Mathematical Economics - Akira Takayama (Dryden Press, 1974)
Mathematical Economics - Akira Takayama (Dryden Press, 1974)
ECONOMICS
AKIRA TAKAYAMA
Purdue University
PREFACE iii
Some Frequently Used Notations xii
INTRODUCTION
A. Scope of the Book xv
B. Outline of the Book xviii
CHAPTER 0 PRELIMINARIES 1
A. Mathematical Preliminaries 1
B. Separation Theorems 35
C. Activity Analysis and the General Production Set 45
A. Introduction 55
B. Concave Programming-Saddle-Point Characterization 62
C. Differentiation and the Unconstrained Maximum Problem 75
a. Differentiation 75
b. Unconstrained Maximum 82
Vii
Viii CONTENTS
A. Introduction 169
B. Consumption Set and Preference Ordering 175
Historical Background
a. 255
McKenzie's Proof
b. 265
A. Introduction 295
B. Elements of the Theory of Differential Equations 302
C. The Stability of Competitive Equilibrium-The Historical
Background 313
D. A Proof of Global Stability for the Three-Commodity Case (with
Gross Substitutability)-An Illustration of the Phase Diagram
Technique 321
E. A Proof of Global Stability with Gross Substitutability-The
n-commodity Case 325
F. Some Remarks 331
A. Introduction 359
B. Frobenius Theorems 367
C. Dominant Diagonal Matrices 380
D. Some Applications 391
a. Introduction 419
b. Spaces of Functions and Optimization 421
c. Euler's Condition and a Sufficiency Theorem 426
a. Introduction 444
b. The Case of a Constant Capital:Output Ratio 450
c. Nonlinear Production Function with Infinite Time Horizon 459
a. Introduction 468
b. Model 470
c. The Optimal Attainable Paths 474
d. Sensitivity Analysis: Brock's Theorem 480
a. Introduction 486
b. Major Theorems 491
c. Two Remarks 497
a. Introduction 503
b. The Output System 507
c. The Price System 517
d. Inequalities and Optimization Model (Solow) 522
e. Morishima's Model of the Dynamic Leontief System 527
a. Introduction 559
b. The Basic Model and Optimality 561
c. Free Disposability and the Condition for Optimality 563
d. The Radner Turnpike Theorem 567
a. Introduction 575
CONTENTS Xi
NDEXES 721
Some Frequently
Used Notations
1. Sets
0
i= I
X, the Cartesian product of the Xi's
(similarly, X x0 Y, and so on)
xii
SOME FREQUENTLY USED NOTATIONS Xii
2. Vectors
11 x II the norm of x
d(x, y) the distance between x and y
xy the inner product of x and y
Given two vectors x and y in R"
a. x y means x; > y; for all i
b. x y means x; > y; for all i and with strict inequality for at least one i
c. x > y means x; > y; for all i
4. Preference Ordering
x0->- y x is not worse than y
(that is, y is not preferred to x)
A' 0 11 x is indifferent to y
xO.r x is preferred to y
5. Others
means "implies"
means "because"
means "is by definition equal to" (or "is identically equal to")
det A: determinant of matrix A
Re(w): real part of a complex number w
Abbreviation resp. stands for respectively
Preface
iii
iV PREFACE
and its development has been crucially dependent on a strong demand and
stimulus from such problems. However, the large number of viewpoints
based on diversified vested interests in a particular policy often obscures
transparent theoretical understanding. Hence it is very important for
economists to find the basic logical structure of each problem and to be
fully equipped with the major analytical tools, although many important
economic theories obviously can be neither mathematical nor analytical.
This book deals with the analytical and the mathematical aspects of
economic theory. It thus emphasizes a systematic exposition and extensions
of various mathematical tools of analysis which can be useful in many
diversified branches of economics, and of two topics in economic theory-
competitive equilibrium and economic growth-both which, with their
rigor and theoretical thoroughness, will provide basic prototypes of analysis
and frames of reference for many other economic theories. Clearly, the
topics of interest in economics change rapidly from time to time, reflecting
the changing current concern with problems in the real world, and no book
can possibly cover all of these topics. However, I think that the material
presented in this book is useful and basic in analyzing many economic
problems, old and new.
In spite of the fact that the book is conspicuously analytical and mathe-
matical and that it is designed to bring the reader to the frontiers of mathe-
matical economics, the mathematical prerequisites have been kept to a mini-
mum. In virtually all sections of the book, the requirements include only
that level of knowledge of elementary calculus and elementary matrix theory
(say,- the knowledge of matrix multiplication) which is now a standard re-
quirement for entering economics students in graduate schools in major
U.S. universities.
With regard to prerequisites in economic theory, the author can think
of several excellent introductory texts available on many of the topics dis-
cussed herein. More generally, a rigorous second or third year under-
graduate course should provide the reader with sufficient economic back-
ground to take up this study. Readers who are acquainted with books such as
Dorfman, Samuelson, and Solow, Linear Programming and Economic Anal-
ysis, New York, McGraw-Hill, 1958, and Hicks, Value and Capital, 2nd ed.,
Oxford, Clarendon Press, 1946, may benefit more from reading this book
than those unfamiliar with these works; however, familiarity with such texts
is by no means necessary.
The book is suitable for use as a textbook in graduate courses in eco-
nomic theory and mathematical economics and is also intended to serve as
a reference work for professional economists who wish to become familiar
with some of the topics and techniques of mathematical economics.
In addition, this book represents a record of my lectures on mathe-
matical economics and economic theory given to first- and second-year
PREFACE V
In the course of writing the manuscript, I found that I owe a great debt
to an excellent graduate education in economics received at the University
of Rochester. I am also grateful for the atmosphere favorable to this modern
approach to economic theory existing at the University of Minnesota and
Purdue University, as well as at the University of Rochester. This atmo-
sphere, sponsored and nurtured by such distinguished scholars as Profes-
sors Leonid Hurwicz, John S. Chipman, Marcel K. Richter, Stanley Reiter,
and Robert L. Basmann, as well as Lionel W. McKenzie, Ronald W. Jones,
and Edward Zabel of Rochester, has provided a great deal of stimulation.
My greatest debt is to the students at Minnesota, Rochester, Hawaii,
and Purdue who took my courses on the topic and have constantly given
me stimulation, encouragement, and criticism. A number of people, in
addition to students in my classes, have read a portion or the whole of the
manuscript. Among them, I would like to express my gratitude to Michihiro
Ohyama, James C. Moore, William A. Brock, Mohamed A. El-Hodiri,
Takashi Negishi, Jinkichi Tsukui, Hiroshi Atsumi, Sheng Cheng Hu, John
Z. Drabicki, Yuji Kubo, Raj K. Jain, Kenneth Avio, and Fred Nordhauser.
In addition, I am grateful to Professor Richard E. Quandt of Princeton, who
read the entire manuscript and provided me with numerous comments as
well as encouragement. My special thanks also go to John Drabicki, for
without his help and encouragement, the completion of this book may have
been further delayed and hampered. I also appreciate the capable research
assistance provided by Erik Haites, Gene Warren, Robert Parks, Frank
Maris, and James Winder, as well as the excellent stenographic services of
Mrs. Gladys Cox, Mrs. Helen Antonienko, and others whose assistance
was made available to me, for the most part, through Purdue University.
I am also grateful to Professor Leonid Hurwicz for his readiness in
giving me permission to quote the results of one of his unpublished works
("LH-Oct. 1966" as revised, July 2, 1970). I would also like to record my
gratitude to those professors who have granted their kind permission to
quote many very interesting and illuminating passages from their writings.
The precise source and the author of each quotation is given in the respective
place of each quotation. Thanks are also due to the editors of Metroecono-
mica and the Quarterly Journal of Economics for permission to include in this
book some articles by the author which were originally published by them.
I am also grateful to Deans Emanuel T. Weiler, John S. Day, Rene
Vi PREFACE
December, 1973
West Lafayette, Indiana
INTRODUCTION
Section A
SCOPE OF THE BOOK
xv
XVi INTRODUCTION
and the economic theories separately. However, the fact that the mathematical
techniques are closely related to economic theories seems to make it difficult
to treat them effectively by themselves. In addition, treating the mathematical
techniques first might discourage the student before he gets to the economic
theories, whereas treating the economic theories first would not enable him to
take advantage of the mathematical techniques. The author takes the view that
this is not a difficulty but rather an advantage, in the sense that developments
in economic theory can be used to provide the unifying structure for the book.
The mathematical techniques can then be explained in connection with the
theoretical developments to which they are related. Mathematical theorems
will thus become more interesting and exciting to economists.
The author fully realizes some of the limitations of the book. For example,
in spite of quite a comprehensive coverage of the topics (which is broader than
in any other book currently in the field), it misses at least three important topics,
namely, the theory of uncertainty, the theory of social systems and organiza-
tions,, and the theory of conflicts and interactions.` There is no question that
these topics are important and that significant contributions will be made in
the next few decades. They were excluded only because their inclusion would
make the book massive and because, in view of the current research carried out
in these fields, the materials covered would probably become obsolete by the
time of publication of this work."
Furthermore, even the topics covered in this book have important de-
ficiencies, in spite of the elegance and importance of all the literature related
to them. For example, in the case of the theory of competitive equilibrium one
may ask the following questions: (1) What is the rationale justifying assumptions
such as a fixed number of commodities, a fixed number of consumers, and a
fixed technology set? (2) Granting all the premises of the theory, how can we
reach an equilibrium? Walras offered the t&tonnement process, which provides
a way to reach an equilibrium without knowing individuals' preferences, tech-
nology sets, and so forth.' But the process excludes the possibility of intermediate
trading. When intermediate trading is permitted, the equilibrium depends on
the trading paths. (3) Even if the t&tonnement process is accepted, convergence
to an equilibrium is still an open question. So far, the proof of stability depends
on heroic assumptions such as "gross substitutability.""
Although the monopoly of the nonanalytical methodology seems to be
over, mathematical economics, as it may be represented by this book, is no doubt
transitory. Future economists, completely free from prejudices against mathe-
matics and well trained in mathematics, econometric methods, and the theory
of measurement, and skilled in methods of electronic computation and simula-
tion, may be able to deal successfully with the institutional and political-economy
aspects of economics." However, the basic methodology and the framework
of thinking developed in mathematical economics will no doubt remain. Future
economists may be less concerned with such "large" problems as the competitive
equilibrium of the entire economy, and instead be more concerned with smaller
XViii INTRODUCTION
aspects of the economy. But they will still realize the importance of the analytical
method and the mode of developing analysis utilizing formally and honestly
constructed models. Furthermore, with the proper training, future economists
will not be in danger of overlooking the general equilibrium aspects of such
models.
In ending, it must be stressed that we should not overlook the importance
of the mathematical techniques developed in the course of the emergence of
mathematical economics. Although I would be the last person to argue whether
or not so-and-so's work is "good economics," I will be the first to defend the
importance of making the mathematical tools available to economists. These
are tools for every economist. Hence I have no hesitation to place heavy empha-
sis on mathematical techniques, almost on a par with my emphasis on economics.
Section B
OUTLINE OF THE BOOK
This book is essentially divided into three parts. The first part (Chapter 0)
provides the background materials in mathematics and economics necessary
for reading the rest of the book and also for further research in mathematical
economics. The second and the third parts constitute the main body of the book.
Roughly speaking, the second part (Chapters 1 through 4) is primarily concerned
with the theory of competitive markets, and the third part (Chapters 5 through 8)
is primarily concerned with the theory of growth. The above division between
the second and the third parts is a rough one, since the mathematical techniques
are closely interwoven with the economic topics, and it is not possible to classify
these techniques according to econorrnc topics. For example, the theory of non-
linear programming (Chapter 1) is a mathematical technique which lies at the
heart of the theory of competitive markets, yet it is an important technique
also for growth theory and for other fields of economics.
The first part, consisting of only one chapter (Chapter 0), is divided into
three sections. Section A collects the basic mathematics that will be useful for
reading the rest of the book and also for the reader's further study in economic
theory. Unlike the remainder of the book, most theorems here are stated with-
out proof so that the reader can grasp the basic mathematical concepts and
ideas without being led astray by the details of the proofs. Care is taken, how-
ever, not to misguide the reader into thinking that our world is always Euclidian,
and thus this section becomes more than a mere exposition of the mathematics
necessary for later sections of the book.
OUTLINE OF THE BOOK XiX
Section D begins with a summary of the results of B and C, and the reader, if
he so wishes, can skip most of the reading of B and C. The rich and wide applica-
tions shown in Section D will illustrate the power of this technique in economic
theory.
Chapter 5 has the dual purpose of introducing modern growth theory in the
form of an aggregate optimal growth model and of making the reader familiar
with an important mathematical tool, the calculus of variations. The calculus of
variations has had an unfortunate history among economists in that it was im-
mediately forgotten after the initial economic works of Roos, Evans, and
Ramsey, in the 1920s and 1930s. However, the power of this technique in physics
is well known, and economists are becoming more aware of its use in economics.
The optimal growth model gives an interesting example of its economic applica-
tions. In Section A, we make an expository account of this technique for the
simplest case. In Section B, we enter into a study of the second-order characteriza-
tions, which may be useful for the reader's further reading and research. It is
also shown there that, just as in nonlinear programming, concavity is sufficient
to guarantee that the first-order characterization ("Euler's condition") is neces-
sary and sufficient for an optimum. Section C digresses from the optimization
problem and attempts a compact summary of the one-sector growth model.
Section D then deals with the one-sector optimal growth model. An enormous
amount of literature in this field is treated in a unified and simple manner. Chapter
5 ends with the Appendix to Section D, in which we deal with the discrete-time
analogue of our discussion of Section D. This Appendix is intended to illustrate
the relation between the "continuous-time" model and the "discrete-time" model,
and to give an expository account of the existence problem and of the important
"sensitivity" results.
Chapter 6 discusses two important multisector growth models, the von
Neumann model and the dynamic Leontief model. In spite of many limitations,
the von Neumann model turns out to be fundamental in modern growth theory.
Section A deals with this model. Section B is concerned with the dynamic Leon-
tief model, which is particularly important among empirical economists. This
model, however, seems to have several important theoretical limitations. We
point out these difficulties. The Appendix to Section B uses a one-sector model
and points out these difficulties more sharply.
Chapter 7 deals with optimal growth in a multisector context. Section A
discusses some of the old results in this topic, namely, the "turnpike theorems."
In this material the role of consumption is completely subsumed and society is
concerned only with the terminal stock of goods. Our emphasis here is on an
elegant turnpike theorem due to Radner. Although the turnpike theorems of
Section A mark a great advance in theory compared with the von Neumann
theory, the above weakness is quite strong. This prompted the "neo-turnpike
theorems," notably that of David Gale. However, we deal with consumption
more explicitly than did Gale. Incidentally, we carry out our discussions of
Chapters 6 and 7 in terms of the discrete-time model, which the reader may,
XXii INTRODUCTION
FOOTNOTES
1. By empirical testing, I do not mean the "curve-fitting school," which relies heavily
on regression analysis. Although this school is fashionable among certain empirical
economists, it seems to represent institutionalism in one of its worst forms. Not
only does it suffer from a poor theoretical basis, but also it often ignores the
elementary theory of measurement.
2. However, there is no question that "diagrams" are often very useful tools for under-
standing important theories and that "common sense" verbal arguments are often
essential in leading to economically fruitful theories.
3. Keynes, J. M., "Alfred Marshall, 1842-1924," Economic Journal, XXXIV, Sep-
tember 1924, p. 333.
4. The analogy is imperfect, for we do not know of any economic theory which is
OUTLINE OF THE BOOK XXiii
Section A
MATHEMATICAL PRELIMINARIES'
1
2 PRELIMINARIES
{x: x E A and x E M. The collection of all the elements which belong to set A
or set B is denoted by A U B and is called the union of A and B; that is, A U B
{x: x E A or x E B}. The difference between two sets A and B (denoted by A\B) is
defined by A\B = {x: x E A and x 0 B}. The set that has no elements is also
considered a set. It is called the empty set and is denoted by 0. If two sets A and
B have no elements in common, A and B are said to be disjoint or nonintersecting.
In other words, A n B = 0.
The union (or intersection) can be taken over a finite or infinite collection
of sets. For example, if S1, S2, ..., S,, ... are sets, we can consider
n M
U Si or u Si
=1 1= 1
In fact, the index i does not have to be an integer. For example, letting T be
the set of all real numbers between 0 and 1, we may consider the union U,E TSI.
If S, is the set of all the human beings on earth at time t, UIC TSA is the set of
all human beings on earth during the period T. The reader should easily under-
stand the notations
M OD
n Si, n Si, n S,
i=1 i=l tET
n cc
When T is some index set (not necessarily the positive integers), we may con-
sider the Cartesian product such as OO,ETX,. If T is the set of all the positive
integers, clearly
OX, = O Xi
tE 7 i= 1
f(x) f(x)
x x
0 0 0
(i) If S is bounded from above, there is a smallest element in the set of upper bounds
of S.
In other words, if S is bounded from above, the set U of its upper bounds,
U = (a: a E R, a > x for all x E S }, is not empty. Proposition (i) asserts that there
exists an a E U such that x < a implies x 0 U; a is called the supremum or the least
upper bound of S. It is denoted as sup S = a or supXESx = a. If S is bounded from
below, the set L of its lower bounds, L = {b: b E R, b < x for all x E S}, is not
empty. Proposition (ii) asserts that there exists a b E L such that x > b implies
x E L. b is called the infimum or the greatest lower bound of S. It is denoted as
inf S = b or infsx = b. Given a set S, a subset of R, S may not contain its
least upper bound even if it is bounded from above. Similarly, S may not contain
its greatest lower bound even if it is bounded from below. For example, the set S
defined by S = {l/q: q = 1, 2, ....} is clearly a subset of R, and sup S = 1 and
inf S = 0. Note that 1 E S and 0 E S. That is, inf S E S. In general, given an
arbitrary subset S of R, if a = sup S is in S, a is called the maximum element of
S. Similarly, if b = inf S is in S, b is called the minimum element of S. The above
is an example of the case in which a set does not contain its infimum.
Finally, we may note that the notation "=' is often used to mean "imply."
For example A =>B is read as "statement A implies statement B." If A =B holds,
we say that B is a necessary condition for A and that A is a sufficient condition for B.
When A=B holds, it is not necessarily the case that BMA holds (that is, "the
converse is not necessarily true"). For example, let Q be the set of all rational
numbers and J be the set of all integers; then x E J= =>x E Q but x E Q does not
necessarily imply x E J. When A= B and B= A both hold, then we may say
that A is a necessary and sufficient condition for B or B is a necessary and sufficient
condition for A. In this case A and B are also said to be logically equivalent. When
A= B holds, then "B does not hold =>A does not hold" is true. On the other
hand, if we can show "B does not hold ==>A does not hold," then A =B. This is
often used to prove the statement "A =B."
MATHEMATICAL PRELIMINARIES 5
of R" by a (called
a E R, define the multiplication of an arbitrary member xthat is, z = ax = xa
scalar multiplication) by coordinate-wise multiplication,
ax E R". In other
means z; = ax;, i = 1, 2, ..., n. Clearly x E R", a E R implies
words, R" is "closed" under scalar multiplication.
When n = 2, we can illustrate the above concepts of coordinates, addition,
should be well known.
and scalar multiplication in Figure 0.2. This diagram multiplication in R", we can
Given the above rule of addition and scalar
for arbitrary elements x,
readily check that the following eight properties hold
y, and z of R" and scalars a, /i E R.
(L-1) (Associative Law) x + (y + z) = (x + y) + z.
0 = 0 + x = X.
(L-2) There exists an element called 0 such that x + (-x) = 0.
(L-3) There exists an element (-x) for every x such that x +
(L-4) (Commutative Law) x + y = y + x.
(L-5) (Associative Law) a(J3 x) = (a J3)x.
(L-6) lx = x.
(L-7) (Distributive Law) a(x + y) = ax + ay.
(L-8) (Distributive Law) (a + /3)x = ax + /3x.
x2
X,
A little notationa.l caution is needed here about the symbol 0, which can mean
either the scalar zero or the n-tuple of 0's. The latter is sometimes called the
origin for the obvious geometric reason (see Figure 0.2).
Given an arbitrary set X (not necessarily R"), if "addition" and "scalar
multiplication" are defined, if X is closed under these two operations, and if
the above properties (L-1) to (L-8) are satisfied, then we call X a (real) linear
space or (real) vector space, and an element of X is called a vector. Of course
R" with the above rule of addition and scalar multiplication is only one example
of a linear space. From now on we shall regard R" as a particular one in which
such addition and multiplication are defined. We henceforth call R" "(n-dimen-
sional) real space." For illustrative purposes we give the following as examples
of linear spaces other than R.
EXAMPLE 1: The set F of real-valued functions defined on the interval
[a, h]. Given f, g E F, "addition" (f + g) is defined byf(x) + g(x) for each
x E [a, b] and scalar multiplication (af) is defined by af(x) for each
x E [a, b].
EXAMPLE 2: x = (x,, x2, ..., x", ...), where Ek I xk < oo, with a similar
definition of addition and scalar multiplication as in R".
EXAMPLE 3: The set of all two-by-two matrices with real number entries.
EXAMPLE 4: The set of all continuous functions defined on the closed
interval [0, 1] into R (denoted by Clo,,l), with the same rules of addition
and scalar multiplication as in example 1.
REMARK: Given an arbitrary set X, if "addition" is defined on X and x,
y E X implies x + y E X with the properties (L-1) to (L-4), we say X is an
Abelian (additive) group.
restrict the "scalars" to real numbers. If a linear space is defined over com-
plex numbers, we call it a complex linear space or a linear space with complex
field. In fact, we can use anything as a scalar in the defining properties of a
linear space, if it satisfies the properties of the algebraic concept "field."
However, in this book, we confine ourselves to a "real linear space," or a
"linear space with real field" (that is, the case in which the scalars are the
real numbers). Hence, when we subsequently refer to a linear space, we
mean it to be a real linear space as was defined above.
Let S1 and S2 be two subsets in a linear space X. Since addition and scalar
multiplication are defined in a linear space, we can define the following set S, for
fixed scalars a and /1
S {rrx+/3y: xES1,yES2,and a,/ ER}
We denote S by aS1 + /3S2 and call it a linear sum of SI and S2. Clearly S is in
X. Given m sets, S1, S2, ..., Sm in a linear space X, we can analogously define
2:;"_
S= a1S,, ai E R, i = 1, 2, ... , m. The linear sum of sets must be dis-
tinguished from the union of sets (such as uT, IS;).
Given any two arbitrary members x = (XI, x2, ..., x") and y = (y1, y2, ...,
y") of R", we may define a rule for multiplication of x and y as follows:
"
x yxiyi
i= 1
Note that x y is a real number; x y computed by the above rule is called the
inner product in R".
In general, given an arbitrary linear space X (over a real field), an inner
product is defined as a real-valued function defined on the Cartesian product
X 0 X (denoted by x y or < x, y> where x E X and y E X), which satisfies
the following properties. For arbitrary elements x, y and z E X and a, /i E R,
(I-1)
(1-2) (ax + /y). z = a(x z) + /3(y. z).
(1-3) x x ? O and x x = 0 if and only if x = 0.
A linear space with an inner product defined is called an inner product space.
Clearly the inner product defined above for R" satisfies the above axioms (I-1) to
(1-3). We call it the usual (Euclidian) inner product.
Given a point x = (x 1, x2, ..., x the
be computed by
d(x, 0) _ x;2 = xx
.=1
Similarly, given any two arbitrary points x = (XI, x2, ..., x") and y = (yi, y2, ...,
8 PRELIMINARIES
The "distance" defined in the above formula is called the Euclidian distance. The
Euclidian distance is illustrated in Figure 0.3.
We can easily show that the Euclidian distance defined above satisfies the
following properties.
(M-1) d(x, y) = 0 if and only if x = y.
(M-2) (Triangular inequality) d(x, y) + d(y, z) > d(z, x).
(M-3) d(x, y) > 0 for all x and y.
(M-4) (Symmetry) d(x, y) = d(y, x).
Properties (M-3) and (M-4) can be obtained from (M-1) and (M-2). Given an
arbitrary set X, if a function d from X Ox X into R is defined and if d satisfies
the above properties (M-1) and (M-2) [hence also (M-3) and (M-4)], then X is
called a metric space, d is called a metric, and d(x, y) is called the distance between
two points x and y in X Q X. The metric space is denoted by (X, d). The metric
(or distance) is a function from X Q X into R.
REMARK: The first example of a metric space is obviously R" with the
Euclidian distance defined, from which the concept is formulated. However,
it is possible to think of many different kinds of metric spaces. In the follow-
ing example the reader can easily check that the axioms (M-1) and (M-2)
of metric spaces are satisfied.
Let X be an arbitrary nonempty set and define
d(x,y) =
0ifx=y
l ifxzy
This example shows that every nonempty set can be regarded as a metric space.
This example is often used to show that certain statements which hold true in R"
(with the Euclidian distance) do not necessarily hold true in an arbitrary metric
space.
Given a point x in Rn, the Euclidian distance between x and the origin,
d(x, 0), is also called the Euclidian norm of x. We denote it by II x II Then d(x, y)
can be denoted by I I x - y1 1 Given arbitrary points x and y in Rn and a scalar
.
a E R, we can easily verify that the following three properties hold for the
Euclidian norm.
(N-1) II x II > 0 and II x II = 0 if and only if x = 0.
(N-2) (Triangular inequality) II x + y II II x II + II y II
(N-3) II ax II = I a I II x II
Given an arbitrary vector space X (not necessarily R"), if we define the real-valued
function, called a norm, which satisfies the above three properties, we call X a
normed vector space, or a normed linear space. Clearly every normed linear space
is a metric space with respect to the induced metric defined by d(x, y) = II x - y II
It should be noted that the choice of a norm is not necessarily unique. For example,
R" is a normed linear space with the Euclidian norm, but it can also be a normed
linear space with the following norms:
IIxll=maxIxil, 15; i<n
or
II IIxil
i= I
The reader can check that either of the above two satisfies all three properties
(N-1) to (N-3).
We should note that an arbitrary metric space cannot necessarily be a
normed linear space, for it may not be a linear space in the beginning.
Given a linear space X, we can also induce the concept of a norm from the
concept of an inner product. That is, let X be an inner product space and define
II X II = (x x)1, or II X II 2 = (x x)
We can easily verify all the properties of a norm, (N-1) to (N-3); henceX becomes
a normed linear space as well as an inner product space. Note that in R" this
,yn_
relation holds when II x II is the Euclidian norm and (x x) _
Thus given an arbitrary linear space X, if we first define an inner product
and if we induce the norm and then the metric in the way described above, then
X becomes a normed, metric, and inner product linear space. Thus all the prop-
erties (N-1) to (N-3), (M-1) to (M-4), (I-1) to (I-3), and (L-1) to (L-8) are available.
In particular, R" can be such a space with its usual Euclidian norm, metric, and
inner product. Unless otherwise stated, we consider Rn as such a space. That is,
given x and y in Rn, we have
n
Exiyi, II xll =
i= 1
10 PRELIMINARIES
and
Definition: Let X be a linear space. Given m vectors in X, xI, x2, ..., x'", the
vector x defined by
m
x= aix', a1ER,i= 1,2,...,m
t= I
Definition: A finite set of vectors {xI, x2, ..., xm} in a linear space is called
linearly independent if
m
a;x' = 0 implies that a, = 0 for every i
r= I
Corollary: The set of nonzero vectors, {x', x2, ..., x"'}, is linearly dependent if
and only if one of the vectors in the set-say, x'-can be expressed as a linear combina-
tion of other vectors in the set. (For the proof, see Halmos [5], pp. 9-10, for
example.)
forms a basis for R". Obviously there can be many bases for a given linear
system X.
The following theorem is basic.
Theorem O.A.1:
1. Define f: R" >R by f (x) = a. x where x E R" and a is any (fixed) vector in
Rn.
2. Let C[ab] be the set of all continuous functions defined on the closed
interval [a, b]. Define f: C[a.b]>R by f (x) = fbx (t)dt, where x(t) E C[a,b] .
Theorem O.A.2: Let X be a finite dimensional linear space and define Las above;
f E L is invertible if and only if f(x) = 0 implies x = 0.
PROOF: See Halmos [5], p. 63, for example.
REMARK: In finite dimensional spaces '!f is not invertible" therefore im-
plies that "there exists an x E X, x # 0 such thatf(x) = 0." Such_anf is often
called singular and an invertible f is often called nonsingular.
Let X be an n-dimensional linear space where n is a finite positive integer.
Let S= {x1, x2, ..., xn} be a basis of X. Let f E L; thus f(x) E X. Hence in
particularJ(xi) E X, j = 1, 2, ..., m. From the definition of basis, f(xJ) can be
written as
n
f(XJ) _ E j = 1, 2, ... , n
i= 1
an 1 an 2 ... ann
0 = [Oij]
[aij] [bij] =
where
i R
i
aikbkj
Theorem O.A.3: Given L and the set 7' of all matrices [ aij ] defined by f (x')
Ini= 1 aiix' with a fixed basis of X, (x x2 ... , X ), there exists an isomorphic
,
Definition: Let f be a function on a linear spaceX into R (that is, a linear function-
al). The function f is said to be linear affine or affine if f(x) - f(O) is linear.
REMARK: Obviously a linear function is a special case of a linear affine
function. Let X c R" and Y = R, and define f (x) = a x + k = X;'_ 1 a,x; +
k, where a, x E R", k E R, and a and k are constants. In elementary mathe-
matics and in most literature in economics, such a function f is known as a
"linear function." However, as long as k / 0, f does not satisfy the defini-
tion of linear functions; it is a linear affine function. Similarly, let F(x) be
defined by F: R" ->Rm, F(x) = A x + k, where x E R", k E R"' and A is an
(m x n) matrix. Then F is also a linear affine function as long as k # 0.
REMARK WITH REGRET: In the course of this book, as long as the con-
text is clear we do not stick closely to this distinction between linear functions
and linear affine functions. In other words, we sometimes call a linear affine
function a "linear function" as long as this does not cause any confusion.
Although obviously imprecise, this is rather inevitable in view of the common
usage in economics. For example, linear programming typically contains
the "linear constraint" of the typef(x) = a x + k < 0 (k z 0). As remarked
above, f is linear affine and not linear. Similarly, F in the constraint F(x)
A x + k < 0 (k 0) is also linear affine. But it is too pedantic to rename
linear programming "linear affine programming" and linear constraints
"linear affine constraints." There are too many such examples in economics
to rename the relevant functions as "linear affine."
16 PRELIMINARIES
d. CONVEX SETS"
Here we consider an arbitrary linear space X. This X does not necessarily
have to be R". However, the reader can certainly confine his attention to R"
(instead of X) if he so wishes.
if 0 = 1, z=x, and
if 0=0,z=y
X
/ z= Ox+(1 -0)y (here 0<0 < 1)
Figure 0.4. An Illustration of
Convex Combination.
a
Definition: Given m points, x1, x2, ..., x' in a linear space X, x defined by
m in
Theorem O.A.4:
(i) A set S in a linear space X is convex if and only if every convex combination of
(two or more) points in S belongs to S.
(ii) Any intersection (finite or infinite) of convex sets is also convex.
MATHEMATICAL PRELIMINARIES 17
REMARK: The empty set 0 and sets consisting of only one point are con-
sidered convex sets.
Theorem O.A.5: Let Si, i = 1, 2, ..., in be convex sets in a linear space X. Then
the following are true.'"
Si's is the set o f all m-tuples (xi, xz, ... , x`, ... , x"'),`x` E S;, i = 1 , 2, ... , in).
REMARK: The set consisting of two different half lines starting from the
origin is a cone, but not a convex cone. If we include the area inside two
half lines with an acute angle, then it is a convex cone.
Theorem O.A.6:
Definition: Given a set S in a linear space X, the intersection of all the convex
cones containing S is called a convex cone spanned by S or a convex cone generated
by S, and we denote it by K(S).
REMARK: We can show that K(S) is the "smallest" convex cone containing
S. That is, K is a convex cone containing S implies that K(S) c K. We can
also show that K(S) can be written as
(0, 1) and (1, 0) will generate a convex polyhedral cone K(S) which is the
nonnegative orthant of R2.
In the previous subsection, we defined such concepts as the dimension of
a linear space and linear functions. With the aid of these concepts we can obtain
the following important characterization of convex sets and so forth.
Theorem 4.A.7: Let f be a linear function of a linear space X into a linear space
Y. If S is a convex subset (resp. cone, linear subspace) of X, then its imagef(S)
is a convex subset (resp. cone, linear subspace) of Y.
PROOF: See Berge [ 1] , p. 143, for example.
EXAMPLES: Let X = Rn and S c X. Consider the following examples:
e. A LITTLE TOPOLOGY15
Consider a point x0 in a metric space (X, d), say, Rn, and define a set
Br(xo) by
B, (x0) = {x: x E X, d(x, x0) < r}
where r is some positive real number and d(x, x0) refers to the (Euclidian)
distance between x and (the fixed point) x0. The set Br (x0) is called the open
ball about x0 with radius r. The point x0 is called the center of B,(xo). An open
ball is always nonempty, for it contains its center. Figure 0.6 illustrates some
examples of an open ball.
x2
0
An open ball in R2 An open ball in C, 0', 1
that is, )feCiolj : d(f, fo) <r
Definition: Let S be a subset in a metric space (X, d). This set S is called an open
set, if, for any x in S, there exists a positive real number r such that Br(x) c S.
REMARK: It is easy to check that every open ball is an open set.
Now consider the collection of all the open sets in X. Note that X itself is an
open set. The empty set 0 can be considered as a trivial example of an open set.
We denote the collection of all the open sets in X by T. We can easily check that-r
satisfies the following properties:
(T-1) XET,0cT.
(T-2) V ; E T, i = 1 , 2, ..., m implies ni _" IV; E T.
(T-3) V,, E T for all a E A implies U4,,,V,, E T.
Given an arbitrary set X (it does not have to be a metric space or a linear
space), if we define a collection of subsets TofX which satisfies the properties (T-1),
(T-2), and (T-3), we can call it a topological space with topology T. We denote a
topological space by (X, T). A member of 'r is called an open set. In fact, as the
reader can easily check, any set X can be a topological space for either one of the
following topologies:
In fact, many kinds of topologies other than the above two can be defined on an
arbitrary set. That is, there are many ways to transform a given arbitrary set to a
topological space. The symbol Tr in the notation (X, T) refers to the topology
specified for this topological space X.
MATHEMATICAL PRELIMINARIES 21
In a metric space we are often concerned with the collection of open sets
defined in terms of open balls as a topology. We call this the usual topology in the
metric space or the topology induced by the metric. Although there are many ways
to make a metric space into a topological space, we henceforth refer to this usual
topology as the topology in metric space, unless otherwise specified.
Note that an arbitrary topological space does not have to be a metric space,
although every metric space can be a topological space by the topology induced by
the metric. Note also that an arbitrary topological space may not be a linear space,
although every normed linear space is a topological space by the topology induced
by the norm.
Real space, R", is a metric space with the Euclidian metric; hence it is also a
topological space with the usual topology. Moreover, R" is also a linear space, a
normed linear space, and an inner product space. In other words, it has all the
features of these spaces. Conversely, we may say that the properties of each of
these spaces are abstracted from R". This means that one can always get an intui-
tive understanding of these concepts by a graphical representation in R2. How-
ever, the reader should note that this is also very dangerous, for these concepts are
far more general and broader than R2 (or R").
We now define closed sets.
is a convergent sequence whose values are in S, with limit 1. But the point 1
is not a limit point of S, for S, consisting of a single point, cannot have any
limit points, as we remarked after the definition of the limit point.
MATHEMATICAL PRELIMINARIES 23
The following theorem is easy to prove and is useful to relate the two con-
cepts of "limit" and "limit point."
PROOF: The sufficiency part of the theorem is obvious. For the proof of the
necessity part, see Rudin [10], p. 42, and Kelley [7], p. 73, for example.
REMARK: Restriction of S to be in the real space Rn is necessary to prove
the necessity part only. In fact, this can be relaxed by requiring that S c X,
where X is a topological space satisfying the "first axiom of countability."
The reader is not required to understand the concept of the first axiom of
countability. However, he can always find its definition from any textbook on
general topology; see, for example, Kelley [7] , p. 50, and footnote 28 of the
present section.
n
2: (xi - yi )2
dl (x, y) =
i=,
dz (x, y) = max
r
I x; - yr
n
Theorem O.A.9: Let S be a subset in a topological space (X, T). Then the follow-
ing hold:
24 PRELIMINARIES
(i) S is closed if and only if S contains all its limit points (that is, if it contains
its derived set).
(ii) S is closed if and only if S = S .
REMARK: If the derived set of S is empty (that is, if S has no limit point),
then S is always closed. Hence, for example, a set of only one element is a
closed set. A set consisting of a finite number of points is closed as is a finite
union of closed sets.
All these properties are obvious when X is a metric space with the usual
topology in the metric space.
EXAMPLE: An open ball B,(xo) x: x E X, d(x, x0) < r} in a metric space
(X, d) is not a closed set. However, { x : x E X, d (x, x0) < r}, called a closed
ball, is a closed set and, in fact, is a closure of B,.(x0). The open ball Br(x0)
is the open kernel of this set Br(xo). The set {x: x E X, d(x, x0) = r} is the
boundary of Br(x0) and Br(x0). Any point in this boundary is a limit point
of the open ball Br(xo) but is clearly not in Br(x0).
REMARK : We started the discussion of topology with open sets. Open sets
satisfy the axioms of a topological space (that is, a finite intersection of open
sets is open and any union of open sets is open). A closed set is then defined to
be the complement of an open set. Then we showed that any closed set con-
tains all its limit points. A point x in a set X is a limit point of S, a subset of
X, if there exists a sequence of points other than x in S which converges to x.
Hence, in a closed set, every converging sequence of points in the set con-
verges to a point in the set. In other words, a closed set is a set which is
closed under the limit operation.
MATHEMATICAL PRELIMINARIES 25
Definition: Let f be a function from a metric space (X, d1) into a metric space
(Y, d2). The function f is called continuous at a point x0 in X if for any real number
c > 0, there exists a real number 6 such that d, (x, xo) < b and x E X imply d2U'(x),
f(xo)) < E. The function f is called continuous in X if it is continuous at every point
of X.
It is easy to show that this definition is equivalent to either of the following
two statements (see Simmons [ 11 ] , p. 76, for example).
(i) For each open ball BF(j(xo)) with center f(xo), there exists an open ball
B,5 (xo) with center xo such that f(B6(xo)) c BE(f(xo)).
(ii) xq->xo implies f(xq)_.f(xo).
Note that the above definition of continuity is strictly analogous to the one in R".
We have the following important theorem:
Theorem O.A.10: Let f be a function from a metric space (X, d,) into a metric
space (Y, d2). The function f is continuous in X if and only if f (V) is open in X
whenever V is open in Y.
Theorem O.A.11:
(i) A continuous function of a continuous function is also continuous.
(ii) The Cartesian product of continuous functions is also continuous (that is, let
f,,, i = 1. ... , m be continuous functions from S into T; ; then a function from S
T, defined by f (x) = [f, (x), ... , fm(x)] is also continuous).
0',11
into 1
PROOF: Statement (i) follows directly from the definition of continuity. For
(ii) and (iii), see Kelley [7], p. 91, for example.
REMARK: For the usual Euclidian topology, the proofs of (ii) and (iii) are
straightforward (see Rudin [10] , pp. 75-76, for example). However, the
MATHEMATICAL PRELIMINARIES 27
ment (iii) says that every projection of a continuous function is also con-
tinuous. The identity transformation f(x) = x on R" is continuous; hence
f(x) = x,, i = 1, 2, . ., n are also continuous functions of x.
.
Theorem O.A. I 1 holds for any continuous function from a metric (or topo-
logical) space into a metric (or topological) space. Suppose now that the range of
the function is in a linear space. We can then talk meaningfully about such things
as f + g, af, and so on. In particular, we consider the properties of a continuous
function whose range is in R" (or R). Then we can show the following 24
Theorem O.A.12:
(i) Let X--R" and ai: X->R, i = 1, 2, ..., m, where X is a metric space, be
continuous functions, then f = 2:"_ 1 ai(x)J(x) is also continuous in X.
(ii) Let f,, be real-valued continuous functions on a metric space X (i = 1, 2, ... , m).
Then II,"' 1 f,- is also continuous.25 If f: X > R is continuous in X, 1/f is also
continuous for all x E X with f (x) v 0.
(iii) Let J be continuous functions on a metric space X (i = 1, 2, ..., m). Then
max { J (x)} and min { J (x)} are also continuous on X.
i i
REMARK: Two corollaries of the above theorem are (1) every polynomial
is a continuous function, and (2) the Cobb-Douglas function, -[I,"=Ix I '
(= x1"' x2112 .... x"an), 0 < ai < 1 for all i and Vt 1 ai = 1, is a continuous
function.
We may note the following important theorem, the proof of which can be
done by using the concept of a continuous function.
Theorem O.A.14: Let (X, d) be a metric space with the usual topology defined.
Then we have the following:
(i) The set S is a compact subset of X if and only if every infinite subset of S has
a limit point (this is known as the Bolzano-Weierstrass property).
(ii) The set S is compact if and only if every sequence in S has a convergent sub-
sequence and its limit is in S (this is known as sequential compactness).
is, there is a way to generate T such that this is possible. Thus generated, T is the
product topology mentioned before. We do not discuss how to generate the pro-
duct topology, but we will state the result of its construction, known as the Tycho-
noff theorem, which is probably the most important theorem in general topology.
Theorem O.A.15 (Tychonoff): The product of any nonempty class of compact sets is
compact with the product topology.
The proof of this theorem is not easy, and, in fact, many of the past proofs are
known to be wrong. The proof requires use of Zorn's lemma. A consequence of this
theorem is the classical Heine-Borel theorem.
This theorem immediately shows the following examples of compact sets in Rn: (1)
a closed ball, (2) the boundary of a closed ball, and (3) the set defined by
n
{x:xERn,xi>O,i= 1, 2,...,n, xi:5 11
r= i
YROOr: For (i) and (ii), see Simmons [ 11], p. 111, for example. Statement
(iii) follows immediately from (ii), and (iv) follows immediately from the
definition of compactness.
Statement (ii) of the above theorem has an important corollary, known as
the Weierstrass theorem.
(i) The space X is said to be a TI-space if x, x' E X, and x # x' imply that there
exist V, V' E T with x E V and x' E V', such that x E V' and x' E V.
(ii) The space X is said to be a T2-space, or Hausdorff space, if x, x' E X, and
x # x' imply that there exist V, V' E T, with x E V and x' E V' such that
VnV'=0.
(iii) The space X is said to be a normal space if whenever U and U' are two disjoint
closed sets in X, then there exist V and V' in T with U c V, U' c V' such that
V n V' = 0. The space X is said to be a T4-space if it is normal and T1.
REMARK: In addition to TI-, T2-, and T4-spaces, To-, T3-, and T5-
spaces, and so forth, are defined and discussed in general topology. Clearly
every T4-space is a T2-space and every T,-space is a Tl-space. Converses
of these statements do not necessarily hold. Note also that any set can be a
normal space under the discrete topology.
The following theorem is an easy consequence of the above definition.
Theorem O.A.19:
(i) A topological space X is a Ti-space if and only if each point in X, considered as a
set, is a closed set in X.
(ii) Every compact subset of a Hausdorff space is closed.
(iii) Every compact Hausdorff space is a T4-space.
(iv) Every metric space is a T4-space (hence a Hau.sdorff space).
(v) Let { xq} be a sequence in a Hausdorff space. If { xq} is convergent, then it has
.27
a unique
MATHEMATICAL PRELIMINARIES 31
(vi) The Cartesian product of any nonempty class of Hausdo,ff spaces is also a
Hausdo ff space.
PROOF: See, for example, Simmons [11], pp. 130-134; also Berge [ 1 ] , IV.5
and IV.6; Wilansky [12], 9.1; Kelley [7], pp. 56-57, pp. 112-113.
REMARK: Statement (v) implies that the proofs of theorems which involve
the limit of a sequence would usually require that the relevant set be a
Hausdorff space. Statement (iv) says that in a metric space we do not have
to worry about this 28
A set X is a topological space if it is equipped with a topology (say, T). What
about a subset S of X? We may construct a topology in S (in a natural way) so
that S is a topological space also.
Theorem O.A.20: Let (X, T) be a topological space and let (S, t) be a subspace of
(X, T). Let A c S. Then the following hold.-
32 PRELIMINARIES
(i) The set A is closed in (S, t) if and only if A = B n S for some closed set B in
(X+T)
(ii) A point xo in X is a limit point of A with respect to t if and only if it is a limit
point of A with respect to T.
FOOTNOTES
1 The basic mathematics, which will be useful for the later sections and chapters and
for the reader's further study in economic theory, are sketched here. No prerequisite
knowledge is necessary to read this section. The reader, if he so wishes, may restrict
his attention to the usual "Euclidian space," or R". However, it should also be
noted that special care is taken not to misguide readers into thinking that our
world is always Euclidian. Consequently, this section becomes more than a mere
exposition of the mathematics necessary for later sections of this book. This approach
to mathematical preliminaries will be useful for readers who are serious about further
study and research in modern economic theory. Unlike the remainder of the book,
most theorems here are stated without proofs in order that the reader can grasp
the basic mathematical concepts and ideas without being led astray by complicated
proofs. For those readers who wish to see the proofs, references are given from
time to time.
2. For a more detailed exposition, see, for example, Kolmogorov and Fomin 181,
chap. 1 , Rudin [ 101, chap. 1 (also pp. 21-27); Nikaido [91, secs. 6-8; Berge [ 1] ,
11.1; and Simmons [ 11] , chap. 1. For a more complete exposition of set theory,
see Halmos [6] , for example.
3. When' the number of elements of a set is finite, it is often called a finite set. It is
called an infinite set if it is not finite. For example, R is an infinite set. A set is
called countably infinite if there is a one-to-one mapping between the set and the set
of all positive integers. (The phrase "one-to-one mapping" will be explained shortly.)
A set which is either finite or countably infinite is called countable; and a set which
is not countable is called uncountable. Then R is uncountable.
4. Let S c: X; then f (S) is called the image of S under f. When Y = f (X), f is said
to be onto.
5. It should be noted, however, that in many treatments in the literature, "function"
usually refers to a single-valued function.
6. For a more detailed exposition, see, for example, Kolmogorov and Fomin [ 8] ,
secs. 8 and 2 1 ; Berge [ 1 ] , IV.1, VII.2; Halmos [5] , secs. 1-4; Simmons [ 11 ] , sees.
9,14, and 15. In this subsection, concepts such as "linear space," "inner product"
(space), "metric space," "norm," and "normed linear space" will be discussed.
The reader will realize that these concepts, although they are abstracted from R",
have much broader scope than R".
7. See Halmos [51, secs. 5-8 and 32-38; Nikaido [91, secs. 9 and 10; Wilansky
[12],2.1-2.4.
8. This is certainly the case, if X is a linear space as is assumed in the above definition.
9. The notation is justified, for, after all, the multiplication off and g is defined as
the composite function off and g.
10. Strictly speaking, A- B may have to be denoted as Ao B. However, this is too
pedantic. In fact, following the usual convention, we often denote it simply as
AB, unless it is confusing.
MATHEMATICAL PRELIMINARIES 33
11. For a more detailed exposition see, for example, Fenchel [31, 1.1, 1.2, 11.1, and
11.2; Berge [ 1 ] , VII.4; Berge and Ghouila-Houri [2] , 1.4 and 1.5; Nikaido [ 9],
sec. 27; or Fleming [41, 1.4. The proofs of the theorems are fairly easy. The reader
can enhance his understandings of the content of this subsection by trying to prove
these theorems by himself.
12. The proof of statement (i) is easy if we utilize a later theorem, Theorem O.A.7.
13. The corresponding concept to K(S) is convex hull, which is defined as the smallest
convex set containing a given set S. Denoting this by C(S), we can easily prove that
C(S) can be written as C(S) = {_Y;" i a;x': x' E S, a, E R, a; >_ 0, i = 1, 2, .. ,
m, _Y;"_ I a, = 1}, where m and the choice of x' and a, are arbitrary. Note the
difference between C(S) and K(S).
14. Corresponding to this concept, the convex hull of a finite number of points in X
is called a convex polyhedron, or a convex polytope.
15. The material here is standard in general topology, and many textbooks are avail-
able for those who wish to see the proofs of the theorems in this subsection and to
study this topic further. See, for example, Simmons [ I 1 ] ; Kelley [ 7] ; and Berge
[ 1 ]. Kelley 7] is a standard textbook on this topic; however, Simmons [ 1 I ] is
easier to read than Kelley [7]. Again, most of the proofs are omitted so that the
reader can grasp the basic ideas without being led astray in the "jungles" of the
proofs.
16. It is important to notice that the concept of limit point becomes concrete only
when the topology of the space is specified. In other words, whether a particular
point is a limit point or not depends on the topology. LetX be the set of real numbers.
With its usual topology the open interval (0, 1) is an open set and every point of
the closed interval [0, 1] is a limit point of (0, 1). However, if the discrete topology
is chosen for X, then no subset of X has the limit point.
17. For example, the closed interval [ a, b] in R is the closure of (a, b), (a, b] , and [a, b)
in R with its usual topology.
18. The reader of this book must have encountered the term "sequence" some time
earlier in his study of mathematics. A rigorous definition is as follows. A sequence
in X is a function defined on the set of all positive integers and whose range is
included in X. If the range of this function is a set of real numbers, then it is called
a sequence of real numbers. In general, however, the range can be any set. A sequence
is usually denoted by {xl, x2, ...} or, in short, {xq}, where x1, x2...., xq, . are. .
the images of the function (and are called the values of the sequence, or the terms
of the sequence). If x9 E S for all q, then { xq} is said to be a sequence in (set) S.
19. Given a sequence {xq}, consider a sequence {qs}, where q, < qZ < ... < qs < ... .
Then the sequence {x"s} is called a subsequence of xq.
20. This remark with respect to Theorem O.A.10 also means that the continuity of a
particular function depends on the topology specified in the space.
21. If x0- a < a, then for a sufficiently large q, we have xq a < a, for x- a is a continuous
function. This contradicts the assumption.
22. Similarly, we can also prove that (1) if x9 --xo and xq a < a for all q, then x0. a < a,
and (2) if xq->,r0, yq->yo with x9- a < xq yq for all q, then xo a < The
propositions in the present remark with this footnote are often utilized in economic
theory (for example, consider a as a price vector).
23. Again the basic motivation here is found in R". The set I of all the open intervals
(a, b) in R, under the usual topology of R, is called the open base of R in the sense
that every open set of R can be expressed as a union of open intervals. In other
words, every open set of R can be generated from I. Then define open cube in
R" by { (XI, x2, .. , x ): a; < x; < b;, a;, b;, x; E R, i = 1, 2, ... , n}. We can prove
that the set of open cubes is an open base for R", that is, it will generate every open
34 PRELIMINARIES
set of R". In other words, we produced a topology for R" starting from that of R.
This idea of generating a product topology for R" is used for the general case.
24. For the proof, see any standard textbook on elementary analysis, or try to prove
it by yourself.
25. II;__ j denotes f, j2 jm.
26. A set S in R" (or any metric space) is said to be bounded if there exists an open ball
with a finite radius which contains S.
27. As we remarked earlier, we can define a limit of a sequence in an arbitrary topological
space (X, T); that is, x0 is a limit of sequence { xq} if for each open set V containing x0
there exists a q such that q >_ q implies xq E V. However, as we cautioned earlier,
such a limit may not be unique: for example, consider T - {X, 0} (indiscrete
topology); then any sequence converges to every point of X. A remarkable feature
of the Hausdorff space is that if a limit exists it is always unique.
28. In this section, we induced the concept of a topology from a metric and thus observed
that a topology is closely related to the concept of the "limit of a sequence." We may
reverse the problem; that is, given a topological space (X, T), what is the situation in
which the topology can be described in terms of sequences alone? This question then
leads us to the Moore-Smith convergence theory in terms of "directed- sets" and
"nets." We do not go into this discussion (see Kelley [7], chap. 2). In any case, it
turns out that the most satisfactory situation in which a topology can be described by
sequences alone is the case of the "first axiom of countability." A topological space
(X, z) is said to satisfy the first axiom of countability, if, for each point x in X, there
exists a countable class of open sets such that every open set containing x is a union of
sets in this class. In short, (X, z) satisfies the "first axiom of countability" if it has a
countable open base at each of its points. An open base is a class of open sets such that
every open set is a union of sets in this class (see Kelley [7] , p. 50; Simmons [ 11] ,
pp. 99-100). The first axiom of countability makes the following statement meaning-
ful for general topological spaces. "A point x0 is a limit point of a set S if and only if
there exists a sequence in SI{x0} which converges to xe" (see Kelley [7], theorem 8,
p. 72, and problem B, p. 76). It is known that every metric space satisfies the first
axiom of countability (for example, see Kelley [7], theorem 11, p. 120). Hence, when
we induced the concept of topology from a metric, the first axiom of countability had
already crept into our discussion. In other words, the metric space with its induced
topology is a "nice" topological space in terms of sequences, for it satisfies the first
axiom of countability as well as being Hausdorff.
29. In the remainder of this book, we usually assume that X is R". Hence the statement "A
is open" will mean, unless otherwise specified, that A is open in R" with its usual to-
pology.
REFERENCES
4
1. Berge, C., Topological Spaces, New York, Macmillan, 1963 (French original, 1959).
2. Berge, C., and Ghouila-Houri, A., Programming, Games and Trasportation Net-
works, New York, Wiley, 1965 (French original, 1962).
3. Fenchel, W., Convex Cones, Sets, and Functions (hectographed), Princeton, N.J.,
Princeton University Press, 1953.
4. Fleming, W. H., Functions of Several Variables, Reading, Mass., Addison-Wesley,
1965.
SEPARATION THEOREMS 35
5. Halmos, P. R., Finite Dimensional Vector Spaces, 2nd ed., Princeton, N.J., Van
Nostrand, 1958.
6. , Naive Set Theory, Princeton, N.J., Van Nostrand, 1960.
7. Kelley, J. L., General Topology, New York, Van Nostrand, 1955.
8. Kolmogorov, A. N., and Fomin, C. V., Functional Analysis, Vol. 1, Rochester, N. Y.,
Grayrock, 1957 (Russian original, 1954).
9. Nikaido, H., Introduction to Sets and Mappings in Modern Economics, tr. K. Sato,
Amsterdam, North-Holland, 1970 (Japanese original, Tokyo, 1960).
10. Rudin, W., Principles of Mathematical Analysis, 2nd ed., New York, McGraw-Hill,
1964. V
11. Simmons, G. F., Introduction to Topology and Modern Analysis, New York, McGraw-
Hill, 1963.
12. Wilansky, A., Functional Analysis, New York, Blaisdell, 1964.
Section B
SEPARATION THEOREMS
Definition: Let p E R" with p 0, 11 p 11 < oo,' and or E R. The set H defined by
H = {x : p x = or, x E R } is called a hyperplane in R" with normal p.
REMARK: If n - 3, H is a plane; and if n = 2, 1-1 is a straight line.
REMARK: Suppose that there are two points, x* and y*, in H. Then by
definition p x* = a and p y* = a. Hence p (x* - ),*) = 0. In other words,
vector p is orthogonal to the line segment(x* - y*), ortoH. [See footnote 5.]
For the two-dimensional case (n = 2), we may illustrate this as in Figure 0.8.
forallxEX
and
a for all xE Y
If we have strict inequalities in the above two inequalities, that is, p x > a for all
x E X and p x < a for all x E Y, then we say X and Y are strictly separated by
H.
REMARK: The above definition obviously holds even if X and/or Y is a set
consisting of only one point.
REMARK: A hyperplane H in R" divides R" into two "half spaces." In
particular, H = {x: p x = a, x E R"} determines the following two closed
half spaces.
{x: p- x? a, x E R"} and {x: p x< a, x E R"}
It can be seen readily that a closed half space is convex as well as closed.
A hyperplane itself is clearly closed and convex.
Theorem O.B.1: Let X be a nonempty closed convex set in R". Let x0 0 X. Then
the following are true.
SEPARATION THEOREMS 37
(i) There exists a point a E X such that d(xo, a) < d(xo, x) for all x E X, and
d(xo, a) > 0.
(ii) There exists a p E R", p # 0, II p II < oc, and an a E R such that
forallxEX
and
PROOF:
(i) Let R (x0) be a closed ball with center at x0 and meeting X [that is,
R (x0) n X 0] . Write A = B (xo) n X. The set A is nonempty, closed
and bounded (hence compact). Since A is compact and the distance
function is continuous, d(x0, x) achieves its minimum in A as a result
of Weierstrass's theorem. That is, there exists an a E A such that d(x0, a) -<
d(x0, x) for all x E A. Hence afortiori d(x0, a) < d(x0, x) for all x E X.
Since x0 rt X and a E 7, d(x0, a) > 0.
(ii) Let p=- a- x0 and a -=p-a. Note first that p x0 = (a- xo) xo =
(a - xo) (xo - a) + (a - xo) a = -(a - x0) (a - xo) + (a - xo) a =
- 11 p 112 + a < a, where 0 < II P II < oo. Let x E X (arbitrary point).
Since X is convex and a E X, x(t) E T, where x(t) - (I'- t)a + tx,
0 < t < 1. Then d(x0, a) < d(x0, x(t)) by (i). In other words: II a - X0 II 2 <
II x(t) - x0 II 2 = II (1 - t)a + tX - x0 II 2 = (1 - t) (a - xo) +
11
or
0>_
(a - x pa x E T.
(Q. E. D.)
REMARK: Note that the convexity of X is used only in (ii) of the above
proof.
REMARK: The above proof is essentially due to von Neumann and Morgen-
stern [ 1 l I. Debreu [4] offers the following alternative argument for the
second part (ii) of the above proof, which avoids the use of the limit process
(that is, t - 0). His argument is rather intuitive (see Figure 0.10). A rigorous
proof along this line can be seen in Berge [ 1], p. 162, or Berge and Ghouila-
Houri [2] , pp. 5 3-54. The proof is done by contradiction. That is, suppose
that there is a point x of X which is strictly on the same side of H as xo.
Consider the point x(t) on the line segment ax such that xox(t) is orthogonal
to WY. Since d(xo, x) ? d(xo, a), the point x(t) is between a and x. Thus
x(t) E X and d(xo, x(t)) < d(xo, a), which contradicts the choice of a.
boundary point a, then the tangent (hyper-) plane at the point gives such
a supporting hyperplane. In this case there is only one supporting hyper-
plane passing through the given point a. However, if the (hyper-) curve is
not smooth, then there can be many supporting hyperplanes passing through
the given point. These two cases are illustrated in Figure 0.11. It is impor-
tant to note that the above tangent hyperplane conceptually linksthesepara-
tion theorems to calculus. The power of the separation theorem is that the
boundary (hyper-) curve does not have to be smooth (differentiable) at a,
and that it is more direct and set-theoretic.
Theorem 1 is sometimes stated in the following form.
Corollary: Let X be a nonempty closed convex set in Rn not containing the origin.
Then there exists a p E R", p 0, 11 p II < co, and an a E R, a > 0 such that
forallxEX
and this inequality can be made strict.
PROOF: In Theorem 1, let x0 be the origin. Then, as a result of the theorem,
there exist p 0 and a such that p x > a for all x E X, where p and a are
defined asp = a - x0 = a and a = p a. By this definition, a > 0.
(Q.E.D.)
REMARK: Obviously, this corollary is really equivalent to Theorem 1, for
the choice of the origin can be arbitrary. The inequality in the statement
of the corollary can be made strict by choosing a point strictly in between
a and the origin (instead of a).
In the above theorem, X is assumed to be a closed set. In fact, we can
relax this assumption and we can obtain the following theorem.
Theorem O.B.2: Let X be a nonempty convex set in R" (not necessarily closed).
Let x0 be a point in R" which is not in X. Then there exists p E R", p C, 11 p 11 < oo
such that3
forallxEX
PROOF:
Theorem O.B.3 (Minkowski): Let X and Y be nonempty convex sets in R" (not
necessarily closed) such that X n Y = 0. Then there exist p E R", p 4 0, 11 p 11 < 00,
and aERsuch that a for aforally E Y.'
Lemma: Let K be a cone, with the vertex at the origin, in R", and let p be a
given point in R". If p. x is bounded from below for all x E K, then p x >_ O for all
x E K.
PROOF: By assumption, there exists an a E R suchthatp x > a for all x E K.
Since K is a cone with the vertex at the origin, x E K implies Ox E K for all
B > 0. Hence p (Ox) ? a or p x > a/B for all x E K and B > 0. Taking the
limit as 6 --> oo yields p - x > 0. (Q.E.D.)
We can now prove the following theorem.
42 PRELIMINARIES
Theorem O.B.4 (Minkowski-Farkas lemma): Let a1, a2, ... , a- and b 0 be points
in W. Suppose that b x > 0 for all x such that a' x > 0, i = 1, 2, .. ., m. Then
there exist coefficients A,, A2, Am, all > 0 and not vanishing simultaneously,
such that b = ZmIA.;a'.
PROOF: Let K be a convex polyhedral cone generated by a1, a2, ..., am.
Then K is a closed set. We want to show that b E K. Suppose b 0 K. Then
K is a nonempty, closed, convex set which is disjoint from b. Hence from
Theorem O.B.1, there exist p E R", p 0, and a E R such that
REMARK: The converse of the above theorem is also true. I f a1, a2, ...,am
and b 0 are points in R", and if there exist coefficients A,, A2, . . ., Am,
all > 0 (not vanishing simultaneously), such that b = Zm 1A1a', then b x >= 0
for x such that a'. x > 0, i 1,2,...,m.
PROOF: Suppose A I, A2, .1 Am, all >_ 0, are such coefficients that b =
Z' 1Aia'; then b x = (E7 iAiai) x = Zm 1Ai(ai. x) > 0. (Q. E. D.)
Owing to the above remark, the Minkowski-Farkas lemma can also be stated
in the following form.
Theorem O.B.5: Given points a', i = 1, 2, ..., m and b 0 in R", exactly one of
the following two alternatives holds.
(i) There exist A,, i = 1, 2, ..., m, all >_ 0 (not vanishing simultaneously), such that
b= A;a` or
t= i
a]
Defining an m x n matrix A a2 , a column vector x X11
x2
am X"
and a row vector A (A1, A2, ..., A,), we can state the above alternatives
(i) and (ii) in the following forms:
Theorem O.B.4 can be restated as "b x > 0 for all x such that A x > 0
implies that there exists a A > 0 with A # 0 such that b = A A."
REMARK: A geometric interpretation of Theorem O.B.4 is as follows:5
(i) The inequality a' x ? 0 for all i means that x is in the cone POQ (the
shaded area).
(ii) The inequality b x > 0 means that x is in the half space, determined
by the hyperplane H - {x: b x = 0, x E R"}, which contains the point b.
(iii) The conclusion of the theorem is "b E K," where K - {y: y = A A,
A>o}.
(iv) If b E K (Case a), then we have b x > 0 for all x such that a' x > 0,
Case a: b e K Case b: b ¢ K
i = 1, 2, ..., m (that is, the cone AOB is contained in the half space,
determined by H, which contains b).
(v) If b (4 K (Case b), then we cannot have b x > 0 for all x such that
a' x > 0, i = 1, 2, ... , m. In other words, there exists a point-say,
x-in the cone AOB, but not in the half space which contains b (Figure
0.13, Case b).
FOOTNOTES
REFERENCES
1. Berge, C., Topological Spaces, New York, Macmillan, 1963 (French original, 1959),
esp. chap. VIII.
2. Berge, C., and Ghouila-Houri, A., Programming, Games and Transportation Net-
works, New York, Wiley, 1965 (French original, 1962).
3. Debreu, G., Theory of Value, New York, Wiley, 1959.
4. Debreu, G., "Separation Theorem for Convex Sets," in "Selected Topics in Eco-
ACTIVITY ANALYSIS AND PRODUCTION SET 45
Section C
ACTIVITY ANALYSIS
AND THE
GENERAL PRODUCTION SET
outputs held constant. Thus the efficient manager in this case defines the unique
surface with a given amount of inputs.
"Activity analysis" revolutionizes traditional production analysis by dis-
carding the above concepts of a production function and an "efficient manager."
Instead, it postulates the set of production processes available in a given economy.
(Here the word "economy" can mean a firm, a collection of firms, the entire
national economy, or the whole world.) This set is called a production set. An
element of this set is an n-tuple which describes the technological relation of the
input-output combination of one process of production. An element of the
production set is called a process or an activity. We may also call it a blueprint
to stress its technological character. There is no presupposition about the existence
of an "efficient manager," so that nothing in the beginning is specified about
what processes or blueprints in the production set should be adopted or discarded.
If one likes, one can include the managerial ability of Mr. A in the list of com-
modities. We assume that there are n "commodities" in the economy and each
commodity is qualitatively homogeneous. In general, a commodity is defined
here by a specification of all its physical characteristics, of its availability location,
and of its availability date. Hence, for example, flows of technically the same
commodity in two different locations represent two different commodities.'
Note that we may always regard different commodities as one commodity, if
this facilitates a sharper and deeper analysis of a particular problem.
The production process is described by an ordered n-tuple of these com-
modities. (The dimension n is usually assumed to be a finite, positive integer, but
can be infinity.) The production set is the collection of these n-tuples. The
following example is from Koopmans and Bausch [91, pp. 99-100. Here we
consider an economy with four commodities and two processes.
Process 1 Process 2
(Tanning) (Shoemaking)
Commodity I (shoes) 0 1
Commodity 2 (leather) 1 ;
Commodity 3 (hides) -1 0
Commodity 4 (labor) io
2
2
In each process inputs are represented by negative numbers and outputs are rep-
resented by positive numbers. Note also that there can be any number of proces-
ses for tanning or shoemaking. Moreover, each process can have more than one
positive entry. This is the case of joint production. For example, in a process which
produces cow hides, beef may also be produced.
The scope of activity analysis is not limited to statics. The convention of
dating commodities (which is due to Hicks [5]) extends the scope of activity
analysis to dynamics and capital theory, in which time is involved in an essential
manner. For example, consider the Akerman-Wicksell model of the durability
ACTIVITY ANALYSIS AND PRODUCTION SET 47
of capital. Assume that one unit of the capital good (an axe) whose durability is
j days is produced by lj units of labor. Assume that li men are used as input
the first day, leaving one unit of the axe for the second day. Assume that the
axe of durability j lasts for j days after it is built with the same efficiency and
suddenly "dies" at the end of the (j + 1)th day with zero scrap value. Then the
jth production process which produces the axe of durability j can be expressed
by the following vector:
ali
A.jajE Yforall.aj_> 0,AjER,where ai
Lanj
The vector ai may be referred to as the jth activity (or process) of Y in its unit
level of operation. Here a,, denotes the amount of the ith good involved in one
unit operation of the jth activity; .Aj signifies the activity level of the jth activity.
Now we impose the third postulate:
(A-3) (Finite number of basic activities) There exist a finite number of ai's such that
Y is a convex polyhedral cone generated by these ai's. These ai's are called basic
activities.
In other words, a typical element y in Y can be expressed as a nonnegative linear
combination of a', a2, ... a-, where m is a finite positive integer. Owing to the
above postulates, the production set Y in activity analysis can be written as
Y = {y: y = A A, A > 01, where A is an n x m matrix (with real-number entries)
formed by [a', ..., a-] and A is an m-vector whose jth element is >Z.j.
It should be clear that the proportionality postulate means complete divis-
ibility of all the commodities and constant returns to scale and that the additivity
postulate means the independent action of each activity (no interactions among
activities): in Scitovsky's terminology, there are no ("technological") external
economies or diseconomies.
That Y is a convex polyhedral cone implicitly entails several other features.
Some important ones are the following:
Y2 Yz
Case a
Theorem O.C.I: Let Y = {y: y = A A, A > 0}. Y satisfies postulate (A-5) if and
only if there exists a p > 0, p E R", 11 p 11 < co, such that p y < O for ally E Y.
PROOF:
(i) (Sufficiency) y > 0 implies p. y > 0 for any p > 0. Hence by as-
sumption y E Y.
(ii) (Necessity) Omitted (an interesting exercise for the use of the separation
theorems).'
Note that (vii-a) implies (i) and (ii) and that (vii-b) implies (ii). Statements
(vii-c) and (ii) together imply that if y E Y, then ay E Y for all 0 < a < 1; in
other words, nonincreasing returns to scale prevail (or increasing returns to
scale are ruled ou.t). Note also that the convexity of Y presupposes the
divisibility of all the goods involved.
The production set Y as described above indicates the technological pos-
sibilities in a given economy; hence it is free from resource limitations. In other
words, y E Y indicates how much output can be produced after specifying the
amounts of the inputs, and we do not ask whether these inputs are, in fact,
available in the economy. (Thus we called y E Y a "blueprint.") In this sense it
corresponds to the concept of the classical production function. However, we
can also take resource limitations into account. For example, Y can indicate a
"truncated" convex polyhedral cone, such as Y = {y: y = A A,, A >_ 0, A E R"',
and y + z > 0}, where z > 0 denotes the resource limitation of the economy.
With the no land of Cockaigne postulate, such a set is no longer a convex poly-
hedral cone, although it is still convex. Note that the set is compact now. This
truncation can easily be illustrated by Figure 0.15.
REMARK: Strictly speaking, the use of the words "activity analysis" may
have to be confined to a study of production processes when the number of
basic activities is finite. In other words, Y must be confined to a convex
polyhedral cone or a "truncated" convex polyhedral cone. We will not
adopt this narrow definition. The revolutionary character of activity analysis
is not in a particular shape of Y. It is in the set-theoretic approach which
is more fundamental and powerful than the traditional smooth (differen-
tiable) production function approach. We now introduce the most important
concept in activity analysis.
PROOF: Suppose not. Then there exists a z > 0 (that is, z E S2 °) such that
z E (Y - y). Hence z + y E Y. Thus y is not an efficient point of Y, which
is a contradiction. (Q.E.D.)
m
EaijAj ri,l= 1,...,n, 0,j= 1,2,...,m
Then the problem of finding A = (.A1, . . ., Am) which maximizes p y where
y= ai,Aj, subject to the above constraints is a typical linear program-
ming problem, of which the computational method is well known and widely
used in practice (the "simplex method"). Hence activity analysis also has
practical and computational significance.
REMARK: It is important to realize the basic features of the neoclassical
"smooth" production function approach in terms of activity analysis
terminology. These are essentially the following. (1) It deals with a produc-
tion set which cannot be generated from a finite number of activities (that
is, it is not a convex polyhedral cone); rather a continuum of vectors is
required to characterize the set. (2) The "efficient manager" is presupposed,
so that production always takes place at an efficient point, that is, on the
set of efficient points (called "production frontier"), which is nothing but
the set defined by the production function. (3) This set of efficient points
constitutes a differentiable function.
FOOTNOTES
we can have a hyperplane passing through the origin and bounding for M, as a result
of the separation theorem (recall Theorems O.B.1 and O.B.2). Thus there exists
an a E R", a >_ 0, such that a- p< 0 for all p E Y*. From this we can show a E Y
with a > 0, contradicting the hypothesis.]
5. The notation Y n S2 c {0} means that the intersection of Y with f2 contains at most
0, the origin. That is, Y n f2 can be an empty set, as in the case in which 0 E Y. If Y
is a convex polyhedral cone, then 0 E Y; hence (iv) is replaced by Y n f2 = {O}
as in (A-4).
6. The notation Y- y denotes Y- {y} - {z: z = y - y, y E Y}.
7. The proof is not too difficult.
REFERENCES
1. Afriat, S., "Economic Transformation," Krannert Institute Paper, Purdue University,
no. 152, November 1966.
2. Baumol, W. J., "Activity Analysis in One Lesson," American Economic Review,
LXVIII, December 1958.
3. Debreu, G., Theory of Value, New York, Wiley, 1959, esp. chap. 2 and 3.
4. Dorfman, R., Samuelson, P. A., and Solow, R. M., Linear Programming and
Economic Analysis, New York, McGraw-Hill, 1958.
5. Hicks, J. R., Value and Capital, 2nd ed., London, Oxford University Press, 1946.
6. Hicks, J. R., "Linear Theory," Economic Journal, LXV, December 1960.
7. Koopmans, T. C., ed., Activity Analysis of Production and Allocation, New York,
Wiley, 1951, esp. chap. III (by Koopmans).
8. Koopmans, T. C., Three Essays on the State of Economic Science, New York,
McGraw-Hill, 1957, esp. pp. 66-104.
9. Koopmans, T. C., and Bausch, A. F., "Selected Topics in Economics Involving
Mathematical Reasoning," SIAM Review, 1, July 1959, esp. Topic 3.
10. Malinvaud, E., "Capital Accumulation and Efficient Allocation of Resources,"
Econometrica, 21, April 1953; also "Corrigendum," Econometrica, 30, July 1962.
11. von Neumann, J., "A Model of General Economic Equilibrium," Review ofEconomic
Studies, XIII, no. 1., 1945-1946. Translation from German original published in
Ergebnisse eines mathematischen Kolloquiums, no. 8, 1937.
12. Wicksell, K., Lectures on Political Economy, London, Routledge & Kegan Paul,
1936 (Swedish original, 3rd ed., 1928).
I
DEVELOPMENTS OF NONLINEAR PROGRAMMING
Section A
INTRODUCTION
55
56 DEVELOPMENTS OF NONLINEAR PROGRAMMING
are inactive (or ineffective) at r. The constraints g, (x) > 0 and g2(x) > 0 are active
(or effective) at z in both Case a and Case b of Figure 1.3 in the sense that
g1(z) = 0 and g2(z) = 0. Note also that the boundary surfaces of the constraint
sets (the shaded areas of Case a and Case b of Figure 1.3) do not allow derivatives
at z (that is, the boundary curve is not "differentiable" at z). In economics, one
of the constraints is often the nonnegativity constraint, such as x >_ 0, since eco-
nomic variables such as price and output are usually nonnegative. And, if z is a
solution of the problem, we typically have a situation such that z > 0; that is, the
constraint x >_ 0 is ineffective at I.
In the classical maximization problem due to Lagrange and Euler, we are
concerned with the case in which all the constraints are always effective [that is,
the problem is the one of maximizing f (x) subject to gj(x) = 0, j = 1, 2, ... , ml.
Although this form of constraint is often very inconvenient in dealing with eco-
nomic problems, it is also true that there are situations in which we know that
the constraints are always effective. For example, one of the constraints may be a
definitional equation, which is not an inequality. Can we then handle the problem
with equality constraints within the above formulation of the nonlinear program-
ming problem? The answer is simply yes, for we can rewrite an equality constraints
gj(x) = 0
as
that the solution for this problem does not exist. Even if the constraint set is
nonempty, we may still have a situation in which the solution does not exist.
INTRODUCTION 59
This integral constraint contains the assumption that the consumer can borrow or
lend any amount at the fixed rate of interest r. In any case, this is a problem of
choosing a function x(t) from a set of (say, continuous) functions,X, defined over
the interval [0, T] such as to maximize a real-valued function I [x(t)] subject to
the constraints g [x(t)] < M and x(t) > 0. This is clearly similar, at least formally,
to the problem discussed above. In fact, there has been a considerable effort
recently to consider this kind of problem as a natural extension of nonlinear
programming theory to problems in linear spaces (not necessarily R11).' However,
in this chapter we restrict ourselves to R" or its subset X, as formulated above.
In this way we can treat the theory in a much simpler manner. We discuss the
question of programming in a linear space later in the book when we discuss
such topics as the calculus of variations and optimal control theory.1'
60 DEVELOPMENTS OF NONLINEAR PROGRAMMING
Now let us come back to the usual nonlinear programming problem, that is,
the one of maximizingf(x) subject to gj(x) > O, j = 1, 2, ..., m, and x E X, where
X is a subset of R", or in other words, maximizing f(x) subject to x E C = {x:
x E X, gj(x) > O, j = 1, 2, ..., m }. The following questions are the natural ques-
tions involved in any nonlinear programming problem.
QUESTION 1: Is the set C nonempty; that is, does there exist a feasible
point?
QUESTION 2: Does there exist a solution z, a point which maximizesf(x)
subject to x E C?
QUESTION 3: What are the characteristics of this optimum z?
QUESTION a: Is the solution z unique, or is there any other solution besides
i?
FOOTNOTES
I. The maximization problem is equivalent to the minimization problem, for one can
easily convert one to the other. For example, if this problem is taken to be one of
minimizing f(x), subject to a certain set of constraints, it can be converted to one
of maximizing [ -f (x)] , subject to the same set of constraints.
2. To name just a few, we have the transportation problem, the production scheduling
problem, the diet problem, the gasoline mixing problem, and the allocation problem.
3. For the theoretical apparatus developed in linear programming and its applications to
economic theory, see, for example, H. W. Kuhn and A. W. Tucker, eds., Linear
Inequalities and Related Systems, Princeton University Press, 1956.
4. The term nonlinear programming is a little confusing. It customarily includes linear
programming as a special case.
5. Linear programming aroused interest in constraints in the form of inequalities
and in the theory of linear inequalities and convex sets. The Kuhn-Tucker study
[3] appeared in the middle of this interest with a full recognition of such devel-
opments. However, the theory of nonlinear programming when the constraints are
all in the form of equalities has been well known for a long time-in fact, since Euler
and Lagrange. The inequality constraints were treated in a fairly satisfactory manner
already in 1939 by Karush [2] . Karush's work is apparently under the influence of a
similar work in the calculus of variations by Valentine. Unfortunately, Karush's work
has been largely ignored more or less.
6. The function f(x) is called the maximand function or the objective function.
7. The point z is also called a maximum point, a solution, an optimal solution, and an
optimal program.
8. See Section D of this chapter, for example.
9. For a pioneering work in this direction, see Hurwicz [ 1] . Programming in linear
spaces would presumably include such topics as the calculus of variations and optimal
control theory. The reverse approach-that is, treating the usual nonlinear pro-
gramming as a special case of optimal control theory-is also possible. This has
been recently investigated, especially after the interest aroused in the variational
approach by Pontryagin et al., Hestenes, and so on. See, for example, M. Canon,
C. Cullum, and E. Polak, "Constrained Maximization Problem in Finite-Dimen-
sional Spaces," Journal of SIAM Control, vol. 4, no. 3, August 1966.
10. The dynamic optimum consumption problem as stated above has recently been
treated in a more sophisticated manner by M. El-Hodiri, M. Yaari, A. Douglas,
K. Avio, and so on, by using the calculus of variations and optimal control theory.
See, for example, K. Avio, "Age-dependent Utility in the Lifetime Allocation
Problem," Krannert Institute Paper, Purdue University, no. 260. November 1969. for
this problem and the references. See also Chapter 8, Section C. Note also that the
dynamic consumption problem can also be treated by using the usual nonlinear pro-
gramming technique.
H. This does not imply, of course, that the scope of available algorithms is very
limited. On the contrary, thanks to electronic computers we are able to handle a
sufficiently large number of practical problems.
12. Moreover, we will not treat such topics as integer programming, stochastic program-
ming, and the like, as such.
13. Although this chapter was written prior to and independently of Mangasarian [4],
the reader may benefit from reading this excellent treatise on nonlinear program-
ming along with this chapter.
62 DEVELOPMENTS OF NONLINEAR PROGRAMMING
REFERENCES
Section B
CONCAVE PROGRAMMING-
SADDLE-POINT CHARACTERIZATION
i
f[OX+(1-O)y] -11
Of(X)+(1 e)f(r) ---
I
o X OX+(1-O)r Y
Theorem 1.B.1
(i) Let f b.? a concave function on a convex subset X of R". Then S = {x: x E X,
f(x) > 0} is a convex set.
(ii) A nonnegative linear combination of concave functions is also concave. In other
words, if f;(.x-), i = 1, 2, ... in, are concave functions on a convex subset X of
R", then f (x) _ Z;" I a; f, (x), where ai E R, ai 0, i = 1, 2, ... , in, is also
a concave function on X.
CONCAVE PROGRAMMING-SADDLE-POINT CHARACTERIZATION 65
(iii) Every concave function is continuous in the interior of the domain of thefunction.3
PROOF:
I
i=1
m
ai[of(x)+(I-O)f(y)]=eEaif(x)+(1-o) Eaifi(y)
i=] i=1
(i) If f is concave, then the set { x:x E X, f (x) ? a} (for each a E R) is convex in
R". [ For the proof, simply observe (i) and (ii) of Theorem I.B. 1.]
(ii) The function f is concave if and only if the set {(x, a):x E X, a E R,
f (x) > a} is convex in R"+ 1. [ The proof follows directly from the def-
initions.]
(iii) The function f is concave if and only if, for each integer m > 1,
+02x2+...+Omxr")>0 f(xi)+o2f(x2)+...+Br0(x
ABIxI )
The. converse of (i) of the above remark is not necessarily true A weaker prop-
erty than the concavity off will suffice to guarantee the convexi.y of the set (later
we discuss it as the quasi-concavity of a function). The set {x: xE X,f(x) > a} is
66 DEVELOPMENTS OF NONLINEAR PROGRAMMING
often called the upper contour set (see Figure 1.7). [In the theory of consumer
demand, f is a utility function and the set {x: x E X,f(x) = a} is often called an
indifference curve.]
x2
xi
0 Figure 7.7. Upper Contour Set.
Theorem 1.B.2 (Fundamental Theorem): Let X be a convex set in R" and let f , f2,
... , f,,, be real-valued concave functions defined on X. If the system
f(x)> O,i= 1,2,...,m
admits no solution x in X, then there exist coefficients p1, p2, ..., p,,,, all pi > 0
(not vanishing simultaneously) and p, E R, i = 1, 2, ..., m, such that
1, 2, . ., m}
.
Corollary: Let X be a convex set in R" and let fl , f2, ... , f,,, be real-valued convex
functions. Then either the systemf(x) < 0, i = 1, 2, ... , m, admits a solution x E X,
68 DEVELOPMENTS OF NONLINEAR PROGRAMMING
or there exist pI, P2, . . ., pm, all > 0 and not vanishing simultaneously, such that
Zmm
l Pif i(x) > 0 for all x E X.4 If we wish, we may choose the pi's such that
zi= iPi = I.
REMARK: There are several important applications of this fundamental
theorem. Berge and Ghouila-Houri ([2], pp. 64-68) proved, for example,
the theorem due to Bohnenblust, Karlin, and Shapley, the von Neumann
minimax theorem, and a generalized Minkowski-Farkas lemma as applica-
tions of this theorem.
We now prove the major theorem of this section, which is again a corollary of
Theorem 1.B.2.
Corollary:' Suppose that the following additional condition is satisfied for Theorem
1.B.3."
(S) There exists an x in X such that g1(z) > 0, j = 1, 2, ... , m.
Then we have po > 0. Hence under the assumptions of the theorem together with
condition (S), there exist coefficients, A I, AZ, ... , A.,,,, all > 0, such that
(SP) 0 (x, Al) < P (z, A) < 0 (z, A), for all x E X and all A ? 0, where (P (x, A)
f(x)+ A-g(x),and A_ (A I,A2'...,A,")
In other words, (z, i) is a saddle point of 0 (x, A) on X Q D"' where D... is the
nonnegative orthant of R "'.
PROOF: Suppose po $ 0. Then po = 0 and p > 0. Hence by the above
theorem we have
Theorem I.B.4: Let f, g1, g2, . , g", be real-valued functions defined over X in R".
If there exists a point (z, )) in X® 92'" such that
(i) The point z maximizes f(x) subject to gj(x) > 0, j = 1, 2, ..., m, and x E X.
(ii) ) g(z) = 0.
PROOF: The inequality 0(i, A) < 0(i, A) for all A E 92'" implies that
A g(z) < A g(c) for all A E 92 m. Thus A g(z) is bounded from below for
all A E 92 m, and S2 m is a convex cone. Therefore, we have A g(z) > 0 for
all A > 0. (Recall the lemma immediately preceeding Theorem 0.B.4.) Thus
gj (z) > 0, j = 1, 2, . ., m. (Thus z satisfies the constraints.) Putting A = 0
.
in the above inequality A g(z) < .1 g(z), we obtain A g(z) < 0, which
proves (ii) since g(z) > 0. Now note that by assump'"on, 0(x, A) <
cD (z, A.) for all x E X. This meansf(x) + g(x) < f(z), since A g(z) = 0;
or f(z) - f(x) > A.g(x). [That is, f(z) - f(x) >_ 0 for all x E X such that
A g(x) 0. ] In particular, f(i) - f (x) > 0 for all x E X such that gj(x) > 0,
j = 1, 2, ..., m. (Q.E.D.)
REMARK : In the above theorem we do not assume the concavity of the
f and the gj's, nor do we assume the convexity of X.
Combining Theorem 1.B.3 and its corollary, we immediately obtain the
following useful theorem.
CONCAVE PROGRAMMING-SADDLE-POINT CHARACTERIZATION 71
Theorem 1.B.5: Let f, 91, g2, ... , g,,, be concave functions defined over a convex set
X in R ". Suppose that Slater's condition (S) is satisfied. Then 2 achieves a maximum
of f(x) subject to gj(x) > 0, j = 1, 2, . . ., m, if and only if there exists a A > 0 -such
that (2, A) achieves the saddle point of the Lagrangian 0 (x, A), that is, (D (x, X)
a) (2, A) < m (2, A) for all x E X and A > 0.
REMARK: The above statement of the theorem presupposes the existence
of 2. We may also restate the theorem in the following way:
There exists a solution, 2, for the problem of maximizing f (x) subject
to gj (x) > 0, j = 1 , 2, ... , m, ifand only if there exists a saddle point for a) (x, A )
such that D (x, .) < D (2, A) < a) (2, A) for all x e X, A > 0.
REMARK: In the above characterization of the solution of a nonlinear pro-
gramming problem, the solution 2 is a global solution; that is, it does not
refer to any small neighborhood about 2. The solution 2 is defined for the
entire domain X.
REMARK: In certain cases, Slater's condition (S) can be dispensed with.
For example, linear programming is a special case of concave programming
and it is known (by the "Goldman-Tucker theorem") that the above theorem
holds without (S) for the linear programming problem. We discuss this later
in Sections D and F of this chapter.
POSTSCRIPTS (FOR FURTHER READING): Here we are concerned with the
problem of finding x E X c R" to maximize a real-valued function f(x) on X
subject to m real-valued function constraints gj(x) _> 0, j = 1, ..., m. Let Qm
be the m-dimensional nonnegative orthant ofRm. Then, writingg(x) = [gi (x), - .
that as a result of this restriction to Euclidian spaces, Moore was able to simplify
some of the proofs considerably. Hence reading his paper may serve as a guide
to the difficult paper of Hurwicz, at least for the concave programming case.
Strictly speaking, Moore's results are not all special cases of Hurwicz [6].
We record one of his main results as a theorem for those readers who are interested
in [9] (his theorem 3).
FOOTNOTES
x - x' E K, where K is a given fixed convex cone in X. If X = R" and Y is S2", then
this definition coincides with the usual definition. In any case, in terms of this
definition of >_ in a linear space, the concavity or convexity of functions on a
linear space can be defined analogously to the case of a real space. See Hurwicz
[61, for example.
3 It is important to note that a concave function may not be continuous at its boundary
points. For example, define f on [ 0, oo) by f (x) = 0 if x = 0 and f (x) = I if x > 0. This
function is clearly concave but not continuous at x = 0. It is continuous on (0, oo).
4. Write f = (f, f2, .. ., f") and p = (p1, P2, ..., p"). If there exists an x E Xwith
f (x) < 0, then clearly p f (x) < 0 for any p 0; that is, "p f (x) > 0 for all p >_ 0,
x E X" does not hold. On the other hand, if f (x) < 0 admits no solution for x E X,
then -f(x) > 0 admits no solution for x E X. Hence, as a result of Theorem 1.B.2,
there exists a p > 0 such that p [ -f (x)] < 0 or p f (x) > 0 for all x E X. This
proves the corollary.
5. A generalization of this corollary and Theorem I.B.3 to the case of linear topo-
logical spaces is accomplished by Hurwicz [61, theorem V.3.1., pp. 91-93.
6. In many economic problems, we are often concerned with the problem of choosing
x E R" which maximizes f (x) subject to gj(x) > 0, j = 1, 2, ... , in, and x > 0. For such
a problem it can easily be shown that Slater's condition is slightly weakened so that
there exists an x >- 0 such that g j (x) > 0, j = 1, 2, ... , in. For a more general result
with linear (affine) constraints such that A- x + b >0, instead of x > 0, see Moore's
theorem later in this section.
7. The need for some requirement for constraints when all the constraints are in the
form of inequalities was first investigated in 1939 by Karush in his Master's thesis at
the University of Chicago (Minima of Functions of Several Variables with Inequalities
as Side Conditions).
8. If (S) holds, then g(x) > 0 so that p g(x) > 0 for p >_ 0; that is, (K) holds. Con-
versely, if (K) holds, then there exists no p > 0 such that p g(x) < 0 for all x E X.
Then, owing to the corollary of Theorem I.B.2, the system g(x) > 0 admits a solution,
say, x in X; that is, (S) holds. The equivalence of these two conditions for a more
general space is provided by L. Hurwicz and H. Uzawa, "A Note on the Lagrangian
Saddle-Points," in Studies in Linear and Non-Linear Programming, Stanford, Calif.,
Stanford University Press, 1958.
9. Slater's own counterexample is the following: f(x) = x - 1 and g(x) (x - 1)2,
x E R. The above example, due to Uzawa [ 13], p. 34, is obviously a slight modifica-
tion of Slater's example. In spite of this well-known counterexample, there seems to
be a confusion among economists on this point. See, for example, K. Lancaster,
Mathematical Economics, New York, Macmillan, 1968, p. 75 (the second proposition
of his "existence theorem"). See also p. 64.
10. For such a definition, recall our earlier remark in footnote 2.
11. The above definition of the K-concavity (or the K-convexity) is clearly motivated by
the definition of >_ in a linear space. The reader should not confuse this concept with
that of the S-concavity (or S-convexity), which is discussed in Berge [ 11 , and soon.
12. In Moore's analysis [9] , f is not restricted to a real-valued function but can be
vector-valued; that is, f = [ f l ,... , f , ] , where f,,, i = 1, ... , n, are real-valued.
Such a problem is called the "vector maximum problem" and we discuss it later in
Section E.
13. Thus g' (x) can be written as g' (x) = A A. x + b, where A is an in, x n matrix with
entries of real numbers and b E RmI.
14. Let K be a convex cone in R"'; then the nonnegative polar cone can be defined as
74 DEVELOPMENTS OF NONLINEAR PROGRAMMING
K* - { z: z E R', y, z > 0, for ally E K} . It is easy to see that K* is also a convex cone.
Also, if K = S2m, then K* = S2"':
15. There is a counterexample to Uzawa's theorem 3, as it is stated. This is pointed out
by Moore [ 9] , p. 61.
REFERENCES
1. Berge, C., Topological Spaces, New York, Macmillan, 1963 (French original, 1959).
2. Berge, C., and Ghouila-Houri, A., Programming, Games and Transportation Net-
works, New York, Wiley, 1965 (French original, 1962).
3. Fenchel, W., Convex Cones, Sets and Functions, Princeton, N.J., Princeton University,
1953 (hectographed).
4. Fleming, W. H., Functions of Several Variables, Reading, Mass., Addison-Wesley,
1965.
5. Hadley, G., Nonlinear and Dynamic Programming, Reading, Mass., Addison-Wesley,
1964, chap. 3.
6. Hurwicz, L., "Programming in Linear Spaces," in Studies in Linear and Non-linear
Programming, ed. by K. J. Arrow, L. Hurwicz, and H. Uzawa, Stanford, Calif.,
Stanford University Press, 1958.
7. Karlin, S., Mathematical Methods and Theory in Games, Programming, and Economics,
Vol. 1, Reading, Mass., Addison-Wesley, 1959, esp. sec. 7.1, 7.2, and appendix B.
8. Kuhn, H. W., and Tucker, A. W., "Non-linear Programming," in Proceedings of the
Second Berkeley Symposium on Mathematical Statistics and Probability, ed. by J.
Neyman, Berkeley, Calif., University of California Press, 1951.
9. Moore, J. C., "Some Extensions of the Kuhn-Tucker Results in Concave Program-
ming," in Papers in Quantitative Economics, ed. by J. P. Quirk and A. Zarley,
Lawrence, Kansas, University of Kansas Press, 1968.
10. Nikaido, H., Introduction to Sets and Mappings in Modern Economics, tr. by K. Sato,
Amsterdam, North-Holland, 1970 (Japanese original, Tokyo, 1960).
11. , Linear Mathematics for Economics, Tokyo, Baifukan, 1961, (in Japanese), esp.
chap. III, sec. 4.
12. Slater, M., "Lagrange Multipliers Revisited: A Contribution to Non-linear Program-
ming," Cowles Commission Discussion Paper, Math. 403, November 1950; also RM-
676, August 1951.
13. Uzawa, H., "The Kuhn-Tucker Theorem in Concave Programming," in Studies in
Linear and Non-linear Programming, ed. by K. J. Arrow, L. Hurwicz, and H. Uzawa,
Stanford, Calif., Stanford University Press, 1958.
DIFFERENTIATION AND THE UNCONSTRAINED MAXIMUM PROBLEM 75
Section C
DIFFERENTIATION AND THE
UNCONSTRAINED MAXIMUM PROBLEM
a. DIFFERENTIATION
lim
h-.0
f(x0 + h)h - f (x°) = a
h#0
or
lim
f (x° + h) - f (x°) - ah = 0
h-.0 h
h#0
We call a the derivative off at x° and denote it by f'(x°). We can easily see that
f'(x°), if it exists, is unique.
REMARK: Since x° is an interior point ofX, it is assumed thatX contains an
open interval (a, f3) such that x° E (a, R). This guarantees that x° + he (a, R)
when h is small enough. Hence such an h, if it is small enough, can be either
negative or positive. If, on the other hand, f is defined on the closed interval
[a, A], then this is no longer the case. For example, if x° = a, h, however
small it may be, cannot be negative. To deal with such a situation the con-
cepts of the "left-hand derivative" and "the right-hand derivative" are
formulated. In other words,
lim f(x°+
h#0 IIh11
where his, of course, an n-vector also. The above a is denoted by f'(x°) and is called
the derivative off at x°.
REMARK: In the above definition, a h is called the differential off at x°.
It clearly depends on x° and h. If f is differentiable at every point in a sub-
set S of X, f is called differentiable in S. If X is open and if f is differentiable
in X, .then f is called a differentiable function.'
REMARK: Also, we can show that f'(x°), if it exists, is unique.
REMARK: Note that X in the above definition does not have to bean open
set. However, x° is restricted to be an interior point. Hence it is assumed
that there is an open ball about x° which is contained in X.
REMARK: When X is a (closed) rectangular region, that is,
X = {x:xER',a1 x1 b1,i= 1,2,...,n}
we can define the concept of the left-hand and right-hand derivatives by
analogy. For example, an n-vector a+ is the right-hand derivative offatx°, if
lim
f(x°+h)-f(x°)-a+.h=0
h-.0 II h 11
h?0
Clearly the above concept can be defined even if X is not bounded (for
example, fl n). It should be clear, however, that the above concept is rather
limited because the closed rectangular region is a very special kind of
domain.
DIFFERENTIATION AND THE UNCONSTRAINED MAXIMUM PROBLENI 77
lim°(Ilhll)_0
h-,o Il hll
h# 0
Definition: Let e' be an n-vector with the ith coordinate equal to 1 and all other
coordinates equal to 0. A real-valued function f on a subsetX of R" is said to have a
partial derivative with respect to x, at x0, where x° is an interior point of X, if there
exists a scalar a; such that
78 DEVELOPMENTS OF NONLINEAR PROGRAMMING
where h is a scalar. The scalar ai is called the partial derivative off at x o and is often
denoted by of/axjjx=x0.
We now state the basic theorems about derivatives, the proofs of which
can be found in any book on elementary analysis or advanced calculus.
Theorem 1. C.1.
.f (xe) =
f , ... , In]
x=X'
0
REMARK: The function f is said to be continuously differentiable at x if it
is differentiable at x0 and if f'(x°) is continuous at x°.
REMARK: The vector f'(x°) is sometimes called the gradient vector off at
x = x°. When notational simplicity is required, we will denote it by f °
REMARK: Although the differentiability off implies the continuity off,' the
converse is not necessarily true. (For example, f(x) = jxi, x E R, is con-
tinuous but not differentiable at x = 0.) Weierstrass constructed an example
of a continuous function on R which is nowhere differentiable.5
REMARK: The partial derivatives are often called
the first-order partial derivatives. The 8f/ax,'s are also functions on X as
x° varies over X. Thus the second-order partial derivatives are defined
analogously [for example, f(x, y) = x2y, x, y E R, of/ax = 2xy,
2y, a2flayax = 2x, of/ay = x2, a2f/aye = 0, a2f/axay = 2x]. The partial deriv-
atives of f of order q = 3, 4, ..., are also defined analogously. If the
qth order (q = 1, 2, ...) partial derivatives off exist and are continuous
in the domain, then f is said to be a function of class C(q). If f is simply
continuous, f is a function of class C(0); C(° and C(1) are often denoted
respectively by C and C. In Theorem 1.C.1, (ii) says that if and only if
f E C('), f is (continuously) differentiable; and (i) says that f E CM implies
f E Coo). As remarked above, f E CEO) does not necessarily imply f E 00.
DIFFERENTIATION AND THE UNCONSTRAINED MAXIMUM PROBLEM 79
It can be shown that f E CM implies azf/ax; axe = azf/axe ax1 for all i and j
and for all points in the domain.' When f E C(2), f is said to be twice con-
tinuously differentiable.
In the definition of differentiation and in the above theorem, we assume that
f is a real-valued function. We can extend the concept of differentiation to the
case in which f is a vector-valued function.
Definition: Let f be a function from a subset X of R" into R"'. Then f is said to
be differentiable at x°, where x° is an interior point of X, if there exists an m x n
matrix A with real number entries such that
f(x°+ o(Ilhll)
We can show that A, if it exists, is unique.
REMARK : Writing f (x) = [f 1 (x), . ., fm (x)] , we can easily show that f (x )
.
l of
ax, ax"
of1
A=
Lx, a
aJ m afm
ax" Ix = x°.
We now state the extension of an important theorem (called the chain rule)
in elementary calculus.
EXAMPLE: (n= 1, k= 1): Let f (x) = [ f, (x), ... , fm (x) ] and h (x) = g [f (x) ] ,
where x E R and h(x) E R. Writing y; = fi(x), we have
" ag
h(x) E,ay; al
ax
h(t) = g [f(t)]
where f(t) = a + bt with a, b E R'", t E R (a, b are constant vectors). Then
we have
f [x° + t(x - x°)] - f(x°) > t [f(x) - f(x°)] , for all t, 0 < t < 1
Let h = x - x°. Subtract tf° h from both sides of the above relation and
divide by t > 0. Then we obtain
f[x° + th] f(-r°) - tf° h
t > f(x) - f'(x°) - f°- h
Now take the limit of t-*0 (t > 0). Then the LHS of the above relation
DIFFERENTIATION AND THE UNCONSTRAINED MAXIMUM PROBLEM 81
tunction I
(X-x°)
of
The function f is strictly convex if and only if the above holds with strict
inequality for any x xo in X.
b. UNCONSTRAINED MAXIMUM
We now consider the maximization problem and its relation to deriva-
tives. The minimization problem is essentially the same as the maximization
problem, for the maximization off (x) is equivalent to the minimization of -f(x),
and vice versa. In this section we take up the unconstrained maximum problem.
Obviously any global maximum (resp. minimum) point is also a local maxi-
mum (resp. minimum) point. The converse is not necessarily true. However, when
f is a concave (resp. convex) function, the converse is also true. In particular,
we prove the following:
Theorem 1.C.4: Let f(x) be a concave function over a convex set X in R". Then
any local maximum off (x) in X is also a global maximum off (x) over X.
PROOF: If the global maximum is taken on at just a single point, then the
result is obvious. Suppose then that the global maximum is taken on at two
different points z and x*. Let z = tx* + (1 - t)z, 0 < t < 1. Sincef is con-
cave and since f(z) =f(x*), we have
f(z)f[tx*+(1-t)z] >tf(x*)+(1 -t)f(z)=f(z),0<t< 1
Since f(i) cannot be greater than f(z), f(i) =f(1). Therefore i E S, or
tx* + (1 - t)z E S for all t, 0 < t < 1. (Q.E.D.)
REMARK: It should be clear that Theorems I.C.4 and I.C.5 remain correct
if we replace '!f is concave" by '!f is convex" and "maximum" by "mini-
mum."
it is also taken at all the points in between those two points. It should also
be clear that if f(x) is a strictly concave function, then the global maximum
is taken at a unique point. To prove this, suppose that the global maximum
is taken on at two distinct points, z and x*. Then we havef(i) > tf(x*) +
(1 - t)f(z) = f(z), where i = tx* + (1 - t) z, 0 < t < 1. That is, we have
f(z) > f(z), which contradicts the fact that f(z) is the global maximum.
We now prove the following basic theorem.
Theorem I .C.7: Let f (x) be a differentiable and concave function over an open convex
set X i n R The function f (x) achieves its global maximum at x = z if and only if
f, = 0, where f, = f'(z) (the gradient vector off at z). Moreover, z furnishes a unique
maximum off if f is strictly concave.
PROOF: If the global maximum is taken on at x = z, clearly we havefr = 0.
Conversely, if fx = 0, then, as a result of Theorem 1.C.3, 0 f(x) - f(z)
for all x E X, or f(!) - f(x) for all x E X. (Q.E.D.)
REMARK: Analogously, f(x) achieves its global minimum at z if and only
if f'(2) = 0, when f is a convex function. Likewise, z furnishes a unique
minimum off if f is strictly convex.
REMARK: In the literature there are usually discussions on the "second-
order sufficiency conditions" assuming f E C(2). When f is specified as con-
cave (or convex), we can see from the above theorem that such considera-
tions can be dispensed with. The second-order conditions are, however,
related to the concavity or the convexity of a function in a neighborhood
DIFFERENTIATION AND THE UNCONSTRAINED MAXIMIMIM PROBLEM 8S
of the relevant point (see Section E, subsection c). Moreover, note that
Theorem 1.C.7 says that fC = 0 is a necessary and sufficient characterization
of a global maximum and not simply that of a local maximum. But this
can easily be understood in view of Theorem 1.C.4, which asserts that
under concavity every local maximum is a global maximum. In other words,
the concavity off also plays a crucial role in establishing the global char-
acterization of a maximum.
REMARK: Consider the constrained maximum problem of maximizingf(x)
subject to g7(x) > 0, j = 1, ..., m, x E X c R. Let C = {x E X:gj(x) ? 0,
j = 1, 2, ... , m} (the constraint set). If z is a solution of this problem, then
z maximizes 1(x) over C. Hence identifying set X in Theorems 1.C.4 and
1.C.5 with set C and assuming that C is convex, we can assert under the
concavity off that every local maximum off over C is a global maximum of
jover C, and that S = {z-f(z) > f(z), x E C} is convex. Furthermore, iff
is strictly concave (and C is convex), z is unique.
REMARK: Again consider the constrained maximum problem of maximiz-
ing f (x) over C. Suppose that the solution z is in the interior of C, so that
there exists an open ball about i [say, BE(z)] which is in C. Then Theorems
1.C.6 and 1.C.7 can be applied directly to such a constrained maximum
problem by identifying X in these theorems with BE(z). In othei words, the
constrained maximum problem is reduced to an unconstrained maximum
problem. But there is nothing surprising in this, for that the solution z is
in the interior of C means that none of the constraints gj(x) > 0,
in, are effective at z.
FOOTNOTES
1. When X = R, there are now two different definitions of differentiability. That is,
f is differentiable at x° (i) if both the right-hand and the left-hand derivatives exist
and they are equal, or (ii) if the differential exists at x° in the above sense. It
can be shown that these two definitions are equivalent. See, for example, Brown
and Page [ 1 ] , pp. 266-267, especially theorem 7.1.9.
2. In other words, if f is differentiable at x° in one norm in R", then f is differentiable
at x° in another norm in R" and the two derivatives coincide. See, for example,
Brown and Page [ 1] , p. 273.
3. In infinite dimensional (normal linear) spaces, the differentiability and the value
of differentials, in general, depend on the choice of the norm.
4. For the proof of this statement, see, for example, Fleming [2] and Rudin [6].
Here it is crucial that X is in a finite dimensional space such as R". If X is in an
infinite dimensional space, the function may be differentiable at x° but may fail
to be continuous at x°. See, for example, Brown and Page [ 1] , p. 274 (exercise 3).
5. Weierstrass showed that the function f(x) = _Y' o a"cos(b"x) is continuous but
nowhere differentiable when b is an odd integer, 0 < a < 1 and ab > 1 + (3/2)7r.
This was first published by du Bois Reymond in 1875. Since then simpler examples
have been constructed. One of the simplest was given by B. L. van der Waerden
86 DEVELOPMENTS OF NONLINEAR PROGRAMMING
REFERENCES
1. Brown, A. L., and Page, A., Elements of Functional Analysis, London, England,
Van Nostrand Reinhold, 1970.
2. Fleming, W. H., Functions of Several Variables, Reading, Mass., Addison-Wesley,
1965, esp. chapters 1, 2, 3, and 4.
3. Goffman, C., Calculus of Several Variables, New York, Harper & Row, 1965, esp.
chapters 2 and 3.
4. Hadley, G., Nonlinear and Dynamic Programming, Reading, Mass., Addison-Wesley,
1964, esp. chapters 1 and 3.
5. Loomis, L. H., and Sternberg, S., Advanced Calculus, Reading, Mass., Addison-
Wesley, 1968, esp. chapter 3.
6. Rudin, W., Principles of Mathematical Analysis, 2nd ed., New York, McGraw-Hill,
1964, esp. chapter 9.
7. Vainberg, M. M., Variational Methods for the Study of Nonlinear Operators, (trans-
lated by Feinstein from the Russian original published in 1956), San Francisco,
Holden-Day, 1964, esp. chapter 1.
Section D
THE QUASI-SADDLE-POINT
CHARACTERIZATION
Supposing that we are given real-valued functions f(x), g1(x), g2(x), ...,
gm(x), on X in R", in Section B we discussed the following two conditions:
(M) (Maximality condition) There exists an z in X which maximizesf(x) subject
to gj(x) > 0, j = 1, 2, ..., in, and x E X.
(SP) (Saddle-point condition) There exists an (z, A.), in X Q f2'" such that (c, A.)
is a saddle point of cI (x, A); that is, I (x, A) < 0 (z, 1) < I (.z, A), for all x E X
and A E DJ , where 0 (x, A) = f (x) + A1, g(x).
In Section B, we showed that condition (M) implies condition (SP) if f and
THE QUASI-SADDLE-POINT CHARACTERIZATION 87
the g,-'s are all concave functions (where X is a convex set) and if a normality
condition such as Slater's condition is satisfied. We also showed that condition
(SP) implies condition (M) (with no conditions such as the concavity off and the
gj's or Slater's condition). In this section, unlike section B, we assume that f
and the gf's are differentiable on X. First we introduce the following condition,
which is also known as the first-order condition or the Kuhn-Tucker-Lagrange
condition.
PROOF:
(i) By assumption, 0 (x, (x, )) for all x E X. [That is, z achieves a
global (hence local) maximum of 1 (x, i.) on X.] Therefore, by Theorem
I.C.7, c1Dx(z, i) = 0 or + A. g-r = 0. Also, by Theorem I .B.4, (SP)
implies A'- g(z) = 0. Hence (D (1, .l) < (D (c, A) for all A >_ 0 implies that
g(x) > 0 for ;t > 0. Hence, in particular, g(c) >_ 0.
(ii) Since 0 (x, A) = f(x) + ) g(x) is a nonnegative linear combination of
f and gj's, and since f and the gj's are concave functions, 0 (x, A) is
also a concave function. Then, by Theorem 1.C.7, we have c (x, A) <
(z, A), and D (z, A) < Q (z, A) follows trivially from the fact that
g(z) = 0, g(z) > 0 and A > 0. (Q.E.D.)
88 DEVELOPMENTS OF NONLINEAR PROGRAMMING
Combining the above theorem with the corollary of Theorem 1.B.3, Theorem
I.D.2 follows at once.
(QSP)
Always f, gj s concave
Figure 1.10. Relationships between (M), (SP), and (QSP) under Concavity.
THE QUASI-SADDLE-POINT CHARACTERIZATION 89
(KTCQ) Let Cbe the constraint set defined by C = {x;x E X, gj(x) >_ 0, j = 1,
2, ... , mi. Let z be a point in C with gj(z) = 0 for j E E where E c { 1, 2, ... , m }
and E # 0. Let x be any point in X such that (x - z) > 0 for all j E E.
It is supposed that there exists a function h(t) on [0, 1] into X, which is differen-
tiable at 0 with the following properties.
This (KTCQ) is illustrated in Figure 1.12. In either of the two cases, (KTCQ)
is satisfied.
The case in which (KTCQ) is not satisfied is illustrated in Figure 1.13. This
is the case where there is some irregularity (such as a "cusp") on the boundaries.
It should be clear that for a point such as . E Yin Figure 1.13, there is no function
h(t) satisfying (i) and (ii) of (KTCQ).
Before we state Kuhn-Tucker's main theorem, we also should modify condi-
tion (M) [note that (M) implies (LM)].
(LM) (Local maximality condition) There exists an z in X such that f (x) has a
local maximum at k subject to g.(x) > 0, j = 1, 2, .. ., m, and x E X.
In other words, there exists an open ball B(z) about z such that A = B(z) fl C :k 0
and f(i) >_ f (x) for all x E A, where C is the constraint set.
We now state and prove the theorem.
x2 x2
E=(1,2} E _ (2)
C = darkly shaded region C = darkly shaded region
Y = entire shaded region Y = entire shaded region
(X = R2) (X = RZ)
Case a Case b
xi
gj (z)=0 jEE
gj (x) > 0 0- E
o(Ilt11)
92 DEVELOPMENTS OF NONLINEAR PROGRAMMING
or
Choose Aj = 0 if j 0 E. We have now obtained (z, A,, ..., A,,,) such that
P(z) + 2:m IA-jgj'(z) = 0. That gj(z) > 0 for all j follows immediately from
(LM). Since gj(z) = 0 for j E E, and Aj = 0 for j (4 E, we have
m
A,.jgj(x) = 0
j= (Q.E.D.)
REMARK: It should be clear from the above proof that the theorem follows
almost immediately from (KTCQ). Note also that the above theorem and
proof follow almost word for word when f and the gj's are real-valued
functions defined over X, where X is a "Banach space" (that is, a complete
normmed linear space) 3 See Ritter [ 13]. The Minkowski-Farkas lemma
holds almost as it is when X, the domain of the f and gj's, is an arbitrary
linear space (which can be infinite dimensional and does not even have to
be normed). See Fan [4], especially theorem 4, p. 104.
REMARK: In the (QSP) condition, we required, among others, the follow-
ing relation:
.fr + gX = 0
If we do not have (KTCQ), (LM) does not necessarily imply (QSP) (in
particular; the above relation). In other words; a statement such as "the
first-order conditions are the necessary conditions for a local maximum" is
not necessarily true. A special regularity condition such as (KTCQ) is
required to make this statement valid.
However, if we modify the above expression to
Ao.f, + A g.c = 0 where Ao >= 0
allowing the coefficient )o for fC (with the possibility of A0 = 0), then (LM)
always implies (QSP) with this modification. In other words, the role of
(KTCQ) is to guarantee that Ao > 0 (which in turn enables us to set Ao = 1).
Hence the regularity condition such as (KTCQ) is really the normality con-
THE QUASI-SADDLE-POINT CHARACTERIZATION 93
(ii) The functions gi(x), j = 1 , 2, ... , m, are linear or linear affine functions.
(iii) The functions g i (x), j = 1 , 2, ... , m, are concave functions and there exists an
x in X such that g1(z) > 0 for j E E' and gi(x) > 0 for j E E", where E' is he
set of indices for the effective constraints (at z) which are linear (affine), and
E" is the set of indices for the effective constraints (at z) which are not linear
(affine).
(iv) The constraint set, C = {x:g (x) > 0, j = 1, 2, ..., m, x E R'}, is convex and
possesses an interior point, and gj (z) 0 for all j E E, where E is the set of
indices of all the effective constraints at z.
(v) (Rank condition) The rank of [gj'(z)]jEE equals the number of effective con-
straints at z,' where the rank of the k x n matrix is defined as the (max-
imum) number of its linearly independent rows (which is equal to the max-
imum number of its linearly independent columns).'
PROOF: Omitted. See Arrow, Hurwicz, and Uzawa [1], especially their
theorem 3. See also the appendix to this section.
REMARK: We may call the above five conditions the Arrow-Hurwicz-
Uzawa (or the A-H-U) conditions. Condition (ii) is obviously a special case
of condition (i). Condition (ii) is important in connection with linear pro-
gramming. Note that Slater's condition is a special case of condition (iii).
Conditions (iv) and (v) presuppose no concavity or convexity of the gi's.
Condition (v) is famous from classical Lagrangian multiplier theory which
deals with the case in which all the constraints are effective (that is, all the
constraints are "equality constraints"). It is important to note that all the
above five conditions are concerned with the constraints only.
REMARK: We illustrate Theorems 1.D.3 and 1.D.4 schematically in
Figure 1.14.
If f and the gi's are concave, then (QSP) implies (LM), hence (M). This was
already discussed in Theorem 1.D.2.
We now show one immediate corollary of the above theorem.
(LM) (QSP)
T
(KTCQ)
"' (A-H-U)
(i) There exists a A > 0 such that (z, )) is a saddle point ofd (x, A.) = f(x) + A g(x)
over R" Qx S2that is, tP(x, A) < tP(z, A) < P(z, A) for all x E R^ and all
A >_ 0.
Or
(ii) There exists a A ? 0 such that (z, A) satisfies (QSP).
PROOF: Since gj(x) is linear affine for all j, condition (ii) of the Arrow-
Hurwicz-Uzawa theorem is satisfied. Hence (M) implies (QSP). Moreover,
f is concave and the gj's are linear affine, hence concave; therefore (QSP)
is sufficient for the maximality (M). Thus statement (ii) in the above theorem
is proved. Owing to the differentiability off and the gj's, (SP) implies (QSP).
Owing to the concavity off and the gj's, (QSP) implies (SP). Since (QSP) is
necessary and sufficient for (M), (SP) is now also necessary and sufficient
for (M), which proves statement (i) of the theorem. (Q.E.D.)
REMARK To prove the above theorem, we really do not need the machinery
of the Arrow-Hurwicz-Uzawa theorem. But the extreme simplicity of the
above proof will indicate the power of the theorem, as well as enhance the
reader's understanding of the theorem.
REMARK: It may be worthwhile to recall the warning that we gave earlier.
That is, in order that an expression such as fr + A - gx = 0 in (QSP) be
meaningful, z must be an interior point inX. If this condition is not satisfied,
the theorems which involve (QSP) become meaningless and those theorems
whose proofs require (QSP) may not hold. Consider the following example
from Moore [12].
Maximize:f(x) =
xER
1x
Subject to: g(x) = x - I > 0
Note that the domain off(x) is restricted to a closed interval [- 1, 1] inR,
in order that f (x) be a real-valued function. Hence, in view of the constraint
x - 1 > 0, the constraint set consists of only one point x = 1. The solution of
the above problem is obviously _z = 1. However, we cannot state the (QSP)
condition since f is not defined at z = 1. Note that the Lagrangian a)
V1 - x2 + A (X - 1) does not have a saddle point at z = 1. Note also that
the constraint function g(x) is linear affine in this case, so that condition (ii)
of the A-H-U theorem is satisfied.
IT (Conc.)
(QSP)
(KTCQ) or (A-H-U)
In Figure 1.15, the arrow again reads "implies" under the conditions stated
with the arrow. If no conditions are stated, then no conditions are necessary
to obtain the given implication. In practical applications of nonlinear pro-
gramming theory to economics, the following conditions are often satisfied.
(i) The function f is concave and the gj's are all concave and satisfy (S), or
(ii) The function f is concave and the gj's are all linear (affine).
In such a "nice" situation, we can easily see that Figure 1.15 is considerably
simplified to Figure 1.16; that is, (LM), (M), (SP), and (QSP) are all equivalent.
The classical Lagrangian problem is concerned with the problem of "equality
constraints," that is, finding z E X, an open subsetofR", which maximizesf(x) sub-
ject to gj(x) = 0, j = 1, 2, ..., m, wherefand theg,'s are real-valued continuously
differentiable functions on X in R. As we remarked earlier, these constraints can
be converted into gj(x) > 0 and -gj(x) >_ 0, j = 1, 2, ..., m. Clearly if the con-
straints are all linear, then the constraint qualification is satisfied as a result of
condition (ii) of the A-H-U theorem. However, suppose that the gj's are not
linear. Condition (i) cannot be applied, for if gj is convex, then -gi is concave.
Condition (iii) cannot be applied either, for if gj(x) > 0 for some x, we cannot
have -gj(x) > 0. The rank constraint (v) may not seem applicable, for the rank
of the (2m x n) matrix.
is certainly not equal to 2m, for any x E X, where 2m is apparently the number of
effective constraints. However, this is not correct reasoning. We have to note
that the constraints g,(x) > 0 and -gj(x) _> 0 are not distinct constraints when the
values of x are such that gj(x) = 0. Hence, although there are 2m constraints in
appearance, the number of distinct constraints for x with gj(x) = 0 is m, and the
rank condition for the problem should be stated that the rank of the (m x n) matrix
[ a g1/ a x;] should be equal tom. Hence we obtain the following classical theorem,
which was originally conceived by Lagrange and later developed by Caratheodory
[2] and Bliss.'
Theorem 1.D.6. (Lagrange): Suppose that z satisfies (LRM).7 Suppose also that
(QSP)
X 2 + 2A i I = 0, a 2 = x, + 2L = 0, and z l2 + z 22 - 1 = 0
From these three equations we can easily obtain z 1, z2 = ± 1 //and )i.
The rank condition which validates the above computation is that the rank
Of (ag/ax, , ag/ax2) _ (2x, , 2x2) must be equal to one at (z1 , z2). It is
obvious that this condition holds.
The above consideration of the case with the equality constraints enables
us to extend our analysis to the case in which both equality and inequality
constraints are present (the case of mixed constraints). In other words, consider
the problem of finding x so as to
Maximize: f(x)
xEX
Theorem 1.D.7: Suppose that z satisfies (LM).8 Suppose also that the rank of the
(me + 1) x n matrix
THE QUASI-SADDLE-POINT CHARACTERIZATION 99
a J.
axi
ahk
where jEEandk= 1,2,...,1
axi
evaluated at z, is equal to (me + 1). Then (QSP) holds for this z where the Lagrangian
for the above theorem is defined as
f(x) + A - g(x) + µ- h(x)
Here (QSP) requires > 0, while µ can be either positive or negative.
REMARK: For a further and a more vigorous consideration of mixed
constraints, see Mangasarian [I I[ii], chapter 11.
EXAMPLE: Consider the problem of choosing (x1, x2) E R2 to
Maximize: x1x2
Subject to: g(x) x, + 8x2 - 4 > 0
h(x) = X12 + x22 - 1 = 0
Using the diagrammatical representation of the problem, the solution of
this problem can be obtained easily as z, = 22 = '/V/2-. The- Lagrangian
of this problem is defined asO = x, x2 + .A 1(x 1 + 8x2 - 4) + . 2(x 12 + x22
and the (QSP) conditions are written out as
a = X2 + A, + 2Ax = 0
ax1
am = X1 + 8A, + 2.2X2
a xe
=0
another route, that is, via calculus. The classical Lagrangian theorem is concerned
with the case in which all the constraints are equalities. Karush 191, much
prior to Kuhn and Tucker [ 10], considered the inequality constraints, reducing
them to the equality constraints by adding to, or subtracting from each inequality
the square of a real number.' For example, the constraints gj (x) > 0, j = 1, 2,. . - , m,
can be converted into the following equality constraints:
FOOTNOTES
REFERENCES
1. Arrow, K. J., Hurwicz, L., and Uzawa, H., "Constraint Qualifications in Maximiza-
tion Problems," Naval Research Logistics Quarterly, vol. 8, no. 2, June 1961.
2. Caratheodory, C., Calculus of Variations and Partial Differential Equations of the
First Order, Part II, Calculus of Variations, San Francisco, Holden Day, 1967 (trans-
lated from the German original published in 1935).
102 DEVELOPMENTS OF NONLINEAR PROGRAMMING
Theorem 1.D.8: Let f, g1, g2, ..., gm be real-valued differentiable functions defined
on a nonempty open convex set X in R. Suppose that condition (AHU) is satisfied.
Then (LM) implies (QSP). In other words, if i achieves a local maximum of the
problem, there exists a A such that the quasi-saddle-point condition (QSP) is satis-
fied.'
PROOF(HURWICZ):
(i) Suppose gj(i) > 0 for all j = 1, 2, ..., m (that is, E = 0). It follows
that f'(i) - 0, for we have in this case the unconstrained maximization
problem. By choosing A i = 212 = ... = a.,,, = 0, the (QSP) condition
is satisfied. Now we concentrate on the case in which E 0.
(ii) Suppose g, (x) is convex (that is, j E J). Then gj(i + th*) - gj(i)
gg(2) (th*) for all t E R, t > 0, and any h* E R" such that (i + lh*) E X.
But by condition (AHU), we can choose h* such that gj(i) h* > 0.
Hence gf (i + th*) - gj(i) > 0 for all t > 0, t E R, such that z + th* E X.
(iii) Suppose gj (x) is not convex (that is, j E J'). By the definition of dif-
ferentiation, we have
gj(x) - gj(i) = j(i) (x - i) + of 11 x - ill ) for all x E X
1 04 DEVELOPMENTS OF NONLINEAR PROGRAMMING
Let x(t) -- z + th* such that x(t) E X and t > 0, t E R. Then gj [x(t)] -
gj (z) = gJ'(z) (th*) + o(II t II ). But by condition (AHU), we c in choose h*
such that gj'(z) h* > 0. Hence gj [x(t)] - gj(2) > 0 for sufficiently
small t.' Hence choosing t sufficiently small, say 0 < t < t , we can have
gj [x(t)] - gj(z) > 0 for all j E Y.
(iv) Let x(t) = z + th*, where 0 < t < t with x(t) E X. Then combining (ii)
and (iii) we have gj[x(t)] - g;(z) > 0 for all j E E, orgj[x(t)] > 0
for all j E E. Moreover, gi [x(t)] > 0 for all j E, for sufficiently small
t, say t, owing to the fact that gj(z) > 0 for all j E and the continuity
of the gj's. Thus x(t) E C, where C is the constraint set { x E X: gj(x) > 0,
j = 1, 2, ... , ml, if t is sufficiently small (that is, t < min {, !I).
(v) Now define Y' (t) = f [x(t)] - f(z) for 0 5 t < I where I = min It, 11.
Note that 'P (0) = 0. Also Y' (t) < 0 for sufficiently small t, say, 0 < t < to <
7 where to > 0, because, by assumption, x(t) E C and z achieves a local
maximum off (x) subject to x E C. Let `P+ (0) be the right-hand derivative
of Y' at t = 0.5 Then by the chain rule, we have `P+ (0) = f'(z) h*.
Note that T(t) < 0 = Y'(0), 0 5 t < t0, which implies that T(t) is non-
increasing at t = 0, or `P+ (0) < 0. Hence f'(z) h* 5 0. Since the choice of
h* can be arbitrary as long as condition (AHU) is satisfied, we have thus
established f'(I) h* < 0 for any h* in which condition (AH U) is satisfied.
Note that condition (AHU) is afortiori satisfied if there exists an h c R"
such that g! (z) h > 0 for all j E E. Hence for any h E R" for which
g' (z) J > 0 for all j E E, we have f' (z) h <- 0.
(vi) Now consider any h satisfying gj(z) h > 0 for all j E E. Then we have
gj'(z) (h + th*) > 0 for all j E E and for any t > 0, t E R, if condition
(AHU) is satisfied for h*. Then as a result of the conclusion obtained in
(v), we have
th*) 50
Take the limit as t - 0. Then, owing to the continuity of the inner
product, we obtain
f'(X)-h<0
Thus we have established that f'(z) h 5 0 for all h such that
jEE
(vii) Hence, from the Minkowski-Farkas lemma (Theorem O.B.4), there exist
,lj's, all > 0, such that
- f'(X) = jI Ajgj(x)
or
f'(z) + jE 0
111
f'(z) + l jgj'(z) = 0
=1
THE QUASI-SADDLE-POINT CHARACTERIZATION 105
That gj(c) ? 0 for all j follows immediately from condition (LM). Since
gj (s) = 0 for all j E E and ,ii = 0 for j it E, we obtain
'ti gi(x) = 0
(Q.E. D.)
REMARK: Just as (KTCQ), the above condition (AHU) is again the normal-
ity condition. Notice also that it is a qualification for the constraints (that is,
nothing is mentioned about the maximand function f ).
We are now ready to derive the conclusion of the Arrow-Hurwicz-Uzawa
theorem. In particular, we want to show that any one of the five conditions in the
A-H-U theorem implies condition (AHU). This part of the A-H-U theorem is
really a corollary of the above theorem and has already been established in the
original paper by Arrow, Hurwicz, and Uzawa (see the corollaries of their theorem
3 in [2] ).
First, note that if gj(x) is convex for all j E E, then condition (AHU) is trivi-
ally satisfied. This can be seen easily by choosing h* = 0 in the statement of condi-
tion (AHU). Clearly if either of the following two conditions is satisfied, then gf (x)
is convex for all j E E.
(i) The function gj(x) is convex for all j = 1, 2, .. ., m.
(ii) The function gj(x) is linear for all j = 1, 2, ..., M.
Since every linear function is convex, (ii) is really a special case of (i); however, it
has a powerful implication, for, as remarked before, it implies that, in linear
programming, condition (AHU) is automatically satisfied.
Next we will see that the following modification of the Slater condition
implies condition (AHU):
(iii) The function gj(x) is concave for j = 1, 2,,... , m and there exists an Y E X such
that g j (x) > O for all j = 1, 2, ... , m.
To see this, first recall the following basic inequality for concave functions.
gj(z) (x - z) ? gj (x) - gj (i) for any x, z E X
In particular, set x = z and let h = z - z. Then we have
?gi (x)> 0 for all jE E
That gj(x) > 0 (for all j) follows from the above condition (iii). Thus condition
(AHU) is satisfied if condition (iii) is satisfied. It should be clear that, in view of (ii),
condition (iii) can be slightly weakened as in (iii').
(iii') The functions gj(x), j = 1 , 2, ... , m, are all concave and there exists an x
in X such that gj (x) ? 0 for.j E E' and gj (x) > 0, j E E", where E' is the set
106 DEVELOPMENTS OF NONLINEAR PROGRAMMING
of indices for the effective constraints (at z) for which the gj's are linear,
and E" is the set of indices for the effective contraints (at z) for which the
gj's are not linear (but concave).
Next we show that the following rank condition implies condition (AHU):
(iv) The rank of the in x n matrix g'(z) (that is, the Jacobian matrix) is equal
to the number of effective constraints at z.
Let gAi) be the submatrix of g'(z) obtained from g'(z) by deleting the rows
which correspond to the constraints that are not effective at z [that is, "gj(. )
is a row of gE (z)" means j E E ] . Let k be the number of effective constraints at . .
Then, owing to condition (iv), the number of linearly independent rows of the
matrix g'(z) is equal to k. Note that owing to an elementary property of matrices,
the rank of matrices cannot exceed the number of columns or rows.6 Hence k5; n
as well ask < m. Since all the rows of gE(z) are linearly independent, there are k
linearly independent columns in gE(z). Without loss of generality we may suppose
that the first k columns of g' (z) are linearly independent. Let A be the k x k square
matrix obtained from gE (z) by deleting the (k + 1)th to thenth column (if k < n).
Let u be the k-vector whose elements are all equal to 1. Since A is a nonsingular
square matrix, there exists a k-vector h such that A h = u. Let an n-vector h*
be defined such that h* = h; for i= 1,2,...,k,andh*=0fori=k+ 1,...,n.
Then clearly we have gE(z) h* = u > 0, org(z) h* > 0 for j E E. This establishes
condition (AHU). Hence the rank condition (iv) implies condition (AHU).
There is another condition in the Arrow-Hurwicz-Uzawa theorem which
implies condition (AHU): The constraint set C is convex and has an interior
and gj(z) 4 0 for every j E E. The proof that this condition implies condition
(AHU) is a little complicated, and hence is omitted. Interested readers are referred
to Arrow, Hurwicz, and Uzawa [2] , p. 184.
Fritz John's famous theorem, originally obtained in 1948 ([4], theorem I,
pp. 188-189),7 is an easy consequence of Theorem 1.D.8.
Theorem 1.D.9 (John): Let f, gI, ..., g,,, be real-valued differentiable functions
defined on a nonempty open set X in R". Suppose that (LM) is satisfied; that is, z
achieves a local maximum off subject to g;(x) > 0, j = 1, 2, ..., m, and x E X.
Then there exist aj >_ 0, j = 0, 1, 2, ... , m, not vanishing simultaneously, such that
m
Aof'(x) + Z Ajg;(x) = 0
J= 1
PROOF (HURWICZ):
(i) Suppose that, for some h* E R", g'I-(2) h* > 0, j E E. Then condition
(AHU) is satisfied. Hence from the previous theorem, we are guaranteed
the existence of i1 > 0, j = 0, 1, 2, ..., in, with A0 = 1.
(ii) Suppose now that there exists no h* E R" for which
O,jEE
THE QUASI-SADDLE-POINT CHARACTERIZATION 107
Z-
Then Z does not contain any strictly positive element z > 0. Let R+ be the positive
orthant of Rk, that is, {z E Rk: z > 0}. Then Rk and Z are two disjoint convex
sets. Hence, owing to the Minkowski separation theorem (Theorem 0.B.3), there
exists an a E Rk, a 0, such that9
FOOTNOTES
1. I am grateful to Leonid Hurwicz for giving me permission to quote the results and
the derivation from his unpublished paper [51, from which much of the material
in this appendix is borrowed. Needless to say, any possible misunderstanding of
[51, and hence mistakes, are mine.
2. The, essence of Hurwicz [ 5] is to provide a proof without by-passing the use of
(KTCQ).
3. Condition (QSP) says that there exist i E X and A E Rm, 0, such that f'(z) +
A g'(z) = 0, g(z) > 0, and A g(z) = 0, where g = (gi, ... , g,,,). If the nonnegativity
constraint x > 0 is made explicit in addition to g(x) > 0, then (QSP) is modified to
(QSP'): There exist x E X, i c 0, A E R"', ). ? 0 such that f'(z) -t g'(z) < 0,
x [ f'(z) + A- g'(z)] = 0, g(z) > 0, and A- g(z) = 0. Recall our discussion on this
point in Section D of this chapter.
4. It should be clear why gI (x) h * > 0 (instead of > 0) is required for j E P. Ifg,' (z) h
= 0 is allowed for j E J', then we cannot guarantee that gf [x(t)] - gf(z) >_ O, j E J'
for sufficiently small t.
5. Note that W' (0) exists because f is differentiable in X.
6. The "rank" of a (rectangular) matrix is defined as the number of linearly independent
rows. As remarked before it can be shown that it is equal to the number of linearly
independent columns (the rank theorem).
7. When all the constraints are in the equality forms, that is, gj(x) = O, j = 1, 2, ... , m,
then the theorem corresponding to Fritz John's theorem is known in the name of
108 DEVELOPMENTS OF NONLINEAR PROGRAMMING
Lagrange and Euler. As remarked before, the proof of such a theorem is provided by
Caratheodory ([3], pp. 176-177, theorem 2). See also theorem 76.1 of G.A. Bliss,
Lectures on the Calculus of Variations, Chicago, Ill., University of Chicago Press, 1946.
8. Recall that k is the number of effective constraints and that gE'(z) denotes the
k x n matrix which is obtained from g'(z) by deleting the rows which correspond to
the ineffective constraints (ineffective at z).
9. It should be clear that the separating hyperplane passes through the origin of R".
REFERENCES
1. Abadie, J., "On the Kuhn-Tucker Theorem," in Nonlinear Programming, ed. by
J.Abadie, New York, Interscience, 1967.
2. Arrow, K. J., Hurwicz, L., and Uzawa, H., "Constraint Qualifications in Maximiza-
tion Problems", Naval Research Logistics Quarterly, vol. 8, no. 2, June 1961.
3. Caratheodory, C., Calculus of Variations and Partial Differential Equations of the First
Order, Part 11, Calculus of Variations. San Francisco, Holden Day, 1967 (tr. by Robert
Dean from German original, 1935).
4. John, F., "Extremum Problems with Inequalities as Subsidiary Conditions," Studies
and Essays, Courant Anniversary Volume, New York, Interscience, 1948.
5. Hurwicz, L., "LH-Oct. 1966," Lecture Note at the University of Minnesota, October
1966, revised July 2, 1970.
Section E
SOME EXTENSIONS
a. QUASI-CONCAVE PROGRAMMING
Definition: A real-valued function f(x) defined over a convex setX in R" is called
quasi-convex if -f(x) is quasi-concave. The function f(x) is called strictly quasi-
concave if
f (x) > f (x') implies f [ tx + (1 - t)x'] > f (x') for all x x' E X and 0 < t < 1
The function f(x) is called strictly quasi-convex if -f(x) is strictly quasi-concave.
REMARK: Clearly a strictly quasi-concave (-convex) function is always
quasi-concave (-convex), but not vice versa.
We can easily show the following theorem.
Theorem 1.E.1:
(i) Any concave function is also quasi-concave, but the converse does not necessarily
hold. Similarly, any strictly concave function is also strictly quasi-concave, but the
converse does not necessarily hold.
(ii) Any monotone increasing (or decreasing) function is quasi-concave zfX c R
(iii) Any monotone nondecreasing function of a quasi-concave function is also quasi-
concave.
REMARK: An ordinary utility function whose corresponding indifference
curve is drawn convex to the origin is an example of a quasi-concave func-
tion. If the indifference curve does not contain a linear segment, then the
utility function is strictly quasi-concave. Although a quasi-concave function
is not necessarily concave, a quasi-concave function can be transformed
into a concave function, under a certain regularity condition, by a strict
positive transformation. See Fenchel [6], pp. 115-137. This observation is
110 DEVELOPMENTS OF NONLINEAR PROGRAMMING
Definition: Given the constraint set C = {x : x E on, gj(x) > 0, j = 1, 2, ..., ml,
we call the ith coordinate variable xi a relevant variable if there exists an x in C
such that x i > 0.
REMARK: As Arrow and Enthoven ([1], p. 783) explained, it is a variable
"which can take on a positive value without necessarily violating the con-
straints."
Theorem 1.E.2 (Arrow-Enthoven): Let f, g1, g2, ..., g,,, be differentiable, quasi-
concave, real-valued functions of the n-dimensional vector x on R" with x >_ 0. Then
(QSP') implies (M'), provided that one of the following conditions is satisfied.
(i) j,, < 0 for at least one variable xi, where fX, is the partial derivative off (x) with
respect to xi, evaluated at x = z.
(ii) J. > O for some relevant variable xi.
SOME EXTENSIONS 111
theorem are satisfied. Thus (QSP') implies (M'). This is nothing but the
result obtained as part of Theorem 1.D.2.
REMARK: If all the variables are relevant (the usual case in economic
theory), then (i) and (ii) of Theorem 1.E.2 simply reduce tof 0.
Referring again to Arrow and Enthoven [ 1], we state the following theorem,
which really corresponds to the Arrow-Hurwicz-Uzawa theorem [Theorem 1. D.4,
conditions (iii) and (iv)]. The theorem is concerned with a necessary condition for
the maximum.
Definition: A real-valued function f(x) defined over a convex set X in R'1 is called
explicitly quasi-concave if it is quasi-concave and if
112 DEVELOPMENTS OF NONLINEAR PROGRAMMING
(i) Iff(x) is a concave (resp. convex) function defined on the con vex setX inR",
then f is explicitly quasi-concave (resp. explicitly quasi-convex) in X.2
(ii) Let f (x) be an explicitly quasi-concave (resp. explicitly quasi-convex)
function on a convex set X in R". Then every local maximum (resp. local
minimum) off in the constraint set C, which is convex, is also a global
maximum a
(iii) Let f (x) be strictly quasi-concave on a convex set X in R". Then ifz achieves
a local maximum off in the constraint set C and if C is a convex set, then
it achieves a unique global maximum over C.'
Definition: Let f 1(x ), f2(x ), ... , fk(x) and g I(x), 92W, ... , g,,(x) be real-valued
functions defined on X in R We say that z in X gives a vector (global) maximum
of f(x) -- [f (x), f2 (x), ..., fk(x)] subject to gj(x) > 0, j = 1, 2, ..., m, if the
following conditions exist:
(i) gj(i) > 0,j= 1,2,...,m,andiE X.
(ii) There exists no i satisfying
SOME EXTENSIONS 113
REMARK: The reader may realize that the concept of "efficient point" in
activity analysis is a special case of the vector maximum wheref (x) = x. One
may also note that the vector maximum problem has immediate relevance to
the concept of Pareto optimum, which is important in economic theory.
REMARK: The definition of vector local maximum is analogous to the
above definition of vector global maximum. For the distinction between a
local maximum and a global maximum, see Section C of this chapter. The
concept of a local maximum is concerned with maximization with respect to
some open ball (which can be very small).
REMARK: It follows from the above definition that if f (z) is a constrained
vector maximum, then
z maximizes f o(x) [that is, f o(z) > f o(x )]
Subject to:
f(x) f(z) for all i 4 io
gj(x)0 j= 1,2,...m
where the choice of io is arbitrary. For if not, there exists an z and an i
such that
f(X) >f(X)
and
f,.(z) >= f (z) for all i# i
gj ( x) 0 j = 1, 2, ... , m
This is a contradiction of the assumption that z is a vector maximum.
Utilizing this remark, we now prove the following theorem. The method
of proof using the above remark is due to El-Hodiri [4], who, in turn, attributes
the idea to Leonid Hurwicz.
Theorem 1.E.4: Let f , f2, ,fk, 91, 92, ... , gm be real-valued concave functions
defined on a convex set X in R". Assume that Slater's condition (S) holds; that is,
there exists an x in X such that
gi(z) > 0 for all j
Then i(1 achieves a vector maximum off (x) = [ f (x), . . ., fk (x)] subject to gj (x) > 0,
j = 1, 2, ..., m, there exist cr E Rk, A E R"' with cr >_ 0, A >_ 0, and a # O such that
(z, A) is a saddle point of (P (x, A) = a f (x) + A g(x); that is,
cD (x, A) < cp (z, A) < cp (z, A) for all x E X and A > 0
PROGRAMMING
114 DEVELOPMENTS OF NONLINEAR
-fi(x)] + I Aiiogj(x)
ai0i0 0( f x) + E aiio[
i=IL i0
j=I
f
m
(z)] + .i1,;i0g;(z) for all x E X
aioio
f o(X) + X aiio f (x)
r j= 1
0
and
Aji0gj(X) = 0
j,I
following:
These can easily be simplified to them
+ 11
a io;of o(x) for all xEX
a,010 fo(x) + (x) - f (; )]
i#i0
or,
m
k
k In+
aiiof (X) + G Ajiogj(X) for all x E X
aiio-fi(x) + ! ijiogj(x) = !5 1
j= i
i= 1 j= I
and
nc _
j-I )j110gj(x) = 0
we obtain
k m
k n, o f (z) j-+;gj(zj for all x E X
aif(x) + j=1
°1 r r I
i=1
and r7?
V/gj(x) 0
j-
Or in vector notation,
a f(z) + g(z) for all x E X
a .f(x) +) g(x)
SOME EXTENSIONS 115
and
forallA>0
But this is obvious since z is a solution of this vector maximum problem, so
that gj(1) > 0 for all j. (Q.E.D.)
REMARK: We should note that Slater's condition (S) in the above theorem
is used to guarantee a # 0. Without this condition, a can be zero.
REMARK: It is possible to prove the above theorem directly from the
separation theorem or the fundamental theorem of concave functions
(Theorem 1.B.2), as we did for Theorem 1.B.3. But the above proof seems
to be conceptually the simplest. See also Karlin [ 11 ] , pp. 216-218, and
Kuhn and Tucker [ 12]..,
one i and g1(z) > 0,J = 1, 2, ... , m. Hence we have a f(.i) > a .f(I) with
g(z) > 0, 1 E X. This contradicts the above observation that r maximizes
a f(x) subject to g(x) > 0 and x E X. (Q.E.D.)
REMARK: Note that we do not need the concavity of the f,- and gj nor the
convexity of X in Theorem 1.E.5. From the above proof we can imme-
diately see the following useful theorem.'
116 DEVELOPMENTS OF NONLINEAR PROGRAMMING
Then
(i) z achieves a vector maximum off (x) subject to g(x) > 0 and x E X.
(ii) 'i g(z) = 0.
PROOF: Since 0(i, A) < cD (z, A), we have A g(z) < A g(z) for all A > 0.
Hence A g(z) is bounded from below for all A in f2-. Therefore A g(z) > 0
for all A >_ 0, so that we obtain g(z) > 0. (Recall the lemma immediately
preceeding Theorem O.B.4.) Putting A = 0 in the above inequality, we obtain
A g(z) < 0. But A g(z) > 0, since a. > 0. Hence a. g(z) = 0.
Now note that 0 (x, A) < 0 (z, A) for all x E X ; that is, a f(x) +
A g(x) 5 a f(z) g(z) = 0). This means that z maximizes a f(x)
subject to g(x) > 0 and x E X with a > 0. Owing to the above theorem, z
achieves a vector maximum subject to g(x) > 0 and x E X. (Q.E.D.)
Now let us assume that the f's and gj's are all differentiable in an open
set X in R". Then we can extend the above analysis of the constrained vector
maximum problem in a manner similar to the analysis in Section D. Since the
proof will be analogous to the proofs given above and in Section D, we need
only list the main results. First we must define certain concepts.
Given differentiable vector-valued functions f(x) = [f (x), f,(x), ...,
fk(x)] and g(x) = [g, (x), ... , g",(x)] defined over an open set X in R", we define
the following conditions.
(VM) There exists an z E X which achieves a vector maximum off(x) subject
to g(x)>0,xEX.
(LVM) There exists an open ball B,(z) with radius c in X about z such that
z achieves a vector maximum off(x) subject to g(x) > 0 and x E BE(x).
(VQSP) There exist a 6 Rk with a > 0 (that is, a 0) and (x, A) in X ® D'"
SOME EXTENSIONS 117
Theorem 1.E.8: Suppose that the f 's and gj's are all concave differentiablefunctions
defined over an open convex set X in Rn. Suppose also that Slater's condition is satis-
fied; that is,
(S) There exists an x E X such that g(i) > 0.
Then (VM) implies (VQSP).
Theorem 1.E.9: Suppose that the gj's satisfy (KTCQ) or (A-H-U) as defined in
Section D. Then (LVM) implies (VQSP).
Theorem 1.E.10: Suppose that the f's and gj's are all concave differentiablefunctions
defined over an open convex set in Rn. Then (VQSP) implies (VM), where a in (VQSP)
is assumed to be strictly positive.
PROOF OF THEOREM I.E. 10: Since a f(x) + A. g(x) is a nonnegative linear
combination of concave functions, it is concave. Hence by Theorem 1.C.7,
we obtain from (VQSP)
a- f(x) + a- f(z)
Since a > 0, this proves that z achieves a vector maximum off (x) subject to
g(x) ? 0 and x e X. (Q.E.D.)
tion problem.
We begin this discussion with some elementary concepts in linear algebra.
n n
f (x, x) = a,1x;xj = x A x
.= t j=1
is called the quadratic form associated with A over the real field.
REMARK: Clearly f (x, x) is a real-valued function defined over R'
R", and it is bilinear in the sense that
f (x + y, x) = f (x, x) + f (y, x) for all x, y E R"
f(ax, x) = af(x, x) for all a E R, x E R"
and
j(x, x + y) = f (x, x) + f (x, y) for all x, y E R"
f(x, ax) = af(x, x) for all a E R and x E R"
where (i, j, ..., k) is any permutation of k integers from the set of integers
11, 2, ... , n }. The determinant k is called a principal minor of A with order k.
Note that every D has the same sign as the determinant of A, since both rows and
columns are interchanged in the process of permutation.
EXAMPLES:
all a12I
D1=a11, Dz =
a a2 1 a zz
all a12
a22 a21 l
D1 = all and azz, Dz = I l and
a21 a22 a12 all
The following theorem is concerned with the characterization of a real
quadratic form.
has three kinds of D2, each kind having its own sign and value,
REMARK: Statements (i) and (ii) of the above theorem can be restated
as the following: Q (x) is positive definite if and only if Dk > 0, k = 1, 2, ... , n,
and Q(x) is negative definite if and only if (-1)kDk > 0, k = 1, 2, ..., n.
REMARK: The determinants Dk in statements (iii) and (iv) of the above
theorem cannot be replaced by Dk. For example, let
0
A =00 1
The n-vector a is called the first derivative off at x0 and A is called the second
derivative of f at x°. The first differential of f at x° is the name given to a h,
and h A h is called the second differential of f at x°. The first and the second
differentials are denoted by S f (or df) and S 2f (or d 2f), respectively. Note that
d2f= d(df).
REMARK: If X is a (normed) linear space which is not necessarily finite
SOME EXTENSIONS 121
where azf -
ax;axi ax;
a
l of l (evaluated at x°), i, j = 1, 2, ... , n.-If the second
matrix A is symmetric. The matrix A is called the Hessian matrix off at x°."
REMARK: According to the usual convention in mathematics, the notation
[f"(xO) < 0] means that the Hessian matrix f" (x°) is negative semidefinite,
and not that each element of the matrix f"(x°) is nonpositive. Similarly,
[f"(xO) < 0] means that the Hessian matrix f"(x°) is negative definite, and
not that each element off"(x°) is negative. When x° is a scalar, this con-
vention does not create any confusion. But when the dimension of x° is
greater than or equal to 2, this convention might cause confusion to some
readers.' In this book, following the usual convention in economics, we
reserve the notation A <_ 0 to mean that each element of the matrix A is non-
positive. Similarly, A < 0 means that each element of A is negative.
The following theorem offers a characterization of concave functions in
terms of the Hessian matrix.
(i) The function f is concave on X if and only if f"(x) is negative semidefinite for
a!!xEX.
122 DEVELOPMENTS OF NONLINEAR PROGRAMMING
(ii) The function f is strictly concave on X if f"(x) is negative definite for all x E X.
(iii) The function f is convex on X if and only if f"(x) is positive semidefinitefor all
x E X.
(iv) The function f is strictly convex on X if f"(x) is positive definite for all x E X.
PROOF: See Fenchel [6], pp. 87-88.
REMARK: Note that concavity or convexity is a global concept. Hence in
each statement of Theorem 1.E.12, the phrase "for all x E X" is needed. If
f (x) is concave (or convex) in a convex subset S of its domain X, then X in
all four statements should be replaced by S.
REMARK: The converse of (ii) and the converse of (iv) do not necessarily
hold. For example, f (x) = - (x - 1)4, x E R, is strictly concave, but f" (1) =
0.
(i) The function f is concave if and only if D 1 < 0, D 2 > 0, ... , (-1)"b n >_ O for
allxEX.
(ii) The function f is strictly concave if D 1 < 0, D2 > 0, ... , (-1)'D > 0 for all
xEX.
(iii) The function f is convex if and only if b 1 > 0, b2 > 0, ... , b, > 0.
(iv) The junction f is strictly convex if D I > 0, D2 > 0, ... , D" > 0.
REMARK: The converse of (ii) and the converse of (iv) are not necessarily
true, since the converse of (ii) and the converse of (iv) in Theorem 1.E.12
are not necessarily true.
EXAMPLES:
1. Y = F(L, K) (L, K, Y E R, and all > 0) is a concave function if FLL < 0,
FKK < 0, and F is linear homogeneous, where FLL = a2F/aLL, FKK
a2F/aK2.
2. In particular, F(L, K) = L°KA > 0 is a strictly concave function if a + R
< 1."
REMARK: Recall that the second-order conditions are never mentioned in
the theorems developed in the previous sections when f and the gj's are
concave. If the gj's are concave (or even quasi-concave), the constraint set
C = {x E X: g,(x) > 0,.j = 1, 2, ... , m} is convex. Under the convexity of
C, the concavity off implies that every local maximum is a global maximum;
f is concave if and only if the Hessian off is negative semidefinite for all
SOME EXTENSIONS 123
(i) If f has a local maximum at z E X, then f'(z) = 0 and f"(z) is negative semi-
definite.
(ii) Conversely, if f'(z) = 0 and f"(z) is negative definite, then there exists an open
ball B((z) about 1 with radius E > 0 and a positive number 0 such that
124 DEVELOPMENTS OF NONLINEAR PROGRAMMING
PROOF: The proof is easy and therefore is omitted. (See, for example,
Hestenes [9], pp. 18-20).
REMARK: That f"(X) be negative semidefinite in the above theorem is
called the second-order necessary condition. That f"(X) be negative definite
is called the second-order sufficient condition.' 2
Next we consider the second-order conditions for the constrained maximum
problem. We consider, for the sake of generality, constraints which are a mixture
of inequalities and equalities. In other words, we consider the problem of finding
x E X, an open subset of R", such as to
Maximize: f(x)
Subject to: gj(x) 0, j = 1, 2, ..., m
hk(x) = 0, k = 1, 2, ..., 1
where f, gj, j = 1, 2, ..., m, and hk, k = 1, 2, . . ., 1, are all real-valued twice con-
tinuously differentiable functions on X.
In view of the presence of the equality constraints, we cannot use condi-
tions such as Slater's condition. We will assume the following rank condition (R).
(R) Let E be the set of indices j for which gj(X) = 0. Let me be the number of
such is (that is, the number of effective g-constraints at z). Then it is required
that the rank of the (me + 1) x n matrix
g- ahk
G
a x; a x;
,.jEE;k= 1, 2,...,l;i= 1, 2,...,n
where each partial derivative is evaluated at r, be equal to (me + l ).1 a
We define the Lagrangian for the present problem by
L(x) = f(x) + g (x) + µ- h(x)
where A = (A 1 , " Z, ... , .A,,,) E S2 - and µ = (µi,µ 2, ..., 41) E R 1. The quasi-saddle-
point conditions or the first-order conditions for this problem are written as follows:
(LM) and the second statement is concerned with the second-order sufficient
conditions for (LM).
Theorem 1.E.16:'4
(i) Suppose that conditions (LM) and (R) are satisfied; then we have (FOC) and
where =x-.r
satisfying
and 1,2, .,l
where H is the Hessian matrix of L evaluated at z, that is, H = L"(z).
(ii) Suppose that conditions (FOC) and (R) are satisfied. Furthermore, suppose that
where-x-cr0
satisfying
and 1,2, ...,l
Then there exists an open ball BE(S) c X about z with radius E > 0 and a positive
number 0 > 0 such that
REMARK: Note that if f, the gi's, and the hk's are all concave and if all
the multipliers Al's and µk's are nonnegative, then the Lagrangian function
L, as a nonnegative linear combination of concave functions, is concave.
Hence the Hessian matrix of L is negative semidefinite for all x E X. In
particular, H is negative semidefinite. In other words, H- < 0 for all .
It should also be noted that Theorem 1.E.16 is concerned with only the
local characterization. The (quasi-) concavity off together with the con-
vexity of the constraint set guarantees a global characterization.
As is well known, it is possible to characterize the second-order conditions
in terms of the bordered I-Iessians. Let A = be any n x n 'matrix with real
entries and B = [b,,] be any m x n matrix with real entries. Here A and B are
not necessarily Hessian or Jacobian matrices. Now define the following sub-
matrices of A and B.
all a12 air. hii b12
... bI,
a21 a22 a2, h21 b22
... b2r
Ar Brnr =
L ar l ar2 a,,. hn,1 bn,r
0 Bmr
CrH-det r=m+ 1,m+2,...,n
[B"111. Ar
where Binr is the transpose of Bmr and 0 is the in x in matrix whose entries are
all zero. Then we have the following theorem to characterize the second-order
sufficient conditions.
REMARK: Let
Ar n,r
CrI=det r=m+ 1,...,n
Bmr 0
Then the bordered principal minors conditions of Theorem I.E. 17 can also
be written in terms of I Crl ; that is,
(-1)'lCrl > 0ifandonlyif(-1)' ICrI > 0(r= in + 1,...,n)
and
or equivalently,
all a12 a13 -PI
all a12 -P1 a21 a22 a23 -P2
a21 a22 -P2 > 0 ,
FOOTNOTES
6. For example, consider the problem of (vector-) maximizing x E R" subject to g(x)
0, x > 0. Interpret g(x) as the usual production transformation locus and x as
the output vector. Theorems 1.E.5 and 1.E.6 signify that the solutions of the problems
of maximizing a x with g(x) > 0 and x > 0 (a E R", a > 0), when a varies, trace
the points on the transformation locus. (The points on the transformation locus
are the solutions of the above vector maximum problem) The vector a may be
interpreted as a "price vector."
7. The quadratic form Q(x) in each of the following statements may be replaced by
the symmetric matrix A.
8. If the second partial derivatives of f exist and are continuous for all x in the
domain, then f is called twice continuously differentiable (as remarked in Section C).
In this case, the Hessian matrix f"(x) is symmetric for all x in the domain.
9. Needless to say, a matrix can be negative definite without each element of A being
negative. Conversely, A may not be negative definite, even if each element of A is
negative.
10. There seems to be a confusion among economists on this point. For example, K.
Lancaster writes, "If f (x) is strictly convex (concave), its Hessian is positive (nega-
tive) definite." (See his Mathematical Economics, New York, Macmillan, 1968,
p. 333.) This statement is wrong in view of the above counterexample.
11. If a + 1, F is no longer strictly concave, although it is strictly quasi-concave. In
general, if f (x) on x E X, a convex subset of R", is linear homogeneous, it cannot
be strictly concave. The strict concavity off requires f [(x + y)/2] > f (x)/2 + f(y)/2
for all x # y in X. But this is impossible under the linear homogeneity off, if y is a
scalar multiple of x (say y = ax, for some a E R). To see this, observef [(x + y)/2]
= (1 + a) f (x)/2 = f (x)12 + of (x)/2 = f (x)/2 + f (y)12.
(y)/2.
12. There seems to be a confusion among economists between the second-order neces-
sary condition and the second-order sufficient condition. For example, Hicks writes,
"In order that u should be a true maximum, it is necessary to have not only du = 0 ...
but also d2u < 0," (Value and Capital, 2nd ed., p. 306). Consider the problem of
maximizing f (x) _ - (x - 1)4, x E R. Clearly f reaches its maximum at x = 1. Note
that)"(1) = 0. In other words,)"(1) < 0 is by no means necessary for a maximum.
13. Assume me + 1 < n.
14. Recall that Theorem 1.D.7 has already established that (LM) and (R) imply (FOC).
This is the first statement of (i) of the present theorem.
15. See, for example, chapters 2, 3, 4, and 5 of Samuelson [ 16] See also Appendix to
.
Section F of this chapter for a complete summary of the local maximization theory
and its applications to the comparative statics problem.
REFERENCES
1. Arrow, K. J., and Enthoven, A. C., "Quasi-Concave Programming," Econometrica,
29, October 1961.
2. Arrow, K. J., Hurwicz, L., and Uzawa, H., "Constraint Qualifications in Maximiza-
tion Problems," Naval Research Logistics Quarterly, vol. 8, no. 2, June 1961.
3. Debreu, G., "Definite and Semidefinite Quadratic Forms," Econometrica, 20, April
1952.
4. El-Hodiri, M. A., Constrained Extrema: Introduction to the Differentiable Case with
Economic Applications, Berlin, Springer-Verlag, 1971 (originally "Constrained
SOME APPLICATIONS 129
Section F
SOME APPLICATIONS
a. LINEAR PROGRAMMING
Probably the most fundamental relation in the theory of linear pro-
gramming is the dual relation. The dual relation is'concerned with the following
two types of problems, each one of which is called the "dual problem" of the
other.
(MLP) Maximize: p x
X C R"
(i) The evaluating vector p in (MAP) appears in the constraint of (mLP), and the
evaluating vector r in (mLP) appears in the constraint of (MAP).
(ii) The constraint matrices are each other's transpose.
(iii) Except for the nonnegativity condition, the inequality in the constraint is
reversed.
(i) There exists an optimal solution i for (MAP) if and only if there exists an optimal
solution v for (mLP).
(ii) The inequalities i > 0 and tiv >_ 0 satisfy A i < r, A'- ii' > p, and v (r - A z) =
i (A'- Cv - p) = 0 if and only if i is optimal for (MAP) and tiv is optimal for
(MLA (moreover, in this case, p i = r tiv ).
SOME APPLICATIONS 131
PROOF: (i) Suppose z is optimal for (MLP). Then by Theorem 1.D.5, there
exists w >_ 0 such that (z, w) is a saddle point of(D (x, w) = p x + w. (r- A x),
that is,
(1) a) (x, w) < a) (z, w) < a) (z, w) for all x > 0, w > 0
Define
Y'(tiv,x)=
x p)
Then from (1) we have
LP(w,z) <'I'(w,z) <'P(w,x) for all x > 0, w>_ 0
Hence from the corollary of the Arrow-Hurwicz-Uzawa theorem, w maxim-
izes - w r subject tow ? 0, A'- w ? p; that is, w is optimal for (mLP).
Conversely, if iv is optimal for (mLP), then, proceeding as before, there
exists an z >- 0 such that 'P(w, x) has a saddle point at (w, z). This in turn
implies a) (x, iv) has a saddle point at (z, w), which implies z is optimal for
(MLP)
(ii) If z and w are optimal solutions of (MLP) and (mLP), respectively,
then by the reasoning of part (i), (,r, w) is a saddle point of 0 (x, 1v), and
(iv, z) is a saddle point of'P(w, x). But then by Theorem 1.B.4 (ii),
w (r - w - p) = 0
Moreover, since I (z, w) = -'I' (w, z), we then have p c = ti"v r, which
verifies the parenthetical remark in (ii).
Conversely, suppose that there exist z > 0 and i^v >_ 0 such that
(2) 1 p) = 0
and
(3)
Then if z > 0 and 1"v > 0 we have, using (3), (2), and (3) in turn,
0(x,w)= p x + w ( r - A x ) = i r+ (p -
w 1 p)=
a) (1, fv) = p- i + A- i) _ (P (1,
Hence, again by Theorem 1.D.5, it follows that z is optimal for (MLP). The
fact that tiv is optimal for (mLP) then follows from (i). (Q. E. D.)
the other by Dantzig ([4], pp. 129-134), which proves the theorem as an
application of the LP simplex method. Our proof will enhance the reader's
understanding of the nonlinear programming theory developed in this
chapter.
In the course of the proof of the duality theorem, we also proved the following
theorem.
Theorem 1.F.2 (Goldman-Tucker): There exist optimal solutions for (MAP) and
(m LP ), denoted by X and w respectively, if and only if there exists (X, w ), which is a saddle
point of 0 (x, w) = p x + w (r - A x); that is,
0 (x, w) < 0 (z, w) < m (z, w) for all x > 0, and all w > 0
REMARK: For the original proof, which obviously does not rely on non-
linear programming theory, see Goldman and Tucker [81, especially
theorem 6, pp. 77-78. They obtained this theorem from the LP duality
theorem.
REMARK : In Figure 1.17 we illustrate schematically the logical structure of
some of the important theorems established so far.
REMARK: We proved the fundamental theorem of activity analysis
(Theorem O.C.3) by using the separation theorem. As we will see later, this
theorem can also be proved by an extension of the concave programming
theorem. The proof of the Minkowski-Farkas lemma by utilizing the LP
duality theorem and the proof of the LP duality theorem by utilizing the
Minkowski-Farkas lemma are not too difficult and will be interesting exer-
cises for the reader.
separation tneorems
b. CONSUMPTION THEORY
In the classical theory of consumer's choice as explained in Hicks [ 101,
for example, a consumer is supposed to maximize his satisfaction over the budget
set. Let x E R" be his n-commodity consumption bundle and u(x), a real-valued
function defined over R", be his utility function. Suppose that this consumer is a
"competitive consumer" so that he cannot influence the prices of the commodities
in the market. Then if a price vector p prevails, his budget constraint can be
expressed as p- x < M(with x ? 0) where M is his income. Although the non-
negativity condition x ? 0 is not mentioned in Hicks, it is implied in the context.
Hicks wrote the budget constraint in the form of the equality p x = M. This
means that the consumer must spend all his income. We will use the inequality
constraint p x < M instead, allowing the consumer the possibility of not spending
all his income. Later we will find a condition under which this constraint becomes
effective (that is, he spends all his income).
We can now write the problem for each consumer as follows:
Maximize: u(x)
xER"
Subject to: p- x< M and x? 0
Following the classical analysis, we assume that u (x) is differentiable' everywhere.
Hence we can use the theory developed in Sections D and E. Since the constraints
are linear, owing to the (A-H-U) theorem (Theorem I.D.4), the (QSP') condition
is a necessary condition for global maximality. In other words, if z is a solution of
the above problem, then there exists a A E R such that
(4) ux; - .a.pi < 0, i = 1, 2, ..., n
(5) i. (fix -.i1.P)=0 (QSP )
(6) A.(M - p z) = 0, A > 0
(7) z>0
Here fix = u'(x), ux; = au/ax; (evaluated at x = z), and ax = (fix , ... , ux").
Conversely, assuming u(x) is a concave function, if there exist z and A, both
nonnegative, such that the above (QSP') condition holds, then, owing to Theorem
1.D.2, z is a solution of the above constrained maximum problem for the con-
sumer. In other words, under the concavity of u(x), the above (QSP') becomes a
necessary and sufficient condition for z to furnish a global maximum of the above
constrained maximum problem (see Theorem I.D.5). Hence our attention will
be shifted to finding the values of z and A which satisfy the above (QSP'). If u (x) is
not concave but rather quasi-concave, then we need an additional assumption. In
particular, we assume
(A-c) Zlx; > 0 for some relevant variable xi (that is, positive "marginal utility"
for some relevant variable).2
Then, applying the Arrow-Enthoven theorem (Theorem 1.E.2), we can again
134 DEVELOPMENTS OF NONLINEAR PROGRAMMING
conclude that (QSP') provides a necessary as well as sufficient condition for the
above constrained maximum problem.
With these remarks, we now shift our attention to (QSP'). First observe that
conditions (6) and (7) of (QSP') mean
(8) A> 0 implies M- p z = 0
Since ux; < Api for all i from condition (4) above, uX > 0 (positive marginal utility)
for some commodity i is consistent only with A > 0 and a positive price of that
commodity (pi > 0). We may recall that i2,; > 0 for some relevant xi is assumed
[(A-c)] when we adopt quasi-concave programming. In other words, if we
assume that there exists at least one commodity in which the consumer is never
satiated, then a. > 0 so that M = p z [resulting from relations (6) and (7)]. This
could mean that the nonsatiation assumption will be a crucial assumption in the
sense that it guarantees that all the income is spent. Thus we have revealed one
crucial assumption which underlies the Hicksian equality constraint M = p x.
Next note that conditions (4) and (5) of the above (QSP') mean
(9) zi(u.Y.-Ap,)=0, i= 1,2,...,n
Hence if we assume an interior solution for all i (that is, . , > 0 for all i ), then we
obtain
(10) u'i = )pi, i = 1, 2, .. ., n
Note that this interior solution assumption is usually made implicit in the classical
analysis, as it is explained in Hicks, for example. In general, this assumption does
not necessarily hold. It is quite possible that ii = 0 for some i. Atypical situation
is illustrated in Figure 1.18.
Following the classical analysis, we now proceed with the interior solution
assumption (that is,. , > 0 for all i). By relation (10), if pi 4 0 for some i, then)t > 0.
In other words, under the assumption that the consumer consumes a positive
amount of every commodity, pi 4 0 for some i guarantees A. > 0. Then we have
M = p z [from (8)]. This and equation (10) provide n + 1 equations which are
, z2 , ... ,
available to determine (n + 1) variables, that is, z I
z
, and A. This is the
I uk ukl Uk2
... ukk I
where u, - au/axi and ui - a2u/ax;axj (all the partial derivatives are evaluated
at x =1).
By Theorem 1.E.14, if the above condition holds for all x, then u(x) is quasi-
concave. In other words, quasi-concave programming enables us to dispense with
the above bordered Hessian condition and provides us with a global maximum
(instead of a local maximum). Note that if u is strictly quasi-concave, we would
have a unique global maximum. Needless to say, the quasi-concavity of the utility
function (that is, the convex-to-the-origin indifference curves) is more intuitively
appealing than to say that the utility function satisfies the bordered Hessian
condition.;
If u(x) is concave, then the above bordered Hessian condition is replaced by
the stronger Hessian condition, as discussed in Section E. Strict concavity will give
a unique solution.
This finishes our critical review of the classical theory of consumer's choice
in terms of (quasi-) concave programming. The following points were made
explicit:
136 DEVELOPMENTS OF NONLINEAR PROGRAMMING
Finally, we should stress that the theory of concave (or quasi-concave) program-
ming provides a global characterization of the problem. The classical treatment
in terms of the Euler-Lagrange necessary conditions and the Hessian (or bordered
Hessian) condition (as utilized by Hicks and so on) only provides a local character-
ization; that is, it is concerned with the properties in some (possibly very small)
neighborhood of a solution point, and there may be many solution points, each
giving a different value for maximal utility.
C. PRODUCTION THEORY
The production activity of an economy is concerned with transforming one
set of commodities, called "inputs," denoted by a vector v = (VI, v2, ..., v,")
into another set of commodities called "outputs," denoted by a vector x = (x1,
x2, . . ., x"). In activity analysis, inputs were denoted by negative numbers, outputs
were denoted by positive numbers, and we called a vectory = (- v, x) an "activity
vector" (after normalization with respect to a certain commodity, to define the
"activity level"). Then we considered the set of these y's, Y, and called it the
"production set." We now wish to describe this set by a functional relation in order
to obtain an application of the theory established in this chapter. By the explicit
introduction of such a functional relation, our analysis will also serve as a
critical review of an important part of the classical production theory (as explained
in Hicks [ 10] and Carlson [2] ). In the following analysis, we denote inputs-
say, vj-by positive numbers (instead of negative numbers).
The functional relation that describes a production set can be written as
F(v, x) > 0
We assume v E R'" and x E R" with v >_ 0 and x >_ 0. In the case where v and
x are real numbers, we may illustrate the above relation as in Figure 1.19. Here
the shaded area illustrates the values of (v, x) which satisfy the above functional
relation.
We note that if F(v, x) = f (v) - x, the relation x = f (v) can be obtained by
solving F(v, x) = 0. This relation x = f(v) or F(v, x) = 0 (where v E R'", x E R"
with v > 0, x > 0) is the familiar production function in the traditional analysis. We
may call such a surface a production frontier. In the functional relation F(v, x) > 0,
we allowed the possibility of points which are not on the surface F(v, x) = 0.
This is illustrated in Figure 1.19. Points such as A are on the curve defined by
SOME APPLICATIONS 137
f(v)
F(v, x) = 0 [or x = f(v)]. However, we also allow the possibility of points such
as B. Such points are allowed for either (or both) of the following two reasons.
(i) We allow the possibility of production processes that are technically inferior to
some other processes. In other words, we do not assume the existence of an
"efficient" manager.
(ii) We assume free disposability of commodities so that some inputs and outputs
can be thrown away in the process. This can happen, for example, if some com-
modities (either inputs or outputs) become "free" due to an excess supply of
those commodities in the economy.
Now let p = (pt, p2, . . ., be the price vector for outputs and iv = (wl,
W2, ..., w,,,) be the price vector for inputs. Then the profit which can be obtained
by transforming v into x may be written as
Suppose that the "producer" is "competitive" so that he cannot affect the level of
prices, p and w, that prevails in the market. Suppose further that his behavioral rule
is profit maximization (for otherwise he will sooner or later be ruled out of a typic-
ally "competitive" market). Then his problem is the following nonlinear program-
ming problem.
Maximize: 7C = p x - w v
(C+ V)
above. Under these assumptions, we can apply Theorem 1.D.2. In other words,
z ? 0 is a solution of the above problem if and only if there exists a scalar A > 0
such that
(11) p.+AF,<. O, i= 1,2,...,n
(12) W, 0, j=
(QSP')
(13)
(14) 0,A.F(v,z)=0
where Fx.= 8 F / 8x;, F vj 8 F / o - v j, i = 1 , 2, ..., n, j= 1, 2, ..., in [each evaluated
at (v, x) = i)], and Fx = r,,, . . ., F., ), Fv = (Fvi, F,,Z, ..., F,m).
By conditions (11) and (12), condition (13) of the above (QSP') is equivalent
to
(15) x; (pi + 0, i= 1, 2, ..., n
(16) i1(-wj +=0, j= 1,2,...,m
We assume that at least one output (say, io) is produced (that is, 2, > 0) or at least
one input (say, jo) is used in production (that is, i > 0). Then from (15) or (16)
(17) A>0
as long as pio > 0 or who > 0.
Then from (14), we have F(v", z) = 0. In other words, under the above as-
sumption of z;o > 0 orvjo > 0 (for some io or jo), production will take place on the
production frontier if and only if the producer maximizes his profit. This cor-
responds to the fundamental theorems of activity analysis (Theorems O.C.2 and .
O.C.3).
If we assume an interior solution for every output and input (that is, z; > 0,
v"j > 0 for all i and j), as in Hicks [10], conditions (11) and (12) can be rewritten as
follows:
(12') wj =j 1,2,...,in
Under the assumption of an interior solution, we have A > 0, as noted above,
which in turn implies F(2, v") = 0. Combining this equation with conditions (11')
and (12'), we obtain (n + in + 1) equations, which, in turn, would presumably
determine (n + in + 1) variables, that is, A, the z;'s, and the vg's as functions of the
pi's and w,'s. By changing the values of thep,'s and w's, we get a comparative statics
analysis which will lead to Hick's fundamental equation. In the above analysis,
the fact that (QSP') is necessary and sufficient for a (global) maximum depends on
our assumption that F(v, x) is a concave function. The function F(v, x) is concave if
and only if the following Hessian condition holds (assuming that F is twice
differentiable). (See Theorem 1.E.13.)
SOME APPLICATIONS 139
there exist v > 0, x > 0, with F(v, x) > 0, (QSP') is necessary for (v", z) to furnish
a maximum, ifs
(18) F'X 0 or F, O
If we have an interior solution for some i or some j, then (11') and (12') imply (18).
Hence, assuming an interior solution, (QSP) becomes necessary and sufficient for
an optimum under the quasi-concavity of the function F. The quasi-concavity of F
can be characterized in terms of the bordered Hessian conditions, which cor-
responds to Hicks's discussion of the topic ([ 10] , p. 320). A condition that is alter-
native to (18) can be obtained by utilizing the rank condition (Theorem .1. DA). The
rank condition for the present problem is stronger than (18). Even with z > 0,
v > 0, we may, for example, requires
(18') Fx 0 and Fv 0
Again under the quasi-concavity of F, (QSP') provides a set of necessary and
sufficient conditions for an optimum.'
The quasi-concavity of F(v, x) implies that the following bordered Hessian
condition holds (assuming that F is twice differentiable).
(19) B, < 0, B2 > 0, ..., (- 1)m+n Bm+ ' 0
where Bk is defined as in Section E (Theorem 1.E.14). We again emphasize that
the concept of concavity or quasi-concavity is more intuitively appealing than
the Hessian or the bordered Hessian conditions.
In order to understand further the meaning of the above (QSP') condition,
we now assume that there is only one output in this production, so that x andp
are now scalars. We also assume that F(x, v) > 0 can be written asf(v) > x. Then
our (QSP') condition can be rewritten as follows:
(20) p-A<0
(21) -wj +AJ 0,j= 1,2,...,m
(22) (p-))L +v" (-IV +)J,)=0
(23) A[f(v)-X] =0,f(v)-X>0,z>0,v>0
140 DEVELOPMENTS OF NONLINEAR PROGRAMMING
(24) (p-A)z=0
and
d. ACTIVITY ANALYSIS
Let a, be the amount of the ith commodity involved in a unit operation of the
jth activity and let ai be the vector for the jth activity whose ith element is a;,.
Assume that there are n commodities and m activities. Let xj be the activity level of
the jth activity and let x be the activity vector whose_jth element is xj. Then, as we
discussed in Chapter 0, Section C, the production set Y is given by
(29) Y = { y: y = A x, x > 0}, where A = [ a,,]
or
In
x>_0}
SOME APPLICATIONS 141
An efficient point y of Y is a point such that there does not exist ay E Y such
that y ? y. In other words, this y can be obtained as a solution of the following
vector maximum problem.
(Vector) Maximize: y
Subject to: y E Y
Then from Theorem I.E.4, if y is a solution of this problem, there exists p ? 0 such
that
(30) for ally E Y
Obviously this holds even if Y is not restricted to the form (29). Only the convexity
of Y is required. Relation (30) corresponds to Theorem O.C.3. Although the con-
verse of the theorem is easy to obtain, as discussed in Theorem O.C.2, we can also
obtain this converse by using Theorem 1.E.6. In other words, if there exists ap > 0
such that p y ? p y for ally E Y, then y is an efficient point of Y(or a solution of
the above vector maximum problem).
We now consider a resource constraint which we write as follows:
(31) y+ z> 0, yE Y
where z;, the ith component of z, denotes the amount of this ith commodity ("re-
source") available in the economy. The feasible set YF of this economy is then
yF={y:yEY,y+z>0}
Now we are interested in the problem of finding an efficient point of this feasible
set F. The point yFis an efficient point of YFif there does not exist y E YFsuch that
y ? yF. Hence an efficient point of YFcan be obtained as a solution of the following
vector maximum problem.
Maximize: y
Subject to: y + z ?0 and y E Y
Assume Slater's condition so that there exists a y E Y such that y + z > 0. Then
using Theorem 1.E.4, if y is a solution of this problem, there exists ap ? 0 (p 0)
and A ? 0 such that
(SP) cP(y,A)«(y,.)<cD(y,A), for a]]yE YandA?0,and)t.(j +z)=
O, where ( 1 ) A) - A - (.1 + z).
The first inequality of the above (SP) can be written as follows:
(32) z) 5 A. (y+ z) for allyE Y
or
which means "profit maximization" with respect to q. Also note that, under
Slater's condition and the convexity of Y, (32) and A (y + z) = 0 are equivalent to
(cf. Theorem 1.B.5):
(33) for all yE Ysuch that y+z0
which means profit maximization with respect to p subject to the resource
constraint.
Conversely, if there exist p > 0, A > 0, and y E Y such that the above (SP)
condition holds, then y is a solution of the above constrained vector maximum
problem (Theorem 1.E.5). This corresponds to Theorem O.C.2. Certainly, this is a
difficult way to reach such a theorem, but it does illustrate one use of the vector
maximum problem.
Now consider the following linear programming problem.
Maximize: a y
y
Subject to: y+ z> O and y E Y, where a E R ", a >_ 0
or
Maximize: a A x
x
Subject
Clearly those two problems are equivalent. Hence if z is a solution of the latter
problem, y = A x is a solution of the former problem. Now; is a solution of the
former problem if and only if there exists a A > 0 such that
(SP) 0(y,.i)<_c1(y,A)<0(y,A),forallyEY,andA>0,where a) (y, A)
z).
We can prove this by slightly modifying our proof of the Goldman-Tucker
theorem. In any case, this saddle-point condition means that ify is a solution of the
above constrained vector maximum problem, then it is a solution of the first linear
programming problem. Conversely, if we can find a solution z of the latter linear
programming problem with a > 0, then y = A z is a solution of the above con-
strained vector maximum problem, thus providing an efficient point of YF. By
varying a, we can obtain the set of efficient points.
Country 1 Country 2
Commodity X 1 0 1 0
Commodity Y 0 1 0 1
where
a;-L', b; y! (i=1,2)
The production possibility sets for the two countries are illustrated in Figure
1.20.
1 Y2
x1 x2
(40)
IX11X2 bi<b2
or
Iy1 1y2 a, a2
(41) I (x, Y; A) < (c, Y; A) < c (X, Y; a.) for all x, Y, A > 0
where
(x,Y;A)=Pxx+ pyy+A1 bi
-aIx _ b'1
y
f A2
a
[a?
x
a2 bz I
and
(42)
z -a,-bi]=0
ij
(43) [,a2
A2 a2-62=0
Conversely, from Theorem 1.E.7, if there existp > 0 and A > 0 such that the above
saddle-point condition (41) holds, then (z, y) is a solution for the vector maximum
problem. It is easy to see that the values of px and py determine the location of
the solution on the line QRS.
Now consider the following linear programming problem [where p
/,,
(Px, PY) > 0] .
X y b
Subject to:
a, b, - b,
x + y < a
a2 b2 a2
x>0,y>0
Then from the Goldman-Tucker theorem (Theorem I. F.2), (z, 5)) is a solution of
this problem if and only if there exist (z, y) :-n 0 and A > 0 such that
(44) 1 (x, l'; A) < a) (X, Y; A) < 1 (c, Y; A) for all x, y, A > 0
Hence the solutions for the above vector maximum problem are characterized by
this linear programming problem. In other words, if (z, y) is a solution of the above
vector maximum problem, it is a solution of the above linear programming
problem. And conversely, if (z, y) is a solution of the above linear programming
problem with px > 0, py >'0, then it is also a solution of the above vector maximum
problem.
If in the linear programming problem we choose px and p,, such that px > 0,
py > 0, and
(R) 41 Px 42
/,,I Py 1,,2
then we obtain point R of Figure 1.21 (as can be seen at once from the diagram).
This point R is called Ricardo's point by DOSSO ([6], p. 35),8 for this is exactly
the problem that David Ricardo was concerned with in his celebrated theory
146 DEVELOPMENTS OF NONLINEAR PROGRAMMING
x + y a
a2 b2 a2
x>0,y>0
This is essentially the problem that John Stuart Mill was trying to solve in [ 11 ] .
(45) fix
- - 0,
uy,2<0
b,
a, a2 = bz
A2
(46) b2 0
a, a2 b, y=J
(47)
b
b,
z
a,
y >
bi =
a- X -
a2 a2
y
b2 >
= 0
(48) At +
a
a2 [u2 - a2 - b2 J = 0
r yl
1-b I a, YJ
z _ O,y>O,AI>_O,A2>0
[where fix = au/ax, uy = au/ay, both evaluated at (z, y)]
To establish the converse of the above statement, we apply Arrow and Enthoven's
theorem of quasi-concave programming (Theorem 1.E.2). In this problem we
can assume uX > 0, or uy > 0, which is the case for the utility function u(x, y) _
xayl,. Then, from the Arrow-Enthoven theorem, the above (QSPm) condition is
sufficient for the optimum. Hence we can assert that (z, y) is a solution of the above
nonlinear programming problem if and only if the above (QSPm) holds.
Now assume
as above and find the condition under which country 1 specializes in the produc-
tion of X and country 2 specializes in the production of Y (the Ricardian pattern
of complete specialization). This is the question raised by J. S. Mill. Mathe-
matically speaking, we are now seeking the condition under which the Ricardian
pattern of specialization (z = z, = a,, y = y2 = b2, y, = 0, and z2 = 0) is the solu-
tion of the above nonlinear problem [hence satisfies the above (QSP'n,)].
148 DEVELOPMENTS OF NONLINEAR PROGRAMMING
The solution of Mill's problem is rather easy to see from Figure 1.21 and it
does not need the above machinery of nonlinear programming such as (QSPm).
Assuming that the utility function is nicely shaped, such as u(x, y) = xayl-11,
0 < a < 1, we can easily see from Figure 1.21 that the necessary and sufficient
condition for (a,, b2) to be the solution of the above nonlinear programming
problem is simply that the slope of the indifference curve at (al, b2) be between
the slopes of the lines QR and RS. Letting (z, y) _ (a,, b2), we can write this
condition as follows:
b, < uX < h2
(50) (with at least one strict inequality)
al uy a2
where uc and uy are now defined respectively as uX and uy both evaluated at (al,
b2). This condition is often called Mill's condition.
We now obtain this necessary and sufficient condition mathematically. This
procedure is more tedious than the one in terms of the above diagram, but it is
useful in order to become familiar with our nonlinear programming theory as
well as to obtain the precise understanding of the solution. Moreover, it will
facilitate a further generalization (see Takayama [ 17] ). First introduce the
following assumption on the utility function, which will guaranteethatz > 0,y >
0. Notice that if we cannot guarantee z > 0, y > 0, then (a,, b2) cannot be a
solution.
(A-m) u(x, 0) = 0 and u(0, y) = 0
adx
>0 forallx>O,y>0
An example of a utility function that satisfies the above assumption is one of the
Cobb-Douglas type, u = xny('--), 0 < a < 1. The feasible set for our nonlinear
programming problem is M = t(x, y): (x, y) > 0, x/a, + y/b, < b/b,, x/a2 + y/b2
a/a2}. The set M is nonempty and contains a point (x, y) with x > 0
and y > 0. (Note that this also implies that Slater's condition holds for the present
problem.) Hence from the above assumption, u(z, y) > u(x, 0) for all x > Oand
u(x , y) > u (0, y) for ally ? 0. Therefore, an optimal point (z, y) must be such that
z > 0 and y > 0. Then the first two conditions (45) and (46) of the above (QSPm)
can be converted to the following equivalent condition:
(51) uA
r - , -
a,
a2 =
a2
0, u -Ab,,-A2=0
b2
Mill's problem is that of finding the condition under which Ricardo's point
(a,, b2) is optimal. Since at (a,, b2) the two relations in condition (47) of the above
(QSP',,,) hold with equality, (48) is automatically satisfied; thus at (a,, b2) both (47)
and (48) of (QSP;,,) are satisfied. Hence the necessary and sufficient condition for
Ricardo's point (a,, b2) to be optimal is reduced to the following condition:
(52) There exists A I >= 0, }12 ? 0 (with at least one strict inequality)" such that
condition (51) holds at (a,, b2).
SOME APPLICATIONS 149
(53-a)
az b, - az .
- Q bzA, = (uy - y- ,,)bl
where uX = uc(a1, b2) and uy = u,,(a1, bz). Since b,/a, < b2/az by hypothesis, we
have
az b,
(54) <1
a, bz
Therefore, recalling A, > 0 and az > 0, we obtain from (53-a) and (53-b)
and
uy = (b2 uX ifs.,=0
and
ux
uy = 16) if iz = 0
bI < bz
(56) (with at least one strict inequality)
a, - uy a2
which is the Mill's condition. Therefore, if condition (52) is satisfied, then Mill's
condition (56) is satisfied. Conversely, if (56) is satisfied, then we can obtain
condition (52). If (56) holds with strict inequalities, then obtain .A and A2 from
(53-a) and (53-b). If (56) holds with one equality, say, b,/a, = uX/uy, then define
31z as iz = 0 and obtain a, from (53-a) as A, = b,uy. Thus obtained, A , and Az
will satisfy condition (52). This finishes the mathematical proof that Mill's condi-
tion is a necessary and sufficient condition for Ricardo's point to be optimal.
It should be noted that Mill's condition and the above observation are
crucially dependent on the specification of the utility function. If we adopt a dif-
ferent form of u, then we will obtain a different condition. As Chipman noted ([ 3] ,
p. 489), Mill realized this point and attempted to analyze more general cases
[more general, that is, than the case in which u(x, y) = xy]. (See [ 11 ] , Book
150 DEVELOPMENTS OF NONLINEAR PROGRAMMING
III, chap. 18. esp. secs. 8 and 9.) However his mathematical equipment precluded
the derivation of any exact condition.
FOOTNOTES
1. The utility function u may not be defined outside f2 ", the nonnegative orthant of R1.
However, it would be more convenient to conceive that u is defined over the entire
space R", in order to avoid the possibility of "corner" derivatives when we talk about
the (QSP') condition. Clearly, the consumer cannot place any utility outside his
consumption set S2 '; hence the definition of u outside of Q n can be more or less
arbitrary, as long as differentiability is preserved. This convention of extending the
domain of the function is often useful in many economic problems in which many
functions are, strictly speaking, defined only on the nonnegative orthant, and in
which we are concerned with (QSP').
2. For the meaning of the "relevant variable," see Section E of this chapter or Arrow
and Enthoven [ 1 ] , p. 783. This concept does not create any problem in the present
problem of consumer's choice.
3. Most readers are probably familiar with the procedure of obtaining the Hicks-Slutsky
equation. Clearly the author is not discounting any of the glory of the classical
demand theory a la Slutsky, Hicks, and so on. In the Appendix to this section, we
attempt the exposition of the classical demand theory, as an example of the time-
honored technique in economics, comparative statics. Later we will take up a modern
approach to the Hicks-Slutsky equation (Chapter 2, Section D).
4. The (strict) quasi-concavity of the utility function means that the consumer desires
to consume a variety of commodities rather than to consume any one commodity.
5. Here we are using condition (iv) of Theorem 1.D.4 (the A-H-U theorem), which is the
same as condition (ii) of Theorem 1.E.3 under the quasi-concavity of F. The condition
requires -(a) condition (18) in addition to (b) the quasi-concavity of F (or the convexity
of the constraint set), and (c) the existence of (v, x) > 0 with F(v, x) > 0 (or the
existence of an interior point in the constraint set).
6. As remarked above, assuming that at least one output is produced at the optimum,
we have ) > 0 so that F(v, 1) = 0. In other words, the constraint F(v, x) >_ 0 is
effective at the optimum. If we do not have the constraint (v, Y) >_ 0, the rank con-
dition is satisfied if condition (18) holds (which ensures [F,,, Fx] 0). A stronger
condition such as (18') is required for the present problem in view of the non-
negativity constraints, z > 0 andv > 0. Note that, to ensure the rank condition,
neither the quasi-concavity of F nor the existence of (v, x) > 0 with F(v, x) > 0 is
required.
7. Under the rank condition, (QSP') is necessary for an optimum. Under the quasi-
concavity of F, (QSP') is sufficient for an optimum.
8. DOSSO is the standard nickname of Dorfman, Samuelson, and Solow [ 6] .
9. The optimality here is defined as the maximization of the value of output under a fixed
price vector p. Notice that the maximization of pcx + p,y(resp. p.Cxi + pyyi, i = 1, 2)
is equivalent to the maximization of "real income" (px/py)x + y or x + (py/px)y
[ resp. (px/py)xi + y, or xi + (py/px)yi, i = 1, 2] , as long as (px, py) is a fixed vector.
Notice also that a country can increase its welfare from the above "optimum" posi-
tion if it is allowed to alter p or the terms of trade pX/py. This will, in general, imply a
loss to the other country. The optimum tariff argument is concerned with the choice
of px/py by means of tariffs so as to maximize one country's welfare.
10. If Al = A2 = 0, then from (51) we obtain uC = uy = 0 which, in view of z > 0 and
y > 0, contradicts (A-m) (in particular, uX > 0, uy > 0 for all x > 0 and y > 0).
SOME APPLICATIONS 151
REFERENCES
1. Arrow, K., and Enthoven, A. C., "Quasi-Concave Programming," Econometrica,
vol. 29, October 1961.
2. Carlson, S., A Study on the Pure Theory of Production, Oxford, Basil Blackwell,
1956.
3. Chipman, J. S., "A Survey of the Theory of International Trade, Part 1, The Classical
Theory," Econometrica, vol. 33, July 1965.
4. Dantzig, G. B., Linear Programming and Extensions, Princeton, N.J., Princeton
University Press, 1963.
5. Dantzig, G. B., and Orden, A., "Notes on Linear Programming: Part II, Duality
Theorem," Rand Corporation, Research Memorandum, RM 1265, October 30, 1953.
6. Dorfman, R., Samuelson, P. A., and Solow, R. M., Linear Programming andEconomic
Analysis, New York, McGraw-Hill, 1958.
7. Eisenberg, E., "Duality in Homogeneous Programming," Proceedings of the American
Mathematical Society, 12, October 1961.
8. Goldman, A. J., and Tucker, A. W., "Theory of Linear Programming," in Linear
Inequalities and Related Systems, ed. by H. W. Kuhn and A. W. Tucker, Princeton,
N.J., Princeton University Press, 1956.
9. Hadley, G., Linear Programming, Reading, Mass., Addison-Wesley, 1962.
10. Hicks, J. R., Value and Capital, 2nd ed., Oxford, Clarendon Press, 1946.
11. Mill, J. S., Principles of Political Economy, 3rd ed., London, Parker & Co., 1852 (1st
ed. 1848 by Parker, 9th ed. 1885 by Longmans, Green & Co.).
12. Nikaido, H., Introduction to Sets and Mappings in Modern Economics, tr. by K. Sato,
Amsterdam, North-Holland, 1970 ( the Japanese original, Tokyo, 1960).
13. Ricardo, D., On the Principles of Political Economy and Taxation, London, John
Murray, 1817, in The Works and Correspondence of David Ricardo, Vol. 1, ed. by
P. Sraffa, Cambridge, Cambridge University Press, 1951.
14. Samuelson, P. A., Foundations of Economic Analysis, Cambridge, Mass., Harvard
University Press, 1947.
15. Shephard, R. W., Cost and Production Functions, Princeton, N.J., Princeton Univer-
sity Press, 1953.
16. Takayama, A., International Economics, Tokyo, Toyo-Keizai Shimpo-sha, 1963 (in
Japanese).
17. , International Trade-An Approach to the Theory, New York, Holt, Rinehart
and Winston, 1972, chaps. 4, 5, 6, and 7.
concisely so that the reader may be able to refresh his understanding of the theory
in a proper perspective.
Let f (x, a) and gj(x, a), j = 1, 2, ..., m, be real-valued functions defined on
X Q A where X and A are, respectively, open subsets of R" and R'. We assume
(A-1) All the second partial derivatives of f and gj, j = 1, 2, ... , m, exist and
are continuous for all (x, a) in X ®x A.
Consider the following maximization problem:
Maximize: f (x, a)
x
Maximize: f(x, a)
X
or equivalently,
Ar 8,,,,
(BHC') (- 1)r
Bmr 0
I
>0, r=in +1,...,n
where Ar and Bmr are defined by
b. COMPARATIVE STATICS
Hereafter, we assume (LM) and (R), so that (FOC) and (SONC) hold.
Condition (FOC) provides (n ± m) equations, which are then available to deter-
mine the (n + m) variables, il, i2, Assume
(A-2) det I B, I 0
A
Then under assumptions (A-1) and (A-2), we can directly apply the implicit func-
tion theorem." Thus we can conclude that there exist continuously differentiable
functions x and A such that z = x(a) and a. = A(a) and
(4-a) tI [x(a), }i(a), a] = 0
(4-b) g[x(a), a] = 09
(5) =0
a)
aak
a ak OAak
(5') + =0
aA(a)
aak _j [IXak]
[aA(afl L
for all a E N(a), where all the second partials of are evaluated at a; that is,
Oxx = i [x(a), A(a), a], and so on. Needless to say, cr = cD;C = gC[x(a),
a].10
H I and H =
(6) [ 0xl Oxs L B' A
By (A-1) and the continuity of the functions x(a) and A(a), every element of H
is continuous in a. But by (A-2), H is nonsingular. Hence H is also nonsingular
SOME APPLICATIONS 155
for all a in some neighborhood of a, say, N(a), where N(a) c N(a). Therefore,
from (5) we obtain12
(a)
8ak [xi (Pxx Oxak
for all a in N (a). This equation is the fundamental equation of comparative statics
obtained from (FOC).
where hji is the cofactor of the i-j element of H and det H is the determinant
of H. Then in view of (8), we can conclude that, for each a E N(&)
(10) sgn(det H) _ (- 1)"'+"
and
156 DEVELOPMENTS OF NONLINEAR PROGRAMMING
L Pn U11I Un2
... unnJ
where u;i = a2u/ax;axi. Evaluate these u;i's at z and set p in the above H.
Denote it by H, and assume
(18) det H 4 0
which corresponds to (A-2). We can then apply the implicit function theorem to
(16-a) and (16-b) and obtain the continuously differentiable functions A.(p, M) and
x(p, M) with =A(p,k), i=x(p,M),and
(19-a) M - p. x(p, M) = 0
(19-b) u; [x(p, M)] - A(p, M)p; = 0, i = 1, 2, ... , n
for all (p, M) in some neighborhood of (p, M)
Partially differentiating (19) with respect to pi, we obtain
xi
0
0
A
(20)
0
api
LOJ
for all (p, M) in the neighborhood of (p, 11%1), where a = A (p, M) and x = x(p, M).
Let ei be the (n + 1)-vector whose jth element is one and all other elements are
zero. The RHS of (20) can be rewritten as
xiel+ae1+1
Since H is nonsingular in some neighborhood of (p, M)-say, N (p, M)-from
(18), we obtain
H- iei+
= Y0, M) H- 1P 1 + a (p, M) I
158 DEVELOPMENTS OF NONLINEAR PROGRAMMING
for all (p, M) in N(p, 11%1), where A _ A(p, M) and x = x(p, M). In particular,
ax;( M)
",PI
(24) = M) - xj(P, M) i, j = 1,2,...,n
M) I
where
-
Next, partially differentiating the budget equation (19-a) with respect to pj and M,
respectively, we obtain
(33-a) i- P;
8x;(P,M)+Xj
a pi
=0,
8x; (P, M)
(33-b) ZP; 1
;= 1 am =
Combining (33) with (24), or also directly from (28) and (32), we obtain
n
(34) 2 p;S,i = 0 for all (p, M) in N(p, M)
Equations (28), (29), (30), (32), and (34) exhaust all the important properties
of S.19
We may note that the following relations can be obtained directly from the
budget equation (19-a) by utilizing (33) but without using (24):
where
8x;(P,M)PL (price elasticity)
(36-a)
aPj Xi
axi(p, M) M
(36-b) 7ri (income elasticity)
8M X;
n
(37) Z i 1, 2,...,n
i=1
Finally, utilizing (19-b) and assuming that A(p, M) 0,20 rewrite the
bordered Hessian in this problem as defined in (17) as follows:
0 u1 u2 U17
Un2 Unn
L Un u,
Then it should be clear that there is a close relation between the present formula-
tion and quasi-concave programming (that is, Theorems 1.E.2, 1.E.3, and 1.E.14).
The only important difference between the two approaches is that the quasi-con-
cavity of u together with the linearity (hence the concavity) of the constraint
function ensure the global result by specifying the signs of the principal minors of
H for all x.
where
(42)
acb(A, a)=af(aaa)+A ag(°a),k=1,2,...,1
Oak k k
REMARK: In the vector notation, (41) and (42) are respectively written
X24
SOME APPLICATIONS 161
Now change the original maximization problem in such a way that the kth
parameter ak is considered as one of the choice variables (such as thex;'s).
Then from the first-order conditions of this new problem, we have
k
aff+A -0
k
Thus
aF(a) = 0
aak
Write (a,, ..., ak_1, ak+,, ..., a!) = /. Then we have the following two
equations:
(45) O(F, ak) = F - F(a) = 0
aF(a)
(46) 00 ak ) = k = 0
,
(49)
aU = 'l(p, M)
aM
which is the well-known result that the Lagrangian multiplier of the problem
signifies the marginal utility of income. We also obtain
(50) aU= -Ax,,, j= 1,2,...,n
apt
Hence if ).(p, M) > 0, then an increase in any price will decrease the
consumer's satisfaction. Also, from (49) and (50), we have (aU/app)/
(au/aM) = -xj (p, Al).
EXAMPLE 2 (the meaning of the multipliers): In genera], consider the
problem of choosing x E R" so as to
Maximize: f(x)
Subject to: gj(x) = bj, j = 1, 2,... , in
Let x(b) and A(b) correspond to x(a) and 2.(a) in (4), where b = (b 1, b2, ... ,
b,,,). Define
(51-a) F(b) = f [x(b)]
SOME APPLICATIONS 163
and
Thus the jth Lagrangian multiplier signifies the marginal rate of change of
the optimal value of the objective function with respect to a change in the
jth constraint. Example I is clearly a special case of this. Interpreting bj
as the amount of thejth resource supply, A signifies the shadow price of the
jth resource.
EXAMPLE 3 (cost minimization):
Consider the following problem:
Maximize: - w x (= Minimize: w x)
X
Subject
where w > 0, y > 0, and g(x), respectively, signify the input price vector, the
output (scalar), and the production function. Let x(w, y) and )(w, y),
respectively, correspond to x(a) and ).(a) of (4).
Defines`
(53-a) C(w, y) = vi, - x(w, y)
and
(53-b) Cp (x, A, w, y) w x + A [g(x) - y]
Then by Theorem I.F.4,28
(54) ay = A(w, Y)
so that the multiplier signifies the long-run marginal cost. Also we obtain
ac
xr(w, Y), i = 1, 2, ... , n
(55) a wr
so that an increase in any factor price increases the minimum total cost C.
From (55), we obtain
for all (w, y). Hence by Euler's theorem, the minimal cost function C(w,y)
is homogeneous of degree one in iv. Also noting that a2C1aw;ay =
164 DEVELOPMENTS OF NONLINEAR PROGRAMMING
aC n ax;(w,y)= n aA
A(w,y)=ay w;
ay
w;
a w;
for all (w, y)
Thus A(w, y) is also homogeneous of degree one in w. Recall that by the first-
order condition, we have
w;=A(w,y)gi(x), i = 1,2,...,n
where g; (x) = ag(x)/a x; . Therefore if, in particular, the production function
is homogeneous of degree one in x so that Z I g; (x) x; = g(x) for all x,
then we obtain
n n
C(w,Y)w,x,=A
r=1 =1
g,(x)xi=A(w,Y)Y
(58) aAay,Y) = 0
EXAMPLE 4 (the envelope of the short-run cost curves): In the problem of the
previous example, reinterpret x as the vector of variable factor inputs and
consider the following problem.
Maximize: - [w- x + f (k)] [ = Minimize: w x + f (k)]
x
Subject to: g(x, k) = y and x E Rn
where k and f (k), respectively, signify the "size of the plant" and the "fixed
cost." For the sake of simplicity, k is assumed to be a scalar rather than a
vector signifying the spectrum of capital goods. Let x(w, y, k) and A(w,y, k),
respectively, correspond to x(a) and )L(a) of (4).
Define3°
(59-a) C(wv, )), k) = iv - x(w, y, k) + f(k)
and
(59-b) (P (x, A, w, y, k) = - f(k)] + A[g(x, k) - y]
Then from Theorem 1.F.4,
SOME APPLICATIONS 165
(60) a C(wy v, k)
= A(w, y, k)
so that the multiplier A in this problem signifies the short-run marginal cost.
Assume that w is fixed and define the function 0 by
(61) 0(C,y,k)=C-C(w,y,k)=0
For a fixed value of k, the graph of 0 in the (C-y)-plane denotes a short-run
(total) cost curve. In the long-run case in which k is allowed to adjust, we
have
alD -f,(k)
(62) + A akk = 0
ak
from the first-order conditions. Since aC(w, y, k)/ak = -(Pk by Theorem
1.F.4, we obtain from (61) and (62)
(63) a0(C,v,k)=0
ak
Suppose that we can obtain the unique relation between C and y by
eliminating k from (61) and (63) as
(64) C= E(Y)
Then the graph of e in the (C-y)-plane signifies the long-run cost curve
as the envelope of the short-run cost curves.
FOOTNOTES
1 . This section is indebted to Otani [ 11] as well as Hicks [ 8] and Samuelson [ 12] .
calculus. It is important to realize that this is a "local" theorem in the sense that
the above neighborhood may be very small.
9. Let Dx be the gradient vector of 0 with respect to A. Then by definition of 0, we
can rewrite (4-b) as OA[x(a), a] = 0.
10. Here 0X°k is the n-dimensional column vector whose ith element is a2a [x(a),A.(a),
a]/ax;8ak, and similarly for "Aak. Clearly the jth element of (DA"k is equal to
ag1[x(a), a]/aak.
11. Clearly, H is obtained from H by evaluating every element of H at a.
12. One should note that, in many applications in economics, (A-2) fails to hold; thus
H- 1 fails to exist. The homogeneity and the concavity of the relevant functions are
often the source of such singularity.
13. This also means that (SOSC) holds for all a in N(a). Therefore, under (FOC) and
(R), we can conclude that, for each fixed a in N(a), x(a) achieves a local maximum
of f(x, a) subject to g(x, a) = 0, x E X.
14. Note that hm+n,m+n is the (m + n)-(m + n) confactor of H; hence it has the sign
opposite of det H as a result of (8). Thus sgn hm+n,m+n = (-1)m+n-1 From this,
we can deduce (11) by using the property of a determinant that a simultaneous per-
mutation of rows and columns does not alter the sign and the value of the deter-
minant.
15. They respectively correspond to 0, SAX, DxA, and Oxx in H.
16. In general, we have the following theorem. Let H be any (m + n) x (m + n) sym-
metric matrix with real entries. Assume that H is decomposed in the form of (6)
as H. Assume also that rank B = m and that A is negative definite subject to B h = 0.
Then H- 1 exists and K4 of H-' [where H- 1 is decomposed as (14)] is negative
semidefinite. See Caratheodory J4], pp. 195-196). Samuelson ([ 121, pp. 378-379)
contains the statement of such a theorem.
17. Since S;; < 0, we have ax;(p, M)/8p; < -x;8x;(p, M)/8M, i = 1, 2, ..., n, from
(24). Commodity i is defined as Giffen if 8x;(p, M)/8p; > 0, and inferior if
8x;(p, M)/8M < 0. Hence it is clear that every Giffen commodity is inferior, but
not necessarily vice versa.
18. The homogeneity of x(p, M) is due to the fact that, in the original maximization
problem, M - p x = 0 if and only if cM - (cp) x = 0 for all scalars c > 0, so that
z= x(p,M)= x(cp, cM) for all c> 0.
19. The term S;j is called the net (or pure) substitution term. To understand the meaning
of this term, consider the problem of choosing x E R" so as to minimize p x
(expenditures) subject to u(x) = u, where u is fixed. Denote the solution to this
dual problem by x = h(p, u). If u = u[x(p, M)], then the solution of the utility
maximization problem becomes the solution of this outlay minimization problem,
and it can be shown that S+1 = ah;(p, u)/8pj, that is, a change in the demand for
i when pj is changed with a compensated change in income so as to keep the level
of utility u fixed. Two commodities i j are said to be substitutes if Sij > 0 and
complements if S;j < 0. From (34), it is clear that at least one pair must be substitutes
(that is, it is not possible that all commodities are complements)..
20. This is satisfied if u. [x(p, M)] > 0 for all i (nonsatiation), as remarked earlier.
21. The discussion here can easily be carried out in the global context under proper
assumptions.
22. By definition of 0, we may also write (40) as W(a) = 0 [x(a), A(a), a].
23. The relation r7F/taak = ?/aak is due to Afriat ([ 1 ] , pp. 355-357). The proposition
in the form of this theorem is found in Otani [ 11] . See also D. G. Luenberger,
Optimization by Vector Space Methods, New York, Wiley, 1969, pp. 221-223.
SOME APPLICATIONS 167
24. The notations Fa, IF,, and Da, respectively, denote the gradient vectors of F, IF, and
0 with respect to a.
25. Here, xa is the (n x 1) (Jacobian) matrix whose (i-k) element is ex; /8ak. Similarly,
Aa is the (m x 1) matrix whose (j-k) element is 8Ailaak. The proof is a simple
application of the chain rule (Theorem 1.C.2) with the first-order conditions [in
the form of (4-a) and (4-b)] .
26. Consider, for example, the family of curves y = (x - a)2 in the (x-y)-plane
where a is a parameter. This is the family of parabolas obtained by translating
y = x2 in the direction of the x-axis. Clearly the x-axis (that is, y = 0) is the
envelope. This is obtained by eliminating a from f (x, y, a) = y - (x - a)2 = 0 and
8f/8a = 2(x - a) = 0. In general, consider f(x, y, a) = 0 where (x, y) E R2 and
a E R. Regarding a as a parameter, we obtain a family of curves in the (x-y)-plane.
An envelope of a family of curves is a curve with the following two properties:
(1) At every one of its points it is tangent to at least one curve of the family; (2)
it is tangent to every curve of the family at at least one point. An envelope may
not be unique [for example, consider the family of circles, (x - a)2 + y2 = 11. The
envelope of a family of curves is the union of its envelopes. The envelope is obtained
by eliminating a from the two equations f(x, y, a) = 0 and 8f(x, y, a)laa = 0.
In general, the envelope of multiparameter surfaces, f (zl, z2, ... , z,,; a i, a2, ... , as)
= 0, is obtained by eliminating a from this equation together with 8f(z, a)/8ak = 0,
k = 1, 2, ..., s. It is, of course, possible that a family of curves or surfaces may
never generate an envelope. The above procedure only gives a necessary (but not
necessarily a sufficient) condition to obtain an envelope. The exposition of the
envelope is found in most textbooks of advanced calculus (for example, E. B. Wilson,
1911; W. F. Osgood, 1925; H. B. Fine, 1937; D. V. Widder, 1947; J. M. H. Olmsted,
1961, and so on) and classical treatments of differential geometry.
27. Here C(w, y) is the long-run minimum (total) cost for given (w, y). Fixing w, the
graph of C as a function of y is the total cost curve that appears in many text-
books on price theory.
28. Note that 8C18y= -8018y by Theorem 1.F.4.
29. Equation (57) says that the average cost is equal to the marginal cost if constant
returns to scale prevail. To obtain (58), note that 8(C/y)/8y = [(8C/8y) - (C/y)] ly,
which is equal to zero by (57). Equation (58) says that the function A is independent
of y. Hence we may write A(w, y) = u(w). Relations (57) and (58) are important in
the results known as the Shephard-Samuelson theorem. See Shephard [ 131 and
Takayama ([ 14], pp. 549-551).
30. The function C(w, y, k) signifies the short-run minimum total cost, for given (w, y, k).
REFERENCES
1. Afriat, S. N., "Theory of Maxima and the Method of Lagrange," SIAM Journal
on Applied Mathematics, 20, May 1971.
2. Bliss, G. M., Lectures on the Calculus of Variations, Chicago, University of Chicago
Press, 1946.
3. Burger, E., "On Extrema with Side Conditions," Econometrica, 23, October 1955.
4. Caratheodory, C., Calculus of Variations and Partial Differential Equations of the
First Order, Part II, San Francisco, Holden Day, 1967, esp. chap. I1 (German
original, 1935).
168 DEVELOPMENTS OF NONLINEAR PROGRAMMING
5. Debreu, G., "Definite and Semidefinite Quadratic Forms," Econometrica, 20, April
1952.
6. El-Hodiri, M. A., Constrained Extrema: Introduction to the Differentiable Case with
Economic Applications, New York, Springer-Verlag, 1971.
7. Hestenes, M. R., Calculus of Variations and Optimal Control Theory, New York,
Wiley, 1966, chap. 1.
8. Hicks, J. R., Value and Capital, 2nd ed., Oxford, Clarendon Press, 1946 (1 st ed. 1939).
9. Mangasarian, O. L., Nonlinear Programming, New York, McGraw-Hill, 1969.
10. Mann, H. B., "Quadratic Forms with Linear Constraints," American Mathematical
Monthly, 1943: reprinted in Readings in Mathematical Economics, ed. by P. Newman,
Baltimore, Md., Johns Hopkins Press, Vol. 1, 1968.
11. Otani, Y., Microeconomic Theory, lecture notes at Purdue University, Fall 1971.
12. Samuelson, P. A., Foundations of Economic Analysis, Cambridge, Mass., Harvard
University Press, 1947.
13. Shephard, R. W., Cost and Production Functions, Princeton, N.J., Princeton
University Press, 1953.
14. Takayama, A., International Trade-An Approach to the Theory, New York, Holt,
Rinehart and Winston, 1972.
15. Uzawa, H., "A Note on the Menger-Wieser Theory of Imputation," Zeitschrift fur
Nationalbkonomie, XVIII, August 1958.
2
THE THEORY OF COMPETITIVE MARKETS
Section A
INTRODUCTION
169
170 THE THEORY OF COMPETITIVE MARKETS
assi}me that only the commodity bundles that a consumer wishes to consume
enter into his decision making and that the prices of the commodities do not
affect his preferences. In real life, one may wish to consume a certain commodity
mainly because it is expensive. Such a "snob effect" is assumed away in this
chapter. In this connection, we must note that there is an important complication
in the theory of consumer's choice. Unlike the theory of production, we do not
have a measurable behavioral criterion such as profit. However, it turns out that
important results in the theory of consumer's choice and the subsequent results
in the theory of competitive markets can be obtained without any reference to the
measurability of individual's satisfaction. We will observe this throughout the
chapter.
In addition to the study of the behavior of each type of economic agent,
the theory of competitive markets is also concerned with the interaction of many
agents in the economy. This is the question of a competitive equilibrium. Essentially,
a competitive equilibrium is a state of affairs in which each consumer maximizes
his satisfaction given his budget set defined by the prevailing price vector, each
producer maximizes his profit given the same price vector, and the total supply
of commodities is equal to the total demand for commodities.' In this chapter
we study the following two aspects of a competitive equilibrium.
x, is taken to be a vector whose components are all nonnegative. If Mr. i has re-
sources z; and if he gets all his income by selling this x;, his income will be
p x;, when price vector p prevails in the market. His budget constraint is thus
We may note that Mr. i, if he wishes, can retain a part of his resources for his
own consumption. One way to handle this is to (fictitiously) suppose that he sells
all his resources in the market and buys some of them back (with zero transaction
cost).
Some simple examples may clarify the above budget relation. Suppose Mr. i
has holdings of only one commodity, A. Suppose further that his initial holding
of A is the amount X a and that he sells a part of A, say, Xa. His consumption of
A will then be (X,, - Xa). Now let us assume that he also consumes commodity
B of which he has no initial holding. Letting pa and Pb denote the price of A and
B, respectively, his budget constraint can be written as
P0(Xa-Xa)+PbXb=PlX-
We may suppose that this budget constraint means that he sells all his initial
holdings of A, X and that he buys back the amount (X,, - Xa). The com-
modities A and B could be such goods as apples and bananas, or one of them
could be labor services (or leisure). In other words, we may interpret X" = 24 hours
a day, Xa = the amount of labor he sells to the market per day, and (X" - Xa) _
the amount of leisure he consumes per day.
The above budget relation could also be written as
PbXb = P0Xa
If we write the equation this way, we have to infer from Xa the amount of com-
modity A which enters into his consumption and his preference ordering. This
can be done by specifying the value of X a and computing the value of (X a - Xa).
As long as the total amounts of resources held by him are fixed, it does not
make any fundamental difference how we write the budget relation. In fact, we
may rewrite the above relation as
Pa(-Xa) + PbXb __<- 0
We then consider his consumption bundle as (-Xa, Xb) and his total budget
(income) as zero. We may suppose that his preference ordering (or utility function)
is defined on all possible values of (-Xa, Xb) instead of all possible values of
(X a - Xa, Xb). One caution to note is that under this supposition, his consump-
tion vector is no longer nonnegative.
In general, we can rewrite p x < p, xi asp z; <_ 0, where z; - x, - x1, and
consider z; as his consumption bundle and 0 as his total budget. Negative elements
in z; represent quantities of commodities supplied and positive elements in z;
represent quantities received. In this convention, the consumption set X; is the
INTRODUCTION 173
set of all possible consumptions and trades (of the ith consumer). Usually, it is
assumed that Xi is a subset of R.
Further complications arise when we consider the producers. First, pro-
ducers may hold certain resources. The question is: Who claims the income from
these resources? There is one simple answer (not the only answer). We may assume
that all the resources are initially held by the consumers (and none by producers)
and some of them are sold to the producers (for example, labor service). Thus
consumers get the income. Second, producers may get positive (negative or zero)
profit. Who has the claim to the profit? This can be solved simply by assuming
that all the firms are owned by the consumers, and that ownership is represented
by stocks issued by the producers. Clearly some consumers may never own stocks
and hence receive no income from the profits that producers make. Let yj E R"
be the production point (input-output combination) chosen by thejth producer
when price vector p prevails. The negative elements of yj denote inputs and the
positive elements of yj denote outputs. His profit is represented by p yj.6 Let
Oji be the fraction of the stock of the jth producer that the ith consumer owns.
Thus
m
Bji= 1 for all.j andOji> 0 for all jand i
i= 1
Then Mr. i gets the dividend from the jth producer in the amount of Oji(p yj).
Letting xi be the total amount of resources initially made available to Mr. i,
his total wealth (or income), prior to any consumption, can now be represented by
k++
p. Xi + G0ji(p'yj)
1= 1
(he owns the stock issued by producer j = 1, ... , k).7 Note that this formulation
does not preclude the possibility that the same individual is a consumer and a
producer at the same time. In this case he, as a consumer, owns 100 percent of the
stock of himself as a producer.
Needless to say, the set of all possible input-output combinations for the
jth producer is the production set of j (which we denote by Yj). Clearly jj E Yj.
In this chapter we assume that Yj is a subset of R°. If there are no external
economies and diseconomies, the aggregate production set of the economy,
denoted by Y, can be defined by
k
Y- Yj
J= I
FOOTNOTES
1. It is possible to suppose that the total demand for commodities does not exceed the
total supply of commodities. Note that this convention, allowing excess supply of
commodities, presupposes the free disposability of commodities.
2. Since this assumption does not involve any ethical connotation, the word "individual-
istic" seems to be better than "selfish."
3. The theory of the core will be explained later (in the appendix to Section C of this
chapter). For the theory of externalities, see also T. Negishi, General Equilibrium
Theory and International Trade, Amsterdam, North-Holland, 1972, esp. chapter 4.
4_ Note that if we date commodities, then prices are also "dated" in the sense that
interest rates between various dates are incorporated into the model.
5. However, even in a stationary state, it is difficult for the consumer to know the
time of his death with perfect certainty.
6. The definition of "profits" depends upon what is included in the list of commodities.
For example, if "entrepreneurial skills" are not included in the list, the returns to
them constitute a part of the profits. On the other hand, if such items are included in
the list of commodities, the production set may become a convex cone, so that the
maximum profit becomes zero.
7 . Such a convention is seen, for example, in Debreu [ 1 ] . One possible difficulty
here is that the explanation of the distribution of the 01/s is not clear. This is
especially true if entrepreneurial skills are not included in the list of commodities.
8. The basic methods involved in the studies of competitive equilibrium (existence,
CONSUMPTION SET AND PREFERENCE ORDERING 175
welfare, stability, and comparative statistics) are also useful and important when
we study even much simpler models in other branches of economics. Still another
important reason for the study of competitive markets is its welfare significance. See
our discussions in Section C and its appendix in this chapter and Section G of
Chapter 3.
REFERENCES
Section B
CONSUMPTION SET
AND PREFERENCE ORDERING
a. CONSUMPTION SET
The basic concept in the theory of consumer's choice is the "consumption
set," which is the set of all possible consumption bundles for a particular con-
sumer. This concept is clearly analogous to the concept of the production set,
which was discussed in Chapter 0, Section C. The consumption set, which we
will denote by X, is traditionally taken to be the entire nonnegative orthant ofR".
We should realize that this convention implicitly or explicitly contains the follow-
ing assumptions:
(A-1) The set X is a convex set.
(A-2) An individual can consume any amount of goods (however large it maybe).
(A-3) An individual can survive as long as he has a positive quantity of some com-
modity. Thus, for example, the origin is a minimum subsistence consumption.
(A-4) The set X is a subset of a finite dimensional vector space.
The third assumption implies that every individual has the same starvation
point regardless of his physiological capability if everybody in the market has the
nonnegative orthant as his consumption set.' The second assumption may not be
considered a strong one, for one may get satisfaction just from owning com-
176 THE THEORY OF COMPETITIVE MARKETS
modities. However, we may then argue that we should distinguish the consump-
tion activity of actually consuming a commodity from that of simply owning it.
After all, eating cakes may provide a different kind of satisfaction to an individual
from owning cakes. The first assumption is very convenient but quite a strong one,
for it implies, among other things, the perfect divisibility of every commodity, in-
cluding commodities such as automobiles. We may note, however, that every com-
modity can be made perfectly divisible if we consider consumption per unit of time
of the commodity, since time is a continuum. For example, the consumption of an
electric bulb can be measured by the amount of time we use the bulb. If the bulb
lasts 1000 hours and if one consumed 10,r hours of lighting by the bulb, we may say
that he consumed 107r/ 1000 of the bulb (which is an irrational number). In this
context, we may even question the need to assume that X is a subset of a linear
space. For a linear space, by definition, must allow multiplication by any scalar.
It may be worthwhile to investigate the extent of the theory in which the con-
sumption set is not embedded in either the linear space structure or the topological
structure. By assuming that X is in R", these structures may unnecessarily creep
into the theory. Note also that even if we assume that every commodity is perfectly
divisible, X may cease to be a subset of a finite dimensional vector space. This is
true, for example, if we date each commodity by continuum time. In this case, a
consumption vector is a function of time, x(t), so that x(t) E X (which typically
presupposesXto be a subset of an infinite dimensional linear space).' In Figure2.1
we illustrate the consumption set in R2 where one of the two commodities is
indivisible. The consumption set is the set of points on the horizontal lines.
x2
IRZ
11
x1
0
If, in addition, x R y and y R x imply "x = y," then the relation is called a partial
ordering or simply an ordering. Furthermore, if in a quasi-ordering R on X, we
necessarily have either x R y or yR x for arbitrary elements x, y (x 4 y) off, we call
R a total quasi-ordering or a complete quasi-ordering. Similarly, we can define total
ordering.
Definition: Let X be the consumption set of a given individual and let! E X. Then
(i) The set { x: x E X, x Q 21 is called the no-worse-than-z set or the upper con-
tour set of z.
(ii) The set {x: x E X, xQ _c} is called the not-better-than-.r set or the lower
contour set of z.
Total ordering
C. UTILITY FUNCTION
Let X be the consumption set of a particular individual (say, Mr. A), and
let us suppose that Mr. A's satisfaction from consumption can be expressed by an
index which is a real number. This is called his utility index. The utility index is a
function from X into R where R is the set of real numbers. This function, denoted
by u(x), is called the utility function (of Mr. A). Since the set of real numbers has
the natural ordering > or > , this amounts to assuming that Mr. A's preference is
representable by the natural order of real numbers.' In other words,
then
(i) x ® y if and only if 0 [u(x)] > o [.(y)].
(ii) x Q y if and only if 0 [u(x)] > o [u(y)].
(iii) x (3 y if and only if 0 [u(x)] = 0 [u(y)].
Classical consumer's theory assumes that the consumption set is the non-
negative orthant of R" and a utility (index) function u(x) is defined on this. In the
diagrammatical analysis, which is so common in the traditional analysis, the con-
cept of an "indifference curve" is used. The indifference curve is the locus of x =
(XI, x2) on R2 such that u(x) = constant. Traditionally, indifference curves are
drawn convex to the origin, indicating a diminishing marginal rate of substitution
between the two commodities.' It is generally supposed that u(x) increases as each
element of x increases. However, this does not have to be true. It may so happen
that a consumer is satiated with some commodity-say, i0-at an amount-say,
x o-so that u(x) < u(x*) where x, = x* for all i # i0 and for all x;o with xio > x*io .
Note that it is possible to have u(x) < u(x*), that is, the utility may actually de-
crease beyond a certain level of consumption (for example, the amount of light
in a room). Moreover, we may also question whether u (x, y) = constant can define
a unique curve, say, x = v(y). It is possible that we can have "thick" indifference
180 THE THEORY OF COMPETITIVE MARKETS
x2 x2
B: "Bliss" point
X 1
XX,
0 0
loci. In Figure 2.3 we illustrate such indifference "curves." On the left, the in-
difference curves take the customary shape except that some of them contain
"thick" portions. On the right, we illustrate indifference curves for which u(x)
is not monotone increasing with respect to a coordinate-wise increase in x.
The question will arise as to whether we can represent the preference order-
ing Q (say, for Mr. A) by a utility function (that is, by real numbers).
This question was first solved by Debreu [3], [4] and later generalized by
Rader [9]. To state Debreu's theorem, we need the following two concepts.
x2
X2
REMARK:''
1. Condition (i) implies that all the upper contour sets of X must be convex.
2. If the preference ordering is continuous, then (ii) implies (i).
3. If the preference ordering is continuous, then (iii) implies (ii).
x,
x, X, x,
0 0
FOOTNOTES
1.As long as we consider a single consumer, the assumption that the origin denotes
a minimum (subsistence) level of consumption is not as strong as it appears. If
{x E R":x > -x > 01 is his consumption set, then by moving the origin properly,
we can obtain the origin as a minimum level of subsistence. His consumption set
may be denoted by J y E R': y > 0} where y = x - Y. However, if we consider more
than one consumer, then it involves the assumption that the minimum subsistence
levels of consumption are identical for all consumers.
2. In the analysis of consumer's behavior, it is well known that the different charac-
teristics of the commodities play an important role. For example, a consumer may
consider a blue Valiant to be a different commodity from a red Valiant even if all other
specifications are the same. Kelvin Lancaster, therefore, has recently proposed a
"new approach to consumer theory" by emphasizing the "different characteristics"
aspect of the commodities. See his "New Approach to Consumer Theory," Journal
of Political Economy, LXXIV, April 1966. However, he apparently assumes that these
characteristics are measurable quantities; see, for example, his phrase "the amount
of the ith characteristic" (p. 135). When the consumption set is considered to bea
subset of R", we have to be careful that anything measured on each coordinate is
representable by real numbers. The characteristics cannot be represented by real
numbers regardless of whether such a representation is ordinal or cardinal. A
statement such as "the quantities of the characteristics are directly proportional to
the quantities of the goods" is thus meaningless.
3. That Q is total means that the consumer can give his preference ordering 0 for
any two elements of his consumption set. Its plausibility is sometimes questioned,
for some of the decisions in the consumption set might involve highly hypothetical
situations which our consumer never faces in real life. In such cases, he cannot
make any decisions. It is known that many theorems can be proved without this
axiom. A still more questionable axiom involved in regarding the preference order-
ing as a total quasi-ordering may be the transitivity axiom. For example, we can
show that the relation 0 on X = {x E R:x > 0} defined by x' (D x if and only if
x' > x - I is intransitive. The existence of thick regions of indifference would often
cause intransitivity of ® . The intransitivity can be quite normal. See K.O. May,
"Transitivity, Utility, and Aggregation in Preference Patterns," Econometrica,
22, 1954, for example. In a remarkable paper [ 121 , Sonnenschein showed that the
transitivity axiom can be dispensed with in proving many important results of the
theory of competitive equilibrium.
4. However, the relation "is different to" can be intransitive. In that case, it cannot
be an equivalence relation.
5. Given the consumption set X of a particular individual, we do not have to define Q> ,
Q Q Q
, , , Q all independently. We may first define Q as a total quasi-order-
ing on X; then define Q by x @ y and y (D x; Q is defined by x @ y but not
y Q x;x Q yis defined byy Q x;andx Q yis defined byy Qx.
6. The preference ordering represented by real numbers is obviously transitive, for
the natural order of the real numbers is transitive.
7. Note that this assumes that the preference ordering of the consumer is individualistic.
184 THE THEORY OF COMPETITIVE MARKETS
8. Note that if the utility function is replaceable by its monotone increasing trans-
formation, then the classical concept of "marginal utility" becomes rather meaning-
less, although the concept of marginal rate of substitution can still be meaningful.
Consider a differentiable utility function u(x) and a differentiable monotone trans-
formation 0 (where 'D' > 0). Then clearly we have (au/ ax; )/ (au/ axe) = (a(P [ u] / ax1)/
(at [u] /axe).
9. It can be shown that we can restate this (in an equivalent form) as follows: Let
{x9} and {x9} be two sequences in X such that x9 ->x and x9 z. Suppose x9 Q
x9 for all q; then x ®x.
10. Rader [9] relaxed the transitivity assumption involved in Debreu's theorem.
11. For the proof, see Debreu [4] , pp. 72-73.
12. As remarked before, this requires, among other things, that all commodities are
perfectly divisible.
13. The following quotation from J. S. Chipman might be of some use in understanding
the significance of the convexity of preference.
Two pillars form the foundations of economic activity. One is the law of con-
vexity of preferences, which states that people desire to consume a variety-
or average-of products rather than limit their consumption to any one
commodity alone....
See his "The Nature and Meaning of Equilibrium in Economic Theory," in Func-
tionalism in the Social Sciences, Philadelphia, Pa., American Academy of Political
and Social Science, February 1965, p. 35.
14. For the proofs, see Debreu [4], pp. 60-61.
REFERENCES
1. Birkoff, G., Lattice Theory, rev. ed., Providence, R. I., American Mathematical
Society, 1961.
2. Chipman, J. S., "Foundations of Utility," Econometrica, 28, April 1960.
3. Debreu, G., "Representation of Preference Ordering by a Numerical Function," in
Decision Processes, ed. by Thrall, Coombs, and Davis, New York, Wiley, 1954, pp.
159-165.
4. , Theory of Value, New York, Wiley, 1959.
5. Georgescu-Roegen, N., "Choice Expectations, and Measurability," Quarterly Journal
of Economics, 58, November 1954.
6. Hicks, J. R., Value and Capital, 2nd ed., Oxford, Clarendon Press, 1946.
7. Koopmans, T. C., Three Essays on the State ofEconomic Science, New York, McGraw-
Hill, 1957.
8. Kuratowski, K., Introduction to Set Theory and Topology, Oxford, Pergamon Press,
1961 (tr. from Polish original).
9. Rader, T., "Existence of a Utility Function to Represent Preferences," Review of
Economic Studies, 30, October 1963.
THE TWO CLASSICAL PROPOSITIONS OF WELFARE ECONOMICS 185
10. Richter, M. K., "Revealed Preference Theory," Econometrica, 34, July 1966.
11. Sonnenschein, H. F., "The Relationship between Transitive Preference and the
Structure of the Choice Space," Econometrica, 33, July 1965.
12. , "Demand Theory without Transitive Preferences, with Applications to the
Section C
THE TWO CLASSICAL
PROPOSITIONS OF
WELFARE ECONOMICS
"worse off" refer to the welfare of each individual consumer with respect to his
preference ordering.
Now the natural question becomes: What is the relationship between "com-
petitive equilibrium" and "Pareto optimum"? In particular, we maybe interested
in asking whether every competitive equilibrium realizes a Pareto optimum and
whether a Pareto optimal state can be achieved and supported by a competitive
equilibrium. These are the two main questions in classical welfare economics. If
each question can be answered in the affirmative, then we want to know the pre-
cise conditions which support each conclusion. This is the task of this section.
Before we start our analysis, we may note that the above questions are not
really new in economics. A principal theme of Adam Smith was that "free con-
petition" realizes a "social optimum." Obviously, Smith did not have precise
concepts of "free competition" and "social optimum."
There have been many attempts in the history of economics to formalize the
above theme. The Ricardian theory of comparative advantage is probably the first
such attempt to be successful in connection with productive efficiency. Wicksell
[23] gave a formulation of how perfect competition maximizes production, which
corresponds to the results from activity analysis mentioned above. The concept of
the Pareto optimum is dueto Pareto but was apparently introduced at the insistence
of his friends, Pantaleoni and Barone. Pareto perceived the Pareto optimum signif-
icance of a competitive equilibrium (for this point, see Samuelson [22] , pp. 212-
214). Apparently it is Barone [ 3] who first stated exactly and proved that a com-
petitive equilibrium, under quite general conditions, realizes a Pareto optimum.'
A somewhat converse proposition, that is, that a Pareto optimum state is
supported by a competitive equilibrium, also came from Pareto and from Barone
[ 31, Lange [ 141, Lerner [ 16], and others. Combined with the previous proposition,
these two propositions constitute the so-called "fundamental theorems of welfare
economics."
The studies of these propositions in the 1930s and 1940s by Lerner [151,
[ 16], Lange [ 14], Hicks [81, Samuelson [221, and others are characterized by
their recognition of the relationship between the marginal equivalences and acom-
petitive equilibrium.
The first rigorous formulation and proof of these propositions using a
modern set-theoretic approach was carried out by Arrow [ 1] and Debreu [4] and
has been further generalized by Debreu [ 51, Moore [ 17], and so forth. The revolu-
tionary character of this development is analogous to the advance from the tradi-
tional production function approach in production theory to the activity analysis
approach (see Chapter 0, Section C). Our discussion in this section is based on the
modern version. The author has greatly benefited from excellent expositions by
Koopmans [ 121, and Koopmans and Bausch [ 131.
Before we turn to this modern approach, we may illustrate the problem by a
simple diagram. Figure 2.7 illustrates the choice of a competitive consumer, whose
consumption set is the nonnegative orthant of R 2. If he is faced with a price vector
THE TWO CLASSICAL PROPOSITIONS OF WELFARE ECONOMICS 187
indicated by the line H (in which the price of each commodity is positive), and if he
has chosen the point z, then, assuming that he is a "rational" consumer, we can
immediately conclude that he must be maximizing his satisfaction over the points
such as x, in the nonnegative orthant, which are on or below the lineH (the shaded
region).' No point in the shaded region costs more than i under this price line H.
Now consider an economy which consists of two consumers but no pro-
ducers. The consumers exchange commodities with each other. Assuming that
there are only two commodities in the economy and assuming also that each con-
sumer's consumption set is the nonnegative orthant ofR2, the well-known Edge-
worth-Bowley box diagram can be drawn. Any allocation of commodities between
the two consumers is possible as long as the point which represents such an alloca-
tion stays inside or on the boundary of the box ("feasibility condition").5 In Figure
2.8, point R, for example,. is not a Pareto optimal point, because it is possible to
improve one person's welfare without decreasing the otherperson's welfare simply
by moving within the "lens" formed by the indifference curves of the two in-
PA
dividuals which pass through R. However, any point on the curve PQ is a Pareto
optimal point. Clearly, point C is a competitive equilibrium if the price indicated
by H prevails, for each person maximizes his satisfaction in the sense described
above. Note that at point C the price line is tangent to each person's indifference
curve. In fact, this tangency condition is sufficient to guarantee that each person
maximizes his satisfaction in the sense described above.
As it can easily be seen from Figure 2.8 and as is well known, any Pareto
optimal point in the ordinary box diagram is a point at which the indifference
curves of the two individuals are tangent. (The collection of such points is called
the contract curve.) As was seen above, any competitive equilibrium point must be
on a line which is tangent to the indifference curve of the individual and it must be
at the point of tangency.e Hence it follows immediately that any competitive
equilibrium point realizes a Pareto optimum.
Note also that at any Pareto optimal point (that is, any point on the contract
curve), it is possible to draw a line which is tangent to an indifference curve for
each consumer. At point C, H is such a line, and at point P, H' is such a line. Rep-
resenting the price vector by the slope of such a line, we can at once conclude that
every Pareto optimal point can be achieved and supported by co m petitive pricing!
Note that the above statement does not say that any Pareto optimal point can
be achieved by competitive pricing after starting from any arbitrary initial point.
For example, if point R represents the initial resource point, point C can be
achieved by pure exchange, with each individual acting as a competitive consumer
-that is, a price taker-under a price line H. But point P cannot be achieved
directly from R. It requires some reshuffling of goods so that pointR is translated to
a point such'as R' on the H' line.
The above "proofs" of the two classical propositions of welfare economics
look very simple, but they rely on many implicit assumptions. In fact, this is a prime
example of traditional economic theory, whose reasoning is so crucially dependent
on the diagram. We may ask the following questions, for example.
(i) How crucial is the assumption that the consumption set is the entire non-
negative orthant?
(ii) Is it necessary to assume the convexity of each consumer's consumption set;
and how essential is the divisibility assumption of each commodity?
(iii) Does the consumption set of every consumer have to include the same list
of n commodities?
(iv) Does an individual's indifference curve have to be (strictly) convex to the
origin?
(v) Does it have to be smooth? [Or is it necessary that we can define a unique
tangent line (such as H) for the individual's indifference curve?]
(vi) Do we have to assume the continuity of the individual's utility function (if
its differentiability can be dispensed with)?
(vii) What is the role of consumer's satiation in the above analysis?
(viii) What happens if the price line which should support a Pareto optimum
THE TWO CLASSICAL PROPOSITIONS OF WELFARE ECONOMICS 189
coincides with one of the edges of the box? Can we still have a competitive
equilibrium?
(ix) What happens if we introduce production? What assumptions are necessary
when production is introduced into the model?
(x) Will the conclusions be altered if there are more than two consumers and
two commodities and if there is an arbitrary number of producers?
(xi) What is the minimum possible set of assumptions which will guarantee all
the conclusions of the classical propositions?
Although we may get considerable insight into the above questions from Figure
2.8,R nothing precise and definite can be said on the basis of it alone. Here we may
quote the following remark from Koopmans ([ 12], p. 174):
Nothing in the process of reading a diagram forces the full statement of as-
sumptions and the stepwise advance, through successive implications to con-
clusions that are characteristic of logical reasoning. Assumptions may be
concealed in the manner in which the curves are "usually" drawn and con-
clusions may be accepted unconditionally although they actually depend on
such unstated assumptions.
We now turn to the modern formulation and the proof of the above classical pro-
positions of welfare economics. The author hopes that the reader will fully appreci-
ate the above remark by Koopmans in the process of reading the following
exposition of the modern approach to welfare economics. In the following, we use
the minimal possible assumptions which are known at presents Any relaxation of
assumptions will be interesting and important. Some important counter-examples
will be offered when some of the assumptions are violated. Diagrams will be useful
to show such counterexamples.
Let x; be an n-vector of consumption by consumer i(i = 1,2, ..., m), and let
y, be an n-vector of production by producerj(j = 1, 2, . . ., k). The negative ele-
ments of y, denote inputs and the positive elements of y1 denote outputs. Let
x xi and y = Z ly1. Denote by X, the consumption set of i and by Y1
the production set of j. We assume that both X, and Y/ are subsets of R". We
denote by X the aggregate consumption set and byYthe aggregate production set.'
We assume that the preference ordering ®, is defined for each consumption set
X;." Given price p, the profit of producer j can be written as p yi. There is an
initial bundle of commodities available in the economy. We denote it by x. This
bundle of commodities can be held by consumers so that if we denote the initial
resource held by the ith consumer by X;, then [Xi = x.
We now define (in the usual manner) feasibility, Pareto optimum, and com-
petitive equilibrium.
The second relation in (iii') states that if there is an excess supply of some
commodity, its price must be zero. As we will see shortly, the case p = 0
will be precluded, under the assumption that 2; is a "local nonsatiation
chosen point."
Commodity 1
"not satiated" with respect to this divisible commodity at point x; (that is,
there exists a point such as x'1 Q,x, for each c). Figure 2.9 illustrates the case
in which one commodity is perfectly divisible and the other commodity is
indivisible." The consumption set here is assumed to be the collection of
the horizontal lines in the nonnegative orthant.
REMARK: Suppose z; is a chosen point under p so that z; Q,x1 for all
x, E X, with p x, < p z,. Then p cannot be a zero vector if x, is a local
nonsatiation point. (For if p = 0, then p x, < p z, holds for any x, E X1.)
Lemma: Let z, be a locally nonsatiating chosen pointfor the ith consumer when price
p prevails. Then under (A-1),
(i) x, (D111 implies p x1 > p 11.
(ii) x,Q,llimpliesp x,? p 11.
PROOF:
(i) Suppose not; that is, p x1 < 1,. Since 11 is a chosen point, 11®1x1,
which is a contradiction.
(ii) Let x,9111 and suppose that p x1 < p 11. Since z, is not a point of
192 THE THEORY OF COMPETITIVE MARKETS
local satiation, neither is x [by (A-1)] . Hence for all c, 0 < E < 8, there
exists x; E Xi and x; E BE(x;) such that x, G> ixi, which in turn implies
xiGizi by the transitivity of the preference ordering) We may choose
x, close enough to x1 so that p x; < p i, (which is possible because the
value function p x1 is continuous). This contradicts the assumption that
zj is a chosen point under p. (Q.E.D.)
REMARK: The proof of statement (ii) of the above lemma can be illustrated
by Figure 2.10.
Commodity 2
REMARK: The reader should realize that the choice of x; close enough to z,
so that p x; < p z; needs the assumption that there exists at least one com-
modity which is divisible.
Theorem 2.C.1: Let [p, {-0, {yj}] be a competitive equilibrium such that z, is
a local nonsatiation point for all i = 1, 2, ..., m. Suppose assumption (A-1) holds
for all i. Then [{X,}, {yj}] is a Pareto optimum.
PROOF: Suppose [{Xi}, {yj}] is not a Pareto optimum. Then there exist
[{xr}, {y1}] such that xi E Xi, i = 1, 2, ..., m, yj E Y1,.j = 1, 2, ..., k, and
xi > zi or x> p z
i=1 i=i
But condition (iii) of C.E. requires p z = p y + p x Hence we have .
THE TWO CLASSICAL PROPOSITIONS OF WELFARE ECONOMICS 193
Definition: Let
Ci(Xr) = {xi: xi E Xi, xi Di Xi}
Ci(_Q _ {xi: xi E Xi, xi®, Xi}
The set CC(.2;) is the no-worse-than-zi set for i, and C,(zi) is the preferred-to-zi
set for i.
REMARK: The convexity of the preference ordering (for i) implies the
convexity of CC(zi) and Ci(zi) for all zi E Xi.
We now introduce the following assumptions:
(A-2) The preference ordering Di is convex for each i = 1, 2, ..., in.
(A-3) The set Y is convex.
(A-4) (cheaper-point)" Given a point zi and a prevailing price vector there exists
x'. E Xi such that p x; < p zi .
(A-5) (continuity of (Di) For each i = 1, 2, ..., m, the set {xi: xi E Xi, x,D, x'i}
is closed for all x; E Xi (that is, if {xiQ} is a sequence in Xi such that x,Q ®i x and
xiQ -. xic, then we have x,0 Qix'i).
REMARK: Assumption (A-3) does not require that the production set for
each producer (1') be convex. Assumption (A-4) is also called the minimum
wealth assumption.
Theorem 2.C.2: Suppose that [{Xi}, {yj}] is a Pareto optimum such that at least
one consumer is not satiated. Then under assumptions (A-2) and (A-3), there exists a
(price) vector 0 such that
(i) p i; < p . xi for all xi E Xi with xi i = 1, 2.... , m.
(ii) e Yj,j= 1,2,...,k.
(iii) z=y+ z.
196 THE THEORY OF COMPETITIVE MARKETS
REMARK: This theorem does not require (A-4) and (A-5) but does not
quite say that to every Pareto optimum we can adjoin a price vector such that
it is supported by C.E. Condition (i) states that each consumer minimizes his
expenditure over his no-worse-than-z; set, but it does not necessarily imply
the maximization of satisfaction over the budget set. To prove the latter,
we use (A-4) and (A-5). We first prove the above theorem.
PROOF: Without loss of generality, we can suppose that the first consumer
is nonsatiated (at Ii). Let zl, X2, ..., zm) = C 1(zl) 2C1(2 ). For
notational simplicity, we abbreviate C(zl, X2, ..., zm) by C. By (A-2), C
is convex. Let W = {w: w = y + x, y E Y}. Since Y is convex by (A-3), W is
also convex. By the definition of P.O., z E C implies z it W. Hence C and
W are two nonempty disjoint convex sets. Hence by the Minkowski separa-
tion theorem (Theorem O.B.3), there exists a p # 0 and a real number a such
that
(a) for all wE W
and
k
for all yj E Yj,j= 1,2,...,k
i_
But i = f) + z(feasibility of P.O.). Hence
k k
lYi+ly Yi+for
J= l=I
all y1E Y,j= 1,2,...,k
THE TWO CLASSICAL PROPOSITIONS OF WELFARE ECONOMICS 197
Or
k k
for allyj E Yj,j= 1, 2,...,k
l= I J=I
Fix j = jo and let yy = yifor all j zA jo. Then p Yio >> P yio for ally;o E Y;o.
Since the choice of jo is arbitrary, this proves condition (ii) of Theorem 2.C.2.
Condition (iii) of Theorem 2.C.2 is automatically satisfied by the feasibility
condition of P.O.
Similarly, from (b') we obtain
(c) p x, > p z, for all x, E C, (z, )
and
Corollary: If in addition (A-4) holds with respect to , and p in the above theorem,
and if (A-5) holds, then for every Pareto optimum [{ii}, { y,}], there exists p 0
such that [p, L j}, { yj}] is a competitive equilibrium.'-'
PROOF: It suffices to show that condition (i) of C.E. holds. In the above
theorem we obtained
(e) p- xi > p z; for all xi E C;(z;), i = 1, 2, . . ., in
198 THE THEORY OF COMPETITIVE MARKETS
This contradicts relation (e) above. Hence there cannot exist an x; E X. such
that p x; = p z; and x11. This means that p x; = p r;, x; E X;, implies
x; ®;z;. Note that relation (e) means (taking its contraposition) that
p x; < p z;, x; E X;, implies x; ©;z;. Hence we have obtained that
p x; < p z;, x; E X;, implies x; ®;z;
In other words, c, Q;x; for all x; E X; with x; 5 p r;. This proves condi-
tion (i) of C.E. (Q.E.D.)23
REMARK: Note that the competitive equilibrium in the above corollary
can be achieved by allocating from the aggregate income of the society,
p (y + x), the amount p z;, i = 1, 2, ..., m, to each consumer [note that
condition (iii) in the definition of C.E. guarantees that all the income of the
society is completely absorbed by all the consumers in the society]. In other
words, without such a reallocation of ownership, a Pareto optimum cannot,
in general, be supported by competitive pricing.
As we remarked above, Theorem 2.C.2 does not quite establish that a Pareto
optimum can be realized through a competitive equilibrium. To establish this
(the above corollary), we needed the additional assumptions (A-4) and (A-5). An
example showing that the conclusion of the corollary does not follow when the
cheaper point assumption (A-4) is missing was first offered by Arrow [ 1 ], and we
illustrate this with the Edgeworth-Bowley box diagram shown in Figure 2.14. The
consumption set for each consumer is assumed to be the nonnegative orthant, so
that one consumer's (consumer A) consumption set is the northeast orthant from
THE TWO CLASSICAL PROPOSITIONS OF WELFARE ECONOMICS 199
X
Figure 2.14. Arrow's Anomalous
Case.
only feasible point (setX and set W have no intersection except at point P). The line
H (that is, the one which goes through PR) is the only line that separates the sets
X and W. (Recall that in the proof of Theorem 2.C.2 and its corollary the slope of
the separating hyperplane of X and W gave the price vector which supports a com-
petitive equilibrium.) But this line contains a point (say, R) in X. In other words,
point R is a better point than point P but has the same value as point P; hence, if
the price represented by line H prevails, Robinson the consumer will certainly in-
crease his satisfaction by trading the commodity bundle represented by P for the
bundle represented by R with Robinson the producer. That is, the separation of
decision-making functions by the price H has given rise to incompatible decisions
by the two Robinsons. Hence the Pareto optimal point P cannot be supported by
competitive'pricing. Note that in this example there is no point of X below the line
H. In other words, the cheaper point assumption (A-4) is again violated. (The
reader should check that all the other assumptions can be satisfied by this
example.)
If there is a point in X below the separating line, then a Pareto optimal point
can be supported by decentralized pricing. This is illustrated by Figure 2.16, which
Food
THE TWO CLASSICAL PROPOSITIONS OF WELFARE ECONOMICS 201
is again indebted to koopmans ([ 12] , p. 36). Note that under the price represented
by line H, Robinson the consumer maximizes his satisfaction, Robinson the pro-
ducer maximizes profit, and the feasibility condition is satisfied. Hence the separa-
tion of decision-making functions has given rise to compatible decisions by the
two Robinsons.2-'
FOOTNOTES
aggregate production set means that there are no (technological) external economies
and diseconomies.
11. Note that X; c Rn. Here it is assumed that individual i's preference ordering depends
only on his own consumption bundle and not on the consumption bundles of other
consumers (nor on the pattern of production). This assumption of the lack of "ex-
ternality" is one of the most crucial assumptions in the theorems of this section. In
the literature, this assumption is referred to as individualistic or selfish preference
ordering, as we remarked earlier.
12. Notice, however, that this does not preclude the possibility of the existence of pro-
duction processes which dispose of various types of waste.
13. Intuitively, a point is a local nonsatiation point if there are arbitrarily close points
which are preferred to it. The concept of a local nonsatiation point was first intro-
duced by Koopmans [ 12] and used again in [ 13]. Moore ([ 17], part I) reaffirmed the
importance of the concept in the literature.
14. An alternative way to state the above definition is as follows: x; E X; is called a
local nonsatiation point (for the ith consumer) if there exists xr E X; such that
x;®1x,and x;EX;where all t,0<t< 1.
15. If Q; is such that x; Q;z'; implies tzi + (1 - t)z'; ®,z;, 0 < t < 1, for z; z, (which
is true if Q; is strictly convex) with convex X;, then the transitivity assumption can
be dispensed with. To see this, suppose x; Q;z; with p x; < p x; as above and
let x; (t) = tx; + (1 - t)z;, 0 < t < 1. Then xi (t) Q; ac; , but p- x. (t) < p z;, which
contradicts (i) above.
16. It simply says that "if a competitive equilibrium exists, then it realizes a Pareto opti-
mum with (A-1)."
17. Such an example can be found in Quirk and Saposnik [20], p. 134. (Caution: Indif-
ference curves should take on values only on the lattice points.)
18. See Debreu [61, pp. 93-94.
19. A slightly stronger version of this assumption, which is also used in the literature, is
the following: For a given point ii, there exists x', E X; such that p x; < p z; for all
price vectors p.
20. The separating hyperplane can be written as H = { x: x E X, p x = p x} , where
X = m;= X;
21. Assumption (A-4) is rather awkward. It is certainly desirable to obtain the present
corollary replacing (A-4) by a more plausible assumption, that is, one that is based
directly on some properties of the preference orderings and/or the consumption and
production sets. For an investigation of such a point, see Moore [ 17] , part I.
22. Suppose not. That is, suppose zj®,ij for any small t > 0. Let t - 0 so that z; - xi.
Then by (A-5), x; Q; xi , which contradicts x; Q; i1.
23. In establishing Theorem 2.C.2, which leads to the above corollary, we saw that the
separation theorem played a crucial role. In Chapter 1, we obtained theorems of non-
linear programming (notably that of concave programming) using the separation
theorem. We can conjecture that the above theorem and the corollary can be proved
using a theorem in concave programming. We attempt to do so in Section F of this
chapter.
24. The compatibility of decentralized decision making in terms of prices is the essence
of the concept of competitive equilibrium. The essence of Theorem 2.C.2 is that if a
separating hyperplane exists, it defines a price system that makes such a decentraliza-
tion possible.
THE TWO CLASSICAL PROPOSITIONS OF WELFARE ECONOMICS
203
REFERENCES
1. Arrow, K. J., "An Extension of the Basic Theorems of Classical Welfare Economics,"
Proceedings of the Second Berkeley Symposium on Mathematical Statistics and Prob-
ability, ed. by J. Neyman, Berkeley, Calif., University of California Press, 1951.
2. Arrow, K. J., and Debreu, G., "Existence of an Equilibrium for a Competitive Econ-
omy," Econometrica, 22, July 1954.
3. Barone, E., "The Ministry of Production in the Collectivist State," in Collectivist
Economic Planning, ed. by F. A. von Hayek, London, Routledge, 1935 (Italian
original, 1908).
4. Debreu, G., "The Coefficient of Resource Utilization," Econometrica, 19, July 1951.
5. , "Valuation Equilibrium and Pareto Optimum,"
Proceedings of the National
Academy of Sciences of the U.S.A., 40, 1954.
6. , Theory of Value, New York, Wiley, 1959, esp. chap. 6.
7. Dorfman, R., Samuelson, P. A., and Solow, R. M., Linear Programming andEconomic
Analysis, New York, McGraw-Hill, 1958.
8. Hicks, J. R., "The Foundations of Welfare Economics," Economic Journal, XLIX,
December 1939.
9. Hurwicz, L., "Optimality and Informational Efficiency in Resource Allocation Pro-
cesses," in Mathematical Methods in the Social Sciences, 1959, ed. by.K. J. Arrow,
S. Karlin, and P. Suppes, Stanford, Calif., Stanford University Press, 1960.
10. Karlin, S., Mathematical Methods and Theory in Games, Programming, and Economics,
Vol. I., 1st ed., Reading, Mass., Addison-Wesley, 1959.
11. Koopmans, T. C., "Efficient Allocation of Resources," Econometrica, 19, October
1951.
12. , Three Essays on the State ofEconomic Science, New York, McGraw-Hill, 1957,
esp. secs. I and 2 of the first essay.
13. Koopmans, T. C., and Bausch, A., "Selected Topics Involving Mathematical Reason-
ing," SIAM Review, 1, July 1959, esp. pp. 83-95.
14. Lange, 0., "Foundations of Welfare Economics," Econometrica, 10, January-
October 1942.
15. Lerner, A. P., "The Concept of Monopoly and Measurement of Monopoly Power,"
Review of Economic Studies, 1, June 1934.
16. , Economics of Control, New York, Macmillan, 1944.
17. Moore, J. C., "On Pareto Optima and Competitive Equilibria (Part I: Relation-
ships Among Equilibria and Optima; Part II: The Existence of Equilibria and
Optima)," Krannert Institute Paper, nos. 268 and 269, April 1970, Purdue University.
18. Pareto, V., Manuel d'Economie Politique, 2nd ed., Paris, Giard, 1927 (1st ed., 1909),
esp. chap. VI.
19. Pigou, A. C., The Economics of Welfare, 4th ed., London, Macmillan, 1932, esp.
chaps. IX, X, XI.
20. Quirk, J., and Saposnik, R., Introduction to General Equilibrium Theory and Welfare
Economics, New York, McGraw-Hill, 1968, esp. chap. 4, sec. 5.
204 THE THEORY OF COMPETITIVE MARKETS
a. INTRODUCTION
Consider a simple two-person, two-commodity pure exchange economy,
which may be illustrated by the familiar Edgeworth-Bowley box diagram. Let
x, and y;, respectively, be the amounts of commoditiesXand Ywhich are initially
held by consumer i, where i = 1, 2. We suppose that the two people, starting from
such an initial position, wish to improve their satisfaction by engaging in the trade
of these two commodities. The situation is illustrated in Figure 2.17.
In Figure 2.17, the indifference curves of the two people are denoted by
the usual strictly convex shapes (a], a2, ..., and A,, /32, ...). The initial endow-
ment point is denoted by point R. The curve passing through points P, E, and
Q is the contract curve, which is the locus of points at which two individuals'
indifference curves are tangent to each other. Any point on the contract curve
is a Pareto optimum point.
If the two consumers, starting from the initial point R, trade with each
other, the result is a reallocation of the total amounts of the two commodities
between them, which may be denoted by a point in the Edgeworth-Bowley box
02
i
'H
Ix, + x2)-
shown in the Figure 2.17. If the competitive price mechanism' is introduced into
the trading, then we obtain a competitive equilibrium reallocation E, where the
H-line signifies the equilibrium price'ratio.
That the competitive equilibrium allocation is on the contract curve (and
hence is Pareto optimal) is an important welfare result of the competitive price
mechanism. But any other allocation on the contract curve is also a Pareto
optimum.
Suppose that the competitive price mechanism is dropped from the trading
scheme. Clearly the resulting allocation depends on the trading rule. However,
it is also clear that, under any trading rule, the resulting allocation of the two
people will not fall outside the lens-shaped region defined by the two indifference
curves a, and Al. For if it does, at least one person will be worse off compared
to the initial position R, and he can always refuse to trade. It is clear that the
final resulting allocation of the two people should lie on the PQ segment of the
contract curve.
Hence we may say that, given the initial endowment point R, the alloca-
tions on the PQ segment of the contract curve should occupy a more privileged
place compared to allocations outside the PQ segment. More strongly, any
point on the curve outside the PQ segment is irrelevant to our consideration
when the initial endowment point is given as R. We term the PQ segment the
core of the above economy.2 Note that the competitive allocation E is on the
PQ segment.
In order to single out the importance of the competitive solution, Edgeworth
[ 11 ] in 1881 considered an expanded economy of 2n consumers, in which there
are two "types" of consumers. Every consumer of the same type has identical
tastes and identical initial endowments. In other words, the above box diagram
economy is replicated n times. Edgeworth then argued that as n tends to infinity,
the above PQ segment shrinks to one point: the competitive allocation E (or
the set of competitive equilibria if it is not unique)!' In 1963, Debreu and
Scarf [ 10] elegantly and rigorously proved Edgeworth's result.
The general principle given by Edgeworth was that of "recontracting."
Consider any subgroup of consumers. Suppose that it is possible for its members
to distribute their initial resources among themselves in such a way that no
member of the subgroup is made worse off, while one or more members of the
subgroup are made better off. Whenever this happens, "recontracting" takes
place without others' consent. "Final settlement"' comes when a contract cannot
be amended by the recontract of any such subsets. For the two-person economy,
the above PQ segment, the "core," constitutes the set of final settlements, that is,
the set of allocations which result in no further recontracting. What Edgeworth
has shown is that, in the economy of 2n consumers of two "types," such a set
of allocations decreases as n increases and converges to the set of competitive
equilibria as n-co.
It was Shubik [28] who related the Edgeworth notion of "final settlement"
in "recontracting" to Gillies' [12] concept of the ".core" in the theory of n-
206 THE THEORY OF COMPETITIVE MARKETS
person games. This, in turn, stimulated various works on the problem including
Scarf and Debreu [10], mentioned above. The n-person game theory is con-
cerned with situations in which individuals ("players") with conflicting interests
compete, and hence it probes deeply into the question of the theory of competitive
equilibrium. Therefore it is quite natural that economists should attempt to
master the intricacies of game theory and the theory of the core.
As indicated above, the "core" of the economy is the set of allocations
which cannot be "blocked" by any subgroup ("coalition") of members of the
economy. Note that the concept of the core is free from prices. In other words, the
core solution provides an alternative approach to the price-guided competitive
solution, as well as offering an important characterization of competitive equi-
librium through the results of Debreu-Scarf [101, and others. Moreover, the
concept of "blocking coalition" in the theory of games and the core offers a
fresh interpretation of the concept of Pareto optimum. A Pareto optimum alloca-
tion is one that will not be blocked by the coalition involving all participants
of the economy. This then means that the core gives a stronger characterization
of competitive equilibrium than does Pareto Optimum. In fact, the precision of
this characterization is quite strong, as the above Edgeworth-Debreu-Scarf result
indicates. An important merit of the core-theoretic approach here is that it
permits freedom of choice for each individual of the economy and deduces that
if the number of these individuals increases, each person might behave as if
he were a price taker. In the theory of competitive markets, on the other hand,
each individual is assumed (or destined) to be a price taker."
With the increasing interest in the concept of the core, economists are now
more concerned with the theory of n-person games (for example, the publication
of a series of joint articles by Shapley and Shubik [22], [23], [24], [25], [261,
and so on, in economic journals). This is quite natural, as we have already
remarked. In the classical treatments of game theory such as the theory of von
Neumann and Morgenstern [32], it is assumed that payoffs are made in "utils,"
which are cardinal and, like money, fully transferable among the players. It is
further assumed that these utils are linear in money. Such an assumption is quite
convenient, since the classical theory of games is almost exclusively concerned
with cooperative games with side payments.' However, the appropriateness of
this assumption of money-like transferable utils is naturally very questionable to
economists and others, and it has been extensively debated. This, no doubt,
prompted the development of the theory of cooperative games without side pay-
ments (for example, Aumann and Peleg [5] ). There are a few necessary steps to
be able to reach ordinality of preferences. The classical N-M theory assumed
cardinal utils which are linear in money, but Shapely and Shubik [21 ] pointed out
that linearity and perfect transferability are not essential to the theory. What
remains is an ordinal theory.' Scarf's approach [ 17] with strictly ordinal utility
is good in terms of the Occam's Razor principle. There seems little question that
this advance in game theory has made the theory much more attractive to
economists.
THE TWO CLASSICAL PROPOSITIONS OF WELFARE ECONOMICS 207
M Xi= m
(1) Xi
i=1 i=1
(4) Xi = Xi
iES iES
We then say that x' is S-block superior to x, or x' dominates x by coalition S, and
denote this by x'Bsx, where BS is a binary relation defined on A.
REMARK: Intuitively, an allocation x is "blocked" by a coalition S if there
is another allocation which is feasible among the members of S and makes
no consumers in S worse off while betters at least one consumer in S. The
consumers outside the coalition are "discriminated" against by the coalition
in the sense that some or all of them can be worse off in x' compared to
the initial allocation x.'2
REMARK: Note that the coalition S may consist of all consumers in the
economy, that is, S = M. A Pareto optimal allocation is one that is not
blocked by the coalition involving all consumers.
Define the binary relation B on A by [x'Bx] if and only if x'Bsx for some
coalition S of M. Given a feasible allocation x in A, define set-valued functions
Bs(x) and B (x) by
In other words, Bs(x) is the set of all feasible allocations that block x by a
particular coalition S in M, and B(x) is the set of all feasible allocations that
block x by some coalition in M.
Definition: The core is the set of all feasible allocations that are not blocked by
any coalition. In other words, it is equal to
{xEA:B(x)=01}
REMARK: That x is Pareto optimal means that B,(x) = 0. Hence if x is in
the core, then x is Pareto optimal, whereas a Pareto optimal allocation need
not belong to the core.
Let xs = [xi] iES be a subvector of a feasible allocation x, in which xi E xs
implies i E S. Denote by As the allocations attainable among the members of the
coalition S. That is,
(7) As= {XSEXs: GXi= Lrxi}
iES iES
THE TWO CLASSICAL PROPOSITIONS OF WELFARE ECONOMICS 209
where X®®x iEsX; and x E A. Assuming that X,, i = 1, 2, ..., m, are all closed
and bounded from below, As is compact in R""° for any coalition S, where s is
the number of members of coalition S. It is easy to show that As is convex for
any S if the X;, i = 1, 2, ..., m, are all convex. Let the function us: As3Rsbe
defined by us(XS) = [u;(x,)] iCS. That an allocation x is not blocked by a coalition S
[that is, Bs(z) _ 0] means that is is a solution of the vector maximum problem
of maximizing us(xs) subject to xs E As.
(10) V(123) = {(u', u2, u3): u' < u;(x;) for some x; E 12, i = 1, 2, 3,
with x1 + X2 + X3 = YI + x2 + X3}
provided that the u;'s are quasi-concave. To show this, first observe that the assump-
tions imply
ul u1(xl), u2 G u2(x2) with x, + x2 = 3E1 + 5E2
u2< u2(Y2),u3GU3(Y3) with Y2+Y3 =X2+X3
uI G ul(z1), u3 G u3(Z3) with ZI + Z3 = x1 + x3
But the allocation [(xl + z1)12, (x2 + Y2)/2, (y3 + z3)/2] is feasible for the coali-
tion consisting of all three consumers, for we have
(13) XI 2 Z] + X2 + Y2 + Y3 2 Z3
2
= XI + X2 + X3
U2
U1
Figure 2.18. An Illustration of V(S).
THE TWO CLASSICAL PROPOSITIONS OF WELFARE ECONOMICS 211
states that the core of any "balanced rn-person game" is always nonempty,1e and
then he remarked that an exchange economy with convex preferences always
gives rise to a balanced rn-person game; hence its core is nonempty. For the
concept of a "balanced rn-person game" and the proof of the above theorem, we
simply refer to Scarf's elegant paper. For a recent generalization of Scarf's result,
we refer to Billera [6].1
That the core is nonempty is obviously important, for the core can indeed be
empty, and if the core is empty any discussion on the properties of the core
becomes meaningless. Moreover, in view of the close relation between the core
and the set of competitive equilibria, the study of the conditions for a nonempty
core can be utilized in the study of competitive equilibrium, such as the existence
of competitive equilibria. As we will show in the next subsection, every competi-
tive equilibrium is in the core. Hence if the core is empty, there exists no competi-
tive equilibrium."
An example of an economy with an empty core (which is due to Scarf,
Shapley, and Shubik) is mentioned by Debreu and Scarf [ 10] and by Shapley and
Shubik [23] . The example is concerned with a pure exchange economy with two
commodities and three consumers, each of whom has nonconvex preferences, as
described by the indifference curves of Figure 2.19.
Mathematically, the utility function for Figure 2.19 may, for example, be
written as
Y
xif x <
2
2if2<x<y
(14) u(x,y)= 2ifx= y
2ify<x<2y
yifx>>2y
Assuming that each consumer has one unit of each commodity initially, the proof
that the core of this economy is empty may be sketched roughly as follows:
(i) Suppose that an allocation c = (Cl, c2, c3) is in the core where c; represents
the consumption bundle of Mr. i(i = 1, 2, 3). Since c cannot be blocked by a
coalition consisting of one person, we must have u(c,) > u(1, 1) = 17 2
i= 1,2,3.
(ii) Moreover, c cannot be blocked by the coalition consisting of any two persons.
But the coalition of any two persons can give each member u(, 3)
u(, 3) = 3 by redistributing the resources between the two as
212 THE THEORY OF COMPETITIVE MARKETS
Commodity Y x=z
x=y
/ / / 2,2) x=2y
/
u=1
U
u=z
//-
Commodity X
0
(iii) Therefore, at least two consumers (say, Mr. 1 and Mr. 2) have u. > for each
i(i = 1, 2). 3
(iv) Hence assume that ui 3, L12 >_ 3. Among all the possible allocations that give
Mr. 1 and Mr. 2 at least satisfaction 3, choose the one that gives Mr. 3 at least
satisfaction ! .
(v) It turns out from (14) that the only c3 possible is the one such that u(c3) = z'
This implies that in view of (14), Mr. 3 must have either of the two commodities
in the amount of one unit. Note that owing to the lack of convexity of
preferences, u(1, Z) = u(1, 1) = u(Z, 1) = Z.
(vi) Suppose that Mr. 3 gets a unit amount of X. Then we can show that the
coalition of 1 and 3 can block such an allocation.
(vii) With a similar analysis for the case where Mr. 3 receives one unit of Y, we
show that an allocation c can always be blocked by some coalition. Hence
c cannot be in the core. Thus the core is empty.
In view of the above example, we can see that the convexity of preferences
plays a crucial role in asserting the nonemptiness of the core. In a recent study by
Shapley and Shubik [23], it is suggested, however, that the convexity of prefer-
ences is not as crucial as it appears in the above example, if the number of
participants is large. They showed that the core can be empty but that there is a set
of allocations which can be blocked only with very small preference on the part
of the blocking coalition. In other words, assuming that a coalition "blocks" an
allocation only when the increase in preferences (money-like utils) of the blocking
coalition is at least as great as some positive number c, a quasi-core (called the
E-core) defined in terms of such a "blocking" is always nonempty when the
number of participants is large enough. Thus the core is "approximately" non-
THE TWO CLASSICAL PROPOSITIONS OF WELFARE ECONOMICS 213
empty if the number of participants is large enough, and the E-core shows such an
approximation. Although the convexity of preferences is not required in establish-
ing this result, it is obtained under a prohibitively strong assumption, that is, that
of "transferable utility." Later on we will point out another result in the litera-
ture: if there is a continuum of participants, then the core in the strict sense is
nonempty even with nonconvex preferences and nontransferable utility.
xj > p ii for all i E S with strict inequality for at least one i, so that
Ziesp x; > ZiESP zi, which contradicts LESx;' = LESX i. (Q.E.D)19
REMARK: As remarked before, every allocation in the core is a Pareto
optimal allocation, while the converse does not necessarily hold. Hence
the above theorem is an extension of the result which says that every
competitive equilibrium realizes a Pareto optimum. Moreover, the above
theorem also asserts that if a competitive equilibrium exists, then the core
is nonempty, and that if the core is empty, there exists no competitive
equilibrium.
Definition: Two consumers-say, i and j-are said to be of the same type if they
have identical utility functions (that is, ui = uj) with identical consumption sets
(that is, Xi = Xj) and if they have the same initial endowment (that is, x = zj).
Suppose that there are r consumers in each of k categories ("types") of
consumers in the economy (so that kr = m). Write the consumption bundle for
each consumer as
xy,i= 1,2,...,k; j= 1, 2, ...,r
That is, xy is the consumption vector of the jth consumer of the ith type. An
allocation vector then is written as (x i i , ... , x lr, , xk i , ... , xkr) The utility
function and the consumption set of any consumer of the ith type is denoted by ui
and X., respectively. His initial endowment vector is denoted by xi, so that the
aggregate endowment vector of the economy is equal to Ek i (rx;).
We now impose the following assumption.
(A-1) The consumption sets Xi are convex for all i and the utility functions ui are
strictly quasi-concave for all i.20
Theorem.2.C.4: Suppose that (A-1) holds. If (x1 I, ..., xtr, -, xkt, - xkr) is
-
an allocation in the core, then xil = xi2 = = xir for each i = 1, 2, ... , k; that is,
an allocation in the core assigns the same consumption to all consumers of the same
type 21
PROOF: Suppose not, so that the consumption vectors xio,, ... , xior are not
identical for some io. For such an io, let xi010 be the least desired consumption
vector (that is, Mr. jo of the i0th type is the "underdog" of the i0th type).
Then owing to the strict quasi-concavity of ui, we have, for such a jo,
(16) uio(Ixiol + - -+
I Xio,.) > ui0(xiW°)
while for any other i we have
(17) lxir) > ui(x11), for some j
r r
THE TWO CLASSICAL PROPOSITIONS OF WELFARE ECONOMICS 215
Theorem 2.C.5 Suppose that (A-1'), (A-2), and (A-3) hold. Then if (z1, ..., Xk) is
in the core for all r, it is a competitive equilibrium.
PROOF: The proof is carried out in four steps.
(i) Define set ri by
(19) ri-{z1E0:ui(zi+xi)> ui(zi)},i= 1,2,...,k
Since the nonsatiation assumption (A-2) holds and ui is strictly quasi-
concave, ri is nonempty and convex. Define set r by23
k k
(20) r' z: z aizi, ai= 1,ai>_ 0,ziEri,i= 1,2,...,k}
i= 1 i= 1
z2
Choose s from any positive integer with s < r. Let ai5 be the smallest
integer greater than or equal to sail and let I be the set of i for which
a* > 0. For each i in 1, define zis by
5(X *
(22) zis - a z*
i
s
Observe that zis approaches z7 ass tends to infinity. Therefore, zis belongs
to Fi for a sufficiently large s, since Fi is an open set (for each i) and any
point sufficiently close to a point in Fi (such as z*) is in F,. Observe also
d. SOME ILLUSTRATIONS
The purpose of this subsection is to illustrate some of the concepts and
theorems discussed thus far. To simplify the exposition, we assume that the utility
functions of all consumers are identical and are denoted by u(xi), i = 1, 2, ..., m.
Assume that the consumption set for each consumer is 0, the nonnegative orthant
of Rn, and that u takes nonnegative values with u(O) = 0. Furthermore, impose
the following restrictive assumption on the function u.
(A-4) The function u(z) is linear homogeneous, concave, and
u[tzl + (1 - t)z2] > tu(zl) + (1 - t)u(z2)
forallO<t< 1 andz1,z2ES2 with z1 /3z2forany/3ER,/3>O,andz1 0,
Z2 0.
REMARK: The last part of (A-4) says that u(z) is strictly concave for all
nonproportional zl and z2. Note that u cannot be linear homogeneous and
strictly concave for proportional z1 and Z2.2' An example of u(z) which
satisfies (A-4) isa7
f [G
i=Ia;z;] = of [G'Xrzi]
i=1a > aI
i=Ia'f(z;)
a
m
_ ai f(zi), where a ai
That is,
m
(28)
r
m+
for any ai > 0, i = 1, 2, ... , m, with E"_ 1a1 > 0. Note that;"_ 1a1 does
not have to be equal to one. For example, we have
f(Z1 + z2 + ... + Zm) > f(Z1) + ... + f(Zm)
for such a function f. If f is linear homogeneous and strictly concave for non-
proportional vectors, then the above inequality (28) is replaced by (28').
(28') f [2:a;z]
=1
> Zi=1atf(zi)
for all a, > 0, i = 1, 2, ... , m, provided that at least one z;o is not proportional
to the others, that is, z,0 /iz; for any /i ? 0 and any i io, and that the
z;'s do not vanish.
We now prove the following lemma, which justifies the power of (A-4) for
the purpose of simplifying the illustration.
Lemma: Let u be the identical utility function of all m consumers and assume that
u satisfies (A-4). Let V be defined by
I f (u 1 , ... , u"') is in V, then there exists a feasible allocation x = (x1, ... , xm) such
that u(x,) = u', i = 1, 2, ..., m, and that x is Pareto optimal. Conversely, if x is a
Pareto optimal allocation, then (u1, ..., u"') = [u(x1), ..., u(x,,,)] is in V. 21
The first two steps (i) and (ii) of the proof are concerned with the
first statement of the lemma, and the last step (iii) is concerned with the
second statement of the lemma.
(i) First we show that there exists a feasible allocation (x1 , ..., x,,,) which
satisfies u' = u(w), where u' = u(x;), i = 1, , m. Let
U
u(xj) = u (I )u(cw) = u', m
U( CO)
since 2:"`_ u' = u(w). In other words, the allocation (x1 , ... , x,,,) defined
by (30) is feasible.
(ii) Next we show that if [u(xl), ..., u(xm)] _ (ul, ..., u') is in V, then
(xi, ..., x,) is Pareto optimal. Suppose the contrary and assume that
there exists yi > 0, i = 1 , 2, ... , m, such that Em lyi = to and u(yi) >_
U' for all i with strict inequality for at least one i. Then using the concavity
of u, we can observe
which is a contradiction.
(iii) To show that [u(x1), . . ., u(xm)] is in V for every Pareto optimal alloca-
tion (x 1 , ... , xm ), it suffices to show that (x 1, ... , xm) is proportional
in the sense that
m
(32) xi = air.,, for some ai > 0, i = 1, 2, ..., m, with ai = 1
i= 1
where w _ 2:t"_ 1xi, the aggregate endowment vector. For then we have
m m m
u(xi) _ u(aiw) aiu(W)
i=1 i=1 i=1
This contradicts the assumption that (x1, ..., xm) is Pareto optimal.
(Q.E.D.)
REMARK: Observe that, in (iii) of the above proof, we showed that every
Pareto optimal allocation is proportional in the sense of (32).'-'9
Next we turn to an illustration of the Edgeworth-Debreu-Scarf limit
theorem.3' Assume now that there are two types of consumers and that there are
r consumers of each type. There are two commodities X and Yin the economy.
Assume for the sake of illustration that the consumers of both types have identical
utility functions of the "Cobb-Douglas" form
(33) u(x, y) = x'Yy1-`% 0 < a < 1, where x > 0 and y > 0
As remarked before this utility function satisfies (A-4) [as well as (A- I')]. The con-
sumers in the two different types are distinguished by their initial endowments.
THE TWO CLASSICAL PROPOSITIONS OF WELFARE ECONOMICS 221
Denote by (x;, y ,) the initial endowments of any consumer of the ith type (i = 1, 2).
Denote the aggregate endowments of X and Y by a and b, respectively, that is,
(34) a = rx, + rx2 and b = ry , + ry 2
We are interested in characterizing the core of such a replicated economy.
Since any allocation in the core assigns identical consumption bundles to every
consumer of the same type (the parity theorem of Subsection c), we may represent
an allocation in the core by [(x,, y, ), (x2, Y2)] where (x;, y;) denotes the con-
sumption bundle of any consumer of the ith type. Moreover, we know, by defini-
tion of the core, that any allocation in the core is Pareto optimal. Furthermore,
as we observed in this subsection, any Pareto optimal allocation is proportional;
that is, it satisfies (32). Hence any allocation in the core is proportional. In other
words, any allocation in the core assigns the two commodities in the ratio of alb,
that is,
(36-b) x2 = (1 - O )a and y2 = (1 - 0) b
where 0 < B s 1. Recalling that a = r(x, + x2) and b r(y i + Y2), we may
rewrite this as
(37) x, = Bx, yi = By, x2 = (1 - U)x and y2 = (1
where x and y are defined by
(38) x=z,+x2and+y2
Therefore, each consumer of type 1 obtains the satisfaction represented by
Bz"y' ly. Similarly, each person of type 2 obtains the satisfaction represented
by (1 - a)zafyi-a
+ t(1 - 6)xa_I-cr
cr_1-m a-
(39) sox y y > (sx, + tx2) (sy, + 02)1
for any integers s and t with 0 < s, t < r. Dividing both sides of (39) by sx ( j
and writing Us = a(where s > 1), we obtain
222 THE THEORY OF COMPETITIVE MARKETS
for any a. Denote the RHS of this relation by 0 (a). Then we immediately obtain
so that 0'(a) > O for all a. It is easy to check that o (a) is a strictly concave function31
and therefore
(47) 0'(a)(1 - a) > 0(1) - 0(a) for all a 4- 1
Now consider the limit as r --> cc. Clearly when r --> oc, r/(r - 1) -> 1 and
(r - 1)/r --> I. Therefore, we have, in view of (49),
(50)
a--.I da
1-lima-
fl-,I
(Y
a<1 a>I
THE TWO CLASSICAL PROPOSITIONS OF WELFARE ECONOMICS 223
(56) y; _ x;Y+(1-a)y;,i=1,2
X
Maximize: x y, I - tr
(xi,P)
Subject to: px;+ y;< px;+ y;,x;>_ 0, y;>0
The solution of this problem can be computed easily and (as is well known)
takes the following form:32
224 THE THEORY OF COMPETITIVE MARKETS
Therefore, combining this with (58-a), we can compute the unique equilibrium
price ratio p as
_ a y
(60) p- 1-az
Therefore, the unique competitive allocation is computed by (58-a) and (58-b)
using this p. The resulting expressions for xi and yi, i = 1, 2, are identical to the
ones in (55) and (56). In other words, when r- co, the core allocation is unique
and coincides with the competitive allocation. It may be emphasized here that
both the core allocation (when r- co) and the competitive allocation are unique
as a result of (A-4), especially the assumption that u is strictly concave with
respect to nonproportional vectors. If this assumption is relaxed, then the unique-
ness does not necessarily follow.
e. SOME REMARKS
The limit theorem obtained by Debreu and Scarf [ 10] has aroused great
interest among mathematicians and economists working on the theory of the core
and has produced various attempts to extend the Debreu-Scarf analysis. One focal
point is the particular way of increasing the number of persons in the economy.
It is assumed that there are k types of participants in the economy with r members
of each type. Debreu and Scarf then obtained their result by letting r increase. The
crucial step in obtaining the limit theorem is the equal treatment theorem which
says that an allocation in the core assigns the same consumption to all consumers
of the same type. In this way they avoid the difficulty of the feasibility condition
ni n7
Xi = G xi
i= 1 i= 1
theorem, it is then natural to explore whether or not a result similar to their limit
theorem holds when we directly increase m. One such attempt is that of Vind [ 31 ] .
Define the set ri by (19), following Debreu and Scarf [ 10] , and let ri(E)
{zi E 0: Nf(zi) c ri}, where NE(zi) is an c neighborhood of z. Let p(p, c) be the
number of persons such that H n ri(E) 0, where H zi: p zi < 01. Let F (E) be
the convex hull of U"'= iI'(E). If we can prove that 0 (4 F(E), then using the separa-
tion theorem we can prove that there existsfi > 0 such that p (p, c) = 0. Hence if we
can prove 0 0 r(E) for any c > 0 when m -> oo, then we have a generalization of the
Debreu-Scarf limit theorem. However, such a proof is impossible. What Vind [ 31 ]
has shown is that we can find an upper bound for p(p, c), which is independent of
m, for every core allocation. In other words, if (zl, ... , z,,, ) is a core allocation, then
there exists a p >_ 0 such that p(p, c) < p for any c > 0, where p is defined as34
<<)I Z and c(E) = sup d(0, H n r(E))
Clearly this result is useful only when p is finite for any M. However, p becomes
infinite when, for example, a particular individual has a complete monopoly over a
certain scarce resource in the production economy. The assumption that P is finite
then seems to play a role similar to the Debreu-Scarf equal treatment theorem.
Another approach to the theory of the core and the limit theorem starts with
Aumann's assumption of an "atomless" set or a continuum of traders [ 2] . In other
words, the concept of a competitive equilibrium requires that the influence of
each participant be zero, which is possible only when the number of participants
in the economy is infinity. Then Aumann assumed that the economy contains a
continuum of traders of as many real numbers as in the unit interval I - [0, 1 ] .
Define the initial endowment and the feasible allocation, respectively, as the func-
tions x and x defined over I to 92, such that
Ji= f xidi>0
an d
jx = f xidi = f-xidi
where the integral is defined componentwise. Note that the integral gives the area
under the curve defined by the integrand, and therefore the area under a single
point-say, xi-is zero. This is the basis of the "atomless"' set of participants. In
the actual treatment of core theory with an atomless space of participants, a
branch of mathematics called "measure theory" is extensively used. Thus the
above integrals are taken in the sense of Lebesgue,35 and xi and xi are Lebesgue
integrable functions in i. Let p(S) denote the Lebesgue measure where S is a
Lebesgue measurable subset of I. The core C, and the set of competitive alloca-
tions E, are respectively defined by
C,= {x: u,{x;) > ui(xi), i E S, p(S) > 0 imply Jxdi i fijdi}
E, = {x: 3 p > 0 such that x; E Dfi(i) implies u(xi) > ui(x;) for a.e. i c I}
226 THE THEORY OF COMPETITIVE MARKETS
says that, under certain assumptions, every core allocation can be obtained as a
competitive equilibrium!
The question then boils down to the problem of finding an equilibrium
price vector. The tatonnement process, which we discuss in Chapter 3, provides
one method of finding an equilibrium price vector, as long as the process converges
to an equilibrium. The beauty of this process is that the "market manager" of this
process does not have to know the preferences of each consumer and the produc-
tion set of each producer. A major weakness of the process is that the convergence
of this process to an equilibrium is established only under a restrictive situation in
which all commodities are "gross substitutes." Recently Scarf [ 18 and 19], as
mentioned before, offered a constructive method of finding an equilibrium price
vector. Electronic computers will "quickly" calculate the equilibrium price
vector '31 if we know the technology available in the economy, the initial endow-
ment of each consumer, and have certain information with regard to each con-
sumer's demand function.10 If one can successfully compute the equilibrium price
vector, then the existence of competitive equilibrium of a given economy can also
be ascertained.
That the core can be characterized "almost" completely by competitive
equilibrium has one important corollary. That is, if we can find circumstances in
which the core is empty, then the competitive mechanism will "fail," and con-
versely, if the competitive mechanism "fails," then there is a good possibility that
the core may be empty (assuming that the number of participants is large)."
In the literature, the cases in which the competitive mechanism fails are known as
the cases of market failures.42 This suggests the close connection between the
theory of the core and the theory of market failures (and the theory of monopoly).
A famous case for market failures is the case of external economies and
diseconomies in production or consumption that effect the welfare of outsiders
regardless of their desires. A classical example of external economies is that of an
apple grower and a beekeeper in the adjacent field. External diseconomies have
attracted a greater attention recently due to smoke, noise, and many forms of air
and water pollution. Recently a fresh look at this problem has been taken by
Shapley and Shubik [25] , who considered the problem of externality from the
viewpoint of the theory of the core. They argue, for example, that in certain cases
of diseconomies, the core may be empty. Needless to say, if the core is empty, the
competitive equilibrium, in general, does not exist. Here we may quote Shapley
and Shubik ([25] , p. 681) for such an example.
The Garbage Game. Each player has a bag of garbage which he must dump in
someone's yard. The utility of having b bags dumped in one's yard is -b.
It can be shown easily that if there are more than two players in this "game," there
is no core.''
Another example of market failures may be the commodity called "informa-
tion." It is true that certain kinds of information can be traded in the market just
as can any other commodity. For example, information with regard to technical
228 THE THEORY OF COMPETITIVE MARKETS
FOOTNOTES
1. Note the following features of the competitive price mechanism: (1) Each economic
agent (here the consumer) is a price taker, and (2) there exists a price system which
is the same for all economic agents. The price system can exist, of course, without
each agent being a price taker. In the box diagram model, one or both can be em-
to name the price. For a game-theoretic exposition, see Shapley and Shubik
F 22] .
2. Note that the core is a subset of the set of Pareto optimal points, in the sense that
the entire contract curve is now restricted to its PQ segment. In this sense, the core
is stronger than a Pareto optimum. The meaning of the core will be discussed more
fully below.
3. See Edgeworth [ 11] , pp. 35-39, in particular.
4. Edgeworth's definitions of terms such as "recontracting," "final settlement," and so
on, appear in [ 11] , pp. 18-19.
5. There is a slight oversell of the core theory here. The Debreu-Scarf result assumes
that there are an equal number of individuals of each "type." The power of this
assumption lies in its consequence that the individuals of the same type are treated
identically (that is, each has the same consumption bundle). If there are different
numbers of individuals of each type, then this parity (or equal treatment) result does
not follow (that is, there is a core allocation which treats individuals of the same type
differently). On the other hand, the basic premise of competitive equilibrium is
obviously that of equal treatment. Thus both the theory of competitive equilibrium
and the Debreu-Scarf result have one basic feature in common, that is, equal
treatment.
6. The game is said to be cooperative if the players are allowed to communicate before
each play and to make binding agreements about the strategies they will use. Side
payments are allowed when there is a medium of exchange-say, "money"-which
is freely transferable between the players and each player's utils are linear in (or
proportional to) money. It is known that cooperative games without side payments
include cooperative games with side payments as a special case. Noncooperative
games include cooperative games as a special case.
7. See, for example, Aumann [4] and Aumann and Peleg [5].
8. For an excellent survey of the theory of n-person games without side payments, see
Aumann [3].
9. Debreu and Scarf [ 10] suggested a way to generalize the results so as to incorporate
production into the model. Nikaido [ 15] , and Arrow and Hahn [ 1] have rigorous
formulations and the proofs of such a generalization.
10. Any order-preserving (that is, monotone increasing) function of a particular utility
function can also be a utility function. For the discussion of the representability of
preferences by a continuous real-valued function ("utility function"), see Debreu
[9]. Also recall our discussion in Section B.
230 THE THEORY OF COMPETITIVE MARKETS
11. The reader should find no difficulty (in most cases) in carrying out an analysis similar
to the one which follows, replacing the function u; by the usual preference ordering.
12. Note, however, that a person will not be worse off compared to his initial endow-
ment position, since he can always refuse trading. In other words, the existence of
a coalition does not mean that it would necessarily "take effect."
13. One of the most important applications of this concept in economics is the "com-
pensation principle" problem in welfare economics. For an exposition of the
compensation principle, see Takayama [29] , chapter 17. Clearly this problem offers
an interesting application of the theory of n-person (cooperative) games in economics.
14. A game (without side payments) can be defined by specifying V(S) for all coalitions S
and U(M). In the theory of games, some or all of the following assumptions are
imposed: (i) V(S) is convex, closed, and nonempty for each S ; (ii) v E V(S) and
v' < v where v' E RS imply that v' E V(S); and (iii) V(S) x0 V(S') c V(S U S') if
S and S' are two disjoint coalitions. Sometimes these assumptions are used as the
axioms of the theory.
15. In Figure 2.18, it is implicitly assumed that the normalization of units is made with
regard to the representation by the ui's such that u,(0) = 0 for all i.
16. Let es be a vector in R3 whose ith element es, is defined as es, = I if i E S, and = 0
if i 14 S. A collection T of coalitions, {S}, is called balanced if it is possible to assign
to each S in T a nonnegative number Ss such that ISETSses = e,,. If M = { 1, 2, 3},
then T = {{1, 2}, {2, 3}, {1, 3}} is balanced where the 8s are given by 8{1,2} =
= 1/2. An m-person game is said to be balanced if for every balanced
-5{2,3} = -5113)
collection T, us E V(S) for all S E T implies u E V(M).
17. Not only did he generalize Scarf's result, but he also obtained necessary and
sufficient conditions for a nonempty core for games whose payoff sets are assumed
to be convex.
18. On the other hand, the method of proving that the core is nonempty can be utilized
in the.proof of the existence of competitive equilibrium. Apparently from this view-
point, Scarf [ 18; 19] showed a constructive proof of the existence of competitive
equilibrium. Compare these articles with [ 17] .
19. It is easy to see that, in establishing this theorem, no stronger assumptions than
those needed in proving Theorem 2.C.1 ("every competitive equilibrium realizes
a Pareto optimum") are required.
20. A real-valued function f defined on a convex subset Z of R" is called strictly quasi-
concave iff(z) ? f(z') implies f[tz + (1 - t)z'] > f(z') for all z, z' E Z with z z',
and for all t, 0 < t < I (see Chapter 1, Section E). Using this definition, we can prove
the following: Let f be strictly quasi-concave on Z, and let z 1 , z2 , ... , z,,, be
m points in Z. Suppose that one of these m vectors-say, zj0 is distinct from any
other points with f(z.) ? f(zio) for all j = 1, 2, . . ., m. Then we have f(01 z2 + +
z,,,)>1(zjo)forall0J> 00l= 1, such that zoos jj"_10jzj.
21. In this sense, Theorem 2.C.4 may be termed the parity theorem or the equal treatment
theorem.
22. In other words, the coalition of the underdogs can block the original allocation
by redistributing their own initial holdings among themselves, where the "underdog"
of the ith group now receives (x;1/r + + Such a coalition is feasible as a
xtr/r).
26. To see this, let z2 = /SzI forsome/i and observe that u [(zt + z2)/2] = u(z' )/2 + u(z2 )/2
using linear homogeneity. But this contradicts strict concavity. Note also that if
z2 = 0 and if u(0) = 0, then we again have u [(z' + z2)/2] = u(z1)/2 + u(z2)/2,
contradicting strict concavity.
27. To prove that u(z) is strictly concave for all nonproportional vectors, show that
u'(z') (z2 - zI) > u(z2) - u(z') for all nonproportional zI 0 and z2 0. To
show this, utilize the following well-known inequality: OI'YIO2 2... a, O1 +
a202 + + anon (the equality holds if and only if 01 = 02 = = 00, where
Ell= I a;= 1, a,> 0, and 0. 0 for all i.
28. A similar result is obtained in E. Eisenberg, "Aggregation of Utility Functions,"
Management Science, 7, July 1961.
29. In terms of the box diagram, this means that the contract curve coincides with the
diagonal line of the box.
30. I learned recently that a similar example was discussed by Herbert Scarf in his
lecture at Yale.
31. Compute 0"(a) and examine q"(a) < 0 for all a. Alternatively, write q(a) = cD [f (a),
g(a)], where f (a) = (x I /x + x 2a/x )a and g(a) _ (y I /y + y 2a/y )' ". Observe that
both f (a) and g(a) are strictly concave. Also note that CD is strictly concave and mono-
tone (that is, aO/af > 0 and Wag > 0). These establish the strict concavity of 0.
32. First note that the Cobb-Douglas form of the utility function guarantees an interior
solution. The condition requiring the tangency between an indifference curve and the
budget line is stated asp = [a/(1 - a)] (y;/x;). This together with px; + y; = px i +
y; constitutes a necessary and sufficient quasi-saddle-point characterization of the
solution and yields (58-a) and (58-b).
33. Green [ 13] contains such an example, which he acknowledges to Alan Kirman.
Green [ 13] then argues that there are bounds on the inequality of treatments. See
also Vind [31].
34. Let X c R" and a E R". Then d(a, X) denotes the "distance" between a and X, mean-
ing d(a, X) = infCEx II x - a II-
35. Since the reader is not expected to know measure theory, the subsequent paragraph
may be omitted.
36. See R. J. Aumann, "Existence of Competitive Equilibrium in Markets with a Con-
tinuum of Traders," Econometrica, 34, January 1966. A crucial assumption is that
of "monotonicity" in the sense that x; ? x; implies u; (x;) > u; (x;), and not the (quasi-)
concavity of the u;'s.
37. Suppose that there exists x E E, but x 0 C,. Since x 0 C,, there exists x' and S c I
with µ(S) > 0, such that u;(x;) > u.(x;), i E S and Js x' = Js x. But, since x E E,,
there exists p ? 0 such that p x; > p x;, i E S, which implies p [ S xidi] >
I [ )s x;di] This contradicts the feasibility condition of the coalition S, Js x;di =
.
Js x;di.
38. For example, Vind [ 30] showed a different derivation of Aumann's result E, = C, in
[2]. Hildenbrand [ 14] introduced production, while Aumann and Vind as';umed a
pure exchange economy. Moreover, Hildenbrand showed that the monotonicity as-
sumption can be relaxed and that the consumption set does not have to be restricted to
the nonnegative orthant of R'.
39. Scarf writes ([ 19] , p. 669)
A considerable body of computational experience with larger models has
already been gathered. Over one hundred examples have been tested, ranging
from three to twenty sectors. The computational time, which is dependent on
the number of sectors, has never exceeded five minutes on an IBM 7094, and in
most cases is substantially smaller.
232 THE THEORY OF COMPETITIVE MARKETS
40. It is assumed that each consumer has a set of demand functions which can be ex-
pressed as a&f,(p)/p,-bi, where x;j denotes consumer i's demand fu action of the
jth commodity, a, measures the intensity of i's demand for j, and b; is the, elasticity of
substitution for i. For computation, it is required that the a;i's and the b;'s be known.
41. It is to be pointed out that competitive equilibrium may fail to exist for various
reasons, such as the nonconvexity of preferences and of the aggregate production set.
But the core can still be nonempty or at least "approximately" nonempty as in the
Shapley-Shubik theory of the E-core [ 23] . This is a great merit of the theory of the
core. But it is also to be noted that the core can be very large, and its practical signi-
ficance may be greatly hampered.
42. The purpose of our discussion here is not to make a comprehensive survey of the
theory of market failures. For an early exposition of this topic, see F. M. Bator, "The
Anatomy of Market Failure," Quarterly Journal of Economics, LXXII, August 1958.
See also K. Imai, H. Uzawa, R. Komiya, T. Negishi, and Y. Murakami, Price Theory
II, (in Japanese), .Tokyo, Iwanami, 1971, Chapter 7.
43. Shapley and Shubik [ 25], on the other hand, indicated that the core will exist in the
case of external economies if they are internalized by being listed as explicit com-
modities. This apparent asymmetry between external economies and diseconomies
makes their result highly suspect, or at least urges us to.consider this problem further.
See discussions on Shapley and Shubik [ 25] by K. J. Arrow and T. Rader, in American
Economic Review, LX, May 1970, pp. 462-464. Arrow, for example, suspects that the
lack of the core in the above garbage game may really be due to a possible lack of con-
vexity in the production set, instead of the presence of an external diseconomy (p.
463).
44. Actually there are some important cases of market failures that we have not discussed
here. For example, "the market fails" as a result of certain "public goods" in which
the beneficiaries of these goods cannot be distinguished from the nonbeneficiaries
(the lack of "exclusion") as a result of the lack of "future markets" for certain com-
modities, or simply as a result of future generations being unable to participate in the
market.
45. The information may be indispensable for the production of a certain new com-
modity, or the information may provide a significant cost saving method of produc-
tion of an existing commodity.
46. The information may be protected by patent rights and the possessor of the right may
refuse to sell the information; or the possessor of the information may simply hide it,
for publication of the information through patent rights may cause his techniques to
be imitated.
REFERENCES
I . Arrow, K. J., and Hahn, F. H., General Competitive Analysis, San Francisco, Holden
Day, 1971.
2. Aumann, R. J., "Markets with Continuum of Traders," Econometrica, 32, January-
April 1964.
3. , "A Survey of Cooperative Games without Side Payments," Essays in Mathe-
matical Economics in Honor of Oscar Morgenstern, ed. by M. Shubik, Princeton, N.J.,
Princeton University Press, 1967.
4. , "The Core of a Cooperative Game without Side Payments," Bulletin of the
American Mathematical Society, XCVIII, March 1961.
5. Aumann, R. J., and Peleg, B., "von Neumann-Morgenstern Solutions to Cooperative
THE TWO CLASSICAL PROPOSITIONS OF WELFARE ECONOMICS 233
28. Shubik, M., "Edgeworth Market Games," in Contributions to the Theory of Games,
IV, ed. by A. W. Tucker, and R. D. Luce, Princeton, N.J., Princeton University Press,
1959.
29. Takayama, A., International Trade-An Approach to the Theory, New York, Holt,
Rinehart and Winston, 1972.
30. Vind, K., "Edgeworth-Allocations in an Exchange Economy with Many Traders,"
International Economic Review, 5, May 1964.
31. , "A Theorem on the Core of an Economy," Review of Economic Studies,
XXXII, January 1965.
32. von Neumann, J., and Morgenstern, 0., Theory of Games and Economic Behavior, 3rd
ed., Princeton, N.J., Princeton University Press, 1953.
Section D
DEMAND THEORY
The purpose of this section is to study the theory of demand for a "competi-
tive" consumer. Traditionally (as-explained in Hicks [ 5] ), this theory is developed
by postulating for each consumer a preference ordering representable by a real-
valued "utility" function. Each consumer is supposed to maximize his utility
subject to his budget constraint. The maximality condition (the first-order condi-
tion) provides the demand function that relates the individual's demand for a com-
modity to the prices of all commodities and his income. A comparative statics
analysis with regard to the maximality condition will yield the Hicks-Slutsky equa-
tion and the properties of the substitution terms. The other approach, which is due
to Samuelson [ 13] , is called the revealed preference theory. This theory neither
presupposes the utility function nor the preference ordering. It goes directly to the
demand for commodities. If a certain bundle of commodities is actually purchased
by a certain consumer at a certain price vector, it is supposed to "reveal" that he
prefers this bundle of commodities to the bundles of goods which cost less than, or
the same amount as, the bundle purchased. Using the consistency condition which
is essentially due to this observation (later called the weak axiom of revealed pref-
erence), Samuelson proved most of the properties of the demand function,
especially the properties of the "substitution terms." However, he failed to prove
the symmetry of the substitution matrix, which was later proved by Houthakker
[6] by imposing another condition (called the "strong axiom of revealed pref-
erence"). The natural question which arises is, What is the relation between the
traditional approach and the revealed preference approach?
(i) Given a demand function, can we tell whether it could be induced by a utility
function? This question is called the integrability problem and has recently been
DEMAND THEORY 235
studied by Samuelson [ 14], Houthakker [6], Uzawa [ 15], and so forth. The
converse of this problem is the traditional analysis explained in Hicks [5] . An
excellent survey article on the ("local") integrability problem is now available in
Hurwicz [7].
(ii) Given a demand function, can we tell whether it could be induced by a preference
relation? Aspects of this problem have been studied by Uzawa [ 15] and Arrow
[ 1 ] The converse of this problem was studied by McKenzie [8]. As discussed in
.
Section B of this chapter, we can deduce the utility function from a preference
relation under certain assumptions. Then the converse problem will be the same
as the converse of problem (i).
x2
H(p)
(A-2) (local nonsatiation) Let x E F(p, M). Then there exists a positive
number S such that for any c, 0 < E < 8, there exists a point x' E B (x) and x' E X
E
with x'(Dx, where BE(x) is an open ball about x with radius c, and Ba(x) n
(X\x) 0.
REMARK: This is the same assumption which was adopted in the previous
section. The following concept was also used in the previous section.
REMARK: The idea of this lemma was used in the proof of Theorem 2.C. 1,
especially in the proof of the lemma preceding the theorem.
Theorem 2.D.2: Let x E F(p, M). If (A-2) holds for x, then p x = M. If (A-2)
holds for all z E F(p, M), then p x = MX(p).
PROOF:
(i) By definition of F(p, M), p x < M. Suppose p x < M. Then, using
Lemma 2.D.2, (A-2) implies that there exists an x' E X, such that x'Qx
with p x' < M. This contradicts the definition of F(p, M). Hence
(ii) By the definition of MX(p), MX(p) < p x = M. Suppose M,(p) < M. Then
p x" < M for some x" Qx. Then x" E F(p, M). Hence by (A-2), there
exists x' E X, such that x' QQ x" with p x' < M. This contradicts the
definition of F(p, M). Therefore we have MX(p) = M. (Q.E.D.)
REMARK: Theorem 2.D.2 means, among other things, that the local non-
satiation at a chosen point implies that all the income is spent.
To study the continuity property of H(p), we introduce the following
assumptions.
(A-3) (interior point) The set X contains an interior point x.
(A-4) The set X is convex.
REMARK: As remarked before, assumption (A-3) amounts to the cheaper-
point assumption (that is, there exists an i in X such that p z < p 5E).
Assumption (A-4) implies perfect divisibility of every commodity. This is
a restrictive although a very useful assumption.
We now explain important mathematical concepts, upper and lower semi-
continuity.
(i) Let {x'7} be a sequence in X such that xq-3x°, and let {yq} be a sequence
in Y such that yq E 0 (xq). If yq -> y° implies y° E 0 (x°), then 0 is called
upper semicontinuous at x0.
(ii) Let {xq} be a sequence in X such that xq --3 x°. If y0 E ¢(x°) implies "there
exists a sequence { yq} in Y such that yq -y° and yq E 0(xq)," then ¢ is called
lower semicontinuous at x°.
(iii) The function ¢ is called continuous at x° if it is both upper semicontinuous and
lower semicontinuous at x°.
REMARK: The above concepts are illustrated in Figure 2.23. The graph of
0 is the shaded region, boundary included; O(x°) is the interval [a, b].
240 THE THEORY OF COMPETITIVE MARKETS
X Xa X Xa
Upper semicontinuity Lower semicontinuity
(i) The function 0 is u per semicontinuous on X if and only if its graph, {(x, y):
x E X and y E 0 (x)}, is closed.
(ii) An upper (resp. lower) semicontinuous function of a continuous function is
upper (resp. lower) semicontinuous.
(iii) The Cartesian product of upper (resp. lower) semicontinuous functions 0j,
that is, (0 1 , 02, ... , pm ), is also upper (resp. lower) semicontinuous.
Theorem 2.D.3: The function H(p) is lower semicontinuous for p ? 0, under (A-3)
and (A-4).6
PROOF: By (A-3) there exists an i E X, and p z < p 5E. Consider
a sequence { pe} with pq _ p. Then p9 z < p x for q large enough. Let z
be an arbitrary point of H(p); we want to find a sequence {ze}, z9 E H(p9),
such that z9 _> z E H(p), as p9 _ p. For large enough q, we define z9 - t9z +
(1 - t9) z where t9 is maximal for t9 E [0, 1] such that z9 E H(p9). We claim
such a {ze} is the sequence we want to find. That is, we want to show
z9 - z as p9 - p. Note that z9 ->z if and only if t9 _> 1. (Hence if we show
t9 - 1 as p9 - p, we are done.) If t9 = 1 (that is, if p17 z p9- x) for large
enough q, we are done. Hence it suffices to consider the case in which
t9 < 1 for large enough q. Note that t9 < 1 implies p9 z9 = pt x, for
q large enough, or else t9 is not a maximum. Suppose t9 - 1. Since the
interval [0, 1] is a compact set, there exists a subsequence of {t9}-say,
{t'}-such that is - t, where 0 t < 1. Since t < 1, ps zs = ps r for
sufficiently large s. Write zs -- tsz + (1 - ts)z. Since ts_t, zs-> z1, where
z'11 = tz + (1 - t)z. Since ps. zs = ps x for large s, we have
(*) tpz+(1-t)p.z=p.z°°=pz
DEMAND THEORY 241
as s -> co. But since p.1 < pi, (*) yields p z > p i, contradicting
z E H(p). Hence we must have tq-> 1. (Q.E.D.)
REMARK: The graph of H(p) is obviously closed. Hence H(p) is upper
semicontinuous.' Thus from the above theorem, H(p) is in fact continuous.
REMARK: Note that the cheaper-point assumption (or the interior-point
assumption) plays a crucial role in the above theorem. If the consumer
starves to death when the price moves beyond a certain point (hence no
"cheaper point" in his consumption set), his budget function H(p) would
become discontinuous.
Write F(p) for F(p,M) where M = p Y. Then we can prove the following
theorem.
Theorem 2.D.4: The demand function F(p) is upper semicontinuous with respect
to p, if H(p) is lower semicontinuous in p, and (A-1) holds.
PROOF: Consider a sequence {pq} such that pq->p, as q->oo. Let z bean
arbitrary point of H(p). Then as a result of the lower semicontinuity of
H(p), there exists a sequence jzq} such that zq E H(pq) and zq -3 z. Let
xq E F(pq). Then by the definition of F, xq®zq. When a different element
z is chosen from H(p), we have a different sequence {zq}. But whatever
the sequence, xq® zq always holds by the definition of F and zq E H(pq).
Owing to the compactness of X, there is a subsequence of {xq}-say, {Xs}-
such that xs -> x' where x' E X; ps xs < ps x implies p x' < p x (take the
limit of s --3 co). Thus x' E H(p). From (A-1) (the continuity of (D), x-,@zs
implies x' @z. Since this holds, whatever the choice of z, x' E F(p). This,
together with the compactness of the range space X, proves the theorem.
(Q.E.D.)
242 THE THEORY OF COMPETITIVE MARKETS
Commodity 2
Lemma 2.D.3: Let x = F(p, M) and x' = F [p', M, (p')] with x' zf- x. Suppose that
(A-3') holds at x'. Suppose also that (A-1) holds. Then if (p', x') lies sufficiently close
to (p, x),x0x'.
PROOF:
(i) (x'Qx): By definition of F, p'- x' < M,r.(p'). As we remarked in the
definition of the minimum expenditure function, there exists z E C, such
that p'. z = MX (p'), since Cr is compact. [In other words, z can be pur-
chased with income MX (p').] From the definition of x', x'(D z. Hence
x' E Cr or x'Qx.
(ii) (x ®x'): Since (p', x') is sufficiently close to (p, x), by (A-3'), there
exists an z E X such that xy - [t9i -j- (1 - t9)x'] E X with p' x9 <
p'. x', for all 0 < t9 < 1. [Note that p'. z < p'. x' implies p' x9 < p'. x'
for all 0 < t9 < 1, since p' x9 = tqp' z + (1 - t9)p' x'.] But p'. x'
244 THE THEORY OF COMPETITIVE MARKETS
Theorem 2.D.6: If ff(p) is differentiable and if (A-1), (A-2), and (A-3') hold, then
p,afx(p)=0, j= 1,2,...,n
"Pi
DEMAND THEORY 245
Commodity 2
Commodity 1
PROOF: Let z = Jx(p), and let z' = J,(p') where (p', z') is sufficiently close
to (p, z). Then by Lemma 2.D.3, z' E) z, so that z' E Cr. But by the defini-
tion of M,,(p), Mx(p) < p z' for all z' E Cr. Since (A-2) holds, M,(p) = p- z
by Theorem 2.D.2. Therefore p z < p z' for all z' E C, or p f,.(p) <
for all p', where (p', z') is sufficiently close to (p, z). In other words,
for a fixed p, p f,(p') is minimized with respect to p' at p. Hence using the
first-order characterization of a minimum, we obtain:
aJX(p') j = 1, 2, ... , n (Q.E.D.)
P, = 0 at p' = p,
apj
REMARK: It may so happen that f ,(p) [hence MX(p) also] is not dif-
ferentiable. This happens, for example, when there is a "kink"in the indiffer-
ence curve. In Figure 2.28 (compare Hurwicz [71, p. 196), there is a kink
at the point x in the sense that there are two tangent lines to the difference
curve a at the point Y.
Commodity 2
Commodity 1
o Figure 2.28. Nondifferentiable f( (p).
246 THE THEORY OF COMPETITIVE MARKETS
Theorem 2.D.7: If (A-1), (A-2), and (A-3') hold at x = f (p), wheref (p) = f (p), and
f (p) is differentiable at p, then,
of (Pt) - of (P)
aP.i apt
PROOF: (i) By Theorem 2.D.2, MX (p) = p x = p F(p, M) = p f(p).
Hence aMX(p)/api = a [p -f(P)] l apt = f (p) + p afl apt = f (p), by the
previous theorem. That of(p)/app _ a2MX(p)/aptapj follows immediately
from this. If MX(p) is twice continuously differentiable at p, then clearly
af(P)lap1 = af(P)lapt (Q.E.D.)
REMARK: The partial derivative of /app signifies the rate at which the
consumer varies the consumption of the ith good per unit change of thejth
price when income changes are made at the same time and by a proper
magnitude to keep the consumer on the same indifference locus; of /app is
called the substitution term by Hicks ([5], p. 103).
Theorem 2.D.10: Suppose that (A-2) and Theorem 2.D.7 hold at x = F(p, M) and
that F is differentiable in p and M. Then
aFj(P, M) _ af(P) _ j(P) aFj(P, M)
apj apj am
where f (p) = fx (P)
PROOF: Since (A-2) holds, Theorem 2.D.2 holds, so that p x = M = MX (p).
Therefore
of (P) aFj[P, M.(P)] _ aFj(P, M) + aFj(P, M) Mx(p)
apj apj apj am apj
aFi(P, M) aF1(P, M)
aPj +j(p) am (y
b Theorem 2.D.7)
(Q.E.D.)
248 THE THEORY OF COMPETITIVE MARKETS
FOOTNOTES
1. For the subject matter of this section, we have relied heavily on McKenzie [8] and
his lectures at the University of Rochester. An exposition of McKenzie's approach is
also seen in S. Karlin, Mathematical Methods and Theory in Games, Programming,
and Economics, Vol. I, Reading, Mass., Addison-Wesley, 1959, pp. 271-273. For
a more complete exposition of (static) demand theory, see D. W. Katzner, Static
Demand Theory, New York, Macmillan, 1970.
2. The compactness of X is assumed just for the simplicity of exposition. It can be
weakened. For example, it suffices to assume that X is closed and bounded from
below. This is due to the fact that X can be restricted to a set which is bounded
from above as a result of the budget constraint (that is, a finite income).
3. That is, we assume that the relation ® is reflexive, transitive, and total. However,
we may note that the transitivity axiom is not essential in obtaining many results
in this section. In other words, in many results, it suffices to regard Q only as a
binary relation on X, which is total. Needless to say, the transitivity axiom is needed
in obtaining some results here (such as Lemma 2.D.3 and Theorem 2.D.6).
4. The crucial underlying fact here is the assumption that some point-say, x-is
"chosen" [that is, F(p, M) is nonempty]. The power of this axiom of selection
in demand theory is well illustrated in the theory of revealed preference. Starting
from preference orderings, it is possible that such a choice is impossible. Here we
may recall Sonnenschein's example (quoted in Section B of this chapter): Assume
that the budget set consists of only three points x, y, and z, and suppose that our
consumer's preference is x®yQz but zQx (the case of intransitivity). Here no
choice is possible. Needless to say, if Q is intransitive, then Q is also intransitive.
5. The minimum expenditure function Mx(p) plays a crucial role in McKenzie's
approach to demand theory. As will be shown later, Mx(p) turns out to be a concave
function, hence its Hessian matrix is negative semidefinite. It will also be pointed
out later that the elements of this Hessian matrix correspond to the effect of a
compensated price change on demand, that is, the substitution terms in the Hicks-
Slutsky theory. In other words, the discovery of the crucial role played by the
minimum expenditure function in demand theory is one of the important contribu-
tions of McKenzie [8].
6. Similarly, we can prove the lower semicontinuity of H(p, M). Such a proof is given
by Debreu [41, pp. 63-65. Our proof is due to Lionel McKenzie.
7. The function H(p) is upper semicontinuous if its graph is closed and if its range
space X is compact. Similarly, we can establish the lower semicontinuity and hence
continuity of H(p, M).
8. Since every concave function is continuous in the interior of the domain (compare
Theorem 1.B.1), M,(p) is continuous for all p > 0.
9. To see this, suppose the contrary. That is, suppose that for some F > 0, Mx(p) <
p- z - F for all z E C. Let 2 be a point in CX such that p- z = Mx(p). Then we
have p 2 < p i - F, which is a contradiction.
10. It is known that every concave function is differentiable almost everywhere (that
is, except for sets of measure zero). See, for example, W. Fenchel, Convex Cones,
Sets, and Functions, Princeton University, September 1953 (hectographed).
11. Note that (iv) can also be obtained from (i) and (ii). Similarly, (i) can also be obtained
from (ii) and (iv).
DEMAND THEORY 249
REFERENCES
1. Arrow, K. J., "Rational Choice Functions and Orderings," Economica, N.S., 26,
May 1959.
2. Berge, C., Topological Spaces, New York, Macmillan, 1963 (French, 1959).
3. Chipman, J. S., Hurwicz, L., Richter, M. K., and Sonnenschein, H. F., eds., Prefer-
ences, Utility, and Demand: A Minnesota Symposium, New York, Harcourt Brace
Jovanovich, 1971.
4. Debreu, G., Theory of Value, New York, Wiley, 1959.
5. Hicks, J. R., Value and Capital, 2nd ed., Oxford, Clarendon Press, 1946.
6. Houthakker, H. S., "Revealed Preference and the Utility Function," Economica,
N.S., 17, May 1950.
7. Hurwicz, L., "On the Problem of Integrability of Demand Functions," in Preferences,
Utility, and Demand, New York, Harcourt Brace Jovanovich, 1971, chap. 9.
8. McKenzie, L. W., "Demand Theory without a Utility Index," Review of Economic
Studies, XXIV, June 1957.
9. , "Further Comments," Review of Economic Studies, XXV, June 1958.
10. Newman, P. K., and Read, R. C., "Demand Theory without a Utility Index;
Comment," Review of Economic Studies, XXV, June 1958.
11. Newman, P. K., The Theory of Exchange, Englewood Cliffs, N.J., Prentice-Hall, 1965.
12. Richter, M. K., "Revealed Preference Theory," Econometrica, 34, July 1966.
13. Samuelson, P. A., Foundations of Economic Analysis, Cambridge, Mass., Harvard
University Press, 1947.
14. "The Problem of Integrability in Utility Theory," Economica, N.S., XVII,
November 1950.
15. Uzawa, H., "Preferences and Rational Choice in the Theory of Consumption,"
in Mathematical Methods in the Social Sciences, ed. by Arrow, Karlin, and Suppes,
Stanford, Calif., Stanford University Press, 1960. A revised version of this paper is
included in [3], chapter 1.
16. Yokoyama, T., "A Logical Foundation of the Theory of Consumer's Demand,"
Osaka Economic Papers, 2, 1953.
Definition:
(i) The function ¢ is
called upper semicontinuous (abbreviated u.s.c.) at x0 if
exists a neighborhood N(x°) (or
for each open set V containing ¢(x°) there
an open set containing x°) such that
x c N(x°) implies 0 (x) c V
DEMAND THEORY 251
TA-11, I'- ,
B2
I L
0
L A
---A 2 11
1
x
Definition:
(i) We say that 0 is closed in X if, for each x0 E X, "x9 --> x°, y9 -> y°, where
x9 E X, y9 E 0(x9)" implies "y°E 0(x°)."
(ii) We say that 0 is G-closed in X if the graph of 0, {(x, y): x E X, y E 0 (x)), is
closed in X Ox Y.
(iii) We say that 0 is quasi upper semicontinuous (abbreviated q.u.s.c.) in Xa if 0
is u.s.c. at each x in X.
(iv) We say that 0 is upper semicontinuous (u.s.c.) in X if 0 is q.u.s.c. in X and 0 (x)
is compact for each x in X.
252 THE THEORY OF COMPETITIVE MARKETS
Theorem 2.D.12: The function ¢ is u.s.c. in X ([and only if ¢ is closed in X and Yis
compact.
PROOF: See Berge [1], p. 112 (corollary of theorem 7), and Moore [4],
lemma 1-d and lemma 2.
REMARK : It is important to note that the compactness of Yis crucial here.'
It is not accidental that in Debreu's definition of upper semicontinuity
([ 2], p. 17) and our definition in Chapter 2, Section D, Yis assumed to be
compact.
REMARK: Theorem 2.D.12 implies that every u.s.c. function is closed.
REMARK: Berge also proved that if 0 is u.s.c. in X, then O (A) is compact in
Y whenever A is compact in X ([1], p. 110).
REMARK: For the lower semicontinuity, we may conjecture that is l.s.c.
in X if and only if for each x0 E X, "y° E 0(x°)" implies that there exists a
sequence {yet } in T such that yq -> y°.'
dorff spaces. Berge ([ 1 ] , pp. 115-116) stated and proved the following theorem
which has many important applications.
FOOTNOTES
1. In the material of this Appendix, we have relied heavily on Berge [ l ] and Moore
[4].
2. Let X2 be the set of positions in which Black can move and X0 be the set of positions
of checkmate or stalemate. ClearlyX is the union ofX 1, X2, and X0; 0 is the mapping of
X \ X0 into X.
3. It is assumed that X and Y satisfy the "first axiom of countability." A few definitions
may be recalled here from Chapter 0, Section A. A topological space is said to satisfy
the first axiom of countability if it has a countable open base at each of its points. An
open base is a class of open sets such that every open set is a union of sets in this class.
THE EXISTENCE OF COMPETITIVE EQUILIBRIUM 255
REFERENCES
1. Berge, C., Topological Spaces, tr. by Patterson, New York, Macmillan, 1963 (French
original, 1959).
2. Debreu, G., Theory of Value, New York, Wiley, 1959.
3. Karlin, S., Mathematical Methods and Theory in Games, Programming, andEconomics,
Vol. I, Reading, Mass., Addison-Wesley, 1959.
4.. Moore, J. C., "A Note on Point-Set Mappings," in Papers in Quantitative Economics,
Vol. 1, ed. by J. P. Quirk, and A. M. Zarley, Lawrence, Kansas, University of Kansas
Press, 1968.
Section E
THE EXISTENCE OF
COMPETITIVE EQUILIBRIUM
a. HISTORICAL BACKGROUND
An economic model is constructed by specifying the economic agents
involved, their behavioral rules, and the various equilibrium relations. The model
is called a general equilibrium model if all the equilibrium relations in the model
are specified. It is called a partial equilibrium model if only a part of the equilibrium
relations is specified. The unspecified portion then is covered by the assumption
256 THE THEORY OF COMPETITIVE MARKETS
that "other things are equal." A partial equilibrium analysis is convenient for a
deeper analysis of some particular segment of the economy. However, it should be
realized that any partial equilibrium analysis presupposes a general equilibrium
analysis. For without knowing precisely under what conditions "other things are
equal," the partial equilibrium analysis is rather meaningless.
Full recognition of the importance of general equilibrium analysis and the
construction of the first general equilibrium model of a national economy is
attributable to Leon Walras [421. Moreover, Walras stated his general equili-
brium system in mathematical forms whose impact on modern economic theory
is immense. The model of a competitive equilibrium that we have considered so
far in this chapter is an outgrowth of the Walrasian general equilibrium model.
The important properties of such a general equilibrium-the optimality, the
existence, and the stability of the equilibrium-have already been considered
by Walras. Although his consideration was not too satisfactory from the present
point of view, he was very much ahead of his time. The Walrasian construction of
general equilibrium models goes from a simple model to more complicated
models.'
We illustrate his model and his consideration of the existence of an equilib-
rium by using his model with production' ([42], part IV). (We use our own
notation.) Let all be the amount of the ith productive resource necessary to
produce one unit of the jth commodity (good or service). Let xj, j = 1, 2, ..., n,
be the output of the jth commodity in the economy. Let v;, i = 1, 2, ... , m, be the
amount of the ith factor made available in the economy. Letp be an n-vector which
gives the prices of the commodities that prevail in the economy, and let w be an
m-vector that gives the prices of the factors. Thejth component of p is denoted
by pi - and the ith component of w is denoted by wi. The demand for the jth
commodity is a function of p and w. Similarly, the supply of the ith factor is a
function of p and w. Thus the Walrasian general equilibrium system of competitive
markets with production can be summarized by the following system of simul-
taneous equations.
P?
Equation (1) determines the total demand for each factor, of which the supply is
given by (3). The demand for each commodity is given by (4). Note that the same
notation is used to denote the demand for and the supply of each factor and each
commodity (that is, v; and xj). This implicitly assumes the equilibrium relation
THE EXISTENCE OF COMPETITIVE EQUILIBRIUM 257
(the demand for each factor is equal to its supply and the demand for each com-
modity is equal to its supply). Equation (2) denotes the familiar profit condition
which states that under perfect competition, profit is eliminated. Although it is not
made explicit in the above model, Walras derived the above relations from the
behavioral rules of the economic agents (competitive consumers and competitive
producers). The market demand for commodity j was obtained by adding up the
individual demands for commodity j over all consumers. Each consumer's
demand for j was obtained by assuming that he maximizes his utility subject to his
budget condition. The are the coefficients of production (or "coefficients
of fabrication"). Walras initially assumed that they were constant but later (in
the third edition; see lesson 36 of the fourth edition) showed that they were
determined by the cost minimization behavior of each producer.' Alternatively the
can be determined by the profit maximization behavior of each producer.
Hence in the above system of equations we may consider a,j = a,j(p, w).
Altogether there are (2m + 2n) equations in the above system, and there are
(2m + 2n) variables to be determined in the system (that is, pj, xj, j = 1 , 2, ... , n;
w,, v,, i= 1, 2, ... , m). The price of one of the commodities for factors can be used as
a numeraire to measure the relative prices of the other commodities and factors.4
Letting pl = 1, we thus reduce the number of variables by one. Then Walras
showed that there can be only (2m + 2n - 1) independent equations in the above
system, for one of the variables can be obtained from the identity, which is later
called Walras' Law..'
m
+n
G+ PA _ wi vi
J=1 i=1
X1 = - L pj xj + L+ x'i vi
i=2 i=1
x2 = - 1
Even a simpler example would be x = - 1, where x denotes "output."6 Hence
Walras' method of counting the number of equations and the number of variables
is quite unsatisfactory.` Although this method often gives a necessary condition
for the existence of an equilibrium solution," it is not a sufficient condition.
Although the above Walrasian procedure of showing the existence of an
equilibrium solution is unsatisfactory, it was accepted for a long time without
question.9 The first satisfactory treatment came in the 1930s from Karl Menger's
seminar in Vienna. One of the most important contributions made here was the
reformulation of the Walras-Cassel system allowing inequalities." Based upon
such reformulations, in particular the ones due to Schlesinger [34] and Zeuthen
[431, Abraham Wald [39] 11 gave the first satisfactory and rigorous proof of the
existence of an equilibrium solution. Alternative proofs of the existence of an equi-
librium solution for Schlesinger's reformulation of the Walras-Cassel model have
recently been given independently by Kuhn [ 19] and DOSSO [ 131. The proofs by
Kuhn and DOSSO are essentially similar in the sense that they are based on the
idea of utilizing the duality theorem of linear programming.
Schlesinger's reformulation of the Walras-Cassal system can be described as
follows (in terms of the above notation):
n
(5) Eaijxj< vie i= 1,2,...,m
j=
n
(6) E aijxj < vi implies wi = 0
j=
ni
(7) wiaij=pj, I= n
i=
Assumption (iii) precludes the Land of Cockaigne. Assumption (v) says that
if the demand for the jth commodity goes to zero, its pricc goes to infinity. This
means that the demand for each commodity will never be zero for any (finite) price,
however large. This assumption is clearly unrealistic. It is introduced primarily to
facilitate the proof. Assumption (vi) is needed in the proof of the uniqueness of the
equilibrium solution.
Since Wald's original proof is rather tedious, we will sketch the proof by
Kuhn [ 19] and DOSSO [ 131," which should be of interest in itself because of its
relation to the theory of linear programming.
Let X = {x: x > 0, A x 5 v}, where A = [a1)] , (the feasible set). We can
easily show thatXis nonempty, compact, and convex. Then consider the following
linear programming problem:
Maximize: p x
Subject to: A x v, and x ? 0
Define p - f(x) for all x > 0 such that x E X. For a fixed value of x, we first obtain
the value ofp and then solve the above linear programming problem with this value
of p. We obtain a solution x* (which is obviously not necessarily unique). We now
have a mapping x -->p -> x* which we denote by F(x). It is a function from the in-
terior of X into X. Extend this mapping to the wholeX and denote it by '1 (x); that is,
0 (x) = F(x) for all x > 0. The extension can be achieved by taking the closure of
the graph of F in X Q X (see Kuhn [ 19] , pp. 269-270). T lsingthe continuity off (x),
we can show that cp (x) is upper semicontinuous and 0 (_z) is nonempty and convex,
for all x E X. Now we use the following theorem, known astheKakutanifixed point
theorem.
Theorem 2.E.1 (Kakutani): Let S be a nonempty, compact, convex subset ofR". Let F
be an upper semicontinuous function from S into itself such that, for all p E S, the set
F(p) is nonempty and convex." Then there exists a p in S such that p E F(p).
This theorem is a generalization of the following theorem, which is called
Brouwer's fixed point theorem.
260 THE THEORY OF COMPETITIVE MARKETS
Theorem 2.E.2 (Brouwer): Let S be a nonempty, compact, convex subset ofRn, and
let F be a single-valued continuous function from S into itself Then there exists a
p in S such that p = F(p).
Both Brouwer's and Kakutani's theorems probe deeply into combinatorial
topology. For rather simple proofs, see Nikaido [30],[31], for example.'-'
Brouwer's theorem is illustrated in Figure 2.30. Here S is the unit interval [0, 11.
A continuous function from S to S must cross the diagonal line; thus F(p) = p.
REMARK: The method of actually computing a fixed point in connection
with the theory of competitive equilibrium has been recently provided by
Scarf [33] ; his paper can also be considered to give a constructive proof of
the existence of competitive equilibrium. See also Arrow and Hahn [4],
Appendix C.
Reading the statement of Kakutani's fixed point theorem, we at once realize
that this theorem is applicable to the present problem. In other words, there exists
an z E X such that z E cD (z). Then, using assumption (v), we can show that z > 0.
Thus we can find z > 0 and p = f(1) such that z solves the above linear program-
ming problem. Then from the duality theorem of linear programming, there exists
a solution w for the following dual problem.
Minimize: w v
Subject to: A' w = p, w >_ 0
Then (p, z, w) constitutes a solution for Schlesinger's version of the Walras-Cassel
model."' Using (vi), the uniqueness can be proved." Note that Wald requires the
equality A'- w = p. This implies that the price of every commodity has to be strictly
positive [under (iii)]. Assumption (v) is required to guarantee z > 0 so that A'- iv =
p. This assumption says roughly that every commodity is indispensable to the con-
sumer. Note also that if we allow an inequality here-that is, A' w > p-then (from
the duality theorem) we, admit zero production for some goods and the difficulty of
introducing assumption (v) disappears (See Kuhn [ 19] ). (However, some sort of
the aggregate supply function, y(p), is the sum of the individual supply functions,
that is, y(p) _ Yj(p), and y(p) is upper semicontinuous. Here a negative element
of y1(p) is an input for j. Assuming no externalities among the producers and the
consumers, we write the (aggregate) excess supply function as z(p) = y(p) + x` -
x(p), where x is the total supply of resources available in the economy. Then z(p)
is also upper semicontinuous. Assuming free disposability, we write the feasibility
condition as z(p) n S2 # 0; or, equivalently, there exists a z E z(p) such that z ? 0.
We say that [ p, { zi } , { y1 } ] is a competitive equilibrium if
and
(ii) there exists a i E z(p) such that z ? 0, where i = Zy1 + x - ZXi.
or
This corresponds to Walras' Law.21 We will now use the following lemma, which is
THE EXISTENCE OF COMPETITIVE EQUILIBRIUM 263
proved independently in the literature by Gale [ 14] and Nikaido [ 28] .22 The
lemma, which practically is the proof of the existence of competitive equilibrium,
can easily be proved if we use Kakutani's fixed point theorem. (For such a proof,
see Debreu [ 11 ] , pp. 82-83, and Nikaido [31 ], pp. 266-267).23
Lemma 2.E.1 (Gale, Nikaido): Let P be the (n - 1)-unit simplex in Rn, that is, P
{ p: p E Rn, 2:,"_ 1 p; = 1, p > 0}, and let S be a nonempty compact subset ofR". Let F
be an upper semicontinuous function from P to S, such that F(p) is nonempty and convex
for all p E P and p F(p) _> 0 [that is, p z > O for all z E F(p)]. Then there exists
a p E P such that F(p) n 0 # 0.
This lemma is illustrated in Figure 2.31. By the assumption thatp F (p) > 0,
F(p) must be above the line passing through the origin and orthogonal to p. Asp
moves in P, F(p) must intersect with D.
Now let P be the (n - 1)-unit simplex of price vectors as defined above. Then
for each element p of P, we obtain x,(p)andyj(p). Define z(p) as z(p) = jyj(p) +
x - 2x;(p). We see immediately that this z(p) satisfies the assumptions of the
lemma. Hence there exists ap E Psuch that z(p) n 0 0. Let z E z(p) with z > 0.
Then there exist x; E x,(p) and yj E yi(p) such that z E z(p), where z = 2:y, +
x - 2:z; This completes the proof of the existence of a competitive equilibrium
.
(i) The survival problem. This is the question of assuring that every consumer can
survive, given the equilibrium conditions. If an equilibrium exists, the equilib-
rium prices of the resources held by some consumer may be so low that he may
not be able to subsist on the income he obtains from his resources. The first
requirement for this problem, of course, is that the aggregate supply set contain
a point which is the sum of the minimal subsistence consumption requirements
for each consumer (otherwise some consumer is bound to die). In terms of the
notation of Section C, this means that there exist x; E X;, for all i and y E Y
such that x = y + Y, where x = Xx1. The second requirement is that each con-
sumer be able to subsist with the resources (including labor) he holds without
engaging in exchange. This can be guaranteed if each consumer's consumption
set, with his resources added, has an intersection with the aggregate produc-
tion set of the economy. In fact, we need a little more. For example, we may
require that not only must such an intersection be nonempty, it must also have
an interior This corresponds to the cheaper-point assumption discussed
in the previous section. Essentially, it guarantees the (upper semi-) continuity
of each consumer's demand function.
(ii) Satiation. When an equilibrium price prevails, some consumer, because the
prices of his resources are very high, may be able to purchase a consumption
bundle such that he is satiated. As we said in the previous section, the nonsatia-
tion assumption is needed to establish the lower semicontinuity of the budget
function (hence the upper semicontinuity of the demand function). Arrow and
Debreu simply assumed that every consumer is nonsatiated in his (somewhat
modified) consumption set. This is a strong assumption. The relaxation of this
assumption is possible and is attempted in the literature (for example, McKenzie
[21],[22]).
(iii) Utility function and the production set. Arrow and Debreu assume the exist-
ence of a continuous utility function for each consumer. McKenzie's formula-
tion is in terms of a preference relation, although his assumptions imply the
existence of a continuous utility function. The crucial assumption in this con-
nection, which is common in all the existence proofs, is the convexity of
individual preferences. Arrow and Debreu [ 3] assume the existence of a fixed
number of firms, each of which has a convex production set 2 McKenzie [ 22]
assumes that the aggregate production set is a convex cone so that constant
returns to scale prevails in the aggregate. McKenzie does not assume the ir-
reversibility of the production processes, nor does he assume free disposability
of commodities.
(iv) The number of producers.27 In Arrow and Debreu [ 3] and subsequent works
such as Debreu [ 11 ] , it is assumed that the total number of firms (producers)
is fixed (at, say, k). It is well known and can easily be checked that diminishing
THE EXISTENCE OF COMPETITIVE EQUILIBRIUM 265
b. MCKENZIE'S PROOF
Essentially, we follow the proof due to McKenzie [ 22] . In order to under-
stand the principal problems and difficulties involved in the proof, we consider
his simpler case .18
Let x; be an n-commodity consumption vector for consumer i (i = 1,
2, ..., m) and let X, be his consumption set, which is assumed to be a subset of
R'1. Wetidopt the convention that the positive components of x, represent the
commodities demanded and the negative components represent commodities
supplied by the ith consumer (recall the discussion in Section A for this conven-
tion). We do not take into explicit account a resource vector such as v, ; it is imbed-
ded in our convention of x,. Let Y be the aggregate production set. We now state
and explain the assumptions which will be used in the present proof of the existence
of a competitive equilibrium.
Definition: The budget set for the ith consumer, denoted by Hi(p), is defined by
il?
C(P) = C(P)
THE EXISTENCE OF COMPETITIVE EQUILIBRIUM 267
The set C;(p) is the ith upper contour set under price p and C(p) is the economy's
upper contour set under price p.
The concept of the upper contour set, C,(p), is illustrated in Figure 2.32.
The dotted lines indicate the indifference curves of the ith consumer.
Definition: We say that the ith consumer is satiated at x; if xi (D;x'; for all x; E X;
and that the ith consumer is satiated at price p if x;Q;x; for all x; E C,(p).
(iii) Assumptions relating consumption and production sets.
(A-5) The set X; n Y has an interior point for all i.
(A-6) Either (1) no consumer is satiated at p, or (2) if some consumer is satiated at
p, then C(p) rl Y = 0.
REMARK: Assumption (A-5) guarantees that every consumer can supply a
positive amount of every (unproduced) commodity to the producers.30 Thus
every consumer always has some income, given nonzero prices, so that his
budget set contains a point in his consumption set, and he can trade with
others. Assumption (A-5) guarantees the subsistence of every consumer
and it also corresponds to the cheaper-point (minimum-wealth) assumption
used in Sections C and D. This assumption, (A-5), is illustrated in Figure 2.33.
REMARK: Assumption (A-6) says that if some consumer is satiated while
trading at price p, then the total demand at p will exceed the possible produc-
tion. This concept is illustrated in Figure 2.34, in which we assume that there
is only one consumer in the economy. In this diagram, point x represents a
point of satiation. In other words, if the price of a certain commodity be-
comes low enough (relative to other commodities), a consumer may be able
to purchase a large quantity of that good (in exchange for other commodi-
Food
Labor
ties), and thus he may be satiated with that good. Assumption (A-6) says that.
if this occurs, the demand for that good is beyond the society's productive
capacity. Therefore (A-6), in effect, precludes such a possibility. Analytic-
ally, (A-6) corresponds to the nonsatiation assumption of demand theory.
We are now ready to start the proof of the existence of a competitive equilib-
rium. First we define competitive equilibrium (in the usual manner).
Food
i= 1
Fix j = jo and put y. = yi for all j except j = jo. Then p yio < p y,,, for all
yjo E Yj0. Since the choice of jo is arbitrary, this shows the profit maximization
of each producer.
In order to prove the existence of a competitive equilibrium, it is sufficient to
confine our attention to a price vectorp which satisfies condition (ii) of a competi-
tive equilibrium.
Let xi E interior (Xi n Y), i = 1, 2, ..., in. This exists for all i from (A-5).
Write xi = z; then z E interior Y.
(i) (Convexity) Let p, p' E P, and let p" = tp + (1 - t)p'(0 < t < 1). But
1, and
p' . x = -1 imply p" x = - 1. Hence p" E P. Therefore P is convex.
(ii) (Closed) Let pq- p, pq E P; pq y!5 0 implies p y < 0 for y E Y (as a
result of the continuity of inner product). Hence p E Y*. p4 .7 = -1
implies p z = - 1. Hence p E P. Therefore P is closed.
(iii) (Bounded) Suppose there exists a sequence { pq} such that II pq 00,
pq E P. Consider pql II P`7 II - P q Then p q E Y*, since Y* is a convex
cone and p9 E Y*. Moreover, )5q is an element of the intersection of
Y* and the (n - 1)-dimensional unit sphere. That is, P q is an element of
a compact set. Hence there exists a subsequence of {Pq}-say, {ps}-
such that P'->P, where II P' II = II P II = 1. Moreover, P E Y*, for Y*
is closed. Consider s' x. Since pS z 1 ('.'p-' E P), P' x =
- I / II p-` II Then II ps
. - oo implies that Ps - 0 as s- oo, that is,
x = 0. But this is a contradiction, For P E Y* and x E interior Y
means P x < 0. Therefore no such sequence { pq} can exist. Thus P is
bounded, so that it is compact. (Q.E.D.)
THE EXISTENCE OF COMPETITIVE EQUILIBRIUM 271
Let z be a point which is not an interior point of Y, and consider a chord join-
ing x and z: tz + (1 - t)z, 0 < t < 1. Now consider the minimum oft subject to
tz + (1 - t)z E Y. Since t varies over the (closed) unit interval [0, 11 , there exists
a t in [0, 11 for which t achieves its minimum. Denote this minimum by tz. In other
words, we define tZ = min t such that tz + (1 - t)z E Y, 0 < tZ < 1. Now consider
a function h(z) from such z 0 interior Y into Y by
h(z) = tz x + (1 - t,)z
subsequence {ys} such that ys_y', where y' y = h(z), and y' = t'x +
(1 - t')z for some t', 0 < t' < 1. Since x E interior Y, and tv and t, are all
less than 1,y4r x, y': x,andyr x.
By the definition of tZ, we cannot have t' < t,. Hence t' > t,. But t' = t,
implies y = y', so that we have t' > t,. This implies y' = Ox + (1 - O)y for
some 0, where 0 < 0 5 1. Since x E interior Y, there exists an open ball
B,(x) about x with radius c > 0, such that Bj(x) c Y. Let w E BE(x) and
define w' - Ow + (1 - O)y. Then w' E Y, since both w and y are in Y. Hence
we have an open ball BoE(y') about y' with radius Oc > 0, such that
BoE(y') c Y. Hence y' is an interior point of Y. Therefore yq E interior Yfor
large q. This contradicts the definition yq = h(z4). Thus we have t' = tz, or
y = y'. (Q. E. D.)
REMARK: The above proof is illustrated in Figures 2.37 and 2.38.
Lemma 2.E.5: The set g(z) is convex and g(z) is an upper semicontinuous func-
tion of z.
PROOF: Convexity follows from the convexity of P and the linearity of
the inner product of z. (Check this for yourself; it is a simple exercise using
the definition of a convex set.)
To prove the upper semicontinuity of g(z), consider a sequence {z9} E
boundary Y such that z9 - z. Then form a sequence p9 E g(z9). We have to
show that if pv -> p, then p E g(z). For this, simply observe:
(i) Since P is closed, p E P.
(ii) By the continuity of inner product, p z = 0
('.'p9 z9 = 0 implies p. z = 0). Hence, p E g(z). (Q.E.D.)
m
Definition: F= g o h of where f f.
REMARK: The function F is illustrated by Figure 2.40.
FOOTNOTES
they desire to earn. It does not say that people spend all their income. The budget
constraint is a constraint and not a result of any choice.
6. The problem may be stated as follows: Letf (x1, ..., x,,, zI, ... zm) = 0, i = 1, 2, ...,
n, be the equilibrium system, where x1, . . ., x, are the "endogenous" variables and
z1, ... , zm are the "exogenous" variables. The problem is whether we can obtain xi _
x; (z i , ... , zm) such as to be consistent with the above set of equations. If we can, these
xi's define the equilibrium values of the endogenous variables and we call such (xi,
... , a solution of the above system. In the above, the number of equations is taken
to be equal to the number of endogenous variables, for otherwise we cannot guarantee
the existence of a solution even when the are all linear (affine). The well-known
implicit function theorem guarantees the (local) existence of a unique solution in the
neighborhood of a point (x°, z°), if certain assumptions are satisfied-especially that
the Jacobian matrix [ evaluated at (x°, z°) is nonsingular. The assumptions
of this theorem guarantee the global existence of a unique solution when the f's
are all linear (affine). However, these assumptions are not sufficient for the global
existence of a solution for the nonlinear case.
7. We note that Walras clearly realized the possibility of the nonexistence of equilibria
for the two-commodity economy ([42], section 64, lesson 7).
8. In certain cases, the equality of the number of equations and variables is not even
a necessary condition. An example is x2 + Y2 = 0, in which the number of equa-
tions (=1) is different from the number of variables (=2), and yet there exists a
unique real solution (that is, x = 0 and y = 0).
9. The above difficulty of the Walrasian system (that is, there is no guarantee that
there exists a solution) was realized after Cassel's exposition [8] of the system. As
a result of the simplicity and the popularity of Cassel's exposition, the system then
became known as the Walras-Cassel system. For the reason that Cassel attracted
Austrians, Hicks says, "As is known, there was a phase [in the 1920's] when Cassel's
treatise was displacing those of the "historical" and "Austrian" schools in curricula
of Central European countries: during such struggles its weakness would be care-
fully watched," ([ 15], p. 674). The difficulty discussed above was made clear in the
seminar conducted by Karl Menger (a mathematician and theson ofthefamous econ-
omist Menger). For the summary of the discussions in Menger's seminar on the
Cassel-Wald system, see Arrow and Debreu [31, pp. 287-289. Strictly speaking,
the Casselian system is quite different from the Walrasian system. Cassel did not pay
any attention to the behavior of each economic agent. He, in fact, proposed to reject
altogether the procedure of deriving an individual's demand function from this
hypothesized utility maximization behavior. Cassel a priori assumed the constancy
of the ay's and the vi's.
10. In particular, condition (1) in the above is changed to Z'_Iai1x = v;, I = 1,
2, ... , m. The equality condition for this relation is a very stringent one, if we assume
the a,1's and v,'s are all constants, as in the Casselian system. If, on the other hand,
the a11's and vi's are functions of prices as in Walras, the equality assumption is
not as strong as is generally believed. For then the equality can be achieved through
changes in prices. The inequality of condition (1) allows the possibility of an excess
supply of factors. If a factor is in excess supply, its price will be zero. In other words,
the inequality allows the possibility of determining the division of factors into free and
scarce (compare Zeuthen [43] ).
11. Wald's work [39] was first presented at Karl Menger's seminar. His article [41]
is the summary of the main results of [39] and [40] . These, together with the results
from Menger's seminar, clearly designate this period as the dawn of modern eco-
276 THE THEORY OF COMPETITIVE MARKETS
nomics. Notably, von Neumann's first paper [26] on game theory was published
in 1928, and his paper on the "von Neumann growth model" [27] was published in
1937. The latter paper clearly resembles modern activity analysis and also contains
the basic idea of the duality theorems of linear programming.
12. Note that (8) presupposes that the functions (3) and (4) are globally invertible and
that the supplies of productive factors (the vi's) are completely inelastic with respect
to all prices (p and w). Note also that the factor prices are left out in (8). In general,
these assumptions are not guaranteed, and Wald's procedure of using (8) is illegiti-
mate.
13. The procedure sketched here is the one prescribed by Kuhn [ 191. DOSSO's pro-
cedure [ 13] is a little different from this, although the mathematical structure of the
two procedures is essentially the same. Incidentally, there are some errors in
DOSSO's proof of exience [ 13]. They are pointed out and corrected by K. Inada.
See his "A Note on the Revision of the Proof of Dorfman, Samuelson, and Solow's
Existence Theorem of General Equilibrium," Economic Studies Quarterly, XIII,
February 1963.
14. If F is an upper semicontinuous function from a compact set X into itself, it can be
shown easily that the image set F(x), x E X, is also compact. See, for example, Berge
[61, section 1 of chapter VI (especially theorems 3 and 4 and the corollary of theorem
7). Note the distinction between his definition and our definition of upper semi-
continuity. See our discussion in Chapter 2, Appendix to Section D.
15. See also C. B. Tompkins, "Sperner's Lemma and Some Extensions," chapter 15 in
Applied Combinatorial Mathematics, ed. by E. F. Beckenbach, New York, Wiley, 1964,
and E. Burger, Introduction to the Theory of Games, Englewood Cliffs, N.J., Prentice-
Hall, 1963 (appendix).
16. Note that from the duality theorem, p r = w . v so that we have iv (v - A z) _
w v - (A'. w) z = w v - p z = 0, that is, condition (6) is satisfied. (See also
Theorem 1. F.1.) Note also that p . z = w. v" implies that (p, z, w) satisfies Walras' Law.
17. Suppose not. That is, suppose (p, z, w) and (p*, x*, w*) are two different equilibria.
Since z maximizes p. x for all x E X, we have p. z > p x for all x E X where
p = f(z). In particular, p. r > p. x*. Similarly, p*. x* > p*. z where p* =f (x*),
which implies that p. a; < p. x* from assumption (vi). This contradicts p. z > p. x*,
which proves the uniqueness. The discussions of assumption (vi) will be postponed
to the Appendix to Section E of this chapter and Chapter 3, Section E.
18. DOSSO [ 13] avoided the use of the inverse demand function.
19. In the issue of Econometrica before the one containing the article by Arrow and
Debreu [31, McKenzie [20] showed the existence of a solution for Graham's model
of world trade, which clearly resembles the model of Walrasian competitive markets.
We may also note that this is probably the first article in economics to use Kakutani's
fixed point theorem.
20. Sonnenschein [35] established the upper semicontinuity of x;(p) [hence also x(p)]
without assuming the transitivity of the underlying individual preferences. The main
result of [ 35] , as the author puts it, is that "the transitivity of preferences can be
replaced by the convexity of preferences in establishing the existence of demand func-
tions." (p. 215).
21. Nikaido called the above relation with inequality the Walras law in the general sense,
and the usual Walras law with equality the Walras law in the narrow sense. ([301,
section 45; [311, p. 263)
22. A similar theorem was proved by Debreu [ 10] and it was also used in the proof of
the existence of a competitive equilibrium. See also [ 1 I].
23. The use of Kakutani's fixed point theorem, which is an extension of Brouwer's fixed
point theorem, is not a matter of mere technical convenience. Surprisingly enough,
THE EXISTENCE OF COMPETITIVE EQUILIBRIUM 277
it can be shown that the Gale-Nikaido lemma conversely implies Brouwer's fixed
point theorem. See Uzawa [38] and Nikaido [31], pp. 268-269. This produces
Uzawa's contention [ 38] that the existence of equilibria in the Walrasian system is
in a sense equivalent to Brouwer's fixed point theorem. Nikaido then remarked ([31],
p. 270): "The Walrasian general equilibrium theory [Walras, 1874] was published in
the 1870's, while Brouwer's work on fixed points [Brouwer, 1909, 1910] appeared
three decades later. It is therefore no wonder that Walras could not achieve a
mathematical consolidation of the conjecture in the days before the advancement of
topology; he should certainly not be criticized for his failure to achieve a mathe-
matical solution, but should be admired for his mathematical imagination which
let him formulate this well-posed conjecture."
24. For this purpose, the reader may also be interested in seeing Debreu [ 111, chapter 5,
for example.
25. See (A-5) of the next subsection and footnote 31. This assumption implies that every
consumer must be able to supply a positive amount of every unproduced commodity
(such as labor). Arrow and Debreu [3] imposed a stronger assumption which re-
quires that every consumer can supply a positive amount of every commodity. The
relaxation of Arrow and Debreu's assumption is seen in McKenzie [21], [22].
In (22], McKenzie introduced the concept of "irreducibility." For a further dis-
cussion on the concept of irreducibility, see Moore [241. See also Arrow [2], and
J. T. Rader, "Pairwise Optimality and Noncompetitive Behavior," in Papers in Quan-
titative Economics, vol. 1, ed. by J. Quirk, and A. M. Zarley, Lawrence, Kansas,
University of Kansas Press, 1968. In essence, the concept of irreducibility says that
no matter how the consumers are partitioned into two groups, an increase in the
initial assets held by the members of one group can be used to make possible an
allocation which would improve the position of someone in the second group without
damaging the position of anyone else there.
26. As Arrow [ 1] points out, the convexity of each consumer's preferences and of each
producer's production set are "the empirically most vulnerable" assumptions. How-
ever, the nonconvexity of preferences would have no significant effect as long as each
consumer is small enough compared to the economy. Recall our discussion on the
core in the Appendix to Section C of this chapter. On the other hand, increasing
returns to scale for each firm (which precludes the convexity of each firm's produc-
tion set) over a sufficiently wide range may mean the appearance of large firms and the
failure of the existence of a competitive equilibrium.
27. There is also a problem in the procedure of fixing the number of consumers in the
economy, even if we assume that everybody can survive. But this seems to be much
less serious than the problem that arises in fixing the number of firms. See Koopmans
[ 171, pp. 64-65.
28. In particular we are concerned with his "special existence theorem," which provides
the core of his proof for a more general case.
29. In proving the "existence of competitive equilibria," no assumptions on each
producer's production set are required (only the assumptions on the aggregate
production set of the total economy are required). This was first shown by Uzawa in
1956 (Stanford Technical Paper No. 40), later published as [ 371.
30. Note that the origin represents the point of the initial endowments, and the con-
sumption set Xi represents the set of all possible trade (and consumption) for the ith
consumer. In Figure 2.33, labor is assumed to be the only unproduced commodity.
At point x, a positive amount of the produced commodity, food, is received by
consumer i. This restrictive assumption simplifies the proof in this subsection.
31. Since xi E interior Y, p x; < 0 for all p E P. This means that the ith consumer is
guaranteed a positive income above subsistence requirements for all p E P. In this
sense, (A-5) takes care of the subsistence problem.
278 THE THEORY OF COMPETITIVE MARKETS
REFERENCES
23. , "On the Existence of General Equilibrium for a Competitive Market: Some
Corrections," Econometrica, 29, April 1961.
24. Moore, J. C., "On Pareto Optima and Competitive Equilibria, Part II. The Existence
of Equilibria and Optima," Krannert Institute Paper, no. 269, April 1970.
25. Morishima, M., "A Reconsideration of the Walras-Cassel-Leontief Model of General
Equilibrium," in Mathematical Methods in the Social Sciences, 1959, ed. by Arrow,
Karlin, and Suppes, Stanford, Calif., Stanford University Press, 1960.
26. von Neumann, J., "Zur Theorie der Geselischaftsspiele," Mathematischen Annalen,
100, 1928.
27. , "Uber ein Okonomisches Gleichungssystem and eine Verallgemeinerund des
Fixpunktsatzes," Ergebnisse eines Mathematischen Kolloquims, 1935-1936 (in
English, "A Model of General Economic Equilibrium," Review of Economic Studies,
VIII, 1945-1946).
28. Nikaido, H., "On the Classical Multilateral Exchange Problem,"Metroeconomica, 8,
August 1956.
29. , "A Supplementary Note to 'On the Classical Multilateral Exchange
Problem,"' Metroeconomica, 9, December 1957.
30. , Introduction to Sets and Mappings in Modern Economics, tr. by K. Sato,
Theorem 2.E.4 (Wald):' Suppose (R) holds; then the equilibrium is unique.
PROOF: Suppose there exist two equilibrium price vectorsp > 0andp* > 0
such that p p*. Write z = f(p) and x* = f(p*). By (R), we have either
p z < p- x* or p* x* < p * . But since and p* are equilibrium price
.
Lemma 2.E.6: Assume the region P of R" is rectangular and suppose the Jacobian
matrix F(p) is Hicksian for all p E P. Then
(i) The mapping f (p) is one-to-one for all p E P.
(ii) The inequalities
(5) (pi - ai) [.f (p) - £ (a)] > 0, i = 1, 2, ... , n have only the trivial solution
p = a.
With the help of this lemma, the following uniqueness theorem is easy to
prove.
Theorem 2.E.5 (Nikaido): Let f (p) be an excess demand vector as considered above,
where f is differentiable and defined on a rectangular region P of fl ". Then the equi-
librium price vector is unique if the Jacobian matrix F(p) is Hicksian.
282 THE THEORY OF COMPETITIVE MARKETS
PROOF: Suppose p and p* are two equilibrium price vectors. In the inequali-
ties (5), let a = p and p = p*. Then the LHS of (5) can be rewritten as
(6) (P* - Pr) [f (P*) - f (P)] = -Prf (P*) - P*f (P) > 0
by definition of equilibrium. Hence by (ii) of Lemma 2.E.6, p* = p.
(Q.E.D.)
REMARK: Suppose that the equilibrium relation is expressed in the Walras-
Cassel equality form
(7) f(P)=0,i= 1,2,...,n
and suppose that such ap withp > O exists. Then (i) of Lemma 2.E.6 provides
the uniqueness of p immediately.
Unless some economic justifications are found for the condition that F(p) is
Hicksian, Theorem 2.E.5 remains essentially a mathematical theorem. Herethere
is still much to be explored. The reader may find his own uniqueness theorems by
exploring further economic interpretations of Theorem 2.E.5.
To illustrate such a line of thought, let us quote the following result in the
literature, from which we shall prove one uniqueness theorem.
Lemma 2.E.7: Let A = be an n x n matrix with a;i > O for all i j. Then A is
Hicksian if and only if
(8) There exists an x > 0 such that A x < 0.
PROOF: See Chapter 4, Section C.
To make use of this theorem, we assume
(G) f, > 0 for all p and for all i j.
That f, > 0 means that an increase (resp. decrease) in the price of the jth com-
modity will increase (resp. decrease) the excess demand for the ith commodity.
Condition (G) is known to be the (weak) gross substitutability condition and plays
an important role in the stability theorem of competitive equilibrium (see Chapter
3).
Next write
(9) f (P) = f (P, Po), i = 1, 2, ... , n
and
are homogeneous of degree zero with respect to all the arguments. Hence using
Euler's equation we obtain
n
Theorem 2.E.6:9 Assume (G) and (12). Then the equilibrium is unique.
PROOF: By Lemma 2.E.7, F(p) is Hicksian for all p. Hence by Theorem
2.E.5, the equilibrium is unique. (Q.E.D.)
This result has been known to economists since Wald [ 101. Moreover, it can
be proved quite simply without using the knowledge of Theorem 2.E.5 and Lemma
2.E.7. For such a proof, see Lemma 3.E.2. Lemma 3.E.3 provides the relation be-
tween condition (R) and gross substitutability. The difficulty of Theorem 2.E.6
is that the economic plausibility of gross substitutability is very questionable.
FOOTNOTES
1. As remarked before, the existence of a numeraire presupposes that the price of such a
commodity is positive at least in equilibrium and that the excess demand for each
commodity is homogeneous of degree zero with respect to all prices including that of
the numeraire commodity.
2. By Walras' Law, we mean here that 2:" 0Pi f (p) = 0 for all p. It is easy to see that,
under this law, (1) and (2) hold if and only if (3) holds.
3. To prove this theorem, there is no need to assume the existence of a numeraire, as
long as condition (R) is stated in a form which includes the numeraire. Then we can
assert the uniqueness of the price vector (including the numeraire commodity) up to
scalar multiples.
4. It may be worthwhile to recall Samuelson's weak axiom of revealed preference.
Interpret x and x' as the consumption vectors of a particular individual. Let x and x'
respectively, be chosen under p and p'. If x' is affordable at p, that is, p x' < p x,
then x is revealed to be preferred to x', for he could have bought x'. If this is the case,
x' cannot be revealed to be preferred to x; that is, p'- x < p'- x' is impossible. There-
fore p 1x 5 0 (where Ax = x' - x) implies p' ::1 x < 0, which is the weak axiom.
By this axiom, we have p' Ax < 0 or p Ax > 0 (for a particular individual). See P. A.
Samuelson, Foundations of Economic Analysis, Cambridge, Mass., Harvard University
Press, 1947, chapter 5.
5. In other words, the statement that the weak axiom holds in the aggregate is not a
284 THE THEORY OF COMPETITIVE MARKETS
REFERENCES
1. Arrow, K. J., "Economic Equilibrium," International Encyclopedia of Social Sciences,
New York, Macmillan and Free Press, 1968.
2. Arrow, K. J., Block, H. D., and Hurwicz, L., "On the Stability of the Competitive
Equilibrium, II," Econometrica, 27, January 1959.
3. Gale, D., and Nikaido, H., "The Jacobian Matrix and Global Univalence of Map-
pings," Mathematische Annalen, 159, 1965.
4. Inada, K., "The Production Coefficient Matrix and the Stolper-Samuelson Condi-
tion," Econometrica, 39, March 1971.
5. McKenzie, L. W., "Matrices with Dominant Diagonals and Economic Theory,"
Mathematical Methods in the Social Sciences, 1959, ed. by Arrow, Karlin, and Suppes,
Stanford, Calif., Stanford University Press, 1960.
6. Morishima, M., "A Generalization of the Gross Substitute System," Review ofEco-
nomic Studies, XXXVII, April 1970.
7. Nikaido, H., Convex Structure and Economic Theory, New York, Academic Press,
1968.
8. Uekawa, Y., "On the Generalization of the Stolper-Samuelson Theorem," Econo-
metrica, 39, March 1971.
9. Wald, A., "Uber die Eindeutige Positive Losbarkeit der Neuen Producktions-
gleichungen," Ergebnisse eines Mathematischen Kolloquiums, 6, 1933-1934.
10. , "Uber einigen Gleichungssysteme der Mathematischen Okonomie," Zeit-
schrift fur Nationalokonomie, 7, 1936 (in English, "On Some Systems of Equations of
Mathematical Economics," Econometrica, 19, October 1951).
PROGRAMMING AND COMPETITIVE EQUILIBRIA 285
Section F
PROGRAMMING, PARETO OPTIMUM,
AND THE EXISTENCE
OF COMPETITIVE EQUILIBRIA'
m m k
x --xi,
i=i
z = i=i x i, and y = E y,
i=t
where all externalities are assumed away.
Denote by Xi the consumption set of i and by Yj the production set of j. We
denote by X the aggregate consumption set and by Y the aggregate production
set. We assume that the preferences of consumer i are represented by a continuous
real-valued function ui(xi).; Given price p, the profit of producer j can be written
as p y,. Pareto optimality and competitive equilibrium are defined (in the usual
manner) as follows:
Definition (satiation): The ith consumer is satiated at x; if ui(x'i) > ui(xi) for all
xi E Xi.
We assume the following:
(A-1) There exist x' E X, y' E Y such that y' + x - x' > O.
(A-2) The set Y is convex.'
(A-3) The function ui(xi) is continuous and concave for all i = 1, 2, ..., m.'
(A-4) (cheaper point) Given a point 2i, if a price vector prevails, there
exists an x'i E Xi such that p 2i > p x; for all i.
PROGRAMMING AND COMPETITIVE EQUILIBRIA 287
Notice also that our definition of feasibility tacitly assumes free disposability.
Theorem 2.F.1 Under (A-1), (A-2), and (A-3), if [{z,}, y] is a Pareto optimum,
then there exists a p _>_ 0 and { yj }such that [ p, {Xi }, { y }] is a competitive equilibrium,
provided that (A-4) holds at this p.
PROOF: Let u be a vector function of which the ith component is u,(x;).
Since [{X;}, y] is a Pareto optimum, it is a solution of the following vector
maximum problem:' Choose {x;} and y so as to maximize u subject to
x < y + k, and x, E Xi, i = 1, ..., m, and y E Y. Hence, in view of (A-1),
(A-2), and (A-3), there exists an (a, p) such that the following (1) and (2) hold:
(1)
for all x;EX1,i= 1,2,...,m,andyE Y,wherea>_O.p 0, andce #0,' and
u = [ul //lxl), u2 //1x2, ..., um( m)]
(2) y+x-z0
Condition (iii) of competitive equilibrium follows immediately from (2).
Since y and y are in Y, we can find yJ E YJ and yJ E YJ, j = 1, 2, ... , k, such
that y = E 1 yJ and y = k 1 jJ. Put x; = z; for all i and yJ = yi for all j
except for j = join (1). Then p &> p yJ0 for all yJ0 E Y. Since the choice of
jo is arbitrary, this establishes condition (ii) of competitive equilibrium. Put
y = y and x; = z; for all i except for i = io. Then we have
a,0 u;0(z;0) - a,0 u,0(x 0) > p 1,0- p x,0 for all x(0E X'0
If a,0 > 0, then condition (i) of competitive equilibrium is satisfied for io.
If aj0 = 0, then p x,0 > p C,0 for all x,0 E X;0.10 This contradicts the
cheaper point assumption, (A-4), so that a,0 > 0. Since the choice of io is
arbitrary, this establishes condition (i) of competitive equilibrium.
(Q.E.D.)
Corollary: If there exists at least one consumer (say, io) who is not satiated at
z,0, then p 4 0.
PROOF: From the proof of the theorem, we know
for allx,EX1,a,> 0,I= 1,2,...,m
Now suppose p = 0. Then u,0(z10) > u,0(xi0) for all x10E X10. This contradicts
the fact that io is not satiated at xi0. (Q. E. D.)
REMARK: Note that the cheaper point assumption plays a crucial role in
establishing that each consumer indeed maximizes his satisfaction subject
to his budget constraint. Without (A-4), we cannot say this, as was shown by
288 THE THEORY OF COMPETITIVE MARKETS
where Bpi is the share of profits from j to i, Z;'__ Bii = 1, i = 1, 2, ...3 m. Then
condition (i) in the definition of competitive equilibrium is restated as
zti(.Q >_ u,(xi) for all xi E Xi with p xi < Mi, i = 1, 2,... , m
To prove the existence, we impose the following additional assumptions."
(A-5) The set Z is compact, where Z = {(xi, ..., x,,,, y): x < y + z, y E Y,
xiEXi,i= 1,2,...,rn}.
PROGRAMMING AND COMPETITIVE EQUILIBRIA 289
Theorem 2.F.2 Under (A-1), (A-2), (A-3), (A-5), (A-6), (A-7), and (A-8), there exists
a competitive equilibrium with a nonzero price vector.
PROOF:Let a E A where A = {a E R'": Z°= 1 a; = 1, a; > 0 for all i}. Let
U = Em a,u;(x1) and consider the following problem:"
1
Maximize: U
x-EX1, yEY
Relations (4), (5), (7), and (8) all hold with p" replaced for p'.
The rest of the proof is analogous to Negishi [ 12]. It is simply
recorded here to keep this section sufficiently self-contained.
Since the set Z is bounded by (A-5), there exists a number M such that
E;" 1JM;(p, y) - p x1 < M for all (x1, ..., x,,,, y) E Z, and for all p
Now define
290 THE THEORY OF COMPETITIVE MARKETS
;
m
i= I
fit
Since there exist x,0 < xi by (A-7), ai > 0 for all i = 1, 2, ..., in, for other-
wise it contradicts relation .12 Hence, combining (7) and (10), we obtain
condition (i') of competitive equilibrium. (Q.E.D.)
PROOF: First note that RU, {9}] maximizes Em Iaiui(xi) (where ai > 0
for all i), subject to feasibility. Suppose [{zi}, {y,}] is not a Pareto optimum.
Then by the definition of Pareto optimum, there exist zi E Xi, i = 1 , 2, ... , m,
yi E Yi, j = 1, 2, . . ., k, such that ui(zi) > ui(zi) for all i with strict inequality
holding for at least one i, and z 5 y + z. In other words, there exists a feasible
[{.ii}, {p }] such that Em 1aiu1(z1) > 2:m 1aiui(. 1), contradicting the maxi-
mality of [Hzi}, {9j}], (Q.E.D.)
FOOTNOTES
I. This section is a revised version of Takayama and El-Hodiri [ 15] ; their work
was developed from the discussions between Takayama and El-Hodiri during the
summer of 1966, and the actual writing was done by Takayama. I am indebted to
Takashi Negishi for comments.
2. One of the important by-products of such an approach is that we can avoid the
concepts of demand correspondence and supply correspondence altogether.
3. Moreover, in Negishi's paper, Xi is nonnegative and contains the origin for all i. This
means that every consumer has the same minimum subsistence consumption point,
the origin, regardless of his physiological need.
4. For conditions which guarantee the existence of a continuous real-valued utility
function, see Rader [ 12] . The reader may wish to note that one of his theorems
(theorem 3) does not require the transitivity axiom of the preference ordering.
5. Note that if y + x - x < 0 for all x E X and y E Y, then the society cannot guarantee
the survival of every one of its members.
6. Note that only the aggregate production set is assumed to be convex. Every
production set does not have to be convex.
7. From the consideration below, we may also surmise that the theorems of nonlinear
programming that we used can be extended to the case in which the maximand
function is explicitly quasi-concave rather than concave. This conjecture is due
to the fact that only explicit quasi-concavity is usually required in establishing the
corresponding theorems of this section (see Theorem 2.C.2 and Theorem 2.E.3).
Finally, we may note that the concavity implies the continuity in the interior of the
domain (here X.), but not necessarily at the boundary. This is important, forXimay
be a closed set.
8. We are using the following theorem. Theorem: Let Z be a convex subset in RI
and f(z) be a vector function of which the ith component is f,,(z). Let
i = 1, 2, ., m, and gj (z), j = 1 , 2, ... , n, be concave functions on Z. Suppose
. .
also that there exists a z E Z such that gi(g) > 0, j = 1, 2, ... , n (Slater's condition).
Then if z achieves a vector maximum off (z) subject to gj(z) > 0, j = 1, 2, ..., n,
then there exist a > 0, p > 0 (a 0) such that (2, p) is a saddle point of
292 THE THEORY OF COMPETITIVE MARKETS
a f (z) + p g(z). See Theorem 1.E.4. Also see Karlin [ 71, pp. 216-218, and
Kuhn and Tucker [8], pp. 487-489. Note that the convexity of Y and X; as
well as the concavity of u; is required in applying this theorem. Note also
that assumption (A-1) provides Slater's condition, which implies Karlin's condition
as stated in [7], p. 201. See also Section B of Chapter 1.
9. Here ai can be interpreted as the "weight" attached by the society to the ith
individual.
10. Note that this unfortunate consumer is still minimizing his expenditure.
11. To ensure the compactness of Z, assume, for example, that the Xi's are closed
and bounded from below, Y is closed, and that "no-land-of-Cockaigne" holds for
each Yi. For the discussion of such a "compactification," see Arrow and Debreu
[21, pp. 276-277, and p. 279; Debreu [51, pp. 76-78, pp. 84-86; and Nikaido [ 131,
section 40.
12. The Weierstrass theorem asserts that a continuous (real-valued) function achieves
its maximum (and a minimum) on a compact set, and this theorem can easily be
extended to the case of a vector maximum.
13. These assumptions are analogous to those of Debreu [5], chapter 5. Assumptions
(A-5) and (A-7) may sound too stron , and some readers may wish to generalize in
the direction achieved by McKenzie 9] . It should be noted, however, that our set
of assumptions used to prove existence
L (Theorem 2.F.2), is more general than that
of Negishi [ 12] . We do not assume the existence of the right-hand and left-hand
derivatives of each Fk. As we noted before, we, in fact, completely avoided the
use of the individual production function Fk. Hence we do not assume that each Fk
is concave. Note that the concavity of each Fk implies the convexity of each
producer's production set. We only assume the convexity of the aggregate production
set. Slater's condition for each Fk does not have to be assumed. Our assumption
(A-1) is concerned with the aggregate sets. Note also that the assumption of Slater's
condition for each Fk also implies that the production set for each producer must
have a common interior point with the consumption set of every consumer (recall
that the origin is the starvation point for each consumer in Negishi [ 12] ).
14. In view of (A-3) and (A-5), the solution of the above maximization problem always
exists because of the Weierstrass theorem.
15. We are using the following version of the Kuhn-Tucker theorem. Theorem: Let Z
be a convex set in R", and f(z), j = 1, 2, ..., m, be concave functions on Z.
Suppose also there exists a 2 E such that gi(2) > 0 for all j (Slater's condition).
Under these conditions, if 2 maximizesf(z) subject to g.(z) ? 0, j = 1, 2, . . ., m, then
there exists a p ? 0 such that (z, p) is a saddle point of [f(z) + p. g(z)] . A beauti-
ful proof when Z = R" is provided by Uzawa [ 16]. The above slightly generalized
version is provided by Karlin [7] , pp. 201-203 (note that Slater's condition implies
Karlin's condition), and Nikaido [ 13] , section 37. See our discussion in Chapter 1,
Section B (especially the corollary of Theorem 1.B.3).
l 6. ,, > 0, since _y"_ ,.X, = 1 and _Y;" I [ (Mi (p, y) - p xi)/M] < 1.
Note that _y;'_'
17. Note that the range of the mapping (a) is compact. This is due to the fact that
the set Z is compact and the part of the range in which p' lives can be considered
as a compact subset-say, P-of the nonnegative orthant S?" of R"; p' is bounded
and it is obviously nonnegative. Without loss of generality, we may also assume
that ,P is convex.
18. The image is convex because U is concave, the constraint set Z is convex (Theorem
I .C.5), and P is convex. Also recall here that the Cartesian product of convex sets
is always convex. Since the graph of this mapping is a closed set, it is a closed
PROGRAMMING AND COMPETITIVE EQUILIBRIA 293
REFERENCES
1. Arrow, K. J., "An Extension of the Basic Theorems of Classical Welfare Eco-
nomics," Proceedings of the Second Berkeley Symposium on Mathematical Statistics
and Probability, ed. by J. Neyman, Berkeley, Calif., University of California Press,
1951.
2. Arrow, K. J., and Debreu, G., "Existence of an Equilibrium for a Competitive
Economy," Econometrica, 22, July 1954.
3. Berge, C., Topological Spaces, tr. by Patterson, New York, Macmillan, 1963 (French
original, 1959).
4. Debreu, G., "The Coefficient of Resource Utilization," Econometrica, 19, July 1951.
5- -, Theory of Value, New York, Wiley, 1959.
6. Fenchel, W., Convex Cones, Sets and Functions, Princeton, 1953 (offset).
7. Karlin, S., Mathematical Methods and Theory in Games, Programming, and Economics,
Vol. 1, 1st ed., Reading, Mass., Addison-Wesley, 1959.
8. Kuhn, H. W., and Tucker, A. W., "Nonlinear Programming," Proceedings of the
Second Berkeley Symposium on Mathematical Statistics and Probability, ed. by
J. Neyman, Berkeley, Calif., University of California Press, 1951.
9. McKenzie, L. W., "On the Existence of General Equilibrium for a Competitive
Market," Econometrica, 27, January 1959-
10. Moore, J. C., "Some Extensions of the Kuhn-Tucker Results in Concave Pro-
gramming," Papers in Quantitative Economics, ed. by J. P. Quirk and A. M. Zarley,
Lawrence, Kansas, University of Kansas Press, 1968.
'I- , "A Note on Point-Set Mappings," Papers in Quantitative Economics, Vol. 1,
294 THE THEORY OF COMPETITIVE MARKETS
Section A
INTRODUCTION
295
296 THE STABILITY OF COMPETITIVE EQUILIBRIUM
P
P
P
P
0 0
0 0
In the case of a competitive equilibrium, this says that the movement of the ith
price, x;(t), is a function of the excess demand for the ith commodity f, (or excess
demand for all the commodities f). When the problem is written in differential
equation form, one suspects that the theory of stability developed for differential
INTRODUCTION 297
equations might be of some value. This was indeed the case in the development
of the theory of the stability of a competitive equilibrium. In Section B, we survey
the basic material on differential equations. This discussion will also be useful in
later chapters.
Before concluding this introductory section, one important discussion is
necessary on the distinction between the "Walrasian stability" and the "Marshal-
lian stability."' In the introductory exposition of the stability problem of a com-
petitive equilibrium above, we considered the basic postulate in the form of
where pi is the price of the ith commodity and p(t) is an n-vector in which the
ith element is p;(t) and p;(t) = dp;(t)/dt. Also,fi(p(t)) is the excess demand function
for the ith commodity and h; is any (fixed) monotone increasing differentiable real-
valued function. For the case of an isolated market for one commodity, we write
this equation as
p = h[D(p(t)) - S(p(t))]
where h refers to some fixed monotone increasing differentiable real-valued
function. Or, more simply,
p = k[D(p(t)) - S(p(t))]
where k > 0 can be interpreted as the "speed of adjustment" of the market.
There are two important premises in the above formulation. One is that
neither demanders nor suppliers can affect the price that prevails in the market,
but rather they take it as given. This is the premise of a competitive market. The
second premise is that the price is the only adjusting parameter of the market.
At each instant of time, demanders and suppliers, respectively, adjust the quanti-
ties that they wish to demand and supply, based only on the information of the
price given to them. This adjustment is assumed to be instantaneous. Then the
price moves as described in the differential equation above. As price moves, the
quantity of excess demand will vary and stability of the market is achieved when
the price moves in such a way that the excess demand vanishes.
In contrast to such a price adjustment process, the quantity adjustment type
mechanism is often considered. in Figure 3.2, suppose q` is the given quantity
of the commodity. We denote by D(q') and S(ql), respectively, the price that
buyers are willing to pay and the price that sellers are charging for a given
quantity q1. The dynamic output adjustment equation for the above market can
be written as
q = k [D(q) - S(q)]
where k > 0 is the speed of adjustment of the market. This reflects the fact that
if D(q) > S(q), for example, the suppliers can profitably increase the quantity
supplied. If the time path of the solution of the above differential equation con-
298 THE STABILITY OF COMPETITIVE EQUILIBRIUM
Price
verges to the equilibrium quantity g as t extends without limit, then the equilibrium
is said to be "stable." Such a stability is often called the Marshallian stability,
while the stability in the price adjusting market as discussed before is often called
the Walrasian stability.
These two stability definitions apparently do not coincide. The market can
be Walrasian stable (resp. unstable) but Marshallian unstable (resp. stable). In
the literature, this is often illustrated by diagrams such as those shown in Figure
3.3. The diagrams should be self-explanatory.
However, comparison of the two concepts as shown in these diagrams
contains a-very serious confusion. In essence, these two concepts are in com-
pletely different dimensions and should not be compared in the same figure.
One source of this confusion is probably attributable to Hicks' remark
p. 62) on the distinction between these concepts. He states that the Marshal-
P P
q
0 Q 0
printed in 1879; the revision is reprinted in his Money, Trade and Commerce,
London, Macmillan, 1923, appendix J). The curves drawn there became known
later as "offer curves." The intersection of the two offer curves determines the
equilibrium outputs of the two commodities involved. It is assumed that the con-
sumers adjust to their optimum positions instantaneously and the Walrasian
adjustment process is completed instantaneously (the stability in the Walrasian
process is implicitly assumed). The adjustment from an off-equilibrium point to
the equilibrium point described in the above article is purely that of output
adjustment.4
We may remark that both Marshall and Walras clearly realized that there
are these two types of adjustments and they both used them in the proper context.'
Hence it may be rather misleading to call the stability in the price adjustment
process the "Walrasian stability" and the stability in the output adjustment pro-
cess the "Marshallian stability." But since this practice is already much too
common, we will not change it. A difference between these two approaches is
probably that Marshall emphasized the "short-run" output adjustment mecha-
nism and utilized a diagrammatical technique for this adjustment, whereas
Walras emphasized the "temporary" price adjustment mechanism and utilized
a diagrammatical technique for this adjustment in his theory of two-person
exchange.' Moreover, Walras [ 5], in his theory of production, treated the output
adjustment process as the one that occurs simultaneously with the price adjustment
process.`
The question still remains whether the Walrasian price adjustment process
is only relevant to the theory of exchange. I believe it is not. As long as both
demand and supply are functions of prices, the prices must be the final adjust-
ment parameter. After the "temporary" and the "short-run" adjustments are
completed, we should find an equilibrium position in which D(p) = S(p). Hence,
if we wish to abstract such "temporary" processes and "short-run" processes,
we may simply assume that both demand and production adjust instan-
taneously to price and then consider the time path as described byp = k[D(p) -
S(p)] , and so on. In other words, we can still consider the price adjustment as
the one that describes the mechanism for the final equilibrium (see Walras' theory
of production [5] ).'
In the later revival of the stability theory, starting from Hicks [1] and
Samuelson [4] , the Walrasian type of price adjustment has been the main issue
and little attention has been paid to the output adjustment. This is rather un-
fortunate, but as long as the price is the sole independent variable in a competitive
market, it may be natural to emphasize the price adjustment process (either as a
theory of temporary equilibrium in exchange or as a theory of short-run equi-
librium when all adjustments including output are completed).
In any case, this chapter is dedicated to exploring this recent development
in the price-adjusting theory. We will examine both the mathematical technique
and the conceptual difficulties in this recent development of the theory. The
mathematical exploration serves as a beautiful example of the application of
the theory of differential equations to economics.
INTRODUCTION 301
FOOTNOTES
REFERENCES
1. Hicks, J. R. Value and Capital, 2nd ed., Oxford, Clarendon Press, 1946.
2. Marshall, A., The Principles of Economics, 8th ed., London, Macmillan, 1920.
3. Newman, P., The Theory of Exchange, Englewood Cliffs, N.J., Prentice-Hall, 1965.
4. Samuelson, P. A., Foundations of Economic Analysis, Cambridge, Mass., Harvard
University Press, 1947.
5. Walras, L., Elements of Pure Economics, 1926 ed., tr. by W. Jaffe, London, George
Allen & Unwin, 1954.
Section B
ELEMENTS OF THE THEORY
OF DIFFERENTIAL EQUATIONS
Here x(t) is a real-valued differentiable function defined on the real line, and a
is some real number which is constant. The two most basic features of the above
equation that one should note are the following:
(i) The above equation holds for all values of t in the domain (here the entire real
line).
(ii) The function x(t) is not a priori specified, that is, it is an "unknown" function.'
ELEMENTS OF THE THEORY OF DIFFERENTIAL EQUATIONS 303
X(t) = f [x(t), t]
is called a system of n first-order differential equations. An R"-valued function c(t)
defined on a subinterval of T, (t', t2), is called a solution of the system if
(i) The function ¢(t) is continuous on (t1, t2).
(ii) c(t) E X for all tin (t1, t2).
(iii) 0(t) = f[¢(t), t] for all t in (t1, t2), except possibly for the elements of
some countable subset of (t1, t2).
the change of variables, y1(t) = x(t), yz(t) = X(t), ..., y"(t) = x("-1)(t),
allows the equations to be rewritten in the following form, which is a system
of n first-order differential equations:
Yr=y1+l,i= 1,2,...,n- 1
y" _ (b [y1 (t), yz(t), ... , yn(t), t]
Hence it is sufficient to consider the theory of a system of n first-order
differential equations (although it may not always be convenient).
Definition: Suppose that a system X(t) = f [x(t), t] can be written in the form
ELEMENTS OF THE THEORY OF DIFFERENTIAL EQUATIONS 305
where A(t) = [a;1(t)] ; then the system is called linear. If a;1(t) = a constant for
all t and for all i and j, then it is called a linear system of constant coefficients. The
function u(t) is called the forcing function or control function of the system, and if
u(t) = 0, the linear system is said to be homogeneous. Unless otherwise specified,
we will be concerned primarily with the nonlinear system x = f [x(t), t].
The fundamental theorem in the theory of differential equations, as
mentioned before, is the Cauchy-Peano theorem, which asserts the existence and
uniqueness of the solution. For the purposes of later chapters (especially Chapter
8), we will state this theorem for a system which has a form slightly different from
the one described above. We consider the form
x(t) = f [x(t), u(t), t]
where u(t) is a known (or a priori given) function (sometimes called a control
function). It is an m vector-valued function oft [or u(t) E R°'] . We may neglect
this function for the purposes of this chapter. We now state the theorem.
REMARKS :
(i) For the proof of this theorem, see Coddington and Levinson [ 5] , chapter
1, or any standard textbook on differential equations.
(ii) Note that no assumptions are made about the existence and continuity of
the partial derivatives 8f /8 uk.
(iii) The theorem gives a local result, for it establishes the existence of a solu-
tion on an interval (t', t2), which can be very small.
(iv) In the statement of the theorem, R"' can be replaced by any subset of R"'
which contains the closure of the range of u(t).
(v) Assumption (A-2) can be weakened; that is, it can be replaced by the
following condition, called the Lipschitz condition.4
other, then the quasi-stability of the system implies the global stability of the
system.'
The following simple example may be useful to clarify some of the above
concepts.
EXAMPLE: X (t) _ -2x(t), t E R, x E R, and x(t°) = x°
Clearly z = 0 (for all t) is an "equilibrium state" of this equation. It is
easy to see that this solution is unique. The solution of this differential
equation is obtained as
0(t; x°, t°) = x°e-2(t-t°)
Clearly, ¢(t; x°, t°) --> 0 (=z) as t - co regardless of the initial value,
(x°, t°). In other words, the equilibrium point of the above differential
equation is unique and globally stable. In general, given x(t) = a x(t), t E R,
x E R, and .x(t°) = x°, z = 0 is a unique equilibrium state. It can easily be
seen that z = 0 is globally stable if and only if a < 0.
A diagrammatical device is often useful to ascertain the stability property of
an equilibrium state when the dimension of x(t) is small (say, 1 or 2). To illustrate
this, consider the following example:
Theorem 3.B.2: Let x(t) =-A x(t) be a given system of differential equations
with x(0) = x°. Then
0(t; x°) =
eA` °0 Ak t k
Corollary: I f A. i n the above system, has n distinct eigenvalues Ai , A2, ... , A,,, then
the solution can be written as
0(t;x°)=Pe:,,P-
xo
Hence in this case we can see that the stability property depends crucially
on the eigenvalues, the A,'s. In particular, if the .A,'s are all negative, then clearly
e"->0 as t ->oo. Hence the system is globally stable. In general, we have the
following theorem, which holds even when the eigenvalues are not all distinct.
Theorem 3.B.3: Let *(t) = A x(t) be a given system of differential equations. The
equilibrium point x = 0 is globally stable if and only if the real part of any eigenvalue
of A is negative.
PROOF: See, for example, Bellman [2], [3], Coddington and Levinson [5],
Birkoff and Rota [4], and Gantmacher [6].
REMARK: If a system is given in the form z(t) = A [x(t) - 11, then,
clearly, x(t) = x is an equilibrium state and it is unique if A is nonsingular.
Carrying out the change of variable y(t) = x(t) - x, we find that y = 0 is the
unique equilibrium state for y(t) = A y(t), and the above theorem can be
applied immediately.
REMARK: If A is negative definite, then from the elementary theory of
linear algebra, we know that all its eigenvalues are negative; hence the
system is stable.
Given an arbitrary n x n matrix, we now wish to know whether all the eigen-
values of A have negative real parts. There is a famous theorem for this.
Theorem 3.B.4 (Routh-Hurwitz): A necessary and sufficient condition that all the
roots of the equation
a0An + aIA"- I + ... + an = 0
with real coefficients have negative real parts is that the following conditions hold:
aI ao 0 0
al ao 0 a3 a2 al a0
al ao
al > 0, > 0, a3 a2 aI > 0,..., a5 a4 a3 a2 > 0
a3 a2
a5 a4 a3
10 0 0 0 ... an
Here ao is taken to be positive (if ao < 0, then multiply the equations by - 1).
PROOF: See Gantmacher [6], chapter XV.
REMARK: The above condition is known as the Routh-Hurwitz condition.
Its power lies in the fact that it provides a necessary and sufficient condition
for stability. However, in actual application, its power is quite weak because
ELEMENTS OF THE THEORY OF DIFFERENTIAL EQUATIONS 311
the linear approximation system. It is clear that if an equilibrium point in the linear
approximation system is (globally) stable, then it is locally stable in the original
system. We should note, however, that the converse is not necessarily true. In
other words, it is possible that an equilibrium point is locally or globally stable in
the original system and is not stable in its linear approximation system. This is
due to the fact that higher order terms may act favorably for stability. Consider
the following example.
EXAMPLE: x(t) = ax(t) - x(t)3. Clearly z = 0 is an equilibrium point. If
a = 0, z = 0 is a globally stable equilibrium point. Its linear approximation
system is i(t) = 0 (when a = 0). Hence z = 0 is not stable-the solution
starting from an initial point x0 always stays at x°. In order to stress this
fact, we call z, an equilibrium point which is stable in the linear approxima-
tion system, linear approximation stable." This point is often confused in the
literature which applies Samuelson's "correspondence principle" to the
comparative statics problem. Although the Routh-Hurwitz condition
provides a necessary and sufficient condition for the stability of the linear
approximation system, it does not necessarily provide a necessary condition
for the (local) stability of the original (nonlinear) system owing to the
reason discussed above. Hence the Routh-Hurwitz condition for the linear
approximation system cannot, in general, be utilized in obtaining compara-
tive statics results.
FOOTNOTES
I. Observe also that the unknown function here, x(t), contains only one independent
variable, t. This is the defining characteristic of an ordinary differential equation. If
the unknown function contains more than one independent variable, then we have
a "partial differential equation." Here we are solely concerned with ordinary
differential equations.
312 THE STABILITY OF COMPETITIVE EQUILIBRIUM
2. Note that the interval (t1, t2) on which the solution 0(t) is defined is a subset of
T. In other words, 0(t) may not be defined on the entire interval T. For example,
consider x = x2, x E R. Clearly 0(t) = -1/t is a solution which passes through
¢(1) = - 1. However, ¢(t) is not defined at t = 0. The existence theorem here asserts
only the existence of ¢(t) in a neighborhood of t0, that is, (t1, t2).
3. When some of the assumptions of the theorem are violated, the solution which
satisfies the initial condition, even if it exists, may not be unique. For example, con-
sider z = Vx- if x 0, and x = 0 if x < 0, with x E R and 0(0) = 0. Clearly
[¢(t) = 0 for all t, - oo < t < oo] is a solution which satisfies ¢(0) = 0. However,
[¢(t) = t2/4, if t > 0, and ¢(t) = 0, if t < 0] is also a solution which satisfies ¢(0) = 0.
Here (A-2) is violated at x = 0. Given the initial point (t0, x0), the problem of finding
the solution 0(t), defined on (t', t2), of a given system of differential equations which
satisfies ¢(t0) = x0, is called the initial value problem.
4. It can be shown that if f (x, t) has continuous partial derivatives, it satisfies the
Lipschitz condition. But f(x) = V A (where x E R, x > 0) does not satisfy even the
Lipschitz condition at x = 0. To see this, observe that I V A - vy- I = I X - Y I /
({ + /) where x 0 and y 0. When x and y approach 0, 1/(,/x + Vly) will
increase indefinitely. In general, the Lipschitz condition [or (A-2)] is crucial
to guarantee the uniqueness of the solution which satisfies ¢(t°) = x0. If f is not
Lipschitzian [hence (A-2) is violated] but if all the other assumptions of Theorem
3.B.1 are satisfied, then all the conclusions of Theorem 3.B.1 follow except (iv);
that is, the existence of ¢(t) is guaranteed but not its uniqueness. Here the continuity
off is the crucial assumption for existence.
5. However, in specific cases, global existence (and uniqueness) can be ascertained.
The procedure is as follows. Suppose that the solution ¢(t) exists. Suppose f is
bounded as well as continuous in X Ox T. Then we can show that ¢(tl + 0) [ that is,
lim ¢(t) as t -> tl with t > tl ] and ¢(t2 - 0) [that is, lim ¢(t) as t- t2 with t < t2]
both exist. Suppose ¢(t' + 0) and ¢(t2 - 0) are in X; then the solution exists in
neighborhoods of (t' + 0) and (t2 - 0) by Theorem 3.B.1. In this way the solution can
be "continued" or extended to an interval which is larger than (t1, t2) and, there-
fore, we can prove the existence (and the uniqueness) of solutions for the interval
(0, co) under certain assumptions. For the discussion of the "continuation" of
solutions, the reader is referred to any standard textbook on differential equations.
6. In other words, if for some sequence tq, q = 1, 2, ..., such that tq -> w, lim
¢(tq, x0, t0 as q -> oo exists, then lim ¢(tq, x0, t0) as q-. oo is an equilibrium. As
Uzawa ([8], p. 619) has shown, the concept of quasi-equilibrium in essence means
that the "distance" between the set of equilibrium points and 0(t, x0, t0) converges
to zero as t -? oo.
7. In other words, whether or not equilibrium points are isolated is crucial. As an
example of a system that is quasi-stable but not (globally) stable, consider the case
in which ¢(t, x0, t0) spirals toward the unit circle as t increases but approaches
no single point on the unit circle, while the set of equilibrium points is the unit
circle. See Section H of this chapter.
8. Let z be a scalar (real or complex). Then the exponential function eZ can be
defined by e2 = )k 0zk/k!; hence the definition of e't below conforms with
this definition. When z is a real number, then the above definition of eZ can be
obtained as a consequence of the usual definition of e2 by using the Taylor ex-
pansion theorem.
9. Readers who are not familiar with the concept of eigenvalues are referred to the
beginning of Section B, Chapter 4 (or any standard textbooks on matrix algebra).
THE HISTORICAL BACKGROUND 313
10. Notice that the definition of eA` conforms with the above definition of elt.
11. In the above example, z = -x3 (with a = 0), c = 0 is not linear approximation
stable.
REFERENCES
1. Athans, M., and Falb, P. L., Optimal Control, New York, McGraw-Hill, 1966,
esp. chap. 3.
2. Bellman, R., Stability Theory of Differential Equations, New York, McGraw-Hill,
1953.
3. , Introduction to Matrix Analysis, New York, McGraw-Hill, 1960.
4. Birkoff, G., and Rota, C. C., Ordinary Differential Equations, Boston, Ginn & Co.,
1962.
5. Coddington, E. A., and Levinson, N., Theory of Ordinary DifferentialEquations, New
York, McGraw-Hill, 1955.
6. Gantmacher, F. R., The Theory of Matrices, Vol. II, New York, Chelsea Publishing
Co., 1959 (tr. from Russian).
7. Samuelson, P. A., Foundations of Economic Analysis, Cambridge, Mass., Harvard
University Press, 1947.
8. Uzawa, H., "The Stability of Dynamic Processes," Econometrica, 29, October 1961.
Section C
THE STABILITY OF
COMPETITIVE EQUILIBRIUM-
THE HISTORICAL BACKGROUND
ficult and its solution has already been indicated in Section A. The complications
arise when we have repercussions among a number of markets.
The first satisfactory treatment of the stability of a competitive equilibrium
was done by Leon Walras [ 18]. He solved this problem fairly completely for the
two-commodity exchange economy. Hicks [7] extended the scope of the analysis
to a multicommodity economy. For the multimarket case, we suspect naturally
that repercussions among the various markets will complicate the analysis a great
deal. In order to deal with this problem, following Hicks, we distinguish two con-
cepts of "stability." An equilibrium in the market for the jth commodity (hence-
forth thejth market) is said to be imperfectly stable if the markets for all the other
commodities are held in equilibrium (with possible adjustment of the prices of
these goods) and there is stability in the jth market. Let p be an equilibrium
price vector [that is, f(p) = 0] (which is assumed to exist), and suppose that
the price vector p deviates from this p. In order to avoid complicating the dis-
cussion, let us assume that p lies in a certain small neighborhood of p (by this
convention we would like to avoid for the time being the problem of multiple
equilibria and local vs. global stability). Then the equilibrium in the jth market
is said to be imperfectly stable if
(i) f(p) = 0 for all i 4 j,
and
(ii) pj > pj implies j;(p) < 0 and pj < pj implies f (p) > 0.
The equilibrium in the jth market is said to- have perfect stability if the above
imperfect stability holds regardless of the number of the other markets adjusted to
equilibrium, or more specifically, whether or not other prices are fixed or adjusted
so as to maintain equilibrium in the relevant market [that is, for i j, either
f(p) = 0 or pi = constant]. If the equilibrium in every market in the economy
is imperfectly stable (say, at p), then Hicks states that the equilibrium of the system
is imperfectly stable. If the equilibrium in every market in the economy is perfectly
stable, then Hicks states that the equilibrium of the system is perfectly stable.
The Hicksian method of stability analysis is essentially that of comparative
statics. By differentiating the equilibrium system f (p) = 0 at a certain equi-
librium-say, p-with respect to pi and applying the definitions of perfect stability
and imperfect stability (then repeating this for all j = 1, 2, ..., n), Hicks obtained
the following condition for perfect stability for the equilibrium of the system:
for all i, j, k, . . ., of the index set { 1, 2, . . ., n} . (See Quirk and Saposnik [ 16] ,
pp. 153-160, as well as Hicks [7] .) Here ay = af,./app, evaluated at p, i, j = 1, 2,
. ., n. It is to be noted that the Walrasian condition for the stability of the two-
.
THE HISTORICAL BACKGROUND 315
commodity economy corresponds to the first of the above conditions (that is,
a,A < 0). When an n x n matrix A = [ate] satisfies the above condition, A is said to
be Hicksian. The Hicksian condition for the imperfect stability was obtained as
AAi/A < 0, where A denotes the determinant of A and AAA denotes the co-factor of
A at ai,.
The above Hicksian concepts of perfect and imperfect stability (in addition
to the assumption of timeless and instantaneous adjustment) clearly have an air
of artificiality about them and thus require some further examination. A care-
ful scrutiny of these concepts will reveal that they may in fact have little to
do with the stability problem that we are considering. Instead of checking these
points, Samuelson [ 17] proposed a fresh approach to the problem. First, he writes
the fundamental assumption of stability analysis as the following system of dif-
ferential equations:
dpA(t)
dt = kAf [pi(t), P2(0, ..., A,(t)] , i = 1, 2, ..., n
Here k; denotes the speed of adjustment of the ith market.2 The fundamental as-
sumption of stability analysis specifies that ki is strictly positive. Then stability
analysis is reduced to the problem of examining the dynamic system generated by
the above system of differential equations. This amounts to examining the stability
property of the above system of differential equations. Alternatively, one may also
formulate the fundamental assumption of stability analysis in terms of the follow-
ing system of difference equations:
pA(t+1)-pi(t)= k1I[p(t)],i= 1,2,...,n
where kA > 0 is the speed of adjustment of the ith market.
Whichever approach one takes, we say that an equilibrium (or, more specifi-
cally, an equilibrium price vector p) is "stable" if the time path of the solution of
the dynamic system, starting from an initial point p°, converges to p. When this is
the case, Samuelson calls the equilibrium truly dynamically stable. We can dis-
tinguish here between local stability and global stability. Samuelson was mainly
concerned with local stability. We may note that either one of the above dynamic
systems describes the behavior of the price vector when it is not an equilibrium
point. We can thus carry out the stability analysis by examining the stability prop-
erty of either of the above dynamic systems. Partly because the theory of dif-
ferential equations is more developed than the theory of difference equations, the
later development occurs mostly through the differential equation approach.
This (dynamic) approach by Samuelson is conceptually much more trans-
parent than Hicks's approach in the sense that it properly handles the general
equilibrium nature of the stability analysis (that is, the repercussions among
various markets). It also has the advantage that it makes clear the dynamic
character of the adjustment process toward an equilibrium.
Samuelson then takes a linear approximation of the above system [that is,
he takes only the linear terms of the Taylor expansion off,.(p) about an equilibrium
316 THE STABILITY OF COMPETITIVE EQUILIBRIUM
dp(t)=
dt [p(t) - P]
where A = [a,], and K is a diagonal matrix whose diagonal elements are k; and
whose nondiagonal elements are all zero. Then the stability analysis of a competi-
tive market is reduced to the stability analysis of the above system of linear dif-
ferential equations. We now recall the discussion of Section B, that is, a necessary
and sufficient condition for stability is that all the eigenvalues of A have negative
real parts. In order to establish this property for matrix A, we refer to the Routh-
Hurwitz condition (Theorem 3.B.4).3 We recall that stability in the above linear
approximation system (that is, "linear approximation stability") implies local
stability in the original system and that the converse of this statement is not
necessarily true. In other words, an equilibrium point can be locally stable in the
original system but it may not be stable in the linear approximation system. We
note that as long as we deal with the stability of the linear approximation system,
we can also use the theory of linear differential equations.
Samuelson then considers the relation between the Hicksian stability (the
conditions of which were discussed above) and the true dynamic stability (in the
linear approximation system). He concludes that (1) for the two-commodity case,
the two conditions are equivalent, (2) for the three-commodity case, the Hicks
condition for perfect stability (that is, that matrix A be Hicksian) is sufficient for
true dynamic stability, and (3) for the n-commodity case (n > 3), the Hicks
condition for perfect stability is neither necessary nor sufficient for true dyna-
mic stability. This relation between Hicks' condition for perfect stability and
dynamic stability is explained in more detail in the literature (for example,
Samuelson [ 17] , Lange [8] , Metzler [ 10] , and Morishima [ 11 ] ), with the fol-
lowing results.
(i) If A is symmetric (that is, ay = ajj) and if k; = I for all i, the Hicksian con-
dition is equivalent to the dynamic condition (Samuelson and Lange). This can
be seen easily by noting that if A is symmetric and Hicksian, then it is negative
definite, which implies that the real parts of the eigenvalues of A are always
negative.
(ii) If A is quasi-negative definite [that is, (A + A')/2 is negative definite where A'
is the transpose of A] , and if k, = 1 for all i, then Hicks' condition is equivalent
to the dynamic condition (Samuelson).
(iii) If the dynamic process is stable regardless of the values of the speeds of ad-
justment, then Hicks' condition must be satisfied (Metzler).
THE HISTORICAL BACKGROUND 317
(iv) If A has all its nondiagonal elements positive (ail > 0; i L j), Hicks' condition
is equivalent to the dynamic condition (Metzler).
a ij
of = aXi m aXik
k=
ap1 a Pi 1 a Pi
(evaluated at p). Hence ail > 0 if axik/apt > 0 for all k = 1, 2, ..., m. This means
that for each consumer the demand for the ith commodity rises when the price of
the jth commodity rises. There should be no confusion between this concept of
substitutability and ordinary (net) substitutability. The latter is concerned with
the (positive) effect of a change in the price of commodity j on the demand for
commodity i when real income is properly compensated. Such a qualification of in-
come compensation is absent in "gross substitutability." That is, when axik/apt > 0
holds (for all p), we say that commodity i is a gross substitute of j for Mr. k with
respect to the change in the price of j (i j). Hence ail > 0 for all i and j (i j) is
guaranteed if all the commodities are gross substitutes for each other for every
consumer in this pure exchange economy. We call this case, ail > 0 for all i and
j (i 4 j), the gross substitute case.
This gross substitute case attracted the attention of many economists, and in
1958 (that is, about ten years after the publication of Samuelson's Foundations
[ 17] ), Hahn [6] , Negishi [ 12], and Arrow and Hurwicz [2] independently
proved that if ail > 0, i j, then the equilibrium point is stable in the linear ap-
proximation system; hence it is locally stable in the original system. Note that in
statement (iv) above, Hicks' condition for perfect stability is stated as a necessary
and sufficient condition for dynamic stability. What Arrow and Hurwicz, Negishi,
and Hahn proved is that Hicks' condition can be totally dispensed with in the gross
substitute case. The novelty of their proof is that they take full advantage of the
implications of the economic assumptions underlying the competitive model, such
as Walras' Law, and the zero homogeneity of the individual's demand function.
In 1959 Arrow, Block, and Hurwicz [ 1 ] finally proved that the original system is
318 THE STABILITY OF COMPETITIVE EQUILIBRIUM
globally stable if all commodities are gross substitutes for each other and put an end
to one of the major periods in the history of the stability of competitive markets.
We may simply list some other major points considered after Samuelson [ 17] .4
(i) The nonnegativity of the price vector has been explicitly considered (Nikaido
and Uzawa [ 15] ).
(ii) Expectation has been introduced into the model (Enthoven and Arrow [5] for
the extrapolative expectation and Arrow and Nerlove [4] for adaptive ex-
pectation).
(iii) Non-t&tonnement processes have been introduced and examined.5
(iv) Some attempts to relax the gross substitutability assumption have been made.
In the course of such attempts, important examples for unstable equilibrium
have been discovered by Scarf (see Section F), which in turn cast dark shadows
on the scope of the stability of competitive markets and the method of finding
an equilibrium by such an adjustment mechanism of the markets.'
Finally, two remarks are in order. In the course of the proof, it was noticed
that the speed of adjustment, k;, is immaterial for the stability property. Arrow
and Hurwicz [2] and Arrow, Block, and Hurwicz [ 1]. noted that by choosing the
units of measurement properly, we can choose k, = 1 for all i and for all t.' If this is
the case, our basic dynamic adjustment system is simplified as
dpi(t) 1, 2, ..., n
dt = f [P l(t), P2(t), ... , Pn(t) ]
or
dp(t)
dt = f [P(t)]
The second remark is concerned with the equilibrium state. In the system
f(p) = 0, which defines an equilibrium state, we note that one commodity can
be taken as the numeraire (for example, po = 1). If every individual's budget
relation holds with equality (that is, if everybody spends all his "income"-as a
result of nonsatiation and the like), then we have the relation known as Walras'
Law. That is, the price-weighted sum of all the excess demands is identically
equal to zero. This relation is supposed to hold whether the economy is in equi-
librium or not. When one of the prices is taken to be the numeraire and one of
the equations in the system is dropped because of Walras' Law, we say that the
system is a normalized system; otherwise it is a nonnormalized system. That one
commodity can be chosen as numeraire depends on the homogeneity assumptions
For the nonnormalized system, none of the commodities is designated as numeraire,
although the homogeneity of the excess demand functions is usually assumed
to be still binding on the system. The Hicksian discussion on stability is based on
the normalized system, while dynamic stability can be (and has been) discussed
THE HISTORICAL BACKGROUND 319
FOOTNOTES
REFERENCES
1. Arrow, K. J., Block, H. D., and Hurwicz, L., "On the Stability of the Competitive
Equilibrium, II," Econometrica, 27, January 1959.
2. Arrow, K. J., and Hurwicz, L., "On the Stability of the Competitive Equilibrium,
I," Econometrica, 26, October 1958.
3. , and , "Decentralization and Computation in Resource Allocation,"
in Essays in Economics and Econometrics, ed. by Phouts, Chapel Hill, N. C., Univer-
sity of North Carolina Press, 1960.
4. Arrow, K. J., and Nerlove, M., "A Note on Expectation and Stability," Econometrica,
26, April 1958.
5. Enthoven, A. C., and Arrow, K. J., "A Theorem on Expectations and the Stability of
Equilibrium," Econometrica, 24, July 1956.
6. Hahn, F. H., "Gross Substitutes and the Dynamic Stability of General Equilibrium,"
Econometrica, 26, January 1958.
7. Hicks, J. R., Value and Capital, 2nd ed., Oxford, Clarendon Press, 1946.
8. Lange, 0., Price Flexibility andEmployment, Bloomington, Ind., Principia Press, 1944.
9. McKenzie, L. W., "Stability of Equilibrium and the Value of Positive Excess
Demand," Econometrica, 28, July 1960.
10. Metzler, L., "Stability of Multiple Markets: The Hicks Conditions," Econometrica,
13, October 1945.
11. Morishima, M., "Notes on the Theory of Stability of Multiple Exchange," Review of
Economic Studies, XXIV, 1957.
12. Negishi, T., "A Note on the Stability of an Economy Where All Goods Are Gross
Substitutes," Econometrica, 26, July 1958.
13. , "The Stability of a Competitive Economy: A Survey Article," Econometrica,
Section D
A PROOF OF GLOBAL STABILITY
FOR THE THREE-COMMODITY CASE
(WITH GROSS SUBSTITUTABILITY)-
AN ILLUSTRATION OF THE
PHASE DIAGRAM TECHNIQUE
Our problem is to find out whether or not the solution of the above system of
differential equations, p(t; p°, 0), or simply p(t; p°), converges to the equilibrium
price vector p, where p is defined by f (p) = 0, i = 1, 2, 3. The phase diagram tech-
nique is a device which shows the time path of p(t; p°) without explicitly solving
the differential equations. The technique (for the present case) is essentially based
on the fact that each [p; = 0] curve or [ f (p) = 0] curve (i = 1, 2,) [that is, the
locus of (p1, p2) such that pi = k. f (p) = 0] divides the entire (p 1-p2)-plane into two
regions: the region in which pi > 0 and the region in which A < 0, where i = 1, 2.
Recall in this connection that p3 = 1 always. Hence we can omit any consideration
of p3 or p3. This enables us to consider the problem in the two-dimensional plane.
First we ascertain the shape of the [f1(p) = 0] curves (i = 1, 2). We assert
that they are both upward sloping and that they intersect only once [hence an
equilibrium point, that is, a point in which f,.(p) = 0 for all i, if it exists, is unique] .
Moreover, we can assert that the [f2(p) = 0] curve intersects the [f (p) = 0]
curve "from the left." By checking the signs ofpl and J2 in the four regions defined
by these two curves, we will be able to ascertain the global stability of p. We now
pursue this process in detail.
First observe that the following "Euler's equation" holds owing to the homo-
geneity assumption (A-2):
3
Z =0 forallp,i= 1,2,3
j= I
is, suppose that there exists another equilibrium point p* > 0, f(p*) = 0 and
p* # p, where p` = p3 = 1. Then the [fi(p) = 0] curve and the [f2(p) = 0] curve
intersect at least twice, that is, at pointsp* and p. Then at one of these two points-
say, at p*-the [f, (p) = 0] curve must intersect the [f2(p) = 0] curve from the
left. This is illustrated in Figure 3.5.
P2
f,(P) =C
P1
Now consider a point p in the diagram; p is chosen such thatp, > pr andp2 >
p`. (Note that p3 = Pt = p3 = 1). Then f, (P) > 0 and f(p) > 0. Totally differen-
tiatef3(p), and obtain
k,f,=0 k2f2=0
Pp
Po +/ - -/+
IN
P,
0
stitutability, is that, regardless of the initial value of the price vector, the price path
[pl(t), p2(t)] is "trapped" inside the region in whichpl < 0 andp2 < 0 withp, > pI
andp2 > p2, or inside the region in whichpl > Oandp2 > Owith p I < p I andp2 < p2 .
FOOTNOTES
drawn for each given x° and is continuous. Such a curve is called the (solution)
path or orbit, and the (x1, x2)-plane on which the solution path is drawn is called
the phase space. Since there can be many possible initial points, we can draw a family
of the solution paths, each path corresponding to each initial point. The phase
diagram technique is concerned with the technique of studying the behavior of the
solution paths on the phase space, without actually solving the given system of
differential equations.
REFERENCES
1. Arrow, K. J., and Hurwicz, L., "On the Stability of the Competitive Equilibrium, I,"
Econometrica, 26, October 1958.
2. Hicks, J. R., Value and Capita!, 2nd ed., Oxford, Clarendon Press, 1946.
3. Marshall, A., Money, Credit and Commerce, London, Macmillan, 1923, appendix J
(this appendix was originally published in 1879 as "The Pure Theory of Foreign
Trade").
Section E
A PROOF OF GLOBAL STABILITY
WITH GROSS SUBSTITUTABILITY-
THE n-COMMODITY CASE
p(t, p°), t E [0, cc) for the above dynamic system, there exists a positive equilibrium
price vector p [that is, f (p) = 0, i = 1, 2, ..., n], and the demand functions are
single-valued and continuously differentiable. Finally, we assume that the follow-
ing relations hold:
(A-1) (Wairas' Law) Jn 1p;f,(p)= 0.
(A-2) (Homogeneity) xi(p) = x;(ap), i = 1, 2, ... , n, for any positive number a.
(A-3) (Gross substitutability) ax;(p)/apj > 0, for all p, i # j, i, j = 1, 2, ..., n.3
REMARK: All the above relations are assumed to hold for any t > 0 and
p(t) > 0.+ Thus we may rewrite Walras' Law as
n
E p (t)I. [P(t)] = 0
i= I
J= i
Lemma 3.E.2: The homogeneity and the gross substitutability assumptions, that is,
(A-2) and (A-3), imply that the equilibrium price vector is unique up to a positive
scalar multiple.
REMARK: This lemma states that any equilibrium price vector may be
expressed in the form ap where a is some positive number. Geometrically,
this means that there is a unique "equilibrium ray" {ap: a > 0}.
PROOF: Suppose not. In other words, let p and p* be two equilibrium price
vectors such that p* # ap for any positive a. Let p,/p* - min; { p 1 /p * ,
P2IP2* , P;lP*, ..., and write y =_ P;/p*. By definition, µ < pilp*
for all i, or pi > up* for all i. Since p* ap for any positive a, p; > yp*
for some i L 1. Write p, - up*. Then pj > p; for all i with strict inequality
THE n-COMMODITY CASE 327
n
p; f,1 (p) > 0 for all p > 0 such that p ap for any a > 0
1= I
D B
whose slope' is given by p. Let CD be the line which passes through the point
x and whose slope is given by p. We assume that p 142 > p i /p2 (under the
assumption p1/p2 < pi/p2, the lemma can be proved analogously). In other
words, we assume that the line CD is steeper than the line AB. This assump-
tion means that pl/p, > p2/p2. Write,u = p2/p2. Then pi > µp1 and p2 = µp2.
Hence the gross substitutability assumption implies that x2(p) > x2(µp).
But Walras' Law implies that µpixI(p) + µp2x2(p) = µp1x I + µp222 =
µplx1(µp) + µp2x2(µp), so that we must have x1(p) < x1(µp). Using the
homogeneity assumption, we get xi(p) = xi(µp) > xl(p) = x I, and x2(p) =
x2(Pp) < x2(p) = x2. Hence point x(p) = [x1(p), x2(p)] must lie to the
right of the point x in Figure 3.7. Now draw a line parallel to CD passing
through the point x(p). We see at once that p x(p) > p z. Hencefi f(p) > 0
wheref(p) = [fi(p),f2(p)] (Q.E.D.)
REMARK: This lemma states that in any disequilibrium situation, the sum
of the excess demands weighted by the equilibrium prices is always positive.
We recall that Samuelson's weak axiom of revealed preference states that
p. t x < 0 implies p' Ax < 0, where Ax = x(p') - x(p). That this axiom
holds for an individual is a consequence of rational behavior.' However,
the statement that this axiom holds for the entire economy (that is, for the
market demand as a whole) is not a consequence of rational behavior but
is an additional assumption as we remarked in the Appendix to Section E,
Chapter 2. In any case, suppose that this axiom holds for the entire economy.
Walras' Law implies that p x(p) = p x = p x(fi), where p is an equi-
librium price vector, so that we have p Ax = 0 where Ax = x(p) - x(p).
Hence from the weak axiom of the revealed preference for the entire econ-
omy, we have p A x < 0. This means that p [x(p) - x(p)] < 0, orp [x -
x(p)] < 0, which is nothing but the statement of the lemma. Hence Lemma
3.E.3 is also implied from the weak axiom of revealed preference for the
entire economy.
REMARK: We may recall that if the weak axiom holds in the aggregate
(that is, the conclusion of Lemma 3.E.3), then the equilibrium is uniqueupto
a positive scalar multiple.' To prove this, suppose not. That is, suppose there
exists a p * crp for any a, yet f (p *) = 0 for all i. But by the assumption,
we have > 0, which is a contradiction. Note that gross sub-
stitutability is not needed in this proof. See the appendix to Section E,
Chapter 2.
REMARK: The uniqueness is nice,` but it is a rather restrictive phenome-
non.' Note that the uniqueness here is a consequence of such restrictive
.
PROOF: We consider the Euclidian distance between p(t) and P and show
that this distance converges to zero as t->oo. Let D(t) - II p(t) - P II 2 =
2:" i [p;(t) P;]z. From Lemma 3.E.1,
- II p(t) II = II p(O) II Normalize P
such that II P II = II p(O) II Differentiate D(t) with respect to t. That is,
.
dD(t) = d
dt
[ E 1p, (t)
dt ;= i - Pr}2] = 2 i=1
117i (t) - Pr} dP`
dt
n n n
= 2 E {p, (t) - Pr} .f (p) = 2 12: pi(t)f(p) - 2: Rff (p)}
i=I i=i
n
-2Epf,(p) Law).
FOOTNOTES
1. It is possible to produce simpler proofs than the ones given by Arrow, Block,
and Hurwicz [ 1] . However, our attempt here to sketch one of the proofs in [1]
will be useful in enhancing our understanding of the stability problem. The facts
that we pick up along the way in the present round-about manner of proof are
of some economic interest in themselves.
330 THE STABILITY OF COMPETITIVE EQUILIBRIUM
2. Arrow, Block, and Hurwicz [ 11 also proved the stability of the normalized system.
3. The gross substitutability assumption can be stated without using derivatives (that
is, without assuming differentiability) as follows: For any j = 1, 2, ... , n, we have
pi = p; for all i j and pj < pj, implying that f(p) < f (p') for all i j, where p =
(p1, ..., p") and p' = (pi, ..., p.). This is called gross substitutability in the finite
incremental form.
4. It can be easily shown that the gross substitutability and homogeneity assumptions
are inconsistent with each other if pi = 0 for any i (Section F-c, of this chapter
and Hotaka [21). Therefore, under these two assumptions, the f,.'s have no meaning
if p(t) is a boundary point of the nonnegative orthant of R". This obviously implies
that if p is an equilibrium price vector [that is, fi(p) = 0, i = 1 , 2, ... , n] , then
p > 0. An error by Arrow, Block, and Hurwicz [ 1] in this connection was pointed
out and corrected by Hotaka ([2] , pp. 305-306), who also showed that these two
assumptions and Walras' Law imply that if p;-0 for some i (the other prices
being fixed), then f (p)->oo.
5. A brief recollection of the weak axiom may be useful. Interpret x and x' as the
consumption vectors of a particular individual. Let x and x', respectively, be chosen
by him when p and p' prevail. Assume the uniqueness of the choice. If x' is affordable
at p-that is, p. x' < p- x-then x is revealed to be preferred to x', for he could
have consumed x'. If this is the case, x' cannot be revealed to be preferred to x;
that is, it is impossible to have p' x < p'- x', with x' chosen under p'. In other words,
p Ax < 0 implies p' Lx < 0. See P. A. Samuelson, Foundations ofEconomicAnalysis,
Cambridge, Mass., Harvard University Press, 1947, chapter 5.
6. We may recall that in the proof of the existence of a competitive equilibrium, Wald
proved the uniqueness of equilibrium by assuming that the weak axiom holds in the
aggregate.
7. When we have multiple equilibria, the property that p(t, po) always converges to
a particular equilibrium point regardless of p0 is rather restrictive. Thus "unique-
ness" 'is a nice property, especially when we are interested in global stability. How-
ever, multiple equilibria are not necessarily destructive in stability analysis. Recall
Uzawa's concept of "quasi-stability" which we remarked upon in Section B. See
also Section H.
8. It may suffice to recall the possibility of multiple intersections of the offer curves
in the Mill-Marshall diagram in the theory of international trade.
REFERENCES
1. Arrow, K. J., Block, H. D., and Hurwicz, L., "On the Stability of the Competitive
Equilibrium, II," Econometrica, 26, January 1959.
2. Hotaka, R., "Some Basic Problems on Excess Demand Functions," Econometrica,
39, March 1971.
3. Negishi, T., "The Stability of a Competitive Economy: A Survey Article," Econo-
rnetrica, 30, October 1962.
SOME REMARKS 331
Section F
SOME REMARKS
where Zj= i aid = 1, aii > 0 for all i, j, and xi (j = 1, 2, ... , n) is the amount of
thejth commodity consumed by Mr. i. We will show that the above utility function
yields a demand function for Mr. i which exhibits the gross substitutability prop-
erty. For notational simplicity, we will omit the subscripts i. In other words,
we will represent Mr. i's preference ordering as
n
(2) u(xi,x2,...,xn) of logxf,Eof=1,anda. >0 forallj
j= I j= I
Now suppose that Mr. i maximizes his utility over his budget constraint and note
that he consumes a positive amount of every commodity (that is, an interior solu-
tion is achieved for this constrained maximum problem).' Then the first-order
condition (see Chapter 1, Section F) can be written in the following form:
au
(3)
Xj
=)pl, j= 1,2,...,n
where A is the Lagrangian multiplier for this problem. Since the above utility
function is a concave function, this condition is also sufficient for the global
maximum of the solution. From (2), au/axj can immediately be obtained as
(4) au _ a;
ax; x/
Note that this.implies A > 0. This, in turn, implies that all the income is spent.
In other words, if we denote Mr. i's income by M, then M = 2:j Ipixj. Now
we sum equation (5) over j and obtain
n n
1= aj= APjxj= AM
j=I j=I
or
(6)
Suppose that all his income is obtained by selling his resources in the markets.
Denote his initial holding of the jth resources by x j (where we again omit the
subscript i for notational simplicity). Then
n
(7) M- -Y pjzi
j= I
Using (5), (6), and (7), we then obtain
n
M E PJ'xJ
xj= aj-=
XP--
aj-=
1
aj
Pi Pi
Therefore
(8) axj_axk
aPk
JPJ, kj
Hence under the assumptions pj > 0 and zk > 0 for all j and k = 1, 2, ..., n,
all commodities are gross substitutes for Mr. i. Thus if everybody in the economy
has the utility function specified by (2) (with the assumptions used above), the
market demand function also exhibits the "gross substitutability" property.'
To obtain the gross substitutability for the market demand function, we
note (resuming the subscript i for Mr. i):
m in n
xU= i S,'p;x
i=I Pji=l.j=l
Hence
_a
8Pki=I
m a. in
Z xij = J2:xik,
Pji=I
k .j
b. SCARF'S COUNTEREXAMPLE
In Section E, we established the global stability of a competitive equilibrium
under the assumptions of Walras' Law, homogeneity, and gross substitutability.
It is natural to ask how far we can relax these assumptions. In particular, we
would like to know the extent to which gross substitutability can be relaxed.
It was conjectured that gross substitutability could be replaced by a more plau-
sible assumption on the utility function, such as quasi-concavity (that is, convex
to the origin indifference curves). Scarf [ 18] has constructed examples that cast
doubt on all such conjectures. His examples are useful in understanding the
basic problem involved in the stability question. Here we explain one of them.
Consider a pure exchange economy consisting of three consumers and three
commodities. Let x, be the consumption of commodity j (j = 1, 2, 3) by Mr. i
(i = 1, 2, 3). Suppose that Mr. i's utility function, ui, can he written in the following
form.
ul(XI1, X12, X13) = min {x11, x12}
u2(X21, X22, X23) = min {X22, x23}
u3(X31, X32, X33) = min {X31, X33}
In other words, each individual desires only two commodities and wants them
only in the fixed ratio (one to one). It can easily be seen (by analogy to the case
of fixed production coefficients) that such a utility function gives rise to an L-shaped
indifference curve (which is clearly convex to the origin!). Let xi be the initial
holding of commodity j by Mr. i. Let us suppose that
x;;= 1, and iii= 0,
i (i = 1, 2, 3) only has one unit of commodity i and none
of the other commodities. Scarf's indifference curve and the budget line are
illustrated in Figure 3.8.
Note that, in view of the specifications of the utility functions, the income
consumption path of each individual in Figure 3.8 is the 45-degree line, so that
x11 = x12, X22 = X23, and x31 = x33. Consider the change in the price indicated
by the arrow in the diagram (a decrease in the price of commodity 2). It is clearly
illustrated by the diagram that there is only an "income effect"; there is no
"substitution effect." Hence for such indifference curves, the entire price cha-.ge
is absorbed into the income effect.
The excess demand for commodity I can be written as
(9) X1 - X1 = (X11 + X21 + X31) - (x11 + X21 + x31)
= (xI1 + x31)- x11 = (x11 - Xil) + X31
The budget equation for Mr. I can be written asplxl I + P2XI2 = plx l 1. But by our
convention, X1 I = X12 and x 11 = 1. Hence x1 I = P1 /(pl + P2) so that x11 - x 11 =
334 THE STABILITY OF COMPETITIVE EQUILIBRIUM
x12
-P2/(Pi + p2). Similarly, x31 can be obtained from Mr. 3's budget equation,
PIX31 + P3x33 = P3x33; that is, x31 = P3/(P3 + p1). Therefore, from (9), we obtain
(10-a) X1 - X 1 = -PI P2 P3
+ P2 + P3 + PI
Similarly we obtain
(10-b) x2-x2--P2+P3+PI+P2
P3 Pi
(10-c) x3-x3- _ Pi P2
P3+PI +P2+P3
From equations (10-a, -b, -c), it is clear that there exists a unique "equilibrium
price ray" PI = P2 = P3
We can write the dynamic adjustment equation as
(11) pi(t)=x1(t)-x;,i= 1,2,3
Now we want to show that II p(t) II = constant for all t. To show this, we
differentiate 11 p(t) 11 with respect to t. In other words
d
dt 1 Pi (t)] = 2 2: P1(t) Pr = 2 2: P;(x1 - x;) = 0
i=I i=I
Hence we conclude that II p(t) II 2 = 1p,2(t) = constant.-'
Next we want to show that II 3_ 1 p1(t) = constant. To do so, differentiate this
as follows:
SOME REMARKS 335
These conditions are somewhat peculiar when compared with the ordinary
Hicks-Slutsky model of consumer's behavior. Scarf [ 18] and Gale [5] also con-
sidered cases of instability in which the substitution effect is present (however,
this effect is "smaller" than the income effect-Giffen's case). It'is certainly
difficult to say precisely under what conditions the instability arises. However,
Scarf's examples indicate that instability may occur in a wide variety of cases.
GkakiPk
Pi
where the aki's are arbitrary constants such that aki > 0 for k i and Eiaki = 0.
Since afi/epk = aki/pi > 0 for k i, the gross substitutability assumption is
satisfied. Second, note that
GPifi ' G 'XakiPk = GakiPk = DkG aki - 0
i i Pi k i,k k i
d. NONNEGATIVE PRICES
The problem of the stability of a competitive equilibrium, as we have seen,
is concerned with the following system of differential equations:
Pi(t) =f [P1(t), P2(t), ..., A#)], i = 1, 2, ... , n
If the f's are defined on an open connected set X, the Cauchy-Peano theorem
guarantees the existence of a solution in a neighborhood of t = 0. But the Cauchy-
Peano theorem does not guarantee the existence of a solution for the entire
region [0, eo), with which stability analysis is concerned. A natural question is:
Can we guarantee the existence of a solution for the entire region of t, [0, oo), by
guaranteeing the existence of solutions in the local regions 10, EI], [E1, E2],
[E2, E311 ..., and so forth? Suppose we can make these "continuations" by some
suitable methods We may find that we can not go further than in some region
[Ej, e, J, for when t is in the region [Ej, E;+ 11, the solution vector p [t; p(0)],
may lie outside the region X on which the f's are defined. Then the above dif-
ferential equation system would not have solutions for the entire region [0, co).
In stability analysis, X is often taken as the positive or nonnegative orthant of R",
or else X = R" with the explicit constraint p ? 0. Hence the question here is one
of negative prices. In other words, when t reaches the region [Ej, cj+ 11, the solu-
tion of pi[t; p(0)] may become negative for some i. Then p is outside the region
X or violates the restriction p > 0 so that it becomes meaningless to discuss the
question of whether or not p [t, p(0)] converges to an equilibrium price vector
as t goes to .
This nonnegativity condition for the price vector is often neglected in the
literature. Explicit consideration is given to this problem in two masterpieces,
[ 1] and [8] Here we illustrate the problem by using some other studies.
.
One way to avoid the above difficulty is to modify the above system of
differential equations such that, for each i,
This approach is adopted by Arrow, Hurwiez, and Uzawa [3], Morishima [9],
and Kose [7]. The advantage of this method is that we do not have to modify
the stability analysis of the previous sections very much, if we assume the existence
of a solution for the above modified system. The question now is whether we can
guarantee the existence of a solution for this modified system. The ordinary
Cauchy-Peano existence theorem is not applicable here because the right-hand
side of this system may not be a continuous function of p. To see this, consider a
sequence {pv} in X such that p9_ with p = 0. Assume fi(p) < 0. Then the RHS
of the above differential equation converges to 0, and not to f, (p), as p"->P.
Hence we indeed have a discontinuity. Unless a new existence theorem is proved
for the above modified system, we are not being quite honest if we proceed with the
stability analysis.
Nikaido and Uzawa [ 15] proposed the following alternative system:
In ending, we may note one criticism of the above three devices for avoiding
the problem of negative prices. The process of "switching," whether continuous
338 THE STABILITY OF COMPETITIVE EQUILIBRIUM
FOOTNOTES
1. We should not, however, emphasize gross substitutability too much. Hicks [6]
considered "strongly asymmetrical income effects and extreme complementary"
as causes of instability. Unfortunately, we do not have any important stability
theorem that applies when the gross substitutability assumption is relaxed, although
we do have some results for instability when this assumption is relaxed (cf. Scarf
[ 18] and Gale [ 5] ). See subsection b of this section. Recent studies by Morishima
[ 10] and Ohyama [ 16] seem to offer interesting attempts when gross substitutability
is absent. Unfortunately, [ 10] seems to contain a serious error.
2. Note that this is a logarithmic transformation (which is a monotone transformation)
of a Cobb-Douglas type utility function u; = H1 1 x,JXU, _Yjai = 1. Recall that a
preference ordering is invariant under a monotone (increasing) transformation of the
utility function.
3. This is due to our specification of the utility function, (2). For if consumption of
one of the commodities becomes zero, then the consumer's utility becomes -oo. As
long as he has a positive income, this (zero utility) is certainly not optimal for him.
4. Recently Eisenberg [4] has shown that if each individual's utility function is of
the Cobb-Douglas type (or more generally homogeneous), then the individual utility
functions are "aggregated" to form a social welfare function, which is of the Cobb-
Douglas (or homogeneous) type. Here the "social welfare function" is not used to
describe the welfare level of the society, but rather it is used to describe the behavior
of the society.
5. This result was proved in Lemma 3.E.1. The proof is recorded here to keep our
exposition sufficiently self-contained.
6. For the expositions of this and the following subsections, I am indebted to Nikaido
[13].
7. As we observed in subsection a. such an excess demand function can be obtained
from the Cobb-Douglas type of utility function.
8. For "continuation" of the solution, recall our remark in Section B. For an explicit
proof of the possibility of continuation under the present framework, see Nikaido
([ 14] , pp. 338-339).
REFERENCES
1. Arrow, K. J., Block, H. D., and Hurwicz, L., "On the Stability of the Competitive
Equilibrium, II," Econometrica, 27, January 1959.
2. Arrow, K. J., and Hurwicz, L., "On the Stability of the Competitive Equilibrium, I,"
Econometrica, 26, October 1958.
THE TATONNEMENT AND THE NON-TATONNEMENT PROCESSES
339
3. Arrow, K. J., Hurwicz, L., and Uzawa, H., eds., Studies in Linear and Nonlinear
Programming, Stanford, Calif., Stanford University Press, 1958.
4. Eisenberg, E., "Aggregation of Utility Functions," Management Science, 7, July 1961.
5. Gale, D., "A Note on Global Instability of Competitive Equilibrium," Naval
Research Logistics Quarterly, 10, March 1963.
6. Hicks, J. R., Value and Capital, 2nd ed., Oxford, Clarendon Press, 1946.
7. Kose, T., "Solutions of Saddle-Value Problems by Differential Equation," Econo-
metrica, 24, January 1956.
8. McKenzie, L. W., "Stability of Equilibrium and the Value of Positive Excess
Demand," Econometrica, 28, July 1960.
9. Morishima, M., "A Reconsideration of the Walras-Cassel-Leontief Model of
General Equilibrium," in Mathematical Methods in Social Sciences, ed. by Arrow
et. al., Stanford, Calif., Stanford University Press, 1960.
10. , "A Generalization of the Gross Substitute System," Review of Economic
Studies, XXXVII, April 1970.
11. Negishi, T., "The Stability of a Competitive Economy: A Survey Article," Econo-
metrica, 30, October 1962.
12. Nikaido, H., "Stability of Equilibrium by the Brown-von Neumann Differential
Equation," Econometrica, 27, October 1959.
13. , "The TAtonnement Process and the Nonnegativity Condition," in New
Economic Analysis, ed. by M. Morishima et al., Tokyo, Sobunsha, 1960 (in Japanese).
14. -, Convex Structures and Economic Theory, New York, Academic Press, 1968.
15. Nikaido, H. and Uzawa, H., "Stability and Nonnegativity in a Walrasian TAtonne-
ment Process," International Economic Review, 1, January 1960.
16. Ohyama, M., "On the Stability of Generalized Metzlerian Systems," Review of
Economic Studies, XXXIX, April 1972.
17. Quirk, J., and Saposnik, R., Introduction to General Equilibrium Theory and
Welfare Economics, New York, McGraw-Hill, 1968.
18. Scarf, H., "Some Examples of Global Instability of the Competitive Equilibrium,"
International Economic Review, 1, September 1960.
Section G
THE TATONNEMENT AND TH E
NON-TATONNEMENT PROCESSES
We pointed out in Section C (and in Section A) that this system reflects a funda-
mental assumption of stability analysis-that a positive excess demand for com-
modity i raises the price of i and a negative excess demand (that is, an excess
supply) for commodity i lowers the price of i. The above system of differential
equations is a straightforward mathematical formulation of this assumption.
This assumption, although it seems quite plausible, is beset with two serious
difficulties: (1) its behavioral background, and (2) the unrealistic nature of the
"tatonnement process," of which the above system of differential equations is a
mathematical formulation.
markets into equilibrium. If so, we say that the tatonnement process is a stable
process. This is the problem that we have considered so far in this chapter.
Under the tatonnement process, it is clear whose behavioral rule is described
by p(t) = f [ p(t)] . This is the behavioral rule of the "market manager .112 However,
the question has not been completely answered yet, because we do not know why
the market manager has to obey this behavioral rule. No straightforward explana-
tion such as the profit maximization of producers or the utility maximization of
consumers is given for this behavioral rule. Thus the stability analysis may be
considered as an analysis which shows whether the tatonnement process converges
to an equilibrium when the market manager is instructed to behave according to
p = f [p(t)] . Since it is not clear who should establish this rule or why the market
manager should behave according to this rule, we may consider this to be the "rule
of the game." Thus we get only a partial answer to the question of the stability of
the behavior which is described by this dynamic equation.
Some readers may not like the interpretation of the dynamic process p =
f [p(t)] in terms of tatonnement process, for it appears quite unrealistic to think
of all traders being assembled in one place at one time to carry out the tatonnement
process as described. Thus we may come back to the original question: Whose
behavior is described by this dynamic adjustment process? We end this inquiry
into the behavioral background of the dynamic adjustment equation with the
following acute observation by Koopmans ([5], p. 179):3
If, for instance, the net rate of increase in price is assumed to be proportional
to the excess of demand over supply, whose behavior is thereby expressed?
And is the alternative hypothesis, that the rate of increase in supply is pro-
portional to the excess of demand price over supply price any more plausible,
or any better traceable to behavior motivations?
f[p(t)] In other words, we used the same function f to denote the equilibrium
.
relation and the dynamic process. If we allow intermediate purchases and actual
transactions in the process, then this excess demand function f will change from
time to time as the traders' income or purchasing power varies 4 Hence the price
vector which prevails when the market is finally cleared depends on the time path
of the process and will, therefore, not generally be the same for any two processes.
Thus the process does not describe at all how the economy actually reaches an
equilibrium price Vectorp, the very problem with which Walras was concerned.'
Here we may note that the stability analysis of the t&tonnement process,
however unrealistic it may look, is one of fundamental importance in economics.
Some of the reasons for this are as follows:
(i) It is a genuine model which describes how the economy can reach an equilibrium.
As long as we describe our economy in terms of equilibrium relations such as
f (p) = 0, it is important to see how we can actually reach an equilibrium. More-
over, our economy may be constantly in disequilibrium as a result of changes in
consumers' tastes, production technology, and the availability of resources in the
economy. That is, the equilibrium relation may be constantly changing. Hence
when the equilibrium relation f(p) = 0 moves to a new relation f(p) = 0, the
price vector p with f (p) = 0 does not necessarily sustain an equilibrium
under the new relation. Hence if the economic model described by equilibrium
relations is to be meaningful, it must contain a model of the adjustment mechan-
ism, by which the equilibrium if disturbed could be restored.' [ In the above
example, ifJ(p*) = 0 for a uniquep*, the mechanism which brings p top * should
be established.] The t&tonnement process, if it is a stable process, offers such
a model. We may also note that the dynamic stability analysis which has been
described in this chapter can be relevant for a model which is more realistic and
does not necessarily involve the t&tonnement process. The author once offered
such a model [ 13] . Even if we grant that the t&tonnement process is unrealistic,
this does not negate the importance of the dynamic stability analysis described
in this chapter. Moreover, there exist adjustment processes in the real economy
which resemble the t&tonnement process-for example, the stock market, the fish
market, the corn market, and so on.
(ii) In Chapter 2, Section C, we showed that a competitive equilibrium is a Pareto
optimal state. Moreover, a competitive equilibrium has a unique feature: a de-
centralized decision-making process. Even apart from the problem of individual
incentives and the like, the decentralized process seems to have a clear advantage
over a centralized decision-making process. It does not involve the almost
impossible task (and accompanying costs) of collecting all the relevant data on
each consumer's tastes, each firm's production set, the resource availability,
and so on, so that the "center" may treat these data in such a way as to obtain a
decision. Hence the model of a competitive equilibrium, whether it is realistic
or not, offers an excellent prototype for the optimal organization of a society
and can be used as a realistic means of achieving a social optimum (even by a
socialist state). Thus when the model of a competitive equilibrium is viewed
as a realistic device for achieving a Pareto optimum, we certainly should know
THE TATONNEMENT AND THE NON-TATONNEMENT PROCESSES 343
how we can actually reach this "equilibrium" state. The tatonnement process,
if it is stable, offers exactly such a process.' See Arrow and Hurwicz [ 2] , for
example."
Now let us return to the unrealistic elements of the tatonnement process. How
can all the traders in the economy gather in one place and exchange tickets? How
can actual trade (and production) be prohibited until an equilibrium price has been
achieved? Despite the fact that we can offer an example from a real economy
which is based on the stability analysis described in this chapter but does not
contain the above difficulties, it is certainly very important to consider models
which explicitly avoid the unrealistic elements mentioned above. Such models
have been developed recently and are known as non-tatonnement processes. The
only models of non-tatonnement processes developed so far are pure exchange
models. In such models, the dynamic adjustment equation dp(t)/dt = f [p(t)]
(=Exi [p(t)] -Z3Ei) is replaced by
m
dpj(t) m
and
Three kinds of non-tatonnement processes are well known in the literature, all
of which are confined to the model of the pure exchange economy:'
344 THE STABILITY OF COMPETITIVE EQUILIBRIUM
Uzawa [ 14] claims that he has shown that the Edgeworth process is stable.''
(iii) Hahn-Negishi process ([ 4] ). This is based on the assumption that if there is
an excess supply of a certain commodity, then all the buyers of this commodity
can achieve their desires, and that if there is an excess demand for a certain
THE TATONNEMENT AND THE NON-TATONNEMENT PROCESSES 345
commodity, then all the sellers of this commodity can achieve their desires.
The following relations illustrate this process:
(1) (Disequilibrium) If xij(t) - 3E,1(t) 1- 0, then sign x;j(t)] = sign
1, 2,...,m;j= 1, 2,...,n.
(2) (Equilibrium) If E;"_ i x, ([) - J" ;(t) = 0, then x;;(t) = 0,
foralli= 1,2,...,m.
Statement (1) means that if there is an excess demand for j (that is,Z;x;f -
,Eix i > 0), then all the sellers can sell (hence x,1 - x; = 0 if Mr. i is a seller)
and not all the buyers can buy (hence xi - x; >_ 0 if Mr. i is a buyer). On the
other hand, if E;x, - Zix; < 0, then all the buyers can buy (hence x,,1 -
x; = 0 if Mr. i is a buyer) but not all the sellers can sell (hence x, - x- < 0
if Mr. i is a seller). Using properties (1) and (2), Hahn and Negishif4] proved
the stability of this process without using the gross substitutability assumption.
FOOTNOTES
1. Walras apparently was not fully aware of the significance of such a "false trading"
(or "recontracting") for he introduced such a concept in his theory of production but
there is no evidence that he considered it seriously in his theory of exchange (see
Patinkin, [ 12] esp. Note B). Newman argues that Walras'tiitonnementwasnotmeant
to be a device to deal with false trading, but that "the device was meant to cope
with-the problem of convergence of the `excess demand' mechanism in multiple
market situations." (See P. Newman, The Theory of Exchange, Englewood Cliffs,
N. J., Prentice-Hall, 1965, p. 102.) Incidentally, for a modern mathematical treat-
ment of the Walrasian successive t&tonnement process in the theory of pure exchange,
see H. Uzawa, "Walras' Tatonnement in the Theory of Exchange," Review of
Economic Studies, XXVII, June 1960.
2. Negishi ([ 111, p. 135) has proposed that the "market manager" in the t&tonnement
process may be regarded as the "incarnation of the competitive forces in the market."
Although this is an interesting observation, it has an objectionable flavor of meta-
physics, as was the case with the "invisible hand." Moreover, in most markets, it is
not easy to think of such a "competitive force."
3. See also Arrow [I] and Takayama [ 13], for similar comments.
4. Or we may consider that f depends on the allocation of commodities among the
traders (in the theory of pure exchange). Given a price vectorp and an initial resource
r _
vector x ; = 1 x ;. , .. , a ;,, , trader i chooses his demand vector xi. = (x11 , ., xi ),
. .
so that we may write x; = x;(p, x;). The market excess demand vector is defined as
x; - Ex;, which can, therefore, be written as f (p, x , ... , x,,,), assuming ni traders
in the economy. Now if actual transactions are allowed in the process, the x ;'s (as well
as p) change from time to time so that dp(t)/dt = f [p(t), x1( 1 ) ,... , x,,, (t)] . Here f
does not change over time. See the latter part of this section.
5. Hahn [3] pointed out a rather artificial case in which intermediate transactions are
allowed which have no effect on the distribution ofwelfare between individuals. This
is the case in which no stocks of commodities exist but where there is a continuous
flow of perishable commodities.
6. This is the point emphasized by Samuelson when he proposed the "correspondence
346 THE STABILITY OF COMPETITIVE EQUILIBRIUM
REFERENCES
1. Arrow, K. J., "Towards a Theory of Price Adjustment", in The Allocations ofRe-
sources, ed. by M. Abramovitz, Stanford, Calif., Stanford University Press, 1959.
2. Arrow, K. J., and Hurwicz, L., "Decentralization and Computation in Resource
Allocation," in Essays in Economics and Econometrics, ed. by R. Phouts, Chapel Hill,
N.C., University of North Carolina Press, 1960.
3. Hahn, F. H., "On the Stability of a Pure Exchange Equilibrium," International
Economic Review, 3, May 1962.
4. Hahn, F. H., and Negishi, T., "A Theorem on Non-tatonnement Stability," Econo-
metric,, 30, July 1962.
5. Koopmans, T. C., Three Essays on the State of Economic Science, New York,
McGraw-Hill, 1957.
6. Morishima, M., Dynamic Economic Theory (Dogakuteki Keizai Riron), Tokyo,
Kobundo, 1950 (in Japanese).
7. , "The Stability of Exchange Equilibrium: An Alternative Approach," Inter-
national Economic Review, 3, May 1962.
8. Negishi, T., "On the Formation of Prices," International EconomicReview, 2, January
1961.
9. , "On the Successive Barter Process," Economic Studies Quarterly, XII, January
1962.
LIAPUNOV'S SECOND METHOD 347
October 1962.
11. ,Theories of Price and Resource Allocation (Kakaku to Haibun no Riron),
Tokyo, Toyo Keizai Shimpo-sha, 1965 (in Japanese)
12. Patinkin, D., Money, Interest and Prices, 2nd ed., New York, Harper and Row, 1965.
13. Takayama, A., "Stability in the Balance of Payments, A Multi-Country Approach,"
The Journal of Economic Behavior, 1, October 1961.
14. Uzawa, H., "On the Stability of Edgeworth's Barter Process," International Economic
Review, 3, May 1962.
15. Walras, L., Elements of Pure Economics, tr. by Jaffe, London, George Allen & Unwin,
1954.
Section H
LIAPUNOV'S SECOND METHOD
dx;
(NA) dt
=f(x1,x2, ...,x,,;t),i= 1, 2, ...,n
348 THE STABILITY OF COMPETITIVE EQUILIBRIUM
or
dx =
f(x, t), where f:X ®x (- oo, oo) -, Rn, X c Rn
dt
In both systems, f is assumed to be continuous.
As discussed in Section B, the system (A) is called the autonomous system and
the system (NA) is called the nonautonomous system. Let z be an equilibrium point
of the (A) system, that is, f(z) = 0 [or f(z; t) = 0 for all t, for (NA)] . We may
choose z = 0 if we wish. This is not really much of a restriction, for by definingy
x - z, we get a new system dy/dt = f(y) [or dy/dt =f(y; t)] whose equilibrium
point is the origin, that is, y = 0. We assume that z is an isolated equilibrium point
in the sense that there is no other equilibrium point in some open ball about z.
We assume that the initial point x(t°) (where to is the "initial" time) lies inside this
open ball. If the equilibrium is unique, we take this open ball to be the whole space
in which both systems, (A) and (NA), are defined. We are concerned with the
global stability of the solution of the above systems x(t; x°, t°), which start from
an initial point x(t°) = x°.
We now define various concepts of stability. We assume that there exists a
unique solution determined by the initial point and that it is continuous with
respect to the initial point. We write the solution vector starting from (x°, t°)
as x(t; x°, t°).
In essence it says that if x° is sufficiently close to z, then x(t; x°, t°) remains
bounded for all t.
REMARK: This again is a local concept, for r(t°) can be very small. A some-
what puzzling thing in the definition is that (SI) (Liapunov stability) has
to be mentioned even though we have (ii); that is, (ii) in the above definition
alone does not necessarily imply (i). Kalman and Bertram ([3], pp. 375-376)
gave the following example.
EXAMPLE: Consider the second-order system in polar coordinates
x=(r,0),0<r<co,0<0<27r
1= [g(8, t)lg(O, t)] r
6=0
where g(8, t) = sin28/ [sin40 + (1 - t sin28)2] + 1/(1 + t2). Here (ii) of (S2)
is satisfied but (i) of (S2) [that is, (S,)] is not satisfied. However, if all
motions are continuous in x°, then we can show that if every motion suf-
ficiently close to z converges uniformly to z, then (S,) holds (see [3], p. 376).
(i) It is uniformly Liapunov stable in the sense that 8 in the definition of (Sl) does
not depend on to, and
(ii) Every motion converges to i as t - co uniformly in to and II xg II < r where r is
fixed and can be arbitrarily large. [That is, given any r > 0 and y > 0, there is
some T (y, r) such that II xo - 111 :5: r implies II x(t ; x0, t°) - z II < µ for all t
to + T.]
+ (autonomous)
Local stability
We are now ready to state some of the major results obtained by Liapunov
and his followers.
Theorem 3.H.1: Consider the autonomous system (A) [ that is, z = f (x)] with f (0) _
0. Suppose that there exists a real-valued continuously differentiable function V(x)
on X such that
SECOND METHOD 351
Then the equilibrium state = 0 is uniformly globally stable, so that x(t; x°, t°)-
0 (for any to and x°), as t co.
Theorem 3.H.2: Consider (A) with f(0) = 0 and suppose that there exists a real-
valued continuously differentiable function V(x) on X and a region D = {x E X:
V (x) < k } which is nonempty and bounded such that
REMARK: The function V(x) in the above theorem is again called the
Liapunov function of the system (A). Again, condition (ii) implies that the
origin is the unique equilibrium point.
EXAMPLE: Consider the van del Pol equation, z - E(x2 - 1)x + x = 0
where c > 0,8 or its equivalent form,' x = y + E(x3/3 - x), y = -x. Define
the Liapunov function for this by V(x, y) = (x2 + y2)/2. T hen V = Ex2(x2/ 3 -
1) along the solution path, so that V <_ 0 if x2 < 3. Define the region D by D
{ (x, y) E R2: x2 + y2 < 3} . Then we have V(x, y) < 0, as well as V(x,_y) > 0,
for all (x, y) in D [ note that V (x, y) = 0 only when (x, y) = (0, 0); that is, con-
dition (i) of Theorem 3.H.2 is satisfied] . Now observe that V(x,y) = 0 along
the solution path only when x = 0, that is, only when (x, y) is on they-axis.
But if x° = 0 and y° r 0, then V < 0 for any t > to, for z = y on they-axis.
This proves condition (ii) of Theorem 3.H.2. Therefore x(t; x0, y°, t°)-> 0
and y(t; x°, y 0, t°) -> 0 as t --> co, provided that (x°, y°) E D.
352 THE STABILITY OF COMPETITIVE EQUILIBRIUM
(i) There exist continuous nondecreasing real-valued functions a and R such that
a(0)=0and f3(0)=0and 0<a(IlxII)<V(x,t)<f(11x11)for all tand all
x=A 0,
(ii) There exists a continuous real-valued function y such that y(0) = 0 and
dV[x(t; x°, t°), t] /dt < -y(II x II) < 0 for all t and all x 1 0,10
(iii) a(11xII)- x with IIxII -> co
Then the equilibrium state z = 0 is strongly uniformly globally stable so that x(t;
x°, t°) -> z = 0 for any x° and to when t -> oo. The function V (x, t) is called a
Liapunov function of the system (NA).
REMARK: For the proof of the above theorems, see Kalman and Bertram
[ 3] ; LaSalle and Lefshetz [4] , chapter 2; and Yoshizawa [ 10] , chapter 5.
REMARK: If V(x, t) in Theorem 3.H.3 is positive definite and if
dV[x(t; x°, t°), t] /dt < 0 (instead of < 0), then we can merely say that
1 = 0 is Liapunov stable.
REMARK: Conditions (i), (ii), and (iii) of Theorem 3.H.3 can be restated as
follows: There exist continuous positive definite functions a(x), b(x), and
c(x) such that
(i) a(x) < V(x, t) < b(x) for all t and all x 1 0,
(ii) dV[x(t; x°, t°), t]/dt < -c(x), for all t and all x # 0, where II x II <
oo, and
(iii) a(x)->oo as I1 x11 oo
REMARK: The converse of Theorem 3.H.3 is also, in a sense, true. In
particular, we can show the following: Let f in (NA) be Lipschitzian" and
suppose that f (0, t) = 0 for all t. If z = 0 is strongly uniformly globally stable,
then there exists a real-valued function V(x, t), infinitely differentiable in
x and t, which satisfies the hypothesis of Theorem 3.H.3."-
Functions a and /3 in (i) and (ii) of Theorem 3.H.3 are illustrated by Figure
3.11.
As a result of the above theorems, the proof of the stability of a certain
dynamic system can be obtained by finding a Liapunov function for the system.
Many stability theorems in the literature are obtained as special cases of the above
dynamic systems (A) and (NA). (For example, see Kalman and Bertram [31.)
LIAPUNOV'S SECOND METHOD 353
V(t) Pi If(P)I
i-,
that is, V(t) is the sum of the absolute values of the excess demands multiplied by
their prices. Clearly V(t) > 0, whenever p p (p = an equilibrium price vector),
and V = 0 when p = p. Hence, if we can show that V < 0, the proof of the stability
is almost complete. For a short sketch of the proof that V < 0, we refer the reader
to Negishi ([ 8] , pp. 656-657). See also Chapter 4, Section D.
In the Liapunov method discussed above, it is assumed that the equilibrium
point is either isolated or unique. However, it is often important to consider cases
in which there are more than one equilibrium point which are not isolated. As we
remarked in Section B, Uzawa [9] reconsidered Liapunov's second method to
allow for such a case. Let f: X ->Rn be continuous and consider a system of differ-
ential equations 1(t) = f [x(t)] . Let x° be the initial value of x at t = 0. Assume
354 THE STABILITY OF COMPETITIVE EQUILIBRIUM
that, for any value of x° E X, this system of differential equations has a unique
solution x(t, x°) for all t > 0, which is continuous at x°. Let E be the set of all equi-
librium vectors, that is, E = {z: z E X andf(z) = 0}. Sincef(x) is continuous, E is
closed in X.
Definition: The process z = f [x(t)] is called quasi-stable if its solution x(t; x°)
satisfies the following conditions:
(i) Every limit point of x(t, x°), as t tends to infinity, is an equilibrium. That is, if for
some sequence tq, q = 1, 2, . . ., such that tq - co, lim x(tq, x°), as q- oo,
exists, then limq-,, x(tq, x°) is an equilibrium.
(ii) It is uniformly bounded; that is, for any given r > 0, there is some number
B = B(r) such that II x° - 111 < r implies 11 x(t; x°, t°) - z 11 < B.
REMARK: Uzawa [9] has shown that if the setX is closed, quasi-stability is
equivalent to
The function V(x) signifies the distance between point x and the equilibrium
set E. This function V will play the role of the Liapunov function.
REMARK: If the equilibrium points are isolated from each other, then con-
dition (i') means nothing but the asymptotic convergence of x(t, x°) to some
equilibrium point. The concept of quasi-stability, however, allows the case in
which the equilibrium points are not isolated and the solution x(t, x°) does
not necessarily converge to a particular point in the equilibrium set (see
Section B of this chapter).
Uzawa then proved the following theorem and showed its application.
Theorem 3.H.4: Suppose that the solution x(t; x°, t°) ofz = f [x(t)] is contained in
a compact set X, and
(U) There exists a continuous function V(x) defined on X such that V [x(t, x°)] is a
strictly decreasing function with respect to t unless x(t, x°) is an equilibrium.
Then the process x = f [x(t)] is quasi-stable.
REMARK: Uzawa called the function V(x) above a modified Liapunov func-
tion. Unlike Liapunov's V, it is not assumed to be differentiable and positive
definite.
As an illustration of such a theorem applied to stability analysis, consider the
nonnormalized adjustment process of a competitive equilibrium:"
LIAPUNOV'S SECOND METHOD 355
dpi
dt
=f [P(t)], i = 1, 2,..., n
where we have
(i) (Homogeneity) f (p) = f (ap) for all a > 0, and for all p, i = 1, 2, , n.
(ii) (Gross substitutability) 8f /8pj > 0 for all i r j and for all p.
Assume that an equilibrium price vector exists, and denote it byfi. Assume further
that the above system of differential equations has a unique solution p(t; p°) for all
t > 0, which is continuous in p°, where p° > 0.
Following Uzawa [9], define the functions V(p) and v(p) by
and
The functions V(p) and v(p) are "proxies" for the distance between p(t) and p.
These functions would play the role of the (modified) Liapunov function.''
Without loss of generality, we may assume that pl(t)/p1 > pi(t)/p; for all i.
That is, V(p) = pl(t)/pl, for some time interval, say, -r. To simplify the exposition,
assume further that V(p) and v(p) are differentiable in t." We, hence, observethat
dV(p)
dt
1 dpl(t)
dt
1 [ct(P) - X t]
pl Pt
Since pl/pl < p;/p; for all i with strict inequality for at least one i,'7 we have
PI
P1 pt
XI( P1,..., -p) < X01,...,Pn) = X1
PI PI
due to gross substitutability. Hence dV/dt < 0 for the time interval T if p(t; p°) is not
an equilibrium vector. A similar argument holds for any time interval so that dV/ dt
< 0 for all t, if p(t; p°) is not an equilibrium vector. Similarly, we can show that
dv/ dt > 0 for all t, if p(t; p°) is not an equilibrium vector. Hence the solution p(t; p°)
is contained in a compact set { p : v(p°) < p;/p; < V(p°), i = 1 , 2, ... , n} of positive
vectors.'' Hence by applying Theorem 3.H.4, every limit point of p(t) as ttendsto
infinity is an equilibrium point. Hence there exists a sequence t9 such that tq -> 00
(q --> oo) and
356 THE STABILITY OF COMPETITIVE EQUILIBRIUM
9
limp't = 1, i = 1,2,...,n
t .00Pr
Hence both V [ p(t9; p°)] and v [ p(t9; p°)] go to unity as q ->oo. But
Pr(; P°)
V [ At; P°)] < < V [ At; P°)]
for all t and i = 1, 2, ., n, and V [ p(t; p°)] and v [ p(t; p°)] are both bounded
. .
and monotonic. Therefore lim1..,) pi(t)/p; always exists and equals 1, for i =
1, 2, ... , n. This proves the uniqueness and global stability of the equilibrium price
vector
FOOTNOTES
1. We may show that if there is Liapunov stability for some initial time t°, then there
is Liapunov stability for any other initial time t1, provided that all motions are
continuous in the initial state.
2. In the above definition of Liapunov stability, if 8 can be chosen independently of
t°, then z is said to be uniformly Liapunov stable. If z is not Liapunov stable,
z is called unstable. An example of a uniformly Liapunov stable equilibrium is
z=0in z=0,xER.
3. As an example, consider x = -x/(t + 1), x E R. The solution can be written as
x(t; x°, t°) = x°(t° + 1)1(t + 1). Then z = 0 is asymptotically locally stable,
since (1) 11 x(t; x°, t°) II < II x° II whenever t > to (Liapunov stable) and
(2) x(t; x°, t°) -> 0 as t -oo. If condition (i) in the definition is replaced by uniform
Liapunov stability (that is, 8 does not depend on t°) and if r and T do not depend on
to in condition (ii), then z is said to be uniformly (asymptotically) locally stable.
As an example, consider _e = -x, x E R and z = 0. The point z = 0 in the above
example, z = -x/(t + 1), x E R, is not uniformly locally stable. See Yoshizawa
[ 101, p. 96.
4. The phrase "globally stable" can be replaced by "stable in the large." Similarly,
"locally stable" can be replaced by "stable in the small." The word "asymptotically"
is often dropped in economics literature (although this is not usually the case in
mathematics literature).
5. Often z is simply called (asymptotically) uniformly globally stable; that is, the
word "strongly" is omitted. In this case no special name is given to the stability
property of z under (S4)-
6. That is, V(x) is "positive definite." In general, any real-valued continuous function
V(x, t), defined on XOx (-oo, oo) where X c R", is said to be positive definite in
region D c X, if there exists a continuous real-valued function W(x) defined on D
such that V(x, t) > W(x) for all t where W(x) > 0 if x # 0 and W(O) = 0.
7. That is, V < 0 along the solution path x(t; x°, t°). We may define such a V(x) by
V(x) VC(x) f(x), noting z = f(x), where VX is the gradient vector of V. Clearly
V(0) = 0 since f (0) = 0. Note that condition (ii) implies that the origin is the unique
equilibrium point, since if r 0 is an equilibrium point [that is, f(z) = 0] , then
V(z) = Vx(c) f(z) = 0, contradicting condition (ii).
8. The sign of E is crucial. If E < 0, it is known that the only equilibrium point is the
origin and it is unstable. Moreover, there is a unique "limit cycle" which surrounds
LIAPUNOV'S SECOND METHOD 357
the origin. In economic theory such a differential equation (with e < 0) is associated
with the so called "Kalecki-Kaldor model" of business cycles. The limit cycle is
supposed to constitute the "business cycles." The name `van del Pol" is due to his
article, "Relaxation-Oscillations," Philosophical Magazine, series 7, vol. 2, November
1962.
9. This form is known as the "Lienard form." The present example is discussed in
Yoshizawa [ 10] .
10. Note that dV[x(t, x0, to), t] /dt = VC f [x(t, x0, to), t] + 8V/ot.
11. That is, 11 f (x, t) - f (y, t) 11 < k 11 x - y 11 where k is a positive constant.
12. This theorem is due to Massera. See J. L. Massera, "Contributions to Stability
Theory." Annals of Mathematics, 64, 1956.
13. The following remark by Kalman and Bertram ([3], p. 371) is quite instructive.
The principal idea of the second method is contained in the following physical
reasoning: If the rate of change dE(x)/dt of the energy E(x) of an isolated
physical system is negative for every possible state x, except for a single
equilibrium state xe, then the energy will continually decrease until it finally
assumes its minimum value E(xe). In other words, a dissipative system per-
turbed from its equilibrium state will always return to it.
14. This illustration is from Uzawa [91, pp. 623-624.
15. The use of such proxies rather than the distance itself together with Theorem 3.H.4
simplify Uzawa's proof of global stability considerably.
16. In general, V and v are not necessarily differentiable in t, although they are con-
tinuous in t. However, the nondifferentiable case can be analyzed analogously. See
Uzawa [91, pp. 623-624. Note that Theorem 3.H.4 requires only the continuity
(and not the differentiability) of the modified Liapunov function.
17. For, otherwise, p(t) is an equilibrium.
18. This proves p(t; p0) > 0 for all t > 0, as long as p0 > 0.
REFERENCES
Section A
INTRODUCTION
be the amount of the ith good used for (final) consumption purposes. Then the
demand = supply equilibrium relation for each good is written as
n
(1) +ci=xi,1= 1,2,...,n
l= I
359
360 FROBENIUS THEOREMS, DOMINANT DIAGONAL MATRICES, AND APPLICATIONS
(2)
Clearly these two questions are very closely related. In fact, we can prove
that (i) is answered affirmatively if and only if (ii) is answered in the affirmative.
However, whether these questions can be answered in the affirmative is not at all
obvious; (I - A) x = c involves n equations and n unknowns for a given c, but this
certainly does not guarantee the existence of x or its nonnegativity.
The study of the existence problem produced the following interesting
conditions as necessary and sufficient for (i) to be answered in the affirmative.
where bid iS the i j clcirncnt of matrix = (i - A), in Other words, for any c
there exists a unique _r" >_ 0 such that A z + c = k if and only if all the successive
principal minors of (I - A) are positive. The condition (H-S) is now known as the
Hawkins-Simon condition.
In order to obtain an intuitive understanding of this condition and the
Leontief system, let us consider a simple two-industry (say, steel and coal) input-
output model. In this case, the Hawkins-Simon condition is expressed as
1 - all -a12
(4) I- a 11 > 0 and I
>0
- a21 1- a22
N
INTRODUCTION 361
Note that the second of the above conditions may also be written as
(5) (1 - a11)(1 - a22) > a12a21
1-a11 -a12
(6) al = a2=
-a21 1 -a22
that a 1 x 1 and a2x2 (such as points A and B) add up to c. It is easy to see that if
the slope of the OA ray is equal to or greater than the slope of the OB ray (that is,
the angle 0 is equal to or greater than 180'), then there does not exist a point in
the nonnegative orthant of the (c1-c2)-plane (except the origin) such that x1 > 0
and x2 > 0 and a 1x1 + a2x2 = c. Hence with the assumption of 1 - a I 1 > O and
1 - a22 > 0 tacitly made in the construction of Figure 4.1, a necessary and
sufficient condition for an x > 0 to exist to satisfy (I - A) x = c for any c >_ 0
362 FROBENIUS THEOREMS, DOMINANT DIAGONAL MATRICES, AND APPLICATIONS
Cp
Cl
is, for the two-industry case, that the slope of the OA ray be flatter than the
slope of the OB ray. In other words,
a21 I -a22
(7) I aI I ajz
which is nothing but condition (5) in the Hawkins-Simon condition.
Let c be a final demand vector. In order to obtain this bundle of goods, we
need the "first-round" input requirements of the goods, namely, A c. But to
obtain the bundle A A. c, we need the "second-round" input requirements of the
goods, that is, A - (A c) = A2 c. Then on the third round, and so on, ad infinitum.
Therefore the total requirements of the goods would be
(8) c+
Now the question arises whether this infinite series converges and is in fact equal
to the bundle of goods produced in the economy, x. That is, the following problem
is posed.
THE CONVERGENCE PROBLEM: Does the above series converge? If so,
can we assert that
00
(9)
k=0
INTRODUCTION 363
It turns out that this problem can be answered in the affirmative if and only
if the existence problem or the nonsingularity problem is answered in the affirma-
tive, or if and only if the Hawkins-Simon condition holds.
Empirically, the input-output matrix A is typically estimated for a particular
year. Suppose that for that year we have c > 0 and x > 0. In other words, we have
(10)
For a given particular c > 0, there exists an x > 0 such that
Clearly if the answer to the existence problem is affirmative, (10) holds. What
about the converse? We will prove later that the converse also holds. Then con-
dition (10) becomes a necessary and sufficient condition for an affirmative answer
to the existence problem, the nonsingularity problem, and the convergence
problem, and it is also necessary and sufficient for the Hawkins-Simon condition
(see Section Q.
Obviously condition (10) can be restated as follows:
(11) There exists an x > 0 such that (I - A) x > 0
It will, be shown later (see Section D) that condition (11) is equivalent to the
following:
(12) There exists a p >_ 0 such that (I - A )' p > 0
where (I - A)' is the transpose of (I - A) and p may be interpreted as a "price"
vector. The amount of "value-added" (per unit output) by the jth industry, vj,
can be defined as
n
(13) !;=pj- pjaij,j= 1,2,...,n
Hence condition (12) can be interpreted as implying the existence of a price
vector p > 0 such that the vj computed by use of this p is positive for all j. Obviously
the vj's go to other factors of production such as labor.
Consider the following sums of the coefficients (a;j's) of the input-output
matrix:
n
(14-a) ri aij, i = 1, 2, ... , n
j=
(15-a) r < I
(15-b) T < 1
The conditions (15-a) and (15-b) are known as the (Brauer-) Solow conditions (see
Section Q. Condition (15-a) should not be surprising to the reader, for it simply
asserts a special case of condition (11). That is, condition (15-a) asserts that
condition (11) is realized by an x whose elements are all equal to one. Similarly,
condition (15-b) asserts a special case of condition (12) [choose p in (12) with
elements all equal to one] . In other words, condition (15-b) states that if we choose
the unit of measurement of each good properly so that the price of each good
is equal to one, then the valued-added (per unit output) of each good is positive.
In the course of these studies it was realized that the input-output matrix,
A, has a special property, that is, all of its elements are nonnegative. By imposing
this special nonnegativity restriction on the matrix, it was conjectured that we
should be able to obtain stronger results than those listed in the usual textbooks
on matrix algebra. Looking back into journals of mathematics, economists found
that such matrices had been discussed at the beginning of the century by the
German mathematicians Perron and Frobenius. Hence the theorem, now called
the "(Perron-) Frobenius theorem," suddenly attracted a great deal of attention
from economists.' A number of papers (by Metzler, Debreu-Herstein, Solow,
Chipman, Morishima, Goodwin, and so on) have been published on the properties
of A (the nonnegative matrix). (See Section C.) By using the properties of such
an A, the nature of the (I - A) matrix was made precise and clear (See Section D).
All of these studies are treated in a unified fashion by McKenzie [9] and Nikaido
[ 13] The unifying concept here is taken from condition (11) or condition (12).
.
FOOTNOTES
REFERENCES
1. Chenery, H. B., and Clark, P. G., Interindustry Economics, New York, Wiley, 1959.
2. Chipman, J. S., The Theory of Inter-Sectoral Money Flows and Income Formation,
Baltimore, Md., Johns Hopkins University Press, 1951.
3. Dorfman, R., Samuelson, P. A., and Solow, R. M., Linear Programming andEconomic
Analysis, New York, McGraw-Hill, 1958.
4. Fukuchi, T., Introduction to Linear Economics, Tokyo, Toyo Keizai Shimpo-sha,
1963 (in Japanese).
5. Gantmacher, F. R., The Theory of Matrices, Vol. II, New York, Chelsea Publishing
Co., 1959 (tr. from Russian).
6. Goodwin, R. M., "Does the Matrix Multiplier Oscillate?" Economic Journal, LX,
December 1950.
7. Karlin, S., Mathematical Methods and Theory in Games, Programming and Economics,
Vol. I, Reading, Mass., Addison-Wesley, 1959.
8. Leontief, W. W., The Structure of American Economy, 1919-1939, 2nd ed., New
York, Oxford University Press, 1951.
9. McKenzie, L. W., "Matrices with Dominant Diagonals and Economic Theory,"
in Mathematical Methods in the Social Sciences, 1959, ed. by Arrow, Karlin, and
Suppes, Stanford, Calif., Stanford University Press, 1960.
10. Metzler, L. A., "Stability of Multiple Markets: The Hicks Conditions," Econo-
rnetrica, 13, October 1945.
11. , "A Multiple Region Theory of Income and Trade," Econometrica, 18,
October 1950.
12. Morishima, M., Interindustry Relations and Economic Fluctuations (Sangyo-renkan
to Keizai Hendo), Tokyo, Yuhikaku, 1955 (in Japanese).
13. Nikaido, H., Introduction to Sets and Mappings in Modern Economics, tr. by K. Sato,
Amsterdam, North-Holland, 1970 (the Japanese original, Tokyo, 1960).
14. , Convex Structures and Economic Theory, New York, Academic Press, 1968.
FROBENJUS THEOREMS 367
Section B
FROBENIUS THEOREMS
(a)A-[I A-I
=)122-2A =A(,1-2)=0 .'.1=0and2
0.1 01
A = 1 (double root)
368 FROBENIUS THEOREMS, DOMINANT DIAGONAL MATRICES, AND APPLICATIONS
l
O
(c)A= [ ' 1 0 w ==1
A _ ± i (complex roots)
When A, is a simple root of the characteristic equation, A is called the simple
eigenvalue (or simple root) of A.
We will now restrict ourselves to an n x n matrix whose elements are all
nonnegative. First recall our conventions with regard to vector and matrix nota-
tion. Here x is a vector whose ith element is x,, and A is a matrix whose i-j
element is aid.
NOTATIONS:
(a) x>0ifxi>>0foralli
x ?0 if xi > 0 for all i and xi > 0 for some i (that is, x 0)
x> 0ifxi> 0 for all i
(b) A?0if ay>0for all iand j
A?0if ay>0for all iand jand a, > 0 for some i and j
A> 0ifa1 > 0forall iandj
Although we are concerned here exclusively with an (n x n) square matrix A, the
matrix A in the above notation does not have to be square. If A > 0, we call A
a nonnegative matrix, and if A > 0, we call A a (strictly) positive matrix (A ? 0 is
often called a semipositive matrix). One may consider A as an input-output matrix
so that aid denotes the amount. of ith input needed to produce a unit of the jth
output.
(12
EXAMPLE: is a permutation of 1 -> 2, 2 -> 3, and 3 -> 1 [that
3 /
is, a(1) = 2, a(2) = 3, and a(3) = 1].
A permutation matrix, usually denoted by P, is the one which is obtained by
permuting the columns (or rows) of the identity matrix. Or, more formally, it is
defined as follows.
j=1, 2, ..., n [resp.p;Q(;) = 1 , i = 1 , 2, ..., n], and if p;1= 0 for all i zk Q(j)
[resp. p, = 0 for all j r u(i)] .
REMARK: The identity matrix itself is a permutation matrix. Every per-
mutation matrix can be obtained by interchanging (two) columns (or rows)
of the identity matrix a finite number of times.
EXAMPLE:
0 0 1
P= 1 0 0
0 1 0
1 0 0 0 0 1 0 1
1= 0 0 0 0 0 0
r
1 1
0 0 1 1 0 0 o 1 0
2 3 I)
obtained by permuting the rows of the identity matrix by U. For example,
0 1 0
PQ = 0 0 1
1 0 0
is obtained by permuting the rows of the (3 x 3) identity matrix by the above u.
It can also be obtained by interchanging the rows as follows:
[10 0 0 1 0 0 1 '1 3 0 1 0 2
1= 1 0 2 0 1 0 -..J 2 0 0 1 3
0 0 1 3 1 0 0 1 1 0 0 1
It can be shown easily that PQ = PQ-1. Note that Pu-1 A P, (or P, A is-the
matrix obtained by permuting the rows and the columns by u.
EXAMPLE:
all a12 all
r l
U
\2 3 1 /, A = a21 a22 a23
a31 a32 a33
2 3 1/
1 - 3, 2 - 1, and 3 - 2. Since the numbering or the naming of the industries
should not alter what is going on in the economy, we may sometimes wish to
perform PR-' A A. P by choosing a properly.
We are now ready to introduce the first important concept.
where All and A22 are square submatrices 2 If this is impossible, A is called
indecomposable.
REMARK: This definition can be restated as follows: A is called "decompos-
able" if (1) there exists a partition {J, K} of N = {1, 2, . ., n}, such that .
[ 0 A22]
this case, not only do the industries in the J-group not require any inputs
from the K-group industries, but also the K-group industries do not require
any inputs from the J-group industries as well.
EXAMPLE:
(12 2
is decomposable by P, where a = 1". That is,
3
FROBENIUS THEOREMS 371
1 1 1
0 1 1
0 1 1
0 0 1 2
0 2
(ii) AZ = 0 0 is indecomposable
3
2 4 0 0
REMARK: The reader may wish to prove that A 1 and A2 in the above
examples are indeed indecomposable.
We now prove the following lemma.
x=(0,0,...,0, 1,0,...,0)
we have [I + A] "- I > 0. To show that [1 + A] x > 0 for any x > 0, it,
I
in turn, suffices to show that the vector y = [I + A] x always has fewer zero
coordinates than x does. Suppose the contrary. Note that y = x + A x and
A x > 0, so that for each positive coordinate of x, there corresponds a
positive coordinate of y. Thus x cannot have more positive coordinates than
y. Hence x has the same zero coordinates as y. Without loss of generality,
we may suppose that x and y have the form
Definition: The root A in the above theorem is called the Frobenius root of A; it
is denoted by AA or simply A.
PROOF. OF THEOREM 4.B.1 (WIELANDT [121): Given x E Rn, with x >_ 0, define
Ax = max {A: A x > Ax, A E R}. Let (A x); =,E1 a;xj and define A.(x) by
(A. x);
x;
ifx;> 0
A (x) =
, oo if x; = 0 an d (A x);fined
0
und e if x1 = 0 and (A x); = 0
A
x2
X
0 ' Figure 4.2. An Illustration of A,,.
n
S= {xERn: Exi2= 1,x>-0}
i=1
S={y:y=(I+A)'-'-x,xES}
Clearly S is compact, since S is compact (Theorem O.A.17). Moreover, by
Lemma 4.B.1, S consists solely of positive vectors. Multiply both sides of
the inequality A x >_ Axx by (1 + A)n 1 > 0, and obtain A _y > ,l,_y, where
y = (I + A)n- x. Hence by the definition of A,,, A, > A. Hence, instead of
I
(i-a) A> 0. Let u = ( 1 , 1 , ... , 1 ) E Rn. Then A.= minI <;<,7 7 1 a;i. Since no rows
of an indecomposable matrix can consist only of zeros, A,, > 0. Since .1 >_ ..1,,,
A > 0.
(i-b) A is an eigenvalue of A and x is its eigenvector, that is, A z = A. Suppose
that A z 4 A i. By the definition of A, A z > Az, so that A z - )i >_ 0.
Multiply both sides of this inequality by (I + A)"- 1 > 0 (recall Lemma 4.B.1),
and (I+A)" I
This contradicts the definition of A, for it implies that A z - (.,l + F)z > 0 for
any sufficiently small E > 0, that is, ,1, ? (A + c) > A. Hence A . = U.
374 FROBENIUS THEOREMS, DOMINANT DIAGONAL MATRICES, AND APPLICATIONS
(ii) z > 0. Let z= (I + A)" z. Then by Lemma 4.B.1, z > 0. But (I+ A )n-1 i =
(1 + A)"-'i. Hence 0 < z = (1 + A)"- 1i so that z > 0.
(iii) z is unique up to a scalar multiple. Let y be another eigenvector associated with
A. Let 0 - min, <i<"yi/ii, and let y = y - 0i. Then A- y = A- (y - Bi) =
Ay - 05.i = Ay. By definition of 0, y ' 0. Suppose y # 0. Then A y = Ay
means y is also an eigenvector associated with A. Hence using a proof
similar to that of (ii), y > 0. This contradicts the condition that y 0. Hence
y=Dory=Oi.
(iv) A x = µx and x > 0 imply that µ = A. Clearly A' is indecomposable if and
only if A is indecomposable, where A' is the transpose of A. Denote the
.A for A by AA and the A for A' by AA'; then we can easily show that
(vii) A is a simple root. (Proof omitted; see Gantmacher [41, p. 57, or Debreu and
Herstein [ 1] , p. 599.) (Q.E.D.)
REMARK: The essential part of the above proof is (i). Note that (i) follows
from Weierstrass' theorem, an elementary property of compact sets. An
alternative proof using sequential compactness is provided by Nikaido [7].
Debreu and Herstein [1] gave a very simple and elegant proof by using
Brouwer's fixed point theorem. It should be stressed, however, that we do
not need such a powerful theorem to prove Frobenius' theorem I. Debreu
and Herstein's proof of (i) goes roughly as follows:
X2
X1
Figure 4.3. An Illustration of T(x).
PROOF: The proof is omitted.' See Gantmacher [4], pp. 66-68, or Debreu
and Herstein [ I ] , p. 600.
REMARK: In the above theorem, a. is again called the Frobenius root of A
and AA, denotes the Frobenius root of A,. Owing to the lack of indecom-
posability we miss certain properties which hold for the indecomposable
376 FROBENIUS THEOREMS, DOMINANT DIAGONAL MATRICES, AND APPLICATIONS
0 0 0 A34 0
0 0 0 0 A,n_1,
0 0 0 ... 0
Here the 0's in the diagonal are square matrices, but the A;,; + i >_ 0 are not neces-
sarily square. If such a permutation does not exist, A is called primitive (or cyclic).
REMARK: This definition can also be stated as follows. An n x n indecom-
posable matrix A = [a;i] is called imprimitive (otherwise primitive) if
(i) There exists a partition { JI , J2, ... , J,,, } of N = { 1 , 2, ... , n } such that
N = J, U J2 U ... U j,,,, Ji n Ji = 0 (i j), J; 0, i = 1, 2, ... , rn,
and
(ii) ail = 0(i (4 J;_ 1, j E J;), and Z;E J._, all > 0(j E J.), i = 1, 2, . . ., in.
Here we regard J0 as J,,, . We note that this partition is not necessarily unique.
FROBENIUS THEOREMS 377
0 0 0 0 0 1
0 0 0 0 5 0
0 0 0 3 0 0
A = can be shown to be imprimitive
0 4 0 0 0 0
6 0 0 0 0 0
0 0 2 0 0 0
2 3 4 5 6
5): then by the permutation matrix Pa , we have
(11
1. Let or, =
4263
r0 0 0 1 001011
00 4 0 0
0 0 0 0 0 5
PRA-I
0 0 0 0 2 0
0 3 0 0 0 0
6 0 0 0 0 0
FOOTNOTES
1. Note that the second, third, and first columns of 1 now become the first, second, and
third columns of P,,. If A is any 3 x 3 matrix whose jth column is ai-that is,
A = [as, a2, a3] -then A PQ = [a2, 613, ai] , under the above permutation.
2. It immediately follows from the definition that A is decomposable if and only if
the transpose of A is decomposable.
3. In fact, it is not necessary to assume the indecomposability of both A, and A2.
It suffices to assume that only Ai is indecomposable.
FROBENIUS THEOREMS 379
REFERENCES
1. Debreu, G., and Herstein, I. N., "Nonnegative Square Matrices," Econometrica, 21,
October 1953.
2. Frobenius, G., "Uber Matrizen aus Positiven E]ementen," Sitzungsberichte der
Koniglichen Preussichen Akademie der Wissenschaftan, 1908, pp. 471-76, 1909,
pp. 514-518.
3. "Uber Matrizen aus Nicht Negativen Elementen," Sitzungberichte der
Koniglichen Preussichen Akademie der Wissenschaften, 1912, pp. 456-477.
4. Gantmacher, F. R., The Theory of Matrices, Vol. II, New York, Chelsea Publishing
Co., 1959 (tr. from Russian).
5. McKenzie, L. W., "Matrices with Dominant Diagonals and Economic Theory," in
Mathematical Methods in the Social Sciences, 1959, ed. by Arrow, Karlin, and Suppes,
Stanford, Calif., Stanford University Press, 1960.
6. Morishima, M., "The Mathematical Theory of the Leontief System," in his
Inter-Industry Relations and Economic Fluctuations, Tokyo, Yuhikaku, 1955 (in
Japanese).
7. Nikaido, H., introduction to Sets and Mappings in Modern Economics, tr. by K. Sato,
Amsterdam, North-Holland, 1970 (the Japanese original, Tokyo, 1960).
8. Linear Mathematics for Economics, Tokyo, Baifukan, 1961 (in Japanese).
9. , Convex Structures and Economic Theory, New York, Academic Press, 1968.
380 FROBENIUS THEOREMS, DOMINANT DIAGONAL MATRICES, AND APPLICATIONS
10. Perron, 0., "Zur Theorie der Matrizen," Mathematischen Annalen, 64, July 1907,
pp. 248-263. -
11. Solow, R. M., "On the Structure of Linear Models," Econometrica, 20, January 1952.
12. Wielandt, H., "Unzerlegbare, Nicht Negative Matrizen," Mathematische Zeitschrijt,
LII, Marz 1950, pp. 642-648.
Section C
DOMINANT DIAGONAL MATRICES
[I - A] - x = c
Here A > 0 is the input-output matrix, x is the output vector, and c is the final
-
demand vector. We may call [I A] the Leontief matrix. We begin this section
by reminding the reader of some of our discussions in Section A.
Suppose we estimate the input-output table A for a particular year from the
statistical data for this year. Suppose that c > 0 and also that x > 0. In other words,
we have the following property for [I A]. -
(i) For some c > 0, there exists an x > 0 such that [I - A] x = c.'
Suppose now that we want to use this input-output table A to predict x for future
years. This can be done if we can predict the final demand vector, c, for these
years and if we can assume that A is "fairly" constant. This is a more or less
usual procedure for the application of the input-output table A. However, there
remains one obvious question: How can we guarantee the nonnegativity of the
x-vector which corresponds to some future c > 0? In other words, we want to
be able to make the following assertion.
In the course of studying this question, the condition that all the successive
principal minors of [I - A] be positive has been shown to be important. This
condition is called the Hawkins-Simon condition, as pointed out in Section A.
Another question has arisen in connection with the problem of "dynamiz-
ing" the Leontief input-output relation. It is now known that the crucial con-
dition here is that the absolute values (modulus) of A's eigenvalues are all less
than one. It is also known that this condition is closely related to conditions (i) and
(ii) above, that is, the Hawkins-Simon condition. In the course of those studies,
DOMINANT DIAGONAL MATRICES 381
the Frobenius theorems were rediscovered and have since played an important
role in developments in this area. We now know that condition (i) is crucial
in the study of the matrix [I - A] ; for then the concept of the "dominant diagonal
matrix" can be used by economists. The relationship of these properties of the
matrix [I - A] or of "dominant diagonal matrices" to other studies in economics,
such as the theory of stability of a competitive market, has also been realized.
McKenzie's article [I Q brilliantly summarizes the whole of this unifying struc-
ture. Nikaido's work [ 16], which was published at about the same time, is partly
devoted to displaying this unifying structure as well. The purpose of this section
is to clarify the mathematical structure of these problems. Hence it is natural that
our exposition rely heavily on McKenzie [ 11] and Nikaido [ 16]. We begin with
the definition of a dominant diagonal matrix.
we have Ixj4 I bjj' <Ti4jl xil I bijl < Ti4jlxjl I bijt for j E J. Or I bjjl <
2:i#i 1 bij 1, j E J. This contradicts the assumption that A has d.d. so that
I bjj I > fi#j I bij I for all j. (Q.E.D.)
Theorem 4.C.2: If an n x n matrix A has d. d. that is positive, all its eigenvalues have
positive real parts.
Theorem 4.C.3: Let B = [ bij] be an n x n matrix with bii > O for all i and bij < 0
for i j. Then there exists a unique x > 0 such that B x = c for every c > 0 if
and only if B has d. d.
Clearly the first term on the left is nonpositive since xj > 0 for j 0 J and bij
0 for i r j. By assumption of d.d., ZLEJ, itidi I bij I < di j bij I for all j, hence for
j E J. Since bjj > 0, this implies that LEJ, i#jdibij + djbij = f icidibij > 0 for
j E J. Hence 2:iEJTjEJdibiixj < 0; that is, the second term on the left-hand
side of (*) is negative. Thus the left-hand side of (*) is negative, which is a
contradiction.
Now we prove necessity. Consider B x = c. By assumption, for any
c > 0, there exists a unique x > 0. In particular, let c > 0. Then x > 0, since
bii > 0 for all i and bii < 0, i j. Hence B' has d.d. realized by this x. Then
by the above, B' p = 7C has a unique solution p > 0 for any 7C > 0. In par-
ticular, let z > 0; then p > 0, since bii > 0 for all i and bij <_ 0, i J. In
other words, B has d.d. with respect to this p. (Q.E.D.)
DOMINANT DIAGONAI. MATRICES 383
Theorem 4.C.4: Let B = [b,1] be an n x n matrix such that b, < O for i j; then
the following conditions are equivalent.
c;-(0,0,.. ,1,...,0)
>0,i= 1, 2, ...,n, which implies B-1 > 0.
(ii) [(III) =>(M)]: The proof follows trivially.
(iii) [(II) =>(I)]: The proof follows trivially. (Q.E.D.)
REMARK: From the proof it is obvious that if B satisfies (I), then B has d.d.
that is positive.
REMARK: Let B be the Leontief matrix [I- A] Then conditions (I) and
.
(II) are obviously restatements of our conditions (i) and (ii), respectively.
Hence we now know that condition (i) is equivalent to condition (ii).
REMARK: Condition (I) can be restated as follows:
(I) For some c > 0, B x = c has a solution x > 0.
Nikaido [ 16] called (I) the weak solvability condition and (II) the strong solvability
condition.
In other words,
384 FROBENIUS THEOREMS, DOMINANT DIAGONAL MATRICES, AND APPLICATIONS
bII ...
blk
b21 . .
bzk
det Bk > 0, k = 1, 2, ..., n, where Bk
bkl bkk
2, ..., n, and k = i. In other words, condition (I) holds for all Bk, k = 1, 2,
n. Hence by Theorem 4.C.4, Bk is nonsingular. Now suppose that by
(i j) shrinks to zero. In this process condition (I) is clearly preserved so that
the nonsingularity of Bk is also preserved. In other words, in this process of
shrinking the by's (i j), det Bk will never be zero so that det Bk keeps the
same sign. But in the limit of big = 0 (i j), det Bk > 0 since bii > 0 for all i.
Hence det Bk must also be positive (k = 1 , 2, ... , n) for the original big (i j).
[ (H-S) =>(I)] : It suffices to show that (H-S)=(II). We prove this by mathe-
matical induction. For n = 1, bI Ix1 = cl. Condition(H-S)impliesthatbll> 0.
Hence for any cl > 0, there exists an xI > 0 such that bI Ix1 = cI (obviously
x, = cI /b, 1). Suppose that (II) holds for n - 1, and we want to prove that
it holds for n. Consider -j 1 b,1xj = ci, i = 1, 2, ... , n (or B- x = c). We
want to show that, for any c = (cl, c2, ., 0, there exists an x =
(x1, x2, ..., 0 such that E c,, i = 1, 2, ..., n. Noting that
bil > 0 (hence 0), we obtain IbiIxi bljxj] biI /bII = ci - I
>0,k=2,3,...,n
bk2 ... bkk
That is, the (H-S) condition holds for the (n - 1) x (n - 1) matrix [b;.].
Then by the induction hypothesis, E 2b;jxj = cj has a nonnegative solution
(x2, ..., for any nonnegative (c2...... c',). Let c = (Ch c2, ..., Obe
an arbitrary vector. Compute c'i, i = 2, 3, .. , n, from c' = ci - cl bil /b ; l
I
c'. _>_ 0 since biI < 0, i L 1, and bII > 0. Then obtain x2, x3, ..., x > 0 and
DOMINANT DIAGONAL MATRICES 385
Corollary: Let B = [bij] be an n x n matrix with bii < 0, i 4 j. Then (H-S) implies
that all the principal minors of B are positive,3 that is,
PROOF: By Theorem 4.C.5, (H-S) implies that condition (I) holds. That is,
for some c > 0 there exists an x > 0 such that B x = c. Then renumber
the coordinates of x and c and renumber the bit's correspondingly. Clearly,
(I) holds throughout this process. Hence by Theorem 4.C.5 (H-S) holds for
the new system. Since the renumbering can be any permutation of { 1, 2,
. . ., n}, condition (h-s) holds. (Q.E.D.)
REMARK: This (h-s) is the so-called "Hawkins-Simon condition." Gant-
macher called the above corollary the Kotelyanskii theorem ([5], pp.
71-73).4
Theorem 4.C.8: Conditions (I), (II), (III), and (IV) are all equivalent to thefollowing
condition.
(V) The real parts of all the eigenvalues of B are positive.
PROOF: By Theorem 4.C.7, condition (V) is equivalent to the condition
DOMINANT DIAGONAL MATRICES 387
PROOF:
C`4
Suppose x 0. Then we may write x =
L
I , where x' > 0. Write
A- ' A12 accordingly. Then A. x < µx implies AZ, x' < ,u0 = 0.
[A21 A22
Hence A21 = 0. This contradicts the indecomposability of A. '(Q.E.D.)
[(VIII') <_ ' (I')] At suffices to show that (VIII') <> (III') since (III')
(I') by Theorem 4.C.4. Since (VIII')= (III') follows trivially, it remains to
show that (III') (VIII').
[(III') (VIII')] : Let c > 0 be an arbitrary semipositive vector. By
assumption, B- I >_ 0. Hence x = B- c > 0. We will show that x > 0. Note
that B . x = c implies px = A . x + c A . x, so that x >- 0. Hence by the
previous lemma, x > 0. Choose
c=(0,0,...,0, 1,0,...,0)
Then B- I c > 0 means that the ith column of B- I is strictly positive. Let
i = 1, 2, . . ., n. Thus B-I > 0. (Q.E.D.)
Theorem 4.C.10 (Brauer [I], Solow [19]): Let A= [aij] bean n x nnonnegative
indecomposable matrix. Let r; _ Yj Iai,, i = 1, 2, ..., n (row sum).Ifp > r; foralli
with strict inequality for some i, then p > where A,, is the Frobenius root of A.
PROOF: Let ci = p - r, and B = pI - A. Then B x = c has a solution x =
(1, 1, ..., 1). Hence condition (VII') holds, so that condition (VI') follows;
that is, p > a.A. (Q.E.D.)
n n n n n n
AA E xi = Z Z aijxj = ± xj ail = xjsj
i= I i= I j= I .1= I i-- / .1=
Hence
xj si
j= I
A= n
Xi
i= I
(that is, AA is a nonnegative weighted average of the column sums, the s1's).
The statement of the theorem follows immediately from this relation.
(Q.E.D.)
REMARK: Takayama [ 20] proved the theorem for the case when sj = 1 for
all j. Fisher's result [4] as recorded above is more general, but his method
of proof is identical with that of Takayama [ 20] . Hence in essence it is only a
slight generalization.
Finally we may point out the interesting discussion on the "choice of units"
by Fisher [ 3] . Consider the Leontief system A A. x + c = x. Suppose the jth
element of x and c (that is, the jth good) is to be measured in new units. Then the
jth row of A must be multiplied by, and the jth column divided by, the same ap-
propriate conversion factor. In other words, the shift of units will convert the
original matrix A to
nant diagonals if there exist d > 0 such that dj(1 - -ajj) > 2:i idia,j, j = 1, 2, ... , n;
that is, 1 > " L(d,/dj)a0, j = 1, 2, ..., n. In other words, [I - A] has dominant
diagonals if and only if the column sums are less than unity with an appropriate
shift of units. In the definition of dominant diagonals, we noted that McKenzie
extended the usual definition. We can now see easily that the dominant diagonals
in McKenzie's extended sense are equivalent to the existence of a set of units in
which the matrix in question has dominant diagonals in the usual sense (see Fisher
[3] , p. 448).
FOOTNOTES
1. Note that this condition is equivalent to the following condition: For some c > 0,
there exists an x > 0 such that [I - A] x = c. That is, x > 0 can be replaced by a
weaker statement x > 0. To see this, write [I - A] x = c > 0 as xi - Zjn= i ayxj _
ci > 0 for all i, and observe that this requires x1 > 0 for all i anyway.
2. Define M = [ mq] from A = [ aii] as follows: mid = - I aii , i j and mii = I aii 1. Then
it is easy to see that A has a dominant diagonal if and only if there exists a d > 0 such
that M'- d > 0, where M' is the transpose of M. Clearly this condition is closely
related to the above condition (i) for the Leontief matrix.
3. Therefore the condition (H-S) is equivalent to the condition (h-s).
4. Hawkins and Simon [7] obtained (h-s) and Nicholas Georgescu-Roegen obtained
(H-S) in his "Some Properties of a Generalized Leontief Model," in Activity Analysis
of Production and Allocation, ed. by T. C. Koopmans, New York, Wiley, 1951 (theorem
7), reprinted with revision and "A Postcript (1964)" in his Analytical Economics,
Cambridge, Mass., Harvard University Press, 1966.
5. If (IV) [that is, (H-S)] holds, then p > 0 is automatically implied if aii > 0, for
(H-S), among other things, requires p - a, I > 0.
6. In the proof it will be shown that if the series is convergent it is equal to [p I - A ] - ' .
7. See Nikaido [ 17] , p. 97, and [ 16] , section 19.
8. Note that the prime in (I'), (V'), and (VI') indicates that B has the specific form
EP I - A ] , where A > 0 is a given nonnegative square matrix.
9. In view of this lemma, we can immediately assert that if A x < µx for some y and
x 0, x 0, then A is decomposable. In fact, we can also show the converse of this
statement, that is, if A is decomposable, then A- x < µx for some µ and x >_ 0,
x 0. The proof of this follows easily from the definition of decomposability.
REFERENCES
1. Brauer, A., "Limits for the Characteristic Roots of a Matrix," Duke Mathematical
Journal, 13, September 1946.
2. Debreu, G., and Herstein, I. N., "Nonnegative Square Matrices," Econometrica,
21, October 1953.
3. Fisher, F. M., "Choice of Units, Column Sums, and Stability in Linear Dynamic
Systems with Nonnegative Square Matrices," Econometrica, 33, April 1965.
4. -, "An Alternate Proof and Extension of Solow's Theorem on Nonnegative
Square Matrices," Econometrica, 30, April 1962.
SOME APPLICATIONS 391
5. Gantmacher, F. R., The Theory of Matrices, Vol. II, New York, Chelsea Publishing
Co., 1959 (tr. from Russian).
6. Hawkins, D., "Some Conditions of Macro-economic Stability," Econometrica, 16,
October 1948.
7. Hawkins, D., and Simon, H. A., "Note: Some Conditions of Macro-economic
Stability," Econometrica, 17, July-October 1949.
8. Herstein, I. N., "Comments on Solow's `Structure of Linear Models,"' Econometrica,
20, October 1952.
9. Karlin, S., Mathematical Methods and Theory in Games, Programming and Economics,
Vol. 1, Reading, Mass., Addison-Wesley, 1959.
10. McKenzie, L. W., "An Elementary Analysis of the Leontief System," Econometrica,
25, July 1957.
11. , "Matrices with Dominant Diagonals and Economic Theory," in Mathe-
matical Methods in Social Sciences, 1959, ed. by Arrow, Karlin, and Suppes, Stanford,
Calif., Stanford University Press, 1960.
12. Metzler, L. A., "Stability of Multiple Markets: The Hicks Conditions," Econo-
metrica, 13, October 1945.
13. , "A Multiple Region Theory of Income and Trade," Econometrica, 18,
October 1950.
14. Morgenstern, 0., ed., Economic Activity Analysis, New York, Wiley, 1954, esp. articles
by Wong, Y. K., and Woodbury, M. A.
15. Mosak, S. L., General Equilibrium Theory in International Trade, Bloomington, Ind.,
Principia Press, 1944.
16. Nikaido, H., Introduction to Sets and Mappings in Modern Economics, tr. by K. Sato,
Amsterdam, North-Holland, 1970 (the Japanese original, Tokyo, 1960).
17. , Convex Structures and Economic Theory, New York, Academic Press, 1968.
18. Price, G. G., "Bounds for Determinates with Dominant Principal Diagonals,"
Proceedings of the American Mathematical Society, 2, 1951.
19. Solow, R. M., "On the Structure of Linear Models," Econometrica, 20, January 1952.
20. Takayama, A., "Stability in the Balance of Payments: A Multi-Country Approach,"
Journal of Economic Behavior, 1, October, 1961 (the paper presented at the Washing-
ton meeting of the Econometric Society, 1959, resume, Econometrica, 28, July 1960).
Section D
SOME APPLICATIONS
a. SUMMARY OF RESULTS
We begin this section by summarizing some of the results obtained in the
previous section. In order to make it easy to refer to these results, we will present
them as theorems.
392 FROBENIUS THEOREMS, DOMINANT DIAGONAL MATRICES, AND APPLICATIONS
Theorem 4.D.1: Let B = [by] be an n x n matrix with b,, <- O for i # j. Then the
following five conditions are mutually equivalent.
(I) There exists an x > 0 such that B x > 0 (that is, for some c > 0, there exists
an x > 0 such that B x = c).'
(II) For any c ? 0, there exists an x >_ 0 such that B x = c.
(III) The matrix B is nonsingular and B- l >- 0.
(IV) [or (H-S)] All the successive principal minors ofB are positive. In other words,
Theorem 4.D.4: Let B be an n x n matrix such that b,, < 0 for i j. Then the
following four conditions are mutually equivalent.
(I) There exists an x > 0 such that B x > 0.
(I) There exists a p > 0 such that B'- p > 0 where B' is the transpose of B.
(1*) There exists an x > 0 such that B x > 0.
(I*) There exists a p > 0 such that B'- p > 0.
394 FROBENIUS THEOREMS, DOMINANT DIAGONAL MATRICES, AND APPLICATIONS
PROOF: From Theorem 4.D.1, condition (I) holds if and only if B-1 >_ 0
[condition (III)]. But by a well-known relation in elementary linear algebra,
(B-')' = (B')-'. Thus B-1 >_ 0 is true if and only if (B')-' >_ 0. Then
applying Theorem 4.D.1 again, this is true if and only if there exists a p > 0
such that B'. p > 0. Hence (I) and (I) are equivalent.
To show the equivalence of (I) and (I*), we recall that if there exists an
x > 0 such that B x > 0, x must be strictly positive since b;, > 0.1 Con-
versely, if there exists an x > 0 such that B x > 0, then condition (I) clearly
follows. Similarly, (I*) holds if and only if there exists a p > 0 such that
B' . p > 0. Combining this with the first part of the theorem, we obtain the
second part of the theorem. (Q.E.D.)
REMARK: In the first part of the proof of Theorem 4.D.4, we use the
relation (I) (III). By noting that the eigenvalues of a matrix are the same
as the eigenvalues of its transpose, we can prove the same statement using
the relation (I) (V).
REMARK: The equivalence (I*) (1*) in Theorem 4.D.4 implies that
B' has a dominant diagonal if and only if B has a dominant diagonal .4
REMARK: We also note that Theorem 4.D.4 can be proved directly from
Theorem 4.C.3. To do this, simply note that B x > 0 for somex > 0 means
that B' has a dominant diagonal. Hence, from Theorem 4.C.3, B' p = 7r > 0
has a solution p > 0. The converse holds similarly.
b. INPUT-OUTPUT ANALYSIS
Let A be an input-output matrix so that all denotes the amount of the ith good
necessary to produce one unit of the jth good. Obviously A > 0. Let c and x be the
final demand vector and the output vector, respectively. The basic relation of
(static) input-output analysis is written as
[I -
be computed for a particular year and let x and c be obtained for the
A
-
year as well. Clearly c > 0 and x > 0, and all the off-diagonal elements of [I A]
are nonpositive. Hence condition (I) of Theorem 4.D.1 is satisfied for [I A]. -
Suppose that this technology matrix A is expected to be fairly constant for some
years. Then by predicting the final demand vector cf for some future year, we
-
can easily compute the output vector for that particular year as xf= [I A] -1 cf.
In order to apply the above theorems, we consider the following two questions
mentioned before.
(i) For any cf > 0, does there exist an xf 0 such that [I - A] xf= cf?
(ii) Is [ I - A] nonsingular? If so, is [ I - A] -' ? 0?
Theorem 4.D.1, respectively. By Theorem 4.D.2, we must also have 1 > A,, where
A.,, is the Frobenius root of A. Conversely, if 1 > A,, we can answer questions (i)
and (ii) in the affirmative. Thus 1 > A,, offers a characterization of this problem in
terms of the matrix A.
Suppose that some elements of the final demand vector c (for the year in
which A is estimated) are zero. In other words, c ? 0 (instead of c > 0). This is pos-
sible if certain goods are used only as intermediate goods. Can we again answer
questions (i) and (ii) above in the affirmative? The questions can be answered "yes"
by referring to Theorem 4.D.2. In other words, if A is indecomposable, we can say
that, for any cf> 0, there exists an xf> 0 such that [I - A] - xf = cf and that
[I - A ] is nonsingular with [ I - A ] -1 > 0.
Suppose that there exists an x > 0 such that [ I - A ] x = cfor some c. Then
from Theorem 4.D.4, this is true if and only if there exists a p > 0 such that
[I - A] ' p > 0. In order to understand the economic significance of this state-
ment, let us suppose that this productive system is realized in a competitive equilib-
rium. Then we may suppose that a set of prices p. > 0, i = 1, 2, ..., n, will be
established for the n goods. Then 2:n p;ai constitutes the payment by the jth
industry for the goods used to produce one unit of the jth good (that is, "raw
material cost"). Since each industry presumably uses some primary factors (such
as labor) for the production, we must have
Pj n
> 0(1= 1,2,...,n)
r= i
or
[I- A]' p > 0 if x > 0, where [I- A]' is the transpose of [I- A]
we are also considering the intermediate step (that is, 7r > 0) to this "Walrasian
long run".] Then write 7 = 7L1 _ 7r2 = _ mn. Let I denote the n-vector whose
jth component is 1j. Then from the definition of m1 and the condition on the ml's, we
obtain
[I-A]'.p- wl]
or
p=(1+m)[A'.p+wl]
Assume 7L > 0; then 1 + 7L 0. Hence [ pI - A'] p = wl where p = 1/(1 + 7r).
From Theorem 4.D.2, a necessary and sufficient condition that there exists ap > 0
such that this relation holds for any wl > O is simply p > AA where AA is the Froben-
ius root of A(hence also of A'). This means
1
> AA
1 +71:
where A = [a, ] is an (n x n) matrix and xj(t) is the output vector in period t. This
system may be justified by the assumption that the demand for the ith good by the
jth industry in period t [that is, x,1(t)] is proportional to this industry's sales
(= output) in period t - 1 [that is, xj(t - 1)]. Here all is simply defined by aid
x;1(t)/xj(t - 1). It should be noted that the meaning of aid here is slightly different
from that of the input-output (production) coefficient in the ordinary sense as dis-
cussed above. Here A denotes the expenditure relations in this model.'
Consider the stationary state in which x(t) = x(t - 1) = x* for all t. Then we
have x* = A x* + c, or [I - A] x* = c. Two questions immediately arise.
The first question is the problem of stability and the second question is the prob-
lem of the existence of a nonnegative solution. From Theorem 4.D.1, we know
immediately that [I - A] is nonsingular and [I - A] 0 if and only if 1 > "A
where AA is the Frobenius root of A.
SOME APPLICATIONS 397
In order to understand the stability question, let us carry out the following
successive substitutions:
x(l) = c
x(2) = c= A2. X(0) + (I +
A x(2) + c = A3 x(0) + (1 + A + A 2). C
..................................................
x(t) = 1) + c = A+ + A' c
Then from the remark following Theorem 4.D.2, we can say that A' as t -->co
if and only if 1 > AA, and that 0A' c is convergent and equal to [1 - A ] -'- c
(= x*). Therefore x(t) -- [I - A]_- I c as t ---) co if and only if 1 > AA. Hence we
see that the stability question and the problem of the existence of a nonnegative
solution are really equivalent.
Yi(t) = E
n
r T,ij(t),
n
Uij(t) + L. r i= 1, 2, ... , n
j= I j= I
Write aij + Pij - aij and 2:j". I (uij + vij) = ci. Then the above relations can be com-
bined and written as
n
Yi(t) aijYj(t - 1) + ci, i= 1,2,...,n
j= I
or in matrix-vector notation,
Y(t) = A Y(t - 1) + c, where A = [aij]
This is exactly the same equation as that for the Leontief expenditure lag type
398 FROBENIUS THEOREMS, DOMINANT DIAGONAL MATRICES, AND APPLICATIONS
model discussed above. As we observed there, the following three conditions are
all equivalent.
(i) as t - oo.
(ii) [I - A] is nonsingular and [I -A]-' > 0.
(iii) 1 > ' A where AAA is the Frobenius root of A.
Here a;j is the input-output coefficient in the ordinary sense. We now ask whether
there exists a balanced growth solution to the above system, starting from an
arbitrary c(0), where balanced growth means, for t = 1, 2, ...,
p] rt ct where
When t = 0, [pi - A] x(0) = c(0). And x(t) = atx(0) and c(t) = atc(0) along
this balanced growth path. A necessary and sufficient condition for the existence
of a solution x(0) ? 0 for any c(O) > 0 is given by Theorem 4.D.2 as p > a.A (or
1/a > A.,,), where AA is the Frobenius root of A [(II') (VI')]. Hence p > AA
SOME APPLICATIONS 399
gives a necessary condition for the existence of a balanced growth solution starting
from an arbitrary initial c(0) > 0. Conversely, if p > AA, we can also obtain the
above balanced growth solution. Hence a necessary and sufficient condition for
the existence of a balanced growth path with growth factor a is 1 /a > AA. Note that
if c(O) = 0 and 1/a > "A, then c(t) = 0 and x(t) = 0 for all t. In order to achieve
x(t) > 0 for all 1, we need the indecomposability of A. Suppose that for some
c(0) > 0 there exists an x(0) >_ 0 such that [ pl - A] x(O) = c(0). Then condition
(VII') of Theorem 4.D.2 is satisfied, provided that A is indecomposable. Then from
condition (VIII'), [ pI - A] is nonsingular and [ pI - A] -1 > 0. Hence x(0) > 0
with c(O) >_ 0, so that x(t) = a&x(0) > 0 for all t. In these considerations it is
important to realize that the initial output vector cannot be arbitrary; it must be
equal to [ pl - A] c(0).
I
Now suppose that the households form an industry with labor as its output.
Labor is now considered a good rather than a primary factor, and thus c(t) = 0
for all t. Then balanced growth with a positive growth factor in the above dynamic
system is possible if and only if there exists a p > 0 such that [ pI - A ] x(O) = 0.
If A is indecomposable, then we know that there exists a unique AA >0 and an
x(0) > 0, such that A x(0) = .1.,,x(0). Then set A,, = p. In other words, if A is
indecomposable, there exists a unique balanced growth path whose growth factor
is equal to 1/A,, where A,, is the Frobenius root of A. Note that we cannot choose
the initial x(0) arbitrarily.
The basic weakness in the above dynamic analysis is that there is no con-
sideration of the stock of goods. A proper treatment of this will give rise to the so-
called "dynamic Leontief model," which we discuss in Chapter 6. There we show
that there exists a balanced growth path, corresponding to the Frobenius root,
and we will discuss the conditions for the "convergence" to this path starting
from an arbitrary given initial stock of goods.
The question is whether or not pi(t) _4 pj as t -- oo for all j. Since the above
system is linear, a global solution p(t) always exists for all t >_ 0. Assume the
solution p(t) remains positive for all t >_ 0. But owing to the linear approximation
procedure, the stability of the above system does not establish global stability,
although it does establish local stability. Write
qi(t)=Pi(t)-Pi,i= 0, 1,...,n
Then the above system can be rewritten as
n
dgi(t)
dt I
- j=0 i = 0, 1,-, n
We impose the following two assumptions:
fi(P)=0,i= 1,2,...,n
and the linear system of dynamic adjustment is written as
dpd(t)=
Iaij[pj(t)-PJ , i = 1,2,...,n
J
or
dgi(t) _
dt f_ I
dq(t)
dt = A A. q(t)
where q(t) = [q1(t), ..., and A = [aif], i, j = 1, 2, ..., n. Our problem
is to ascertain whether q(t)->O as t oo in the above system of differential equa-
tions. Needless to say, p(t) -- p if and only if q(t) -- 0. From the elementary
theory of differential equations (see Chapter 3, Section B), it is known that
q (t) - 0 as t -moo if and only if the real parts of all the eigenvalues of A are negative.
Hence the question is reduced to finding the condition that will guarantee that
the real parts of all the eigenvalues of A are negative. If A is symmetric, then
negative definiteness gives this condition; this result is given by Samuelson (see
SOME APPLICATIONS 401
Theorem 4.D.5: Under (A-3), the normalized system q(t) = A q(t) is stable, that
is, q(t) - 0 as t - co, provided that either one of the following conditions holds."
PROOF:
(i) First use (A-1) (that is, Walras' Law) and apj > 0 for j 0. From (A-1)
we haves 0Pia,1 = a0j < 0 for all j. Hence condition (I") of Theorem
4.D.3 is satisfied for A' where A' is the transpose of A, which implies (I")
also holds for A from Theorem 4.D.4. Hence condition (V") of Theorem
4.D.3 holds, which in turn implies the stability.
(ii) Now use (A-2) (that is, homogeneity) and a;o > 0, for i 0. The homo-
geneity implies _yjn_oa;jpj = 0 (Euler's equation). Or -Y oa;jpj =
-a,p < 0, for all i. Hence condition (I") of Theorem 4.D.3 is satisfied
for A', which implies that condition (V") also holds for A. This again
establishes stability. (Q. E. D.)
REMARK: The above proofs are not exactly the same as those of Negishi
and Hahn. This is understandable, for, at the time of their proofs, the struc-
ture of Theorem 4.D.3 was not well recognized. However, our proofs are
essentially based on their ideas. Note also that we only required a;j > 0
(i ' j)' (called the weak gross substitute case) instead of a j > 0 (i 4-1 j).
REMARK: As remarked above, Metzler [9] proved that under gross sub-
stitutability, the condition for dynamic stability (that is, the eigenvalues
with negative real parts) is equivalent to the Hicksian condition (that is, the
alternating sign of the successive principal minors). In other words, Metzler
proved part of Theorem 4.D.3 by establishing the equivalence between
402 FROBENIUS THEOREMS, DOMINANT DIAGONAL MATRICES, AND APPLICATIONS
conditions (V") and (IV"). The Hahn-Negishi theorem indicated the con-
nection between conditions (V") and (I") of Theorem 4.D.3.
We now prove the global stability of the nonnormalized differential equation
system, p = f (p). For the other systems (normalized system or difference equation
system), the reader can attempt a similar proof. The essential idea is to use
Liapunov's second method. (For Liapunov's second method for a difference
equation system, see Kalman and Bertram [ 5] .) We will use our theorems to show
V < 0 (or A V < 0 for the difference equation system), where V is Liapunov's func-
tion (as described in Chapter 3, Section H).
Theorem 4.D.6: Let p(t; p0) be the solution price vector for the system p(t) _
f [ p (t)], with the initial condition p (0) = p0. Assume Walras' Law (A-1 ), homogeneity
(A-2), and
(A-3') f,j(p) (- of (p)/apj) > O for all i j (i, j = 0, 1, ..., n) and for all p.
Then p (t ; p 0) - > p as t-> oo where f (p) = 0, regardless of the initial point p0.
[P(t)] Zf.fif
dt - ZiEJZ.f.fjPJ
j=0
=Z ufJjf = Z ZMJ4 +
iEJ jEJ
iEJ j=0 iEJ jeJ
where fj - f,j(p) = af(p)/apj (that is, evaluated at p rather than p). From
(A-3') (fj > 0, i .j) and f < 0 for j 0 J, we have ZiE JZj0Jf fjf < 0. We
now show that Ii. J `jE J f fj f < 0. From the definition of J, f > 0 for i E J.
By Walras' Law, Z"=0 Piffj = -f ,j = 0, 1, ..., n. HenceZiEJPiffj 5 _f < 0,
for j E J. Therefore F', pJ < 0, where F'' is the transpose of F. = [fij] , i, j E
J; Pi is the vector [pj] where j is taken from J. Also by homogeneity,
I jEJ.f jPj + Zje Jfj pj = 0 for all i, so thatZjE J fj pj < 0 for i E J, or FJ pJ <_ 0.
SOME APPLICATIONS 403
REMARK: The proof for the normalized system is similar. Or, alternatively,
use p f(p) > 0 (Lemma 3.E.3); that is, define y = E (p, - p;)2 so that V =
2(p - p)- p = 2(p - p) f(p) < 0 for all p p by Lemma 3.E.3 and (A-1).
Here we normalize p by Ep;(t) = 1 for all t.
9. COMPARATIVE STATICS
Suppose that a certain economic system is described by
1,2,...,n
or simply
f(x)=0
Such a system can describe the set of certain equilibrium relations or the set of
certain optimization conditions. An example of the former interpretation is that
f and x; are respectively taken as the excess demand for and the price of the ith
commodity. The value of x(say, z) which satisfies the above relationsf(x) = 0 is
called the equilibrium value of x. Clearly the equilibrium value of x is not neces-
sarily unique.
In order to consider a shift of the above system, rewrite the system as
.f(xl, xn; a, r, ...) = 0, 1 = 1, 2, . ., n
.
f [x(a, /3, ...); a, /3, ...] = 0, for all (a, /3, ...) in the region S
Obviously the f's are not so "well-behaved" in general, so that an explicit func-
404 FROBENIUS THEOREMS, DOMINANT DIAGONAL MATRICES, AND APPLICATIONS
tional relation such as x = x(a, j3, ...) may not be obtained, either locally or
globally. A set of mathematical conditions guaranteeing the "local" possibility is
known as the implicit function theorem." In this case S is some neighborhood of
...), which can be very small.
Comparative statics is concerned with the effect of a change in one or more of
the shift parameters a, j3, ... on the equilibrium value of x. The comparative statics
analysis can be local or global, depending on whether the region S is large [for
example, so that it covers all possible values of (a, j3, ...)] or S is confined to a
certain (small) neighborhood of (a, ji,...). The local analysis in connection with
the classical optimization theory was discussed in the Appendix to Section F,
Chapter 1.12
Partially differentiate the system f [x(a, j3, ...); a, j3, ...] = 0 with respect
to one of the shift parameters-say, a (while keeping the other parameters,
y, ... , constant)-we obtain
where fij = of/ax; and b, = of/aa with x = x(a, j3, ...). In matrix form, we may
rewrite this as
F x,, + b = 0
where F = [J,] , b = [b;] , and x,, is the n-vector whose jth element is axe/aa.
Assuming that F is nonsingular, we can rewrite the above equation as
xa = - F- I b, A
n
ZY ,,x,.= 0,i= 0, 1,2,...,n, forall(x;a,j3,...)
1=0
Assume now that f, > 0 for all i 4J(0, 1, ... , n) and for all x (gross substitutability).
Then we obtain
SOME APPLICATIONS 405
n
Z f jxj < 0, for all i and for all (x; a, /3, ...)
j=
Therefore the n x n matrix F = 1.41 satisfies condition (I") of Theorem 4. D.3.
We are now ready to consider the so-called Hicksian laws of comparative statics.
As an illustration, we show that a shift in demand from the numeraire to commodity
k raises the price of k and the prices of all other commodities. For this purpose,
set bk afk/aa = 1 and bi = of/aa = 0 for all i # k (i, k = 1, 2, ..., n). Since
F is indecomposable and condition (I") of Theorem 4.D.3 holds, condition (VII')
also holds so that F is nonsingular and F- I < 0. Hence xa. = - F- 1 b implies
xa > 0; that is, the price of all the commodities (except that of the numeraire)
must rise.'"
We now turn to another illustration, the theory of the firm. Consider a
firm that produces a single product y using n inputs, x I, x2, ... , xn, with the
production function O (x). Let p be the price of the product and w the factor price
vector, which are all taken as positive constants given to the firm. Assume that the
firm maximizes its profit py - w x subject to 0 (x) ? y and x ? 0. Assume
further that
Condition (iii) implies that 0 is a strictly concave function. Under these assump-
tions, the following set of conditions together with D(z) = y gives a necessary
and sufficient characterization of a unique global maximum:
/ w'
ffl-x,w != - p=0,i= 1,2,...,n
where we assume that the optimal values, z and y, are strictly positive.'-'
First consider the effect of the minimum wage regulation (MWR) on employ-
ment. Suppose that a local government imposes a minimum wage rate w which
is above wn. Assume that this does not disturb the market in such a way as to
change p and wh ... , Thus p, WI, w2, ... ,
1 . w,, are taken as (positive)
constants to the firm. This is again a problem of comparative statics." Assume
that there exists a continuously differentiable function x = x(w/p) such that
fi [x (wv/p), iv/p] = 0, i = 1, 2, ... , n, for all w/p in some region S of w/p. Then the
partial differentiation of f, with respect to wn yields
fl axI
+ b; = 0, i = 1, 2, ... , n, for all w/p in S
ativn
j= l
where we now define thef j's and the b,'s by f j of/a xj = (a 20/axia xi) = CD;jand
406 FROBENIUS THEOREMS, DOMINANT DIAGONAL MATRICES, AND APPLICATIONS
In other words, a local MWR always decreases the employment of labor and all
other factors, provided that the above assumptions are satisfied.`,,
In order to explore the above line of thinking further, let us suppose that
the price of the kth factor changes, where k is not restricted to n. Then the partial
differentiation of the system Di [x(w/p)] - w,/p = 0 (where Oi - 00/0x;), i = 1,
2, . . ., n, with respect to wk, again yields
n ax.
Zf, awk
+ b; = 0, k = 1, 2, ... , n, for all -win S
l=
where b. = 0 if i r k, bk = -1/p, and fij _ cI It is often assumed in general
.
equilibrium theory that commodities are gross substitutes. Our question now. is
whether we can extend the list of these commodities to factors. In particular, we
want to examine whether axe/awk > 0 for all j # k, that is, whether an increase
in the kth factor price will increase the demand for the jth factor when j 4 k.
Consider again the "normal" case in which fi > 0, for all i rL j, and for all x. Then
we can apply Theorem 4.D.3 again. Observing that the negative definiteness of
the Hessian matrix 0 implies condition (IV') of Theorem 4.D.3 and that F = [ fij]
is indecomposable, we can conclude that condition (VIII') is satisfied. In other
words, F is nonsingular and F-' < 0 for all x. Hence we conclude19
ax
<0 forallj k,j,k= 1,2,...,n, for allP inSwhere x= xrwl
P
In other words, "normally" factors are not gross substitutes but rather "gross
complements." This is the conclusion obtained by Rader [ 19] .20 The economic
interpretation of this result can be found in [ 19] , p. 40.
FOOTNOTES
1. Since b0 < 0 for all i :kj, B x > 0 with x > 0 implies x > 0. This was also noted
in the proof [step (i)] of Theorem 4.C.4. Hence condition (I) says that B' (that is,
the transpose of B) has d.d.
2. This means that like (I'), h has d.d.
SOME APPLICATIONS 407
3. The same argument is used in the proof [step (i)] of Theorem 4.C.4.
4. We can also conclude that an arbitrary matrix-say, A (that is, the one whose
off-diagonal elements do not necessarily have a definite sign)-has a dominant
diagonal if and only if its transpose A' has a dominant diagonal. To see this, recall
that (by the definition of a dominant diagonal) A = [a;i] has d.d. if and only if
the matrix M = [m j] , where my = -Ja jl, i # j, mi; _ Ja111, has d.d. Note that M is
a B matrix in Theorem 4.D.4.
5. It is, however, possible to interpret A as the technology matrix, if we assume that
sales expectations are made on the basis of simple extrapolation of all industries;
that is, the sales of the last period x(t - 1) are expected to continue as sales of this
period so that A x(t - 1) + c signifies the expected demands for the goods in period
t. The advantage of this interpretation is that we can interpret A as the technology
matrix.
6. In proving local stability, it suffices to assume that (A-1) or (A-2) holds at the
equilibrium p. The theorem is often referred to as the Hahn-Negishi theorem. Hahn
used (A-1) and Negishi used (A-2).
7. Strictly speaking, we also required that aoj > 0 or a,n > 0, i,j = 1, 2, . . ., n, where
the 0th commodity is the numeraire. This also implies that the choice of the
numeraire is important.
8. Obviously, f = 0 implies V(f) = 0.
9. It is well known and easy to show that all the eigenvalues of any symmetric matrix
are real.
10. It is well known in matrix theory that a symmetric matrix is negative definite if and
only if its eigenvalues are all negative (for the proof, see any textbook on matrix
theory).
11. As remarked earlier in Chapter 1, the (local) implicit function theorem roughly
states the following. Let f,(z; a, /i, . .) = 0 for some r and for some (a, /i, ...).
.
Assume that det[ay] 0, where aj^ of/axj, evaluated at (z, a, /3, .). Then . .
function g such that g(i, /i, ...) = z and f [g(cr, A.... .):a,13,...] = 0 for all (cr, /5, ...)
in N. See W. H. Fleming, Functions of Several Variables, Reading, Mass., Addison-
Wesley, 1965, and most textbooks on advanced calculus.
12. The use of comparative statics, as remarked in the above, is not confined to optimiza-
tion theory; that is, the above system,f(x, cr, ...) = 0 is not necessarily made up of
first-order conditions.
13. See Samuelson [201, which also contains an excellent exposition of comparative
statics (especially chapters 2 and 3). Also see the Appendix to Section F, Chapter 1.
14. For the other Hicksian laws of comparative statics, see Mundell [ 14] . See also
Morishima [ 131, pp. 3-14.
15. Recall Section F of Chapter 1 (especially subsection c).
16. There seems to be a widespread misunderstanding among labor economists with
regard to the comparative statics nature of the problem. For example, in their
study of the MWR of New York City, M. Benewitz, and R. E. Weintraub wrote,
"Economic theorists assert, as a logical deduction from the diminishing returns,
that elasticity of the demand for labor is negative. This means that a rise in the wage
rate will lead to a decline in employment." See their "Employment Effects of a Local
Minimum Wage," Industrial and Labor Relations Revieiv, 17, January 1964, p. 283.
The error seems to be in confusion between shifts of a curve and movements along
a curve; the employment of other factors as well as labor adjust to a new level of
the wage rate, which causes a shift of the marginal productivity curve of labor.
408 FROBENIUS THEOREMS, DOMINANT DIAGONAL MATRICES, AND APPLICATIONS
Also recall the famous controversy between Lester and Machlup in the American
Economic Review (in the 1940s) with regard to the validity of the marginal pro-
ductivity theory. See also J. M. Peterson, "Employment Effects of Minimum Wages,
1938-50," Journal of Political Economy, LXV, October 1957, as well as our footnote
18.
17. This means that an increase in the employment of the jth factor will increase the
marginal (physical) productivity of the ith factor, if i 4 j. This means that factors are
used in conjunction with each other rather than as substitutes. This so-called
"Wicksell's Law" is termed the normal case by Rader [ 19]. The case in which this
assumption is slightly weakened to fj > 0 for all i j can be analyzed in a method
similar to the subsequent analysis. Also note that the crucial condition for the global
invertibility of C D [that is, the existence of the unique inverse - 1 , for all (w1 /p, ... ,
w,/p)] in the Gale-Nikaido theorem is satisfied if the Hessian matrix of D is Hicksian.
See Nikaido [ 181, section 20, and our Chapter 2, Appendix to Section E.
18. If some of the assumptions are violated (which is quite plausible), it is possible that
a local MWR may increasethe employment of labor. Such an analysis is carried out by
Takayama [22] . For example, suppose the demand function of the product is
p(y)=y-0- (a complete monopoly) and the production function is y = LK. Then we
can show that the profit-maximizing value of (L, K) is unique and that the imposi-
tion of MWR increases the employment of labor L. If p = ay- 1/'? and y = bL°'K/,
where 71 > 1, a, R > 0, and a, b > 0, then a necessary and sufficient condition that the
imposition of M WR increase the employment of labor is computed as [ 1 - e(a + R)] /
(ca - 1) > 0 where e = 1 - 1/77. Hence the empirical findings that the imposition of
MWR does not necessarily decrease the amount of labor employment do not con-
stitute a sufficient reason to refute the marginal productivity theory.
19. If fj > 0, i j, condition (VI") of Theorem 4.D.3 is equivalent to (III") instead of
(VIII"). In other words, F is nonsingular and F- 1 5 0, which slightly weakens the
conclusion to 8xj/8 wk < 0, for all j k.
20. A similar conclusion was obtained by M. Morishima in "A Note on a Point in Value
and Capital," Review of Economic Studies, XXI, 1953-1954 (which is cited in Rader
[ 19] ). See also D. V. T. Bear, "Inferior Inputs and the Theory of the Firm," Journal
of Political Economy, LXXIII, June 1965. Note that the above analysis of MWR
(Takayama [22]) essentially establishes the same result.
REFERENCES
1.Arrow, K. J., and Hurwicz, L., "On the Stability of the Competitive Equilibrium, I,"
Econometrica, 26, October 1958.
2. Bellman, R., Introduction to Matrix Analysis, New York, McGraw-Hill, 1960.
3. Chipman, J. S., The Theory of Inter-Sectoral Money Flows and Income Formation,
Baltomore, Md., Johns Hopkins University Press, 1951.
4. Hahn, F. H., "Gross Substitutes and the Dynamic Stability of General Equilibrium,"
Econometrica, 26, January 1958.
5. Kalman, R. E., and Bertram, J. E. "Control System Analysis and Design Via the
"Second Method" of Lyapunov, II, Discrete-Time Systems," Journal of Basic
Engineering, June 1960.
6. McKenzie, L. W., "Matrices with Dominant Diagonals and Economic Theory," in
SOME APPLICATIONS 409
Mathematical Methods in the Social Sciences, 1959, ed. by Arrow, Karlin, and Suppes,
Stanford, Calif., Stanford University Press, 1960.
7. , "An Elementary Analysis of the Leontief System," Econometrica, 25, July
1959.
8. Metzler, L. A., "Underemployment Equilibrium in International Trade," Econo-
metrica, April 1942.
9. , "Stability of Multiple Markets: The Hicks Conditions," Econometrica, 13,
October 1945.
10. , "A Multiple Region Theory of Income and Trade,"Econometrica, 18, October
1950.
11. Morishima, M., "The International Inter-relatedness of Economic Fluctuations,"
in his Inter-Industry Relations and Economic Fluctuations, Tokyo, Yuhikaku, 1955
(in Japanese).
12. , Introduction to the Inter-Industry Analysis, Tokyo, Sobun-sha, 1956 (in
Japanese).
13. , Equilibrium, Stability and Growth, Oxford, Oxford University Press, 1964.
14. Mundell, R. A., "The Homogeneity Postulate and the Law of Comparative Sta-
tics," Econometrica, 33, April 1965.
15. Negishi, T., "A Note on the Stability of an Economy Where All Goods Are Gross
Substitutes," Econometrica, 26, July 1958.
16. Nikaido, H., Introduction to Sets and Mappings in Modern Economics, tr. by K. Sato,
Amsterdam, North-Holland, 1970 (the Japanese original, Tokyo, 1960).
17. Linear Mathematics for Economics, Tokyo, Baifukan, 1961 (in Japanese).
18. , Convex Structures and Economic Theory, Academic Press, N.Y., 1968.
19. Rader, T., "Normally, Factor Inputs Are Never Gross Substitutes," Journal of
Political Economy, 76, January/ February 1968.
20. Samuelson, P. A., Foundations of Economic Analysis, Cambridge, Mass., Harvard
University Press, 1947.
21. Solow, R. M., "On the Structure of Linear Models," Econometrica, 20, January 1952.
22. Takayama, A., "Minimum Wage and Unemployment," Purdue University, March
1967 (unpublished manuscript).
5
THE CALCULUS OF VARIATIONS AND
THE OPTIMAL GROWTH OF AN
AGGREGATE ECONOMY
Section A
ELEMENTS OF
THE CALCULUS OF VARIATIONS
AND ITS APPLICATIONS
The problem then is one of finding a function ("arc") x(t) from the set of dif-
ferentiable functions' to minimize the above integral JD subject to x(a) = a
and x(b) = A. This problem is called the minimum distance problem.
410
ELEMENTS OF THE CALCULUS OF VARIATIONS AND ITS APPLICATIONS 411
t
b
The answer to the above problem is obviously the straight line joining A
and B. This answer can readily be obtained without applying any of the theorems
in the calculus of variations. However, we can use this problem to illustrate the
nature of the technique of the calculus of variations.
The first major development in the calculus of variations came as a result
of a little more difficult problem first discussed by Galileo in 1630 and then by
John Bernouilli in 1696. The problem was solved by John Bernouilli himself and
by James Bernouilli, Newton, and L'Hospital. The problem (later called the
brachistochrone problem) was as follows: Let 0 and A be two fixed points in a
vertical plane and consider a particle with mass sliding from 0 to A under the
force of gravity along a certain curve connecting 0 and A on this vertical plane.
The problem is to find a curve such that the particle moves from 0 to A in the least
amount of time. The problem is illustrated in Figure 5.2. Here 0 is taken as the
origin of the coordinates.
Using elementary mechanics, we can prove that the problem can be reduced
to one of minimizing the following integral:2
= f° 1 + y'(x)2dx
a
- J° 2gy
Here y(x) [with y(0) = 0 andy(a) = a] denotes the curve joining 0 and A, and
g denotes the gravitational constant.
Later it was discovered that the calculus of variations provided a unified
view of many problems in physics. For example, in optics it is known that light
travels in such a way that it traverses the distance between two given points
in the least possible time (Fermat's principle). In classical mechanics, W. Hamilton
discovered that the motion of a system of n particles (in the x-y-z-plane) can be
explained by the minimization of the following integral:
'
(T - U)dt, where t denotes time
fo
b. EULER'S EQUATION
We now obtain the first-order necessary condition for the maximization
(or minimization) problems discussed above. The emphasis will be on the exposi-
tion and the intuitive understanding of the derivation. Hence some sacrifice of
mathematical rigor is inevitable. Moreover, we will consider only the simplest
problem. We go to a more general analysis in the next section.
Let X be the set of all real-valued (and single-valued) continuously differen-
tiable functions defined on the closed interval [a, b] (admissible functions).
We want to find a function x(t) inXwhich maximizes (or minimizes) the following
integral:
(3) aE J [xE] = 0
where the partial derivative a J [x, I/ a E is evaluated at E = 0. But we have
_-
jh[J]dt [.-h(a) = h(b) = 0]
414 CALCULUS OF VARIATIONS AND OPTIMAL GROWTH OF AN AGGREGATE ECONOMY
provided we tacitly assume that dfx,/dt exists. Therefore, we obtain from (3), (4),
and (5)
Lemma: Let F(t) be a given continuous function on [a, b]. Let X0 be the set of
all continuous functions on [a, b] such that h(t) E Xo implies h(a) = h(b) = 0. Sup-
pose that f bF(t) h(t) dt = 0 for all h(t) E X0. Then F(t) is identically equal to zero
f o r all t E [a, b] .
PROOF: Suppose F(1) 0 at some point 1 in [a, b], say, F(I) > 0. Then,
from the continuity of F(t), F(t) > 0 in some interval [c, d] where c < d,
7 E [c, d], and [c, d] c [a, b] . Choose h(t) E X0 such that h(t) > 0 for
t E (c, d) and h(t) = 0 for t 0 (c, d). Then f'F(t) h(t)dt > 0, which is a
contradiction. (Q.E.D.)
Using this lemma and noting the continuity of we obtain from (6) that
t =fxxI +.fr'x
But for the optimal path i(t), Euler's condition is satisfied so that fx =
df,-/dt. Hence we have [along the optimal arc z(t)] :
Minimize: JD = 6 1 + (x')2 dt
Ja
Subject to: x(a) = a and x(b) _ A
Write /i + (x')2 = .fD; then Euler's condition is written as
afD _ d afo
ax dt ax'
SincefD contains no x, afo/a x = 0. Hence
a fD = x'
-+(X,)2
= const.
ax'
0
In other words, x'(t) = y (constant). Hence x(t) = pt + a. Since x(a) = a, and
x(b) = A3, we obtain x(t) = [(a - /3)/(a - b)] t + (a/3 - ab)/(a - b). This is the
equation which denotes the desired straight line.
(y),
Minimize: JB = f oa 1 + dx
2gy
Subject to: y(O) = 0 and y(a) = a
or
+ _ (y,)2 +c
2gy 2gy [I + (y')2]
We find, on multiplying, squaring, and collecting terms, that
k
y= where k = I
1 + (y' )2 2912
k k = k cos2w
y + (y')2 - 1 + tan2w
Thus y' 2k cos w sin w dw/dx = tan w, so that dw/dx = I/ (2k cos2w). Thus
dx = -2k cos2w dw = -k(1 + cos 2w)dw. Hence integration yields
x = -k(a) + z sin 2w) + (constant)
Write 2w =7r - 0 and note that sin (7r - 0) = sin 0. Then we may write
x=k,(0-sin0)+k2
where k1 and k2 are some constants. Obviously k1 = k/2. Also y can be obtained
as follows:
y=kcos2w=k(1+cos2w)=k[1+cos(7l-0)]
=2[1-cos0]=k,[1-cos0]
In other words, we have obtained the parametric representation of the solution
as
x=k,(0-sin0)+k2
y = k,(1 - cos 0)
These equations define a family of "cycloids" with cusps on the x -axis. A unique
curve is determined by the boundary conditions
y(O) = 0 and y(a) = a
fM =
a- p + S, where S is some constant
Thus pD - C = [^ - C'D,,] p + S, where D = aD/ap and C' = dC/dx. Clearly,
C' signifies the marginal cost function. The above equation is a first-order differen-
tial equation, the solution of which involves two arbitrary constants that can be
determined by the boundary conditions p(O) = p° and p(T) = p. The following
special case illustrates the problem. Let
418 CALCULUS OF VARIATIONS AND OPTIMAL GROWTH OF AN AGGREGATE ECONOMY
FOOTNOTES
1. The analysis can be extended to the case of continuous functions (instead of dif-
ferentiable functions).
2. Let m be the mass of the particle. The gravitational force F is given by F = mg,
where g is the gravitational constant. The force F is decomposed into its normal
and tangential components. The former plays no part in the motion. Let ds be
the tangential distance along the curve at point (x, y). Since cos 0 = dylds (note
ds2 = dx2 + dy2), the tangential component of F at (x, y) is equal to F cos 0 =
mg dy/ds. Hence the acceleration along ds is equal to g dylds. That is, dv/dt = g dylds
where v = ds/dt (velocity along ds). Thus we obtain v dv = g dy. Integrating this and
using the initial condition y = 0, v = 0, we have v = 2gy. Therefore dt = ds/v =
1 + y' dx/ 2gy. Hence for the required duration, we obtain the expression
J(y) = \° (I + y )/2gy dx.
3. The existence of an optimal function i(t) is not at all obvious in many problems and
the proof of the existence should be supplied separately. But such a proof will exceed
the scope of the present section. We may also note that z(t), even if it exists, may not
be unique.
4. Solve equation (7) regarding z as an unknown function of t. The solution £(t) of (7)
will provide the equation that the optimal path must satisfy. It is important to realize,
however, that this i(t) does not necessarily maximize (or minimize) the objective
integral J, since equation (7), in general, only provides a necessary condition and
not a sufficient condition for the optimum. In Section B, we prove that under certain
concavity conditions equation (7) also provides a sufficient condition for the optimum.
REFERENCES
Allen, R. G. D., Mathematical Analysis for Economists, London, Macmillan, 1938,
esp. chap. XX.
2. Bliss, G. C., Lectures on the Calculus of Variations, Chicago, I]1., University of
Chicago Press, 1946.
SPACES OF FUNCTIONS AND THE CALCULUS OF VARIATIONS 419
Section B
SPACES OF FUNCTIONS AND
THE CALCULUS OF VARIATIONS'
a. INTRODUCTION
In the last section, we considered the problem of finding a function x(t) from
the set Xla,h1 of differentiable real-valued functions defined on the closed interval
[a, b] such as to maximize the following integral:
h
J [x] = J a f [t, x(t), x'(t) ] dt
The alert reader may already have noticed the resemblance of the above problem
to the ordinary nonlinear programming problem that we discussed in Chapter 1.
The set X[a,b] of continuously differentiable real-valued functions on [a, b] is a
linear space. The function J[x] is a real-valued function defined on Xla,bl.The
problem is to find an z E X[a,b] which maximizes J[x]. As a matter of fact, this
analogy is the same even if we take x(t) as a continuously differentiable function of
[a, b] into R", that is, x(t) = [xI (t), x2(t), ... , xn(t)] , and XIa bl is the collection
of such vector-valued continuously differentiable functions on [a, b] . The func-
tion x'(t) in J[x] is simply defined by x'(t) = [xi(t), ..., x,,(t)], where x;(t)
dx;(t)/dt. What then is the difference between the above problem and the ordinary
nonlinear programming problem? The crucial difference is simply that the linear
space X[a,b] is no longer finite dimensional. In the exposition of ordinary non-
linear programming, the choice set is typically a finite dimensional Euclidian
space, and theorems are developed under this basic assumption. But there is no
guarantee that the theorems that hold for a finite dimensional Euclidian space
remain valid for an infinite dimensional linear space.
However, it is interesting to note that many theorems in the theory of non-
linear programming for the finite dimensional Euclidian space can be re-proved
420 CALCULUS OF VARIATIONS AND OPTIMAL GROWTH OF AN AGGREGATE ECONOMY
without too much difficulty (sometimes almost word for word) for infinite dimen-
sional linear spaces. There may be some unexpected difficulties in this task, but if
we accomplish it, we obtain a general theory of nonlinear programming which
covers both the finite dimensional and infinite dimensional cases. In particular,
such a theory will include the classical calculus of variations problem as a special
case. We may note that this corresponds to a trend in modern mathematics to
review the classical results of analysis for spaces of functions (or function spaces);
this field of study is known as functional analysis.
Extensive work along this line has been done by Hurwicz in his seventy-page
article, "Programming in Linear Spaces" [5]. Textbooks on the calculus of
variations have been written from the viewpoint of function spaces (see Gelfand
and Fomin [3] and Shilov [ 10] ; for the exposition of this section we are indebted
to them). We note that our Chapter 1 was written in the same spirit as was the
Hurwicz article. Although we confined our attention to the finite dimensional
case, we remarked in several places that the definitions and theorems could be
extended to the infinite dimensional case. This was done, for example, in the
definitions of derivatives and in the proof of the Kuhn-Tucker main theorem
(Theorem 1.D.3).
In this section we shall explicitly state our problem as a "nonlinear pro-
gramming" problem for the infinite dimensional case and proceed with our analysis.
Euler's condition will be rigorously and more systematically obtained under this
procedure. However, we will not follow this procedure through to its completion.
In other words, we will not be concerned here with developing all the results of
the classical calculus of variations from the viewpoint of nonlinear programming
in infinite dimensional vector spaces. One reason is that this attempt has not been
completed yet. In the meantime, a new development suddenly attracted a great
deal of attention from mathematicians. This development became a matter of vital
interest to American mathematicians after the publication (and translation) of
the book by Pontryagin and his students [ 8] . This was followed by the work of
Hestenes [4] and his students, in which all the major results of the classical
calculus of variations have been obtained and extended by this new approach. The
most important extension is probably the incorporation of inequality constraints
in a natural way. Active research and development in this field (known as optimal
control, theory), which we will summarize in Chapter 8, is being carried out vigor-
ously. This new approach resembles Hurwicz's approach [5] in the sense that it
recognizes the problem as a choice problem in infinite dimensional space, but it
is different in the sense that it does not come out as a natural extension of the
ordinary linear and nonlinear programming theory. An interesting novelty in
viewing and formulating variational problems is seen in the Pontryagin-Hestenes
approach. A natural question now is whether we can develop the ordinary non-
linear programming theory for the finite dimensional case from this new formula-
tion. The answer should be yes, but how? This task has been partly accomplished
recently by Canon, Cullum, and Polak [2]. However, we will not go into this
problem here. Furthermore, in this section we restrict ourselves to the simplest
SPACES OF FUNCTIONS AND THE CALCULUS OF VARIATIONS 421
problem, that is, the problem in which there are no constraints and the "end
points" such as a and b are fixed.
Definition: Let X be a linear space. Any function from X into R is called a func-
tional on X. A functional Jon X is called a linear functional on X if (1) J[ax] =
aJ [x] for any x E X and a E R, and (2) J [x + y] = J [x] + J [y] for any x,
yEX.
of the above statements, see Kolmogorov and Fomin [71, pp. 77-78, for
example (or the reader may try it himself).
422 CALCULUS OF VARIATIONS AND OPTIMAL GROWTH OF AN AGGREGATE ECONOMY
limo(Ilhll) =p
h-.o II h II
Here o(II h II 2) is the infinitesimal of higher order than II h II 2, that is, lim1,.0
o(II h II 2)/ II h II = 0. Q[ h, h] is called the second differential (or second varia-
2
n 2
(a) II h 11 = h?
r= i
or
REMARK: When X = R", we used the "chain rule" (Theorem 1.C.2) for the
proof of the above theorem (Theorem 1.C.6). By noting that the chain
rule also holds in an infinite dimensional (normed linear) space, we may
prove Theorem 5.B.1. in the way we proved the finite dimensional case
(Theorem 1.C.6). Let X, Y, and Z be normed linear spaces. Let f be afunction
from 'X into Y and g be a function fromf(X) c Y into Z. Let f be differenti-
able at x0 and g be differentiable at f (x°). The chain rule for this case simply
states that for h = gof, h'(x°) = g' [ f (x°)] of'(x°). The proof of Theorem
5.B.1 then goes as follows: Let h E X be such that 11 h 11 = 1. Consider
0(6) - J [z + Oh] where O E R. Then ti (B) is a function from P into itself. By
assumption, 0 (B) has a local extremum at B = 0; hence by elementary cal-
culus, cv'(0) = 0. By the chain rule, V(0) = J' [ c] h. Thus J' [z] = 0 or
dJ[z] = 0.
REMARK: It may have to be recalled that X does not have to be a finite
dimensional linear space. For example, x(t) may be in C[a,b] with a certain
norm. Then the fact that J [x] has a local maximum at r means that J [z] -
J[x] > 0 for some neighborhood of the curve c(t), where the "neighbor-
hood" is defined in terms of the metric induced by the norm of C[a,b} .
Theorem 5.B.3: A sufficient condition for a functional J[x] to have a unique local
minimum at x = i, given that the first differential at z vanishes (that is, dJ [z; h] = 0),
is that its second differential at z, d2J[z; h], be strongly positive definite.
PROOF: Since dJ[x; h] = 0, we have AJ[z; h] = d2J [z; h] + E 11 h II 2, where
f = 0( 11 h 112)/ 11 h I1 2 (that is, c -> 0 as h -> 0). By assumption, there exists a
0 > 0 such that d2J[z; h] ? 0 II h II 2 for all h. Hence we obtain
AJ[z;h] = d2J[X;h] + c11h112? (0 + E) IIh112
For E small enough (that is, II h II small enough) with h 0, we have B + E > 0.
Hence AJ[z; h] > 0. (Q.E.D.)
Here f is defined in an open subset ofR2"+' which includes the space of It, x(t), x'(t)]
which is defined for a < t < b. The function f is assumed to possess continuous
first and second partial derivatives with respect to t, x, and x'. With this explicit
form of J[x], we realize at once that the general problem of minimization (or
maximization) of J[x] turns out to be that of the calculus of variations.
Let us now consider a "displacement" of x(t) by h(t), where h(t) E D"[a,b]
with h, (a) = h; (b) = 0, i = 1, 2, ., n. . .
o(IIkII)I 'N(2n)2IIkII2
Hence we obtain
fb(fy
AJ[x;h] = k)dt+ fbo(IIkI )dt
a
b b
f a (fy k) dt + 2Nn2 f i2dt
a
for II k II < u
In other words,
AJ [x; h] < f b
(fy k) dt + 2N n2 (b -a)µ2 for II k II <
Thus we see that the increment of the functional J[x] is split into a principal
linear part (linear with respect to k) and an infinitesimal of higher order. For the
latter, note that limy,-0 [2N n2(b - a)µ2] /µ = 0. Hence J [x] is differentiable and
its differential has the form
b b of of of of
dJ[x; h] = fa (.fy y) dt = fa [---h1
axl
+ ... + axe h; + ... + axnhn dt
To obtain the Euler condition, we assume that x(t) is twice differentiable and then
perform integration by parts for the terms which involve h; in the above equation.
bn of n of b - fb
aX
' dt
j_ I aXi
'h`I
a a i= I
[:i:;] h,dt
where fr,. = of/ax;. The first term on the right of the above equation is zero since
hi(a) = hi(b) = 0. Hence we have
428 CALCULUS OF VARIATIONS AND OPTIMAL GROWTH OF AN AGGREGATE ECONOMY
b n n
d
V= fX;hi - X atf. h; dt
r=1
J Qbk. 1
[ fX; (f)]h.}
for all hi(t), i = 1, 2, ..., n, where f = of/ax; and fx- = of/ax'. Since all the
increments hi(t) are independent and arbitrary, we obtain, by setting all except
one to zero, that
s:[[f1 - dt
-Ux)] h; dt 0, for all h;(t)
J
Hence from the fundamental lemma of the calculus of variations (Section A), we
must have (noting the continuity of dff /dt which is due to the continuous dif-
ferentiability offX.)
This is a (necessary) condition that the optimal arc must satisfy and it is called
Euler's condition for the n-variable case.
REMARK: When x(t) is not twice differentiable, then we cannot get (E) as
we discussed in Section A. We instead obtain
ft
We may obtain the expression for the second differential d2J for
b
J[x] - fa f [t, x(t), x'(t)] dt
by assuming that all the second partial derivatives off exist and are continuous,
and by using Taylor's expansion. For example, if x(t) is real-valued (instead of
Rn-valued), then
b
Theorem 5.B.4: Let f[t, x(t), x'(t)] be differentiable with respect to x(t) and x'(t),
where x(t) is an R"-valued twice differentiable function on the closed interval [a, b]
with x(a) = a and x(b) = A. Suppose that f is a concave junction in x(t) and x'(t).
Then a necessary and sufficient condition that z(t) maximizes the integral
J[x] - J[X] = f a
b
d
,Jab (X X) (.fc - dtjx")dt + (x - C) JC, la
fh
(x - X)
dt (fx')dt]
b
_ (x - z) fL-, I ['.' Euler's equation]
a
= 0 [.'. fixed end points, that is, x(a) = z(a) = a and x(b) = x(b) _ /3]
(Q.E.D.)
REMARK: It can easily be seen from the above proof that if f is strictly
concave in x and x', then z(t) provides a unique global maximum. Note also
430 CALCULUS OF VARIATIONS AND OPTIMAL GROWTH OF AN AGGREGATE ECONOMY
that if f is convex (resp. strictly convex in x and x'), then 1(t) provides a
global minimum (resp. unique global minimum).
REMARK: Any integral evaluated at a single point is obviously zero. Also,
any integral evaluated on a set of countably many points is zero. Therefore,
if two (integrable) functions u(t) and v(t) differ only for countably many
points in [a, b], then the values of the integrals of these functions from a
to b will be the same. Hence the "uniqueness" of an optimal solution in the
above only means uniqueness except for countably many points.
REMARK: As we noted in Section A, the above theorem does not say that
there does exist such an 2(t). It is possible that there is no solution for
the differential equation, that is, Euler's equation. The existence of z(t) is
simply assumed in the above theorem.
REMARK: This theorem was obtained by the author and presented as a
lecture at the University of Minnesota in the spring of 1966. See Takayama
[11] . It is now a special case of Mangasarian's theorem in optimal control
theory (see Theorem 8.C.5).
FOOTNOTES
1. For the first reading, this section can safely be skipped, except for Theorem 5.B.4.
The major purpose of this section (except for Theorem 5.B.4) is to clarify the basic
underlying mathematical structure of the calculus of variations problem (rather than
to provide theorems useful for practical applications), which then will be useful for
further theoretical studies on this topic. Theorem 5.B.4 can be read independently
of the rest of this section, and it provides a useful result in applications. That is,
under "concave cases" the Euler condition is sufficient (as well as necessary) for
a global maximum.
2. Define addition and scalar multiplication pointwise; that is, for any x(t) and
y(t) E C[Q,b], define (x + y)(t) = x(t) + y(t) and (cax)(t) - ax(t). The zero element
is x(t) = 0 for all t and (-x)(t) _ -x(t).
3. With this norm, C[a,b] is a Banach space. We may recall that the Banach space is
defined as a normed linear space which is "complete" as a metric space induced by
the norm. A metric space is called complete if every Cauchy sequence is convergent.
In general, CX, the set of all bounded continuous real-valued functions on a topo-
logical space X, is a Banach space with the norm II x II ° sup I x(t) 1. Convergence of a
sequence {xq} with respect to this norm is a "uniform convergence." The sequence
{xq} is said to converge to x0 uniformly if for any c > 0 there exists a.q such that
q > 4 implies Ixq(t) - x0(t)I < E. It is crucial that 4 does not depend on t. If q
depends on t, then we have a pointwise convergence. That the space CX is complete
amounts to the fact that xq-> x0 (uniformly) and xq E CC for each q implies x0 E C.
The norm in the above, Jxii = sup Ix(t)I, is often called the uniform norm.
4. Clearly the definitions of differentiability and differentials depend on the choice
of the norm. However, as remarked earlier (Chapter l , Section C), the choice of the
norm really does not matter infinite dimensional spaces; that is, if J is differentiable
at x0 in one norm, then J is differentiable at x0 in any other norm and the differentials
at x0 under any norm are the same.
SPACES OF FUNCTIONS AND THE CALCULUS OF VARIATIONS 431
REFERENCES
Section C
A DIGRESSION:
THE NEO-CLASSICAL
AGGREGATE GROWTH MODEL
(2) Y1=X,+1,
Assuming that the amount of depreciation of capital at each instant of time is a
constant proportion (u) of the existing stock of capital,' the amount of gross
investment must be equal to K1 + µK1, where K1 = dK1/dt. That is,
(3) K1+ uK,=I,
Assume that labor grows at a constant rate n, so that we have
(4) L, = Loent
where Lo is the amount of labor available at t = 0. So far, there are four equations
above but there are five variables, L,, K1, Y1, It, X1, excluding time t. Hence by
adding one more equation we can "close" the model; that is, if these five equations
are somewhat "nice," we should be able to solve for these five variables with
respect to t. The fifth equation which can be used to close the model is the
equation which describes the consumption behavior. A common behavioral
assumption here is that the amount of consumption is a constant fraction of
net income.
A DIGRESSION: THE NEO-CLASSICAL AGGREGATE GROWTH MODEL 433
(i) In equation (1), L, denotes the amount of labor input and in equation (4), L,
denotes the total amount of labor available in the economy. Hence the fact
that the same notation L, is used in those two equations means that full
employment of labor is assumed. Similarly, K, in equation (1) denotes the
amount of capital input and K, in equation (3) is the amount of the total stock
of capital available in the economy. Hence the fact that the same notation K,
is used in these two equations means that full employment of capital is as-
sumed. It is true that the economy can deviate from such a full employment
state from time to time. But if we are interested in the long-run behavior of
the economy, we might as well consider such unemployment states as "short-
run" phenomena and abstract our model from them (at least as a first ap-
proximation).
(ii) Equation (1) describes the equilibrium relation in the output market. Nothing
is mentioned about how this equilibrium can be achieved. Typically, such an
equilibrium can be achieved through flexibility of certain price variables such
as the price of output (vis-i -vis money) and/or the rate of interest. A full
consideration of this mechanism involves the consideration of other markets
such as the money market. The above model is abstracted from this consider-
ation. This abstraction parallels the previous assumption of full employment
where the mechanism of how full employment of labor and capital can be
achieved is not considered. In the model, the full employment equilibrium in
the output market is maintained through adjustments in investment I,. That is,
investment is assumed to be completely "passive"; the amount of I, is auto-
matically adjusted to the level just equal to the amount of saving.
YY = L1F(l,
K = L, f(k,), where k, = K and f(k1) - F(1, K
In other words,
Y`
(6) yr = f (k1), where y1 =
Henceforth we will assume L > 0 always (that is, labor is indispensable for
production). The function f is a real-valued differentiable function defined on the
half real line [0, cc).
We can prove the following two lemmas, both of which are important in
aggregate growth theory, fairly easily: (for the sake of notational simplicity, we
omit the subscript t in these two lemmas.)
Lemma 5.C.2: Let L > 0, K > 0. Then a 2F/ aLz < 0 for all L > 0, K > 0 if and
only if f' (k) < 0 for all k > 0, and a 2F/ a K2 < 0 for all L > 0, K > 0 if and only if
f"(k) < for all k > 0.
PROOF: We use Lemma 5.C.1.
1
= kzf' (k)
L
From this the first statement of the lemma follows immediately. Also
a
[.f '(k)] = f"(k) L
a 2F OK
f(k)
returns to scale and diminishing returns with respect to each factor is illustrated
in Figure 5.3. The reader should note that both (i) f'(k) > 0 for all k > 0 and (ii)
f"(k) < 0 for all k >_ 0 are satisfied in the diagram. Note also thatf(O) = 0, that is,
that capital is indispensable for production, is assumed in the diagram.
The next task is to simplify the above set of equations. From equations (1),
(2), and (3), we can immediately obtain
But
kt = Kt
Lt
- Ltkt
Lt
Kt = ,Ct
Lt i + , L. t
Definition: The time path (kt, xt) is called a (neo-classical aggregate) feasible
(growth) path, if it satisfies equation (8) and kt >_ 0, xt >_ 0. If in addition it satisfies
CALCULUS OF VARIATIONS AND OPTIMAL GROWTH OF AN AGGREGATE ECONOMY
436
the prescribed initial conditions ko and x0, then it is called the attainable path
with respect to ko and x0.
REMARK: It should be clear that any path (L1, K1, Y,, X1, I,) which satisfies
equations (1), (2), (3), and (4) can be completely described by the path (k1,
x1) which satisfies equation (8). This is because (8) is obtained from (1), (2),
(3), and (4), and from any path (k1, x1) which satisfies equation (8) we can
obtain a path (L1, Kt, Y1, Xt, It) which satisfies equations (1), (2), (3), and (4).
Clearly there are many attainable paths starting from the same point (ko,
xo). This is due to the fact that there are two "unknowns" in equation (8). We can
close the model by specifying the behavior of consumption. Robert Solow [ 15 ] ,
following Harrod and Domar, adopted the consumption behavior as described in
equation (5). By dividing both sides of (5) by L1 and referring to (6), we obtain
Theorem 5.C.1 (Solow): Under assumptions (A-1) to (A-5), there exists a feasible
path (ks, xs), which is unique, where kc and x, are some positive constants, such
that any attainable growth path with the proportional saving behavior converges
,11onotonically to it (that is, k, - ks and x,->x, monotonically) as t - cc , regard-
less of the initial value of ko > 0, where ks and xs are determined by sf (ks) _
.l*kc and x, = (1 - s) [f(k) - µks] .
k
0 k,
proves the existence and the uniqueness of the path (ks, xs) with ks > 0 and
xs > 0. The rest of the proof is also easy. As is clear from Figure 5.4, k, ks
according to whethersf(k,) .).*k,.Butequation (10)states thatk, 0accord-
ing to whether sf(k,) )*k,. Hence k, > 0 according to whether k, < ks. In
other words, if k, > ks, then k, < O so that k, decreases over time. If k, < ks, then
k > 0 so that k, increases over time. And if k, = ks, then k, = 0 so that k, stays
at ks (Q.E.D.)
REMARK: Starting from ko, what will be the amount of time required to
reach a certain prescribed value k*? To find this, note
ksdtdk= k=
t(k*) k
k
Recall that
(>0ifk<ks
sf(k)-.A*k l<0ifk>ks
Theorem 5.C.1 establishes that k approaches ks monotonically. Hence t(k*)
is meaningful only when k* is such that ko < k* < ks or ks < k* < ko.5 Let
ko # ks. The question is: What is the amount of time necessary to reach ks
when ks. r ko'? This can be resolved by considering the two cases ko < ks
and ko > ks. In either of these two cases it is elementary to see that
lim t(k*) = o0
k*-ks
since sf(k*) -> A*k* as k* -> ks. In other words, it takes an infinite amount
of time to reach the path (ks, xs).6
438 CALCULUS OF VARIATIONS AND OPTIMAL GROWTH OF AN AGGREGATE ECONOMY
REMARK: As was seen in the proof, (A-1) to (A-5) are used to guarantee
the existence and the uniqueness of a positive k, that is, the existence and
uniqueness of a path (k5, x,) with k, > 0 and x, > 0. The reader can easily
think of many alternative sets of assumptions which guarantee the existence
and the uniqueness of a positive k,.' This path (k x,) may be referred to
as Solow's path. Note that in Solow's path, labor, L1, and capital, K1, grow
at the same rate n (because K1/L1 = constant k,), and Y, and X, also grow
at this rate n [because f(k,) and (1 - s) f(k,) are constant]. Investment I,
also grows at this rate because of equation (2). Hence the above theorem
establishes that the path of (L K Y X,, I,) approaches a "balanced
growth" path in which these variables all grow at the same rate as time
extends without limit, regardless of the initial value of these variables. This
global stability theorem was not quite established in Solow's original paper
[ 15 1. A further scrutiny and proof of this theorem with an explicit recogni-
tion of key assumptions is due to Okamoto and Inada [10] .
REMARK: Solow's path may be illustrated as a ray from the origin with its
slope equal to k as illustrated in Figure 5.5. Note, however, that k,
approaching k, as t -> oo does not guarantee that the (L K,) configuration
in the L-K-plane asymptotically converges to the k, -ray (as illustrated by
the dotted line in Figure 5.5). In fact, such an asymptotic convergence is
impossible for the present model, as is shown by Deardorf [3]. Let S, be
the vertical distance of the (L,, K,) path from the k, -ray; that is, 8,
(k, - k,)L,. Deardorf argues that 8,-> 0 is impossible and that 8,-0o is
more plausible.
Under (A-1), (A-3'), and (A-4'), we can draw Figure 5.6, from which we can
immediately assert the existence and the uniqueness of k, such that Tf'(k,) = A
k
k
-x
and k, > 0. Moreover, the time path of k, is quite apparent from Figure 5.6. Thus
we can easily determine k, < 0 according to whether k, > k, Ask, approaches k,
x, approaches x.., where x.. can easily be obtained from (12), that is,
(14)
x, = .f(k,) - T.f'(k,)k,
Hence we have established the following theorem.
Theorem 5.C.2: Under assumptions (A-1), (A-3'), and (A-4') with the classical
saving behavior, there exists a unique feasible path (kc, xc) with kc > 0 and xc > 0,
such that any attainable path (k x,) approaches it monotonically as t -> oo regardless
of the initial value of k0, where kc and xc are respectively determined by Tf'(kc) _
A and equation (14).
REMARK: We may call (kc, xJ the classical path. Theorem 5.C.2 establishes
global stability for the classical path, which is again a balanced growth path
of (L,, K,, Y X,, I,). It can also be shown that the time required to reach the
classical path is infinity. Hence, like Solow's theorem, the above theorem
establishes "asymptotic" stability for the classical path. Note that the
above method of proof can also be used to prove Solow's theorem. For this,
simply divide (10) by k, and observe that we obtain an equation similar to
(13). Such a proof is used in Okamoto and Inada [ 10].
The above two theorems lead us to focus our attention on balanced growth
paths.
f(k) -Xk
A [ f (k) - Ak] = 0
or
(16) f(k) = A
There exists, in view of (A-1), only one value of k which satisfies (16), and k is
this value. Note that (A-1) guarantees the strict concavity of f and hence of
[f(k) - 3.k] Thus (16) gives a necessary and sufficient condition for the unique
.
global maximum (Theorem 1.C.7). Even without such a remark this may be
obvious from Figure 5.7. We now define a very important concept.
Theorem 5.C.3: Under assumptions (A-1), (A-2), (A-3'), and (A-4'), there exists
a unique golden rule path, (k, z), where k and z are respectively defined byf'(k) _ A
and z=f(k)-Ak.
REMARK: Thus in the golden rule path the marginal productivity of capital
f(k) is equal to the rate of population growth (n) plus the rate of depreciation
(y). The above theorem was established by quite a number of different
economists. See Phelps [ 11], Robinson [ 14], Swan [ 17], von Weizsacker
[20] , Allais [ 1 ] , and Desrousseaux [4] .
442 CALCULUS OF VARIATIONS AND OPTIMAL GROWTH OF AN AGGREGATE ECONOMY
FOOTNOTES
1. A complete survey of macro growth theory is not attempted here. For such an
attempt, see, for example, Hahn and Matthews [7].
2. This implies, of course, that Y is gross national product and I is gross investment.
If instead we take Y as net national product, then we can put y = 0. Then I is taken
as net investment. This convention is adopted by Solow [ 15] and others.
3. The theoretical justification for this from a "long-run" standpoint by Deusenberry
is well known. See J. Deusenberry, Income, Saving and the Theory of Consumer
Behavior, Cambridge, Mass., Harvard University Press, 1949.
4. As remarked in footnote 2, Solow has no explicit consideration of depreciation. A
similar model and a similar theorem were also obtained by Swan [ 16], althou h he
assumed that the production function is of the Cobb-Douglas form. Tobin L 19]
obtained a similar but more general model that incorporates money. But he did not
obtain the stability theorem like Theorem 5.C.1.
5. In other words, we preclude such cases as k* < ko < ks and k, < ko < k*.
6. It may be of some interest to investigate what is the time required for the actual
path to come "close enough" to the path (ks, x,). This clearly depends on such
parameters as s, n, n, and k0. There was a debate between R. Sato and K. Sato
on this point. See, for example, K. Sato, "On the Adjustment Time in Neo-Classical
Growth Models," Review of Economic Studies, XXXIII, July 1966.
7. For example, we can have the case in which the f (k)-curve is mound-shaped, that is,
f(k) < 0 for sufficiently large k (capital satiation). The essential point here is that
the sf(k)-curve intersects the A*k-line from the "left" with only one point of inter-
section. If the rate of population growth n (hence A*) is not constant but depends
on per capita income, and thus is a function of k, we can have multiple equilibria with
a mixture of stable and unstable ones. This has been studied by such economists
as R. R. Nelson, H. Leibenstein, J. Buttrick, and J. Niehans. This is used as a
rationale for the "big-push" thesis. However, we may question why n (= LIL) rather
than L is a function of y.
A DIGRESSION: THE NEO-CLASSICAL AGGREGATE GROWTH MODEL 443
8. The name is used to emphasize its mythological character. See J. Robinson, The
Accumulation of Capital, 2nd. ed., London, Macmillan, 1965, p. 99.
9. For the Cobb-Douglas case, that is, f (k) = ka, 0 < a < I, the value of k in the golden
rule path is easily obtained ask = (a/))"I('--l. Note that, for this case, k k, accord-
ing to whether a s, if A. = .a.*.
REFERENCES
1. Allais, M., "The Influence of the Capital-Output Ratio on Real National Income,"
Econometrica, 30, October 1962.
2. Champernowne, D. G., "Some Implications of Golden Age Conditions When
Savings Equal Profits," Review of Economic Studies, XXIX, June 1962.
3. Deardorf, A. V., "Growth Path in the Solow Neoclassical Growth Model," Quarterly
Journal of Economics, LXXXIV, February 1970.
4. Desrousseaux, J., "Expansion table et taux d'interet optimal," Annales de Mines,
November 1961.
5. Domar, E. D., Essays in the Theory of Growth, London, Oxford University Press,
1957.
6. Haavelmo, T., A Study in the Theory of Investment, Chicago, Ill., University of
Chicago Press, 1960.
7. Hahn, F. H., and Matthews, R. C. 0., "The Theory of Economic Growth: A Survey,"
Economic Journal, LXXIV, December 1964.
8. Harrod, R. F., "Second Essay in Dynamic Theory," Economic Journal, LXX, June
1960.
9. Meade, J. E., A Neo-classical Theory of Economic Growth, London, George Allen and
Unwin, 2nd. ed., 1962 (1st. ed. 1961).
10. Okamoto, T., and Inada, K., "A Note on the Theory of Economic Growth,"
Quarterly Journal of Economics, LXXVI, August 1962.
11. Phelps, E. S., "The Golden Rule of Accumulation: A Fable for Growthmen," Ameri-
can Economic Review, LI, September 1961.
12. , "Second Essay on the Golden Rule of Accumulation," American Economic
18. Takayana, A., "Per Capita Consumption and Growth: A Further Analysis,"
Western Economic Journal, V, March 1967.
19. Tobin, J., "A Dynamic Aggregative Model," Journal of Political Economy, LXIII,
April 1955.
20. von Weizsacker, C. C., Wachstum, Zins and Optimale Investitionsquote, Basel,
Kyklos-Verlag, 1962.
Section D
THE STRUCTURE OF
THE OPTIMAL GROWTH PROBLEM
FOR AN AGGREGATE ECONOMY'
a. INTRODUCTION
In the previous section we discussed an aggregate model of economic growth.
The model we considered can be described by the following three equations:
(1) Y1 = F(L1, Kt)
(2) K1 + µK1 = Y, - Xt
(3) L` n
L,
This economy produces a single output, Y, using two inputs, labor (L) and capital
(K); X denotes the amount of consumption. The rate of depreciation is denoted by
µ and the subscript t denotes time. In the previous section we observed that, by
adding the equation which describes the consumption (or saving) behavior and by
specifying the initial capital and labor (or the capital: labor ratio ko if F is homo-
genous of degree one), we can "close" the model and thus completely describe the
time path of each variable.
In this section we ask a different question. Instead of specifying the con-
sumption behavior, we ask: What is the necessary amount. of consumption at each
instant time in order to maximize a certain target while satisfying the above three
equations (the feasibility condition) and the prescribed boundary conditions?
Clearly such a target must be based on the satisfaction that one can obtain from the
stream of consumption. The question thus posed casts a genuine question of
choice. If we consume more at present, then we have less saving so that the amount
of capital stock in the future will be less compared with the case in which we save
more (that is, consume less) at present. This, in turn, implies that we have less out-
put and less consumption (unless we eat up the capital accumulated) in the future
compared with the case in which we save more at present. Hence, although we can
get more satisfaction at present as we consume more now, we will have less satis-
STRUCTURE OF OPTIMAL GROWTH PROBLEM FOR AN AGGREGATE ECONOMY 445
faction in the future. The question is: What is the optimal amount of present con-
sumption? In this verbal presentation of the problem, we implicitly assumed that
our time consists of only two periods, present and future. In general, there are
more than two periods. But this does not create too much difficulty.
Supposing that we can choose the time path of consumption on a time
continuum, we may ask what the optimal time path of consumption is. Let xr
X,/L1 be per capita consumption of the economy. One obvious target function
which the economy may wish to choose is
(4) 1 fTx,dt
0
where T is the planning horizon. Here the society wishes to maximize the total sum
of per capita consumption over time, satisfying the feasibility condition, equations
(1) to (3), and the boundary condition (say, k0 and kT). In Figure 5.8 we illustrate
two types of consumption streams. The a-curve denotes the "thrifty type" of
economy, that is, one which chooses less consumption at present or in the immedi-
ate future, while the Ai-curve denotes the "nonthrifty type." The problem here is to
compare the area under the a-curve up to the T-line with the area under the 3-
curve up to the T-line. If the former, for example, has a larger area than the latter,
we say that the former is "better" than the latter under the target prescribed in
equation (4). Clearly curves such as aand Ai are not drawn arbitrarily; they must
satisfy the "feasibility" prescribed by equations (1), (2), and (3). The optimal pro-
gram we choose under the prescribed target equation (4) is the one which gives the
largest area under the curve up to the T-line.
As alert readers may have already realized, such an optima] program de-
pends on the length of the planning horizon T. If the planning horizon T is longer,
the thrifty type of program may eventually be "better" than the nonthrifty type,
as it pays off at a later time. However, if T is short enough, the thrifty type of
program will not be optimal. Then a question arises as to what should be the length
of this planning horizon. Should it be 5 years, 10 years, or longer? This is a rather
X,
economics, we usually argue that a utility function of the form u(x,) = x1 is rather
unrealistic. Instead, we say that the "marginal utility" is decreasing with an in-
creasing amount of consumption. We now impose such an assumption. In other
words, throughout this section, we assume that u is defined on [0, 00), is twice
differentiable, and
(A-1) u'(x,) > 0 and u"(x1) < 0 for all x1 ? 0
Under this assumption, u(x,) is a strictly monotone and strictly concave function.
[ Here u'(0) and u" (0), respectively, refer to the right-hand derivative ofu and u' at
x = 0.]
The question of discounting the future (that is, p > 0) is not an easy one.
Frank Ramsey, who first studied the optimal saving problem systematically, argued
that p should be equal to zero, for it is "unethical" to discount the utility of our
descendants compared to the utility of ourselves. The welfare of different genera-
tions should be equally weighted. However, Koopmans [21.] and Koopmans,
Diamond, and Williamson [25] have discovered that a utility function of all con-
sumption paths, which at the same time exhibits time neutrality and satisfies other
reasonable postulates on utility functions, does not exist. Since this question has
not been settled yet, we will not discuss it further. For the time being, we assume
that p is constant and nonnegative.
For the infinite horizon formulation, equation (5) is rewritten as
(6) J - rl u(x1)e-Ptdt
0
Thus our problem is now to find the time path of x1 which maximizes J subject to
the feasibility conditions (1), (2), and (3) with the prescribed value of the initial
capital-labor ratio ko (assuming constant returns to scale) and the nonnegativity of
each variable.
The question thus formulated, however, casts another problem immediately.
That is, how can we guarantee that J converges? If, for some feasible paths with a
prescribed ko, the integral J does not converge (say, goes to 00), the above formula-
tion becomes meaningless.' This question of convergence is especially acute when
p = 0. Ramsey [35] solved this question beautifully by constructing some ref-
erence path, say, u, and converting the problem to one of maximizing
(7) J = f0 , I - u] dt
Note that both J and JR are bounded from below for the optimal path under the
monotonicity of the utility function u, assuming that the economy is "productive"
in the sense that it allows a strictly positive path of consumption starting from a
given ko. When p > 0, the easiest way to guarantee the convergence of J is simply
to assume that the function u is bounded from above (that is, satiation). (See also
footnotes 12 and 16.)
448 CALCULUS OF VARIATIONS AND OPTIMAL GROWTH OF AN AGGREGATE ECONOMY
So' much for the discussion of the formulation of the problem. We now
proceed to the solution of the problem thus formulated. This question of the opti-
mal saving problem was first formulated and solved to a certain extent by Ramsey
in 1928 [ 35]. Then the problem was almost forgotten for some time probably as a
result of the Great Depression and the war. Then in the 1950s there was a revival of
the problem with enthusiasm and it was solved by Koopmans [22] and Cass [6] in
the early 1960s.1 Cass formulated the problem in terms of Pontryagin's maximum
principle, reflecting the fashion during the time the paper was written. (We shall
take up such a formulation in Chapter 8.) As we will see in this section, we really
do not need this new technique, the full understanding of which requires a con-
siderable mathematical maturity. Instead, we will use only the knowledge of the
elementary theory of the calculus of variations that we discussed in Section A.
Koopmans' paper [22], although masterly and very penetrating, is long, consist-
ing of sixty-three printed pages, and his proofs are sometimes difficult. This dif-
ficulty is partly the result of his thorough and important examination of the
"eligibility" conditions for the "feasible path."
We can simplify the treatment considerably if we realize the fact that the
whole problem is a straightforward application of the elementary part of the clas-
sical calculus of variations. We will see that the "phase diagram" will be very
useful and vital in our analysis. There is one basic difference between his and
our approach. Koopmans first eliminates the ineligible paths, then chooses the
optimal (eligible) path from the set of eligible (attainable) paths, whereas we first
eliminate the paths which do not satisfy the Euler condition as "nonoptimal," and
then choose the eligible (optimal) path from the set of the attainable paths that
satisfy Euler's condition. In the process of obtaining the set of attainable paths
that satisfy Euler's condition, we use the elementary theory of the calculus of
variations.
To solve our problem, we first have to simplify the constraint equations (1),
(2), and (3). This procedure of simplification has already been discussed in the
previous section, assuming that F exhibits constant returns to scale. In short, for
L, > 0, equations (1), (2), and (3) are reduced to
(8) k, = J'(k,) - Ak, - x,
where k, = K, IL, (capital labor ratio), z, = X, /L,, f (k,) -- F(L K,)/L, . We call
the path (k,, x,) the feasible path if it satisfies (8). If, in addition, it satisfies the
arbitrarily prescribed initial value ko and the terminal value kT, we call it the
attainable path. When T -> co, kT will not be specified. The problem, then, is the
following:;
T
Maximize: JT - f u(x,)e-P1dt
0
Maximize: fou[f(kt)
T - A.k, - k,]e P'dt
Subject to: k, > 0, x, > 0 for all t
with the prescribed values of ko > 0 and kt > 0.
The nonnegativity constraints k, > 0, x, > 0, do not cause any trouble here. For,
as we will see later, the solution path obtained by neglecting the nonnegativity
condition is in fact in the nonnegative orthant.
Neglecting the nonnegativity condition, we can immediately apply the Euler
condition to choose x, so as to maximize the integral JT from the set of attainable
paths. In other words, our problem is now converted to the calculus of variations
problem without the constraint. Thus letting
(9) I(t, k1, kt) = u[.f(kr) - Ak, - k,]e -Pt
we can write Euler's condition as follows:
(10)
aa) dad
ak, dt[ak,
where the partial derivatives are evaluated at the optimal path k, . For notational
simplicity we henceforth omit (° ), which denotes the optimal path. Equation (10)
gives a necessary condition for k, to be an optimal path. By utilizing (9), (10) can
be computed as
path. We show that the eligible path which satisfies both feasibility (8) and Euler's
condition (11) is such that it monotonically approaches the "modified golden
rule path" (whose concept is to be defined later) regardless of the initial k0 as T
increases. It can be shown that the integral J(for p > 0) orJR(for p = 0) converges
along such a path. The eligible Euler feasible path thus obtained will be better than
any other feasible path starting from the same initial point k0 for any sufficiently
large T. This criterion of choosing the optimal path corresponds to the one
proposed by von Weizsacker [52] as the "overtaking criterion."
Finally, one remark about the feasibility condition (8), in particular the
shape of f(k,), should be mentioned. In the neo-classical model as described in the
previous section, we supposed a strictly concave shape off, that is, f'(k,) > 0 and
f"(k,) < 0 for all k, . However, there is one other type of production function that
is quite common in the literature of economic growth and development. This
production function has the assumption of a constant capital:output ratios. In
this case, F(L, Kt) has the form
Y,
(12) _ a Kt
where or is a positive constant denoting the capital:output ratio. Notice that labor,
Lt, is not explicitly involved in this production function. By dividing both sides
by Lt we obtain
In other words, by identifying f (k,) with (1/Q )k, , we can consider the present case
as a special case of the production function considered above.' We can use the
same conditions (8) and (11), with f (k,) now identified as f (k,) = (1/Q )k, Equation
.
jecture that the optimal attainable path is "insensitive" with respect to the
terminal capital stock kT and also that it is insensitive with respect to the planning
horizon T. We will show that these conjectures are true under a general frame-
work, which will shed some light on a later controversy between Chakravarty and
Maneschi.' We note that our sensitivity analysis deals with a simple case of Brock's
elegant analysis [5]. We point out that, in the Chakravarty-Goodwin case,
the optimal program for a sufficiently large T approximates the program in which
consumption is kept at the subsistence level forever. The discussion of this sub-
section will be useful to increase the reader's understanding of the problem
involved in the constant capital:output ratio case and of the basic technique
employed in the analysis, as well as some of the basic difficulties involved in the
optimal growth problem of an aggregate economy.
With these preliminary remarks we now proceed with our analysis. First
rewrite the Euler equation, (11), for the case of a fixed capital: output ratio, that
is,
(15) Xt uI
u Q
(A + P)
Since f(k,) - A.kt - k1[= (1/u - A)k, - k1] is a linear function in k1 and kt (hence
concave) and u is a strictly concave function, the Euler condition, (15), is sufficient
for a global optimum as well as necessary (assuming that T is finite). The optimal
feasible path is the one which satisfies equations (14) and (15) simultaneously with
k, > 0 and x1 > 0. We may replace the condition x1 > 0 by x1 > x, where x is the
subsistence level of consumption. We may note that Chakravarty assumes x = 0.
In the formulation of the problem by Tinbergen, Goodwin, and Chakravarty,
the depreciation of capital is not explicit. In our formulation, this amounts to
putting µ = 0. Also these three people assumed that there is no time discount for
the future consumption so that p = 0. Tinbergen and Chakravarty in the main
assumed that there is no population growth in the economy so that n = 0 (thus
A = 0). Goodwin gives a numerical example of the problem in which he assumes
n = 0.01 and u = 4 (see [ 15] , pp. 773-774). Hence all the treatments of Tinbergen,
Chakravarty, and Goodwin can be considered as special cases of the following
assumption:
(A-2) i -u(A+p)> 0
The case in which 1 - u(A + p) < 0 can be discussed mutatis mutandis so that the
analysis for this case can be omitted from our discussion. We may note that if
1 - u(A. + p) < 0, then the path constrained by equation (15) requires the economy
continuously to reduce per capita consumption (that is, z< < 0), since u' > 0
and u" < 0 by (A-1). This is an uninteresting case. We may note that (A-2) implies
1 - uA > 0. If 1 - uA < 0, then, from equation (14), any feasible path (with x1
some positive constant) must undergo a decrease in capital stock and the economy
must disappear for a sufficiently large T in order to keep some positive level of
consumption. Otherwise, the amount of per capita consumption must become zero
452 CALCULUS OF VARIATIONS AND OPTIMAL GROWTH OF AN AGGREGATE ECONOMY
and everyone in the society will eventually starve to death. Therefore, the case
(1 - aA < 0) is uninteresting. Note that (A-2), among others, implies that p
cannot be too large, and in fact, if we accept p = 0, (A-2) can easily be accepted
as a realistic assumption.
The studies by Tinbergen, Goodwin, and Chakravarty all assume the
following specific form of the utility function," which obviously satisfies (A-1),
forx,>.:
(16) u'(x,) _ (x, - x) ", v>0
or
Here u is defined for x, > x if v = 1 and for x, >= x for 0 < v < 1. In (16) and (17),
x is the subsistence level of consumption. If we suppose u (x) > 0 for some value of
x > x, then v cannot be greater than 1. Goodwin assumes that v = 1. Tinbergen
quotes the figures from Frisch's study of 1931 which, for example, says v = 0.6 for
American workers ([48] , p. 482). The specification of the utility function as above
may cause strong opposition from the view point of the cardinality of utility.
However, since one of the purposes of this section is to survey the past studies,
we want to keep the explicit form of the utility function as defined above.
With the above specification of the utility function, the Euler equation (15)
can be rewritten as
(18) z,-= a(x, - x), where a a [1 - a(A + p)] > 0 from (A-2)
(21) kt = R + (B - At)el'', if cr = /3
x
(22) k, = + e'll + BeA', if a /3
/3-rx
The two constants A and B are to be determined by the boundary conditions. One
of them is obviously the initial value of k. We can consider several candidates for
the other. Goodwin chooses the terminal growth rate YT/YT. Since the capital:
output ratio is constant and the rate of labor growth is constant, this amounts to
STRUCTURE OF OPTIMAL GROWTH PROBLEM FOR AN AGGREGATE ECONOMY 453
choosing kT/kT ° Chakravarty chooses the terminal stock of capital kT as the other
boundary condition. In either case, the specification of the two boundary condi-
tions determines the values of A and B, and hence specifies completely the optimal
attainable path of (k,, x,). We may note that Goodwin assumes v = 1 andp = 0 so
that a = /3, whereas Tinbergen and Chakravarty consider the case in which v < 1
and p = 0 so that a /i. In other words, Goodwin considers a special case of the
time path (k,, x,) described by equations (20) and (21), while Tinbergen and Chak-
ravarty considered a special case of the path of (k,, x,) described by (20) and (22).
To pursue the analysis further, let us assume x = 0. This amounts to choos-
ing the origin properly and does not constitute a loss of generality. (One may, if he
so desires, redefine x, k, by x, - x and k, - x//3, respectively.) Then (19), (21), and
(22) can be rewritten respectively as
CASE II: a /3
(30) B=a+ A
(31)
a-
A = (a - /3) e(n-A)T
be- AT
-1
We now examine the nonnegativity condition, that is, k, > 0, x, > 0 for all t.
Clearly whether this condition is satisfied or not depends on the magnitudes of
454 CALCULUS OF VARIATIONS AND OPTIMAL GROWTH OF AN AGGREGATE ECONOMY
A and B. So far as equations (27) to (30) are concerned, A and B can be either
negative or positive depending on the size of T and the relative sizes of a and b.
For example, if T is sufficiently small and b is sufficiently large relative to a,
then we may have a < be-AT, so that A < 0 in (27). A necessary and sufficient
condition for x, > 0 for all t is, in view of (23), that A >_ 0 regardless of the
relative size of a and R. A > 0 holds [in view of (27) and (29)] if and only if
(32) aePT > b, regardless of the relative size of a and R
We assume that this condition holds, for otherwise x, < 0 (for all t).
In order to consider the condition in which k, > 0 for all t, we obtain the
expressions for k, using (24), (25), (27), (28), (29), and (30):
eear - Of
(34) k, = aeAt - (aept - b) eaT - eAT , when a A
In view of (33), k, > 0 for all t, when a = R. Also in view of (35), k, >_ 0 for all t,
when a # A. In fact, k, > 0 for all t, 0 < t < T, regardless of the relative size of
a and /3 as long as a> 0.
In order to investigate what happens when T is large enough, take the limit
as T->oo in (27), (28), (29), and (30). We then obtain:
(36) A=0,Ba,when a=R
(37) A=O,B=a,when a>p
(38) A(p-a)a,B=0,when a</3
Hence in view of (23), (24), and (25), we obtain:
(39) x, = 0, for all t, when a > R
(40) _v, _ (A - a)ae"t, for all t, when n' <
(41) k, = aelit, for all t, when a > /3
(42) k, = ae°t, for all t, when a < /3
Note that when a < A, we obtain, in view of (40) and (42), the following relation:
(43) x, _ (/i - a)k,, for all t
In other words, x, and k, grow at the same rate a and the ratio between them is
constant (that is, /i - a). For a > /i, we do not have such a solution.
STRUCTURE OF OPTIMAL GROWTH PROBLEM FOR AN AGGREGATE ECONOMY 455
Define the limit path as the path which is specified by (39) and (41) [or
(40) and (42)]. What can we infer from the limit path? One important implication
is that such a path gives an approximation of the optimal path when the planning
horizon is large enough.
Can we infer anything about the infinite horizon problem? First note that
we cannot specify the terminal stock b for the infinite horizon problem. It is
certainly meaningless to talk about the capital:labor ratio for the infinite future,
that is, the date which we can never reach! Therefore, strictly speaking, the
solution of the infinite horizon problem is not the limit of the finite horizon
problem. The problem is altered with regard to the specification of the terminal
stock b.
The reader may then wonder whether we can replace the boundary con-
dition kT = b by a condition such as limT_. kT = b. Then the terminal condition
is specified. But we can immediately see that the limit path then does not give a
solution of the infinite horizon problem, by simply observing k,->oo as t -->oc in
the limit path. In other words, if we adopt the limit path approach for the infinite
horizon problem, the terminal condition should not be specified.
Furthermore, the limit path is not, in general, a solution of the infinite
horizon problem anyway, even if the terminal condition is unspecified. This is
easy to see by assuming a > /3 and recalling (39). In other words, if a > /3, x, = 0
for all t in the limit path; that is, consumption must be kept at the subsistence
level forever. Clearly the path in which x, = 0 for all t cannot be an "optimal"
path. In fact, it gives the worst possible path if /3 > 0, since it is possible for the
economy to sustain itself at more than the subsistence level. To see this, it suffices
to choose x, such that 0 < x, < /3a and examine (14). Clearly such apath is attain-
able and k, is non-decreasing in t. Such a path is certainly better than the path
in which the economy is kept at the subsistence level for all t.
What then can we infer from this? The appropriate conclusion is that the
solution of the infinite horizon problem does not exist if a > /i. Actually a simple
procedure, which does not involve the tedious process of obtaining the limit
path, will also reveal this. First note that the solution must satisfy the feasibility
condition (14), and the Euler condition, (18) or (23).1° Any path which satisfies
these two conditions is called the feasible Euler path. The feasible Euler path is
not necessarily a solution path (an optimal path), that is, the solution of the
optimization problem. We then have to proceed to screen the solution path out
of the set of all feasible Euler paths. The test used in this screening process is
called the eligibility test. Note that if x0 = 0, then x, = 0 for all t by Euler's condi-
tion (23). Since the society can sustain itself at more than the subsistence level, the
path in which x, = 0 for all t is not "eligible" for the solution path. Recall (24)
and (25). Then if x0 > 0 and a > 0, k, eventually becomes negative for a sufficiently
large t ['.'A > 0 from (23) and x0 > 01. In other words, if a >_ /3, none of the
feasible Euler paths is "eligible" for the solution path; that is, the solution path
for the infinite horizon problem does not exist.
Since the above conclusion crucially hinges on whether a >_ /3 or a < 0, let
456 CALCULUS OF VARIATIONS AND OPTIMAL GROWTH OF AN AGGREGATE ECONOMY
us-obtain the condition under which a< A. This can easily be obtained from the
definitions of a and A in (18) and (20), and we can conclude that the necessary
and sufficient condition for a < /i is
As we remarked earlier, if we assume u(x) > 0 for some x > 0, v cannot be greater
than one. Hence, in this case, a >_ A must hold as long as p = 0. Note that if
p > 0, then v 5 1 is necessary for a > p, and that v > 1 is sufficient for a < A
[in view of (44A. Goodwin's case (v = 1, A > 0, p = 0) and Chakravarty's model
(v < 1, A > 0, p = 0), as well as the above-mentioned Tinbergen case (A = p = 0,
v < 1), all yield the case in which a > A. An interesting example of a < A may be
the case in which A = 0, p > 0, and v > 1.
Tinbergen considered the infinite horizon problem with the above specifica-
tion (which amounts to a > p) and contended that his article is "an unsuccessful
attempt to find a simple solution to the problem of optimum savings" ([48],
p. 481). Both Goodwin and Chakravarty considered the finite horizon problem;
hence there is no such "unsuccessful" story.
For the finite horizon problem, the limit path represents an approximation
to the case in which T is sufficiently large. The only question that remains is
how the economy can tolerate spending most of its time near the subsistence
level. The present contention of the author is that this is not a small criticism,
although such a judgment may be a matter of taste.
Confining ourselves to the finite horizon problem, there is a way to avoid
the above-mentioned problem of the "arbitrary cut-off point." This is the
"sensitivity analysis" explored by Brock [5]. Postponing its full discussion to
the Appendix, we now illustrate this analysis for the present problem. Assuming
that T is finite, this analysis examines questions such as the effect of a change in
the terminal stock kT = b and a change in the terminal date T on the optimal
consumption program. Then we find out that the optimal consumption program
is "insensitive," at least for a certain initial period, with respect to these changes,
if T is large enough. As remarked before, Chakravarty [9] conjectured such
insensitivities by constructing certain numerical examples. These problems were
then solved under a general framework with both linear and nonlinear production
functions by Brock [5]. Our consideration here offers a simple case of Brock's
result. Also note that Brock dealt with a discrete time model while ours is a
continuous time model.
We first consider the effect of a change in the terminal stock requirement
kT, assuming T is fixed. Write the optimal path for kT = b, as (k,', x,')
and the optimal path for kT = b2 as (k,2, x,2). Similarly, we write the values of
STRUCTURE OF OPTIMAL GROWTH PROBLEM FOR AN AGGREGATE ECONOMY 457
CASEII: a#/i
(48) _x,1 - x,2 = eat(A1 - A2)
(b2 - bl)eat - eaT < 0 (which implies x,l < x,2)
eat - eWt
(49) kt1 - kt2 ear - epr(b, - b2) > 0 (which implies k,l > k,2)
err - e/ir
(51)
k, l-k2=(b -b ,
2) = (b , ?
ePt - eat
-b )err7'(e(-a)T
e%T(e(a-P)T - 1) - 1)
Hence the distance between x,1 and x,2 can again be made arbitrarily small for
each t by choosing T large enough (relative to t), when a /3. Also the distance
between k,1 and k,22 can again be made arbitrarily small for each t by choosing
T large enough (relative to t), when a /3. The choice of T with a given distance
between x,1 and x,2 (resp. k,l and k2) and with a given value of t can be computed
precisely from (50) [resp. (51)].
458 CALCULUS OF VARIATIONS AND OPTIMAL GROWTH OF AN AGGREGATE ECONOMY
CASE II: a
e(a-P)r' -1 e(a-p)r - 1
(55) ktT' - ktT = - (eat - el3')A7'
Note that S r and L r can be made arbitrarily close to zero by choosing T sufficiently
large (regardless of the relative size of a and p). Hence, for each fixed t, both the
distance between xtT and xtT" and the distance between ktT and ktT' can be made
arbitrarily close to zero, by choosing T sufficiently large (relative to t), regardless
of the relative size of a and A.
Note also that if b = 0, then 8r < 0 so that XtT' < XtT and k,T" > ktT when
a = R. Also b = 0 implies that A T j 0 according to whether a > A. Hence xtT"
< XtT and k,T' > k,r, when a R. In other words, the monotonicities of XtT and
ktr with respect to T can be achieved whenever we have b = 0. Note that a neces-
sary and sufficient condition for such monotonicities can be computed from (52)
to (55) for the case in which b > 0." Also note that we have established the in-
sensitivity of the optimal path with respect to T without regard to any such
monotonicities.
Hence we obtained the conclusion that the optimal path is insensitive both
to a change in the planning horizon T and to a change in the terminal stock kT
for a certain initial period. I believe that this is a precise formalization of Chakra-
varty's conjecture, where he confined himself to numerical examples.
In the above we noted the following features of the constant capital:output
ratio model when a >- A.
(i) The solution of the infinite horizon problem does not exist.
(ii) Although the sensitivity results hold for the finite horizon problem, the solution
for a sufficiently large T approximates the program in which consumption is
kept at the subsistence level for a long period of time.
of the production function, that is, the case in which the assumption of a con-
stant capital:output ratio does not hold. We then show that under a certain set
of plausible assumptions, the solution of the infinite horizon problem always con-
verges to a balanced growth path ("modified golden rule path"). In the Appendix
to Section D, we show that the sensitivity results hold in general, including such
a nonlinear case.
Here we should also recall the problem of the inequality between the
natural rate of growth and the warranted rate of growth in the Harrod-Domar
model. In other words, we have to ask ourselves the question whether we can
really describe the "optimal" path without any significant consideration of
the growth of labor. Will not such a path be bounded by the ceiling of the
growth with "full employment of labor"? Will not such a path cause contin-
uously increasing unemployment of labor? Will not the productivity of capital
(1/a) be decreased with the increase in the capital:labor ratio? There are no
clear answers to these questions as long as we retain the assumption of a
constant capital:labor ratio.
to see that the "nonlinear" specification off creates an essential difference from
the "linear" specification off where the capital:output ratio is assumed to be
constant. In the subsequent analysis, we shall show that, under the nonlinear
specification off (also of F), there exists a unique optimal attainable path for
the infinite horizon problem, which approaches the modified golden rule path.
The "modified golden rule path" will be defined later. (It is equal to the golden
rule path when the discount factor p is equal to zero.)
To obtain this result, we need one more specification on the utility function
in addition to (A-1):
(A-5) lim u(x) -. - co as x - 0 with x > 0.
This assumption is due to Koopmans [221. He explains that "this means a strong
incentive to avoid periods of very low consumption as much as is feasible" (p. 241).
If x, = 0 for any (small) time interval, then by (A-5) the objective integral diverges
to -co. That is, (A-5) in essence guarantees an interior solution.
We are now ready to proceed with our analysis. As discussed in subsection a,
we first solve the problem with a finite horizon, and then examine the optimal
feasible path when T extends without limit. Thus our first task is to maximize the
integral JT [equation (5)] subject to feasibility. This is a straightforward calculus
of variations problem, of which the Euler condition is already obtained [equation
(11)] . Now note that since f is strictly concave ink from (A-3) and u is a strictly
concave function from (A-5), C1 is a strictly concave function in k and k. Hence the
Euler condition as given in (11) is sufficient (as well as necessary) for a unique
global maximum (Theorem 5.B.4). Ignoring the possibility of a corner solution
(which may arise due to the nonnegativity condition k, > 0, x, > 0), the feasible
Euler path is the one that satisfies equations (11) and (8) simultaneously. Hence the
STRUCTURE OF OPTIMAL GROWTH PROBLEM FOR AN AGGREGATE ECONOMY 461
time path of k1 and x1 can be analyzed simply in terms of the following phase dia-
gram, where we now confine our attention to the nonnegative orthant of the
(x - k)-plane in view of the nonnegativity constraint.
In Figure 5.10, k(p) is defined as the value of k which satisfies the following
equation:
(56) f(k) = A + p
From (A-3) and (A-4), k(p) lies between 0 and k. Also, z(p) in Figure 5.10 is
defined by the following equation:
x=f(k)-Xk(ork=0)
described in the above diagram. From the diagram it is clear that there are three
kinds of feasible Euler paths, (k,, x,), namely,
(Type A) k, > k(p) for all t > t (for some 1 > 0).
(Type B) k,-k(p) and x,->1(p) as t oo.
(Type C) k, < k(p), for all t > t (for some 1 > 0).
Along the type A path, x, < i(p) as well as k, > k(p) from some time on
(say after t = to). Then we can always improve on a given type A path by con-
suming the capital stock (disinvesting) in some interval beginning at to until k,
diminishes to k(p), while we raise x, to i(p). After this, we maintain i(p) and
k(p), and we obtain a path superior to a given type A path. In other words, the
type A path cannot be optimal. Along the type C path, k, < 0 for all t > it so
that k, is decreasing over time, yet x, is nondecreasing over time as can be seen
from the above phase diagram. Hence k, eventually goes to some negative value
for a sufficiently long passage of time." This violates our assumption of k, > 0
for all t > 0. Hence both the type A and type C paths are not eligible for the
infinite horizon problem. Note that when we consider the problem of t->co
(hence also T--> oo ), we do not pre-specify the value of kT.
What about the type B path? If p is positive, then the integral J defined in (6)
along the type B path is clearly convergent, so that we obtain a unique eligible
path which is feasible and satisfies Euler's condition, for any positive initial ko.15
The path approaches monotonically to [k(p), z(p)] as time extends without
limit.' 6 If p is zero, then the integral J defined in (6) along the type B path is not
convergent. However, the problem of divergence in this case can be avoided if we
redefine the target function as follows:
Along the type B path, we can show that the integral JR is convergent; hence the
feasible Euler path of type B is eligible for the infinite horizon problem under
this new target function JR.'' This Ramseyian device is also used by Koopmans
[22]. We now obtain the following theorem.
is defined in the sense of maximizing the integral JR, and this integral is convergent
for this optimal path.
REMARK: ' If ko = k(p), then the optimal feasible path is simply the path of
[k(p), 1(p)] for all t > 0. The target is the integral Jwhen p > 0 and JR when
p = 0.
REMARK: The path of [k(p), 2(p)] is the familiar "golden rule path" a la
Phelps, Robinson, and so on, when p = 0. We can, in general, call the path of
[k(p), 1(p)] with p > 0 the modified golden rule path.
The importance of the above theorem may be emphasized. It gives a
completely new significance to the golden rule path. As discussed in Section C,
the concept of the golden rule path can be severely criticized on the grounds
that it neglects the historically given stocks of capital and labor, and that its
choice set is restricted to the golden age paths. This means that even if the
historically given value of the capital:labor ratio happens to be on the golden
rule path, it only maximizes per capita consumption in the choice set which is
limited to the set of the golden age paths. Theorem 5.D.1 gives an answer to
both of these criticisms. In other words, it says that the path which maximizes the
"Ramsey sum" of utility over the infinite horizon (that is, JR) converges to the
golden rule path regardless of the initial value of k0, as long as it satisfies the
eligibility conditions. Here the choice set is not limited to the golden age paths,
so that kt can fluctuate over time (in fact, along this optimal attainable eligible
path, kt approaches k(p) monotonically-hence, in general, it is not constant). If
we have a positive discount factor (p > 0), the theorem says that the optimal
x f'(k) =X+p
1-1 -f(k)-Ak
FOOTNOTES
1. This section was first presented by the author as a lecture at the University of
Minnesota in the spring of 1966. See Takayama [461. For a recent survey of
the same problem, see Koopmans [241, for example. In the first reading of this
section, the reader may skip reading subsection b.
2. This point is discussed by Chakravarty [8].
3. There exists an extensive literature on this topic including recent textbook exposi-
tions. The earlier contributions on this problem in addition to [35] , [22] , and [6],
include: Tinbergen [47] and [481, Goodwin [151, Black [41, Chakravarty [8]
and [9], Dasgupta [12], Horvat [18] and [19], Leontief [27], Meade [31],
Samuelson [ 36], Sen [ 38] and [39], Stone [45] and von Weizsi cker [ 52]. (Dis-
cussion on "investment criteria" in the 1950s by Sen, Eckstein, and others, especially
in the Quarterly Journal of Economics, also belongs to this category of problem.)
An extension to the multisector model has been attempted since the pioneering work
by Samuelson and Solow [371. More recent turnpike theorems obviously belong
to this category. We take up this topic later (Chapter 7, Section A). See also a further
extension of this multisector growth model by Gale [ 14]. We also discuss this
later (Chapter 7, Section B). The extension to the two-sector optimization model is
attempted by Kurz [26] , Srinivasan [43], Stoleru [44], Johansen [20], Uzawa
[50] and [51], Atsumi [3], and so on. See also J. Z. Drabicki and A. Takayama,
"On the Optimal Growth of the Two Sector Economy," Krannert Institute Paper,
No. 383, January 1973.
4. We implicitly assume that the economy can "eat up" the existing stock of capital:
that is, the economy can increase the amount of consumption by reducing the existing
stock of capital. Cass [6] , and Arrow and Kurz [2] considered the optimal growth
problem by explicitly banning this possibility.
5. It is often referred to in connection with the "Harrod-Domar mode]." This repre-
sentation of a production function implicitly assumes that labor is not scarce.
Harrod's and Domar's original models are more sophisticated than the one with such
an assumption.
6. This also implies either that Y is defined as "net" (rather than gross) national
product or that the capital good is assumed to last forever.
7. See [111 [29], and [30].
,
8. Note that v = - (x1 - x)u"/u', which signifies the elasticity of marginal utility. The
crux of such a specification of the utility function is the constancy of this elasticity.
STRUCTURE OF OPTIMAL GROWTH PROBLEM FOR AN AGGREGATE ECONOMY 465
that the uniqueness of the optimal path is provided for by the strict concavity of
the function u.
17. Our Euler condition (11), under this target function, can be transformed as u'(x 1)k 1=
u(z) - u(x1). This is the Keynes-Ramsey rule which says that "the net increase in
capital per worker multiplied by the marginal utility of consumption per worker at
any time equals the excess of the maximum sustainable utility level over the current
utility level." See Koopmans [22], p. 243, and also pp. 272-273. As we will show
in the next section, his equation (28) corresponds to our equation (11). As Koopmans
has shown, the time necessary to reach the golden rule path is infinity.
REFERENCES
17. Hicks, J. R., Capital and Growth, Oxford, Clarendon Press, 1965.
18. Horvat, B., "The Optimum Rate of Saving: A Note," Economic Journal, LXVII,
March 1958.
19. , "The Optimum Rate of Investment," Economic Journal, LXVIII, December
1958.
20. Johansen, L., "Saving and Growth in Long-Term Programming Models," in Econo-
metric Analysis for National Economic Planning, ed. by Hart, P. E., Mills, G., and
Whitaker, J. K., London, Butterworth, 1964.
21. Koopmans, T. C., "Stationary Ordinal Utility and Impatience," Econometrica, 28,
April 1960.
22. , "On the Concept of Optimal Economic Growth," in The Econometric
Approach to Development Planning, Pontificiae Academiae Scientiarvm Scriptvm
Varia, Amsterdam, North-Holland, 1965 (also "Discussion," pp. 289-300).
23. , "On Flexibility of Future Preferences," in Human Judgement and Optimality,
ed. by Bryan and Shelly, New York, Wiley, 1966.
24. , "Objectives, Constraints and Outcomes in Optimal Growth Models,"
Econometrica, 35, January 1967.
25. Koopmans, T. C., Diamond, R. A., and Williamson, R. E., "Stationary Utility
and Time Perspective," Econometrica, 32, January-April 1964.
26. Kurz, M., "Optimal Paths of Capital Accumulation under Minimum Time
Objective," Econometrica, 33, January 1965.
27. Leontief, W., "Theoretical Note on Time Preference, Productivity of Capital,
Stagnation, and Economic Growth," American Economic Review, XLVIII, March
1958.
28. , "Time Preference and Economic Growth: A Reply," American Economic
Review, XLIX, December 1959.
29. Maneschi, A., "Optimal Savings with Finite Planning Horizon: A Note," Inter-
national Economic Review, 7, January 1966.
30. "Optimal Savings with Finite Planning Horizon: A Rejoinder," Inter-
national Economic Review, 7, January 1966.
31. Meade, J. E., Trade and Welfare: Mathematical Supplement, London, Oxford Uni-
versity Press, 1955.
32. Mirrlees, J., "Optimal Growth When Technology is Changing," Review of Economic
Studies, XXXIV, January 1967.
33. Phelps, E., "The Rammsey Problem and the Golden Rule of Accumulation," in Phelps,
Golden Rules of Economic Growth, New York, W. W. Norton, 1966.
34. Pontryagin, L. S., Boltyanskii, V. G., Gamkrelidze, R. V., and Mishchenko, E. R.,
The Mathematical Theory of Optimal Processes, New York, Interscience, 1962, (tr. by
Trirogoff and Neustadt from Russian).
35. Ramsey, F. P., "A Mathematical Theory of Saving," Economic Journal, XXXVIII,
December 1928.
36. Samuelson, P. A., "A Catenary Turnpike Involving Consumption and the Golden
Rule," American Economic Review, LV, June 1965.
37. Samuelson, P. A., and Solow, R. M., "A Complete Capital Model Involving Hetero-
geneous Capital Goods," Quarterly Journal of Economics, LXX, November 1956.
468 CALCULUS OF VARIATIONS AND OPTIMAL GROWTH OF AN AGGREGATE ECONOMY
38. Sen, A. K., "A Note on Tinbergen on the Optimum Rate on Saving," Economic
Journal, LXVII, December 1957.
39. , "On Optimising the Rate of Saving," Economic Journal, LXXI, September
1961.
40. Shell, K., "Applications of Pontryagin's Maximum Principle to Economics," in
Mathematical Systems, Theory and Economics, ed. by H. W. Kuhn and G. P. Szego,
Berlin, Springer-Verlag, 1969.
41. Solow, R. M., "A Contribution to the Theory of Economic Growth," Quarterly
Journal of Economics, LXX, February 1956.
42. Srinivasan, T. N., "Investment Criteria and Choice of Techniques of Production,"
Yale Economic Essays, 2, Spring 1962.
43. -, "Optimal Savings in a Two-Sector Model of Growth," Econometrica, 32, July
1964.
44. Stoleru, L. G., "An Optimal Policy for Economic Growth," Econometrica, 33, April
1965.
45. Stone, R., "Misery and Bliss: A Comparison of the Effect of Certain Forms of Savings
Behaviour on the Standard of Living of a Growing Community," Economia Inter-
nazionale, VIII, Febraio 1955.
46. Takayama, A., "On the Structure of the Optimal Growth Problem," Krannert
Institute Paper, Purdue University, No. 178, June 1967.
47. Tinbergen, J., "The Optimum Rate of Saving," Economic Journal, LXVI, December
1956.
48. , "Optimum Savings and Utility Maximization over Time," Econometrica,
28, April 1960.
49. Tobin,_J., "Economic Policy as an Objective of Government Policy," American Eco-
nomic Review, LIV, May 1964.
50. Uzawa, H., "Optimal Growth in a Two-Sector Model of Capital Accumulation,"
Review of Economic Studies, XXXI, January 1964.
51. -, "Optimal Technical Change in an Aggregative Model of Economic Growth,"
International Economic Review, 5, January 1965.
52. von Weizsacker, C. C., "Existence of Optimal Programs of Accumulation for an
Infinite Time Horizon," Review of Economic Studies, XXXII, April 1965.
53. Westfield, F. M., "Time-Preference and Economic Growth: Comment," American
Economic Review, XLIX, December 1959.
54. Yaari, M. E., "On the Existence of an Optimal Plan in Continuous-time Allocation
Process," Econometrica, 32, October 1964.
a. INTRODUCTION
In Section D, we have assumed that time t is a continuum or, more specific-
ally, that it is represented by real numbers. The purpose of this section is to con-
struct a one-sector optimal growth model when time t is not a continuum but
STRUCTURE OF OPTIMAL GROWTH PROBLEM FOR AN AGGREGATE ECONOMY 469
(i) A rigorous formulation of the discrete time model for the present topic.
(ii) The illustration of the use of nonlinear programming for the present topic.
(iii) The obtaining of some important additional results-in particular, the existence
and uniqueness of the optimal attainable path and Brock's theorem on
sensitivities [2].
The existence theorem is not a particularly easy topic when we use the cal-
culus of variations and differential equations. However, when we use nonlinear
programming, the simple Weierstrass theorem is often sufficient for this purpose.
We have already illustrated sensitivity analysis in Section D for the case of a
constant capital:output ratio. We will now record general results with the proofs.
b. MODEL
We define the notations L,, K,, X,, I, and so on, as we have done in the two
previous sections, except that t now refers to period t. The labor supply equation is
now written as
entire amount of It+I is invested at the beginning of the (t + 1)th period. Hence
Kt+ 1, the stock of capital available for production in the (t + 1)th period, is written
as
(A-1) f(O) = 0, 0 < f'(k,) < oo and f" (k,) < 0 for all k, < oc.
For the meaning of these assumptions, the reader is referred to Section C. Note
that f" (k,) = 0 (for all k,) corresponds to the case in which the capital:output
ratio is a constant. Note also that (A-1) implies the following:
(A-1') g(O) = 0, 0 < g'(k,) < oo and g"(k,) < 0 for all k, < oo
This, among other things, implies that the function g is concave.
We assume that the economy is endowed with the stock of a good whose
per capita amount is equal to a. We also assume that the economy is required to
bequeath a stock of that good to the amount of b per capita at the end of the Tth
period. Thus we have the following conditions:
(9) x0 + k0 = a
and
(10) kT= b
If a = 0, then k0 = 0 as well as x0 = 0, which in turn implies that k, = 0, and x, = 0
for all tin view of g(0) = 0 and (7).3 In order to avoid this uninteresting case, we
assume a > 0 and that
(A-2) (a) There exists a unique k, 0 < k < oc, such that g(k) - k = 0, or
(b) g" = 0 for all k, > 0 (and k = oo).
In terms off, this can be expressed as'
(A-2') (a). There exists a unique k, 0 < k < oc, such thatf(k) = ilk, where.l
,u+n, or
(b) f" = 0 for all k, > 0 (and k = oo).5
Recall that an assumption similar to part (a) of (A-2') was imposed in the Cass-
Koopmans model which we discussed in Section D.
Now consider the problem of finding a path such that k, = k > 0 and
x, = x > 0, for all t = 0, 1, ... , T (k, x are constants). That is, we ask whether
there exists a nonzero balanced growth path.' This problem is reduced to one of
finding k > 0, x > 0, such that
(i1) k+x=aandx=g(k)-k
It is clear from Figure 5.12 that such a path exists uniquely, if a < k. We call such a
path the balanced growth (or the golden age) path with respect to a, and we denote
it by }k*(a), x*(a)}. Note that k > k*(a) > 0 and x*(a) > 0. We henceforth assume
(A-3) (a) a < k, when g" < 0 for all k, > 0, or
(b) g(k,) - k, > 0 for all k, > 0, when g" = 0 for all k, > 0.
It is important to note that this consideration implies that the economy is
capable of growing with strictly positive values of X, and K, (or x, and k,),
STRUCTURE OF OPTIMAL GROWTH PROBLEM FOR AN AGGREGATE ECONOMe
473
x=g(k) -k
as long as the initial condition satisfies (A-3), for we can then choose x, = x*(a)
and k, = k*(a). On the other hand, we may consider the path of pure accumulation
or the path of subsistence with respect to a, which is defined by
(12) ko = a, I = g(kt), t = 0, 1, ..., T - 1,
kt+ xt = 0, t = 0, 1, ... , T
The path of pure accumulation is illustrated in Figure 5.13.
When a > k and f" < 0 [so that (A-3) is violated], then k, monotonically
declines to k in the path defined by (12), as t increases. This is an uninteresting case
and may be considered as another justification of (A-3).7
is called an attainable path starting from a and ending at b. When only (13) is
imposed and (14) is disregarded, it is called a feasible path. Clearly the optimal
attainable path is the path which maximizes Uamong the set of all attainable paths.
Note that the set of attainable paths can be empty, so that there may not exist a
solution for the above maximization problem. For example, b may be so large that
the economy cannot attain it within the prescribed T periods, even if x, = 0
for all t (the path of pure accumulation). We may denotethesetofalltheattainable
paths by A(a, b; T). For the infinite horizon problem (T--> oo), this set is denoted
by A(a, oo), or simply A(a), where we do not impose the constraint such as
lim r , kT = b.
We assume that the set of attainable paths is nonempty, for otherwise it is
meaningless to consider the problem. It can be shown that the attainable set
A(a, b; T) is compact in the (2T + 2) dimensional Euclidian space. To show this,
let (k9, x,9), q = 1, 2, be a sequence such that
(15) x,9+k,9=g(k9_1),t= 1,2,...,T
(16) x09 + k09 = a, kT9 = b
and
compacts Thus, the attainable set A(a, b; T) is nonempty and compact. Hence,
in view of the Weierstrass theorem (Theorem O.A.18) and the continuity of u,
there always exists a solution for the above nonlinear programming problem. That
is, the existence of an optimal attainable path is demonstrated.1'
Next we consider the nonnegativity of the optimal attainable path. In fact,
we can show, under a certain assumption, that k, > 0,.i, > 0, for all t = 0, 1, ... , T
(except possibly for kT = b, which can be zero), where (k,, z,) denotes an optimal
attainable path. To consider this problem, first suppose that b = k*(a). Then it is
clear that thepath (k,,x,),inwhich k,= k* (a) > 0,x,=x*(a)> 0,t=0,1,...,T,
is an attainable path. Then in view of the assumptions that u(0) = -oc and
f (O) = 0,11 we have
(18) k, > 0, z, > 0, for all t = 0, 1, . . ., T (except possibly for kT)
That is, we have an "interior solution" for the above maximization problem. Now
suppose that b < k*(a). Then we can similarly conclude that we have an interior
solution [that is, (18) holds] since the path (k,, x,) in which k, = k* (a), x, = x*(a),
t = 0, 1, ..., T - 1, and kT = b, xT = g[k*(a)] - b = a - b, is an attainable path
and k, > 0, x, > 0, for all t along this path. Henceforth we impose the following
assumption:
(A-5) b _< k*(a).
Note that if b = 0, then this condition is always satisfied. We leave it to the
interested readers to work out the implications of the case in which (A-5) is not
satisfied.
We now assert the uniqueness of an optimal path. Although we should be
able to assert this by way of the Lagrangian of the above maximization problem
and using assumptions such as u" < 0 in (A-4) and g' > 0 in (A-l'), here we
will prove uniqueness. directly from the problem because that method has applica-
tions to some cther problems; in particular, we will use it for the multisector case
(Chapter 7, Section B). First, for the sake of notational simplicity, write x
(x0, x1, ..., xT) and k = (ko, k1, ..., kT), so that U(x) = ET ou(xr)(1 + P)-`.
We first assert that the strict concavity of u(that is, u" < 0) implies that the
optimal consumption path z is unique. To show this, suppose that (k, z) and
(k', x') are two optimal attainable paths with z 4 x'. That is, U(x) = U(x') and
a- io- ko= O,g(k,-,)- i,- k,= 0,t= 122,..., T,kT-b=O,a-xO-ko=0,
g(kt_ 1 ) - x, - k', = 0, t = 1, 2, ... , T, k'T - b = 0. Define a new path (k i,) by
(19) +x,),t=0,1,...,T- 1
(20) koa - io, k1=g(kr-I)-Xr,t= 1,2,...,T
(21) iT = g(kT- i) - b
Then (k,, i,) is an attainable path. Note that ko = Z(ko + ko). Hence, from the
concavity of g, we obtain
476 CALCULUS OF VARIATIONS AND OPTIMAL GROWTH OF AN AGGREGATE ECONOMY
Hence we obtain k2 > k(k2 + k2). Here the equality holds if ko = ko and ki = k'1.
Repeating the above argument, we obtain
(24) zT + b = g(kT- i) > g(ZkT-1 + IkT-1) zg(kT- i) + zg(kT- i)
which is a contradiction. Note that the above consideration does not preclude
the possibility that k k'. We now show that this is impossible. Suppose that
(z, k) and (z, k') are two optimal attainable paths. Then observe from the attain-
ability that.
(26) g(kr-1) - k, = g(kt-1) - k, t = 1, 2, ..., T
Since kT = k'T = b, so kT_ i = k'T_ i as a result of the monotonicity of g.' 2 Then
using the relation (26) successively, we obtain k, = k,, t = 0, 1, 2, ... , T, which
in turn is consistent with a - ko = a - k6. Thus we obtain k = k'. Note that in the
above proof the crucial assumption is g' > 0 and not g" < 0. That is, the strict
concavity of g (or g" < 0) is not crucial.
Having demonstrated the existence and the uniqueness of the optimal
attainable path, we now proceed to the characterization of such a path. To ease
the notation, we define the following h-functions within the constraints of the
above maximization problem:
(27) ho(ko, x0) = a - x0 - ko
(28) 1,2,...,T
Note that kT may be replaced by b, so that it can be dropped from the list of the
control variables.13 In order to obtain the first-order characterization of the
above maximization problem, we next examine the rank constraint qualification.' 4
For this purpose, define the (T + 1) x (2T + 1) matrix H as
STRUCTURE OF OPTIMAL GROWTH PROBLEM FOR AN AGGREGATE ECONOMY 477
a hT a hT a hT a hT ah7
ako ak, akT_, axo aXT
Since the number of effective constraints for the above problem is (T + 1), the
rank constraint qualification of this problem states that the rank of matrix H
should be equal to (T + 1). It is easy to see that this condition is in general
satisfied by actually computing H in view of (27) and (28) (and evaluating the
matrix along the optimal attainable path).
We now define the Lagrangian function L of the above problem as
T T
(30) Lu(x,)(1 +p)-`+ P,h,
where po, p, , ... , PT are the Lagrangian multipliers. Since the constraint qualifica-
tion is satisfied, the following first-order condition gives a set of necessary condi-
tions for an optimum (recall Chapter 1, Section D):
aL' _
(31) at -p,+P,+ig'(k,)=0,t=0, 1,..., T- 1
(32)
ax,-u'(X,)(1+P)-`-P,=0,t=0, 1,2,...,T
Note that we do not have the inequalities aL/ak, < 0, and aL/ax, 5 0, for we
ruled out the corner solution (that is, k, = 0, z, = 0, for some t) by (A-5) and
the assumptions u(0) _ -oo, f(0) = 0. Note that we have p, > 0 for all t = 0.
1, 2, .. ., T in view of (32) and the assumption u'(x,) > 0 for all x,. There are
(2T + 1) conditions in (31) and (32). These together with the (T + 1) constraints
(that is, h, = 0, t = 0, 1, ..., T) determine the optimal value of the (2T + 1) +
(T + 1 ) variables (that is, k, t = 0, 1, ... , t - 1 ; X p,, t = 0, 1, ... , T ).
It is easy to show that (31) and (32) together with the constraints (13) and (14)
also give a set of sufficiency conditions for a (global) optimum in view of the con-
cavity of u and g. Since the optimal attainable path is unique, these conditions give
a set of necessary and sufficient conditions for a unique global optimum. We now
turn to a study of these conditions. First we note that conditions (31) and (32) can
be rewritten as
There are T conditions in (33), which, together with the (T + 2) conditions in (13)
and (14), completely determine the value of the (2T + 2) variables k,, z t = 0,
1,...,T.
The economic meaning of (33) is easy to see. By reducing consumption by
one unit of the good, the loss of utility is u'(z,) for the tth period. By investing
this one unit, there is a gain in "net" output15 by the amount of g'(k,). This, in
turn, gives an increase in utility by the amount of u'(z,+ 1) g'(k,)/(1 + p). Hence
the equality (33) gives nothing but the competitive intertemporal arbitrage condi-
tion. It is easy to rewrite (33) in the following equivalent form:
-u'(++p)
(33') u'(Xt+i) - u'(2,) = [g'(kt) - (1 + P)]
It should be clear that this corresponds to the Euler equation obtained in Section
D of this chapter.
We summarize some of the results obtained obove.
Theorem 5.D.2: Under assumptions (A-1), (A-2), (A-3), (A-4), and (A-5), we have
the following:
(i) The balanced growth path [k*(a), x*(a)] starting from a > 0 exists, is unique,
and k*(a) > 0, x*(a) > 0.
(ii) The optimal attainable path (k,, r",) starting from a> 0 and ending at kT = b > 0
exists, is unique, and k, > 0, r, > O for all t = 0, 1, 2, ... , T, with the possible
exception that kT = b = 0.
(iii) A necessary and sufficient condition for the path to be optimal and attainable is
given by (33), (13), and (14).
Definition: The attainable path (k,, zt) starting from a and ending at b is called
competitive if there exist nonnegative numbers ("prices") pt such that
t=o
u(xt)(1 + p)-t + po(a - xo - k0) +
t=1
Pt[g(kt-1) - xt - kj
for all k, x, > 0, t = 0, 1,. . ., T. Set xt = zt for all t = 0, 1, ... , T, except for
t = to. Then we obtain (34), the first condition of competitiveness, since the
choice of to is arbitrary. Next set kt = kt for all t = 1, ... , T, except for t = to,
and xt = zt for all t = 0, 1, . . ., T. Since the choice of to is arbitrary, we
establish (35), the second condition of competitiveness. In other words, we
established that optimality implies competitiveness.
To show the converse, first note the following simple identity:
(38) p0a - pTb = p0a - prb, where kT = kT = b
Then summing both sides of (34), (35), and (38), we obtain (37), which
establishes the converse. (Q.E.D.)
We now turn to the problem in which t becomes "very large," which is
obviously meaningful when T is large enough. Define k(p) and i(p) by
(39) g'[k(p)] = 1 + p
and
(40) x(p) = g[k(p)] - k(p)
Here we assume g" (k) < 0 for all k and impose assumption (A-2), part (a). Then
[k(p), . (p)] defines the modified golden rule path for the present discrete time
model. Note that 0 < k(p) < k . Recall now the two basic equations of the
present model, that is, (7) for feasibility and (33') for optimality. Then, in view
of g"(kt) < 0 for all kt and u"(xt) < 0 for all xt, we can easily conclude from
(33') that
We can now draw a phase diagram similar to the one used in Section D. Applying
our argument of the "eligibility condition," we can obtain the same conclusion we
obtained in Section D, which we list as another corollary of Theorem 5.D.2.
mous" changes in u and g; that is, u(x,) can be replaced by u(x,, t) andg(k,) can be
written as g(k,, t). The second argument t in these functions signifies the autono-
mous shifts of these functions. For example, u(x,, t) can involve the case in which
there is a change in the discount rate as well as taste over time, and g(k t) can
mean technological progress. We may then rewrite our feasibility condition (7)
accordingly as
(7")
x1+ + kr+ = g(kt, t), t = 0, 1, ... , T - 1
It can be shown fairly easily, by repeating our earlier argument, that the optimality
condition (33) can be rewritten accordingly as
(46)
Po(b1) = g'o(b1)> g'o(h2)__ Po(b2)
P1(b1) 1 +P 1 +P P1(b2).
Hence P, (b 1) < P1(b2), which means z 1(b 1) > z 1(b2). Then, together with (45-c)
and feasibility, (13),
(48) k1(b1) - k1(b2) = [go(b1) - X1(b1)] - [go(b2) - C1(b2)]
We now compare two optimal attainable paths, both starting from a and
ending at 0 with the only difference being the planning horizon, T for one and
(T + 1) for the other. Then kT+ 1 T+ 1 (0) = kTT(0) = 0, but kYT+' (0) > 0; for if
kTT+1(0) = 0, then iT+1T+1(0) = 0, which implies u[zT+1T+1, T + 1] _ oo.
Next observe that
(52) k1T+' (0)
= k!' (kTT+' (0)), t = 0, 1, ... , T
for otherwise one can always increase the value of U for the (T + 1)-period
program with (kT+ 1T+ ' = 0) by following the path ktT(kTT+' (0) ), t = 0, ... , T
(that is, up to the Tth period). Since kTT+' (0) > 0, (52) implies
and
In other words, an increase in the planning horizon with zero terminal stocks
always increases the optimal (per capita) stock for each t and decreases the
optimal (per capita) consumption for each t.
Next note that
(56) ktT(0)< kt for allt=0, 1,...,T;T= 1,2,...
where k1 denotes k1 in the path of pure accumulation. Hence, for each t, k1T(0)
is a monotone increasing sequence with respect to T, which is bounded from
above. Hence lim, , k1T(0) exists. Denote this by k, Also z1T(0) is a monotone
decreasing sequence with respect to T, which is bounded from below by zero.
Hence limT_,,,, z'T(0) exists. Denote this by z, The path (ks, zj is attainable
in view of the continuity of g.
This convergence is the essential result in the sensitivity analysis with
respect to T, for it asserts that the distance between k,'(0) and k1''(0) and the
distance between r1T(0) and z'T'(0) can be made arbitrarily small, at least for
certain initial periods when T and T' are sufficiently large. Also note that (k, zj
corresponds to the path obtained by Koopmans and Cass for the infinite horizon
problem (which also implies that k, does not converge to 0 when t -> oc).
STRUCTURE OF OPTIMAL GROWTH PROBLEM FOR AN AGGREGATE ECONOMY 483
Theorem 5.D.3 (Brock): Under (A-1) and (A-4), we have (51) and (55), and the
optimal attainable path for the T-period problem with no terminal stock requirements,
[k1T(0), i,T(0)], converges (for each t) to a limit path (k,, z,) as T -> oo.
Can we assert that [k1T(b), z,T(b)] also converges to (k,, z,) when b > 0, as
T-> oo? Brock [2] proved the following corollary (his theorem 3), which asserts,
in essence, that if b is below a certain value, then such a convergence holds.
Corollary: If b < lim k,, then lim k1T(b) exists and equals k1.
l-m T-oo
Write limq-00 xq = z. Then we can show easily thatz E X, and that ifa < z then
there exists an integer Nsuch that q ? N implies xq > a. (That is, for a given
c > 0, there exists an N such that q > N implies xq > z - E.) Similar results
hold for lim sup. We now turn to the proof of the above corollary.
PROOF: By assumption, b < lim k1. Hence there exists a To such that
1-w
(59) k, > b, for t > To
For T > To, choose N, which depends on T, such that
with equality when b = 0. Observe that k,T(0)>kt as T>oo and k 'N (0)->k, as
N> co. Since N> co as T> oo, (62) implies that ktT(b)> kt (for each t) as
T> co. (Q.E.D.)
REMARK: This corollary shows that as long as b is below the bound limt-.,)
kt , the optimal attainable path is insensitive to changes in the terminal stock b,
for certain initial periods. In other words, as long as.bl and b2 are below the
bound, the distance between k1T (b1) and ktT (b2) can be made arbitrarily small
for certain initial periods by choosing T sufficiently large.
FOOTNOTES
1. This does not preclude the importance of period analysis in the empirical contexts.
For example, in empirical econometric research one is often forced to use period
analysis since the time series data are tabulated at discrete time intervals (for example,
GNP). In optimization models, one may be able to change the policy (or control)
variables only at discrete time intervals as a result of practical considerations. Then
such a time interval may define the period.
2. We should note that our specification of the model is not the only way to produce
equation (3). Our point here is simply that we should make the specification explicit
to avoid possible misunderstanding.
3. We here adopt the convention that zero is the subsistence level of consumption.
Hence xt = 0 for all t means that the economy is at the subsistence level all the time.
4. In the literature, the following alternative assumptions are used in place of (A-2'),
part (a): f'(0) = oo and f'(oo) = 0. See Sections C and D. Clearly this implies the
present assumption, (A-2'), part (a).
5. This is the case of a constant capital:output ratio. Obviously for such a case, (A-2),
part (a), cannot be imposed.
6. If kto = 0 for some to, then f(0) = 0 implies xt = 0, kt = 0 for all t > to. This is an
uninteresting case of "balanced growth."
7. See Koopmans [ 6], p. 237, for example. When a = k, then kt = k and xt = 0 for all t
except possibly for t = T and T - 1. This is again a trivial and uninteresting case.
8. Recall the definition of continuous functions and note that the linear (affine) func-
tions are continuous everywhere in the domain. The function g is continuous because
it is differentiable.
9. Each of kt, xt, t = 0, 1, 2, ... , T, is bounded from below by 0; kt, t = 0, 1, 2, ... , T, are
bounded from above by the path of pure accumulation; xt, t = 0, 1 , 2, ... , T, are
bounded from above by g'(.k,) < oc for all .k, < co and equation (7). Let x t, kt, t = 0,
1 , 2, ... , T, be these upper bounds and consider rectangles St in R2 defined by St =
{(xt, k,): 0 k, < kt, 0 < xt < xt}, t = 0, 1, ..., T. Clearly the St's are compact;
hence in view of Tychonoff's theorem 0,.0S1 is also compact in R2T+2 with
respect to the product topology (Theorem O.A.15). As a closed subset of the compact
set, A(a, b; T) is compact.
10. Consider S,, t = 0, , 2, ... , ad inf. Define S - Ox '0St. Then S is again compact as
1
a result of the Tychonoff theorem. Hence A(a) with T- oo is compact as a closed sub-
set of S. Hence, using the Weierstrass theorem again, we can assert the existence of an
optimal attainable program for the infinite horizon problem (T = oo), as long as
U(xp, xL, ...) - Z' 0 u(xt) (1 + p)-t remains continuous and bounded. This argu-
STRUCTURE OF OPTIMAL GROWTH PROBLEM FOR AN AGGREGATE ECONOMY 485
REFERENCES
1. Beale, R., and Koopmans, T. C., "Maximizing Stationary Utility in a Constant
Technology," SIAM Journal of Applied Mathematics, 17, September 1969.
2. Brock, W. A., "Sensitivity of Optimal Growth Paths with Respect to a Change in
Target Stocks," Zeitschrift fur Nationalokonomie, Supp. 1, 1971 (originally present-
ed at the Purdue Meeting of the Kansas-Missouri Seminar on Quantitative
Economics, October 1969). .
3. Brock, W. A., and Gale, D., "Optimal Growth under Factor Augmenting Progress,"
Journal of Economic Theory, vol. 1, October 1969.
4. Gale, D., "On Optimal Development in a Multisector Economy, "Review of Economic
Studies, XXXIV, January 1967.
5. Gale, D., and Sutherland, W. R., "Analysis of a One Good Model of Economic
Development," in Mathematics of the Decision Sciences, Part 2, ed. by G. B. Dantzig
and A. F. Veinott, Providence, R. I., American Mathematical Society, 1968.
6. Koopmans, T. C., "On the Concept of Optimal Economic Growth," in The Econo-
metric Approach to Development Planning, Pontificiae Academiae Scientarum Varia,
Amsterdam, North-Holland, 1965.
7. Samuelson, P. A., "A Turnpike Refutation of the Golden Rule in a Welfare-Maximiz-
ing Many Year Plan," in Essays on the Theory of Optimal Economic Growth, ed. by K.
Shell, Cambridge, Mass., M.I.T. Press, 1967.
MULTISECTOR MODELS OF ECONOMIC
GROWTH
Section A
THE VON NEUMANN MODEL
a. INTRODUCTION
If we had to name the most important immediate forerunner of modern
mathematical economics, we would not hesitate to choose John von Neumann for
his model of economic growth presented in his 1937 paper [22]. Not only did this
paper provide the first explicit nonaggregate model in capital and growth theory,
but also it presented (1) the first explicit activity analysis model of production and
(2) the first abstract model of a competitive economy (together with Wald's model
in his papers published in 1935 and 1936), which lead to the models of the 1950s (see
Chapter 2, Section E). In addition to the modern character of the model, the prob-
lems that von Neumann dealt with are also modern. In particular, he was con-
cerned with the path that gives the maximal rate of growth and the price implica-
tion of such a path. In the first part of this section, we present the model as
formulated now so as to convey its modern character. In subsection b, we prove his
major results in an elementary fashion.
Because of the innovative character of the paper, many papers have been
written on the von Neumann models, but we restrict ourselves to his growth model
as such and exclude its impact on other developments such as activity analysis and
the theory of competitive markets.
A major effort has been devoted to simplifying the proof of von Neumann's
major theorem in the original paper. To prove his theorem von Neumann used
Brouwer's fixed point theorem. Later, Loomis [15], Georgescu-Roegen [5],
Gale [3], and Karlin [11] all provided elementary proofs. The proof of the
existence of the balanced growth path with the maximal growth rate can be
separated from the proof of the price implication of such a path. The essence
of the proof of the existence is simply to utilize the compactness of the relevant
set of production. Our proof in subsection b also follows this line. In this con-
nection, we may note a formal similarity to the theory of nonnegative matrices in
486
THE VON NEUMANN MODEL 487
which the existence of the Frobenius root can be proved by utilizing the compact-
ness of a certain set, although it can also be proved by using Brouwer's fixed
point theorem (see Chapter 4). For the proof of the price implication of the
maximal rate balanced growth path, Georgescu-Roegen, Gale, and Karlin made
direct use of the separation theorem of convex sets. We use, instead, the funda-
mental theorem of concave functions (Chapter 1, Section B). It is true that this
theorem is derived from the separation theorem, but the use of this fundamental
theorem will avoid many steps in the proof (which is hence much simpler) compared
with those necessary in the proof which uses the separation theorem directly.
Another major effort devoted to the von Neumann model is in the direction
of the generalization of the model and its results. Essentially there are three weak-
nesses in the original paper:
(i) There is no discussion of the irreducibility of the model, and there is no proof
that the value of output at each time period is positive.
(ii) The growth paths that von Neumann considered were restricted to the"bal-
anced growth paths," that is, paths in which all commodities grow at the
same rate.
(iii) There is no explicit treatment of consumption.
period all the commodities are simultaneously fed into the process and at the
end of the period all the commodities are simultaneously produced. In reality, such
a simultaneous input or output rarely occurs. More commonly, various com-
modities are fed into the process at different points in time and various commodi-
ties are produced from time to time. However, such "successive" inputs and out-
puts can be handled simply as follows. Consider a production process in which
inputs go into the production process at time to and t1 and outputs come out at time
t2 and t3. Then we can "decompose" such a production process into three
"steps." In other words, the commodities that are fed in at time to produce
certain "intermediate commodities," and at t 1 , new inputs, together with these
intermediate commodities, are fed in. At time t2 certain commodities are pro-
duced as final outputs together with the higher order intermediate goods. At time
t3 only the final outputs are produced. Hence, including these intermediate com-
modities in the classification of commodities, in each of the production periods
(to , t I ), (t 1 , t2 ), (t2 , t 3 ), all the commodities are simultaneously fed in at the
beginning of the period and produced at the end of the period. We only consider
such "decomposed" processes.
The second apparent difficulty in the concept of a production period lies
in the difference in the actual time period from process to process. This can
simply be handled as follows. Suppose there are only three processes z1, z2, and
z3 such that z1 takes 30 days, z2 takes 60 days, and z3 takes 45 days. Take the
greatest common divisor of these three periods {30, 60, 45}-that is, 15-and
define this 15 days as a unit period of production. Then z1 is decomposed into
two steps. In other words, at the end of the first period (that is, at the end of 15
days), the process produces certain "intermediate" (or unfinished) commodities,
and during the second period of production these intermediate commodities
are all transformed into the final commodities of the process z 1 Thus, at the end
.
of the second period (that is, at the end of 30 days), this process z 1 is completed.
Hence, by choosing the unit of period properly and by properly including
the "intermediate" commodities in the list of commodities, we can avoid the two
difficulties and can proceed meaningfully to our analysis. Time in this economy
elapses with the succession of such production periods, and each production
process is in the technology set T. We assume that this set T is constant over
time. This implies, among other things, that there is no technological progress
in the economy.
In the original von Neumann presentation of the model, a concrete explana-
tion about the input-output vector (x, y) is given. Let a,1 be the amount of the
ith commodity needed as an input in a one-unit operation of thejth process (or
"activity"). Let b,1 be the amount of the ith commodity produced in a one-unit
operation of the jth process. Let there be n commodities and m processes in the
economy. Let A = [a;1] and B = [b;1] be n x m matrices whose entries are non-
negative real numbers and possibly zero for some elements. Let a' be an n-vector
whose ith element is a; and let b' be an n-vector whose ith element is b;. Some
of the ay's and the b11's can be zero. One unit operationofthejthprocesstransforms
THE VON NEUMANN MODEL 489
aj to bi. Let z(t) be an m-vector whose jth element, zj(t), signifies the level of
operation of the jth process in period t; zj(t) > 0 for all j and t. The vector z(t)
is called the activity level vector (or process level vector) in period t. That x(t)
(often denoted also as x1) is an input vector in period t means that x(t) can be
written as
m++
x(t) = ! aJzj(t)
j= 1
for some zj(t) > 0 (j = 1, 2, ... , m). Similarly, that y(t) (or yt) is an output vector
in period t means that it can be written as
Y(t) = E b'zj(t)
j= 1
for some zj(t) > 0 (j = 1, 2, ..., m). The assumption that the technology set T is
constant implies that these a;j's and b;j's are constant over time for some zj(t) >_ 0
(j = 1, 2, . ., m). Allow "free disposability"; that is, if the process (x, y) is in the
.
Assumption (A-ii) means that every process uses some commodity as an in-
put, which implies "the impossibility of the land of Cockaigne." Assumption
(A-iii) means that every commodity is producible by some process. Assumptions
(A-ii) and (A-iii) modify (AN-2) in a significant way.
Karlin [I I[11], in his formulation of the von Neumann model, did not use
the von Neumann specification of the technology set in terms of the matrices
A and B. Hence he did not adopt the above specification of assumptions in terms
of the ay's and the b,3's. Rather, he abstracted the essential nature of the von
Neumann technology by supposing that the technology set T of the economy is
specified by the following four assumptions.
(A-1) T is a closed convex cone in ( 2n, the nonnegative orthant of R2n.
(A-2) (Free disposability) (x, y) E T, x' >_ x, and 0 < y' < y imply (x', y')
E T.
(A-3) (The impossibility of the land of Cockaigne) (0, y) E T implies y = 0.
(A-4) (The "productiveness") For any i, there exists an (x, y) E T such that
y, > 0 (that is, every commodity is producible).
REMARK: As remarked in the exposition of activity analysis (Chapter 0,
Section C), "T is a convex cone" implies additivity, proportionality (that is,
complete divisibility and constant returns to scale), and the possibility of
inaction.
REMARK: In view of (A-1), (A-4) is equivalent to the following:
(A-4') There exists an (x, y) E T such that y > 0.
It is easy to see that the von Neumann technology set TN with (A-i) to (A-iii)
is a special case of the technology set T with (A-1) to (A-4). To see this, note that
(A-i) together with the definition of TN imply (A-1) and (A-2); (A-ii) implies
(A-3); and (A-iii) implies (A-4).
The economy transforms the stock of commodities at the beginning of period
x(t) to the stock of commodities y(t) by spending one time period, such that (x(t),
y(t)) E T, where T satisfies (A-1) to (A-4). Thus the movement of the economy
can be depicted schematically by Figure 6.1.
T
x(t+1) > y(t+1)
T
x(t) > y(t)
T
y(t-1)
(i) Activity analysis: the existence of an efficient point in a production set and
the assertion that every efficient point can be realized as a profit maximization
point (Theorem O.C.3).
(ii) Welfare economics: the existence of a Pareto optimal point and the assertion
that every Pareto optimal point can be supported by competitive pricing
(Theorem 1.C.2).
b. MAJOR THEOREMS
We now proceed to the investigation of the two major problems stated above:
(1) the existence of the balanced growth path with a maximal rate, and (2) the
price implication of such a path.
The value of a(x, y) is called the rate of expansion of the process (x, y) in T.
REMARK: For (0, 0), a(x, y) cannot be defined. Note also that, owing to
(A-3), (0, y) 0 T if y >_ 0. Hence a(x, y) is not defined on such points under
(A-3). Thus a(x, y) is not defined for (0, 0) and (0, y) with y > 0. Since
x > 0 and y > 0, this means that a(x, y) is defined only when x 0. This
implies that a(x, y) > 0. The concept of "rate of expansion" is illustrated
in Figure 6.2. As an example of a(x, y) = 0, consider x = (0, 1) and y =
(2, 0). Note that a(x, );) may be less than 1; hence the process can produce
"decay" rather than "expansion." Writing x = (XI, x2, ..., x,) and
y = (yI, y2, . . ., we can also write the above definition of the expansion
rate as follows. Let ai(x, y), i = 1, 2, ..., n, be defined as
yi if xi > 0
xi
ai(x, y)
oo ifxi= Oandyi> 0
undefined if xi = 0 and yi = 0
492 MULTISECTOR MODELS OF ECONOMIC GROWTH
Commodity 2
If y'=ax,a=a(x, y)
0 Commodity 1
Then
a(x, y) = min a;(x, y) for x >_ 0
i
As is clear from Figure 6.2, the concept of the rate of expansion is that of
a "balanced growth path," that is, a ray from the origin. Given (x, y), if
y is not on the ray from the origin passing through x, y is brought into such
a ray as illustrated in Figure 6.2 (y - y').
Since a(x, y) is a function of (x, y), the value of a(x, ),), the expansion
rate, varies from process to process. The following theorem asserts that there
exists a process (z, y) in T which gives the maximum rate of expansion. The
essential argument for this existence theorem is the standard compactness argu-
ment if we assume that a(x, y) is continuous.
Theorem 6.A.1: Under the assumptions (A-1) to (A-4), there exists an (z, y) E T
such that y = az, where a = a(z, y), and a > a(x, y) for all (x, y) E T with
x >_ 0. Also we have 0 < a < oo.
PROOF: Let T be the intersection of T with the unit sphere (with the center
at the origin) in R2 . Since T is closed, T is also closed, which in turn implies
that T is compact. Suppose a(x, y) is continuous for all points of T (or
T). Then a(x, y) achieves a maximum on T by Weierstrass' theorem
(Theorem O.A.18). However, as Glycopantis [6] pointed out, a(x, y) can be
discontinuous (for such an example, see [61, p. 296). This, in essence, is
due to the fact that the domains of the functions a, (x, y) are not identical.
To avoid this difficulty, define the subset T, of T as
T, = J (x, y) E T: a (x, y) = a! (x, y)}
THE VON NEUMANN MODEL 493
Clearly, T \ {0} = U;T;. Let T; = T;U {0}, and let (xy, yq) be a convergent
sequence in T; with limit (x°, y°). Then, sinceyk9/xk9 > yfl/x;9 for (,r9, yq) # 0
implies that yk%xk° ? y;°/x;°, we have a(x°, y°) = a; (x°, y°). That is,
(x°, y°) E T; so that the Ti's are closed cones. Also the functions a, (x, y) are
continuous except when they are undefined. Let T* be the intersection of D.
with the unit sphere in R2i. Then the T*'s are compact. Hence, from Weier-
strass' theorem, a; (x, y) achieves its maximum b; on T*. Choose the maximum
of the b;'s over i, which clearly exists. Thus we have shown that there exists
an (x*, y*) E T such that a(x*, y*) >_ a(x, y) for all (x, y) E T ('.'U;T; = T; \
J01). Write s - a (x*, y*). Since T is a cone, for any (x, y) E Twithx ? 0, there
exists an (z, y) in T such that (Ax, Ay) = (x, y) for some A > 0, A E R [note
(x, y) = (x, y)/ II (x, y) II ]. However, by the definition of a(x, y), we
have a(x, y) = a (AY, Ay) = a (Y, y). Since a(x*, y*) >_ a(x, y) for
all (x, y) E T with x >_ 0, we then obtain that a(x*, y*) >_ a(x, y) for all
(x, y) E T with x 0. From the free disposability assumption (A-2),
we can find an (z, y) in T such that a(z, y) = a(x*, y*) and y = az. Thus
a = a(z, y) > a(x, y) for all (x, y) E T with y = &z. Finally, we show
that 0 < 6 < oo. From (A-4) there exists an (z, y) in T such that y > 0.
Since a(z, y) > 0 and 6 > a(z, y), a > 0. a < oo clearly follows from (A-3)
and y = az. (Q.E.D.)
REMARK: In the above proof (which is in essence due to [6] ), we observe
that the possible discontinuities of the function a(x, y) make it necessary
to complicate the "compactness" proof. Note that in the above proof the
continuity of a(x, y) is neither established nor utilized. The proof relies
on the continuity of a;(x, y).
REMARK: We may recall that the "failure" of the "compactness" proof
(as a result of the lack of continuity) also appeared in the proof of the
Frobenius theorem (Theorem 4.B.1). The reader may, therefore, wish to
consider Theorem 6.A.1 and the Frobenius theorem under a unified frame-
work.
REMARK: For the case of n = 1, that is, a one commodity economy, the
above proof can be illustrated by Figure 6.3.
REMARK: The definition of a(x, y) reduces the rate of expansion to the
rate in the corresponding balanced growth path, as was illustrated in the
definition of a(x, y). Hence Theorem 6.A.1 simply asserts that there exists
a balanced growth path which maximizes the rate of expansion in the set of
all the balanced growth paths in the economy. We call such a path the
von Neumann path. It is important to notice that the von Neumann path is
not necessarily unique. In the above illustrations, the von Neumann path
was supposed to constitute a unique ray from the origin. But, as McKenzie
[ 16] emphasized in connection with the turnpike theorem, this is not
necessarily the case. The set of von Neumann paths, in general, constitutes a
facet.
494 MULTISECTOR MODELS OF ECONOMIC GROWTH
Maximize:
Z
a
Subject to: [B - aA] z > 0 and z > 0
As Gale noted ([4], p. 312), this problem yields the following problem,
which appears strikingly analogous to the dual problem of linear program-
ming.
Minimize: /S
P
Subject to: p - [B - A] < 0 and p > 0
Then using exactly the same argument as in the proof of Theorem 6.A.1, we
show that there exists a solution /3 for the above problem. In general, we
cannot, however, show that /3 = &, although it can be shown that/ < a. The
above "dual problem" can have various economic interpretations. If
p a> > 0, the ratio p b//p of is meaningful and signifies return divided by
cost, a kind of "profit ratio" of the jth activity. The inequality p [B - /3A]
0 means (p bf)l(p af) < /3 whenever p a> > 0. In other words, /3 is the
maximum profit rate. In a competitive economy with free entry, competition
THE VON NEUMANN MODEL 495
Theorem 6.A.2: Under assumptions (A-1) to (A-4), there exists a p such that p > 0
and p (y - &x) < 0 for all (x, y) E T.
PROOF: By definition of &, there exists no (x, y) in T such that y - &x > 0
['.'if there exists an (x', y') E T such that y' - &x' > 0, then there exists
an e > 0 such thaty' - (a + e) x' > 0, so that & is not the maximum expansion
ratio-contradiction] . Since T is convex and the function f(x, y) -- y - 6x
is concave (in fact linear), we can apply the fundamental theorem of concave
functions, that is, Theorem 1.B.2. From this theorem, there exists a p > 0
such that p f(x, y) < 0 for all (x, y) E T. (Q.E.D.)
REMARK: We can prove Theorem 6.A.2 directly from the separation
theorem by considering two convex sets: { y - ax: (x, y) E T, 11 (x, y) 11 < 1 }
and the interior of S2". For such a proof, see Karlin [ 11], pp. 339-340. We
note, however, that Theorem 6.A.2 follows immediately once we recall
the fundamental theorem (Theorem 1.B.2). The above proof illustrates the
usefulness of the fundamental theorem.
REMARK: As we discussed above in connection with the "dual problem,"
(p y)/(p x) is the "profit ratio" of process (x, y), whenever p x > 0.
Theorem 6.A.2 states that this does not exceed the maximum expansion
rate. Note that this theorem also implies that fi < a. Gale ([4], p. 316)
gave an example of A3 < a.
The following theorem, originally due to von Neumann [221, is now an
immediate corollary of Ti eorems 6.A.1 and 6.A.2.
Theorem 6.A.3 (von Neumann): Let A and B respectively be the input and the output
matrices in the von Neumann technology. Then under assumptions (A-1) to (A-4)
there exist a > 0, z > 0, and p ? 0, where a c R, z c R"', and p E R", such that
(i)
[B -
p [B - 0
496 MULTISECTOR MODELS OF ECONOMIC GROWTH
PROOF: Let (z, 9) be the process which gives rise to the maximum expansion
rate a in Theorem 6.A.1; then there exists a z > 0 such that9 = &z,9 <_ B z,
and z > A 2. Hence B 2 >-_ &A 2, or (i) is shown, since & > 0 and 9 >- 0.
2 >- 0. To show (ii) of Theorem 6.A.3, recall Theorem 6.A.2. Then we have
p [B - &A] z < 0 with p >- 0 for all z > 0, so that p [B - &A] < 0 with
> 0. Thus (ii) is shown. To show (iii), note that (i) and p >- 0 imply
p [B - &A] 2 > 0, so that p [B - &A] 2 = 0, in view of (ii).
(Q. E. D.)
REMARK: The relation (iii) of Theorem 6.A.3 states that if the jth activity
yields negative profit, the corresponding activity level z1 is zero, and that if
the ith commodity is expandingat a rate greater than &, being "oversupplied,"
its price p; is zero.
In order to increase our understanding of Theorem 6.A.3, let us consider
more closely the original von Neumann model [22] in terms of the matrix [A, B].
Let z(t) and p(t) denote the activity level vector in period t and the price vector
in period t, respectively. Let /3(t) be the interest factor in period t. Then we have
the following "equilibrium" relations.
(i) A z(t + 1) << B z(t)
(ii) p(t + 1) [B z(t) - A z(t + 1)] = 0
(iii) /3(t) p(t) A > p(t + 1) B
(iv) [ /3(t) p(t) A - p(t + 1) B] z(t) = 0
(i')
[B -
p z=0
THE VON NEUMANN MODEL 497
We may call the quadruplet [z, p, a, $] which satisfies the above four relations
the von Neumann quadruplet. Theorem 6.A.3 asserts the existence of such a quad-
ruplet with a = R > 0 and z > 0, p > 0. It is important to note that in this
interpretation of the model, a is not defined as the maximum expansion factor. An
interpretation in terms of the dual maximization and minimization problems that
Gale conceived is not intended here. Strictly speaking, $ here is not interpreted as
the solution of Gale's minimization problem. The model (i') to (iv') describes the
workings of a closed economy as interpreted above with the fundamental as-
sumption of balanced growth. That is, the interpretation here is that of a descrip-
tive model and not that of a planning model. However, it should also be noted that
Theorem 6.A.3 provides a "planning" interpretation. In other words, a in the
von Neumann quadruplet can be interpreted as the maximum expansion factor.
This means that if the economy is organized as described by (i') to (iv') with
the attached interpretations, it will maximize the expansion factor a. This result
is analogous to a result in the theory of competitive equilibrium, namely, that
every competitive equilibrium realizes a Pareto optimum.
c. TWO REMARKS
Irreducibility
In Theorem 6.A.3, we proved the existence of [a, p, z] with a > 0, p >_ 0,
and z > 0, such that
5
We may call this triplet [a, p, z] the von Neumann equilibrium of the [A, B]
economy. However, in the conclusion of Theorem 6.A.3, or the definition of
the von Neumann equilibrium, the possibility that p B z = 0 is not precluded.
The condition p B z = 0 means (intuitively) that the total value of all com-
modities produced in the von Neumann equilibrium is zero. This is rather
annoying. Gale [3] apparently realized this and considered the "regular" von
Neumann model, where the "regularity" is defined in terms of B B. z > 0. Clearly
B z > 0, together with p >_ 0, implies p B z > 0. Thompson [21] and Kemeny,
Morgenstern, and Thompson [ 12] also realized this and explicitly introduced
the condition p B B. z > 0 into the definition of the von Neumann equilibrium.
A natural question now is: What then is the condition which would guarantee
that p B z > 0? Gale [3] and [4] introduced "irreducibility," the concept
which is analogous to the indecomposability of the Frobenius theorem. It turns
out that this concept of "irreducibility" plays an important role in the above
question.
for all i E I and for some j E J. The input-output matrix [A, B] is said to be
irreducible if I' = 0.
REMARK: Hence, if the model is reducible, there is a certain permutation
of rows and columns of A (that is, renumbering of indices) such that A is
decomposed as
J . J'
I A1I AJ2
I' 1 0 E Azz
Theorem 6.A.4: Let A and B, respectively, be the input and the output matrices in
the von Neumann technology. Suppose there exist it > 0, z > 0, p > 0, a E R,
2 E R-, and p E RR such that (1), (ii), and (iii) of Theorem 6.A.3 hold. Then ifthe
input-output matrix [A, B] is irreducible, we have p B 2 > 0.
PROOF: Renumber j and partition z such that z = (20, 21), where z° > 0
and 21 = 0. Let J be the set of indices where z° > 0 (so that J' = MI J is
the set of indices where 21 = 0). Let b; be the m-vector whose jth element is
b,,. Let I be the set of indices i such that b; z > 0 (so that I' = N \ I is the
set of indices i where b, z = 0). Renumbering i and j, A and B respectively
can be partitioned as
J J, J J,
I A A12 1 BB1,
-------t ------ and -- ----
I' A21 A22
if B21 B22
By assumption of the theorem, B . z > &A z, which, in turn, means B1, z° _>_
6A 11 2° and B21 z° >_ &A21 z° (.' z' = 0). But B21 z° 0 by definition
of I and J ('.' 0 = B21 z° + B22 z' = B21 z°). Hence &A21.2° < 0, or
A21 z° <_ 0 since 6 > 0. But A21 z° > 0, since A2, >_ 0 and z° > 0, so that
we must have A2, i° = 0. This, in view of A21 > 0 and z° > 0, implies that
A2, = 0. That is, aid = 0 for i E I', j E J. Since [A, B] is irreducible by assump-
tion, the set I' must be empty. That is, I = N, so that b; z > 0 for all i =
1, 2, ..., n. Hence B 2 > 0, which implies p B z > 0.
(Q.E.D.)
REMARK: The above proof is based on Gale [4], p. 315.
REMARK: If we have assumptions (A-i) through (A-iii), then we can prove
the above theorem without irreducibility. See Kemeny, Morgenstern, and
Thompson [ 121. For alternative proofs, see Howe [ 10], p. 638, orNikaido
[20], p. 146.
THE VON NEUMANN MODEL 499
where q(t) - p(t)/ (Ej 1 pi (t) ), f and g are n-vectors, and " 1 pi (t) > 0 for all
t is assumed. Note that E(t) < 0 implies f = 0 by assumption. Defining µ(t) by
µ(t) = w(t)/(E" 1p;(t)), e(t) can be further rewritten as e(t) =1u(t)g[q(t)]L z(t).
Here µ may be interpreted as the "real wage rate."
Morishima [ 18] retains the fundamental balanced growth assumption a la
von Neumann so that
(11)
z(t) = rxz(t - 1), p(t) = constant = p
w(t) = constant = w, /3(t) = constant = p
Also the unit of measurement of the commodities is chosen properly so that
En_ 1 pi = 1. Note that this convention implies p = q and p = w. Then, using (7)
and (10), the model consisting of (1'), (2'), (3'), (4'), and (5) can be rewritten as
(12) B.z>_c 1) wL]
(13) wL] >ppB
THE VON NEUMANN MODEL 501
Section B
THE DYNAMIC
LEONTIEF MODEL
a. INTRODUCTION
The dynamic Leontief model is a natural extension of the static input-output
Leontief model to a dynamic case. As in the static case, the general equilibrium
interaction among various industries in an economy is explicitly taken into
account. Like the static model, the dynamic model is also used extensively for
empirical purposes to ascertain the industrial structure of particular economies
for forecasting, planning, and so on.
Theoretically speaking, the dynamic Leontief model can be considered a
special case of the von Neumann model, in which there is only one production
process available for the production of each good (the "fixed coefficient assump-
tion") and no joint output is allowed except that capital goods may be considered
as joint outputs in the sense that they are transferred from one period to the next.
The essential idea in dynamizing the static Leontief model seems to have
come from the Harrod-Domar model. Hence the assumption of fixed capital
coefficients is essential in the model.
The fixed coefficients assumption or the strict linearity of the model,
although a useful assumption for empirical purposes, causes serious theoretical
difficulties in the dynamic Leontief model because of its rigidity. The most notable
of such difficulties is known as causal indeterminacy. That is, unless the initial
output vector and stock vector are on a certain ray from the origin, it may so
happen that the output and the stock of at least one good may become negative
for sufficiently large t. Thus, waking up on a bright Monday morning, one may
find that the dynamic Leontief economy, which had started with a positive initial
stock of commodities, has realized a negative stock of some commodity!
There have been various attempts to rescue the dynamic Leontief model
from this difficulty. One useful concept in this connection turns out to be that
of "relative stability," developed by Solow and Samuelson [39]. That is, if the
coefficient matrices A and B satisfy certain conditions, then there exists a balanced
growth path in which all outputs (or stocks) grow at the same rate such that the
ratio between the output (or stock) of the balanced growth path and the actual
output (or stock) for each good converges to a certain positive constant, as time
extends without limit, regardless of the initial configuration of outputs and stocks.
(Incidentally, the definitions of coefficient matrices A and B will be given at the
beginning of subsection b. Here we simply proceed with our discussion without
worrying about the definitions of A and B.)
504 MULTISECTOR MODELS OF ECONOMIC GROWTH
It turns out that the study of this question is concerned with a system of
first-order, linear, homogenous difference equations of the form
x(t + 1) = M x(t)
where M is an n x n constant matrix and x(t) [and x(t + 1)] is an n-vector.
In the (closed) dynamic Leontief model, it turns out that M is written as M =
I + B- I (1 - A).
A necessary and sufficient condition for the relative stability of the above
system was discovered by Tsukui [42]. It states that the above system is relatively
stable if and only if there exists a positive integer m such that M" > 0. Observe
that by iteration the above system of difference equations yields
x(t) = MI. x(0)
Thus Tsukui's theorem means that, for a sufficiently large t (say, m), the output
of every good becomes positive regardless of the initial point, x(O) >_ 0.
This is an interesting result even from a purely mathematical viewpoint, for a
system of difference equations such as x(t + 1) = M- x(t) can appear in many
fields of economics other than the dynamic Leontief system. Hence it has many
potential applications.
Coming back to the dynamic Leontief system, suppose now that the
coefficient matrices A and B are such that we do not have relative stability. Then
we are back to the problem of causal indeterminacy. In general, there is nothing
which would guarantee the relative stability of the system. The answer to this
question of causal indeterminacy under these general circumstances can be sought
from two directions. One is to convert the (deterministic) dynamic Leontief model
into a planning model, in which case the problem of causal indeterminacy can
be avoided trivially by explicitly introducing the nonnegativity of the output and
the stock vectors (for all t) in the constraints. The nontrivial part of this conversion
procedure is to change the equalities in the Leontief system into inequalities.
The procedure of converting the deterministic Leontief model to a planning
model, thus avoiding the problem of causal indeterminacy, was developed by
Solow [381. Since the coefficient matrices A and B are fixed so that the system
is linear, he obtained a linear programming model. Then considering the dual
problem of this linear programming model and interpreting the dual variables
as "pi ice" variables, Solow obtained a remarkable result: the price implication
of the output system of the dynamic Leontief model.
Preserving linearity in the sense of linear homogeneity of the production pro-
cesses, we can still avoid the problem of causal indeterminacy by explicitly intro-
ducing some sort of "nonlinearity" into the system. In particular, we may point
out the following three kinds of "nonlinearity" to avoid causal indeterminacy.
(i) Allow factor substitution in the production processes. In this case, the aj's
and b;j's in the matrices A and B are no longer fixed but are functions of prices.
Moreover, labor can be introduced in this substitution mechanism.
THE DYNAMIC LEONTIEF MODEL S0S
(ii) Allow demand (of consumer) substitution. In the usual dynamic Leontief
model, the final demand vector c(t) is exogenously given; but we may allow
it to depend on prices.
(iii) Introduce a "floor" and "ceiling," just as Hicks [ 13] introduced them into
Samuelson's business cycle model. As Goodwin [8] observed, this essentially
amounts to introducing nonlinearity into the system.
It is easy to see that causal indeterminacy could be avoided by (i) and/or (ii).
If the stock of a certain good decumulates in a certain period(s) (too close to zero),
then the scarcity of this good would, in general, cause an increase in its marginal
productivity and/or an increase in its marginal utility. This would increase the
demand for the good, thus increasing its price. This, in turn, would cause an
increase in its supply, which would avoid the stock of the good decumulating to
zero.
That labor can be introduced in the mechanism of producer's substitution
in (i) has another important implication in the sense that it will avoid another
major difficulty in the dynamic Leontief model with fixed coefficients. We now
discuss this difficulty. Under full employment of capital, the output vector x(t)
may follow the law of motion described by a system of difference equations such
as x(t + 1) = M. x(t) + d(t) [where d(t) is exogeneously given owing to final
demand] in the open Leontief model in which labor is explicitly introduced. As
long as there is a fixed relation between the labor input and the output of each
good, the movement of x(t) will uniquely determine the labor requirement in the
economy. Let it be L(t). Suppose this L(t) does not correspond to the actual
supply of labor. There is no mechanism in the usual dynamic Leontief system
to eliminate the gap between L(t) and the supply of labor. For example, suppose
the supply of labor is given exogenously by population and the like, and grows
proportionately such that L(t) = Lo(1 + n)'. Suppose also that the output
determined by x(t + 1) = M. x(t) + d(t) is given by x(0)(1 + µ)t + (constant).
Then we have ever-expanding unemployment of labor if n > a. If, on the other
hand, n < µ, then the output system such as x(t + 1) = M M. x(t) + d(t) is meaning-
less, for such a growth of output is impossible due to the labor constraint. This
difficulty can be avoided by introducing labor in the mechanism of producer's
substitution. In other words, if labor grows faster than is required, then the price of
labor will go down and encourage the use of labor in the production vis-a-vis other
factors. This, in turn, will increase the labor requirement. In other words, the
labor coefficients-the amounts of labor necessary to produce one unit of each
good-are not fixed constants but functions of prices.
Morishima [26] introduced nonlinearity of type (i). However, he was appar-
ently misled by his desire to obtain the substitution theorem for the dynamic
Leontief system. He was concerned with reducing nonlinearity to linearity by
arguing that only one set of values of the au's and b,1's will be chosen regardless
of the value of final demand, rather than with the problem of causal indeterminacy
per se.
506 MULTISECTOR MODELS OF ECONOMIC GROWTH
The second term of this expression can be understood by supposing that the
production of each good requires a stock of goods (such as "capital") as well
as current inputs (such as "raw materials"). In other words, suppose that, in the
production of one unit of the jth good, bij units of the stock of thejth good are
necessary as a capital good. Let Kij(t) be the amount of the stock of the ith good
required as capital in the jth industry in period t. Then we have
(2) bijxj(t) = Kij(t)
Let Ki(t) be the total stock of the ith good required as a capital good in the
economy in period t, that is, Ki(t) = f 1Kij(t). Then we have [in view of (2)]
n
(3) Ki(t) _ bijxj(t)
J= I
Assume that capital is freely transferable from one industry to another, and
assume also the full employment of capital so that Ki(t) in (3) also denotes the
supply of the ith capital as well as its demand (i = 1, 2, .. ., n). Then
n
(4) li(t) = A Ki(t) = Ki(t + 1) - Ki(t) _ E bij [ xj(t + 1) - xj(t)]
j=1
where Ii(t) is the amount of the ith good demanded (and supplied) for "invest-
ment" purposes.' Expression (1) may be interpreted as (demand as a current
input) + (investment demand) + (final demand) for the ith good.
Since xi(t) denotes the output of the ith good in period t, the basic supply =
demand equilibrium relation for the ith good can now be written as
n n
or in matrix form,
(6) x(t) = A x(t) + B B. [x(t + 1) - x(t)] + c(t)
where c(t) is an n-vector whose ith element is ci(t). We may also rewrite (6) as
508 MULTISECFOR MODELS OF ECONOMIC GROWTH
where x' is an eigenvector associated with Ai. That this is a solution of (8) can
be checked easily by substituting this into (8) and noticing that (8) is reduced to
an identity. If all the eigenvalues of M are distinct, then then particular solutions
in the form of (9) (i = 1, 2, ... , n) are linearly independent, and the general solu-
tion of (8) can be written as
(10) i(t) = h1.ltx1 + ... + hn.Antxn
THE DYNAMIC LEONIEF MODEL 509
where h1, h2, ... , h are determined by then initial (boundary) conditions. The
fact that this is a solution of (8) can be checked easily by noticing that (10) reduces
(8) to an identity.
In general, the A,'s are complex numbers and the x''s can contain negative
elements, so that a solution (9) may not have any economic meaning. Suppose,
however, that one of the eigenvalues-say, A,-is a positive (real) number and that
an eigenvector associated with it-say, x'-is a positive vector (that is, x' > 0);
then a solution
(11) x*(t) _ A,1x'
does make economic sense, because this tells us that if the initial output vector,
x(0), of the economy is x1 (or its positive constant multiple, say, h,x'), then the
economy is capable of "balanced growth" (at the rate of A,) for A, > 1 or
"balanced decay" for O < A, < 1. We simply call the path such as (11) a balanced
growth path or a balanced growth solution. This is an interesting conclusion, for if
the initial output vector x(0) is xl or its (positive) constant multiple, then, in
the economy in which (8) holds, the output of every good grows (or decays) at the
same rate A,.
A natural question that follows from this consideration is: What are the
conditions which would guarantee the existence of a positive eigenvalue A, and a
positive eigenvector x' associated with it? An immediate thought about this is
to consider the case where M is a nonnegative matrix. For if M is a nonnegative
indecomposable matrix, then owing to the Frobenius theorem (Theorem 4.B.1),
there exists a positive eigenvalue (called the Frobenius root) and a positive
eigenvector associated with it. That is, simply by taking the Frobenius root as A,
and the associated eigenvector as x I, we have a balanced growth solution, (11).
However, there is one basic difficulty, namely, the question of how we can
guarantee that M = [I + B-' (I - A)] is a nonnegative matrix. In general, M will
not be nonnegative. To see this, consider the following example by DOSSO
([4], p. 297):
EXAMPLE 1:
F l rl 0
Let A- B-
I3 3 L0 1
5 I
3
Then M = [I + B-'(I - A)] _ I 5
1-3 3
However, the fact that M is not a nonnegative matrix may not preclude the
possibility that the system of difference equations, (8), contains a balanced growth
solution. Hence, we want to find a set of plausible assumptions under which there
exists a balanced growth solution of (8). To do this, first assume that the matrix
-
[I A] has a dominant diagonal, that is,
510 MULTISECTOR MODELS OF ECONOMIC GROWTH
(15) lim ii(t) = Q exists such that oc > u > 0 and u is independent
r-co x*(t)
of i, where i stands for the ith component
REMARK: The concept of relative stability is really independent of
whether the motion of x(t) is described by a system of linear difference
equations such as (8). Essentially, if i(t) behaves according to a certain law
of motion (which can be anything) starting from the initial value z(O), and if
THE DYNAMIC LEONTIEF MODEL 511
there exists a reference path x*(t) [for example, x*(t) _ Atx], which is
positive for all t, then the definition such as the one described by (15) holds.
REMARK: One of the crucial features of the concept of relative stability is
that i(t) can start from any arbitrary initial point. That is, regardless of the
initial value z(0), relation (15) holds if x*(t) is relatively stable. Suppose
z(t) and z°(t) are two solutions starting from z(0) and z°(0), respectively,
such that z(t) > 0 for all t. Then noting that z,°(t)/,ii(t) = [zi°(t)/x*(t)]/
[z1(t)/x*(t)], we can conclude that if there exists a balanced growth path
x*(t) > 0 which is relatively stable, then lim, [z;°(t)/z1(t)] also converges
to a constant which is independent of i.
In Figure 6.4, we illustrate the concept of relative stability in such a way
that i(t) asymptotically approaches the balanced growth path x*(t). Contrary
to a common misunderstanding, this asymptotic convergence is not necessary
in the concept of relative stability. In other words, that x*(t) is relatively stable
does not necessarily imply that z(t) -> x*(t) as t -> oo. An example of such a case
is given by Nikaido ([32], section 22) as follows:
where [;] and [ ] are the eigenvectors associated with 4 and 2, respectively.
Clearly,
(17) x*(t) = 41, i = 1, 2
is a balanced growth path of the economy. If the initial output vector is such that
x, (0) = 2 and x2(0) = 0, then the path of the output of each good, determined by
x2
X(t)
(16), is
(1.8) z1(t) = 41 + 2t and z2(t) = 41 - 21
Hence i,(t)1xi (t) and z2(t)1x2(t) both approach 1 as t -- oo; that is, x*(t) is
relatively stable. But the Euclidian distance between i(t) and x*(t) in period t
is given by
= 2 `,
(22) (1 + F)"I - xi = Al In xi
Also note that (F -') z' = vi z' implies (1 /vi) zi = F z', which in turn implies
(1 + F) zi = (1 + 1/vi)zi. Hence we have
(24)
Since Al = 1 + 14, and also X11 > I .A; J for all i = 2, ... , n, by assumption'13
this implies
(25)
Let e' be the n-dimensional column vector whose ith component is 1 and all
other components are zero, and consider the system of difference equations
x(t + 1) =
Then, assuming that all the eigenvalues of M are distinct," the general
solution of this system of difference equations is [as discussed in connection
with (10)]
so that
since the A,'s are the eigenvalues of M and the x''s are the eigenvectors of
M associated with the A,'s, i = 1, 2, ... , n. Also (26) implies
n
(28)
r= 1
(30) Ofort> k,
Let m; be the smallest of such k;'s and let m - max }M h , 1772, ... , m;, ... , In
Then M"' e' > 0 for all i = 1, 2, ... , n, so that
THE DYNAMIC LEONTIEF MODEL 515
(31) Mm> 0
as desired. (Q.E.D.)
We are now ready to consider the relative stability property of a balanced
growth solution of the system of difference equations x(t + 1) = M x(t), where
M - (I + F). Consider the following particular solution, which is a (meaningful)
balanced growth solution if Al > 0 and x1 > 0:
Hence, assuming that all the eigenvalues of M are distinct," the solution x*(t)
is relatively stable (by definition) if and only if
n
xj(t) = 1i
(33) r-1
A11xjl x' --> o > 0 as t -> oo, j = 1, 2, ... , n
xj (t)
where xj' is the jth element of x', i = 1, 2, ..., n, and z(t) is the general solution
written in the form of (10), whose jth element is zj (t). It is clear then that the
balanced growth path x*(t) is relatively stable if and only ifA1 > I A; 1 , i = 2, ... , n,
with h1 > 0. Hence, in view of the previous lemma, we can conclude the following.
Theorem 6.B.1: Suppose there exists a positive integer m such that Mm > 0 with
with F-' > 0; then the balanced growth solution x*(t) as defined in (32) is relatively
stable, where A1m is the Frobenius root of M"'. Conversely, if the solution x*(t) is
relatively stable and F- 1
> 0, then there exists a positive integer m such that
Mm > 0.18
REMARK: Nikaido ([32], section 22) proved the first half of this theorem
for the case m = 1 (that is, M > 0), which is a special case of the Solow-
Samuelson theorem.
REMARK: The matrix M is not necessarily a nonnegative matrix in the
above theorem. However, if M is a nonnegative matrix, then Al becomes
the Frobenius root ofM. Moreover, ifMis nonnegative and indecomposable
(but not necessarily M > 0), then from Theorem 4.B.4 there exists a positive
integer m such that Mm > 0 if and only if M is primitive. Hence we obtain the
following corollary.
Above, we remarked that the balanced growth path x*(t) defined by (32) is
relatively stable if and only if )t > Ja.,I, i = 2, ..., n. However, the examination
of (33) also reveals that the path x*(t) is relatively unstable if
(34) Ai < IA;I, i=2,...,n
Now suppose that M- "' > 0 for some positive integer m where M-"' is defined by
(Mm)-1. Write the Frobenius root of M-m as p i and observe that p I > Pi , i = I I
2, ... , n, and 1 /p; _ ,uj = A/', i = 1, 2, ... , n, which in turn implies condition (34).
On the other hand, if M -- I + F, where F -' > 0, and if (34) holds, we can prove
that there exists a positive integer m such that M-m > 0. The proof is analogous
to the sufficiency proof of Tsukui's lemma. Therefore, we obtain the following
direct opposite to Tsukui's lemma, which is also due to Tsukui [44].
r 5 1
3 3
M-- I+B-'(I-A)= 1 5
1--s 3
h, + h2 and c2(0) _ -h, + h2. If z,(0) and z2(0) are such that h, = 0, then
the economy is on a balanced growth path x*(t) = h2(3)', i = 1, 2. On the other
hand, if z,(0) and z2(0) are such that h, 0, then one of the outputs eventually
becomes negative. For example, if h, > 0, then z2(t) < 0 for all t > 1 for some 1.
Note that the balanced growth solution h,2' is impossible for any h, > 0 under
the assumption that x(0) ? 0, that is, x;(0) > 0, i = 1, 2, with strict inequality for
at least one i.18 In other words, we have shown for the present dynamic Leontief
system that one of the outputs eventually becomes negative except when the
boundary conditions are such that the economy is actually on its only balanced
growth path x*(t) = h2(3)', i = 1, 2. We may note that if z;(t) becomes negative
for t > 1, then Ki(t), the stock of this ith good, also becomes negative for t > 1
in view of (3). Clearly, negative output and a negative stock of goods do not make
any economic sense in the present discussion. Such a possibility in a dynamic
Leontief system is called "causal indeterminacy."19
Definition: If the relative configuration of the initial outputs (or the initial stock
of goods) does not coincide with that of any possible balanced growth path of
the economy, then the growth path may ultimately reach a situation at which the
output (and the stock) of at least one good becomes negative. If this happens,
then we say that we have causal indeterminacy.
Clearly this possibility of causal indeterminacy seriously undermines the
dynamic Leontief model. Note also that if the economy possesses a balanced
growth path which is relatively stable, then there is no causal indeterminacy in
such an economy. This is, as mentioned earlier, another point of crucial import-
ance in the concept of relative stability.
C. THE PRICE SYSTEM20
Assume that our economy is equipped with "money" which can be produced
at no cost and which functions as a medium of exchange as well as a unit of
account by which the price of each good is measured. Let p(t) be the price vector in
period t, whose ith element pi(t) denotes the price of the ith good in period t.
We assume that the production of all goods takes exactly one period and that
prices are constant throughout each period. It is assumed that no individual
can affect the price of any good that prevails in the market (the "competitiveness"
assumption).
Consider a bundle of goods denoted by the vector b; = (b,j, b2j, ... , bj).
This bundle of goods gives the necessary configuration of capital equipment for
the production of one.unit of the jth good. The value of this bundle is equal to
Uj p(t) bj = 2:;'_, p,(t)b,1. Consider an individual who has money in the amount
of vj at the beginning of period t. Suppose he can either lend this (say to a "bank")
at the rate of interest r(t) or invest it in the production of the jth good.
We assume that no individual can affect the interest rate which prevails in
the economy and that r(t) is the rate which prevails in the economy throughout
518 MULTISECTOR MODELS OF ECONOMIC GROWTH
where aj = [a1j, a2j, . . ., and] . The current profit for period t per unit production
of the jth good is thus given by
7I(t)=p(t+ 1)-po(t+ 1)aoi-P(t+
Since the configuration of capital equipment bj will be worth p (t + 1) - bj at the
beginning of period (t + 1), the total value of his assets at the beginning of period
(t + 1) is given by
(37) 7I (t) + p(t + 1) bb
This is the well-known equation of capital theory. We may consider vi(t) the price
of a unit capacity to produce thejth good. A usual way of interpreting (39) is as
follows: Suppose a person has one unit of money (say, a "dollar") with which he
THE DYNAMIC LEONTIEF MODEL 519
can buy the capacity for the jth good in the amount of l/vj(t).21 One unit of this
capacity is worth vj(t + 1) in period (t + 1) so that vj(t + 1)Ivj(t) is the value of
the capacity of the original (1/vj(t)) units. Moreover, (1/v,(t)) units of capacity
yield current profits of'r (t)/vj(t) in period t. On the other hand, one dollar, if it
is loaned to a "bank," will be worth [I + r(t)] at the beginning of the (t + 1)th
period. Thus, in equilibrium, we have
(40) vj(t + 1) + 7rj(t)
+ r(t)
v;(t) Vi(t)
where ao is an n-vector whose jth element is aoj. This is the basic price equation
of the dynamic Leontief system. This equation is the "dual" of the output equation
(7), which we may rewrite as
implicitly assumed.
In the closed dynamic Leontief system in which labor is subsumed as one of
the industries, the term W does not appear, so that (42) is simplified to
(43) p(t + 1) = [1 +r(t)]p(t) [I+ (1-A)B-1]-I
This is again a system of n first-order, linear, homogeneous difference equations.
We should note the one fundamental assumption in the above procedure of
520 MULTISECTOR MODELS OF ECONOMIC GROWTH
obtaining (43). That is, one has to know p(t + 1) in period t. Clearly, p(t + 1) is a
future price in period t; thus this is the assumption of perfect foresight 22
We may also note that this system of equations, (43), is not self-contained.
That is, there are (n + 1) unknown time profiles (the prices of n goods and the
interest rate), whereas (43) contains only n equations. If we specify r(t), then we
can solve this system of equations for pi(t), i = 1 , 2, ... , n. An unspecified interest
rate is appropriate in the present model, for there is no mechanism in the model
which determines the interest rate. Alternatively, if we set one of the goods to
be the numeraire (say p (t) = 1 for all t), then (43) determines r(t) and pi (t),
i = 2, ..., n. But then there is no mechanism in the model which determines the
absolute prices. The banking system, demand for money, and the like, are not
discussed. In his treatment of the dynamic Leontief price system, Solow proposed
"to let the interest rate hang and treat it as an arbitrary function of time" ([38],
p. 36). Then, soon after this statement, he assumed r(t) constant for all t (p. 36).
Jorgenson [ 151, in proving his dual stability theorem, assumed that r(t) = 0 for
all t. Here let us also assume r(t) = 0 for all t, so that we now have
(44) p(t + 1) = p(t) N
where N -- [I + (I - A )B -' ] -'. Let ,, i= 1, 2, . ., n, be the eigenvalues of N
.
(47) i=1,2,...,n
Then
(48) 9,[I+(I-A)B-1] =(1 i = 1,2,...,n
That is, the (I + ,)'s are the eigenvalues of [I + (I - A)B '] Then we have .
(50) r =1 + ;1
and Pi=9;,i= 1,2,...,n
that A I > A i l , i = 2, ... , n, with x' > 0, where x I is the eigenvector associated with
l .
AI > 0. The relevant balanced growth path which is relatively stable is x*(t) _
AItz'. But if AI > IA;I, i = 2, ..., n, then we have, in view of (51),
P*(t) = V P1
where I > 0 and p I > 0 as long as AI > 0 and x' > 0. Hence if x*(t) is relatively
stable, then p*(t) is not relative stable, for the ratio p;(t)/p*(t), i = 1, 2, ..., n,
does not, in general, converge to a constant as t -> oo in view of (46). Conversely, if
there exists a §I > 0 and the corresponding eigenvector )5, > 0 such that I >
i = 2, ..., n, then we have AI < 1A11, i = 2, ..., n. Hence the balanced growth
path x*(t) is not relatively stable. We may call this result the (Solow-Jorgenson)
dual stability theorem in view of Solow [ 38] and Jorgenson [ 15] .23
Finally, let us examine whether it is possible for the price vector to be con-
stant for all t. Coming back to (41), put p(t + 1) = p(t) = constant = p, and
po(t) = constant = po. Then we obtain
(53) fPo'ao
In other words, if the initial price vectors p(O) and po(O) happen to be such that
p(O) = p and po(0) = po and that p and po satisfy (53), then all the prices are
constant for all t. For the special case r = 0 andpo = ao = 0, that is, the case we are
considering in connection with (44), we have p = 0, assuming (1- A) is non-
singular. This can also be seen from (44) directly. This means that in the closed
Leontief system with r = 0, the only constant price case is the case in which all
the prices are zero. The constant price solution is the one that Morishima [26]
is concerned with. A discussion of this is given in subsection e.
522 MULTISECTOR MODELS OF ECONOMIC GROWTH
matter of the demand conditions. In this connection, we should note that point Q
and point P are related in a definite way, for vector d(O) is uniquely specified once
x(O) is specified. Hence the choice of x(O) is uniquely related to the choice of
A K(0) if the equality (54) is to hold, so that K(1) is determined in a definite way.
In other words, the "demand condition" simultaneously determines the output
vector of the current period [ here x(0)] and the stock vector of the next period
[ here K(1)] . Once the stock vector of the next period is determined, we are ready
to consider the next period by taking the stock vector of this period as a new initial
condition.
Solow [ 38] then proposed to consider this demand condition as a solution of
an optimization model. Suppose, for example, that the demand condition is not
determined in a decentralized way as the sum of each individual's desires or tastes
but is rather determined by the central planning authority in such a fashion as to
optimize a certain target.
As such a target (following Solow), we choose a weighted sum of the terminal
capital stocks,a;K;(T), or a- K(T), where a >_ 0. Recalling our discussion
in Chapter 1, Section E, we may regard this as the vector maximum problem of
maximizing K(T). In any case, given the value of a, our problem now is a simple
linear programming problem of maximizing a, KIT) subject to
(I - A) x(t) > c(t) + A K(t), t = 0, 1, 2, ... , T - 1
1,2,...,T- 1
and
0 0 0 -C 0 0 K (2) -c(1)
0 -I I 0 0 r
0 0 -C ... 0 K (3) -c(2)
0
---- ----
0 0 -I I 1 0
------ --------4---- -- ---- ---- ---- ---
0 0 -c K(T) -c(T- 1)
0 0 0 0 0 B 0 0 ... 0 x(0) K (0)
-I 0 0 0 0 0 B 0 ... 0 0
0 0;
X(1)
0 -I 0 0 0 B ... 0 x(2) 0
...
..
...
......
......
... ...
... ...
. ... ...
0 0 0 -I 0 0 0 0 ... B x(T- 1) 0
Clearly the choice variables here are K(t), t = 1 , 2, ... , T, and x(t), t = 0, 1, 2, ... ,
T - 1. Clearly, it is possible to generalize the above maximization problem by add-
ing the primary resource constraints such as a0 x(t) s L(t) where L(t) is the exo-
genous supply of labor in period t. Since such an analysis would be analogous to
the subsequent one, we shall leave it to the interested reader.
In order to consider the price implications of the above problem, Solow con-
sidered the dual of this linear programming problem, which can be easily obtained
by constructing the dual constraints from the above original constraints. That is,
the dual constraints are explicitly written out as follows:
-I 0 -I 1
I
0
0
I
0
-I .0
I
...
...
0
0
0
0
0
0
0
-I
0
0
...
...
0
0
0
U(O)
U(1)
u(2)
0
0
0
0 0 0 -I j 0 0 0 -I u(T-2) 0
0 0 0 ... I ; 0 0 0 0 u(T - 1)
-C' 0 0 .. 0 B' 0 0 ... 0 q(O) 0
0 -C' 0 ... 0 0 B' 0 ... 0 q(1) 0
0 0 -C' 0 0 0 B' 0 q(2) 0
as well as the nonnegativity constraints, u(t) > 0, q(t) > 0, t = 0, 1, ... , T - 1. The
objective of this dual problem is to minimize
T-1
uo- [K(0) - F(0)]- u(t) F(t) + q(0) K(0)
t= J
Now assuming that all stocks are always positive, that is, k(t) > 0 for all t, we
obtain from (60)
(68) [1 + r(t)] p(t) = R(t + 1) + p(t + 1)
from which we can easily obtain
526 MULTISE(TOR MODELS OF ECONOMIC GROWTH
Pi (t + 1) --pi(t) + Ri(t + 1)
(69) = r(t), for all t
Pi (t) Pi (t)
which corresponds to our old intertemporal-arbitrage equilibrium condition (39).
Note that Ri(t + 1) corresponds to ni(t), which is the current profit (own rent) of
period t from a spectrum of equipment which has a capacity of producing one unit
of thejth good. On the other hand, R1(t) is the rent on a stock consisting of one unit
of thejth good. Also from (61) we obtain [by assuming k(T) > 0]
p(T - 1)
(70) a= T-
,=I[ 1 + r(i)]
Using (66) and (67), the constraint - C. u(t) + B'- q(t) > 0 can be written as
(71) -C'-p(t) + B'-R(t) > 0, or p(t) < A'-p(t) + B'-R(t)
Suppose k(t) > 0 for all t so that (60) holds; then we have
(72) (B + C)' p(t + 1) < [1 + r(t)] B'-p(t), for all t
If i(t) > 0, then (71) holds with equality for all tin view of (62). Hence (72) holds
with equality for all t, and we can easily rewrite it as the basic price equation (43)
in the dynamic Leontief system. [If we had incorporated the labor constraint
a0- x(t) < L(t) in the original maximization problem, we would obtain equation
(42) instead.]
If some stock is not held for some period, that is, if Ki(t) = 0 for some j and
some t, then (69) does not necessarily hold with equality. In other words,
(73) pi(t + 1) - pi(t) + Ri(t + 1) _< r(t)
Pi(t) Pi(t)
The strict inequality here would induce a holder of the stock to get rid of it. We
henceforth assume Ki(t) > 0 for all j and for all t.
Suppose that ai > 0, that is, the terminal stock of the jth good is positively
weighted. Then we can show that ui(t) > 0 for all t. To see this, observe that, owing
to (60), ui(t) = 0 implies ui(t + 1) = 0 and qi(t + 1) = 0, so that ui(t + 2) =
ui(t + 3) = . . = ui(T - 1) = 0. But ui(T - 1) = ai > 0. Hence ui(t) = 0 is impos-
.
sible for any t. Therefore, assuming a > 0, we can conclude that u(t) > 0 for all t.
Hence, in view of (63), we obtain
(74) z(t) = A i(t) +A K(t) + c (t), for all t
In terms of Figure 6.5 this means that the output must always be chosen on the
frontier, that is, the kinked line EFG, as we remarked earlier.
If the stock of thejth good has excess capacity in period t, then in view of (65),
we have qi(t) = 0, so that R1(t) = 0. Then, owing to (69),
(75) pi(t + 1) = [1 + r(t)] pi(t)
THE DYNAMIC LEONTIEF MODEL. 527
In other words, the nominal price of thejth good will increase simply at the com-
pound rate of interest r(t).
What does this all add up to? It may be worthwhile to recapitulate some of
the results obtained above.
(i) The problem of causal indeterminacy can be avoided simply by converting the
equalities to inequalities, that is, by allowing excess capacity of capital together
with the nonnegativity constraint.
(ii) The dual variables u(t) and q(t) play the role of prices in the competitive mech-
anism. For example, the competitive intertemporal arbitrage equation (69) is
obtained by interpreting the dual variables as prices.
(iii) The values of the dual variables can be computed explicitly by solving the dual
linear programming problem.
(iv) There are certain important relations between prices [p(t) and R(t)] and the
real variables [x(t) and K(t)] implied by relations (59) to (65). Some of them are
obtained as (69), (71), (72), (73), (74), and (75). The possibility of zero prices
and of excess capacity is a novel feature in the present formulation.
(v) The price equation (43) holds if k(t) > 0 and z(t) > 0 for all t.
Finally, we should stress that the above model a la Solow is a planning model
and not a descriptive model of an economy. Although it can be interpreted as an
"optimal path" generated by a "competitive" mechanism, it does not describe the
mechanism nor the equilibrium of a competitive dynamic economy. This is in
marked contrast to Morishima's model, which we describe in the next subsection.
However, if we recall the treatment of a competitive (static) equilibrium in terms of
linear programming (with the duality theorem) by Kuhn [18] and DOSSO [4]
(see our Chapter 2, Section E, subsection a), we realize immediately that we can
construct a model for a competitive dynamic economy utilizingthe above planning
model and prove various properties of the model such as existence, and so on. In
other words, we may consider the above planning model as a part of a descriptive
model.
There is another possible route of development in the above planning
model, and that is dropping the assumption of fixed coefficients and allowing
various production processes for the production of one or more goods, while still
retaining the basic planning character of the model. A natural question which
then arises is that of characterizations of the "optimal path." The turnpike
theorems that we discuss in the next chapter (Section A) are concerned with this
question, under the assumption that the planning horizon Tis long enough. This
route is further investigated by Gale, and others, with a more satisfactory treat-
ment of consumption (Section B of Chapter 7).
Leontief model is that the production coefficients such as the ay's and by's are
no longer assumed to be constant. That is, these coefficients can now vary depend-
ing on changes in the prices of goods (which are used as factors of production) as
well as on changes in the wage rate.26
One of the important objectives of Morishima in the above-mentioned
papers was to prove that the coefficients, the ay's and b11's, are, in fact, constant.
Although these coefficients are allowed to vary, only fixed values are chosen in
equilibrium so that the usual analysis of the dynamic Leontief system with fixed
coefficients could be justified. This is an extension of the famous substitution
theorems` of Samuelson from a static to a dynamic case.
Although we will argue that his dynamic substitution theorem holds only
for very limited cases, his introduction of the possibility of factor substitution
is a very important contribution. For example, as we remarked in subsection a,
it could enable us to avoid the difficulty of causal indeterminacy which is in-
herent in the dynamic Leontief model with fixed coefficients. In other words,
suppose that for certain period(s) of time, the stock of some good diminishes.
Then, under the usual circumstances, such a decumulation would increase the
marginal productivity of this good when it is used as a factor of production, which
may, in turn, encourage the production of this good. Hence the decumulation of
the stock of this good could be stopped, thus avoiding causal indeterminacy.28
We do not, however, attempt to prove this observation rigorously in this
subsection. We leave this to the interested reader. Here, instead, we try to build
a foundation for such an attempt by discussing Morishima's model critically.
This will enable the reader to understand how the model is to be constructed
when factor substitution is allowed. In this connection we may also point out an
interesting feature in Morishima's model, that is, he distinguishes capital goods
from noncapital goods explicitly, so there are goods that are never used as capital
goods (unlike the usual Leontief model in which the nonsingularity of the B
matrix is assumed).
Second, Morishima's model represents a polar assumption with regard to the
treatment of the price equation compared with our price equation (41) a la
Solow [38]. In other words, as we will discuss later, whereas (41) represents
the dynamic price equation with changing prices but perfect foresight, Mori-
shima's price equation is based on the assumption that entrepreneurs always
expect prices to remain constant.29
(77) k = m + 1,...,n
THE DYNAMIC LEONTIEF MODEL 529
where xij is the amount of the ith good used for the production of the jth (non-
capital) good and xik is the amount of the ith good used for the production of
the kth (capital) good. When i = 0, it refers to labor. It is assumed that (homo-
geneous) labor is the only primary factor of production. Assuming that these
production functions i and fk's) exhibit constant returns to scale (that is,
9
linear homogeneity), and then dividing both sides of the above equations by xj or
Xk, we obtain
(78) 1 = f (a0i, ... , ami, bm+ 1,i, ... , bni), i = 1, 2, ... , n
where aji and bki are now defined as aji ° xji/xj, j = 0, 1, ... , m, and bki = xkilxk,
k= m + 1, ...,n, (i = 1, 2, ...,n).
Unlike the usual dynamic Leontief system, these aji's and bki's are not
constant. They are assumed to depend on prices, thus reflecting the substitutability
of these goods as factors of production. That is,
(79) aji = aji(po, pi, ... , Pn),j = 0, 1, ..., m; i = 1, 2, ..., n
(80) bki = bki(Po, P1,... , pn), k = m + 1, ..., n; i = 1, 2, ..., n
It is assumed that aji > 0 and bki > 0 for all i, j, and k. The aji's and bki's may
be chosen to minimize the cost of production subject to theproduction constraints
(76) and (77) [or (78)].
The treatment of price equations in Morishima [26] is of the traditional
Walrasian type and is different from Solow's. Let pi (i = 1, 2, ... , m) be the price
of the noncapital good i, pk (k = m + 1, ., n) be the price of capital service k,
. .
and po be the wage rate. Let Pk be the price of new capital good k and let SkPk
and SkPk be the depreciation charges and the insurance premium, respectively, to
be deducted from the gross income Pk of one unit of capital good k(k =
m + 1, . ., n). Then the net price of capital service k is Pk - (Sk + Sk)Pk. Let r be
.
the rate of interest. Following Walras [46], Morishima supposed that the net price
of each capital service is equal to rPk. That is, Pk - (Sk + Sk)Pk = rPk, or3o
Pk
(81) Pk = , '5k = 4 + .5", k = m + 1,..., n
r + -5k
Under a regime of perfect competition, owing to free entry and exit of firms,
profit is zero for all industries. Thus we have, for each period,
m n
(82) pi = 2: ajipj + bkiPk, i = 1, ..., m
j=0 k=m+ 1
m n
(83) Pi = 2: ajipj + 2: bkiPk, i = m + 1, ..., n
j=0 k=m+ I
all a12
...
...
alm bm+l,l bm+I,2 "' bm+I,m
all a22 a2m bmb+nl 2, 1 bm b+ 2,2 ' bm +2,m
l=
'. '.
A ... ,BI
A2 =
a2,m+ 1
...
a2,m+ 2
...
' '
...
' a2n
... B2 =
bm+2,m+1
...
"' "'
...
bm+2.n
...
,
p2
Then writing P I = (P l , P2, Pm ), P2 = (Pm+ 1, , Pn ), = (Pm+ 1, . . ., Pn ),
-
11 = (aol, ., aom), and 12 = (ao,m+1, ..., aon), we can rewrite (82) and (83) as
(84)
(85)
BI B2-15 02 B2 12 (rl + 8)
are constant, so there are no capital gains or losses by holding a stock of goods.
However, to say that Morishima's equation is a special case of Solow's equa-
tion may be a bit too strong. The best stand with regard to this comparison can be
found in Solow's own writing, where he states ([38] , p. 32):
Morishima's model and mine can be reconciled by recognizing that they rep-
resent polar assumptions about price expectations. I assume that entre-
preneurs have perfect foresight and (correctly) expect prices to change, and
I ask what price movements are then logically consistent. Morishima assumes
that entrepreneurs always expect prices to remain constant although in
fact they do change from time to time in order to clear markets, and he asks
what set of constant prices (and interest rate) can actually be made to endure.
It's a toss-up which assumption does more violence to reality.
In other words, what Morishima [26] was concerned with is the possibility of
enduring constant prices. Since he wished to prove the dynamic substitution
theorem in which prices are uniquely chosen (thus constant), he must find the set
of prices (and interest rate) that can be kept constant. Therefore, his way of treat-
ment of the price equation is a natural consequence of his interest in proving his
dynamic substitution theorem 31
We now count the number of equations and the number of unknowns. For
this purpose, first note that the matrices A and B are completely specified once
the price vector (po, p) is given. Hence we may write (88) as
(89) P = P A (Po, P) + rP B (Po, P) + Pol
which we may also write as
(89') « (Po, p, r) = 0
It is well known (and can be shown easily) that the af;'s and the bk;'s are all
homogeneous of degree zero in (po, p). Hence we have A (apo, ap) = A (po, p)
and B (apo, ap) = B (po, p) for any positive number a. Hence (89) implies
(90) ap = ap A (app, ap) + r(ap) B (app, ap) + apol
for any a > 0. In other words,
(91) 0 (apo, ap, r) = 0
for any a > 0. in view of the homogeneity of (D in (po, p), we may impose the
following price normalization equation :12
n
(92) Pi = 1
i-o
Equations (89) and (92) combined provide us with (n + 1) equations. There are
(n + 2) variables to be determined within the system, that is, p;, i = 0, 1, ..., n, and
r. Hence if we can preassign the value of eitherpo or r, the system is completely
specified. This is schematically described by
532 MULTISECTOR MODELS OF ECONOMIC GROWTH
(93-a) r - (Po, P)
(93-b) Po - (p, r)
The values of (po, p) or (p, r) thus determined define the equilibrium values of the
system. Note that the mechanism described in (93) does not determine the
absolute (monetary) prices of the goods. This is because there are two degrees of
freedom in (89) and one of them is controlled by (92). Alternatively, we may
specify both r and po exogenously where (92) is not binding; then the absolute
prices of the goods can all be specified.
As we remarked in Chapter 1, Section E, subsection a, the procedure of
counting the number of equations and the number of unknowns merely checks the
consistency of the model and does neither prove the existence nor the uniqueness
of the equilibrium values of the unknowns. The task of establishing the existence,
uniqueness, and nonnegativity of the equilibrium values was attempted by
Morishima [26]. The problem of existence is not a particularly difficult one. The
continuity of the linear maps A (po, p) and B (po, p) immediately establishes the
continuity of rD (po, p, r). For the r - (po, p) determination, it then suffices to
consider a continuous map c D from a unit simplex {(po, p): opc = 1} into itself
and to apply the Brouwer fixed point theorem (Theorem 2.E.2). The existence of
the equilibrium values for the po - (p, r) specification can be proved anal-
ogously.33 For an attempt to prove the uniqueness, the reader is referred to
Morishima [ 26] . Unfortunately, Morishima [ 26] apparently forgot to prove the
nonnegativity of the equilibrium values. Take the case of the p o- (p, r) specifica-
tion, for example. Just looking at (89), it can immediately be seen that ifpo is large
enough, r may have to be negative to preserve the nonnegativity of the (po, p)
vector by (92). Hence it is not surprising to find the brilliant example due to
Georgescu-Roegen [ 5] , in which the value of r is negative when the value ofpo
is preassigned. Such a defect can be remedied if we can set an upper bound on the
value of po. Morishima and Murata [ 29] thus "remedied" this defect simply by
assuming such a bound 34
That the value of (po, p) is uniquely determined for a given value of r, or that
the value of (p, r) is uniquely determined for a given value of po, has a very im-
portant implication; it means that the aj;'s and bk;'s remain constant regardless of
any change in the values of the final demand for goods, provided that these
changes do not disturb the preassigned values of r and po. This means, under
certain assumptions, that a perfectly competitive economy would choose to
produce each good by one process. Hence (as remarked before) the above result,
essentially obtained by Morishima [26] , is considered an extension of Samuel-
son's famous substitution theorem for the static Leontief system to the case of
intertemporal production (see, for example, Hahn and Matthews [9], p. 870).
Under what circumstances are the values of r and po determined in such a
way that they are undisturbed by changes in the final demand for goods? An
obvious case is the "Keynesian situation" in which the money rate of interest r is
THE DYNAMIC LEONTIEF MODEL 533
fixed owing to the "liquidity trap" in the money market and/or the money wage
rate is fixed owing to its rigidity in the labor market.3.5 However, except for such
rather extreme cases, we cannot, in general, establish that the values of r and po
are undisturbed by changes in the final demand for goods. This obviously under-
mines the use of Morishima's substitution theorem in the dynamic Leontief system
described above, contrary to the belief by Morishima [26] and Hahn and
Matthews [9] .
To illustrate this point, consider the po -> (p, r) determination. In this case,
the equilibrium value of (p, r) is uniquely determined as long as the value ofpo
is given. But if the value of P0, that is, the wage rate, changes, the values of p
and r also move, thus causing changes in the values of the aj;'s and the bki's.
This is not the case for the substitution theorem of the static Leontief model 36
Theoretically speaking, the (absolute) values of po and r are determined
under a broader general equilibrium system which incorporates the markets
ignored in the above consideration-that is, the markets for labor, money, bonds,
and so on-with the introduction of the store of value as a function of money as
well as of bonds. If we incorporate these markets into our analysis, it is more dif-
ficult, at least for this author, to accept Morishima's basic postulate of constant
prices, for constant prices presuppose certain assumptions with regard to the
supply of money and the like.3r Suppose now that prices change over time. Then
the basic equation (89) is no longer valid, for it lacks the term that signifies price
changes which certainly affects the profit condition.34 This factor, among others,
may destroy the homogeneity of the function 0, which in turn makes the price
normalization equation (92) invalid.
In this connection, we may point out that within the profession an active
interest has arisen recently in incorporating money (which, among other things,
functions as a store of value) into a growth model (see Tobin [41] and the ap-
pendix to this section, for example).39 Although such an analysis in the literature
has so far been confined to the model in which there is only one commodity be-
sides labor, it is certainly possible to extend it to a multicommodity model by
using the dynamic Leontief system as discussed in this section.
Keeping this in mind, let us return to the constant price world to explore
further implications of such a model. Consider (89) and normalize the price vector
p by the wage rate po instead of the price normalization represented by equation
(92). Thus rewrite (89) as
(94) p=p
where p = ()5 , = p1/po, i = 1, 2,.., n, A*(p)
. (1, P/Po), and
B*(p) = B (1, p/po). There are n equations in (94) with (n + 1) variables, p andr.
Suppose that the value of r is preassigned, and ask the question whether it is pos-
sible to have identical technology matrices A * and B * for two different values of
r. If it is possible, we call such a phenomenon reswitching of techniques."' If the
relation between r and p is one-to-one, then "reswitching" is obviously im-
534 MULTISECTOR MODELS OF ECONOMIC GROWTH
possible."' Note that the relation between r and p may not be one-to-one, even if
the functions A *(p) and B *(p) are continuous and one-to-one.
Actually we can also show that reswitching is impossible even in the absence
of a one-to-one relation between r and p,42 if p is strictly positive for all relevant
changes in r. To prove this, let r and r' be two interest rates, and suppose that re-
switching is possible. In other words, suppose that we have the same matrices A
and B * for both r and r'. Since A *(p) and B *(p) are one-to-one, we have the same
p for r and r'. Therefore
(95-a)
and
(95-b)
Hence
(96)
p > 0 by assumption, we obtain r = r' as long as at least one element of
B* is positive. This proves the impossibility of the reswitching of techniques.
Let us now turn to the determination of outputs in the above system. Let
xj(t), j = 1, 2, ..., m, represent the output of noncapital good j and let Xk(t),
k = m + 1, . . ., n, represent the output of capital good k. Then we have
n
(97) x (t) aj;xi(t) + Cj(t), j = 1, 2, ... , m
where ej(t), j = 1, 2, . ., m, represents the final demand for the ith noncapital
.
good. Here, unlike the usual dynamic Leontief system, c; as well as the aji's are
functions of the prices (po, Pl, ., Pn)
Let yk(t) be the existing quantity of capital in period t. Assuming that all
the existing capital goods are all fully employed we have
Since the obsolete and destroycd portions of capital goods are cckyk(t), k _
m + 1, ..., n, we obtain
(99) Xk(t) = 8kYk(t) + IYk(t + 1 ) - Yk(t)] + Ck(t), k = m + 1, ... , n
where ch(t) is the final demand for capital good k. Hence, in view of (98), we obtain
A, AZ 0, 03
(101) A- , B
8.B2 B1 B2
)i1 is the Frobenius root of Mm > 0, since Aim > l airy' l, i = 2, ... , n ['.' a. i > I a J,
i = 2, ... , n] . Note that the assumption F- I > 0 can be weakened so that F- ` is
nonnegative, indecomposable, and primitive. Incidentally, the x1 in (32) is a positive
vector as is clear from the proof of Tsukui's lemma.
17. When the output system (6) is replaced by x(t) = A x(t) + B [x(t) - x(t - 1)] +
T (t), where the coefficients in B now can be interpreted similarly to the coefficients
in the usual acceleration principle, then relative stability is considered to be empiri-
cally more plausible. I owe this observation to Jinkichi Tsukui. In this system,
investment is assumed to be "passive," that is, it takes place only to supplement the
excess demand for capacity in the preceding period.
18. It suffices to show that we cannot have h2 = 0. To show this, suppose the contrary,
or h2 = 0. Then x, (0) _ -x2(0)(=hj ), which contradicts x(0) ? 0.
19. Our Example 1, as an illustration of causal indeterminacy, is due to DOSSO [4].
20. The formulation of the price system here is due to Solow [381. See also DOSSO [4]
and Samuelson [351.
21. Assume perfect divisibility of the capital good.
22. Because (43) is obtained by comparing only two periods, the assumption is also
known as that of "myopic perfect foresight."
23. Observing this, Jorgenson proved Solow's conjecture that in the closed Leontief
model, if the output system is relatively stable, then the price system is relatively
unstable and vice versa, provided that n > 2. A similar result is obtained by Uzawa
[451, M. Fukuoka, and H. Niida.
24. In other words, unlike the notation in subsection b, K(t) denotes only the supply of
capital and does not denote the demand for capital, whereas AK(t) denotes the
demand for an increase in the supply of capital.
25. Mathematically speaking, this means that although inequality may be allowed in
(54) [that is, x(t) ? A x(t) + A K(t) + e(t)] , only the equality case is chosen.
Such a choice can be justified if for each good there exists an individual who is
not satiated with the good.
26. One difference between [26] and [27] is that in [26] such a neo-classical "smooth"
substitution with a continuum of activities is assumed, whereas in [27] a "discrete"
substitution with a finite number of activities is assumed. Although the latter
resembles "reality" more closely, there is little theoretical difference between the
two approaches. Hence we mostly adhere to the simpler neo-classical. case [26] .
27. Since the "substitution theorem" asserts that the input coefficients are in fact
fixed, it is often (perhaps more properly) called the nonsubstitution theorem.
28. If we allow such a price flexibility and factor substitution, then all the factors (here
labor and the stocks of goods = capital) are fully employed. Hence Morishima's
model is in sharp contrast to Solow's excess capacity model [38] described in
subsection d. It resembles more closely Solow's neo-classical revision [37] of the
Harrod-Domar model. Morishima's model also contrasts sharply with Jorgenson's
descriptive excess capacity model [ 16] which converts Solow's optimization model
[38] to a descriptive model, allowing excess capacities but sticking to the fixed
coefficients of production (as in Solow [381). Jorgenson's model [16] thus also
aims to remove the difficulty of causal indeterminacy and the dual stability theorem,
but has unfortunately attracted severe criticism by McManus [251.
29. In other words, Morishima assumed static expectations. As we will discuss later,
he then asked: What set of constant prices can actually be made to endure so that
such an expectation is realized for each t (which also, in fact, implies perfect fore-
538 MULTISECTOR MODELS OF ECONOMIC GROWTH
sight)? Such a state, of constant prices is sustained, except for knife edge cases,
only on a balanced growth or decay path [often termed the steady state (growth)
path] . These two polar assumptions with regard to prices and expectation-that is,
perfect foresight with changing prices and static foresight with constant prices-are
quite common in growth theory literature. Both these assumptions are clearly
unrealistic. This rather unfortunate state of the theory is, among other things, due
to the fact that we do not have any elaborate theory with regard to future expectation
and uncertainty. See also the Appendix to this section.
30. As in Walras [46] , part V, (81) is crucial in Morishima in establishing the consistency
of the system with capital accumulation or decumulation under constant prices (that
is, a balanced growth or decay path). Note that (81) also signifies that the returns to
various capital goods are equal, or [pk - (8k + 8k)Pk] /Pk is the same for all k
(and equal to r). This means that capital goods are freely transferable among
industries. In the absence of price changes, r is equal to the interest rate (or the own
rate of interest in the moneyless economy).
31. As we remarked earlier, the state of constant prices is sustained, except for knife
edge cases, only in a state of balanced growth or decay. It is not clear how the
economy reaches such a steady state starting from a historically given initial point.
This question, in spite of its great importance, was not investigated by Morishima,
thus undermining the significance of his work.
32. Choose a = 1/Z! 0P1 and let p; =_ ap;, i = 0, 1, ..., n. Then clearly E" op; = 1.
The imposition of (92) amounts to writing pi for each such p; and dropping the
homogeneity from (89) or (89').
33. There is a slight complication that we must take care of in the proof. That is, it is
easy to see from (89) that if r is large enough, it may not be possible to have a
positive po; hence it may be impossible to find a "fixed point" in the simplex.
34. However, it is not quite clear from Morishima and Murata [29] what is the mechan-
ism that sets the upper bound of the wage rate po.
35. Morishima [26] pointed out another situation, the "Ricardo-Marx" case, in which
the "real" wage rate is fixed as a result of the "reserve army" of labor and so on
([26] , p. 69). Here the "real" wage rate means the "money" wage rate po deflated
by a certain price index (see [261, p. 66).
36. Unlike in Morishima [261, it seems more natural to emphasize the r -> (po, p)
determination as a dynamic substitution theorem. In this case, as long as r is given,
the aj,'s and the bk;'s are fixed regardless of p and po. Since r reflects intertemporal
choice (hence it is abstracted away from the static theory), the static substitution
theorem may be considered as a special case (that is, r = 0) of the r- (P0, p)
determination.
37. With the introduction of money, which functions as a store of value, the state of
balanced growth or decay of the real goods sector does not necessarily imply
constant absolute prices, although relative prices may be constant. This seems
to be an obvious point, but it is often forgotten in growth theory literature.
38. It will not be too difficult to modify (89) if we assume perfect foresight. The real
task, of course, is to modify (89) under a suitable assumption with regard to
expectations concerning future prices.
39. For a recent survey of such a theory, see, for example, Burmeister and Dobell [ J ] ,
chapter 6.
40. The reswitching of techniques has the following important implication: If it occurs,
then it is impossible to say that a lower interest rate implies (in steady-state equili-
THE DYNAMIC LEONTIEF MODEL 539
REFERENCES
1. Burmeister, E., and Dobell, A. R., Mathematical Theories of Economic Growth,
London, Macmillan, 1970.
2. Chakravarty, S., Capital and Development Planning, Cambridge, Mass., M.I.T.
Press, 1969.
3. Domar, E. D., Essays in the Theory of Growth, New York, Oxford University Press,
1957.
4. Dorfman, R., Samuelson, P. A., and Solow, R. M., Linear Programming and
Economic Analysis, New York, McGraw-Hill, 1958.
5. Georgescu-Roegen, N., "Book Review: Morishima, M., Equilibrium, Stability and
Growth-A Multi-Sectoral Analysis," American Economic Review, LV, March 1965.
6. , Analytical Economics, Cambridge, Mass., Harvard University Press, 1966.
7."Goodwin, R., "A Non-linear Theory of the Cycle, "Review ofEconomics andStatistics,
XXXII, November 1950.
8. , "The Non-linear Accelerator and the Persistence of Business Cycles,"
Econometrica, 19, January 1951.
9. Hahn, F. H., and Matthews, R. C. 0., "The Theory of Economic Growth: A Survey,"
Economic Journal, LXXIV, December 1964.
10. Harrod, R. F., "An Essay in Dynamic Theory," Economic Journal, XLIX, March
1939.
11. , Towards a Dynamic Economics, London, Macmillan, 1948.
12. , "Domar and Dynamic Economics," Economic Journal, LXIX, September
1959.
13. Hicks, J. R., A Contribution to the Theory of the Trade Cycle, Oxford, Clarendon
Press, 1950.
14. Jorgenson, D. W., "On Stability in the Sense of Harrod," Economica, XXVII, August
1960.
540 MULTISECTOR MODELS OF ECONOMIC GROWTH
23. McKenzie, L. W., "Turnpike Theorems for a Generalized Leontief Model," Econo-
metrica, 31, January-April 1963.
24. McManus, M., "Self-Contradiction in Leontief's Dynamic Model," Yorkshire
Bulletin, 9, May 1957.
25. , "Notes on Jorgenson's Model," Review of Economic Studies, XXX, June
1963.
26. Morishima, M., "A Dynamic Leontief System with Neo-Classical Production Func-
tion," chap. III in his Equilibrium, Stability and Growth: A Multi-Sectoral Analysis,
Oxford, Clarendon Press, 1965 (a revision of his paper in Econometrica, 26, July
1958.)
27. , "An Alternative Dynamic System with a Spectrum of Technique," chap. IV
36. Sargan, J. D., "The Instability of the Leontief Dynamic Model," Econometrica, 26,
July 1958.
37. Solow, R. M., "A Contribution to the Theory of Economic Growth," Quarterly
Journal of Economics, LXX, February 1956.
38. , "Competitive Valuation in a Dynamic Input-Output System,"Econometrica,
27, January 1959.
39. Solow, R. M., and Samuelson, P. A., "Balanced Growth under Constant Returns
to Scale," Econometrica, 21, July 1953.
40. Suits, D. B., "Dynamic Growth Under Diminishing Returns to Scale," Econometrica,
22, October 1954.
41. Tobin, J., "Money and Growth," Econometrica, 33, December 1965.
42. Tsukui, J., "On a Theorem of Relative Stability," International Economic Review,
2, May 1961.
43. , "Efficient and Balanced Growth Paths in Dynamic Input-Output
System-
A Turn-Pike Theorem," Economic Studies Quarterly, XIII, September 1962 (in
Japanese).
44. , "Application of a Turn-Pike Theorem to Planning for Efficient Accumula-
where a > 0 is a constant. Note that K, here means the amount of capital employed
in period t. That the same notation K, is used for the supply of capital in period
t implies that we are assuming the full employment of capital. Thus (2) can be
obtained easily from (3) and (4).
Equations (1) and (2) can be rewritten as
(5) I't=I[Y,+i-Y,]+Xt
which corresponds to (6) in Section B. Here 1/a corresponds to the B matrix, and
there is nothing which corresponds to the A matrix.
It is certainly possible to specify consumption, such as
(6) Xt = (1 - s)Yt + c
where s denotes the marginal propensity to save (which is assumed to be constant
THE DYNAMIC LEONTIEF MODEL 543
and 0 < s < 1), and c ? 0 is a constant. In the Harrod-Domar model, the speci-
fication of the behavior of consumption as above is made explicit. Moreover, it is
usually assumed that c = 0. The closed Leontief model in which consumption is
included corresponds to a special case of the above, that is, the case in whichX, = 0
for all t, or s = 1 and c = 0 for all t. The open Leontief model with a fixed bundle
of final consumption corresponds to a special case of the above, that is, the case in
which X, = constant, or s = 1 and X, = c > 0. However, it is also to be noted that,
in the general open Leontief model, X, may not be constant over t but rather is
given exogenously as an explicit function of time t. In any case, as long asX, is given
either in the induced form such as (6) or in an exogenous manner as a function of t,
we can solve for Y, explicitly as a function of t.
We may consider the present model to consist of four equations, (1), (3),
(4), and (6), and four variables (Y,, K,,X,, I) to be determined in the system. IfX, is
given exogenously, then equation (6) drops out from this list and the variableX, is
also dropped accordingly. In the latter case, we simply obtain Y, from (5), and in
the former case, we obtain the following equation by combining (5) and (6):
(7) Ii+I =(1+su)Y,-co-
From this, Y, is obtained explicitly as a function of time. Then K, is obtained from
(4), I, is obtained from either (2) or (3), and X, is obtained from (5).
The solution of equation (7) is easily found to be
(8) Y, = (1 + sU), (Yo - Y*) + Y*
where Y* = c/s and Yo is the initial output. Clearly, as long as Yo > Y*, the
economy is capable of growth. In the usual Harrod-Domar model, 'j is assumed to
be zero (thus Y* = 0) so that Yo > Y* will always hold, and the economy grows at a
constant rate, su, that is,
(9) Y, = Y0(1 + sU),
put. Suppose this is a fixed constant. In order to avoid the above difficulty, assume
Yo > Y* so that Yt grows over time according to (8). As Yt grows over time, the
labor requirement, denoted by Lt, also grows according to lYt, or
- Pt+ L Yt - wtLt
t -
PtKt
or
(13) U
7r, = pt Pr+ t - wtl
sight)2 as
(14) Pt+IKt + [pt+IYt - w,L,] = (1 + r) (p,K1)
which can be rewritten as
(14') Pt+ Q
= l + r,
[Pt+ 1 - w,1]
Pt + Pt
where rt is the interest rate in period t. This is also written as
(15) Pt+]-Pt
Pt Pt
Equation (14) or (15) corresponds to equation (39) in Section B.
Following Morishima [ 161, we now ask whether constant prices are possible
in (15). That is, setting r, = r, w, = w, and p, = p*, we obtain from (15)
p*
(16) p* = r + w1
Q
which corresponds to our equation (78) in Section B. Notice that subscript t is now
attached to 1 and a. If we follow Morishima in [ 161, these coefficients 1, and a,
are now chosen to minimize the unit cost [ wrlr + Priar] subject to (20),4 where fit
is the price of the capital service of the good. Then we may write the optimal
values of 1, and a, as
(21) 1, = 1(Pr, wr)
which corresponds to Morishima's (81) in Section B. The price equation (16) can
be rewritten [in view of (23)] as
r *
(24) Pr = P* = Qr + rwl, for all t
(26) i=Mlr,,Yr,P,)
We assume labor is employed up to the point where the marginal produc-
tivity of labor is equal to the real wage rate.' Then we have
8
(27) w` =
Pr 8L, F(L,, K,)
Assuming that the production function F is homogeneous of degree one, so that
we can write F(L,, K,) = L,f(k,) where k, = K,/L, and f(k,) = F(1, k,), we have
(28)
=
f (k) - k, f'(k,)
Pr
where 8 is the rate of depreciation and 1, now denotes gross investment instead of
net investment. Corresponding to this, Y, now denotes gross output instead of net
output. As a result of the explicit introduction of depreciation, (14') should be
modified tog
(14") Ppt (1 - 8) + [P,+ i- w,l,] = 1 + r,
Pt
where a, = Y,/K, and 1, = L,/Y,.
548 MULTISECTOR MODELS OF ECONOMIC GROWTH
(37) Pt k, +
[(1 - u)k, +.f(kt)] = (1 + r,)k, + Pt
Equations (35), (36), (37), and (28) determine the time path of [k p , r w, j ,
once the behavior of per capita consumption x, is specified and the per capita
money supply m, is determined. The model can be complicated further (and
generalized) by introducing government expenditures and taxes explicitly. This
complication can be handled by reformulating the equilibrium relation of the
goods market (1) and by formulating explicitly the relation among government
expenditures, taxes, and the money supply. We leave this to the interested reader.
Note that equations (37) and (28) can be combined to yield"
(38) r, = [f'(k,) - u] + Ot
THE DYNAMIC LEONTIEF MODEL 549
where 0r = pr/pr, the rate of inflation or deflation. Equation (38) implies that there
is inflation (0, > 0) or deflation (cI < 0) at time t, depending on whether the
money interest rate r, exceeds or falls short of the "net" marginal productivity of
capital, [ f'(k,) - µ] , at time t. Price stability (0, = 0) is achieved if and only if
r, = f'(k,) - µ.
Per capita consumption x, depends on the choice between present consump-
tion and future consumption (present savings) for each individual and on the
income distribution among people (say, between the capitalists and the workers).
Hence x, would, in general, depend on variables such as r, p,, w,, and so on. There
is one simple specification of x, which ignores all such considerations, namely,
the proportional savings behavior, X, = (1 - s) [ Y, - µK,] . Or
(39) x, = (1 - s) [f(k,) - µk,]
where 0 < s < 1 is assumed to be constant."
If we adopt this specification of x, then (35) is simplified to
(40) k, = sf(k,) - (n + sy)k,
so that the time path of k, becomes independent of the other part of the model.
This is the case Solow was concerned with in his 1956 paper [ 22] As we proved
.
power or the purchasing power in terms of the real good. If we recognize that the
price of money in terms of the real good is 1/p, and if we denote it by pm,, then
d(M,/p,)/dt = pm,M, + pm,M,, where pm, - 1/p,. Also denote the "net output"
by Q,; that is, Q, - F(L,, K,) - µK,. Then (42) may also be written as"
(43) X, = (1 - s) [Qt + pmtMt + pm1Mt]
Alternatively, we may assume that the money value of consumption is a
constant function of the money value of income. Then instead of (42) or (43), we
have
(44) p, X, = (1 - s) [ p, Q, + M, + p, K, ]
The two specifications (43) and (44) are fundamentally different. As is well known,
(44) involves money illusion (see Burmeister and Dobell [ 1 ] , pp. 166-167). We
proceed by using (43) or (42).
Write the per capita real cash balances as z, - mt/p,. Also write the rate of
change in the money supply as 0, = M,/M,. Then, dividing both sides of (42) by
Lt, we 17
where
(53) tP(kt, z1; 0r) = 01 - n + [f'(kt) - it - g(k,, z1)]
Equations (50) and (52) then define the equilibrium path of (k,, z,) for a pre-
assigned value of 0,. Once the path of (k,, z,) is determined, the rate of price
change D is determined by (49). The dynamic behavior of (k,, z,) can be studied
by constructing a phase diagram in the (k,, z,)-plane using (50) and (52). The
construction of the phase diagram also reveals the condition for the existence and
uniqueness of the steady state path in which a, = 0 and k, = 0. This task is left to
the interested reader. Such an analysis can be seen, for example, in Burmeister
and Do bell ([1], Chapter 6). In this connection, note that (46) and (51) imply
(54) k, = sf (kt) - (n + su)k, - (1 - s) (z, + nz, )
Hence, if the steady state path (k, z) is ever achieved, then
(55) sf(k) = (n + su)k + (1 - s)nz
Therefore, assuming that f'(k,) > 0 for all k, and k > 0, (55) implies
(56) k < ks
where ks is the value of kin Solow's steady state path defined by (41). Equation (56)
implies that per capita output under the present steady state path is lowerthan that
under Solow's steady state path, that is, f (k) < f (ks ). Note that these conclusions
are independent of the rate of change in the money supply, 0,.21 It is, however, to
be stressed that the convergence to the steady state path under the present model
does not necessarily hold (unlike in Solow's theorem in [ 22] ).22 Hence the value
of any statement with regard to the steady state path underthepresent model is not
very great.
Above we assumed that the rate of change of the money supply 0t is exo-
genously given. Alternatively, we may suppose that the monetary authority
manipulates 0, so as to maintain price stability (that is, ct = 0 for all t).23 Imposing
(Pt = 0, we obtain from (46), (51), and (49)
(57) k, = sf (k,) - (n + su)k, - (1 - s)z, 0,
(58) z, = z,(0, - n)
(59) f'(k,) = g(kt, z1) + y
Note that (59) implies
(60) dz,
=
f- gk
A, gZ
where gk - t9g/ek, and gZ = 8g/8z,. Hence, assuming that gk > 0, g< < 0, and
f" < 0, we obtain dz,/dk, > 0.21 Therefore, we may write
The fact that r(k) > 0 means that the transaction demand for money is positive.
The shape of r(k) is illustrated in Figure 6.6.
Assuming that f" (k) < 0j'(0) = co, and f'(co) = 0, the shape of [f'(k) - µ]
is also illustrated in Figure 6.6. As is clear from the diagram, under the above
assumptions, there exists a unique value of k > 0 which satisfies
(65) f'(k) = r(k) + u
z,
The dynamic behavior of k, can now be deduced easily from Figure 6.8.
It is clear from Figure 6.8 that there exists a unique k* with 0 < k* < ks
which is defined by
(67) sf(k*) = (n + su)k* + (1 - s)nK(k*)
Moreover, from (66) we can immediately conclude that k, converges to k*
monotonically as t->oo, regardless of the initial value of ko. In other words,
k* is globally stable. If k, converges to k* monotonically, then from (61') we can
also conclude that z, converges monotonically to z*, where z* -- K(k*). Note
that at (k*, z*), k, = zi = 0, so that (58) implies 0, = n; that is, the money supply
is increasing at the rate of population growth. Moreover, if k, < k*, then z, < z*,
and z, is monotonically increasing, so that (58) implies 0, > n. On the other hand,
if k1',> k*, then we can similarly conclude that 0, < n. The precise formula for
0, can be computed from (57), (58), and (59). Some of the above conclusions may
be summarized as follows:
(n+sµ)k,
.01
k,
Figure 6.8. - Dynamics of k,.
554 MULTISECTOR MODELS OF ECONOMIC GROWTH
Proposition Under the above specification and the assumptions of the model, if
the monetary authority manipulates the money supply so as to maintain price stability,
then there exists a unique steady state path (k*, z*), where 0 < k* < ks, which is
globally stable. Along the stipulated price stability path of the money supply, B, n
according to whether k, V.
The relation between money and growth in connection with the one-industry
model has recently attracted the attention of many economists since Tobin's
fundamental paper [28]. We have simply traced and developed some thoughts
along these lines. Since active research is still being done on this topic, we do
not go any further.27 The above analysis is an exercise under the rather limited
conditions of perfect foresight, full employment, and no fiscal elements.21, The
reader may wish to extend our analysis by realizing these limitations. However, the
purpose of this Appendix is fulfilled if the reader realizes some of the inherent
difficulties in the dynamic Leontief model.
FOOTNOTES
1. The major result on money and growth in this appendix is taken from Takaya na
[261.
2. Note that intertemporal arbitrage is concerned only with two periods (t and t + 1).
Hence this perfect foresight assumption is often called myopic perfect foresight, as
mentioned before.
3. Recall that p* > 0 only when a > r.
4. This is clearly myopic, and can be justified under Morishima's assumption of a
stationary state. When we divert ourselves from the steady state (or balanced
growth) assumption, it is desirable to reconsider this decision rule. Here we simply
assume that such a "long-range" decision rule is reduced to the present myopic
rule.
5. Note that in (23), depreciation and obsolescence are assumed away.
6. Here At signifies the money value of assets. An alternative formulation with regard
to monetary equilibrium is M, = M(r,, p1K1, At). See Tobin [28], for example.
Following Tobin [28], we may assume that there are only two kinds of assets,
(outside) money, and the stock of capital, which certainly justifies the definition of A,
in (25). However, (25) can incorporate private (nongovernmental) bonds, and signify
a part of the portfolio equilibrium of the three types of assets, M, K, and, private
bonds. The introduction of interest-yielding government bonds will complicate the
formulation, although the essence of the conclusions in this Appendix will still
remain. On the other hand, if there are no bonds, equation (15) [and also (14")
and (32), which will appear later as a modified version of (15)] becomes the defining
equation of the own rate of interest (or the money rate of return on the physical
capital) and does not show the intertemporal arbitrage relation. In this case, (25)
describes the portfolio equation of M and K alone.
7. The behavioral rule of cost minimization (for each period) is the major background
for this result.
8. Assume, for example, that w, adjusts the labor market. Then an excess demand
(resp. supply) of labor will increase (resp. lower) w, with a given pt, thus increasing
(resp. lowering) w, /p,. It is usually assumed that pt adjusts the goods market. This
THE DYNAMIC LEONTIEF MODEL 555
adjustment mechanism takes place within the framework of the Hicksian week. Here
it is assumed that such an equilibrium is achieved in order to focus attention on
the equilibrium path. Under the Keynesian framework, w1 has downward rigidity;
that is, an excess supply of labor will not reduce wt.
9. With p1K1 dollars, one can buy Kt units of capital in period t, which are worth
pl+ I (Kt - 8K1) dollars in period t + 1. The current profit in t is pl+ i Yt - w1L1.
Hence we have the intertemporal arbitrage equation pt+ I (K1 - SK1) + (pl+ I YY -
w1 Lt) _ (1 + rt )pt Kt . From this, we can deduce (14").
10. Equation (32) corresponds to (14"). Note that (32) cannot be obtained by simply
setting cD1+ 1 = Pl + pt in (32). To obtain (32), first observe the following inter-
temporal arbitrage equation under perfect foresight for the continuous time case,
f 00
-A -1) e-f,%da dz
Pt = pT e
where pT = ptaT - wTaT/T, and we assume that the integral converges. The term
inside the integral gives the present value of the quasi-rent for time T. Totally
differentiating the above equation with respect to t, we obtain pt [pt at -
w1a111] + µp1 + r1pt, from which we obtain (32). In this derivation of (32), no
assumption is made with regard to the myopic nature of intertemporal arbitrage. It
is assumed that rT is known for all future time (T). However, (32) can also be obtained
by assuming the myopic nature of intertemporal arbitrage (that is, the arbitrage is
concerned only with the current time and the next instant of time).
11. Suppose that we have the alternative formulation M1 = M[rt, ptK1i A1]. Then
under a suitable set of assumptions we can conclude that Mt/(p1Kt) is a function of
r1 alone and that it is a decreasing function.
12. Equation (37) is obtained from (32), which assumes perfect foresight in the inter-
temporal arbitrage relation. In general, (pt/pt) should be replaced by E(pt/pt),
which denotes the expected rate of price change at 1. However, if (pt/pt) E(pl/Pt).
that is, if expectations turn out to be incorrect, then some sort of learning device
to correct such a mistake is necessary. One device which is used in the literature to
cope with this problem is the simple "adaptive expectations" postulate of the form
7rt = E(P1/pt - -1), where r1 denotes E(pt/pt) and E _> 0 signifies the speed of
adjustment in expectations. It says that the rate of adjustment in the current expected
rate of price change is linearly dependent on the error made in predicting the
current rate of change. No doubt this device can be useful. The reader can, if he is
interested, modify our analysis accordingly using this device. However, the funda-
mental question with regard to the background of this device from the viewpoint
of rational behavior is still unclear.
13. Equation (38) says that the money rate of interest (rt) is equal to the net marginal
productivity of capital [ f'(kt) - , ] plus the rate of inflation (or deflation) 41.
14. In (39), it is assumed that consumption is a constant fraction of net national product.
Alternatively, we may assume that consumption is a constant fraction of gross
national product. Then (39) is simply written as
(39') xt = (1 - s)f(kt)
15. This conclusion will be unaltered even if (39) is replaced by (39'). In that case,
the definition of ks needs to be modified to the one specified by
(41') sf (ks) = Y.ks
16. Consider the possibility that real output is very low, or to dramatize the story,
556 MULTISECTOR MODELS OF ECONOMIC GROWTH
consider the case in which F(Lt, Kr) = 0. The consumption specification (42) or
(43) then says that usual bounds on s such as 0 < s < I are absurd. If d(Mt/pt)/dt < 0,
Xt < 0, which is absurd. If, on the other hand, d(Mt/pt)/dt is positive and large
enough, Xt > 0. Assuming that one does not "eat up" the capital that is already
invested, this is again absurd, for this then implies that one can live by paper money
alone. Hence we may naturally impose the constraint such as 0 < Xt < F(L1, Kt).
17. Actually,' several other alternative specifications are possible. See, for example,
Levhari and Patinkin [ 15] and Stein [251. Note also that if E(pt/pt) pt/pt, then
capital gains or losses due to miscalculation should also be introduced in defining
the purchasing power. This point seems to be ignored in the literature.
18. It is easy to see that aM/art < 0 and aM/af > 0 from the usual Keynesian hypothesis
on liquidity preference; aM/a(kt + zt) > 0 or aM/aA1 > 0 says that money is not
an inferior good; and I > 3M/3(k1 + zt) says that an extra dollar of wealth will not
all be held in the form of money.
19. The proof of this statement is left to the reader.
20. Equation (51) is obtained from zt - mt/pt. Equation (51) implies that in the steady
state path in which it = 0, 0t = Bt - n; that is, the rate of price change is equal to
the rate of change of the money supply minus the rate of population increase.
21. However, the actual values of k and z, in general, depend on Bt. This is often
called the nonneutrality of money. The conclusion that "money matters" is rattier
obvious in view of the change of the consumption specification from (39) to (45).
22. In various money and growth models, it has been established that the steady state
path under the present specifications is not globally (nor locally) stable. See, for
example, Burmeister and Dobell [ 1] and Nagatani [ 18J.
23. To me, this is a much more acceptable hypothesis than the usual one in the money
and growth literature which assumes that the monetary authority keeps the rate of
money change (Bt) constant forever, regardless of the state of the economy.
24. In other words, if price stability is maintained, then kt and zt move together. Needless
to say, (59) signifies that if price stability is maintained, then the marginal physical
productivity of capital is equal to the rate of depreciation plus the rate of interest.
25. The proof is again left to the reader.
26. Such an illustration of (kt) is also seen in Burmeister and Dobell ([ 1] , p. 169).
27. For a recent survey of the discussion on "money and growth," the reader is referred
to Stein [25] and Burmeister and Dobell [ 1] .
28. Another major limitation is that we are concerned only with the equilibrium path
in which temporary (or momentary) equilibrium in all markets is achieved (instan-
taneously). For pioneering studies in which "disequilibrium" is allowed in this con-
text, see, for example, Rose [191, Stein [24], and Tsiang [30] . In essence, they
assume the price of the good changes if and only if there is disequilibrium in the
goods market. Writing It and St, respectively, for planned investment and planned
saving at t, they (somewhat arbitrarily) imposed that k, = ill, + (I - q)S,, where
71 is a constant with 0 <'1 < I.
REFERENCES
1. Burmeister, E., and Dobell, A. R., Mathematical Theories of Economic Growth, New
York, Macmillan, 1970.
2. Domar, E. D., "Capital Expansion, Rate of Growth and Employment," Econo-
metrica, 14, April 1946.
THE DYNAMIC LEONTIEF MODEL 557
March 1970.
558 MULTISECTOR MODELS OF ECONOMIC GROWTH
26. Takayama, A., "A Note on Money and Growth," Krannert Institute Paper No. 305,
Purdue University, March 1971.
27. Tobin, J., "A Dynamic Aggregative Model," Journal of Political Economy, LXIII,
April 1955.
28. "Money and Growth," Econometrica, 33, December 1965.
29. , "The Neutrality of Money in Growth Models: A Comment," Economica,
XXXIV, February 1967.
30. Tsiang, S. C., "A Critical Note on the Optimum Supply of Money," Journal of
Money. Credit and Banking, 1, May 1969.
MULTISECTOR OPTIMAL GROWTH MODELS
7
Section A
TURNPIKE THEOREMS
a. INTRODUCTION
Consider a trip from a suburb of Chicago to a suburb of New York City.
There are many routes that one could take. The fastest way is probably not the
route that is the shortest, that is, the route that is approximately a straight line
between the two suburbs. The fastest route is probably to get to the "turnpike"
as quickly as possible and travel on it until reaching an exit that leads to the
destination. This is true even if the "turnpike" appears to be a very roundabout
route compared to a straight line between the starting point and the terminal
point.
In the problem of economic growth, one may wonder whether or not there is
a path of growth that resembles a "turnpike," that is, a growth path on which an
economy should spend most of its time. This problem was considered by Dorfman,
Samuelson, and Solow (DOSSO) [2] with respect to the von Neumann type of
growth model. They conjectured that there is such a path and that it is none
other than the von Neumann growth path, that is, the path which maximizes the
growth rate among the set of balanced growth paths. This conjecture was first
proved rigorously by Morishima [22] and Radner [25] for the n-commodity case.
Since then, there have been many extensions and variations of the basic theorem.
For example, we list the following important papers: McKenzie [ 171, [181, [19],
and [20], Nikaido [23], Inada [ 10], Tsukui [281, [29], and [301, Drandakis [3].
Winter [35], and expository articles by Koopmans [ 14] and Hahn-Matthews [7]
for simpler cases. Because of the variations in these numerous papers, the theorem
is often referred to in the plural form as turnpike theorems.
Let us now describe the essence of theseturnpike theorems more specifically.
The basic model is the von Neumann (or at least a von Neumann type) economy
with n-commodities. The vector of historically given initial stocks of the commodi-
ties is given arbitrarily. It is supposed that the economy wishes to maximize the
559
560 MULTISECFOR OPTIMAL GROWTH MODELS
vector of the final stocks of commodities or the utility function which is defined
with regard to only the final stocks of commodities as its arguments. Then, in
terms of this optimality criterion, the "best" way for the economy to achieve
its goal is to spend "most" of its time "sufficiently close" to the von Neumann
growth path, regardless of the initial point, provided that the planning horizon
(that is, the terminal time) is sufficiently far away. In almost all versions of the
turnpike theorems, it is not advocated that the optimal path actually be on the von
Neumann path most of the time; it is only required that it spend most of the time
"sufficiently close" to the von Neumann path. Hence the above Chicago-New
York analogy is not quite accurate.
The above statement of the essence of the turnpike theorems may be illust-
rated by a simple diagram (see Figure 7.1). The turnpike theorems essentially
require that the "optimal" path arch toward the von Neumann path; it is in this
sense that the von Neumann path plays the role of the "turnpike." It is important
to note that in the statement above, optimality is defined with respect to the final
state only and that, unlike the analogy of the Chicago-New York trip, the terminal
point is not given whereas the time to reach the terminal state is specified..It is
possible to conjecture that the "turnpike property" of arching toward a certain
path holds for other types of models. Optimality may depend on the interim
states as well as on the final state, or the final state may be given and optimality
may be defined to minimize the time in reaching the final state.
The significance of the turnpike theorems for the von Neumann model and
the von Neumann theorem should now be clear. As we remarked in Chapter 6,
Section A, it saves the von Neumann theorem from its two basic criticisms: the
von Neumann path ignores the historically given stocks of commodities and
consideration is restricted to balanced growth paths. The situation is analogous
to the Ramsey-Koopmans-Cass theorem for the one-sector economy concerning
the golden age path (recall Chapter 5, Section D).
Aside from the number of commodities, there is, however, one important
(A-1) The set Tis a closed cone in the nonnegative orthant of the 2n-dimensional
real space, IZ2n.
(A-2) (No land of Cockaigne) (0, y) E T implies); = 0.
The model described above is the von Neumann type "closed" model of produc-
tion, in which there is no explicit treatment of consumption.
Definition (feasibility): Given N, the span of the programming periods, and given
the vector of initial commodity stocks, xo, a sequence {x1}, t = 0, 1, ..., N, is
called a feasible path with respect to xo, if (x1, xt+1) E T, t = 1, ..., N - 1, and
Xp = Yo.
562 MULTISECTOR OPTIMAL GROWTH MODELS
Definition (von Neumann path): A triplet (z, p, A), where z and p are nonzero
elements in the nonnegative orthant of R" and A E R with A > 0, is called a von
Neumann triplet or a von Neumann equilibrium, if
(i) (I, Ax) E T
(ii) p- (y - Ax) < 0 for all (x, y) E T
We call the process (z, A, i) a von Neumann process. The ray from the origin through
z is called the von Neumann ray (with respect to z) or the von Neumann (growth)
path (with respect to z). This ray is denoted by the set {x: x E R", x = az, a > 0,
a E R}. In the above triplet, p is called the (von Neumann) price, and A. is called
the (von Neumann) interest factor [or sometimes the (von Neumann) growth
factor]. An evaluation of process (x, y) E T by p (y - Ax), where p is the von
Neumann price and A. is the von Neumann interest factor, is called the von
Neumann value (or the von Neumann profit) of the process (x, y).
REMARK: Set y = A.z and x = z in condition (ii) of the definition of the
von Neumann triplet. Then we have p (y - Ax) = p (A.z - Ai) = 0. In
other words, the von Neumann value of the von Nuemann process (z, lz)
is zero.
We now assume:
(A-3) There exists a von Neumann triplet.
REMARK: In Chapter 6, Section A, we proved the "von Neumann
theorem" which asserts the existence of a von Neumann triplet under (A-1),
(A-2), and the following:
(Convexity) T is convex.
(Free disposability) (x, y) E T, x' > x and 0:5 y' !S y imply (x', y') E T.
(Productiveness) There exists an (x, y) E T such that y > 0.
Radner imposed the following assumption which qualifies the von Neumann
triplet:
(A-4) Let (z, p, A) be a von Neumann triplet. Then p (y- Ax) < 0 for all (x,y)'s
in T that are not proportional to (z, Ak), that is, those (x, y)'s in T which are not on
the von Neumann ray with respect to z.
REMARK: It is important to note that (A-4) guarantees the uniqueness of
a von Neumann ray. Radner remarked that (A-4) can be obtained from the
following assumption:
(A-4') The set T has a nonempty interior and is a strictly convex cone, in
the sense that z, z' E T, with z' not proportional to z, implies 6z + (1 - 0)z'
is in the interior of T for any 0 where 0 < 0 < 1.
That (A-4') implies (A-4) can be proved as follows. Suppose (A-4) does
not hold, so that there exists an (x', y') 6 T, which is not proportional to
TURNPIKE THEOREMS 563
(z, A2), but p (y' - Ax') = 0. Let (x, y) = A.i) + Z(x', y'). Then
Z(2,
Subject to:
XI 55-y, t= 1121...IN
xo<10
and (x,, y,) E T, t = 0, 1, ..., N
Here the inequalities again presuppose free disposability. Suppose, for example,
u is concave and "Slater's condition" holds [that is, there exists an (x, y) E T
such that x < x0 and _I < y] ; then the solution to Problem II may be regarded
as a solution to Problem I with the following particular utility function:
u(YN) = P*'YN, P* > 0, P* 0
(Recall Theorem 1.E.4.) Clearly this utility function, as remarked before, satisfies
assumptions (A-5) and (A-6). Since it may be rather artificial to conceive of a
utility function for the economy in which the consumers are not explicit, the
formulation of optimality in terms of Problem II may be better than that in terms
of Problem I. In this subsection, we use the formulation in terms of Problem II.
REMARK: In the above, we assumed free disposability and used the in-
equality constraints. This was done to utilize the ready-made theorems
developed in Chapter 1, and hence to relate the present discussion to that
chapter. In general, the free disposability assumption is not essential, and
the results in the present subsection follow in the main without such an
assumption by using the equality constraints.
Here we may digress from Radner's discussion of the turnpike theorem and
characterize the optimal feasible path (a la Problem II). In other words, by apply-
ing a theorem on vector maximum (especially Theorem 1.E.4), we can assert the
following:
t= 1
On the other hand, in view of condition (ii) of the definition of the von
566 MULTISECTOR OPTIMAL GROWTH MODELS
Neumann equilibrium,
p- (yt - Axt) < 0, for all (xt, yt) E T
That is,
Pt+ 1, Yt - Pr' xr < 0, for all (xt, yt) E T
Hence we obtain
Pr+ I ' Yr - Pt' Xt Pt+ I ' Yr - Pr' xt, for all (xr, Yr) E T, t = 0, 1, ... , N
where we set p* = pN+ 1. It is elementary to see that the other conditions
set in Theorem 7.A.1 also hold.
REMARK: The above theorem is, in essence, concerned with the saddle-
point characterization of the optimal feasible program. Let us now consider
the quasi-saddle-point characterization. To do this, first write the produc-
tion set T as
T= { (x, y): F (x, y) > 0, (x, y) E S2 In }
where the production function F is assumedto be continuously differentiable
and concave. Consider Problem II and assume again that the Slater condition
holds for this problem [that is, there exists an (2, y) E 02n such that x <
Yo, x < y, and F(2, y) > 0] . The Lagrangian of this problem can be written
as
N N
Let (rt, Yt), t = 0, 1, 2, ..., N, be the solution for Problem II. Writetheith
element of . , P , p, and p* as X/, Y/, p/, and p*, respectively. Then,
assuming an interior solution for all t (that is, Xr > 0 and Yr > 0 for all t) and
p* > 0, the following quasi-saddle-point conditions are necessary and suf-
ficient for an optimum:
aF(Xr,Yr)
-Pr i +9t ax; =0,i= 1,2,...,n;t=0, 1,...,N
r
Pr+I a yt
=0,i= 1,2,...,n;t=0, 1,...,N- 1
aF(XN, YN)
p*+gN
ayN'
=0,i=1,2,...,n
Here 3F(1 , y, )/ a y,+ and 3F(1 , y, )/ a x,' denote that these partial derivatives
are evaluated at (z,, Assume p, > 0 and q, > 0 for all t. Then we have
Yr-t=z,,t= 1,2,...,N,1o=xo,andF(11 ,Y1)=0,t=0,1,2,...,N.
Moreover, the first two sets of conditions yield
aF(zr-1,Yr-I)laY,_1i_ aF(x,,Yj)laxt`t=
aF(11,Y1)laxti
1,...,N,andi,j- 1,...,n
f Yr
Definition: The (Radner) distance between two vectors z' and z" is defined as
Z' z"
d(z', z")
IIZ'11 IIZ"II
In proving the main theorem, Radner first proved the following lemma,
which is crucial to the proof of his main theorem. It is often referred to as Radner's
lemma. Here we do not assume free disposability.
Radner's Lemma: Suppose that (A-1), (A-2), (A-3), and (A-4) hold, and let
(z, p, a.) be a von Neuman triplet. Then for any c > 0, there exists a S, 0 < 8 < A,
such that (x, y) E T and d(x, z) ? E imply p y < (A -. S) (p x).
REMARK: If (x, y) E T is on a von Neumann ray, then p y = A.(p x); that
is, the value of the output is A Limes the value of the input. Radner's lemma
asserts that whenever the distance from the process (x, y) E T to a given
von Neumann ray [or, equivalently, to (z, U)] exceeds some number c,
then the value of the output falls short of A times the value of the input for
such a process (x, y), by some proportion 8, as long as p. x > 0. In other
words, there is a certain "value loss" associated with such a process. It is easy
to see that the lemma is crucial in establishing the turnpike theorem. Suppose,
for example, that a feasible program {xj, t = 0, 1, ..., N, deviates from a
given von Neumann path in many periods. Then the sum of the values lost
may be excessive. If we could link the "value loss" totheoptimality criterion,
TURNPIKE THEOREMS 569
T1={y:(x,y)ETand IIx1I = 1}
X 11 = 1, or i -k 0. Since (P. yq)l (p xq) = (p. Yq)l (p zq) and (p. yq)l
11
REMARK: Assumption (A-8) is satisfied if, for example, all the coordinates
of p are positive. Assumption (A-7) is satisfied if, for example, there is free
disposability and x0 provides positive amounts of all those commodities. This
assumption enables the economy to reach the von Neumann ray one period
after the initial time, starting from an arbitrary initial pointxo. Assumption
(A-7) can be slightly weakened as follows: An initial vector Yo is given such
that there exists a feasible sequence {x}, t = 0, 1, . . ., N0(No > 1), starting
from the given value oil' A. o at t = 0, such that xN,O = kz for some k > O. In other
words, the economy can reach the von Neumann ray within a finite number
of periods. Assumption (A-9) can be weakened as follows (as pointed out by
Radner [25] ): For some integer N, > 0 and some commodity vector y for
which u(y) > 0, there is a feasible sequence from i toy in N, periods.
for any (xt, xt+ i) E T. Suppose that d(xt, z) > c for N' periods. Then we have
P' xN <_ (A - 8)NAN-N (p. o)
Then, by (A-8), there exists an a > 0 such that
u(XN) < a(pXN) < a(A - 8)NAN-N(p Xo)
On the other hand, by the homogeneity of u, (A-6),
U(XN) = UN-1 u(x)
or
A>0
logb+N'logA8
Here it is essential to note that S < A; for, otherwise, log [(A - S)/A,] makes
no sense. The above inequality can be rewritten as
N, < log b
log
(/1-8
572 MULTISECTOR OPTIMAL GROWTH MODELS
Define N by
log b
N = max 1, log( -
) (Q.E.D.)
REMARK: The number N gives the maximum number of periods that any
optimal feasible path can remain at a distance exceeding c from the von
Neumann ray. It is crucial to observe that N is independent of the planning
period N. Hence if N is sufficiently large, N becomes sufficiently larger than
N and any optimal feasible path starting from an arbitrary initial point
spends "most" of its time within the E-distance from the von Neumann ray.
Note also that Radner's theorem (just as several other versions of the turn-
pike theorems) does not advocate that any optimal feasible path must be
on the von Neumann ray most of the time. It requires only that it must be
sufficiently close to the von Neumann ray most of the time.
One of the difficulties in Radner's turnpike theorem is that it does not
preclude the possibility that an optimal feasible path may run out of the neighbor-
ing E-cone of the von Neumann ray around the halfway point of the entire pro-
gramming period. In other words, the optimal feasible path may enter and leave
the neighboring E-cone several times. In this sense, Radner's theorem is some-
times referred to as a hop-skip-jumping turnpike theorem or a weak turnpike
theorem. This possibility can, with certain additional assumptions, be ruled out.
Such a theorem is often referred to as a strong turnpike theorem. In terms of the
Radner type. model, such a theorem is proved by Nikaido [23] and Inada [ 101. 4
Nikaido [23] imposed the following assumptions in addition to the assumptions of
Radner's theorem.
(N-1) For any x > 0 there is some y such that (x, y) E T, where y can be 0.
(N-2) z > 0.
(N-3) The function u(x) is such that x > x' >_ 0 implies u(x) > u(x').
Assumption (N-1) is related to but weaker than the usual free disposability
assumption, which says that (x, y) E T, x' > x, and y' < y imply (x', y') E T.
Assumption (N-3) is satisfied if, for example, u(x) = p* x with p* > 0.
Under these additional assumptions, Nikaido's strong turnpike theorem
([23], p. 154) asserts the following:
For any c > 0, there is a number N, such that, for any N and for any optimal
feasible program, {z,}, t = 0, 1, ... , N, starting from an arbitrarily given x0, we have
d(z,, z) < E for N, < t < N - N,
FOOTNOTES
2. For an excellent attempt to survey various turnpike theorems, see Turnovsky [321,
for example.
3. Obviously this is not necessarily true, if the prescribed initial stock x p is not on
the von Neumann ray. On the other hand, we can conclude that any balanced growth
path other than the von Neumann path is not optimal even if the initial stock xp
is on such a path. This is owing to the observation made in the previous remark
that the converse of the above theorem also holds.
4. In this connection, we should point out Tsukui's contribution [281. In a Leontief
type model with alternative techniques, he proved a strong turnpike theorem as well
as other results. The result of this paper overlaps with those in McKenzie [ 17] ,
Drandakis [31, and Tsukui [291. As in [291, Tsukui in [281, also proved a "dual
theorem" which shows the turnpike behavior of the shadow prices of the efficient
path about the von Neumann price ray. Strikingly enough, his [28] was apparently
completed in February 1961 (as a Ph.D. thesis at the Hitotsubashi University), and it
appears to be independent even of pioneering works by Morishima [22] and Radner
[ 25] . Tsukui's contribution in [ 28] seems to be unduly ignored. In this connection,
the truly pioneering nature of the Japanese edition (published in 1957) of Furuya and
Inada [4] in the turnpike literature should be emphasized. Incidentally, Nikaido
[ 23] was apparently written under the stimulus of Tsukui [ 28] (see [ 23] , p. 151).
REFERENCES
1. Atsumi, H., "Neoclassical Growth and the Efficient Program of Capital Accumula-
tion," Review of Economic Studies, XXXII, April 1965.
2. Dorfman, R. A., Samuelson, P. A., and Solow, R. M., Linear Programming and
Economic Analysis, New York, McGraw-Hill, 1958, chap. 12.
3. Drandakis, E. M., "On Efficient Accumulation Paths in the Closed Production
Model," Econometrica, 34, April 1966.
4. Furuya, H., and Inada, K., "Balanced Growth and Intertemporal Efficiency in
Capital Accumulation," International Economic Review, 3, January 1962.
5. Gale, D., "The Closed Linear Model of Production," in Linear Inequalities and
Related Systems, ed. by H. W. Kuhn, and A. W. Tucker, Princeton, N.J., Princeton
University Press, 1956.
6. , "On Optimal Development in a Multi-Sector Economy," Review ofEconomic
Studies, XXXIV, January 1967.
7. Hahn, F. H., and Matthews, R. C. 0., "The Theory of Economic Growth: A Survey,"
Economic Journal, LXXIV, December 1964.
8. Hicks, J. R., "The Story of Marc's Nest," Review of Economic Studies, XXVIII,
February 1961.
9. , Capital and Growth, Oxford, Clarendon Press, 1965.
32. Turnovsky, S. J., "Turnpike Theorems and Efficient Economic Growth," Chapter
10 of Mathematical Theories of Economic Growth, by E. Burmeister and A. R. Dobell,
New York, Macmillan, 1970.
33. von Neumann, J., "A Model of General Economic Equilibrium," Review ofEconomic
Studies, XII, 1, 1945-46 (originally published in German, 1937).
34. Winter, S. G., "Some Properties of the Closed Linear Model of Production,"
International Economic Review, 6, May 1965.
35. , "The Norm of a Closed Technology and the Straight-Down-the-Turnpike
Theorem," Review of Economic Studies, XXXIV, January 1967.
Section B
MULTISECTOR
OPTIMAL GROWTH
WITH CONSUMPTION
a. INTRODUCTION
In spite of all the excitement in the profession, the turnpike theory, at least
in its earlier versions, has one major weakness: It assumes that the utility function
is a function of the terminal stock of commodities only. This means that the
economy's concern about the intermediate periods is restricted only to their effect
on the terminal stock of commodities. The utility function in the (earlier) turn-
pike theory is defined only on the terminal stock of commodities and not on the
stock of commodities in any intermediate period. As Koopmans remarked, "the
purpose of economic activity is by implication assumed to be the fastest growth
rather than the enjoyment of life by all generations" ([9], p. 357).
Ramsey, Koopmans, and Cass have overcome these shortcomings of the
turnpike theory for a one-commodity economy. We have already discussed their
problem in Chapter 5, Section D. For a multisector model Gale [6], then
McKenzie [ 13], have made major progress and have provided an almost complete
solution of the problem. In their treatment of the problem, the utility function
depends on every intermediate state as well as on the terminal state. If s, represents
the state of period t relevant to satisfaction, Gale's utility function is represented
as E,""_ iu(s,) for an N-period program, or as E' iu(s,) for an infinite horizon
program. Gale and McKenzie are concerned primarily with the optimal program
when the time horizon is infinite. In this sense, they are addressing the same ques-
tion for the multisector optimal growth problem that Ramsey, Koopmans, Cass,
and so on, addressed for the one-sector optimal growth problem. Gale shows the
existence of an optimal path by actually constructing such a path, which at the
same time exhibits the basic characteristics of the optimal path. In arriving at
this major result, Gale utilizes Radner's procedure in proving his turnpike
576 MULTISED'OR OPTIMAL. GROWTH MODELS
theorem; hence a concept analogous to the von Neumann ray becomes essential in
his procedure. For this purpose, he defines the concept of an "optimal stationary
program"; then the "loss" associated with paths which deviate from this "optimal
stationary program" plays a crucial role in establishing his major theorem. He
confesses, "it may well be true that there is a more direct way of obtaining our
existence theorem," but "the facts we pick up along the way are of economic
interest in themselves describing properties of an `optimal path"' ([6] , p. 1).
Although in showing the existence and in characterizing the optimal path
we essentially follow Gale's procedure, our presentation is more expository. In
addition, it differs from Gale's presentation in the following respects:
(i) Gale assumed that the utility function is defined on the input-output process
adopted at each period. If (x,, y,) denotes such a process, then s1 = (xr, yr),
and Eu(x,, y,) represents his utility series. Here (x,, y,) includes consumption
activities such as eating cakes as well as production activities such as producing
cakes. Although he claims that this is "the conceptually correct way" ([61,
p. 6), and although he may be right in his defense, this has the weakness of
obscuring the distinction between the production activity and the consumption
activity, with minor implications such as obscuring the role of consumers'
satiation. Certainly an activity of consuming cakes is essentially different from
an activity of producing cakes and in economics it is often very important to
make this distinction clear. It would be difficult to rewrite the entire theory
of competitive markets (such as described in Chapter 2) by adopting Gale's
procedure. Hence in this section we assume that the utility function is defined
on consumption vectors instead of on input-output vectors. Such a procedure
is certainly the case for the one-sector optimal growth theory a la Ramsey,
Koopmans, and Cass.
(ii) In the discussion of the "optimal stationary program," Gale [7] utilized his
new results in the theory of nonlinear programming and developed the Kuhn-
Tucker theorem. We show that we can do the same job by utilizing the ordinary
concave programming theory (Chapter 1, Section B) without any new result.
(iii) Brock [3] worries about Gale's assumption of strict concavity of the utility
function. His worry is mainly due to the fact that it does not include the "von
Neumann" economy. Although he followed Gale in defining a utility function
on input-output vectors, Brock simplified Gale's procedure on one important
point which we call "Brock's lemma." For a discussion of his other important
contribution on the "weakly maximal program," the reader is referred to his
paper [3].
A rough preview of this section is now in order. First, we may note that we
agree with Gale about the importance of appreciating various "sceneries" in
connection with the present problem. In subsection b, we formulate the basic
model of this section. Then in subsection c, we discuss the finite horizon problem.
There we show that every "competitive" program is "optimal" and that every
"optimal" program is "competitive" (Theorems 7.B.1 and 7.B.2). (As pointed out
MULTISECTOR OPTIMAL GROWTH WITH CONSUMPTION 577
by Malinvaud [II ] , similar theorems for the infinite horizon case would not hold
without an important modification, that is, "cost minimization.") In subsection d,
we switch to the finite horizon problem. First, we introduce the concept of "optimal
stationary program" (O.S.P.). Theorem 7.B.3 asserts its existence and Theorem
7.B.4 asserts the price implications of the optimal stationary program. In the
corollaries of Theorem 7.B.4, we discuss (i) the consumers' nonsatiation condi-
tions which guarantee semipositiveness and strict positiveness of the price vector
associated with the O.S.P, and (ii) the conditions which guarantee the uniqueness
of the O.S.P. In subsection e, we compare an arbitrary "attainable" program (for
the infinite horizon problem) with the O.S.P. In Theorem 7.B.5, we assert that
no attainable program is "infinitely better" than the O.S.P. In Theorem 7.B.6, we
characterize the attainable paths which are not "infinitely worse" than the O.S.P.
(called the "eligible programs"). Theorem 7.B.7 establishes the existence of an
eligible attainable program, and in Theorem 7.B.8, we assert that every eligible
attainable program converges to the O.S.P. if the O.S.P. is unique. In subsection f,
we turn to the optimal program for the infinite horizon problem and by Theorem,
7.B.9, prove the crucial result of this section, the existence of the optimal attain-
able program. Before we prove Theorem 7.B.9, we introduce Brock's lemma,
which is crucial to this theorem.
b. THE MODEL
Consider an economy with n commodities. Let (x,, y,) denote the production
process in period t where x, and yt, respectively, denote the (stock) input vector
and the (stock) output vector. Let T be the set of such processes which are tech-
nologically feasible in the economy. Thus Tis the technology set (or the production
set) of the economy. Let ct denote a consumption vector of the economy and let
C be the set of all possible consumption vectors (not necessarily technologically
feasible) in the economy. We assume that:
(A-1) (i) The set T is a nonempty, compact, and convex subset of R2", and
(ii) C is a nonempty, compact, and convex subset of R".
The set T is bounded because of some sort of resource limitation, which we
clarify in an example later in this subsection, and C is bounded from below for
the obvious reason of subsistence, and so on. We may assume that C is hounded
from above owing to the physiological limitation of personal consumption, for if
C is not bounded from above and if the economy can "grow" indefinitely as time
extends without limit, then the above assumption would not hold. However, the
justification for the upper bound of C may not be acceptable to some readers.
One way to avoid this question is to introduce consumers' satiation, which imposes
a practical upper bound on consumption; for example, recall Ramsey's "bliss"
in [ 16]. Another way is to suppose an upper bound on capital accumulation
resulting from capital satiation. The latter may be more acceptable. (See, for
example, McKenzie [ 13].) If there is an upper bound on capital accumulation.
578 MULTISECTOR OPTIMAL GROWTH MODELS
then this, together with the lower bound on the consumption set, will practically
make the relevant "attainable" set bounded, which makes the relevant consump-
tion set bounded. In this connection, we may remind the reader of the procedure
in the theoryof competitive equilibria, in which the attainable set is "compactified."
(See Debreu [4], pp. 76-78.)
In any case, we proceed with our analysis under the assumption that both
T and C are bounded. Since both T and C are compact, T x®C is also compact.
That T (2)C is nonempty and convex follows from the fact that a product of non-
empty convex sets is nonempty and convex. Hence (A-1) implies that TO C is
nonempty, compact, and convex. Let ro denote the vector of the stock of com-
modities made available at the beginning of period 0. It is called the initial resource
vector. Let Z be the set of all possible initial resource vectors. We assume that
Z is a nonempty bounded subset of R".
We assume that the welfare of the society in period t can be represented by
the utility function u(c,) such that:
We call such a sequence {(x y c,)} an N-period attainable program starting from
ro. The set of all the N-period attainable programs starting from ro is denoted by
AN(ro). When N - co, we can analogously define the infinite horizon attainable
program starting from ro. The movement of our economy, that is, a sequence
{(x y,, c,)} where (x y,) E T and c, E C, is described by Figure 7.4
We now discuss an important example of the economy described above.
EXAMPLE: We consider an economy in which n "commodities" and one
type of "labor" are involved. The essential characteristics of this "labor" are
that it is indispensable for any production process and that it grows at a con-
stant rate y. Let a,1 be the amount of the ith commodity input per unit opera-
tion of the jth process and let b,1 be the amount of the ith commodity output
per unit operation of the jth process. Here the unit operation of each process
is measured by one unit input of "labor." We assume that a1 >_ 0 and
b;1 > 0 for all i and j. Let A and B be n x m matrices such that A =
and B = [by] The technology set may be defined as {(A v, B v): v >_ 0,
.
X,+2
x,+i >Yt+i-3Ct+2
xt >Yt>Ct+i
T
x`-1 ) Yt-i - c,
b
Period (t-1) Period t Period (t+1)
PROGRAM a
t= 1,2,...,N
<Lr,vt>_0andct>0, t=0, 1,...,N
where u = (1, 1, ..., 1) E R-
zt-1,
t= 1, 2,...,N
1 +,u.
1,zt>_0andct>0, t=0, 1,...,N
Write xt = A zt and yt_ I = B zt_ 1 /(1 + µ). Let T (xt, y,): xt = A zt,
yt = B zt/(1 + µ), 0 < u zt < 1 } . Clearly, T is nonempty, compact, and
convex. We can now construct an attainable program.
580 MULTISECFOR OPTIMAL GROWTH MODELS
PROGRAM R
(xr, Yr) E T, ct E on
xt+ct =Yt-1, t= 1,2,...,N
xo+co!5 ro
Clearly program a and program R are equivalent in the sense that there
exists a one-to-one correspondence by the rule defined above. Hence the
set of utility sequences u(ct/Lt) in program a and the set of utility sequences
u(ct) in program R are identical. Therefore, it suffices to consider only
program A.
(i)
(ii) Pr+ Yt - Pr' xr = Pr+ I Yr - Pr xr for all (xr, Yr) E T, t = 0, 1, ... , N
(iii) { (xr, Yr, cr)} E AN(ro), po- (r0 - c0 - zo) = 0, p,- (vr_ i cr - zr) = 0
t= 1,2,...,N and PN+i = 0
(2) u(C1) - u(cr) > Pr' (Err - cr) > Pr' lvr-1 - xr) - pt-(Yr-1 - x,)
15t<N
Theorem 7.B.2: Under (A-2) and (A-4), if {(zr, yr,cr)} is an optimal program with
respect to AN(ro), then it is competitive.
PROOF: By the hypothesis of the theorem, {(zr, yr, cr)} maximizes
N
Z u(cr)
r=0
subject to ro >_ x0 + co, yr- I > _ xr + cr, t = 1 , 2, ... , N
and(x,,yr)E T, c1E C, t = 0, 1,2,...,N
Since u is concave and since Slater's condition is satisfied from (A-4), we can
apply the Kuhn-Tucker-Uzawa theorem of concave programming (Theorem
1.B.3 and its corollary). In other words, there exist pr > 0, t = 0, 1, 2, ... , N,
such that
N N
(5) Zu(6r)+Po'(r0-zo-co)+ PI'(Yr 1-xr-cr)
r=0 r= I
N A'
(8) Pto' cto > u(c,) - Pto' c,0 for all c,0 E C
for all (x,, y,) E T, t = 0, 1, ..., N - 1. Now set x, = z, and y, = y, for all
t = 0, 1, ..., N - 1, except for to; then we obtain
(9) Pto+ I (Yto - y,o) > P,0' (z,o - x,o) for all (x,o, yto) E T
Note that the choice of to is arbitrary, so that (9) holds for any to = 0, 1, ... ,
N - 1. Next put c, = c,,x,= z,, and y,= y,(for t= 0, 1,...,N- 1) in
relation (7), and obtain
PN' xN > -PN ' xN
or
Theorem 7.B.3: Under (A-1), (A-2), and (A-5), there exists an O.S.P.
REMARK: In establishing this theorem the strict inequality (< ) in (A-5) can
be weakened to the weak inequality ().
We now prove the following important theorem which is in Gale [6].
Theorem 7.B.4: If {(x, y, c)} is an O.S.P. and if (A-2) and (A-5) are satisfied, then
there exists a p ? 0, p E R", such that
(10)
for all (x, y) E T and c c C
and
(11) y-z-c>=0
MULTISECTOR OPTIMAL GROWTH WITH CONSUMPTION 585
REMARK: We call the p obtained in Theorem 7.B.4 the price vector asso-
ciated with O.S.P. It corresponds to the von Neumann price vector.
REMARK: The above inequality holds, afortiori, for all (x, y) E T and c E C
such that x + c < r0 for a given r0 E Z [provided that such an (x, y, c) exists].
If c in the above theorem is not a satiation consumption [that is, if there
exists a c' E C such that u(c) > u(c)], then we can also assert that p 0. To see
this, suppose p = 0 in the inequality (10); then u(c) > u(c) for all c E C, which
contradicts c being a nonsatiation consumption. Suppose we strengthen this
assumption of nonsatiation such that, for c, there exists a ic' = (c,, c2' ... ,
c;,..., E C such that u(ic') > u(c), for all i = 1, 2, ..., n. This means that the
consumption vector for c can be improved by changing the amount of consump-
tion of any commodity. In other words, at c, the consumption of no commodity
reaches a satiation point. We call this assumption the strong nonsatiation assump-
tion, and we call the former assumption ["there exists a c' E C such that u(c') >
u(c)"] the weak nonsatiation assumption. If this strong nonsatiation assumption
holds, then we can assert that p is strictly positive. To see this, suppose the contrary,
so that pio = 0 for some i0 in 11, 2, ... , n}. Let ci = ci, xi = zi, and yi = yi for all i =
1, 2, ..., n, except for i = i0 in inequality (10). Then we have u(c) > u(ioc) for
all E C where ioc = (ci , c2, ... , cip, ... , c,). This contradicts the strong non-
satiaon assumption. We may summarize this as a corollary of Theorem 7.B.4.
It should be clear that the O.S.P. is not necessarily unique. If u(c) is a strictly
concave function of c, then we can assert that c is unique. But this does not
guarantee that the associated input-output vector (z, y) is unique, unless T has
some special feature. Here we point out one such feature, that is, the strict con-
vexity of the set T.
Definition: The set T is said to be strictly convex if (x, y) E T, (x', y') E T, and
(x, y) (x', y') imply that there exists an (z, y) E T such that, for some 0,
0<0<1,
586 MULTISECTOR OPTIMAL GROWTH MODELS
and
y > By + (1 - 0)y'
Corollary 2: Suppose that u is strictly concave and that both T and C are convex.
Let (z, y, c) be an O.S.P. Then
(i) c is unique.
(ii) If in addition, the assumptions of Theorem 7.B.4 hold with p 0 and if T is
strictly convex, then the (z, y) associated with c is also unique. Thus (z, y, c) is a
unique O.S.P.
PROOF:
(i) We first assert that the strict concavity of u implies the uniqueness of.
c. To prove this, suppose that (z, y, c) and (x', y', d) are two O.S.P.'s
such that e c'. Note that u(e) = u(c) by the definition of O.S.P. Let
Z + c'), z = + x'), and + y'). Clearly z + c < Also
(z, y)E T and c E CZ(zresulting from theZ(9
convexity of T and C. Hence the
constraints for the defining programming problem are all satisfied for
(z, y, c). But owing to the strict concavity we have
u (c) > Z [ u (c) + u (c' )] = U (c)
which is a contradiction.
(ii) Next we show that the input-output vector (1, y) associated with c is
unique under the strict convexity of T. To show this, suppose the contrary
so that both (z, y) E T and (x', y') E T are associated with c, where
(c, y) (x', y'). In other words, both (r, y, c) and (x', y', c) are solutions
of the defining programming problem of O.S.P. Since the assumptions of
Theorem 7.B.4 hold, the necessary and sufficient condition for O.S.P.-
that is, the relations (10) and (1 I)-holds. In view of (11) and the assump-
tion that both (z, y, c) and (x', y', c) are O.S.P.'s, we have
or
or
P.(Y*-x*-c)>0
Set x = x*, y = yand c = c in relation (10), and note (11). Then we
obtain
P. (y*-x*-c)<0
which is a contradiction. (Q.E.D.)
REMARK: As we remarked in subsection a, Gale [6] and Brock [3]
assumed that u is a function of (x, y) rather than of c and suppressed c from
their entire analysis. Hence in Gale [6], his assumption of the strict con-
cavity of u implies the uniqueness of his O.S.P., (z, y), without any assump-
tion such as the strict convexity of T. However, as Brock pointed out, the
strict concavity of u(x, y) precludes the von Neumann type model from the
analysis. Brock therefore assumed the concavity of u(x, y) instead of its
strict concavity to allow for the von Neumann model. To obtain some of his
major theorems, he also assumed that the O.S.P. is unique. We may wish
to obtain the conditions, other than the strict convexity of T and the strict
concavity of u, which would imply the uniqueness of the O.S.P. We leave
this to the interested reader.
REMARK: The uniqueness of the optimal attainable program
for the finite horizon problem can be established in a manner similar to
that in the above Corollary 2.
(i) The utility sequence of any attainable program cannot be infinitely better than
that of the O.S.P. (Theorem 7.B.5).
(ii) An attainable program can be "infinitely worse" than the O.S.P. by deviating
from it sufficiently. The characterization of such a path is given by Theorem
7.B.6.
(iii) Any attainable program that is not "infinitely worse" must converge to the
O.S.P. asymptotically (Theorem 7.B.8), if the O.S.P. is unique.
Neumann ray in establishing his turnpike theorem (see Section A of this chapter).
Following Gale [6], we now establish the above statements one by one. First
we prove the following.
Theorem 7.B.5: Suppose that {(z, y, c)} is an O.S.P. and that (A-1), (A-2), and
(A-5) hold. Then, for any N and for any attainable program {(xr, yt, cr)} startingfrom
any given ro E Z, there exists an M such that
N
(13) Y (ur - u) 5 M, where ut = u(cr) and u = u(c)
r=0
PROOF: Using (A-2) and (A-5), we apply Theorem 7. B.4. Then there exists a
p > 0 such that
(14) for all(x,y)E Tand cE C
or
(15) u(ct) - u < p. (xr + cr - yt) for all (xt, yr) E T and ct E C
t=0, 1,2,...,N
Now suppose { (xr, yr, cr)} E A so that xr + cr < yt- , t = 1, 2, ... , N. Then
I
summing both sides of the above inequality (15) over t, we obtain the follow-
ing relation for {(xt, yr, c,)} E AN(ro):
N N
1=0
(ur- u) =(x,+
r=0
ct-Y1).
N
co)+ P'(xr+ cr - Yr- i) - P'YN
r=1
P'r'o-P'YN
The last two inequalities hold because xr + cr 5 yr_ 1 and x0 + co < ro with
p >_ 0. Since T and Z are bounded, there exists a real number M, independent
of N, such that
(17)
P'r'o-P'YN< M
Hence
N
Definition: An infinite horizon attainable program {(x,, y,, c,)}, starting from an
arbitrary given r0 E Z, is called eligible if its associated utility series is bounded
from below; that is, there exists a real number E such that
N
(18) X(u,-u)>_E for any N
r=0
Theorem 7.B.6: If an attainable program {(x,, y,, c,)} starting from any given
r0 E Z is not eligible, and if (_A-1), (A-2), and (A-5) hold, then
N
(19) I (u, - u)- - oo as N->oo
r=0
PROOF: Since {(x,, ),,, c,)} is not eligible for any E, there exists an N depen-
dent on E such that
(20) (u,-t)<E
t=o
Also, in view of relation (15), we have
N N
Since xt + ct - y,_I < 0 for all t for {(x,, y,, c,)} E A(r0), this implies
N
It follows that
N
(24) (u, - u) - - co as N - co
=0
Theorem 7.B.7: Suppose that an O.S.P., (z, y, c), exists. Then under (A-1), (A-6),
and (A-7), there exists an eligible attainable path starting from an arbitrary given
ro in Z.
PROOF: By (A-6), there exist an (x, y) E T and a c E C such that x + c < y
and x + c ro. Define x, , y, , and c, as
(26) x, = (1 - ).)z + Ax, where 0 < A < 1
=(1 -At)z+Atxx,
yt = (I - A)Y + Ayt- I
ct = (I - A)c + Act- i
We now show that x, + c, < y,_ 1 and (x y,) E T for all t = 1, 2, .... First
note that this is true for t = I by putting yo = y. Next, by mathematical in-
duction on t,
(35) x,+ i + ct+ I (I - A)z + Ax, + (I - A)c + Ac,
=(1-A)(z+c)+A(x,+ct)<(1-A)y+Ayt-I=yt
using xt + ct < y,_1. Also (xt, yt) E T means (xt+1, yt+1) E T, owing to the
convexity of T. That c, E C for all t = 1, 2, ... , can also be shown easily by
mathematical induction. First recall cl E C. Then the convexity of C with
c E C and c, E C implies ct+ I E C in view of (34). Let x0 - x and co = c.
Then x0 + co < ro. Hence {(xt, yt, c,)}, t = 0, 1, 2, ..., constitutes an attain-
able program.
To show that the program is eligible, note that u has a bounded steep-
ness by (A-7). In other words, there exists a a > 0 such that
Theorem 7.B.8: Suppose that (A-2) and (A-5) hold and that (z, y, c) is the unique
O.S.P. with strict convexity of T and strict concavity of u. Then i[an attainableeprogram
{(xt, y, ct)} starting from ro is eligible, then {(x,, y, ct))} converges to as
t - co, regardless of the value of ro in Z.
PROOF: By the hypothesis of the present theorem, Theorem 7.B.4 holds, so
that relation (15) also follows. (See the proof of Theorem 7.B.5.) Rewriting
relation (15), we obtain
(38) ut - u= p (x,+ct-yt)-At,t=0,1,2,...
for all (xt, yt) E T and ct E C, where u, = u(ct), u = u(c), and At > 0 for all
t. Summing this, we obtain
N N N
(39) Z(u,-u)=ZP'(xt+ct-Yt)-ZRt
t=o t=o t=o
2:p.(xt+ct-Yt-1)-ERt,
N N
A (ro), we have
N N
But the eligibility of {(x,, y,, c,)} implies that there exists an E such that
E < Zr o(u, - u). Hence we have
N N
ES (ut-u)<M-YAt
t=0 t=0
or
N
(41) 2 /3t<_M-E,N=0,1,...
t=0
REMARK: For the case in which (c, y, c) may not be a unique O.S.P., see
Brock ([3], his lemma 4).
Corollary: Suppose that the assumptions of the previous theorem hold. Then
Et o(ut - u) converges to a finite value.
+ N
E + P'(YN - ro) 2: 13t, forallN
t=0
Here a real number E which is independent of N exists owing to the eligibility
of the program. Since Et ol3t is convergent (from the proof of Theorem
594 MULTISECTOR OPTIMAL GROWTH MODELS
7.B.8) and yN - y as N -> oo, the series Z' l p (zt + ct - yr- i) is bounded
from below by k - p r0 where k - E + p y + Z' 0Ar, a fixed number.
(Since Z is bounded, p r0 is also bounded.) Since p >_ 0 and (xt + ct -
yt_ i) s 0 for {(xt, yt, ct)} E A(r0), and 2:0 1 p (xt + ct - yt_ 1) is bounded
from below, this series Zj I p (xt + ct - yt_ 1) is monotone nonincreasing
and converges to a finite value. Therefore, in view of equation (39) and re-
Z,t"_0(ut
calling again that y,, - y as N -> oo, - u) converges to a fixed
value. (Q.E.D.)
Go
(44) E u (ct )
t= 0
or
co
u(ct)
(45) where p >- 0 is a discount factor
t=O 0 + p)t,
Following- Ramsey [ 16] and Gale [ 6] , we assume that the discount factor is zero.
But we cannot simply adopt the target function such as (44), for such a target may
diverge to infinity in many attainable programs. To avoid such a situation for
infinite horizon programs, we define optimality as follows.
Definition: An attainable program { (zt, yt, ct)} starting from the initial resource
vector r0 is said to be optimal if there exists an N such that, for any attainable
program {(xt, yt, ct)} starting from the same r0,
(46)
t=0
[u(ct) - u(ct)] ? 0 for all N > N
The question that we wish to ask now is whether there exists an optimal
attainable program and, if so, what the characteristics of such a program are.
Gale answered both questions simultaneously by constructing such a program.
This is probably the most important although the most tedious part of his paper
[6]. Brock in a recent paper [3] has simplified this tedious procedure. In this
simplification, the following lemma, which we call Brock's lemma, plays a central
role.
Brock's Lemma : Suppose that (A-1), (A-2), and (A-5) hold. Let J (x,, y, , c,) j bean
attainable program starting from an arbitrary given ro E Z. Assume that an eligible
attainable program exists starting from ro. Let (z, y, c) be an O.S.P. with an as-
sociated price vector p. Then there exists a nonnegative sequence {S, }t o associated
with {(x,, y, c,)} such that
N N
Since x, + c, 5 y,_ i, for all t = 1, 2, ... , and xo + co r0, for any at-
tainable {(x,, y,, c,)}, we have S, > 0 for all t = 0, 1, 2, .... This proves
the first statement of the lemma.
(ii) Consider an attainable program {(x,, y,, c,)} starting from an arbitrary
given ro which is eligible. Then for any N = 0, 1, 2, ... , there exist E
and B such that
N N N
(49) E<_
r=0
ZSr<B - 2:15,
r=0 r=0
where the existence of a bound B , independent of N and r0 E Z, can
be asserted owing to the boundedness of T and Z. Therefore we have
596 MULTISECTOR OPTIMAL GROWTH MODELS
N
(50) r OS, < B - E, N = 1, 2,....
Hence X OSr < co (that is, bounded from above), for any eligible and
attainable { (xr, yr, c,)} starting from re E Z.
(iii) Given any { (xr, yr, c)} starting from a given re, we can obtain the
associated sequence of 8r > 0, t = 0, 1, 2, ..., defined in (48). Let a be
defined by
(51) a - inf t=o Sr: {Sr} is associated with an attainable { (xr, yr, cr)}
starting from r0 }
-
Here the infimum is taken over the set of all attainable { (x y c,)} start-
ing from a given r0. Since an eligible program exists by assumption,
a < oo [ see step (ii) of the proof] . Starting from r0, there may not exist
any attainable program such that its associated series is equal to a. Our
task now is to show that there does exist such a program. That is, we wish
to find an attainable program { (zr, y,, cr)} starting from r0 such that its
associated sequence {Sr} is such that
00
(52) a = X91
r=0
In other words, we wish to find a program { (z, cr)} such that its
associated series Z' OSr is minimal in the class of programs starting
from a given r0.
(iv) By the definition of a, there exists an attainable program { x,N, y,N, c1N}
starting from r0 with its associated series Z' 08,N such that
(53)
r=0
Sr"<_a+
N+ 1
N=0,1,2,...
Now for a given t, consider (xrN, y,", c/N) as a sequence over N, where
N = 0, 1, 2, .... Then owing to the compactness of TO C, it contains a
convergent subsequence whose limit is in T ®x C. That is, there exists an
{N'} c {N} such that
(54) (xrN , YrN" crN') (Xr , Yr , r) as All - co
where (Yr, yr) E T and c, c C. Note that x,N' + c,N' < yr_ I N' for all t, so
that we have z, + c, < yr. Hence {(z y cr)} E A(r0). Owing to the
compactness of T QC, the boundedness of Z, and the continuity of u,
the sequence {8,N'} (sequence with respect to N') is bounded for each t
[recall equations (38) and (48)]. Hence for each t there exists a con-
{8rN'};
vergent subsequence of that is, there exists a subsequence {M} c
{N'} such that, for each t = 0, 1, 2, ...,
(55) SrM Sr as M --> co
MULTISECTOR OPTIMAL GROWTH WITH CONSUMPTION 597
(56) -Y S,>_ a
r= 0
by the definition of a.
Write E' OSt. Suppose /3 > a. Then choose ri and r2 such that
(57) > r2> r1 > a
. Choose No large enough so that
No
(58) _Y St>r2
t= O
(59) -Y S,M = rl
t=O
M+1 >rl
(61) +
Theorem 7.B.9: Suppose that (A-1), (A-2), and (A-5) hold, that an eligible attainable
program exists starting from r0, and that the O.S.P. is unique. Then there exists
an attainable program {(z,, y,, c,)} starting from r0 such that, for any attainable
program {(x,, y,, c,)} starting from the same r0, there exists an N such that
N
(62) Z [u(c,)
t=0
- u(c,)] > 0 for all N >_ N
PROOF: Let {(z,, y,, c,)} be the program with minimal associated series
E°° 0S which is obtained in the previous lemma. We claim that this is
the optimal program that is desired in Theorem 7.B.9. As remarked before,
598 MULTISECrOR OPTIMAL GROWTH MODELS
(63)
t=0
[u(cr) - u(cr)] = P- [ro - ro + (YN - YN)] + 2:t=0Sr - t=0
2: Sr
Also as a result of the eligibility of the two programs and the uniqueness
of the O.S.P., yN -> Y and YN -> Y as N -> oo. By definition, 081 is the
minimal series so that 2:' 08t > L= o8t. Hence the conclusion of the
theorem follows from (63). (Q.E.D.)
REFERENCES
1. Arrow, K. J., Hurwicz, L., and Uzawa, H., "Constraint Qualification in Maximiza-
tion Problems," Naval Research Logistics Quarterly, 8, June 1961.
2. Atsumi, H., "Neoclassical Growth and the Efficient Program of Capital Accumula-
tion," Review of Economic Studies, XXXII, April 1965.
3. Brock, W. A., "On Existence of Weakly Maximal Programmes in a Multi-Sector
Economy," Review of Economic Studies, XXXVII, April 1970.
4. Debreu, G., Theory of Value, Cowles Foundation Monograph, No. 17, New York,
Wiley, 1959.
5. Drandakis, E. M., "On Efficient Accumulation Paths in the Closed Production
Model," Econometrica, 34, April 1966.
6. Gale, D., "On Optimal Development in a Multi-Sector Economy," Review of
Economic Studies, XXXIV, January 1967 (also "Correction," Review of Economic
Studies, XXXVIII, July 1971).
7. , "A Geometric Duality Theorem with Economic Applications," Review of
Economic Studies, XXXIV, January 1967.
8. Koopmans, T. C., "Analysis of Production as an Efficient Combination of Activ-
ities," in Activity Analysis of Production and Allocation, ed. by T. C. Koopmans,
Cowles Foundation Monograph, No. 13, New York, Wiley, 1951, chap. 3..
9. , "Economic Growth at a Maximal Rate," Quarterly Journal of Economics,
LXXVIII, August 1964.
10. , "On the Concept of Optimal Economic Growth," in The Econometric
Approach to Development Planning, Pontificiae Academiae Scientiarvm Scriptvm
Varia, Amsterdam, North-Holland, 1965.
11. Malinvaud, E., "Capital Accumulation and Efficient Allocation of Resources,"
Econometrica, 21, April 1953 (also "A Corrigendum," Econometrica, 30, July 1962).
12. McKenzie, L. W., "Turnpike Theorems for a Generalized Leontief Model," Econo-
metrica, 31, January-April 1963.
13. , "Accumulation Programs of Maximum Utility and the von Neumann Facet,"
in Value, Capital and Growth, Papers in Honour of Sir John Hicks, ed. by. J. N. Wolfe,
Edinburgh, Edinburgh University Press, 1968.
MULTISECTOR OPTIMAL GROWTH WITH CONSUMPTION 599
14. Nikaido, H., Convex Structures and Economic Theory, New York, Academic Press,
1968.
15. Radner, R., "Paths of Economic Growth that Are Optimal with Regard Only to
Final States: A Turnpike Theorem," Review of Economic Studies, XXVIII, February
1961.
16. Ramsey, F. P., "A Mathematical Theory of Saving," Economic Journal, XXXVIII,
December 1928.
17. Tsukui, J., "Turnpike Theorem in a Generalized Dynamic Input-Output System,"
Econometrica, 34, April 1966.
18. , "The Consumption and the Output Turnpike Theorems in a von Neumann
Type of Model-A Finite Term Problem," Review of Economic Studies, XXXIV,
January 1967.
19. Uzawa, H., "The Kuhn-Tucker Theorem in Concave Programming," in Studies in
Linear and Non-Linear Programming, ed. by K. J. Arrow, L. Hurwicz, and H. Uzawa,
Stanford, Calif., Stanford University Press, 1958.
20. von Weizsacker, C. C., "Existence of Optimal Programs of Accumulation for
an Infinite Time Horizon," Review of Economic Studies, XXXII, April 1965.
F;7
DEVELOPMENTS OF OPTIMAL CONTROL THEORY
AND ITS APPLICATIONS
Section A
P0 NTRYAG I N'S
MAXIMUM PRINCIPLE
traditional control theory in various fields of engineering, but also it has attracted
attention throughout our society. It can be considered a mathematical theory with
applications extending to all of human activity. Mathematically, optimal control
theory is closely related to the calculus of variations, as can be suggested by the
problem of optimal growth. In fact, optimal control theory provides a link to the
vast literature on the calculus of variations.' And, by contrast to the classical
calculus of variations, optimal control theory incorporates general constraints
imposed on the problem in a direct and natural way.4 The work by the famous
Russian mathematician L. S. Pontryagin and his associates [111 is chiefly
responsible for this new approach.' Although pioneering works were done by
F. A. Valentine in 1937, E. J. McShane in 1939 and 1940, and M. R. Hestenes in
1949, this new approach has attracted a huge audience of mathematicians,
engineers, economists, and so on, only after the publication (of the English
translation) of Pontryagin et al. [ 11] Especially since the publication of this
.
work, the literature in the field of optimal control theory has been increasing very
rapidly,e and already includes a number of good textbooks (for example, [2],
[8], and [9] ).
The basic result of Pontryagin et al. [ 11 ] is called Pontryagin's maximum
principle, which is concerned with the necessary conditions for optimality.? This
condition is analogous to the maximization of the Lagrangian in the classical
theory of nonlinear programming. Further results by Hestenes [5] and others
extended this condition to incorporate various kinds of constraints.
The purpose of this chapter is to give an expository account of this theory
and to illustrate it with some applications in economics. Since a rigorous exposi-
tion of optimal control theory requires a book, and since there are several such
books available, our exposition here is rather intuitive.
Consider a system of n first-order differential equations
the f's, x,'s, and uk's are all real-valued functions. The boundary conditions for
(1) are given by
u (t)'s in U is denoted by U. The region U is called the control region and U is called
the set of admissible controls. When u(t) E U, u(t) is called an admissible control
(function). In this section, we assume that the control region U is independent of
x(t) and t. The case in which U is restricted by a constraint such as g(x, u, t) > 0
will be discussed in Section C. Throughout this chapter, we assume that U is
restricted to the set where u(t) is "piecewise continuous." By piecewise continuous
8
we mean that a function is continuous except possibly at a finite number of points
It is important to note that discontinuities are allowed for the control functions
(recall footnote 7). Notice that the control region U can be a closed set. In other
words, U can incorporate a constraint such as
0<u(t)< 1 forallt
Such a bound may appear if, for example, u(t) is the propensity to save at time t.
The f's are assumed to be continuous in each x;, Uk, and t, and possess
continuous partial derivatives with respect to each x, and t. The range of x(t)
is denoted by X, which is assumed to be an open connected subset of R". The
boundary point (x°, t°) must be such that x° E X and to E (t1, t2). It is required
that x(t) be continuous and have piecewise continuous derivatives.
We now set the target as follows (where T is a fixed constant):
n
(3) S= cix;(T), where T E (t1, t2)
Theorem 8.A.1: Under the above specifications of the problem, in order that u (t) be a
solution of the above problem with the corresponding state variable z(t), it is necessary
that there exist a nonzero, continuous vector-valued function p (t) _ [p 1(t), p2(t), ... ,
p"(t)] such that"
PONTRYAGIN'S MAXIMUM PRINCIPLE 603
(i) p(t) together with u(t) and 1(t) solve thefollowing Hamiltonian system:
(4) x,
of OR
api,pi - -ax,i= 1,2,...,n
where H is defined by
Pi =IPA
n afj
ax; 'i= 1,2,...,n
where f denotes f [X(t), u(t), t]. In other words, Theorem 8.A.1 produces
2n first-order differential equations with 2n boundary conditions x,(t0) =
x0 and pi(T) = c;, i = 1, 2, ..., n; hence the actual solution of the above
problem is reduced to solving a system of differential equations. Clearly this
system is unsolvable unless the function u(t) is specified. The choice of u(t)
depends upon condition (ii) (that is, the maximization of H). The triplet
[,i(t), L(t), p(t)] thus found in Theorem 8.A.1 is called the optimal triplet or
the solution triplet. The pair [X(t), u(t)] is called the optimal pair or the
solution pair.
REMARK: Note that the target function as described in (3) is more general
than it appears, as it includes the following target function:
(6) I j1J[x(t)u(t), t] dt
To see this, define x0(t) by *0 =fo[x(t), u(t), t] with xo(0) = 0. Then I =
xo(T),-which is clearly a special case of (3). Hence the problem of maximizing
I subject to (1) and (2) can be converted to the problem of simply maximizing
xo(T) subject to (1), (2), and x0 = fo[x(t), u(t), t] and x0(0) 0. We can then
immediately apply Theorem 8.A.1.
REMARK: In the above formulation of the problem, we defined the target
function by S = r 1c;x;(T). We noted in the above remark that an integral
target in the form of
I= T fo [x(t), u(t), t] dt
J
can be converted into the form of S. The converse is also true. In other words,
the target in the form of S can be converted to the above integral form. To
see this, note that
n T n n
T n
J= 5c,.cj(t)dt
Hence the maximization of S subject to x;(0) = x;o, x = f [x(t), u(t), t],
i = 1, 2, ..., n, is reformulated as follows:
REMARK: If we define M by
(7) M[X(t), t, p(t)] = "s pUH[X(t), u(t), t, p(t)]
E
REMARK: Those readers who are familiar with the calculus of variations
problem with differential equation constraints should be able to see that
Theorem 8.A.1 may be reduced to a well-known result in the calculus of
variations when u(t) is in the interior of U. To see this, note that the
problem is reduced to one of maximizing
frT Zc;z;(t)dt
0 r=I
subject to (1) and (2)
since the constant term -2:, 1c;x;(0) will not affect the solution [z(t),
fi(t)]. Then form the "Lagrangian," 0 - 2:" ic;x; + 2:, T.(*, - f). View=
ing this as a calculus of variations problem, we can write Euler's conditions
here as
d ac15 arD d acD arD
and for all the i's and k's
dt axi ax; dt auk auk'
PROBLEM:
Subject to:
(9) xi = f,, [x(t), u(t), t], i = 1, 2, ... , n
(10) xi(0) = xi° (fixed), i = 1, 2, ... , n
and
(11) U(t) E U
where x(t) = [xl(t), ..., xR(t)] and u(t) = [ui(t), ..., u,(t)].
First we define the "auxiliary variables" pi(t), i = 1, 2, ... , n, by
of
(12) pi(t)= axi
,p;(T)=ci,i= 1,2,...,n
i=l
Here aj/axi denotes aj/axi evaluated at [,i(t), u(t), t]. Define the function H by
where p(t) = [pl(t), ..., p,(t)] ; it is clear from the definition of the pi(t)'s that
(14) pi(t) aH, where H = H[z(t), u(t), p(t)]
We assume that an optimal control vector u(t) has been found and let _z(t)
be the corresponding state vector. We are concerned with the characterization of
this solution pair [2(t), u(t)]. Consider now a variation Au(t) from the optimal
control vector u(t) such that u(t) + Au(t) E U, and let 4x(t) be the resulting total
variation from the optimal state vector 2(t). Then from (9), we have
n n
Hence we obtain
Tn Tn
(16) f 1 piaxidt = fo Zpi[f(X + Ox, u + Du, t) - t)] dt
0 ;_1 i=I
Then, rewriting the second term of the RHS of (17) by utilizing (12), and using
(16) and (17), we obtain
(18)
n
E+ PiAxi
i= 1 0
T n
=- f Z Z Pi(t) a Xj AXjdt
o i= 1 j= 1
n f
+ Z pi [ f (z + AX, u + Au, t) - f,(1, u, t)] dt
J
00 i= 1
Since the initial state vector x(0) is assumed to be fixed, Ax(0) = 0. Note that in
(12), pi(T) is chosen such that pi(T) = ci. Hence we have
n T n
where 0 < § < 1. Here it is assumed that the first and second continuous partial
derivatives off exist. From (18), (19), and (20), we obtain
1 1T n n n
a 2f,- (X + i;AX, u+ Au, t)
+Z
0
- Pi
i=I j=1 k=1 aXjaX k AxiAxkdt
Now recall our special form of the f's, that is, equation (8). Then, owing to (8), the
last two members of the RHS of (21) vanish and AS becomes
A sufficient condition for a maximum of the payoff function Sat (r, u, t) is clearly
AS < 0, and a sufficient condition for AS < 0, in turn, is obtained from (22) as
(23) H(z, u + Au, t) - H(z, u, t) < 0
for all0<t<T.
PONTRYAGIN'S MAXIMUM PRINCIPLE 609
C. VARIOUS CASES
As already remarked, the above theorem is concerned with the case in which
the time horizon (T) is fixed and the end-point x(T) is not a priori fixed (it is deter-
mined from the solution of the problem). However, in many circumstances this
may not be the case. For example, if the target of the problem is to minimize
the time (T) to reach a certain target, then T is not a priori specified but it is
rather obtained as a solution of the problem. Such a problem is called the time
optimal problem. In general, we can formulate various problems depending, first,
on whether or not some (or all) coordinates of the state vector x(T) are a priori
fixed and, second, on whether or not the "final time" (T) is fixed.
A few examples are now in order. In the problem of minimizing the time to
fill a bathtub, the final state x(T) = 100 (%) is a priori fixed, but the final time T
is not specified; it is determined as a solution of the problem. In the problem of
shooting a missile to intercept an airplane in a minimum amount of time, both the
final time T and the final state x(T) are unspecified. They are determined as a
part of the solution. In the optimal growth problem of maximizing the discounted
sum (integral) of utilities over time [0, T] with fixed initial and terminal
capital:labor ratios, k0 and kT, both the final time T and the final state k(T) are
a priori fixed.
We now turn to a general consideration of such problems. First we discuss
the case in which m coordinates of the terminal value of the state vector, x(T),
are a priori fixed, where m < n or n7= n. Next we consider the case in which the
final time T is not a priori fixed.
(i) The Right-Hand End-Point x(T) Partially Specified: In other words
610 DEVELOPMENTS OF OPTIMAL CONTROL THEORY AND ITS APPLICATIONS
Here the Ai's are unknown variables that are constant over time. Note that
in (27) we have m new equations, which corresponds to m new variables, the
Ai's. The rest of Theorem 8.A.1 holds as it is. Clearly, Theorem 8.A.1 with
condition (iii) revised as above is a generalization of the original Theorem 8.A.1.
Note that the original theorem is concerned with the case in which m = 0
[that is, none of the xi(T)'s are a priori specified] . A further generalization
of the theorem is obtained if, instead of specification (27), we have the following
functional specification on the right-hand end-point x(T):
(30) Fj [x(T)] = 0, j = 1, 2, ... , m
where the Fj's are real-valued differentiable functions. Clearly (27) is a special
case of (30) in which F j [x(T)] = xj(T) - xjT, j = 1, 2, ... , m. In this case, the
transversality conditions (28) and (29) are rewritten as
e? i= 1,2,...,n
(31) pi(T)=ci+ j=1
X '
where the A j's are unspecified variables which are constant over t." It should
be clear that (31) is a generalization of (28) and (29) in the sense that (28) and
(29) are obtained from (31) (not vice versa). Thus by replacing the transversality
condition (iii) by (31), we obtain a further generalization of Theorem 8.A.1.
(ii) Final Time Open: We now turn to the consideration of the case in which the
"terminal time" T is not a priori specified. Since T is not specified, we have
one additional degree of freedom in the system. Hence one additional equation
is required, which is written as follows:
n
(32) _Y pi(T)zi(T) = 0
i= i
Here [z(T), u(T), p(T)j denotes the solution triplet at T. In terms of M [z(T),
T, p(T)] as defined in (7), (33) can be rewritten as
(34) M [z(T), T, p(T)] = 0
In the case of an autonomous system in which thef's do not explicitly depend
PONTRYAGIN'S MAXIMUM PRINCIPLE 611
FIXED "FINAL TIME": The two conditions given above for the case of open
final time are not required for the case of fixed final time.
Theorem 8.A.1 with the above modifications in (i) and (ii) may be called the
generalized Theorem 8.A.1. However, for the sake of simplicity, we will hence-
forth refer to this simply as Theorem 8.A.1. With these modifications, we are
ready to derive the results for various interesting cases as corollaries of Theorem
8.x.1. Not only will these corollaries give us some readily available results, they
will also enhance the reader's understanding of Theorem 8.A.1. In fact, the
reader will observe that all the theorems listed in chapter 1 of Pontryagin et al.
[ 11 ] are really special cases of this theorem.
(a) FIXED-TIME WITH FIXED-END POINTS PROBLEM: We consider the
following problem in which the final time T is fixed (with fixed end-points):
Here T is fixed and both the initial and terminal end points, xio and xiT, are
also fixed. To consider this problem, define x0(t) by *0 = fo[x(t), u(t), t],
x0(0) = 0. Then, as we noted earlier, the problem is converted to one of maxim-
izing x0(T) subject to zi = f,.[x(t), u(t), t], i = 0, 1, 2, ..., n, and x0(0) = 0,
612 DEVELOPMENTS OF OPTIMAL CONTROL THEORY AND ITS APPLICATIONS
Theorem 8.A.2 (Pontryagin et al. [ 11] , pp. 67-68): In the above problem, in order
that [i(t), u(t)] be optimal, it is necessary that there exist a nonzero (n + 1)-vector-
valued continuous function p (t) ° [po(t), p i (t), ... , which has piecewise
continuous derivatives, such that
(i) z0(t), z(t), u(t), and p(t) solve the following Hamiltonian system:
aH, i = 0, 1, 2, ..., n
aPi Pi = - ax,
xi = aH,
where
and
in
X pjgj(u) = 0
j= 1
Theorem 8.A.3 (Pontryagin et al. [11] , pp. 60-61): In the above problem, in order
that [c(t), u(t)] be optimal, it is necessary that there exist a nonzero (n + 1)-vector-
valued continuous function p(t) = [po(t), pi(t), ..., which has piecewise
continuous derivatives, such that
(i) co(t), 2(t), u(t), and p(t) solve the following Hamiltonian system:
aH aR
xi= a p;
,pi= 'i= 0, 1,2,...,n
aX1.
for all t follows from po = - aH/axo = 0. The p,(T), i = 1, 2.... , n, are left
unspecified because the x, (T), i = 1, 2, ... , n, are specified. Note thatpo = 0
is possible in the above theorem.
(y) TIME OPTIMAL PROBLEM (NONAUTONOMOUS CASE): Consider the
following problem.
Minimize: T
UW
we can apply Theorem 8.A.3 and obtain the following theorem (here we
define H [x(t), u(t), t, p(t)] = E7 1 pi(t)fi[x(t), u(t), t] and note that H =
-po + H).
Theorem 8.A.4 (Pontryagin et al. [11], p. 65): In the above problem, in order
that [.i(t), u(t)] be optimal, it is necessary that there exist a nonzero n-vector-valued
continuous function p(t) _ [ pl(t), P2(t), ..., p, (t)] , which has piece wise continuous
derivatives, such that
(i) i(t), u(t) and p(t) solve the following Hamiltonian system:
where
(ii) H[i(t), fi(t), p(t)] ? H[i(t), u(t), p(t)] for all u(t) E U
(iii) H[i(T), ii(T), T, p(T)] ? 0
REMARK: The above condition (iii) follows from po(T) > 0 and
H[z(T), u(T), T, p(T)] -- -po(T) + H[z(T), u(T), T, p(T)] = 0
PONTRYAGIN'S MAXIMUM PRINCIPLE 615
(8) FINAL TIME OPEN WITH FIXED END POINTS (AUTONOMOUS CASE):
Consider the following problem.
(b) Minimize: T
u(t)
Theorem 8.A.5 (Pontryagin et al. [11], p. 19 and pp. 20-21): The necessary
conditions for [z(t), u(t)] to be a solution of problem (8-a) are obtained by replacing
condition (iii) of Theorem 8.A.3 by condition (36). (The other conditions of Theorem
8.A.3 hold as they are.) The necessary conditions for [,i(t), u(t)] to be a solution
of problem (8-b) are the same as those in Theorem 8.A.4, except that condition (iii) of
the theorem is replaced by condition (37).
(E) FIXED-TIME WITH VARIABLE RIGHT-HAND END POINTS PROBLEM:
Now consider the following problem:
/
Maximize: fo[x(t), u(t), t] dt
u(t) o
Here T is fixed but the x; (T), i = 1, 2, ... , n, are not fixed. Again defining x0(t)
by xo = fo [x(t), u(t), t], x0(0) = 0, we can convert the above problem to one
of maximizing xo(T). Thus applying Theorem 8.A. 1, we obtain the following
theorem.
Theorem 8.A.6 (Pontryagin et al. [11], p. 69): In the above problem, in order
that [i(t), fi(t)] be optimal it is necessary that there exist a nonzero (n + 1)-vector-
valued continuous function p(t) = [po(t), p, (t), ..., which has piecewise
continuous derivatives, such that
(i) z0(t), 2(t), u(t), and p(t) solve the following Hamiltonian system:
8H aH
xi=B-,Pi
pi
=-8-,i=0,1,2,...,n
x;
where
(ii) H [1(t), u(t), t, p (t)] > H [. (t), u(t), t, p (t)] , for all u(t) E U
(iii) p(T) = (1, 0, ..., 0) [that is, po(T) = 1 and pi(T) = 0, i = 1,2..... n]
(iv) P0(t) = 1 for all t
REMARK: Owing to the transversality condition for the variable end points,
pi (T) = 0 for all i = 1, 2, ... , n. Since p (T) = [po(T), p l(T), ... , p (T)]
is a nonzero vector, this implies p0(T) 4 0, orpo(T) > 0. Thus without loss of
generality we may take p0(T)=1. Hence we obtained condition (iii),
especially p0(T) = 1, without mentioning anything about the normality con-
dition.
(38) H[k(t), x(t), t, p(t)] = u [x(t)] e-P' + p(t) [f [k(t)] - .ik(t) - x(t)]
618 DEVELOPMENTS OF OPTIMAL CONTROL THEORY AND ITS APPLICATIONS
then i(t) = 0 (for all t) cannot be optimal. However, even without (43), we can
guarantee that i(t) = 0 (for all t) is not optimal. To see this, notice that the
productivity conditions imposed by (A-2) and (A-3) permit the existence of x (t) >
0 for all t, which is technically feasible in the economy. Since u'(x) > 0 for all x,
the path with 3e(t) is better than the path with i(t); that is, i(t) = 0 for all t cannot
be optimal. Hence (42) is a contradiction so that (41) is also a contradiction.
This then implies that po = 0 is a contradiction. Therefore we obtain po > 0. We
can now choose po to be unity without loss of generality since all the necessary
conditions for optimality in the various theorems (such as Theorem 8.A.2) stated
so far in this section will not be affected by this choice. This then justifies our
definition of the Hamiltonian in (38). The above rather lengthy consideration,
which justifies po = 1, is usually ignored in the literature.
Furthermore, note that with po = 1, (40) may be rewritten as
and
(48) aH = 0
Since u'(x) > 0 for all x, this implies that p(t) > 0 for all t. From (49) we obtain
(53), and (54) and their interpretations are the generalizations of Koopmans' prop-
osition F([6], pp. 245-246), for our propositions are concerned with each instant
of time. Note also that (49) and (52) yield
(55) u[2(t)] - u[x(t)] >_ u' [z(t)] [1(t) - x(t)]
forallx(t)>0,0<t<T
In other words, at any instant of time, the excess of 2(t) over x(t) multiplied by the
marginal'utility of the optimal consumption at t cannot exceed the excess of utility
at 2(t) over that at x(t).
In fact, the above formulation (51) in terms of the maximum principle and
the subsequent implications discussed above should hold even if we replace
u [x(t)] a-Pt and f [k(t)] by more general functions u [x(t), t] and f [k(t), t],
where u and f are continuously differentiable in t [as well as in x(t) and k(t)]. The
function u [x(t), t] allows the possibility of a nonconstant discount, andf [k(t), t]
allows the possibility of technological progress. We may rewrite relations (52),
(53), and (54) in terms of these new functions u and f as follows:
Hence, if we have u'(x) > 0 for all x (no satiation) so that p(T) > 0, we must have
k(T) = 0. This condition then replaces (58). Thus k(0) = k0 and k(T) = 0 specify
the two boundary conditions for (46) and (47).
To analyze the present problem, we will consider a phase diagram which is
different from the one used in Chapter 5, Section D. For this purpose, first note
that (A-1) [especially u"(x) < 0 for all x] and (49) imply
(60) z(t) = g [ q(t)]
where
A
x
0
we can again assert that the solution is an interior solution, that is, z(t) > 0 for all
t. 15
Now recalling the definition of q(t) in (61), we rewrite equation (47) as
(64) q = - q(t) [f' [k(t)] - (A + p)]
Hence we obtain the phase diagram from equations (62) and (64) as in Figure 8.2.
The transversality condition (58') means that q(T) ? 0 and q(T)k(T) = 0.
Suppose that satiation is not allowed so that u' (x) > 0 for all x. Then q(T) cannot be
zero. Hence, as remarked earlier, the transversality condition is replaced by
k(T) = 0. This means that it is always better to "eat up" the capital to increase con-
PONTRYAGIN'S MAXIMUM PRINCIPLE 623
A
k
Figure 8.2. An Alternative Phase Diagram for the Optimal Growth Problem.
sumption for some period of time and leave nothing after the planning horizon.
This reflects the fact that k(T) is not a priori specified by k(T) = kT. The optimal
attainable path would in general be unique up to the boundary conditions k(O) _
ko and k(T) = 0. It is illustrated by a curve such as the CC' path in Figure 8.2. A
curve such as the AB path cannot be optimal, because at point B we have q = 0 so
that u' = 0, which violates the nonsatiation assumption. Note that a path such as
the DE path in Figure 8.2 cannot be optimal whether or not satiation in consump-
tion is allowed, because the transversality condition (58') cannot be satisfied in
any way.
What happens for the infinite horizon problem (T = oo)? As long as we do not
specify the terminal stock k(T) when T-> oo, the problem is identical with the usual
optimal growth problem, that is, the one discussed in Section D of Chapter 5,
except in one important aspect. How should the transversality condition be modi-
fied for the infinite horizon problem? Note that when T-> oo the problem of satia-
tion discussed above does not arise, since p(T) = 0 when T-> oo as long as u'(x) is
bounded (by the bound on the movement of x). In other words, the condition in
the form of either (58) or (58') is satisfied as T-> oo. The real question here is
whether such a condition indeed constitutes a condition for optimality. In other
words, the question we have to ask is: What is a transversa]ity condition at infinity?
Mathematically speaking, the "transversality conditions" refer to the condi-
tions which require that the state variables be in a particular target set at the
terminal point (see, for example, Pontryagin et al [ 11 ] , p. 49). For the finite hori-
zon problem, the values of the state variables would have a definite meaning; but
the meaning of the limit of these values when T-> oo is ambiguous, for the limit
may not exist for all attainable paths. "' Hence the phrase "transversality condi-
tions" is rather meaningless for the infinite horizon problem. Although for the in-
624 DEVELOPMENTS OF OPTIMAL CONTROL THEORY AND ITS APPLICATIONS
finite horizon case the condition [ p(T) ->0 as T-> oo ] may appear to be a natural
extension of the transversality condition [p(T) = 0 for finite T] , counterexamples
have been discovered where this is not true." In general, appropriate conditions
for the infinite horizon problem, which replace the transversality conditions for the
finite horizon problem, are not known. However, for the present problem of
optimal growth, Arrow [ 1 ] pointed out that the following condition happens to be
necessary for optimality, as long as p > 0:'a
(65) lim p(T) > 0 and lim p(T)k(T) = 0
T-.oo T-.oe
In other words, the simple extension of the finite horizon transversality condition
(58') holds in this particular case.
Clearly any path in which x(t) approaches a finite value as t->oo will satisfy
condition (65) so long as p > 0. It can be shown easily that of the paths illus-
trated in Figure 8.2, only the FG path and the HG path satisfy condition (65)
[note that k(t) ->kk implies i(t) ->zp from (46)]. Hence assuming the existence of
a unique optimal attainable path, we have FG as the optimal attainable path if
ko < kP and we have HG as the optimal attainable path if ko > kP. Here [kr, zpJ-is
the modified golden rule path discussed in Chapter 5, Section D.19
In the above analysis, we assumed that p > 0 (positive discount factor).
Suppose now that p = 0. Then the condition such as limn , u' [x(T)] k(T) = 0
which would correspond to (65) does not hold in general. That is, condition
(65) is false when p = 0. As Koopmans ([6], proposition C and lemma 3) has
shown, the following condition is necessary for the present problem, with p = 0:
or
Thus in the case where p = 0, condition (65) is replaced by condition (66) or (67).
Condition (66) reconfirms our conclusion that the only optimal attainable
path is the one which converges to the golden rule path. Needless to say, the maxi-
mand integral for the case of p = 0 should be changed to
When there are no restrictions on the final state, the transversality condition re-
quired in Theorem 8.A.6 is pi(T) = 0, i = 1, 2, . . ., n. However, the condition that
pi(T) = 0, i = 1, 2, . . ., n, as T- co may fail to hold for the infinite horizon prob-
lem. A counterexample, which is due to H. Halkin, is reported by Arrow and
Kurz.20 In view of the importance of the problem, we reproduce Halkin's counter-
example here.
Consider a control problem which maximizes
fy [1 - y(t)] v(t)dt
subject to y(t) 1 - y(t)] v(t), -1 < v(t) < 1, and y(O) = 0, where y(t) E R de-
notes the state variable and v(t) E R denotes the control variable. Observe that
f J
[1 - y(t)]v(t)dt = ji(t)dt = lim
t-or
But by direct integration, y(t) = 1 - e-''() where V(t) = .J v(i)de. Hencey(t) < 1
for all t. Hence any choice of v, - 1 < v < 1, forwhich limt,,, V(t) = co is optimal.
For example, v(t) = vo (constant), where 0 < vo < 1, is optimal. The Hamiltonian
for this problem is
H = [ 1 + p(t)] [ 1 - y(t)] v(t)
where p (t) is the auxiliary variable. Since v(t) = vo is a solution, it maximizes H.
Since vo is in the interior of [-1, 1], the control region, the maximality of H with
respect to v in turn implies that aH/aav = [ 1 + p(t)] [ 1 - y(t)] = 0. Hencep(t)
-1 for all t, since y(t) < 1 for all t. Owing to the continuity ofp(t), lim,' p(t) _ -1
and not 0.
FOOTNOTES
1. The above problem of shooting a guided missile is a favorite example in the literature
of optimal control theory. An expository account of the solution of this problem can
be found, for example, in Leitmann [9] , section 8, chapter 2, and in Saaty and Bram
[12].
2. See, for example, K. Shell ed., Essays on the Theory of Optimal Economic Growth,
Cambridge, Mass., M.I.T. Press, 1967, as well as Arrow [ l ], El-Hodiri [4], and Shell
[ 131. See also G. Hadley and M. C. Kemp, Variational Methods in Economics,
Amsterdam, North-Holland, 1971.
3 The development of the classical calculus of variations reached its culmination in the
1930s, especially at the University of Chicago.
4. The major results in optimal control theory and the relation between the calculus of
variations and optimal control theory are discussed in Hestenes [5] in a systematic
and unified way. The major content of this work was published in 1965 in the Journal
of SIAM Control.
5. It received the Lenin Prize in 1962.
626 DEVELOPMENTS OF OPTIMAL CONTROL THEORY AND ITS APPLICATIONS
REFERENCES
1. Arrow, K. J., "Applications of Control Theory to Economic Growth," in Mathe-
matics of the Decision Sciences, Part 2, ed. by G. B. Dantzig and A. F. Veinott,
Providence, R.I., American Mathematical Society, 1968.
2. Athans, M., and Falb, P. L., Optimal Control, New York, McGraw-Hill, 1966.
3. Cass, D., "Optimum Growth in an Aggregative Model of Capital Accumulation,"
Review of Economic Studies, XXXII, July 1965.
4. El-Hodiri, M. A., Constrained Extrema: Introduction to the Differentiable Case with
Economic Applications, Berlin, Springer-Verlag, 1971.
5. Hestenes, M. R., Calculus of Variations and Optimal Control Theory, New York, Wiley,
1966.
6. Koopmans, T. C., "On the Concept of Optimal Economic Growth," in The Econo-
metric Approach to Development Planning, Pontificiae Academiae Scientiarvm Scriptvm
Varia, Amsterdam, North-Holland, 1965.
7. Kopp, R. E., "Pontryagin Maximum Principle," in Optimization Techniques, ed. by
G. Leitmann, New York, Academic Press, 1962.
8. Lee, E. B., and Markus, L., Foundations of Optimal Control Theory, New York, Wiley,
1967.
9. Leitmann, G., An Introduction to Optimal Control, New York, McGraw-Hill, 1966.
10. Mangasarian, O. L., "Sufficient Conditions for the Optimal Control of Nonlinear
Systems," Journal of SIAM Control, vol. 4, February 1966.
11. Pontryagin, L. S., Boltyanskii, V. G., Gamkrelidze, R. V., and Mishchenko, E. F.,
The Mathematical Theory of Optimal Processes, New York, Interscience, 1962 (tr. by
K. N. Trirogoff from Russian original). (A translation by D. E. Brown was published
by Macmillan, 1964.)
12. Saaty, T. L., and Bram, J., Nonlinear Mathematics, New York, McGraw-Hill, 1964,
esp. chap. 5.
13.=-,.Shell, K., "Applications of Pontryagin's Maximum Principle to Economics," in
Mathematical Systems, Theory and Economics, ed. by H. W. Kuhn and G. P. Szego,
Berlin, Springer-Verlag, 1969.
14. Takayama, A., "On the Structure of the Optimal Growth Problem," Krannert In-
stitute Paper, No. 178, Purdue University, June 1967.
Section B
SOME APPLICATIONS
(7) pi= 1, 2
Henceforth we omit e), which denotes the optimality, for the sake of notational
simplicity.
According to the maximum principle we choose the control variables so as to
maximize H.' Noting that H = [/3(pI - P2) + P2] (g, K, + g2K2), we obtain'
(8) /3 = 1 if p, > P2 and /3 = 0 if P, < P2
Equations (7) and (6) can be rewritten as
(7') Pi = - [/3(PI - P2) + P2] gi, PI(T) = b,
P2 = (PI - P2) + P2] g2, PAT) = b2
Hence, observing that p1/P2 = g1/g2 from the above, we obtain"
If g 1 > g 2 and s 2 > s 1(also for gi = 92 and s2 > s1, or g1 > 92 and s2 = sl ),
then pI > P2 always. Hence, /3 = 1. In particular if the saving rates in both regions
are the same, we should obviously invest all the funds in that region where
productivity of capital, b, is higher. Similarly, if the productivities are the same
(b, = b2), we should invest all the funds in the region where the saving rate is
higher. In this case, therefore, there is no switch for our control variable/3. Since
this is rather obvious, one may wonder why Intriligator was misled to conclude that
there is always a "switch" at the terminal date.
In order to understand the more difficult case gI > g2, s1 > s2,8 we draw
a diagram for equation (9), as in Figure 8.3. Although in Figure 8.3, p2 is to
the right of b2, in fact, it may be on either side of b2. The value of p2 is obtained
by setting p, = P2 in (9):
(11)
P2<
b2 according to whether b2 < b,
CASE is b2 > b1
This is the case depicted in Figure 8.3. Let t* be the point of time at which
p2(t) takes the value of p2, and let to be the initial point of time. Since both p,
and p2 are monotone decreasing functions of time t from equation (7'), the point
t* is unique and we can consider case i as composed of two subcases.
to < t*
Equating p2(t) to p2, the value of which is given in (10), we obtain the exact
expression for the switching time t* as follows:
P1
Hence if g, > g2 and s, < sz" (instead of s, > sz), then PI(t) - PA(t) > 0 for all
t < T, and the optimal policy is to invest the entire fund in the first region. How-
ever, if s, > sz (together with gi > gz and A # 0), then we cannot arrive at any
immediate conclusion about the optimal policy. This forces us to reconsider the
problem under Intriligator's target function from a completely fresh viewpoint.
To do this, we obtain from (15) the following equation:
A,
(18) Pi - Pz = - D (Pi - P2) + Pz] (gi - gz) + [(1 - sz)bz - (1 - si)bi] e
If u = (1 - sz)b, - (1 - s,)bi is negative," then P, - P2 < 0 for optimal values
of /3, provided that g, > gz .'' if g, > 92 and s, < sz, then b, > bz, which is required
SOME APPLICATIONS 633
in order that a < 0. Note, however, that a can be negative even if we have g, > gz
and s, > sz. Since p,(T) = pz(T) = 0,P' i(t) -P'2(t) < 0 for all t < TimpliespI(t) -
P2(t) > 0 for all t < T. Hence the optimal value of/i is equal to one. In other words,
if g1 > gz and a < 0, we have 1. The same conclusion holds when g, > gz and
a<0.
However, when a > 0, p i - pz is not necessarily negative. In the subsequent
analysis, we assume a > 0. First we define q.(t) - p;(t)eA', i = 1, 2. It should be
clear that, for all t < T, q,(t) q2(t) according to whether pl(t) p2(t). We
also note that q.(t) 0 according to whetherp;(t) Ofor each t, i = 1, 2. From this
we can conclude that ql(T) = q2(T) = 0 and that q;(t) > 0, i = 1, 2, for all t < T.
With this definition of q.(t), we immediately have the following equations:
i = 1, 2
We can also show that q;(t) < 0, i = 1, 2, for all t < T for 1 or 0, if we assume
that g; > A., i = 1, 2.15 Using the definitions of the q;(t) and (19), we now consider
(18) for the two cases /i = 0 and /i = 1.
CASE 1: qi < q2 (that is, /i = 0)
In this case equation (18) can be rewritten as"
q1
45°
q2
0 a
/ 91-92
/
/Equation (21)
Slope = _91-_92- 11
/
Figure 8.5. Case a: q1 < q2 with g, - gZ + A > 0.
(q1-q2)-plane and divides the entire plane into two parts, that is, the region
where q1 - q2 > 0 and the region where q1 - q2 < 0. We should recall that we
are concerned with the case in which q I > q2; hence the "relevant region" in the
present case is the region above the 45° line in the (q1-q2)-plane where q1 - 42
< 0.18 We are again concerned with the nonnegative values of q1 and q2. The
sign of the slope of the straight line which satisfies (23) will differ according to
whether g1 - g2 - A > 0 or < 0.18 We illustrate the relevant region for the case
q1 > q2 with g1 - g2 - A > 0 by the shaded triangle in Figure 8.6.
41
45° /
/N Equation (23)
x
\\I Slope = -
91 92
0 a
g1-92
\ 4z
+ \-
In Figure 8.7, we combine Figures 8.5 and 8.6, and we obtain the path of
(q1, q2), which is illustrated by an arrowed line. In Figure 8.8, we illustrate the
case in which gi - 92 - A < 0, when ql > q2. The optimal path of (ql, q2) is again
indicated by an arrowed line. Note that in both cases-that is, g1 - g2 - A > 0 and
g1 - g2 - A < 0-there is a possibility of a switch of the optimal policy from
l to /3 = 0. For example, in the case of g1 - g2 - A > 0, the optimal value of
is equal to one until ql(t*) = q2(t*) = cr/(gi - g2), and then it switches to zero
until the terminal point of time T. The same holds for the case ofgi - $2 - A < 0.
Finally, let us obtain the switching time t*. This can be done by noting that
qi(t*) = q2(t*) = cr/(g1 - g2). The explicit expression for q1(t) with /3 = I is
written as20
24) q i (t)
I[(1 - sl)bl(r [e1T)
gi- A J
Define A as
q1
positive, there is a switch of the optimal policy at t = t*. If T is not big enough,
there is no switch and the optimal policy will always be/3 = 0. If g, < A, then A < 0;
hence as long as the difference between gl and A is not too large,' log (A + 1) < 0,
and there is again a possibility of a switch of the optimal policy at t = t* provided
T is sufficiently large so that the RHS of (26) is positive.
This finishes our analysis under Intriligator's target function. We summarize
the results as follows:
(1) gi > g2, sl < s, (or g1 = g2, s1 < s2; g1 > g2, s1 = s2; or g> > g2, A = 0):
R = 1 always.
(ii) gi > 92, Q < 0: /3 = I always .12
(iii) g, = g7, o > 0: /3 0 always."
(iv) g1 > g2, a > 0: possibility of a switch from R = 1 to /3 = 0.
S1 >s2 s1 <s2(bl>b2)
We may now discuss whether the model is really plausible or not. The first
question is whether we can assume that the b;'s are kept constant, since the b,'s
may decline, owing to the law of diminishing returns, as capital accumulates. One
reason the b;'s may be kept from declining is that labor is freely available so that
the capital:labor ratio is kept constant. However, this is impossible in a full-
employment economy unless the total labor supply increases at the same rate as
capital. Even in an economy with an "unlimited" supply of labor, it is not easy
to conceive of a mechanism which would determine the total employment of labor
and the allocation of this labor to different regions such that the capital:labor
ratio would remain constant in each region.
Apart from this, a more important question is the implication of our optimal
policy which says that the planner should invest all the funds in one region only
(say, region 1). If the income is growing in region 1 while the income in region 2
is stagnant, there may be a migration of labor from region 2 to region 1. It is not
quite clear whether there should be a mechanism to stop this and whether we
should consider the effect of this migration on productivity and the propensity to
save of each region. In short, the question we ought to face is whether we can
keep labor implicit in our model.
ADDENDUM: Here we record the explicit expressions for pi(t) and P'i(t),
i = 1, 2, as functions of time, corresponding to the optimal values ofA under
Intriligator's objective function. They can be obtained by putting R = 1 or
0 in equation (15) and solving the linear differential equations thus obtained
subject to the boundary conditions pi(T) = p2(T) = 0.25
CASE is A=1
(1 - sl)bi
(28) Pi(t) - e-at [A - g, e(91 -A)(T-1)
g, - A
-e-at S,Abi
(29) P2(t) = (1 - s2)b2 + (lg, g2[e(g1-;t)(T-1)- 1]
Then substituting (27) into this, we immediately obtain the expression for
p2(t). Note that e(g1-1)(T-1) < 1 according to whether g, < A, for all t < T.
Therefore from (27), (28), and (29) we obtain pi(t) > 0, pi(t) < 0 and p2(t) <
0, for all t < T. Since p2(T) = 0, we also have p2(t) > 0 for all t < T. Using
(19), (27), and (28), we can easily show that g1(t) < 0 for all t < T. Ifg, > A,
we can also show that g,(t) < 0 for all t < T. 26
638 DEVELOPMENTS OF OPTIMAL CONTROL THEORY AND ITS APPLICATIONS
CASE ii: A= 0
[(' - s2)b2
(30) P2(t) = e-2.t
e(92-2.)(r-t)
-1
g2-A
(31) P1(t) = -e-"t (1 - si)bl + (1 -g2-A
s2)b291 [e(92-2)(T-1)
([1_s2b2] A - g2e(82-A)(T-1)
(32) P2 = e -' 11
[92
The expression for pl(t) can be obtained by substituting (30) into (17').
Using an argument analogous to the one above, we can show that p,(t) > 0
and Pi(t) < 0, i = 1, 2, for all t < T. Using (19), (30), and (32), we can show
that 42(t) < 0 for all t < T. Also if g2 > A, we can show that q1(t) < 0 for all
t < T.
(33) J= f 0
xte-Ptdt, where p > 0
(36) J- r0
(1 - st)f(kt) a-Ptdt
SOME APPLICATIONS 639
The fact that st > 0 means that consumption does not exceed current income; that
is, gross investment It is nonnegative. This signifies that we do not allow capital
to be "eaten up" (except for depreciation), which means that once the output is
invested as capital stock it is not used for the purpose of consumption. The
assumption of It > 0 is often called the irreversibility of investment (see Arrow 111,
for example).
Note three (mathematical) features in the present formulation of the optimal
growth problem: (1) the objective function is linear in the control (st); (2) the
RHS of the differential equation constraint (37) is again linear in st; and (3)
the control is in the closed region prescribed by (38). Under these features, it will
be observed that we obtain a "corner solution" as a usual case. Since it is typically
supposed in the classical calculus of variations that the control region is an open
set, the corner solution requires special consideration. However, the Pontryagin
maximum principle, in which the control region can be a closed set, is well suited
for the analysis of this problem. Moreover, the solution of the above.problem is
such that there is a jump in the optimal control from a corner solution to an
interior solution. Hence the assumption of the piecewise continuity of the control
function is useful.
The problem is to choose the time path of st so as to maximize J defined
in (36) subject to (37) and (38) with a given ko. The solution path is called the
optimal attainable path (with respect to ko). For this problem st is the control
variable and kt is the state variable. We consider this problem as the one with open
terminal end point and apply Theorem 8.A.6. The Hamiltonian for this problem
is
(i) The variables kt, st, p, solve the Hamiltonian system which consists of (37)
and the following equation:
(40) -
Pt = - [e-P'(1 - st)f'(kt) + Pt [s,f'(kt) A]]
(ii) The Hamiltonian H is maximized with respect to st.
(iii) The right-hand end-point condition: limt-apt = 0.
q
4=0 k=o
N
q(ko) `J
1 1 k
0 ko k` k
(k,', q/), which is the same as the above path for the period 0 < t < T but
is (k*, 1) for t > T.
We may now examine whether the path (k*, 1) satisfies the system of
differential equations (37) and (42). To do this, first note that q, = 1 implies
7E, = 1, and for the path (k*, 1), k, = 0 and 4, = 0. Therefore, (37) and (42)
are reduced to
(52) 0 = s, f(k*) - Ak*
(53) 0 = (A + p) - f'(k*)
Equation (53) is obviously satisfied by the definition of k* [see (51)].
Equation (52) is satisfied if and only if s, takes the value
Note that s* > 0, and that k* < k implies s* < 1. Hence we have 0 < s* < 1,
which satisfies (38). Thus (k*, 1) satisfies both (37) and (42).
Hence the path (k,', q,') defined above satisfies all three conditions of
the maximum principle for this problem, including condition (iii) [or(44)] .
It can be easily shown that along this path (k,', q,'), the integral Jdefined
by (36) also converges.
CASE ii: q, < 1
In this cases, = 0 and ?c, = I in view of (46) and (47), so that (37) and
(42) can be rewritten respectively as
(55) k, = -ilk,
(56) 4, = (A + p)q, - f'(k,)
642 DEVELOPMENTS OF OPTIMAL CONTROL THEORY AND ITS APPLICATIONS
The phase diagram for these two differential equations is depicted in Figure
8.10.
If ko > k*, there exists a path (kt, qt), starting from [k0, q(k0)] , such
that it reaches the state (k*, 1). This is also illustrated by an alternative phase
diagram, Figure 8.11, which again can be obtained from (55) and (56). It can
be shown that it reaches (k*, 1) within a finite amount of time-say, T.
Assuming k0 > k*, we therefore define the path, denoted by (kt", q/1),
as the above path for the period 0 < t < T and (k*, 1) for t > T. Clearly
(kr", 'q,") satisfies all three conditions of the maximum principle for the
present problem, including condition (44). Also, along this path, the integral
J defined by (36) converges.
We may summarize the above discussion in the following theorem.
Theorem 8.B.1: For the above model, given an arbitrary initial value of k, there is
a unique optimal attainable path which is characterized as follows:
(i) k0 < k*: si = 1, and after k, reaches k*, k, = k* for all such t; that is, the path
(kt', qt').
(ii) k0 > k*: si = 0, and after ki reaches k*, ki = k* for all such t; that is, the path
k " ")
(r+9r
(iii) k0 = k*: ki = k* for all t and st = s* = Ak*/f(k*).
In other words, the optimal attainable path is the one that reaches the modified
golden rule path (k*, x*) [where x* = (1 - s*)f(k*)] with a maximum speed and
stays thereafter on it. This optimal attainable path is illustrated in Figure 8.12.
Thus the solution path is such that if k0 < k*, the economy maximizes savings
from current income until time T and after time T maintains a constant saving
ratio s*; if k0 > k*, the economy minimizes saving from current income until
time T and after time T maintains a constant saving ratio s*. As is clear from
the above diagrams, the optimal saving ratio is a kind of bang-bang control or,
more precisely, "bang-off" (or "bang-coast") control.
Sr Sr
S" ------- S.
0 IT
t t
kr I
kr
ko
0 T 0 T
FOOTNOTES
1.This subsection is taken from Takayama ([ 14] and [ 15] ), which were originally
developed in his lectures at the University of Minnesota in the spring of 1966.
2. Our analysis can be modified to a two-sector economy (for example, agriculture and
industry). It can also be extended to an n-region or n-sector economy without too
much difficulty. An attempt for the two-sector economy is made by Bruno [3].
3. We can also consider the objective function in the form of c1Y1(T) + c2Y2(T)
where c; is some weight attached to the income of each region by the planner. The
analysis, in this case, will be analogous to the one which we develop below. It is
also possible to consider different propensities to save of each factor (labor, capital,
and so on). The analysis will be similar as long as we assume fixed coefficients of
production. For such variations, see, for example, Dorfman [5] .
4. The maximum principle, as it is presented and proved by Pontryagin et al. 191,
gives necessary conditions for the optimum. Since the right-hand sides of equations
(3-a) and (3-b) are linear (hence concave) functions of /i, K, and K2, the maximum
principle is also sufficient for the optimum. See Mangasarian [8] and Section C
of this chapter.
5. We can interpret pi as the "shadow price" of investment in the ith region. Condition
(8) can be interpreted simply as investing the entire fund in the region where thee
"shadow price" of investment is higher.
6. From (7'), it should be clear that pl (t) > 0 and P2(t) > 0, for 0 < t < T, for the
optimal values of /i (0 or 1).
7. If gl = 92 and s, = s2, the two regions would look exactly the same to the planner,
so that the choice between the two would be indifferent.
8. Since the name of the region is arbitrary, this exhausts all the possible cases.
9. We can certainly extend our analysis to the case in which the target function is
more generally defined as fo (X/N)e-t'rdt + a1K1(T) + a2K2(T) where p is a time
discount factor for future consumption and a; is a weight attached by the planner
for the capital stock in the ith region at time T.
10. We choose the units of population properly so that the initial amount of pupulation
is equal to one.
11. We can show that pi (t) and P2(t) are positive, 0 < t < T, if /i = 1 or 0.
12. Or: (i) gl = 92, si < s2; (11) gl > 92, Sl = s2; (iii) 91 > 92, A = 0-
13. If 91 > 92 and s, > s2, then [(1 - s2)b2 - (1 - s, )bi] is not necessarily negative.
14. This is because we can show that p, (t) > 0 and p2(t) > 0, for all t < T, ifP = 1 orO,
and that the optimal value of /i is either 0 or 1. In the Addendum to this subsection,
we show our proof for pi (t), p2 (t) > 0, for all t < T, if /i = 1 or 0. In the argument
that follows, we assume that gi > g2. If g, = g2 and a > 0, then pi -- p) > 0 so
that p, - P2 < 0 for all t < T. In other words, the optimal policy is to invest the
entire fund in the second region (/i = 0). When gi = 92 and a < 0, 1 always.
But this case is already covered by case i of footnote 12.
15. See the Addendum to this subsection.
16. We may recall that gi > 92 by assumption.
17. This is due to the fact that qI - q2 > 0 implies q1 - q2 < 0 for all t < T
since qi(T) = q2(T) = 0. If qi - q2 < 0, then q1 > q2, which is a case that
should be excluded from the assumption of the present case (qi < q2). We also
note that q, > 0, i = 1, 2, for all t < T. Hence we are concerned with the
nonnegative orthant of the (ql-q2)-plane.
SOME APPLICATIONS 645
18. Again this is due to the fact that 4i - 42 < 0 implies qi - q2 > 0 for all t < Tsince
qi (T) = q2(T) = 0.
19. If we approximate the discount factor p by the current market rate of interest, the
gi's may become much larger than A(= n + p), where n is the rate of population
growth. Note also that if this is the case, both qi(t), i = 1, 2, decrease over time
for all t < T.
20. See the Addendum to this subsection, especially (27), and recall that pl(t) - Bi(t)e-Ar.
21. If the difference between gi and A is too big, then (A + 1) is negative and (26)
makes no sense. We may avoid such a possibility altogether by assuming that gi > A.
22. In this case, b> > b2. Needless to say, bi > b2 and g> > 92 do not necessarily
imply a < 0.
23. In this case, b1 < b2-
24. Here we assume that g > 92.
25. Some of the computational procedure can be simplified by transforming the pi(t) to
qi(t) and noting (19).
26. Use (19), (29), (17'), and (27).
27. This part is also from my lectures at the University of Minnesota given in the spring
of 1966. This is a simplification of Uzawa's model [ 16] , which involves the two
sectors, material output and knowledge. This simplification illuminates the signifi-
cance of a linear objective more dramatically.
28. The condition that xi > 0 or, equivalently, s< < 1, implicitly assumes that the
starvation level of consumption is zero. If we want to explicitly consider a positive
level of consumption as the starvation level, then we alter this condition to
x, >_ x > 0 or, equivalently, s< < s < 1, where x is the starvation level of con-
sumption ands is the corresponding propensity to save. However, this change will
not alter the subsequent analysis in any essential way as we simply assume x = 0.
29. Using a proof similar to the one used in Mangasarian's theorem [8] , or in Theorem
8.C.5, we can show that these conditions are also sufficient for optimality. Note
also that condition (iii) below needs a proof, for Theorem 8.A.6 is concerned only
with the finite horizon problem.
REFERENCES
Economics, LXXXII, August, 1968 (also Krannert Institute Paper, No. 186, Purdue
University, August 1967).
16. Uzawa, H., "Optimal Technical Change in an Aggregative Model of Economic
Growth," International Economic Review, 5, January 1965.
Section C
FURTHER DEVELOPMENTS
IN OPTIMAL CONTROL THEORY
Here fo, the J's, and the gj's are assumed to be continuously differentiable
in (x, u, t)-space. The functions xi(t), i = 1, 2, ..., n, 0 < t < T, are con-
tinuous, and the u;(t), i = 1, 2, ..., r, are piecewise continuous. In this
problem the final time T is fixed and the terminal end points, the x.(T)'s,
are not specified.
Here we shall not attempt a full exposition of the maximum principle with
the g-constraints (1). Instead, following Arrow [ 1] , we simply give below a heuristic
explanation of the main result for Problem I.
First we consider the above problem without the constraints (1); the problem
is then reduced to the one discussed for Theorem 8.A.6. Hence we obtain the
necessary conditions described in that theorem. The essential part of Theorem
8.A.6 is the maximization of H with respect to u. That is, for each t,
(3) H[. (t), u(t), t, p(t)] > H[z(t), u(t), t, p(t)] for all u(t) E U
where
M
Now we add constraints (1) to the problem and consider the maximization of
H as a constrained maximum problem, that is, the problem of maximizing H
subject to the constraints (1). Thus (3) may now be replaced by
(5) u(t) maximizes H[z(t), u(t), t, p(t)] (for each t)
subject to gj [z(t), u(t), t] > 0, j = 1, 2, ... , m, and u(t) E U
Let us now define the Lagrangian L (or the generalized Hamiltonian) by
(6) L [x(t), u(t), t, p(t), q(t)] = H[x(t), u(t), t, p(t)]
where H is defined by (4), and the qj(t)'s are the "multipliers" associated with
the g-constraints. Then, in view of our discussions in Chapter 1, the maximization
648 DEVELOPMENTS OF OPTIMAL CONTROL THEORY AND ITS APPLICATIONS
of L implies the following conditions for each t, provided the "constraint qualifica-
tion") holds:'
(7)
au;
0, i = 1, 2, ..., r, where f = L [z (t), u(t), t, p(t), q(t)]
(8) q(t)- g[z(t), fi(t), t] = 0 and q(t) > 0, where q(t) = [q1(t), ..., q,,,(t)]
or
Lemma: Any one of the following conditions provides the constraint qualification:
Theorem 8.C.1: Assuming that the constraint qualification holds, in order that
u(t) be a solution of Problem I with the corresponding state variable i(t), it is
necessary that there exist vector-valued functions p(t) = [pi(t), p2(t), ..., p,(t)]
and q(t) - [qi(t), q2(t), . . ., gm(t)]5, where the pi(t)'s are continuous and have
piecewise continuous derivatives and the gj(t)'s are piecewise continuous and contin-
uous at all points of continuity of u(t), such that
(i) The function p(t) together with u(t) and i(t) solve the following Hamiltonian
system:
(9) Xi=
aL
api i
and pi=- aL,i= 1, 2,...,n,
for each interval on which u(t) is continuous
H[z(t), u(t), t, p(t)] > H[$(t), u(t), t, p(t)]
for all u(t) E Usuch thatgj[z(t), u(t), t] > 0, j = 1, 2, ..., m.
FURTHER DEVELOPMENTS IN OPTIMAL CONTROL THEORY 649
Conversely, if (13) holds, then we can easily find multipliers i(t) > 0,
i = 1, 2, .. ., n, so that condition (iii) of the above theorem is satisfied. An
alert reader may have realized that this procedure and condition (13) are
analogous to those discussed in connection with the nonnegative quasi-
saddle-point condition in Chapter 1, Section D.
REMARK: It should also be realized that Pontryagin's maximum principle
as discussed in Section A can be considered as a special case of Theorem
650 DEVELOPMENTS OF OPTIMAL CONTROL THEORY AND ITS APPLICATIONS
8.C.1, in which gj[x(t),u(t), t] > 0 takes the form gj[u(t)] ? O,j= 1,2,...,m,
or the constraint region U is restricted by the g-constraint.
In many problems of economics, it may be desirable to impose the following
condition explicitly:'
(14) x.(T)>_ 0, i = 1,2,...,n
To do so, we first alter the objective functional of Problem I as follows:
where T is fixed, x(T) is not specified, and the c;, i = 1, 2, ... , n, are some fixed
constants. The transversality condition for Problem I with this change in the
objective is
then assuming that an optimal policy exists, all the conclusions of the above
corollary except condition (17) and all the conclusions of Theorem 8.C.1
except condition (v) hold. As remarked at the end of Section A, the appro-
priate conditions which replace the transversality conditions (17) or (v)
(of Theorem 8.C. 1) for the infinite horizon problem are not yet known in a
general form. So far it is necessary to prove such conditions for each case.
FURTHER DEVELOPMENTS IN OPTIMAL CONTROL THEORY 651
b. HESTENES' THEOREM
In an important paper [5] and later in a book [6] that discusses the
relation between the classical calculus of variations and optimal control theory,
Hestenes presented a general formulation of the necessary conditions for optimal-
ity in optimal control theory and theproofs of his majortheorems. Theformulation
is general enough to cover constraints of the type g [x(t), u(t), t] > 0, equality
constraints, integral constraints with both inequalities and equalities, as well as
ordinary differential equation constraints of the Pontryagin type. The formulation
also introduces the "control parameter."
An example of the integral constraint problem follows.
PROBLEM II:
Here T is fixed and the xi (T)'s are unspecified. The integral constraint in
the above problem is stated in the form of an inequality."
Let [,i(t), u(t)] be a solution pair of the above problem. Assuming all
functions, fo, the fi's, and the hk's, are continuously differentiable with respect
to their arguments, we have the following theorem describing the necessary
conditions of optimality.
Theorem 8.C.2: Suppose [ :(t), u(t)] is a solution of the above problem. Assume
that the constraint qualification holds. Then there exist multipliers po, pi(t), i = 1,
2, . . ., n, A k, k = 1, 2, .. ,1, not vanishing simultaneously on 0 < t < T, and a function
H,
(20) H[x(t), u(t), t, p(t)] = pofo[x(t), u(t), t]
(21)
Moreover, we have
xi=
aH
ap;' pi=- aH
x;
, in
The functions z(t), u(t), p(t) satisfy the equations
REMARK: If the terminal end points, the x; (T)'s, are fixed such that x; (T) _
x; T, i = 1, 2, ..., n', where n' < n, then the above transversality condition
(iv) is replaced by
(25-a) x;(T) = x;T, i = 1, 2, ..., n'
and
(25-b) pi(T) = 0, i = n' + 1, n' + 2, ..., n
EXAMPLE: Consider a consumer who wishes to maximize the sum of his
satisfaction from consumption over his lifetime. Assume, for the sake of
simplicity, that he knows that his life span is T, and that he also knows
the time path of the price vector p, of his consumption bundle c, and his
income y,, over his entire life span. Let r be the market rate of interest
which is assumed to be a positive constant. Assume that this consumer is
"competitive" (that is, "small" enough relative to the economy) so that
his choice of c, for any t will not affect the p, and r that prevail in the
market. Let M be his total (discounted) income; that is, M = £T e-rty, dt.
Let a differentiable real-valued function u(c,) represent his satisfaction
from the consumption vector c,. Let C be his consumption set. His problem
is to choose the time path of consumption c, from C such as to maximize
his satisfaction over time subject to his budget constraint. That is,
FURTHER DEVELOPMENTS IN OPTIMAL CONTROL THEORY 653
Maximize:
C,
f e-Pru(cr)dt
0
T
(27) Poe-P`u(cr) + .(7M - Pr' ire '` J Poe Pru(cr) + (T - Pr' cre-rrl
for all cr E C, where A ? 0, and /
[
(28)
_I
M- f pr c,e-''rdt] = 0
0
T
(28')
0
In other words, all of his income is spent over his lifetime. This is certainly
a natural consequence under the nonsatiation assumption.
654 DEVELOPMENTS OF OPTIMAL CONTROL THEORY AND ITS APPLICATIONS
(29) e-Pt[u(ct) - u(c,)] > A[Pt' t - pt' ct] a-'r for all ct E C
Thus, for A > 0,
(30) u(ct) u(ct) for all ct E C such that pt' ct > Pt' ct
and
(31) Pt' cr Pt' ct for all ct E C such that u(ct) > u(ct)
Condition (30) says that this consumer maximizes his satisfaction at each
instant of time over those consumptions whose values do not exceed the
value of the optimal consumption ct. Condition (31) says that, for the optimal
consumption bundle, his consumption expenditure is minimized at each
instant of time over those consumption bundles which would give him
satisfaction that is higher than or equal to the satisfaction obtained from ct.
The control variable u(t) is a function of time t. In many cases it may.So
happen that we can choose a variable that does not depend on t. Such a variable is
called a control parameter. Let b = [b1, b2, ..., b.] be an a-dimensional vector
which denotes the control parameter. Let B c R" be the set to which b is restricted.
Consider the following problem.
PROBLEM III:''
Maximize: 0(b) + f f0 [x(t), u(t), t] dt
-u(t)EU,bEB 0
Theorem 8.C.3: Suppose [1(t), u(t), b] is a solution of the above problem and the
constraint qualification holds. Then there exist multipliers po, pi(t), i = 1, 2, ..., n,
qj(t), j = 1, 2, ..., m, not vanishing simultaneously on 0 < t < T, and a junction L,
m
(32) L [x(t), u(t), t, p(t), q(t)] = H[x(t), u(t), t, p(t)] + 2:I qj (t) g1 [x(t), u(t), t]
where
n
(33) H[x(t), u(t), t, p(t)] = pofo[x(t), u(t), t] + 2:1 pif.[x(t), u(t), t]
aL aL
(35) xi =api
-, Pi = -
axi '
i = 1,2,.. ,n
aL
(36)
aui
=0, i= 1,2,...,r
where
(37)
dtL atI
on each interval of continuity of u(t) and the function L is continuous on 0 t < T.
n r
(38) LTeb + iI.IPi(T) ab- = 0
-PO ab
-
656 DEVELOPMENTS OF OPTIMAL CONTROL THEORY AND ITS APPLICATIONS
where
LT = L[z(T), u(T), T, p(T), q(T)]
REMARK: The last condition, (iv), summarizes (or generalizes) the simple
transversality conditions discussed in Section A. In particular,
(a) T is fixed and b; = x;T: pi(T) = 0, i = 1, 2, ..., n.
(b) T is unspecified and the xiT3s are fixed: Set b = T and obtain
and
(47) to = to (b), t I = t, (b)
8iigi
658 DEVELOPMENTS OF OPTIMAL CONTROL THEORY AND ITS APPLICATIONS
has rank mat each element [i(t), u(t), b, t] inXo, where 8;i is "Kronecker's delta"
defined by 8y = 1 if i = j and 84 = 0 if i j, and ag/au is the Jacobian matrix of
g with respect to u evaluated at [i(t), u(t), b, t] .
The matrix (49) ca n be written out as
499111 09111
39 111
0 0 0
au, au2 au, 9111
-(51)
has rank s, where E is the set of indices in which the gl's are effective, that is,
(52) E = f j: gjx, u, b, t] = 0}
and s is the number of these effective constraints. In other words, (A-2) says that
the rank of (agE/au) is equal to the number of the effective constraints. If all
the gl-constraints are inequality constraints (so that m' = m), then (A-2) amounts
to the rank constraint qualification discussed in subsection a [ see condition (iv) of
the lemma and Arrow-Hurwicz-Uzawa [2] ].
with H defined as
FURTHER DEVELOPMENTS IN OPTIMAL CONTROL THEORY 659
(59) xi=aL-,
api
Pi= -ax'
aL
i = ],2, ,n
(60) aL=0,
aui
i= 1,2,...,r
(61)
dtL atL
on each interval of continuity of u(t) and the function L is continuous on
to<t< tl.
(iii) The following formula holds:
(63) - a0 + [_!sotS
abj
n
i=1
ax .c
)abj
1,2,...,a
0
c-o
holds, where
REMARK: If fo, the f,'s, gj's, and hk's do not contain b explicitly (so that
L does not contain b explicitly), then the right-hand side of the transversality
condition is identically equal to zero. If, in addition, to is fixed and does
not depend on t and the xro's are fixed, the transversality condition is further
simplified to
(64) a bj
+ [_Li aJ
+
i_
Pi(ti) 6 J=O,j= 1, 2, ...,a
J
C. A SUFFICIENCY THEOREM
All the theorems we have discussed so far have been concerned with the
necessary conditions for optimality. A naturally important question is: Under
what conditions are these conditions also sufficient for optimality? In the case of
ordinary nonlinear programming and the calculus of variations, several important
sufficiency theorems exist; however, in each case there exists a simple but powerful
sufficiency theorem which implies optimality when the relevant functions are
concave. Here we prove such a theorem, which is a generalization of a theorem
due to Mangasarian 191.
We consider a problem in which x(t) is the state variable and u(t) is the
control variable. The function x(t) is an n-dimensional vector-valued continuous
function and u(t) is an r-dimensional vector-valued piecewise continuous func-
tion. The problem is as follows:
T
(68) Ok[x(T)] + ,lo hk[x(t), u(t), t] dt > 0, k = 1' + 1,-, 1
Note that in Mangasarian [91, there are no integral constraints. In this problem
both the initial and the terminal time (that is, 0 and T) are assumed to be fixed.
Regarding the vector [x(0), x(T)] as a control parameter b, we can apply the
Hestenes theorem. Thus, under suitable assumptions, we have the following set of
necessary conditions in order that [z(t), fi(t)] be optimal:
(i) There exist multipliers po, p(t) _ [p I (t), ..., q(t) _ [qi (t), ..., q,,,(t)] ,
and i. _ [A1, . . ., At] , such that
(i-a) PO and a. are constants and a. > 0, with
T
(69) A [ r/, + h tit] = 0, where
J
(71) x, -a-'
Pt
Pi= - aLa,
x i = 1, 2_ . n
aL
(72)
au;
0, i= 1, 2,. .. r
,
where L is defined by
(73) L = L[x(t), u(t), t, p(t), q(t)]
and
I
+ -y3.khk[x(1), u(t), t]
6 -I
662 DEVELOPMENTS OF OPTIMAL CONTROL THEORY AND ITS APPLICATIONS
(76) 1,2,...,n
where
Theorem 8.C.5: For the above problem, if (A-3) and (A-4) hold, then all the
necessary conditions (i), (ii), and (iii), stated above, are also sufficient for [,i(t), u(t)]
to be a global optimum solution of the problem, provided that po = 1 and thefollowing
additional condition holds:
(82) p(t) > 0 for all t
If the concavity in (A-3) and (A-4) is replaced by strict concavity, then the optimality
is "unique. "
IN OPTIMAL CONTROL THEORY 663
FURTHER DEVELOPMENTS
00)
I[x, u] = foT(o -fo)dt +
fT ou] dt + [z(0) - x(0)]ox(o)
+ 9' gu + A. hu)] dt
x(z)]
+
- fT' hdt +
(g)
0
> 0
(h)
hold:
Following are the reasons the above relations
and the concavity offo and io.''
Inequality (a) by the differentiability
Equation (b) by (78), (79), (80), and (S I)(65), and the continuity of x(t), x(t)
Equation (c) by integration by parts'
and p(t).
664 DEVELOPMENTS OF OPTIMAL CONTROL THEORY AND ITS APPLICATIONS
Inequality (d) by the differentiability and concavity off, g, and h, q(t) >_ 0,
and (82) [note that this is the only step in the proof where (82) is used] .
Inequality (e) by (70), q(t) > 0, and (66).
Inequality (f) by the concavity and differentiability of 0, and by A > 0.
Equation (g) by (69).
Inequality (h) by A >_ 0, (67), and (68).
x(0) = xo
Then this condition replaces transversality condition (75). Similarly, if the right-
hand end-point is fixed as x(T) = XT, then this replaces transversality condition
(76). With this remark, we can easily prove the following corollary.
Corollary:
(i) If x(0) = xo [or x(T) = x"], and if hk = O for all k, then Theorem 8.C.5 holds
with (69), (75) [or (76)] , and A. > 0, and the vector A, all deleted.
(ii) If x(0) = xo and x(T) = xT, and if hk = 0 for all k, then Theorem 8.C.5 holds
with (69), (75), (76), and A >_ 0, and the vector A., all deleted.
FOOTNOTES
1. When the g-function lacks the u(t) (the case of bounded state variables), the optimal
control problems become quite difficult and tedious, and are beyond the scope of
our exposition. The interested readers are referred to Hestenes ([6] , chapter 8) and
Russak [ 101, for example. See also footnote 16.
2. The constraint qualification is the qualification imposed or, the constraint to
guarantee "normality." In other words, if the constraint qualification does not hold,
then L must be written as L - q0H + qjgj, where q0 can be zero. The con-
straint qualification in this context can be interpreted as the qualification imposed
on the constraint to guarantee 90 > 0. (Note that if qo > 0, we can choose 90 = 1,
for we can always redefine the multipliers qj by qj/9o. Recall our discussion on the
normality condition in Chapter 1.)
3. More precisely, the notation Of/au; means aL/au, evaluated at [.i(t), fi(t), t,
p(t), q(t)] for each t. In the subsequent discussion, we use the notation aL/ax;,
aL/ap, in the same sense.
4. That constraint gj[x, u, t] ? 0 is effective means gj[z, u, t] = 0. In the sub-
sequent discussion we refer to condition (iv) as the rank constraint qualification.
5. The pi's and the qt's are often called multipliers. It is important to note that in the
definition of the functions L and H, the multiplier corresponding to f0 (that is, po) is
set equal to one. This is due to the fact that the present problem corresponds to the
one considered in Theorem 8.A.6. In other words, this is the case with variable right-
hand end-points. As should be clear from our discussion in Section A, if the right-hand
end-points are fixed, then in general we do not obtainpo = 1. In this case, Hshould be
defined as H = pof0 [x(t), u(t), t] + -Y° i p; f [x(t), u(t), t] withpo > 0 (constant) and
the definition of L should be modified accordingly.
6. More precisely, L is continuous along [z(t), u(t), t] and has a piecewise continuous
derivative given by aL/at on each interval on which u(t) is continuous. From
conditions (i) and (iii), we obtain dL/dt = (aL/ax) z + (aLlau) u + (aLlat) +
(aLlap) p + (aL/aq) q = (aLlat) + (aLlaq) q = (aL/at) + g- q, where
g [ z(t), u (t), t] . If gj = 0, then gj qj = 0. On the other hand, if gj > 0 on some interval,
then qj gj = 0 [condition (iii)] means qj(t) = 0 (= constant) for this interval so
that qj = 0 for this interval. Thus we have gj qj = 0 for this interval. In other words,
we have gj qj = 0 for all j so that dL/dt = aL/at.
7. Clearly, if L is not concave in u, then condition (iii) does not necessarily imply
condition (ii). It is important to note that condition (ii) always implies condition (iii)
provided that the constraint qualification holds.
8. From condition (iii), aL/au; = 0 and µ;(t) > 0, µ;(t)u;(t) = 0, i = 1, 2, ..., r. But
666 DEVELOPMENTS OF OPTIMAL CONTROL THEORY AND ITS APPLICATIONS
REFERENCES
Section D
TWO ILLUSTRATIONS:
THE CONSTRAINT
g [x(t), u(t), t] >_ 0 AND THE USE OF
THE CONTROL PARAMETER
inequality g- constraint. The notations of the problem are the following: N, labor
force; K, capital stock; I,, gross investment; X,, consumption; n, rate of popula-
tion growth; u, rate of capital depreciation; p, discount factor; F, production
function; u, utility function; subscript t, time t. (Refer to Chapter 5, Section C.)
Then our problem can be written as follows:
Subject to:
(6) f(k,)-xt-it>0
(7) k,=it-i.kt, where =_n+
(8) k, > 0, ko is given
(9) x, > 0
where x, = X,/N, and it = 1,/N,. Equation (7) is obtained from (2) and (3). We
retain the assumptions made before for this problem: f'(k) > 0J11 (k) < 0 for all k,
f'(0) = o, f'(oo) = 0, f(0) = 0, and u" (x) < 0 for all x. Thus f is strictly concave
in k, and u is strictly concave in x. Viewing this problem as an optimal control
problem, x, and it are the control variables and k, is the state variable. Here
k.. is not fixed. We first proceed with our analysis without explicit consideration
TWO ILLUSTRATIONS 669
of the state variable constraint (8). Introducing the multipliers p, r and v,,
we define the function L as follows:
(10) L=L[k,,x,, it, t, p,, r, v,]
u(x,)e-Pr
+ pt(i, - Akt) + rt [f(kt) - x, - it] + v, x,
Note that the rank constraint qualification is trivially satisfied, for we can observe2
ax[f(k)- x - i] ai[f(k)- x-
ax ax =1L0
ax ai
1 0
Then, in view of Theorem 8.C.1, the solution [k,, z,, it] of the above problem
must satisfy the following conditions:'
(i) The variables k, z,, it, p,, r,, and v, must satisfy the Euler-Lagrange-Hamil-
tonian equations
aL aL
apt' pt ak
(14) V, ? 0, v,zt=0
hold.
H[k,, z,, it, pt] > H[k,, x,, it, p,]
for all [k,, x,, i,] which satisfy f(k,) - x, - it >= 0 and x, > 0, where
must hold. (See Arrow [ 1 ] , pp. 92-93, and recall our discussion at the end
of Section A.)
Condition (11) can be written as equation (7) for the optimal path and
670 DEVELOPMENTS OF OPTIMAL CONTROL THEORY AND ITS APPLICATIONS
(18) u'(Xt)e-Pt - rt + vt = 0
and
(19) pt - rt = 0
Clearly condition (iii) implies relations (12), (13), and (14) under the assumption
that the constraint qualification holds. Moreover, the converse [that is, (12), (13),
and (14) imply condition (iii)] also holds, because L is (strictly) concave in zt
and it in view of the (strict) concavity of u. Note also that (19), in view of ri 0,
implies
(20) Pt > 0
Moreover, (19) and (17) imply
(21) Pt = -P1[.f'(kt) -A]
Define a new variable qt by
Then assuming u'(xt) > 0 (nonsatiation) for all x1 > 0, or at least for the optimal
per capita consumption path z (24) [or (24')] implies
(25) Pt > 0
or
(25') gt > 0 fort < oo
Then (13) combined with (19) implies
(26) .f(kt) - zt - it = 0
TWO ILLUSTRATIONS 671
In other words, constraint (6) holds with equality. It is important to note that
the equality constraint is obtained as a result of the explicit recognition of the
nonsatiation assumption.
Combining (26) with (11), we now obtain
(29) u'(Xt)e-pt = Pt
or
(29') u'(ct) = qt
Combining (23), (27), and (29'), we can draw the phase diagram on either the
(x-k)-plane or the (q-k)-plane. The rest of the analysis is the same as that carried
out in Section A. Recall that the nonnegativity of the state variable-that is,
condition (8)-is satisfied along the optimal path which converges to the modified
golden rule path. And along this path, the right-hand end-point condition (16)
is satisfied, and the integral J converges.
In the above analysis, it is not assumed that it > 0. In other words, i, or It
can be negative. This means K, + uK, can be negative. If Kt < 0, then the economy
may "eat up" the capital accumulated in the past. We may suppose that this is
impossible. In other words, we introduce the assumption of the irreversibility of
investment, that is, It > 0 (or it > 0). This means that investment once made in
physical form cannot be converted into consumer goods; hence the economy
cannot "eat up" the capital accumulated in the past. Such a problem is discussed
by Arrow [2] and Arrow and Kurz [3], but we omit it here.
utilization. Then there is a capacity limit given by the relation bYt < K.. We
assume that b is a positive constant. It is important to note that the suffix t is not
attached to K; that is, K is not a function of time. The demand function for the out-
put of the firm is given by D(pt, t), where pt is the price of the output at time t.
Thus if pt is constant for all t (say, p, = p), we can draw the time path of demand
for the output as illustrated in Figure 8.13.
Suppose that the firm is required to produce an output that will meet the
peak demand. If the firm builds a capacity which will meet the peak demand,
then there will be an excess capacity during the nonpeak periods, for the output
is nonstorable by the assumption of the peak-load problem. Such a loss of excess
capacity can be reduced if the firm sets a higher price for a peak period, thus
"flattening" the demand curve. One version of the peak-load problem is that
of choosing the amount of initial investment K and the time path of price so
as to minimize such a "loss."
Let w be the price vector of L and T be the planning horizon of the firm.'
For the sake of simplicity, we assume that the capital lasts for the period T with
the same efficiency and w is constant over time and over the relevant range of
output.' We also assume that the initial purchase of capital stock costs the firm
r dollars per unit of capital at each t.` Assume that r is a positive number.
There are at least two types of targets that the firm might wish to achieve.
In one case, the firm wishes to maximize total social welfare over time. This may
be the case when the firm is owned by a public authority, for example. In the
other case, the firm wishes to maximize the total profit over time. This may be
the case when the firm is privately owned. The solution may be different in each
of the two cases; then the problem of optimal public regulation occurs.
First we consider the case in which the firm wishes to maximize social
welfare over time. This formulation of the peak-load problem seems to be more
common in the literature (see, for example, Williamson [11] and Steiner [8] ).
The definition of "social welfare" or at least its maximization will cause well-
known difficulties. We assume that the "optimum conditions" of production and
exchange are satisfied elsewhere in the economy in order to avoid the "second
best" digression. We also assume that the social welfare at each instant of time
D (P, t)
is measured by (total revenue) plus (consumer's surplus) minus (social cost). That
is,
Notice that the firm can select either the price policy of choosing the time path of
pt or the output policy of choosing the time path of Yt. However, in view of the
demand relation Yt = D(pt, t) or pt = P(Yt, t), the choice of one policy auto-
matically implies the choice of the other policy. In other words, it does not make
any difference whether we suppose the firm adopts the price policy or the output
policy. Thus if the firm adopts the price policy, it has a uniquely implied output
policy determined by Yt = D(pt, t). Here we suppose that the firm adopts the
output policy (that is, the policy of choosing the time path of Yt). The price policy
is then implied by pt = P(Yt, t). The demand function is illustrated in the tradi-
tional manner in Figure 8.14. Note that as t changes (say, from t1 to t2), the
demand curve shifts.
The total revenue plus consumers' surplus at time t, when Yt is chosen,
is given by
Y,
yr
a
y'
(33) F(Y,, t) = P(y,, t) dy,
Jo
Assume again that w, r, a, and b are all constant over time and over the relevant
range of output. Then total social cost at time t, that is, TC,, is given by
(34) TC, = (w - a)Y, + rK
Thus total social benefit over the period of time [0, T] is given by
r
(35) W -- f W,dt = f(TRI + S, - TC,)dt = JT[F(Y1, t) - (w a) Y, - rK] dt
The analysis with a positive future discount (that is, W = Jo W,e-°ldt, o- > 0, where
r is social discount rate) is analogous to the subsequent analysis; hence it is left
as an exercise to the interested reader.
We are now ready to formulate the present version of the peak-load
problem.
PROBLEM I:
T
Maximize: W =
YK
f o [F(Y t) - (w a) Y, - rK] dt
Subject to:
(36) K > bY,
and
(37) Y, > 0
(i) 00 > 0 (constant), q, > 0, µ, > 0, for all t, and 00, q, , and µ, do not vanish
simultaneously. Moreover, µ, Y, = 0, for all t.
(39) 00{Fy - (w a)} - bq, < 0 and [0o{Fy - (w. a)} - bq,] Y, = 0, for all t
where Fy = aF(Y!, t)/BY,.
(iii) The following relations hold:
(40) qt(k - b1i) = 0 and k > bYi, for all t
(iv) The following relation also holds:
(41) Oo[F(Yt, t) - (w. a)Y, - rk] ? Oo[F(Yr, t) - (w. a)Y1 - rk]
for all t, and for all Y1 such that k > bY1 and Y1 > 0.
(v) The following transversality condition holds:
T
(42) (- cbor + g1)dt = 0
Next we show 00 > 0 so that we can take 00 = 1. To see this, simply note the
relation (44). If 00 = 0, then (44) implies qr = 0. Since we assumed Y1 > 0 (the in-
terior solution) (so that u, = 0), this means that all the multipliers (00, q, and µ1)
.vanish simultaneously. This contradicts condition (i) in the above. Hence the rela-
tions (44), (41), and (42) can now be rewritten as follows:
(46) P(Yt, t) - w a = bq1, for all t, or qr = [P(Y1, t) - w a] /b, for all t
(47) F (Y,, t) - (w - a) Y, - rK F (Y1, t) - (w a)Y, - rk, for all It
or
(48) F(Yt, t) - (w - a)k, > F(Y1, t) - (w a) Yt, for all t
and for all Y1 such that k > bY1, Y1 > 0. Also,
Relation (47) means that WW is maximized subject to the constraints at each instant
of time.
676 DEVELOPMENTS OF OPTIMAL CONTROL THEORY AND ITS APPLICATIONS
Conditions (46), (47), (49), and (40) constitute necessary conditions for
(Y,, K) to be optimal. Assuming a2F/aY,2 = aP/aY, < 0, W, is a concave func-
tion in Y, and K, so that condition (47) is implied from condition (46). Therefore,
in view of Mangasarian's theorem (Theorem 8.C.5), conditions (40), (46), and (49)
constitute a set of necessary and sufficient conditions for an optimum. In other
words, conditions (40), (46), and (49) completely describe the solution of the
problem, k, q,, k, t e [0, T]. Notice that if q, > 0 for all t, then k = bY, t E
[0, T] replaces (40). In general, q, can be zero for some t, although q, > 0 holds
over a certain period of time in view of (49).
If the firm has an existing stock of capital K, then k is written as k
K + Ka, where Ka is the additional capital requirement. If K ? K, then our
analysis above follows word for word, except that it should be reinterpreted ac-
cordingly. If K < K, a slight modification of the analysis would be necessary, and
r would presumably be zero. If r = 0, (49) implies q, = 0 for "almost all" t (that is,
for all t except for a countable number of isolated points in [0, T] ), so that we
have P(Y1, t) = w a for almost all t. We proceed with our analysis for K ? K.
From (46) and (49) we obtain
=f q,K- rTK= 0
TWO ILLUSTRATIONS 677
In other words, the profit over the whole planning horizon should be zero.
Now suppose that the demand function P is such that gr > 0 for all t so that
A = [0, T] (that is, full capacity output always occurs). Then f, = IC/b = con-
stant (= Y) for all t, so that the value of k is determined by (50) as follows:
Figure 8.15. An Illustration of the Solution When Full Capacity Is Achieved Always.
678 DEVELOPMENTS OF OPTIMAL CONTROL THEORY AND ITS APPLICATIONS
w.a+ brT
a
P3(Y)
w.a
y
0
Y
Figure 8.16. An Illustration of the Solution When Full Capacity Is Not Necessarily Achieved.
(Pr - w- a)D(P1, t) - rK
TWO ILLUSTRATIONS 679
Note that once p, is set, the firm knows the demand for output by D(p,, t) and
hence produces an amount Y, = D(p t).
The firm is supposed to maximize
(64) n = foT7te_p1dt
where p > 0 denotes the discount rate for the firm. We assume p = 0 for the sake
of simplicity. The analysis in which p > 0 is analogous to the subsequent analysis;
hence it is left as an exercise for the interested reader. We are now ready to state
our problem.
PROBLEM II:
T
Maximize: f[(pi - wa) D(p,, t)
Pr, K
- rK] dt
Subject to:
Here 00, q,, and u, are multipliers. Although the same notation is used for these
multipliers (as well as L) as in the previous problem, their values can, of course,
be different from the corresponding ones in the previous problem. The same
notation is used purely for the sake of notational simplicity. Using Hestenes'
theorem, we now have the following necessary conditions for pl and K* to be
optimal:
(i) The multipliers Oo, q,, and µ, do not vanish simultaneously and 00 > 0
(constant), q, > 0, µ, > 0, for all t. Moreover, p, pt = 0, for all t.
*
(ii) apt = 0, where L* = L(p,*, K*, q, fe,), for all t, that is,
680 DEVELOPMENTS OF OPTIMAL CONTROL THEORY AND ITS APPLICATIONS
(72) f0
r + gt)dt = 0
In other words, we may assume that the rank constraint qualification is satisfied
because
(74) Dpi0,or Dp<0
from the assumption on the demand function. This implies 00 > 0. To see this,
suppose 00 = 0. Then the relation (69) with condition (74) implies qt = 0. Since
µt = 0 from p* > 0, all the multipliers vanish, contradicting condition (i) above.
Thus 00 > 0, so that we may choose 00 = 1. Conditions (69),(7 1), and (72) are now
simplified as follows:'
for all t and for all pt such that K* > bD (pt, t) and pt > 0.
T r
(77) (qt - r)dt = 0, or rT = f gtdt
0
Note that relation (76) means that the "current profit" (that is, profit except for
capital cost) as well as the total profit are to be maximized at each instant of time.
TWO ILLUSTRATIONS 681
Conditions (75), (76), (77), and (70) constitute necessary conditions for an
optimum for the present problem. Moreover, if we assume DPP - PD/8p,2 < 0
for all p, then our n, (hence n also) is a concave function so that these conditions
are also sufficient for an optimum (again in view of Mangasarian's theorem or
Theorem 8.C.5).
We now proceed to further characterizations of the above solution. First,
we define the elasticity of demand by
r
(86) 7r* fo [(p* - rK*]dt
T
where [B denotes the integration in t over the range of B. Recall that, in the case of
a welfare maximizing monopoly, total profit is zero.
Now for the sake of illustration, assume that Tt* is constant over t. An
example of a demand function that yields a constant Tt* is
= T(E* - 1)w a
That is, total profit is larger when E* (or the degree of monopoly 1/q*) is larger.
When c* is constant, (83) can be rewritten as
r
(89)
f p* dt = E*(w a + br)T
Suppose further that the demand conditions are such that qt > 0 for all t. Then
Y* = D(p*, t) for all t, where Y* = K*/b. Using this relation, we obtain
(90) p* = P(Y*, t), for all t
For the sake of illustration, suppose also that relation (55) holds for the function
P. Then (89) and (90) yield
(91) TIPI(Y*) + T2P2(Y*) = E* T(w a + br)
Since E* > 1, Y* < Y, so that K* < K. That is, capacity for the profit maximizing
monopoly tends to be less than the socially optimum amount. It is easy to prove
that this conclusion also holds even if c* changes over time.
If, on the other hand, the demand conditions are those specified by (57),
then (82) and (77) imply
FOOTNOTES
8. We will assume that pt > 0 implies Y1= D(pl, t) > 0 for all t.
9. In view of the assumption made in footnote 8 and Fy < 0, Y, > 0 implies p1
Fy(Y1, t) > 0.
10. In view of the assumption made in footnote 8, p* > 0 implies D* = D(p7 , t) > 0.
11. Equation (75) can be rewritten as p* + D*ID, = w a + q1b. But D* = Y7 and
l/Dp = 8 P(Yt , t)l8 YY = P. Hence the LHS of this equation is p7 + P**YY _
8 (p* Y*)/8 Y1, which signifies the marginal revenue. The RHS of the equation,
w. a + q1b, will signify the "marginal cost." Therefore equation (75) may be inter-
preted as the familiar rule, MR, = MC,.
684 DEVELOPMENTS OF OPTIMAL CONTROL THEORY AND ITS APPLICATIONS
12. Note that c = 1/(1 - 1/n7), and that l/q7 is the well-known degree of monopoly
a la Lerner. Notice that et is greater, the greater the degree of monopoly. From
footnote 11, (w. a + gtbj is equal to the marginal revenue. Hence (82) signifies the
usual rule that the difference between the price and the marginal revenue increases
as the degree of monopoly increases.
REFERENCES
Section E
THE NEO-CLASSICAL THEORY
OF INVESTMENT AND
ADJUSTMENT COSTS-
AN APPLICATION OF
OPTIMAL CONTROL THEORY'
a. INTRODUCTION
The essence of the present treatment of the theory of investment is the
behavioral assumption that a firm maximizes the present value of net cash flows
subject to constraints such as a production function and a capital accumulation
equation. Hence it is a part of dynamic decision theory. Since the firm determines
both the demand for factors such as labor as well as the demand for investment, the
name "theory of investment" seems slightly inappropriate. Rather it should be
termed the dynamic theory of the firm.
Whatever we call it, there seems to be quite a bit of confusion in the theory
of investment. The purpose of this section is partly expository in the sense that we
attempt to correct these confusions and partly illustrative in the sense that we
present various theories in a unified and generalized fashion.
First there is the argument (Haavelmo [20] and Lerner [43], for example)
which says that there is no investment demand schedule for an individual firm.
Assuming that the firm is competitive and small enough and that all prices are
constant, the firm can and would adjust instantaneously to the desired stock of
capital, which is constant. In this case, investment is always equal to the amount
of depreciation and there is no investment function as such. Thus Haavelmo, for
example, concludes the following ([20], p. 216):
What we should reject is the naive reasoning that there is a `demand schedule'
for investment which would be derived from a classical scheme of producer's
behavior in maximizing profit.
The capital is adjusted to the desired level instantaneously at the initial time
and it will be kept constant over the whole planning horizon ([20], p. 163).
Jorgenson, being apparently distressed by this, argued that "it is possible
to derive a demand function for investment based on purely neoclassical con-
siderations" ([27], p. 133). The secret of Jorgenson's innovative procedure of
obtaining the investment demand schedule is to change prices, notably the price
of capital goods, over time ([27], p. 149). The amount of investment then changes
over time depending on the time path of the prices.
However, as Tobin ([61], p. 157) noticed, nothing basic is changed. If all
prices are assumed (or expected by the firm) to be constant, then the amount of
investment is also constant over time in Jorgenson's model. In other words, the
686 DEVELOPMENTS OF OPTIMAL CONTROL THEORY AND ITS APPLICATIONS
1. The investment policy for the firm is to reach the "long-run" desired stock
of capital (K*) as soon as possible (that is, II = Imc if Kp < K*, and It = Imin
if Kp > K*), and after reaching K* to remain at K*.
2. The "long-run" desired stock of capital, K*, is determined by the usual marginal
productivity principle.
THE NEO-CLASSICAL THEORY OF INVESTMENT 687
3. The above conclusions imply that the investment demand changes once over time
when the capital stock reaches K* (except for the case in which K0 = K*).
4. The (Lerner-Haavelmo-Jorgenson) conclusion of instantaneous adjustment
cannot occur for the continuous time model (see, for example, [20] and [27] ),
regardless of the sizes of Imax and Imin, as long as they are finite. This is simply
because the integral over a point of time-say, at t = 0-is zero. For the discrete
time model, instantaneous adjustments can occur (see, for example, Takayama
[ 59] ).
5. However, if the sizes of Ima and lImini are large enough, then the time required
to reach K* can be made very small; that is, an "almost" instantaneous adjust-
ment occurs.
K,=8(K-K)
where K = I/8 and 8 is the rate of depreciation. Viewing K as the "long-run"
desired stock of capital, this equation seems to define the usual "response
function" which is often seen in the empirical literature. Our first remark in
subsection d is a critical note on such a claim.
The second remark in subsection d is a critical summary of Uzawa's treat-
ment of adjustment costs, the "Penrose effect". The third remark is concerned
with a possible extension of investment theory. Among other things, we point
out that essentially the same results follow for the complete monopoly case.
e-.tW,dt
(1) W- coo,
where
(2) Wt = P1Q1 - w1L1 - g111
Here we use the following notations: Q,, output; L,, labor input; I, investment;
p price of output; w , wage rate; q1, price of capital goods; and r, discount rate 3
There are three constraints in this maximization problem. The first is the produc-
tion function, which we write as
(3) Q(L1,K1)-Q,=0
where K1 is the stock of capital. The second is the capital accumulation constraint,
which, following the literature, we write as
(4) K1= It - 8K1
where 8 denotes the rate of depreciation (0 < 8 < 1).
THE NEO-CLASSICAL THEORY OF INVESTMENT 689
aA
(8) Nt = -
by explicitly introducing the constraint liml,co K, > 0. But the analysis and the
conclusion would be the same as the present one. Condition (9) signifies the
maximization of the Hamiltonian H with respect to the control variables Lt and
I,. Setting L, = L, for all t, condition (9) implies
(11) (µ, - e-rtq)It > (ut - e-rtq)I, for all It with Imin < It < Imax
In other words,
(12-a) It = Imax if µt > e-rtq
(12-b) It = 1 min if u, < e-rtq
(12-c) It E [Imin, Imax] If ut = e-rtq
Now the significance of the constraint (5), Imin < It < Imax, should be apparent.
THE NEO-CLASSICAL THEORY OF INVESTMENT 691
If there are no such bounds, I, = co when u, > e-'rg and It = -co when
µr < e-'rg; both of these conditions do not make too much sense either economic-
ally or mathematically. The usual calculus of variations approach as seen in
Jorgenson (for example, [25], [27], [28], and soon) is thus rather inappropriate
for the present case. Note that the solution described in (12) arises from the fact
that the terms inside the objective integral and the constraint function are both
linear in It. This "bang-bang" characteristic described in (12) is ignored in the
literature.
By setting It = It for all t in (9), we obtain8
(13) PQL - w !S 0 and (PQL - w)Lt = 0
where QL = 8Q/8L, evaluated at (L,, K,). Assuming Lt > 0 for all t, we obtain
w
(14) QL =
P
which is the familiar marginal productivity rule with respect to labor.
Conditions (7), (8), and (10) are respectively rewritten as9
(15) ICI=It -SIC,
(16) (r+ S)A, - PQK
where
and
(20) L, = L(K1, p)
692 DEVELOPMENTS OF OPTIMAL CONTROL THEORY AND ITS APPLICATIONS
and write pQK(L,, K,) = pQK [L,(K,, P), IC, = 0(K,); that is,
0(pr)
(21) QK
(22) 9 = 0 (K 8
From Figure 8.17, it is clear that the only path that is eligible13 is the one that ap-
proaches K*, which is described by the heavy lines. Mathematically,'
(23-a) Kt = I max - 8K, if Ko < K*
(23-b) K, ='min - 8 K, if Ko > K*
(23-c) K, = K* if Ko = K*
From (23-a), k, is explicitly obtained for Ko < K* as follows:
K,=0
and
K* =
Koe_8T**
In other words,15
T* Imax - 8Ko
(25-a) 1
tog >0
S Imax - SK*
(25-b) T**=SlogK**>0
The optimal policy in (24) may be described as the one by which the firm
reaches K* as soon as possible, and after reaching K*, remains there. It is illus-
trated in Figure 8.18.
We call K* the long-run desired stock of capital. If k, = constant = K*, then
from (20), Lt = constant = L*. Recalling (14) and (22), the values of L* and K* are
determined by
(r +p8)q
(26-b) Q K (L* , K*) =
Here (r + 8)q signifies the rent for capital; hence (26-a) and (26-b) are the famous
marginal productivity rules. To see that (r + 8)q is the rent for capital, suppose
that one unit of the capital good is rented with rent c. The physical quantity of one
unit of capital good at time t will be decayed to a-St; hence the rental payment
will be ce-St. Therefore, if the capital good is rented for an infinite future, the
present value of rent over all future time will be
pQ*e-(+ r)tdt
(29) q=S K
where QK aQ/aKt evaluated at (L*, K*). This is the famous Keynesian rule
of the marginal efficiency of capital which states that the demand for the stock of
capital is determined by the equality between the unit price of capital and the
present value of all future income from an additional unit of capital. Since
equations (26-b) and (29) are equivalent, the Keynesian rule of the marginal
efficiency of capital coincides with the neo-classical marginal productivity rule;
if the firm is on the path (L*, K*). A similar observation with regard to the
equivalence between the marginal efficiency rule and the marginal productivity
rule can be made in terms of a discrete time model with depreciation by "sudden
death" (see Takayama [59]). Therefore, we disagree with the following view
which is taken by Jorgenson ([27], p. 152) as well as others: "Keynes' construction
of the demand function for investment must be dismissed as inconsistent with the
neoclassical theory of optimal capital accumulation."
Needless to say, Keynes does not explicitly impose some of the above
THE NEO-CLASSICAL THEORY OF INVESTMENT 695
assumptions. In other words, in Keynes, prices may change over time, the firm
may not be a price taker, and the demand function that the firm (if monopolistic)
faces may change in the future. Therefore Keynes obtained a rule which is much
less explicit than (29); that is,
q=r
0 Rye-(s+.)rdt
where R, is the "expected rate of return" on capital at time t, that is, Rt is the
expected revenue minus the expected operating cost (not including the deprecia-
tion cost) per additional unit of capital.
It is important to observe that the values of L* and K* are equal to the ones
determined by maximizing the "short-run" (or instantaneous) profit
pQ(L, K) - wL - cK
In other words, the "long-run" solution (L*, K*) for the dynamic optimization
problem is reduced to the one for the static optimization problem. The myopic
rule is optimal after all from the long-run viewpoint.
With constant prices, the effect of changes in parameters such as p, w, q,
and r on L* and K* can easily be obtained from (26-a) and (26-b) by using the usual
comparative statics procedure. For this purpose, assume, for example, that
QLLQKK - QLK > 0, QLL < 0, Q*K < 0, and 0. Then it can be established,
for example, that aL*/ar < 0, aK*/ar < 0, aL*/aw < 0, and aK*/aw < 0. Since
I* = SK*, we also obtain aI*/ar < 0 and a7*/a w < 0.
Assuming that the values of K* and L* are uniquely determined by (26-a)
and (26-b), these values are constant as long as the prices (w, p, q, and r) are
constant. In this case, I* is also constant and equal to the amount of depreciation
SK*. In other words, the firm's investment is constant after it reaches K*.
Jorgenson [27] obtained results in which I, changes over time. This is due to
the fact that he allowed the price of capital q to vary over time, while he in the
main assumed all other prices (p, w, and r) constant." It is not quite clear why he
allowed this asymmetry with regard to the expectation of future prices.
Finally, let us consider the case in which the production function Q(L,, K,)
is homogeneous of degree one (constant returns to scale). In this case, QLLQKK -
QLK2 = 0 for all (Lt, Kt), and the above analysis should be modified. Note that,
in this constant returns to scale case, QL(Lt, K,) and QK(Lt, K,) are both homo-
geneous of degree zero. Then in view of (14), K,/Lt is constant (-- k) for the fixed
value of w/p, as long as QLL < 0 and QKK < 0 for all (Lt, K,). Hence QK =
QK(l, k) is also constant.18 Hence in view of (17), the At which satisfies (16) is
obtained as
PQK(1,k)
PQK
k =_ K/L
0
Figure 8.19. An Illustration of A > q.
The time path of the optimal stock of capital is obtained from (15) and
(24') as
limK, = O if <q
r-0]
The time path of IC, for the constant returns to scale case is illustrated in Figure
8.20.
We have remarked that )l = q is the "knife edge" case. Although this may be
THE NEO-CLASSICAL THEORY OF INVESTMENT 697
Figure 8.20. The Time Path of K, (the Constant Returns to Scale Case).
true for the behavior of an individual firm, the situation A 4- q cannot continue for-
ever, if every firm behaves under the rule described above and if every firm has
more or less the same production function (that is, the same technical efficiency).
For example, if A > q, the total demand for capital for the market as a whole
would exceed its supply and the price of the capital good (q) would sooner or
later rise to the point at which A = q. Notice that in the meantime, the demand
for labor would increase as k, increases in order to keep k constant, which
might push up the real wage rate. Then the value of k would decrease to keep the
relation Q, = w/p, which in turn would increase A. In other words, the equilibrium
would be realized at an increased level of A. In any case, if A = q is to be achieved
sooner or later, then A = q is not really a knife edge case. Notice also that under
A = q, the marginal productivity rule with respect to capital, (26-b), is also
realized, although the volumes of optimal investment and capital stock become
indeterminate.
ment costs of capital? Eisner and Strotz [141, Lucas [451, and Gould [181
suggested the following form to replace gtIt:
(31) Ct = C(It)
where C(I1) > 0, C'(I1) > 0, C" (It) > 0 for all It > 0, C'(0) > 0, and C(0) = 0.19
The condition C" (It) > 0 means that adjustment costs will be greater, on the
average, the greater the rate of investment.
On the other hand, Lucas [47] , viewing the adjustment cost as the internal
cost of the output foregone, introduced adjustment costs by altering the usual
production function (3) to the following form:
(32) Qt = Q(Lt, Kt, It)
where it is assumed that aQ/alt < 0 and a2Q/alt2 < 0 for all (L1, Kt, It) > 0.11
Clearly, the choice of the mathematical formulation may affect the con-
clusion. Such a choice would depend on empirical considerations and will vary
between industries. Here we simply adopt the form presented in (31).
The firm's maximization problem is now slightly altered by this modification;
qt is no longer exogenous to the problem. In the definition of H in (6), we should
replace gtlt by C(I1). In other words, we rewrite the firm's problem as follows.
PROBLEM II: Choose the time path of Lt and It so as to
Imin = It = Imax
Lt? 0, Kt >0
and a given value of Ko
Write the Hamiltonian H now as 21
Denote again the optimal path by (ICS, L, Ii). Assuming that the function Q is
concave and noting that C is (strictly) concave, the following conditions are
sufficient as well as necessary for an optimum:
(33) dK` = I - SICt
W
(35) QL=
P
where QL = 8Q/8L, evaluated at (L IC,);
(36) At = C1(1l)
and23
(37) limA,e-rl
=0
t- m
K,
0
K.
The values of At and k, at the intersection of these two curves are denoted by 21*
and K*, respectively, and are defined by the following equations:
(43-a)
(43-b)
In other words, an increase in the interest rate (resp. the price of output) lowers
(resp. increases) the "long-run" desired stock of capital (K*) and investment (P).
There are three types of (Kr, A,)-paths in Figure 8.21: (1) the path in which
THE NEO-CLASSICAL THEORY OF INVESTMENT 701
both IC, and A, decreases over time; (2)'the path in which A,- co; (3) the path in
which ICt_K* as t->oo. Clearly only the third type of path is eligible.26
In the third path, k, monotonically approaches K*, as indicated in Figure
8.21. Notice that at K*
(46) It = I* = g(A*) = SK*
as we can see from (43-b). In other words, the optimal investment at (K*, A*) is
just equal to the amount of depreciation. Now observe that
(47) It - SIC, = It - I* - (SIC, - SK*)
= [g(At) - g(A*)] -S(ICt-K*)
In the third path, A, approaches A* as k, approaches K*. Therefore, from (47),
I, - SIC, approaches zero as k, approaches K*. This implies that it takes an infinite
amount of time to reach the steady state (K*, A*). We thus conclude the
following:
Of d<0
(48-a) Ko < K*: > 0,!L,' for all t >_ 0, Ktk= K*, lim it = I*
lim
In other words, the usual response function which appears in the literature cannot
hold for any t (see subsection d).
it
K>K
(52) At
A + Ae('+a)t
r+b
where A is the integrating constant. In view of the right end-point condition (37), A
must be zero. Hence
_ c
(53)
r+b( )
which corresponds to (30). Then in view of (36) with C" > 0, and hence in view
of (41 we obtain
(54) it = g
[r + 81 - I (= a positive constant)
for all t, where it is assumed that
c
C'(0)
(55)
r+b >
Therefore we conclude that the optimal investment is constant and positive for all
t. Notice that we obtained this result without assuming the quadratic approxima-
tion of the adjustment cost function which is the usual convention in the liter-
ature.28
Since the function g is monotone increasing, it is easy to conclude
(56)
a
r <0'>0'as <o
in view of (54).
I lie path of the optimal capital stock, k, is easily obtained from (33) as
(57) dd`=7-SIC,
(59)
Notice that the strict convexity of the function C (that is, C" > 0) guarantees
the uniqueness of the optimal investment as a result of (54), and thus guarantees
the uniqueness of k, from (58), which in turn implies the uniqueness of L, since
KdL! = k = constant. In other words, the strict convexity guarantees the unique-
ness of the solution (K L,, I) (almost everywhere), in spite of the fact that Q is
homogeneous of degree one, and hence not strictly concave.
d. SOME REMARKS
Lerner-Haavelmo observation. If all the prices, p, wt, gt, and r, are constant, then
the desired stock of capital K*, whose value depends on the parameters p, w, q, and
r, would be constant. In Jorgenson's study [27], these prices are not constant;
hence the desired stock of capital is not constant, and is hence denoted by K1 .
Jorgenson and his associates then insisted that the actual stock of capital Kt is
in general different from the desired level. In "reality," the firm cannot adjust its
stock of capital to the desired level instantaneously and frictionlessly. Hence the
investment demand at time t will be determined in such a way as to accommodate
this adjustment.
Suppose that the response mechanism is represented in such a way that
actual capital is a weighted average of all past levels of desired capital with geo-
metrically declining weight. That is
(61) Kt = E aT
Go k,-,r
T=l
where29
0C
(63) Kt = JaT&_TdT
where
00
(64)
To
aTdT=1 and aT>0 forallt>0
If the lag function at is a simple exponential lag
(65) at = ae-111, a > 0
then it is well known that (63) can be rewritten as30
(66) Kt = a(Kt - Kt)
where a may be termed the response parameter or the speed of adjustment.
It is possible to select a lag function other than the above simple exponential
lag, and the specification of the lag function would, in general, affect the results.
For example, in their empirical studies of comparing alternative theories of
investment, Jorgenson and Siebert ([29], p. 688) remarked the following:
Misspecification of the lag distribution for a given theory of investment
behavior may bias the results of our comparison. Accordingly, we choose the
best lag distribution for each alternative specification of desired capital from
among the class of general Pascal distributed lag functions.31
THE NEO-CLASSICAL THEORY OF INVESTMENT 705
It goes without saying that the investment function ... which is derived
from logically completely inconsistent assumptions, cannot have any
economic and empirical significance, no matter how good the statistical
fit of that function may be. (Translation is mine.)
'In other words, a response mechanism such as (61) or (66) affects the profits, and
hence affects the desired stock of capital.
The introduction of adjustment costs by such authors as Treadway [62],
Lucas [45] and [47], and Gould [ 18], as described in subsection c is, no doubt,
based on serious skepticism about the above difficulty in the studies by Jorgenson
and his associates.
However, the result they obtained is remarkable. As we can observe from
our discussion in subsection c [especially from equations (57) and (59)], if the
production function exhibits constant returns to scale with respect to labor and
capital and if the adjustment cost function is strictly convex with respect to
investment (I,), then we obtain a result which states that I, = I (constant) for all
t and
(67)
dICi
dt
(K-k)
where 8 is the rate of depreciation. From (67) it is concluded that it is optimal for
the firm to adjust in the way described by (66) where a = 8.
Now we should comment on this result. As remarked earlier, this result
depends crucially on the assumption of constant returns to scale. As we showed
in subsection c, we cannot obtain a nice response function such as (66) for the case
of nonconstant returns to scale.
Secondly, one crucial assumption in the formulation by Gould, Treadway,
Lucas, and so on, is that there is no lag between the investment decision and the
realization of the decision. This is in sharp contrast to the usual rationalization
706 DEVELOPMENTS OF OPTIMAL CONTROL THEORY AND ITS APPLICATIONS
of a time lag. For example, Jorgenson made the following (empirically) very sound
remark ([25], p. 2):
... we divide the investment process into separate stages. The first stage of the
process is a change in the demand for capital services. Subsequent to an
alteration in demand for capital services, architectural and engineering plans
must be drawn up, cost estimates prepared, funds appropriated and funds
committed through the issuing of orders for equipment or the letting of
contracts for construction. Actual investment expenditure is the final stage
in the investment process. Only after a given investment project has passed
through each of the intermediate stages can actual investment expenditure
take place.
(68)
Kt - t/
where 7r' > 0 and 7r" > 0; ir" > 0 signifies that the marginal effect of investment
upon the growth process is diminishing 33 The ir-schedule depends on how the
managerial and administrative resources are accumulated in the course of the
growth of the firm. Notice that K, in (68) is not the usual physical capital stock,
but rather it incorporates the scarcity of managerial and administrative resources.
In essence, the Penrose effect seems to treat the managerial and administra-
tive resources as another factor of production. It is assumed that such resources
are directly related to the accumulated physical stock of capital. Moreover,
instead of directly relating k, to It, Uzawa's Penrose function relates Kt/Kt to
It/Kt. These conventions enable him to avoid the problem of how to measure
managerial and administrative resources,34 to simplify mathematical deduction
considerably, and thus to obtain some definite conclusions.
The Penrose function is illustrated in Figure 8.24 and it is called the Penrose
curve.
K,
K,
depreciation relation
(69) K1= It - SKr
is already included. This relation may be considered as the case in which the
Penrose curve takes a special form:
(69') It = Kt + S
Kr Kt
The problem of the firm is again to maximize the present value of all future
profits subject to the constraints
Maximize: JOB
e-.t [Q(Lr, Kt) - wLt - q,r)Ktl dt
r J
Subject to: 'tV -- 7r
Zmin _< Zt < Zmax
Kt>_0,Lt?0
and a given value of Ko
Here Kt is the state variable and Zr and Lr are the control variables. The intro-
duction of a new control variable like Zt is a standard practice in control theory.
We omit the discussion of the procedure of obtaining the solution to this
THE NEO-CLASSICAL THEORY OF INVESTMENT 709
problem, for it is already described in Uzawa ([64] and [66] ), and the reader,
if he so wishes, should easily be able to obtain it by himself.39
Assuming the linear homogeneity40 of the function Q, Uzawa obtained the
following results:
(70-a) it = constant (=z) for all t
(70-b) k, = constant (=k) for all t
(70-c) A, = constant (=A) for all t
where it -- Z1/K, and k1= K,/L,. The value of k is determined by QL = w/p as
a result of the homogeneity assumption. The value of z is obtained as a solution of
the problem of maximizing
v - c - g7r(z)
(71) where c -- pQK
r-z
with respect to z.41
Since 4/K1= it and I,/K, _ ( ), the above solution implies
(72) Ki = Koezt
(73) It = Ioezr
where Io -- Kon(z), and
(74) Lt = Loezt
obviously funded from saving. Thus there is a possible gap between the demand
for and supply of capital. Uzawa did realize these points. Assuming that the supply
of labor grows at a constant rate and devising a theory of saving, he concluded
that there is a steady state solution in the economy which is dynamically stable.
It is assumed that the firm makes its decision on the assumption that the current
interest rate and prices continue to remain the same for all future times. Clearly,
the firm realizes its expectations are wrong in every case, for these prices and
the interest rate change continuously over time. It is assumed that the firm never
learns from past experience, which, to me, is highly dubious.
In this connection, the following confession by Uzawa with regard to the
weakness of his theory seems to be pertinent ([66], pp. 651-652):
However, the most serious limitation of the present analysis is the hypothesis
that the aggregative behavior of each of two major sectors of the national
economy may be explained in terms of the representative unit which behaves
itself in a way similar to each individual unit. It might be less objectionable for
a static or stationary analysis, but an economic model which is purportedly
analyzing the mechanism of a growing economy would be deemed question-
able if enough attention were not paid to the process of aggregation.
Needless to say, if the economy is in the stationary state, all the prices and the
interest rate will presumably be constant; hence there should be no problem
of aggregation nor should there be a problem of not learning from past experience.
However, in Uzawa's investment theory, the firm wishes to expand exponentially
without any limit; hence it cannot be a static theory. In other words, it seems
that his theory also contains an inconsistency.
Some Extensions
casting nor does he have to learn from past mistakes. This perfect foresight
assumption is thus useful in building a theoretically consistent model. But it is
obviously an unrealistic assumption to make for a theory intended to explain the
behavior of the firm. The firm does make mistakes with regard to the forecasting
of future prices; hence it has to revise its program from time to time.
An obvious generalization here is to introduce an element of risk into fore-
casting and incorporate the firm's utility function with regard to risks into the
analysis. Commenting on Jorgenson, Karl Borch has made the following remark
([24] p. 273):
It can then be immediately seen that nothing basic is changed mathematically. The
function pQ(L,, K,) is replaced by G(L K,). If we replace the concavity assump-
tion of Q with that of G, our analysis in subsections b and c follows almost word
for word. Notice that the choice of the path (L,, K,) implies the choice of the
output path Q,, which in turn implies the choice of the price by p, = p(Q,).
In the case of no adjustment costs, the firm will adjust to the long-run
712 DEVELOPMENTS OF OPTIMAL CONTROL THEORY AND ITS APPLICATIONS
optimal stock of capital and labor (K* and L*) very quickly as long as Imax is
large enough or Imm is small enough. Both K* and L* are determined by the
famous`rule that marginal revenue equals marginal cost or, equivalently, that the
marginal revenue product equals the factor price. In other words, L* and K*
are determined by;;
(77-a) GL(L*, K*) = w
(77-b) GK(L*, K*) = (r + 8)q
where GL = 8G/8L and GK = 8G/8K. Hence the optimal price p* is given by
FOOTNOTES
1. This note is a revised version of my lecture given at Purdue University, May 1971,
which is recorded in A. Takayama, "The Neoclassical Theory of Investment and
Adjustment Costs," Krannert Institute Paper, no. 349, Purdue University, April 1972.
2. A similar attempt is made by Arrow [71, for example, in the model in which labor
is not explicitly introduced.
3. It is assumed that r is constant for all t. It can be interpreted as the current interest
rate, if the firm can borrow or lend any amount at this rate r.
4. This assumption on depreciation contrasts, in the present analysis, to the so-called
"depreciation by sudden death" assumption, which means that the capital goods last
for a finite period of time with a constant efficiency and then "die" suddenly at the
end of that period (that is, they are scrapped with zero value). The assumption of
depreciation by evaporation ("radioactive" decay) is a very common assumption
in the literature dealing with the continuous time model.
5. In the classical stationary state where everything is repeated over and over again,
all prices are constant. Hence everybody expects all prices to remain constant and
their expectations are always correct (perfect foresight). This makes the classical
theory simple and elegant. But in a dynamic economy, the problem of future expecta-
tions causes very difficult problems.
THE NEO-CLASSICAL THEORY OF INVESTMENT 713
16. If the capital good is rented for a finite period of time-say, T-then the inter-
temporal arbitrage relation is
sample residual variance. This procedure, developed by Theil, is appropriate only for
non-stochastic regressors, but J & S have It, as regressors!" By "J & S," Swanson
seems to have been referring to Jorgenson and Stephenson [32] .
32. See, for example, Uzawa [64] , [651, [66] , [67] , and [68] .
33. A powerful outcome of the assumption n" > 0 is that the uniqueness of the optimal
path of capital accumulation K, is obtained, even if the production function is
homogeneous of degree one. See Uzawa [66] , p. 642. In addition to the above re-
strictions on the function n, Uzawa ([64] , p. 4; [66] , p. 641) also imposed the
following conditions: 7r(O) = 0 and n'(0) = 1. These conditions are assumed to hold
for mathematical convenience and do not impair the generality of his argument.
34. The crucial feature of managerial and administrative resources is that they are not
usually bought and sold in the market; hence the market prices of these resources
usually do not exist. This then creates the problem of how to measure them.
35. In other words, Uzawa claims that the crucial feature of the "fixity of capital" is in
the limitational character of administrative and managerial resources.
36. Notice that, in Problems I and II, there is really no problem of how to measure the
units of capital, output, labor, and cost of adjustment. In the profit function, pQt,
wLt, qlt, and C(It) all enter in dollar terms. Notice also that as long as all prices are
constant, the maximization of pQt - wLt - qlt] dt will give the same result
as that of the maximization of the present value of real profit foe-.t [Q, - wL1/p -
qlt/p] dt.
37. This does not deny the importance of such a treatment. In fact, this is one of the
most important features of Uzawa's theory, reflecting clearly the influence of Joan
Robinson.
38. Uzawa maximizes the present value of real profit f e-rt [Q(Lt, K) - wLt/p -
n(ZtlKt)Kt] dt, where his convention with regard to the measurement of units
(mathematically) amounts to setting p = q.
39. Assuming the concavity of Q and an interior solution, the following conditions are
necessary and sufficient for an optimum: Kt = Zt; A, = rdt - [ PQK -- q(n -
n'Zt/Kt)] ; QL = w/p; -It = qn'; limt-.e-'tdt = 0. Here the optimal path is again
denoted by (Kt, Lt, Zt). Also n' denotes do/dzt evaluated at 2t where z, - Zt/Kt.
Similarly, n = n(2t), QK - QK(Lt, Kt) and QL = QL(Lt, Kt).
40. The reader can easily analyze the case of nonconstant returns to scale. The analysis
will be analogous to the one in subsection c. The results are different from Uzawa's.
41. The range of the variation of z is assumed to be restricted to 0 < z < r and 0 < qn(z)
!5e. It is shown then that 0 < 2 < r.
42. There is a more difficult problem involved. The firm may not be able to ascertain the
probability distribution of its future prices or payments. Recall the famous Knightian
distinction between risk and uncertainty.
43. It is assumed that there is no monopsony in the labor market. In the case of no
adjustment costs, it is further assumed that there is no monopsony in the capital
market.
44. Compare (77) with equations (26-a) and (26-b).
REFERENCES
1. Alchian, A. A., "The Rate of Interest, Fisher's Rate of Return Over Costs and
Keynes' Internal Rate of Return," in The Management of Corporate Capital, ed. by
E. Solomon, Glenco, Ill., Free Press, 1959.
716 DEVELOPMENTS OF OPTIMAL CONTROL THEORY AND ITS APPLICATIONS
and Growth, Papers in Honour of Sir John Hicks, ed. by J. N. Wolfe, Edinburgh,
Edinburgh University Press, 1968.
8. Arrow, K. J., Beckman, M. J., and Karlin, S., "The Optimal Expansion of the
Capacity of a Firm," in Studies in the Mathematical Theory of Inventory and Produc-
tion, ed. by K. J. Arrow, S. Karlin, and H. Scarf, Stanford, Calif., Stanford University
Press, 1958.
9. Arrow, K. J., and Kurz, M., Public Investment, the Rate of Return and Optimal Fiscal
Policy, Baltimore, Md., Johns Hopkins Press, 1970.
10. Bailey, M. J., "Formal Criteria for Investment Decisions," Journal of Political
Economy, 67, October 1969.
11. , National Income and the Price Level, 2nd ed., New York, McGraw-Hill, 1971,
Chap. 8.
12. Chenery, H. B., "Overcapacity and the Acceleration Principle," Econometrica, 20,
January 1952.
13. Eisner, -R., "A Distributed Lag Investment Function," Econometrica, 28, January
1960.
14. Eisner, R., and Strotz, R. H., "Determinants of Business Investment," in Impacts of
Monetary Policy, by D. B. Suits et al., Englewood Cliffs, N.J., Prentice-Hall, 1963.
15. Eisner, R., and Nadiri, M. I., "Investment Behavior and the Neo-Classical Theory,"
Review of Economics and Statistics, L, August 1968.
16. Fisher, I. N., The Theory of Interest, New York, Macmillan, 1930.
17. Griliches, Z., "Distributed Lags: A Survey," Econometrica, 35, January 1967.
18. Gould, J. P., "Adjustment Costs in the Theory of Investment of the Firm," Review
of Economic Studies, XXXV, January 1968.
19. , "The Use of Endogenous Variables in Dynamic Models of Investment,"
24. Jorgenson, D. W., "Capital Theory and Investment Behavior," American Economic
THE NEO-CLASSICAL THEORY OF INVESTMENT 717
metric Model of the United States, ed. by E. Kuh, G. Fromm, and L. R. Klein,
Amsterdam, North-Holland, 1965.
26. , "Rational Distributed Lag Functions," Econometrica, 34, January 1966.
43. Lerner, A. P., The Economics of Control; Principles of Welfare Economics, New York,
Macmillan, 1944.
44. , "On Some Recent Developments in Capita'! Theory," American Economic
Review, LV, May 1965.
45. Lucas, R. E., "Optimal Investment Policy and the Flexible Accelerator," Inter-
national Economic Review, 8, February 1967.
46. , "Tests of a Capital Theoretic Model of Technological Change," Review of
Economic Studies, XXXIV, April 1967.
47. , "Adjustment Costs and the Theory of Supply," Journal of Political Economy,
September 1971.
THE NEO-CLASSICAL THEORY OF INVESTMENT 719
64. Uzawa, H., "The Penrose Effect and Optimum Growth," Economic Studies Quarterly,
XIX, March 1968.
65. , "A New Theory of the Investment Function," Nihon Keizai Shimbun, July
1969 (in Japanese).
66. , "Time Preference and the Penrose Effect in a Two Class Model of Economic
Growth," Journal of Political Economy, 77, July/August 1969.
67. , "Towards a Keynesian Model of Monetary Growth," IEA Conference on
Trade," presented at the Far Eastern Meeting of the Econometric Society, June
1970.
69. Viner, J., "Cost Curves and Supply Curves," Zeitschrift fur Nationalokonomie, 1931
[reprinted in Readings in Price Theory, ed. by G. J. Stigler, and K. E. Boulding, with
"Supplementary Note (1950)," Chicago, Ill., Irwin, 1952].
70. Witte, J., "The Microfoundations of the Social Investment Function," Journal of
Political Economy, 71, October 1963.
71. Wright, J. F., "Notes on the Marginal Efficiency of Capital," Oxford Economic
Papers, n.s., 15, June 1963.
Name Index
721
722 NAME INDEX
C'hakravarty, S., 450, 451, 452, 453, 456, Fenchel, W., 33, 65, 109, 248
464, 465 Fermat, P. de, 412
Chenery, H. B., 360 Filipov, A. F., 666
Chipman, J. S., 149, 181, 184, 364, 365, Fine, H. B., 167
397 Fisher, F. M., 388, 389, 390
Clark, P. G., 360 Fleming, W. H., 65, 85, 407
Cobb, C. W., 27, 220, 231, 338, 433, Fomin, S. V., 32, 420, 421
442, 563 Frechet, M., 77
Coddington, E. A., 306, 310 Frisch, R., 452
Cohn, A., 319 Frobenius, G., 364, 372, 375
Cournot, A. A., 159 Fromovitz, S., 100, 101
Cullum, C., 61, 420 Fukuchi, T., 360
Fukuoka, M., 537
Dantzig, G. B., 55, 132 Furuya, H., 573
Dasgupta, A., 464
Davis, D. G., 301 Gale, D., xxi, 44, 101, 174, 261, 262,
Deardorf, A. V., 438 263, 281, 335, 338, 446, 464, 479,
Debreu, G., 38, 126, 174, 180, 181, 184, 485, 486, 487, 494, 497, 498, 527,
186, 201, 202, 205, 206, 211, 213, 561, 575, 576, 581, 584, 587, 588,
215, 224, 225, 226, 229, 235, 248, 590, 594, 595
254, 255, 261, 263, 264, 265, 275, Galilei, Galileo, xv, 411
276, 277, 292, 364, 374, 379, 578 Gambill, R. A., 666
Desrousseaux, J., 441 Gantmacher, F. R., 119, 310, 364, 374,
Deusenberry, J., 442 378, 379, 385
Diamond, R. A., 447 George, M. D., 696
Dobell, A. R., 538, 539, 551, 556 Georgescue-Roegen, N., 181, 390, 486,
Domar, E. D., 436, 459, 464, 503, 537, 487, 532
542, 543 Ghouila-Houri, A., 26, 33, 38, 41, 67, 68
Dorfman, R., 644 Gillies, D. B., 205
DOSSO (Dorfman, R., Samuelson, P. A., Glicksberg, I., 67, 69
Solow, R. M.), 145, 150, 258, 259, Glycopantis, D., 379, 492
261, 276; 366, 487, 509, 516, 527, Goldman, A. J., 44, 132
537,559:567 Goodwin, R. M., 364, 365, 366, 450,
Douglas, A., 61
451, 452, 453, 456, 464, 465, 505
Douglas, P. H., 27, 220, 231, 338, 433,
Gould, F. J., 100, 101
442, 563
Gould, J. P., 687, 688, 698, 705, 707,
Drabicki, J. Z., 464 714
Drandakis, E. M., 559, 573 Graham, F., 276
Eckstein, 0., 464 Green, J., 231
Edgeworth, F. T., 187, 193, 198, 204, Griliches, Z., 714
205, 206, 215, 229, 344
Eisenberg, E., 231, 338 Haaveimo, T., 685, 686, 687
Eisner, R., 687, 698 Hadamard, J. S., 381
El-Hodiri, M. A., 6 1 , 100, 113, 125, 291, Hadley, G., 625
602, 625 Haga, H., 499
Engel, E., 159, 500 Hahn, F. H., 229, 260, 317, 344, 345,
Enthoven, A. C., 110, 111, 123, 150 401, 442, 532, 533, 536, 559
Euler, L., 58; 61, 108, 136, 413, 414, Halkin, H., 625, 651
42-6,428,659 Halmos, P. R., 11, 14, 32
Evans, G. C., xxi, 100, 101, 412 Hamilton, W. R., 412, 603, 659
Harrod, R. F., 436, 459, 464, 503, 537,
Falb, P. L., 309 542, 543
Fan, K., 67, 92 Hausdorff, F., 30
Farkas, J., 42 Hawkins, D., 360, 383, 390, 583
NAME INDEX 723
Mangasarian,' O. L., xxii, 61, 99, 100, 277, 281, 284, 292, 318, 324, 337,
101, 603, 644, 645, 660, 661, 713 364, 365, 371, 374, 377, 378, 381,
Markus, L., 666 383, 385, 390, 408, 498, 511, 512,
Marschak, T., 346 513, 515, 536, 559, 572, 573
Marshall, A., 174, 266, 280, 297, 298,
300, 301, 321, 330, 697 Occam, W., 350
Marton, B., 127 Ohyama, M., 338
Marx, K., 499, 538 Okamoto, T., 438, 440
Massera, J. L., 3 57 Olmstead, J. M. H., 167
Matthews, R. C. 0., 442, 532, 533, 536, Otani, Y., 165, 166
559 Otsuki, M., 499
May, K. 0., 183
McKenzie, L. W., 174, 235, 242, 248, Page, A., 85
263, 264, 265, 276, 277, 292, 347, Pareto, V., 187, 188, 190, 201, 208, 286
353, 364, 365, 381, 390, 402, 487, Pascal, B., 704, 714
493, 536, 559, 561, 565, 569,573, Patinkin, D., 345, 346, 556
575, 577, 592 Peano, G., 305
McManus, M., 537 Peleg, B., 206, 229
McShane, E. J., 601 Penrose, E. T., 706
Meade, J. E., 464 Perron, 0., 364, 365
Menger, Karl, 258, 275 Peterson, J. M., 408
Metzler, L. A., 316, 317, 364, 365, 366, Phelps, E. S., 441
397, 401 Planck, Max, xv
Mill, J. S., 142, 146, 147, 148, 330 Polak, E., 61, 420
Minkowski, H., 40, 41, 42 Ponstein, J., 127
Moore, E. H., 34 Pontryagin, L. S., xxii, 61, 420, 600,
Moore, J. C., 71, 72, 73, 74, 95, 186, 601, 603, 604, 611, 612, 613, 614,
201, 202, 254, 255, 277, 293 615, 616, 623, 626, 644, 651
Morgenstern, 0., 38, 206, 487, 489, 497 Proctor, M. S., 713
498, 499, 501 Quirk, J., 202, 314, 319
Morishima, M., 55, 274, 301, 316, 337,
338, 344, 364, 366, 397, 398, 407, Rader, T., 180, 184, 232, 277, 291, 406,
408, 487, 489, 499, 500, 501, 505, 408, 713
506, 512, 513, 521, 527, 528, 529, Radner, R., 346, 487,.559, 561, 562,
530, 531, 532, 533, 537, 538, 539, 563, 564, 567, 568, 569, 570, 573, 587
545, 546, 559, 563, 569, 573 Rahman, M. A., 628, 629, 630, 631, 636
Mundell, R. A., 407 Ramsey, F. P., 412, 447, 448, 462, 463,
Murata, Y., 532, 538 466, 480, 575, 576, 577, 594
Muth, J. F., 512 Reiter, S., 346
Reymond, du Bois, 85
Nagatani, K., 556 Ricardo, D., 142, 144, 145, 186, 538
Negishi, T., 174, 285, 289, 291, 292, Richter, M. K., 235
293, 301, 317, 319, 325, 327, 344, Ritter, K., 92
345, 353, 401, 407 Robinson, J., 441, 443, 715
Nelson, R. R., 442 Roos, C. F., xxi, 412
Nerlove, M., 318 Rose, H., 556
Newman, P., 299, 301, 345 Rota, C. C., 310
Newton, Isaac, xv, 4 11 Routh, E. J., 310
Niehans, J., 442 Rudin, W., 23, 26, 32, 85
Niida, H., 537 Ruppert, R. W., 346
Nikaido, H., 23, 27, 28, 32, 33, 62, 131, Russak, B., 665
174, 229, 260, 261, 262, 263, 265, Russel, R. R., 346
NAME INDEX 725
Sen, A. K., 464 Tsukui, J., 504, 513, 516, 535, 536, 537,
Shapley, L., 68, 206, 211, 212, 227, 229, 559, 573
232 Tucker, A. W., 44, 56, 60, 61, 68, 87, 89,
Shell, K., 549, 625, 626 90, 91, 93, 100, 115, 132, 292, 582,
Shephard, R. W., 167 664
Shilov, G. Y., 420 Turnovsky, S. J., 573
Shubik, M., 205, 206, 211, 212, 227, Tychonoff, A. N., 29
229, 232
Sidrauski, M., 549 Uekawa, Y., 284
Siebert, C. D., 704, 714 Uzawa, H., 60, 68, 69, 72, 73, 74, 93, 94,
Simmons, G. F., 24, 25, 27, 29, 31, 32, 100, 103, 161, 174, 235, 277, 292,
33 312, 318, 330, 337, 344, 345, 353,
Simon, H. A., 360, 383, 384, 390 354, 355, 357, 464, 537, 582, 645,
Slater, M., 69, 93 658, 687, 688, 705, 706, 707, 708,
Slutsky, E. E., 158, 247 709, 710, 715
Smith, Adam, 186
Smith, H. L., 34 Valentine, F. A., 61, 107, 601
Solow, R. M., 364, 378, 388, 396, 436, van del Pol, B., 35, 357
438, 442, 464, 503, 504, 505, 506, van der Waerden, B. L., 85
512, 520, 521, 522, 523, 524, 525, Vind, K., 225, 231
527, 528, 529, 530, 531, 536, 537, Viper, J., 162, 713
545, 546, 549, 551 von Neumann, J., 38, 68, 206, 276, 337,
Sonnenschein, H. F., 182, 183, 248, 276 486, 488, 489, 490, 493, 494, 495,
Srinivasan, T. N., 464 497, 499, 500, 559, 562
Starrett, D. A., 539 von Weizsacker, C. C., 441, 450, 464, 594
Stein, J. L., 556
Steiner, P. 0., 672, 677, 678, 683 Wald, Abraham, 256, 258, 259, 260, 275,
Stephenson, J. A., 715 276, 280, 283, 330, 486, 522
Stiglitz, J. E., 549 Walras, L., xvii, 174, 256, 257, 258, 274,
Stoleru, L. G., 464 275, 277, 280, 297, 298, 299, 300,
Stone, R., 464 301, 314, 340, 342, 345, 359, 395,
Strotz, R. H., 687, 698 396, 499, 522, 529, 538
Suits, D. B., 512 Weierstrass, K., 28, 29, 78, 85, 428
Suppes, P., xxiii Weintraub, R. E., 407
Sutherland, W. R., 485 Wicksell, K., 46, 47, 186, 408
726 NAME INDEX
727
728 SUBJECT INDEX
set, 3 defined, 76
system, 14 Differential, 76, 77, 79, 424
Core, 170, 174, 204, 205, 206, 208, 209, Differential equations, 302, 315, 347,
211, 213, 214, 215, 216, 217, 221, 465,548
223, 224, 225, 226, 227 continuation of solutions, 312, 336, 338
defined, 208 control function, 305
theory of, 204-234 existence of a solution, 305, 312
Corner solution, 134, 136, 626, 639 first order, 303
Correspondence, 3 forcing function, 305
Correspondence principle, 311, 345-346 global existence of a solution, 312
Cost minimization, 163 homogeneous, 305
Costate variables, 603 initial condition, 304
Costs of coalitions, 226, 228 initial value problem, 312
Countable set, 32 linear, 305
Countably infinite set, 32 local existence of a solution, 306, 312
Cournot aggregation property, 159 n-th order, 303
Cyclic matrix, 376, 377 ordinary, 311
solution of, 303, 304, 309
theory of, 302-313
uniqueness of a solution, 305, 312
Debreu's theorem, 180, 184 with constant coefficients, 305
Decentralized decision-making, 202,342 Differentiation, 75-82, 422-423
Decomposable matrix, 370, 371, 375, Diminishing returns, 433, 435
377, 378, 379, 390 Diminishing returns to scale, 265, 266
completely, 370 Discontinuity of the first kind, 626
defined, 370 Discount factor, 446, 463, 594, 624
Degree of monopoly, 682, 684 negative, 463
Demand function, xix, 234, 235, 236, nonconstant, 620
237, 241, 271 Discounting the future, 447
continuity of, 235, 264 Discrete time, 468, 469, 470, 484
homogeneity of, 237, 271 Discrete topology, 20
single-valuedness of, 271 Disjoint set, 2
Demand theory, 234-249 Distance, 8 (see Metric)
Depreciation, 432, 442, 444, 470, 535, Enclidian, 8, 329
542, 547, 548, 688, 712 Distributed lag, 704
by evaporation, 689, 712 Distributive law, 5
by sudden death, 712 Divisibility, 48, 50, 179, 184, 188, 194,
Derivative, 75, 76 239, 490
Derived set, 21 Dominant diagonal (d.d.), xx, 359, 364,
Descriptive model, 497, 527 365, 380, 381, 382, 383, 389, 390,
Desired goods, 51 394, 407, 509
Diagrams defined, 381
use of, xxii, 189, 201 DOSSO, 150
Difference equations, 469, 504, 508, 510, Dual stability theorem, 520, 521, 537
512, 515, 519, 522, 601, 604, 606 Dynamic adjustment, 321, 400
Differentiable at a point, 75, 76, 77, 78, Dynamic Leontief model (system), xxi
79, 80, 85, 422 399, 503-540, 541, 542, 554
two definitions of, 85. basic output equation of, 508
Differentiable function, 76, 78, 80, 84, basic price equation of, 519
87, 88, 90, 94, 95, 103, 116, 117, capital coefficient matrix, 508
133, 137, 244, 245, 246, 247, 281, closed, 508, 519, 543
355, 410, 413, 414, 419, 422, 424, current input coefficient, 508
427, 429, 430, 662 Morishima's model, 527-535
732 SUBJECT INDEX
Feasibility, 187, 189, 207, 286, 444, 522, cooperative, 206, 207, 229
561 garbage, 227, 232
Feasible set, 57 noncooperative, 229, 261
Fermat's principle, 412 side-payment, 206, 229
Final settlement, 205, 229 theory of, xxiii
Finite horizon, 623, 624 General equilibrium, 255, 256
Finite set, 32 analysis, xvi
First axiom of countability, 34, 254, 255 General production set, 45, 49, 51, 52
First derivative, 120, 422 (see also Production set)
First differential, 120, 422, 425 Generalized Hariltonian, 647
First-order condition, 84, 87, 96, 124, Golden age path, 440, 442, 463, 472
152, 153, 154, 155, 156, 160 [see defined, 440
Condition (FOC)] golden age program, 584
First-order partial derivative, 98 Golden rule path, 441, 442, 463, 466,
First variation, 422 624
Fixed coefficient assumption, 503, 506, defined, 441
508,527,528 golden rule program, 584
Frechet differential, 77, 79 Goldman-Tucker, theorem, 71, 130, 132,
Free disposability, 50, 137, 174, 190, 142, 145
201, 264, 266, 269, 290, 489, 490, stated, 132
562, 563, 564, 568 Gradient vector, 78, 84
Fritz John theorem, 106-107 Greatest lower bound (glb), 4, 483, 485
Frobenius root, 372, 375, 376, 378, 386, (see Infimum)
387, 388, 392, 393, 395, 396, 398, Gross complements, 406
399, 487, 509, 513, 515, 516, 536, Gross substitutability, xvii, 282, 283, 317,
537 318, 321, 325, 326, 327, 328, 329,
defined, 372, 375 330, 335, 338, 345, 355, 401, 404,
Frobenius theorem, xx, 359, 366, 406
367-380, 386, 493, 497, 509, 510, example of, 331-332
512, 536 in the finite incremental form, 330
theorem I, stated and proved, 372-375 weak, 401
theorem II, stated, 375 Guided missile problem, 600, 625
Function(s), 3, 32
affine, 15 Hahn-Negishi process, 344
composite, 3 Hahn-Negishi theorem, 402, 407
constant, 3 Halkin's counterexample, 625, 651
domain of, 3 Hamiltonian, 603, 614, 617, 618, 619,
linear affine, 15 625, 626, 629, 632, 639, 690, 698,
multivalued, 3 713
single-valued, 3, 12, 32 Hamiltonian system, 603, 612, 613, 614,
space of, 419, 421 616, 629, 632, 639, 640, 655
value of, 3 Hamilton's principle, 412
Functional, 12, 421 Harrod-Domar model, 459, 464, 503,
Functional analysis, 420 537, 542, 543
Functions of class C1, 78 Hausdorff space, 30, 31, 34, 255
defined, 30
G-closed function, 251, 255 Hawkins-Simon condition, 360, 361, 362,
g-constraint, 646, 647, 650, 668 363, 365, 380, 384, 385 [see Condi-
defined, 646 tion (H-S)]
Gain function, 253 defined, 360, 383-384
Gale-Nikaido lemma, 263, 277 Hawkins-Simon theorem, 383-384
Gale-Nikaido theorem, 408 Heine-Borel theorem, 28, 29, 30
Game stated, 29
balanced, 211, 230 Helly's theorem, 67
734 SUBJECT INDEX
Hessian, 117, 121, 122, 123, 125, 136, 458, 459, 460, 463, 474, 482, 484,
138, 153, 248, 405, 406, 408 578, 589, 594, 623, 650
defined, 121 justifications of, 446
Hestenes' theorem, 651-660, 661, 674, Infinite set, 32
679, 683 Infinitesimal, 77, 423, 427
stated, 658-660 Information (cost), 227-228, 232
Hick's condition for stability, 314, 316, Injection, 3
317,365,401 Inner product, 7, 44
stated, 314 axiom of, 7
Hicks-Slutsky equation, xix, 135, 150, Euclidian, 7
156-160, 234, 235, 236, 242, 247 usual, 7
defined, 156 Inner product space, 7, 9
Hicksian matrix, 281, 282, 283, 315, Input-output analysis, 319, 360, 366, 394
316, 393, 401, 406, 408 Input-output matrix, 359, 360, 363, 364,
defined, 281, 315 370, 380, 394
Hicksian method of stability, 314 defined, 359
Hicksian week, 555 Insurance premium, 228
Homogeneity, 262, 284, 321, 326, 327, Integrability problem, 234, 235
328, 330, 355, 400, 402, 404 Interest rate, 517, 520, 527, 529, 531,
Homogeneous of degree one, 164, 218, 532, 538, 545, 549, 554, 555, 645,
246, 471, 547, 563 712
Homogeneous of degree zero, 237, 246, Interior, 24, 72, 85, 271, 562, 605, 619
247, 262, 271, 283, 321, 326, 400, defined, 24
500, 531 Interior point, 24, 75, 76, 77, 100, 150,
Hyperplane, 35, 37 239, 267, 292
bounding for a set, 36 defined, 24
separated by, 35, 36, 37 Interior point assumption, 215, 239, 243
strictly separated by, 36 Interior solution, 134, 138, 231, 619,
supporting, 36, 38 622, 626, 675, 680, 699, 715
Intertemporal arbitrage, 526, 527, 544,
Image, 3, 32 555,694,714
inverse, 3 Inverse demand function, 258, 261, 2776
imperfect stability, 314 Irreversibility of investment, 639, 671,
Implicit function theorem, 101, 157, 689
165, 275,404, 407 Irreversibility of production, 48, 50, 264,
stated, 165, 407 266, 277
Imprimitive matrix, 376, 377 Isolated point, 21
Income, 171 Isomorphic correspondence, 14
elasticity, 159 Isoperimetric problem, 666
Increasing returns to scale, 50, 277
non-, 50 Jacobian, 79, 165, 275, 281, 658, 662
Indecomposable matrix, 365, 370, 371, defined, 79
372, 375, 377, 378, 379, 387, 388, Journal of SIAM Control, 625, 626
389, 392, 393, 395, 399, 509, 510,
513, 515 K-concavity, 71, 73
defined, 370 K-convexity, 72, 73
Indifference curve, 66, 179 Kakutani's fixed point theorem, 259,
Indifference set of z, 179 260, 263, 273, 276, 290
Indifferent-to-z set, 179 stated, 259
Indiscrete topology, 20 Kalecki-Kaldor model of business cycles,
Infimum, 4, 485 (see Greatest lower 357
bound) Karlin's condition, 69
Infinite horizon, 446, 447, 455, 456, Keynes-Ramsey rule, 466
SUBJECT INDEX 735
Monopoly, 228, 299, 412, 417, 672, 679, Solow's path, 438, 549
711 Neo-classical theory of investment,
Moore-Smith convergence theory, 34 685-715
Moore's theorem, 72, 73 case of no adjustment costs, 688-697
Multicountry income flows, 397-398 case with adjustment costs, 697-703
Multiplier 603, 612, 647, 651, 655, 656, complete monopoly, 711-712
658, 659, 661, 665, 679, 680, 683, critiques of Jorgenson's theory, 687,
714 705, 714-715
Multisector model of economic growth, lag distribution, 704
486-541 long-run desired stock of capital, 686,
dynamic Leontief model, 503-541 687, 693, 700, 703, 714
von Neumann model, 486-502 response function, 703-706
Multisector optimal growth model with response mechansim, 704
consumption, 575-599 response parameter, 704
attainable program (finite horizon), 578 speed of adjustment, 704
attainable program (infinite horizon), Uzawa on the Penrose effect, 706-710
578, 583, 587, 588, 589, 590, 594, Neo-turnpike theorem, xxi, 572
595 (Net) substitution term, 166, 246
competitive program, 577, 581, 582 No-worse-than- z set, 178, 238
eligible attainable program, 577, 589, Nonautonomous system, 304, 306, 348,
590, 592, 595, 598 352, 611, 613, 614
feasible program, 583 conditions for strong uniform global
finite horizon problem, 580-583 stability, 352
golden age program, 584 defined, 304
golden rule program, 584 equilibrium point (defined), 306
initial resource vector, 578 Nonlinear programming, xix, xx, 44, 55,
optimal (attainable) program (finite 56, 59, 61, 103, 285, 419, 420, 469,
horizon), 581, 582 470, 475, 494, 563, 580, 601, 603,
optimal (attainable) program (infinite 612, 613, 621, 683
horizon), 594-598 exposition of, 55-168
defined, 594 feasible point, 60
optimal stationary program (O.S.P.), maximand function, 61
583-5.87 (see also Optimal stationlx-. maximum point, 61
program) objective function, 61
O.S.P. and eligibility, 587-594 optimal program, 61
stationary program, 584 optimal solution, 61
solution, 60, 61
Necessary condition, 4 uniqueness of solution, 60, 84, 112,
Necessary and sufficient condition, 4 127
Negative definite matrix, 118 -123, 128, Nonnegative matrix, 364, 368, 372, 375,
166, 316, 406, 407 378, 385, 387, 388, 392, 509, 510,
defined, 118 515, (see also p. 121)
Negative prices, 135, 136, 269 defined, 368
Negative semidefinite matrix, 118-123 Nonnegative quasi-saddle point (QSP')
124, 125, 156, 158, 247, 248 condition, 88, 100, 649 [see Condi-
defined, 118 tion (QSP')]
Neo-classical aggregate growth model, Nonnormalized system, 318, 319, 325,
432-444,546-554 402
attainable path, 436, 440 defined, 318
classical path, 440 Nonsatiation, 136, 195, 215, 264, 267,
feasible path, 435, 436, 440 268, 287, 289, 485, 585, 623, 653,
fundamental equation of, 435 670, 671
with money, 546-554, 556 defined, 195
738 SUBJECT INDEX
with linear objective function, Perfect foresight, 520, 528, 531, 538,
638-643 544, 555,689,710,711,712
optimal (attainable) path [see Optimal and intertemporal arbitrarge, 555
(attainable) path] myopic, 537, 554, 555
sensitivity analysis, 456-458, 480-484 Perfect stability, 314
solution path, 449, 455 Period, 398, 469, 487, 488, 561
Optimal stationary program (O.S.P.), of production, 47, 398, 470, 487, 488
576, 577, 583, 584, 586, 587, 588, Period analysis, 469, 470, 484
589, 590, 592, 593, 597 Permutation, 368
defined, 584 matrix, 368, 369, 376
price vector associated with, 585 Phase diagram, 323, 324, 448, 461, 622,
uniqueness of, 585, 586, 587, 597 623, 640, 641, 642, 671, 692
Optimum tariff argument, 150 orbit, 325
Ordering, 177 phase space, 325
Origin, 6 solution path, 325
Overtaking criterion, 450, 594 Phase diagram technique, 309, 321, 322,
Own rate of interest, 519, 554 325
applied to the stability of competitive
Parameterizability condition, 89 equilibrium, 321-325
Pareto optimum, xix, 113, 185, 186, 187, Piecewise continuous derivatives, 602,
188, 190, 192, 193, 195, 197, 198, 612, 613, 614, 616, 626, 648,652,
201, 202, 204, 205, 206, 208, 219, 655, 659, 661, 665
220, 221, 229, 285, 286, 287, 288, Piecewise continuous function, 305, 602,
291, 342, 491, 497, 561, 580, 581, 605, 626, 639, 647, 648, 655, 659,
582 661
Arrow's anomalous case, 199 defined, 305, 602
and core, 208 Planning horizon, 445, 446, 458, 480,
defined, 190, 286 481, 482, 527, 672, 685
Koopman's example, 200 Planning model, 497, 506, 527
Parity theorem (core), 230 Pointwise convergence, 430
stated, 214 Polar cone, 269
Partial derivative, 77-78, 305, 413, 426, negative, 53
512, 536 nonnegative, 23
defined, 77-78 normalized, 269
Partial equilibrium, 255, 256 Pontryagin's maximum principle,
Partial ordering, 177 600-627, 628, 649 (see Maximum
Partial preordering, 177 principle)
Partial quasi-ordering, 177
basic theorem, 602-603
proof of a simple case, 606-609
Pascal distributed lag function, 704 stated, 603
Pascal probability distribution, 714 various cases, 609-617
Passive investment, 433 Positive definite function, 352, 356
Path of pure accumulation, 465, 473, 482 defined, 356
Peak-load problems, 654, 667, 671-684 Positive definite matrix, 118-120, 122
firm peak case, 678 Positive matrix, 368
full capcity, 677, 678 Positive semidefinite matrix, 118 -12 0,
nonpeak periods, 672 122
peak demand, 672 defined, 118
shifting peak case, 677 Possibility of inaction, 48, 289, 490
social welfare, 672-673 Preference ordering, 176-179, 180,
top-peak periods, 676 181-183, 184, 190, 194
Penrose curve, 707, 708 closed, 180
Penrose effect, 688, 706, 707, 708 connected, 180
740 SUBJECT INDEX
Total quasi-ordering, 177, 180, 235, 265 Utility function, 109, 150, 179-181,
defined, 177 184, 188, 207, 229, 234, 264, 338
Transformation, 3 aggregate, 146, 581
Transversality condition, 603, 610, 611, defined, 179
616, 621, 622, 623, 624, 625, 626, existence of, 180
629, 650, 652, 655, 656, 660, 662, indirect, 162
664, 675, 680 Utility index, 179
at infinity, 623-625, 626, 650-651 Utility possibility set, 209
stated in the most general form, 660
Triangular inequality, 8, 9 Value-added, 363, 395
True dynamic stability, 315, 316 van del Pol equation, 351
Truncated production cone, 50, 51 Vector(s), 6, 15
Tsukui's lemma, 513, 516, 537 linear combination of, 10
stated, 513 nonnegative linear combination of, 17,
Turnpike property, 560 111
Turnpike theorem, xxi, 464, 527, Vector local maximum, 113
559-575 Vector maximum, 112-113, 115, 116,
feasible path, 561, 564 209, 289, 291, 564, 567
free disposability and optimality, defined, 112-113
563-567 problem, 73, 112-117, 128, 141, 142,
hop-skip-jumping, 572 144
intertemporal efficiency condition, 567 Vector space, 6, 420 (see Linear space)
optimal path, 563 Vector subspace, 6 (see linear subspace)
strong, 572 von Neumann
value loss in, 568, 569 equilibrium, 497, 501, 562
weak, 572 facet, 592
Twice continuously differentiable func- growth factor, 562
tion, 79, 122, 123, 128, 152, 246 interest factor, 562
defined, 79 path, 379, 493, 560, 562, 573, 584
Twice differentiable function, 120, 121,
price, 562
414, 423, 424, 427, 428, 429
Twice differentiable at a point, 120, 422 process, 562
defined, 120 profit, 562
Tychonoff's theorem, 29, 484 quadruplet, 497
stated, 29 triplet, 562, 568, 570
value, 560
Unconstrained maximum, 75, 82-85, von Neumann (growth) model, xxi, 276,
123-124 486-502, 508, 560
Uncountable set, 32 dual problem, 494-495
Uniform convergence, 430 existence of maximum rate of expan-
Uniform norm, 430 sion, 492
Uniformly bounded, 350, 354 existence of price vector, 495
Upper bound, 4 independent subset, 497
interest factor, 495
Upper contour set, 66, 178, 267
irreducibility, 497-498
Upper inverse, 250, 255 maximum profit rate, 494
Upper semicontinuous function, rate of expansion, 491, 493, 496
239-242, 250-254, 259, 261, 262, regular, 497
263, 276, 293 von Neumann theorem, 495-496, 560
defined, 240, 251 von Neumann model with consumption
Upper semicontinuous at a point, 239, Marx-von Neumann model, 499
250 Morishima's treatment of, 499-501
Util, 206 Wairas-von Neumann model, 499
744 SUBJECT INDEX
von Neumann ray, 562, 563, 565, 567, Weierstrass theorem, 29, 53, 59, 288,
568, 569, 570, 572, 573, 576, 588 292, 373, 374, 470, 475, 484, 492,
uniqueness of, 562, 563, 569 495, 580
proved, 30
Walras-Cassel system, 258, 266, 275, 282 stated, 29
Walras' Law, 259, 262, 274, 318, 319, Welfare economics, 185, 187, 491
321,326,327,328,330,335,341, two classical propositions of, 185-204
400, 402 Wicksell's Law, 408
in the general sense, 276 Wong-Viner envelope theorem, 162
in the narrow sense, 276 World efficient frontier, 146
Walrasian "long run," 396 World production set, 144
Weak solvability condition, 383 Worse-than-2 set, 179
Wealth, 171
Zorn's lemma, 11