100% found this document useful (3 votes)

808 views768 pages

Mathematical Economics - Akira Takayama (Dryden Press, 1974)

Mathematical Economics

Uploaded by

Anonymous sc79IBC5

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

100% found this document useful (3 votes)

808 views768 pages

Mathematical Economics - Akira Takayama (Dryden Press, 1974)

Mathematical Economics

Uploaded by

Anonymous sc79IBC5

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 768

MATHEMATICAL

ECONOMICS
AKIRA TAKAYAMA
Purdue University

THE DRYDEN PRESS

Hinsdale, Illinois
To my parents

Copyright © 1974 by The Dryden Press,

a division of Holt, Rinehart and Winston, Inc.
All Rights Reserved
Library of Congress Catalog Card Number: 72-79146
ISBN: 0-03-086653-7
Printed in the United States of America
1234 038 123456789
Contents

PREFACE iii
Some Frequently Used Notations xii
INTRODUCTION
A. Scope of the Book xv
B. Outline of the Book xviii

CHAPTER 0 PRELIMINARIES 1

A. Mathematical Preliminaries 1

a. Some Basic Concepts and Notations 1

b. R" and Linear Space 5

c. Basis and Linear Functions 10
d. Convex Sets 16
e. A Little Topology 19

B. Separation Theorems 35
C. Activity Analysis and the General Production Set 45

CHAPTER 1 DEVELOPMENTS OF NONLINEAR PROGRAMMING 55

A. Introduction 55
B. Concave Programming-Saddle-Point Characterization 62
C. Differentiation and the Unconstrained Maximum Problem 75

a. Differentiation 75
b. Unconstrained Maximum 82

D. The Quasi-Saddle-Point Characterization 86

Appendix to Section D: A Further Note on the Arrow-Hurwicz-Uzawa
Theorem 102

E. Some Extensions 108

a. Quasi-Concave Programming 109

Vii
Viii CONTENTS

b. The Vector Maximum Problem 112

c. Quadratic Forms, Hessians, and Second-Order Conditions 117

F. Some Applications 129

a. Linear Programming 130

b. Consumption Theory 133
c. Production Theory 136
d. Activity Analysis 140
e. Ricardo's Theory of Comparative Advantage and Mill's
Problem 142

Appendix to Section F: Optimization and Comparative Statics-A

Local Theory 151

a. The Classical Theory of Optimization 151

b. Comparative Statics 154
c. The Second-Order Conditions and Comparative Statics 155
d. An Example: Hicks-Slutsky Equation 156
e. The Envelope Theorem 160

CHAPTER 2 THE THEORY OF COMPETITIVE MARKETS 169

A. Introduction 169
B. Consumption Set and Preference Ordering 175

a. Consumption Set 175

b. Quasi-Ordering and Preference Ordering 176
c. Utility Function 179
d. The Convexity of Preference Ordering 181

C. The Two Classical Propositions of Welfare Economics 185

Appendix to Section C: Introduction to the Theory of the Core 204

a. Introduction 204
b. Some Basic Concepts 207
c. Theorems of Debreu and Scarf 213
d. Some Illustrations 218
e. Some Remarks 224

D. Demand Theory 234

Appendix to Section D: Various Concepts of Semicontinuity and the
Maximum Theorem 249

a. Various Concepts of Semicontinuity 249

b. The Maximum Theorem 253

E. The Existence of Competitive Equilibrium 255

Historical Background
a. 255
McKenzie's Proof
b. 265

Appendix to Section E: On the Uniqueness of Competitive

Equilibrium 280
CONTENTS 1X

F. Programming, Pareto Optimum, and the Existence of

-Competitive Equilibria 285

CHAPTER 3 THE STABILITY OF COMPETITIVE EQUILIBRIUM 295

A. Introduction 295
B. Elements of the Theory of Differential Equations 302
C. The Stability of Competitive Equilibrium-The Historical
Background 313
D. A Proof of Global Stability for the Three-Commodity Case (with
Gross Substitutability)-An Illustration of the Phase Diagram
Technique 321
E. A Proof of Global Stability with Gross Substitutability-The
n-commodity Case 325
F. Some Remarks 331

a. An Example of Gross Substitutability 331

b. Scarf's Counterexample 333
c. Consistency of Various Assumptions 335
d. Nonnegative Prices 336
G. The Tatonnement and the Non-Tatonnement Processes 339
a. The Behavioral Background and the Tatonnement Process 340
b. The Tatonnement and the Non-Tatonnement Processes 341

H. Liapunov's Second Method 347

CHAPTER 4 FROBENIUS THEOREMS, DOMINANT DIAGONAL

MATRICES, AND APPLICATIONS 359

A. Introduction 359
B. Frobenius Theorems 367
C. Dominant Diagonal Matrices 380
D. Some Applications 391

a. Summary of Results 391

b. Input-Output Analysis 394
c. The Expenditure Lag Input-Output Analysis 396
d. Multicountry Income Flows 397
e. A Simple Dynamic Leontief Model 398
f. Stability of Competitive Equilibrium 399
g. Comparative Statics 403

CHAPTER 5 THE CALCULUS OF VARIATIONS AND THE

OPTIMAL GROWTH OF AN AGGREGATE
ECONOMY 410

A. Elements of the Calculus of Variations and Its Applications 410

a. Statement of the Problem 410

X CONTENTS

b. Euler's Equation 413

c. Solutions of Illustrative Problems 415

B. Spaces of Functions and the Calculus of Variations 419

a. Introduction 419
b. Spaces of Functions and Optimization 421
c. Euler's Condition and a Sufficiency Theorem 426

C. A Digression: The Neo-Classical Aggregate Growth Model 432

D. The Structure of the Optimal Growth Problem for an Aggregate
Economy 444

a. Introduction 444
b. The Case of a Constant Capital:Output Ratio 450
c. Nonlinear Production Function with Infinite Time Horizon 459

Appendix to Section D: A Discrete Time Model of One-Sector

Optimal Growth and Sensitivity Analysis 468

a. Introduction 468
b. Model 470
c. The Optimal Attainable Paths 474
d. Sensitivity Analysis: Brock's Theorem 480

CHAPTER 6 MULTISECTOR MODELS OF ECONOMIC GROWTH 486

A. The von Neumann Model 486

a. Introduction 486
b. Major Theorems 491
c. Two Remarks 497

B. The Dynamic Leontief Model 503

a. Introduction 503
b. The Output System 507
c. The Price System 517
d. Inequalities and Optimization Model (Solow) 522
e. Morishima's Model of the Dynamic Leontief System 527

Appendix to Section B: Some Problems in the Dynamic Leontief

Model-The One-Industry Illustration 541

CHAPTER 7 MULTISECTOR OPTIMAL GROWTH MODELS 559

A. Turnpike Theorems 559

a. Introduction 559
b. The Basic Model and Optimality 561
c. Free Disposability and the Condition for Optimality 563
d. The Radner Turnpike Theorem 567

B. Multisector Optimal Growth with Consumption 575

a. Introduction 575
CONTENTS Xi

b. The Model 577

c. Finite Horizon: Optimality and Competitiveness 580
d. Optimal Stationary Program 583
e. O.S.P. and Eligibility 587
f. Optimal Program for an Infinite Horizon Problem 594

CHAPTER 8 DEVELOPMENTS OF OPTIMAL CONTROL THEORY

AND ITS APPLICATIONS 600

A. Pontryagin's Maximum Principle 600

a. Optimal Control: A Simple Problem and the Maximum

Principle 600
b. The Proof of a Simple Case 606
c. Various Cases 609
d. An Illustrative Problem: The Optimal Growth Problem 617
B. Some Applications 627
a. Regional Allocation of Investment 627
b. Optimal Growth with a Linear Objective Function 638

C. Further Developments in Optimal Control Theory 646

a. Constraint: g[x(t), u(t), t] > 0 646

b. Hestenes' Theorem 651
c. A Sufficiency Theorem 660
D. Two Illustrations: The Constraint g[x(t), u(t), t] > 0 and the
Use of the Control Parameter 667

a. Optima] Growth Once Again 667

b. Two Peak-Load Problems 671

E. The Neo-Classical Theory of Investment and Adjustment

Costs-An Application of Optimal Control Theory 685
a. Introduction 685
b. The Case of No Adjustment Costs 688
c. The Case with Adjustment Costs 697
d. Some Remarks 703

NDEXES 721
Some Frequently
Used Notations

1. Sets

xEX x belongs to X (x is a member of X)

x0X x does not belong to X (x is not a member of X)
{x: properties of x} set notation
0 the empty set
R the set of real numbers
R" the n-dimensional real space
S2" the non-negative orthant of R" (or simply 2 when the dimension n is clear
in the context)
XC the complement of X
X° the open kernel (the interior of X)
(for example, E2° = the positive orthant)
XcY X is contained in Y (X is a subset of Y)
X= Y X is equal to Y (that is, X c Y and Y c X)
XnY the intersection of X and Y
(similarly, .n X; or n Xt)
=l tCT
XUY the union of X and Y
(similarly,
U
X; or u Xt)
1=1 tET
X+ Y the vector sum of X and Y (that is, {_r + v: x E X, y E Y}

(similarly, L X;, aX + 0 Y, and so on)

r= 1
X_ Y the vector difference between X and Y
(that is,{x - y:xE X,yE Y}
X\Y {x:xEX,x0 Y}
n

0
i= I
X, the Cartesian product of the Xi's
(similarly, X x0 Y, and so on)

xii
SOME FREQUENTLY USED NOTATIONS Xii

2. Vectors
11 x II the norm of x
d(x, y) the distance between x and y
xy the inner product of x and y
Given two vectors x and y in R"
a. x y means x; > y; for all i
b. x y means x; > y; for all i and with strict inequality for at least one i
c. x > y means x; > y; for all i

3. Matrix and Vector Multiplication

Let A be an m x n matrix
A x implies x is an n-dimensional column vector
x A implies x is an m-dimensional row vector
x A y implies x is an m-dimensional row vector and y is an n-dimensional
column vector
In other words, we do not use any transpose notation for vectors, unless so specified.

4. Preference Ordering
x0->- y x is not worse than y
(that is, y is not preferred to x)
A' 0 11 x is indifferent to y
xO.r x is preferred to y

5. Others

means "implies"
means "because"
means "is by definition equal to" (or "is identically equal to")
det A: determinant of matrix A
Re(w): real part of a complex number w
Abbreviation resp. stands for respectively
Preface

This book is intended to provide a systematic treatment of mathe-

matical economics, a field that has progressed enormously in recent decades.
It discusses existing theories in the field and attempts to extend them. The
coverage herein is much broader than in any other book currently used in
the field.
The literature on mathematical economics is enormous. The tradi-
tional method of education in economics-that of assigning many books
and articles to be read by the student-is clearly inappropriate for the study
of mathematical economics. This is both because of the size and complexity
of the field and because the traditional method fails to make the student
aware of the importance of the analytical character of economic theory.
Here an attempt is made to provide all of the material usually obtained from
a multitude of different sources but within a single framework, using con-
sistent terminology, and requiring a minimum of outside reading.
More than a mere survey of the literature, this book strongly empha-
sizes both the unifying structure of economic theory and the mathematical
methods involved in modern economic theory with the intention of provid-
ing the reader both the technical tools and the methodological approach
necessary for doing original research in the field.
Furthermore, the book is not an exposition of elementary calculus and
matrix theory with applications to economic problems; rather it is a
book on economic problems using mathematical tools to aid in the analysis.
Nor is it an introduction to a higher level text. It begins at a rather elementary
level and brings the reader right to the frontiers of current research. Care is
also taken so that each chapter can be read- more or less independently (that
is, each chapter can be read without careful reading of other chapters).
Needless to say, economics is concerned with real world problems,

iii
iV PREFACE

and its development has been crucially dependent on a strong demand and
stimulus from such problems. However, the large number of viewpoints
based on diversified vested interests in a particular policy often obscures
transparent theoretical understanding. Hence it is very important for
economists to find the basic logical structure of each problem and to be
fully equipped with the major analytical tools, although many important
economic theories obviously can be neither mathematical nor analytical.
This book deals with the analytical and the mathematical aspects of
economic theory. It thus emphasizes a systematic exposition and extensions
of various mathematical tools of analysis which can be useful in many
diversified branches of economics, and of two topics in economic theory-
competitive equilibrium and economic growth-both which, with their
rigor and theoretical thoroughness, will provide basic prototypes of analysis
and frames of reference for many other economic theories. Clearly, the
topics of interest in economics change rapidly from time to time, reflecting
the changing current concern with problems in the real world, and no book
can possibly cover all of these topics. However, I think that the material
presented in this book is useful and basic in analyzing many economic
problems, old and new.
In spite of the fact that the book is conspicuously analytical and mathe-
matical and that it is designed to bring the reader to the frontiers of mathe-
matical economics, the mathematical prerequisites have been kept to a mini-
mum. In virtually all sections of the book, the requirements include only
that level of knowledge of elementary calculus and elementary matrix theory
(say,- the knowledge of matrix multiplication) which is now a standard re-
quirement for entering economics students in graduate schools in major
U.S. universities.
With regard to prerequisites in economic theory, the author can think
of several excellent introductory texts available on many of the topics dis-
cussed herein. More generally, a rigorous second or third year under-
graduate course should provide the reader with sufficient economic back-
ground to take up this study. Readers who are acquainted with books such as
Dorfman, Samuelson, and Solow, Linear Programming and Economic Anal-
ysis, New York, McGraw-Hill, 1958, and Hicks, Value and Capital, 2nd ed.,
Oxford, Clarendon Press, 1946, may benefit more from reading this book
than those unfamiliar with these works; however, familiarity with such texts
is by no means necessary.
The book is suitable for use as a textbook in graduate courses in eco-
nomic theory and mathematical economics and is also intended to serve as
a reference work for professional economists who wish to become familiar
with some of the topics and techniques of mathematical economics.
In addition, this book represents a record of my lectures on mathe-
matical economics and economic theory given to first- and second-year
PREFACE V

graduate students at the University of Minnesota, the University of Roches-

ter, the University of Hawaii, and Purdue University during the past eight
years. In spite of considerable revisions and attempts to update the book
during those years, it is natural and inevitable that the book carries some
flavor of the period 1965-1966 when the entity of the book was first tried out
at Minnesota.

In the course of writing the manuscript, I found that I owe a great debt
to an excellent graduate education in economics received at the University
of Rochester. I am also grateful for the atmosphere favorable to this modern
approach to economic theory existing at the University of Minnesota and
Purdue University, as well as at the University of Rochester. This atmo-
sphere, sponsored and nurtured by such distinguished scholars as Profes-
sors Leonid Hurwicz, John S. Chipman, Marcel K. Richter, Stanley Reiter,
and Robert L. Basmann, as well as Lionel W. McKenzie, Ronald W. Jones,
and Edward Zabel of Rochester, has provided a great deal of stimulation.
My greatest debt is to the students at Minnesota, Rochester, Hawaii,
and Purdue who took my courses on the topic and have constantly given
me stimulation, encouragement, and criticism. A number of people, in
addition to students in my classes, have read a portion or the whole of the
manuscript. Among them, I would like to express my gratitude to Michihiro
Ohyama, James C. Moore, William A. Brock, Mohamed A. El-Hodiri,
Takashi Negishi, Jinkichi Tsukui, Hiroshi Atsumi, Sheng Cheng Hu, John
Z. Drabicki, Yuji Kubo, Raj K. Jain, Kenneth Avio, and Fred Nordhauser.
In addition, I am grateful to Professor Richard E. Quandt of Princeton, who
read the entire manuscript and provided me with numerous comments as
well as encouragement. My special thanks also go to John Drabicki, for
without his help and encouragement, the completion of this book may have
been further delayed and hampered. I also appreciate the capable research
assistance provided by Erik Haites, Gene Warren, Robert Parks, Frank
Maris, and James Winder, as well as the excellent stenographic services of
Mrs. Gladys Cox, Mrs. Helen Antonienko, and others whose assistance
was made available to me, for the most part, through Purdue University.
I am also grateful to Professor Leonid Hurwicz for his readiness in
giving me permission to quote the results of one of his unpublished works
("LH-Oct. 1966" as revised, July 2, 1970). I would also like to record my
gratitude to those professors who have granted their kind permission to
quote many very interesting and illuminating passages from their writings.
The precise source and the author of each quotation is given in the respective
place of each quotation. Thanks are also due to the editors of Metroecono-
mica and the Quarterly Journal of Economics for permission to include in this
book some articles by the author which were originally published by them.
I am also grateful to Deans Emanuel T. Weiler, John S. Day, Rene
Vi PREFACE

P. Manes, and Jay W. Wiley of the Krannert School of Industrial Admin-

istration of Purdue University, who have provided me with generous en-
couragement as well as unusually favorable research conditions. Finally,
my wife, Machiko, greatly helped me in preparing the indexes of the book,
as well as providing me with encouragement.
A. T.

December, 1973
West Lafayette, Indiana
INTRODUCTION

Section A
SCOPE OF THE BOOK

The essential feature of modern economic theory is that it is analytical and

mathematical. Mathematics is a language that facilitates the honest presenta-
tion of a theory by making the assumptions explicit and by making each step
of the logical deduction clear. Thus it provides a basis for further developments
and extensions. Moreover, it provides the possibility for more accurate empirical
testing.' Not only are some assumptions hidden and obscured in the theories of
the verbal and the "curve-bending" economic schools,' but their approaches
provide no scope for accurate empirical testing, simply because such testing
requires explicit and mathematical representations of the premises of the theories
to be tested.
Hence it is often argued that the "institutionalists" (as representing a
methodological approach to economic problems) have been largely driven out
of the temple and, furthermore, the relative weights of the curve-bending and
the mathematical methodologies have been moving in the direction of the latter
in many departments of economics in U.S. universities. But yet, economics is
a complex subject and involves many things that cannot be expressed readily in
terms of mathematics. Commenting on Max Planck's decision not to study
economics, J. M. Keynes remarked that economics involves the "amalgam of
logic and intuition and wide knowledge of facts, most of which are not precise."'
In other words, economics is a combination of poetry and hard-boiled analysis
accompanied by institutional facts. This does not imply, contrary to what many
poets and institutionalists feel, that hard-boiled analysis is useless. Rather, it
is the best way to express oneself honestly without being buried by the millions
of institutional facts. Abstract economic theorizing with analytical and mathe-
matical methodology does provide an excellent way to investigate real-world
problems and institutions. An analogy here would be that of Tycho Brahe vs.
Kepler, Galilei and Newton.' Clearly both are important, but this book chooses
to discuss the analytical and mathematical approach.

xv
XVi INTRODUCTION

Mathematical economics is a field that is concerned with complete and

hard-boiled analysis. The essence here is the method of analysis and not the
resulting collection of theorems, for actual economies are too complex to allow
ready application of these theorems. J. M. Keynes once remarked that "the
theory of economics does not furnish a body of settled conclusions immediately
applicable to policy. It is a method rather than a doctrine, an apparatus of the
mind, a technique of thinking, which helps its possessor to draw conclusions."5
An immediate corollary of this is that the theorems are useless without
explicit recognition of the assumptions and complete understanding of the
logic involved. It is important to get an intuitive understanding of the theorems
(by means of diagrams and so on, if necessary), but this understanding is useless
without a thorough knowledge of the assumptions and the proofs. Hence, in this
book, all the major theorems are stated in full and proved rigorously. An intro-
ductory account of each topic is given to help provide an intuitive understanding
of the theory involved. Care is taken to make the proofs as simple (or at least
conceptually as elementary) as possible and no steps are omitted (so that they
can be followed by readers of a nonmathematical inclination). Furthermore, a
special effort is made to make the economic meaning of the theorems and the
concepts involved in the theorems explicit and clear.
Modern economic theory may be discussed either in terms of the analytical
techniques employed, or in terms of the topics discussed. Economists have
lavished particular attention on two topics-the theory of competitive markets
and the theory of growth, especially their general equilibrium aspects-with
the results. that both have achieved the status of rigorous, elegant theories, a
state of theoretical development unmatched by any other branch of economics.
The perfection of these theories has been very closely tied to the exploration,
advancement, and elaboration of various mathematical techniques. This book
will be restricted to these two theories-the theory of competitive markets and
the theory of growth-and to those mathematical techniques that will assist in
explaining and clarifying them. The book adopts a unified viewpoint-that of
general equilibrium analysis.
The great danger in a book of this kind is that it may tend to become a
patchwork of theories that are collected from different sources and artificially
pasted together. The reader of such a book may be embarrassed by the knowledge
he gains because he will be unaware of the relationships among the theories.
Particular emphasis is placed herein on the relationships among the theories
by bringing out the principles common to them. Also, insofar as is possible, a
consistent body of terminology and notation is employed.
There exists a great deal of work in the profession indicating the utility
of this approach as well as the utility of mathematical analysis in economic
theory, for example, Samuelson's Foundations of Economics Analysis (1947). This
book follows this pattern, paying particular attention to the advancements of
economic science over the past 25 years.
It is tempting, in a study of this type, to treat the mathematical techniques
SCOPE OF THE BOOK Xvii

and the economic theories separately. However, the fact that the mathematical
techniques are closely related to economic theories seems to make it difficult
to treat them effectively by themselves. In addition, treating the mathematical
techniques first might discourage the student before he gets to the economic
theories, whereas treating the economic theories first would not enable him to
take advantage of the mathematical techniques. The author takes the view that
this is not a difficulty but rather an advantage, in the sense that developments
in economic theory can be used to provide the unifying structure for the book.
The mathematical techniques can then be explained in connection with the
theoretical developments to which they are related. Mathematical theorems
will thus become more interesting and exciting to economists.
The author fully realizes some of the limitations of the book. For example,
in spite of quite a comprehensive coverage of the topics (which is broader than
in any other book currently in the field), it misses at least three important topics,
namely, the theory of uncertainty, the theory of social systems and organiza-
tions,, and the theory of conflicts and interactions.` There is no question that
these topics are important and that significant contributions will be made in
the next few decades. They were excluded only because their inclusion would
make the book massive and because, in view of the current research carried out
in these fields, the materials covered would probably become obsolete by the
time of publication of this work."
Furthermore, even the topics covered in this book have important de-
ficiencies, in spite of the elegance and importance of all the literature related
to them. For example, in the case of the theory of competitive equilibrium one
may ask the following questions: (1) What is the rationale justifying assumptions
such as a fixed number of commodities, a fixed number of consumers, and a
fixed technology set? (2) Granting all the premises of the theory, how can we
reach an equilibrium? Walras offered the t&tonnement process, which provides
a way to reach an equilibrium without knowing individuals' preferences, tech-
nology sets, and so forth.' But the process excludes the possibility of intermediate
trading. When intermediate trading is permitted, the equilibrium depends on
the trading paths. (3) Even if the t&tonnement process is accepted, convergence
to an equilibrium is still an open question. So far, the proof of stability depends
on heroic assumptions such as "gross substitutability.""
Although the monopoly of the nonanalytical methodology seems to be
over, mathematical economics, as it may be represented by this book, is no doubt
transitory. Future economists, completely free from prejudices against mathe-
matics and well trained in mathematics, econometric methods, and the theory
of measurement, and skilled in methods of electronic computation and simula-
tion, may be able to deal successfully with the institutional and political-economy
aspects of economics." However, the basic methodology and the framework
of thinking developed in mathematical economics will no doubt remain. Future
economists may be less concerned with such "large" problems as the competitive
equilibrium of the entire economy, and instead be more concerned with smaller
XViii INTRODUCTION

aspects of the economy. But they will still realize the importance of the analytical
method and the mode of developing analysis utilizing formally and honestly
constructed models. Furthermore, with the proper training, future economists
will not be in danger of overlooking the general equilibrium aspects of such
models.
In ending, it must be stressed that we should not overlook the importance
of the mathematical techniques developed in the course of the emergence of
mathematical economics. Although I would be the last person to argue whether
or not so-and-so's work is "good economics," I will be the first to defend the
importance of making the mathematical tools available to economists. These
are tools for every economist. Hence I have no hesitation to place heavy empha-
sis on mathematical techniques, almost on a par with my emphasis on economics.

Section B
OUTLINE OF THE BOOK

This book is essentially divided into three parts. The first part (Chapter 0)
provides the background materials in mathematics and economics necessary
for reading the rest of the book and also for further research in mathematical
economics. The second and the third parts constitute the main body of the book.
Roughly speaking, the second part (Chapters 1 through 4) is primarily concerned
with the theory of competitive markets, and the third part (Chapters 5 through 8)
is primarily concerned with the theory of growth. The above division between
the second and the third parts is a rough one, since the mathematical techniques
are closely interwoven with the economic topics, and it is not possible to classify
these techniques according to econorrnc topics. For example, the theory of non-
linear programming (Chapter 1) is a mathematical technique which lies at the
heart of the theory of competitive markets, yet it is an important technique
also for growth theory and for other fields of economics.
The first part, consisting of only one chapter (Chapter 0), is divided into
three sections. Section A collects the basic mathematics that will be useful for
reading the rest of the book and also for the reader's further study in economic
theory. Unlike the remainder of the book, most theorems here are stated with-
out proof so that the reader can grasp the basic mathematical concepts and
ideas without being led astray by the details of the proofs. Care is taken, how-
ever, not to misguide the reader into thinking that our world is always Euclidian,
and thus this section becomes more than a mere exposition of the mathematics
necessary for later sections of the book.
OUTLINE OF THE BOOK XiX

Section B of Chapter 0 is an exposition of separation theorems, one of the

most important of all mathematical theories which contribute to the founda-
tions of modern economic theory. One of the important features of modern eco-
nomic theory is that it is set-theoretic, and Section C offers an exposition of
activity analysis, which represents one of the most basic materials for the set-
theoretic feature of modern economic theory. Separation theorems are the
important mathematical technique used here.
The main content of the book starts with the exposition of nonlinear pro-
gramming theory (Chapter 1), which is probably the most important mathematical
technique in modern economic theory. There are many approaches one can
take in this theory. Our approach utilizes the separation theorems of convex
sets because this approach seems more natural (than, say, the implicit function
theorem approach) in providing results that do not require differeniability.
Section B of this chapter summarizes such results. Differentiation is introduced
in Section C, and Section D summarizes the major results on the characteriza-
tion of the solution of the constrained maximum problem in terms of derivatives.
Section E provides an exposition of some additional (yet important) topics such
as quasi-concave programming, vector maximization, the characterization of
concave or quasi-concave functions in terms of Hessian matrices, and the
second-order (necessary or sufficient) conditions for an optimum. The last section
(F) of Chapter 1 provides examples of various economic applications. Obviously,
the applications of nonlinear programming are not exhausted in Section F. Only
some of the well-known examples are given. No doubt these applications have
stimulated the interest of economists in the theory of nonlinear programming. It
is probably natural to assume that the readers of this book have already con-
fronted the standard use of the classical optimization theory (for example, Hicks's
Value and Capital, Mathematical Appendix), so that they are motivated to read
the treatment of the modern theory in Chapter 1 without too much economic
introduction. Applications are thus placed at the end of the chapter. In the
Appendix to Section F, we summarize the classical theory of optimization and
its standard applications to comparative statics analysis, in order to make this
book sufficiently self-contained.
Chapter 2 deals with the theory of competitive markets, especially its wel-
fare aspects and the existence problem. Section B introduces the discussion of
consumers and consumer preferences. Section C proves the two classical prop-
ositions of welfare economics: (1) a competitive equilibrium always realizes a
Pareto optimum, and (2) for any Pareto optimum, there exists a reallocation of
initial resources such that it can be supported by a competitive equilibrium. In
the Appendix to Section C, we attempt an introductory exposition of the theory
of the core, a topic which has recently attracted great interest. It is hoped that
interest is aroused among readers to study the theory of n-person games, which
has great potential with regard to its applications to economics. Section D deals
with demand theory. Two main results, the continuity property of demand
functions and the Hicks-Slutsky equation, are the central themes here. In the
XX INTRODUCTION

Appendix to Section D, related mathematical concepts and theorems are dis-

cussed. In particular, we discuss the relation among thorvarious concepts of
semicontinuity and a useful mathematical theorem known as the "maximum
theorem." Section E deals with the existence of a competitive equilibrium.
Mathematical techniques known as "fixed point theorems" play a central role
here. Various approaches to the existence problem are discussed, for each ap-
proach has an interesting feature and contains potential applications to other
problems. The Appendix to Section D contains a brief discussion of the uniqueness
of equilibrium. The last section (F) attempts to make it clear that the mechanism
of competitive markets can be viewed as a mechanism that generates a solution
to a nonlinear programming problem. Thus the topics of the two chapters on
nonlinear programming and competitive markets are now related. In particular,
we prove the two classical propositions of welfare economics and the existence
of a competitive equilibrium from the point of view of nonlinear programming.
Chapter 3 deals with the stability of competitive markets. The t&tonnement
process provides an institutional scheme under which one can find a com-
petitive equilibrium without actually knowing each consumer's preferences and
each producer's technology set. After a discussion of the historical background
of the topic in Section C, the Arrow-Block-Hurwicz proof of global stability
under the gross substitutes case is duscussed in Section E. Section F provides
certain remarks on this main result, the most important of which is Scarf's
example of instability. Section G questions the institutional plausibility of the
t&tonnement mechanism and discusses non-t&tonnement processes. Owing to the
strictness of the gross substitutability assumptions coupled with Scarf's counter-
example, and with the doubt about the institutional plausibility of the tatonnement
process, there are certain economists who are left cold by the entire stability
analysis. However, this analysis has made economists aware of the importance
of disequilibrium analysis and adjustment processes toward an equilibrium.
Moreover, it has also made economists realize the importance of the differential
equations technique in economics. The exposition of this technique is attempted
in Section B and the Liapunov second method, a powerful tool for stability
analysis, is explained in Section H. An important diagrammatical technique, the
phase diagram, is also made available to economists through stability analysis.
This technique, discussed in Section D, has many applications in dynamic eco-
nomic theories.
Chapter 4 deals with the mathematical techniques developed in connection
with Frobenius' theorems and dominant diagonal matrices. These are developed
in connection with the Leontief input-output analysis and stability analysis. In
Section A, we motivate our discussion of this chapter by using the Leontief
input-output model, which in turn is a model of a general equilibrium competi-
tive economy. Section B deals with Frobenius' theorems, and Section C deals
with dominant diagonal matrices in cases where off-diagonal elements are either
all nonnegative or all nonpositive. After the rather tedious mathematical dis-
cussions of Sections B and C, economic applications are taken up in Section D.
OUTLINE OF THE BOOK xxi

Section D begins with a summary of the results of B and C, and the reader, if
he so wishes, can skip most of the reading of B and C. The rich and wide applica-
tions shown in Section D will illustrate the power of this technique in economic
theory.
Chapter 5 has the dual purpose of introducing modern growth theory in the
form of an aggregate optimal growth model and of making the reader familiar
with an important mathematical tool, the calculus of variations. The calculus of
variations has had an unfortunate history among economists in that it was im-
mediately forgotten after the initial economic works of Roos, Evans, and
Ramsey, in the 1920s and 1930s. However, the power of this technique in physics
is well known, and economists are becoming more aware of its use in economics.
The optimal growth model gives an interesting example of its economic applica-
tions. In Section A, we make an expository account of this technique for the
simplest case. In Section B, we enter into a study of the second-order characteriza-
tions, which may be useful for the reader's further reading and research. It is
also shown there that, just as in nonlinear programming, concavity is sufficient
to guarantee that the first-order characterization ("Euler's condition") is neces-
sary and sufficient for an optimum. Section C digresses from the optimization
problem and attempts a compact summary of the one-sector growth model.
Section D then deals with the one-sector optimal growth model. An enormous
amount of literature in this field is treated in a unified and simple manner. Chapter
5 ends with the Appendix to Section D, in which we deal with the discrete-time
analogue of our discussion of Section D. This Appendix is intended to illustrate
the relation between the "continuous-time" model and the "discrete-time" model,
and to give an expository account of the existence problem and of the important
"sensitivity" results.
Chapter 6 discusses two important multisector growth models, the von
Neumann model and the dynamic Leontief model. In spite of many limitations,
the von Neumann model turns out to be fundamental in modern growth theory.
Section A deals with this model. Section B is concerned with the dynamic Leon-
tief model, which is particularly important among empirical economists. This
model, however, seems to have several important theoretical limitations. We
point out these difficulties. The Appendix to Section B uses a one-sector model
and points out these difficulties more sharply.
Chapter 7 deals with optimal growth in a multisector context. Section A
discusses some of the old results in this topic, namely, the "turnpike theorems."
In this material the role of consumption is completely subsumed and society is
concerned only with the terminal stock of goods. Our emphasis here is on an
elegant turnpike theorem due to Radner. Although the turnpike theorems of
Section A mark a great advance in theory compared with the von Neumann
theory, the above weakness is quite strong. This prompted the "neo-turnpike
theorems," notably that of David Gale. However, we deal with consumption
more explicitly than did Gale. Incidentally, we carry out our discussions of
Chapters 6 and 7 in terms of the discrete-time model, which the reader may,
XXii INTRODUCTION

if he so wishes, translate to the continuous-time model. In Chapter 7, where we

discuss the optimization problem, we see that the nonlinear programming tech-
nique (Chapter 1) is again found to have important economic applications.
Chapter 8 is the last chapter of the book and deals with one important and
powerful technique, optimal control theory. In Section A we summarize the
important results obtained by Pontryagin and others with some illustrations of
the applications of these results. In Section B we discuss, in full, two applications
of these results: the problem of regional allocation of investment and the optimal
growth problem with a linear objective. The latter is particularly useful as an
illustration of the "bang-bang" solution. In Section C we return to the basic
theory again and discuss some important generalizations of the results sum-
marized in Section A. In addition to the usual Pontryagin-type differential
equation constraint, constraints of other forms (such as g[x, u, t] > 0, integral
constraints, and so on) are introduced. The major theorem here is due to Hestenes.
We again point out the importance of concavity and present a result which is
slightly more general than Mangasarian's sufficiency theorem. Section D deals
with some applications of the results of Section C. Optimal growth is again taken
up because of the reader's familiarity with this topic. Another application in
Section D is concerned with the peak-load problem. In Section E, we study the
neoclassical theory of investment as an application of the optimal control tech-
nique, and present various theories in a unified and generalized fashion. In spite
of our rather complete treatment of investment theory in Section E, our empha-
sis in Chapter 8 is still to expose the reader to this powerful technique rather
than to discuss in detail all possible economic applications. It is hoped that
the reader will find other important economic applications of the technique.
Although economic applications in the literature have been concerned almost
exclusively with growth theory, there should be no such restriction. Obviously
the interpretation of t does not have to be confined to "time." The variable t
can refer to Mr. t in the continuum of traders model or to income "t" in the
taxation model, and so forth.

FOOTNOTES

1. By empirical testing, I do not mean the "curve-fitting school," which relies heavily
on regression analysis. Although this school is fashionable among certain empirical
economists, it seems to represent institutionalism in one of its worst forms. Not
only does it suffer from a poor theoretical basis, but also it often ignores the
elementary theory of measurement.
2. However, there is no question that "diagrams" are often very useful tools for under-
standing important theories and that "common sense" verbal arguments are often
essential in leading to economically fruitful theories.
3. Keynes, J. M., "Alfred Marshall, 1842-1924," Economic Journal, XXXIV, Sep-
tember 1924, p. 333.
4. The analogy is imperfect, for we do not know of any economic theory which is
OUTLINE OF THE BOOK XXiii

as successful in its application as is Newtonian mechanics in physics. Thus Planck

decided not to study economics.
5. Keynes, J. M., "Introduction" (to the series), Cambridge Economic Handbooks (the
first book in the series, Supply and Demand by H. D. Henderson, appeared in 1922,
published by Harcourt Brace and Co.).
6. A classical study in this field is L. Hurwicz, "Optimality and Informational Ef-
ficiency in Resource Allocation Processes," in Mathematical Methods in the Social
Sciences, 1959, ed. by K. J. Arrow, S. Karlin, and P. Suppes, Stanford, Calif.,
Stanford University Press, 1960. In the study of the problems of social choice, the
classical work is K. J. Arrow, Social Choice and Individual Values, 2nd ed., New
York, Wiley, 1963 (1st ed., 1951).
7. These three topics are obviously interrelated. An important example of this inter-
relationship is found in the body of mathematical knowledge known as the "theory
of games." The list of good textbooks on the theory of games has been expanding,
consequently discouraging me from including the subject in this book.
8. There seems to be no question that the topics covered in this book will provide a
basis for further research in the above three fields. For example, the theory of
competitive equilibrium is a theory of a social system-competitive markets-
given a priori, operating under conditions of certainty. The theory of growth
provides an understanding of the complications that arise when the possibility of
intertemporal choice is explicitly introduced. In view of the prior importance of
the theory of competitive equilibrium and the theory of growth, the omission might
not be as serious as it may seem.
9. H. Scarf recently offered a method of computing the competitive equilibrium
when we know the aggregate technology set and certain information on the pref-
erences of individual consumers. See his article "On the Computation of Equi-
librium Prices," in Economic Studies in the Tradition of Irving Fisher, New York,
Wiley, 1967, and "An Example of an Algorithm for Calculating General Equi-
librium Prices," American Economic Review, LIX, September 1969.
10. Some of these points are discussed in the book. See our discussion on the non-
t6tonnement processes (Chap. 3, Section G), for example. Incidentally, disequi-
librium analysis is another important topic whose rapid progress is expected in
the next few decades.
11. Needless to say, they should still have ample knowledge of data and institutions
as well as deep insight into the workings of the real world. Lacking these, there is
some doubt as to whether they are qualified to be called "economists."
PRELIMINARIES

Section A
MATHEMATICAL PRELIMINARIES'

a. SOME BASIC CONCEPTS AND NOTATIONS'

A set is a collection of objects (of any kind). For example, the collection
of all the positive integers is a set, the collection of all human beings on earth
is a set, and the collection of all the transistor radios made in Japan during the
year 1973 is a set. Words such as "family," "collection," and "class are often
used synonymously with the word "set." If x is a member of a set S, we denote
that by x E S. If x is not a member of S, we denote that by x V S. A set is often
denoted by braces, { }. Inside the brace, a colon often separates two descriptions:
the first denotes a typical element of the set and the second denotes the properties
that a typical element must have to belong to the set. For example, if I is the set
of all the positive integers, we may denote it by I = {n: n is a positive integer}. The
set of points on the unit circle in the two-dimensional plane can be denoted by
{(x, y): x E R, y E R, x2 + y2 = 1}, where R is the set of all real numbers. That
any S can be written in the form {x: P(x)}, where P(x) are the properties for x
to be a member of S, is sometimes called the axiom of specification (see Halmos
[6], p. 6). In this book, unless otherwise stated, R will always refer to the set of all
real numbers.' A set with a finite number of elements can (in principle) be
denoted by enumerating its elements; for example, {x, y, z}, {0}, and so on. An
element of a set is often called a "point" in the set.
Given two sets A and B, if every element of A is an element of B, we say
that A is a subset of B or B includes A; we write A c B or B A. Note that
A c A. We say that two sets A and B are equal if they have the same elements,
and we write A = B or B = A. (This is often called the axiom of extension; see
Halmos [61, pp. 2-3.) It is easy to see that A = B if and only if both A c B and
B c A.
The collection of all the elements which belong to both set A and B is
denoted by A n B and is called the intersection of A and B; that is, A n B

1
2 PRELIMINARIES

{x: x E A and x E M. The collection of all the elements which belong to set A
or set B is denoted by A U B and is called the union of A and B; that is, A U B
{x: x E A or x E B}. The difference between two sets A and B (denoted by A\B) is
defined by A\B = {x: x E A and x 0 B}. The set that has no elements is also
considered a set. It is called the empty set and is denoted by 0. If two sets A and
B have no elements in common, A and B are said to be disjoint or nonintersecting.
In other words, A n B = 0.
The union (or intersection) can be taken over a finite or infinite collection
of sets. For example, if S1, S2, ..., S,, ... are sets, we can consider
n M
U Si or u Si
=1 1= 1

In fact, the index i does not have to be an integer. For example, letting T be
the set of all real numbers between 0 and 1, we may consider the union U,E TSI.
If S, is the set of all the human beings on earth at time t, UIC TSA is the set of
all human beings on earth during the period T. The reader should easily under-
stand the notations
M OD
n Si, n Si, n S,
i=1 i=l tET

If I is the set of all positive integers, we may write

M
U Si = U Si
i=1 iEi
When S is a subset of set X, the set of elements which belongs to X but not
to S is said to be the complement of S relative to X. When it is obviously "relative
to X," we often omit this phrase and denote it by Sc where Sc = {x: x E X, x it S}.
Clearly Xc = 0 and (ST = S.
Given two sets X and Y, consider an ordered pair (x, y) where x E X and
y E Y. The collection of all those ordered pairs is called the Cartesian product
of X and Y and is denoted by X O Y. The ordinary two-dimensional plane can
be written as R O R, also R' where R is the set of real numbers. Given sets X1,
X2, ... , we may consider the Cartesian product such as

n cc

Oxi, Oi= 1xi

i=1

When T is some index set (not necessarily the positive integers), we may con-
sider the Cartesian product such as OO,ETX,. If T is the set of all the positive
integers, clearly

OX, = O Xi
tE 7 i= 1

If we write x = (x 1 , x2, ... , E 0 ;'=1 Xi, xi is called the ith coordinate of a

MATHEMATICAL PRELIMINARIES 3

point. Similarly in Q,ETX xt E X, is the tth coordinate of the respective point.

The set X, is called the tth coordinate set.
Given two -sets X and Y, if we can associate each member of X with an
element of Y in a certain manner, which we denote by f (x), then we say f is a
function from X into Y, denoted by f:X - Y. The set X is said to be the domain of
f and the set {f(x): x E X } [often denoted by f(X )] is said to be the range off;
f (x) is the value or the image of x under f" The terms map, transformation, operator,
and function are synonymous. If we can associate more than one point in Y for
each point x in X, we call it a set-valued function, a multivalued function, or a
correspondence. When only one point in Y is associated with each point of X, we
call it a single-valued function or simply a function.' Even if a function is single-
valued, it is still possible that more than one point in.Xis associated with the same
value in Y under this function. An example is f: R ->R with f (x) = a for all x (called
a constant function). Let A be a subset off (X). The set defined as { x: x E X, f (x) E
A} is called the inverse image of A under f and is denoted by f- I(A ). If A consists
of only one element, say, A = {y}, thenf- I(A) is the inverse image ofy underf. For
example, if f(x) = 3, x E R, then f- '(3) = R. One can define a mapping f - I of Y
into X by x = f- 1(y) if and only if y = f(x). The function f - I can be either single-
valued or multivalued. When both f and f- I are single-valued, then f is said to be
one to one or an injection. If, in addition,f(X) = Y, then f is called one to one and
onto or a bijection. If f is a function from X into Y and if g is a function fromf(X)
into Z, then we can define a function h fromXinto Z by first applyingf, then g. That
is, h = g [f(x)], x E X. We call h the composite function off and g and denote it by
h = gof. Let f be a function from X into Y. Then the set defined by {(x, y): (x, y)
E X H) f (X), y = f (x)} is called the "graph" of f. The graphs on R2 shown in
Figure 0.1 may be useful illustrations.
Let S be a set of real numbers, that is, S c R. Then S is said to be bounded from
above if there exists an a E R (a is not necessarily in S) such that x < a for all x c S,

f(x) f(x)

x x
0 0 0

One-to-one Single-valued but Multivalued

not one-to-one

Figure 0.1. Functions.

4 PRELIMINARIES

and a is called an upper bound of S. Similarly, S is said to be bounded from below if

there exists a b E R (b is not necessarily in S) such that x > b for all x E S; b is
called the lower bound of S. Clearly there can be many upper bounds and lower
bounds for a given set of real numbers. We do not discuss the axioms of the real
number system here (the interested reader can refer to any relevant book in pure
mathematics). But from these axioms, one can easily derive the following
proposition:

(i) If S is bounded from above, there is a smallest element in the set of upper bounds
of S.

From (i), we can then derive (ii):

(ii) If S is bounded from below, there is a largest element in the set of lower bounds
of S.

In other words, if S is bounded from above, the set U of its upper bounds,
U = (a: a E R, a > x for all x E S }, is not empty. Proposition (i) asserts that there
exists an a E U such that x < a implies x 0 U; a is called the supremum or the least
upper bound of S. It is denoted as sup S = a or supXESx = a. If S is bounded from
below, the set L of its lower bounds, L = {b: b E R, b < x for all x E S}, is not
empty. Proposition (ii) asserts that there exists a b E L such that x > b implies
x E L. b is called the infimum or the greatest lower bound of S. It is denoted as
inf S = b or infsx = b. Given a set S, a subset of R, S may not contain its
least upper bound even if it is bounded from above. Similarly, S may not contain
its greatest lower bound even if it is bounded from below. For example, the set S
defined by S = {l/q: q = 1, 2, ....} is clearly a subset of R, and sup S = 1 and
inf S = 0. Note that 1 E S and 0 E S. That is, inf S E S. In general, given an
arbitrary subset S of R, if a = sup S is in S, a is called the maximum element of
S. Similarly, if b = inf S is in S, b is called the minimum element of S. The above
is an example of the case in which a set does not contain its infimum.
Finally, we may note that the notation "=' is often used to mean "imply."
For example A =>B is read as "statement A implies statement B." If A =B holds,
we say that B is a necessary condition for A and that A is a sufficient condition for B.
When A=B holds, it is not necessarily the case that BMA holds (that is, "the
converse is not necessarily true"). For example, let Q be the set of all rational
numbers and J be the set of all integers; then x E J= =>x E Q but x E Q does not
necessarily imply x E J. When A= B and B= A both hold, then we may say
that A is a necessary and sufficient condition for B or B is a necessary and sufficient
condition for A. In this case A and B are also said to be logically equivalent. When
A= B holds, then "B does not hold =>A does not hold" is true. On the other
hand, if we can show "B does not hold ==>A does not hold," then A =B. This is
often used to prove the statement "A =B."
MATHEMATICAL PRELIMINARIES 5

b. Rn AND LINEAR SPACEb

Let R be the set of all real numbers. Consider an n-tuple of real numbers,
real numbers, that is,
x = (xi, x2, ..., x"). Write R" for the set of all n-tuples of
R"is then-fold Cartesian
R" {x = (xi, x2, ..., x"): x; E R; i = 1,2,...,n{.Then
coordinate of x. Define the
product of R. The ith element x; of x E R" is the ith coordinate-wise addi-
addition of any two arbitrary members x and of R" by
y1 ,22,7 n. Clearly z E R" if x,
tion, that Is, xy+ Y = z meansy
Given an arbitrary scalar
y E R". In other words, R" is closed under addition.

of R" by a (called
a E R, define the multiplication of an arbitrary member xthat is, z = ax = xa
scalar multiplication) by coordinate-wise multiplication,

ax E R". In other
means z; = ax;, i = 1, 2, ..., n. Clearly x E R", a E R implies
words, R" is "closed" under scalar multiplication.
When n = 2, we can illustrate the above concepts of coordinates, addition,
should be well known.
and scalar multiplication in Figure 0.2. This diagram multiplication in R", we can
Given the above rule of addition and scalar
for arbitrary elements x,
readily check that the following eight properties hold
y, and z of R" and scalars a, /i E R.
(L-1) (Associative Law) x + (y + z) = (x + y) + z.
0 = 0 + x = X.
(L-2) There exists an element called 0 such that x + (-x) = 0.
(L-3) There exists an element (-x) for every x such that x +
(L-4) (Commutative Law) x + y = y + x.
(L-5) (Associative Law) a(J3 x) = (a J3)x.
(L-6) lx = x.
(L-7) (Distributive Law) a(x + y) = ax + ay.
(L-8) (Distributive Law) (a + /3)x = ax + /3x.

Figure 0.2. An Illustration of R.

2
6 PRELIMINARIES

A little notationa.l caution is needed here about the symbol 0, which can mean
either the scalar zero or the n-tuple of 0's. The latter is sometimes called the
origin for the obvious geometric reason (see Figure 0.2).
Given an arbitrary set X (not necessarily R"), if "addition" and "scalar
multiplication" are defined, if X is closed under these two operations, and if
the above properties (L-1) to (L-8) are satisfied, then we call X a (real) linear
space or (real) vector space, and an element of X is called a vector. Of course
R" with the above rule of addition and scalar multiplication is only one example
of a linear space. From now on we shall regard R" as a particular one in which
such addition and multiplication are defined. We henceforth call R" "(n-dimen-
sional) real space." For illustrative purposes we give the following as examples
of linear spaces other than R.
EXAMPLE 1: The set F of real-valued functions defined on the interval
[a, h]. Given f, g E F, "addition" (f + g) is defined byf(x) + g(x) for each
x E [a, b] and scalar multiplication (af) is defined by af(x) for each
x E [a, b].
EXAMPLE 2: x = (x,, x2, ..., x", ...), where Ek I xk < oo, with a similar
definition of addition and scalar multiplication as in R".
EXAMPLE 3: The set of all two-by-two matrices with real number entries.
EXAMPLE 4: The set of all continuous functions defined on the closed
interval [0, 1] into R (denoted by Clo,,l), with the same rules of addition
and scalar multiplication as in example 1.
REMARK: Given an arbitrary set X, if "addition" is defined on X and x,
y E X implies x + y E X with the properties (L-1) to (L-4), we say X is an
Abelian (additive) group.

A subset S of a linear space X is called a linear subspace or vector subspace

if x, y E S =x + y E S and x "C S, a E R =ax E S. It should be clear that
a linear subspace is also a linear space [that is, the above axioms (L-1) to (L-8)
are satisfied in S]. If Si, i = 1, 2, ., n are linear subspaces in linear space X,
. .

then the intersection n"=,S; is also a linear subspace.

REMARK: Given a linear space X, an arbitrary subset ofXis not necessarily
a linear space. In fact, in most cases it is not. However, the cases in which
a subset itself is a linear space are important, for we can then utilize the
above eight properties (L-1) to (L-8). The set R3 is a linear space and its
subsets S, _ {(x,, 0, 0) E R3: X, E R} and S2 = {(x,, x2i 0) E R3: x,,
x2 E R}, respectively, are linear subspaces of R3. However, S, U {(l, 1, 1)} is
not a linear space.
REMARK: In the above definition of linear space, scalar multiplication is
confined to multiplication by real numbers. In general we do not have to
MATHEMATICAL PRELIMINARIES 7

restrict the "scalars" to real numbers. If a linear space is defined over com-
plex numbers, we call it a complex linear space or a linear space with complex
field. In fact, we can use anything as a scalar in the defining properties of a
linear space, if it satisfies the properties of the algebraic concept "field."
However, in this book, we confine ourselves to a "real linear space," or a
"linear space with real field" (that is, the case in which the scalars are the
real numbers). Hence, when we subsequently refer to a linear space, we
mean it to be a real linear space as was defined above.
Let S1 and S2 be two subsets in a linear space X. Since addition and scalar
multiplication are defined in a linear space, we can define the following set S, for
fixed scalars a and /1
S {rrx+/3y: xES1,yES2,and a,/ ER}
We denote S by aS1 + /3S2 and call it a linear sum of SI and S2. Clearly S is in
X. Given m sets, S1, S2, ..., Sm in a linear space X, we can analogously define
2:;"_
S= a1S,, ai E R, i = 1, 2, ... , m. The linear sum of sets must be dis-
tinguished from the union of sets (such as uT, IS;).
Given any two arbitrary members x = (XI, x2, ..., x") and y = (y1, y2, ...,
y") of R", we may define a rule for multiplication of x and y as follows:
"
x yxiyi
i= 1

Note that x y is a real number; x y computed by the above rule is called the
inner product in R".
In general, given an arbitrary linear space X (over a real field), an inner
product is defined as a real-valued function defined on the Cartesian product
X 0 X (denoted by x y or < x, y> where x E X and y E X), which satisfies
the following properties. For arbitrary elements x, y and z E X and a, /i E R,
(I-1)
(1-2) (ax + /y). z = a(x z) + /3(y. z).
(1-3) x x ? O and x x = 0 if and only if x = 0.
A linear space with an inner product defined is called an inner product space.
Clearly the inner product defined above for R" satisfies the above axioms (I-1) to
(1-3). We call it the usual (Euclidian) inner product.
Given a point x = (x 1, x2, ..., x the
be computed by

d(x, 0) _ x;2 = xx
.=1

Similarly, given any two arbitrary points x = (XI, x2, ..., x") and y = (yi, y2, ...,
8 PRELIMINARIES

y,1) in R^, the distance between x and y can be computed by

d(x,Y) = (xr - Yr)Z = (x - Y)' (x - Y)

i=1

The "distance" defined in the above formula is called the Euclidian distance. The
Euclidian distance is illustrated in Figure 0.3.
We can easily show that the Euclidian distance defined above satisfies the
following properties.
(M-1) d(x, y) = 0 if and only if x = y.
(M-2) (Triangular inequality) d(x, y) + d(y, z) > d(z, x).
(M-3) d(x, y) > 0 for all x and y.
(M-4) (Symmetry) d(x, y) = d(y, x).
Properties (M-3) and (M-4) can be obtained from (M-1) and (M-2). Given an
arbitrary set X, if a function d from X Ox X into R is defined and if d satisfies
the above properties (M-1) and (M-2) [hence also (M-3) and (M-4)], then X is
called a metric space, d is called a metric, and d(x, y) is called the distance between
two points x and y in X Q X. The metric space is denoted by (X, d). The metric
(or distance) is a function from X Q X into R.
REMARK: The first example of a metric space is obviously R" with the
Euclidian distance defined, from which the concept is formulated. However,
it is possible to think of many different kinds of metric spaces. In the follow-
ing example the reader can easily check that the axioms (M-1) and (M-2)
of metric spaces are satisfied.
Let X be an arbitrary nonempty set and define

d(x,y) =
0ifx=y
l ifxzy
This example shows that every nonempty set can be regarded as a metric space.

Figure 0.3. An Illustration of Euclidian Distance.

MATHEMATICAL PRELIMINARIES 9

This example is often used to show that certain statements which hold true in R"
(with the Euclidian distance) do not necessarily hold true in an arbitrary metric
space.
Given a point x in Rn, the Euclidian distance between x and the origin,
d(x, 0), is also called the Euclidian norm of x. We denote it by II x II Then d(x, y)
can be denoted by I I x - y1 1 Given arbitrary points x and y in Rn and a scalar
.

a E R, we can easily verify that the following three properties hold for the
Euclidian norm.
(N-1) II x II > 0 and II x II = 0 if and only if x = 0.
(N-2) (Triangular inequality) II x + y II II x II + II y II
(N-3) II ax II = I a I II x II
Given an arbitrary vector space X (not necessarily R"), if we define the real-valued
function, called a norm, which satisfies the above three properties, we call X a
normed vector space, or a normed linear space. Clearly every normed linear space
is a metric space with respect to the induced metric defined by d(x, y) = II x - y II
It should be noted that the choice of a norm is not necessarily unique. For example,
R" is a normed linear space with the Euclidian norm, but it can also be a normed
linear space with the following norms:
IIxll=maxIxil, 15; i<n
or

II IIxil
i= I

The reader can check that either of the above two satisfies all three properties
(N-1) to (N-3).
We should note that an arbitrary metric space cannot necessarily be a
normed linear space, for it may not be a linear space in the beginning.
Given a linear space X, we can also induce the concept of a norm from the
concept of an inner product. That is, let X be an inner product space and define
II X II = (x x)1, or II X II 2 = (x x)
We can easily verify all the properties of a norm, (N-1) to (N-3); henceX becomes
a normed linear space as well as an inner product space. Note that in R" this
,yn_
relation holds when II x II is the Euclidian norm and (x x) _
Thus given an arbitrary linear space X, if we first define an inner product
and if we induce the norm and then the metric in the way described above, then
X becomes a normed, metric, and inner product linear space. Thus all the prop-
erties (N-1) to (N-3), (M-1) to (M-4), (I-1) to (I-3), and (L-1) to (L-8) are available.
In particular, R" can be such a space with its usual Euclidian norm, metric, and
inner product. Unless otherwise stated, we consider Rn as such a space. That is,
given x and y in Rn, we have
n

Exiyi, II xll =
i= 1
10 PRELIMINARIES

and

d(x,y)° Ilx - yII = [(x-y).(x-y)]=

with (N-1) to (N-3), (M-1) to (M-4), (I-1) to (1-3), (L-1) to (L-8), and all the prop-
erties derived from them.
We remark that to have a norm it is not necessary to have an inner product
at all. We also remark that there are normed linear spaces over which one cannot
define an inner product that will "generate" or "induce" the norm on the space.
In other words, there may not exist any inner product such that x x = II X II 2,
where II x II is the norm on the space. However, if further conditions are imposed
on the norm, then one can guarantee the existence of an inner product that will
induce the norm.

C. BASIS AND LINEAR FUNCTIONS`

We begin this subsection by defining some important concepts of a linear
space.

Definition: Let X be a linear space. Given m vectors in X, xI, x2, ..., x'", the
vector x defined by
m
x= aix', a1ER,i= 1,2,...,m
t= I

is called a linear combination of these m vectors.

REMARK: In order that the above definition of x be meaningful, X cannot
be an arbitrary set. That is, scalar multiplication and addition must be de-
fined on the set.' Note also that x is in X since X is a linear space. Note also
that m must be a finite number, since a linear combination is defined only
with respect to a finite sum.

Definition: A finite set of vectors {xI, x2, ..., xm} in a linear space is called
linearly independent if

m
a;x' = 0 implies that a, = 0 for every i
r= I

An arbitrary (finite or infinite) collection of vectors, S, is said to be linearly

independent if every finite subset is linearly independent. The collection of vectors
S is said to be linearly dependent if it is not linearly independent; S is linearly
dependent if and only if there exists a linear combination of (a finite number of)
vectors in S, say x1, x2, ..., x'", such that
m
ax'= 0 for some aj ER, a1 0, i = 1,2,...,m
=I
MATHEMATICAL PRELIMINARIES 11

REMARK: If S is finite, say, {x', x2, ..., x"'}, S is linearly dependent if

and only if there exist scalars, aj, a2, ..., a,,,, all in R and not vanishing
simultaneously such that E;"_ i a;x' = 0.
The following proposition is an immediate but very important corollary of
the above definitions.

Corollary: The set of nonzero vectors, {x', x2, ..., x"'}, is linearly dependent if
and only if one of the vectors in the set-say, x'-can be expressed as a linear combina-
tion of other vectors in the set. (For the proof, see Halmos [5], pp. 9-10, for
example.)

Definition: Let X be a linear space. A linearly independent set S in X, with the

property that every vector x in X can be expressed as a linear combination of
the vectors in S, is called a basis (or a Hamel basis) for X.
REMARK: A basis S may consist of a finite or infinite number of members.
If it is finite, X is said to be finite dimensional (and otherwise infinite
dimensional).
REMARK: The usual Cartesian coordinate system of R", that is, n vectors,
(1, 0, . . ., 0), (0, 1, . . ., 0), . . . (0, . ., 0, 1), each consisting of n members,
.

forms a basis for R". Obviously there can be many bases for a given linear
system X.
The following theorem is basic.

Theorem O.A.1:

(i) Every linear space has a basis.

(ii) Any two bases of a linear space are in one-to-one correspondence.
PROOF: See Wilansky [ 12], pp. 16-17, for example.
REMARK: The proof of (i) requires Zorn's lemma; an understanding of
this lemma is not required in this book. The interested reader can refer to
any standard textbook on set theory or topology (for example, Halmos
[6], sec. 16, and Kelley [7], p. 33).
A corollary of (ii) in the above theorem is that the number of elements
in any basis of a finite dimensional linear space is the same as in any other
basis. Hence we can define the number of elements of a basis of a finite dimen-
sional linear space as the dimension of the space. Moreover, if X is an n-dimen-
sional linear space, then every set of (n + 1) vectors in X is linearly dependent.
12 PRELIMINARIES

We now define an important class of functions, "linear functions." For the

remainder of this section, all functions are taken to be single-valued.

Definition: A function f from a linear space X into a linear space Y is said to be

a linear function if

(i) f(x + x') = f(x) + f(x') for every x, x' E X;

(ii) f(ax) = af(x) for every a E R and x E X.

REMARK: In particular, if Y c R (that is, if f is real-valued), then f is

often called a linear form or a linear functional. A real-valued function
which may not be linear is simply called a functional.
REMARK: In the above definition addition, such as x + x' and f(x) +
f(x'), and scalar multiplication, such as ax and af(x), are meaningful because
both X and Y are linear spaces. Note also that if f is a linear function, then
-f is also a linear function.
EXAMPLES OF LINEAR FUNCTIONALS:

1. Define f: R" >R by f (x) = a. x where x E R" and a is any (fixed) vector in
Rn.

2. Let C[ab] be the set of all continuous functions defined on the closed
interval [a, b]. Define f: C[a.b]>R by f (x) = fbx (t)dt, where x(t) E C[a,b] .

Incidentally. cta bl is a linear space but not finite dimensional.

Consider now the collection of all linear functions of a linear space X

into a linear space Y. We denote this collection by L[X, Y] First, note that
.
a function which maps x E X into 0 E Y is in L [X, Y1, and we shall reserve
the notation 0 to denote this, that is, 0 E L[X, Y] and 0(x) = 0. Now define
the addition of any two elements, say, f and g, of L[X, Y] by [f + g] (x) =
f(x) + g(x), for each x E X. Define [ -f] by [-f] (x) = -f(x) for each x E X
and f E L [X, Y1. We can easily see that f + [-f] = 0. Furthermore, [-f] is
the only gin L [X, Y] such that f + g = 0. Using the defining properties (i) and
(ii) of linear functions, the argument can be continued to establish L [X, Y] as
a linear space. In particular, the set of linear functions of a linear space X into
itself (that is, L [X, X]) is a linear space, and the set of linear functionals L [X, R]
is a linear space.
Consider now L [X, X] . Denote as L the set of those f E L [X, X] for which
1(X) = X (that is, "onto"). The identity function which maps each element of
X into itself is in L. We reserve the notation 1 for this function; thus we can
write I E L and 1(x) = x.
Let f, g E L. Define the multiplication of linear functions f and g (denoted
by fogs or simply f g) by [fog] (x) - f [g(x)]. It should be noted that the order
of transformation off og is to transform first by
g and then by f. In general, f og is
not the same as g o f.
MATHEMATICAL PRELIMINARIES 13

Definition: The function f E L is said to be invertible if the following hold:

(i) x, x' E X with x L x' implies f (x) # f (x').

(ii) For every y E X, there exists at least one x E X such that f(x) = y.

Definition: Let f c L be invertible. Define the inverse off (again denoted by

f-1) as follows. If yo is any vector in X, we may by (ii) find an x0 in X such that
f(xo) = yo. Moreover, by (i) this x0 is unique. Definef-1(yo) to be this xo.
REMARK: From the definition it is easy to see that 0 is not invertible and
that '!f is invertible" implies ` f a f 1= f 1 a f= 1."
For finite dimensional spaces, we have the following remarkable result.

Theorem O.A.2: Let X be a finite dimensional linear space and define Las above;
f E L is invertible if and only if f(x) = 0 implies x = 0.
PROOF: See Halmos [5], p. 63, for example.
REMARK: In finite dimensional spaces '!f is not invertible" therefore im-
plies that "there exists an x E X, x # 0 such thatf(x) = 0." Such_anf is often
called singular and an invertible f is often called nonsingular.
Let X be an n-dimensional linear space where n is a finite positive integer.
Let S= {x1, x2, ..., xn} be a basis of X. Let f E L; thus f(x) E X. Hence in
particularJ(xi) E X, j = 1, 2, ..., m. From the definition of basis, f(xJ) can be
written as
n
f(XJ) _ E j = 1, 2, ... , n
i= 1

where aid E R, i, j = 1, 2, . . ., n. Consider the following array of the a;j's.

all a12 al,,

a71 a22 a2n
A=

an 1 an 2 ... ann

or simply A = [a,1] . Such an array is called a matrix.

It should be noted that the matrix A is determined with respect to a particular
set of basis vectors. That is, a matrix A is a representation of a linear function f
when a particular basis is chosen. When a different basis is chosen, we obtain a
different array of scalars. To emphasize this we used different notations, f for
a linear function and A for its matrix representation under a particular basis, which
is contrary to the usual convention.
Note also that in the matrix representation of a linear function, the order
of the basis chosen is important. When the basis is defined as a set of vectors, the
14 PRELIMINARIES

order of the vectors is clearly not particularly important. However, in order to

fix the matrix representation of linear functions, we also have to fix the order of
the basis vectors.
In the above, a matrix is defined in connection with f applied on a basis
{x1, ..., x"}. Consider now an arbitrary vector x in X. Let x = En I .jxi in terms
of these basis vectors. Then noting f'(x) jf (xi) since f is linear, we again
obtain an array of scalars [aid] , where aij = Note that if y = f (x) _
,,"_ I 77ix', then 77, 1 Considering x and y as (column) vectors whose
ith element is i and 71i respectively, y = f(x) is represented by the usual textbook
matrix notation 71i = E I aid i , or y = Ax.
Using a fixed basis (often called a coordinate system), we obtain the matrix
representation A = [a iJ] of a linear function f. It can be shown that this representa-
tion is one to one, that is, matrices of two different functions in L are different.
Moreover, we can also assert that every array [aij] of n2 scalars is the matrix of
some linear function in L. For the proofs of these statements, see Halmos [5],
pp. 67-68, for example. Moreover, we can assert that this association between L
and the set of matrices preserves addition, multiplication, and the 0 and identity
element.
For this purpose, consider the set of all matrices [aij], [b,1] , and so on,
i, j = 1, 2, ., n, and define addition, scalar multiplication, product, the 0 element
. .

(0), and the identity element (I) in this set by

[a, ] + [bid] = [aid + big]

ce[aij] = [aaij]

0 = [Oij]
[aij] [bij] =
where
i R

i
aikbkj

Oii = 0 E R for all i and j

I= [eJ where eij = O E R if i j
= lERifi=j
Then we can assert

Theorem O.A.3: Given L and the set 7' of all matrices [ aij ] defined by f (x')
Ini= 1 aiix' with a fixed basis of X, (x x2 ... , X ), there exists an isomorphic
,

correspondence, that is, a one-to-one correspondence between L and T that preserves

addition, scalar multiplication, product, and the 0 and identity elements, when these
operations on the matrices are defined as above.

PROOF: See Halmos [5], pp. 67-68, for example.

REMARK: To arrive at the above theorem, we restricted L to be the set
of linear functions of an n-dimensional linear space X onto itself. However,
we may extend our consideration in an analogous fashion to the case where L
is the set of linear functions defined on an n-dimensional linear spaceX onto
an m-dimensional linear space Y. In this case we do not get square matrices,
MATHEMATICAL PRELIMINARIES 15

but obtain (n x m) rectangular matrices as the representation of such linear

functions.
We have come a rather long way. Starting from the definitions of basis, linear
functions, and so on, we finally reached "matrices," which, after all, must be
familiar to the readers who have finished an elementary course in matrix algebra.
We should note, however, that we have discovered that these "matrices" have
much more profound meaning than a mere array of numbers. Namely, the set
of these matrices is an "isomorphic" representation of the set of linear functions
on a finite dimensional linear space.
Some remarks on notations are in order now. In this book the dot between
a matrix and a vector (or a matrix) indicates the multiplication of the matrix with
the vector (or the matrix). We will not make a distinction between "row" or
"column" vectors. We assume that the readers of this book know the rules of the
multiplication of two matrices-say, A and B, of a matrix A and vector x, and of
two vectors x and y-so that there should be clear understanding of the meaning of
A B, A x, and x y (and y x) and no misunderstanding about the distinction
between and After all, it is much more plausible to consider a vector
as an element of a linear space rather than as an array of numbers and a matrix
as a representation of a linear function rather than as an array of numbers."'
We close this subsection by mentioning the definition of "linear affine"
functions, since they are often confused with linear functions.

Definition: Let f be a function on a linear spaceX into R (that is, a linear function-
al). The function f is said to be linear affine or affine if f(x) - f(O) is linear.
REMARK: Obviously a linear function is a special case of a linear affine
function. Let X c R" and Y = R, and define f (x) = a x + k = X;'_ 1 a,x; +
k, where a, x E R", k E R, and a and k are constants. In elementary mathe-
matics and in most literature in economics, such a function f is known as a
"linear function." However, as long as k / 0, f does not satisfy the defini-
tion of linear functions; it is a linear affine function. Similarly, let F(x) be
defined by F: R" ->Rm, F(x) = A x + k, where x E R", k E R"' and A is an
(m x n) matrix. Then F is also a linear affine function as long as k # 0.
REMARK WITH REGRET: In the course of this book, as long as the con-
text is clear we do not stick closely to this distinction between linear functions
and linear affine functions. In other words, we sometimes call a linear affine
function a "linear function" as long as this does not cause any confusion.
Although obviously imprecise, this is rather inevitable in view of the common
usage in economics. For example, linear programming typically contains
the "linear constraint" of the typef(x) = a x + k < 0 (k z 0). As remarked
above, f is linear affine and not linear. Similarly, F in the constraint F(x)
A x + k < 0 (k 0) is also linear affine. But it is too pedantic to rename
linear programming "linear affine programming" and linear constraints
"linear affine constraints." There are too many such examples in economics
to rename the relevant functions as "linear affine."
16 PRELIMINARIES

d. CONVEX SETS"
Here we consider an arbitrary linear space X. This X does not necessarily
have to be R". However, the reader can certainly confine his attention to R"
(instead of X) if he so wishes.

Definition: Given x and y in a linear space X, z defined by z -- Ox + (1 - O)y,

0 <_ 0 < 1, 0 E R, is called a convex combination of x and y.

REMARK: The concept of a convex combination can be illustrated in R2

by Figure 0.4.

if 0 = 1, z=x, and
if 0=0,z=y

X
/ z= Ox+(1 -0)y (here 0<0 < 1)
Figure 0.4. An Illustration of
Convex Combination.
a

Definition: Let S X, where X is a linear space. If an arbitrary convex combina-

tion of any two points of S is in S, then S is called a convex set. That is, S is
convex if x, y E.S implies Ox + (1 - O)y E S for 0 < 0 < 1.
REMARK: A circle is not a convex set, but a disk that includes all the interior
points of a circle is a convex set. The area covered by the Chinese character
,J--L, (translation "convex") is not convex (see Figure0.5).

Definition: Given m points, x1, x2, ..., x' in a linear space X, x defined by
m in

x=Z 01x' where 0<Oi< 1,O,ER, i= 1,....,m and Z Oi= 1

i= 1 i=1

is called a convex combination of these m points.

The proof of the following theorem is easy to do and so is left for the reader.

Theorem O.A.4:

(i) A set S in a linear space X is convex if and only if every convex combination of
(two or more) points in S belongs to S.
(ii) Any intersection (finite or infinite) of convex sets is also convex.
MATHEMATICAL PRELIMINARIES 17

Figure 0.5. Chinese Character "Con-

vex"-Not a Convex Set.

REMARK: The empty set 0 and sets consisting of only one point are con-
sidered convex sets.

Definition: Let S c X, where X is a linear space. Given in points in S, x 1, x", .. ,

x"', x defined by x = 2:;" 1 aix' where a; E R and ai >_ 0, i = 1, 2, ..., in is called
a nonnegative linear combination of these in points.
REMARK: As we defined earlier, if we do not restrict ai to be non negative,
x is simply called a linear combination of those points. Given any arbitrary
in points 'in S, a subset of a linear space X, neither a convex combination
nor a (nonnegative) linear combination of these in points has to be in S,
although both of them are in X.
The following theorem is very useful.

Theorem O.A.5: Let Si, i = 1, 2, ..., in be convex sets in a linear space X. Then
the following are true.'"

(i) Their linear sum Z;" , a;S; is also convex;

(ii) Their Cartesian product (2)j'. Si is also convex (the Cartesian product of the
1

Si's is the set o f all m-tuples (xi, xz, ... , x`, ... , x"'),`x` E S;, i = 1 , 2, ... , in).

Definition: Let K c X, where X is a linear space; K is called a cone with vertex

at the origin if a >_ 0, a E R, and x E K imply ax E K.
18 PRELIMINARIES

Definition: Let K c X, where X is a linear space; K is called a convex cone with

vertex at the origin if it is a cone with vertex at the origin, with the following
property:
x, y E K implies x + y E K
REMARK: It is possible to define a (convex) cone with vertex at a point
other than the origin. However, in this book we confine our discussion of
convex cones to those with the vertex at the origin (this does not hamper
the generality of the discussion, for the choice of the origin can be arbitrary).
Hence when we refer to (convex) cones, we omit the phrase "vertex at the
origin."
REMARK: From the above definitions the following properties should be
obvious.

(i) Every convex cone is a convex set.

(ii) Every cone contains the origin.

REMARK: The set consisting of two different half lines starting from the
origin is a cone, but not a convex cone. If we include the area inside two
half lines with an acute angle, then it is a convex cone.

Theorem O.A.6:

(i) T;_" i K, is a convex cone if Ki is a convex cone for all i.

(ii) Any (finite or infinite) intersection of convex cones is also a convex cone.

REMARK: The empty set 0 is considered a convex cone. Note that Ki U K2

is not necessarily a convex cone even if both K, and K2 are convex cones.

Definition: Given a set S in a linear space X, the intersection of all the convex
cones containing S is called a convex cone spanned by S or a convex cone generated
by S, and we denote it by K(S).
REMARK: We can show that K(S) is the "smallest" convex cone containing
S. That is, K is a convex cone containing S implies that K(S) c K. We can
also show that K(S) can be written as

K(S) aix':x'ES,aiER,ai0,i= 1,2,...,m

where m and the choice of x' and a, are arbitrary."
REMARK: When S is a set consisting of a finite number of points, K(S)is
called a convex polyhedral cone." In R2, for example, the set of two points
MATHEMATICAL PRELIMINARIES 19

(0, 1) and (1, 0) will generate a convex polyhedral cone K(S) which is the
nonnegative orthant of R2.
In the previous subsection, we defined such concepts as the dimension of
a linear space and linear functions. With the aid of these concepts we can obtain
the following important characterization of convex sets and so forth.

Theorem 4.A.7: Let f be a linear function of a linear space X into a linear space
Y. If S is a convex subset (resp. cone, linear subspace) of X, then its imagef(S)
is a convex subset (resp. cone, linear subspace) of Y.
PROOF: See Berge [ 1] , p. 143, for example.
EXAMPLES: Let X = Rn and S c X. Consider the following examples:

1. P. S- >X by f (x) = ax, where a E R. Then aS is a convex subset of X if

S is convex.
2. f: S -> Rr" by f(x) = A x, where A is an (m x n) matrix with real
entries. Then the set defined by { y: y = A x, x E S} is a convex subset in
R' if S is convex.

e. A LITTLE TOPOLOGY15
Consider a point x0 in a metric space (X, d), say, Rn, and define a set
Br(xo) by
B, (x0) = {x: x E X, d(x, x0) < r}
where r is some positive real number and d(x, x0) refers to the (Euclidian)
distance between x and (the fixed point) x0. The set Br (x0) is called the open
ball about x0 with radius r. The point x0 is called the center of B,(xo). An open
ball is always nonempty, for it contains its center. Figure 0.6 illustrates some
examples of an open ball.
x2

0
An open ball in R2 An open ball in C, 0', 1
that is, )feCiolj : d(f, fo) <r

Figure 0.6. Illustration of Open Balls.

20 PRELIMINARIES

There is one very important characteristic in the concept of an open ball.

Given an open ball-say, Br(xo) in Rn-pick any point-say, x-in Br(xo). Then
we can always find another open ball about this point x which is contained in
Br(xo). This is illustrated in Figure 0.7. Given an arbitrary set S in (X, d), if S
has this characteristic, S is called an open set. In other words, we have the follow-
ing definition.

/ Figure 0.7. A Property of an Open

- Ball.

Definition: Let S be a subset in a metric space (X, d). This set S is called an open
set, if, for any x in S, there exists a positive real number r such that Br(x) c S.
REMARK: It is easy to check that every open ball is an open set.
Now consider the collection of all the open sets in X. Note that X itself is an
open set. The empty set 0 can be considered as a trivial example of an open set.
We denote the collection of all the open sets in X by T. We can easily check that-r
satisfies the following properties:
(T-1) XET,0cT.
(T-2) V ; E T, i = 1 , 2, ..., m implies ni _" IV; E T.
(T-3) V,, E T for all a E A implies U4,,,V,, E T.

Given an arbitrary set X (it does not have to be a metric space or a linear
space), if we define a collection of subsets TofX which satisfies the properties (T-1),
(T-2), and (T-3), we can call it a topological space with topology T. We denote a
topological space by (X, T). A member of 'r is called an open set. In fact, as the
reader can easily check, any set X can be a topological space for either one of the
following topologies:

1. T = { X, 0} (called the indiscrete topology).

2. = all the subsets of X (called the discrete topology).

In fact, many kinds of topologies other than the above two can be defined on an
arbitrary set. That is, there are many ways to transform a given arbitrary set to a
topological space. The symbol Tr in the notation (X, T) refers to the topology
specified for this topological space X.
MATHEMATICAL PRELIMINARIES 21

In a metric space we are often concerned with the collection of open sets
defined in terms of open balls as a topology. We call this the usual topology in the
metric space or the topology induced by the metric. Although there are many ways
to make a metric space into a topological space, we henceforth refer to this usual
topology as the topology in metric space, unless otherwise specified.
Note that an arbitrary topological space does not have to be a metric space,
although every metric space can be a topological space by the topology induced by
the metric. Note also that an arbitrary topological space may not be a linear space,
although every normed linear space is a topological space by the topology induced
by the norm.
Real space, R", is a metric space with the Euclidian metric; hence it is also a
topological space with the usual topology. Moreover, R" is also a linear space, a
normed linear space, and an inner product space. In other words, it has all the
features of these spaces. Conversely, we may say that the properties of each of
these spaces are abstracted from R". This means that one can always get an intui-
tive understanding of these concepts by a graphical representation in R2. How-
ever, the reader should note that this is also very dangerous, for these concepts are
far more general and broader than R2 (or R").
We now define closed sets.

Definition: Let (X, r) be a topological space. Then a subset S of X is called a

closed set if its complement is an open set, that is, S` E r (where S` denotes X/S).
REMARK: Note that the empty set 0 and X are also closed sets (as well as
open sets). Open intervals in R, such as (a, b), (-oo, a), (b, oo), are all open sets
in R, and closed intervals in R, such as [a, b], are closed sets. The sets [b, oo)
and (-oo, a] are also closed sets. However, (a, b] and [a, b) (where a and b are
finite) are neither open sets nor closed sets.

REMARK: It is an elementary exercise in set theory to show the following

propositions:

(i) Any (finite or infinite) intersections of closed sets is closed.

(ii) Anyfinite union of closed sets is closed.

Definition: Given a topological space (X, T) and a subset S of X, a point x0 ofX is

called a limit point or an accumulation point of S if every open set containing xo
contains a point of S other than x0.11 A point x0 of S is called an isolated point of S if
it is contained in an open subset of X which has no other points in S. The set of all
the limit points of S is called the derived set of S. The union of S and its derived set
is called the closure of S, which we denote by S.''
REMARK: If S is a set which consists of only one point, that is, S = {x0},
then x0 cannot be a limit point of S, for S has no point other than x0.
22 PRELIMINARIES

REMARK: A limit point of S can be a point of S, but it is not necessary that

it be a point of S. For example, let S be an interval [ 0, 1] of the real line with
-
the usual metric inR [that is, d(a, b) = l b al] ; then everypointinSisalimit
point of S. But point 0, which does not belong to S, is also a limit point of S.

Definition: Given a metric space (X, d), a point x0 in X, and a subset S of X, a

sequence' 8 {x9} in S is said to converge to xo (denoted by x9 -> x0 or lim xq = x0) if
for any real number e > 0, there exists a positive integer q such that q > q implies
d(x0, x9) < e. The point x0 is called the limit of {x9}, and such a sequence {x9} is
called a convergent sequence in S if x0 E S.
REMARK: Intuitively speaking, this means that if the terms of a sequence
approach a limit, they get close together. Note again that x0 does not have to
be in S. For example, let S be an interval (0, 11 with the usual metric. The
sequence { l/q}, where the q's are positive integers, is a sequence whose
values are in S. The point 0 is the limit of this sequence, but it is not in S.
REMARK: The concept of convergence can be generalized for a topological
space which is not necessarily a metric space. Given a topological space
(X. T), a sequence {xe} in X is called convergent in X if there exists a point
x0 E X such that for every open set V containing x0, there exists a positive
integer q such that q > q implies x9 E V. The point x0 is called a limit of {xa},
and we say x9 converges to x0. However, such a limit may not be unique in
an arbitrary topological space.
REMARK: Given an arbitrary sequence in a metric space, say, {x9}, it may
have a limit or it may not have a limit (nonconvergent sequence). But if the
limit exists, it can be shown easily from the definition of a limit that the limit
is always unique in metric spaces. Moreover, if a sequence has a limit, every
subsequence19 of the sequence has the same limit. If a sequence has no limit,
a subsequence of the sequence may have a limit or may have no limit. For
example, {0, 1, 0, 1, ...} has no limit but has two convergent subsequences
{0, 0, . . ., 0} and {1, 1, ... 1} with limits 0 and 1 respectively.
REMARK: A common confusion is in the distinction between "limit" and
"limit point." These two terms are different. A sequence is not a set of
points; rather, it is a function defined on the positive integers (q = 1, 2....
with values in the sets (that is, x9 E S). Moreover, (the value of) the limit of a
sequence may not be a limit point of the set. That is, that x9>x0, where
{x9} E S, does not necessarily imply that x0 is a limit point of S. For example,
let S be a set which consists of a single point 1 ; then a sequence { 1 , , 1, ...}
1

is a convergent sequence whose values are in S, with limit 1. But the point 1
is not a limit point of S, for S, consisting of a single point, cannot have any
limit points, as we remarked after the definition of the limit point.
MATHEMATICAL PRELIMINARIES 23

The following theorem is easy to prove and is useful to relate the two con-
cepts of "limit" and "limit point."

Theorem O.A.8: Let S c R n. Then a point x0 in R n is a limit point of S if and only f

there exists a sequence {xq} in SI{x0} such that
lim xq = x0
q-.oo

PROOF: The sufficiency part of the theorem is obvious. For the proof of the
necessity part, see Rudin [10], p. 42, and Kelley [7], p. 73, for example.
REMARK: Restriction of S to be in the real space Rn is necessary to prove
the necessity part only. In fact, this can be relaxed by requiring that S c X,
where X is a topological space satisfying the "first axiom of countability."
The reader is not required to understand the concept of the first axiom of
countability. However, he can always find its definition from any textbook on
general topology; see, for example, Kelley [7] , p. 50, and footnote 28 of the
present section.

REMARK: Let {xq} be a sequence in X. The convergence of this sequence

depends on the metric or topology defined on X. For example, {xq} may be
convergent under one metric but may not be convergent under another
metric. Moreover, even if {xq} is convergent under two different metrics, the
limit may be different depending on the metric chosen. In the real space
Rn, the following three metrics are well known and important.

n
2: (xi - yi )2
dl (x, y) =
i=,
dz (x, y) = max
r
I x; - yr
n

d3(x, y) = Ixi - y;l

It can be shown that if a sequence { xq} is convergent under any of di , and

d3, it is also convergent under the other two metrics, and the limits under
these three metrics are all the same. (For the proof, see Nikaido [9] , sec. 11,
for example.)
The following theorems are rather easy to prove but important.

Theorem O.A.9: Let S be a subset in a topological space (X, T). Then the follow-
ing hold:
24 PRELIMINARIES

(i) S is closed if and only if S contains all its limit points (that is, if it contains
its derived set).
(ii) S is closed if and only if S = S .

PROOF: See, for example, Kelley [ 7] , pp. 41-43; Simmons [ 11 ] , sec.

17.

REMARK: If the derived set of S is empty (that is, if S has no limit point),
then S is always closed. Hence, for example, a set of only one element is a
closed set. A set consisting of a finite number of points is closed as is a finite
union of closed sets.

Definition: Let S be a subset of a topological space (X,'r). A point x0 in S is called

an interior point of S if S contains an open set containing x0. The set of all the in-
terior points of S is called the interior (or the open kernel) of S (denoted by S° or
"interior S"). If a point is in S (the closure of S) but not in S°, then it is called a
boundary point of S. The collection of all the boundary points of S is called the
boundary of S and is denoted by "boundary S."
REMARK: It is not difficult to prove (see Kelley [7], p. 44, for example)
the following propositions:

(i) S° is an open set and it is the largest open subset of S.

(ii) S is open if and only if S = S°,
(iii) S= S U boundary S and boundary S = S n SC.

All these properties are obvious when X is a metric space with the usual
topology in the metric space.
EXAMPLE: An open ball B,(xo) x: x E X, d(x, x0) < r} in a metric space
(X, d) is not a closed set. However, { x : x E X, d (x, x0) < r}, called a closed
ball, is a closed set and, in fact, is a closure of B,.(x0). The open ball Br(x0)
is the open kernel of this set Br(xo). The set {x: x E X, d(x, x0) = r} is the
boundary of Br(x0) and Br(x0). Any point in this boundary is a limit point
of the open ball Br(xo) but is clearly not in Br(x0).
REMARK : We started the discussion of topology with open sets. Open sets
satisfy the axioms of a topological space (that is, a finite intersection of open
sets is open and any union of open sets is open). A closed set is then defined to
be the complement of an open set. Then we showed that any closed set con-
tains all its limit points. A point x in a set X is a limit point of S, a subset of
X, if there exists a sequence of points other than x in S which converges to x.
Hence, in a closed set, every converging sequence of points in the set con-
verges to a point in the set. In other words, a closed set is a set which is
closed under the limit operation.
MATHEMATICAL PRELIMINARIES 25

We can reverse the construction of a topology or topological spaces by con-

fining ourselves to a metric space, thus defining convergence in terms of a metric;
that is, we start with the definition of a closed set as a set that is closed under the
limit operation and then define an open set as the complement of a closed set. We
can then show that a finite union of closed sets is closed and any intersection of
closed sets is closed. From the definition of open set, we can then trivially con-
clude the properties of topological spaces.
Now we define the very important concept of a "continuous function." As re-
marked before, function here always refers to a single-valued function.

Definition: Let f be a function from a metric space (X, d1) into a metric space
(Y, d2). The function f is called continuous at a point x0 in X if for any real number
c > 0, there exists a real number 6 such that d, (x, xo) < b and x E X imply d2U'(x),
f(xo)) < E. The function f is called continuous in X if it is continuous at every point
of X.
It is easy to show that this definition is equivalent to either of the following
two statements (see Simmons [ 11 ] , p. 76, for example).

(i) For each open ball BF(j(xo)) with center f(xo), there exists an open ball
B,5 (xo) with center xo such that f(B6(xo)) c BE(f(xo)).
(ii) xq->xo implies f(xq)_.f(xo).

Note that the above definition of continuity is strictly analogous to the one in R".
We have the following important theorem:

Theorem O.A.10: Let f be a function from a metric space (X, d,) into a metric
space (Y, d2). The function f is continuous in X if and only if f (V) is open in X
whenever V is open in Y.

PROOF: See Simmons [ 11 ] , pp. 76-77, for example.

This theorem induces the definition of continuity in an arbitrary topological

space (not necessarily a metric space). That is, we say that a function from a
topological space (X, T1) to (Y, T2) is continuous iff-'(V) E T1 whenever V E T2.
Therefore the concept of the continuous function is quite a general one for it is not
confined to metric spaces "'

Corollary: The function f is continuous in X if and only iff-'(V) is closed When-

ever V is closed in Y.
REMARK: The statement 'I is continuous in X and V is an open set (resp.
a closed set)" does not necessarily imply "f(V) is an open set (resp. a closed
set)." For a counterexample, see Kolmogorov and Fomin [8], sec. 12.
26 PRELIMINARIES

EXAMPLE I: A constant function is a continuous function.

EXAMPLE 2: The Euclidian norm on R" is a continuous function into R. In
fact, any norm II x II on an arbitrary linear space X is a continuous function
into R with respect to the metric induced by the norm [that is, d(x, y)
Ilx-yll],becausex,yEXandd(x,y)<8imply thatlllxll IIyIII<8 -
('.'III x II - II y 11 1 < II x - y II owing to the triangular inequality).
EXAMPLE 3: The distance (or metric) function d(x, x0) on X x0 X (that is,
x0, x E X) is a function on X into R when we fix x0, and it is continuous
because x, y E X, and d(x, y) < 8 imply I d(x, x0) - d(y, xo) I < S.
EXAMPLE 4: The function r(x, y) of R" Q R" into R" defined by r(x, y)
x + y is continuous.
EXAMPLE 5: The function k(a, x) of R @R" into R" defined by k(a, x)
ax is continuous. (The proofs of Examples 4 and 5 are easy, or see Berge and
Ghouila-Houri [2], p. 38).
REMARK: In the expression lima- f (x) = b or (f (xq) _ b as xq _ a),
Q

it is neither required that f be defined at a nor that f(a) = b when f is

defined at a. The function f is "continuous" at a if f is defined at a and
f(xq)_f(a) as xq-> a.
REMARK: Let X be a subspace of R" and let x y I x,y;. Let a be
a fixed point in R" and {xq} be a sequence in X such that xq_>xo,
x0 E X, as q -> oo. Suppose xq a > a for all q, where a is a fixed real number.
We can conclude that x0 a > a. To prove this, note that x0 a is a con-
tinuous function on X, and obtain a contradiction21 if x a < a. Similarly, if
{xq}, { yq} are two sequences in X such that xq- x0 and yq -yo with xq a >
xq yq for all q, then we can show x0 a > x0 yo. (Note that x y is continuous
on X (D X.)22
The following theorem is important.

Theorem O.A.11:
(i) A continuous function of a continuous function is also continuous.
(ii) The Cartesian product of continuous functions is also continuous (that is, let
f,,, i = 1. ... , m be continuous functions from S into T; ; then a function from S
T, defined by f (x) = [f, (x), ... , fm(x)] is also continuous).
0',11

into 1

(iii) The converse of statement (ii) is also true.

PROOF: Statement (i) follows directly from the definition of continuity. For
(ii) and (iii), see Kelley [7], p. 91, for example.
REMARK: For the usual Euclidian topology, the proofs of (ii) and (iii) are
straightforward (see Rudin [10] , pp. 75-76, for example). However, the
MATHEMATICAL PRELIMINARIES 27

construction of a topology on the product space Q ;"_ 1 Ti as the one natural-

ly induced from the component spaces is extremely important and is not
easy. See Kelley [7] , pp. 88-90, and Simmons [ 11 ] , pp. 115-118. The
topology thus constructed is known as the product topology, and the usual
topology of the product space is this product topology.23 We take up this
topic once again in connection with the Tychonoff theorem on compact sets.
REMARK: Given f = [ft, , f,"] , f, is called the ith projection off State-
. .

ment (iii) says that every projection of a continuous function is also con-
tinuous. The identity transformation f(x) = x on R" is continuous; hence
f(x) = x,, i = 1, 2, . ., n are also continuous functions of x.
.

Theorem O.A. I 1 holds for any continuous function from a metric (or topo-
logical) space into a metric (or topological) space. Suppose now that the range of
the function is in a linear space. We can then talk meaningfully about such things
as f + g, af, and so on. In particular, we consider the properties of a continuous
function whose range is in R" (or R). Then we can show the following 24

Theorem O.A.12:

(i) Let X--R" and ai: X->R, i = 1, 2, ..., m, where X is a metric space, be
continuous functions, then f = 2:"_ 1 ai(x)J(x) is also continuous in X.
(ii) Let f,, be real-valued continuous functions on a metric space X (i = 1, 2, ... , m).
Then II,"' 1 f,- is also continuous.25 If f: X > R is continuous in X, 1/f is also
continuous for all x E X with f (x) v 0.
(iii) Let J be continuous functions on a metric space X (i = 1, 2, ..., m). Then
max { J (x)} and min { J (x)} are also continuous on X.
i i

REMARK: Two corollaries of the above theorem are (1) every polynomial
is a continuous function, and (2) the Cobb-Douglas function, -[I,"=Ix I '
(= x1"' x2112 .... x"an), 0 < ai < 1 for all i and Vt 1 ai = 1, is a continuous
function.
We may note the following important theorem, the proof of which can be
done by using the concept of a continuous function.

Theorem O.A.13: Every convex polyhedral cone is a closed set.

PROOF: See, for example, Nikaido [9], theorem 5, sec. 27, for such a proof.
For a similar but an alternative proof, see Hestenes, M. R., Calculus of Varia-
tions and Optimal Control Theory, New York, Wiley, 1966, pp. 15-16
(lemma 5.5).
REMARK: Note that an arbitrary convex set may not be closed, but every
convex polyhedral cone is closed by the above theorem.
28 PRELIMINARIES

Next we discuss another very important topological concept, "compact-

ness." In the real space R", a set is called "compact" if it is a closed set and if
it is bounded (that is, if there exists an open ball with a finite radius that con-
tains the set). Then it can be shown that a set is compact if and only if every open
"cover" of the set contains a finite subcover (the Heine-Borel theorem). It is now
argued that this theorem probes very deeply into the nature of "compactness,"
and the conclusion of the Heine-Borel theorem is converted into a definition of
compactness.

Definition : Let (X, T) be a topological space. Then a class of open sets { Va J,

a E A, in (X, T) (where A is an index set), is called an open cover of S, a subset
of X, if each point of S belongs to at least one V. A subclass of an open cover
which is itself an open cover is called a subcover.

Definition: A subset S of X in a topological space (X, T) is called compact if

every open cover of S has a finite subcover.
REMARK: Note that the set must be in a topological space if compactness
is to be defined at all.
In a metric space (not necessarily R") compactness with the usual topology
has the following important consequence.

Theorem O.A.14: Let (X, d) be a metric space with the usual topology defined.
Then we have the following:

(i) The set S is a compact subset of X if and only if every infinite subset of S has
a limit point (this is known as the Bolzano-Weierstrass property).
(ii) The set S is compact if and only if every sequence in S has a convergent sub-
sequence and its limit is in S (this is known as sequential compactness).

REMARK: Let S be a set of two points 0, 1 in R, that is, S = {0, 1 } . Clearly

S is closed and bounded in R, and it is compact (see Theorem 0.A.16).
Consider an infinite sequence in S, {0, 1,0,1,. . .1. This sequence is clearly not
convergent, but it has convergent subsequences such as {O, 0, ...} or {1,
1, ...}. The property of sequential compactness is useful for obtaining many
important results. Nikaido [9], for example, exploited this property
throughout his book.
Given a collection of topological spaces (Xe., Tj where a E A and A is
an index set, form the Cartesian product O,,EAX, and denote it by X. There
will be many ways to generate a topology T in X from Ta. The question is
whether we can generate T in such a way that the product of compact sets in Xa
in terms of Ta, a E A, is also a compact set in terms of T. The answer is yes; that
MATHEMATICAL PRELIMINARIES 29

is, there is a way to generate T such that this is possible. Thus generated, T is the
product topology mentioned before. We do not discuss how to generate the pro-
duct topology, but we will state the result of its construction, known as the Tycho-
noff theorem, which is probably the most important theorem in general topology.

Theorem O.A.15 (Tychonoff): The product of any nonempty class of compact sets is
compact with the product topology.
The proof of this theorem is not easy, and, in fact, many of the past proofs are
known to be wrong. The proof requires use of Zorn's lemma. A consequence of this
theorem is the classical Heine-Borel theorem.

Theorem O.A.16 (Heine-Borel): Every subset ofRn is compact ifandoniyifit is closed

and bounded."
PROOF: See Kelley [7], pp. 144-145, and Simmons [ 11 ] , pp. 119-120,
for example.

This theorem immediately shows the following examples of compact sets in Rn: (1)
a closed ball, (2) the boundary of a closed ball, and (3) the set defined by
n
{x:xERn,xi>O,i= 1, 2,...,n, xi:5 11
r= i

The following theorem is useful and easy to prove.

Theorem O.A.17: Let (X, T) be a topological space. Then

(i) Any closed subset of a compact set in (X, r) is compact.
(ii) Any continuous image of a compact set in (X, T) is compact.
(iii) Let X be a linear space and X;, i = 1, 2, ... , m, be subsets in X. Then their linear
sum set - "` I aiX;, (a; E R, i = 1, 2, ... , m) is compact if all the X;'s, i = 1,
2, .. , m, are compact.
(iv) The union of a finite number of compact sets in (X, T) is compact.

YROOr: For (i) and (ii), see Simmons [ 11], p. 111, for example. Statement
(iii) follows immediately from (ii), and (iv) follows immediately from the
definition of compactness.
Statement (ii) of the above theorem has an important corollary, known as
the Weierstrass theorem.

Theorem O.A.18 (Weierstrass): Let (X, T) be a topological space and f be a real-

valued continuous function on X. Let S be a compact subset of X. Then f achieves
a maximum and a minimum in S.
30 PRELIMINARIES

PROOF: Since f is continuous and S is compact, f (S) is compact. Also, since

f(S) c R, it is closed and bounded in R by the Heine-Borel theorem. Hence
f has a maximumf(a) and a minimumf(b), where both a and b are in S.
(Q.E.D.)
REMARK: For example, f(x) = x defined on the unit open interval (0, 1)
of the real line is a continuous function, but it does not achieve a maximum
(or minimum) at any point in (0, 1). Iff(x) = x is defined on [0, 1] instead,
then it achieves the maximum (resp. minimum) at x = 1 (resp. at x = 0).
Note that [0, 1] is closed and bounded in R, hence compact.
REMARK: Note that Theorem O.A.18 holds even if X is not Euclidian.
Important concepts which are closely related to concepts such as limit,
continuity, and compactness are the separation properties of topological spaces.
We introduce some of them in the following definitions.

Definition: Let (X, T) be a topological space. Then

(i) The space X is said to be a TI-space if x, x' E X, and x # x' imply that there
exist V, V' E T with x E V and x' E V', such that x E V' and x' E V.
(ii) The space X is said to be a T2-space, or Hausdorff space, if x, x' E X, and
x # x' imply that there exist V, V' E T, with x E V and x' E V' such that
VnV'=0.
(iii) The space X is said to be a normal space if whenever U and U' are two disjoint
closed sets in X, then there exist V and V' in T with U c V, U' c V' such that
V n V' = 0. The space X is said to be a T4-space if it is normal and T1.

REMARK: In addition to TI-, T2-, and T4-spaces, To-, T3-, and T5-
spaces, and so forth, are defined and discussed in general topology. Clearly
every T4-space is a T2-space and every T,-space is a Tl-space. Converses
of these statements do not necessarily hold. Note also that any set can be a
normal space under the discrete topology.
The following theorem is an easy consequence of the above definition.

Theorem O.A.19:
(i) A topological space X is a Ti-space if and only if each point in X, considered as a
set, is a closed set in X.
(ii) Every compact subset of a Hausdorff space is closed.
(iii) Every compact Hausdorff space is a T4-space.
(iv) Every metric space is a T4-space (hence a Hau.sdorff space).
(v) Let { xq} be a sequence in a Hausdorff space. If { xq} is convergent, then it has
.27
a unique
MATHEMATICAL PRELIMINARIES 31

(vi) The Cartesian product of any nonempty class of Hausdo,ff spaces is also a
Hausdo ff space.

PROOF: See, for example, Simmons [11], pp. 130-134; also Berge [ 1 ] , IV.5
and IV.6; Wilansky [12], 9.1; Kelley [7], pp. 56-57, pp. 112-113.
REMARK: Statement (v) implies that the proofs of theorems which involve
the limit of a sequence would usually require that the relevant set be a
Hausdorff space. Statement (iv) says that in a metric space we do not have
to worry about this 28
A set X is a topological space if it is equipped with a topology (say, T). What
about a subset S of X? We may construct a topology in S (in a natural way) so
that S is a topological space also.

Definition: Let (X, T) be a topological space and S be a subset of X. Then

t = { U: U = V n S, Y E T}, that is, the collection of all intersections of members
of T with S, is called the relative topology of S.
REMARK: It is easy to see that t is indeed a topology so that (S, t) is a
topological space. The space (S, t) is called a subspace of (X, T); U E t is
said to be open in S and S\U is said to be closed in S whenever.U E t.
REMARK: Let S be a subset of a topological space (X, T). Let A be a sub-
set of S. Since A is also a subset of X, we can determine whether A is open
or not by the topology T. However, we can also determine whetherA is open
or not in the topological space (S, t) by the relative topology t. One may
conjecture that A is open (closed) in (S, t) if and only if A is open (closed)
in (X, T). However, this conjecture is not true, as one can see by considering
the following example.
EXAMPLE: Consider R (the set of all real numbers) with its usual topology.
Let S be the set of all rational numbers. The set S itself is a closed set in
the space S. However, S is not a closed set in R since S (the closure of
S) = R so that S # S.
REMARK: In this book the statement "A is open (closed)" will mean that
A is open (closed) in X with topology T.Y' If A is open (closed) with respect
to the relative topology t, we shall explicitly specify the relative topology
(unless it is clear from the context).
The following theorem is important but follows immediately from the
definition of relative topology.

Theorem O.A.20: Let (X, T) be a topological space and let (S, t) be a subspace of
(X, T). Let A c S. Then the following hold.-
32 PRELIMINARIES

(i) The set A is closed in (S, t) if and only if A = B n S for some closed set B in
(X+T)
(ii) A point xo in X is a limit point of A with respect to t if and only if it is a limit
point of A with respect to T.

FOOTNOTES

1 The basic mathematics, which will be useful for the later sections and chapters and
for the reader's further study in economic theory, are sketched here. No prerequisite
knowledge is necessary to read this section. The reader, if he so wishes, may restrict
his attention to the usual "Euclidian space," or R". However, it should also be
noted that special care is taken not to misguide readers into thinking that our
world is always Euclidian. Consequently, this section becomes more than a mere
exposition of the mathematics necessary for later sections of this book. This approach
to mathematical preliminaries will be useful for readers who are serious about further
study and research in modern economic theory. Unlike the remainder of the book,
most theorems here are stated without proofs in order that the reader can grasp
the basic mathematical concepts and ideas without being led astray by complicated
proofs. For those readers who wish to see the proofs, references are given from
time to time.
2. For a more detailed exposition, see, for example, Kolmogorov and Fomin 181,
chap. 1 , Rudin [ 101, chap. 1 (also pp. 21-27); Nikaido [91, secs. 6-8; Berge [ 1] ,
11.1; and Simmons [ 11] , chap. 1. For a more complete exposition of set theory,
see Halmos [6] , for example.
3. When' the number of elements of a set is finite, it is often called a finite set. It is
called an infinite set if it is not finite. For example, R is an infinite set. A set is
called countably infinite if there is a one-to-one mapping between the set and the set
of all positive integers. (The phrase "one-to-one mapping" will be explained shortly.)
A set which is either finite or countably infinite is called countable; and a set which
is not countable is called uncountable. Then R is uncountable.
4. Let S c: X; then f (S) is called the image of S under f. When Y = f (X), f is said
to be onto.
5. It should be noted, however, that in many treatments in the literature, "function"
usually refers to a single-valued function.
6. For a more detailed exposition, see, for example, Kolmogorov and Fomin [ 8] ,
secs. 8 and 2 1 ; Berge [ 1 ] , IV.1, VII.2; Halmos [5] , secs. 1-4; Simmons [ 11 ] , sees.
9,14, and 15. In this subsection, concepts such as "linear space," "inner product"
(space), "metric space," "norm," and "normed linear space" will be discussed.
The reader will realize that these concepts, although they are abstracted from R",
have much broader scope than R".
7. See Halmos [51, secs. 5-8 and 32-38; Nikaido [91, secs. 9 and 10; Wilansky
[12],2.1-2.4.
8. This is certainly the case, if X is a linear space as is assumed in the above definition.
9. The notation is justified, for, after all, the multiplication off and g is defined as
the composite function off and g.
10. Strictly speaking, A- B may have to be denoted as Ao B. However, this is too
pedantic. In fact, following the usual convention, we often denote it simply as
AB, unless it is confusing.
MATHEMATICAL PRELIMINARIES 33

11. For a more detailed exposition see, for example, Fenchel [31, 1.1, 1.2, 11.1, and
11.2; Berge [ 1 ] , VII.4; Berge and Ghouila-Houri [2] , 1.4 and 1.5; Nikaido [ 9],
sec. 27; or Fleming [41, 1.4. The proofs of the theorems are fairly easy. The reader
can enhance his understandings of the content of this subsection by trying to prove
these theorems by himself.
12. The proof of statement (i) is easy if we utilize a later theorem, Theorem O.A.7.
13. The corresponding concept to K(S) is convex hull, which is defined as the smallest
convex set containing a given set S. Denoting this by C(S), we can easily prove that
C(S) can be written as C(S) = {_Y;" i a;x': x' E S, a, E R, a; >_ 0, i = 1, 2, .. ,
m, _Y;"_ I a, = 1}, where m and the choice of x' and a, are arbitrary. Note the
difference between C(S) and K(S).
14. Corresponding to this concept, the convex hull of a finite number of points in X
is called a convex polyhedron, or a convex polytope.
15. The material here is standard in general topology, and many textbooks are avail-
able for those who wish to see the proofs of the theorems in this subsection and to
study this topic further. See, for example, Simmons [ I 1 ] ; Kelley [ 7] ; and Berge
[ 1 ]. Kelley 7] is a standard textbook on this topic; however, Simmons [ 1 I ] is
easier to read than Kelley [7]. Again, most of the proofs are omitted so that the
reader can grasp the basic ideas without being led astray in the "jungles" of the
proofs.
16. It is important to notice that the concept of limit point becomes concrete only
when the topology of the space is specified. In other words, whether a particular
point is a limit point or not depends on the topology. LetX be the set of real numbers.
With its usual topology the open interval (0, 1) is an open set and every point of
the closed interval [0, 1] is a limit point of (0, 1). However, if the discrete topology
is chosen for X, then no subset of X has the limit point.
17. For example, the closed interval [ a, b] in R is the closure of (a, b), (a, b] , and [a, b)
in R with its usual topology.
18. The reader of this book must have encountered the term "sequence" some time
earlier in his study of mathematics. A rigorous definition is as follows. A sequence
in X is a function defined on the set of all positive integers and whose range is
included in X. If the range of this function is a set of real numbers, then it is called
a sequence of real numbers. In general, however, the range can be any set. A sequence
is usually denoted by {xl, x2, ...} or, in short, {xq}, where x1, x2...., xq, . are. .

the images of the function (and are called the values of the sequence, or the terms
of the sequence). If x9 E S for all q, then { xq} is said to be a sequence in (set) S.
19. Given a sequence {xq}, consider a sequence {qs}, where q, < qZ < ... < qs < ... .
Then the sequence {x"s} is called a subsequence of xq.
20. This remark with respect to Theorem O.A.10 also means that the continuity of a
particular function depends on the topology specified in the space.
21. If x0- a < a, then for a sufficiently large q, we have xq a < a, for x- a is a continuous
function. This contradicts the assumption.
22. Similarly, we can also prove that (1) if x9 --xo and xq a < a for all q, then x0. a < a,
and (2) if xq->,r0, yq->yo with x9- a < xq yq for all q, then xo a < The
propositions in the present remark with this footnote are often utilized in economic
theory (for example, consider a as a price vector).
23. Again the basic motivation here is found in R". The set I of all the open intervals
(a, b) in R, under the usual topology of R, is called the open base of R in the sense
that every open set of R can be expressed as a union of open intervals. In other
words, every open set of R can be generated from I. Then define open cube in
R" by { (XI, x2, .. , x ): a; < x; < b;, a;, b;, x; E R, i = 1, 2, ... , n}. We can prove
that the set of open cubes is an open base for R", that is, it will generate every open
34 PRELIMINARIES

set of R". In other words, we produced a topology for R" starting from that of R.
This idea of generating a product topology for R" is used for the general case.
24. For the proof, see any standard textbook on elementary analysis, or try to prove
it by yourself.
25. II;__ j denotes f, j2 jm.
26. A set S in R" (or any metric space) is said to be bounded if there exists an open ball
with a finite radius which contains S.
27. As we remarked earlier, we can define a limit of a sequence in an arbitrary topological
space (X, T); that is, x0 is a limit of sequence { xq} if for each open set V containing x0
there exists a q such that q >_ q implies xq E V. However, as we cautioned earlier,
such a limit may not be unique: for example, consider T - {X, 0} (indiscrete
topology); then any sequence converges to every point of X. A remarkable feature
of the Hausdorff space is that if a limit exists it is always unique.
28. In this section, we induced the concept of a topology from a metric and thus observed
that a topology is closely related to the concept of the "limit of a sequence." We may
reverse the problem; that is, given a topological space (X, T), what is the situation in
which the topology can be described in terms of sequences alone? This question then
leads us to the Moore-Smith convergence theory in terms of "directed- sets" and
"nets." We do not go into this discussion (see Kelley [7], chap. 2). In any case, it
turns out that the most satisfactory situation in which a topology can be described by
sequences alone is the case of the "first axiom of countability." A topological space
(X, z) is said to satisfy the first axiom of countability, if, for each point x in X, there
exists a countable class of open sets such that every open set containing x is a union of
sets in this class. In short, (X, z) satisfies the "first axiom of countability" if it has a
countable open base at each of its points. An open base is a class of open sets such that
every open set is a union of sets in this class (see Kelley [7] , p. 50; Simmons [ 11] ,
pp. 99-100). The first axiom of countability makes the following statement meaning-
ful for general topological spaces. "A point x0 is a limit point of a set S if and only if
there exists a sequence in SI{x0} which converges to xe" (see Kelley [7], theorem 8,
p. 72, and problem B, p. 76). It is known that every metric space satisfies the first
axiom of countability (for example, see Kelley [7], theorem 11, p. 120). Hence, when
we induced the concept of topology from a metric, the first axiom of countability had
already crept into our discussion. In other words, the metric space with its induced
topology is a "nice" topological space in terms of sequences, for it satisfies the first
axiom of countability as well as being Hausdorff.
29. In the remainder of this book, we usually assume that X is R". Hence the statement "A
is open" will mean, unless otherwise specified, that A is open in R" with its usual to-
pology.

REFERENCES
4

1. Berge, C., Topological Spaces, New York, Macmillan, 1963 (French original, 1959).
2. Berge, C., and Ghouila-Houri, A., Programming, Games and Trasportation Net-
works, New York, Wiley, 1965 (French original, 1962).
3. Fenchel, W., Convex Cones, Sets, and Functions (hectographed), Princeton, N.J.,
Princeton University Press, 1953.
4. Fleming, W. H., Functions of Several Variables, Reading, Mass., Addison-Wesley,
1965.
SEPARATION THEOREMS 35

5. Halmos, P. R., Finite Dimensional Vector Spaces, 2nd ed., Princeton, N.J., Van
Nostrand, 1958.
6. , Naive Set Theory, Princeton, N.J., Van Nostrand, 1960.
7. Kelley, J. L., General Topology, New York, Van Nostrand, 1955.
8. Kolmogorov, A. N., and Fomin, C. V., Functional Analysis, Vol. 1, Rochester, N. Y.,
Grayrock, 1957 (Russian original, 1954).
9. Nikaido, H., Introduction to Sets and Mappings in Modern Economics, tr. K. Sato,
Amsterdam, North-Holland, 1970 (Japanese original, Tokyo, 1960).
10. Rudin, W., Principles of Mathematical Analysis, 2nd ed., New York, McGraw-Hill,
1964. V

11. Simmons, G. F., Introduction to Topology and Modern Analysis, New York, McGraw-
Hill, 1963.
12. Wilansky, A., Functional Analysis, New York, Blaisdell, 1964.

Section B
SEPARATION THEOREMS

Optimization problems deeply underlie many branches of economic theory.

The theorem (known as the separation theorem) which asserts the existence of a
hyperplane that separates two disjoint convex sets is probably the most funda-
mental theorem in the mathematical theory of optimization. In this section, we
study the essence of this theorem by confining ourselves to a simple but very im-
portant case-that is, the whole space is the real space, R"-and considering it
both as a vector space and a topological space (with the usual metric topology of
R").

Definition: Let p E R" with p 0, 11 p 11 < oo,' and or E R. The set H defined by
H = {x : p x = or, x E R } is called a hyperplane in R" with normal p.
REMARK: If n - 3, H is a plane; and if n = 2, 1-1 is a straight line.
REMARK: Suppose that there are two points, x* and y*, in H. Then by
definition p x* = a and p y* = a. Hence p (x* - ),*) = 0. In other words,
vector p is orthogonal to the line segment(x* - y*), ortoH. [See footnote 5.]
For the two-dimensional case (n = 2), we may illustrate this as in Figure 0.8.

Definition: Given two nonempty sets X and Y in R" and a hyperplane H

{x: p x = a, x E R"}, we say X and Y are separated by H (or H separates X and
Y) if
36 PRELIMINARIES

Figure O.B. An Illustration of a

Hyperplane.

forallxEX
and
a for all xE Y
If we have strict inequalities in the above two inequalities, that is, p x > a for all
x E X and p x < a for all x E Y, then we say X and Y are strictly separated by
H.
REMARK: The above definition obviously holds even if X and/or Y is a set
consisting of only one point.
REMARK: A hyperplane H in R" divides R" into two "half spaces." In
particular, H = {x: p x = a, x E R"} determines the following two closed
half spaces.
{x: p- x? a, x E R"} and {x: p x< a, x E R"}
It can be seen readily that a closed half space is convex as well as closed.
A hyperplane itself is clearly closed and convex.

Definition: Given a nonempty set X in R", a hyperplane H is called bounding for

X if X is contained in one of the two closed half spaces determined by H. If H
is bounding for X and H has a point in common with the boundary of X (that is,
inf,EX p x = a), then H is called a supporting hyperplane to X.

Definition: An intersection of hyperplanes is called a linear manifold.

We first prove a separation theorem in its simplest version.

Theorem O.B.1: Let X be a nonempty closed convex set in R". Let x0 0 X. Then
the following are true.
SEPARATION THEOREMS 37

(i) There exists a point a E X such that d(xo, a) < d(xo, x) for all x E X, and
d(xo, a) > 0.
(ii) There exists a p E R", p # 0, II p II < oc, and an a E R such that
forallxEX
and

In other words, X and x0 are separated by a hyperplane H = { x: p x = a, x E Rn} .

PROOF:

(i) Let R (x0) be a closed ball with center at x0 and meeting X [that is,
R (x0) n X 0] . Write A = B (xo) n X. The set A is nonempty, closed
and bounded (hence compact). Since A is compact and the distance
function is continuous, d(x0, x) achieves its minimum in A as a result
of Weierstrass's theorem. That is, there exists an a E A such that d(x0, a) -<
d(x0, x) for all x E A. Hence afortiori d(x0, a) < d(x0, x) for all x E X.
Since x0 rt X and a E 7, d(x0, a) > 0.
(ii) Let p=- a- x0 and a -=p-a. Note first that p x0 = (a- xo) xo =
(a - xo) (xo - a) + (a - xo) a = -(a - x0) (a - xo) + (a - xo) a =
- 11 p 112 + a < a, where 0 < II P II < oo. Let x E X (arbitrary point).
Since X is convex and a E X, x(t) E T, where x(t) - (I'- t)a + tx,
0 < t < 1. Then d(x0, a) < d(x0, x(t)) by (i). In other words: II a - X0 II 2 <
II x(t) - x0 II 2 = II (1 - t)a + tX - x0 II 2 = (1 - t) (a - xo) +
11

t(x - xo)112 = (1 - t)2 Ila - X0112 + 2t(1 - t)(a - xo)- (x - xo) +

t2 II X - X0 II 2. Hence we obtain 0 < t(t - 2) II a - X0 I12 + 2t(1 -- t)(a -
.x0) (x - x0) + t2 II X - X0II2

Divide both sides by t(t > 0), and we

obtain 0
Take a limit as t -> 0. Then we obtain

Figure 0.9. An Illustration of the Proof of Theorem 0. B. 1.

38 PRELIMINARIES

or
0>_
(a - x pa x E T.
(Q. E. D.)

REMARK: Note that the convexity of X is used only in (ii) of the above
proof.

REMARK: Since 0 < II p II < w, we may choose p such that II p II = 1 2

REMARK: The above proof is essentially due to von Neumann and Morgen-
stern [ 1 l I. Debreu [4] offers the following alternative argument for the
second part (ii) of the above proof, which avoids the use of the limit process
(that is, t - 0). His argument is rather intuitive (see Figure 0.10). A rigorous
proof along this line can be seen in Berge [ 1], p. 162, or Berge and Ghouila-
Houri [2] , pp. 5 3-54. The proof is done by contradiction. That is, suppose
that there is a point x of X which is strictly on the same side of H as xo.
Consider the point x(t) on the line segment ax such that xox(t) is orthogonal
to WY. Since d(xo, x) ? d(xo, a), the point x(t) is between a and x. Thus
x(t) E X and d(xo, x(t)) < d(xo, a), which contradicts the choice of a.

REMARK: Note that the hyperplane H = j x: p. x = a, x E R"} is a support-

ing hyperplane to X. This hyperplane separates set X from the given point
xo(which itself is a convex set). Hence the existence of such a supporting
hyperplane to X is the crux of the theorem. Given a nonempty closed con-
vex set (say, X) and a point (say, a) in the boundary of X, we can assert that
there exists at least one supporting hyperplane to X passing through the
point a. In Theorem 1, we saw that such a hyperplane played an important
role in the theorem. In fact, such a supporting hyperplane also plays a
more crucial role in other versions of the separation theorem, which we
prove in this section (Theorems O.B.2 and O.B.3). If the (hyper-) curve
which defines the boundary of X is smooth ("differentiable") at the given

Figure 0.10 Debreu's Argument.

SEPARATION THEOREMS 39

boundary point a, then the tangent (hyper-) plane at the point gives such
a supporting hyperplane. In this case there is only one supporting hyper-
plane passing through the given point a. However, if the (hyper-) curve is
not smooth, then there can be many supporting hyperplanes passing through
the given point. These two cases are illustrated in Figure 0.11. It is impor-
tant to note that the above tangent hyperplane conceptually linksthesepara-
tion theorems to calculus. The power of the separation theorem is that the
boundary (hyper-) curve does not have to be smooth (differentiable) at a,
and that it is more direct and set-theoretic.
Theorem 1 is sometimes stated in the following form.

Corollary: Let X be a nonempty closed convex set in Rn not containing the origin.
Then there exists a p E R", p 0, 11 p II < co, and an a E R, a > 0 such that
forallxEX
and this inequality can be made strict.
PROOF: In Theorem 1, let x0 be the origin. Then, as a result of the theorem,
there exist p 0 and a such that p x > a for all x E X, where p and a are
defined asp = a - x0 = a and a = p a. By this definition, a > 0.
(Q.E.D.)
REMARK: Obviously, this corollary is really equivalent to Theorem 1, for
the choice of the origin can be arbitrary. The inequality in the statement
of the corollary can be made strict by choosing a point strictly in between
a and the origin (instead of a).
In the above theorem, X is assumed to be a closed set. In fact, we can
relax this assumption and we can obtain the following theorem.

Smooth case Nonsmooth case

Figure 0.11. Supporting Hyperplanes.

40 PRELIMINARIES

Theorem O.B.2: Let X be a nonempty convex set in R" (not necessarily closed).
Let x0 be a point in R" which is not in X. Then there exists p E R", p C, 11 p 11 < oo
such that3
forallxEX
PROOF:

(i) Suppose x0 0 7, where X is the closure of X. Then, by Theorem 1, there

exist p E Rn, p 4 0, and a E R such that p x> a for all x E X and
p x0 < a. Thus p x > p x0 for all x E X. Hence, a fortiori, p x > p x0
for all x E X.
(ii) Suppose x0 E X. Since x0 it X (that is, x0 E X`') by assumption, x0 is a
boundary point of T. Then, for any open ball containing x0, there exists a
point which is not in X. That is, there exists a sequence {x9} such that
x9 07 and x9->xo. Since x9jZ X and X is nonempty, closed, and
convex, there exist, by Theorem O.B.1, p9 E R", pq 0 such that
p9 X > pq. x9 for all x c Y. This is illustrated in Figure 0.12. Now
without a loss of generality, we can choose pq such that II p9 II = 1. Then
the sequence {p9} moves in the unit sphere of R". Since the unit sphere
is compact, there exists a convergent subsequence in the sphere; that is,
there exists a subsequence such that pgs->p with II p II = 1, where {p9s}
corresponds to {x9s}. Take the limit of p9s x > pqs. xqs as qs-moo. Since
an inner product is a continuous function, we have p x > p. x0 for all
x E Y. Hence, afortiori, p x > p x0 for all x E X. (Q.E.D.)

Theorem O.B.3 (Minkowski): Let X and Y be nonempty convex sets in R" (not
necessarily closed) such that X n Y = 0. Then there exist p E R", p 4 0, 11 p 11 < 00,
and aERsuch that a for aforally E Y.'

Figure 0.12. An Illustration of the

proof of Theorem O.B.2.
SEPARATION THEOREMS 41

PROOF: Consider S - X + (- Y) (the set obtained by vector addition). Since

X and Y are convex, S is also convex. Also 0 0 S. (If 0 E S, then there exist
x * E X and y * E Y such that x * + (-y *) = 0, or x * = y *. This contradicts
X r1 Y = 0.) Hence, owing to Theorem O.B.2, there exists ap 0 such that
p - z >_ p 0 = 0 for all z E S. Write z = x - y, x E X, _y E Y. Thus we have
p x > p . y for all x E X and y E Y. In other words, infXEXp x ? supyEy
p y. Hence we can pick a in such a way that the conclusion of the theorem
holds. (Q.E.D.)
REMARK: If, in addition, X is closed and Y is compact, then we can
strengthen the conclusion of Theorem O.B.3 as p x > a for all x E X and
p - y < a for all y E Y. For the proof of this theorem, see Berge [ 11, pp.
163-164; and Berge and Ghouila-Houri [2], p. 55.
REMARK: Theorem O.B.2 is clearly a special case of Theorem O.B.3, in
which one of the two sets is a set consisting of only one point (which is
obviously convex). The previous remark states the theorem which general-
izes Theorem O.B.1, since a set consisting of only one point is compact.
REMARK: In the above theorems we used expressions such as
a forallxEX a forallxEX)
and
p - y _< a for all yE Y a for all yE Y)
The directions of these inequalities are immaterial to the essence of the
theorems, for we can easily reverse the direction of the inequalities by
defining p = -p and & = -a. Then p . x .< & for all x E X (or p - x < & for
all x E X ), and p . y >_ & for all y E Y (or p . y > & for all y E Y). Notice
also that in the statement of the above theorems we did not specify the signs
of a and of each component of p.
We finish our discussion of the separation theorems by showing one of their
important applications. We will prove the Minkowski-Farkas lemma by using
Theorem 0.B.1. In order to do this we need the following lemma.

Lemma: Let K be a cone, with the vertex at the origin, in R", and let p be a
given point in R". If p. x is bounded from below for all x E K, then p x >_ O for all
x E K.
PROOF: By assumption, there exists an a E R suchthatp x > a for all x E K.
Since K is a cone with the vertex at the origin, x E K implies Ox E K for all
B > 0. Hence p (Ox) ? a or p x > a/B for all x E K and B > 0. Taking the
limit as 6 --> oo yields p - x > 0. (Q.E.D.)
We can now prove the following theorem.
42 PRELIMINARIES

Theorem O.B.4 (Minkowski-Farkas lemma): Let a1, a2, ... , a- and b 0 be points
in W. Suppose that b x > 0 for all x such that a' x > 0, i = 1, 2, .. ., m. Then
there exist coefficients A,, A2, Am, all > 0 and not vanishing simultaneously,
such that b = ZmIA.;a'.
PROOF: Let K be a convex polyhedral cone generated by a1, a2, ..., am.
Then K is a closed set. We want to show that b E K. Suppose b 0 K. Then
K is a nonempty, closed, convex set which is disjoint from b. Hence from
Theorem O.B.1, there exist p E R", p 0, and a E R such that

px for all x E K. Because of the previous

lemma, we have p x > 0 for all x E K. Also note that 0 E K meansp 0 > a,
or a s 0. Thus we have p b < 0. Since a' E K for all i, p a' > 0 for all i.
Thus for this p, we have b p < 0 with a' p > 0, i = 1, 2, ... , m. This contra-
dicts the hypothesis of the theorem. Hence b E K. In other words, there exist
A 1 , A2, ... , Am, all > 0, such that b = Zm 1 A;a'. Since b 0, the A,'s can-
not vanish simultaneously. (Q.E.D.)
REMARK: If b = 0, then it is possible that A, = 0 for all i = 1, 2, . ., m.
.

REMARK: The converse of the above theorem is also true. I f a1, a2, ...,am
and b 0 are points in R", and if there exist coefficients A,, A2, . . ., Am,
all > 0 (not vanishing simultaneously), such that b = Zm 1A1a', then b x >= 0
for x such that a'. x > 0, i 1,2,...,m.
PROOF: Suppose A I, A2, .1 Am, all >_ 0, are such coefficients that b =
Z' 1Aia'; then b x = (E7 iAiai) x = Zm 1Ai(ai. x) > 0. (Q. E. D.)
Owing to the above remark, the Minkowski-Farkas lemma can also be stated
in the following form.

Theorem O.B.5: Given points a', i = 1, 2, ..., m and b 0 in R", exactly one of
the following two alternatives holds.

(i) There exist A,, i = 1, 2, ..., m, all >_ 0 (not vanishing simultaneously), such that

b= A;a` or
t= i

(ii) There exists an x E R" such that

<0 and 1,2,...,m
REMARK: Theorem O.B.5 is stated as follows: "if (ii) does not hold, then
(i) holds."
SEPARATION THEOREMS 43

a]
Defining an m x n matrix A a2 , a column vector x X11
x2

am X"

and a row vector A (A1, A2, ..., A,), we can state the above alternatives
(i) and (ii) in the following forms:

(i) The equation b = A- A has a nonnegative solution A > 0, A 0.

(ii) The inequalities b x < 0 and A A. x > 0 have a solution x.

Theorem O.B.4 can be restated as "b x > 0 for all x such that A x > 0
implies that there exists a A > 0 with A # 0 such that b = A A."
REMARK: A geometric interpretation of Theorem O.B.4 is as follows:5

(i) The inequality a' x ? 0 for all i means that x is in the cone POQ (the
shaded area).
(ii) The inequality b x > 0 means that x is in the half space, determined
by the hyperplane H - {x: b x = 0, x E R"}, which contains the point b.
(iii) The conclusion of the theorem is "b E K," where K - {y: y = A A,
A>o}.
(iv) If b E K (Case a), then we have b x > 0 for all x such that a' x > 0,

Case a: b e K Case b: b ¢ K

Figure 0.13. A Geometric Interpretation of the Minkowski-Farkas Lemma.

44 PRELIMINARIES

i = 1, 2, ..., m (that is, the cone AOB is contained in the half space,
determined by H, which contains b).
(v) If b (4 K (Case b), then we cannot have b x > 0 for all x such that
a' x > 0, i = 1, 2, ... , m. In other words, there exists a point-say,
x-in the cone AOB, but not in the half space which contains b (Figure
0.13, Case b).

REMARK: The Minkowski-Farkas lemma plays an important role in the

theory of linear programming (for example, the duality theorem), game
theory (for example, the zero-sum two-person game), and the theory of
nonlinear programming (for example, the Kuhn-Tucker theorem), and so
on. There are many ways to prove this theorem. The above proof (the proof
of Theorem O.B.4) is a minor modification of the proof by Berge [ 11, p. 164.
Alternative proofs can be found, for example, in Gale [6], pp. 44-46;
Goldman, A. J., and Tucker, A. W., "Polyhedral Convex Cones" in [7] ;
Nikaido [8], sec. 29, [9] 1.3; and Hestenes, M. R., Calculus of Variations
and Optimal Control Theory, New York, Wiley, 1966, pp. 13-15.

FOOTNOTES

1. By 11 p 11 < oo, we mean 11 p II is finite. Note also that p #0 means 11 p 11 > 0.

2. In the statement of the theorem, p x >_ a for all x E X and p x0 < a, redefine
p and a by p/ 11 p 11 and a/ 11 p 11 respectively. Needless to say, 11 p/ I p I
=1
3. Again, p can be chosen such that II P II = 1, if one wishes to do so.
4. Again, p can be chosen such that II p II = 1, if one wishes to do so.
5. Given two vectors x and y in R", we can prove that x y = II x II 11 y 11 cos B, where
II x 112 = n=1X 2 = x x and 0 is the angle between the two vectors x and y
(0 0 if 0 < 0 < n/2, x y = 0 if 0 = n/2, and x- y < 0 if n/2 < 0 < n.
Incidentally, from the above relation x y = 11 x 11 y 11 cos 0, we can also obtain
(X' Y)2 < 11 x 11 2 11 y 112, which is called the Cauchy-Schwartz inequality (in fact, this
relation holds for any inner product space).

REFERENCES
1. Berge, C., Topological Spaces, New York, Macmillan, 1963 (French original, 1959),
esp. chap. VIII.
2. Berge, C., and Ghouila-Houri, A., Programming, Games and Transportation Net-
works, New York, Wiley, 1965 (French original, 1962).
3. Debreu, G., Theory of Value, New York, Wiley, 1959.
4. Debreu, G., "Separation Theorem for Convex Sets," in "Selected Topics in Eco-
ACTIVITY ANALYSIS AND PRODUCTION SET 45

nomics Involving Mathematical Reasoning" by Koopmans, T. C., and Bausch, A. F.,

SIAM Review, 1, July 1959.
5. Fenchel, W., Convex Cones, Sets, and Functions, Princeton, N.J., Princeton Univer-
sity, 1950 (hectographed).
6. Gale, D., The Theory of Linear Economic Models, New York, McGraw-Hill, 1960.
7. Kuhn, H. W., and Tucker, A. W., eds., Linear Inequalities and Related Systems,
Princeton, N.J., Princeton University Press, 1956.
8. Nikaido, H., Introduction to Sets and Mappings in Modern Economics, tr. K. Sato,
Amsterdam, North-Holland, 1970 (Japanese original, Tokyo, 1960).
9. , Convex Structures and Economic Theory, New York, Academic Press, 1969.
10. Valentine, F. A., Convex Sets, New York, McGraw-Hill, 1964.
11. von Neumann, J., and Morgenstern, 0., Theory of Games and Economic Behavior,
2nd ed., Princeton, N.J., Princeton University Press, 1947 (1st ed., 1944).

Section C
ACTIVITY ANALYSIS
AND THE
GENERAL PRODUCTION SET

The central concept of the traditional (or "neoclassical") analysis of produc-

tion is that of a production function. A production function is a function which
describes the technological relation between various outputs and various inputs.
If we denote the output vector by x = (XI, x2, ... , Xk) and the input vector by
v = (v1, v2, ..., v,), then a "production function" can be written as F(x, v) = 0.
This relation is usually supposed to define a unique surface on the x-plane for
a given value of v. In order to understand the meaning of the traditional produc-
tion function analysis more fully, let us first consider the case where there is
no joint production so that x = x1. In this case a production function is usually
written in the form x = f(v). It is generally supposed that this relation assigns a
unique value of x for each value of the vector v. In other words, f is assumed to
be a single-valued function. Then the traditional analysis usually proceeds with
a further assumption, the differentiability of function f, and the analysis then
becomes one which may best be described as "marginal analysis."
It has long been realized that this concept of a production function is un-
necessarily restrictive. It, in a sense, presupposes the existence of an "efficient
manager." Given available quantities of factors, the efficient manager maximizes
the amount of output produced. In the joint production case, this manager sup-
posedly maximizes the production of one arbitrary output with all the other
46 PRELIMINARIES

outputs held constant. Thus the efficient manager in this case defines the unique
surface with a given amount of inputs.
"Activity analysis" revolutionizes traditional production analysis by dis-
carding the above concepts of a production function and an "efficient manager."
Instead, it postulates the set of production processes available in a given economy.
(Here the word "economy" can mean a firm, a collection of firms, the entire
national economy, or the whole world.) This set is called a production set. An
element of this set is an n-tuple which describes the technological relation of the
input-output combination of one process of production. An element of the
production set is called a process or an activity. We may also call it a blueprint
to stress its technological character. There is no presupposition about the existence
of an "efficient manager," so that nothing in the beginning is specified about
what processes or blueprints in the production set should be adopted or discarded.
If one likes, one can include the managerial ability of Mr. A in the list of com-
modities. We assume that there are n "commodities" in the economy and each
commodity is qualitatively homogeneous. In general, a commodity is defined
here by a specification of all its physical characteristics, of its availability location,
and of its availability date. Hence, for example, flows of technically the same
commodity in two different locations represent two different commodities.'
Note that we may always regard different commodities as one commodity, if
this facilitates a sharper and deeper analysis of a particular problem.
The production process is described by an ordered n-tuple of these com-
modities. (The dimension n is usually assumed to be a finite, positive integer, but
can be infinity.) The production set is the collection of these n-tuples. The
following example is from Koopmans and Bausch [91, pp. 99-100. Here we
consider an economy with four commodities and two processes.

Process 1 Process 2
(Tanning) (Shoemaking)

Commodity I (shoes) 0 1

Commodity 2 (leather) 1 ;

Commodity 3 (hides) -1 0
Commodity 4 (labor) io
2
2

In each process inputs are represented by negative numbers and outputs are rep-
resented by positive numbers. Note also that there can be any number of proces-
ses for tanning or shoemaking. Moreover, each process can have more than one
positive entry. This is the case of joint production. For example, in a process which
produces cow hides, beef may also be produced.
The scope of activity analysis is not limited to statics. The convention of
dating commodities (which is due to Hicks [5]) extends the scope of activity
analysis to dynamics and capital theory, in which time is involved in an essential
manner. For example, consider the Akerman-Wicksell model of the durability
ACTIVITY ANALYSIS AND PRODUCTION SET 47

of capital. Assume that one unit of the capital good (an axe) whose durability is
j days is produced by lj units of labor. Assume that li men are used as input
the first day, leaving one unit of the axe for the second day. Assume that the
axe of durability j lasts for j days after it is built with the same efficiency and
suddenly "dies" at the end of the (j + 1)th day with zero scrap value. Then the
jth production process which produces the axe of durability j can be expressed
by the following vector:

(-Ii, 1, 1, ..., 1, 0, ..., 0)

where there are (j) l's in this vector. Assuming that the maximum durability in
use one can obtain (with any amount of the initial labor input) is m days, there
are (m - j) 0's in the above vector. There are m processes for the production of
the axe, and the choice of the durability of the axe amounts to choosing a proper
process from these m processes. Clearly this convention of "dating commodities"
can also be applied to the Bohm-Bawerk-Wicksell theory of the period of produc-
tion. If the inputs (say grape juice) are "sunk" for certain periods of time, as
in the Wicksellian model of vintage wine, then there are zeros in the production
process vector corresponding to such periods. The choice of the period of produc-
tion amounts to choosing a proper process among the set of processes which are
distinguished by the number of these zeros. For a completely general treatment on
capital theory from the activity analysis viewpoint, we simply refer to Malinvaud
[ 10] Clearly, it is also possible to build a model of growth or capital without
.

following the convention of dating commodities. Simpler treatments are often

possible.
The modus operandi of activity analysis is through the use of set theory and
other branches of modern mathematics. Activity analysis is axiomatic, more
fundamental, and more rigorous than the traditional production function analysis.
Separation theorems will play an important role in activity analysis just as deriva-
tives played an important role in production function analysis. If we like, how-
ever, we can characterize the production set by some functional relations and
pursue an analysis using these relations. The analysis then looks similar to the
traditional analysis except that it is more general. Now we will study the elements
of this modern production analysis. This will provide a good bridge to the
modern economic theory which we propose to study in subsequent chapters.
Let Y be the set of all the technically possible production processes in a
given "economy." We assume Y c R", and y E Y denotes a production process
in the economy. We use the convention that the ith component, y;, ofy, represents
an "output" if y. > 0 and represents an "input" if y, < 0.z The quantity J y; I indi-
cates the amount of the ith "commodity" involved in this process y. We first
impose the following two postulates.

(A-1) (Additivity) y E Y and y' E Y imply y + y' E Y.

(A-2) (Proportionality) y E Y implies ay 6 Y for all a> 0, a E R.
48 PRELIMINARIES

Thus Y is a convex cone. Due to the proportionality, if ai E Y, then

ali
A.jajE Yforall.aj_> 0,AjER,where ai
Lanj

The vector ai may be referred to as the jth activity (or process) of Y in its unit
level of operation. Here a,, denotes the amount of the ith good involved in one
unit operation of the jth activity; .Aj signifies the activity level of the jth activity.
Now we impose the third postulate:
(A-3) (Finite number of basic activities) There exist a finite number of ai's such that
Y is a convex polyhedral cone generated by these ai's. These ai's are called basic
activities.
In other words, a typical element y in Y can be expressed as a nonnegative linear
combination of a', a2, ... a-, where m is a finite positive integer. Owing to the
above postulates, the production set Y in activity analysis can be written as
Y = {y: y = A A, A > 01, where A is an n x m matrix (with real-number entries)
formed by [a', ..., a-] and A is an m-vector whose jth element is >Z.j.
It should be clear that the proportionality postulate means complete divis-
ibility of all the commodities and constant returns to scale and that the additivity
postulate means the independent action of each activity (no interactions among
activities): in Scitovsky's terminology, there are no ("technological") external
economies or diseconomies.
That Y is a convex polyhedral cone implicitly entails several other features.
Some important ones are the following:

(i) 0 E Y (possibility of inaction). That is, it is possible for the producer to do

nothing.
(ii) Y is a closed set. (This is mathematically both a very important and a nice
feature. Economically it means that any production process that can be approxi-
mated by processes in Y is itself in Y.)

Koopmans [7] imposed the following three additional important postulates

(whose economic meanings should be self-evident).
(A-4) (Productiveness) There exists at least one positive element for some y in Y.
(A-5) (No land of Cockaigne) y >_ 0 implies y 14 Y. Or Y n 0 = {0}.3
(A-6) (Irreversibility) y E Y and y 4 0 imply -y V Y. Or Y n (- Y) = {0}.
The two diagrams of Figure 0.14 illustrate the meaning of some of the above postu-
lates. In case a, (A-4) and (A-5) hold but not (A-6). In case b, (A-4), (A-5), and (A-6)
all hold. We may note one important consequence of (A-5), which illustrates the
meaning of the above postulates. For a detailed investigation of the implications
of these postulates, see Koopmans [71, chap. III.
ACTIVITY ANALYSIS AND PRODUCTION SET 49

Y2 Yz

Case a

Figure 0.14 Illustrations of Two Cases

Theorem O.C.I: Let Y = {y: y = A A, A > 0}. Y satisfies postulate (A-5) if and
only if there exists a p > 0, p E R", 11 p 11 < co, such that p y < O for ally E Y.

PROOF:

(i) (Sufficiency) y > 0 implies p. y > 0 for any p > 0. Hence by as-
sumption y E Y.
(ii) (Necessity) Omitted (an interesting exercise for the use of the separation
theorems).'

REMARK: If we interpret p as a price vector, then p y represents the profit

from y. Hence, for example, p y < 0 for ally E Y means that the maximum
profit is at most 0.
We stated postulates (A-4), (A-5), and (A-6) in connection with the produc-
tion set Y which is a convex polyhedral cone. In general these postulates can
be stated even if Y is not a convex polyhedral cone. When Yrepresents the collec-
tion of input-output combinations that are technically feasible in a given economy,
and when we do not require Y to be a convex polyhedral cone, we call Ya general
production set. We can list some of the important postulates that we may wish
to impose on the genera] production set Y. (Most of the results in activity analysis
follow in an arbitrary normed linear space, as well as in R".)

(i) The set Y is closed.

(ii) Possibility of inaction (0 E Y).
(iii) Productiveness [that is, (A-4)].
50 PRELIMINARIES

(iv) No land of Cockaigne (Y rl 0 c {p}).5

(v) Irreversibility (Y rl (- Y) c {p}).
(vi) Free disposability [y E -0 implies y E Y, or Y (-0)].
(vii-a) The set Y is a convex polyhedral cone.
(vii-b) The set Y is a convex cone.
(vii-c) The set Y is convex.

Note that (vii-a) implies (i) and (ii) and that (vii-b) implies (ii). Statements
(vii-c) and (ii) together imply that if y E Y, then ay E Y for all 0 < a < 1; in
other words, nonincreasing returns to scale prevail (or increasing returns to
scale are ruled ou.t). Note also that the convexity of Y presupposes the
divisibility of all the goods involved.
The production set Y as described above indicates the technological pos-
sibilities in a given economy; hence it is free from resource limitations. In other
words, y E Y indicates how much output can be produced after specifying the
amounts of the inputs, and we do not ask whether these inputs are, in fact,
available in the economy. (Thus we called y E Y a "blueprint.") In this sense it
corresponds to the concept of the classical production function. However, we
can also take resource limitations into account. For example, Y can indicate a
"truncated" convex polyhedral cone, such as Y = {y: y = A A,, A >_ 0, A E R"',
and y + z > 0}, where z > 0 denotes the resource limitation of the economy.
With the no land of Cockaigne postulate, such a set is no longer a convex poly-
hedral cone, although it is still convex. Note that the set is compact now. This
truncation can easily be illustrated by Figure 0.15.

Figure 0.15 A Truncated Pro-

duction Cone.
ACTIVITY ANALYSIS AND PRODUCTION SET 51

REMARK: Strictly speaking, the use of the words "activity analysis" may
have to be confined to a study of production processes when the number of
basic activities is finite. In other words, Y must be confined to a convex
polyhedral cone or a "truncated" convex polyhedral cone. We will not
adopt this narrow definition. The revolutionary character of activity analysis
is not in a particular shape of Y. It is in the set-theoretic approach which
is more fundamental and powerful than the traditional smooth (differen-
tiable) production function approach. We now introduce the most important
concept in activity analysis.

Definition: Let Y c R" be a general production set. A pointy in Y is called

an efficient point of Y if there does not exist a y E Y such that y
REMARK: An efficient point represents a boundary point, and it cor-
responds to a point on the classical production function. It is an input-output
combination such that no output can be increased without decreasing other
outputs or increasing inputs. In terms of the previous diagram, OA is the set
of efficient points in the truncated production cone. If the production set
is the entire cone, the half line from 0 passing through A is the set of the
efficient points.
REMARK: An efficient point is often defined with an explicit recognition
of the resource constraint. As we discussed in the previous remark, a
general production set Y can also be regarded as the one which takes
the resource constraint into account. Thus our concept of an efficient
point can include such a case. However, we also note that our interpreta-
tion of Y is rather flexible in the sense that it also allows the case in which no
resource constraints are taken into account. Hence our concept of an ef-
ficient point is also flexible accordingly. If no resource constraints are taken
into account in Y, then some efficient points may not be attainable in the
given economy because they may be outside the range of the resource
constraints.
REMARK: In activity analysis a distinction among primary, intermediate,
and desired commodities is often made. The primary commodities are
the ones which flow into production from outside the production system;
the intermediate commodities are the ones which are produced only for
use as inputs for further production; and the desired goods are the ones
which are produced for consumption or other uses outside the production
system. However, these distinctions can be arbitrary. For example, the
same commodity can often be used either for final consumption or as
an input for further production. Hence we do not emphasize these distinc-
tions.
We now state and prove two fundamental theorems in activity analysis.
52 PRELIMINARIES

Theorem 0.C.2: Let Y be a general production set in R A pointy in Yis an efficient

point of Y if there exists a p > O, p E R and II p II < oo such that p y > p y for all
y E Y.
PROOF: Suppose not. Then there exists y E Y such that y ? y. Thus p y
> p y, since p > 0, which is a contradiction. (Q.E.D.)
REMARK: The above theorem specifies nothing with respect to the postu-
lates on Y.
In order to prove Theorem O.C.3, we need the following lemma.

Lemma: Let Y be a general production set in R If y is an efficient point of Y,

then (Y - y) fl DO = 0', where DO is the positive orthant of R" (that is, the interior
of the nonnegative orthant of R", f2).

PROOF: Suppose not. Then there exists a z > 0 (that is, z E S2 °) such that
z E (Y - y). Hence z + y E Y. Thus y is not an efficient point of Y, which
is a contradiction. (Q.E.D.)

Theorem O.C.3: Let Y be a convex production set in R". If y is an efficient point

of Y, then there exists p ? 0, p E R", II p 11 < oo such that p y ? p y for ally E Y.
PROOF: Let X -- Y - y. Then X is convex as it is a linear sum of two convex
sets. (See Theorem O.A.5.) Since y is an efficient point of Y, we have
X r1 D° = 0 by the previous lemma. Thus we have two disjoint nonempty
convex sets. Hence, from the Minkowski separation theorem (Theorem
O.B.3), there exists a p E R", p 0, II P II < oo, and a E R such that
a for all z E D°
and
forallxEX
Note that 0 E X fort' E Y. Hence a ? 0. Then we have p ? 0. For if not, there
exists an element of p (say, pi) which is negative, since p # 0. By choosing
the corresponding element of z (say, z;) large enough in S2°, we can have
p . z < 0, which is a contradiction. Also a < 0, for if not, then p . z ? a > 0
for all z E 0 °. This is impossible, for by choosing II z II small enough we can
have p z < a. Since a > 0 and a > 0, we have a = 0. Therefore p x _5 0
for all x E X with p ? 0, or p . (y - y) < 0; that is, p y > p y for ally E Y with
p > 0. (Q.E.D.)
REMARK: Because of the homogeneity of relation p y >_ p y, we may
choose p so that 2:7_ 1pi = 1.
REMARK: If Y is a convex polyhedral cone in the above theorem, then it can
be shown that we can choose p > 0.'
ACTIVITY ANALYSIS AND PRODUCTION SET 53

REMARK: By Theorem O.C.3, the concept of "efficient point" is now

characterized by profit maximization; that is, maximization of p . y with
respect to y over Y. The existence of a solution for this maximization
problem is guaranteed if, for example, Y is a compact set (the Weierstrass
theorem), since the inner product is a continuous function.
REMARK: Suppose that Y can be characterized by linear inequalities such
as:

m
EaijAj ri,l= 1,...,n, 0,j= 1,2,...,m
Then the problem of finding A = (.A1, . . ., Am) which maximizes p y where
y= ai,Aj, subject to the above constraints is a typical linear program-
ming problem, of which the computational method is well known and widely
used in practice (the "simplex method"). Hence activity analysis also has
practical and computational significance.
REMARK: It is important to realize the basic features of the neoclassical
"smooth" production function approach in terms of activity analysis
terminology. These are essentially the following. (1) It deals with a produc-
tion set which cannot be generated from a finite number of activities (that
is, it is not a convex polyhedral cone); rather a continuum of vectors is
required to characterize the set. (2) The "efficient manager" is presupposed,
so that production always takes place at an efficient point, that is, on the
set of efficient points (called "production frontier"), which is nothing but
the set defined by the production function. (3) This set of efficient points
constitutes a differentiable function.

FOOTNOTES

1. "Services" and "factors of production" as well as ordinary "goods" are com-

modities.
2. Clearly yi may be 0 for some i; yi = 0 means that the ith commodity is used neither
as an input nor as an output for the processy-that is, it is not involved in the produc-
tion process y.
3. Given two vectors x, y E R", the notation x >_ y means that every element of x
is greater than or equal to the corresponding element of y; that is, xi > yi for all
i = 1, 2, ..., n, and xi > yi for at least one i. This should be distinguished from the
notation x y which requires only the first of the above conditions, that is, xi > yi
for all i. The notation x > y is used to mean xi > yi for all i = 1, 2, ..., n. The
symbol D denotes the nonnegative orthant of R".
4. The sketch of the proof is as follows. By hypothesis, Y does not contain any y > 0.
Let Y* = J p: p . y, < 0 for all y E Y} (called the negative polar cone of Y). We can
then show that there exists p E Y* such that p > 0, which completes the proof.
[Suppose not. Then Y* does not contain any interior points of 0, the positive
orthant of R" (that is, 02°). Let M = S2° - Y*; then we can show that M does not
contain the origin. Moreover, M is convex because both f2 and Y* are convex. Thus
54 PRELIMINARIES

we can have a hyperplane passing through the origin and bounding for M, as a result
of the separation theorem (recall Theorems O.B.1 and O.B.2). Thus there exists
an a E R", a >_ 0, such that a- p< 0 for all p E Y*. From this we can show a E Y
with a > 0, contradicting the hypothesis.]
5. The notation Y n S2 c {0} means that the intersection of Y with f2 contains at most
0, the origin. That is, Y n f2 can be an empty set, as in the case in which 0 E Y. If Y
is a convex polyhedral cone, then 0 E Y; hence (iv) is replaced by Y n f2 = {O}
as in (A-4).
6. The notation Y- y denotes Y- {y} - {z: z = y - y, y E Y}.
7. The proof is not too difficult.

REFERENCES
1. Afriat, S., "Economic Transformation," Krannert Institute Paper, Purdue University,
no. 152, November 1966.
2. Baumol, W. J., "Activity Analysis in One Lesson," American Economic Review,
LXVIII, December 1958.
3. Debreu, G., Theory of Value, New York, Wiley, 1959, esp. chap. 2 and 3.
4. Dorfman, R., Samuelson, P. A., and Solow, R. M., Linear Programming and
Economic Analysis, New York, McGraw-Hill, 1958.
5. Hicks, J. R., Value and Capital, 2nd ed., London, Oxford University Press, 1946.
6. Hicks, J. R., "Linear Theory," Economic Journal, LXV, December 1960.
7. Koopmans, T. C., ed., Activity Analysis of Production and Allocation, New York,
Wiley, 1951, esp. chap. III (by Koopmans).
8. Koopmans, T. C., Three Essays on the State of Economic Science, New York,
McGraw-Hill, 1957, esp. pp. 66-104.
9. Koopmans, T. C., and Bausch, A. F., "Selected Topics in Economics Involving
Mathematical Reasoning," SIAM Review, 1, July 1959, esp. Topic 3.
10. Malinvaud, E., "Capital Accumulation and Efficient Allocation of Resources,"
Econometrica, 21, April 1953; also "Corrigendum," Econometrica, 30, July 1962.
11. von Neumann, J., "A Model of General Economic Equilibrium," Review ofEconomic
Studies, XIII, no. 1., 1945-1946. Translation from German original published in
Ergebnisse eines mathematischen Kolloquiums, no. 8, 1937.
12. Wicksell, K., Lectures on Political Economy, London, Routledge & Kegan Paul,
1936 (Swedish original, 3rd ed., 1928).
I
DEVELOPMENTS OF NONLINEAR PROGRAMMING

Section A
INTRODUCTION

In many economic problems, as well as problems in science and engineering,

it is often necessary to maximize or minimize a certain real-valued function, say
f(x), where x E X, a subset of R", subject to certain constraints, say;
gi(x) > 0, g2(x) > 0, ... , gm(X) > 0
where each gj is a real-valued function.' For example, in the theory of consumer
choice f(x) is a utility function for an individual and x is his n-commodities con-
sumption vector. If his income is given by M, when price p prevails, a "competitive"
consumer, being unable to affect the level of p, maximizes his satisfaction f(x)
subject to the constraints M - p x 0, x > 0.
When the functions f, g1, g2, ..., are all linear functions, except for
constant differences, the problem is known as the linear programming problem.
For example, a country may wish to maximize the value of her national output
(n-vector x), which can be measured by p x (linear function) if price vector p
prevails. Let at,, be the amount of the jth resource necessary to produce one unit
of the ith commodity and let ri be the amount of the jth resource available in
this country. Then the constraints of the problem are the linear functions,
j= 1,2,...,m and x>O
where a1 represents the vector whose ith component is In this formulation
we implicitly assumed that only one production process is available for each
commodity, but if one wishes one can easily introduce as many processes as one
likes for each industry. The problem remains a linear programming problem.
Linear programming, first formulated by the Russian mathematician
Kantorovich but developed chiefly in the United States by G. Dantzig and others,
is applicable to many different problems.' The activity analysis of production, as

55
56 DEVELOPMENTS OF NONLINEAR PROGRAMMING

sketched in the preceding chapter, was no doubt influenced by linear program-

ming. There is perhaps little doubt that interest in linear programming was
prompted, at least in its earlier stage, by the invention of the computational
algorithm known as the "simplex method."
The development of linear programming also prompted the study of the
optimization problem when the f and gj's are not necessarily linear. We now see
that the theoretical apparatus' which was developed in connection with linear
programming can be used in or extended to the problem of nonlinear program-
ming.4 The simplex method also encouraged study of algorithms for certain
nonlinear (notably quadratic) programming problems. In this study we are not
concerned with algorithms as such; rather we are concerned with the theoretical
structure of the nonlinear programming problem and with its connection to
modern economic theory. The crucial paper in the development of nonlinear
programming is Kuhn and Tucker [3] 5
We may now explain the problem we are going to deal with in this chapter.
A functional relation gj(x) = 0, whether it is linear or nonlinear, may define a
surface (or curve) on x-space (or plane). Suppose that it divides the space R"
into two regions, the region where gj(x) > 0 and the region where gj(x) < 0. The
surface of gj(x) = 0 serves as the common boundary of these two regions. In
Figure 1.1, gi(x) = 1 - XI - x2 = 0 defines a straight line on the x-plane and
divides R2 into two regions. In the region which contains the origin (the shaded
region), we can easily show that 1 - XI - x2 > 0, and in the other region we have
1-xi-x2<0.
In Figure 1.2, the shaded region satisfies the four functional relations,
g1(x) > 0, g2(x) >_ 0, g3(x) _> 0, and x > 0 on R2. It may happen that a set of func-
tional relations is inconsistent in the sense that it does not allow any x which satis-
fies all the relations. Mathematically, we may express this by saying that the set
C -- {x:x E X, gj(x) > 0, j = 1 , 2, ..., m} is empty. For example, gl(x) - 1 - x, -
X2 > 0 and g2(x) x ± x2 - 2 0 do not allow any (x1, x2) which satisfies these
two relations simultaneously, even if X = R2, the entire space.

Figure 1 .1 . The Division of the Space.

INTRODUCTION 57

Figure 1.2. The Constraint Set.

Hence the nonlinear programming problem of maximizingf(x) subject to

gj(x) > 0, j = 1, 2, ..., in, can be considered as one of choosing the x which
maximizes f(x) over the nonempty set C. The set Cis called the constraint set or
the feasible set.' The following two diagrams (Case a and Case b of Figure 1.3)
illustrate the problem. Case a is concerned with the case in which f (x) and the
gj(x)'s are all linear or linear affine (the problem of linear programming), and
Case b is concerned with the case in which these functions are nonlinear. The
optimum point' is denoted by z, where in both cases X = R2.
Note that in both Cases in Figure 1.3 the optimum point z is strictly inside
the region of the constraints, g3 (x) > Oandg4 (x) > 0. We say that these constraints

Case a: A linear case Case b: A nonlinear case

Figure 1.3. The Optimal Point.

58 DEVELOPMENTS OF NONLINEAR PROGRAMMING

are inactive (or ineffective) at r. The constraints g, (x) > 0 and g2(x) > 0 are active
(or effective) at z in both Case a and Case b of Figure 1.3 in the sense that
g1(z) = 0 and g2(z) = 0. Note also that the boundary surfaces of the constraint
sets (the shaded areas of Case a and Case b of Figure 1.3) do not allow derivatives
at z (that is, the boundary curve is not "differentiable" at z). In economics, one
of the constraints is often the nonnegativity constraint, such as x >_ 0, since eco-
nomic variables such as price and output are usually nonnegative. And, if z is a
solution of the problem, we typically have a situation such that z > 0; that is, the
constraint x >_ 0 is ineffective at I.
In the classical maximization problem due to Lagrange and Euler, we are
concerned with the case in which all the constraints are always effective [that is,
the problem is the one of maximizing f (x) subject to gj(x) = 0, j = 1, 2, ... , ml.
Although this form of constraint is often very inconvenient in dealing with eco-
nomic problems, it is also true that there are situations in which we know that
the constraints are always effective. For example, one of the constraints may be a
definitional equation, which is not an inequality. Can we then handle the problem
with equality constraints within the above formulation of the nonlinear program-
ming problem? The answer is simply yes, for we can rewrite an equality constraints
gj(x) = 0
as

gj(x) >_ 0 and -gj(x) > 0

The solution z as defined above maximizes f(x) subject to g1(x) > 0, j = 1,
2, ... , m, over the entire domain off, that is, X. In this sense, it is often said that
z achieves a global maximum. In the traditional Lagrange-Euler formulation, we
are also concerned with a local maximum. The point z is said to achieve a local
maximum of f(x) subject to gj(x) > 0, j = 1, 2, ..., m, and x E X, if there exists
an open ball B about z (which may be very small) intersecting with the constraint
set C (that is, B f1 C 0) such that f (z) > f (x) subject to gj(x) > 0, j = 1, 2, ... , m,
and x c X n B (that is, f(z) > f(x) for all x in B n Q. We will introduce these
concepts in a proper context. We may note that the concept of local maximum
or minimum, being concerned with a (small) neighborhood, is closely associated
with calculus. When we treat the problem by means other than calculus, it is not
necessarily the case that we restrict ourselves to the local concept of optimization.
Moreover, the calculus approach obviously presupposes the differentiability of
certain functions. This is often considered too restrictive.
Given the problem of maximizing (or minimizing) a certain real-valued
function f (x), where x is an n-vector, subject to the constraints of m real-valued
functions gi(x) > 0, g2(x) > 0, ..., g,,,(x) >_ 0, and x E X, there is no guarantee
that a solution z exists for this problem. First, as remarked above, the set C
{x: x E X, gj(x) > 0, j = 1, 2, . ., m} may be empty, in which case it is clear
.

that the solution for this problem does not exist. Even if the constraint set is
nonempty, we may still have a situation in which the solution does not exist.
INTRODUCTION 59

For example, consider the problem of maximizing f(x) = x, x E R, over the

constraint set C = (0, 1), the unit open interval on R. Again the solution does not
exist. What, then, are the conditions which guarantee the existence of a solution
of the above problem? One powerful condition is obtained by utilizing the Weier-
strass theorem which asserts the existence of a maximum (and minimum) of a
continuous function over a compact set (Theorem O.A.18). In regard to our
present problem, this means that if the maximand function is continuous and the
constraint set C is compact, then, from the Weierstrass theorem, we can assert
that there exists a solution for the present nonlinear programming problem.
In the above formulation of the nonlinear programming problem we con-
fined ourselves to the real space R" or some subsetX. However, we can extend the
analysis to the case in which X is a subset of some linear space-say, L-with
a certain topological structure defined so that we can talk about such things as
continuous functions. In particular, we may be concerned with the problem of
choosing a certain function x(t) from a certain set of functions X. The set of
functions often constitutes a linear space so that we may be able to regard X as
a subset of a certain linear space. For example, consider a consumer who wants
to maximize the sum of the utility stream u [x(t)] attained by the consumption
stream x(t) over his lifetime. Suppose that he knows his income at time t, y(t), and
the commodity-price vector at time t, p(t), for all t in the span of his lifetime.
Let p and r respectively denote his subjective discount rate and the market rate
of interest, both of which are assumed to be positive constants. Assume that this
consumer is "small" enough so that his choice of x(t) does not affect the p(t)
and r that prevail in the market. Then the problem may be formulated as follows:
T
Maximize: I [x(t)] = fu [x(t)] a Ptdt

Subject to: g [x(t)] = f0p(t).x(i)e'tdt < M

and x(t) > 0, t E [0, T], where M = f0y(t)e'tdt

This integral constraint contains the assumption that the consumer can borrow or
lend any amount at the fixed rate of interest r. In any case, this is a problem of
choosing a function x(t) from a set of (say, continuous) functions,X, defined over
the interval [0, T] such as to maximize a real-valued function I [x(t)] subject to
the constraints g [x(t)] < M and x(t) > 0. This is clearly similar, at least formally,
to the problem discussed above. In fact, there has been a considerable effort
recently to consider this kind of problem as a natural extension of nonlinear
programming theory to problems in linear spaces (not necessarily R11).' However,
in this chapter we restrict ourselves to R" or its subset X, as formulated above.
In this way we can treat the theory in a much simpler manner. We discuss the
question of programming in a linear space later in the book when we discuss
such topics as the calculus of variations and optimal control theory.1'
60 DEVELOPMENTS OF NONLINEAR PROGRAMMING

Now let us come back to the usual nonlinear programming problem, that is,
the one of maximizingf(x) subject to gj(x) > O, j = 1, 2, ..., m, and x E X, where
X is a subset of R", or in other words, maximizing f(x) subject to x E C = {x:
x E X, gj(x) > O, j = 1, 2, ..., m }. The following questions are the natural ques-
tions involved in any nonlinear programming problem.

QUESTION 1: Is the set C nonempty; that is, does there exist a feasible
point?
QUESTION 2: Does there exist a solution z, a point which maximizesf(x)
subject to x E C?
QUESTION 3: What are the characteristics of this optimum z?
QUESTION a: Is the solution z unique, or is there any other solution besides
i?

QUESTION 5: What is the algorithm to find all the solutions?

Because this is a book dealing primarily with economic theory, we are not
interested in Question 5. Much work is being done on the problem of finding
algorithms, but no definite methods have yet been found, except in some special
cases." Readers are referred to articles in professional journals.
Owing to our interest in theory, we pay greatest attention here to Question
3,11 discussing, in particular, the "saddle-point characterization" and the "quasi-
saddle-point characterization." Question 3 should also be of central concern to
those interested in algorithms.
In Section B, we discuss the problem posed when the fu nctionsfand gj belong
to a certain class of functions, called "concave" (or "convex") functions. We do
not assume differentiability of these functions, and global results are obtained.
The central characterization of optimality is that of a "saddle point." In Section
C we remind the reader of certain basic facts, such as the definition of differentia-
bility and the unconstrained maximum problem. This section will prepare the
way for Section D, in which we study problems where we can assume the dif-
ferentiability of f and the gj's. The basic characterization for optimality under
such an assumption is well known and is called the first-order condition (or "quasi-
saddle-point condition"). The central theorems are Kuhn and Tucker's main
theorem and the Arrow-Hurwicz-Uzawa theorem. In the Appendix to Section D,
we sketch the proof of the Arrow-Hurwicz-Uzawa theorem. In Section E, we
extend the nonlinear programming theory established thus far. In particular, we
discuss (1) quasi-concave programming, (2) the vector maximum problem, and
(3) the characterization of concave (or quasi-concave) functions and the second-
order conditions. In Section F we illustrate economic applications of the theory
established in this chapter.13 In the Appendix to Section F, we summarize the
classical theory of optimization and its standard applications to comparative
statics analysis. The reader may find this appendix a useful review.
INTRODUCTION 61

FOOTNOTES

I. The maximization problem is equivalent to the minimization problem, for one can
easily convert one to the other. For example, if this problem is taken to be one of
minimizing f(x), subject to a certain set of constraints, it can be converted to one
of maximizing [ -f (x)] , subject to the same set of constraints.
2. To name just a few, we have the transportation problem, the production scheduling
problem, the diet problem, the gasoline mixing problem, and the allocation problem.
3. For the theoretical apparatus developed in linear programming and its applications to
economic theory, see, for example, H. W. Kuhn and A. W. Tucker, eds., Linear
Inequalities and Related Systems, Princeton University Press, 1956.
4. The term nonlinear programming is a little confusing. It customarily includes linear
programming as a special case.
5. Linear programming aroused interest in constraints in the form of inequalities
and in the theory of linear inequalities and convex sets. The Kuhn-Tucker study
[3] appeared in the middle of this interest with a full recognition of such devel-
opments. However, the theory of nonlinear programming when the constraints are
all in the form of equalities has been well known for a long time-in fact, since Euler
and Lagrange. The inequality constraints were treated in a fairly satisfactory manner
already in 1939 by Karush [2] . Karush's work is apparently under the influence of a
similar work in the calculus of variations by Valentine. Unfortunately, Karush's work
has been largely ignored more or less.
6. The function f(x) is called the maximand function or the objective function.
7. The point z is also called a maximum point, a solution, an optimal solution, and an
optimal program.
8. See Section D of this chapter, for example.
9. For a pioneering work in this direction, see Hurwicz [ 1] . Programming in linear
spaces would presumably include such topics as the calculus of variations and optimal
control theory. The reverse approach-that is, treating the usual nonlinear pro-
gramming as a special case of optimal control theory-is also possible. This has
been recently investigated, especially after the interest aroused in the variational
approach by Pontryagin et al., Hestenes, and so on. See, for example, M. Canon,
C. Cullum, and E. Polak, "Constrained Maximization Problem in Finite-Dimen-
sional Spaces," Journal of SIAM Control, vol. 4, no. 3, August 1966.
10. The dynamic optimum consumption problem as stated above has recently been
treated in a more sophisticated manner by M. El-Hodiri, M. Yaari, A. Douglas,
K. Avio, and so on, by using the calculus of variations and optimal control theory.
See, for example, K. Avio, "Age-dependent Utility in the Lifetime Allocation
Problem," Krannert Institute Paper, Purdue University, no. 260. November 1969. for
this problem and the references. See also Chapter 8, Section C. Note also that the
dynamic consumption problem can also be treated by using the usual nonlinear pro-
gramming technique.
H. This does not imply, of course, that the scope of available algorithms is very
limited. On the contrary, thanks to electronic computers we are able to handle a
sufficiently large number of practical problems.
12. Moreover, we will not treat such topics as integer programming, stochastic program-
ming, and the like, as such.
13. Although this chapter was written prior to and independently of Mangasarian [4],
the reader may benefit from reading this excellent treatise on nonlinear program-
ming along with this chapter.
62 DEVELOPMENTS OF NONLINEAR PROGRAMMING

REFERENCES

1. Hurwicz, L., "Programming in Linear Spaces," in Studies in Linear and Nonlinear

Programming, ed. by K. J. Arrow, L. Hurwicz, and H. Uzawa, Stanford, Calif.,
Stanford University Press, 1958.
2. Karush, W., Minima of Functions of Several Variables with Inequalities as Side
Conditions, Master's Thesis, University of Chicago, 1939.
3. Kuhn, H. W., and Tucker, A. W., "Nonlinear Programming," Proceedings ofSecond
Berkeley Symposium on Mathematical Statistics and Probability, ed. by J. Neymann,
Berkeley, Calif., University of California Press, 1951, pp. 481-492.
4. Mangasarian, 0. L., Nonlinear Programming, New York, McGraw-Hill, 1969.

Section B
CONCAVE PROGRAMMING-
SADDLE-POINT CHARACTERIZATION

In this section we discuss one important characterization of the solution

of the nonlinear programming problem: the saddle-point characterization. A
major feature in this characterization is that we need no assumptions with regard
to the differentiability of any functions involved, and hence we do not have to
talk about differentiation here.

Definition: Let P (x, y) be a real-valued function defined on X ®Y where x E X

and y E Y. A point (z, y) in X x®-- Y is called a saddle point of (P (x, y) if
P(x,y) < D(z,y) < P(z,y)forallxEXandyE Y
REMARK: Clearly a saddle point may never exist, and even if it exists, it is
not necessarily unique. Note that, for a fixed value y, z achieves the maxi-
mum of (P (x, y) and that, for a fixed value z, y achieves the minimum of
0(i, y). In other words, 0(i, y) = maxre x O (x, y) and I(,9)= minyE r
0 (z, y). Intuitively, this could produce a picture like a horse saddle, as il-
lustrated in Figure 1.4. We should note, however, that there is a common
misconception that a saddle point always looks similar to such a saddle.
Nikaido [ 11], pp. 142-143, gave the following example of a saddle point,
which does not look like a saddle.
EXAMPLE: D(x,y)= 1-x+y,0<_x< 1,0<y< 1.
Point (0, 0) is a saddle point of (P (x,y). As can be seen from Figure 1.5, (P does
not look like a saddle.
CONCAVE PROGRAMMING-SADDLE-POINT CHARACTERIZATION 63

Figure 1.4. An Illustration of a Saddle Point.

Definition: Letf be a real-valued function defined on a convex set X in R ". The

function f is called a concave function if, for all x, y E X, and 0 < 0 < 1,
f [Ox + (1 - 0)y] _> Of(x) + (1 - O)j(y)
If the inequalities in the above definition are strict for all x, .v E X with x 4 y, and
0<0< 1, that is, if f [Ox + (I - 0)y] > Of(x) + (1 - O)f(y), for all O, 0 < 0 < 1,
and for all x, y E X with x # y, then f is called a strictly concave function. On
the other hand, f is called a convex (resp. strictly convex) function if (-f) is
concave (resp. strictly concave).
REMARK: A (strictly) concave function is illustrated in Figure. 1.6 (where
X = R). Needless to say, every strictly concave function is concave (and
every strictly convex function is convex).
REMARK: Intuitively, f is a concave function if the chord joining any two
points on the function lies on or below the function. The set X must be con-
vex; otherwise Ox + (1 - 0)y may not be in X, so that the LHS of the in-

Figure 1.5. Nonsaddie-like Saddle Point.

64 DEVELOPMENTS OF NONLINEAR PROGRAMMING

i
f[OX+(1-O)y] -11
Of(X)+(1 e)f(r) ---
I

o X OX+(1-O)r Y

Figure 1.6. An Illustration of a Concave Function.

equality may be meaningless. Henceforth X is automatically taken to be a

convex set if it is a domain of a concave function.
REMARK: If, in particular, f can be written as f (x) = a x + ao where a is a
constant vector in R" and ao is a constant real number,' then f is both concave
and convex, but neither strictly concave nor strictly convex. Clearly there are
many- real-valued functions that are neither concave nor convex (hence
certainly neither strictly concave nor strictly convex).
REMARK: Note that the concept of concave (or convex) functions is a
global concept in the sense that the defining property is concerned with all
the points of the domain. Note also that the above definition still holds even if
we replace R" by any linear space.'
The class of concave functions (and convex functions) probably forms the
most important class of functions in economics, for reasons which will become
clear in reading this book. We now list some of the important properties of concave
functions, which follow immediately from the definition.

Theorem 1.B.1

(i) Let f b.? a concave function on a convex subset X of R". Then S = {x: x E X,
f(x) > 0} is a convex set.
(ii) A nonnegative linear combination of concave functions is also concave. In other
words, if f;(.x-), i = 1, 2, ... in, are concave functions on a convex subset X of
R", then f (x) _ Z;" I a; f, (x), where ai E R, ai 0, i = 1, 2, ... , in, is also
a concave function on X.
CONCAVE PROGRAMMING-SADDLE-POINT CHARACTERIZATION 65

(iii) Every concave function is continuous in the interior of the domain of thefunction.3

PROOF:

(i) Let x, y E S c X. Then f(x) > 0 andf(y) > 0. Also Ox + (1 - O) y E X,

for 0 < 0 < 1, since X is convex. Using the concavity off, we have
f[ox+(1-O)y]'ef(x)+(1-o)f(y)>0 for 0<0 I

Hence [ox + (I - O)y] must also be in S.

(ii) Let x, y E X. Then ox + (1 - O)y E X, for 0 < 0 < 1, due to the convexity
of X. Using the concavity of the f's, we obtain

flox +(I-O)y]aif [ox+(1-o)y]

I
i=1
m
ai[of(x)+(I-O)f(y)]=eEaif(x)+(1-o) Eaifi(y)
i=] i=1

= Of(x) + (1 - O)f(y) for 0 < 0 < 1

Hence f is also a concave function.

(iii) Proof is omitted. It is an easy exercise which follows in a straightforward
way from the definitions of continuity and concavity. For this proof, the
reader may refer to Fenchel [31, pp. 75-76; Berge [ 1] , pp. 193-194; or
Fleming [41, pp. 26-27. (Q.E.D.)

REMARK: Let f(x) be a real-valued function on a convex subset X of R11;

then the following can be shown:

(i) If f is concave, then the set { x:x E X, f (x) ? a} (for each a E R) is convex in
R". [ For the proof, simply observe (i) and (ii) of Theorem I.B. 1.]
(ii) The function f is concave if and only if the set {(x, a):x E X, a E R,
f (x) > a} is convex in R"+ 1. [ The proof follows directly from the def-
initions.]
(iii) The function f is concave if and only if, for each integer m > 1,
+02x2+...+Omxr")>0 f(xi)+o2f(x2)+...+Br0(x
ABIxI )

for allxJEX,O ER,oi>_O,j=1,2,...,m,with . ion=1. [For the

proof use (ii) above.]
(iv) I f i = 1 , 2, ... , k, are concave functions on X which are bounded from
below, then f (x) = inf f (x) is also concave. [ For the proof,use (ii) above.]

The. converse of (i) of the above remark is not necessarily true A weaker prop-
erty than the concavity off will suffice to guarantee the convexi.y of the set (later
we discuss it as the quasi-concavity of a function). The set {x: xE X,f(x) > a} is
66 DEVELOPMENTS OF NONLINEAR PROGRAMMING

often called the upper contour set (see Figure 1.7). [In the theory of consumer
demand, f is a utility function and the set {x: x E X,f(x) = a} is often called an
indifference curve.]

xi
0 Figure 7.7. Upper Contour Set.

REMARK: Let gj, j = 1, 2, ... , m, be concave functions on X in R. Let C

{x: x E X, gj(x) >_ 0, j = 1, 2, . . ., m} (the constraint set). Then Cis a convex
set since it is an intersection of the convex sets Cj = {x:x E X, gj(x) _> 0} (that
m
is, C= Cj).

We now prove the fundamental theorem of concave functions from which

many of the important implications of concave functions can be proved.

Theorem 1.B.2 (Fundamental Theorem): Let X be a convex set in R" and let f , f2,
... , f,,, be real-valued concave functions defined on X. If the system
f(x)> O,i= 1,2,...,m
admits no solution x in X, then there exist coefficients p1, p2, ..., p,,,, all pi > 0
(not vanishing simultaneously) and p, E R, i = 1, 2, ..., m, such that

P for all xEX

If we wish, we may choose pi such that _Y"_ 1pi = 1.

PROOF: Given a point x in X, define a set Z, by

ZX tV ZZ, . ., Zi, . . ., 27771 E . ZI <f(X), =

R117.

1, 2, . ., m}
.

Then consider a set Z defined by

Z U Zc
SEX

Set Z is inustrated by Figure 1.8. By assumption, Z does not contain the

origin ['.'fit does, 0 E Z, for some ZC, or 0 < f.(x) for all i and for some x,
which is a contradiction]. Also Z is convex, since if z E Zr and z' E Z,,
and 0 E R sich that 0 5 0 5 1, then
CONCAVE PROGRAMMING-SADDLE-POINT CHARACTERIZATION 67

Figure 1.8. Illustration of Set Z.

ezi + (1 - e)z; < 0J;(x) + (1 - e)f(x') < j [ex + (1 - e)x']

fori= 1,2,...,m,
and so Oz + ( 1 - 9)z' E Z [u:r + (I _ 0]X,] c Z ['.'ex + (1 - 9)x' E X since X
is convex]. Thus we have a convex set which is disjoint from a point, that is,
the origin. Hence, owing to Theorem O.B.2 (or Theorem O.B.3), there exists
a p # 0 such that
forallzEZ
Since zi can take any absolutely large negative value, p <_ 0. Write p
Then
p-z<0 forallzEZ
where p 0. An arbitrary point z E Z can be expressed by z; = f,.(x) - ci
for some x and some Ei > 0. By varying x in X and ci > 0, we can obtain all
-
points z E Z. Hence we obtain 2:' 1 pi [ f (x) E;] < 0 for all x E X and all
E; > 0 (i = 1, 2, ... , m). In other words, E°' I p; f, (x) < c for all x E X and all
e > 0, where e 2:' 1p; E;. Since this relation holds for all c > 0, we have
2 °_ 1 p; f,(x) < 0 for all x E X, as required. We can suppose that I p; = 1

by dividing each pi by the number pl + P2 + + pm > 0. (Q.E.D.)

REMARK: Theorem 1.B.2 is essentially due to K. Fan, I. Glicksberg, and
A. J. Hoffman, "Systems of Inequalities Involving Convex Functions,"
American Mathematical Society Proceedings, 8, 1957. See also Berge [ 1 ] ,
pp. 201-202, and Berge and Ghouila-Houri. [2], pp. 62-64.
REMARK: In the case where m > n + 1, we can take all but (n + 1) of the
numbers p1, pz, ..., p,,, to be zero. For the proof of this statement (which
requires "Belly's theorem"), see Berge and Ghouila-Houri [2] , p. 64. For
the exposition of Belly's theorem, see, for er'.ample, [ 1 ] , pp. 165-166; and
[2], p. 62. The following corollary follows immediately.

Corollary: Let X be a convex set in R" and let fl , f2, ... , f,,, be real-valued convex
functions. Then either the systemf(x) < 0, i = 1, 2, ... , m, admits a solution x E X,
68 DEVELOPMENTS OF NONLINEAR PROGRAMMING

or there exist pI, P2, . . ., pm, all > 0 and not vanishing simultaneously, such that
Zmm
l Pif i(x) > 0 for all x E X.4 If we wish, we may choose the pi's such that
zi= iPi = I.
REMARK: There are several important applications of this fundamental
theorem. Berge and Ghouila-Houri ([2], pp. 64-68) proved, for example,
the theorem due to Bohnenblust, Karlin, and Shapley, the von Neumann
minimax theorem, and a generalized Minkowski-Farkas lemma as applica-
tions of this theorem.
We now prove the major theorem of this section, which is again a corollary of
Theorem 1.B.2.

Theorem 1.B.3 (Kuhn-Tucker, Uzawa): Let f, g1, g2, ..., gm be real-valued

concave functions defined on a convex set X in Rn. Suppose that z achieves a maximum
off (x) on X subject to gj (x) > 0, j = 1, 2.... m, x E X. Then there exist coefficients
Po, Pi, P2, , P,,,, all >_ 0, not all equal to zero, such that

Pof (x) + P g (x) < Pof (X) for all x E X

where p = (pi, pZ, .. ., p,,) and g(x) is the m-vector whose jth component is gj(x).
Also, p g(z) = 0: If one wishes, pj may be chosen such that opi = 1.
PROOF: By hypothesis, the system
gj(x)>>0, j= 1,2,...,m
f(x) - f(x) > 0
has no solution in X. Hence, afortiori, the system
gj(x) > O,j= 1, 2, m
f(x)-f(z)> 0
has no solution in X. Thus, by the fundamental theorem, there exist coeffi-
cients po, Pl, pz, ... , fin, all > 0. not all equal to zero, which can be chosen
with m o pj = I such that.
in

PO [f(x) -f(x )] +2: pjgj (x)<0, for all xEX

J= I
or

Pof(x) + POfG for all x E X, where p = (pi , pZ, . ., p,,,)

To show p.g(z) = 0, set x = in the above inequality. Then we obtain

p g(z) < 0. But p 0 an I g(z) > 0 so that p g() >_ 0. Therefore we have
P g(r) = 0. (Q.E.D.)
We now state and prove t':ie immediate corollary of the above theorem. The
above theorem is better known in the form of the following corollary.
CONCAVE PROGRAMMING-SADDLE-POINT CHARACTERIZATION 69

Corollary:' Suppose that the following additional condition is satisfied for Theorem
1.B.3."
(S) There exists an x in X such that g1(z) > 0, j = 1, 2, ... , m.
Then we have po > 0. Hence under the assumptions of the theorem together with
condition (S), there exist coefficients, A I, AZ, ... , A.,,,, all > 0, such that
(SP) 0 (x, Al) < P (z, A) < 0 (z, A), for all x E X and all A ? 0, where (P (x, A)
f(x)+ A-g(x),and A_ (A I,A2'...,A,")
In other words, (z, i) is a saddle point of 0 (x, A) on X Q D"' where D... is the
nonnegative orthant of R "'.
PROOF: Suppose po $ 0. Then po = 0 and p > 0. Hence by the above
theorem we have

p g()e) <_ 0. Since p >_ 0, this contradicts the above condition

(S). Therefore, we must have po > 0. Write Al pj/po, j = 1, 2, ..., M.
Then the first part of the above relation (SP) follows immediately from
the statement of the theorem. The second part of (SP) also follows im-
mediately from A g(z) = 0, g(2) > 0 and Al > 0. (Q.E.D.)
.

REMARK: The above condition (S) is known as Slater's condition' (see

Slater [121). Slater's condition (S) can be replaced by the following con-
dition (K).
(K) For any p ? 0, there exists an x in X such that p g(x) > 0.
This condition is known as Karlin's condition, for it is due to Karlin ([7], p. 201).
It can easily be shown that condition (S) and condition (K) are, in fact, equivalent.8
In any case, condition (S) or condition (K) guarantees a strictly positive po in
the statement of Theorem I.B.3. Such a condition is often called the normality
condition.
REMARK : In the original paper by Kuhn and Tucker [ 8], the need for a
normality condition such as Slater's condition is not explicitly stated. It was
hidden in a condition called the Kuhn-Tucker constraint qualification, which
we take up in Section D. The first really elegant proof of the above theorem
using Slater's condition (and without relying on the Kuhn-Tucker constraint
qualification and the differentiability of the gj's) is provided by Uzawa [ 131.
Uzawa's proof does not utilize the fundamental theorem (that is, Theorem
I.B.1); rather, it directly utilizes the separation theorem. The present proof
is a slight modification of the proof by Berge [ 1 ] , which is essentially
similar to the one by 4Jzawa. We may note that our theorem is slightly more
general than Uzawa's, for we do not assume the set X to be the nonnegative
orthant, S2". Also note that Uzawa's proof essentially amounts to re-proving
the above fundamental theorem which is originally due to Fan, Glicksberg,
and Hoffman.
70 DEVELOPMENTS OF NONLINEAR PROGRAMMING

REMARK: If Slater's condition (S) is not satisfied, then the conclusion

of the above corollary does not necessarily follow. Consider the problem
of maximizing f(x) = x on x E R, subject to g(x) _ -x2 > 0. Clearly,
z = 0 is the solution of this constrained maximization problem. However,
it can be shown easily that the point (0, A) cannot be a saddle point of
0 (x, A) - f(x) + Ag(x) for any nonnegative A. (Notice that 8rD/ ax evaluated
at z = 0 is positive for any value of A.)'
REMARK: The function D(x, A) --f(x) + A. g(x) [or, but much less often,
po f(x) + p g(x)] defined on X Q 92 m is called the Lagrangian function
or simply the Lagrangian of the above nonlinear programming problem. The
above characterization (SP) of the solution of the problem in terms of the
saddle point of the Lagrangian function cD is called the saddle-point character-
ization. Any nonlinear programming problem with both concave objective
and constraint functions (that is, the f and the gj's) is called a concave
programming problem.
We now prove the converse of the above corollary.

Theorem I.B.4: Let f, g1, g2, . , g", be real-valued functions defined over X in R".
If there exists a point (z, )) in X® 92'" such that

'D(x,A) < 'D(z,A.) for all xEXand all AES2

where D (x, A) - f (x) + A g (x), then

(i) The point z maximizes f(x) subject to gj(x) > 0, j = 1, 2, ..., m, and x E X.
(ii) ) g(z) = 0.

PROOF: The inequality 0(i, A) < 0(i, A) for all A E 92'" implies that
A g(z) < A g(c) for all A E 92 m. Thus A g(z) is bounded from below for
all A E 92 m, and S2 m is a convex cone. Therefore, we have A g(z) > 0 for
all A > 0. (Recall the lemma immediately preceeding Theorem 0.B.4.) Thus
gj (z) > 0, j = 1, 2, . ., m. (Thus z satisfies the constraints.) Putting A = 0
.

in the above inequality A g(z) < .1 g(z), we obtain A g(z) < 0, which
proves (ii) since g(z) > 0. Now note that by assump'"on, 0(x, A) <
cD (z, A.) for all x E X. This meansf(x) + g(x) < f(z), since A g(z) = 0;
or f(z) - f(x) > A.g(x). [That is, f(z) - f(x) >_ 0 for all x E X such that
A g(x) 0. ] In particular, f(i) - f (x) > 0 for all x E X such that gj(x) > 0,
j = 1, 2, ..., m. (Q.E.D.)
REMARK : In the above theorem we do not assume the concavity of the
f and the gj's, nor do we assume the convexity of X.
Combining Theorem 1.B.3 and its corollary, we immediately obtain the
following useful theorem.
CONCAVE PROGRAMMING-SADDLE-POINT CHARACTERIZATION 71

Theorem 1.B.5: Let f, 91, g2, ... , g,,, be concave functions defined over a convex set
X in R ". Suppose that Slater's condition (S) is satisfied. Then 2 achieves a maximum
of f(x) subject to gj(x) > 0, j = 1, 2, . . ., m, if and only if there exists a A > 0 -such
that (2, A) achieves the saddle point of the Lagrangian 0 (x, A), that is, (D (x, X)
a) (2, A) < m (2, A) for all x E X and A > 0.
REMARK: The above statement of the theorem presupposes the existence
of 2. We may also restate the theorem in the following way:
There exists a solution, 2, for the problem of maximizing f (x) subject
to gj (x) > 0, j = 1 , 2, ... , m, ifand only if there exists a saddle point for a) (x, A )
such that D (x, .) < D (2, A) < a) (2, A) for all x e X, A > 0.
REMARK: In the above characterization of the solution of a nonlinear pro-
gramming problem, the solution 2 is a global solution; that is, it does not
refer to any small neighborhood about 2. The solution 2 is defined for the
entire domain X.
REMARK: In certain cases, Slater's condition (S) can be dispensed with.
For example, linear programming is a special case of concave programming
and it is known (by the "Goldman-Tucker theorem") that the above theorem
holds without (S) for the linear programming problem. We discuss this later
in Sections D and F of this chapter.
POSTSCRIPTS (FOR FURTHER READING): Here we are concerned with the
problem of finding x E X c R" to maximize a real-valued function f(x) on X
subject to m real-valued function constraints gj(x) _> 0, j = 1, ..., m. Let Qm
be the m-dimensional nonnegative orthant ofRm. Then, writingg(x) = [gi (x), - .

g,,,(x)], we may rephrase this problem as one of maximizing f(x) subject to

g(x) E fl-. Now notice that f2m is a convex cone. Thus we can generalize the
above problem to finding x E X c R" so as to
Maximize: f(x)
Subject to: g(x) E K
where K is a convex cone in Rm. Note that this allows equality constraints as
well as inequality constraints. For example, this reformulation allows the following
constraints:
gj(x)>O,j= 1,2,...,m'
and
gj(x)=0,j=m'+ 1,...,m
Besides, g(x) E K allows the extension to a general linear space in which the
meaning of the ordering > has to be defined properly.10
Confining himself to real spaces, R", R"', and so on, Moore [9] recently
examined concave programming. In order to cope with the extension of D... to
a general convex cone K, he, following Hurwicz [61, redefined the concavity of
the gj's in terms of "K-concavity."
72 DEVELOPMENTS OF NONLINEAR PROGRAMMING

Definition: Let g be a function of X into R m, where X is a convex subset of R ".

The function g is said to be K-concave if
g[ox + (1 - o)x'] - [og(x) + (1 - o)g(x')] E K for all x, x' E X
and 0 < 0 < 1, 0 E R, where K is a convex cone in R.
We may define K-convexity analogously. If K is the nonnegative orthant of
R (m = 1), this definition of K-concavity (or K-convexity) coincides with the usual
definition of concavity (or convexity)." Moore restricted his attention to the case
in which g is K-concave.12
Although Moore's paper is quite general, his analysis is still confined to
Euclidian spaces. The analysis of a more general case, that is, the case of real
linear, topological spaces, is thoroughly discussed by Hurwicz [6] We may note
.

that as a result of this restriction to Euclidian spaces, Moore was able to simplify
some of the proofs considerably. Hence reading his paper may serve as a guide
to the difficult paper of Hurwicz, at least for the concave programming case.
Strictly speaking, Moore's results are not all special cases of Hurwicz [6].
We record one of his main results as a theorem for those readers who are interested
in [9] (his theorem 3).

Moore's Theorem: Let f be a real-valued concavefunction of-,Y, a convex subset ofR",

and let g(x) = [gl (x), g2(x)] be a function of X into R"'. Suppose that c E X achieves
a maximum of f(x) subject to gl (x) E K1 and g2(x) E K2, where K1 is a convex cone
in R ' 1, K2 is a convex cone in R 122, and m1 + m2 = m. Assume the following.

(i) The function g 1 is linear affine13 and g2 is K2-concave.

(ii) Interior X 0 and Interior K2 0 (in R 12).
(iii) There exists an z E Interior X such that g1 (z) E K1.
(iv) There exists an x E X such that g 1(x) E K1 and g2 (x) E Interior K2.
Then there exists a A E K*, where K* denotes the nonnegative polar cone" of
K - K, Q K2, such that (z, A) is a saddle point off (x) + A g(x).
REMARK: The corollary of Theorem 1.B.3 is not really a special case of the
above theorem in view of the requirement that interior X 0 in the above
theorem. This theorem was apparently conceived as a generalization of
Uzawa's theorem 3 ([13], pp. 35-37). As Moore pointed out, Uzawa's
theorem 3 is not correct in a strict sense.15

FOOTNOTES

1. In other words, f is linear affine.

2. In an arbitrary linear space the meaning of the ordering > is not obvious. Usually it is
defined as follows: Let X be a linear space and x and x' be points in X; x > x' if
CONCAVE PROGRAMMING-SADDLE-POINT CHARACTERIZATION 73

x - x' E K, where K is a given fixed convex cone in X. If X = R" and Y is S2", then
this definition coincides with the usual definition. In any case, in terms of this
definition of >_ in a linear space, the concavity or convexity of functions on a
linear space can be defined analogously to the case of a real space. See Hurwicz
[61, for example.
3 It is important to note that a concave function may not be continuous at its boundary
points. For example, define f on [ 0, oo) by f (x) = 0 if x = 0 and f (x) = I if x > 0. This
function is clearly concave but not continuous at x = 0. It is continuous on (0, oo).
4. Write f = (f, f2, .. ., f") and p = (p1, P2, ..., p"). If there exists an x E Xwith
f (x) < 0, then clearly p f (x) < 0 for any p 0; that is, "p f (x) > 0 for all p >_ 0,
x E X" does not hold. On the other hand, if f (x) < 0 admits no solution for x E X,
then -f(x) > 0 admits no solution for x E X. Hence, as a result of Theorem 1.B.2,
there exists a p > 0 such that p [ -f (x)] < 0 or p f (x) > 0 for all x E X. This
proves the corollary.
5. A generalization of this corollary and Theorem I.B.3 to the case of linear topo-
logical spaces is accomplished by Hurwicz [61, theorem V.3.1., pp. 91-93.
6. In many economic problems, we are often concerned with the problem of choosing
x E R" which maximizes f (x) subject to gj(x) > 0, j = 1, 2, ... , in, and x > 0. For such
a problem it can easily be shown that Slater's condition is slightly weakened so that
there exists an x >- 0 such that g j (x) > 0, j = 1, 2, ... , in. For a more general result
with linear (affine) constraints such that A- x + b >0, instead of x > 0, see Moore's
theorem later in this section.
7. The need for some requirement for constraints when all the constraints are in the
form of inequalities was first investigated in 1939 by Karush in his Master's thesis at
the University of Chicago (Minima of Functions of Several Variables with Inequalities
as Side Conditions).
8. If (S) holds, then g(x) > 0 so that p g(x) > 0 for p >_ 0; that is, (K) holds. Con-
versely, if (K) holds, then there exists no p > 0 such that p g(x) < 0 for all x E X.
Then, owing to the corollary of Theorem I.B.2, the system g(x) > 0 admits a solution,
say, x in X; that is, (S) holds. The equivalence of these two conditions for a more
general space is provided by L. Hurwicz and H. Uzawa, "A Note on the Lagrangian
Saddle-Points," in Studies in Linear and Non-Linear Programming, Stanford, Calif.,
Stanford University Press, 1958.
9. Slater's own counterexample is the following: f(x) = x - 1 and g(x) (x - 1)2,
x E R. The above example, due to Uzawa [ 13], p. 34, is obviously a slight modifica-
tion of Slater's example. In spite of this well-known counterexample, there seems to
be a confusion among economists on this point. See, for example, K. Lancaster,
Mathematical Economics, New York, Macmillan, 1968, p. 75 (the second proposition
of his "existence theorem"). See also p. 64.
10. For such a definition, recall our earlier remark in footnote 2.
11. The above definition of the K-concavity (or the K-convexity) is clearly motivated by
the definition of >_ in a linear space. The reader should not confuse this concept with
that of the S-concavity (or S-convexity), which is discussed in Berge [ 11 , and soon.
12. In Moore's analysis [9] , f is not restricted to a real-valued function but can be
vector-valued; that is, f = [ f l ,... , f , ] , where f,,, i = 1, ... , n, are real-valued.
Such a problem is called the "vector maximum problem" and we discuss it later in
Section E.
13. Thus g' (x) can be written as g' (x) = A A. x + b, where A is an in, x n matrix with
entries of real numbers and b E RmI.
14. Let K be a convex cone in R"'; then the nonnegative polar cone can be defined as
74 DEVELOPMENTS OF NONLINEAR PROGRAMMING

K* - { z: z E R', y, z > 0, for ally E K} . It is easy to see that K* is also a convex cone.
Also, if K = S2m, then K* = S2"':
15. There is a counterexample to Uzawa's theorem 3, as it is stated. This is pointed out
by Moore [ 9] , p. 61.

REFERENCES

1. Berge, C., Topological Spaces, New York, Macmillan, 1963 (French original, 1959).
2. Berge, C., and Ghouila-Houri, A., Programming, Games and Transportation Net-
works, New York, Wiley, 1965 (French original, 1962).
3. Fenchel, W., Convex Cones, Sets and Functions, Princeton, N.J., Princeton University,
1953 (hectographed).
4. Fleming, W. H., Functions of Several Variables, Reading, Mass., Addison-Wesley,
1965.
5. Hadley, G., Nonlinear and Dynamic Programming, Reading, Mass., Addison-Wesley,
1964, chap. 3.
6. Hurwicz, L., "Programming in Linear Spaces," in Studies in Linear and Non-linear
Programming, ed. by K. J. Arrow, L. Hurwicz, and H. Uzawa, Stanford, Calif.,
Stanford University Press, 1958.
7. Karlin, S., Mathematical Methods and Theory in Games, Programming, and Economics,
Vol. 1, Reading, Mass., Addison-Wesley, 1959, esp. sec. 7.1, 7.2, and appendix B.
8. Kuhn, H. W., and Tucker, A. W., "Non-linear Programming," in Proceedings of the
Second Berkeley Symposium on Mathematical Statistics and Probability, ed. by J.
Neyman, Berkeley, Calif., University of California Press, 1951.
9. Moore, J. C., "Some Extensions of the Kuhn-Tucker Results in Concave Program-
ming," in Papers in Quantitative Economics, ed. by J. P. Quirk and A. Zarley,
Lawrence, Kansas, University of Kansas Press, 1968.
10. Nikaido, H., Introduction to Sets and Mappings in Modern Economics, tr. by K. Sato,
Amsterdam, North-Holland, 1970 (Japanese original, Tokyo, 1960).
11. , Linear Mathematics for Economics, Tokyo, Baifukan, 1961, (in Japanese), esp.
chap. III, sec. 4.
12. Slater, M., "Lagrange Multipliers Revisited: A Contribution to Non-linear Program-
ming," Cowles Commission Discussion Paper, Math. 403, November 1950; also RM-
676, August 1951.
13. Uzawa, H., "The Kuhn-Tucker Theorem in Concave Programming," in Studies in
Linear and Non-linear Programming, ed. by K. J. Arrow, L. Hurwicz, and H. Uzawa,
Stanford, Calif., Stanford University Press, 1958.
DIFFERENTIATION AND THE UNCONSTRAINED MAXIMUM PROBLEM 75

Section C
DIFFERENTIATION AND THE
UNCONSTRAINED MAXIMUM PROBLEM

In this section we summarize some important results, a few of which are

probably known to readers who have finished an advanced calculus course. How-
ever, this review is necessary as a bridge to the next important section. We begin
by reminding the reader of the definition of differentiation on a real line R.

a. DIFFERENTIATION

Definition: A real-valued function f defined on a subset X of R is said to be

differentiable at x°, where x° is an interior point ofX, if there exists a real number a
which depends on x° such that

lim
h-.0
f(x0 + h)h - f (x°) = a
h#0

lim
f (x° + h) - f (x°) - ah = 0
h-.0 h
h#0

We call a the derivative off at x° and denote it by f'(x°). We can easily see that
f'(x°), if it exists, is unique.
REMARK: Since x° is an interior point ofX, it is assumed thatX contains an
open interval (a, f3) such that x° E (a, R). This guarantees that x° + he (a, R)
when h is small enough. Hence such an h, if it is small enough, can be either
negative or positive. If, on the other hand, f is defined on the closed interval
[a, A], then this is no longer the case. For example, if x° = a, h, however
small it may be, cannot be negative. To deal with such a situation the con-
cepts of the "left-hand derivative" and "the right-hand derivative" are
formulated. In other words,

lim f (x0 + h) - f (x0) = a+

h-.0 h
h>0
and
lim f(x° + h) - f (x°) = a
h-.0 h
h<0
76 DEVELOPMENTS OF NONLINEAR PROGRAMMING

The numbers a+ and a- are respectively called the right-hand derivative

and the left-hand derivative of f at x°, and are denoted by f'+(x°) and
f'-(x°) respectively. At x = a in [a, j3], only f'+(a) can be defined, and
at x = /3, only f'-(f3) can be defined; f'(x°) exists only if f'+(x°) and f'-(x°)
both exist and are equal.
Now suppose that f is defined on R. The above definition of differentiability
needs to be modified, for h must now be a vector as x° is a vector in R. Letting 11 x II
be the usual Euclidian norm of x, the following modified definition is a natural
generalization of the definition given above.

Definition: A real-valued function defined on a subset X of R" is said to be dif-

ferentiable at x° E X where x° is an interior point of X, if there exists an n-vector a
which depends on x0 such that

lim f(x°+
h#0 IIh11

where his, of course, an n-vector also. The above a is denoted by f'(x°) and is called
the derivative off at x°.
REMARK: In the above definition, a h is called the differential off at x°.
It clearly depends on x° and h. If f is differentiable at every point in a sub-
set S of X, f is called differentiable in S. If X is open and if f is differentiable
in X, .then f is called a differentiable function.'
REMARK: Also, we can show that f'(x°), if it exists, is unique.
REMARK: Note that X in the above definition does not have to bean open
set. However, x° is restricted to be an interior point. Hence it is assumed
that there is an open ball about x° which is contained in X.
REMARK: When X is a (closed) rectangular region, that is,
X = {x:xER',a1 x1 b1,i= 1,2,...,n}
we can define the concept of the left-hand and right-hand derivatives by
analogy. For example, an n-vector a+ is the right-hand derivative offatx°, if

lim
f(x°+h)-f(x°)-a+.h=0
h-.0 II h 11
h?0

Clearly the above concept can be defined even if X is not bounded (for
example, fl n). It should be clear, however, that the above concept is rather
limited because the closed rectangular region is a very special kind of
domain.
DIFFERENTIATION AND THE UNCONSTRAINED MAXIMUM PROBLENI 77

REMARK: The definition of differentiation is often restated in the following

manner: 'f is differentiable at x° E X, where x° is an interior point ofX c R",
if there exists an a E R" such that
f(x°+
II) is an infinitesimal of higher order than h." [More rigor-
h
ously, o( 11 h II) denotes that for any c > 0, there exists a S > 0 such that
IIhII < 8 implies o ( h ) < E I I h1 1 1 . In other words,

lim°(Ilhll)_0

h-,o Il hll
h# 0

This is often called Landau's o-symbol. The notation r(h) = o(II h II )

means that II r(h) / h -> 0 as h -> 0 with h 0. In general, r is a vector-
valued function.
REMARK: In fact, in the above definition of differentiability, it is not
necessary that x° be restricted to an interior point of R. In other words,
f (x) can be defined to be differentiable at x° E X, if there exists a linear
function a (x - x°) such that
f (x) - f (x°) = a. (x - xO) + o( II x - x° II )
That is, only the existence of the differential is crucial in the definition of
differentiability.
REMARK: In the above definition, although we let II x II be the usual
Euclidian norm, there is no necessity for this. As long as we consistently use
one norm, the choice of a norm is really immaterial. In fact, in R" we can
show that if f is differentiable at x0 under a certain norm, then f is differenti-
able at x0 under any other norm.'
REMARK: We may also note that it is not necessary that our space be R".
The above definition can be extended word for word to the case in whichX is
an arbitrary normed linear space (not necessarily R"), except that a h is
replaced by a linear functional a(h). Then a(h) is called the differential off
at x0. Note that a(h) depends on x0, as was the case above [so it is often
also denoted as a(x°, h)]. This linear functional a(h), defined on a normed
linear space, is also called the Frechet differential and the o(II h II ) is called
the remainder of the differential. (See Vainberg [7], p. 40, for example.)3

Definition: Let e' be an n-vector with the ith coordinate equal to 1 and all other
coordinates equal to 0. A real-valued function f on a subsetX of R" is said to have a
partial derivative with respect to x, at x0, where x° is an interior point of X, if there
exists a scalar a; such that
78 DEVELOPMENTS OF NONLINEAR PROGRAMMING

hymn f (x° .+ he') - f (xo) - a;h = 0

h#o h

where h is a scalar. The scalar ai is called the partial derivative off at x o and is often
denoted by of/axjjx=x0.
We now state the basic theorems about derivatives, the proofs of which
can be found in any book on elementary analysis or advanced calculus.

Theorem 1. C.1.

(i) If a real-valued function f (x) on X c Rn is differentiable at x0, then it is continuous

at x0 and has partial derivatives with respect to each of its coordinate variables
such that

a = (a1, a2, ..., an)

that is,

.f (xe) =
f , ... , In]
x=X'

(ii) The function f is differentiable at x0 and its differential is continuous at x0 if

and only if f has continuous partial derivatives at x0 with respect to each of its
coordinate variables.

0
REMARK: The function f is said to be continuously differentiable at x if it
is differentiable at x0 and if f'(x°) is continuous at x°.
REMARK: The vector f'(x°) is sometimes called the gradient vector off at
x = x°. When notational simplicity is required, we will denote it by f °
REMARK: Although the differentiability off implies the continuity off,' the
converse is not necessarily true. (For example, f(x) = jxi, x E R, is con-
tinuous but not differentiable at x = 0.) Weierstrass constructed an example
of a continuous function on R which is nowhere differentiable.5
REMARK: The partial derivatives are often called
the first-order partial derivatives. The 8f/ax,'s are also functions on X as
x° varies over X. Thus the second-order partial derivatives are defined
analogously [for example, f(x, y) = x2y, x, y E R, of/ax = 2xy,
2y, a2flayax = 2x, of/ay = x2, a2f/aye = 0, a2f/axay = 2x]. The partial deriv-
atives of f of order q = 3, 4, ..., are also defined analogously. If the
qth order (q = 1, 2, ...) partial derivatives off exist and are continuous
in the domain, then f is said to be a function of class C(q). If f is simply
continuous, f is a function of class C(0); C(° and C(1) are often denoted
respectively by C and C. In Theorem 1.C.1, (ii) says that if and only if
f E C('), f is (continuously) differentiable; and (i) says that f E CM implies
f E Coo). As remarked above, f E CEO) does not necessarily imply f E 00.
DIFFERENTIATION AND THE UNCONSTRAINED MAXIMUM PROBLEM 79

It can be shown that f E CM implies azf/ax; axe = azf/axe ax1 for all i and j
and for all points in the domain.' When f E C(2), f is said to be twice con-
tinuously differentiable.
In the definition of differentiation and in the above theorem, we assume that
f is a real-valued function. We can extend the concept of differentiation to the
case in which f is a vector-valued function.

Definition: Let f be a function from a subset X of R" into R"'. Then f is said to
be differentiable at x°, where x° is an interior point of X, if there exists an m x n
matrix A with real number entries such that
f(x°+ o(Ilhll)
We can show that A, if it exists, is unique.
REMARK : Writing f (x) = [f 1 (x), . ., fm (x)] , we can easily show that f (x )
.

is differentiable at x° if and only if f (x) is differentiable at x° for all i = 1,

... , m. The above A can be written in the form

l of
ax, ax"
of1

Lx, a
aJ m afm
ax" Ix = x°.

This matrix is sometimes called the Jacobian matrix off at x°.

REMARK: If X and Y are normed linear spaces (not necessarily R"), we
replace A h by a linear function A(h) from X into Y. The function A(h)
is called the differential of f at x0, or the Frechet differential off at x0, and
o( II h II) is the remainder of the differential.
REMARK: A theorem analogous to Theorem l .C.1 also holds for vector-
valued functions. For example, let f (x) _ [ f, (x), .. , f,,, (x) ] Then f has a
.

continuous differential at x° if and only if the partial derivatives off, (i = 1,

2, ... , n) exist and are continuous at x°.

We now state the extension of an important theorem (called the chain rule)
in elementary calculus.

Theorem 1.C.2 (Composite Function Theorem): Let f be a function from X c R"

into R'", and let g be a function from Y c R"' into Rk where f (X) c Y. Suppose that
80 DEVELOPMENTS OF NONLINEAR PROGRAMMING

f is differentiable at x ° E X and g is differentiable at f (x°) E Y: then the function

h = g of: X -*Rk is differentiable at x°, and h'(x°) = g' [f(x°)] of'(x°).
REMARK: Note that g' [ f (x°)] is the (k x m) Jacobian matrix of g and f'(x°)
is the (m x n) Jacobian matrix of f, so that h'(x°) can simply be expressed
as the product of these two matrices. This is the formula for the generalized
chain rule.

EXAMPLE: (n= 1, k= 1): Let f (x) = [ f, (x), ... , fm (x) ] and h (x) = g [f (x) ] ,
where x E R and h(x) E R. Writing y; = fi(x), we have
" ag
h(x) E,ay; al
ax

In particular, we may consider

h(t) = g [f(t)]
where f(t) = a + bt with a, b E R'", t E R (a, b are constant vectors). Then
we have

We conclude this part by noting a very important characterization of a

concave or convex function in terms of the gradient vector of the function.

Theorem 1:C.3: Let f be a differentiable real-valued function defined on an open

convex set X in R ". Then f is concave if and only if, for any x and x° in X,

IX° (x - x°) > f(x) -f(x°)

where fX° is the gradient vector off at x0.
PROOF: (Necessity) Suppose f is concave; then we have
f [(1 - t)x° + tx] > (1 - t)f(x°) + tf(x) for all t, 0 < t < 1
or

f [x° + t(x - x°)] - f(x°) > t [f(x) - f(x°)] , for all t, 0 < t < 1
Let h = x - x°. Subtract tf° h from both sides of the above relation and
divide by t > 0. Then we obtain
f[x° + th] f(-r°) - tf° h
t > f(x) - f'(x°) - f°- h

Now take the limit of t-*0 (t > 0). Then the LHS of the above relation
DIFFERENTIATION AND THE UNCONSTRAINED MAXIMUM PROBLEM 81

goes to 0 by definition off°. Therefore we obtain

f 0 - (x - x°) > R X) - f (x°)

(Sufficiency) We suppose the above inequality holds and show that f is
concave. Let xI and x2 be arbitrary points in X such that xI x2. Let x°
(1 - t)x I + tx2, 0 < t < 1, t E R and h = x 1 - x°. Then x2 = x°- [(I - t)l t] h.
As a result of the above presupposed inequality, we have

f(x') -f(x°) <fa°.h

f(x2)-f(x°) fX°-( 1 i th)
Hence we obtain

(±_Tif(x') + f(x2)) - (1 t t + l)f(x°) < 0

or
(1 - t) f(x') + tf(x2) < f(x°), 0 < t < 1
When t = 1 or 0, this inequality obviously holds. (Q.E.D.)
REMARK: The above theorem can be illustrated by Figure 1.9.
REMARK: If f is strictly concave, then we have, for any two points x and
x0, x x°, in X,

f" - (x - x°) > f(x) -f(-x°)

The converse of this statement is also true. The proof of this remark is
analogous to the proof of the above theorem.
REMARK: The function f is convex if and only if the inequality of the
above theorem is reversed, that is,

fr° (x - x0) =f(x) -f(x°)

tunction I
(X-x°)
of

0 x° x Figure 1.9. Illustration of Theorem 1.C.3.

82 DEVELOPMENTS OF NONLINEAR PROGRAMMING

The function f is strictly convex if and only if the above holds with strict
inequality for any x xo in X.

b. UNCONSTRAINED MAXIMUM
We now consider the maximization problem and its relation to deriva-
tives. The minimization problem is essentially the same as the maximization
problem, for the maximization off (x) is equivalent to the minimization of -f(x),
and vice versa. In this section we take up the unconstrained maximum problem.

Definition: A real-valued function f defined on X in R" is said to achieve its

local (or relative) maximum at x E X if there exists an open ball BE(z) with center
x and radius c > 0 such that f(x) for all x E A where A = BE(z) rl X. If
f (x) > f (x) for all x E A, we say that f achieves its strong (or unique) local maxi-
mum at z.
REMARK: Note that in the above definition X is not necessarily open. Also
note that c can be very small.

Definition: A real-valued function f defined on X in R" is said to achieve its

global (or absolute) maximum at ,x E X if f (.j) > f (x) for all x E X. If f (z) > f (x)
for all x E X, then we say that f achieves its strong (or unique) global maximum
at z.
REMARK: Local minimum, strong local minimum, global minimum, and
strong global minimum are defined analogously. We say extremum for either
maximum or minimum.
REMARK: In the above definitions of maximum and minimum, the domain
off does not necessarily have to be R". For the concept of global extremum,
X can be anything. For the concept of local extremum it has to be a set
in a metric space in order for the concept "open ball" to be meaningful.
If we replace "open ball" by "open set," X does not even have to be in a
metric space, although it has to be in a topological space.
REMARK: Theorem 1.B.3 of the previous section may be reworded as
follows:
Iff g1, g2, .. g,,, are concave functions on a convex set X in R", and iff
achieves its maximum at x, subject to g(x) > 0, then there exist po, pi, ...,
P,,, all > 0, not all equal to zero, such that
L(z) ? L(x) for all x E X, where L(x) -- pof(x) + j3 g(x)
Note that this theorem thus says that the global maximum of f over the
constraint set C -- {x: x E X and g(x) >_ 0} implies the global maximum of
L over X.
DIFFERENTIATION AND THE UNCONSTRAINED MAXIMUM PROBLEM 83

Obviously any global maximum (resp. minimum) point is also a local maxi-
mum (resp. minimum) point. The converse is not necessarily true. However, when
f is a concave (resp. convex) function, the converse is also true. In particular,
we prove the following:

Theorem 1.C.4: Let f(x) be a concave function over a convex set X in R". Then
any local maximum off (x) in X is also a global maximum off (x) over X.

PROOF: Suppose f achieves its local maximum at z with respect to an open

ball BE(z). Suppose that z is not a global maximum point. In other words,
there exists an x* E X such that
f(x*) >f(X)
Clearly, x* 0 BE(z). Sincef(x) is concave, we have
f[tx* + (1 - t)z] >_ tf(x*) + (1 - t)f(z) for all t, 0 < t < 1
Since f(x*) > f(z), the RHS of the above inequality exceeds f(z), if t zb 0,
so that
f[tx*+(1-t)z] >f(z) forallt,0<t< 1
Let x = tx* + (1 - t)z fort < It with t chosen such that t < 1 and 0 < t <
, so that x is inside BE(z) (note that x E X for X is a convex
E/ 11 z - x* 11
set). Then f (x) > f(1) for all t, 0 < t < t, which contradicts the fact that
f(x) takes a local maximum at z with respect to the ball BE(z). (Q. E. D.)

Theorem 1.C.5: Let f be a concave function defined on a convex set X in R. Let

S be the set of points in X at which f takes on its global maximum. Then S is a
convex set.

PROOF: If the global maximum is taken on at just a single point, then the
result is obvious. Suppose then that the global maximum is taken on at two
different points z and x*. Let z = tx* + (1 - t)z, 0 < t < 1. Sincef is con-
cave and since f(z) =f(x*), we have
f(z)f[tx*+(1-t)z] >tf(x*)+(1 -t)f(z)=f(z),0<t< 1
Since f(i) cannot be greater than f(z), f(i) =f(1). Therefore i E S, or
tx* + (1 - t)z E S for all t, 0 < t < 1. (Q.E.D.)
REMARK: It should be clear that Theorems I.C.4 and I.C.5 remain correct
if we replace '!f is concave" by '!f is convex" and "maximum" by "mini-
mum."

REMARK: Hence, if the global maximum is taken at two different points,

84 DEVELOPMENTS OF NONLINEAR PROGRAMMING

it is also taken at all the points in between those two points. It should also
be clear that if f(x) is a strictly concave function, then the global maximum
is taken at a unique point. To prove this, suppose that the global maximum
is taken on at two distinct points, z and x*. Then we havef(i) > tf(x*) +
(1 - t)f(z) = f(z), where i = tx* + (1 - t) z, 0 < t < 1. That is, we have
f(z) > f(z), which contradicts the fact that f(z) is the global maximum.
We now prove the following basic theorem.

Theorem 1.C.6: Let f be a real-valued function on an open set X in R". If f has

a local extremum at z and f is differentiable at z, then f'(z) = 0.
PROOF: Let v be a vector in R" such that 11 v 11 = 1, where 11 v 11 is the Eucli-
dian norm of v. Consider a (t) = f (z + tv), where t E R and x + tv E X. By
assumption a(t) has a local extremum at t = 0; hence by elementary calculus
a'(0) = 0. But a'(0) = f'(z) v by the chain rule (Theorem 1.C.2). Since the
choice of v is arbitrary, this implies f'(z) = 0.
REMARK: Hence f'(z) = 0 is a necessary condition that z furnish a local
extremum of f, and it is called the first-order condition. It is not a sufficient
condition. In other words, the converse of the above theorem is not neces-
sarily true. For example, considerf(x) = x3, x E R. We know thatf'(x) =
3x2 is 0 at x = 0 [that is, f'(0) = 01, but 0 is not an extremum point. How-
ever, when f is a concave (or a convex) function [note f(x) = x3 is neither
concave nor convex in R] , the converse of the above theorem is also true.
Thus we have Theorem 1.C.7.

Theorem I .C.7: Let f (x) be a differentiable and concave function over an open convex
set X i n R The function f (x) achieves its global maximum at x = z if and only if
f, = 0, where f, = f'(z) (the gradient vector off at z). Moreover, z furnishes a unique
maximum off if f is strictly concave.
PROOF: If the global maximum is taken on at x = z, clearly we havefr = 0.
Conversely, if fx = 0, then, as a result of Theorem 1.C.3, 0 f(x) - f(z)
for all x E X, or f(!) - f(x) for all x E X. (Q.E.D.)
REMARK: Analogously, f(x) achieves its global minimum at z if and only
if f'(2) = 0, when f is a convex function. Likewise, z furnishes a unique
minimum off if f is strictly convex.
REMARK: In the literature there are usually discussions on the "second-
order sufficiency conditions" assuming f E C(2). When f is specified as con-
cave (or convex), we can see from the above theorem that such considera-
tions can be dispensed with. The second-order conditions are, however,
related to the concavity or the convexity of a function in a neighborhood
DIFFERENTIATION AND THE UNCONSTRAINED MAXIMIMIM PROBLEM 8S

of the relevant point (see Section E, subsection c). Moreover, note that
Theorem 1.C.7 says that fC = 0 is a necessary and sufficient characterization
of a global maximum and not simply that of a local maximum. But this
can easily be understood in view of Theorem 1.C.4, which asserts that
under concavity every local maximum is a global maximum. In other words,
the concavity off also plays a crucial role in establishing the global char-
acterization of a maximum.
REMARK: Consider the constrained maximum problem of maximizingf(x)
subject to g7(x) > 0, j = 1, ..., m, x E X c R. Let C = {x E X:gj(x) ? 0,
j = 1, 2, ... , m} (the constraint set). If z is a solution of this problem, then
z maximizes 1(x) over C. Hence identifying set X in Theorems 1.C.4 and
1.C.5 with set C and assuming that C is convex, we can assert under the
concavity off that every local maximum off over C is a global maximum of
jover C, and that S = {z-f(z) > f(z), x E C} is convex. Furthermore, iff
is strictly concave (and C is convex), z is unique.
REMARK: Again consider the constrained maximum problem of maximiz-
ing f (x) over C. Suppose that the solution z is in the interior of C, so that
there exists an open ball about i [say, BE(z)] which is in C. Then Theorems
1.C.6 and 1.C.7 can be applied directly to such a constrained maximum
problem by identifying X in these theorems with BE(z). In othei words, the
constrained maximum problem is reduced to an unconstrained maximum
problem. But there is nothing surprising in this, for that the solution z is
in the interior of C means that none of the constraints gj(x) > 0,
in, are effective at z.

FOOTNOTES
1. When X = R, there are now two different definitions of differentiability. That is,
f is differentiable at x° (i) if both the right-hand and the left-hand derivatives exist
and they are equal, or (ii) if the differential exists at x° in the above sense. It
can be shown that these two definitions are equivalent. See, for example, Brown
and Page [ 1 ] , pp. 266-267, especially theorem 7.1.9.
2. In other words, if f is differentiable at x° in one norm in R", then f is differentiable
at x° in another norm in R" and the two derivatives coincide. See, for example,
Brown and Page [ 1] , p. 273.
3. In infinite dimensional (normal linear) spaces, the differentiability and the value
of differentials, in general, depend on the choice of the norm.
4. For the proof of this statement, see, for example, Fleming [2] and Rudin [6].
Here it is crucial that X is in a finite dimensional space such as R". If X is in an
infinite dimensional space, the function may be differentiable at x° but may fail
to be continuous at x°. See, for example, Brown and Page [ 1] , p. 274 (exercise 3).
5. Weierstrass showed that the function f(x) = _Y' o a"cos(b"x) is continuous but
nowhere differentiable when b is an odd integer, 0 < a < 1 and ab > 1 + (3/2)7r.
This was first published by du Bois Reymond in 1875. Since then simpler examples
have been constructed. One of the simplest was given by B. L. van der Waerden
86 DEVELOPMENTS OF NONLINEAR PROGRAMMING

in 1930. A systematic discussion of nowhere differentiable functions is given in

E. W. Hobson, The Theory of Functions ofaReal Variable, Vol. 2, 2nd ed., Cambridge
University Press, 1926, pp. 401-412.
6. If f (4 C(2) or if the second partial derivatives of f are not continuous. then we
do not necessarily have a2f/ 3x; 3xj = a`f/axe 3xi. A usual counterexample in text-
books is the function f defined by f(x, y) = xy(x2 - y2)/(x2 + y2) if (x, y)
(0, 0), and f(0, 0) = 0. It can be shown easily that 9f/6yax = - I at (0, 0) and
a2f/axay = 1 at (0, 0). See also Section E of this chapter.

REFERENCES
1. Brown, A. L., and Page, A., Elements of Functional Analysis, London, England,
Van Nostrand Reinhold, 1970.
2. Fleming, W. H., Functions of Several Variables, Reading, Mass., Addison-Wesley,
1965, esp. chapters 1, 2, 3, and 4.
3. Goffman, C., Calculus of Several Variables, New York, Harper & Row, 1965, esp.
chapters 2 and 3.
4. Hadley, G., Nonlinear and Dynamic Programming, Reading, Mass., Addison-Wesley,
1964, esp. chapters 1 and 3.
5. Loomis, L. H., and Sternberg, S., Advanced Calculus, Reading, Mass., Addison-
Wesley, 1968, esp. chapter 3.
6. Rudin, W., Principles of Mathematical Analysis, 2nd ed., New York, McGraw-Hill,
1964, esp. chapter 9.
7. Vainberg, M. M., Variational Methods for the Study of Nonlinear Operators, (trans-
lated by Feinstein from the Russian original published in 1956), San Francisco,
Holden-Day, 1964, esp. chapter 1.

Section D
THE QUASI-SADDLE-POINT
CHARACTERIZATION

Supposing that we are given real-valued functions f(x), g1(x), g2(x), ...,
gm(x), on X in R", in Section B we discussed the following two conditions:
(M) (Maximality condition) There exists an z in X which maximizesf(x) subject
to gj(x) > 0, j = 1, 2, ..., in, and x E X.
(SP) (Saddle-point condition) There exists an (z, A.), in X Q f2'" such that (c, A.)
is a saddle point of cI (x, A); that is, I (x, A) < 0 (z, 1) < I (.z, A), for all x E X
and A E DJ , where 0 (x, A) = f (x) + A1, g(x).
In Section B, we showed that condition (M) implies condition (SP) if f and
THE QUASI-SADDLE-POINT CHARACTERIZATION 87

the g,-'s are all concave functions (where X is a convex set) and if a normality
condition such as Slater's condition is satisfied. We also showed that condition
(SP) implies condition (M) (with no conditions such as the concavity off and the
gj's or Slater's condition). In this section, unlike section B, we assume that f
and the gf's are differentiable on X. First we introduce the following condition,
which is also known as the first-order condition or the Kuhn-Tucker-Lagrange
condition.

(QSP) (Quasi-saddle-point condition) There exists an (z, A) inX Q .Q "such that

J x+ A.= 0, g(z) >= 0 and A g(z) = 0, where] = f'(z) and gX = g'(z). (Here
X is an open set to make differentiation meaningful.)'
REMARK: In the above we supposed X c R. The space R" can be replaced
by a normed linear space.
REMARK: Note that condition fx + gX = 0 can be spelled out as the
following:
III agj
Of
axi j=i
A-
ax;
=0,i= 1,2,...,n

(where the partial derivatives are evaluated at x = z). If we write L(x)

[ = (P (x, A)] = f (x) + g(x), then this condition can also be written as
aL (.)()
ax;-z =0,i= 1,2,.. ,n
Now we can immediately observe the following theorem.

Theorem 1.D.1: Let f, 91, g2, ... , be real-valued differentiable functions on an

open set X in R".

(i) Condition (SP) implies condition (QSP).

(ii) If, in addition, f and g,, ... , g,,, are all concave functions, then (QSP) implies (SP)
(where X is now taken to be convex).

PROOF:
(i) By assumption, 0 (x, (x, )) for all x E X. [That is, z achieves a
global (hence local) maximum of 1 (x, i.) on X.] Therefore, by Theorem
I.C.7, c1Dx(z, i) = 0 or + A. g-r = 0. Also, by Theorem I .B.4, (SP)
implies A'- g(z) = 0. Hence (D (1, .l) < (D (c, A) for all A >_ 0 implies that
g(x) > 0 for ;t > 0. Hence, in particular, g(c) >_ 0.
(ii) Since 0 (x, A) = f(x) + ) g(x) is a nonnegative linear combination of
f and gj's, and since f and the gj's are concave functions, 0 (x, A) is
also a concave function. Then, by Theorem 1.C.7, we have c (x, A) <
(z, A), and D (z, A) < Q (z, A) follows trivially from the fact that
g(z) = 0, g(z) > 0 and A > 0. (Q.E.D.)
88 DEVELOPMENTS OF NONLINEAR PROGRAMMING

Combining the above theorem with the corollary of Theorem 1.B.3, Theorem
I.D.2 follows at once.

Theorem 1.D.2: Let f, g) , g2.... , g,,, be concave and differentiable on an open

convex set X in R. Assume that Slater's condition holds, that is,
(S) There exists an x E X such that gj (x) > 0 for all.j = 1, 2, ..., m.
Then condition (M) holds if and only if (QSP) holds.
REMARK: We illustrate schematically some important results obtained
so far in this chapter in Figure 1.10. The arrow here reads "implies" under
the conditions stated with the arrow.
REMARK: In economics it is often necessary to take into account explicitly
a nonnegativity condition such as x 0. So here let us consider the problem
of maximizing f(x) subject to gj(x) 0, j = 1, 2, ..., m, and x > 0, where
x E R". Note that the set X in the above theorem is now taken to be the
whole space R", which is obviously open and convex. We can write the
(QSP) condition for the present case as follows:
There exist z, such that
fX+ +0
and
g(r)>0, z>_0
l>0, µ>0
This condition can easily be converted to the following equivalent and better
known condition, which we call condition (QSP').2
(QSP') There exist z > 0 and >_ 0 such that
fx+A 'gx0,
A g(z) = 0 and g(z) > 0
We illustrate the above condition in Figure 1.11. Here X = R and we have
only one constraint, x > 0. From the diagram, z = 0 achieves the maximum
of f(x) subject to x 0, and f'(x) < 0, f'(k) z = 0. In this problem, if we
take X = 0 (that is, {x:x E R, x > 0}), then the explicit constraint will dis-
appear; f'(z) should be replaced by the right-hand derivative f'+ (fl.

f, g/ s concave and (Slater) f, g,'s differentiable

(QSP)

Always f, gj s concave

Figure 1.10. Relationships between (M), (SP), and (QSP) under Concavity.
THE QUASI-SADDLE-POINT CHARACTERIZATION 89

Figure 1.11. An Illustration of Corner Solution.

In Theorem 1.D.2 we considered the relation between (M) and (QSP) by

going through (SP). We are now concerned with the problem of finding the rela-
tion between (M) and (QSP) without going through (SP). This is, in fact, the
approach adopted by Kuhn and Tucker in their famous paper [ 101. To do this,
we first discuss the following condition, (KTCQ), which was introduced by
Kuhn and Tucker and is now called the Kuhn-Tucker constraint qualification. Here
X is assumed to be an open subset of R", and the gj's are assumed to be differen-
tiable in X.

(KTCQ) Let Cbe the constraint set defined by C = {x;x E X, gj(x) >_ 0, j = 1,
2, ... , mi. Let z be a point in C with gj(z) = 0 for j E E where E c { 1, 2, ... , m }
and E # 0. Let x be any point in X such that (x - z) > 0 for all j E E.
It is supposed that there exists a function h(t) on [0, 1] into X, which is differen-
tiable at 0 with the following properties.

(i) h(O) = z, and h(t) E C, 0 < t < 1,

(ii) h'(0) _ (x - z) for some positive number ct.

Because of h(t), this condition (KTCQ) may be referred to as the parameterizability

condition.
REMARK: Intuitively speaking, (KTCQ) says the following:
For every "line" originating from z and lying in the set defined by
Y {x: x E E}
there exists a differentiable curve h(t) which lies in the constraint set Csuch
that at z, h(t) is tangent to the line.
90 DEVELOPMENTS OF NONLINEAR PROGRAMMING

This (KTCQ) is illustrated in Figure 1.12. In either of the two cases, (KTCQ)
is satisfied.
The case in which (KTCQ) is not satisfied is illustrated in Figure 1.13. This
is the case where there is some irregularity (such as a "cusp") on the boundaries.
It should be clear that for a point such as . E Yin Figure 1.13, there is no function
h(t) satisfying (i) and (ii) of (KTCQ).
Before we state Kuhn-Tucker's main theorem, we also should modify condi-
tion (M) [note that (M) implies (LM)].
(LM) (Local maximality condition) There exists an z in X such that f (x) has a
local maximum at k subject to g.(x) > 0, j = 1, 2, .. ., m, and x E X.
In other words, there exists an open ball B(z) about z such that A = B(z) fl C :k 0
and f(i) >_ f (x) for all x E A, where C is the constraint set.
We now state and prove the theorem.

Theorem 1.D.3 (Kuhn-Tucker's main theorem ): Let f, g l , g2, ... , g, be real-valued,

differentiable functions on an open set X in R". Assume that the functions gj, j = 1,
2, ..., m, satisfy (KTCQ). Then condition (LM) implies condition (QSP).
PROOF: Let z be a local maximum. If gj(z) > 0 for all j = 1, 2, ... m,then
it follows easily that f'(z) = 0 (Theorem 1.C.6, unconstrained local maxi-

x2 x2

E=(1,2} E _ (2)
C = darkly shaded region C = darkly shaded region
Y = entire shaded region Y = entire shaded region
(X = R2) (X = RZ)
Case a Case b

Figure 1.12. Illustrations of the Kuhn-Tucker Regularity.

THE QUASI-SADDLE-POINT CHARACTERIZATION 91

C = darkly shaded region

Y = the straight line passing through
X = RZ
Figure 1.13. An Illustration of the Kuhn-Tucker Irregularity.

mum). Then if we choose A I = Az = ... _ A,,, = 0, the (QSP) condition

is satisfied. Now suppose

gj (z)=0 jEE
gj (x) > 0 0- E

where E c: J 1, 2, ... , m } and E 0. Leti be an arbitrary point in X

such that
gj(2) (x - z) 0 for j E E
We consider h(t) in (KTCQ), with h(0) =.i. Write x = h(t). By the definition
of differentiation, we have
f(x) -.f(X) =.f'(c) (x - X) + o(II x - z 11 )
Also
x - i=h(t)-h(0)=h'(0)t+o(II tll) and o(Ilx-XII)=o(Iltll)
Hence
f(x)-f(z)=f'(z) [h'(0)t+ o(Iltll)] + o(Iltll)
Therefore

o(Ilt11)
92 DEVELOPMENTS OF NONLINEAR PROGRAMMING

f(x) - .f(z) = (Xf'(z) (x - z)t + o (II t II )

Since x = h(t) E C for 0 < t < 1, f(x) -f() < 0 for sufficiently small t from
the local maximality off(z)in C. Hence for sufficiently small t,f'(z) (x - z)
0. Write = (x - z). Then - f'(z) > 0 for all with 0,
j E E. Therefore, by the Minkowski-Farkas lemma (Theorem O.B.4), there
exist .,1j >- 0, for all j E E, such that
=
_f Gi) Ajgj, co
jEE
or
.f' (x) + ajg,(x) = 0
jEE

Choose Aj = 0 if j 0 E. We have now obtained (z, A,, ..., A,,,) such that
P(z) + 2:m IA-jgj'(z) = 0. That gj(z) > 0 for all j follows immediately from
(LM). Since gj(z) = 0 for j E E, and Aj = 0 for j (4 E, we have
m
A,.jgj(x) = 0
j= (Q.E.D.)
REMARK: It should be clear from the above proof that the theorem follows
almost immediately from (KTCQ). Note also that the above theorem and
proof follow almost word for word when f and the gj's are real-valued
functions defined over X, where X is a "Banach space" (that is, a complete
normmed linear space) 3 See Ritter [ 13]. The Minkowski-Farkas lemma
holds almost as it is when X, the domain of the f and gj's, is an arbitrary
linear space (which can be infinite dimensional and does not even have to
be normed). See Fan [4], especially theorem 4, p. 104.
REMARK: In the (QSP) condition, we required, among others, the follow-
ing relation:
.fr + gX = 0
If we do not have (KTCQ), (LM) does not necessarily imply (QSP) (in
particular; the above relation). In other words; a statement such as "the
first-order conditions are the necessary conditions for a local maximum" is
not necessarily true. A special regularity condition such as (KTCQ) is
required to make this statement valid.
However, if we modify the above expression to
Ao.f, + A g.c = 0 where Ao >= 0
allowing the coefficient )o for fC (with the possibility of A0 = 0), then (LM)
always implies (QSP) with this modification. In other words, the role of
(KTCQ) is to guarantee that Ao > 0 (which in turn enables us to set Ao = 1).
Hence the regularity condition such as (KTCQ) is really the normality con-
THE QUASI-SADDLE-POINT CHARACTERIZATION 93

dition, which we discussed in Section B of this chapter. The theorem which

states that (LM) implies the above modified (QSP) was proved by Fritz John
in 1948 [8].
To make clear this point about 2.o, let us consider the following
example (Slater's example) again.
Maximize:f(x) = x on x E R
Subject to: g(x) _ -x2 ? 0
Clearly z = 0 is a solution of this problem. Since the Lagrangian is written
as 0 (x, A) = f (x) + Ag (x) = x - Ax2, we have f, + Ag, = 1 - 2Ax. There-
fore fx + At., = 1 - 22.2 = 1. In other words, (fx + ,1fi,) 0 at z = 0.
Hence the usual first-order condition is not anecessary condition for maximal-
ity in this case. However, the condition
AO.fx + Afix = 0
is certainly satisfied at z = 0 with AO = 0.
When the condition (KTCQ) is satisfied, such a nonlinear programming
problem is called Kuhn-Tucker regular, and in this case we have [(LM)=(QSP)]
(with Ao = 1 as (QSP) is usually defined). However, (KTCQ) is not the only con-
dition that guarantees [(LM) = (QSP)]. In fact, in Theorem 1.D.2 we already
observed that Slater's condition (S) together with the concavity of the function
f and the gj's are also sufficient to guarantee [(LM) = (QSP)] ; hence they are
also the conditions for normality. We can check easily (as in fact we already re-
marked in Section B) that Slater's example given above does violate Slater's
condition (S).
We are now interested in finding some other conditions that would guarantee
[(LM) = (QSP)]. To begin, we may observe that (KTCQ) is not a particularly
easy condition to check, so that we may be interested in finding conditions that
are easier to apply. Second, we may also relate the present theory to the linear
programming theory and to the classical maximization theory in which all the
constraints are expressed as equalities. In this connection we may recall that in
linear programming no normality condition such as (KTCQ) is required and
that in the classical theory the normality condition called the "rank condition"
is required.
With this background, we are now ready to state beautiful results by Arrow,
Hurwicz, and Uzawa which provide us with important conditions that replace
the (KTCQ) condition.

Theorem I.D.4 (Arrow-Hurwicz-Uzawa): If anyone of thefollowingfive conditions

holds, then (KTCQ) in Theorem 1.D.3 can be dispensed with [that is, (LM) implies
(QSP )] , where X is an open convex subset of R".

(i) The functions gj(x),.1 = 1, 2, ... , nn, are convex functions.

94 DEVELOPMENTS OF NONLINEAR PROGRAMMING

(ii) The functions gi(x), j = 1 , 2, ... , m, are linear or linear affine functions.
(iii) The functions g i (x), j = 1 , 2, ... , m, are concave functions and there exists an
x in X such that g1(z) > 0 for j E E' and gi(x) > 0 for j E E", where E' is he
set of indices for the effective constraints (at z) which are linear (affine), and
E" is the set of indices for the effective constraints (at z) which are not linear
(affine).
(iv) The constraint set, C = {x:g (x) > 0, j = 1, 2, ..., m, x E R'}, is convex and
possesses an interior point, and gj (z) 0 for all j E E, where E is the set of
indices of all the effective constraints at z.
(v) (Rank condition) The rank of [gj'(z)]jEE equals the number of effective con-
straints at z,' where the rank of the k x n matrix is defined as the (max-
imum) number of its linearly independent rows (which is equal to the max-
imum number of its linearly independent columns).'

PROOF: Omitted. See Arrow, Hurwicz, and Uzawa [1], especially their
theorem 3. See also the appendix to this section.
REMARK: We may call the above five conditions the Arrow-Hurwicz-
Uzawa (or the A-H-U) conditions. Condition (ii) is obviously a special case
of condition (i). Condition (ii) is important in connection with linear pro-
gramming. Note that Slater's condition is a special case of condition (iii).
Conditions (iv) and (v) presuppose no concavity or convexity of the gi's.
Condition (v) is famous from classical Lagrangian multiplier theory which
deals with the case in which all the constraints are effective (that is, all the
constraints are "equality constraints"). It is important to note that all the
above five conditions are concerned with the constraints only.
REMARK: We illustrate Theorems 1.D.3 and 1.D.4 schematically in
Figure 1.14.

If f and the gi's are concave, then (QSP) implies (LM), hence (M). This was
already discussed in Theorem 1.D.2.
We now show one immediate corollary of the above theorem.

Theorem 1.D.5: Let f (x' be a real-valued concave differentiable function on R".

Let ai, j = 1, 2, ... , m, be points in R. Let gi(x) = ri - ai - x, .i = 1, 2, ... , m,
where ri c R. Then i achieves a maximum off(x) subject to gi(x) ? 0, j = 1, 2, ..
in, if and only if

(LM) (QSP)
T
(KTCQ)

"' (A-H-U)

Figure 1.14. An Illustration of Theorems 1.D.3 and 1.D.4.

THE QUASI-SADDLE-POINT CHARACTERIZATION 95

(i) There exists a A > 0 such that (z, )) is a saddle point ofd (x, A.) = f(x) + A g(x)
over R" Qx S2that is, tP(x, A) < tP(z, A) < P(z, A) for all x E R^ and all
A >_ 0.

Or
(ii) There exists a A ? 0 such that (z, A) satisfies (QSP).

PROOF: Since gj(x) is linear affine for all j, condition (ii) of the Arrow-
Hurwicz-Uzawa theorem is satisfied. Hence (M) implies (QSP). Moreover,
f is concave and the gj's are linear affine, hence concave; therefore (QSP)
is sufficient for the maximality (M). Thus statement (ii) in the above theorem
is proved. Owing to the differentiability off and the gj's, (SP) implies (QSP).
Owing to the concavity off and the gj's, (QSP) implies (SP). Since (QSP) is
necessary and sufficient for (M), (SP) is now also necessary and sufficient
for (M), which proves statement (i) of the theorem. (Q.E.D.)
REMARK To prove the above theorem, we really do not need the machinery
of the Arrow-Hurwicz-Uzawa theorem. But the extreme simplicity of the
above proof will indicate the power of the theorem, as well as enhance the
reader's understanding of the theorem.
REMARK: It may be worthwhile to recall the warning that we gave earlier.
That is, in order that an expression such as fr + A - gx = 0 in (QSP) be
meaningful, z must be an interior point inX. If this condition is not satisfied,
the theorems which involve (QSP) become meaningless and those theorems
whose proofs require (QSP) may not hold. Consider the following example
from Moore [12].

Maximize:f(x) =
xER
1x
Subject to: g(x) = x - I > 0
Note that the domain off(x) is restricted to a closed interval [- 1, 1] inR,
in order that f (x) be a real-valued function. Hence, in view of the constraint
x - 1 > 0, the constraint set consists of only one point x = 1. The solution of
the above problem is obviously _z = 1. However, we cannot state the (QSP)
condition since f is not defined at z = 1. Note that the Lagrangian a)
V1 - x2 + A (X - 1) does not have a saddle point at z = 1. Note also that
the constraint function g(x) is linear affine in this case, so that condition (ii)
of the A-H-U theorem is satisfied.

It may be worthwhile now to summarize some of the results obtained

with respect to the characterization of a solution. For the sake of simplicity,
we assume X .= R", which, in particular, implies that X is open and convex.
Moreover, whenever (QSP) appears, the differentiability of f and the 's is
assumed. First we remind ourselves of the problem and various conditions.
96 DEVELOPMENTS OF NONLINEAR PROGRAMMING

PROBLEM: Maximize: f (x)

Subject to: gj(x) > 0, j = 1, 2, ..., m

xEX=R"
or
Maximize: f (x)

Subject to: x E C, where C = {x E X: gj(x) > 0, j = 1, ... , m}

Condition (M) z is a solution of the above problem.

Condition (LM) There exists an open ball (or neighborhood) B (z) such that
z maximizes f(x) subject to x E B(z) n C.
Condition (SP) The point (z, A) is a saddle-point of (D (x, A) = AX) +
A. g(x)-
Condition (QSP) (Or the first-order conditions) The point (r, A) is a quasi-
saddle-point of (D(x, A); that is,

(i) .fx + -i ' gx = 0

(ii) A ' g(X) = 0, g(z) ? 0
and
(iii) A> 0, a E R'", X E X

Condition (KTCQ) The constraint functions gj's satisfy the Kuhn-Tucker

constraint qualification.
Condition (A-H-U) The constraint functions gj's satisfy any one of the five
conditions of the Arrow-Hurwicz-Uzawa theorem. For example,

(i) The g1's are all convex or linear (affine).

(ii) The gj's are all concave and satisfy the Slater condition (S):
(S) 3 x E X such that gj(x) > 0 for all j.
(iii) The rank condition is satisfied.

Condition (Conc.) The functions f, gj, j = 1, 2, . .., m, are all concave.

We are now ready to obtain the diagram which shows the logical connections
of the above conditions (Figure 1.15).

(Conc.) (Conc.) + (S)

(LM) < (M) <-- (SP)

IT (Conc.)

(QSP)
(KTCQ) or (A-H-U)

Figure 1 .15. Characterization of X.

THE QUASI-SADDLE-POINT CHARACTERIZATION 97

In Figure 1.15, the arrow again reads "implies" under the conditions stated
with the arrow. If no conditions are stated, then no conditions are necessary
to obtain the given implication. In practical applications of nonlinear pro-
gramming theory to economics, the following conditions are often satisfied.

(i) The function f is concave and the gj's are all concave and satisfy (S), or
(ii) The function f is concave and the gj's are all linear (affine).

In such a "nice" situation, we can easily see that Figure 1.15 is considerably
simplified to Figure 1.16; that is, (LM), (M), (SP), and (QSP) are all equivalent.
The classical Lagrangian problem is concerned with the problem of "equality
constraints," that is, finding z E X, an open subsetofR", which maximizesf(x) sub-
ject to gj(x) = 0, j = 1, 2, ..., m, wherefand theg,'s are real-valued continuously
differentiable functions on X in R. As we remarked earlier, these constraints can
be converted into gj(x) > 0 and -gj(x) >_ 0, j = 1, 2, ..., m. Clearly if the con-
straints are all linear, then the constraint qualification is satisfied as a result of
condition (ii) of the A-H-U theorem. However, suppose that the gj's are not
linear. Condition (i) cannot be applied, for if gj is convex, then -gi is concave.
Condition (iii) cannot be applied either, for if gj(x) > 0 for some x, we cannot
have -gj(x) > 0. The rank constraint (v) may not seem applicable, for the rank
of the (2m x n) matrix.

is certainly not equal to 2m, for any x E X, where 2m is apparently the number of
effective constraints. However, this is not correct reasoning. We have to note
that the constraints g,(x) > 0 and -gj(x) _> 0 are not distinct constraints when the
values of x are such that gj(x) = 0. Hence, although there are 2m constraints in
appearance, the number of distinct constraints for x with gj(x) = 0 is m, and the
rank condition for the problem should be stated that the rank of the (m x n) matrix
[ a g1/ a x;] should be equal tom. Hence we obtain the following classical theorem,
which was originally conceived by Lagrange and later developed by Caratheodory
[2] and Bliss.'

Theorem 1.D.6. (Lagrange): Suppose that z satisfies (LRM).7 Suppose also that

(QSP)

Figure 1.16. Characterization of z for a "Nice" Case.

98 DEVELOPMENTS OF NONLINEAR PROGRAMMING

the rank of the (m x n) matrix [agjlax;] evaluated at z is equal to m, where it is

assumed that m < n. Then (QSP) holds for this z.
REMARK: The Lagrangian for this problem is written as

= f(x) + (lAj - v)gj(x) = f(x) + `Ajgj(x)

j=1 j= I

where A.j - pj - vj, j = 1, 2, ..., m. Although in (QSP) we may require

,uj > 0 and vi >_ 0, we cannot require ).j > 0 (j = 1, 2, ..., m); that is,Aj
can be either positive or negative.

EXAMPLE: Consider the problem of choosing (XI, x2) E R2 to

Maximize: xlx2
Subject to: g (x) = x 12 + x22 - 1 = 0

Using a diagrammatical representation of the problem, we can easily obtain

the solution of the problem as il, z2 = ± 1/,vf2-. We now obtain this as
an application of the above theorem. Define the Lagrangian by - x i x2 +
.(x12 + x22 - 1). The (QSP) conditions can be written as

X 2 + 2A i I = 0, a 2 = x, + 2L = 0, and z l2 + z 22 - 1 = 0
From these three equations we can easily obtain z 1, z2 = ± 1 //and )i.
The rank condition which validates the above computation is that the rank
Of (ag/ax, , ag/ax2) _ (2x, , 2x2) must be equal to one at (z1 , z2). It is
obvious that this condition holds.
The above consideration of the case with the equality constraints enables
us to extend our analysis to the case in which both equality and inequality
constraints are present (the case of mixed constraints). In other words, consider
the problem of finding x so as to
Maximize: f(x)
xEX

Subject to: g1(x) > 0,J = 1, 2, ... , m

hk(x)=0,k= 1,2,...,1
It is assumed that f. and the gj's and hk's are continuously differentiable. Let z be a
solution of this problem and let E be the set of indices such that gj(z) = 0 (that
is, the effective constraints). Let me be the number of elements in E. Then we
can obtain the following theorem, assuming me + I < n.

Theorem 1.D.7: Suppose that z satisfies (LM).8 Suppose also that the rank of the
(me + 1) x n matrix
THE QUASI-SADDLE-POINT CHARACTERIZATION 99

a J.

axi
ahk
where jEEandk= 1,2,...,1
axi
evaluated at z, is equal to (me + 1). Then (QSP) holds for this z where the Lagrangian
for the above theorem is defined as
f(x) + A - g(x) + µ- h(x)
Here (QSP) requires > 0, while µ can be either positive or negative.
REMARK: For a further and a more vigorous consideration of mixed
constraints, see Mangasarian [I I[ii], chapter 11.
EXAMPLE: Consider the problem of choosing (x1, x2) E R2 to
Maximize: x1x2
Subject to: g(x) x, + 8x2 - 4 > 0
h(x) = X12 + x22 - 1 = 0
Using the diagrammatical representation of the problem, the solution of
this problem can be obtained easily as z, = 22 = '/V/2-. The- Lagrangian
of this problem is defined asO = x, x2 + .A 1(x 1 + 8x2 - 4) + . 2(x 12 + x22
and the (QSP) conditions are written out as

a = X2 + A, + 2Ax = 0
ax1
am = X1 + 8A, + 2.2X2
a xe
=0

X12 + X22 1=0

,(X1 + 8X2 - 4) = 0
A,>0
Solving this, we can also obtain X1 = X2 = 1// as well as Al = 0 and a.2 =
-Z. Note that the Lagrangian multiplier .A.2 which corresponds to the equal-
ity constraint is negative here. It can easily be seen that the constraint
g(x) > 0 is ineffective at (X, , x2), that is, 1i/ + 8/V,'2 - 4 > 0. Hence the
rank condition for this problem is that the rank of (a h/ ax, , a h/ axe) =
(2x,, 2x2) be equal to at X, = X2 = 1/ f , which is trivially satisfied.
1

POSTSCRIPT: This section and Section B constitute the main characterizations

of optimal solutions. Our approach here is via the theory of convex sets utilizing
the separation theorems and the Minkowski-Farkas lemma. This approach was
motivated by the development of linear programming. Historically there is
100 DEVELOPMENTS OF NONLINEAR PROGRAMMING

another route, that is, via calculus. The classical Lagrangian theorem is concerned
with the case in which all the constraints are equalities. Karush 191, much
prior to Kuhn and Tucker [ 10], considered the inequality constraints, reducing
them to the equality constraints by adding to, or subtracting from each inequality
the square of a real number.' For example, the constraints gj (x) > 0, j = 1, 2,. . - , m,
can be converted into the following equality constraints:

gj(x)-82=0,where 8,2j_0,j= 1,2,...,m

Unfortunately, this work has been unduly ignored. El-Hodiri [3] rediscovered
Karush [9] and put it in a better perspective. The essential tool in the calculus
approach is the implicit function theorem. Although several approaches are
possible in this calculus route, three expositions seem to be most useful, that is,
Hestenes [ 6], Mangasarian [ 11 ] , and El-Hodiri, M. A. (Constrained Extrerna:
Introduction to the Differentiable Case with Economic Applications, Berlin, Springer-
Verlag, 1971).
Next to Karush [9] , but still prior to Kuhn and Tucker [10] , Fritz John
([8], 1948) considered the nonlinear programming problem with inequality
constraints. He assumed no qualification except that all functions are continuously
differentiable. Here the Lagrangian expression looks like (A0f + A A. g) instead
of (f + A g), and A0 can be 0 in the first-order conditions. The Kuhn-Tucker
constraint qualification ([101, 1951) amounts to providing a condition which
guarantees ).0 > 0 (that is, a normality condition). Arrow-Hurwicz-Uzawa ([ 1],
1961) then. provided a weaker normality condition than the Kuhn-Tucker con-
straint qualification, and also provided us with some very useful constraint quali-
fications that imply their normality condition. Their normality condition was
shown to be the weakest possible if the constraint set is convex. Since then there
have been some efforts to weaken the normality condition. In 1967, Abadie
introduced a new normality condition which neither implies nor is implied by
the Arrow-Hurwicz-Uzawa condition. Evans (1969) then provided us with a
normality condition that was weaker than any of those mentioned above. The
normality condition for the case in which both inequality and equality constraints
are present was considered by Mangasarian and Fromovitz in 1967 and by
Mangasarian ([ 11], chap. 11). They presented a normality condition for the case
of mixed constraints. A further investigation for the case of mixed constraints was
done by Gould and Tolle (1971).10

FOOTNOTES

I. In order that the expression Jr + A. 0 be meaningful, z must be an interior

point in X. That is, there exists an open ball about which is contained in X. Intuitive-
ly, z is surrounded by points in X. If X is an open set, this condition is satisfied.
In many nonlinear programming problems, it is simply assumed that X = R".
2. This is often known as the nonnegative quasi-saddle-point condition.
THE QUASI-SADDLE-POINT CHARACTERIZATION 101

3. A Banach space is a normed space that is "complete" as a metric space induced

by the norm. A metric space is called complete if every sequence {xq} in X which
has the following property is convergent. For each e > 0, there exists a positive
integer q such that d(x9, xe') < E for all q, q' > q . (Any sequence satisfying this
property is called a Cauchy sequence.) The set of all the continuous functions defined
on a closed interval in R is an example of a Banach space, when the norm is properly
defined.
4. These constraints gj(x) > 0, j = 1, 2, ..., m, must be distinct. For example, the
two constraints x i + x2 < 1 and 2x i + 2x2 < 2 are not distinct. More formally,
the functions gj, j = 1, 2, . . ., m, must be linearly independent. It is easy to see
that this rank condition will not hold under any circumstances if we allow non-
distinct constraints.
5. The maximum number of linearly independent rows of any (possibly rectangular)
matrix is equal to the maximum number of its linearly independent columns. This
is often known as the rank theorem. See D. Gale, The Theory of Linear Economic
Models, New York, McGraw-Hill, 1960, pp. 36-37. Here it is assumed that the
number of effective constraints is less than n.
6. Caratheodory [2] stated and proved the theorem in which the normality condition
is not involved (thus the coefficient for f, A0, appears in the Lagrangian). See his
theorem 2, p. 177. The proof is a simple application of the implicit function theorem.
See also G. A. Bliss, Lectures on the Calculus of Variations, Chicago, Ill., University
of Chicago Press, 1946.
7. The definition of (LM) must be modified to cope with the equality constraint. Such
a modification must be obvious, that is, gj(x) >_ 0 in the definition of (LM) is re-
placed by gj(x) = 0.
8. Again, gj(x) > 0, j = 1, 2, ..., m, in the definition of (LM) should be replaced by
gj(x)>_0,j= 1,2,...,m,andhk(x)=0,k= 1,2,...,1.
9. This is obviously motivated by the similar procedure in the calculus of variations
by Valentine. See F. A. Valentine, "The Problem of Lagrange in the Calculus of
Variations with Inequalities as Added Side Conditions," in Contributions to the
Calculus of Variations (1933-1937), Chicago, Ill., University of Chicago Press,
1937.
10. The works cited in this paragraph are as follows: J. Abadie, "On the Kuhn-Tucker
Theorem," in Nonlinear Programming, ed. by J. Abadie, New York, Wiley, 1967;
J. Evans, "A Note on Constraint Qualifications," Report 6917, Center for Mathe-
matical Studies in Business and Economics, University of Chicago, June 1969;
O. L. Mangasarian, and S. Fromovitz, "The Fritz John Necessary Optimality Con-
ditions in the Presence of Equality and Inequality Constraints," Journal of Mathe-
matical Analysis and Applications, 17, January 1967; F. J. Gould, and J. W. Tolle,
"A Necessary and Sufficient Qualification for Constraint Optimization," SIAM
Journal on Applied Mathematics, 20, March 1971.

REFERENCES

1. Arrow, K. J., Hurwicz, L., and Uzawa, H., "Constraint Qualifications in Maximiza-
tion Problems," Naval Research Logistics Quarterly, vol. 8, no. 2, June 1961.
2. Caratheodory, C., Calculus of Variations and Partial Differential Equations of the
First Order, Part II, Calculus of Variations, San Francisco, Holden Day, 1967 (trans-
lated from the German original published in 1935).
102 DEVELOPMENTS OF NONLINEAR PROGRAMMING

3. El-Hodiri, M., "The Karush Characterization of Constrained Extrema of Functions

of a Finite Number of Variables," Ministry of Treasury UAR,ResearchMemoranda,
series A, no. 3, July 1967.
4. Fan, Ky, "On Systems of Linear Inequalities," in Linear Inequalities and Related
Systems, ed. by Kuhn and Tucker, Princeton, N.J., Princeton University Press,
1956.
5. Hadley, G., Nonlinear and Dynamic Programming, Reading, Mass., Addison-
Wesley, 1964, esp. chap. 6.
6. Hestenes, M. R., Calculus of Variations and Optimal Control Theory, New York,
Wiley, 1966, esp. chap. 1.
7. Hurwicz, L., "Programming in Linear Spaces," in Studies in Linear and Non-
linear Programming, ed. by Arrow, Hurwicz, and Uzawa, Stanford, Calif., Stanford
University Press, 1958.
8. John, F., "Extremum Problems with Inequalities as Subsidiary Conditions,"
Studies and Essays, Courant Anniversary Volume, New York, Interscience, 1948.
9. Karush, W., Minima of Functions of Several Variables with Inequalities as Side
Conditions, Master's Thesis, University of Chicago, 1939.
10. Kuhn, H. W., and Tucker, A. W., "Non-linear Programming," Proceedings of the
Second Berkeley Symposium on Mathematical Statistics and Probability, ed. by
Neyman, Berkeley, Calif., University of California Press, 1951, pp. 481-492.
11. Mangasarian, O. L., Nonlinear Programming, New York, McGraw-Hill, 1969.
12. Moore, J. C., "Some Extensions of the Kuhn-Tucker Results in Concave Program-
ming," in Papers in Quantitative Economics, ed. by J. P. Quirk and A. Zarley, Lawrence,
Kansas, University of Kansas Press, 1969.
13. Ritter, K., "Duality for Nonlinear Programming in a Banach Space," SIAM Journal
on Applied Mathematics, vol. 15, no. 2, March 1967.
14. Samuelson, P. A., Foundations of Economic Analysis, Cambridge, Mass., Harvard
University Press, 1947, esp. appendix A.

Appendix to Section D: A Further Note on the Arrow-Hurwicz-Uzawa

Theorem'

In Section D, we introduced a very useful result due to Arrow, Hurwicz, and

Uzawa (Theorem 1.D.4) that provided five conditions, any one of which replaces
the (KTCQ). In other words, if any of these five conditions holds, the (LM) condi-
tion implies the (QSP) condition. The beauty of these conditions is that it is much
easier to apply them than the (KTCQ). In proving this theorem, Arrow, Hurwicz,
and Uzawa [2] first proposed the "condition W," as they called it, which replaces
(KTCQ) but which is "slightly weaker" than (KTCQ). Then they prove that
any one of the above five conditions implies this condition W.
Later in a lecture at Minnesota [5], Hurwicz simplified the proof of the
Arrow-Hurwicz-Uzawa theorem considerably.' The purpose ofthis Appendix is to
THE QUASI-SADDLE-POINT CHARACTERIZATION 103

provide an expository account of this new proof of the Arrow-Hurwicz-Uzawa

theorem.
We consider the following nonlinear programming problem.
Maximize: f(x)
Subject to:gj(x)> 0,j= 1,2,...,m
xEX
We assume that X is a nonempty convex open set in R. Let z be a point in X
which achieves a local maximum of f(x), subject to the constraints [that is, the
condition (LM) is realized by z] . Let E be the set of indices for the effective
constraints at z; that is, gj(z) = 0, if j E E, E c { 1, 2, ... , m}. Let J be the set
of indices for the convex effective constraints and let J' be the set of indices for
the nonconvex effective constraints. In other words, j E J means j E E and gj(x)
is convex on X, and j E J' means j E E and gj(x) is not convex on X. Clearly
E = J U P. We now define the following condition due to Arrow, Hurwicz, and
Uzawa [2] (p. 183, theorem 3), which we call condition (AHU) or the Arrow-
Hurwicz-Uzawa constraint qualification.
(AHU) There exists an h* E R" such that
gj(z) h* _>- 0 for all j E J
and g'(i) h* > 0 for all j E J'
This condition plays an important role in Arrow-Hurwicz-Uzawa [2] and will
play a crucial role in the following proof. We are now ready to state the main
theorem of this appendix.

Theorem 1.D.8: Let f, g1, g2, ..., gm be real-valued differentiable functions defined
on a nonempty open convex set X in R. Suppose that condition (AHU) is satisfied.
Then (LM) implies (QSP). In other words, if i achieves a local maximum of the
problem, there exists a A such that the quasi-saddle-point condition (QSP) is satis-
fied.'
PROOF(HURWICZ):
(i) Suppose gj(i) > 0 for all j = 1, 2, ..., m (that is, E = 0). It follows
that f'(i) - 0, for we have in this case the unconstrained maximization
problem. By choosing A i = 212 = ... = a.,,, = 0, the (QSP) condition
is satisfied. Now we concentrate on the case in which E 0.
(ii) Suppose g, (x) is convex (that is, j E J). Then gj(i + th*) - gj(i)
gg(2) (th*) for all t E R, t > 0, and any h* E R" such that (i + lh*) E X.
But by condition (AHU), we can choose h* such that gj(i) h* > 0.
Hence gf (i + th*) - gj(i) > 0 for all t > 0, t E R, such that z + th* E X.
(iii) Suppose gj (x) is not convex (that is, j E J'). By the definition of dif-
ferentiation, we have
gj(x) - gj(i) = j(i) (x - i) + of 11 x - ill ) for all x E X
1 04 DEVELOPMENTS OF NONLINEAR PROGRAMMING

Let x(t) -- z + th* such that x(t) E X and t > 0, t E R. Then gj [x(t)] -
gj (z) = gJ'(z) (th*) + o(II t II ). But by condition (AHU), we c in choose h*
such that gj'(z) h* > 0. Hence gj [x(t)] - gj(2) > 0 for sufficiently
small t.' Hence choosing t sufficiently small, say 0 < t < t , we can have
gj [x(t)] - gj(z) > 0 for all j E Y.
(iv) Let x(t) = z + th*, where 0 < t < t with x(t) E X. Then combining (ii)
and (iii) we have gj[x(t)] - g;(z) > 0 for all j E E, orgj[x(t)] > 0
for all j E E. Moreover, gi [x(t)] > 0 for all j E, for sufficiently small
t, say t, owing to the fact that gj(z) > 0 for all j E and the continuity
of the gj's. Thus x(t) E C, where C is the constraint set { x E X: gj(x) > 0,
j = 1, 2, ... , ml, if t is sufficiently small (that is, t < min {, !I).
(v) Now define Y' (t) = f [x(t)] - f(z) for 0 5 t 0, because, by assumption, x(t) E C and z achieves a local
maximum off (x) subject to x E C. Let `P+ (0) be the right-hand derivative
of Y' at t = 0.5 Then by the chain rule, we have `P+ (0) = f'(z) h*.
Note that T(t) < 0 = Y'(0), 0 5 t < t0, which implies that T(t) is non-
increasing at t = 0, or `P+ (0) < 0. Hence f'(z) h* 5 0. Since the choice of
h* can be arbitrary as long as condition (AHU) is satisfied, we have thus
established f'(I) h* < 0 for any h* in which condition (AH U) is satisfied.
Note that condition (AHU) is afortiori satisfied if there exists an h c R"
such that g! (z) h > 0 for all j E E. Hence for any h E R" for which
g' (z) J > 0 for all j E E, we have f' (z) h <- 0.
(vi) Now consider any h satisfying gj(z) h > 0 for all j E E. Then we have
gj'(z) (h + th*) > 0 for all j E E and for any t > 0, t E R, if condition
(AHU) is satisfied for h*. Then as a result of the conclusion obtained in
(v), we have
th*) 50
Take the limit as t - 0. Then, owing to the continuity of the inner
product, we obtain
f'(X)-h<0
Thus we have established that f'(z) h 5 0 for all h such that
jEE
(vii) Hence, from the Minkowski-Farkas lemma (Theorem O.B.4), there exist
,lj's, all > 0, such that
- f'(X) = jI Ajgj(x)
or
f'(z) + jE 0

Choose .aj = 0 if j (I E. Then

111

f'(z) + l jgj'(z) = 0
=1
THE QUASI-SADDLE-POINT CHARACTERIZATION 105

That gj(c) ? 0 for all j follows immediately from condition (LM). Since
gj (s) = 0 for all j E E and ,ii = 0 for j it E, we obtain

'ti gi(x) = 0

(Q.E. D.)
REMARK: Just as (KTCQ), the above condition (AHU) is again the normal-
ity condition. Notice also that it is a qualification for the constraints (that is,
nothing is mentioned about the maximand function f ).
We are now ready to derive the conclusion of the Arrow-Hurwicz-Uzawa
theorem. In particular, we want to show that any one of the five conditions in the
A-H-U theorem implies condition (AHU). This part of the A-H-U theorem is
really a corollary of the above theorem and has already been established in the
original paper by Arrow, Hurwicz, and Uzawa (see the corollaries of their theorem
3 in [2] ).
First, note that if gj(x) is convex for all j E E, then condition (AHU) is trivi-
ally satisfied. This can be seen easily by choosing h* = 0 in the statement of condi-
tion (AHU). Clearly if either of the following two conditions is satisfied, then gf (x)
is convex for all j E E.
(i) The function gj(x) is convex for all j = 1, 2, .. ., m.
(ii) The function gj(x) is linear for all j = 1, 2, ..., M.

Since every linear function is convex, (ii) is really a special case of (i); however, it
has a powerful implication, for, as remarked before, it implies that, in linear
programming, condition (AHU) is automatically satisfied.
Next we will see that the following modification of the Slater condition
implies condition (AHU):

(iii) The function gj(x) is concave for j = 1, 2,,... , m and there exists an Y E X such
that g j (x) > O for all j = 1, 2, ... , m.

To see this, first recall the following basic inequality for concave functions.
gj(z) (x - z) ? gj (x) - gj (i) for any x, z E X
In particular, set x = z and let h = z - z. Then we have
?gi (x)> 0 for all jE E
That gj(x) > 0 (for all j) follows from the above condition (iii). Thus condition
(AHU) is satisfied if condition (iii) is satisfied. It should be clear that, in view of (ii),
condition (iii) can be slightly weakened as in (iii').

(iii') The functions gj(x), j = 1 , 2, ... , m, are all concave and there exists an x
in X such that gj (x) ? 0 for.j E E' and gj (x) > 0, j E E", where E' is the set
106 DEVELOPMENTS OF NONLINEAR PROGRAMMING

of indices for the effective constraints (at z) for which the gj's are linear,
and E" is the set of indices for the effective contraints (at z) for which the
gj's are not linear (but concave).

Next we show that the following rank condition implies condition (AHU):
(iv) The rank of the in x n matrix g'(z) (that is, the Jacobian matrix) is equal
to the number of effective constraints at z.
Let gAi) be the submatrix of g'(z) obtained from g'(z) by deleting the rows
which correspond to the constraints that are not effective at z [that is, "gj(. )
is a row of gE (z)" means j E E ] . Let k be the number of effective constraints at . .
Then, owing to condition (iv), the number of linearly independent rows of the
matrix g'(z) is equal to k. Note that owing to an elementary property of matrices,
the rank of matrices cannot exceed the number of columns or rows.6 Hence k5; n
as well ask < m. Since all the rows of gE(z) are linearly independent, there are k
linearly independent columns in gE(z). Without loss of generality we may suppose
that the first k columns of g' (z) are linearly independent. Let A be the k x k square
matrix obtained from gE (z) by deleting the (k + 1)th to thenth column (if k < n).
Let u be the k-vector whose elements are all equal to 1. Since A is a nonsingular
square matrix, there exists a k-vector h such that A h = u. Let an n-vector h*
be defined such that h* = h; for i= 1,2,...,k,andh*=0fori=k+ 1,...,n.
Then clearly we have gE(z) h* = u > 0, org(z) h* > 0 for j E E. This establishes
condition (AHU). Hence the rank condition (iv) implies condition (AHU).
There is another condition in the Arrow-Hurwicz-Uzawa theorem which
implies condition (AHU): The constraint set C is convex and has an interior
and gj(z) 4 0 for every j E E. The proof that this condition implies condition
(AHU) is a little complicated, and hence is omitted. Interested readers are referred
to Arrow, Hurwicz, and Uzawa [2] , p. 184.
Fritz John's famous theorem, originally obtained in 1948 ([4], theorem I,
pp. 188-189),7 is an easy consequence of Theorem 1.D.8.

Theorem 1.D.9 (John): Let f, gI, ..., g,,, be real-valued differentiable functions
defined on a nonempty open set X in R". Suppose that (LM) is satisfied; that is, z
achieves a local maximum off subject to g;(x) > 0, j = 1, 2, ..., m, and x E X.
Then there exist aj >_ 0, j = 0, 1, 2, ... , m, not vanishing simultaneously, such that
m
Aof'(x) + Z Ajg;(x) = 0
J= 1
PROOF (HURWICZ):
(i) Suppose that, for some h* E R", g'I-(2) h* > 0, j E E. Then condition
(AHU) is satisfied. Hence from the previous theorem, we are guaranteed
the existence of i1 > 0, j = 0, 1, 2, ..., in, with A0 = 1.
(ii) Suppose now that there exists no h* E R" for which
O,jEE
THE QUASI-SADDLE-POINT CHARACTERIZATION 107

Define the set Z by"

Z-
Then Z does not contain any strictly positive element z > 0. Let R+ be the positive
orthant of Rk, that is, {z E Rk: z > 0}. Then Rk and Z are two disjoint convex
sets. Hence, owing to the Minkowski separation theorem (Theorem 0.B.3), there
exists an a E Rk, a 0, such that9

for all zERk

and
for all zE Z
From the first relation, it is clear that a >_ 0. In the second relation, if a z < 0
for some z E Z, then a (-z) > 0, since z E Z implies -z E Z. This is a contra-
diction. Hence

a for all h. Therefore

Let A0 = 0, ) = aj if j E E and Ay = 0 if j 0 E. Then we obtain the.desired k's.

(Q.E.D.)

FOOTNOTES

1. I am grateful to Leonid Hurwicz for giving me permission to quote the results and
the derivation from his unpublished paper [51, from which much of the material
in this appendix is borrowed. Needless to say, any possible misunderstanding of
[51, and hence mistakes, are mine.
2. The, essence of Hurwicz [ 5] is to provide a proof without by-passing the use of
(KTCQ).
3. Condition (QSP) says that there exist i E X and A E Rm, 0, such that f'(z) +
A g'(z) = 0, g(z) > 0, and A g(z) = 0, where g = (gi, ... , g,,,). If the nonnegativity
constraint x > 0 is made explicit in addition to g(x) > 0, then (QSP) is modified to
(QSP'): There exist x E X, i c 0, A E R"', ). ? 0 such that f'(z) -t g'(z) < 0,
x [ f'(z) + A- g'(z)] = 0, g(z) > 0, and A- g(z) = 0. Recall our discussion on this
point in Section D of this chapter.
4. It should be clear why gI (x) h * > 0 (instead of > 0) is required for j E P. Ifg,' (z) h
= 0 is allowed for j E J', then we cannot guarantee that gf [x(t)] - gf(z) >_ O, j E J'
for sufficiently small t.
5. Note that W' (0) exists because f is differentiable in X.
6. The "rank" of a (rectangular) matrix is defined as the number of linearly independent
rows. As remarked before it can be shown that it is equal to the number of linearly
independent columns (the rank theorem).
7. When all the constraints are in the equality forms, that is, gj(x) = O, j = 1, 2, ... , m,
then the theorem corresponding to Fritz John's theorem is known in the name of
108 DEVELOPMENTS OF NONLINEAR PROGRAMMING

Lagrange and Euler. As remarked before, the proof of such a theorem is provided by
Caratheodory ([3], pp. 176-177, theorem 2). See also theorem 76.1 of G.A. Bliss,
Lectures on the Calculus of Variations, Chicago, Ill., University of Chicago Press, 1946.
8. Recall that k is the number of effective constraints and that gE'(z) denotes the
k x n matrix which is obtained from g'(z) by deleting the rows which correspond to
the ineffective constraints (ineffective at z).
9. It should be clear that the separating hyperplane passes through the origin of R".

REFERENCES
1. Abadie, J., "On the Kuhn-Tucker Theorem," in Nonlinear Programming, ed. by
J.Abadie, New York, Interscience, 1967.
2. Arrow, K. J., Hurwicz, L., and Uzawa, H., "Constraint Qualifications in Maximiza-
tion Problems", Naval Research Logistics Quarterly, vol. 8, no. 2, June 1961.
3. Caratheodory, C., Calculus of Variations and Partial Differential Equations of the First
Order, Part 11, Calculus of Variations. San Francisco, Holden Day, 1967 (tr. by Robert
Dean from German original, 1935).
4. John, F., "Extremum Problems with Inequalities as Subsidiary Conditions," Studies
and Essays, Courant Anniversary Volume, New York, Interscience, 1948.
5. Hurwicz, L., "LH-Oct. 1966," Lecture Note at the University of Minnesota, October
1966, revised July 2, 1970.

Section E
SOME EXTENSIONS

In this section we extend the nonlinear programming theory established so

far. This extension, set out under three principal topics, will provide us with useful
applications in economic theory.
The first topic is concerned with constraint maximization problems where
the maximand functionf(x) and the constraint functions are not necessarily con-
cave but "quasi-concave." As we will see later, the quasi-concavity of the con-
sumer's utility function corresponds to the ordinary utility function whose
indifference curves are convex to the origin, and the quasi-concavity of the pro-
duction function allows increasing returns to scale. These observations alone
should be sufficient to motivate a study of "quasi-concave programming."
The second topic is the constrained vector maximum problem. So far we
have assumed that the maximand function f(x) is a real-valued function. We want
to extend this to problems wheref(x) is a vector-valued function. The constrained
vector maximum problem will be related to the concept of "efficient point" in
activity analysis and the concept of "Pareto optimum" in welfare economics.
SOME EXTENSIONS 109

The third topic is the characterization of differentiable concave or quasi-

concave functions in terms of the Hessian matrix. This will clarify the relation
between the theory of nonlinear programming and the ordinary second-order con-
ditions, and it will give a useful method of determining whether or not a particular
function is concave or quasi-concave. At the end, we will discuss the so-called
"second-order (necessary or sufficient) conditions" for an optimum.

a. QUASI-CONCAVE PROGRAMMING

Definition: A real-valued functionf(x) defined over a convex setXin R" is called

quasi-concave if
f(x) > f(x') implies f [tx + (1 - t)x'] > f(x') for all x, x' E X and 0 < t < 1
REMARK: Therefore, f(x), over a convex set X, is a quasi-concave function
if and only if {x: x E X, f(x) > a} is a convex set for all a c R.

Definition: A real-valued function f(x) defined over a convex setX in R" is called
quasi-convex if -f(x) is quasi-concave. The function f(x) is called strictly quasi-
concave if
f (x) > f (x') implies f [ tx + (1 - t)x'] > f (x') for all x x' E X and 0 < t < 1
The function f(x) is called strictly quasi-convex if -f(x) is strictly quasi-concave.
REMARK: Clearly a strictly quasi-concave (-convex) function is always
quasi-concave (-convex), but not vice versa.
We can easily show the following theorem.

Theorem 1.E.1:
(i) Any concave function is also quasi-concave, but the converse does not necessarily
hold. Similarly, any strictly concave function is also strictly quasi-concave, but the
converse does not necessarily hold.
(ii) Any monotone increasing (or decreasing) function is quasi-concave zfX c R
(iii) Any monotone nondecreasing function of a quasi-concave function is also quasi-
concave.
REMARK: An ordinary utility function whose corresponding indifference
curve is drawn convex to the origin is an example of a quasi-concave func-
tion. If the indifference curve does not contain a linear segment, then the
utility function is strictly quasi-concave. Although a quasi-concave function
is not necessarily concave, a quasi-concave function can be transformed
into a concave function, under a certain regularity condition, by a strict
positive transformation. See Fenchel [6], pp. 115-137. This observation is
110 DEVELOPMENTS OF NONLINEAR PROGRAMMING

interesting; since utility is usually supposed to be an index so that the utility

function must be invariant under a monotone transformation.
Some examples of a quasi-concave function which is not concave are the
following:
(1) f (x) = x2, where x E R, x >_ 0
(2) Y = L11KAwhere a+/3> 1,a> 0,/3> 0,(L> 0,K>0)
It can also be shown that any homogenous function with degree less than or
equal to one is a concave function, if it is a quasi-concave function.
We are now concerned with the problem of maximizing a quasi-concave
function f(x) over the n-dimensional nonnegative orthant on subject to the con-
straints gj(x) >_ 0, j = 1, 2, . . ., m, where the gj's are all quasi-concave functions
defined on On. More specifically, we are interested in characterizing the maxi-
mality condition of the problem in terms of the (QSP') condition. The (QSP')
condition for the present problem can be written (as discussed in the remark
following Theorem 1.D.2.) as

(QSP') There exists an (z, A) such that

JX + 0, x ci + LX)
p = 0

g(z) = 0, g(z) > 0

>_0 and A>0
The maximality condition (M') would be the same as before, that is,
(M') There exists an z E On which maximizesf(x) subject to gj(x) > 0, j = 1,
2, ...,m, andx0 with xE R.
We now introduce a new concept and state a sufficiency theorem for the
maximum.

Definition: Given the constraint set C = {x : x E on, gj(x) > 0, j = 1, 2, ..., ml,
we call the ith coordinate variable xi a relevant variable if there exists an x in C
such that x i > 0.
REMARK: As Arrow and Enthoven ([1], p. 783) explained, it is a variable
"which can take on a positive value without necessarily violating the con-
straints."

Theorem 1.E.2 (Arrow-Enthoven): Let f, g1, g2, ..., g,,, be differentiable, quasi-
concave, real-valued functions of the n-dimensional vector x on R" with x >_ 0. Then
(QSP') implies (M'), provided that one of the following conditions is satisfied.
(i) j,, < 0 for at least one variable xi, where fX, is the partial derivative off (x) with
respect to xi, evaluated at x = z.
(ii) J. > O for some relevant variable xi.
SOME EXTENSIONS 111

(iii) f, -A 0 and f (x) is twice differentiable in the neighborhood of z.

(iv) The function f (x) is concave.

PROOF: Omitted. See Arrow and Enthoven [ 1] .

REMARK: In concave programming, we assume that both f and gj(j = 1,
2, . . ., m) are concave. Condition (iv) in Theorem 1.E.2 is a slight weakening
of this. Note also that if all gj's are quasi-concave, then the constraint set
C={ x: x E On, gj(x) ?0, j = 1,2,...,m}isaconvex set.
REMARK: The requirement that the constraint functions gl(x),..., g,,,(x)
be quasi-concave can be replaced by the weaker condition that the constraint
set C = {x: x E On, gj(x) > 0, j = 1, 2, .. ., m} be a convex set. Obviously
C is convex if gj is quasi-concave for all j, but it can also be convex if some
gj's are not quasi-concave. See Arrow and Enthoven [ 1], p. 788. Note that
if f and gj, j = 1, 2, ., m, are all concave, then all the requirements of the
. .

theorem are satisfied. Thus (QSP') implies (M'). This is nothing but the
result obtained as part of Theorem 1.D.2.
REMARK: If all the variables are relevant (the usual case in economic
theory), then (i) and (ii) of Theorem 1.E.2 simply reduce tof 0.
Referring again to Arrow and Enthoven [ 1], we state the following theorem,
which really corresponds to the Arrow-Hurwicz-Uzawa theorem [Theorem 1. D.4,
conditions (iii) and (iv)]. The theorem is concerned with a necessary condition for
the maximum.

Theorem 1.E.3: Let gj(x), j = 1, 2, . . ., m, be differentiable quasi-concave real-

valued functions. Suppose that there exists an .5E ? O such that gj(x) > 0 for all. j. Then
(M') implies (QSP'), provided that either of the following conditions is satisfied:
(i) The functions gj (x) are concave for all j.
(ii) The functions gj(x) -A 0 for all j and for all x E C.
REMARK: We must note a very unpleasant fact about quasi-concave func-
tions. As we mentioned before, any nonnegative linear combination of
concave functions is also concave. But a nonnegative linear combination
of quasi-concave functions is not necessarily quasi-concave. As we will see
later, this will restrict the applicability of quasi-concave programming. (We
call the constrained maximization problem quasi-concave programming if the
maximand function and the constraint functions are all quasi-concave).
Finally, we should mention the concept of "explicit" quasi-concavity (or
quasi-convexity).'

Definition: A real-valued function f(x) defined over a convex set X in R'1 is called
explicitly quasi-concave if it is quasi-concave and if
112 DEVELOPMENTS OF NONLINEAR PROGRAMMING

f(x) > f(x') impliesf[tx + (1 - t)x'] > f(x')

for all x, x' E X (with x x') and for all t with 0 < t < 1.
REMARK: The two properties required in the above definition are inde-
pendent. For example, considerf(x), x E R, defined by
f(x) = -1 if x = 0, andf(x) = 0 if x 0
Then f is not quasi-concave, but it satisfies the second property required
in the above definition. However, if f is continuous, the second property
implies the quasi-concavity off, so that quasi-concavity is superfluous in the
above definition. The proof is easy and is left to the interested reader.
REMARK: Note that every strictly quasi-concave function is explicitly
quasi-concave, and that every explicitly quasi-concave function is quasi-
concave.

Definition: A real-valued function f(x) defined over a convex set X in R" is

called explicitly quasi-convex if -f(x) is explicitly quasi-concave.

REMARK: The following very useful propositions can easily be proved:

(i) Iff(x) is a concave (resp. convex) function defined on the con vex setX inR",
then f is explicitly quasi-concave (resp. explicitly quasi-convex) in X.2
(ii) Let f (x) be an explicitly quasi-concave (resp. explicitly quasi-convex)
function on a convex set X in R". Then every local maximum (resp. local
minimum) off in the constraint set C, which is convex, is also a global
maximum a
(iii) Let f (x) be strictly quasi-concave on a convex set X in R". Then ifz achieves
a local maximum off in the constraint set C and if C is a convex set, then
it achieves a unique global maximum over C.'

b. THE VECTOR MAXIMUM PROBLEM

So far we have been concerned with the problem of maximizing a certain
real-valued functionf(x) subject to certain constraints. Here we are concerned
with the problem of maximization whenf(x) is a vector-valued function.

Definition: Let f 1(x ), f2(x ), ... , fk(x) and g I(x), 92W, ... , g,,(x) be real-valued
functions defined on X in R We say that z in X gives a vector (global) maximum
of f(x) -- [f (x), f2 (x), ..., fk(x)] subject to gj(x) > 0, j = 1, 2, ..., m, if the
following conditions exist:
(i) gj(i) > 0,j= 1,2,...,m,andiE X.
(ii) There exists no i satisfying
SOME EXTENSIONS 113

f(z)>f, (z) foralli= 1,2,...,k

f (z) > f, (z) for some i
gj (z) > O, j = 1, 2, , m, and z E X

REMARK: The reader may realize that the concept of "efficient point" in
activity analysis is a special case of the vector maximum wheref (x) = x. One
may also note that the vector maximum problem has immediate relevance to
the concept of Pareto optimum, which is important in economic theory.
REMARK: The definition of vector local maximum is analogous to the
above definition of vector global maximum. For the distinction between a
local maximum and a global maximum, see Section C of this chapter. The
concept of a local maximum is concerned with maximization with respect to
some open ball (which can be very small).
REMARK: It follows from the above definition that if f (z) is a constrained
vector maximum, then
z maximizes f o(x) [that is, f o(z) > f o(x )]
Subject to:
f(x) f(z) for all i 4 io
gj(x)0 j= 1,2,...m
where the choice of io is arbitrary. For if not, there exists an z and an i
such that
f(X) >f(X)
and
f,.(z) >= f (z) for all i# i
gj ( x) 0 j = 1, 2, ... , m
This is a contradiction of the assumption that z is a vector maximum.
Utilizing this remark, we now prove the following theorem. The method
of proof using the above remark is due to El-Hodiri [4], who, in turn, attributes
the idea to Leonid Hurwicz.

Theorem 1.E.4: Let f , f2, ,fk, 91, 92, ... , gm be real-valued concave functions
defined on a convex set X in R". Assume that Slater's condition (S) holds; that is,
there exists an x in X such that
gi(z) > 0 for all j
Then i(1 achieves a vector maximum off (x) = [ f (x), . . ., fk (x)] subject to gj (x) > 0,
j = 1, 2, ..., m, there exist cr E Rk, A E R"' with cr >_ 0, A >_ 0, and a # O such that
(z, A) is a saddle point of (P (x, A) = a f (x) + A g(x); that is,
cD (x, A) < cp (z, A) < cp (z, A) for all x E X and A > 0
PROGRAMMING
114 DEVELOPMENTS OF NONLINEAR

remark and Theorem 1.B.4, fixing i0, there

PROOF: As a result of the above, 1 , 2, ..., m, with aiio >_ 0, .1.;io > 0 for

exist aiio, i = 1, 2, ..., k, Ajio' J such that

all i and j but not all equal two` 0. In

-fi(x)] + I Aiiogj(x)
ai0i0 0( f x) + E aiio[
i=IL i0
j=I

f
m
(z)] + .i1,;i0g;(z) for all x E X
aioio
f o(X) + X aiio f (x)
r j= 1
0

and

Aji0gj(X) = 0
j,I
following:
These can easily be simplified to them
+ 11
a io;of o(x) for all xEX
a,010 fo(x) + (x) - f (; )]
i#i0

or,
m
k
k In+
aiiof (X) + G Ajiogj(X) for all x E X
aiio-fi(x) + ! ijiogj(x) = !5 1
j= i
i= 1 j= I

and
nc _
j-I )j110gj(x) = 0

equation over io from 1 to k, and defining

Summing this inequality and
k k
aiio, j,i
ai ' 10- I io= I

we obtain
k m
k n, o f (z) j-+;gj(zj for all x E X
aif(x) + j=1
°1 r r I

i=1

and r7?

V/gj(x) 0
j-

Or in vector notation,
a f(z) + g(z) for all x E X
a .f(x) +) g(x)
SOME EXTENSIONS 115

and

Now we want to show that a # 0. Suppose that a = 0; then A 0. Also,

owing to the above relation we have A g(x) < 0, for all x c X. Let x = x
for condition (S); then we obtain a contradiction, since A # 0 and A >_ 0.
Now we show 1(X, A) 5 0(.i, A) for all A ? 0; that is, a .f(1) + g(x)
a f(z) + A g(z) for all A > 0. Or we want to show

forallA>0
But this is obvious since z is a solution of this vector maximum problem, so
that gj(1) > 0 for all j. (Q.E.D.)
REMARK: We should note that Slater's condition (S) in the above theorem
is used to guarantee a # 0. Without this condition, a can be zero.
REMARK: It is possible to prove the above theorem directly from the
separation theorem or the fundamental theorem of concave functions
(Theorem 1.B.2), as we did for Theorem 1.B.3. But the above proof seems
to be conceptually the simplest. See also Karlin [ 11 ] , pp. 216-218, and
Kuhn and Tucker [ 12]..,

We now prove a sufficiency theorem.

Theorem 1.E.5: Let f, i = 1, 2, ..., k, and gj, j = 1, 2, . . ., in, be real-valued

functions over X in R". Suppose that there exist x c X, a > 0 and A ? 0, such that
a f (x) + . g(x) :5 f(X) + a. g(X) for all x E X
and
0, g(X) > 0
Then z achieves a vector maximum subject to g(x) > 0, x E X.
PROOF: The proof is almost trivial. By the hypothesis of the theorem, we
have a f(.z) - a f(x) >_ A g(x) for all .v c X. Hence z maximizes a real-
valued function a f(x) subject to the constraints g(x) > 0 and x c X. Now
suppose z is not a vector maximum point. Then there exists an x E X such
that f,.(z) > f (z) for all i = 1, 2, ., k with strict inequality for at least
. .

one i and g1(z) > 0,J = 1, 2, ... , m. Hence we have a f(.i) > a .f(I) with
g(z) > 0, 1 E X. This contradicts the above observation that r maximizes
a f(x) subject to g(x) > 0 and x E X. (Q.E.D.)
REMARK: Note that we do not need the concavity of the f,- and gj nor the
convexity of X in Theorem 1.E.5. From the above proof we can imme-
diately see the following useful theorem.'
116 DEVELOPMENTS OF NONLINEAR PROGRAMMING

Theorem 1.E.6: Let f, i = 1, ..., k, and gj, j = 1, 2, .. ., m, be real-valuedfunc-

tions over X in R. Suppose that there exists an z E X and coefficients a , a2, ... , ak
in R with ai > 0 for all i, such that z maximizes a f (x) subject to g(x) > 0 and
x E X. Then z gives a vector maximum of f(x) subject to g(x) => 0, x E X. Here
f(x) = [.fi(x), ...,.fk(x)] andg(x) = [gi(x), ..., g,"(x)]
We can obtain the following theorem, whose proof is analogous to that of
Theorem 1.B.4.

Theorem 1.E.7: Let f, i = 1, 2, . . ., k, and gj, j = 1, 2, ..., m, be real-valued

functions over X in R". Suppose that there exist coefficients aI, a2, ..., ak in R with
a; > O.for all i and a saddle point (z, A) in X 0 0'" such that

0 (x, A) 5 0(z, a.) < cD (z, A) for all x E X and all A E D m

where

Then
(i) z achieves a vector maximum off (x) subject to g(x) > 0 and x E X.
(ii) 'i g(z) = 0.
PROOF: Since 0(i, A) < cD (z, A), we have A g(z) < A g(z) for all A > 0.
Hence A g(z) is bounded from below for all A in f2-. Therefore A g(z) > 0
for all A >_ 0, so that we obtain g(z) > 0. (Recall the lemma immediately
preceeding Theorem O.B.4.) Putting A = 0 in the above inequality, we obtain
A g(z) < 0. But A g(z) > 0, since a. > 0. Hence a. g(z) = 0.
Now note that 0 (x, A) < 0 (z, A) for all x E X ; that is, a f(x) +
A g(x) 5 a f(z) g(z) = 0). This means that z maximizes a f(x)
subject to g(x) > 0 and x E X with a > 0. Owing to the above theorem, z
achieves a vector maximum subject to g(x) > 0 and x E X. (Q.E.D.)
Now let us assume that the f's and gj's are all differentiable in an open
set X in R". Then we can extend the above analysis of the constrained vector
maximum problem in a manner similar to the analysis in Section D. Since the
proof will be analogous to the proofs given above and in Section D, we need
only list the main results. First we must define certain concepts.
Given differentiable vector-valued functions f(x) = [f (x), f,(x), ...,
fk(x)] and g(x) = [g, (x), ... , g",(x)] defined over an open set X in R", we define
the following conditions.
(VM) There exists an z E X which achieves a vector maximum off(x) subject
to g(x)>0,xEX.
(LVM) There exists an open ball B,(z) with radius c in X about z such that
z achieves a vector maximum off(x) subject to g(x) > 0 and x E BE(x).
(VQSP) There exist a 6 Rk with a > 0 (that is, a 0) and (x, A) in X ® D'"
SOME EXTENSIONS 117

such that a fX + A gr = 0 and A g(z) = 0 with g(z) 0 where fx = f'(x) and

gXg,G4
Now we are ready to list the theorems, whose proofs are obvious.

Theorem 1.E.8: Suppose that the f 's and gj's are all concave differentiablefunctions
defined over an open convex set X in Rn. Suppose also that Slater's condition is satis-
fied; that is,
(S) There exists an x E X such that g(i) > 0.
Then (VM) implies (VQSP).

Theorem 1.E.9: Suppose that the gj's satisfy (KTCQ) or (A-H-U) as defined in
Section D. Then (LVM) implies (VQSP).

Theorem 1.E.10: Suppose that the f's and gj's are all concave differentiablefunctions
defined over an open convex set in Rn. Then (VQSP) implies (VM), where a in (VQSP)
is assumed to be strictly positive.
PROOF OF THEOREM I.E. 10: Since a f(x) + A. g(x) is a nonnegative linear
combination of concave functions, it is concave. Hence by Theorem 1.C.7,
we obtain from (VQSP)
a- f(x) + a- f(z)
Since a > 0, this proves that z achieves a vector maximum off (x) subject to
g(x) ? 0 and x e X. (Q.E.D.)

C. QUADRATIC FORMS. HESSIANS, AND

SECOND-ORDER CONDITIONS
In this section we are interested in characterizing concave or quasi-concave
functions in terms of their "Hessian matrices" and in general order (necessary
or sufficient) conditions for an optimum. To do this we first have to introduce the
useful properties of symmetric matrices, their negative or positive (semi) definite-
ness. We also define the second-ordcr derivatives to which the Hessian matrix
corresponds. Since these concepts are important in mathematical economics, we
will discuss them in detail before we come to the characterization of the concave
or quasi-concave functions. We will also discuss the relation between this
characterization and the so-called "second-order condition" for the maximiza-

tion problem.
We begin this discussion with some elementary concepts in linear algebra.

Definition:Given an n x n symmetric matrix A = [aq] , whose entries are real

numbers, and an n-vector, x = (x1, x2, ..., xn) E Rn,
118 DEVELOPMENTS OF NONLINEAR PROGRAMMING

n n
f (x, x) = a,1x;xj = x A x
.= t j=1
is called the quadratic form associated with A over the real field.
REMARK: Clearly f (x, x) is a real-valued function defined over R'
R", and it is bilinear in the sense that
f (x + y, x) = f (x, x) + f (y, x) for all x, y E R"
f(ax, x) = af(x, x) for all a E R, x E R"
and
j(x, x + y) = f (x, x) + f (x, y) for all x, y E R"
f(x, ax) = af(x, x) for all a E R and x E R"

In general, a real-valued functionf(x, x) defined overX Q X whereX

is a linear space (not necessarily finite dimensional) is called a quadratic
functional if the above bilinearity holds (where R" in the above definition is
replaced by X). Every quadratic functional can be expressed as a quadratic
form (that is, in the form of x A x) if X is finite dimensional.

Definition: Let Q (x) = f (x, x) be a quadratic functional (or quadratic form) on a

linear space X. Then Q (x) is said to be negative (positive) definite if Q (x) < 0 (> 0)
for all nonzero x in X, and Q(x) is said to be negative (positive) semidefinite if
Q(x) S 0(> 0)forallxEX.
If Q(x) = x A x, where A is a symmetric matrix, the matrix A is said to
be negative (positive) definite if Q(x) is negative (positive) definite. The matrix A
is said to be negative (positive) semidefinite if Q (x) is negative (positive) semi-
definite.
EXAMPLES:
l 0
1. If x= (xt,x2)andA=
0 1

then Q (x) = x A x = x 12 + x22. Hence Q (x) > 0 for all nonzero

xER2.
0
2. If X = (X 1, X2) and A =
] 0
1

then Q (x) = x- A- x = 2x l x2 . Here Q (x) can be negative, positive, or zero,

depending on the value of xt and X2-
3. Xt = C[(),,] (the set of all continuous functions on [0, 1]); Q(f) _
r [ f (t)] 2dt where f (t) E X is a quadratic functional that is positive definite.
SOME EXTENSIONS 119

We call a quadratic form Q (x) = x A x a real quadratic form if x E R "

and A = [a, ] with a, E R. We are concerned with real quadratic forms. Given an
n x n matrix A = [a1], we may define the following determinants.
all a12 ... alk
Dk
all a22 ... a2k k= 1,2,...,n
akl ak2 ... akk
The determinants D 1, D2, ..., D" are called the successive principal minors of A.
Given A, we may also define the following k x. k determinant
a ii aik
alt ail ... ajk
Dk

l aki akj ' ' ' akk

where (i, j, ..., k) is any permutation of k integers from the set of integers
11, 2, ... , n }. The determinant k is called a principal minor of A with order k.
Note that every D has the same sign as the determinant of A, since both rows and
columns are interchanged in the process of permutation.
EXAMPLES:

all a12I
D1=a11, Dz =
a a2 1 a zz

all a12
a22 a21 l
D1 = all and azz, Dz = I l and
a21 a22 a12 all
The following theorem is concerned with the characterization of a real
quadratic form.

Theorem I.E. I I : Let Q (x) = xA x, x E R" be a real quadratic form.' Then

(i) Q(x) is positive definite if and only if D1 > 0, D2 > 0, ..., D/1 > 0 (that is,
all the successive principal minors are positive).
(ii) Q (x) is negative definite if and only if DI < 0, D2 > 0, ... , (- 1)"D" > 0 (that
is, the successive principal minors alternate in signs).
(iii) Q (x) is positive semidefinite if and only if all D1 ? 0, D2 > 0, ..., j5, > 0.
(iv) Q (x) is negative semidefinite if and only if all D1 < 0, D2 > 0, ... , (- 11"D,, 0.
(v) A positive (negative) semidefinite Q (x) is positive (negative) definite if and only
if A is a nonsingular matrix.
PROOF: See Gantmacher [7] , pp. 306-308, and Hestenes [9] , pp. 20-21,
for example.
120 DEVELOPMENTS OF NONLINEAR PROGRAMMING

REMARK: The determinants Dk have all the same signs if {i, j, .. k} is

the same index set, since both rows and columns are interchanged in the
process of permutation. For example,
r all a12 a13
A= a21 a22 a23
a3l a32 a33

has three kinds of D2, each kind having its own sign and value,

D21 = f all a12 a21

and a22I

a21 a2, a12 all

D22 = all a13 land 1 a33 a31
a31 a33 a13 all
D23 = a22 a23 a33 a32
land
a32 a33 a23 a22

REMARK: Statements (i) and (ii) of the above theorem can be restated
as the following: Q (x) is positive definite if and only if Dk > 0, k = 1, 2, ... , n,
and Q(x) is negative definite if and only if (-1)kDk > 0, k = 1, 2, ..., n.
REMARK: The determinants Dk in statements (iii) and (iv) of the above
theorem cannot be replaced by Dk. For example, let

0
A =00 1

Then D1 = 0, D2 = 0 (but D1 = 0 and 1). Hence this satisfies the condition of

(iv) if Dk is replaced by Dk, but Q (x) = x A x = x22 is not negative semi-
definite (while it is positive semidefinite but not positive definite).
Now we define the second derivative.

Definition: Let f be a real-valued differentiable function on an open subset

X in R11; f is said to be twice differentiable at x° where x° E X, if there exists an
n-vector a and an n x n matrix A such that
A x + h) - f(x°) = a - h + 2h A h + o( 11 h 112)

The n-vector a is called the first derivative off at x0 and A is called the second
derivative of f at x°. The first differential of f at x° is the name given to a h,
and h A h is called the second differential of f at x°. The first and the second
differentials are denoted by S f (or df) and S 2f (or d 2f), respectively. Note that
d2f= d(df).
REMARK: If X is a (normed) linear space which is not necessarily finite
SOME EXTENSIONS 121

dimensional, then a h is replaced by the linear functional a(h) and h A his

replaced by the quadratic functional A (h). The above definition and this
remark are natural extensions of the concept of a derivative and differential
as discussed in Section C. It can be shown that a and A in the above defini-
tion are unique if they exist.
REMARK: If f is twice differentiable at x° E X, an open subset of R",
then the second partial derivatives azf (x)1,9 x;,9 x; (i, j = 1, 2, . . ., n) exist
at x0 and the second derivative A (at x°), also denoted by f"(x°), has the
following expression:
.9 2f
azf azf
ax12 axlaxz axlax
azf azf
A= axzaxl axzax"

azf azf azf

ax"ax1 ax"axz ... a- x

where azf -
ax;axi ax;
a
l of l (evaluated at x°), i, j = 1, 2, ... , n.-If the second

partial derivatives are continuous at x°, then f" is continuous at x° and

n2f z

a xj 2,. ,n (evaluated at x°). In other words, the above

ax x;' i1, .

matrix A is symmetric. The matrix A is called the Hessian matrix off at x°."
REMARK: According to the usual convention in mathematics, the notation
[f"(xO) < 0] means that the Hessian matrix f" (x°) is negative semidefinite,
and not that each element of the matrix f"(x°) is nonpositive. Similarly,
[f"(xO) < 0] means that the Hessian matrix f"(x°) is negative definite, and
not that each element off"(x°) is negative. When x° is a scalar, this con-
vention does not create any confusion. But when the dimension of x° is
greater than or equal to 2, this convention might cause confusion to some
readers.' In this book, following the usual convention in economics, we
reserve the notation A <_ 0 to mean that each element of the matrix A is non-
positive. Similarly, A < 0 means that each element of A is negative.
The following theorem offers a characterization of concave functions in
terms of the Hessian matrix.

Theorem I.E.12: Let f(x) be twice continuously differentiable real-valuedfunction

on an open convex set X in R ", and let f" (x) be a Hessian matrix. Then

(i) The function f is concave on X if and only if f"(x) is negative semidefinite for
a!!xEX.
122 DEVELOPMENTS OF NONLINEAR PROGRAMMING

(ii) The function f is strictly concave on X if f"(x) is negative definite for all x E X.
(iii) The function f is convex on X if and only if f"(x) is positive semidefinitefor all
x E X.
(iv) The function f is strictly convex on X if f"(x) is positive definite for all x E X.
PROOF: See Fenchel [6], pp. 87-88.
REMARK: Note that concavity or convexity is a global concept. Hence in
each statement of Theorem 1.E.12, the phrase "for all x E X" is needed. If
f (x) is concave (or convex) in a convex subset S of its domain X, then X in
all four statements should be replaced by S.
REMARK: The converse of (ii) and the converse of (iv) do not necessarily
hold. For example, f (x) = - (x - 1)4, x E R, is strictly concave, but f" (1) =
0.

Combining Theorem 1.E.l2withTheorem 1.E.11,weobtainTheorem 1.E.13.

Theorem 1.E.13: Let f (x) be a twice continuously differentiable real-valuedfunction

on an open convex set X in R". Let f" (x) = [aq ] be the Hessian matrix off for x E X.
Let Dk and Dk(k = 1, 2, ..., n), respectively, be the successive principal minors and
principal minors off" (x). Then the following are true.

(i) The function f is concave if and only if D 1 < 0, D 2 > 0, ... , (-1)"b n >_ O for
allxEX.
(ii) The function f is strictly concave if D 1 < 0, D2 > 0, ... , (-1)'D > 0 for all
xEX.
(iii) The function f is convex if and only if b 1 > 0, b2 > 0, ... , b, > 0.
(iv) The junction f is strictly convex if D I > 0, D2 > 0, ... , D" > 0.

REMARK: The converse of (ii) and the converse of (iv) are not necessarily
true, since the converse of (ii) and the converse of (iv) in Theorem 1.E.12
are not necessarily true.
EXAMPLES:
1. Y = F(L, K) (L, K, Y E R, and all > 0) is a concave function if FLL < 0,
FKK < 0, and F is linear homogeneous, where FLL = a2F/aLL, FKK
a2F/aK2.
2. In particular, F(L, K) = L°KA > 0 is a strictly concave function if a + R
< 1."
REMARK: Recall that the second-order conditions are never mentioned in
the theorems developed in the previous sections when f and the gj's are
concave. If the gj's are concave (or even quasi-concave), the constraint set
C = {x E X: g,(x) > 0,.j = 1, 2, ... , m} is convex. Under the convexity of
C, the concavity off implies that every local maximum is a global maximum;
f is concave if and only if the Hessian off is negative semidefinite for all
SOME EXTENSIONS 123

x E X. That is, (-1)kDk >_ 0, k = 1, 2, ..., n for all x E X. In other words,

a global maximum off corresponds to the Hessian of f being globally nega-
tive semidefinite.
We now turn to the characterization of quasi-concave functions.

Definition: Let f(x) be a real-valued function on an open subset X of R", which

is twice differentiable at x° in X. Then the following matrix B is called the
bordered Hessian matrix off evaluated at x°.
0 fl f2 ... fn
fl fill f12 ... fin
B
fn fn I f2 firm

where f = of/ ax;, and fj = 6,2f/ax;8xj, i, j = 1, 2, ..., n, all evaluated at x°.

Denote the (k + 1)th successive principal minor of B by Bk, which we call
the kth (successive) bordered Hessian determinant evaluated at x°. In other words,
0 fl f2 ... fk
fl f1 I fl2 ... fl
= 1,2,...,n
f k A A ... fkk

When x° moves in X, the values of the Bk's also change in general.

Theorem 1.E.14: Letf(x) be a twice continuously differentiable real-valuedfunction

on Rn. Then the following holds for x > 0.
(i) If f(x) is quasi-concave, then B2 > 0, B3 0, ... , (- 1)"Bn > O for all x E Rn
(B 1 5 0 holds always).
(ii) Conversely, if B1 < 0, B2 > 0, B3 < 0, ..., (-1)"B, > O for all x E Stn, then
f (x) is quasi-concave on Stn, where S2" is the nonnegative orthant of Rn.
PROOF: See Arrow and Enthoven [ 1 ] , especially their theorem 5.
We now turn to the discussion of the second-order conditions. The following
characterization of the unconstrained maximum problem is a classical result.

Theorem 1.E.15: Letf(x) be a twice continuously differentiable real-valuedfunction

on an open set X in R" and f"(z) be the Hessian matrix off at z. Then

(i) If f has a local maximum at z E X, then f'(z) = 0 and f"(z) is negative semi-
definite.
(ii) Conversely, if f'(z) = 0 and f"(z) is negative definite, then there exists an open
ball B((z) about 1 with radius E > 0 and a positive number 0 such that
124 DEVELOPMENTS OF NONLINEAR PROGRAMMING

f(x)>f(x)+0 11x-X112 for all x in B, (_i)

PROOF: The proof is easy and therefore is omitted. (See, for example,
Hestenes [9], pp. 18-20).
REMARK: That f"(X) be negative semidefinite in the above theorem is
called the second-order necessary condition. That f"(X) be negative definite
is called the second-order sufficient condition.' 2
Next we consider the second-order conditions for the constrained maximum
problem. We consider, for the sake of generality, constraints which are a mixture
of inequalities and equalities. In other words, we consider the problem of finding
x E X, an open subset of R", such as to
Maximize: f(x)
Subject to: gj(x) 0, j = 1, 2, ..., m
hk(x) = 0, k = 1, 2, ..., 1
where f, gj, j = 1, 2, ..., m, and hk, k = 1, 2, . . ., 1, are all real-valued twice con-
tinuously differentiable functions on X.
In view of the presence of the equality constraints, we cannot use condi-
tions such as Slater's condition. We will assume the following rank condition (R).
(R) Let E be the set of indices j for which gj(X) = 0. Let me be the number of
such is (that is, the number of effective g-constraints at z). Then it is required
that the rank of the (me + 1) x n matrix
g- ahk
G
a x; a x;
,.jEE;k= 1, 2,...,l;i= 1, 2,...,n
where each partial derivative is evaluated at r, be equal to (me + l ).1 a
We define the Lagrangian for the present problem by
L(x) = f(x) + g (x) + µ- h(x)
where A = (A 1 , " Z, ... , .A,,,) E S2 - and µ = (µi,µ 2, ..., 41) E R 1. The quasi-saddle-
point conditions or the first-order conditions for this problem are written as follows:

(FOC) There exists an (z, )., ,u) in X (x S2l" x® R1 such that fx + it gX + µ hC = 0,

A g(z) = 0, g(X) > 0, 0, and h(z) = 0, where f, = f'(X), g, = g'(z) and
h,, = h'(z).
The (local) maximality condition is written out as:
(LM) There exists an X in X such that f (x) has a local maximum atz subjecttogi(x)
>0,j= 1, 2,...,m,andhk(x)=0,k= 1,2,...,1.
We are now ready to state an important theorem which characterizes (LM).
The first statement is concerned with the second-order necessary conditions for
SOME EXTENSIONS 125

(LM) and the second statement is concerned with the second-order sufficient
conditions for (LM).

Theorem 1.E.16:'4

(i) Suppose that conditions (LM) and (R) are satisfied; then we have (FOC) and

where =x-.r

satisfying
and 1,2, .,l
where H is the Hessian matrix of L evaluated at z, that is, H = L"(z).
(ii) Suppose that conditions (FOC) and (R) are satisfied. Furthermore, suppose that
where-x-cr0
satisfying
and 1,2, ...,l
Then there exists an open ball BE(S) c X about z with radius E > 0 and a positive
number 0 > 0 such that

f(z)?f(x)+ B 11 x-.xll2 for all x E BE(z)

PROOF: See Hestenes [9] , chapter 1, sections 9 and 10, and El-Hodiri [4] .

REMARK: Note that if f, the gi's, and the hk's are all concave and if all
the multipliers Al's and µk's are nonnegative, then the Lagrangian function
L, as a nonnegative linear combination of concave functions, is concave.
Hence the Hessian matrix of L is negative semidefinite for all x E X. In
particular, H is negative semidefinite. In other words, H- < 0 for all .

It should also be noted that Theorem 1.E.16 is concerned with only the
local characterization. The (quasi-) concavity off together with the con-
vexity of the constraint set guarantees a global characterization.
As is well known, it is possible to characterize the second-order conditions
in terms of the bordered I-Iessians. Let A = be any n x n 'matrix with real
entries and B = [b,,] be any m x n matrix with real entries. Here A and B are
not necessarily Hessian or Jacobian matrices. Now define the following sub-
matrices of A and B.
all a12 air. hii b12
... bI,
a21 a22 a2, h21 b22
... b2r
Ar Brnr =
L ar l ar2 a,,. hn,1 bn,r

where m < n is assumed. Furthermore, define the following determinants I C,. I .

126 DEVELOPMENTS OF NONLINEAR PROGRAMMING

0 Bmr
CrH-det r=m+ 1,m+2,...,n
[B"111. Ar

where Binr is the transpose of Bmr and 0 is the in x in matrix whose entries are
all zero. Then we have the following theorem to characterize the second-order
sufficient conditions.

Theorem I.E.17: Let A be symmetric and Bmm be nonsingular. Then

(i) A < O for all 0 such that B = 0, if and only if

(-1)J CrI> O,r= m + 1, m + 2, ...,n

(ii) A > O for all 0 such that B = 0, if and only if

(-1)'IC,I >O,r=m+ 1,m+2,...,n

REMARK: In (i), the sign of I Cr I depends on r; thus (i) says that the last
(n - m) successive principal minors of the bordered matrix C alternate in
signs. In (ii), the sign of I Cr I depends on m; thus (ii) says that the last
(n - m) successive principal minors of C all have the identical sign (-1)"'.

REMARK: Theorem 1.E.17 became well known to economists through

Hicks's Value and Capital (Mathematical Appendix) and Samuelson's
Foundations of Economic Analysis (especially pp. 376-378). A complete and
sound proof of this theorem is given by Debreu [3] . It is well known that
this theorem, together with Theorem 1.E.16, plays an important role in the
comparative statics analysis.15

REMARK: Let

Ar n,r
CrI=det r=m+ 1,...,n
Bmr 0

Then the bordered principal minors conditions of Theorem I.E. 17 can also
be written in terms of I Crl ; that is,
(-1)'lCrl > 0ifandonlyif(-1)' ICrI > 0(r= in + 1,...,n)
and

(- 1)"I C,. J > 0 if and only if (- 1)mn J C,. I > 0 (r = in + 1, ... , n)

As an example of the applications of Theorems 1.E.16 and 1.E.17, consider
the problem of choosing x E Rn so as to
Maximize: f (x)
Subject to:g(x)=M-px=0
SOME EXTENSIONS 127

where p ? 0 is a constant vector in R" and M is a positive constant. Define the

Lagrangian by L(x) = f (x) + A(M - p x) and let A be the Hessian matrix of L(x)
evaluated at z; that is, A = L"(z) so that a; = a 2L(i)/a x;a xj. Let B be the gradient
vector (ag/axl, ..., (-pl, -P2, ..., -pa). Then combining (ii) of
Theorem 1.E.16 and (i) of Theorem I .E.17, we can assert that a sufficient condition
for z to achieve a local maximum off(x) subject to the constraint is L'(i) = 0 and

0 -PI -P2 0 -PI -P2 -P3

-PI all a12 a13
-PI all a12 > 0,
<0,....
- P2 a21 a22
-P2
-P3
a21
a3l
a22
a32
a23
a33

or equivalently,
all a12 a13 -PI
all a12 -P1 a21 a22 a23 -P2
a21 a22 -P2 > 0 ,

a3l a32 a33 -P3 < 0, ....

-PI -P2 0
-P1 -P2 -P3 0

Note that the rank condition (R) is satisfied because p 0 by assumption.

FOOTNOTES

1. Note the difference between "strictly quasi-concave" and "explicitly quasi-concave."

Clearly if f is strictly quasi-concave (resp. strictly quasi-convex), then it is explicitly
quasi-concave (resp. explicitly quasi-convex). The converse does not necessarily
hold. Mathematicians often use the term "strictly quasi-concave" for the second
property of "explicitly quasi-concave." See Ponstein [ 15] , for example. The terms
such as "explicitly quasi-concave" and "explicitly quasi-convex" are used in Martos
[ 14] , one of the papers which introduced the concept for the first time in the
literature.
2. This statement can be proved by applying the arithmetic mean theorem; that is,
min {a, b} < to + (1 - t)b < max {a, b}, where a, b, t E R, and 0 < t < 1. Let
a =f(x) and b = f(x'), and apply the left inequality. Note the converse of the state-
ment does not necessarily hold.
3. Prove by contradiction. Suppose z is a local maximum point in C which is not
global. Then there exists x E C such that f (;:) > f (2). Due to the explicit quasi-
concavity off, this means f [tx + (I - t)i] > f(i) for all t, 0 < t < 1. Choosing t
close enough to zero, we get a contradiction of the local maximality of z. Note
that only the second property of explicit quasi-concavity is used in the proof. See
Ponstein [15] , for example.
4. That z achieves a global maximum follows from the previous statement since every
strictly quasi-concave function is explicitly quasi-concave. To show the uniqueness,
suppose the contrary. That is, f (i) = f (x*) with z x* and i, x* E C, where z and x*
both achieve global maximum. Let i = 1-i + zx*. Then r E C andf(i) > f(x*),
which is a contradiction.
5. However, it should also be noted that the consideration of the case in which the
maximand function f is real-valued is not really a prerequisite for considering the
case in which f is vector-valued. Without too much difficulty, the reader should be
able to rephrase our discussions on Sections B and D such that they hold for the case
in which f is vector-valued. See also Hurwicz [ 10] , for example.
128 DEVELOPMENTS OF NONLINEAR PROGRAMMING

6. For example, consider the problem of (vector-) maximizing x E R" subject to g(x)
0, x > 0. Interpret g(x) as the usual production transformation locus and x as
the output vector. Theorems 1.E.5 and 1.E.6 signify that the solutions of the problems
of maximizing a x with g(x) > 0 and x > 0 (a E R", a > 0), when a varies, trace
the points on the transformation locus. (The points on the transformation locus
are the solutions of the above vector maximum problem) The vector a may be
interpreted as a "price vector."
7. The quadratic form Q(x) in each of the following statements may be replaced by
the symmetric matrix A.
8. If the second partial derivatives of f exist and are continuous for all x in the
domain, then f is called twice continuously differentiable (as remarked in Section C).
In this case, the Hessian matrix f"(x) is symmetric for all x in the domain.
9. Needless to say, a matrix can be negative definite without each element of A being
negative. Conversely, A may not be negative definite, even if each element of A is
negative.
10. There seems to be a confusion among economists on this point. For example, K.
Lancaster writes, "If f (x) is strictly convex (concave), its Hessian is positive (nega-
tive) definite." (See his Mathematical Economics, New York, Macmillan, 1968,
p. 333.) This statement is wrong in view of the above counterexample.
11. If a + 1, F is no longer strictly concave, although it is strictly quasi-concave. In
general, if f (x) on x E X, a convex subset of R", is linear homogeneous, it cannot
be strictly concave. The strict concavity off requires f [(x + y)/2] > f (x)/2 + f(y)/2
for all x # y in X. But this is impossible under the linear homogeneity off, if y is a
scalar multiple of x (say y = ax, for some a E R). To see this, observef [(x + y)/2]
= (1 + a) f (x)/2 = f (x)12 + of (x)/2 = f (x)/2 + f (y)12.
(y)/2.
12. There seems to be a confusion among economists between the second-order neces-
sary condition and the second-order sufficient condition. For example, Hicks writes,
"In order that u should be a true maximum, it is necessary to have not only du = 0 ...
but also d2u < 0," (Value and Capital, 2nd ed., p. 306). Consider the problem of
maximizing f (x) _ - (x - 1)4, x E R. Clearly f reaches its maximum at x = 1. Note
that)"(1) = 0. In other words,)"(1) < 0 is by no means necessary for a maximum.
13. Assume me + 1 < n.
14. Recall that Theorem 1.D.7 has already established that (LM) and (R) imply (FOC).
This is the first statement of (i) of the present theorem.
15. See, for example, chapters 2, 3, 4, and 5 of Samuelson [ 16] See also Appendix to
.

Section F of this chapter for a complete summary of the local maximization theory
and its applications to the comparative statics problem.

REFERENCES
1. Arrow, K. J., and Enthoven, A. C., "Quasi-Concave Programming," Econometrica,
29, October 1961.
2. Arrow, K. J., Hurwicz, L., and Uzawa, H., "Constraint Qualifications in Maximiza-
tion Problems," Naval Research Logistics Quarterly, vol. 8, no. 2, June 1961.
3. Debreu, G., "Definite and Semidefinite Quadratic Forms," Econometrica, 20, April
1952.
4. El-Hodiri, M. A., Constrained Extrema: Introduction to the Differentiable Case with
Economic Applications, Berlin, Springer-Verlag, 1971 (originally "Constrained
SOME APPLICATIONS 129

Extrema of Functions of a Finite Number of Variables: Review and Generalizations,"

Krannert Institute Paper, No. 141, Purdue University, 1966).
5. "The Karush Characterization of Constrained Extrema of Functions of
a Finite Number of Variables," UAR Ministry of Treasury, Research Memoranda.
series A. no. 3, July 1967.
6. Fenchel, W., Convex Cones, Sets, and Functions, Princeton, N.J., Princeton University,
1953 (hectographed).
7. Gantmacher, F. R., The Theory of Matrices, Vol. 1, New York, Chelsea Publishing
Co., 1959, esp. chap. X (tr. from Russian).
8. Hadley, G., Linear Algebra, Reading, Mass., Addison-Wesley, 1961, esp. chap 7.
9. Hestenes, M. R., Calculus of Variations and Optimal Control Theory, New York,
Wiley, 1966.
10. Hurwicz, L., "Programming in Linear Spaces," in Studies in Linear and Non-linear
Programming, ed. by Arrow, Hurwicz, and Uzawa, Stanford, Calif., Stanford
University Press, 1958.
11. Karlin, S., Mathematical Methods and Theory in Games, Programming, and Economics,
Vol. 1, Reading, Mass., Addison-Wesley, 1959, esp. pp. 216-218 and appendices
A and B.
12. Kuhn, H. W., and Tucker, A. W., "Non-linear Programming," in Proceedings of
the Second Berkeley Symposium on Mathematical Statistics and Probability, ed. by J.
Neyman, Berkeley, Calif., University of California Press, 1951.
13. Marcus, M., and Minc, H., ASurveyofMatrix Theory and Matrix Inequalities, Boston,
Allyn and Bacon, 1964, esp. part II.
14. Martos, B., "The Direct Power of Adjacent Vertex Programming Methods,"
Management Science, Series A, 12, November 1965.
15. Ponstein, J., "Seven Kinds of Convexity," SIAM Review, vol. 9, January 1967.
16. Samuelson, P. A., Foundations of Economic Analysis, Cambridge, Mass., Harvard
University Press, 1947, esp. appendix A.

Section F
SOME APPLICATIONS

In this section we give applications of some of the theorems established

in the previous sections of this chapter. This will indicate the practical importance
of these theorems and at the same time enhance our understanding of them. First
we derive two important theorems in linear programming as simple corollaries
of the theorems established thus far. Then we illustrate the applications to
consumer's choice, theory of production, activity analysis, and the Ricardo-Mill
problem in the theory of international trade.
130 DEVELOPMENTS OF NONLINEAR PROGRAMMING

a. LINEAR PROGRAMMING
Probably the most fundamental relation in the theory of linear pro-
gramming is the dual relation. The dual relation is'concerned with the following
two types of problems, each one of which is called the "dual problem" of the
other.
(MLP) Maximize: p x
X C R"

Subject to: A x< r and x 0

where p is a given vector in R", r is a given vector in R'", and A is an m x n
matrix
(m LP) Minimize: w, r
xER'7
Subject to: A'- w >_ p and w > 0
where A' is the transpose of A.
We may recall here the convention of multiplying a vector by a.matrix or
by another vector. We do not make a distinction in our notation between a "row"
or a "column" vector, assuming that the reader will be able to tell the difference
by use. The dot between a matrix (or a vector) and a vector indicates multiplica-
tion of the matrix (or the vector) with the vector. The reader should ensure that
he knows the result of this product. In the above two problems, one should note
the following:

(i) The evaluating vector p in (MAP) appears in the constraint of (mLP), and the
evaluating vector r in (mLP) appears in the constraint of (MAP).
(ii) The constraint matrices are each other's transpose.
(iii) Except for the nonnegativity condition, the inequality in the constraint is
reversed.

We now prove two important theorems in connection with dual problems:

the duality theorem and the Goldman-Tucker theorem. These theorems were
originally proved without using the theory of nonlinear programming, and, in fact,
played an important role in the development of the theory of nonlinear pro-
gramming. Here we prove them using the theorems established in the previous
sections.

Theorem 1.F.1 (LP duality theorem):

(i) There exists an optimal solution i for (MAP) if and only if there exists an optimal
solution v for (mLP).
(ii) The inequalities i > 0 and tiv >_ 0 satisfy A i < r, A'- ii' > p, and v (r - A z) =
i (A'- Cv - p) = 0 if and only if i is optimal for (MAP) and tiv is optimal for
(MLA (moreover, in this case, p i = r tiv ).
SOME APPLICATIONS 131

PROOF: (i) Suppose z is optimal for (MLP). Then by Theorem 1.D.5, there
exists w >_ 0 such that (z, w) is a saddle point of(D (x, w) = p x + w. (r- A x),
that is,
(1) a) (x, w) < a) (z, w) < a) (z, w) for all x > 0, w > 0
Define
Y'(tiv,x)=
x p)
Then from (1) we have
LP(w,z) <'I'(w,z) <'P(w,x) for all x > 0, w>_ 0
Hence from the corollary of the Arrow-Hurwicz-Uzawa theorem, w maxim-
izes - w r subject tow ? 0, A'- w ? p; that is, w is optimal for (mLP).
Conversely, if iv is optimal for (mLP), then, proceeding as before, there
exists an z >- 0 such that 'P(w, x) has a saddle point at (w, z). This in turn
implies a) (x, iv) has a saddle point at (z, w), which implies z is optimal for
(MLP)
(ii) If z and w are optimal solutions of (MLP) and (mLP), respectively,
then by the reasoning of part (i), (,r, w) is a saddle point of 0 (x, 1v), and
(iv, z) is a saddle point of'P(w, x). But then by Theorem 1.B.4 (ii),
w (r - w - p) = 0
Moreover, since I (z, w) = -'I' (w, z), we then have p c = ti"v r, which
verifies the parenthetical remark in (ii).
Conversely, suppose that there exist z > 0 and i^v >_ 0 such that
(2) 1 p) = 0
and

(3)

Then if z > 0 and 1"v > 0 we have, using (3), (2), and (3) in turn,
0(x,w)= p x + w ( r - A x ) = i r+ (p -
w 1 p)=
a) (1, fv) = p- i + A- i) _ (P (1,
Hence, again by Theorem 1.D.5, it follows that z is optimal for (MLP). The
fact that tiv is optimal for (mLP) then follows from (i). (Q. E. D.)

REMARK: There are a variety of proofs of the LP duality theorem which

do not rely on the theory of nonlinear programming. For these, see any
standard textbook on linear programming. Two proofs that are among the
simplest and most interesting are one by Nikaido [ 12], which proves the LP
duality theorem as an application of the Minkowski-Farkas lemma, and
132 DEVELOPMENTS OF NONLINEAR PROGRAMMING

the other by Dantzig ([4], pp. 129-134), which proves the theorem as an
application of the LP simplex method. Our proof will enhance the reader's
understanding of the nonlinear programming theory developed in this
chapter.
In the course of the proof of the duality theorem, we also proved the following
theorem.

Theorem 1.F.2 (Goldman-Tucker): There exist optimal solutions for (MAP) and
(m LP ), denoted by X and w respectively, if and only if there exists (X, w ), which is a saddle
point of 0 (x, w) = p x + w (r - A x); that is,
0 (x, w) < 0 (z, w) < m (z, w) for all x > 0, and all w > 0
REMARK: For the original proof, which obviously does not rely on non-
linear programming theory, see Goldman and Tucker [81, especially
theorem 6, pp. 77-78. They obtained this theorem from the LP duality
theorem.
REMARK : In Figure 1.17 we illustrate schematically the logical structure of
some of the important theorems established so far.
REMARK: We proved the fundamental theorem of activity analysis
(Theorem O.C.3) by using the separation theorem. As we will see later, this
theorem can also be proved by an extension of the concave programming
theorem. The proof of the Minkowski-Farkas lemma by utilizing the LP
duality theorem and the proof of the LP duality theorem by utilizing the
Minkowski-Farkas lemma are not too difficult and will be interesting exer-
cises for the reader.

separation tneorems

Fundamental theorem of LP duality theorem

M - Farkas lemma
concave functions

Concave programming A-H-U theorem

Kuhn-Tucker's main theorem

Figure 1.17. Logical Structure of Some Important Theorems.

SOME APPLICATIONS 133

b. CONSUMPTION THEORY
In the classical theory of consumer's choice as explained in Hicks [ 101,
for example, a consumer is supposed to maximize his satisfaction over the budget
set. Let x E R" be his n-commodity consumption bundle and u(x), a real-valued
function defined over R", be his utility function. Suppose that this consumer is a
"competitive consumer" so that he cannot influence the prices of the commodities
in the market. Then if a price vector p prevails, his budget constraint can be
expressed as p- x < M(with x ? 0) where M is his income. Although the non-
negativity condition x ? 0 is not mentioned in Hicks, it is implied in the context.
Hicks wrote the budget constraint in the form of the equality p x = M. This
means that the consumer must spend all his income. We will use the inequality
constraint p x < M instead, allowing the consumer the possibility of not spending
all his income. Later we will find a condition under which this constraint becomes
effective (that is, he spends all his income).
We can now write the problem for each consumer as follows:
Maximize: u(x)
xER"
Subject to: p- x< M and x? 0
Following the classical analysis, we assume that u (x) is differentiable' everywhere.
Hence we can use the theory developed in Sections D and E. Since the constraints
are linear, owing to the (A-H-U) theorem (Theorem I.D.4), the (QSP') condition
is a necessary condition for global maximality. In other words, if z is a solution of
the above problem, then there exists a A E R such that
(4) ux; - .a.pi < 0, i = 1, 2, ..., n
(5) i. (fix -.i1.P)=0 (QSP )
(6) A.(M - p z) = 0, A > 0
(7) z>0
Here fix = u'(x), ux; = au/ax; (evaluated at x = z), and ax = (fix , ... , ux").
Conversely, assuming u(x) is a concave function, if there exist z and A, both
nonnegative, such that the above (QSP') condition holds, then, owing to Theorem
1.D.2, z is a solution of the above constrained maximum problem for the con-
sumer. In other words, under the concavity of u(x), the above (QSP') becomes a
necessary and sufficient condition for z to furnish a global maximum of the above
constrained maximum problem (see Theorem I.D.5). Hence our attention will
be shifted to finding the values of z and A which satisfy the above (QSP'). If u (x) is
not concave but rather quasi-concave, then we need an additional assumption. In
particular, we assume
(A-c) Zlx; > 0 for some relevant variable xi (that is, positive "marginal utility"
for some relevant variable).2
Then, applying the Arrow-Enthoven theorem (Theorem 1.E.2), we can again
134 DEVELOPMENTS OF NONLINEAR PROGRAMMING

conclude that (QSP') provides a necessary as well as sufficient condition for the
above constrained maximum problem.
With these remarks, we now shift our attention to (QSP'). First observe that
conditions (6) and (7) of (QSP') mean
(8) A> 0 implies M- p z = 0
Since ux; < Api for all i from condition (4) above, uX > 0 (positive marginal utility)
for some commodity i is consistent only with A > 0 and a positive price of that
commodity (pi > 0). We may recall that i2,; > 0 for some relevant xi is assumed
[(A-c)] when we adopt quasi-concave programming. In other words, if we
assume that there exists at least one commodity in which the consumer is never
satiated, then a. > 0 so that M = p z [resulting from relations (6) and (7)]. This
could mean that the nonsatiation assumption will be a crucial assumption in the
sense that it guarantees that all the income is spent. Thus we have revealed one
crucial assumption which underlies the Hicksian equality constraint M = p x.
Next note that conditions (4) and (5) of the above (QSP') mean
(9) zi(u.Y.-Ap,)=0, i= 1,2,...,n
Hence if we assume an interior solution for all i (that is, . , > 0 for all i ), then we
obtain
(10) u'i = )pi, i = 1, 2, .. ., n
Note that this interior solution assumption is usually made implicit in the classical
analysis, as it is explained in Hicks, for example. In general, this assumption does
not necessarily hold. It is quite possible that ii = 0 for some i. Atypical situation
is illustrated in Figure 1.18.
Following the classical analysis, we now proceed with the interior solution
assumption (that is,. , > 0 for all i). By relation (10), if pi 4 0 for some i, then)t > 0.
In other words, under the assumption that the consumer consumes a positive
amount of every commodity, pi 4 0 for some i guarantees A. > 0. Then we have
M = p z [from (8)]. This and equation (10) provide n + 1 equations which are

Figure 1.18. An Illustration of Corner Solution.

SOME APPLICATIONS 135

, z2 , ... ,
available to determine (n + 1) variables, that is, z I
z
, and A. This is the

classical procedure as explained in Hicks [10] (especially the mathematical

appendix). Assuming that there exist "equilibrium" values of these variables, we
can then go to comparative statics and obtain the "fundamental equation of value
theory" (or the "Hicks-Slutsky equation") 3 We may note that these equilibrium
values of the I 's will furnish a global maximum for the above constrained maxi-
mum problem, which clearly has important implications in comparative statics.
We may note here that (10) implies that i > 0 if and only if pi > 0 (with
A > 0). This means of course that if the consumer decides to consume a positive
amount of some commodity (which enters into his original choice problem), then
the positive price of a particular commodity implies a positive marginal utility
for that commodity. Note that this allows the possibility of a negative price
for a certain commodity. If the price of a certain commodity is negative, the
marginal utility of this commodity must be negative. Conversely, if a certain
commodity has a negative marginal utility (thus is "undesired") for every
consumer, then it must have a negative price. To pursue this converse problem
more precisely, we would have to construct a model of the economy into which
all the relevant consumers and producers are included. We will not pursue this
problem here.
Next we note the bordered Hessian condition as discussed in Hicks. Hicks
pointed out the following second-order condition as a sufficient condition for
(local) maximality of the constrained maximum problem: (- 1)k Bk > 0, k = 1,
2, 3, .. ., n, at a solution z of the problem, where
0 u1 U2 Uk
U1 UI1 ul2 ulk
Bk U2 u21 U22 U2k

I uk ukl Uk2
... ukk I

where u, - au/axi and ui - a2u/ax;axj (all the partial derivatives are evaluated
at x =1).
By Theorem 1.E.14, if the above condition holds for all x, then u(x) is quasi-
concave. In other words, quasi-concave programming enables us to dispense with
the above bordered Hessian condition and provides us with a global maximum
(instead of a local maximum). Note that if u is strictly quasi-concave, we would
have a unique global maximum. Needless to say, the quasi-concavity of the utility
function (that is, the convex-to-the-origin indifference curves) is more intuitively
appealing than to say that the utility function satisfies the bordered Hessian
condition.;
If u(x) is concave, then the above bordered Hessian condition is replaced by
the stronger Hessian condition, as discussed in Section E. Strict concavity will give
a unique solution.
This finishes our critical review of the classical theory of consumer's choice
in terms of (quasi-) concave programming. The following points were made
explicit:
136 DEVELOPMENTS OF NONLINEAR PROGRAMMING

(i) The formulation of the problem as a modern theory of (quasi-) concave

programming problem.
(ii) The assumption which guarantees the equality M = p z (that is, the non-
satiation assumption).
(iii) The possibility of a corner solution.
(iv) The possibility of negative prices.
(v) The relation between the second-order condition and quasi-concave pro-
gramming theory.

Finally, we should stress that the theory of concave (or quasi-concave) program-
ming provides a global characterization of the problem. The classical treatment
in terms of the Euler-Lagrange necessary conditions and the Hessian (or bordered
Hessian) condition (as utilized by Hicks and so on) only provides a local character-
ization; that is, it is concerned with the properties in some (possibly very small)
neighborhood of a solution point, and there may be many solution points, each
giving a different value for maximal utility.

C. PRODUCTION THEORY
The production activity of an economy is concerned with transforming one
set of commodities, called "inputs," denoted by a vector v = (VI, v2, ..., v,")
into another set of commodities called "outputs," denoted by a vector x = (x1,
x2, . . ., x"). In activity analysis, inputs were denoted by negative numbers, outputs
were denoted by positive numbers, and we called a vectory = (- v, x) an "activity
vector" (after normalization with respect to a certain commodity, to define the
"activity level"). Then we considered the set of these y's, Y, and called it the
"production set." We now wish to describe this set by a functional relation in order
to obtain an application of the theory established in this chapter. By the explicit
introduction of such a functional relation, our analysis will also serve as a
critical review of an important part of the classical production theory (as explained
in Hicks [ 10] and Carlson [2] ). In the following analysis, we denote inputs-
say, vj-by positive numbers (instead of negative numbers).
The functional relation that describes a production set can be written as
F(v, x) > 0
We assume v E R'" and x E R" with v >_ 0 and x >_ 0. In the case where v and
x are real numbers, we may illustrate the above relation as in Figure 1.19. Here
the shaded area illustrates the values of (v, x) which satisfy the above functional
relation.
We note that if F(v, x) = f (v) - x, the relation x = f (v) can be obtained by
solving F(v, x) = 0. This relation x = f(v) or F(v, x) = 0 (where v E R'", x E R"
with v > 0, x > 0) is the familiar production function in the traditional analysis. We
may call such a surface a production frontier. In the functional relation F(v, x) > 0,
we allowed the possibility of points which are not on the surface F(v, x) = 0.
This is illustrated in Figure 1.19. Points such as A are on the curve defined by
SOME APPLICATIONS 137

f(v)

Figure 1.19. An Illustration of Production Set.

F(v, x) = 0 [or x = f(v)]. However, we also allow the possibility of points such
as B. Such points are allowed for either (or both) of the following two reasons.
(i) We allow the possibility of production processes that are technically inferior to
some other processes. In other words, we do not assume the existence of an
"efficient" manager.
(ii) We assume free disposability of commodities so that some inputs and outputs
can be thrown away in the process. This can happen, for example, if some com-
modities (either inputs or outputs) become "free" due to an excess supply of
those commodities in the economy.
Now let p = (pt, p2, . . ., be the price vector for outputs and iv = (wl,
W2, ..., w,,,) be the price vector for inputs. Then the profit which can be obtained
by transforming v into x may be written as

Suppose that the "producer" is "competitive" so that he cannot affect the level of
prices, p and w, that prevails in the market. Suppose further that his behavioral rule
is profit maximization (for otherwise he will sooner or later be ruled out of a typic-
ally "competitive" market). Then his problem is the following nonlinear program-
ming problem.
Maximize: 7C = p x - w v
(C+ V)

Subject to: F(v, x) > 0 and v > 0, x > 0

First, let us suppose that F(v, x) is a differentiable concave function with respect to
v and x, and that the following Slater's condition holds.
(S) There exist v ? 0, z > 0 such that F(v, z) > 0.
This condition (S) can be accepted without much difficulty. Essentially it allows
an interior point in the production set which is possible as a result of either (i) or (ii)
138 DEVELOPMENTS OF NONLINEAR PROGRAMMING

above. Under these assumptions, we can apply Theorem 1.D.2. In other words,
z ? 0 is a solution of the above problem if and only if there exists a scalar A > 0
such that
(11) p.+AF,<. O, i= 1,2,...,n
(12) W, 0, j=
(QSP')
(13)
(14) 0,A.F(v,z)=0
where Fx.= 8 F / 8x;, F vj 8 F / o - v j, i = 1 , 2, ..., n, j= 1, 2, ..., in [each evaluated
at (v, x) = i)], and Fx = r,,, . . ., F., ), Fv = (Fvi, F,,Z, ..., F,m).
By conditions (11) and (12), condition (13) of the above (QSP') is equivalent
to
(15) x; (pi + 0, i= 1, 2, ..., n
(16) i1(-wj +=0, j= 1,2,...,m
We assume that at least one output (say, io) is produced (that is, 2, > 0) or at least
one input (say, jo) is used in production (that is, i > 0). Then from (15) or (16)
(17) A>0
as long as pio > 0 or who > 0.
Then from (14), we have F(v", z) = 0. In other words, under the above as-
sumption of z;o > 0 orvjo > 0 (for some io or jo), production will take place on the
production frontier if and only if the producer maximizes his profit. This cor-
responds to the fundamental theorems of activity analysis (Theorems O.C.2 and .
O.C.3).
If we assume an interior solution for every output and input (that is, z; > 0,
v"j > 0 for all i and j), as in Hicks [10], conditions (11) and (12) can be rewritten as
follows:

(12') wj =j 1,2,...,in
Under the assumption of an interior solution, we have A > 0, as noted above,
which in turn implies F(2, v") = 0. Combining this equation with conditions (11')
and (12'), we obtain (n + in + 1) equations, which, in turn, would presumably
determine (n + in + 1) variables, that is, A, the z;'s, and the vg's as functions of the
pi's and w,'s. By changing the values of thep,'s and w's, we get a comparative statics
analysis which will lead to Hick's fundamental equation. In the above analysis,
the fact that (QSP') is necessary and sufficient for a (global) maximum depends on
our assumption that F(v, x) is a concave function. The function F(v, x) is concave if
and only if the following Hessian condition holds (assuming that F is twice
differentiable). (See Theorem 1.E.13.)
SOME APPLICATIONS 139

61:5 0,,62> 0,..., (-1)m+nDm+n> 0

where Dk is defined as in Section E.
If F is quasi-concave instead of concave, we can still say that the above
(QSP') is sufficient for the optimality of (v", z) since the maximand function is linear
and hence concave [condition (iv) of the Arrow-Enthoven theorem, Theorem
1.E.2] . Moreover, (QSP') is also necessary for the optimality if a certain constraint
qualification is satisfied. The constraint qualifications which require neither the
concavity nor the convexity of the constraint functions are provided, for example,
in (KTCQ) and the Arrow-Hurwicz-Uzawa theorem [conditions (iv) and (v) of
Theorem 1.D.4 or condition (ii) of Theorem 1.E.3] For example, assuming that
.

there exist v > 0, x > 0, with F(v, x) > 0, (QSP') is necessary for (v", z) to furnish
a maximum, ifs
(18) F'X 0 or F, O

If we have an interior solution for some i or some j, then (11') and (12') imply (18).
Hence, assuming an interior solution, (QSP) becomes necessary and sufficient for
an optimum under the quasi-concavity of the function F. The quasi-concavity of F
can be characterized in terms of the bordered Hessian conditions, which cor-
responds to Hicks's discussion of the topic ([ 10] , p. 320). A condition that is alter-
native to (18) can be obtained by utilizing the rank condition (Theorem .1. DA). The
rank condition for the present problem is stronger than (18). Even with z > 0,
v > 0, we may, for example, requires
(18') Fx 0 and Fv 0
Again under the quasi-concavity of F, (QSP') provides a set of necessary and
sufficient conditions for an optimum.'
The quasi-concavity of F(v, x) implies that the following bordered Hessian
condition holds (assuming that F is twice differentiable).
(19) B, < 0, B2 > 0, ..., (- 1)m+n Bm+ ' 0
where Bk is defined as in Section E (Theorem 1.E.14). We again emphasize that
the concept of concavity or quasi-concavity is more intuitively appealing than
the Hessian or the bordered Hessian conditions.
In order to understand further the meaning of the above (QSP') condition,
we now assume that there is only one output in this production, so that x andp
are now scalars. We also assume that F(x, v) > 0 can be written asf(v) > x. Then
our (QSP') condition can be rewritten as follows:
(20) p-A<0
(21) -wj +AJ 0,j= 1,2,...,m
(22) (p-))L +v" (-IV +)J,)=0
(23) A[f(v)-X] =0,f(v)-X>0,z>0,v>0
140 DEVELOPMENTS OF NONLINEAR PROGRAMMING

Conditions (20), (21), and (22) imply that

(24) (p-A)z=0
and

(25) )])=0, j= 1,2,...,m

Assume p > 0; then A > 0 as a result of (20). Then, in view of (23),.i = f (v), which
means that the production takes place on the production frontier at an optimum.
Note that this does not preclude the possibility of z = 0. In other words, it is pos-
sible that zero output is optimum. Assume that z > 0, for otherwise it would not be
of interest to discuss the problem. Then from (24), p = A. In other words, the
Lagrangian multiplier A for the problem is equal to the price of the outputp which
is given to the producer. Thus the set of conditions (20), (21), (22), and (23) are re-
written as follows:
(26) p-v<w1, j 1,2,...,m
(27) v,(wi-pf)=0, j=1,2,...,m
(28) f(v) = X
Conditions (26), (27), and (28) provide the necessary and sufficient conditions
for (z, v) to be optimal under the quasi-concavity of [f(v) - x ] or the concavity of
f(v), given the proper additional conditions discussed above.
Condition (26) says that the value of thejth factor's marginal product cannot
exceed the price of the jth factor. Condition (27) says that if the price of thejth
factor exceeds the value of its marginal product, then thisjth factor will not be used
in the profit maximizing activity. Condition (28) says that only profit maximiza-
tion is compatible with "efficient" production (production on the frontier). We
may also note that the nonnegativity conditions (x > 0, v > 0) are now explicitly
considered.

d. ACTIVITY ANALYSIS
Let a, be the amount of the ith commodity involved in a unit operation of the
jth activity and let ai be the vector for the jth activity whose ith element is a;,.
Assume that there are n commodities and m activities. Let xj be the activity level of
the jth activity and let x be the activity vector whose_jth element is xj. Then, as we
discussed in Chapter 0, Section C, the production set Y is given by
(29) Y = { y: y = A x, x > 0}, where A = [ a,,]
or
In

x>_0}
SOME APPLICATIONS 141

An efficient point y of Y is a point such that there does not exist ay E Y such
that y ? y. In other words, this y can be obtained as a solution of the following
vector maximum problem.
(Vector) Maximize: y
Subject to: y E Y
Then from Theorem I.E.4, if y is a solution of this problem, there exists p ? 0 such
that
(30) for ally E Y
Obviously this holds even if Y is not restricted to the form (29). Only the convexity
of Y is required. Relation (30) corresponds to Theorem O.C.3. Although the con-
verse of the theorem is easy to obtain, as discussed in Theorem O.C.2, we can also
obtain this converse by using Theorem 1.E.6. In other words, if there exists ap > 0
such that p y ? p y for ally E Y, then y is an efficient point of Y(or a solution of
the above vector maximum problem).
We now consider a resource constraint which we write as follows:
(31) y+ z> 0, yE Y
where z;, the ith component of z, denotes the amount of this ith commodity ("re-
source") available in the economy. The feasible set YF of this economy is then
yF={y:yEY,y+z>0}
Now we are interested in the problem of finding an efficient point of this feasible
set F. The point yFis an efficient point of YFif there does not exist y E YFsuch that
y ? yF. Hence an efficient point of YFcan be obtained as a solution of the following
vector maximum problem.
Maximize: y
Subject to: y + z ?0 and y E Y
Assume Slater's condition so that there exists a y E Y such that y + z > 0. Then
using Theorem 1.E.4, if y is a solution of this problem, there exists ap ? 0 (p 0)
and A ? 0 such that
(SP) cP(y,A)«(y,.)<cD(y,A), for a]]yE YandA?0,and)t.(j +z)=
O, where ( 1 ) A) - A - (.1 + z).
The first inequality of the above (SP) can be written as follows:
(32) z) 5 A. (y+ z) for allyE Y
or

(32') forallyE Y, whereq - p+A

142 DEVELOPMENTS OF NONLINEAR PROGRAMMING

which means "profit maximization" with respect to q. Also note that, under
Slater's condition and the convexity of Y, (32) and A (y + z) = 0 are equivalent to
(cf. Theorem 1.B.5):
(33) for all yE Ysuch that y+z0
which means profit maximization with respect to p subject to the resource
constraint.
Conversely, if there exist p > 0, A > 0, and y E Y such that the above (SP)
condition holds, then y is a solution of the above constrained vector maximum
problem (Theorem 1.E.5). This corresponds to Theorem O.C.2. Certainly, this is a
difficult way to reach such a theorem, but it does illustrate one use of the vector
maximum problem.
Now consider the following linear programming problem.
Maximize: a y
y
Subject to: y+ z> O and y E Y, where a E R ", a >_ 0
or

Maximize: a A x
x
Subject
Clearly those two problems are equivalent. Hence if z is a solution of the latter
problem, y = A x is a solution of the former problem. Now; is a solution of the
former problem if and only if there exists a A > 0 such that
(SP) 0(y,.i)<_c1(y,A)<0(y,A),forallyEY,andA>0,where a) (y, A)
z).
We can prove this by slightly modifying our proof of the Goldman-Tucker
theorem. In any case, this saddle-point condition means that ify is a solution of the
above constrained vector maximum problem, then it is a solution of the first linear
programming problem. Conversely, if we can find a solution z of the latter linear
programming problem with a > 0, then y = A z is a solution of the above con-
strained vector maximum problem, thus providing an efficient point of YF. By
varying a, we can obtain the set of efficient points.

e. RICARDO'S THEORY OF COMPARATIVE ADVANTAGE AND

MILLS PROBLEM
Consider a two-country world, where each country (1 and 2) is able to pro-
duce two commodities, X and Y, using one factor, "labor." Let lri and lyi be the
amount of labor necessary to produce one unit of X and Yrespectively in country
i(i = 1, 2), which are assumed to be positive constants. Let L; be the total supply
of labor in country i. We suppose that labor is immobile between the countries,
and that the transport costs of X and Y are negligible. The production activities
ire described in the following table.
SOME APPLICATIONS 143

Country 1 Country 2
Commodity X 1 0 1 0
Commodity Y 0 1 0 1

Labor of country 1 -IXI - lyi 0 0

Labor of country 2 0 0 -1x2 - lye
Note that we are assuming that each country has only one production process for
the production of each commodity. Letting x1 and y; be the output of X and Y
respectively for country i, the resource constraints for the two countries can be
written as follows:
(34) lxi xl + lyiyi L,
(35) lx2 x2 + 1y2Y2 < L2
Or we can write
i + Yi
(36)
Li/1x, Li/ly, =
X2 + Y2
(37)
L2/1x2 L2/1y2

We may also write

xi +Y,<_1
(38)
ai b1
x2
(39) + Y2 <
a2 b2

where
a;-L', b; y! (i=1,2)
The production possibility sets for the two countries are illustrated in Figure
1.20.

1 Y2

x1 x2

Figure 1.20. Each Country's Production Possibility Set.

144 DEVELOPMENTS OF NONLINEAR PROGRAMMING

In the above diagrams we assumed that

(40)
IX11X2 bi<b2
or
Iy1 1y2 a, a2

This condition is called Ricardo's condition of comparative advantage. Essentially,

this says that country 1 is comparatively more "efficient" in producing commodity
X and less "efficient" in producing commodity Y than country 2. Letting x and y
be the total world output for the two commodities (that is, x = x, + x2, y = Y, +
y2), and using Figure 1.20, we obtain Figure 1.21 where the block OSRQ describes
the world production set.
Mathematically, the world production set can be described as the set of
points (x, y) which satisfy the following constraints with x > 0, y > 0:
X y b x y a
a, b, b,' a2 b2 a2'

where x = x, + x2, y = Y1 + y2, a = a, + a2, and b = b, + b2. We can also check

algebraically that these constraints are equivalent to the constraints for the
individual countries given above (see Chipman [3], p. 485).
Now consider the vector maximum problem of maximizing (x, y) subject to
the above constraints and x ? 0, y ? 0. The set of solutions of this problem is the
kinked line QRS. Since Slater's condition is trivially satisfied (or since the con-
straints are all linear) in this problem, we can immediately apply Theorem 1.E.4.
Hence, if (z, y) is a solution of this problem, there exist p = (px, py), -
P 0, 0, such that

(41) I (x, Y; A) < (c, Y; A) < c (X, Y; a.) for all x, Y, A > 0

Figure 1.21. The World Production Possibility Set.

SOME APPLICATIONS 145

where
(x,Y;A)=Pxx+ pyy+A1 bi
-aIx _ b'1
y
f A2
a
[a?
x
a2 bz I
and

(42)
z -a,-bi]=0
ij

(43) [,a2
A2 a2-62=0

Conversely, from Theorem 1.E.7, if there existp > 0 and A > 0 such that the above
saddle-point condition (41) holds, then (z, y) is a solution for the vector maximum
problem. It is easy to see that the values of px and py determine the location of
the solution on the line QRS.
Now consider the following linear programming problem [where p
/,,
(Px, PY) > 0] .

Maximize: pxx + pyy

(x,Y)

X y b
Subject to:
a, b, - b,
x + y < a
a2 b2 a2

x>0,y>0
Then from the Goldman-Tucker theorem (Theorem I. F.2), (z, 5)) is a solution of
this problem if and only if there exist (z, y) :-n 0 and A > 0 such that
(44) 1 (x, l'; A) < a) (X, Y; A) < 1 (c, Y; A) for all x, y, A > 0
Hence the solutions for the above vector maximum problem are characterized by
this linear programming problem. In other words, if (z, y) is a solution of the above
vector maximum problem, it is a solution of the above linear programming
problem. And conversely, if (z, y) is a solution of the above linear programming
problem with px > 0, py >'0, then it is also a solution of the above vector maximum
problem.
If in the linear programming problem we choose px and p,, such that px > 0,
py > 0, and

(R) 41 Px 42
/,,I Py 1,,2

then we obtain point R of Figure 1.21 (as can be seen at once from the diagram).
This point R is called Ricardo's point by DOSSO ([6], p. 35),8 for this is exactly
the problem that David Ricardo was concerned with in his celebrated theory
146 DEVELOPMENTS OF NONLINEAR PROGRAMMING

of comparative advantage. Note that point R is obtained if country 1 specializes

in the production of commodity X and country 2 specializes in the production
of commodity Y. In short, Ricardo was concerned with the problem of finding
the frontier of the world production set (the line QRS) and he specified point
R by specifying the slope of the "price line," px/py. Point R is the point of com-
plete specialization for each country. Borrowing the terminology of activity
analysis, we may call the line QRS the world efficient frontier.
Note that when price vector p prevails with condition (R), then, as a
solution of the above linear programming problem, (z, y) at point R maximizes
the value of total world output. Note also that if price vector p prevails with
condition (R) after free trade (with no transport cost), then each country's
national product is maximized. This can easily be seen from Figure 1.20 which
describes the production possibility sets for the two countries (the price line is
indicated by the dotted lines). In other words, point R is the "optimum" point
from the point of view of both the world as a whole and each country individually.'
Also note that if price vector p prevails with condition (R), then point R (which
signifies complete specialization for both countries) is obtained under the com-
petitive rule, assuming that the price ratio of the two commodities in each
country will be equal to the slope of its production possibility line before trade.
This can be seen as follows: Suppose that international trade is initiated between
the two countries with price vector p such that condition (R) holds. Then each
country's merchants will come to the world market and trade the commodities
at the world price vector p, thus maximizing their profit. This, in turn, will
bring each country to the point of complete specialization (country 1. in com-
modity X and country 2 in commodity Y). The above logic can be easily seen
from the dotted price lines in Figure 1.20.
The above analysis has one serious drawback: It does not involve a con-
sideration of the utility of each country's consumers (that is, the so-called
"demand conditions"). Let us now take up this problem.
Assume for the sake of simplicity that we can write the world welfare
obtained by consuming the bundle (x, y) by u(x, y), where u is a quasi-concave
continuously differentiable function on the nonnegative orthant of R2. Now con-
sider the following nonlinear programming problem.
Maximize: u(x, y)
(X, y)
X+Y b
Subject to:
a, b, b,

x + y a
a2 b2 a2
x>0,y>0
This is essentially the problem that John Stuart Mill was trying to solve in [ 11 ] .

For the utility function u, he assumed:

As the simplest and most convenient, let us suppose that in both countries
SOME APPLICATIONS 147

any given increase of cheapness produces an exactly proportional increase of

consumption: or, in other words, that the value expended in the com-
modity, the cost incurred for the sake of obtaining it, is always the same,
whether the cost affords a greater or a smaller quantity of the commodity.
([ 11] , p. 155)
Chipman noted ([3], pp. 484-485) that such a demand function is yielded by
a utility function of the form u = xayP, and Mill chose a special case in which
a = /i = 1. We generalized his utility function by assuming it to be quasi-concave.
Now let us return to the above nonlinear programming problem. Noting that
the constraints are all linear, we apply the Arrow-Hurwicz-Uzawa theorem
(Theorem 1.D.4). As a result of this theorem, the following (QSP') condition is a
necessary condition for the optimum:
(QSPm) There exist (z, y) and _ 0 , , such that

(45) fix
- - 0,
uy,2<0
b,
a, a2 = bz

A2
(46) b2 0
a, a2 b, y=J

(47)
b
b,
z
a,
y >
bi =
a- X -
a2 a2
y
b2 >
= 0

(48) At +
a
a2 [u2 - a2 - b2 J = 0
r yl
1-b I a, YJ

z _ O,y>O,AI>_O,A2>0
[where fix = au/ax, uy = au/ay, both evaluated at (z, y)]
To establish the converse of the above statement, we apply Arrow and Enthoven's
theorem of quasi-concave programming (Theorem 1.E.2). In this problem we
can assume uX > 0, or uy > 0, which is the case for the utility function u(x, y) _
xayl,. Then, from the Arrow-Enthoven theorem, the above (QSPm) condition is
sufficient for the optimum. Hence we can assert that (z, y) is a solution of the above
nonlinear programming problem if and only if the above (QSPm) holds.
Now assume

b, b2 that is, lx, < l_,.2

(49)
a, a2 11.1 .2

as above and find the condition under which country 1 specializes in the produc-
tion of X and country 2 specializes in the production of Y (the Ricardian pattern
of complete specialization). This is the question raised by J. S. Mill. Mathe-
matically speaking, we are now seeking the condition under which the Ricardian
pattern of specialization (z = z, = a,, y = y2 = b2, y, = 0, and z2 = 0) is the solu-
tion of the above nonlinear problem [hence satisfies the above (QSP'n,)].
148 DEVELOPMENTS OF NONLINEAR PROGRAMMING

The solution of Mill's problem is rather easy to see from Figure 1.21 and it
does not need the above machinery of nonlinear programming such as (QSPm).
Assuming that the utility function is nicely shaped, such as u(x, y) = xayl-11,
0 < a < 1, we can easily see from Figure 1.21 that the necessary and sufficient
condition for (a,, b2) to be the solution of the above nonlinear programming
problem is simply that the slope of the indifference curve at (al, b2) be between
the slopes of the lines QR and RS. Letting (z, y) _ (a,, b2), we can write this
condition as follows:
b, < uX < h2
(50) (with at least one strict inequality)
al uy a2

where uc and uy are now defined respectively as uX and uy both evaluated at (al,
b2). This condition is often called Mill's condition.
We now obtain this necessary and sufficient condition mathematically. This
procedure is more tedious than the one in terms of the above diagram, but it is
useful in order to become familiar with our nonlinear programming theory as
well as to obtain the precise understanding of the solution. Moreover, it will
facilitate a further generalization (see Takayama [ 17] ). First introduce the
following assumption on the utility function, which will guaranteethatz > 0,y >
0. Notice that if we cannot guarantee z > 0, y > 0, then (a,, b2) cannot be a
solution.
(A-m) u(x, 0) = 0 and u(0, y) = 0
adx
>0 forallx>O,y>0
An example of a utility function that satisfies the above assumption is one of the
Cobb-Douglas type, u = xny('--), 0 < a < 1. The feasible set for our nonlinear
programming problem is M = t(x, y): (x, y) > 0, x/a, + y/b, < b/b,, x/a2 + y/b2
a/a2}. The set M is nonempty and contains a point (x, y) with x > 0
and y > 0. (Note that this also implies that Slater's condition holds for the present
problem.) Hence from the above assumption, u(z, y) > u(x, 0) for all x > Oand
u(x , y) > u (0, y) for ally ? 0. Therefore, an optimal point (z, y) must be such that
z > 0 and y > 0. Then the first two conditions (45) and (46) of the above (QSPm)
can be converted to the following equivalent condition:

(51) uA
r - , -
a,
a2 =
a2
0, u -Ab,,-A2=0
b2

Mill's problem is that of finding the condition under which Ricardo's point
(a,, b2) is optimal. Since at (a,, b2) the two relations in condition (47) of the above
(QSP',,,) hold with equality, (48) is automatically satisfied; thus at (a,, b2) both (47)
and (48) of (QSP;,,) are satisfied. Hence the necessary and sufficient condition for
Ricardo's point (a,, b2) to be optimal is reduced to the following condition:
(52) There exists A I >= 0, }12 ? 0 (with at least one strict inequality)" such that
condition (51) holds at (a,, b2).
SOME APPLICATIONS 149

From (51) we obtain

(53-a)
az b, - az .
- Q bzA, = (uy - y- ,,)bl

(53-b) Az- az b;Az= (uy- b,ur)bz

where uX = uc(a1, b2) and uy = u,,(a1, bz). Since b,/a, < b2/az by hypothesis, we
have
az b,
(54) <1
a, bz

Therefore, recalling A, > 0 and az > 0, we obtain from (53-a) and (53-b)

(55-a) uy>[bz) llX if A, > 0

\ z

and
uy = (b2 uX ifs.,=0

(55-b) uy< (--)aT if-.z>0

and

ux
uy = 16) if iz = 0

Hence, recalling that a, and iz cannot vanish simultaneously, we obtain from

(55-a) and (55-b)

bI < bz
(56) (with at least one strict inequality)
a, - uy a2

which is the Mill's condition. Therefore, if condition (52) is satisfied, then Mill's
condition (56) is satisfied. Conversely, if (56) is satisfied, then we can obtain
condition (52). If (56) holds with strict inequalities, then obtain .A and A2 from
(53-a) and (53-b). If (56) holds with one equality, say, b,/a, = uX/uy, then define
31z as iz = 0 and obtain a, from (53-a) as A, = b,uy. Thus obtained, A , and Az
will satisfy condition (52). This finishes the mathematical proof that Mill's condi-
tion is a necessary and sufficient condition for Ricardo's point to be optimal.
It should be noted that Mill's condition and the above observation are
crucially dependent on the specification of the utility function. If we adopt a dif-
ferent form of u, then we will obtain a different condition. As Chipman noted ([ 3] ,
p. 489), Mill realized this point and attempted to analyze more general cases
[more general, that is, than the case in which u(x, y) = xy]. (See [ 11 ] , Book
150 DEVELOPMENTS OF NONLINEAR PROGRAMMING

III, chap. 18. esp. secs. 8 and 9.) However his mathematical equipment precluded
the derivation of any exact condition.

FOOTNOTES

1. The utility function u may not be defined outside f2 ", the nonnegative orthant of R1.
However, it would be more convenient to conceive that u is defined over the entire
space R", in order to avoid the possibility of "corner" derivatives when we talk about
the (QSP') condition. Clearly, the consumer cannot place any utility outside his
consumption set S2 '; hence the definition of u outside of Q n can be more or less
arbitrary, as long as differentiability is preserved. This convention of extending the
domain of the function is often useful in many economic problems in which many
functions are, strictly speaking, defined only on the nonnegative orthant, and in
which we are concerned with (QSP').
2. For the meaning of the "relevant variable," see Section E of this chapter or Arrow
and Enthoven [ 1 ] , p. 783. This concept does not create any problem in the present
problem of consumer's choice.
3. Most readers are probably familiar with the procedure of obtaining the Hicks-Slutsky
equation. Clearly the author is not discounting any of the glory of the classical
demand theory a la Slutsky, Hicks, and so on. In the Appendix to this section, we
attempt the exposition of the classical demand theory, as an example of the time-
honored technique in economics, comparative statics. Later we will take up a modern
approach to the Hicks-Slutsky equation (Chapter 2, Section D).
4. The (strict) quasi-concavity of the utility function means that the consumer desires
to consume a variety of commodities rather than to consume any one commodity.
5. Here we are using condition (iv) of Theorem 1.D.4 (the A-H-U theorem), which is the
same as condition (ii) of Theorem 1.E.3 under the quasi-concavity of F. The condition
requires -(a) condition (18) in addition to (b) the quasi-concavity of F (or the convexity
of the constraint set), and (c) the existence of (v, x) > 0 with F(v, x) > 0 (or the
existence of an interior point in the constraint set).
6. As remarked above, assuming that at least one output is produced at the optimum,
we have ) > 0 so that F(v, 1) = 0. In other words, the constraint F(v, x) >_ 0 is
effective at the optimum. If we do not have the constraint (v, Y) >_ 0, the rank con-
dition is satisfied if condition (18) holds (which ensures [F,,, Fx] 0). A stronger
condition such as (18') is required for the present problem in view of the non-
negativity constraints, z > 0 andv > 0. Note that, to ensure the rank condition,
neither the quasi-concavity of F nor the existence of (v, x) > 0 with F(v, x) > 0 is
required.
7. Under the rank condition, (QSP') is necessary for an optimum. Under the quasi-
concavity of F, (QSP') is sufficient for an optimum.
8. DOSSO is the standard nickname of Dorfman, Samuelson, and Solow [ 6] .
9. The optimality here is defined as the maximization of the value of output under a fixed
price vector p. Notice that the maximization of pcx + p,y(resp. p.Cxi + pyyi, i = 1, 2)
is equivalent to the maximization of "real income" (px/py)x + y or x + (py/px)y
[ resp. (px/py)xi + y, or xi + (py/px)yi, i = 1, 2] , as long as (px, py) is a fixed vector.
Notice also that a country can increase its welfare from the above "optimum" posi-
tion if it is allowed to alter p or the terms of trade pX/py. This will, in general, imply a
loss to the other country. The optimum tariff argument is concerned with the choice
of px/py by means of tariffs so as to maximize one country's welfare.
10. If Al = A2 = 0, then from (51) we obtain uC = uy = 0 which, in view of z > 0 and
y > 0, contradicts (A-m) (in particular, uX > 0, uy > 0 for all x > 0 and y > 0).
SOME APPLICATIONS 151

REFERENCES
1. Arrow, K., and Enthoven, A. C., "Quasi-Concave Programming," Econometrica,
vol. 29, October 1961.
2. Carlson, S., A Study on the Pure Theory of Production, Oxford, Basil Blackwell,
1956.
3. Chipman, J. S., "A Survey of the Theory of International Trade, Part 1, The Classical
Theory," Econometrica, vol. 33, July 1965.
4. Dantzig, G. B., Linear Programming and Extensions, Princeton, N.J., Princeton
University Press, 1963.
5. Dantzig, G. B., and Orden, A., "Notes on Linear Programming: Part II, Duality
Theorem," Rand Corporation, Research Memorandum, RM 1265, October 30, 1953.
6. Dorfman, R., Samuelson, P. A., and Solow, R. M., Linear Programming andEconomic
Analysis, New York, McGraw-Hill, 1958.
7. Eisenberg, E., "Duality in Homogeneous Programming," Proceedings of the American
Mathematical Society, 12, October 1961.
8. Goldman, A. J., and Tucker, A. W., "Theory of Linear Programming," in Linear
Inequalities and Related Systems, ed. by H. W. Kuhn and A. W. Tucker, Princeton,
N.J., Princeton University Press, 1956.
9. Hadley, G., Linear Programming, Reading, Mass., Addison-Wesley, 1962.
10. Hicks, J. R., Value and Capital, 2nd ed., Oxford, Clarendon Press, 1946.
11. Mill, J. S., Principles of Political Economy, 3rd ed., London, Parker & Co., 1852 (1st
ed. 1848 by Parker, 9th ed. 1885 by Longmans, Green & Co.).
12. Nikaido, H., Introduction to Sets and Mappings in Modern Economics, tr. by K. Sato,
Amsterdam, North-Holland, 1970 ( the Japanese original, Tokyo, 1960).
13. Ricardo, D., On the Principles of Political Economy and Taxation, London, John
Murray, 1817, in The Works and Correspondence of David Ricardo, Vol. 1, ed. by
P. Sraffa, Cambridge, Cambridge University Press, 1951.
14. Samuelson, P. A., Foundations of Economic Analysis, Cambridge, Mass., Harvard
University Press, 1947.
15. Shephard, R. W., Cost and Production Functions, Princeton, N.J., Princeton Univer-
sity Press, 1953.
16. Takayama, A., International Economics, Tokyo, Toyo-Keizai Shimpo-sha, 1963 (in
Japanese).
17. , International Trade-An Approach to the Theory, New York, Holt, Rinehart
and Winston, 1972, chaps. 4, 5, 6, and 7.

Appendix to Section F: Optimization and Comparative Statics-A Local

Theory'

a. THE CLASSICAL THEORY OF OPTIMIZATION

The classical theory of optimization and comparative statics has become
very well known to economists through Hicks [ 81, Samuelson [ 121, and many
textbooks on price theory. The purpose of this Appendix is to review this topic
152 DEVELOPMENTS OF NONLINEAR PROGRAMMING

concisely so that the reader may be able to refresh his understanding of the theory
in a proper perspective.
Let f (x, a) and gj(x, a), j = 1, 2, ..., m, be real-valued functions defined on
X Q A where X and A are, respectively, open subsets of R" and R'. We assume
(A-1) All the second partial derivatives of f and gj, j = 1, 2, ... , m, exist and
are continuous for all (x, a) in X ®x A.
Consider the following maximization problem:
Maximize: f (x, a)
x

Subject to: gj(x, a) = 0, j = 1, 2, ..., m, and x E X

where a is a given vector in A. The ak's are called the shift parameters. The local
maximum condition (LM) is written as:
(LM) 3 z E X such that it achieves a local maximum off subject to the con-
straints.
The first-order condition (FOC) of this problem can be stated as follows:
(FOC) 3 z E X and A E R"' such that
0x(z,A,a)=0, g(1,a)=0

P (x, A, a) = f (x, &) + A g(x, &)

and g(x, a) gi (x, a), ... , g,, (x, 6)]. Here Ox denotes the gradient vector of 0
with respect to x.2
Next, we write the second-order necessary condition (SONG) and the second-
order sufficient condition (SOSC), respectively, as
(SONG) A < 0 for all such that B = 0.
(SOSC) A L < 0 for all rr 0 such that B L = 0.
where A = [a,1] and B = [by] are, respectively, (n x n) and (m x n) matrices de-
fined by'

(3-a) A = Oxx(z, A, a); that is, a,j =

a2 (X' A, a)
ax, ax;

(3-b) B = gx(z, A, a); that is, b, ag,(z, A, &)

a xj

The rank condition (R) is', assuming m < n,

(R) Rank B = m.
The fundamental theorem of the classical optimization theory is now stated."
SOME APPLICATIONS 153

Theorem I.F.3 Assume that condition (R) holds. Then

(i) (LM) implies (FOC) and (SONC).
(ii) (FOC) and (SOSC) imply (LM).

In the classical optimization theory, all the constraints are assumed to be

effective for all x E X. Now suppose we consider the following problem with in-
equality constraints.

Maximize: f(x, a)
X

Subject to:gj(x,6)> O,j= 1,2, ...,m,x> 0and xER"

The first-order conditions of this problem may be stated as6
(QSP') 3 (zx> O and > O (where .z E X and A E Rm such that
Ox(, A, I) < 0, ox(x, A,'I). x = 0, g(x, a) _! 0, ' - g(x, a) = 0
If we can assume z > 0 and A > 0, then this condition is reduced to
Ox(i, A, a) = 0 and g(i, a) = 0
which then looks precisely the same as (FOC). Therefore, we may regard the above
(FOC) as (QSP') with i > 0 and A > 0, and we can carry out a similar comparative
statics analysis using (QSP'). However, in (FOC) of the classical theory, there are
no provisions to guarantee i ? 0 and A ? 0. Note also that if f and the gj's are all
concave and if A ? 0, then cD is concave in x so that the Hessian matrix 0xx
is negative semidefinite for all x in X. Thus (SONC) is automatically satisfied.
In any case, we proceed here with the classical theory .7 As remarked earlier
(Theorem 1. E.17), (SOSC) is equivalent to the following bordered Hessian condition
(BHC), assuming that rank B,,,,,, = m and m < n.
0 Bmr
(BHC) (_ 1)r > 0, r = m + 1,..., n
Bmr Ar

or equivalently,
Ar 8,,,,
(BHC') (- 1)r
Bmr 0
I
>0, r=in +1,...,n
where Ar and Bmr are defined by

air bit biz ... b1,.

a2r b21 b2z b2, -

I Bmr = ... ...

arl arr bml b, 2 bmr

Here Bmr is the transpose of B,nr

154 DEVELOPMENTS OF NONLINEAR PROGRAMMING

b. COMPARATIVE STATICS
Hereafter, we assume (LM) and (R), so that (FOC) and (SONC) hold.
Condition (FOC) provides (n ± m) equations, which are then available to deter-
mine the (n + m) variables, il, i2, Assume

(A-2) det I B, I 0
A
Then under assumptions (A-1) and (A-2), we can directly apply the implicit func-
tion theorem." Thus we can conclude that there exist continuously differentiable
functions x and A such that z = x(a) and a. = A(a) and
(4-a) tI [x(a), }i(a), a] = 0
(4-b) g[x(a), a] = 09

for all a in some neighborhood of &, say, N(&).

The comparative statics analysis in the context of optimization is to establish
the effect of changes in the ak's on the values of x(a) and A(ce), using (4-a) and
(4-b). Differentiating (4-a) and (4-b) with respect to ak, we obtain (for each k)

(5) =0
a)
aak

for all a E N(&), or equivalently,

a ak OAak
(5') + =0
aA(a)
aak _j [IXak]
[aA(afl L

for all a E N(a), where all the second partials of are evaluated at a; that is,
Oxx = i [x(a), A(a), a], and so on. Needless to say, cr = cD;C = gC[x(a),
a].10

Define the (m + n) x (m + n) matrices H and H by'

H I and H =
(6) [ 0xl Oxs L B' A

By (A-1) and the continuity of the functions x(a) and A(a), every element of H
is continuous in a. But by (A-2), H is nonsingular. Hence H is also nonsingular
SOME APPLICATIONS 155

for all a in some neighborhood of a, say, N(a), where N(a) c N(a). Therefore,
from (5) we obtain12

(a) 0 OA-11 Otak

aclk

(a)
8ak [xi (Pxx Oxak

for all a in N (a). This equation is the fundamental equation of comparative statics
obtained from (FOC).

C. THE SECOND-ORDER CONDITIONS AND COMPARATIVE

STATICS
It is clear from (7) that the key to establishing the comparative statics results
is in the matrix H-1. The concavity of the functions f and the gj's would provide
very useful global information on H-1, as we shall see later (for example, see
Chapter 4, Section D). Here we investigate the local information which can be
deduced from the second-order conditions.
In Theorem 1.F.3, we stated that (LM) together with (R) imply (SONG) as
well as (FOC). It can be shown that (SONC) and (A-2) together imply
0 for all h 4 Osuch that 0
that is, (SOSC), hence (BHC). In other words, under (LM) and (R), (A-2) implies
that the last (n - m) principal minors ofH alternate signs, as in (BHC). Since every
element of H is continuous in a, the last (n - m) principal minors of H alternate
in signs for each fixed a in N(a). In other words,"
(8) (-1)' H, > 0, i = 2m + I,-, m + n, for all a E N(a)
where H; is the ith successive principal minor of H.
Write H-1 as

hll h21 hm+n,l

1h12 h22 hm + n.2
H- I =
(9)
det1 '
I
...
hl,m+n hm+n,m+n

where hji is the cofactor of the i-j element of H and det H is the determinant
of H. Then in view of (8), we can conclude that, for each a E N(&)
(10) sgn(det H) _ (- 1)"'+"
and
156 DEVELOPMENTS OF NONLINEAR PROGRAMMING

(11) sgnh;;=(-1)"'+i= 1,2,...,in+n'

Therefore, for each a E N(&),

(12) sgn i = 1, 2, ... , in + n

d ei " H=- 1 ,

Next decompose the matrix H- I as

K, K2
(13) H-1 _
K3 K4

where K,, K2, K3, and K4 are, respectively, in x m, in x n, n x m, and n x n

Since H is symmetric, so is H- 1, Ki and K4 are also symmetric. More-
over, by condition (R) and (8), we can conclude that, for all a E N(a),
(14) K4 < 0 for all
that is, K4 is negative semidefinite.'s
In summary, K4 [that is, the (n x n) southeast submatrix of H-11 is sym-
metric, negative semidefinite, and its diagonal elements are all negative. Also
H- I is symmetric and its diagonal elements are all negative.
We now illustrate this discussion in terms of the Hicks-Slutsky equation.
The purpose here is only for illustration, and a more general (and elegant) dis-
cussion is postponed to Chapter 2, Section D.

d. AN EXAMPLE: HICKS-SLUTSKY EQUATION

Consider the problem of choosing x E R" so as to
Maximize: u(x)
Subject to: g(x) = M - p x = 0
where p > 0 and M > 0 are the parameters of the problem. We assume that (A-1)
holds with respect to the functions u and g. That p 0 ensures the rank condition
(R), since
ia(7g\
r-1, ... , (-pI , -p2, ... , -P,) 0

for all x. Thus B = - p. Define the Lagrangian by

(15) cP (x, A, p, 111) = u(x) + 1,(111 - p x)
Assume that there exists an z > 0 which satisfies the (LM) condition. Then the
first-order condition (FOC) for this problem is now written as follows:
(16-a) -pz=0
M
(16-b) u1(z)--,p;=0,i= 1,2,...,n
SOME APPLICATIONS 1-57

where u;(i) = au(z)/ax;. Define the bordered Hessian matrix H by

r0 -Pi -P2 ... -pn

--pl U11 u12
(17) H -P2 U21 U22

L Pn U11I Un2
... unnJ
where u;i = a2u/ax;axi. Evaluate these u;i's at z and set p in the above H.
Denote it by H, and assume
(18) det H 4 0
which corresponds to (A-2). We can then apply the implicit function theorem to
(16-a) and (16-b) and obtain the continuously differentiable functions A.(p, M) and
x(p, M) with =A(p,k), i=x(p,M),and
(19-a) M - p. x(p, M) = 0
(19-b) u; [x(p, M)] - A(p, M)p; = 0, i = 1, 2, ... , n
for all (p, M) in some neighborhood of (p, M)
Partially differentiating (19) with respect to pi, we obtain

xi
0

0
A
(20)
0
api

LOJ
for all (p, M) in the neighborhood of (p, 11%1), where a = A (p, M) and x = x(p, M).
Let ei be the (n + 1)-vector whose jth element is one and all other elements are
zero. The RHS of (20) can be rewritten as
xiel+ae1+1
Since H is nonsingular in some neighborhood of (p, M)-say, N (p, M)-from
(18), we obtain

H- iei+
= Y0, M) H- 1P 1 + a (p, M) I
158 DEVELOPMENTS OF NONLINEAR PROGRAMMING

for all (p, M) in where A = A(p, M) and x = x(p, M).

Next, partially differentiating (19) with respect to M and taking account of
(18), we obtain
aA
aM
(22)
ax
_ - H- lei
Jim-

= (p, M) and x = x(p, M). Therefore, com-

for all (p, M) in N' (p, 11%1), where
bining (21) with (22), we obtain the following fundamental equation.
aa.
AM
(23) I a
I = A H- ej+ 1- x i
ax
;)-m

for all (p, M) in N(p, 11%1), where A _ A(p, M) and x = x(p, M). In particular,
ax;( M)
",PI
(24) = M) - xj(P, M) i, j = 1,2,...,n
M) I

where

(25) S,1_A(p,M) k' 1j, I i,j= 1,2,...,n

d etH'
and
(26) H-
det H [ h1 ]
as defined in (9). Equation (24) is called the Hicks-Slutsky equation. By (19-b),
A(p, M) > 0 if u;[x(p, M)] > 0 for all i (nonsatiation).
We now obtain the basic properties of S,.
Define S by
SI] Siz .. Sin
Sz i Szz ... Szn
(27) S

Snl Snz.... Snn

Then S corresponds to K4 in (13). Therefore, from our discussion on K4, we can

immediately conclude that, for all (p, M) in N(p, A1),
(28) 5;, = 5,, , i, j = 1, 2, ... , n
(29) S;; < 0, i = 1, 2, .. , n''
(30) S is negative semidefinite
SOME APPLICATIONS 159

Next observe that x;(p, M) is homogeneous of degree zero in p and M.18

Hence by the Euler equation, we obtain

(31) ax;(P,M)pj+ ax;aMM)M=O, i= 1,2,...,n

j=1 %
for all (p, M) in Ni(, pR). Combining this with (24), we obtain
n
(32) X Sijpj = 0, for all (p, M) in N(p, M)
j=1

-
Next, partially differentiating the budget equation (19-a) with respect to pj and M,
respectively, we obtain

(33-a) i- P;
8x;(P,M)+Xj
a pi
=0,

8x; (P, M)
(33-b) ZP; 1

;= 1 am =

Combining (33) with (24), or also directly from (28) and (32), we obtain
n
(34) 2 p;S,i = 0 for all (p, M) in N(p, M)

Equations (28), (29), (30), (32), and (34) exhaust all the important properties
of S.19
We may note that the following relations can be obtained directly from the
budget equation (19-a) by utilizing (33) but without using (24):

(35-a) Z BiTlij Bf (the Cournot aggregation property)

1=1

(35-b) B; 7r1 = 1 (the Engel aggregation property)

where
8x;(P,M)PL (price elasticity)
(36-a)
aPj Xi

axi(p, M) M
(36-b) 7ri (income elasticity)
8M X;

(36-c) B; = 8 X i (budget proportion of commodity)

Also from homogeneity, or (31), we obtain

160 DEVELOPMENTS OF NONLINEAR PROGRAMMING

n
(37) Z i 1, 2,...,n
i=1

Finally, utilizing (19-b) and assuming that A(p, M) 0,20 rewrite the
bordered Hessian in this problem as defined in (17) as follows:
0 u1 u2 U17

ul U11 u12 Uln

(38) H = u2 u21 u22 U2n

Un2 Unn
L Un u,

Then it should be clear that there is a close relation between the present formula-
tion and quasi-concave programming (that is, Theorems 1.E.2, 1.E.3, and 1.E.14).
The only important difference between the two approaches is that the quasi-con-
cavity of u together with the linearity (hence the concavity) of the constraint
function ensure the global result by specifying the signs of the principal minors of
H for all x.

e. THE ENVELOPE THEOREM21

As before, consider the problem of choosing x so as to
Maximize: f(x, a)
Subject to: gj(x, a) = 0, j = 1, 2, ... , m, and x E X (an open subset of Rn)
Let x(a) and .A (a) be the functions corresponding to (4-a) and (4-b). It is assumed
that (LM) holds as well as (R) so that (FOC) and (SONC) hold. Define the func-
tions F(a) and P(a) by22
(39) F(a) -- f [x(a), a]
(40) T(a)-f[x(a),a]
Then we have the following theorem.

Theorem 1.F.4:2' Assume that F and I are continuously differentiable in a. Then

OF (a) _ OW (a) _ a c1 (x, .,1, a)

(41)
aak
k 1,2, ...,1
aak Oak

where

(42)
acb(A, a)=af(aaa)+A ag(°a),k=1,2,...,1
Oak k k

REMARK: In the vector notation, (41) and (42) are respectively written
X24
SOME APPLICATIONS 161

(43) Fa= Wa=0a

(44) to =1a +
REMARK: It is possible that F and W are not continuously differentiable in
a. See Uzawa [ 15], for example.
PROOF:25 Simply observe that
`I`a = (D 1, Xa + (D,l Aa + (Da = 0 c Xa + g Aa + (Da
_ (P,, [by using (4-a) and (4-b)]
Fa =.lx Xa + fa.
A g,, xa + fa [ since fX = - A g .zfrom (4-a) ]
_ A ga + fa [ since gx xa + ga = 0 by (4-b)]
_ 0a (Q.E.D)
REMARK: Notice that awl aak measures the total effect of a change in ak
on the Langrangian, while a(D/aak measures the partial effect of a change in
ak on the Langrangian with x and A being held constant. Note that every
partial derivative in (41) is evaluated at [x(a), a .(a ), a] .
From Theorem 1.F 4.
aF(a) _ of ag
aak aak aak

Now change the original maximization problem in such a way that the kth
parameter ak is considered as one of the choice variables (such as thex;'s).
Then from the first-order conditions of this new problem, we have

k
aff+A -0
k

Thus
aF(a) = 0
aak

Write (a,, ..., ak_1, ak+,, ..., a!) = /. Then we have the following two
equations:
(45) O(F, ak) = F - F(a) = 0
aF(a)
(46) 00 ak ) = k = 0
,

Keeping ak constant, O(F, p, ak) = 0 defines a surface in the (F-/S)-space.

By changing ak, we then obtain a family of such surfaces. By solving this
equation (45) together with equation (46), we obtain the envelope of such
a family of surfaces.25 For example, if /3 is a scalar, then (45) defines a curve
162 DEVELOPMENTS OF NONLINEAR PROGRAMMING

in the (F-/3)-plane for a given value of ak. By changing ak, we obtain a

family of curves. The envelope of this family of curves in the (F-/3)-plane
may be obtained by eliminating ak from (45) and (46) as
(47) F= e(/3) or e(F, A3) = 0
The famous Wong-Viner envelope theorem that the long-run cost curve is
the envelope of the family of short-run cost curves is an example of the
above consideration, as we will see later (Example 4).
We now show some examples of the applications of Theorem 1.F.4. We
hope that these examples will illustrate the use of this theorem.
EXAMPLE I (marginal utility of income):
Consider the following problem again:
Maximize: u(x)
X

Subject to: M - p x= 0, x E R"

Let x(p, M) and a.(p, M), respectively, correspond to x(a) and A(a) in (4).
Define
(48) U[ p, M] = u [x (p, M)] (indirect utility function)

and 0 [x, A, P, M] - u(x) + A(M - p x)

Then by Theorem 1.F.4, we obtain

(49)
aU = 'l(p, M)
aM
which is the well-known result that the Lagrangian multiplier of the problem
signifies the marginal utility of income. We also obtain
(50) aU= -Ax,,, j= 1,2,...,n
apt
Hence if ).(p, M) > 0, then an increase in any price will decrease the
consumer's satisfaction. Also, from (49) and (50), we have (aU/app)/
(au/aM) = -xj (p, Al).
EXAMPLE 2 (the meaning of the multipliers): In genera], consider the
problem of choosing x E R" so as to
Maximize: f(x)
Subject to: gj(x) = bj, j = 1, 2,... , in
Let x(b) and A(b) correspond to x(a) and 2.(a) in (4), where b = (b 1, b2, ... ,
b,,,). Define
(51-a) F(b) = f [x(b)]
SOME APPLICATIONS 163

and

(51-b) (D (x, A., b) = f(x) + A. [b - g(x)]

Then by Theorem 1.F.4, we obtain
(52) aF(b)
a b;
= Aj(b), j = 1, 2, ..., m

Thus the jth Lagrangian multiplier signifies the marginal rate of change of
the optimal value of the objective function with respect to a change in the
jth constraint. Example I is clearly a special case of this. Interpreting bj
as the amount of thejth resource supply, A signifies the shadow price of the
jth resource.
EXAMPLE 3 (cost minimization):
Consider the following problem:
Maximize: - w x (= Minimize: w x)
X

Subject
where w > 0, y > 0, and g(x), respectively, signify the input price vector, the
output (scalar), and the production function. Let x(w, y) and )(w, y),
respectively, correspond to x(a) and ).(a) of (4).
Defines`
(53-a) C(w, y) = vi, - x(w, y)
and
(53-b) Cp (x, A, w, y) w x + A [g(x) - y]
Then by Theorem I.F.4,28

(54) ay = A(w, Y)

so that the multiplier signifies the long-run marginal cost. Also we obtain
ac
xr(w, Y), i = 1, 2, ... , n
(55) a wr

so that an increase in any factor price increases the minimum total cost C.
From (55), we obtain

" a QW, Y) "

w1 _ w1 x; (w, Y) = C(w, Y)
a w; ;

for all (w, y). Hence by Euler's theorem, the minimal cost function C(w,y)
is homogeneous of degree one in iv. Also noting that a2C1aw;ay =
164 DEVELOPMENTS OF NONLINEAR PROGRAMMING

a2C/ayaw;, we obtain from (54) and (55)

(56) ax;(w, Y) = aA(w, Y)
ay a w;

Then, using (54), (53), and (56) successively, we can observe

aC n ax;(w,y)= n aA
A(w,y)=ay w;
ay
w;
a w;
for all (w, y)

Thus A(w, y) is also homogeneous of degree one in w. Recall that by the first-
order condition, we have
w;=A(w,y)gi(x), i = 1,2,...,n
where g; (x) = ag(x)/a x; . Therefore if, in particular, the production function
is homogeneous of degree one in x so that Z I g; (x) x; = g(x) for all x,
then we obtain
n n
C(w,Y)w,x,=A
r=1 =1
g,(x)xi=A(w,Y)Y

From this we can conclude that," for all (w, y),

(57) C = aC (=A)

(58) aAay,Y) = 0

EXAMPLE 4 (the envelope of the short-run cost curves): In the problem of the
previous example, reinterpret x as the vector of variable factor inputs and
consider the following problem.
Maximize: - [w- x + f (k)] [ = Minimize: w x + f (k)]
x
Subject to: g(x, k) = y and x E Rn
where k and f (k), respectively, signify the "size of the plant" and the "fixed
cost." For the sake of simplicity, k is assumed to be a scalar rather than a
vector signifying the spectrum of capital goods. Let x(w, y, k) and A(w,y, k),
respectively, correspond to x(a) and )L(a) of (4).
Define3°
(59-a) C(wv, )), k) = iv - x(w, y, k) + f(k)
and
(59-b) (P (x, A, w, y, k) = - f(k)] + A[g(x, k) - y]
Then from Theorem 1.F.4,
SOME APPLICATIONS 165

(60) a C(wy v, k)
= A(w, y, k)

so that the multiplier A in this problem signifies the short-run marginal cost.
Assume that w is fixed and define the function 0 by
(61) 0(C,y,k)=C-C(w,y,k)=0
For a fixed value of k, the graph of 0 in the (C-y)-plane denotes a short-run
(total) cost curve. In the long-run case in which k is allowed to adjust, we
have
alD -f,(k)
(62) + A akk = 0
ak
from the first-order conditions. Since aC(w, y, k)/ak = -(Pk by Theorem
1.F.4, we obtain from (61) and (62)
(63) a0(C,v,k)=0
ak
Suppose that we can obtain the unique relation between C and y by
eliminating k from (61) and (63) as
(64) C= E(Y)
Then the graph of e in the (C-y)-plane signifies the long-run cost curve
as the envelope of the short-run cost curves.

FOOTNOTES
1 . This section is indebted to Otani [ 11] as well as Hicks [ 8] and Samuelson [ 12] .

2. Therefore, (Dx = fX + A g, where gx denotes the Jacobian matrix of g with respect

to x, andf1 denotes the gradient vector off with respect to x.
3. In other words, tDxx denotes the Jacobian matrix of (Dx with respect of x, so that it
is the Hessian matrix of CU in x.
4. The rank of a rectangular matrix is equal to the number of linearly independent
rows, which is equal to the number of linearly independent columns. Rank B denotes
"the rank of the matrix B."
5. Recall Theorem 1.E.16.
6. Recall, for example, Chapter 1, Section D.
7. To save space, we will not attempt to make detailed comments in the subsequent
discussions from the viewpoint of modern theory. The reader is urged to do this
job by himself.
8. The implicit function theorem states the following: Let f (y, a), i = 1, 2, ..., s, be
continuously differentiable real-valued function on Y( 2)A where Y and A, respec-
tively, are open subsets of RS and R1. Let f (y, a) = 0, i = 1, 2, ..., s, for
some (y, a) E YO A, and assume that the determinant of the Jacobian matrix
[af (y, &)I8yj] is nonvanishing. Then there exists a continuously differentiable
function h such that y = h(&) andf [h(a), a] = 0, i = 1, 2, ..., s, for all a in some
neighborhood of &. For the proof of this theorem, see any textbook of advanced
166 DEVELOPMENTS OF NONLINEAR PROGRAMMING

calculus. It is important to realize that this is a "local" theorem in the sense that
the above neighborhood may be very small.
9. Let Dx be the gradient vector of 0 with respect to A. Then by definition of 0, we
can rewrite (4-b) as OA[x(a), a] = 0.
10. Here 0X°k is the n-dimensional column vector whose ith element is a2a [x(a),A.(a),
a]/ax;8ak, and similarly for "Aak. Clearly the jth element of (DA"k is equal to
ag1[x(a), a]/aak.
11. Clearly, H is obtained from H by evaluating every element of H at a.
12. One should note that, in many applications in economics, (A-2) fails to hold; thus
H- 1 fails to exist. The homogeneity and the concavity of the relevant functions are
often the source of such singularity.
13. This also means that (SOSC) holds for all a in N(a). Therefore, under (FOC) and
(R), we can conclude that, for each fixed a in N(a), x(a) achieves a local maximum
of f(x, a) subject to g(x, a) = 0, x E X.
14. Note that hm+n,m+n is the (m + n)-(m + n) confactor of H; hence it has the sign
opposite of det H as a result of (8). Thus sgn hm+n,m+n = (-1)m+n-1 From this,
we can deduce (11) by using the property of a determinant that a simultaneous per-
mutation of rows and columns does not alter the sign and the value of the deter-
minant.
15. They respectively correspond to 0, SAX, DxA, and Oxx in H.
16. In general, we have the following theorem. Let H be any (m + n) x (m + n) sym-
metric matrix with real entries. Assume that H is decomposed in the form of (6)
as H. Assume also that rank B = m and that A is negative definite subject to B h = 0.
Then H- 1 exists and K4 of H-' [where H- 1 is decomposed as (14)] is negative
semidefinite. See Caratheodory J4], pp. 195-196). Samuelson ([ 121, pp. 378-379)
contains the statement of such a theorem.
17. Since S;; < 0, we have ax;(p, M)/8p; < -x;8x;(p, M)/8M, i = 1, 2, ..., n, from
(24). Commodity i is defined as Giffen if 8x;(p, M)/8p; > 0, and inferior if
8x;(p, M)/8M < 0. Hence it is clear that every Giffen commodity is inferior, but
not necessarily vice versa.
18. The homogeneity of x(p, M) is due to the fact that, in the original maximization
problem, M - p x = 0 if and only if cM - (cp) x = 0 for all scalars c > 0, so that
z= x(p,M)= x(cp, cM) for all c> 0.
19. The term S;j is called the net (or pure) substitution term. To understand the meaning
of this term, consider the problem of choosing x E R" so as to minimize p x
(expenditures) subject to u(x) = u, where u is fixed. Denote the solution to this
dual problem by x = h(p, u). If u = u[x(p, M)], then the solution of the utility
maximization problem becomes the solution of this outlay minimization problem,
and it can be shown that S+1 = ah;(p, u)/8pj, that is, a change in the demand for
i when pj is changed with a compensated change in income so as to keep the level
of utility u fixed. Two commodities i j are said to be substitutes if Sij > 0 and
complements if S;j < 0. From (34), it is clear that at least one pair must be substitutes
(that is, it is not possible that all commodities are complements)..
20. This is satisfied if u. [x(p, M)] > 0 for all i (nonsatiation), as remarked earlier.
21. The discussion here can easily be carried out in the global context under proper
assumptions.
22. By definition of 0, we may also write (40) as W(a) = 0 [x(a), A(a), a].
23. The relation r7F/taak = ?/aak is due to Afriat ([ 1 ] , pp. 355-357). The proposition
in the form of this theorem is found in Otani [ 11] . See also D. G. Luenberger,
Optimization by Vector Space Methods, New York, Wiley, 1969, pp. 221-223.
SOME APPLICATIONS 167

24. The notations Fa, IF,, and Da, respectively, denote the gradient vectors of F, IF, and
0 with respect to a.
25. Here, xa is the (n x 1) (Jacobian) matrix whose (i-k) element is ex; /8ak. Similarly,
Aa is the (m x 1) matrix whose (j-k) element is 8Ailaak. The proof is a simple
application of the chain rule (Theorem 1.C.2) with the first-order conditions [in
the form of (4-a) and (4-b)] .
26. Consider, for example, the family of curves y = (x - a)2 in the (x-y)-plane
where a is a parameter. This is the family of parabolas obtained by translating
y = x2 in the direction of the x-axis. Clearly the x-axis (that is, y = 0) is the
envelope. This is obtained by eliminating a from f (x, y, a) = y - (x - a)2 = 0 and
8f/8a = 2(x - a) = 0. In general, consider f(x, y, a) = 0 where (x, y) E R2 and
a E R. Regarding a as a parameter, we obtain a family of curves in the (x-y)-plane.
An envelope of a family of curves is a curve with the following two properties:
(1) At every one of its points it is tangent to at least one curve of the family; (2)
it is tangent to every curve of the family at at least one point. An envelope may
not be unique [for example, consider the family of circles, (x - a)2 + y2 = 11. The
envelope of a family of curves is the union of its envelopes. The envelope is obtained
by eliminating a from the two equations f(x, y, a) = 0 and 8f(x, y, a)laa = 0.
In general, the envelope of multiparameter surfaces, f (zl, z2, ... , z,,; a i, a2, ... , as)
= 0, is obtained by eliminating a from this equation together with 8f(z, a)/8ak = 0,
k = 1, 2, ..., s. It is, of course, possible that a family of curves or surfaces may
never generate an envelope. The above procedure only gives a necessary (but not
necessarily a sufficient) condition to obtain an envelope. The exposition of the
envelope is found in most textbooks of advanced calculus (for example, E. B. Wilson,
1911; W. F. Osgood, 1925; H. B. Fine, 1937; D. V. Widder, 1947; J. M. H. Olmsted,
1961, and so on) and classical treatments of differential geometry.
27. Here C(w, y) is the long-run minimum (total) cost for given (w, y). Fixing w, the
graph of C as a function of y is the total cost curve that appears in many text-
books on price theory.
28. Note that 8C18y= -8018y by Theorem 1.F.4.
29. Equation (57) says that the average cost is equal to the marginal cost if constant
returns to scale prevail. To obtain (58), note that 8(C/y)/8y = [(8C/8y) - (C/y)] ly,
which is equal to zero by (57). Equation (58) says that the function A is independent
of y. Hence we may write A(w, y) = u(w). Relations (57) and (58) are important in
the results known as the Shephard-Samuelson theorem. See Shephard [ 131 and
Takayama ([ 14], pp. 549-551).
30. The function C(w, y, k) signifies the short-run minimum total cost, for given (w, y, k).

REFERENCES
1. Afriat, S. N., "Theory of Maxima and the Method of Lagrange," SIAM Journal
on Applied Mathematics, 20, May 1971.
2. Bliss, G. M., Lectures on the Calculus of Variations, Chicago, University of Chicago
Press, 1946.
3. Burger, E., "On Extrema with Side Conditions," Econometrica, 23, October 1955.
4. Caratheodory, C., Calculus of Variations and Partial Differential Equations of the
First Order, Part II, San Francisco, Holden Day, 1967, esp. chap. I1 (German
original, 1935).
168 DEVELOPMENTS OF NONLINEAR PROGRAMMING

5. Debreu, G., "Definite and Semidefinite Quadratic Forms," Econometrica, 20, April
1952.
6. El-Hodiri, M. A., Constrained Extrema: Introduction to the Differentiable Case with
Economic Applications, New York, Springer-Verlag, 1971.
7. Hestenes, M. R., Calculus of Variations and Optimal Control Theory, New York,
Wiley, 1966, chap. 1.
8. Hicks, J. R., Value and Capital, 2nd ed., Oxford, Clarendon Press, 1946 (1 st ed. 1939).
9. Mangasarian, O. L., Nonlinear Programming, New York, McGraw-Hill, 1969.
10. Mann, H. B., "Quadratic Forms with Linear Constraints," American Mathematical
Monthly, 1943: reprinted in Readings in Mathematical Economics, ed. by P. Newman,
Baltimore, Md., Johns Hopkins Press, Vol. 1, 1968.
11. Otani, Y., Microeconomic Theory, lecture notes at Purdue University, Fall 1971.
12. Samuelson, P. A., Foundations of Economic Analysis, Cambridge, Mass., Harvard
University Press, 1947.
13. Shephard, R. W., Cost and Production Functions, Princeton, N.J., Princeton
University Press, 1953.
14. Takayama, A., International Trade-An Approach to the Theory, New York, Holt,
Rinehart and Winston, 1972.
15. Uzawa, H., "A Note on the Menger-Wieser Theory of Imputation," Zeitschrift fur
Nationalbkonomie, XVIII, August 1958.
2
THE THEORY OF COMPETITIVE MARKETS

Section A
INTRODUCTION

In this chapter, we study the theory of competitive markets. We consider

an economy which consists of two types of "economic agents," one called a
"producer" and the other a "consumer." There are m consumers and k-producers
in the economy. These economic agents are concerned with "commodities." A
commodity bundle is considered to be an element of R"; that is, it is an n-dimen-
sional vector whose components are real numbers. Usually a "commodity" is
defined by the specification of all its physical characteristics, its availability
location, and its availability date. Services are also considered to be commodities.
We call attention to the fact that each commodity is dated. Hence today's apple
is a different commodity from tomorrow's apple. Thus the time element is intro-
duced into the model. There is a "price" for each commodity and the price vector
is an element in R". It is assumed in the theory of "competitive" markets that
each economic agent is so "small" relative to the size of the economy that he
cannot affect the price level that prevails in the economy (or, more precisely,
the impact of his action, as a producer or a consumer, on market prices is
negligible). This assumption obviously implies that there are many producers and
consumers in the economy. A behavioral rule is assumed for each economic agent.
It is assumed that each consumer maximizes his satisfaction over the set of com-
modity bundles that he can afford to buy with his income, and that each producer
maximizes his profit using the process or processes available in his production
set. Activity analysis is most typically concerned with a characterization of such
a competitive producer when his production set is a convex polyhedral cone.
For example, we showed that every efficient point of a production set is a profit
maximization point under a certain fixed price vector and that every profit maxi-
mization point (under a certain fixed price vector) is an efficient point (Theorems
O.C.2 and O.C.3). In Section D of this chapter we are concerned with the theory
of a competitive consumer. In describing the theory of consumer's choice, we

169
170 THE THEORY OF COMPETITIVE MARKETS

assi}me that only the commodity bundles that a consumer wishes to consume
enter into his decision making and that the prices of the commodities do not
affect his preferences. In real life, one may wish to consume a certain commodity
mainly because it is expensive. Such a "snob effect" is assumed away in this
chapter. In this connection, we must note that there is an important complication
in the theory of consumer's choice. Unlike the theory of production, we do not
have a measurable behavioral criterion such as profit. However, it turns out that
important results in the theory of consumer's choice and the subsequent results
in the theory of competitive markets can be obtained without any reference to the
measurability of individual's satisfaction. We will observe this throughout the
chapter.
In addition to the study of the behavior of each type of economic agent,
the theory of competitive markets is also concerned with the interaction of many
agents in the economy. This is the question of a competitive equilibrium. Essentially,
a competitive equilibrium is a state of affairs in which each consumer maximizes
his satisfaction given his budget set defined by the prevailing price vector, each
producer maximizes his profit given the same price vector, and the total supply
of commodities is equal to the total demand for commodities.' In this chapter
we study the following two aspects of a competitive equilibrium.

(i) The welfare implication of a competitive equilibrium (Section C).

(ii) The existence of a competitive equilibrium (Section E).

The existence of a competitive equilibrium depends on whether or not there

exists a price vector such that a competitive equilibrium as described above
can be sustained. In other words, it is essentially concerned with the "consistency"
of the concept of a competitive equilibrium and the model of a competitive
economy in the sense of whether the actions of numerous "competitive" producers
and "competitive" consumers can be consistent with each other.
In developing such a theory of competitive markets, one fundamental
assumption is often made and usually plays a crucial role in establishing the
major results. This is the assumption of the absence of "externalities." In other
words, it is usually assumed that the interdependence among the economic
agents (consumer vs. consumer, consumer vs. producer, producer vs. producer)
is negligible. One type of interdependence among consumers is known as the "de-
monstration effect." In essence, the "independence" among consumers says that
each consumer cares only about the consumption bundle that he consumes and
does not care about the consumption bundles of others. If this is the case, the con-
sumer is said to be selfish or individualistic.'-' One type of interdependence among
producers is known as the (technological) external economies and diseconomies.
A famous example of the interdependence between producer and consumer is the
dissatisfaction incurred by the public as a result of smoke from a factory's
chimney. The theory of such externalities is making rapid progress in economics.
Here we only point out that fresh views are given for the entire theory by Hurwicz
[2] and by the "core" theorists.'
INTRODUCTION 171

Some preliminary remarks on notations are now in order. The production

process yj for the jth producer is an n-vector whose negative elements denote
"inputs" and whose positive elements denote "outputs." The set of possible
consumption bundles for the ith individual is denoted by X., and is called his
consumption set. Given a price vector p and income M;, his budget set is {x, : x, E X;
and p, x, < M,1. Each individual receives his income either by selling (or offering)
his "resources" in the market or by receiving a gift from someone else. His
resources may be physical goods such as apples or labor services performed by
him. Whether there is money in the economy or not is immaterial here since we
do not consider its specific features as such in this chapter. Money, if it exists in
this economy, serves mainly as a unit of measurement of prices and the generally
accepted means of payment. In a "money" economy he receives his "money
income" by selling his resources to the market, while in a barter economy he
obtains a bundle of goods and services for his consumption in exchange for his
resources. The convenience due to the existence of money when compared with
the barter economy may enter the utility function of each consumer. But since
money, then, is like a public good and the amount of satisfaction to each individual
cannot be well described, we might as well assume that money does not enter
the individual's utility function. Or we may just consider that money is simply
inherent to the institutional scheme of the monetary economy, so that one does
not obtain any particular utility from it. If the reader so desires, for the sake of
simplicity of discussion, he can assume that our entire discussion is confined to
a barter economy.
The relationships between a consumer's budget set and his consumption
bundle require additional clarification. First, the word "income" may be mis-
leading, as from the above argument it can be considered to be the total value of
his "resources." For example, one can sell the land that he owns if he so wishes,
instead of selling the services from the land. In other words, the stocks of those
commodities which are marketable can enter the definition of the consumer's
budget constraint. Thus by the word "income" we mean the total value of all the
flows and stocks of the commodities that he can sell to the markets. Hence the
word "income" may more properly be replaced by "wealth." However, once the
above point is recognized, it is rather immaterial whether we call it "income" or
"wealth," and thus we stick to the more conventional word "income." Note also
that as long as the commodities are dated, income may also be dated, so that the
possibility of borrowing and lending will affect the consumer's budget set.'
One implication of dating commodities is that it is assumed that each con-
sumer knows all of the bundles of commodities that he will want to consume in
the future and all of the price vectors that will prevail; that is, we assume that
each consumer has perfect foresight. This assumption can easily be justified in
a stationary state (or even on a "balanced growth" path),' though under any
other conditions justification for the assumption is difficult. The extension of the
theory of competitive markets to the world of uncertainty has recently been
explored.
In any case, let us now return to Mr. i's consumption vector x,. Traditionally,
172 THE THEORY OF COMPETITIVE MARKETS

x, is taken to be a vector whose components are all nonnegative. If Mr. i has re-
sources z; and if he gets all his income by selling this x;, his income will be
p x;, when price vector p prevails in the market. His budget constraint is thus

We may note that Mr. i, if he wishes, can retain a part of his resources for his
own consumption. One way to handle this is to (fictitiously) suppose that he sells
all his resources in the market and buys some of them back (with zero transaction
cost).
Some simple examples may clarify the above budget relation. Suppose Mr. i
has holdings of only one commodity, A. Suppose further that his initial holding
of A is the amount X a and that he sells a part of A, say, Xa. His consumption of
A will then be (X,, - Xa). Now let us assume that he also consumes commodity
B of which he has no initial holding. Letting pa and Pb denote the price of A and
B, respectively, his budget constraint can be written as
P0(Xa-Xa)+PbXb=PlX-
We may suppose that this budget constraint means that he sells all his initial
holdings of A, X and that he buys back the amount (X,, - Xa). The com-
modities A and B could be such goods as apples and bananas, or one of them
could be labor services (or leisure). In other words, we may interpret X" = 24 hours
a day, Xa = the amount of labor he sells to the market per day, and (X" - Xa) _
the amount of leisure he consumes per day.
The above budget relation could also be written as
PbXb = P0Xa
If we write the equation this way, we have to infer from Xa the amount of com-
modity A which enters into his consumption and his preference ordering. This
can be done by specifying the value of X a and computing the value of (X a - Xa).
As long as the total amounts of resources held by him are fixed, it does not
make any fundamental difference how we write the budget relation. In fact, we
may rewrite the above relation as
Pa(-Xa) + PbXb __<- 0
We then consider his consumption bundle as (-Xa, Xb) and his total budget
(income) as zero. We may suppose that his preference ordering (or utility function)
is defined on all possible values of (-Xa, Xb) instead of all possible values of
(X a - Xa, Xb). One caution to note is that under this supposition, his consump-
tion vector is no longer nonnegative.
In general, we can rewrite p x < p, xi asp z; <_ 0, where z; - x, - x1, and
consider z; as his consumption bundle and 0 as his total budget. Negative elements
in z; represent quantities of commodities supplied and positive elements in z;
represent quantities received. In this convention, the consumption set X; is the
INTRODUCTION 173

set of all possible consumptions and trades (of the ith consumer). Usually, it is
assumed that Xi is a subset of R.
Further complications arise when we consider the producers. First, pro-
ducers may hold certain resources. The question is: Who claims the income from
these resources? There is one simple answer (not the only answer). We may assume
that all the resources are initially held by the consumers (and none by producers)
and some of them are sold to the producers (for example, labor service). Thus
consumers get the income. Second, producers may get positive (negative or zero)
profit. Who has the claim to the profit? This can be solved simply by assuming
that all the firms are owned by the consumers, and that ownership is represented
by stocks issued by the producers. Clearly some consumers may never own stocks
and hence receive no income from the profits that producers make. Let yj E R"
be the production point (input-output combination) chosen by thejth producer
when price vector p prevails. The negative elements of yj denote inputs and the
positive elements of yj denote outputs. His profit is represented by p yj.6 Let
Oji be the fraction of the stock of the jth producer that the ith consumer owns.
Thus
m
Bji= 1 for all.j andOji> 0 for all jand i
i= 1

Then Mr. i gets the dividend from the jth producer in the amount of Oji(p yj).
Letting xi be the total amount of resources initially made available to Mr. i,
his total wealth (or income), prior to any consumption, can now be represented by

k++
p. Xi + G0ji(p'yj)
1= 1

(he owns the stock issued by producer j = 1, ... , k).7 Note that this formulation
does not preclude the possibility that the same individual is a consumer and a
producer at the same time. In this case he, as a consumer, owns 100 percent of the
stock of himself as a producer.
Needless to say, the set of all possible input-output combinations for the
jth producer is the production set of j (which we denote by Yj). Clearly jj E Yj.
In this chapter we assume that Yj is a subset of R°. If there are no external
economies and diseconomies, the aggregate production set of the economy,
denoted by Y, can be defined by
k
Y- Yj
J= I

Finally, we may remark that the definition of a competitive equilibrium can,

in fact, be independent of the question of who owns the resources, because one
can describe a consumer's behavior not by specifying how much his total income
is, but by specifying the point chosen by him. We may simply define each con-
174 THE THEORY OF COMPETITIVE MARKETS

sumer's behavior as being described by the choice of which consumption bundle

is preferred or equivalent to all other alternatives in his consumption set that
are of equal or less value. Note that in this definition of consumer's behavior,
nothing is said about how he obtains his income.
With these remarks, we now enter the theory of competitive markets. This
is probably the most rigorously and elegantly developed field in economics. It
certainly deals with a very simple type of economy, a "competitive economy."
Because it is simple, we can more fully appreciate many of the difficult problems
which arise at this stage. The study of this theory is very important in order to
better understand a more complicated or more "realistic" model (which may not
yet be well covered in the literature).' We may also remind the reader that there
was at least one period of time in history (if not now) in which the model of
competitive equilibrium is thought to have approximated the real world fairly
well. After all, Walras did not draw his theory from a hat. Great economists such
as Walras and Marshall, who were concerned with the competitive economy,
were all interested in the real world. The theory of competitive markets has
probably the longest history of any subject in economics, and the literature is
voluminous. But we should also note that most of the modern development that
took place in the 1950s is due to economists such as Arrow, Debreu, Hurwicz,
McKenzie, Gale, Nikaido, and Uzawa after the classical contributions of Walras,
Hicks, Samuelson, and so on.

FOOTNOTES

1. It is possible to suppose that the total demand for commodities does not exceed the
total supply of commodities. Note that this convention, allowing excess supply of
commodities, presupposes the free disposability of commodities.
2. Since this assumption does not involve any ethical connotation, the word "individual-
istic" seems to be better than "selfish."
3. The theory of the core will be explained later (in the appendix to Section C of this
chapter). For the theory of externalities, see also T. Negishi, General Equilibrium
Theory and International Trade, Amsterdam, North-Holland, 1972, esp. chapter 4.
4_ Note that if we date commodities, then prices are also "dated" in the sense that
interest rates between various dates are incorporated into the model.
5. However, even in a stationary state, it is difficult for the consumer to know the
time of his death with perfect certainty.
6. The definition of "profits" depends upon what is included in the list of commodities.
For example, if "entrepreneurial skills" are not included in the list, the returns to
them constitute a part of the profits. On the other hand, if such items are included in
the list of commodities, the production set may become a convex cone, so that the
maximum profit becomes zero.
7 . Such a convention is seen, for example, in Debreu [ 1 ] . One possible difficulty
here is that the explanation of the distribution of the 01/s is not clear. This is
especially true if entrepreneurial skills are not included in the list of commodities.
8. The basic methods involved in the studies of competitive equilibrium (existence,
CONSUMPTION SET AND PREFERENCE ORDERING 175

welfare, stability, and comparative statistics) are also useful and important when
we study even much simpler models in other branches of economics. Still another
important reason for the study of competitive markets is its welfare significance. See
our discussions in Section C and its appendix in this chapter and Section G of
Chapter 3.

REFERENCES

1. Debreu, G., Theory of Value, New York, Wiley, 1959.

2. Hurwicz, L., "Optimality and Informational Efficiency in Resource Allocation
Processes," in Mathematical Methods in Social Sciences, 1959, ed. by K. J. Arrow,
S. Karlin, and P. Suppes, Stanford, Calif., Stanford University Press, 1960.
3. Koopmans, T. C., Three Essays on the State of Economic Science, New York, McGraw-
Hill, 1957.
4. Walras, L., Elements of Pure Economics, tr. by W. Jaffe, Homewood, Ill., Richard
D. Irwin, 1954.

Section B
CONSUMPTION SET
AND PREFERENCE ORDERING

a. CONSUMPTION SET
The basic concept in the theory of consumer's choice is the "consumption
set," which is the set of all possible consumption bundles for a particular con-
sumer. This concept is clearly analogous to the concept of the production set,
which was discussed in Chapter 0, Section C. The consumption set, which we
will denote by X, is traditionally taken to be the entire nonnegative orthant ofR".
We should realize that this convention implicitly or explicitly contains the follow-
ing assumptions:
(A-1) The set X is a convex set.
(A-2) An individual can consume any amount of goods (however large it maybe).
(A-3) An individual can survive as long as he has a positive quantity of some com-
modity. Thus, for example, the origin is a minimum subsistence consumption.
(A-4) The set X is a subset of a finite dimensional vector space.
The third assumption implies that every individual has the same starvation
point regardless of his physiological capability if everybody in the market has the
nonnegative orthant as his consumption set.' The second assumption may not be
considered a strong one, for one may get satisfaction just from owning com-
176 THE THEORY OF COMPETITIVE MARKETS

modities. However, we may then argue that we should distinguish the consump-
tion activity of actually consuming a commodity from that of simply owning it.
After all, eating cakes may provide a different kind of satisfaction to an individual
from owning cakes. The first assumption is very convenient but quite a strong one,
for it implies, among other things, the perfect divisibility of every commodity, in-
cluding commodities such as automobiles. We may note, however, that every com-
modity can be made perfectly divisible if we consider consumption per unit of time
of the commodity, since time is a continuum. For example, the consumption of an
electric bulb can be measured by the amount of time we use the bulb. If the bulb
lasts 1000 hours and if one consumed 10,r hours of lighting by the bulb, we may say
that he consumed 107r/ 1000 of the bulb (which is an irrational number). In this
context, we may even question the need to assume that X is a subset of a linear
space. For a linear space, by definition, must allow multiplication by any scalar.
It may be worthwhile to investigate the extent of the theory in which the con-
sumption set is not embedded in either the linear space structure or the topological
structure. By assuming that X is in R", these structures may unnecessarily creep
into the theory. Note also that even if we assume that every commodity is perfectly
divisible, X may cease to be a subset of a finite dimensional vector space. This is
true, for example, if we date each commodity by continuum time. In this case, a
consumption vector is a function of time, x(t), so that x(t) E X (which typically
presupposesXto be a subset of an infinite dimensional linear space).' In Figure2.1
we illustrate the consumption set in R2 where one of the two commodities is
indivisible. The consumption set is the set of points on the horizontal lines.

IRZ

11
x1
0

Figure 2.1. An illustration of Consumption Set.

b. QUASI-ORDERING AND PREFERENCE ORDERING

Let X be the consumption set for a certain individual, say, Mr. A. Given two
elements (consumption bundles) x, yin X, we suppose that Mr. A can say (from the
point of view of the satisfaction he obtains from consuming these bundles of
goods) such things as: (1) "I prefer x to y"; (2) "x is no worse than y"; (3) "x and y
are equivalent to me." This means that he is defining a certain "relation" (called a
CONSUMPTION SET AND PREFERENCE ORDERING 177

"preference relation") on two elements of his consumption set X. If we wish, we

may consider such a relation to be a collection of ordered pairs (x,y) where
x E X and y E X. For example, we may interpret this ordered pair as "x is pre-
ferred to y." If we wish, we may use symbols such as xQv or x P y. If we want to
stress that this is Mr. A's ordering, we may write it as x(D,, y or x P y.
In general, a (binary) relation is a mathematical concept which is a set of
ordered pairs (x, y). If R is a relation, we can write (x, y) E R. This is also written as
"x R y," and we say that x is R-related to y. More intuitively, "relation" is defined
on a certain set-say, X-such that the statement x R y, whereR is a verbal phrase
such as "is not worse than," is meaningful in the sense that it can be classified
definitely as "true" or "false." For example, if X is the set of all positive integers,
we can define a relation by interpretingR as "is less than"; thatis,xRymeansx < y
where x and y are positive integers.

Definition: A relation R on X is called a (partial) quasi-ordering, or (partial) pre-

ordering, if it satisfies

(i) x R x for every x E X (reflexivity).

(ii) xRy and yR z imply x R z where x, y, z E X (transitivity).

If, in addition, x R y and y R x imply "x = y," then the relation is called a partial
ordering or simply an ordering. Furthermore, if in a quasi-ordering R on X, we
necessarily have either x R y or yR x for arbitrary elements x, y (x 4 y) off, we call
R a total quasi-ordering or a complete quasi-ordering. Similarly, we can define total
ordering.

Definition: A relation R on X is called an equivalence relation if it satisfies (i) and

(ii) above (that is, reflexivity and transitivity) and

(iii) xRy implies yRx (symmetry).

In other words, an equivalence relation is a quasi-ordering which is sym-

metric. An equivalence relation is denoted by x y.
EXAMPLE: The relation "is not less than" (that is >) defined on the set of
positive integers is a total ordering. The relation "is equal to" is an equiv-
alence relation. If X is the set of all fractions alb where a and b are integers
with b 4 0, then we can define the equivalence relation by saying that alb -
c/d (d 0) whenever ad = be holds, where ad and be are integers in the usual
sense (for instance, z - The preference relation "is not worse than"
4).

(denoted by ®) can be a quasi-ordering which is not necessarily an ordering.

The relation ® is often assumed to be total, but this is not necessarily the
case. In fact, ® can even fail to be a (partial) quasi-ordering by missing the
178 THE THEORY OF COMPETITIVE MARKETS

transitivity axiom 3 The preference relation "is indifferent to" can be an

example of an equivalence relation.' If X = R" and if we define x >_ y for x,
y E R" by x; > yi, i = 1, 2, . . ., n, this relation >_ is a partial ordering which
is not total.
NOTATION: We use the following notations for the "preference or-
dering.":
(a) x is preferred to y: x Qy or y©x.
(b) x is not preferred to y: x®y or yQx.
(c) x is indifferent to y: x Qy or y IQx.
If we wish to indicate the individual (say, i) for whom the preference ordering
holds, we write
XGy, x(Dy, xay
REMARK: In the "usual" theory of consumer's choice, the preference or-
dering of a consumer is assumed to be defined on a consumption set whose
elements consist of his own consumption bundles alone. That is, his pre-
ference ordering is independent of the consumption bundles of other people
and of the pattern of production, and so on. This is clearly a strong restric-
tion imposed on the preference ordering, although it is known to bepowerful
enough to produce important results. The preference ordering with this
restriction is called selfish or individualistic, as mentioned in Section A.
REMARK: Figure 2.2 may be- useful for the understanding of the above
concepts.

Definition: Let X be the consumption set of a given individual and let! E X. Then

(i) The set { x: x E X, x Q 21 is called the no-worse-than-z set or the upper con-
tour set of z.
(ii) The set {x: x E X, xQ _c} is called the not-better-than-.r set or the lower
contour set of z.

(Partial) quasi-ordering (reflexive and transitive)

(Either xRy or yRx) 1 (xRy, yRx imply x = y)

Equivalence relation
(symmetric)
Total quasi-ordering (Partial) ordering

Total ordering

Figure 2.2. Relations among Various Orderings.

CONSUMPTION SET AND PREFERENCE ORDERING 179

(iii) The set { x: x c X, x Q z} is called the preferred-to-.j set, or the better-than-i

set.
(iv) The set {x: x E X, x©i} is called the worse-than-i set.
(v) The set {x: x E X, xQi} is called the indifferent-to-i set or the indifference set
of i.

C. UTILITY FUNCTION
Let X be the consumption set of a particular individual (say, Mr. A), and
let us suppose that Mr. A's satisfaction from consumption can be expressed by an
index which is a real number. This is called his utility index. The utility index is a
function from X into R where R is the set of real numbers. This function, denoted
by u(x), is called the utility function (of Mr. A). Since the set of real numbers has
the natural ordering > or > , this amounts to assuming that Mr. A's preference is
representable by the natural order of real numbers.' In other words,

(i) xQ_v if and only if u(x) > u(y),

(ii) x0y if and only if u(x) > u(y), and
(iii) v v if and only if u(x) = u(y).

The fundamental characteristic of such a utility function is that it can be replaced

by any of its monotone increasing function without altering anything substantial
(that is, ordering). In other words, if 0 is a real-valued function such that
(p(u') c1 (u2) according to whether u' u2

then
(i) x ® y if and only if 0 [u(x)] > o [.(y)].
(ii) x Q y if and only if 0 [u(x)] > o [u(y)].
(iii) x (3 y if and only if 0 [u(x)] = 0 [u(y)].
Classical consumer's theory assumes that the consumption set is the non-
negative orthant of R" and a utility (index) function u(x) is defined on this. In the
diagrammatical analysis, which is so common in the traditional analysis, the con-
cept of an "indifference curve" is used. The indifference curve is the locus of x =
(XI, x2) on R2 such that u(x) = constant. Traditionally, indifference curves are
drawn convex to the origin, indicating a diminishing marginal rate of substitution
between the two commodities.' It is generally supposed that u(x) increases as each
element of x increases. However, this does not have to be true. It may so happen
that a consumer is satiated with some commodity-say, i0-at an amount-say,
x o-so that u(x) < u(x*) where x, = x* for all i # i0 and for all x;o with xio > x*io .

Note that it is possible to have u(x) < u(x*), that is, the utility may actually de-
crease beyond a certain level of consumption (for example, the amount of light
in a room). Moreover, we may also question whether u (x, y) = constant can define
a unique curve, say, x = v(y). It is possible that we can have "thick" indifference
180 THE THEORY OF COMPETITIVE MARKETS

x2 x2

B: "Bliss" point

X 1
XX,
0 0

Figure 2.3. Illustration of Indifference "Curves."

loci. In Figure 2.3 we illustrate such indifference "curves." On the left, the in-
difference curves take the customary shape except that some of them contain
"thick" portions. On the right, we illustrate indifference curves for which u(x)
is not monotone increasing with respect to a coordinate-wise increase in x.
The question will arise as to whether we can represent the preference order-
ing Q (say, for Mr. A) by a utility function (that is, by real numbers).
This question was first solved by Debreu [3], [4] and later generalized by
Rader [9]. To state Debreu's theorem, we need the following two concepts.

Definition: Let X be a topological space. Then X is called a connected set if it

cannot be represented as the union of two disjoint, nonempty, open sets.
REMARK: From the definition it follows that X is connected if and only if
the only subsets of X which are both open and closed are X itself and the
empty set 0. A subset S of X becomes a topological space with the relative
topology of S. The set S is called connected (subset ofX) if it is connected as a
topological space with the relative topology. Hence it follows that S is con-
nected if and only if it cannot be partitioned into two disjoint nonempty
subsets of X which are open in S with respect to the relative topology. In-
tuitively speaking, a set is connected if it is of "one piece" (but possibly with
holes).

Definition: A preference ordering ® on X is called continuous or closed if for

every r E X, the two sets {x: x E X, x®,r} and {x: x E X, x®z} are both closed
sets. In other words, if {x9} is a sequence in X with xIQ r for all q (resp. xl/Qz
for all q), then x9 > x° implies x°®_i (resp. x°®). '
We are now ready to state Debreu's theorem.

Theorem 2.B.1 (Debreu): Let X be a connected subset of Rn and let Q be a con-

tinuous total quasi-ordering defined on X. Then there exists a continuous utility
function on X for Q .10
CONSUMPTION SET AND PREFERENCE ORDERING 181

REMARK: A well-known example of a preference ordering which is not

representable by a real-valued function is the lexicographic ordering.' I The
lexicographic ordering can be illustrated in R2 by
x Q y if x, > y, or if x2 > y2 with x i = y,
x Q y if and only if x = y, where x = (xi, x2) and y = (yl, Y2)
In other words, in the lexicographic ordering, preference ordering is ar-
ranged according to the dictionary rule. The British Treasury may have a
lexicographic ordering for its employees in the sense that it always prefers the
Oxbridge graduates to others and the American Defense Department may
have a lexicographic ordering for its employees in the sense that it always
prefers noncommunists to communists. The upper contour set (of z) with
the above lexicographic ordering is illustrated by the shaded area in Figure
2.4. Clearly it is not a closed set, for it does not contain the dotted line.
Obviously we can consider a more general lexicographic ordering of which
the upper contour set (of z) can be illustrated by the shaded area in Figure
2.5. For discussions on lexicographic orderings, see Georgescue-Roegen
[5], and Chipman [2].

0 Figure 2.4. A Lexicographic Ordering.

x' A More General Lexicographic Ordering.

0 Figure 2.5.

d. THE CONVEXITY OF PREFERENCE ORDERING

Given a preference ordering Q (hence Q and (D are also given) on X, we
may define the following relations on the assumption that X is a convex set." For
any two points x and y in X we may have

(i) x Q y implies tx + (I - t) y Q y, 0 < t < 1, where x y.

(ii) x Q y implies tx + (1 - t) y Q y, 0 < t < I, where x y.
182 THE THEORY OF COMPETITIVE MARKETS

(iii) x 0 y implies tx + (1 - t) y QQ y, 0 < t < 1, where x y.

(iv) x Q y implies tx + (1 - t) y Q y, 0 < t < 1.

A preference ordering is called weakly convex if (i) holds; it is called convex if

(ii) and (iv) hold; and it is called strictly convex if (ii) and (iii) hold.13 These pre-
ference orderings are illustrated by the indifference curves in Figure 2.6. Note that
a weakly convex preference ordering allows a "thick" indifference curve (band)
and a convex preference ordering allows a "flat" indifference curve. For a discus-
sion of the relationship among these convexities, see Debreu [4] section 4.7.
We may note that if the preference ordering is representable by a real-valued func-
tion (utility function), then the strict quasi-concavity of the utility function cor-
responds to the strict convexity of the preference ordering and the quasi-concavity
of the utility function corresponds to the weak convexity of the preference ordering.
The explicit quasi-concavity of the utility function corresponds to the convexity
of the preference ordering.

REMARK:''
1. Condition (i) implies that all the upper contour sets of X must be convex.
2. If the preference ordering is continuous, then (ii) implies (i).
3. If the preference ordering is continuous, then (iii) implies (ii).

REMARK: In showing that the behavior of consumers without transitive

preferences is compatible with most results in the theory of competitive
equilibria, Sonnenschein [ 12] had to rely on a rather strong assumption
that all the upper contour sets of X are convex. However, it is also true that
in proving major theorems of the competitive equilibrium theory, such an
assumption is required. The reason why only the convexity is required in
place of the transitivity can be seen (intuitively) from the following example
in [ 12] (p. 216). Suppose that X contains three points x, y, and z with
x Q y Q z Q x. Then any budget set which contains these three points may

x, X, x,
0 0

Weakly convex Convex Strictly convex

Figure 2.6. Illustrations of the Convexity of Preference Ordering.

CONSUMPTION SET AND PREFERENCE ORDERING 183

not contain any optimal consumption plan. However, the convexity of

preferences ensures the existence of points in the budget set which are
preferred to all three.

FOOTNOTES

1.As long as we consider a single consumer, the assumption that the origin denotes
a minimum (subsistence) level of consumption is not as strong as it appears. If
{x E R":x > -x > 01 is his consumption set, then by moving the origin properly,
we can obtain the origin as a minimum level of subsistence. His consumption set
may be denoted by J y E R': y > 0} where y = x - Y. However, if we consider more
than one consumer, then it involves the assumption that the minimum subsistence
levels of consumption are identical for all consumers.
2. In the analysis of consumer's behavior, it is well known that the different charac-
teristics of the commodities play an important role. For example, a consumer may
consider a blue Valiant to be a different commodity from a red Valiant even if all other
specifications are the same. Kelvin Lancaster, therefore, has recently proposed a
"new approach to consumer theory" by emphasizing the "different characteristics"
aspect of the commodities. See his "New Approach to Consumer Theory," Journal
of Political Economy, LXXIV, April 1966. However, he apparently assumes that these
characteristics are measurable quantities; see, for example, his phrase "the amount
of the ith characteristic" (p. 135). When the consumption set is considered to bea
subset of R", we have to be careful that anything measured on each coordinate is
representable by real numbers. The characteristics cannot be represented by real
numbers regardless of whether such a representation is ordinal or cardinal. A
statement such as "the quantities of the characteristics are directly proportional to
the quantities of the goods" is thus meaningless.
3. That Q is total means that the consumer can give his preference ordering 0 for
any two elements of his consumption set. Its plausibility is sometimes questioned,
for some of the decisions in the consumption set might involve highly hypothetical
situations which our consumer never faces in real life. In such cases, he cannot
make any decisions. It is known that many theorems can be proved without this
axiom. A still more questionable axiom involved in regarding the preference order-
ing as a total quasi-ordering may be the transitivity axiom. For example, we can
show that the relation 0 on X = {x E R:x > 0} defined by x' (D x if and only if
x' > x - I is intransitive. The existence of thick regions of indifference would often
cause intransitivity of ® . The intransitivity can be quite normal. See K.O. May,
"Transitivity, Utility, and Aggregation in Preference Patterns," Econometrica,
22, 1954, for example. In a remarkable paper [ 121 , Sonnenschein showed that the
transitivity axiom can be dispensed with in proving many important results of the
theory of competitive equilibrium.
4. However, the relation "is different to" can be intransitive. In that case, it cannot
be an equivalence relation.
5. Given the consumption set X of a particular individual, we do not have to define Q> ,
Q Q Q
, , , Q all independently. We may first define Q as a total quasi-order-
ing on X; then define Q by x @ y and y (D x; Q is defined by x @ y but not
y Q x;x Q yis defined byy Q x;andx Q yis defined byy Qx.
6. The preference ordering represented by real numbers is obviously transitive, for
the natural order of the real numbers is transitive.
7. Note that this assumes that the preference ordering of the consumer is individualistic.
184 THE THEORY OF COMPETITIVE MARKETS

In general, we should write individual i's utility function, ui, as u;(x', X 2

XI, ... , x'), x' E X;, i = 1, 2, ... , m, where X, is individual i's consumption set.

8. Note that if the utility function is replaceable by its monotone increasing trans-
formation, then the classical concept of "marginal utility" becomes rather meaning-
less, although the concept of marginal rate of substitution can still be meaningful.
Consider a differentiable utility function u(x) and a differentiable monotone trans-
formation 0 (where 'D' > 0). Then clearly we have (au/ ax; )/ (au/ axe) = (a(P [ u] / ax1)/
(at [u] /axe).
9. It can be shown that we can restate this (in an equivalent form) as follows: Let
{x9} and {x9} be two sequences in X such that x9 ->x and x9 z. Suppose x9 Q
x9 for all q; then x ®x.
10. Rader [9] relaxed the transitivity assumption involved in Debreu's theorem.
11. For the proof, see Debreu [4] , pp. 72-73.
12. As remarked before, this requires, among other things, that all commodities are
perfectly divisible.
13. The following quotation from J. S. Chipman might be of some use in understanding
the significance of the convexity of preference.
Two pillars form the foundations of economic activity. One is the law of con-
vexity of preferences, which states that people desire to consume a variety-
or average-of products rather than limit their consumption to any one
commodity alone....
See his "The Nature and Meaning of Equilibrium in Economic Theory," in Func-
tionalism in the Social Sciences, Philadelphia, Pa., American Academy of Political
and Social Science, February 1965, p. 35.
14. For the proofs, see Debreu [4], pp. 60-61.

REFERENCES
1. Birkoff, G., Lattice Theory, rev. ed., Providence, R. I., American Mathematical
Society, 1961.
2. Chipman, J. S., "Foundations of Utility," Econometrica, 28, April 1960.
3. Debreu, G., "Representation of Preference Ordering by a Numerical Function," in
Decision Processes, ed. by Thrall, Coombs, and Davis, New York, Wiley, 1954, pp.
159-165.
4. , Theory of Value, New York, Wiley, 1959.
5. Georgescu-Roegen, N., "Choice Expectations, and Measurability," Quarterly Journal
of Economics, 58, November 1954.
6. Hicks, J. R., Value and Capital, 2nd ed., Oxford, Clarendon Press, 1946.
7. Koopmans, T. C., Three Essays on the State ofEconomic Science, New York, McGraw-
Hill, 1957.
8. Kuratowski, K., Introduction to Set Theory and Topology, Oxford, Pergamon Press,
1961 (tr. from Polish original).
9. Rader, T., "Existence of a Utility Function to Represent Preferences," Review of
Economic Studies, 30, October 1963.
THE TWO CLASSICAL PROPOSITIONS OF WELFARE ECONOMICS 185

10. Richter, M. K., "Revealed Preference Theory," Econometrica, 34, July 1966.
11. Sonnenschein, H. F., "The Relationship between Transitive Preference and the
Structure of the Choice Space," Econometrica, 33, July 1965.
12. , "Demand Theory without Transitive Preferences, with Applications to the

Theory of Competitive Equilibrium," in Preferences, Utility, and Demand: A Min-

nesota Symposium, ed. by J. S. Chipman, L. Hurwicz, M. K. Richter, and H. F.
Sonnenschein, New York, Harcourt Brace Jovanovich, 1971.

Section C
THE TWO CLASSICAL
PROPOSITIONS OF
WELFARE ECONOMICS

When we discussed activity analysis in Chapter 0, we proved that a point in

the production set which maximizes. profit under a certain "price" vector is an
efficient point and that we can associate a price vector with every efficient point
such that it becomes a profit maximization point. (For the latter, we needed
the convexity of the production set.) In a competitive market, every producer is a
price taker-that is, he cannot affect the prices which prevail in the market-and
he must maximize his profit under the given price vector. The above results from
activity analysis indicate a strong relationship between competitive pricing and the
efficient point. Before we explore this further, we must recall the behavior of the
other important economic unit in this economy, the consumer. In a competitive
market, every consumer is a price taker-that is, he cannot affect the prices that
prevail in the market-so he maximizes his satisfaction over the bundles of com-
modities which can be purchased under a given price vector. Assuming that an
economy consists of these two basic types of economic units, and assuming that
the total supply of commodities is equal to the total demand for commodities, a
natural question is whether such an economy achieves a certain optimum of social
welfare. The concept of "efficient production point" is concerned with such an
optimality concept in production. But this is valid from the society's point of view
only if we disregard consumers. What happens if we introduce consumers? When
we introduce consumers into our consideration, we immediately recall that an in-
dividual's utility cannot be measured' so that we cannot add individual utilities to
get a measure of social welfare. The search for a concept of social welfare, as is well
known, led to the concept of the Pareto optimum, the state in which nobody can be
better off without making others worse off, given that the total supply of com-
modities is equal to total demand for commodities! Since a consumer presumably
gets his satisfaction from his consumption activity, the phrases "better off" or
186 THE THEORY OF COMPETITIVE MARKETS

"worse off" refer to the welfare of each individual consumer with respect to his
preference ordering.
Now the natural question becomes: What is the relationship between "com-
petitive equilibrium" and "Pareto optimum"? In particular, we maybe interested
in asking whether every competitive equilibrium realizes a Pareto optimum and
whether a Pareto optimal state can be achieved and supported by a competitive
equilibrium. These are the two main questions in classical welfare economics. If
each question can be answered in the affirmative, then we want to know the pre-
cise conditions which support each conclusion. This is the task of this section.
Before we start our analysis, we may note that the above questions are not
really new in economics. A principal theme of Adam Smith was that "free con-
petition" realizes a "social optimum." Obviously, Smith did not have precise
concepts of "free competition" and "social optimum."
There have been many attempts in the history of economics to formalize the
above theme. The Ricardian theory of comparative advantage is probably the first
such attempt to be successful in connection with productive efficiency. Wicksell
[23] gave a formulation of how perfect competition maximizes production, which
corresponds to the results from activity analysis mentioned above. The concept of
the Pareto optimum is dueto Pareto but was apparently introduced at the insistence
of his friends, Pantaleoni and Barone. Pareto perceived the Pareto optimum signif-
icance of a competitive equilibrium (for this point, see Samuelson [22] , pp. 212-
214). Apparently it is Barone [ 3] who first stated exactly and proved that a com-
petitive equilibrium, under quite general conditions, realizes a Pareto optimum.'
A somewhat converse proposition, that is, that a Pareto optimum state is
supported by a competitive equilibrium, also came from Pareto and from Barone
[ 31, Lange [ 141, Lerner [ 16], and others. Combined with the previous proposition,
these two propositions constitute the so-called "fundamental theorems of welfare
economics."
The studies of these propositions in the 1930s and 1940s by Lerner [151,
[ 16], Lange [ 14], Hicks [81, Samuelson [221, and others are characterized by
their recognition of the relationship between the marginal equivalences and acom-
petitive equilibrium.
The first rigorous formulation and proof of these propositions using a
modern set-theoretic approach was carried out by Arrow [ 1] and Debreu [4] and
has been further generalized by Debreu [ 51, Moore [ 17], and so forth. The revolu-
tionary character of this development is analogous to the advance from the tradi-
tional production function approach in production theory to the activity analysis
approach (see Chapter 0, Section C). Our discussion in this section is based on the
modern version. The author has greatly benefited from excellent expositions by
Koopmans [ 121, and Koopmans and Bausch [ 131.
Before we turn to this modern approach, we may illustrate the problem by a
simple diagram. Figure 2.7 illustrates the choice of a competitive consumer, whose
consumption set is the nonnegative orthant of R 2. If he is faced with a price vector
THE TWO CLASSICAL PROPOSITIONS OF WELFARE ECONOMICS 187

Figure 2.7. Rational Consumer and Price Line.

indicated by the line H (in which the price of each commodity is positive), and if he
has chosen the point z, then, assuming that he is a "rational" consumer, we can
immediately conclude that he must be maximizing his satisfaction over the points
such as x, in the nonnegative orthant, which are on or below the lineH (the shaded
region).' No point in the shaded region costs more than i under this price line H.
Now consider an economy which consists of two consumers but no pro-
ducers. The consumers exchange commodities with each other. Assuming that
there are only two commodities in the economy and assuming also that each con-
sumer's consumption set is the nonnegative orthant ofR2, the well-known Edge-
worth-Bowley box diagram can be drawn. Any allocation of commodities between
the two consumers is possible as long as the point which represents such an alloca-
tion stays inside or on the boundary of the box ("feasibility condition").5 In Figure
2.8, point R, for example,. is not a Pareto optimal point, because it is possible to
improve one person's welfare without decreasing the otherperson's welfare simply
by moving within the "lens" formed by the indifference curves of the two in-

Figure 2.8. An Illustration of the Two Classical Propositions.

188 THE THEORY OF COMPETITIVE MARKETS

dividuals which pass through R. However, any point on the curve PQ is a Pareto
optimal point. Clearly, point C is a competitive equilibrium if the price indicated
by H prevails, for each person maximizes his satisfaction in the sense described
above. Note that at point C the price line is tangent to each person's indifference
curve. In fact, this tangency condition is sufficient to guarantee that each person
maximizes his satisfaction in the sense described above.
As it can easily be seen from Figure 2.8 and as is well known, any Pareto
optimal point in the ordinary box diagram is a point at which the indifference
curves of the two individuals are tangent. (The collection of such points is called
the contract curve.) As was seen above, any competitive equilibrium point must be
on a line which is tangent to the indifference curve of the individual and it must be
at the point of tangency.e Hence it follows immediately that any competitive
equilibrium point realizes a Pareto optimum.
Note also that at any Pareto optimal point (that is, any point on the contract
curve), it is possible to draw a line which is tangent to an indifference curve for
each consumer. At point C, H is such a line, and at point P, H' is such a line. Rep-
resenting the price vector by the slope of such a line, we can at once conclude that
every Pareto optimal point can be achieved and supported by co m petitive pricing!
Note that the above statement does not say that any Pareto optimal point can
be achieved by competitive pricing after starting from any arbitrary initial point.
For example, if point R represents the initial resource point, point C can be
achieved by pure exchange, with each individual acting as a competitive consumer
-that is, a price taker-under a price line H. But point P cannot be achieved
directly from R. It requires some reshuffling of goods so that pointR is translated to
a point such'as R' on the H' line.
The above "proofs" of the two classical propositions of welfare economics
look very simple, but they rely on many implicit assumptions. In fact, this is a prime
example of traditional economic theory, whose reasoning is so crucially dependent
on the diagram. We may ask the following questions, for example.

(i) How crucial is the assumption that the consumption set is the entire non-
negative orthant?
(ii) Is it necessary to assume the convexity of each consumer's consumption set;
and how essential is the divisibility assumption of each commodity?
(iii) Does the consumption set of every consumer have to include the same list
of n commodities?
(iv) Does an individual's indifference curve have to be (strictly) convex to the
origin?
(v) Does it have to be smooth? [Or is it necessary that we can define a unique
tangent line (such as H) for the individual's indifference curve?]
(vi) Do we have to assume the continuity of the individual's utility function (if
its differentiability can be dispensed with)?
(vii) What is the role of consumer's satiation in the above analysis?
(viii) What happens if the price line which should support a Pareto optimum
THE TWO CLASSICAL PROPOSITIONS OF WELFARE ECONOMICS 189

coincides with one of the edges of the box? Can we still have a competitive
equilibrium?
(ix) What happens if we introduce production? What assumptions are necessary
when production is introduced into the model?
(x) Will the conclusions be altered if there are more than two consumers and
two commodities and if there is an arbitrary number of producers?
(xi) What is the minimum possible set of assumptions which will guarantee all
the conclusions of the classical propositions?

Although we may get considerable insight into the above questions from Figure
2.8,R nothing precise and definite can be said on the basis of it alone. Here we may
quote the following remark from Koopmans ([ 12], p. 174):
Nothing in the process of reading a diagram forces the full statement of as-
sumptions and the stepwise advance, through successive implications to con-
clusions that are characteristic of logical reasoning. Assumptions may be
concealed in the manner in which the curves are "usually" drawn and con-
clusions may be accepted unconditionally although they actually depend on
such unstated assumptions.
We now turn to the modern formulation and the proof of the above classical pro-
positions of welfare economics. The author hopes that the reader will fully appreci-
ate the above remark by Koopmans in the process of reading the following
exposition of the modern approach to welfare economics. In the following, we use
the minimal possible assumptions which are known at presents Any relaxation of
assumptions will be interesting and important. Some important counter-examples
will be offered when some of the assumptions are violated. Diagrams will be useful
to show such counterexamples.
Let x; be an n-vector of consumption by consumer i(i = 1,2, ..., m), and let
y, be an n-vector of production by producerj(j = 1, 2, . . ., k). The negative ele-
ments of y, denote inputs and the positive elements of y1 denote outputs. Let
x xi and y = Z ly1. Denote by X, the consumption set of i and by Y1
the production set of j. We assume that both X, and Y/ are subsets of R". We
denote by X the aggregate consumption set and byYthe aggregate production set.'
We assume that the preference ordering ®, is defined for each consumption set
X;." Given price p, the profit of producer j can be written as p yi. There is an
initial bundle of commodities available in the economy. We denote it by x. This
bundle of commodities can be held by consumers so that if we denote the initial
resource held by the ith consumer by X;, then [Xi = x.
We now define (in the usual manner) feasibility, Pareto optimum, and com-
petitive equilibrium.

Definition (feasibility): An array of consumption vectors {x,} is said to be feasible

if there exists an array of production vectors {yj} such that x = y + X.
190 THE THEORY OF COMPETITIVE MARKETS

Definition (Pareto optimality) : A feasible {2} is said to be Pareto optimal (P.O.)

if there does not exist a feasible {x;} such that x; Qi 2, for all i = 1, 2, ..., m
with Q; for at least one i.

Definition (competitive equilibrium): An array of vectors [p, {2i}, {i j}] is called

a competitive equilibrium (C.E.), if 2, E X,, i = 1, 2, ... , rn, yj E Yi, j = 1, 2, ... , k,
and
(i) 2i Q , x1 for all x1 E X1 such that p x1 < p 1j,
i = 1, 2, ..., m (consumer's equilibrium)
(ii) yj? yj for all yj E Yj,
j = 1, 2, ..., k (profit maximization)
(iii) 2 = y + z (feasibility)
REMARK: In the definition of feasibility above, we required the equality
z = y + Y. In the literature, this is often replaced by z <= y + z, allowing
an excess supply of commodities. This implicitly or explicitly assumes "free
disposability" of commodities.12 If free disposability is assumed, it is neces-
sary that the price vector p in the definition of competitive equilibrium be
nonnegative. We do not assume free disposability; hence the "undesired
commodity" cannot be freely disposed of and its price will be negative.
Under the free disposability assumption, not only do we require 2 < y + z
for feasibility, but we also change condition (iii) of the definition of com-
petitive equilibrium as follows:
(iii') -

The second relation in (iii') states that if there is an excess supply of some
commodity, its price must be zero. As we will see shortly, the case p = 0
will be precluded, under the assumption that 2; is a "local nonsatiation
chosen point."

Definition (chosen point): When price p prevails, a point 2; E X1 is called a chosen

point of the ith consumer, if
2, Q;x, for all x; E X; with p xi < p x;

Definition (local nonsatiation point): A point x; E X1 is called a local nonsatiation

point, if there exists a.5 > O with BS (x;) rl (X1 \ xi) 0 such that for any E, 0 < E <.5,
with BE (xi) n (Xi \ x,) 0, we have x; Q; x; for some x, E B, (xi) n X1, where
B8(x,) and BE(x;) are open balls with center x, and radii S and c, respectively.
REMARK: The above definition of "local nonsatiation point" assumes that
there exists at least one commodity which is divisible (since E, the radius of
the ball, can be any real number with 0 < E < S)13 and that the consumer is
THE TWO CLASSICAL PROPOSITIONS OF WELFARE ECONOMICS 191

Commodity 1

Figure 2.9. An Illustration of the Local Nonsatiation Point.

"not satiated" with respect to this divisible commodity at point x; (that is,
there exists a point such as x'1 Q,x, for each c). Figure 2.9 illustrates the case
in which one commodity is perfectly divisible and the other commodity is
indivisible." The consumption set here is assumed to be the collection of
the horizontal lines in the nonnegative orthant.
REMARK: Suppose z; is a chosen point under p so that z; Q,x1 for all
x, E X, with p x, < p z,. Then p cannot be a zero vector if x, is a local
nonsatiation point. (For if p = 0, then p x, < p z, holds for any x, E X1.)

Definition (locally nonsaturating Q1): The preference ordering Q/ is called

locally nonsaturating if, given any local nonsatiation point x,, x; ®1x1 implies
that x', is also a local nonsatiation point.
We now introduce the following assumption.
(A-1) The preference ordering Q1 is locally nonsaturating for every consumer.
REMARK: This assumption presupposes that there exists at least one per-
fectly divisible commodity.

Lemma: Let z, be a locally nonsatiating chosen pointfor the ith consumer when price
p prevails. Then under (A-1),
(i) x, (D111 implies p x1 > p 11.
(ii) x,Q,llimpliesp x,? p 11.
PROOF:
(i) Suppose not; that is, p x1 < 1,. Since 11 is a chosen point, 11®1x1,
which is a contradiction.
(ii) Let x,9111 and suppose that p x1 < p 11. Since z, is not a point of
192 THE THEORY OF COMPETITIVE MARKETS

local satiation, neither is x [by (A-1)] . Hence for all c, 0 < E < 8, there
exists x; E Xi and x; E BE(x;) such that x, G> ixi, which in turn implies
xiGizi by the transitivity of the preference ordering) We may choose
x, close enough to x1 so that p x; < p i, (which is possible because the
value function p x1 is continuous). This contradicts the assumption that
zj is a chosen point under p. (Q.E.D.)
REMARK: The proof of statement (ii) of the above lemma can be illustrated
by Figure 2.10.

Commodity 2

Figure 2.10. An Illustration of the Lemma (X = 122).

REMARK: The reader should realize that the choice of x; close enough to z,
so that p x; < p z; needs the assumption that there exists at least one com-
modity which is divisible.

Theorem 2.C.1: Let [p, {-0, {yj}] be a competitive equilibrium such that z, is
a local nonsatiation point for all i = 1, 2, ..., m. Suppose assumption (A-1) holds
for all i. Then [{X,}, {yj}] is a Pareto optimum.
PROOF: Suppose [{Xi}, {yj}] is not a Pareto optimum. Then there exist
[{xr}, {y1}] such that xi E Xi, i = 1, 2, ..., m, yj E Y1,.j = 1, 2, ..., k, and

(i) x=y+x (feasibility)

(ii) xi ®;,ri for all i = 1, 2, ... , m
(iii) x®,1 for some i

Hence from the previous lemma we have

xi > zi or x> p z
i=1 i=i
But condition (iii) of C.E. requires p z = p y + p x Hence we have .
THE TWO CLASSICAL PROPOSITIONS OF WELFARE ECONOMICS 193

p x > p y + p x . Condition (ii) of C.E. requires p yj > p zj for all zj E Yi,

j = 1, 2, ... , k. Hence, in particular, p yj > p yj, j = 1, 2, ... , k. Or,
summing over j, p y > p y. Therefore we now have

p (x - y - x) > 0. This contradicts the feasibility of {x;}. (Q.E.D.)

REMARK: Since z; is a local nonsatiating chosen point, we have p 0.
REMARK: In the proofs of the above theorem and the preceding lemma,
no convexity of the preference ordering is assumed.
REMARK: Although Theorem 2.C.1 states that every competitive equi-
librium realizes a Pareto optimum under the extremely weak assumption
(A-1), it does not say that a competitive equilibrium can exist under the
same assumption.1e At present the proof of the existence of competitive
equilibrium requires a more stringent set of assumptions. This will be dis-
cussed in Section E of this chapter.
REMARK: We may easily construct an example in which a competitive
equilibrium does not realize a Pareto optimum if (A-1) is not satisfied. We
illustrate this by the Edgeworth-Bowley diagram (Figure 2.11) which deals
with a two-person pure exchange economy. In the diagram, OA represents
the origin for individual A and OB represent the origin for individual B. The
initial resource point is illustrated by R. The indifference curves are drawn in
the ordinary convex fashion. Note that individual A is satiated over and
above his indifference curve, and each point in the satiation region (the
shaded region) gives him the same level of satisfaction.
Point C is a competitive equilibrium under the price represented by
line H. But this point C is not a Pareto optimal point, for by moving from
point C to point P, individual B certainly can increase his satisfaction with-
out affecting A's satisfaction. (Note that P is a Pareto optimal point.) We
may also note that in this example, point C can be achieved from point R

L__ Figure 2.11. (C. E. = P.O.)

OA
194 THE THEORY OF COMPETITIVE MARKETS

by competitive pricing under price line H. We may, however, recall that

the above definition of C.E. does not require that point C should be achieved
from an arbitrary initial resource point (say, R). The reader should be able
to construct an example in which a competitive equilibrium does not realize
a Pareto optimum when none of the commodities are divisible."
In the above definition of competitive equilibrium, we have not specified
how each consumer obtains the income which enables him to purchase com-
modities with a value of p ii. One typical case is that the ith consumer receives
the values of his resources i (where x; E R° and 2:'i' 1Xi = x) and shares Bpi,
021, ..., 8j;, ..., Oki of the profit of the 1st, ..., jth, ..., kth producer (where
Bj; E R with 0 and jOj; = 1). The x;'s are the given quantities of the com-
modities that he owns a priori, and 0 , can be interpreted as the fraction of the
stock of thejth producer that he owns.
The case thus described can be called the private ownership economy. We now
wish to relate the definition of a competitive equilibrium as defined above to the
competitive equilibrium of the private ownership economy.

Definition (competitive equilibrium of the private ownership economy): An array

of vectors [p, {ii}, {yj}, {Bji}] is a competitive equilibrium of the private ownership
economy (C.E.P.O.E.) if z; E X,, i = 1, 2, ..., m, y, E Yj, j = 1, 2, ..., k, and
(i) z; ®,x1 for all x; E X; such that x; M;, where M; = p Ti + Zj IOj;p y,,
i=1,2,...,m.
(ii) foryj E Yj,j= 1,2,...,k.
(iii) z = y + j e.

Clearly if [p, {. ,}, {yj}, {B;;}] is a C.E.P.O.E., then it is a C.E. It is easy to

check that every C.E. can be derived from some C.E.P.O.E. This can be done by
giving the ith consumer the resources x, = z; - (1/m) y and the shares Bj, = 1/m
(observe that E"_ 1 O 1 = 1, x + y = z, and so on), where [ p, {ii', { yj }] is a C.E.'8
We now turn to a deeper or at least a more difficult theorem, that is, a
proposition somewhat converse to Theorem 2.C.1. Given the fact that an economy
is in a Pareto optimal state, we want to know whether or not there exists a price
vector such that it can be supported as a competitive equilibrium with this price
vector (allowing a redistribution of ownership of the resources, if necessary). The
answer is affirmative under a stronger set of assumptions than that required in
Theorem 2.C. 1. It will be shown that the crucial tool in establishing such a theorem
is the separation theorem and that the slope of the separating hyperplane will give
such a price vector. We start our discussion by recalling some definitions intro-
duced in the previous section.

Definition (convex preference ordering): We call a preference (i on X. con-

vex if
THE TWO CLASSICAL PROPOSITIONS OF WELFARE ECONOMICS 195

(i) The set Xi is convex.

(ii) x Qix' implies tx + (1 - t) x' D j x% for 0 < t < 1.
(iii) x Qix' implies tx + (1 - t) x'D i x', for 0 < t < 1.
Here i refers to the ith consumer.
REMARK: The convexity of Xi is necessary to make statements (ii) and (iii)
of the above definition meaningful. The convexity of Xi presupposes the
the divisibility of all the commodities.

Definition: Let
Ci(Xr) = {xi: xi E Xi, xi Di Xi}
Ci(_Q _ {xi: xi E Xi, xi®, Xi}
The set CC(.2;) is the no-worse-than-zi set for i, and C,(zi) is the preferred-to-zi
set for i.
REMARK: The convexity of the preference ordering (for i) implies the
convexity of CC(zi) and Ci(zi) for all zi E Xi.
We now introduce the following assumptions:
(A-2) The preference ordering Di is convex for each i = 1, 2, ..., in.
(A-3) The set Y is convex.
(A-4) (cheaper-point)" Given a point zi and a prevailing price vector there exists
x'. E Xi such that p x; < p zi .
(A-5) (continuity of (Di) For each i = 1, 2, ..., m, the set {xi: xi E Xi, x,D, x'i}
is closed for all x; E Xi (that is, if {xiQ} is a sequence in Xi such that x,Q ®i x and
xiQ -. xic, then we have x,0 Qix'i).
REMARK: Assumption (A-3) does not require that the production set for
each producer (1') be convex. Assumption (A-4) is also called the minimum
wealth assumption.

Definition (nonsatiation): The ith consumer is said to be nonsatiated at z; if there

exists an xi E Xi such that xi Qi xi.

Theorem 2.C.2: Suppose that [{Xi}, {yj}] is a Pareto optimum such that at least
one consumer is not satiated. Then under assumptions (A-2) and (A-3), there exists a
(price) vector 0 such that
(i) p i; < p . xi for all xi E Xi with xi i = 1, 2.... , m.
(ii) e Yj,j= 1,2,...,k.
(iii) z=y+ z.
196 THE THEORY OF COMPETITIVE MARKETS

REMARK: This theorem does not require (A-4) and (A-5) but does not
quite say that to every Pareto optimum we can adjoin a price vector such that
it is supported by C.E. Condition (i) states that each consumer minimizes his
expenditure over his no-worse-than-z; set, but it does not necessarily imply
the maximization of satisfaction over the budget set. To prove the latter,
we use (A-4) and (A-5). We first prove the above theorem.
PROOF: Without loss of generality, we can suppose that the first consumer
is nonsatiated (at Ii). Let zl, X2, ..., zm) = C 1(zl) 2C1(2 ). For
notational simplicity, we abbreviate C(zl, X2, ..., zm) by C. By (A-2), C
is convex. Let W = {w: w = y + x, y E Y}. Since Y is convex by (A-3), W is
also convex. By the definition of P.O., z E C implies z it W. Hence C and
W are two nonempty disjoint convex sets. Hence by the Minkowski separa-
tion theorem (Theorem O.B.3), there exists a p # 0 and a real number a such
that
(a) for all wE W
and

(b) p z> a for all z E C

We now show that p z = a.
Since z = y + z and y E Y, we have z E W, so that p z <_ a. Suppose
x' E C; then we can find xi E C i(c1) and x'; E C1(-i1), i = 2, 3, ..., m, such
that x' = Z ; "_ Ix;. Now let xi(t) = tx' + (1 - t) z;, 0 < t < 1 , i = 1, 2, ...,
and x(t) = Z "_ I x1(t). By (A-2), xI(t) E C 1(I) and x;(t) E C1(-i1), i = 2, 3, ..
m. Hence x(t) E Now suppose that a. Then from (b) above, we
obtain p z > p x for all z E C. In particular, p x(t) > p z for 0 < t < 1.
Since, by choosing t small enough, x(t) can be arbitrarily close to x, this is a
contradiction. Thus we have p z f a. This together with p z < a gives
p z = a (that is, the hyperplane separating the two sets W and C goes
through z).2D Hence (a) and (b) may be rewritten as follows:
(a') p w< p .z for all w E W
and

(b') p z> p z for all z E C

That w E W means w can be written as w = Yk= i yj + z where yj E Yj, j = 1,

2, ... , k. Therefore, from (a') we obtain

k
for all yj E Yj,j= 1,2,...,k
i_
But i = f) + z(feasibility of P.O.). Hence
k k
lYi+ly Yi+for
J= l=I
all y1E Y,j= 1,2,...,k
THE TWO CLASSICAL PROPOSITIONS OF WELFARE ECONOMICS 197

Or
k k
for allyj E Yj,j= 1, 2,...,k
l= I J=I

Fix j = jo and let yy = yifor all j zA jo. Then p Yio >> P yio for ally;o E Y;o.
Since the choice of jo is arbitrary, this proves condition (ii) of Theorem 2.C.2.
Condition (iii) of Theorem 2.C.2 is automatically satisfied by the feasibility
condition of P.O.
Similarly, from (b') we obtain
(c) p x, > p z, for all x, E C, (z, )
and

(d) P' x1 ! P' c; for all x; E C.(z), i = 2, 3, ... , rrz

Figure 2.12. An Illustration of xi (t) and x".

We wish to assert that p x, >_ p z, for all x, E C1(1) _ To do this we have to

show that x, Q, z, implies p x, > p z, . Since z, is not a satiation point,
we can find a point x Qix ®1x,. Let x,(t) = tx" + (1 - t)x,, 0 < t < 1.
Then by the convexity of r®i [(A-2)], xl(t)Qixi, 0 < t < 1, so that
x (t) Q ,zl, 0 < t < 1. Then, from (c) above, p x i (t) > p z, . Hence by
i
the continuity of the function p x, , it follows that p x, > p z, . Thus we
obtain
(e) for all x;E 1,2,...,in
This proves condition (i) of the theorem. (Q.E.D.)

Corollary: If in addition (A-4) holds with respect to , and p in the above theorem,
and if (A-5) holds, then for every Pareto optimum [{ii}, { y,}], there exists p 0
such that [p, L j}, { yj}] is a competitive equilibrium.'-'
PROOF: It suffices to show that condition (i) of C.E. holds. In the above
theorem we obtained
(e) p- xi > p z; for all xi E C;(z;), i = 1, 2, . . ., in
198 THE THEORY OF COMPETITIVE MARKETS

This does not preclude the existence of an x; E X, such that p x; = p r, and

x; Q;z;. We will show that this cannot happen under (A-4) and (A-5). Sup-
pose the contrary. In other words, suppose that there exists an x; E X. such
that
p- x; = and x;(D;z;
From (A-4), there exists an x' E Xi such that p x'; < p z;. Relation (e)
implies x'©,z1. Now consider z; = tx' + (1 - t) x,, 0 < t < 1. Obviously,
p z; < p x; = p z;. By choosing t small enough, we can make z, arbitrarily
close to x;. Then, from (A-5), we obtain"
z; Q;r;, but p z; < p z;

Figure 2.13. An Illustration of the Proof of the Corollary.

This contradicts relation (e) above. Hence there cannot exist an x; E X. such
that p x; = p z; and x11. This means that p x; = p r;, x; E X;, implies
x; ®;z;. Note that relation (e) means (taking its contraposition) that
p x; < p z;, x; E X;, implies x; ©;z;. Hence we have obtained that
p x; < p z;, x; E X;, implies x; ®;z;
In other words, c, Q;x; for all x; E X; with x; 5 p r;. This proves condi-
tion (i) of C.E. (Q.E.D.)23
REMARK: Note that the competitive equilibrium in the above corollary
can be achieved by allocating from the aggregate income of the society,
p (y + x), the amount p z;, i = 1, 2, ..., m, to each consumer [note that
condition (iii) in the definition of C.E. guarantees that all the income of the
society is completely absorbed by all the consumers in the society]. In other
words, without such a reallocation of ownership, a Pareto optimum cannot,
in general, be supported by competitive pricing.
As we remarked above, Theorem 2.C.2 does not quite establish that a Pareto
optimum can be realized through a competitive equilibrium. To establish this
(the above corollary), we needed the additional assumptions (A-4) and (A-5). An
example showing that the conclusion of the corollary does not follow when the
cheaper point assumption (A-4) is missing was first offered by Arrow [ 1 ], and we
illustrate this with the Edgeworth-Bowley box diagram shown in Figure 2.14. The
consumption set for each consumer is assumed to be the nonnegative orthant, so
that one consumer's (consumer A) consumption set is the northeast orthant from
THE TWO CLASSICAL PROPOSITIONS OF WELFARE ECONOMICS 199

X
Figure 2.14. Arrow's Anomalous
Case.

0A and the other consumer's (consumer B) consumption set is the southwest

orthant from OB. There is no production in the economy.
In the diagram, point P is a Pareto optimal point. The line tangent to the two
consumers' indifference curves is the line going through 0AP Q (the upper contour
set for each consumer at P is represented by the shaded area). But this cannot rep-
resent the price line which supports a competitive equilibrium at point P. For
under this price line, the price of commodityX is zero and consumer A is certainly
not maximizing his satisfaction subject to his budget at point P. He can increase
his satisfaction by moving in the direction of Q (this is possible since he can get
commodity X free with the given price line). Note that at point P, the value of his
commodity bundle is zero and he has no point in his consumption set belowthis line
(OAPQ ). In other words, the cheaper point assumption (A-4) is not satisfied. Note
also that all the other assumptions for the corollary can be satisfied in the above
example.
This example does not allow production in the economy. An example that
does allow production is offered by Koopmans ([ 12] , pp. 34-35; [ 13] , pp.
92-93). His example is concerned with an economy which involves only one con-
sumer and one producer (if one likes, one can visualize a.situation in which one
person-say, Mr. Robinson Crusoe-performs two roles: one as a consumer and
the other as a producer). Since there is only one consumer, a Pareto optimal point
will be the point at which this consumer achieves his maximum satisfaction given
the "feasibility condition," that is, given the entire supply of goods in the economy.
(Recall that we denoted the aggregate supply set of the economy by Win the proof
of Theorem 2.C.2.) We now illustrate Koopman's example in Figure 2.15.
In this economy there are only two commodities, food and labor service. The
consumption set X of the consumer in this economy is the region above the curve
QRPS. The consumer's indifference curves are represented by the dotted lines.
Assume that the indifference curve passing through point P is tangent to the line
PR (but stops at P and does not go farther along PR). Note also that any point on
the line RP (except point P) is better than P for this consumer. Now in the diagram,
point P is the only Pareto optimal point in this economy, simply because it is the
200 THE THEORY OF COMPETITIVE MARKETS

Figure 2.15. Koopmans's Example.

only feasible point (setX and set W have no intersection except at point P). The line
H (that is, the one which goes through PR) is the only line that separates the sets
X and W. (Recall that in the proof of Theorem 2.C.2 and its corollary the slope of
the separating hyperplane of X and W gave the price vector which supports a com-
petitive equilibrium.) But this line contains a point (say, R) in X. In other words,
point R is a better point than point P but has the same value as point P; hence, if
the price represented by line H prevails, Robinson the consumer will certainly in-
crease his satisfaction by trading the commodity bundle represented by P for the
bundle represented by R with Robinson the producer. That is, the separation of
decision-making functions by the price H has given rise to incompatible decisions
by the two Robinsons. Hence the Pareto optimal point P cannot be supported by
competitive'pricing. Note that in this example there is no point of X below the line
H. In other words, the cheaper point assumption (A-4) is again violated. (The
reader should check that all the other assumptions can be satisfied by this
example.)
If there is a point in X below the separating line, then a Pareto optimal point
can be supported by decentralized pricing. This is illustrated by Figure 2.16, which

Food
THE TWO CLASSICAL PROPOSITIONS OF WELFARE ECONOMICS 201

is again indebted to koopmans ([ 12] , p. 36). Note that under the price represented
by line H, Robinson the consumer maximizes his satisfaction, Robinson the pro-
ducer maximizes profit, and the feasibility condition is satisfied. Hence the separa-
tion of decision-making functions has given rise to compatible decisions by the
two Robinsons.2-'

FOOTNOTES

1. The measurability of individuals' utility is a long debated question. At present we

do not know any way of measuring it; we customarily treat it as nonmeasurable.
However, the important empirical and theoretical consequences known in the litera-
ture do not depend on the measurability; hence it suffices to assume only the ordinal-
ity of utility. That is, the general practice of using ordinal utility (rather than cardinal
utility) is a use of Occam's razor.
2. We may replace the last phrase of this sentence by "given that the total demand for
commodities does not exceed the total supply of commodities." In fact, this may be a
more common phraseology. However, we should note that if we allow excess supply
of commodities, as above, then we are implicitly assuming free disposability of those
commodities that are in excess supply.
3. The term "Pareto optimum" was apparently coined by I. M. D. Little in A Critique
of Welfare Economics, Oxford, Clarendon Press, 1950, p. 89.
4. Note that we do not ask here how this point x is achieved. In other words, we do not
ask here how the consumer obtains his "income" that brings him to point k. This
observation will later become relevant when we discuss the price implication of a
Pareto optimum (Theorem 2.C.2).
5. It goes without saying that the same point in the box should be chosen by the two con-
sumers. This is required for the demand for each commodity to be equal to (or not to
exceed) its supply.
6. In other words, at any competitive equilibrium point, the "price line" is tangent to
each individual's indifference curve. In view of the fact that the demand for each com-
modity should be equal to its supply in a competitive equilibrium (so that a common
point in,the box must be chosen by the two consumers), this implies that any com-
petitive equilibrium point is a point in which the indifference curves for the two
consumers are tangent to each other.
7. The calculus proof and a quite rigorous statement of the above two classical prop-
ositions of welfare economics are seen in Lange [ 141. Although his treatment is
compact and elegant, mathematical limitations restrict the generality of his ex-
position. This leads the way to a further generalization by Arrow [ 1] and Debreu
[4].
8. Although diagrammatical analysis is not often accepted by serious economic
theorists as the proof (or even the formulation) of a particular theorem that the an-
alysis is concerned with, any serious theorist will not question the usefulness of
diagrams in yielding an insight into the problem. It is often very advisable to think the
problem through in terms of diagrams before one mathematizes it.
9. A complete scrutiny of the literature on the topic covered in this section is done by
Moore [ 17] , especially part I, with further extensions and interesting counter-
examples.
10. Here we define: X - _Y1?71= iX1 and Y = J= i YJ.. Both are subsets of R". That Y is the
202 THE THEORY OF COMPETITIVE MARKETS

aggregate production set means that there are no (technological) external economies
and diseconomies.
11. Note that X; c Rn. Here it is assumed that individual i's preference ordering depends
only on his own consumption bundle and not on the consumption bundles of other
consumers (nor on the pattern of production). This assumption of the lack of "ex-
ternality" is one of the most crucial assumptions in the theorems of this section. In
the literature, this assumption is referred to as individualistic or selfish preference
ordering, as we remarked earlier.
12. Notice, however, that this does not preclude the possibility of the existence of pro-
duction processes which dispose of various types of waste.
13. Intuitively, a point is a local nonsatiation point if there are arbitrarily close points
which are preferred to it. The concept of a local nonsatiation point was first intro-
duced by Koopmans [ 12] and used again in [ 13]. Moore ([ 17], part I) reaffirmed the
importance of the concept in the literature.
14. An alternative way to state the above definition is as follows: x; E X; is called a
local nonsatiation point (for the ith consumer) if there exists xr E X; such that
x;®1x,and x;EX;where all t,0<t< 1.
15. If Q; is such that x; Q;z'; implies tzi + (1 - t)z'; ®,z;, 0 < t < 1, for z; z, (which
is true if Q; is strictly convex) with convex X;, then the transitivity assumption can
be dispensed with. To see this, suppose x; Q;z; with p x; < p x; as above and
let x; (t) = tx; + (1 - t)z;, 0 < t < 1. Then xi (t) Q; ac; , but p- x. (t) < p z;, which
contradicts (i) above.
16. It simply says that "if a competitive equilibrium exists, then it realizes a Pareto opti-
mum with (A-1)."
17. Such an example can be found in Quirk and Saposnik [20], p. 134. (Caution: Indif-
ference curves should take on values only on the lattice points.)
18. See Debreu [61, pp. 93-94.
19. A slightly stronger version of this assumption, which is also used in the literature, is
the following: For a given point ii, there exists x', E X; such that p x; < p z; for all
price vectors p.
20. The separating hyperplane can be written as H = { x: x E X, p x = p x} , where
X = m;= X;
21. Assumption (A-4) is rather awkward. It is certainly desirable to obtain the present
corollary replacing (A-4) by a more plausible assumption, that is, one that is based
directly on some properties of the preference orderings and/or the consumption and
production sets. For an investigation of such a point, see Moore [ 17] , part I.
22. Suppose not. That is, suppose zj®,ij for any small t > 0. Let t - 0 so that z; - xi.
Then by (A-5), x; Q; xi , which contradicts x; Q; i1.
23. In establishing Theorem 2.C.2, which leads to the above corollary, we saw that the
separation theorem played a crucial role. In Chapter 1, we obtained theorems of non-
linear programming (notably that of concave programming) using the separation
theorem. We can conjecture that the above theorem and the corollary can be proved
using a theorem in concave programming. We attempt to do so in Section F of this
chapter.
24. The compatibility of decentralized decision making in terms of prices is the essence
of the concept of competitive equilibrium. The essence of Theorem 2.C.2 is that if a
separating hyperplane exists, it defines a price system that makes such a decentraliza-
tion possible.
THE TWO CLASSICAL PROPOSITIONS OF WELFARE ECONOMICS
203

REFERENCES

1. Arrow, K. J., "An Extension of the Basic Theorems of Classical Welfare Economics,"
Proceedings of the Second Berkeley Symposium on Mathematical Statistics and Prob-
ability, ed. by J. Neyman, Berkeley, Calif., University of California Press, 1951.
2. Arrow, K. J., and Debreu, G., "Existence of an Equilibrium for a Competitive Econ-
omy," Econometrica, 22, July 1954.
3. Barone, E., "The Ministry of Production in the Collectivist State," in Collectivist
Economic Planning, ed. by F. A. von Hayek, London, Routledge, 1935 (Italian
original, 1908).
4. Debreu, G., "The Coefficient of Resource Utilization," Econometrica, 19, July 1951.
5. , "Valuation Equilibrium and Pareto Optimum,"
Proceedings of the National
Academy of Sciences of the U.S.A., 40, 1954.
6. , Theory of Value, New York, Wiley, 1959, esp. chap. 6.
7. Dorfman, R., Samuelson, P. A., and Solow, R. M., Linear Programming andEconomic
Analysis, New York, McGraw-Hill, 1958.
8. Hicks, J. R., "The Foundations of Welfare Economics," Economic Journal, XLIX,
December 1939.
9. Hurwicz, L., "Optimality and Informational Efficiency in Resource Allocation Pro-
cesses," in Mathematical Methods in the Social Sciences, 1959, ed. by.K. J. Arrow,
S. Karlin, and P. Suppes, Stanford, Calif., Stanford University Press, 1960.
10. Karlin, S., Mathematical Methods and Theory in Games, Programming, and Economics,
Vol. I., 1st ed., Reading, Mass., Addison-Wesley, 1959.
11. Koopmans, T. C., "Efficient Allocation of Resources," Econometrica, 19, October
1951.
12. , Three Essays on the State ofEconomic Science, New York, McGraw-Hill, 1957,
esp. secs. I and 2 of the first essay.
13. Koopmans, T. C., and Bausch, A., "Selected Topics Involving Mathematical Reason-
ing," SIAM Review, 1, July 1959, esp. pp. 83-95.
14. Lange, 0., "Foundations of Welfare Economics," Econometrica, 10, January-
October 1942.
15. Lerner, A. P., "The Concept of Monopoly and Measurement of Monopoly Power,"
Review of Economic Studies, 1, June 1934.
16. , Economics of Control, New York, Macmillan, 1944.
17. Moore, J. C., "On Pareto Optima and Competitive Equilibria (Part I: Relation-
ships Among Equilibria and Optima; Part II: The Existence of Equilibria and
Optima)," Krannert Institute Paper, nos. 268 and 269, April 1970, Purdue University.
18. Pareto, V., Manuel d'Economie Politique, 2nd ed., Paris, Giard, 1927 (1st ed., 1909),
esp. chap. VI.
19. Pigou, A. C., The Economics of Welfare, 4th ed., London, Macmillan, 1932, esp.
chaps. IX, X, XI.
20. Quirk, J., and Saposnik, R., Introduction to General Equilibrium Theory and Welfare
Economics, New York, McGraw-Hill, 1968, esp. chap. 4, sec. 5.
204 THE THEORY OF COMPETITIVE MARKETS

21. Rader, J. T., "Pairwise Optimality and Noncompetitive Behavior," in Papers in

Quantitative Economics, Vol. I, ed. by J. Quirk and A. M. Zarley, Lawrence, Kansas,
University of Kansas Press, 1968.
22. Samuelson, P. A., Foundations of Economic Analysis, Cambridge,. Mass., Harvard
University Press, 1947.
23. Wicksell, K., Lectures on Political Economy, Vol. I, London, Routledge & Kegan Paul,
Ltd., 1935 (Swedish original, 1901).

Appendix to Section C: Introduction to the Theory of the Core

a. INTRODUCTION
Consider a simple two-person, two-commodity pure exchange economy,
which may be illustrated by the familiar Edgeworth-Bowley box diagram. Let
x, and y;, respectively, be the amounts of commoditiesXand Ywhich are initially
held by consumer i, where i = 1, 2. We suppose that the two people, starting from
such an initial position, wish to improve their satisfaction by engaging in the trade
of these two commodities. The situation is illustrated in Figure 2.17.
In Figure 2.17, the indifference curves of the two people are denoted by
the usual strictly convex shapes (a], a2, ..., and A,, /32, ...). The initial endow-
ment point is denoted by point R. The curve passing through points P, E, and
Q is the contract curve, which is the locus of points at which two individuals'
indifference curves are tangent to each other. Any point on the contract curve
is a Pareto optimum point.
If the two consumers, starting from the initial point R, trade with each
other, the result is a reallocation of the total amounts of the two commodities
between them, which may be denoted by a point in the Edgeworth-Bowley box

i
'H
Ix, + x2)-

Figure 2.17. Two-Person, Two-Commodity Pure Exchange Economy.

THE TWO CLASSICAL PROPOSITIONS OF WELFARE ECONOMICS 205

shown in the Figure 2.17. If the competitive price mechanism' is introduced into
the trading, then we obtain a competitive equilibrium reallocation E, where the
H-line signifies the equilibrium price'ratio.
That the competitive equilibrium allocation is on the contract curve (and
hence is Pareto optimal) is an important welfare result of the competitive price
mechanism. But any other allocation on the contract curve is also a Pareto
optimum.
Suppose that the competitive price mechanism is dropped from the trading
scheme. Clearly the resulting allocation depends on the trading rule. However,
it is also clear that, under any trading rule, the resulting allocation of the two
people will not fall outside the lens-shaped region defined by the two indifference
curves a, and Al. For if it does, at least one person will be worse off compared
to the initial position R, and he can always refuse to trade. It is clear that the
final resulting allocation of the two people should lie on the PQ segment of the
contract curve.
Hence we may say that, given the initial endowment point R, the alloca-
tions on the PQ segment of the contract curve should occupy a more privileged
place compared to allocations outside the PQ segment. More strongly, any
point on the curve outside the PQ segment is irrelevant to our consideration
when the initial endowment point is given as R. We term the PQ segment the
core of the above economy.2 Note that the competitive allocation E is on the
PQ segment.
In order to single out the importance of the competitive solution, Edgeworth
[ 11 ] in 1881 considered an expanded economy of 2n consumers, in which there
are two "types" of consumers. Every consumer of the same type has identical
tastes and identical initial endowments. In other words, the above box diagram
economy is replicated n times. Edgeworth then argued that as n tends to infinity,
the above PQ segment shrinks to one point: the competitive allocation E (or
the set of competitive equilibria if it is not unique)!' In 1963, Debreu and
Scarf [ 10] elegantly and rigorously proved Edgeworth's result.
The general principle given by Edgeworth was that of "recontracting."
Consider any subgroup of consumers. Suppose that it is possible for its members
to distribute their initial resources among themselves in such a way that no
member of the subgroup is made worse off, while one or more members of the
subgroup are made better off. Whenever this happens, "recontracting" takes
place without others' consent. "Final settlement"' comes when a contract cannot
be amended by the recontract of any such subsets. For the two-person economy,
the above PQ segment, the "core," constitutes the set of final settlements, that is,
the set of allocations which result in no further recontracting. What Edgeworth
has shown is that, in the economy of 2n consumers of two "types," such a set
of allocations decreases as n increases and converges to the set of competitive
equilibria as n-co.
It was Shubik [28] who related the Edgeworth notion of "final settlement"
in "recontracting" to Gillies' [12] concept of the ".core" in the theory of n-
206 THE THEORY OF COMPETITIVE MARKETS

person games. This, in turn, stimulated various works on the problem including
Scarf and Debreu [10], mentioned above. The n-person game theory is con-
cerned with situations in which individuals ("players") with conflicting interests
compete, and hence it probes deeply into the question of the theory of competitive
equilibrium. Therefore it is quite natural that economists should attempt to
master the intricacies of game theory and the theory of the core.
As indicated above, the "core" of the economy is the set of allocations
which cannot be "blocked" by any subgroup ("coalition") of members of the
economy. Note that the concept of the core is free from prices. In other words, the
core solution provides an alternative approach to the price-guided competitive
solution, as well as offering an important characterization of competitive equi-
librium through the results of Debreu-Scarf [101, and others. Moreover, the
concept of "blocking coalition" in the theory of games and the core offers a
fresh interpretation of the concept of Pareto optimum. A Pareto optimum alloca-
tion is one that will not be blocked by the coalition involving all participants
of the economy. This then means that the core gives a stronger characterization
of competitive equilibrium than does Pareto Optimum. In fact, the precision of
this characterization is quite strong, as the above Edgeworth-Debreu-Scarf result
indicates. An important merit of the core-theoretic approach here is that it
permits freedom of choice for each individual of the economy and deduces that
if the number of these individuals increases, each person might behave as if
he were a price taker. In the theory of competitive markets, on the other hand,
each individual is assumed (or destined) to be a price taker."
With the increasing interest in the concept of the core, economists are now
more concerned with the theory of n-person games (for example, the publication
of a series of joint articles by Shapley and Shubik [22], [23], [24], [25], [261,
and so on, in economic journals). This is quite natural, as we have already
remarked. In the classical treatments of game theory such as the theory of von
Neumann and Morgenstern [32], it is assumed that payoffs are made in "utils,"
which are cardinal and, like money, fully transferable among the players. It is
further assumed that these utils are linear in money. Such an assumption is quite
convenient, since the classical theory of games is almost exclusively concerned
with cooperative games with side payments.' However, the appropriateness of
this assumption of money-like transferable utils is naturally very questionable to
economists and others, and it has been extensively debated. This, no doubt,
prompted the development of the theory of cooperative games without side pay-
ments (for example, Aumann and Peleg [5] ). There are a few necessary steps to
be able to reach ordinality of preferences. The classical N-M theory assumed
cardinal utils which are linear in money, but Shapely and Shubik [21 ] pointed out
that linearity and perfect transferability are not essential to the theory. What
remains is an ordinal theory.' Scarf's approach [ 17] with strictly ordinal utility
is good in terms of the Occam's Razor principle. There seems little question that
this advance in game theory has made the theory much more attractive to
economists.
THE TWO CLASSICAL PROPOSITIONS OF WELFARE ECONOMICS 207

It is not the intention of the author to give an expository account of the

theory of cooperative games 8 The author wishes only to point out the relevance
and the importance of the theory of games to economics. The purpose of this
appendix is to attempt an introductory exposition of the theory of the core, to
the extent that it requires no exposition of game theory. Moreover, it is not
the intention of the author to make a comprehensive survey of the entire litera-
ture. The literature in the field has been expanding quite rapidly. The aim of
this Appendix is thus a much more modest one; it is simply to familiarize the
reader with some basic concepts in the theory of the core and the important
result of Debreu and Scarf, which hopefully will facilitate further study on the
theory of the core. The author then hopes that this will further increase the
inquisitiveness of the reader about the general topic of game theory. In order to
simplify our exposition, we will confine our attention to the pure exchange econ-
omy. The extension to an economy in which production is involved is more or
less straightforward in certain cases 9

b. SOME BASIC CONCEPTS

Let x, be an n-vector of consumption by consumer i and x; be an n-vector
of initial resources held by the ith consumer. There are m consumers and let
M be the set of all consumers, that is, M = 11, 2, ... , m}. Denote by X; the con-
sumption set of i, where X, c Rn and by X we denote .1X,. It is assumed
that each consumer's preference ordering is represented by a continuous real-
valued function u;(x;) defined on X,, i = 1, 2, ... , m. Clearly such a representa-
tion is, in general, not unique.10 We arbitrarily select one of them for each
consumer, and the analysis remains purely ordinal. That is, the representation
here is only for convenience and is not essential in the subsequent discussion."
An array of consumption vectors x = (XI, x2, ..., x,,,), where x E X (that is,
xi E X,, i = 1, 2, ..., rn), is called an allocation. An allocation x is said to be
feasible if

M Xi= m
(1) Xi
i=1 i=1

Let A be the set of all feasible allocations, that is,

m m
(2) A-= {x: x E X, 2:x1 = 2:_J
i=1 i=1

The central concept in the theory of the core is that of blocking.

Definition: By a coalition we mean a nonempty subset S of the set M of all

consumers. A feasible allocation x E A is said to be blocked by a coalition S if
there exists another feasible allocation x' such that
208 THE THEORY OF COMPETITIVE MARKETS

ui(x;) ? ui(xi) for all i E S

(3)

ui(x'i) > ui(xi) for some i E S

and

(4) Xi = Xi
iES iES

We then say that x' is S-block superior to x, or x' dominates x by coalition S, and
denote this by x'Bsx, where BS is a binary relation defined on A.
REMARK: Intuitively, an allocation x is "blocked" by a coalition S if there
is another allocation which is feasible among the members of S and makes
no consumers in S worse off while betters at least one consumer in S. The
consumers outside the coalition are "discriminated" against by the coalition
in the sense that some or all of them can be worse off in x' compared to
the initial allocation x.'2
REMARK: Note that the coalition S may consist of all consumers in the
economy, that is, S = M. A Pareto optimal allocation is one that is not
blocked by the coalition involving all consumers.
Define the binary relation B on A by [x'Bx] if and only if x'Bsx for some
coalition S of M. Given a feasible allocation x in A, define set-valued functions
Bs(x) and B (x) by

(5) Bs(x) = {z: zBsx, z E A}

(6) B(x) = {z: zBx, z E Al

In other words, Bs(x) is the set of all feasible allocations that block x by a
particular coalition S in M, and B(x) is the set of all feasible allocations that
block x by some coalition in M.

Definition: The core is the set of all feasible allocations that are not blocked by
any coalition. In other words, it is equal to
{xEA:B(x)=01}
REMARK: That x is Pareto optimal means that B,(x) = 0. Hence if x is in
the core, then x is Pareto optimal, whereas a Pareto optimal allocation need
not belong to the core.
Let xs = [xi] iES be a subvector of a feasible allocation x, in which xi E xs
implies i E S. Denote by As the allocations attainable among the members of the
coalition S. That is,
(7) As= {XSEXs: GXi= Lrxi}
iES iES
THE TWO CLASSICAL PROPOSITIONS OF WELFARE ECONOMICS 209

where X®®x iEsX; and x E A. Assuming that X,, i = 1, 2, ..., m, are all closed
and bounded from below, As is compact in R""° for any coalition S, where s is
the number of members of coalition S. It is easy to show that As is convex for
any S if the X;, i = 1, 2, ..., m, are all convex. Let the function us: As3Rsbe
defined by us(XS) = [u;(x,)] iCS. That an allocation x is not blocked by a coalition S
[that is, Bs(z) _ 0] means that is is a solution of the vector maximum problem
of maximizing us(xs) subject to xs E As.

Definition: The set U(S) defined by

(8) U(S) ° {us(xs): xs E As}

is called the utility possibility set of coalition S.
REMARK: Clearly
(8') U(S) _ {us: 3 xs E As such that us = us(xs)}
Note that U(S) is compact if As is compact ('.' the u;'s are all continuous).
REMARK: The concept of the utility possibility set corresponds to that of a
"characteristic function" in game theory. In economics, such a concept
(when S = M) is well known through Samuelson [27] and others."
In the theories of the core and games, a weaker concept than that of U(S)
is often used.'{ In other words, define the set V(S) by
(9) V(S) _ {us: 3 xs E As such that us < us(xs)}
Clearly U(S) c V(S), but the converse does not necessarily hold. As an example
of V(S) c U(S), it suffices to consider the case in which the consumption set of
each consumer has a "hole." Obviously the relevant concept in the theory of
the core should be U(S). However, it is often convenient and useful (for obtaining
sharper results) to carry out an analysis by simply assuming (or starting out with
the assumptions which imply) U(S) = V(S).
For the sake of illustration, suppose that the number of consumers in the
economy is 3 (m = 3). Assume also that the consumption set for each consumer
is 0, the nonnegative orthant of R". In such an economy, there are obviously
seven possible coalitions, that is, { 1, 2, 3}, { 1, 2}, {2, 3}, { 1, 3}, { 11, {2}, and {3}.
Denote the sets V({ I, 2, 3}), V({1, 2}), V({1}), and so on, by V(123), V(12), V(l),
and so on. Thus, for example,

(10) V(123) = {(u', u2, u3): u' < u;(x;) for some x; E 12, i = 1, 2, 3,
with x1 + X2 + X3 = YI + x2 + X3}

(11) V(12) = {(u', u2): ui < ui(x;) for some xi e D, i = 1, 2,

with x 1 + x2 = X 1 + 2}
210 THE THEORY OF COMPETITIVE MARKETS

(12) V(1) - {u1: uI 5 u1(x1) for some x, E S2 with xl = X1 }

The concepts of V(12), V(I), and V(2) are illustrated in Figure 2.18.'' It should be
clear that if uI E V(1) and (u2, u3) E V(23), then (u I, u2, u3) E V(123). Moreover, as
Scarf [ 17] has shown,

(ul, u2) E V(]2)

(u2, u3) E V(23) imply (ul, u2, u3) E V(123)

(u1, u3) E V(13)

provided that the u;'s are quasi-concave. To show this, first observe that the assump-
tions imply
ul u1(xl), u2 G u2(x2) with x, + x2 = 3E1 + 5E2
u2< u2(Y2),u3GU3(Y3) with Y2+Y3 =X2+X3
uI G ul(z1), u3 G u3(Z3) with ZI + Z3 = x1 + x3
But the allocation [(xl + z1)12, (x2 + Y2)/2, (y3 + z3)/2] is feasible for the coali-
tion consisting of all three consumers, for we have

(13) XI 2 Z] + X2 + Y2 + Y3 2 Z3
2
= XI + X2 + X3

Moreover, in view of the quasi-concavity of u l, u l [(x, + z, )/2] > min {u I (x 1),

ui(z1)} u1. Similarly, we also have u2[(x2 + Y2)12] > u2 and u3[(Y3 + z3)/2]
u3. Therefore, (u', u2, u3) E V(123)
It is important to note that the quasi-concavity assumption plays a crucial
role in connecting the three "two-consumer" coalitions to the coalition of all three
consumers. Using this as a guide, Scarf [ 17] proved a remarkable theorem which

U1
Figure 2.18. An Illustration of V(S).
THE TWO CLASSICAL PROPOSITIONS OF WELFARE ECONOMICS 211

states that the core of any "balanced rn-person game" is always nonempty,1e and
then he remarked that an exchange economy with convex preferences always
gives rise to a balanced rn-person game; hence its core is nonempty. For the
concept of a "balanced rn-person game" and the proof of the above theorem, we
simply refer to Scarf's elegant paper. For a recent generalization of Scarf's result,
we refer to Billera [6].1

That the core is nonempty is obviously important, for the core can indeed be
empty, and if the core is empty any discussion on the properties of the core
becomes meaningless. Moreover, in view of the close relation between the core
and the set of competitive equilibria, the study of the conditions for a nonempty
core can be utilized in the study of competitive equilibrium, such as the existence
of competitive equilibria. As we will show in the next subsection, every competi-
tive equilibrium is in the core. Hence if the core is empty, there exists no competi-
tive equilibrium."
An example of an economy with an empty core (which is due to Scarf,
Shapley, and Shubik) is mentioned by Debreu and Scarf [ 10] and by Shapley and
Shubik [23] . The example is concerned with a pure exchange economy with two
commodities and three consumers, each of whom has nonconvex preferences, as
described by the indifference curves of Figure 2.19.
Mathematically, the utility function for Figure 2.19 may, for example, be
written as
Y
xif x <
2

2if2<x<y
(14) u(x,y)= 2ifx= y
2ify<x<2y
yifx>>2y
Assuming that each consumer has one unit of each commodity initially, the proof
that the core of this economy is empty may be sketched roughly as follows:
(i) Suppose that an allocation c = (Cl, c2, c3) is in the core where c; represents
the consumption bundle of Mr. i(i = 1, 2, 3). Since c cannot be blocked by a
coalition consisting of one person, we must have u(c,) > u(1, 1) = 17 2
i= 1,2,3.
(ii) Moreover, c cannot be blocked by the coalition consisting of any two persons.
But the coalition of any two persons can give each member u(, 3)
u(, 3) = 3 by redistributing the resources between the two as
212 THE THEORY OF COMPETITIVE MARKETS

Commodity Y x=z

x=y

/ / / 2,2) x=2y
/
u=1
U

u=z
//-
Commodity X
0

Figure 2.19. Preferences for the Empty Core.

(iii) Therefore, at least two consumers (say, Mr. 1 and Mr. 2) have u. > for each
i(i = 1, 2). 3
(iv) Hence assume that ui 3, L12 >_ 3. Among all the possible allocations that give
Mr. 1 and Mr. 2 at least satisfaction 3, choose the one that gives Mr. 3 at least
satisfaction ! .
(v) It turns out from (14) that the only c3 possible is the one such that u(c3) = z'
This implies that in view of (14), Mr. 3 must have either of the two commodities
in the amount of one unit. Note that owing to the lack of convexity of
preferences, u(1, Z) = u(1, 1) = u(Z, 1) = Z.
(vi) Suppose that Mr. 3 gets a unit amount of X. Then we can show that the
coalition of 1 and 3 can block such an allocation.
(vii) With a similar analysis for the case where Mr. 3 receives one unit of Y, we
show that an allocation c can always be blocked by some coalition. Hence
c cannot be in the core. Thus the core is empty.
In view of the above example, we can see that the convexity of preferences
plays a crucial role in asserting the nonemptiness of the core. In a recent study by
Shapley and Shubik [23], it is suggested, however, that the convexity of prefer-
ences is not as crucial as it appears in the above example, if the number of
participants is large. They showed that the core can be empty but that there is a set
of allocations which can be blocked only with very small preference on the part
of the blocking coalition. In other words, assuming that a coalition "blocks" an
allocation only when the increase in preferences (money-like utils) of the blocking
coalition is at least as great as some positive number c, a quasi-core (called the
E-core) defined in terms of such a "blocking" is always nonempty when the
number of participants is large enough. Thus the core is "approximately" non-
THE TWO CLASSICAL PROPOSITIONS OF WELFARE ECONOMICS 213

empty if the number of participants is large enough, and the E-core shows such an
approximation. Although the convexity of preferences is not required in establish-
ing this result, it is obtained under a prohibitively strong assumption, that is, that
of "transferable utility." Later on we will point out another result in the litera-
ture: if there is a continuum of participants, then the core in the strict sense is
nonempty even with nonconvex preferences and nontransferable utility.

C. THEOREMS OF DEBREU AND SCARF

It is important to note that the concept of the core is free from any considera-
tions of the particular price system. The redistribution of commodities in the
concept of blocking coalitions can take place even when trading is not constrained
by prices. Suppose now that trading is restricted by the competitive price mecha-
nism; that is, each trader takes the prices as given, the prices are the same for all
traders, and each trader maximizes his satisfaction subject to his budget deter-
mined by prices. Our next concern is then to relate the concept of the core to
such an economy, that is, one that is guided by competitive prices.

Definition: An array of vectors [z, p] is said to be a competitive equilibrium

(or C. E.) if iE A and /3 E R, _ 0, such that

(i) ui (zi) > ui (xi) for all xi E Xi with p xi s p x i, i = 1, 2, ... , m

m m
(ii) ' .Ti = i2 Xi

REMARK: Let [z, p] be a C. E. In view of (ii), Z;"_ [ p (.ii - x) = 0.

But p (.ii - zi) < 0 for all i by (i). Hence we have p zi = p xi for all i.
In Section C, we proved that if ui(zi) > ui(xi) for all xi with p xi ui(.zi) implies p xi > p zi
and
(15-b) ui(xi) = ui(zi) implies p- xi ? p- xi
Suppose that these relations hold for each i; then we can prove the following
theorem.

Theorem 2.C.3: Every competitive equilibrium is in the core.

PROOF: Let [z, p] be a competitive equilibrium, and suppose that it is not

in the core. Then there exists a coalition S such that x' Bs x, for some feasible
allocation x' with 'iES x; = DIES xi. Hence ui(x'i) > ui(ci) for all i c S
with strict equality for at least one i. Hence by (15-a) and (15-b), we have
214 THE THEORY OF COMPETITIVE MARKETS

xj > p ii for all i E S with strict inequality for at least one i, so that
Ziesp x; > ZiESP zi, which contradicts LESx;' = LESX i. (Q.E.D)19
REMARK: As remarked before, every allocation in the core is a Pareto
optimal allocation, while the converse does not necessarily hold. Hence
the above theorem is an extension of the result which says that every
competitive equilibrium realizes a Pareto optimum. Moreover, the above
theorem also asserts that if a competitive equilibrium exists, then the core
is nonempty, and that if the core is empty, there exists no competitive
equilibrium.

Definition: Two consumers-say, i and j-are said to be of the same type if they
have identical utility functions (that is, ui = uj) with identical consumption sets
(that is, Xi = Xj) and if they have the same initial endowment (that is, x = zj).
Suppose that there are r consumers in each of k categories ("types") of
consumers in the economy (so that kr = m). Write the consumption bundle for
each consumer as
xy,i= 1,2,...,k; j= 1, 2, ...,r
That is, xy is the consumption vector of the jth consumer of the ith type. An
allocation vector then is written as (x i i , ... , x lr, , xk i , ... , xkr) The utility
function and the consumption set of any consumer of the ith type is denoted by ui
and X., respectively. His initial endowment vector is denoted by xi, so that the
aggregate endowment vector of the economy is equal to Ek i (rx;).
We now impose the following assumption.
(A-1) The consumption sets Xi are convex for all i and the utility functions ui are
strictly quasi-concave for all i.20

Theorem.2.C.4: Suppose that (A-1) holds. If (x1 I, ..., xtr, -, xkt, - xkr) is
-

an allocation in the core, then xil = xi2 = = xir for each i = 1, 2, ... , k; that is,
an allocation in the core assigns the same consumption to all consumers of the same
type 21

PROOF: Suppose not, so that the consumption vectors xio,, ... , xior are not
identical for some io. For such an io, let xi010 be the least desired consumption
vector (that is, Mr. jo of the i0th type is the "underdog" of the i0th type).
Then owing to the strict quasi-concavity of ui, we have, for such a jo,

(16) uio(Ixiol + - -+
I Xio,.) > ui0(xiW°)
while for any other i we have
(17) lxir) > ui(x11), for some j
r r
THE TWO CLASSICAL PROPOSITIONS OF WELFARE ECONOMICS 215

On the other hand, we have

k
(18) [(1Xil +... + lXir) - Xi] = 0
i= l r r
since
k k
(xil + ... + Xir) = E (rii)
i=I i=l
Therefore the coalition consisting of one consumer of each type, each of
whom receives a least preferred consumption (that is, the "underdogs"),
would block.Z" (Q.E.D.)
REMARK: Suppose that the above theorem holds so that for each i, xil =
Xi2 = = xir, which we write as xi to simplify the condition. Then under
this theorem, an allocation in the core may be described by (xl, x2, ... , xk),
k k
where i= l x, = Ei= lz i.
We now proceed with our analysis by assuming that (A-1) holds. Consider
the set of all allocations in the core. Clearly it depends on r, the number of
consumers in each category. We then denote the core by C(r). It is easy to see
that if an allocation (xl, ..., xk) with r members of each type is blocked by a
coalition S, then (x1, ..., xk) with (r + 1) members of each type is blocked by S.
Hence C(r + 1) c C(r). In other words, the core "shrinks," or at least is non-
increasing, as r increases. The main theorem obtained by Debreu and Scarf [ 10]
says that if (xl, ..., Xk) is in the core for all r, then it is a competitive allocation.
Intuitively, this means that the core shrinks to the set of competitive equilibria as
r ->oo . The result is known as a limit theorem, which, as remarked earlier, was
originally discussed by Edgeworth [ 11 ] when k = 2.
In order to simplify our exposition, we impose the following assumption in
place of (A-1).

(A-I') For each i, the consumption set Xi is 0, the nonnegative orthant of

R". The utility functions ui are strictly quasi-concave for all i.
Also assume nonsatiation, that is
(A-2) For all xi E 0, there exists an xi E ( such that ui(xi) > ui(xi), i = 1, 2, ... , k.
Furthermore, we impose the interior-point assumption, that is
(A-3) xi > 0 for all i.
Assumption (A-3) implies that if a "price" vector p _> 0 prevails, then there exists
an x' > 0 such that p x'i < p x i . In other words, (A-3) amounts to the cheaper-
point assumption.
216 THE THEORY OF COMPETITIVE MARKETS

We now prove the following theorem.

Theorem 2.C.5 Suppose that (A-1'), (A-2), and (A-3) hold. Then if (z1, ..., Xk) is
in the core for all r, it is a competitive equilibrium.
PROOF: The proof is carried out in four steps.
(i) Define set ri by
(19) ri-{z1E0:ui(zi+xi)> ui(zi)},i= 1,2,...,k
Since the nonsatiation assumption (A-2) holds and ui is strictly quasi-
concave, ri is nonempty and convex. Define set r by23
k k
(20) r' z: z aizi, ai= 1,ai>_ 0,ziEri,i= 1,2,...,k}
i= 1 i= 1

Clearly r is nonempty and convex. This set r is illustrated in Figure

2.20. Next we show that the origin 0 does not belong to r (allowing us to
utilize the separation theorem between 0 and r).
(ii) Suppose that 0 E F. Then there exists a* > 0, z* E ri, i = 1, 2, ..., k,
with Eki= 1 1 such that
k
(21) a*z* = 0
t= 1

Figure 2.20. An Illustration of the Set F.

THE TWO CLASSICAL. PROPOSITIONS OF WELFARE ECONOMICS 217

Choose s from any positive integer with s < r. Let ai5 be the smallest
integer greater than or equal to sail and let I be the set of i for which
a* > 0. For each i in 1, define zis by

5(X *
(22) zis - a z*
i
s

Observe that zis approaches z7 ass tends to infinity. Therefore, zis belongs
to Fi for a sufficiently large s, since Fi is an open set (for each i) and any
point sufficiently close to a point in Fi (such as z*) is in F,. Observe also

(23) Iaiszi' = sZaz, = 0

iEI iEI
Now consider the coalition consisting of ais members of each type i E 1,
to each one of which we assign (zis + x i). Such a coalition is possible
owing to (23). Also ui (zis + x i) > ui (.ii) for all i E 1, ifs is large enough
(so that zis E F.). Therefore this coalition blocks (ii, z2, ..., zk) for a
sufficiently large s, which contradicts the assumption that (z1, i2, ...
xk) is in the core for all r. Therefore 0 F.
(iii) Hence, from the Minkowski separation theorem (Theorems 0.B.2 or
0.B.3), there exists a p E R", p > 0, such that"
(24) p. z >_ 0 for all z E F
Now consider x' E 92 such that u1(zi). Then (x' - xi) is in Ti,
so also in F. Hence from (24), p xi > xi. In other words,
(25) ui (x.) > ui (zi) implies p x' x i, i = 1, 2, . . ., k
From (A-2) and the strict quasi-concavity of ui , there exists an x" E S in
every neighborhood of zi such that ui(x") > u1(z1). Then we obtain
p . zi >= p z i [for if p zi ui(zi), which contra-
dicts (25)] . On the other hand, we also have Zk I (,ii - x i) = 0.
Combining this with p Xi > p x j, we obtain
(26) 1,2,...,k
(iv) We now show that zi satisfies condition (i) of competitive equilibrium.
For this purpose, it remains to be shown that ui(x;) > ui(zi) implies
p x; > p zi ['.' (25) and (26)] . To prove this, suppose that there exists
x' E 92 such that ui (x,) > ui (zi) and p xi' = p zi. Then by the cheaper-
point assumption (A-3),25 there exists x* E S2 such that p- x* < p- zi.
Hence we can choose a point xi° E 92 which is close enough to x; so that
ui (xi0) > ui (zi ), and yet p xi0 oo).
218 THE THEORY OF COMPETITIVE MARKETS

d. SOME ILLUSTRATIONS
The purpose of this subsection is to illustrate some of the concepts and
theorems discussed thus far. To simplify the exposition, we assume that the utility
functions of all consumers are identical and are denoted by u(xi), i = 1, 2, ..., m.
Assume that the consumption set for each consumer is 0, the nonnegative orthant
of Rn, and that u takes nonnegative values with u(O) = 0. Furthermore, impose
the following restrictive assumption on the function u.
(A-4) The function u(z) is linear homogeneous, concave, and
u[tzl + (1 - t)z2] > tu(zl) + (1 - t)u(z2)
forallO<t< 1 andz1,z2ES2 with z1 /3z2forany/3ER,/3>O,andz1 0,
Z2 0.

REMARK: The last part of (A-4) says that u(z) is strictly concave for all
nonproportional zl and z2. Note that u cannot be linear homogeneous and
strictly concave for proportional z1 and Z2.2' An example of u(z) which
satisfies (A-4) isa7

u: Sl ->R, u(z) _ i'KZ2 ... nR

where z = (t1, 2, ..., 1a; = 1, ai > 0, i = 1, 2, ..., n.

REMARK: It may be worthwhile to observe some consequences of the
homogeneity and concavity requirements imposed in (A-4). Let f(z) be a
linear homogeneous and concave function defined on a convex subset Z of
Rn. Let z1, z2, ... , z,, be m points of Z. Note that the concavity off alone
implies

(27) f [G tizi] ' t;f(zi)

for all t; > 0, i = 1, ..., m, with

I'in= I t; = 1. Since f is linear homogeneous
as well as concave, then for any ai > 0, i = 1, 2, ... , m, yn 1 a; > 0,

f [G
i=Ia;z;] = of [G'Xrzi]
i=1a > aI
i=Ia'f(z;)
a

m
_ ai f(zi), where a ai

That is,
m

(28)
r
m+

f LG aiZ] > n aif(Zi)

r=1 i=1
THE TWO CLASSICAL PROPOSITIONS OF WELFARE ECONOMICS 219

for any ai > 0, i = 1, 2, ... , m, with E"_ 1a1 > 0. Note that;"_ 1a1 does
not have to be equal to one. For example, we have
f(Z1 + z2 + ... + Zm) > f(Z1) + ... + f(Zm)
for such a function f. If f is linear homogeneous and strictly concave for non-
proportional vectors, then the above inequality (28) is replaced by (28').

(28') f [2:a;z]
=1
> Zi=1atf(zi)
for all a, > 0, i = 1, 2, ... , m, provided that at least one z;o is not proportional
to the others, that is, z,0 /iz; for any /i ? 0 and any i io, and that the
z;'s do not vanish.
We now prove the following lemma, which justifies the power of (A-4) for
the purpose of simplifying the illustration.

Lemma: Let u be the identical utility function of all m consumers and assume that
u satisfies (A-4). Let V be defined by

(29) V = {(u1, ..., u"'): u' = u(cw)}, where co x;

1=1 i=1

I f (u 1 , ... , u"') is in V, then there exists a feasible allocation x = (x1, ... , xm) such
that u(x,) = u', i = 1, 2, ..., m, and that x is Pareto optimal. Conversely, if x is a
Pareto optimal allocation, then (u1, ..., u"') = [u(x1), ..., u(x,,,)] is in V. 21
The first two steps (i) and (ii) of the proof are concerned with the
first statement of the lemma, and the last step (iii) is concerned with the
second statement of the lemma.

(i) First we show that there exists a feasible allocation (x1 , ..., x,,,) which
satisfies u' = u(w), where u' = u(x;), i = 1, , m. Let

(30) x1 = u(co, i = 1, 2, ..., m

To show u' = u(x1), i = 1, 2, .. ., m, simply observe

U
u(xj) = u (I )u(cw) = u', m
U( CO)

we employ the linear homogeneity of u. Now observe that

m In ui
(31) x'
r=1 =1 ( )
220 THE THEORY OF COMPETITIVE MARKETS

since 2:"`_ u' = u(w). In other words, the allocation (x1 , ... , x,,,) defined
by (30) is feasible.
(ii) Next we show that if [u(xl), ..., u(xm)] _ (ul, ..., u') is in V, then
(xi, ..., x,) is Pareto optimal. Suppose the contrary and assume that
there exists yi > 0, i = 1 , 2, ... , m, such that Em lyi = to and u(yi) >_
U' for all i with strict inequality for at least one i. Then using the concavity
of u, we can observe

u(W)= U 2 j'iGi=1 u(yi)> Ei=1ui = u(W)

j=1

which is a contradiction.
(iii) To show that [u(x1), . . ., u(xm)] is in V for every Pareto optimal alloca-
tion (x 1 , ... , xm ), it suffices to show that (x 1, ... , xm) is proportional
in the sense that
m
(32) xi = air.,, for some ai > 0, i = 1, 2, ..., m, with ai = 1
i= 1

where w _ 2:t"_ 1xi, the aggregate endowment vector. For then we have

m m m
u(xi) _ u(aiw) aiu(W)
i=1 i=1 i=1

To show that every Pareto optimal allocation (x 1, ... , xm) is propor-

tional, suppose the contrary. Then in view of (A-4), we have u (,E^` 1 xi ) >
E;"_ 1u(xi) from (28'). But, since 1xi this implies that u(w) >
;"_iu(xi). Define y, u(xi) W /2:r-1u(xi), i = 1, 2, ..., m, and observe
that (Y1, ... , ym) is a feasible allocation and that

u(xi)u(W) > u(xi), i = 1, 2, ... , m

u(yi) _
+
L,, u (xi )
i= l

This contradicts the assumption that (x1, ..., xm) is Pareto optimal.
(Q.E.D.)
REMARK: Observe that, in (iii) of the above proof, we showed that every
Pareto optimal allocation is proportional in the sense of (32).'-'9
Next we turn to an illustration of the Edgeworth-Debreu-Scarf limit
theorem.3' Assume now that there are two types of consumers and that there are
r consumers of each type. There are two commodities X and Yin the economy.
Assume for the sake of illustration that the consumers of both types have identical
utility functions of the "Cobb-Douglas" form
(33) u(x, y) = x'Yy1-`% 0 < a < 1, where x > 0 and y > 0
As remarked before this utility function satisfies (A-4) [as well as (A- I')]. The con-
sumers in the two different types are distinguished by their initial endowments.
THE TWO CLASSICAL PROPOSITIONS OF WELFARE ECONOMICS 221

Denote by (x;, y ,) the initial endowments of any consumer of the ith type (i = 1, 2).
Denote the aggregate endowments of X and Y by a and b, respectively, that is,
(34) a = rx, + rx2 and b = ry , + ry 2
We are interested in characterizing the core of such a replicated economy.
Since any allocation in the core assigns identical consumption bundles to every
consumer of the same type (the parity theorem of Subsection c), we may represent
an allocation in the core by [(x,, y, ), (x2, Y2)] where (x;, y;) denotes the con-
sumption bundle of any consumer of the ith type. Moreover, we know, by defini-
tion of the core, that any allocation in the core is Pareto optimal. Furthermore,
as we observed in this subsection, any Pareto optimal allocation is proportional;
that is, it satisfies (32). Hence any allocation in the core is proportional. In other
words, any allocation in the core assigns the two commodities in the ratio of alb,
that is,

(35) X'= a for all i = 1, 2

yr b

Therefore, we may write

(36-a) x,=0 ar and yi=B6r

(36-b) x2 = (1 - O )a and y2 = (1 - 0) b

where 0 < B s 1. Recalling that a = r(x, + x2) and b r(y i + Y2), we may
rewrite this as
(37) x, = Bx, yi = By, x2 = (1 - U)x and y2 = (1
where x and y are defined by
(38) x=z,+x2and+y2
Therefore, each consumer of type 1 obtains the satisfaction represented by
Bz"y' ly. Similarly, each person of type 2 obtains the satisfaction represented
by (1 - a)zafyi-a

Consider the coalition consisting of s arbitrary consumers of type 1 and t

arbitrary consumers of type 2. If [(x, , y, ), (r2, y2 )J is an allocation in the core,
then it cannot be blocked by any such coalition. In other words, using the lemma
of this subsection, we obtain

+ t(1 - 6)xa_I-cr
cr_1-m a-
(39) sox y y > (sx, + tx2) (sy, + 02)1

for any integers s and t with 0 < s, t < r. Dividing both sides of (39) by sx ( j
and writing Us = a(where s > 1), we obtain
222 THE THEORY OF COMPETITIVE MARKETS

(40) 0 + (1 - 0) a > XI + Xza

rr
CYI +Yzal 1-
z z y Y

for any a. Denote the RHS of this relation by 0 (a). Then we immediately obtain

(41) 0 10(a)-a] , ifa<1

(1 a)
We may rewrite this as
(42) A > 1 - 0(1) - 0(a) if a < 1
1-a '

where we note that 0(1) = 1. Similarly, we have

(43)
1-a
B<1-0(1)-0(a),
ifa> 1
Write

(44) (a) _ x(11 - a(a)

Then

(45) c'(a) (1 - a)2 [0'(a)(1 - a) - (0(1) - 0(a)11

On the other hand, we can compute
[ axe + (1 - a)Y 21
(46) 0,(U) _ 0(a ) L X 1+ 1
'2U Y 1 + Y 2a

so that 0'(a) > O for all a. It is easy to check that o (a) is a strictly concave function31
and therefore
(47) 0'(a)(1 - a) > 0(1) - 0(a) for all a 4- 1

Hence from (45) we conclude

(48) 0'(a) < 0 for all a 4 1
Now for a < 1, 0(a) reaches a minimum when a(= t/s) is as close as possible to
1, that is, t = r - I and s = r. For a > 1, c1,(a) reaches a maximum when a(= t/s)
is as close as possible to 1, that is, t = r and s = r - 1. Therefore from (42) and (43),
we should have

(49) 1- , r 1 <_ 0< 1- '

\r r 1

Now consider the limit as r --> cc. Clearly when r --> oc, r/(r - 1) -> 1 and
(r - 1)/r --> I. Therefore, we have, in view of (49),

(50)
a--.I da
1-lima-
fl-,I
(Y

a<1 a>I
THE TWO CLASSICAL PROPOSITIONS OF WELFARE ECONOMICS 223

But by the differentiability of 0 (a), we have

ao a¢
(51) 1im = lim _ 01(1)
U-1 Au _I DQ
a<1 a>I

where 0'(1) is easily computed from (46) as

(52) 0,(1) = aX2 + (1 )Y2

From (50) and (51) we obtain Y

(53) 0 = 1-0'(1)= 1-(xX2 +(1-a)Y21
1' /l
i n the limit as a - 1. It is easy to see that 0 < 0 < 1. In particular, if the type 1
consumers do not have Y initially and the type 2 consumers do not have X initially
(that is, y I = 0 and x2 = 0), then we have
(54) 0=a
In any case, an allocation in the core when r- co is uniquely determined and easily
computed as

(55) x; = ax; + (1 - a)y;Y , i = 1, 2

and

(56) y; _ x;Y+(1-a)y;,i=1,2
X

When x = x I and y = 32 so that 0 = a, the computation is even simpler, and we

obtain
(57) x1 = ax, x2 = (1 - a)x, y1 = ay, andy2 = (1 - a)!;
Finally we relate the above discussions to the theory of competitive equilib-
rium. For this purpose we have to introduce prices. Denote byp = p,/p,,the rela-
tive price of X vis-a-vis Y. It is assumed that each consumer chooses his consump-
tion bundle so as to maximize his satisfaction subject to his budget constraint. In
other words, his consumption bundle (x;, y;) is a solution of the following non-
linear programming problem. For each i = 1, 2,

Maximize: x y, I - tr
(xi,P)
Subject to: px;+ y;< px;+ y;,x;>_ 0, y;>0
The solution of this problem can be computed easily and (as is well known)
takes the following form:32
224 THE THEORY OF COMPETITIVE MARKETS

(58-a) xi = (pxi + 7,), i = 1, 2

P
(58-b) yi = (1 - a) (pxi + yi), i = 1, 2
where the notation (such as A) which denotes the optimal value is omitted to
simplify the notation. The condition for a competitive equilibrium is described by
(59) X1 + X2 = xI + XZ

Therefore, combining this with (58-a), we can compute the unique equilibrium
price ratio p as
_ a y
(60) p- 1-az
Therefore, the unique competitive allocation is computed by (58-a) and (58-b)
using this p. The resulting expressions for xi and yi, i = 1, 2, are identical to the
ones in (55) and (56). In other words, when r- co, the core allocation is unique
and coincides with the competitive allocation. It may be emphasized here that
both the core allocation (when r- co) and the competitive allocation are unique
as a result of (A-4), especially the assumption that u is strictly concave with
respect to nonproportional vectors. If this assumption is relaxed, then the unique-
ness does not necessarily follow.

e. SOME REMARKS
The limit theorem obtained by Debreu and Scarf [ 10] has aroused great
interest among mathematicians and economists working on the theory of the core
and has produced various attempts to extend the Debreu-Scarf analysis. One focal
point is the particular way of increasing the number of persons in the economy.
It is assumed that there are k types of participants in the economy with r members
of each type. Debreu and Scarf then obtained their result by letting r increase. The
crucial step in obtaining the limit theorem is the equal treatment theorem which
says that an allocation in the core assigns the same consumption to all consumers
of the same type. In this way they avoid the difficulty of the feasibility condition
ni n7

Xi = G xi
i= 1 i= 1

becoming meaningless when rn is directly increased rather than r. 'However, the

assumption that there is an equal number of individuals (= r) of every type is a
strong assumption indeed. Moreover, the Debreu-Scarf limit theorem is con-
cerned with the case in which r is far larger than k, the number of types of indivi-
duals. In general, an economy would contain a different number of individuals in
each of various types, and the number of types may exceed the number of indi-
viduals in any one type. If this is the case, then the equal treatment theorem breaks
down and simple counterexamples can be found easily in which a core allocation
does not assign the same consumption to two individuals of the same type.33 Since
the equal treatment theorem is crucial in the proof of the Debreu-Scarf limit
THE TWO CLASSICAL PROPOSITIONS OF WELFARE ECONOMICS 225

theorem, it is then natural to explore whether or not a result similar to their limit
theorem holds when we directly increase m. One such attempt is that of Vind [ 31 ] .

Define the set ri by (19), following Debreu and Scarf [ 10] , and let ri(E)
{zi E 0: Nf(zi) c ri}, where NE(zi) is an c neighborhood of z. Let p(p, c) be the
number of persons such that H n ri(E) 0, where H zi: p zi < 01. Let F (E) be
the convex hull of U"'= iI'(E). If we can prove that 0 (4 F(E), then using the separa-
tion theorem we can prove that there existsfi > 0 such that p (p, c) = 0. Hence if we
can prove 0 0 r(E) for any c > 0 when m -> oo, then we have a generalization of the
Debreu-Scarf limit theorem. However, such a proof is impossible. What Vind [ 31 ]
has shown is that we can find an upper bound for p(p, c), which is independent of
m, for every core allocation. In other words, if (zl, ... , z,,, ) is a core allocation, then
there exists a p >_ 0 such that p(p, c) 0, where p is defined as34
<<)I Z and c(E) = sup d(0, H n r(E))

Clearly this result is useful only when p is finite for any M. However, p becomes
infinite when, for example, a particular individual has a complete monopoly over a
certain scarce resource in the production economy. The assumption that P is finite
then seems to play a role similar to the Debreu-Scarf equal treatment theorem.
Another approach to the theory of the core and the limit theorem starts with
Aumann's assumption of an "atomless" set or a continuum of traders [ 2] . In other
words, the concept of a competitive equilibrium requires that the influence of
each participant be zero, which is possible only when the number of participants
in the economy is infinity. Then Aumann assumed that the economy contains a
continuum of traders of as many real numbers as in the unit interval I - [0, 1 ] .

Define the initial endowment and the feasible allocation, respectively, as the func-
tions x and x defined over I to 92, such that
Ji= f xidi>0
an d
jx = f xidi = f-xidi
where the integral is defined componentwise. Note that the integral gives the area
under the curve defined by the integrand, and therefore the area under a single
point-say, xi-is zero. This is the basis of the "atomless"' set of participants. In
the actual treatment of core theory with an atomless space of participants, a
branch of mathematics called "measure theory" is extensively used. Thus the
above integrals are taken in the sense of Lebesgue,35 and xi and xi are Lebesgue
integrable functions in i. Let p(S) denote the Lebesgue measure where S is a
Lebesgue measurable subset of I. The core C, and the set of competitive alloca-
tions E, are respectively defined by
C,= {x: u,{x;) > ui(xi), i E S, p(S) > 0 imply Jxdi i fijdi}
E, = {x: 3 p > 0 such that x; E Dfi(i) implies u(xi) > ui(x;) for a.e. i c I}
226 THE THEORY OF COMPETITIVE MARKETS

where Dp(i) = {z E s2: p. z < p x,}. The existence of a competitive equilibrium

then amounts to asserting that E, 0, which is proved by Aumann 36 The proof of
the statement E, (-- C, is almost trivial .37 This in turn implies that C, 0 as long
as E, 0. Using the Minkowski separation theorem, Aumann [2] showed that
C, c E,, which together with E, c: C, implies C, = El.
A remarkable feature in Aumann's proof of the nonemptiness of E, and
hence C, is that no assumption is necessary with regard to the quasi-concavity
of an individual's utility function or the convexity of an individual's preference
ordering). However, as we remarked before, in the economy consisting of a finite
number of consumers, the core can be empty when preference orderings lack
convexity (recall Figure 2.19). Another remarkable feature is that in his proof of
C, = E,, which corresponds to Debreu-Scarf's limit theorem, there is no need to
suppose various "types" of consumers with the same number of members in each
type. These two features are clearly very powerful, and there has been active
research in the field of an atomless space of economic agents .31, The cost of obtain-
ing these features is the assumption of a continuum of economic agents. Although
this assumption may appear to be the natural consequence of the assumption that
the influence of each agent is nil, this is indeed a striking assumption for econo-
mists. Clearly the number of economic agents is finite. It seems too far-fetched to
leave Debreu and Scarf's world of countably many agents and to jump into a
world with a continuum of agents. Furthermore, the fundamental notion of a
competitive equilibrium is that each agent is a price taker rather than that the
influence of each agent is nil. The latter implies the former, but not vice versa. It
is true that each agent would be silly to act as a price taker if he can influence
prices; but the amount of his influence may be so small and the cost of obtaining
information with regard to his influence and of forming coalitions may be so large
that each agent may end up acting as a price taker. In other words, one can argue
that the influence of each agent in a competitive market is nil, not because he
is atomless, but because the high cost of a coalition forces him to be a price taker.
That there is a cost involved in any coalition can Possibly constitute a serious
weakness in the theory of the core, for it usually ignores such a cost. Consider the
case in which the number of participants m is finite. The cost to each participant
of finding a coalition which blocks a given allocation can be very large indeed if
m is large, and if every participant is not somewhat like the others. Hence to find
a core allocation, that is, an allocation that cannot be blocked by any coalition,
may become practically impossible because of the information cost involved in
finding the effects of all coalitions.
On the other hand, the price mechanism involved in the competitive
economy is quite remarkable in this respect. Each participant is required to
know only the prices given to him, and even though no more information is
required for his actions, yet the economy, by this mechanism, reaches an allocation
that cannot be blocked by any possible coalition of the economy, that is, a core
allocation. One crucial difference here is that the competitive price mechanism
does not involve excessive information cost. The Debreu-Scarf limit theorem then
THE TWO CLASSICAL PROPOSITIONS OF WELFARE ECONOMICS 227

says that, under certain assumptions, every core allocation can be obtained as a
competitive equilibrium!
The question then boils down to the problem of finding an equilibrium
price vector. The tatonnement process, which we discuss in Chapter 3, provides
one method of finding an equilibrium price vector, as long as the process converges
to an equilibrium. The beauty of this process is that the "market manager" of this
process does not have to know the preferences of each consumer and the produc-
tion set of each producer. A major weakness of the process is that the convergence
of this process to an equilibrium is established only under a restrictive situation in
which all commodities are "gross substitutes." Recently Scarf [ 18 and 19], as
mentioned before, offered a constructive method of finding an equilibrium price
vector. Electronic computers will "quickly" calculate the equilibrium price
vector '31 if we know the technology available in the economy, the initial endow-
ment of each consumer, and have certain information with regard to each con-
sumer's demand function.10 If one can successfully compute the equilibrium price
vector, then the existence of competitive equilibrium of a given economy can also
be ascertained.
That the core can be characterized "almost" completely by competitive
equilibrium has one important corollary. That is, if we can find circumstances in
which the core is empty, then the competitive mechanism will "fail," and con-
versely, if the competitive mechanism "fails," then there is a good possibility that
the core may be empty (assuming that the number of participants is large)."
In the literature, the cases in which the competitive mechanism fails are known as
the cases of market failures.42 This suggests the close connection between the
theory of the core and the theory of market failures (and the theory of monopoly).
A famous case for market failures is the case of external economies and
diseconomies in production or consumption that effect the welfare of outsiders
regardless of their desires. A classical example of external economies is that of an
apple grower and a beekeeper in the adjacent field. External diseconomies have
attracted a greater attention recently due to smoke, noise, and many forms of air
and water pollution. Recently a fresh look at this problem has been taken by
Shapley and Shubik [25] , who considered the problem of externality from the
viewpoint of the theory of the core. They argue, for example, that in certain cases
of diseconomies, the core may be empty. Needless to say, if the core is empty, the
competitive equilibrium, in general, does not exist. Here we may quote Shapley
and Shubik ([25] , p. 681) for such an example.
The Garbage Game. Each player has a bag of garbage which he must dump in
someone's yard. The utility of having b bags dumped in one's yard is -b.
It can be shown easily that if there are more than two players in this "game," there
is no core.''
Another example of market failures may be the commodity called "informa-
tion." It is true that certain kinds of information can be traded in the market just
as can any other commodity. For example, information with regard to technical
228 THE THEORY OF COMPETITIVE MARKETS

know-how is traded for certain prices, called "royalties." Similarly, insurance

premiums can be considered as the price for the information with regard to certain
uncertainties. However, there are some types of information that are not traded in
the market. The most important example would be "basic research," which in
practice is often carried out in universities through funds given by the government,
foundations, and so on, according to somewhat arbitrary principles. Clearly such
information plays a role very similar to that of the externalities discussed above."
Even if information can he treated as any other commodity, it is still possible
that we do not have a competitive economy in practice. Inequalities in the distribu-
tion of information can cause fundamental inequalities among the members of the
society and thus generate a possibility of blocking coalitions. For example, some
forms of technical know-how that may be crucial for the production of certain
commodities45 cannot easily be imitated by others.4' Then the possessor of this
know-how can form a blocking coalition, thus giving rise to a monopoly. Similarly,
some specialized skills that are scarce relative to the size of the economy can give
rise to monopolies. Examples would be associations such as the American Medical
Association or an electricians' union.
In this connection, we may recall our previous discussion with regard to the
cost of coalitions. For example, a particular commodity such as "unskilled labor"
may be indispensable for the production of any commodity, but the cost of the
coalition of "working men of all countries" may be prohibitively high. The U.S.
textile workers may refuse to have a coalition with the Japanese textile workers.
On the other hand, in the above examples of specialized skills that are scarce
relative to the size of the economy, the cost of coalition would be relatively small.
In fact, the relative differences in the costs of coalitions among the possible
coalitions in the economy may be more important in explaining monopoly than
the scarcities of skills, technical know-how, and so on. Monopoly can arise solely
as the result of the ease of coalition and without any regard to the scarcities of
skills, know-how, and the like. Incidentally, the cost of coalition may not be con-
fined to the pecuniary cost alone; such things as differences in social class (such as
caste in India), and matriculation from a university (say, Oxbridge) can explain
various coalitions in a given society.
Some monopolies (or oligopolies) can be explained in terms of indivisibilities
of certain commodities which give rise to increasing returns to scale. The produc-
tion of electricity, automobiles, and so on, is often cited as an example. Here
again the way in which a coalition is formed to obtain the initial capacity may be
crucial in explaining the birth of these monopolies and oligopolies. Such coalitions
are not formed by considering all the possible coalitions in the society under the
assumption that the cost of forming each coalition is zero.
These considerations suggest an urgent need for introducing the cost of
coalitions in the existing theory of the core. The author has a serious doubt with
regard to the existing theory of the core, which considers all possible coalitions of
the economy but ignores completely the cost of coalitions. The explicit introduction
of the cost of coalitions will provide another important area for applications of the
THE TWO CLASSICAL PROPOSITIONS OF WELFARE ECONOMICS 229

theory of the n-person cooperative game. Instead of exploring allocations which

cannot be blocked by any coalition, one may be more inclined to study the coali-
tions which block certain allocations. Such a study may give rise to a fresh
approach to the study of monopolies and oligopolies and to the study of various
other forms of social establishment.

FOOTNOTES

1. Note the following features of the competitive price mechanism: (1) Each economic
agent (here the consumer) is a price taker, and (2) there exists a price system which
is the same for all economic agents. The price system can exist, of course, without
each agent being a price taker. In the box diagram model, one or both can be em-
to name the price. For a game-theoretic exposition, see Shapley and Shubik
F 22] .

2. Note that the core is a subset of the set of Pareto optimal points, in the sense that
the entire contract curve is now restricted to its PQ segment. In this sense, the core
is stronger than a Pareto optimum. The meaning of the core will be discussed more
fully below.
3. See Edgeworth [ 11] , pp. 35-39, in particular.
4. Edgeworth's definitions of terms such as "recontracting," "final settlement," and so
on, appear in [ 11] , pp. 18-19.
5. There is a slight oversell of the core theory here. The Debreu-Scarf result assumes
that there are an equal number of individuals of each "type." The power of this
assumption lies in its consequence that the individuals of the same type are treated
identically (that is, each has the same consumption bundle). If there are different
numbers of individuals of each type, then this parity (or equal treatment) result does
not follow (that is, there is a core allocation which treats individuals of the same type
differently). On the other hand, the basic premise of competitive equilibrium is
obviously that of equal treatment. Thus both the theory of competitive equilibrium
and the Debreu-Scarf result have one basic feature in common, that is, equal
treatment.
6. The game is said to be cooperative if the players are allowed to communicate before
each play and to make binding agreements about the strategies they will use. Side
payments are allowed when there is a medium of exchange-say, "money"-which
is freely transferable between the players and each player's utils are linear in (or
proportional to) money. It is known that cooperative games without side payments
include cooperative games with side payments as a special case. Noncooperative
games include cooperative games as a special case.
7. See, for example, Aumann [4] and Aumann and Peleg [5].
8. For an excellent survey of the theory of n-person games without side payments, see
Aumann [3].
9. Debreu and Scarf [ 10] suggested a way to generalize the results so as to incorporate
production into the model. Nikaido [ 15] , and Arrow and Hahn [ 1] have rigorous
formulations and the proofs of such a generalization.
10. Any order-preserving (that is, monotone increasing) function of a particular utility
function can also be a utility function. For the discussion of the representability of
preferences by a continuous real-valued function ("utility function"), see Debreu
[9]. Also recall our discussion in Section B.
230 THE THEORY OF COMPETITIVE MARKETS

11. The reader should find no difficulty (in most cases) in carrying out an analysis similar
to the one which follows, replacing the function u; by the usual preference ordering.
12. Note, however, that a person will not be worse off compared to his initial endow-
ment position, since he can always refuse trading. In other words, the existence of
a coalition does not mean that it would necessarily "take effect."
13. One of the most important applications of this concept in economics is the "com-
pensation principle" problem in welfare economics. For an exposition of the
compensation principle, see Takayama [29] , chapter 17. Clearly this problem offers
an interesting application of the theory of n-person (cooperative) games in economics.
14. A game (without side payments) can be defined by specifying V(S) for all coalitions S
and U(M). In the theory of games, some or all of the following assumptions are
imposed: (i) V(S) is convex, closed, and nonempty for each S ; (ii) v E V(S) and
v' < v where v' E RS imply that v' E V(S); and (iii) V(S) x0 V(S') c V(S U S') if
S and S' are two disjoint coalitions. Sometimes these assumptions are used as the
axioms of the theory.
15. In Figure 2.18, it is implicitly assumed that the normalization of units is made with
regard to the representation by the ui's such that u,(0) = 0 for all i.
16. Let es be a vector in R3 whose ith element es, is defined as es, = I if i E S, and = 0
if i 14 S. A collection T of coalitions, {S}, is called balanced if it is possible to assign
to each S in T a nonnegative number Ss such that ISETSses = e,,. If M = { 1, 2, 3},
then T = {{1, 2}, {2, 3}, {1, 3}} is balanced where the 8s are given by 8{1,2} =
= 1/2. An m-person game is said to be balanced if for every balanced
-5{2,3} = -5113)
collection T, us E V(S) for all S E T implies u E V(M).
17. Not only did he generalize Scarf's result, but he also obtained necessary and
sufficient conditions for a nonempty core for games whose payoff sets are assumed
to be convex.
18. On the other hand, the method of proving that the core is nonempty can be utilized
in the.proof of the existence of competitive equilibrium. Apparently from this view-
point, Scarf [ 18; 19] showed a constructive proof of the existence of competitive
equilibrium. Compare these articles with [ 17] .

19. It is easy to see that, in establishing this theorem, no stronger assumptions than
those needed in proving Theorem 2.C.1 ("every competitive equilibrium realizes
a Pareto optimum") are required.
20. A real-valued function f defined on a convex subset Z of R" is called strictly quasi-
concave iff(z) ? f(z') implies f[tz + (1 - t)z'] > f(z') for all z, z' E Z with z z',
and for all t, 0 < t < I (see Chapter 1, Section E). Using this definition, we can prove
the following: Let f be strictly quasi-concave on Z, and let z 1 , z2 , ... , z,,, be
m points in Z. Suppose that one of these m vectors-say, zj0 is distinct from any
other points with f(z.) ? f(zio) for all j = 1, 2, . . ., m. Then we have f(01 z2 + +
z,,,)>1(zjo)forall0J> 00l= 1, such that zoos jj"_10jzj.
21. In this sense, Theorem 2.C.4 may be termed the parity theorem or the equal treatment
theorem.
22. In other words, the coalition of the underdogs can block the original allocation
by redistributing their own initial holdings among themselves, where the "underdog"
of the ith group now receives (x;1/r + + Such a coalition is feasible as a
xtr/r).

result of (18) and (17).

23. Such a set is the convex hull generated by the sets F,.
24. That p # 0 follows directly from the separation theorem. That p ? 0 is then obvious
from (24).
25. This use of the cheaper-point assumption is standard practice. See the proof of the
corollary to Theorem 2.C.2.
THE TWO CLASSICAL PROPOSITIONS OF WELFARE ECONOMICS 231

26. To see this, let z2 = /SzI forsome/i and observe that u [(zt + z2)/2] = u(z' )/2 + u(z2 )/2
using linear homogeneity. But this contradicts strict concavity. Note also that if
z2 = 0 and if u(0) = 0, then we again have u [(z' + z2)/2] = u(z1)/2 + u(z2)/2,
contradicting strict concavity.
27. To prove that u(z) is strictly concave for all nonproportional vectors, show that
u'(z') (z2 - zI) > u(z2) - u(z') for all nonproportional zI 0 and z2 0. To
show this, utilize the following well-known inequality: OI'YIO2 2... a, O1 +
a202 + + anon (the equality holds if and only if 01 = 02 = = 00, where
Ell= I a;= 1, a,> 0, and 0. 0 for all i.
28. A similar result is obtained in E. Eisenberg, "Aggregation of Utility Functions,"
Management Science, 7, July 1961.
29. In terms of the box diagram, this means that the contract curve coincides with the
diagonal line of the box.
30. I learned recently that a similar example was discussed by Herbert Scarf in his
lecture at Yale.
31. Compute 0"(a) and examine q"(a) < 0 for all a. Alternatively, write q(a) = cD [f (a),
g(a)], where f (a) = (x I /x + x 2a/x )a and g(a) _ (y I /y + y 2a/y )' ". Observe that
both f (a) and g(a) are strictly concave. Also note that CD is strictly concave and mono-
tone (that is, aO/af > 0 and Wag > 0). These establish the strict concavity of 0.
32. First note that the Cobb-Douglas form of the utility function guarantees an interior
solution. The condition requiring the tangency between an indifference curve and the
budget line is stated asp = [a/(1 - a)] (y;/x;). This together with px; + y; = px i +
y; constitutes a necessary and sufficient quasi-saddle-point characterization of the
solution and yields (58-a) and (58-b).
33. Green [ 13] contains such an example, which he acknowledges to Alan Kirman.
Green [ 13] then argues that there are bounds on the inequality of treatments. See
also Vind [31].
34. Let X c R" and a E R". Then d(a, X) denotes the "distance" between a and X, mean-
ing d(a, X) = infCEx II x - a II-
35. Since the reader is not expected to know measure theory, the subsequent paragraph
may be omitted.
36. See R. J. Aumann, "Existence of Competitive Equilibrium in Markets with a Con-
tinuum of Traders," Econometrica, 34, January 1966. A crucial assumption is that
of "monotonicity" in the sense that x; ? x; implies u; (x;) > u; (x;), and not the (quasi-)
concavity of the u;'s.
37. Suppose that there exists x E E, but x 0 C,. Since x 0 C,, there exists x' and S c I
with µ(S) > 0, such that u;(x;) > u.(x;), i E S and Js x' = Js x. But, since x E E,,
there exists p ? 0 such that p x; > p x;, i E S, which implies p [ S xidi] >
I [ )s x;di] This contradicts the feasibility condition of the coalition S, Js x;di =
.

Js x;di.
38. For example, Vind [ 30] showed a different derivation of Aumann's result E, = C, in
[2]. Hildenbrand [ 14] introduced production, while Aumann and Vind as';umed a
pure exchange economy. Moreover, Hildenbrand showed that the monotonicity as-
sumption can be relaxed and that the consumption set does not have to be restricted to
the nonnegative orthant of R'.
39. Scarf writes ([ 19] , p. 669)
A considerable body of computational experience with larger models has
already been gathered. Over one hundred examples have been tested, ranging
from three to twenty sectors. The computational time, which is dependent on
the number of sectors, has never exceeded five minutes on an IBM 7094, and in
most cases is substantially smaller.
232 THE THEORY OF COMPETITIVE MARKETS

40. It is assumed that each consumer has a set of demand functions which can be ex-
pressed as a&f,(p)/p,-bi, where x;j denotes consumer i's demand fu action of the
jth commodity, a, measures the intensity of i's demand for j, and b; is the, elasticity of
substitution for i. For computation, it is required that the a;i's and the b;'s be known.
41. It is to be pointed out that competitive equilibrium may fail to exist for various
reasons, such as the nonconvexity of preferences and of the aggregate production set.
But the core can still be nonempty or at least "approximately" nonempty as in the
Shapley-Shubik theory of the E-core [ 23] . This is a great merit of the theory of the
core. But it is also to be noted that the core can be very large, and its practical signi-
ficance may be greatly hampered.
42. The purpose of our discussion here is not to make a comprehensive survey of the
theory of market failures. For an early exposition of this topic, see F. M. Bator, "The
Anatomy of Market Failure," Quarterly Journal of Economics, LXXII, August 1958.
See also K. Imai, H. Uzawa, R. Komiya, T. Negishi, and Y. Murakami, Price Theory
II, (in Japanese), .Tokyo, Iwanami, 1971, Chapter 7.
43. Shapley and Shubik [ 25], on the other hand, indicated that the core will exist in the
case of external economies if they are internalized by being listed as explicit com-
modities. This apparent asymmetry between external economies and diseconomies
makes their result highly suspect, or at least urges us to.consider this problem further.
See discussions on Shapley and Shubik [ 25] by K. J. Arrow and T. Rader, in American
Economic Review, LX, May 1970, pp. 462-464. Arrow, for example, suspects that the
lack of the core in the above garbage game may really be due to a possible lack of con-
vexity in the production set, instead of the presence of an external diseconomy (p.
463).
44. Actually there are some important cases of market failures that we have not discussed
here. For example, "the market fails" as a result of certain "public goods" in which
the beneficiaries of these goods cannot be distinguished from the nonbeneficiaries
(the lack of "exclusion") as a result of the lack of "future markets" for certain com-
modities, or simply as a result of future generations being unable to participate in the
market.
45. The information may be indispensable for the production of a certain new com-
modity, or the information may provide a significant cost saving method of produc-
tion of an existing commodity.
46. The information may be protected by patent rights and the possessor of the right may
refuse to sell the information; or the possessor of the information may simply hide it,
for publication of the information through patent rights may cause his techniques to
be imitated.

REFERENCES
I . Arrow, K. J., and Hahn, F. H., General Competitive Analysis, San Francisco, Holden
Day, 1971.
2. Aumann, R. J., "Markets with Continuum of Traders," Econometrica, 32, January-
April 1964.
3. , "A Survey of Cooperative Games without Side Payments," Essays in Mathe-
matical Economics in Honor of Oscar Morgenstern, ed. by M. Shubik, Princeton, N.J.,
Princeton University Press, 1967.
4. , "The Core of a Cooperative Game without Side Payments," Bulletin of the
American Mathematical Society, XCVIII, March 1961.
5. Aumann, R. J., and Peleg, B., "von Neumann-Morgenstern Solutions to Cooperative
THE TWO CLASSICAL PROPOSITIONS OF WELFARE ECONOMICS 233

Games without Side Payments," Transactions of the American Mathematical Society,

LXVI, May 1960.
6. Billera, L., "Some Problems on the Core of an N-Person Game without Side-
Payments," SIAM Journal on Applied Mathematics, 18, May 1970.
7. Chipman, J. S., "The Nature and Meaning of Equilibrium Economic Theory," in
Functionalism in Social Sciences, American Academy of Political and Social Science,
Philadelphia, February 1965.
8. Debreu, G., "On a Theorem of Scarf," Review of Economic Studies, XXX, October
1963.
9. , Theory of Value, New York, Wiley, 1959.
10. Debreu, G., and Scarf, H., "A Limit Theorem on the Core of an Economy," Inter-
national Economic Review, 4, September 1963.
11. Edgeworth, F. T., Mathematical Psychics, London, Kegan Paul, 1881.
12. Gillies, D. B., Some Theorems on N-Person Games, Ph.D. Thesis, Princeton Univer-
sity, 1953.
13. Green, J., "A Note on the Cores of Trading Economies," unpublished manuscript,
University of Rochester, May 1969.
14. Hildenbrand, W., "On the Core of an Economy with a Measure Space of Economic
Agents," Review of Economic Studies, XXXV, October 1968.
15. Nikaido, H., Convex Structures and Economic Theory, New York, Academic Press,
1968, esp. sec. 17.2.
16. Scarf, H. E., "An Analysis of Markets with a Large Number of Participants," in
Recent Advances in Game Theory, ed. by M. Maschler, Princeton, N.J., Princeton
University Press, 1962.
17. , "The Core of an N-Person Game," Econometrica, 35, January 1967.
18. , "On the Computation of Equilibrium Prices," in Ten Economic Studies in the
Tradition of Irving Fisher, New York, Wiley, 1967.
19. "An Example of an Alogarithm for Calculating General Equilibrium
Prices," American Economic Review, LIX, September 1969.
20. Shapley, L. S., "On Balanced Sets and Cores," Naval Research Logistics Quarterly, 14,
December 1967.
21. Shapley, L. S., and Shubik, M., "Solution of N-Person Games with Ordinal Utilities"
(abstract), Econometrica, 21, April 1953.
22. , and , "Concepts and Theories of Pure Competition," Essays in Mathe-
matical Economics in Honor of Oscar Morgenstern, ed. by M. Shubik, Princeton, N.J.,
Princeton University Press, 1963.
23. , and , "Quasi-cores in a Monetary Economy with Nonconvex Pre-

ferences," Econometrica, 34, October 1966.

24. ,and , "On Market Games," Journal of Economic Theory, 1, June 1969.
25. , and , "On the Core of an Economic System with Externalities,"

American Economic Review, LIX, September 1969.

26. , and , "Pure Competition, Coalitional Power, and Fair Division,"

International Economic Review, 10, October 1969.

27. Samuelson, P. A., "Evaluation of Real National Income," Oxford Economic Papers,
2, January 1950.
234 THE THEORY OF COMPETITIVE MARKETS

28. Shubik, M., "Edgeworth Market Games," in Contributions to the Theory of Games,
IV, ed. by A. W. Tucker, and R. D. Luce, Princeton, N.J., Princeton University Press,
1959.
29. Takayama, A., International Trade-An Approach to the Theory, New York, Holt,
Rinehart and Winston, 1972.
30. Vind, K., "Edgeworth-Allocations in an Exchange Economy with Many Traders,"
International Economic Review, 5, May 1964.
31. , "A Theorem on the Core of an Economy," Review of Economic Studies,
XXXII, January 1965.
32. von Neumann, J., and Morgenstern, 0., Theory of Games and Economic Behavior, 3rd
ed., Princeton, N.J., Princeton University Press, 1953.

Section D
DEMAND THEORY

The purpose of this section is to study the theory of demand for a "competi-
tive" consumer. Traditionally (as-explained in Hicks [ 5] ), this theory is developed
by postulating for each consumer a preference ordering representable by a real-
valued "utility" function. Each consumer is supposed to maximize his utility
subject to his budget constraint. The maximality condition (the first-order condi-
tion) provides the demand function that relates the individual's demand for a com-
modity to the prices of all commodities and his income. A comparative statics
analysis with regard to the maximality condition will yield the Hicks-Slutsky equa-
tion and the properties of the substitution terms. The other approach, which is due
to Samuelson [ 13] , is called the revealed preference theory. This theory neither
presupposes the utility function nor the preference ordering. It goes directly to the
demand for commodities. If a certain bundle of commodities is actually purchased
by a certain consumer at a certain price vector, it is supposed to "reveal" that he
prefers this bundle of commodities to the bundles of goods which cost less than, or
the same amount as, the bundle purchased. Using the consistency condition which
is essentially due to this observation (later called the weak axiom of revealed pref-
erence), Samuelson proved most of the properties of the demand function,
especially the properties of the "substitution terms." However, he failed to prove
the symmetry of the substitution matrix, which was later proved by Houthakker
[6] by imposing another condition (called the "strong axiom of revealed pref-
erence"). The natural question which arises is, What is the relation between the
traditional approach and the revealed preference approach?

(i) Given a demand function, can we tell whether it could be induced by a utility
function? This question is called the integrability problem and has recently been
DEMAND THEORY 235

studied by Samuelson [ 14], Houthakker [6], Uzawa [ 15], and so forth. The
converse of this problem is the traditional analysis explained in Hicks [5] . An
excellent survey article on the ("local") integrability problem is now available in
Hurwicz [7].
(ii) Given a demand function, can we tell whether it could be induced by a preference
relation? Aspects of this problem have been studied by Uzawa [ 15] and Arrow
[ 1 ] The converse of this problem was studied by McKenzie [8]. As discussed in
.

Section B of this chapter, we can deduce the utility function from a preference
relation under certain assumptions. Then the converse problem will be the same
as the converse of problem (i).

One highlight of these discussions is seen in a recent elegant paper by Richter

[ 121. One of his conclusions (his theorem 1, p. 639) is that a consumer acts "ration-
ally" according to his preference ordering, which is a total quasi-ordering, if and
only if he is "congruous" in the sense that this action satisfies the Samuelson-
Houthakker revealed preference axiom in a more general sense (which essentially
takes account of satiation and the nonuniqueness of demand function). His
theorem on the integrability problem, concerning whether the strong axiom of
revealed preference is sufficient to suppose that a consumer acts as though he has a
real-valued "utility" function ("representable" consumer preferences), needs
further assumptions. The purpose of this section is not to give an exposition of this
article, although we strongly encourage the reader to read this elegant master-
piece. Here we want to stick to a more or less traditional approach, starting from
a preference relation (if not a utility function). By imposing a preference relation
which is total, over an individual's consumption set, we want to obtain the Hicks-
Slutsky equation-in particular, the properties of the substitution terms. This is
the approach adopted by McKenzie [8] and Yokoyama [ 161. The author believes
that this will be very helpful in understanding the basic structure of modern
demand theory, for the McKenzie approach, although it is essentially based on
the traditional approach, is greatly influenced by the revealed preference theory
and appears to have influenced recent discussions of the integrability problem
(for example, a paper by Hurwicz and Richter in [3] ). On the other hand, the
reader who is interested in the integrability problem (local or global) is referred
to some articles in [3] as well as [ 121.
Another important aspect of demand theory is the continuity property of the
demand function. First, the demand function is not necessarily single-valued.
Hence a new concept of continuity is needed for multivalued functions. In partic-
ular, we will prove that under certain conditions, the demand function is
"upper semicontinuous." This property is obtained by presupposing the pref-
erence relation on the consumption set. As we see in the next section, this will play
a very important role in the proof of the existence of competitive equilibria. In
Debreu [4] , this property is proved by using a theorem which Berge [2] called the
maximum theorem. We do not use this theorem in our proof. We derive it directly
from our consideration of a preference relation. In the Appendix, we attempt an
expository account of the "maximum theorem." In addition to the upper semi-
236 THE THEORY OF COMPETITIVE MARKETS

continuity of the demand function, we establish some other important facts in

demand theory. This section has two parts. In the first part we want to prove the
upper semicontinuity of the demand function, and in the second part, we obtain
the Hicks-Slutsky equation and related results.'
Let X be the consumption set of a certain individual. We assume that it is a
compact2 subset of R" and totally quasi-ordered by O.' At the outset, we do not
assume thatX is convex. Let p E R11, p 0, be the price vector which prevails in the
market. We suppose that this individual is a "competitive" consumer in the sense
that he cannot affect the prices that prevail in the market. We also suppose that his
behavioral rule as a consumer is to maximize his satisfaction from the consumption
bundles that he can afford with his income. Jet M, a positive number, denote his
(money) income. We are tempted to define the "budget set" H(p, M) by
{x: x E X, p x < M}. But this definition is not quite right, for we have not
specified the domain of price vector p and income M so that H(p, M) can be
empty. The set H(p, M) would be empty if the prices became so high compared
to his income that he could not afford to buy anything in his consumption set X.
He may starve to death. To remedy this difficulty, we first define the set S of
price-income pairs (p, M) by S = {(p, M) E 1: 0(p, M) 0}, where 0(p, M)
{x: x E X, p x < M}. We then assume that this set hence S is nonempty, and
we have the following definition.

Definition: Let H be a multivalued function from S into X such that H(p, M) =

{x: x E X, p x < M}. The function H is called the budget function and H(p, M) is
called the budget set when the consumer's income is M and price p prevails. In
many cases, it is more convenient to explicitly write M = p x, where r is the
endowment vector of the consumer. Then we define his budget set H(p) by H(p)
{x: x E X, p x < p Y I, where p is taken in the subset of R" in which H(p) is
nonempty.

Definition (demand function) :4 The demand function is a multivalued function,

F, from Sc: I
into X such that x E F(p, M) means p ..x < M and xoz
for all z such that p z < M. We assume that F(p, M) is non empty.
REMARK: If x E F(p, M), it means that x E H(p, M) and x®z for all
z E H(p, M). Figure 2.21 illustrates the points of F(p, M). The left-hand
diagram illustrates the case when F(p, M) is single-valued and the right-hand
diagram illustrates the case when F(p, M) is multivalued. The curves, which
are drawn convex to the origin in the diagram, indicate the utility indif-
ference curves. Although it is not shown in Figure 2.2 1, it is possible to have
x E interior H(p, M) for some x E F(p, M); that is, x can be a satiation point.
REMARK: When M= p x, we may write F(p, M) as F(p). To obtain the
Hicks-Slutsky equation and the traditional results in the demand theory, it
would be more convenient to use F(p, M). However, to discuss the upper
semicontinuity of the demand function (and the lower semicontinuity of the
DEMAND THEORY 237

H(p)

Figure 2.21. Demand Functions (X = f22).

budget set), it would be more useful, especially in connection with Section

E, to consider F(p) instead of F(p, M) (and H(p) instead of H(p, M)).
The following theorem is an immediate consequence of the definition of the
demand function.

Theorem 2.D.1: Let F(p, M) be a single-valued demand function. Then F(p, M)

is positively homogeneous of degree zero in (p, M); that is, x = F(p, M) means
x = F(tp, tM) for any positive number t.
PROOF: Since F is single-valued, we may write x = F(p, M). From the
definition of F, x = F(p, M) if and only if x®z for z with p . z < M. But
p z:5 M if and only if tp z < tM where t is any positive number. Hence
x = F(tp, tM). (Q.E.D.)
REMARK: From the above proof it is also clear that if x E F(p, M) (that
is, F is multi-valued), then x E F(tp, tM) for any t > 0. Also if x E H(p, M),
then x E H(tp, tM).
We introduce the following two basic assumptions, (A-1) and (A-2), of
demand theory.
(A-1) (continuity of Q) Let {xq} and {zq} be two arbitrary sequences in X such
that and Then xg®zq implies x®x.
REMARK: The set X is compact, hence closed. Thus Xq - x and xq_>3e
imply x E X and x E X. We may rewrite (A-1) as follows:
(A-1' ): Let {xq} be a sequence in X such that Xq -,X. IfxgQx for all q, then
x®x.And if xgOxforall q,then x0Y.
238 THE THEORY OF COMPETITIVE MARKETS

(A-2) (local nonsatiation) Let x E F(p, M). Then there exists a positive
number S such that for any c, 0 < E < 8, there exists a point x' E B (x) and x' E X
E

with x'(Dx, where BE(x) is an open ball about x with radius c, and Ba(x) n
(X\x) 0.
REMARK: This is the same assumption which was adopted in the previous
section. The following concept was also used in the previous section.

Definition: C, = {z: z E X, z (Dx} is called the no-worse-than-x set.

Lemma 2.D.1: The set C, is closed, if (A-1) holds.

PROOF: Let x9 -> x where x9 E CX . Then x E X since X is a closed set. Also

x9 0x implies x®x from (A-1). Hence CX is a closed set. (Q.E.D.)
REMARK: Since X is bounded, CX is bounded, hence compact.

Definition: MX(p) = min p z for z E C.,.

REMARK: Since CX is compact, there exists a z in Cx which minimizes p - z.
The function MX(p) is called the minimum expenditure function (which
achieves a level of satisfaction that is at least as great as the satisfaction
obtained from x at pricep).5 This function Mr(p) is illustrated in Figure 2.22.

Lemma 2.D.2: Given p E R", select an x E X such that p- x < a where a E R.

Then (A-2) implies that there exists an x' E X such that x' Q x and p x' < a.
PROOF: By (A-2), there exists a S > 0 such that for all c, 0 < E < 8, there
exists an x' E B, (x), x' E X, with x' Q x. Because of the continuity of the
value function p - z, we may choose x' close enough to x so that p - x' < a.
(Q.E.D.)

Figure 2.22. An Illustration of M,, (p).

DEMAND THEORY 239

REMARK: The idea of this lemma was used in the proof of Theorem 2.C. 1,
especially in the proof of the lemma preceding the theorem.

Theorem 2.D.2: Let x E F(p, M). If (A-2) holds for x, then p x = M. If (A-2)
holds for all z E F(p, M), then p x = MX(p).
PROOF:
(i) By definition of F(p, M), p x < M. Suppose p x < M. Then, using
Lemma 2.D.2, (A-2) implies that there exists an x' E X, such that x'Qx
with p x' < M. This contradicts the definition of F(p, M). Hence

(ii) By the definition of MX(p), MX(p) < p x = M. Suppose M,(p) < M. Then
p x" < M for some x" Qx. Then x" E F(p, M). Hence by (A-2), there
exists x' E X, such that x' QQ x" with p x' < M. This contradicts the
definition of F(p, M). Therefore we have MX(p) = M. (Q.E.D.)
REMARK: Theorem 2.D.2 means, among other things, that the local non-
satiation at a chosen point implies that all the income is spent.
To study the continuity property of H(p), we introduce the following
assumptions.
(A-3) (interior point) The set X contains an interior point x.
(A-4) The set X is convex.
REMARK: As remarked before, assumption (A-3) amounts to the cheaper-
point assumption (that is, there exists an i in X such that p z < p 5E).
Assumption (A-4) implies perfect divisibility of every commodity. This is
a restrictive although a very useful assumption.
We now explain important mathematical concepts, upper and lower semi-
continuity.

Definition: Let 0 be a multivalued function from X c R'" into Y c R" where

Y is assumed to be compact. Let x° be a point in X.

(i) Let {x'7} be a sequence in X such that xq-3x°, and let {yq} be a sequence
in Y such that yq E 0 (xq). If yq -> y° implies y° E 0 (x°), then 0 is called
upper semicontinuous at x0.

(ii) Let {xq} be a sequence in X such that xq --3 x°. If y0 E ¢(x°) implies "there
exists a sequence { yq} in Y such that yq -y° and yq E 0(xq)," then ¢ is called
lower semicontinuous at x°.
(iii) The function ¢ is called continuous at x° if it is both upper semicontinuous and
lower semicontinuous at x°.

REMARK: The above concepts are illustrated in Figure 2.23. The graph of
0 is the shaded region, boundary included; O(x°) is the interval [a, b].
240 THE THEORY OF COMPETITIVE MARKETS

X Xa X Xa
Upper semicontinuity Lower semicontinuity

Figure 2.23. An Illustration of Semicontinuity.

REMARK: Semicontinuity and continuity on X can be defined as semi-

continuity and continuity at every point of X.
Assuming that the range set Y is compact, the following statements follow
easily from the above definition (see Berge [2], chapter VI, for example).

(i) The function 0 is u per semicontinuous on X if and only if its graph, {(x, y):
x E X and y E 0 (x)}, is closed.
(ii) An upper (resp. lower) semicontinuous function of a continuous function is
upper (resp. lower) semicontinuous.
(iii) The Cartesian product of upper (resp. lower) semicontinuous functions 0j,
that is, (0 1 , 02, ... , pm ), is also upper (resp. lower) semicontinuous.

Theorem 2.D.3: The function H(p) is lower semicontinuous for p ? 0, under (A-3)
and (A-4).6
PROOF: By (A-3) there exists an i E X, and p z z E H(p), as p9 _ p. For large enough q, we define z9 - t9z +
(1 - t9) z where t9 is maximal for t9 E [0, 1] such that z9 E H(p9). We claim
such a {ze} is the sequence we want to find. That is, we want to show
z9 - z as p9 - p. Note that z9 ->z if and only if t9 _> 1. (Hence if we show
t9 - 1 as p9 - p, we are done.) If t9 = 1 (that is, if p17 z p9- x) for large
enough q, we are done. Hence it suffices to consider the case in which
t9 < 1 for large enough q. Note that t9 < 1 implies p9 z9 = pt x, for
q large enough, or else t9 is not a maximum. Suppose t9 - 1. Since the
interval [0, 1] is a compact set, there exists a subsequence of {t9}-say,
{t'}-such that is - t, where 0 t < 1. Since t < 1, ps zs = ps r for
sufficiently large s. Write zs -- tsz + (1 - ts)z. Since ts_t, zs-> z1, where
z'11 = tz + (1 - t)z. Since ps. zs = ps x for large s, we have
(*) tpz+(1-t)p.z=p.z°°=pz
DEMAND THEORY 241

Figure 2.24. An Illustration of the Proof of Theorem 2. D.3.

as s -> co. But since p.1 < pi, (*) yields p z > p i, contradicting
z E H(p). Hence we must have tq-> 1. (Q.E.D.)
REMARK: The graph of H(p) is obviously closed. Hence H(p) is upper
semicontinuous.' Thus from the above theorem, H(p) is in fact continuous.
REMARK: Note that the cheaper-point assumption (or the interior-point
assumption) plays a crucial role in the above theorem. If the consumer
starves to death when the price moves beyond a certain point (hence no
"cheaper point" in his consumption set), his budget function H(p) would
become discontinuous.
Write F(p) for F(p,M) where M = p Y. Then we can prove the following
theorem.

Theorem 2.D.4: The demand function F(p) is upper semicontinuous with respect
to p, if H(p) is lower semicontinuous in p, and (A-1) holds.
PROOF: Consider a sequence {pq} such that pq->p, as q->oo. Let z bean
arbitrary point of H(p). Then as a result of the lower semicontinuity of
H(p), there exists a sequence jzq} such that zq E H(pq) and zq -3 z. Let
xq E F(pq). Then by the definition of F, xq®zq. When a different element
z is chosen from H(p), we have a different sequence {zq}. But whatever
the sequence, xq® zq always holds by the definition of F and zq E H(pq).
Owing to the compactness of X, there is a subsequence of {xq}-say, {Xs}-
such that xs -> x' where x' E X; ps xs < ps x implies p x' < p x (take the
limit of s --3 co). Thus x' E H(p). From (A-1) (the continuity of (D), x-,@zs
implies x' @z. Since this holds, whatever the choice of z, x' E F(p). This,
together with the compactness of the range space X, proves the theorem.
(Q.E.D.)
242 THE THEORY OF COMPETITIVE MARKETS

REMARK: As a result of this theorem, F(p) is upper semicontinuous in

p if (A-1), (A-3), and (A-4) hold.
REMARK: Note the importance of the cheaper-point assumption in estab-
lishing the above theorem. As we remarked in connection with Theorem
2.D.3, if the consumer starves to death when the price goes beyond a certain
point, his demand function becomes discontinuous.
We now introduce a fifth assumption.
(A-5) (strict convexity of (D) Let x, x' E X with x # x' where X is assumed to
be convex; x®x' implies x" Ox'where x" = tx + (1 - t)x', 0 < t < 1.
REMARK: Note that (A-5) presupposes (A-4).

Theorem 2.D.5: The demand function F(p, M) is single-valued under (A-5)

PROOF: Let x and x' E F(p, M). Then by definition,
x®z for all z E X such that p z< M
x'Oz for all z E X such that p z< M
Suppose x # x'. Then from (A-5), x"(Dx where x" = zx + Zx'. But
p x" < M. This contradicts the condition that x E F(p, M). Hence x = x'.
(Q.E. D.)
REMARK: This theorem gives a sufficient condition for the single-valued-
ness of the demand function. It does not provide a necessary condition.
REMARK: From the above proof, it should be clear that F(p) is also single-
valued under (A-5). Therefore, under (A-1), (A-3), and (A-5), F(p) is
single-valued and continuous.
We now come back to the minimum expenditure function MX(p). Using
this concept, we would eventually like to obtain the Hicks-Slutsky equation,
especially the properties of the "substitution term."Obviously, the Hicks-Slutsky
equation can be obtained by performing a comparative static analysis on the
maximality condition obtained from maximizing a consumer's utility, u(x),
subject to his budget constraint. The essence of the comparative statics procedure
is fully explained and discussed in Hicks [5] and in our Appendix to Section F,
Chapter 1. Here we will arrive at the Hicks-Slutsky equation without having to
resort to the concept of a utility function. The essential idea in this procedure is to
use M, (p), which, as is mentioned above, is due to McKenzie [8]. The fact that
the utility function is dispensed with in the analysis is an exercise of Occam's
Razor, as McKenzie points out ([8], p. 185). The importance of the following
analysis lies in its simplicity, its directness, and the clarification of the structure
of demand theory. It is direct in the sense that it does not presuppose any knowl-
edge or development of nonlinear programming. It does not even require the
separation theorem.
DEMAND THEORY 243

We first modify the interior-point assumption (A-3) as follows.

(A-3') (cheaper point) Let x E F(p, M). Then there exists a positive number '5
such that for any c, 0 < E < S, there exists a point z E X and i E BE(x) with
p . z < p x, where BE(x) is an open ball about x with radius c and BS(x) n (X \ x)
/z 0.

Commodity 2

Figure 2.25. An Illustration of (A-3').

REMARK: If X is convex (hence all commodities are divisible), then the

above assumption is simplified as follows: [Let x E F(p, M). Then there
exists an z E X such that p z < p x.] Note that the convexity of X implies
x' = [ tz + (1 - t)x] E X for all 0 < t < 1, and obviously p x' < p x for all
0<t< 1.
In order to sharpen the argument, we henceforth assume that the demand
function F(p, M) is-single-valued. First we prove the following lemma.

Lemma 2.D.3: Let x = F(p, M) and x' = F [p', M, (p')] with x' zf- x. Suppose that
(A-3') holds at x'. Suppose also that (A-1) holds. Then if (p', x') lies sufficiently close
to (p, x),x0x'.
PROOF:
(i) (x'Qx): By definition of F, p'- x' < M,r.(p'). As we remarked in the
definition of the minimum expenditure function, there exists z E C, such
that p'. z = MX (p'), since Cr is compact. [In other words, z can be pur-
chased with income MX (p').] From the definition of x', x'(D z. Hence
x' E Cr or x'Qx.
(ii) (x ®x'): Since (p', x') is sufficiently close to (p, x), by (A-3'), there
exists an z E X such that xy - [t9i -j- (1 - t9)x'] E X with p' x9 <
p'. x', for all 0 < t9 < 1. [Note that p'. z < p'. x' implies p' x9 < p'. x'
for all 0 < t9 < 1, since p' x9 = tqp' z + (1 - t9)p' x'.] But p'. x'
244 THE THEORY OF COMPETITIVE MARKETS

Figure 2.26. An Illustration of the Proof of Lemma 2.D.3.

MX(p') by definition of F, so that we have p'. x9 < MX(p'). Hence from

the definition of Mr(p'), x9 (4 C,.. Or x ®x. Letting t9 -> 0, by (A-1),
x ®x'. Combining this with (x'Qx), we obtain x®x'. (Q. E. D.)

REMARK: Lemma 2.D.3 amounts to asserting that there is an indifference

curve in the neighborhood of (p, x).

Definition: Let ff(p) = f [p, MX(p)], which is defined in the neighborhood of

(p, x), in which (A-3') holds. The function f (p) is called the compensated demand
function.
REMARK: When it is convenient and not confusing, we abbreviate ff(p)
by f(p). Now f(p), like F(p, M), is a vector-valued function, the ith com-
ponent of which can be written as f (p). Thus f(p) is the compensated
demand function for the ith commodity. The concept of the compensated
demand function is important in classical demand theory. The compen-
sated demand function f(p) indicates the point chosen at price p, when a
consumer's income is guaranteed (compensated) such that it is just sufficient
for him to obtain a level of satisfaction as great as at point x. This concept
is illustrated in Figure 2.27. Note that x = F(p, M) obviously means
x = fx(p). Lemma 2.D.3 says that if x = fx(p) and (p', x') is close enough
to (p, x) such that x' = f (p'), then x&x'.

Theorem 2.D.6: If ff(p) is differentiable and if (A-1), (A-2), and (A-3') hold, then

p,afx(p)=0, j= 1,2,...,n
"Pi
DEMAND THEORY 245

Commodity 2

Commodity 1

Figure 2.27. An Illustration of f,,(p).

PROOF: Let z = Jx(p), and let z' = J,(p') where (p', z') is sufficiently close
to (p, z). Then by Lemma 2.D.3, z' E) z, so that z' E Cr. But by the defini-
tion of M,,(p), Mx(p) < p z' for all z' E Cr. Since (A-2) holds, M,(p) = p- z
by Theorem 2.D.2. Therefore p z < p z' for all z' E C, or p f,.(p) <
for all p', where (p', z') is sufficiently close to (p, z). In other words,
for a fixed p, p f,(p') is minimized with respect to p' at p. Hence using the
first-order characterization of a minimum, we obtain:
aJX(p') j = 1, 2, ... , n (Q.E.D.)
P, = 0 at p' = p,
apj
REMARK: It may so happen that f ,(p) [hence MX(p) also] is not dif-
ferentiable. This happens, for example, when there is a "kink"in the indiffer-
ence curve. In Figure 2.28 (compare Hurwicz [71, p. 196), there is a kink
at the point x in the sense that there are two tangent lines to the difference
curve a at the point Y.

Commodity 2

Commodity 1
o Figure 2.28. Nondifferentiable f( (p).
246 THE THEORY OF COMPETITIVE MARKETS

Theorem 2.D.7: If (A-1), (A-2), and (A-3') hold at x = f (p), wheref (p) = f (p), and
f (p) is differentiable at p, then,

aMX(P) af(P) = a2MX(P)

= f (p) and (i,j--1, 2, ... , n)
apt apl api apl
If MX(p) is twice continuously differentiable at p, then

of (Pt) - of (P)
aP.i apt
PROOF: (i) By Theorem 2.D.2, MX (p) = p x = p F(p, M) = p f(p).
Hence aMX(p)/api = a [p -f(P)] l apt = f (p) + p afl apt = f (p), by the
previous theorem. That of(p)/app _ a2MX(p)/aptapj follows immediately
from this. If MX(p) is twice continuously differentiable at p, then clearly
af(P)lap1 = af(P)lapt (Q.E.D.)
REMARK: The partial derivative of /app signifies the rate at which the
consumer varies the consumption of the ith good per unit change of thejth
price when income changes are made at the same time and by a proper
magnitude to keep the consumer on the same indifference locus; of /app is
called the substitution term by Hicks ([5], p. 103).

Lemma 2.D.4: The function M, (p) is concave' for all p ? 0.

PROOF: Let p - tp" + (1 - t)p', 0 < t < 1. By definition ofMr(p), Mr(p) _0.1 Hence M., for all e > 0 and
0 < t < 1. Since z E CC, p" z > MX(p") and p' z > MX(p'). Therefore we
have MX(p) >_ tMX(p") + (I - t)MX(p') - e for all e > 0 and 0:5 t < 1.
Therefore M.,(p) > tMX(p") + (1 - t) MX(p'), 0 < t!!:-: 1. (Q.E.D.)

Theorem 2.D.8: f ;2:j(af,./apj) dptdpj <_ 0, almost everywhere.

PROOF: a f / a pj = a 2MX (p)/ a pt a p, . Since MX (p) is a concave function

by Lemma 2.D.4, and since the Hessian matrix of a concave function is nega-
tive semidefinite (Chapter 1, Section F,c), the statement of the theorem
follows trivially."' (Q.E.D.)

Lemma 2.D.5: The function Mx (p) is positively homogeneous of degree one.

PROOF: By definition, MC(p) = inf. p z for z E C. But this is true if and
only if tM,(p) = inf. tp z, for all z E C, t > 0. (Q.E.D.)

Theorem 2.D.9: The function f, (p) is positively homogeneous of degree zero.

DEMAND THEORY 247

PROOF: By definition, f, (p) = F [ p, MX(p)]

= F[tp, tM,r(p)] (by Theorem 2.D.1)
= F[tp, MX(tp)] (by Lemma 2.D.5)
= .fx(tp) (by definition) (Q.E.D.)

REMARK: Using the Euler equation for the homogeneous function of

degree zero, we may write,
of
jin a pi pj = 0

where] is the ith component of fx(p).

REMARK: Writing Sij = afil apj and S = [Sij] (n x n matrix), we may
summarize the results obtained in Theorems 2.D.6 through 2.D.9 as follows:"
n
(i) p S = 0 (or Zy p,Sjj = 0) (Theorem 2.D.6)
i=1

(ii) S is symmetric (or Sij = Sji) (Theorem 2.D.7)

n
(iii) S is negative semidefinite (or Sij pip] < 0) (Theorem 2.D.8)
i, j
n
(iv) S p = 0 (or Z Sij pj = 0) (Theorem 2.D.9)
j= 1

(v) Sij < 0 [from (iii) above]

Finally we obtain the Hicks-Slutsky equation.

Theorem 2.D.10: Suppose that (A-2) and Theorem 2.D.7 hold at x = F(p, M) and
that F is differentiable in p and M. Then
aFj(P, M) _ af(P) _ j(P) aFj(P, M)
apj apj am
where f (p) = fx (P)
PROOF: Since (A-2) holds, Theorem 2.D.2 holds, so that p x = M = MX (p).
Therefore
of (P) aFj[P, M.(P)] _ aFj(P, M) + aFj(P, M) Mx(p)
apj apj apj am apj

aFi(P, M) aF1(P, M)
aPj +j(p) am (y
b Theorem 2.D.7)

(Q.E.D.)
248 THE THEORY OF COMPETITIVE MARKETS

FOOTNOTES

1. For the subject matter of this section, we have relied heavily on McKenzie [8] and
his lectures at the University of Rochester. An exposition of McKenzie's approach is
also seen in S. Karlin, Mathematical Methods and Theory in Games, Programming,
and Economics, Vol. I, Reading, Mass., Addison-Wesley, 1959, pp. 271-273. For
a more complete exposition of (static) demand theory, see D. W. Katzner, Static
Demand Theory, New York, Macmillan, 1970.
2. The compactness of X is assumed just for the simplicity of exposition. It can be
weakened. For example, it suffices to assume that X is closed and bounded from
below. This is due to the fact that X can be restricted to a set which is bounded
from above as a result of the budget constraint (that is, a finite income).
3. That is, we assume that the relation ® is reflexive, transitive, and total. However,
we may note that the transitivity axiom is not essential in obtaining many results
in this section. In other words, in many results, it suffices to regard Q only as a
binary relation on X, which is total. Needless to say, the transitivity axiom is needed
in obtaining some results here (such as Lemma 2.D.3 and Theorem 2.D.6).
4. The crucial underlying fact here is the assumption that some point-say, x-is
"chosen" [that is, F(p, M) is nonempty]. The power of this axiom of selection
in demand theory is well illustrated in the theory of revealed preference. Starting
from preference orderings, it is possible that such a choice is impossible. Here we
may recall Sonnenschein's example (quoted in Section B of this chapter): Assume
that the budget set consists of only three points x, y, and z, and suppose that our
consumer's preference is x®yQz but zQx (the case of intransitivity). Here no
choice is possible. Needless to say, if Q is intransitive, then Q is also intransitive.
5. The minimum expenditure function Mx(p) plays a crucial role in McKenzie's
approach to demand theory. As will be shown later, Mx(p) turns out to be a concave
function, hence its Hessian matrix is negative semidefinite. It will also be pointed
out later that the elements of this Hessian matrix correspond to the effect of a
compensated price change on demand, that is, the substitution terms in the Hicks-
Slutsky theory. In other words, the discovery of the crucial role played by the
minimum expenditure function in demand theory is one of the important contribu-
tions of McKenzie [8].
6. Similarly, we can prove the lower semicontinuity of H(p, M). Such a proof is given
by Debreu [41, pp. 63-65. Our proof is due to Lionel McKenzie.
7. The function H(p) is upper semicontinuous if its graph is closed and if its range
space X is compact. Similarly, we can establish the lower semicontinuity and hence
continuity of H(p, M).
8. Since every concave function is continuous in the interior of the domain (compare
Theorem 1.B.1), M,(p) is continuous for all p > 0.
9. To see this, suppose the contrary. That is, suppose that for some F > 0, Mx(p) <
p- z - F for all z E C. Let 2 be a point in CX such that p- z = Mx(p). Then we
have p 2 < p i - F, which is a contradiction.
10. It is known that every concave function is differentiable almost everywhere (that
is, except for sets of measure zero). See, for example, W. Fenchel, Convex Cones,
Sets, and Functions, Princeton University, September 1953 (hectographed).
11. Note that (iv) can also be obtained from (i) and (ii). Similarly, (i) can also be obtained
from (ii) and (iv).
DEMAND THEORY 249

REFERENCES
1. Arrow, K. J., "Rational Choice Functions and Orderings," Economica, N.S., 26,
May 1959.
2. Berge, C., Topological Spaces, New York, Macmillan, 1963 (French, 1959).
3. Chipman, J. S., Hurwicz, L., Richter, M. K., and Sonnenschein, H. F., eds., Prefer-
ences, Utility, and Demand: A Minnesota Symposium, New York, Harcourt Brace
Jovanovich, 1971.
4. Debreu, G., Theory of Value, New York, Wiley, 1959.
5. Hicks, J. R., Value and Capital, 2nd ed., Oxford, Clarendon Press, 1946.
6. Houthakker, H. S., "Revealed Preference and the Utility Function," Economica,
N.S., 17, May 1950.
7. Hurwicz, L., "On the Problem of Integrability of Demand Functions," in Preferences,
Utility, and Demand, New York, Harcourt Brace Jovanovich, 1971, chap. 9.
8. McKenzie, L. W., "Demand Theory without a Utility Index," Review of Economic
Studies, XXIV, June 1957.
9. , "Further Comments," Review of Economic Studies, XXV, June 1958.

10. Newman, P. K., and Read, R. C., "Demand Theory without a Utility Index;
Comment," Review of Economic Studies, XXV, June 1958.
11. Newman, P. K., The Theory of Exchange, Englewood Cliffs, N.J., Prentice-Hall, 1965.
12. Richter, M. K., "Revealed Preference Theory," Econometrica, 34, July 1966.
13. Samuelson, P. A., Foundations of Economic Analysis, Cambridge, Mass., Harvard
University Press, 1947.
14. "The Problem of Integrability in Utility Theory," Economica, N.S., XVII,
November 1950.
15. Uzawa, H., "Preferences and Rational Choice in the Theory of Consumption,"
in Mathematical Methods in the Social Sciences, ed. by Arrow, Karlin, and Suppes,
Stanford, Calif., Stanford University Press, 1960. A revised version of this paper is
included in [3], chapter 1.
16. Yokoyama, T., "A Logical Foundation of the Theory of Consumer's Demand,"
Osaka Economic Papers, 2, 1953.

Appendix to Section D: Various Concepts of Semicontinuity and the

Maximum Theorem'

a. VARIOUS CONCEPTS OF SEMICONTINUITY

In the literature various definitons of semicontinuity of a multivalued
function are used, and the definitions given by different authors do not always
coincide. Hence one should be careful in applying the theorems obtained in the
(mathematics) literature to economic problems whenever such problems involve
the concept of semicontinuity. Here we take up a few of these definitions of
semicontinuity and point out the relations among them.
We start our discussion by introducing the following concepts.
250 THE THEORY OF COMPETITIVE MARKETS

from set into

Definition: Let 0 be a function, multivalued or single-valued,
denoted by 0 is the mapping from O(X)
a set Y. The lower inverse of 0,
into X defined as
0-(Y)= {x:xEX,yE O(x)}
define c- (B) as
Let B be a nonempty subset of Y, then we
0-(B)-{x:x EX,0(x)nB40}
denoted by 0 +, is defined as
Let B c Y; then the upper inverse of 0,
0+(B)- {x:xEX,0(x) B}

REMARK: If B = 0, then we define 0- (B) = 0. Note that, by definition of

} . Note also that for all B c Y,
0 + , we have 0 + (0) _ {x : x c X, 0 (x) = 0
particular, 0 is single-valued, then
B $ 0, we have 0 + (B) c 0 - (B ). If, in
we haveq+(B) = (B) = 1(B).
Berge [1] (p. 25), may be useful in
The following example, given bywell as being somewhat amusing to the
understanding the above concepts, as
reader.
of possible positions in the game of chess: A
EXAMPLE: Let X be the set of the different pieces on the chess board
position consists of the coordinates be the set of positions in which White
and player whose move is next. Let X1
E X i and define 0 (x) as the set of positions
can move. Clearly X, c X. Let x after position X.2 The image q- (x) is
which White can reach immediatelycould have occurred before position x.
the set of possible positions which (A) is the set of positions that can give
If A is a nonempty subset of X1, 0 + following move. If K denotes the set of
only a position belonging to A in the
then the set of positions White can
positions in which Black is checkmated,
mate "in two moves" is 0-.O+XT= -(K).
Y = [0, 1 ] , the unit interval on the real line,
As a second example, let
two diagrams of Figure 2.29 illustrate
and let 0 be a function of X into Y. Theinverse, where the shaded region of each
the concepts of lower inverse and upper
diagram denotes the graph of 0. be topological spaces: 0 is now a
Now we restrict our sets X and Y to
X into a topological space Y? We
(multivalued) function of a topological space
then define the following concepts.

Definition:
(i) The function ¢ is
called upper semicontinuous (abbreviated u.s.c.) at x0 if
exists a neighborhood N(x°) (or
for each open set V containing ¢(x°) there
an open set containing x°) such that
x c N(x°) implies 0 (x) c V
DEMAND THEORY 251

TA-11, I'- ,
B2

I L

0
L A
---A 2 11
1
x

Case a: 0-(B1) =A7, 0'(B1) = D , 0'(B2) =A2

u - A, 1

Case b: 0(x) = B, 0'(B) =A, ¢*(B) =A2

Figure 2.29. Illustrations of Lower Inverse and Upper Inverse.

(ii) The function 0 is called lower semicontinuous (abbreviated l.s.c.) at x0 if for

each open set V containing 0(x°) there exists a neighborhood -N(xo) (or an
open set containing x°) such that
x E N(x°) implies 0(x) fl V 4 0
(iii) The function 0 is called continuous at x° if it is both u.s.c. and l.s.c. at x0.
REMARK: When 0 is single-valued, we know that 0 is continuous at x°if
and only if for each open set V containing 0 (x°), there exists a neighborhood
N(x°) (or an open set containing x°) such that
x E N(x°) implies 0 (x) E V
In other words, 0 is both u.s.c. and l.s.c. at x0.

We are now ready to state various definitions of semicontinuity in X.

Definition:

(i) We say that 0 is closed in X if, for each x0 E X, "x9 --> x°, y9 -> y°, where
x9 E X, y9 E 0(x9)" implies "y°E 0(x°)."
(ii) We say that 0 is G-closed in X if the graph of 0, {(x, y): x E X, y E 0 (x)), is
closed in X Ox Y.
(iii) We say that 0 is quasi upper semicontinuous (abbreviated q.u.s.c.) in Xa if 0
is u.s.c. at each x in X.
(iv) We say that 0 is upper semicontinuous (u.s.c.) in X if 0 is q.u.s.c. in X and 0 (x)
is compact for each x in X.
252 THE THEORY OF COMPETITIVE MARKETS

(v) We say that ¢ is lower semicontinuous (1.s.c.) in X if ¢ is l.s.c. at each x in X.

(vi) We say that ¢ is continuous in X if it is both u.s.c. and I.s.c. in X.
REMARK: Definitions (i) and (ii) are equivalent.' If ¢ is single-valued, then
(i), (ii), (iii), (iv), (v), and (vi) are all equivalent.
REMARK: Definition (ii) is equivalent to saying that for each x° E X,
"y° E Y, y° 0 ¢ (x°)" implies that "there exist open sets U in X and V in Y
with x0 E U and y° E V such that ¢(x) r1 V = 0 for all x E U."
We now state theorems which would clarify the meaning of the various
definitions of semicontinuity and link these definitions.

Theorem 2.D.11 (Berge):'

(i) The function ¢ is l.s.c. in X if and only if for each open set V in Y the set
¢ - (V) is open in X.
(ii) The function ¢ is u.s.c. in X if and only if for each open set V in Y the set
¢ +(V) is open in X and ¢ (x) is compact for each x E X.
PROOF: See Berge [ l] , pp. 109-110.

Theorem 2.D.12: The function ¢ is u.s.c. in X ([and only if ¢ is closed in X and Yis
compact.
PROOF: See Berge [1], p. 112 (corollary of theorem 7), and Moore [4],
lemma 1-d and lemma 2.
REMARK : It is important to note that the compactness of Yis crucial here.'
It is not accidental that in Debreu's definition of upper semicontinuity
([ 2], p. 17) and our definition in Chapter 2, Section D, Yis assumed to be
compact.
REMARK: Theorem 2.D.12 implies that every u.s.c. function is closed.
REMARK: Berge also proved that if 0 is u.s.c. in X, then O (A) is compact in
Y whenever A is compact in X ([1], p. 110).
REMARK: For the lower semicontinuity, we may conjecture that is l.s.c.
in X if and only if for each x0 E X, "y° E 0(x°)" implies that there exists a
sequence {yet } in T such that yq -> y°.'

Theorem 2.D.13 (Berge):

(i) The function 0.0 is l.s.c. (resp. u.s.c.) if both ¢ and 0 are l..s.c. (resp. u.s.c.).
(ii) The union (resp. intersection) of a family (finite or infinite) of l.s.c. (resp. u.s.c.)
functions is also l.s.c. (resp. u.s.c.).
(iii) The union of a finite number of u.s.c. functions is also u.s.c. (no corresponding
property holds for l.s.c. functions).
DEMAND THEORY 253

(iv) The Cartesian product of afinite number ofu.s.c. (resp.l.s.c.)functionsisalso u.s.c.

(resp. 1. s. c. ).

PROOF: See [1], pp. 113-115.

REMARK: Moore ([4], pp. 135-137) also proved the following.
(i) The function 0. is q.u.s.c. if both 0 and rJ are q.u.s.c.
(ii) The function 0°'J is closed if 0 is closed and 0 is u.s.c.

b. THE MAXIMUM THEOREM

Consider a typical competitive consumer with a utility function u(x) defined
on his consumption set X. The function u(x) is real-valued from X into R. He
chooses x (his "action") such as to maximize u(x). His choice is restricted to a sub-
set ofX, called the "budget set." Suppose that his income is M and he is faced with a
market price vector p. Let S be the set of all possible values of (p, M). Then his
budget set can be defined by a set-valued function, H, from S intoX. In the theory
of consumer choice, His defined in terms of p x < M. However, given an arbitrary
value of (p, M), H(p, M) may not have an image inX, that is, H(p, M) can be empty.
We assume that there is at least one value of (p, M) in S such that H(p, M) has an
image in X, that is, H(p, M) is nonempty for some (p, M) E S. We then restrict the
set S to its subset Z such that H(p, M) is nonempty. In other words, H(p, M) is non-
empty for all (p, M) in Z. Now his action x (consumption vector) is such as to
maximize u(x) over the budget set H(p, M).
In general, let us consider an economic agent who has a set of actions, which
is a priori available to him, denoted by X. Let Z be the set of his possible environ-
ments. Given an element z E Z, we suppose that his action is restricted to a subset
of X by a set-valued function from Z into X. In other words, his action is restricted
to O (z) c X. We assume that 0 (z) is nonempty for all z in Z. Let us consider the out-
come of action x when his environment is z. His action x is restricted to 0 (z) c X.
In particular, we define a real-valued function u on Z Q X, called the gain function;
u(z, x) represents the gain when his environment is z and his action is x. Here the
value of u may depend only on the action x which is taken. Hence u(z, x) may
simply be written u(x). In demand theory, u is the utility function of the consumer;
z is a price-income pair (p, M),Xis the set of possible consumptions, and q(z) is the
budget set. In the theory of production, the economic agent is a producer,Xis his
production set (given the resource constraints), Z is the set of price-resource pairs,
and u is the profit function.
Hence the above formulation of the function u defined on Z 0 X with
x E 0(x) has a wider application to economic problems. Let us suppose that the
agent maximizes u(z, x) for all x E 0(z) with a given z E Z. Let V be the value of
this action, that is, V(z) = sup {u(z, x): x E 0(z)}. In other words, V(z) is a real-
valued function (and obviously single-valued) defined on Z. Let The the set of such
"optimal" actions. In otherwords, T(z) _ {x: x E 0(z), u(z, x) = V(z)j. In general,
T(z) is a multivalued function from Z into X. We assume that X and Z are Haus-
254 THE THEORY OF COMPETITIVE MARKETS

dorff spaces. Berge ([ 1 ] , pp. 115-116) stated and proved the following theorem
which has many important applications.

Theorem 2.D.14 (maximum theorem): Let u(z,x) be a real-valued (and single-

valued) continuous function in Z OX, and 0 (z) be a multi valued function from Z into X
such that o (z) # o for all z E Z and o(z) is continuous in Z. Then V(z) is continuous and
T(z) is u.s.c. in Z.
REMARK: In the above theorem, the requirement that u(z, x) is continuous
in Z (2 X can be replaced by the requirement that u(x) is continuous in X.
This is owing to the continuity of O(z). This observation would be useful
for the application of the theorem to demand theory, for example.
REMARK: In demand theory, u(x) is the individual's utility function, 0(z) is
his budget function, V(z) is his indirect utility function, and T(z) is his
demand function (multivalued). The lower semicontinuity of O(z) cor-
responds to the lower semicontinuity of H(p, M). The upper semicontinuity
of T(z) corresponds to the upper semicontinuity of the demand function
F(p, M).
REMARK: Let Y c R" be the production set for a certain "competitive"
producer. When price p prevails, his profit is p y, for action y E Y. Let Z be
the subset of R" such that p y attains a maximum over Y. We assume that Z is
nonempty. Define u(p, y) on Z ox Y such that u(p, y) = p y. There is no
restriction by 0 on Y; V(p) = sup p y, y E Y. The supply function of this
producer (multivalued function in general) is T(p). Then from the above
theorem (since p y is obviously continuous), we can immediately conclude
that his supply function T (p) is upper semicontinuous and his profit function
V(p) is continuous.
REMARK: Theorem 2.D.14 was used by Debreu [2] to establish upper
semicontinuity of the demand function and the supply function, but no ex-
plicit mention of the literature for the above theorems (such as Berge [ 1] )
was made by him.

FOOTNOTES

1. In the material of this Appendix, we have relied heavily on Berge [ l ] and Moore
[4].
2. Let X2 be the set of positions in which Black can move and X0 be the set of positions
of checkmate or stalemate. ClearlyX is the union ofX 1, X2, and X0; 0 is the mapping of
X \ X0 into X.
3. It is assumed that X and Y satisfy the "first axiom of countability." A few definitions
may be recalled here from Chapter 0, Section A. A topological space is said to satisfy
the first axiom of countability if it has a countable open base at each of its points. An
open base is a class of open sets such that every open set is a union of sets in this class.
THE EXISTENCE OF COMPETITIVE EQUILIBRIUM 255

The first axiom of countability practically enables a topology to be defined in terms of

sequences alone. Clearly the usual Euclidian space satisfies the first axiom of count-
ability. We further restrict Yto be a Hausdorff space. A topological space Y is called a
Hausdorff space if for any two different points yl and y2 in Y, there exist disjoint open
sets VI and V2 in Y such that yI E V2 (recall Chapter 0, Section A). This restriction of
Y is required to obtain the subsequent results. It is known that every metric space is
a Hausdorff space and also satisfies the first axiom of countability.
4. A function may be q.u.s.c. in X but may fail to be G-closed. A counterexample is the
following: X R, Y = [0, 11 and define by O(x) = (0, 1) for all x E E'(see Moore [ 41,
pp. 131-132). As Moore points out ([4 , p. 138, footnotes 7 and 8), there seems to be
a slight confusion about this point in Karlin ([ 3], P. 409).
5. See Berge [ 1] , p. 111, and Moore [41, p. 130.
6. Recall that when 0 is single-valued, 0 is continuous if and only if for each open set Vin
Y, the set 0- 1 (V) is open in X. The definitions of lower inverse and upper inverse are
thus linked in a natural way to the concepts of (semi-) continuity.
7. When Y fails to be compact, Theorem 2.D.12 fails to be valid. For an example of a G-
closed function which fails to be q.u.s.c. (hence also fails to be u.s.c.), see Moore [4]
p. 132.
8. The latter is the usual definition of l.s.c. appearing in the literature. See Debreu [21,
p. 17.

REFERENCES
1. Berge, C., Topological Spaces, tr. by Patterson, New York, Macmillan, 1963 (French
original, 1959).
2. Debreu, G., Theory of Value, New York, Wiley, 1959.
3. Karlin, S., Mathematical Methods and Theory in Games, Programming, andEconomics,
Vol. I, Reading, Mass., Addison-Wesley, 1959.
4.. Moore, J. C., "A Note on Point-Set Mappings," in Papers in Quantitative Economics,
Vol. 1, ed. by J. P. Quirk, and A. M. Zarley, Lawrence, Kansas, University of Kansas
Press, 1968.

Section E
THE EXISTENCE OF
COMPETITIVE EQUILIBRIUM

a. HISTORICAL BACKGROUND
An economic model is constructed by specifying the economic agents
involved, their behavioral rules, and the various equilibrium relations. The model
is called a general equilibrium model if all the equilibrium relations in the model
are specified. It is called a partial equilibrium model if only a part of the equilibrium
relations is specified. The unspecified portion then is covered by the assumption
256 THE THEORY OF COMPETITIVE MARKETS

that "other things are equal." A partial equilibrium analysis is convenient for a
deeper analysis of some particular segment of the economy. However, it should be
realized that any partial equilibrium analysis presupposes a general equilibrium
analysis. For without knowing precisely under what conditions "other things are
equal," the partial equilibrium analysis is rather meaningless.
Full recognition of the importance of general equilibrium analysis and the
construction of the first general equilibrium model of a national economy is
attributable to Leon Walras [421. Moreover, Walras stated his general equili-
brium system in mathematical forms whose impact on modern economic theory
is immense. The model of a competitive equilibrium that we have considered so
far in this chapter is an outgrowth of the Walrasian general equilibrium model.
The important properties of such a general equilibrium-the optimality, the
existence, and the stability of the equilibrium-have already been considered
by Walras. Although his consideration was not too satisfactory from the present
point of view, he was very much ahead of his time. The Walrasian construction of
general equilibrium models goes from a simple model to more complicated
models.'
We illustrate his model and his consideration of the existence of an equilib-
rium by using his model with production' ([42], part IV). (We use our own
notation.) Let all be the amount of the ith productive resource necessary to
produce one unit of the jth commodity (good or service). Let xj, j = 1, 2, ..., n,
be the output of the jth commodity in the economy. Let v;, i = 1, 2, ... , m, be the
amount of the ith factor made available in the economy. Letp be an n-vector which
gives the prices of the commodities that prevail in the economy, and let w be an
m-vector that gives the prices of the factors. Thejth component of p is denoted
by pi - and the ith component of w is denoted by wi. The demand for the jth
commodity is a function of p and w. Similarly, the supply of the ith factor is a
function of p and w. Thus the Walrasian general equilibrium system of competitive
markets with production can be summarized by the following system of simul-
taneous equations.
P?

(1) Ia,j x.= v, i= 1,2,...,m

(2) Ewia1 = pj j= 1,2,...,n

(3) v,= v.(P,w) i= 1,2,...,m
(4) x, = xj (P, iv) j = 1, 2, . . ., n

Equation (1) determines the total demand for each factor, of which the supply is
given by (3). The demand for each commodity is given by (4). Note that the same
notation is used to denote the demand for and the supply of each factor and each
commodity (that is, v; and xj). This implicitly assumes the equilibrium relation
THE EXISTENCE OF COMPETITIVE EQUILIBRIUM 257

(the demand for each factor is equal to its supply and the demand for each com-
modity is equal to its supply). Equation (2) denotes the familiar profit condition
which states that under perfect competition, profit is eliminated. Although it is not
made explicit in the above model, Walras derived the above relations from the
behavioral rules of the economic agents (competitive consumers and competitive
producers). The market demand for commodity j was obtained by adding up the
individual demands for commodity j over all consumers. Each consumer's
demand for j was obtained by assuming that he maximizes his utility subject to his
budget condition. The are the coefficients of production (or "coefficients
of fabrication"). Walras initially assumed that they were constant but later (in
the third edition; see lesson 36 of the fourth edition) showed that they were
determined by the cost minimization behavior of each producer.' Alternatively the
can be determined by the profit maximization behavior of each producer.
Hence in the above system of equations we may consider a,j = a,j(p, w).
Altogether there are (2m + 2n) equations in the above system, and there are
(2m + 2n) variables to be determined in the system (that is, pj, xj, j = 1 , 2, ... , n;
w,, v,, i= 1, 2, ... , m). The price of one of the commodities for factors can be used as
a numeraire to measure the relative prices of the other commodities and factors.4
Letting pl = 1, we thus reduce the number of variables by one. Then Walras
showed that there can be only (2m + 2n - 1) independent equations in the above
system, for one of the variables can be obtained from the identity, which is later
called Walras' Law..'

m
+n
G+ PA _ wi vi
J=1 i=1

For example, x1 can be obtained as

n+ m+

X1 = - L pj xj + L+ x'i vi
i=2 i=1

Now there are (2m + 2n - 1) variables and (2m + 2n - 1) independent

equations. Walras' fundamental method with regard to the existence of an equilib-
rium was that there exists an "equilibrium" (that is, the solution for the equi-
librium values of the pi's, xj's, w,'s, and v' s) because the number of equations
and the number of variables are the same.
Although his recognition of the importance of the question of whether or not
there exists an equilibrium is very ingenious and quite beyond his time, the logic
used to obtain the conclusion, as described above, is clearly wrong. For example,
the following two simultaneous equations in the variables x, y E R, will not yield
any solution, for these two equations are inconsistent (they are clearly indepen-
dent).
r+y=
x+ y=-1
258 THE THEORY OF COMPETITIVE MARKETS

Moreover, even if there is asolution, there is a question as to whether the solution is

economically meaningful. For example, the following system with one equation
yields a solution, but the solution is usually economically meaningless:

x2 = - 1
Even a simpler example would be x = - 1, where x denotes "output."6 Hence
Walras' method of counting the number of equations and the number of variables
is quite unsatisfactory.` Although this method often gives a necessary condition
for the existence of an equilibrium solution," it is not a sufficient condition.
Although the above Walrasian procedure of showing the existence of an
equilibrium solution is unsatisfactory, it was accepted for a long time without
question.9 The first satisfactory treatment came in the 1930s from Karl Menger's
seminar in Vienna. One of the most important contributions made here was the
reformulation of the Walras-Cassel system allowing inequalities." Based upon
such reformulations, in particular the ones due to Schlesinger [34] and Zeuthen
[431, Abraham Wald [39] 11 gave the first satisfactory and rigorous proof of the
existence of an equilibrium solution. Alternative proofs of the existence of an equi-
librium solution for Schlesinger's reformulation of the Walras-Cassel model have
recently been given independently by Kuhn [ 19] and DOSSO [ 131. The proofs by
Kuhn and DOSSO are essentially similar in the sense that they are based on the
idea of utilizing the duality theorem of linear programming.
Schlesinger's reformulation of the Walras-Cassal system can be described as
follows (in terms of the above notation):
n
(5) Eaijxj< vie i= 1,2,...,m
j=
n
(6) E aijxj < vi implies wi = 0
j=
ni
(7) wiaij=pj, I= n
i=

(8) Pi =f(xl,x2,...,xn),j= 1,2,...,n [or, in short, p = f(x)]

(9) vi = Vi (constant) i= 1, 2, ..., m
(10) 0,x;>0, j= 1,2,...,n
wi>0, i= 1, 22. .,m

The model is almost self-explanatory, given the above explanation of Walras'

system. Relation (6) says that if there is an excess supply of the ith factor, its price
will be zero. Equation (8) is the "inverse" demand function, expressing the price of
each commodity as a function of the quantity demanded. This convention facilit-
ated the proof by Wald.12 Wald proved the existence of a unique solution (p, x, v)
THE EXISTENCE OF COMPETITIVE EQUILIBRIUM 259

in the above system under the following assumptions:

(i) aid ? 0 for all i and j.

(ii) v > 0, a1 = constant for all i, j (fixed coefficients).
(iii) For each j there exists at least one i such that of > 0.
(iv) The demand function f is single-valued, continuous, and defined for all x > 0.
(v) If {xq} is a sequence of commodity vectors such that xq_>z with ci = 0, then
f (xq) ao .
(vi) Given x, x' > 0, and letting p = f(x) and p' = f(x'), we have either (a)p. x <
p. x' p'. x.

Assumption (iii) precludes the Land of Cockaigne. Assumption (v) says that
if the demand for the jth commodity goes to zero, its pricc goes to infinity. This
means that the demand for each commodity will never be zero for any (finite) price,
however large. This assumption is clearly unrealistic. It is introduced primarily to
facilitate the proof. Assumption (vi) is needed in the proof of the uniqueness of the
equilibrium solution.
Since Wald's original proof is rather tedious, we will sketch the proof by
Kuhn [ 19] and DOSSO [ 131," which should be of interest in itself because of its
relation to the theory of linear programming.
Let X = {x: x > 0, A x 5 v}, where A = [a1)] , (the feasible set). We can
easily show thatXis nonempty, compact, and convex. Then consider the following
linear programming problem:
Maximize: p x
Subject to: A x v, and x ? 0
Define p - f(x) for all x > 0 such that x E X. For a fixed value of x, we first obtain
the value ofp and then solve the above linear programming problem with this value
of p. We obtain a solution x* (which is obviously not necessarily unique). We now
have a mapping x -->p -> x* which we denote by F(x). It is a function from the in-
terior of X into X. Extend this mapping to the wholeX and denote it by '1 (x); that is,
0 (x) = F(x) for all x > 0. The extension can be achieved by taking the closure of
the graph of F in X Q X (see Kuhn [ 19] , pp. 269-270). T lsingthe continuity off (x),
we can show that cp (x) is upper semicontinuous and 0 (_z) is nonempty and convex,
for all x E X. Now we use the following theorem, known astheKakutanifixed point
theorem.

Theorem 2.E.1 (Kakutani): Let S be a nonempty, compact, convex subset ofR". Let F
be an upper semicontinuous function from S into itself such that, for all p E S, the set
F(p) is nonempty and convex." Then there exists a p in S such that p E F(p).
This theorem is a generalization of the following theorem, which is called
Brouwer's fixed point theorem.
260 THE THEORY OF COMPETITIVE MARKETS

Theorem 2.E.2 (Brouwer): Let S be a nonempty, compact, convex subset ofRn, and
let F be a single-valued continuous function from S into itself Then there exists a
p in S such that p = F(p).
Both Brouwer's and Kakutani's theorems probe deeply into combinatorial
topology. For rather simple proofs, see Nikaido [30],[31], for example.'-'
Brouwer's theorem is illustrated in Figure 2.30. Here S is the unit interval [0, 11.
A continuous function from S to S must cross the diagonal line; thus F(p) = p.
REMARK: The method of actually computing a fixed point in connection
with the theory of competitive equilibrium has been recently provided by
Scarf [33] ; his paper can also be considered to give a constructive proof of
the existence of competitive equilibrium. See also Arrow and Hahn [4],
Appendix C.
Reading the statement of Kakutani's fixed point theorem, we at once realize
that this theorem is applicable to the present problem. In other words, there exists
an z E X such that z E cD (z). Then, using assumption (v), we can show that z > 0.
Thus we can find z > 0 and p = f(1) such that z solves the above linear program-
ming problem. Then from the duality theorem of linear programming, there exists
a solution w for the following dual problem.
Minimize: w v
Subject to: A' w = p, w >_ 0
Then (p, z, w) constitutes a solution for Schlesinger's version of the Walras-Cassel
model."' Using (vi), the uniqueness can be proved." Note that Wald requires the
equality A'- w = p. This implies that the price of every commodity has to be strictly
positive [under (iii)]. Assumption (v) is required to guarantee z > 0 so that A'- iv =
p. This assumption says roughly that every commodity is indispensable to the con-
sumer. Note also that if we allow an inequality here-that is, A' w > p-then (from
the duality theorem) we, admit zero production for some goods and the difficulty of
introducing assumption (v) disappears (See Kuhn [ 19] ). (However, some sort of

Figure 2.30. An Illustration of Brouwer's Fixed

Point Theorem.
THE EXISTENCE OF COMPETITIVE EQUILIBRIUM 261

indispensability assumption seems still to be needed to guarantee that the output

of at least one commodity will be positive.)
Wald's original proof is quite tedious because he obviously could not utilize
the duality theorem. Although we now have a simple proof by Kuhn and DOSSO,
there are some unsatisfactory points in the above formulation. For example, it is
not clear what will guarantee the existence of the inverse demand function and its
continuity.1' No behavioral rule for the consumers or for the producer is stated.
The production set implied from the model is for the entire economy and is not for
each producer. Only a special type of production set is considered. Thus we are
forced to return to a model such as the one discussed in the previous sections of
this chapter. In other words, we first specify the consumption set for each con-
sumer, the production set for each producer, the behavioral rule for each econo-
mic agent, and a competitive equilibrium. Then, using the assumptions on the
consumption set and the production set, and so forth, we want to prove the exis-
tence of an equilibrium. The problem then is no longer one of finding a solution for
the simultaneous equations or inequalities. The stress now lies in the compatibility
of each economic agent's behavior. The following excerpts from Koopmans ([ 17],
p. 60) point this out precisely.
The problem is no longer conceived as that of proving that a certain set of
equations has a solution. It has been reformulated as one of proving that a
number of maximization of individual goals under independent restraints can
be simultaneously carried out.
This is the essence of the modern formulation of the existence question. The first
successful formulation and proof of this problem is due to Arrow and Debreu
[3].19
The essential idea of their paper is to consider the model of competitive
markets as the model of an n-person noncooperative game and to utilize atheory
developed in game theory. Independently, Gale [14] and Nikaido [28] , [29]
presented other proofs of existence at almost the same time. Their proofs and
the proof given in Debreu [11] are quite similar. The development of their proofs
can be obtained easily from our discussion in the previous sections of this chapter,
especially Section D.
Starting with each consumer's consumption set and preference ordering, we
can show that his demand function, x;(p), is an upper semicontinuous function
of the price vector p, where p includes the price of all commodities including
primary factors: The aggregate demand function is obtained as the sum of indi-
vidual demand functions, that is, x(p) = Jxj(p), assuming the absence of
interactions among consumers' preferences. Since a linear combination of upper
semicontinuous functions is also upper semicontinuous (which can be shown
easily), the aggregate demand function x(p) is also upper semi continuous.20
Similarly, we can show that the supply function of an individual producer (who is
a profit maximizer) is upper semicontinuous (see Appendix to Section D of this
chapter). Then, assuming no "(technological) externalities" among the producers,
262 THE THEORY OF COMPETITIVE MARKETS

the aggregate supply function, y(p), is the sum of the individual supply functions,
that is, y(p) _ Yj(p), and y(p) is upper semicontinuous. Here a negative element
of y1(p) is an input for j. Assuming no externalities among the producers and the
consumers, we write the (aggregate) excess supply function as z(p) = y(p) + x` -
x(p), where x is the total supply of resources available in the economy. Then z(p)
is also upper semicontinuous. Assuming free disposability, we write the feasibility
condition as z(p) n S2 # 0; or, equivalently, there exists a z E z(p) such that z ? 0.
We say that [ p, { zi } , { y1 } ] is a competitive equilibrium if

(i) z; E xi(p), y1 E y1(p)for all i and j,

and
(ii) there exists a i E z(p) such that z ? 0, where i = Zy1 + x - ZXi.

We normalize the price vector p by setting Z,"= i pi = 1. This corresponds to

and replaces the Walrasian convention of settingpl = 1. The underlying assump-
tion which makes the normalization possible is the "homogeneity postulate"; that
is, each element of vectors xi(p)'s and yj(p)'s- [ hence z(p) also] is homogeneous
of degree zero [thus z(ap) = z(p), for all p, for any a > 0, a E R, for example] , so
that an equilibrium is unaltered if all prices are multiplied by the same positive
constant. Hence, without loss of generality, we can assume that 2:" 1 pi = 1 or
pI = 1 (see footnote 4). Note also that if we wish to choose a particular commodity
(say commodity 1) to be numeraire, then its price should not be zero, at least in
equilibrium (so that we can set pi = 1). If there exists a commodity for which the
excess demand is positive whenever its price is zero, regardless of the prices of all
other commodities, then such a commodity is a good candidate for numeraire.
Let P = {p: X" lp i= 1, p i 0, i = 1,2,.. ., n}. From a consideration of the
individual's demand function, we prove that each xi (p) and each yj (p) is a convex
set. [The convexity of the set xi(p) follows from the assumption of the weak con-
vexity and the continuity of the preference ordering, and the convexity y1(p)
follows from the convexity of the production set Y1. ] Here we simply assume this.
Let 01ibe the share of the profits of thejth producer going to the ith consumer.
If all the resources are held by consumers (so that x = z i), then each consumer's
budget constraint can be written as
2: 01iP-y1
j=I
where xi is the consumption vector of i and y1 is the production vector of j.
Summing over i and recalling that Zi_ i B1i = 1, we obtain
x=Z xiand y y1

This corresponds to Walras' Law.21 We will now use the following lemma, which is
THE EXISTENCE OF COMPETITIVE EQUILIBRIUM 263

proved independently in the literature by Gale [ 14] and Nikaido [ 28] .22 The
lemma, which practically is the proof of the existence of competitive equilibrium,
can easily be proved if we use Kakutani's fixed point theorem. (For such a proof,
see Debreu [ 11 ] , pp. 82-83, and Nikaido [31 ], pp. 266-267).23
Lemma 2.E.1 (Gale, Nikaido): Let P be the (n - 1)-unit simplex in Rn, that is, P
{ p: p E Rn, 2:,"_ 1 p; = 1, p > 0}, and let S be a nonempty compact subset ofR". Let F
be an upper semicontinuous function from P to S, such that F(p) is nonempty and convex
for all p E P and p F(p) _> 0 [that is, p z > O for all z E F(p)]. Then there exists
a p E P such that F(p) n 0 # 0.
This lemma is illustrated in Figure 2.31. By the assumption thatp F (p) > 0,
F(p) must be above the line passing through the origin and orthogonal to p. Asp
moves in P, F(p) must intersect with D.
Now let P be the (n - 1)-unit simplex of price vectors as defined above. Then
for each element p of P, we obtain x,(p)andyj(p). Define z(p) as z(p) = jyj(p) +
x - 2x;(p). We see immediately that this z(p) satisfies the assumptions of the
lemma. Hence there exists ap E Psuch that z(p) n 0 0. Let z E z(p) with z > 0.
Then there exist x; E x,(p) and yj E yi(p) such that z E z(p), where z = 2:y, +
x - 2:z; This completes the proof of the existence of a competitive equilibrium
.

in the manner of Gale [ 14] , Nikaido [28], and Debreu [ 11 ] .

In the above sketch of the proof of the existence of a competitive equilibrium
by Gale, Nikaido, and Debreu, we relied heavily on results obtained in the previous
sections. Hence it was not made explicit what the crucial assumptions are or how
they are used. The reader can check this by going back to the previous sections of
this chapter.21 Now we present a complete proof of the existence of a competitive
equilibrium. Our proof and the formulation are based essentially on McKenzie
[22] and [21 ]. We do not use the most general version of McKenzie's model; for
our purpose, the proof of a simpler formulation will suffice. Moreover, this will
serve as a guide to McKenzie's rather difficult article [221. For this purpose the
author is also indebted to McKenzie for his lectures at the University of Rochester.

Figure 2.31. An Illustration of the Gale-Nikaido Lemma.

264 THE THEORY OF COMPETITIVE MARKETS

Before we go on to McKenzie's proof, let us sketch an outline of some of the

important and difficult problems in the proof of existence, which are not clearly or
explicitly stated in the above expositions and proofs. The first thorough recogni-
tion of these problems (and hence the meaning of the assumptions) is due to
Arrow and Debreu [ 3] , whose work remains the standard reference for the
problem.

(i) The survival problem. This is the question of assuring that every consumer can
survive, given the equilibrium conditions. If an equilibrium exists, the equilib-
rium prices of the resources held by some consumer may be so low that he may
not be able to subsist on the income he obtains from his resources. The first
requirement for this problem, of course, is that the aggregate supply set contain
a point which is the sum of the minimal subsistence consumption requirements
for each consumer (otherwise some consumer is bound to die). In terms of the
notation of Section C, this means that there exist x; E X;, for all i and y E Y
such that x = y + Y, where x = Xx1. The second requirement is that each con-
sumer be able to subsist with the resources (including labor) he holds without
engaging in exchange. This can be guaranteed if each consumer's consumption
set, with his resources added, has an intersection with the aggregate produc-
tion set of the economy. In fact, we need a little more. For example, we may
require that not only must such an intersection be nonempty, it must also have
an interior This corresponds to the cheaper-point assumption discussed
in the previous section. Essentially, it guarantees the (upper semi-) continuity
of each consumer's demand function.
(ii) Satiation. When an equilibrium price prevails, some consumer, because the
prices of his resources are very high, may be able to purchase a consumption
bundle such that he is satiated. As we said in the previous section, the nonsatia-
tion assumption is needed to establish the lower semicontinuity of the budget
function (hence the upper semicontinuity of the demand function). Arrow and
Debreu simply assumed that every consumer is nonsatiated in his (somewhat
modified) consumption set. This is a strong assumption. The relaxation of this
assumption is possible and is attempted in the literature (for example, McKenzie
[21],[22]).
(iii) Utility function and the production set. Arrow and Debreu assume the exist-
ence of a continuous utility function for each consumer. McKenzie's formula-
tion is in terms of a preference relation, although his assumptions imply the
existence of a continuous utility function. The crucial assumption in this con-
nection, which is common in all the existence proofs, is the convexity of
individual preferences. Arrow and Debreu [ 3] assume the existence of a fixed
number of firms, each of which has a convex production set 2 McKenzie [ 22]
assumes that the aggregate production set is a convex cone so that constant
returns to scale prevails in the aggregate. McKenzie does not assume the ir-
reversibility of the production processes, nor does he assume free disposability
of commodities.
(iv) The number of producers.27 In Arrow and Debreu [ 3] and subsequent works
such as Debreu [ 11 ] , it is assumed that the total number of firms (producers)
is fixed (at, say, k). It is well known and can easily be checked that diminishing
THE EXISTENCE OF COMPETITIVE EQUILIBRIUM 265

returns to scale for an individual producer implies a positive profit, which in

turn should imply that firms enter the market. Constant returns to scale for the
aggregate production set can be justified on the basis of an adjustment in the
number of firms, which are small in size compared to the industry. Diminishing
returns to scale for an individual firm typically occur when there are certain
limitational fixed factors, such as managerial ability or entrepreneurship,
which are not explicitly introduced in the model (and are not marketed). There-
fore, diminishing returns to scale (for each firm) plus a finite fixed set of firms
imply the scarcity of certain commodities (factors) and freezing the assignment
to various production processes of these commodities. (Such a model will not
be useful for exploring possible effects of a redistribution of these resources.)
Under diminishing returns to scale, firms may make profits, which are attribut-
able to payments for the use of such resources as entrepreneurial skills or special
talents of some kind. In McKenzie's model [ 22] , such resources are explicitly
included in the list of commodities (and marketed) and the number of firms need
not be fixed, so that we can safely assume constant returns to scale for the
aggregate production set. McKenzie also shows the concordance of his model
with the usual Hicksian model of a fixed number of firms, each of which has a
closed and convex production set (pp. 66-67).

b. MCKENZIE'S PROOF
Essentially, we follow the proof due to McKenzie [ 22] . In order to under-
stand the principal problems and difficulties involved in the proof, we consider
his simpler case .18
Let x; be an n-commodity consumption vector for consumer i (i = 1,
2, ..., m) and let X, be his consumption set, which is assumed to be a subset of
R'1. Wetidopt the convention that the positive components of x, represent the
commodities demanded and the negative components represent commodities
supplied by the ith consumer (recall the discussion in Section A for this conven-
tion). We do not take into explicit account a resource vector such as v, ; it is imbed-
ded in our convention of x,. Let Y be the aggregate production set. We now state
and explain the assumptions which will be used in the present proof of the existence
of a competitive equilibrium.

(i) Assumptions on consumption sets.

(A-1) The set X1 is convex, closed, and bounded.
(A-2) The set X; is totally quasi-ordered by a strictly convex and continuous
preference ordering Q;.
REMARK: For discussions of the compactness of Xi, see Debreu [ 11] ,
Arrow and Debreu [ 3], and Nikaido [ 30]. They show how we can restrict
our attention to a compact consumption set. The crucial part of this assump-
tion is that X; is bounded from below.
REMARK: We recall the following definitions:
(strict convexity of Qi) Preference ordering & is called strictly convex
266 THE THEORY OF COMPETITIVE MARKETS

xi imply [tx; +(I - t)x;] 0 ;x;, 0 < t < 1.

if Xi is convex and x'; O;x; and x;
(continuity of 0;) Preference ordering 0; is said to be continuous if for
any sequences {xi9}, {X i9} in Xi with Xi9 -> xi and X i9 X i, Xi9 @!X ,9 for
all q implies x; Q;x i .

(ii) Assumptions on the aggregate production set Y.

(A-3) The set Y is a closed convex cone.

(A-4) Y n 92 = {0} (the impossibility of the Land of Cockaigne).
REMARK: Assumption (A-3) does not necessarily imply that each Yjis con-
vex. It means constant returns to scale for the aggregate production set 29
As mentioned earlier, diminishing returns to scale for a particular producer
(firm) can be subdued in the aggregate by increasing the number of firms in
which the entrepreneurial factor is not private to the firm. The absence of
technological external economies and diseconomies (interactions among
production processes) is assumed here. The absence of Marshallian external
economies and diseconomies is also assumed. The Marshallian externalities
are due to a change in the size of an industry; hence they are external to each
firm, but internal to the industry or the economy. Such externalities should
be distinguished from the purely technological externalities. Note also that
(A-3) and (A-4) do not assume the free disposability nor the irreversibility of
production.
REMARK : From (A-3) and (A-4), the total profit in the economy is zero (re-
call our discussion of activity analysis, Chapter 0, Section C). We can sup-
pose that each consumer receives zero shares of profit income. Or we can
suppose that the profit share from a particular producer to a particular con-
sumer is simply the return to the resources offered by the consumer to the
producer. In any case, we suppose that each consumer's income is restricted
to the value of the resources that he offers to the market. Hence his budget
constraint can be expressed by p x; < 0, xi E X, if price p E R" prevails.

Definition: The budget set for the ith consumer, denoted by Hi(p), is defined by

H1(p)= {xr:P'xi < 0, x, E X;}

We assume H;(p) is nonempty for all possible p. Recall our discussion on this
assumption in Section D and its appendix.

Definition: Define the sets C; (p) and C(p) by

C,(p) - {x;: x'; (D,x1 for all xi E H.(p)}

il?

C(P) = C(P)
THE EXISTENCE OF COMPETITIVE EQUILIBRIUM 267

The set C;(p) is the ith upper contour set under price p and C(p) is the economy's
upper contour set under price p.
The concept of the upper contour set, C,(p), is illustrated in Figure 2.32.
The dotted lines indicate the indifference curves of the ith consumer.

Definition: We say that the ith consumer is satiated at x; if xi (D;x'; for all x; E X;
and that the ith consumer is satiated at price p if x;Q;x; for all x; E C,(p).
(iii) Assumptions relating consumption and production sets.
(A-5) The set X; n Y has an interior point for all i.
(A-6) Either (1) no consumer is satiated at p, or (2) if some consumer is satiated at
p, then C(p) rl Y = 0.
REMARK: Assumption (A-5) guarantees that every consumer can supply a
positive amount of every (unproduced) commodity to the producers.30 Thus
every consumer always has some income, given nonzero prices, so that his
budget set contains a point in his consumption set, and he can trade with
others. Assumption (A-5) guarantees the subsistence of every consumer
and it also corresponds to the cheaper-point (minimum-wealth) assumption
used in Sections C and D. This assumption, (A-5), is illustrated in Figure 2.33.
REMARK: Assumption (A-6) says that if some consumer is satiated while
trading at price p, then the total demand at p will exceed the possible produc-
tion. This concept is illustrated in Figure 2.34, in which we assume that there
is only one consumer in the economy. In this diagram, point x represents a
point of satiation. In other words, if the price of a certain commodity be-
comes low enough (relative to other commodities), a consumer may be able
to purchase a large quantity of that good (in exchange for other commodi-

Food

Figure 2.32. An Illustration of C;(p). [C;(p) = the shaded area.

268 THE THEORY OF COMPETITIVE MARKETS

Labor

Figure 2.33. An Illustration of (A-5).

ties), and thus he may be satiated with that good. Assumption (A-6) says that.
if this occurs, the demand for that good is beyond the society's productive
capacity. Therefore (A-6), in effect, precludes such a possibility. Analytic-
ally, (A-6) corresponds to the nonsatiation assumption of demand theory.
We are now ready to start the proof of the existence of a competitive equilib-
rium. First we define competitive equilibrium (in the usual manner).

Definition (competitive equilibrium): An array of vectors [ { i I, x2, ... , zm} , y, p]

Food

Figure 2.34. An Illustration of (A-6).

THE EXISTENCE OF COMPETITIVE EQUILIBRIUM 269

is called a competitive equilibrium if the following hold:

(i) 1,E Ci(5)flHi(p),i= 1,2,...,m.

(ii) yE E Y.
M

i= 1

REMARK: Condition (i) is the familiar one of a consumer maximizing his

satisfaction over his budget set ("demand condition"). Condition (iii) is the
requirement that total demand be equal to total supply. This is stated as an
equality; hence no free disposability is assumed, and negative prices are
allowed. Condition (ii) is the familiar zero profit condition of a competitive
economy. If there exists a y' E Y such that p y' > 0, then consumers, as
owners of resources, could receive higher returns for their resources by
offering some of their resources at a production point ay' for some a > 0
(a can be less than one, of course). Hence some consumers can be made
better off by engaging in such trades. Thus the situation would not be stable.
On the other hand, if p- y < 0, some returns would be reduced.
REMARK: Note that p y = 0 and p y < 0 for all y E Y. If we suppose
that the number of the producers in the economy is fixed (j = 1, 2.... , k),
and that there are no externalities among the producers, then we can write
Y=E il- i5 andy=2k I y,, where yj, yj E Y1,j= 1, 2, k. Then 0,
Ymeans
k k
<E
j=1
-jj=0 forallyiE Yi,j= 1,2,...,k
.i= I

Fix j = jo and put y. = yi for all j except j = jo. Then p yio < p y,,, for all
yjo E Yj0. Since the choice of jo is arbitrary, this shows the profit maximization
of each producer.
In order to prove the existence of a competitive equilibrium, it is sufficient to
confine our attention to a price vectorp which satisfies condition (ii) of a competi-
tive equilibrium.

Definition: The polar cone of Y, denoted by Y*, is defined by

Let xi E interior (Xi n Y), i = 1, 2, ..., in. This exists for all i from (A-5).
Write xi = z; then z E interior Y.

Definition: We define P, called the normalized polar cone of Y, by"

P= {p: p E l}
270 THE THEORY OF COMPETITIVE MARKETS

Figure 2.35. An Illustration of Y* and P.

REMARK: The normalization of the price vector is done for convenience.

Since some prices can be negative, we cannot use a more customary normal-
ization such as Zr I pi = 1. It turns out that the above normalization is
convenient for the present analysis.

Lemma 2.E.2: The set P is convex and compact.

PROOF: We prove this in three steps.

(i) (Convexity) Let p, p' E P, and let p" = tp + (1 - t)p'(0 < t < 1). But
1, and
p' . x = -1 imply p" x = - 1. Hence p" E P. Therefore P is convex.
(ii) (Closed) Let pq- p, pq E P; pq y!5 0 implies p y < 0 for y E Y (as a
result of the continuity of inner product). Hence p E Y*. p4 .7 = -1
implies p z = - 1. Hence p E P. Therefore P is closed.
(iii) (Bounded) Suppose there exists a sequence { pq} such that II pq 00,
pq E P. Consider pql II P`7 II - P q Then p q E Y*, since Y* is a convex
cone and p9 E Y*. Moreover, )5q is an element of the intersection of
Y* and the (n - 1)-dimensional unit sphere. That is, P q is an element of
a compact set. Hence there exists a subsequence of {Pq}-say, {ps}-
such that P'->P, where II P' II = II P II = 1. Moreover, P E Y*, for Y*
is closed. Consider s' x. Since pS z 1 ('.'p-' E P), P' x =
- I / II p-` II Then II ps
. - oo implies that Ps - 0 as s- oo, that is,
x = 0. But this is a contradiction, For P E Y* and x E interior Y
means P x < 0. Therefore no such sequence { pq} can exist. Thus P is
bounded, so that it is compact. (Q.E.D.)
THE EXISTENCE OF COMPETITIVE EQUILIBRIUM 271

Definition: The ith demand function, denoted byf(p), is defined by

f,(p) = n Hi(p), pcP
The following lemma was proved in Section D.

Lemma 2.E.3: Under (A-1), (A-2), and (A-5),

(i) The functionf1(p) is single-valued and continuous on P.

(ii) If the i-th consumer is not satiated at xi = J (p), p f, (p) = 0.
(iii) The function fi(p) is positively homogeneous of degree zero.

Let z be a point which is not an interior point of Y, and consider a chord join-
ing x and z: tz + (1 - t)z, 0 < t < 1. Now consider the minimum oft subject to
tz + (1 - t)z E Y. Since t varies over the (closed) unit interval [0, 11 , there exists
a t in [0, 11 for which t achieves its minimum. Denote this minimum by tz. In other
words, we define tZ = min t such that tz + (1 - t)z E Y, 0 < tZ < 1. Now consider
a function h(z) from such z 0 interior Y into Y by
h(z) = tz x + (1 - t,)z

This function h(z) is illustrated in Figure 2.36.

Lemma 2.E.4: The function h(z) is continuous for z 0 interior Y.

PROOF: Consider a sequence { z9 } (4 Y such that z9 _> z. Let y4 = h(ze ).

Suppose h(z) is not continuous. That is, suppose y9-- y = h(z) does not hold.
Then by the compactness of the unit interval ('.' tz varies in [0, 1 ] ), there is a

The direction of the minimization of t

Figure 2.36. An Illustration of h(z).

272 THE THEORY OF COMPETITIVE MARKETS

subsequence {ys} such that ys_y', where y' y = h(z), and y' = t'x +
(1 - t')z for some t', 0 < t' < 1. Since x E interior Y, and tv and t, are all
less than 1,y4r x, y': x,andyr x.
By the definition of tZ, we cannot have t' < t,. Hence t' > t,. But t' = t,
implies y = y', so that we have t' > t,. This implies y' = Ox + (1 - O)y for
some 0, where 0 < 0 5 1. Since x E interior Y, there exists an open ball
B,(x) about x with radius c > 0, such that Bj(x) c Y. Let w E BE(x) and
define w' - Ow + (1 - O)y. Then w' E Y, since both w and y are in Y. Hence
we have an open ball BoE(y') about y' with radius Oc > 0, such that
BoE(y') c Y. Hence y' is an interior point of Y. Therefore yq E interior Yfor
large q. This contradicts the definition yq = h(z4). Thus we have t' = tz, or
y = y'. (Q. E. D.)
REMARK: The above proof is illustrated in Figures 2.37 and 2.38.

Definition: g(z) = {p: p E P and p z = 0}, where z E boundary Y.

Lemma 2.E.5: The set g(z) is convex and g(z) is an upper semicontinuous func-
tion of z.
PROOF: Convexity follows from the convexity of P and the linearity of
the inner product of z. (Check this for yourself; it is a simple exercise using
the definition of a convex set.)
To prove the upper semicontinuity of g(z), consider a sequence {z9} E
boundary Y such that z9 - z. Then form a sequence p9 E g(z9). We have to
show that if pv -> p, then p E g(z). For this, simply observe:
(i) Since P is closed, p E P.
(ii) By the continuity of inner product, p z = 0
('.'p9 z9 = 0 implies p. z = 0). Hence, p E g(z). (Q.E.D.)

Figure 2.37. An Illustration of y' and B, (T).

THE EXISTENCE OF COMPETITIVE EQUILIBRIUM 273

Figure 2.38. An Illustration of B,(x) and Bo,(y').

m
Definition: F= g o h of where f f.
REMARK: The function F is illustrated by Figure 2.40.

Theorem 2.E.3: Under assumptions (A-1) through (A-6), there is a competitive

equilibrium.
PROOF: If f (p) E interior Y, then by (A-6) no consumer is satiated. Then by
Lemma 2.E.3 (ii), 0. This contradicts the fact that p E P. Hence
f(p) 0 interior Y for p E P.
By Lemma 2.E.3 (i), Lemma 2.E.4, and Lemma 2.E.5, Fis upper semi-
continuous. Also F maps p E P to a convex subset F(p) c P. Hence by
Kakutani's fixed point theorem, there exists a p E F(fi). Consider
[-it, x2, ..., Xm, Y, P] where zr =ff (P), i = 1, 2, ..., m, and y = h(f(P)) _
Ii + (1 - t")z, where z 11j, 0:5 1 < 1. By the definition of c, p j = 0.
Also p y < 0 for all y E Y, since p E Y* from p E P. Hence the profit
condition, (ii), of a competitive equilibrium is satisfied. By the definition of
f,, the demand condition (i) of a competitive equilibrium is satisfied.

Figure 2.39. An Illustration of g(z).

274 THE THEORY OF COMPETITIVE MARKETS

Figure 2.40. An Illustration of F(p).

Now we want to prove t " = 0, which implies condition (iii) of a com-

petitive equilibrium. By the definition of p z, 0 (the budget condi-
tion). Thus p i < 0. Moreover, p y = 0, but p y == p3 [ I + (1 - t")z] .
Since p x < 0, and p z < 0 and 0 < I < 1, this implies t" = 0. Therefore
y = i and condition (iii) of competitive equilibrium is satisfied. (Q.E.D.)

FOOTNOTES

1. Roughly the order is as follows: (a) (multiperson) two-commodity pure exchange

economy; (b) (mu]tiperson) multicommodity pure exchange economy; (c) introduc-
tion of production; (d) introduction of capital goods; and finally (e) introduction of
money.
2. For the sake of simplicity, we leave out an explicit treatment of intermediate
goods. The modification of our illustration with intermediate goods should be an
easy exercise for the reader. See Walras [42] and Morishima [251.
3. See William Jaffe's "Translator's Notes" in Walras [42] , pp. 549-553.
4. Assume that the functions with p and w as arguments-that is, v;, xx, of are homo-
geneous
P" of degree zero. Then xj(p, w), for example, can be written as x,(1,p2/p , ... ,
/Pi, wi/p1, ..., w,,,/pl). By redefining symbols, we may write (1, p2/pi, ...,
w,,,/p [) as (p, w). This amounts to setting p i = 1.
5. We can easily show that Walras' Law holds if and only if the budget constraint of each
consumer is satisfied as an equality. In general, a consumer can be satiated so that he
may not spend all his income. Then the Walras law identity is replaced by an in-
equality such as Z pj xx < Y w; v;. Since the substance of Walras' Law is the
individual's budget constraint, it in essence says that, whatever the market prices
may be, the amount people wish to spend is equal to (or does not exceed) the amount
THE EXISTENCE OF COMPETITIVE EQUILIBRIUM 275

they desire to earn. It does not say that people spend all their income. The budget
constraint is a constraint and not a result of any choice.
6. The problem may be stated as follows: Letf (x1, ..., x,,, zI, ... zm) = 0, i = 1, 2, ...,
n, be the equilibrium system, where x1, . . ., x, are the "endogenous" variables and
z1, ... , zm are the "exogenous" variables. The problem is whether we can obtain xi _
x; (z i , ... , zm) such as to be consistent with the above set of equations. If we can, these
xi's define the equilibrium values of the endogenous variables and we call such (xi,
... , a solution of the above system. In the above, the number of equations is taken
to be equal to the number of endogenous variables, for otherwise we cannot guarantee
the existence of a solution even when the are all linear (affine). The well-known
implicit function theorem guarantees the (local) existence of a unique solution in the
neighborhood of a point (x°, z°), if certain assumptions are satisfied-especially that
the Jacobian matrix [ evaluated at (x°, z°) is nonsingular. The assumptions
of this theorem guarantee the global existence of a unique solution when the f's
are all linear (affine). However, these assumptions are not sufficient for the global
existence of a solution for the nonlinear case.
7. We note that Walras clearly realized the possibility of the nonexistence of equilibria
for the two-commodity economy ([42], section 64, lesson 7).
8. In certain cases, the equality of the number of equations and variables is not even
a necessary condition. An example is x2 + Y2 = 0, in which the number of equa-
tions (=1) is different from the number of variables (=2), and yet there exists a
unique real solution (that is, x = 0 and y = 0).
9. The above difficulty of the Walrasian system (that is, there is no guarantee that
there exists a solution) was realized after Cassel's exposition [8] of the system. As
a result of the simplicity and the popularity of Cassel's exposition, the system then
became known as the Walras-Cassel system. For the reason that Cassel attracted
Austrians, Hicks says, "As is known, there was a phase [in the 1920's] when Cassel's
treatise was displacing those of the "historical" and "Austrian" schools in curricula
of Central European countries: during such struggles its weakness would be care-
fully watched," ([ 15], p. 674). The difficulty discussed above was made clear in the
seminar conducted by Karl Menger (a mathematician and theson ofthefamous econ-
omist Menger). For the summary of the discussions in Menger's seminar on the
Cassel-Wald system, see Arrow and Debreu [31, pp. 287-289. Strictly speaking,
the Casselian system is quite different from the Walrasian system. Cassel did not pay
any attention to the behavior of each economic agent. He, in fact, proposed to reject
altogether the procedure of deriving an individual's demand function from this
hypothesized utility maximization behavior. Cassel a priori assumed the constancy
of the ay's and the vi's.
10. In particular, condition (1) in the above is changed to Z'_Iai1x = v;, I = 1,
2, ... , m. The equality condition for this relation is a very stringent one, if we assume
the a,1's and v,'s are all constants, as in the Casselian system. If, on the other hand,
the a11's and vi's are functions of prices as in Walras, the equality assumption is
not as strong as is generally believed. For then the equality can be achieved through
changes in prices. The inequality of condition (1) allows the possibility of an excess
supply of factors. If a factor is in excess supply, its price will be zero. In other words,
the inequality allows the possibility of determining the division of factors into free and
scarce (compare Zeuthen [43] ).
11. Wald's work [39] was first presented at Karl Menger's seminar. His article [41]
is the summary of the main results of [39] and [40] . These, together with the results
from Menger's seminar, clearly designate this period as the dawn of modern eco-
276 THE THEORY OF COMPETITIVE MARKETS

nomics. Notably, von Neumann's first paper [26] on game theory was published
in 1928, and his paper on the "von Neumann growth model" [27] was published in
1937. The latter paper clearly resembles modern activity analysis and also contains
the basic idea of the duality theorems of linear programming.
12. Note that (8) presupposes that the functions (3) and (4) are globally invertible and
that the supplies of productive factors (the vi's) are completely inelastic with respect
to all prices (p and w). Note also that the factor prices are left out in (8). In general,
these assumptions are not guaranteed, and Wald's procedure of using (8) is illegiti-
mate.
13. The procedure sketched here is the one prescribed by Kuhn [ 191. DOSSO's pro-
cedure [ 13] is a little different from this, although the mathematical structure of the
two procedures is essentially the same. Incidentally, there are some errors in
DOSSO's proof of exience [ 13]. They are pointed out and corrected by K. Inada.
See his "A Note on the Revision of the Proof of Dorfman, Samuelson, and Solow's
Existence Theorem of General Equilibrium," Economic Studies Quarterly, XIII,
February 1963.
14. If F is an upper semicontinuous function from a compact set X into itself, it can be
shown easily that the image set F(x), x E X, is also compact. See, for example, Berge
[61, section 1 of chapter VI (especially theorems 3 and 4 and the corollary of theorem
7). Note the distinction between his definition and our definition of upper semi-
continuity. See our discussion in Chapter 2, Appendix to Section D.
15. See also C. B. Tompkins, "Sperner's Lemma and Some Extensions," chapter 15 in
Applied Combinatorial Mathematics, ed. by E. F. Beckenbach, New York, Wiley, 1964,
and E. Burger, Introduction to the Theory of Games, Englewood Cliffs, N.J., Prentice-
Hall, 1963 (appendix).
16. Note that from the duality theorem, p r = w . v so that we have iv (v - A z) _
w v - (A'. w) z = w v - p z = 0, that is, condition (6) is satisfied. (See also
Theorem 1. F.1.) Note also that p . z = w. v" implies that (p, z, w) satisfies Walras' Law.
17. Suppose not. That is, suppose (p, z, w) and (p*, x*, w*) are two different equilibria.
Since z maximizes p. x for all x E X, we have p. z > p x for all x E X where
p = f(z). In particular, p. r > p. x*. Similarly, p*. x* > p*. z where p* =f (x*),
which implies that p. a; < p. x* from assumption (vi). This contradicts p. z > p. x*,
which proves the uniqueness. The discussions of assumption (vi) will be postponed
to the Appendix to Section E of this chapter and Chapter 3, Section E.
18. DOSSO [ 13] avoided the use of the inverse demand function.
19. In the issue of Econometrica before the one containing the article by Arrow and
Debreu [31, McKenzie [20] showed the existence of a solution for Graham's model
of world trade, which clearly resembles the model of Walrasian competitive markets.
We may also note that this is probably the first article in economics to use Kakutani's
fixed point theorem.
20. Sonnenschein [35] established the upper semicontinuity of x;(p) [hence also x(p)]
without assuming the transitivity of the underlying individual preferences. The main
result of [ 35] , as the author puts it, is that "the transitivity of preferences can be
replaced by the convexity of preferences in establishing the existence of demand func-
tions." (p. 215).
21. Nikaido called the above relation with inequality the Walras law in the general sense,
and the usual Walras law with equality the Walras law in the narrow sense. ([301,
section 45; [311, p. 263)
22. A similar theorem was proved by Debreu [ 10] and it was also used in the proof of
the existence of a competitive equilibrium. See also [ 1 I].
23. The use of Kakutani's fixed point theorem, which is an extension of Brouwer's fixed
point theorem, is not a matter of mere technical convenience. Surprisingly enough,
THE EXISTENCE OF COMPETITIVE EQUILIBRIUM 277

it can be shown that the Gale-Nikaido lemma conversely implies Brouwer's fixed
point theorem. See Uzawa [38] and Nikaido [31], pp. 268-269. This produces
Uzawa's contention [ 38] that the existence of equilibria in the Walrasian system is
in a sense equivalent to Brouwer's fixed point theorem. Nikaido then remarked ([31],
p. 270): "The Walrasian general equilibrium theory [Walras, 1874] was published in
the 1870's, while Brouwer's work on fixed points [Brouwer, 1909, 1910] appeared
three decades later. It is therefore no wonder that Walras could not achieve a
mathematical consolidation of the conjecture in the days before the advancement of
topology; he should certainly not be criticized for his failure to achieve a mathe-
matical solution, but should be admired for his mathematical imagination which
let him formulate this well-posed conjecture."
24. For this purpose, the reader may also be interested in seeing Debreu [ 111, chapter 5,
for example.
25. See (A-5) of the next subsection and footnote 31. This assumption implies that every
consumer must be able to supply a positive amount of every unproduced commodity
(such as labor). Arrow and Debreu [3] imposed a stronger assumption which re-
quires that every consumer can supply a positive amount of every commodity. The
relaxation of Arrow and Debreu's assumption is seen in McKenzie [21], [22].
In (22], McKenzie introduced the concept of "irreducibility." For a further dis-
cussion on the concept of irreducibility, see Moore [241. See also Arrow [2], and
J. T. Rader, "Pairwise Optimality and Noncompetitive Behavior," in Papers in Quan-
titative Economics, vol. 1, ed. by J. Quirk, and A. M. Zarley, Lawrence, Kansas,
University of Kansas Press, 1968. In essence, the concept of irreducibility says that
no matter how the consumers are partitioned into two groups, an increase in the
initial assets held by the members of one group can be used to make possible an
allocation which would improve the position of someone in the second group without
damaging the position of anyone else there.
26. As Arrow [ 1] points out, the convexity of each consumer's preferences and of each
producer's production set are "the empirically most vulnerable" assumptions. How-
ever, the nonconvexity of preferences would have no significant effect as long as each
consumer is small enough compared to the economy. Recall our discussion on the
core in the Appendix to Section C of this chapter. On the other hand, increasing
returns to scale for each firm (which precludes the convexity of each firm's produc-
tion set) over a sufficiently wide range may mean the appearance of large firms and the
failure of the existence of a competitive equilibrium.
27. There is also a problem in the procedure of fixing the number of consumers in the
economy, even if we assume that everybody can survive. But this seems to be much
less serious than the problem that arises in fixing the number of firms. See Koopmans
[ 171, pp. 64-65.
28. In particular we are concerned with his "special existence theorem," which provides
the core of his proof for a more general case.
29. In proving the "existence of competitive equilibria," no assumptions on each
producer's production set are required (only the assumptions on the aggregate
production set of the total economy are required). This was first shown by Uzawa in
1956 (Stanford Technical Paper No. 40), later published as [ 371.
30. Note that the origin represents the point of the initial endowments, and the con-
sumption set Xi represents the set of all possible trade (and consumption) for the ith
consumer. In Figure 2.33, labor is assumed to be the only unproduced commodity.
At point x, a positive amount of the produced commodity, food, is received by
consumer i. This restrictive assumption simplifies the proof in this subsection.
31. Since xi E interior Y, p x; < 0 for all p E P. This means that the ith consumer is
guaranteed a positive income above subsistence requirements for all p E P. In this
sense, (A-5) takes care of the subsistence problem.
278 THE THEORY OF COMPETITIVE MARKETS

REFERENCES

1. Arrow, K. J., "Economic Equilibrium," International Encyclopedia of Social Sciences,

New York, Macmillan and Free Press, 1968.
2. , "The Firm in General Equilibrium," Technical Report, no.3, Harvard
University, May 1969.
3. Arrow, K. J., and Debreu, G., "Existence of an Equilibrium for a Competitive Econ-
omy," Econometrica, 22, July 1954.
4. Arrow, K. J., and Hahn, F. H., General Competitive Analysis, San Francisco, Holden
Day, 1971.
5. Aumann, R. J., "Existence of Competitive Equilibria in Markets with a Continuum
of Traders," Econometrica, 34, January 1966.
6. Berge, C., Topological Spaces, New York, Macmillan, 1963 (French, 1959).
7. Brouwer, L. E. J., "Uber Abbildung von Mannigfaltigkeiten," Mathematischen
Annalen,71, 1912.
8. Cassel, G., Theory of Social Economy, tr. by McCabe, London, T. Fisher Unwin, 1923.
9. Debreu, G., "A Social Equilibrium Existence Theorem," Proceedings of the National
Academy of Sciences of the U.S.A., 42, November 1952.
10. , "Market Equilibrium," Proceedings of the National Academy of Sciences of the

U.S.A., 42, November 1956.

11. , Theory of Value, New York, Wiley, 1959.

12. , "New Concepts and Techniques for Equilibrium Analysis," International

Economic Review, 3, September, 1962.
13. Dorfman, R., Samuelson, P. A., and Solow, R. M., Linear Programming and Economic
Analysis, New York, McGraw-Hill, 1958, esp. chap. 13.
14. Gale, D., "The Law of Supply and Demand," Mathematics Scandinavica, 3, 1955.
15. Hicks, J. R., "Linear Theory", Economic Journal, LXX, December 1960, reprinted in
Surveys of Economic Theory, Vol. III, prepared for the American Economic Associa-
tion and the Royal Economic Society, London, Macmillan, 1967.
16. Kakutani, S., "A Generalization of Brouwer's Fixed Point Theorem," Duke Mathe-
matical Journal, 8, 1941.
17. Koopmans, T. C., Three Essays on the State ofEconomic Science, New York, McGraw-
Hill, 1957.
18. Koopmans, T. C., and Bausch, A., "Selected Topics Involving Mathematical Reason-
ing," SIAM Review, 1, July 1959.
19. Kuhn, H. W., "On a Theorem of Wald," in Linear Inequalities and Related Systems,
ed. by H. W. Kuhn, and A. W. Tucker, Princeton, N.J., Princeton University Press,
1956.
20. McKenzie, L. W., "On Equilibrium in Graham's Model of World Trade and Other
Competitive Systems," Econometrica, 22, April 1954.
21. , "Competitive Equilibrium with Dependent Consumer Preferences," in
Proceedings of the Second Symposium in Linear Programming, ed. by H. A. Antosie-
wicz, Washington, D.C., National Bureau of Standards, 1955.
22. , "On the Existence of General Equilibrium for a Competitive Market," Eco-
nometrica, 27, January 1959.
THE EXISTENCE OF COMPETITIVE EQUILIBRIUM 279

23. , "On the Existence of General Equilibrium for a Competitive Market: Some
Corrections," Econometrica, 29, April 1961.
24. Moore, J. C., "On Pareto Optima and Competitive Equilibria, Part II. The Existence
of Equilibria and Optima," Krannert Institute Paper, no. 269, April 1970.
25. Morishima, M., "A Reconsideration of the Walras-Cassel-Leontief Model of General
Equilibrium," in Mathematical Methods in the Social Sciences, 1959, ed. by Arrow,
Karlin, and Suppes, Stanford, Calif., Stanford University Press, 1960.
26. von Neumann, J., "Zur Theorie der Geselischaftsspiele," Mathematischen Annalen,
100, 1928.
27. , "Uber ein Okonomisches Gleichungssystem and eine Verallgemeinerund des
Fixpunktsatzes," Ergebnisse eines Mathematischen Kolloquims, 1935-1936 (in
English, "A Model of General Economic Equilibrium," Review of Economic Studies,
VIII, 1945-1946).
28. Nikaido, H., "On the Classical Multilateral Exchange Problem,"Metroeconomica, 8,
August 1956.
29. , "A Supplementary Note to 'On the Classical Multilateral Exchange
Problem,"' Metroeconomica, 9, December 1957.
30. , Introduction to Sets and Mappings in Modern Economics, tr. by K. Sato,

Amsterdam, North-Holland, 1970 (Japanese original, Tokyo, 1960).

31. , Convex Structures and Economic Theory, New York, Academic Press, 1968.
32. Quirk, J., and Saposnik, R., Introduction to General Equilibrium Theory and Welfare
Economics, New York, McGraw-Hill, 1968.
33. Scarf, H., "On the Computation of Equilibrium Prices," in Ten Economic Studies in
the Tradition of Irving Fisher, New York, Wiley, 1967.
34. Schlesinger, K., "Uber die Produktiongleichungen der Okonomischen Wertlehre,"
Ergebnisse eines Mathematischen Kolloquiums, 6, 1933-1934.
35. Sonnenschein, H. F., "Demand Theory without Transitive Preferences," in Pref-
erences, Utility, and Demand, ed. by J. S. Chipman et. al., New York, Harcourt Brace
Jovanovich, 197 1.
36. Starr, R. M., "Quasi-Equilibria in Markets with Non-Convex Preferences," Econo-
metrica, 37, January 1969.
37. Uzawa, H., "Aggregate Convexity and the Existence of Competitive Equilibrium,"
Economic Studies Quarterly, XII, January 1962.
38. , "Walras' Existence Theorem and Brouwer's Fixed Point Theorem," Economic

Studies Quarterly, XIII, March 1962.

39. Wald, A., "Uber die Eindeutige Positive Losbarkeit der Neuen Produktionsglei-
chungen," Ergebnisse eines Mathematischen Kolloquiums, 6, 1933-1934.
40. , "Uber die Produktionsgleichungen der Okonomischen Wertlehre," Ergeb-
nisse eines Mathematischen Kolloquiums, 7, 1934-1935.
41. "Uber Einige Gleichungssysteme der Mathematischen Okonomie," Zeit-
schrift fur Nationalokonomie, 7, 1936, (in English, "On Some Systems of Equations
of Mathematical Economics," Econometrica, 19, October 1951).
42. Walras, L., Elements of Pure Economics (1926 ed.), tr. by W. Jaffe, London, Allen &
Unwin, 1954. (1st. ed., 1874).
43. Zeuthen, F., "Das Prinzip der Knapphert, technische Kombination and okonomische
Qualitat," Zeitschrift fur Nationalokonomie, 4, 1933.
280 THE THEORY OF COMPETITIVE MARKETS

Appendix to Section E: On the Uniqueness of Competitive Equilibrium

The existence of a competitive equilibrium does not necessarily guarantee its

uniqueness. This has been known to economists since Walras and Marshall. In
fact, the conditions needed to ensure uniqueness are somewhat different from
those needed to ensure existence. The purpose of this Appendix is to review briefly
the uniqueness problem.
Let there be n + 1 commodities in the economy and assume that the 0th
commodity can be chosen as the numeraire.' Letf (p) be the excess demand func-
tion for the ith commodity, i = 1, 2, . . ., n, where p = (pi, p2, . . ., p") E S2" (the
nonnegative orthant of R") denotes the price vector. It is assumed that thef's are
single-valued, continuous, and bounded from below. Define "equilibrium" as
follows.

Definition: If the following conditions are satisfied, p E f2n is said to be an equili-

brium price vector.
(1) f(p)<0,i=0, 1,2,...,n
and

(2) p ; f, (p) = 0, i = 0, 1 , 2, ... , n (where po = 1)

Using Walras's Law, (1) and (2) can be equivalently rewritten (in vector
notations) as2
(3) f (p) < 0 and P.f(P) = 0, wheref(p) = [fi(P), ..,fn(p)]
We assume that there exists an equilibrium price vector p.
Needless to say, the value of p under which (1) and (2) hold may not be
unique. International trade theorists, for example, have often encountered the
possibility of multiple equilibria in connection with their simple two-commodity
trade models. This is the situation when two offer curves intersect more than once.
Our question here is: Under what conditions can we guarantee uniqueness? In
Section E, we remarked that Wald [ 10] proved uniqueness assuming the following
condition:
(R) We have either p x < p x' or p' - x' < p' x, where x =f(p) and x' = f (p').
Then the following theorem is almost trivial to prove.

Theorem 2.E.4 (Wald):' Suppose (R) holds; then the equilibrium is unique.
PROOF: Suppose there exist two equilibrium price vectorsp > 0andp* > 0
such that p p*. Write z = f(p) and x* = f(p*). By (R), we have either
p z < p- x* or p* x* < p * . But since and p* are equilibrium price
.

vectors, it must be that p z = 0 and p* x* = 0, so we have either p - x* > 0

or p* z > 0. But p . x* > 0 is impossible for x* < 0 ('.'p* is an equilibrium
THE EXISTENCE OF COMPETITIVE EQUILIBRIUM 281

price vector). Hence p* z > 0, which is also impossible, since k < 0 is

an equilibrium price vector). (Q.E.D.)
REMARK: If x is interpreted as an excess demand vector for all the com-
modities of an individual consumer, then (R) refers to Samuelson's well-
known weak axiom of revealed preference.' However, as remarked before,
the plausibility of the weak revealed preference axiom for the entire econo-
my is to be questioned.'
As a matter of fact, the most appropriate uniqueness theorem has not been
fully explored. However, important mathematical theorems have recently been
made available by Gale and Nikaido [3] and others in connection with their
studies on the factor price equalization theorem," and this line of thought turns out
to be useful in the study of the uniqueness problem (Nikaido [7], p. 338).
To consider this, let f (p) be a differentiable function from Pinto R" where P is
a "region" in R" (that is, an open connected subset of R"). For the time being (that
is, as long as we are concerned with mathematical theorems), we do not adhere to
the economic interpretation off(p) (such as "excess demand" vector). Let F(p) _
[ fj] be the Jacobian matrix of f(p), that is, fj = afj(p)/apj.

Definition: An n x n matrix A = [aij] is said to be Hicksian if it has all the prin-

cipal minors of odd order negative and those of even order positive. In other
words,
aii at,, aii aij aik
(4) aii < 0, > 0, [iii ajj ajk < 0, . . .,
a.# aki akj akk
aa' I I

where i,j,k,...,= 1,2,...,n.

Then the following proposition is available from Gale-Nikaido [3] and
Nikaido [7].

Lemma 2.E.6: Assume the region P of R" is rectangular and suppose the Jacobian
matrix F(p) is Hicksian for all p E P. Then
(i) The mapping f (p) is one-to-one for all p E P.
(ii) The inequalities
(5) (pi - ai) [.f (p) - £ (a)] > 0, i = 1, 2, ... , n have only the trivial solution
p = a.

With the help of this lemma, the following uniqueness theorem is easy to
prove.
Theorem 2.E.5 (Nikaido): Let f (p) be an excess demand vector as considered above,
where f is differentiable and defined on a rectangular region P of fl ". Then the equi-
librium price vector is unique if the Jacobian matrix F(p) is Hicksian.
282 THE THEORY OF COMPETITIVE MARKETS

PROOF: Suppose p and p* are two equilibrium price vectors. In the inequali-
ties (5), let a = p and p = p*. Then the LHS of (5) can be rewritten as
(6) (P* - Pr) [f (P*) - f (P)] = -Prf (P*) - P*f (P) > 0
by definition of equilibrium. Hence by (ii) of Lemma 2.E.6, p* = p.
(Q.E.D.)
REMARK: Suppose that the equilibrium relation is expressed in the Walras-
Cassel equality form
(7) f(P)=0,i= 1,2,...,n
and suppose that such ap withp > O exists. Then (i) of Lemma 2.E.6 provides
the uniqueness of p immediately.
Unless some economic justifications are found for the condition that F(p) is
Hicksian, Theorem 2.E.5 remains essentially a mathematical theorem. Herethere
is still much to be explored. The reader may find his own uniqueness theorems by
exploring further economic interpretations of Theorem 2.E.5.
To illustrate such a line of thought, let us quote the following result in the
literature, from which we shall prove one uniqueness theorem.

Lemma 2.E.7: Let A = be an n x n matrix with a;i > O for all i j. Then A is
Hicksian if and only if
(8) There exists an x > 0 such that A x < 0.
PROOF: See Chapter 4, Section C.
To make use of this theorem, we assume
(G) f, > 0 for all p and for all i j.
That f, > 0 means that an increase (resp. decrease) in the price of the jth com-
modity will increase (resp. decrease) the excess demand for the ith commodity.
Condition (G) is known to be the (weak) gross substitutability condition and plays
an important role in the stability theorem of competitive equilibrium (see Chapter
3).
Next write
(9) f (P) = f (P, Po), i = 1, 2, ... , n
and

(10) fo(P) = fo(P, Po)

where po = 1 and fo(p) signifies the excess demand function of the 0th commodity
(numeraire). Observe that of/app = of/app for all i, j = 1, 2, ..., n. As remarked
before, the existence of a numeraire presupposes that f(p, 1), i = 0, 1, 2, ..., n,'
THE EXISTENCE OF COMPETITIVE EQUILIBRIUM 283

are homogeneous of degree zero with respect to all the arguments. Hence using
Euler's equation we obtain
n

(11) Gfijpj= -fo, forallp,i= 1,2,...,n,

j= I
where fij - afi (p)/apj, i = 1, 2, ... , n, j = 0, 1, 2, ... , n. Assume
(12) fr0> 0 forallp,i= 1,2,...,n8
The economic interpretation of (12) should be obvious. Clearly equations (11) and
(12) imply that condition (8) is satisfied for the n x n Jacobian matrix F(p) for all
p. Hence, as a simple corollary of Theorem 2.E.5, we immediately obtain the
following theorem.

Theorem 2.E.6:9 Assume (G) and (12). Then the equilibrium is unique.
PROOF: By Lemma 2.E.7, F(p) is Hicksian for all p. Hence by Theorem
2.E.5, the equilibrium is unique. (Q.E.D.)
This result has been known to economists since Wald [ 101. Moreover, it can
be proved quite simply without using the knowledge of Theorem 2.E.5 and Lemma
2.E.7. For such a proof, see Lemma 3.E.2. Lemma 3.E.3 provides the relation be-
tween condition (R) and gross substitutability. The difficulty of Theorem 2.E.6
is that the economic plausibility of gross substitutability is very questionable.

FOOTNOTES

1. As remarked before, the existence of a numeraire presupposes that the price of such a
commodity is positive at least in equilibrium and that the excess demand for each
commodity is homogeneous of degree zero with respect to all prices including that of
the numeraire commodity.
2. By Walras' Law, we mean here that 2:" 0Pi f (p) = 0 for all p. It is easy to see that,
under this law, (1) and (2) hold if and only if (3) holds.
3. To prove this theorem, there is no need to assume the existence of a numeraire, as
long as condition (R) is stated in a form which includes the numeraire. Then we can
assert the uniqueness of the price vector (including the numeraire commodity) up to
scalar multiples.
4. It may be worthwhile to recall Samuelson's weak axiom of revealed preference.
Interpret x and x' as the consumption vectors of a particular individual. Let x and x'
respectively, be chosen under p and p'. If x' is affordable at p, that is, p x' < p x,
then x is revealed to be preferred to x', for he could have bought x'. If this is the case,
x' cannot be revealed to be preferred to x; that is, p'- x < p'- x' is impossible. There-
fore p 1x 5 0 (where Ax = x' - x) implies p' ::1 x < 0, which is the weak axiom.
By this axiom, we have p' Ax < 0 or p Ax > 0 (for a particular individual). See P. A.
Samuelson, Foundations of Economic Analysis, Cambridge, Mass., Harvard University
Press, 1947, chapter 5.
5. In other words, the statement that the weak axiom holds in the aggregate is not a
284 THE THEORY OF COMPETITIVE MARKETS

consequence of rational behavior but is an additional assumption. Hence, unless

some behavioral background is discussed for such a statement,- Theorem 2.E.4 is
essentially a mathematical theorem. It can be shown that sufficiently small income
effects could ensure such a statement.
6. See, for example, Inada [41, Nikaido [71, and Uekawa [8].
7. Besides, the price of the numeraire commodity must be positive, at least in equi-
librium.
8. Note that conditions (G) (with strict inequalities) and (12) are inconsistent with
homogeneity unless p > 0. To see this, suppose pi = 0 for some i = 1 , 2, ... , n. Then
a) > j(p, 1) in view of (G) and (12), which contradicts the homogeneity condi-
tion (see Chapter 3, Section G). In other words, under such conditions, none of the
commodities are free.
9. As in Theorem 2.E.5, j(p) is an excess demand vector (deleting the numeraire com-
modity), which is differentiable and defined on a rectangular region P in S2".

REFERENCES
1. Arrow, K. J., "Economic Equilibrium," International Encyclopedia of Social Sciences,
New York, Macmillan and Free Press, 1968.
2. Arrow, K. J., Block, H. D., and Hurwicz, L., "On the Stability of the Competitive
Equilibrium, II," Econometrica, 27, January 1959.
3. Gale, D., and Nikaido, H., "The Jacobian Matrix and Global Univalence of Map-
pings," Mathematische Annalen, 159, 1965.
4. Inada, K., "The Production Coefficient Matrix and the Stolper-Samuelson Condi-
tion," Econometrica, 39, March 1971.
5. McKenzie, L. W., "Matrices with Dominant Diagonals and Economic Theory,"
Mathematical Methods in the Social Sciences, 1959, ed. by Arrow, Karlin, and Suppes,
Stanford, Calif., Stanford University Press, 1960.
6. Morishima, M., "A Generalization of the Gross Substitute System," Review ofEco-
nomic Studies, XXXVII, April 1970.
7. Nikaido, H., Convex Structure and Economic Theory, New York, Academic Press,
1968.
8. Uekawa, Y., "On the Generalization of the Stolper-Samuelson Theorem," Econo-
metrica, 39, March 1971.
9. Wald, A., "Uber die Eindeutige Positive Losbarkeit der Neuen Producktions-
gleichungen," Ergebnisse eines Mathematischen Kolloquiums, 6, 1933-1934.
10. , "Uber einigen Gleichungssysteme der Mathematischen Okonomie," Zeit-
schrift fur Nationalokonomie, 7, 1936 (in English, "On Some Systems of Equations of
Mathematical Economics," Econometrica, 19, October 1951).
PROGRAMMING AND COMPETITIVE EQUILIBRIA 285

Section F
PROGRAMMING, PARETO OPTIMUM,
AND THE EXISTENCE
OF COMPETITIVE EQUILIBRIA'

In Chapter 1, we described the recent developments in the theory of non-

linear programming. In this chapter, we have described one of the most important
developments of economic theory, that is, the theory of competitive markets. The
purpose of this section is to relate these two developments. Specifically, we will
prove the optimality and the existence of competitive equilibria, as an applica-
tion of the theory of nonlinear programming. The reader will see that the two
developments, the theory of nonlinear programming and the theory of competitive
equilibria, are thus closely related to each other in their structures. We may
even say that the model of competitive equilibria is a programming model with
economic content attached to it. And in a sense, the theory of nonlinear pro-
gramming is a mathematical reformulation of the theory of competitive equilibria.
Thus this section is intended to give a unified treatment of the problem of exis-
tence, welfare economics, and the theory of nonlinear programming. Moreover,
as a result of this recognition, our treatment of the theory of competitive equilibria
becomes very simple and straightforward, while retaining the generality of the
model of the competitive market fairly well. Such simplicity will encourage further
research.
We should mention one important predecessor in such an attempt. Negishi,
in his ingenious paper [ 12], proved the existence of competitive equilibria by
using the theory of nonlinear programming. His formulation is based on the
quasi-saddle-point characterization of the constrained maximum problem. Hence
he assumed, among other things, the existence of the right-hand and left-hand
derivatives of each producer's production function (his Fk). Moreover, the use
of the quasi-saddle-point characterization complicated his formulation and the
proof of the existence a great deal. Here we entirely avoid the use of each pro-
ducer's production function-hence we do not impose any conditions on such
functions as Fk (conditions such as concavity and the existence of the right-hand
and left-hand derivatives of each Fk, as imposed by Negishi [ 12]).1 Our treatment
and proof will be much simpler than that of Negishi, yet our model of competitive
equilibrium will be more general than Negishi's. The reader will note that the
simplicity and generality of the present section is achieved by using the simple
saddle-point characterization (rather than the quasi-saddle-point characteriza-
tion) of optimality. In Negishi's paper, the characterization of competitive equi-
librium as a Pareto optimum is not attempted, but we will attempt it here. The
reader may note that the proof of the Pareto optimality characterization and the
proof of existence parallel each other very closely. In Theorem 2.F.1, we prove
286 THE THEORY OF COMPETITIVE MARKETS

that every Pareto optimum can be realized by competitive pricing; in Theorem

2.F.2, we prove the existence of competitive equilibria; and in Theorem 2.F.3
we show that such a competitive equilibrium will realize a Pareto optimum.
Let xi be an n-vector of consumption by consumer i (i = 1, 2, ..., m) and
x; be the initial resources held by the ith consumer, and let y, be an n-vector
of production by producer j(j = 1, 2, ., k). Let
. .

m m k

x --xi,
i=i
z = i=i x i, and y = E y,
i=t
where all externalities are assumed away.
Denote by Xi the consumption set of i and by Yj the production set of j. We
denote by X the aggregate consumption set and by Y the aggregate production
set. We assume that the preferences of consumer i are represented by a continuous
real-valued function ui(xi).; Given price p, the profit of producer j can be written
as p y,. Pareto optimality and competitive equilibrium are defined (in the usual
manner) as follows:

Definition (feasibility): An array of consumption vectors {xi} is said to be feasible

if there exists an array of production vectors { yj} such that y + z - x > 0 with
x1EXifor all iand yj E Yj forallj.

Definition (Pareto optimality): A feasible {2i} is said to be Pareto optimal if there

does not exist a feasible {x;} such that ui(x;) >> ui(2i) for all i = 1, 2, ..., m with
strict inequality for at least one i.

Definition (competitive equilibrium): An array of vectors [p,{2i},{ jj}] with

p ? 0, 2i E Xi for all i, and 5 E Yj for all j, is a competitive equilibrium, if the
following hold:
(i) ui(zi) ? ui(xi) for all xi E Xi with xi < zi, i = 1, 2, ..., M.
(ii) yj for all yj E Yj,j= 1,2,...,k.
(iii) <y+.C and p- (y+x -=0.

Definition (satiation): The ith consumer is satiated at x; if ui(x'i) > ui(xi) for all
xi E Xi.
We assume the following:
(A-1) There exist x' E X, y' E Y such that y' + x - x' > O.
(A-2) The set Y is convex.'
(A-3) The function ui(xi) is continuous and concave for all i = 1, 2, ..., m.'
(A-4) (cheaper point) Given a point 2i, if a price vector prevails, there
exists an x'i E Xi such that p 2i > p x; for all i.
PROGRAMMING AND COMPETITIVE EQUILIBRIA 287

Notice also that our definition of feasibility tacitly assumes free disposability.

Theorem 2.F.1 Under (A-1), (A-2), and (A-3), if [{z,}, y] is a Pareto optimum,
then there exists a p _>_ 0 and { yj }such that [ p, {Xi }, { y }] is a competitive equilibrium,
provided that (A-4) holds at this p.
PROOF: Let u be a vector function of which the ith component is u,(x;).
Since [{X;}, y] is a Pareto optimum, it is a solution of the following vector
maximum problem:' Choose {x;} and y so as to maximize u subject to
x < y + k, and x, E Xi, i = 1, ..., m, and y E Y. Hence, in view of (A-1),
(A-2), and (A-3), there exists an (a, p) such that the following (1) and (2) hold:

(1)
for all x;EX1,i= 1,2,...,m,andyE Y,wherea>_O.p 0, andce #0,' and
u = [ul //lxl), u2 //1x2, ..., um( m)]
(2) y+x-z0
Condition (iii) of competitive equilibrium follows immediately from (2).
Since y and y are in Y, we can find yJ E YJ and yJ E YJ, j = 1, 2, ... , k, such
that y = E 1 yJ and y = k 1 jJ. Put x; = z; for all i and yJ = yi for all j
except for j = join (1). Then p &> p yJ0 for all yJ0 E Y. Since the choice of
jo is arbitrary, this establishes condition (ii) of competitive equilibrium. Put
y = y and x; = z; for all i except for i = io. Then we have
a,0 u;0(z;0) - a,0 u,0(x 0) > p 1,0- p x,0 for all x(0E X'0

If a,0 > 0, then condition (i) of competitive equilibrium is satisfied for io.
If aj0 = 0, then p x,0 > p C,0 for all x,0 E X;0.10 This contradicts the
cheaper point assumption, (A-4), so that a,0 > 0. Since the choice of io is
arbitrary, this establishes condition (i) of competitive equilibrium.
(Q.E.D.)

Corollary: If there exists at least one consumer (say, io) who is not satiated at
z,0, then p 4 0.
PROOF: From the proof of the theorem, we know
for allx,EX1,a,> 0,I= 1,2,...,m
Now suppose p = 0. Then u,0(z10) > u,0(xi0) for all x10E X10. This contradicts
the fact that io is not satiated at xi0. (Q. E. D.)

REMARK: Note that the cheaper point assumption plays a crucial role in
establishing that each consumer indeed maximizes his satisfaction subject
to his budget constraint. Without (A-4), we cannot say this, as was shown by
288 THE THEORY OF COMPETITIVE MARKETS

Arrow [ 1] , although each consumer still minimizes his expenditure subject

to a given satisfaction. See Section C of this chapter.
REMARK: That ai0= 0 means that individual i0 is completely "dis-
regarded" by the society at a given Pareto optimal state [{zi}, y] . If
ai0= 0, he gets the minimum possible income at this price vector p. As noted
before, (A-4) avoids this possibility. As is well known, the society can be
at a Pareto optimum when every member of the society except one is "dis-
regarded" by the society. The importance of the cheaper point assumption
became well known since Arrow's famous example [ ]. ] ; however, its role in
precluding such a case (that is, a, = 0) is not well recognized in the literature.
REMARK: Let z= (xi, .. .,xm,y)and define the set Zby Z= {z:x <y+
z, y E Y, xi E X,, i = 1, 2, ..., m}. The sets Xi's and Y may not be compact,
but we may assume that Z is compact without much difficulty." Since the
constraint set of the maximization problem in the above proof is Z, the
compactness of Z (together with the continuity of the ui's) guarantees the
existence of a Pareto optimum as a result of the Weierstrass theorem.12
REMARK: Note that the competitive equilibrium in the above definition
can be achieved by allocating the aggregate income of the societyp (y + x)
to each consumer by the amount of p zi, i = 1, 2, ... , m [note that
p z = p (y + x) from (2), so all the income of the society is completely
absorbed by each member of the society]. In other words, without such a
reallocation of income, a given Pareto optimum cannot, in general, be
supported by competitive pricing.
To prove the "existence" of competitive equilibria, we have to show the
existence of a price vector which would support the conditions of the competitive
equilibrium without such a reallocation of income as discussed in the previous
remark. Hence we have to rephrase condition (i) in the previous definition of
competitive equilibrium. To do this, first note that the income of consumer i,
denoted by Mi, when a price vector p prevails and the output vector for the jth
producer is y,, can be written as
k
(3) M1(p, y) = p x i + max 10, S Blip yr}
J= I

where Bpi is the share of profits from j to i, Z;'__ Bii = 1, i = 1, 2, ...3 m. Then
condition (i) in the definition of competitive equilibrium is restated as
zti(.Q >_ u,(xi) for all xi E Xi with p xi < Mi, i = 1, 2,... , m
To prove the existence, we impose the following additional assumptions."
(A-5) The set Z is compact, where Z = {(xi, ..., x,,,, y): x < y + z, y E Y,
xiEXi,i= 1,2,...,rn}.
PROGRAMMING AND COMPETITIVE EQUILIBRIA 289

(A-6) (Nonsatiation) For every consumption x, in X, with [{x;}, y] E Z,

there is a consumption in X, preferred to xi, i = 1, 2, ..., in.
(A-7) (Survival) There is an x;° in X, such that x;° < Ti for all i = 1, 2, ... , rn.
(A-8) (The possibility of inaction) 0 E Yp j = 1, 2, ..., k.

Theorem 2.F.2 Under (A-1), (A-2), (A-3), (A-5), (A-6), (A-7), and (A-8), there exists
a competitive equilibrium with a nonzero price vector.
PROOF:Let a E A where A = {a E R'": Z°= 1 a; = 1, a; > 0 for all i}. Let
U = Em a,u;(x1) and consider the following problem:"
1

Maximize: U
x-EX1, yEY

Subject to:y+ x-x>O,x;EXi,i= 1,2,...,m,andyE Y

Hence, from the Kuhn-Tucker theorem," there exists a p' > 0 such that the
following (4) and (5) hold:

(4) x - x)< x - x')

for all x. E X;, i = 1, 2, ..., m, y E Y, where x,: and y' are the optimal vectors
for the above problem, and U' Em 1 a;u;(x;):
(5) Y-x')= 0,andy'+ z - x'> 0
In a manner similar to the proof of Theorem 2.F.1, we can immediately show
that the following (6), (7), and (8) hold:
(6) P''1; >= for all yj E Yj ,j= 1,2, ...,k
where the yy are obtained from y' as y' = 2j 1
yj E Yp j = 1, 2, ... , k;
(7) u, (x'') u. (x,) for all x, E Xi with p'. x; < p'- x' , if a, > 0
(8) p' x; > p' x;for all x; E X,, if a. = 0
Since there is no satiation consumption in Z as a result of assumption (A-6),
(7) implies p' 0. Hence we may normalize the price vector p' as follows:
PS s = 1, 2, ... , n
Ps =
(9)
EP.,
S= l

Relations (4), (5), (7), and (8) all hold with p" replaced for p'.
The rest of the proof is analogous to Negishi [ 12]. It is simply
recorded here to keep this section sufficiently self-contained.
Since the set Z is bounded by (A-5), there exists a number M such that

E;" 1JM;(p, y) - p x1 < M for all (x1, ..., x,,,, y) E Z, and for all p
Now define
290 THE THEORY OF COMPETITIVE MARKETS

µi - max {0, cri + (M1(P, y) -p x,)/M}, i= 1, 2, ..., m

where p is a point in the (n - 1)-dimensional unit simplex.1e Define by

;
m

i= I
fit

Note that a' is a point of the (n - 1)-dimensional unit simplex. Following

Negishi [ 12 ], we construct the following 17

_(a) a-> L{x;}, Y" P']

(b) [{xi}, y', P'] -> [a', {xi}, y', P"]
The point-to-set mapping (a) is the mapping from a to the saddle point of
U + p (y + x - x). Its image is nonempty from (A-3) and (A-5) (see foot-
note 14), and the mapping is upper semicontinuous with compact and convex
images.'8 The mapping (b) is a point-to-point mapping and is continuous.
Let (x1, ..., X, y) be an arbitrary point in Z. Let p be an arbitrary
point in the (n - 1) unit simplex. Combining (a) and (b), the mapping
[a, {xi}, y, p] - [a', {x;}, y', p"] is an upper semicontinuous mapping19 from
a convex compact set into itself whose image is nonempty and convex.
Hence, from Kakutani's fixed point theorem,"' there exists a fixed point
[cr, {zj}, y, p] . Since conditions (ii) and (iii) of competitive equilibrium are
met by the construction of the mapping, it suffices to show that condition (i')
of competitive equilibrium (C. E.) is satisfied. Note that (A-8) and condition
(ii) of C. E. imply p j > 0 for all j. Hence Mi(p, y) = p x i + E 1 Bpi p j,.
Note also that the (M,(p, y) - p z,), i = 1, 2, ... , m, must be of equal sign
or zero by the construction of the mapping. Hence, from (5), we have"
(10)

Since there exist x,0 < xi by (A-7), ai > 0 for all i = 1, 2, ..., in, for other-
wise it contradicts relation .12 Hence, combining (7) and (10), we obtain
condition (i') of competitive equilibrium. (Q.E.D.)

REMARK: If we do not assume free disposability, then we cannot have a

nonnegative price vector in Theorems 2.F.1 and 2.F.2. The prices of
"undesired" commodities can be negative. To analyze such a case, we must
alter the statement of the definition of feasibility. In other words, [y + x -
x > 0] should be replaced by [y + x - x = 0] . Then our. problem becomes
the programming problem with an equality constraint. Although we can
carry out our analysis analogously to that above, we can no longer use the
saddle-point characterization of maximality. We have to rely upon the
quasi-saddle-point characterization, for the Slater condition is no longer
applicable and the Kuhn-Tucker constraint qualification or the classical
rank condition would now be relevant. This forces us to introduce the
PROGRAMMING AND COMPETITIVE EQUILIBRIA 291

undesirable assumption, that is, differentiability of the utility functions. The

proper route should be in extending the theory of the saddle-point character-
ization which takes proper account of the equality constraint.

Theorem 2.F.3: Let [{ii}, {5.}, p] be a competitive equilibrium in Theorem 2.F.2;

then [ {Xi 1, { yj } ] is a Pareto optimum.

PROOF: First note that RU, {9}] maximizes Em Iaiui(xi) (where ai > 0
for all i), subject to feasibility. Suppose [{zi}, {y,}] is not a Pareto optimum.
Then by the definition of Pareto optimum, there exist zi E Xi, i = 1 , 2, ... , m,
yi E Yi, j = 1, 2, . . ., k, such that ui(zi) > ui(zi) for all i with strict inequality
holding for at least one i, and z 5 y + z. In other words, there exists a feasible
[{.ii}, {p }] such that Em 1aiu1(z1) > 2:m 1aiui(. 1), contradicting the maxi-
mality of [Hzi}, {9j}], (Q.E.D.)

FOOTNOTES

I. This section is a revised version of Takayama and El-Hodiri [ 15] ; their work
was developed from the discussions between Takayama and El-Hodiri during the
summer of 1966, and the actual writing was done by Takayama. I am indebted to
Takashi Negishi for comments.
2. One of the important by-products of such an approach is that we can avoid the
concepts of demand correspondence and supply correspondence altogether.
3. Moreover, in Negishi's paper, Xi is nonnegative and contains the origin for all i. This
means that every consumer has the same minimum subsistence consumption point,
the origin, regardless of his physiological need.
4. For conditions which guarantee the existence of a continuous real-valued utility
function, see Rader [ 12] . The reader may wish to note that one of his theorems
(theorem 3) does not require the transitivity axiom of the preference ordering.
5. Note that if y + x - x < 0 for all x E X and y E Y, then the society cannot guarantee
the survival of every one of its members.
6. Note that only the aggregate production set is assumed to be convex. Every
production set does not have to be convex.
7. From the consideration below, we may also surmise that the theorems of nonlinear
programming that we used can be extended to the case in which the maximand
function is explicitly quasi-concave rather than concave. This conjecture is due
to the fact that only explicit quasi-concavity is usually required in establishing the
corresponding theorems of this section (see Theorem 2.C.2 and Theorem 2.E.3).
Finally, we may note that the concavity implies the continuity in the interior of the
domain (here X.), but not necessarily at the boundary. This is important, forXimay
be a closed set.
8. We are using the following theorem. Theorem: Let Z be a convex subset in RI
and f(z) be a vector function of which the ith component is f,,(z). Let
i = 1, 2, ., m, and gj (z), j = 1 , 2, ... , n, be concave functions on Z. Suppose
. .

also that there exists a z E Z such that gi(g) > 0, j = 1, 2, ... , n (Slater's condition).
Then if z achieves a vector maximum off (z) subject to gj(z) > 0, j = 1, 2, ..., n,
then there exist a > 0, p > 0 (a 0) such that (2, p) is a saddle point of
292 THE THEORY OF COMPETITIVE MARKETS

a f (z) + p g(z). See Theorem 1.E.4. Also see Karlin [ 71, pp. 216-218, and
Kuhn and Tucker [8], pp. 487-489. Note that the convexity of Y and X; as
well as the concavity of u; is required in applying this theorem. Note also
that assumption (A-1) provides Slater's condition, which implies Karlin's condition
as stated in [7], p. 201. See also Section B of Chapter 1.
9. Here ai can be interpreted as the "weight" attached by the society to the ith
individual.
10. Note that this unfortunate consumer is still minimizing his expenditure.
11. To ensure the compactness of Z, assume, for example, that the Xi's are closed
and bounded from below, Y is closed, and that "no-land-of-Cockaigne" holds for
each Yi. For the discussion of such a "compactification," see Arrow and Debreu
[21, pp. 276-277, and p. 279; Debreu [51, pp. 76-78, pp. 84-86; and Nikaido [ 131,
section 40.
12. The Weierstrass theorem asserts that a continuous (real-valued) function achieves
its maximum (and a minimum) on a compact set, and this theorem can easily be
extended to the case of a vector maximum.
13. These assumptions are analogous to those of Debreu [5], chapter 5. Assumptions
(A-5) and (A-7) may sound too stron , and some readers may wish to generalize in
the direction achieved by McKenzie 9] . It should be noted, however, that our set
of assumptions used to prove existence
L (Theorem 2.F.2), is more general than that
of Negishi [ 12] . We do not assume the existence of the right-hand and left-hand
derivatives of each Fk. As we noted before, we, in fact, completely avoided the
use of the individual production function Fk. Hence we do not assume that each Fk
is concave. Note that the concavity of each Fk implies the convexity of each
producer's production set. We only assume the convexity of the aggregate production
set. Slater's condition for each Fk does not have to be assumed. Our assumption
(A-1) is concerned with the aggregate sets. Note also that the assumption of Slater's
condition for each Fk also implies that the production set for each producer must
have a common interior point with the consumption set of every consumer (recall
that the origin is the starvation point for each consumer in Negishi [ 12] ).
14. In view of (A-3) and (A-5), the solution of the above maximization problem always
exists because of the Weierstrass theorem.
15. We are using the following version of the Kuhn-Tucker theorem. Theorem: Let Z
be a convex set in R", and f(z), j = 1, 2, ..., m, be concave functions on Z.
Suppose also there exists a 2 E such that gi(2) > 0 for all j (Slater's condition).
Under these conditions, if 2 maximizesf(z) subject to g.(z) ? 0, j = 1, 2, . . ., m, then
there exists a p ? 0 such that (z, p) is a saddle point of [f(z) + p. g(z)] . A beauti-
ful proof when Z = R" is provided by Uzawa [ 16]. The above slightly generalized
version is provided by Karlin [7] , pp. 201-203 (note that Slater's condition implies
Karlin's condition), and Nikaido [ 13] , section 37. See our discussion in Chapter 1,
Section B (especially the corollary of Theorem 1.B.3).
l 6. ,, > 0, since _y"_ ,.X, = 1 and _Y;" I [ (Mi (p, y) - p xi)/M] < 1.
Note that _y;'_'
17. Note that the range of the mapping (a) is compact. This is due to the fact that
the set Z is compact and the part of the range in which p' lives can be considered
as a compact subset-say, P-of the nonnegative orthant S?" of R"; p' is bounded
and it is obviously nonnegative. Without loss of generality, we may also assume
that ,P is convex.
18. The image is convex because U is concave, the constraint set Z is convex (Theorem
I .C.5), and P is convex. Also recall here that the Cartesian product of convex sets
is always convex. Since the graph of this mapping is a closed set, it is a closed
PROGRAMMING AND COMPETITIVE EQUILIBRIA 293

mapping (see Berge [3] , p. 1 11). Furthermore, it is an upper semicontinuous (u.s.c.)

mapping, since its range is compact as a result of our observation in the previous
footnote. The importance of this is pointed out by Moore [III (especially p. 133
and his footnotes 10 and 12 on p. 139), who corrected the impreciseness on this
point in Negishi [ 12]. In general, a closed mapping is u.s.c. if the range space is
compact (see Berge [3], p. 112, corollary of theorem 7). The image of the mapping
(a) is compact, for the image of any u.s.c. mapping is always compact.
19. In general, the composite mapping of two u.s.c. mappings is also u.s.c. (see Berge
[3], p. 113). For a generalization of this, see Moore [ 1 1 ] .
20. Kakutani's fixed point theorem states the following: Let Z be a nonempty convex
and compact subset of R. If F is an upper semicontinuous mapping from Z to Z
such that for all z E Z, the set F(z) is nonempty and convex, then there exists a z
(called a "fixed point") such that i E F(z) Recall Section E of this chapter.
21 . Note that p (y + x ) _ E l" t M i (p, Y Hence _ Y " , ' t [ M i (p, Y) - P Xi] = j 3 ( 9 +
which is zero from (5). Since the [ Mi(p, y) - Q's are all of equal sign
or zero, this proves (10).
22. Since p >_ 0, x, < 3E, implies p x o < p . X i. But p z i <_ Mi(p, y) = p x;, so that we
have p xi° < p .ri. Now suppose the contrary, that is, a; = 0 for some i. Then
(8) says that p- xi ? p zi for all xi E X; hence, in particular, p p x;, which
is a contradiction.

REFERENCES

1. Arrow, K. J., "An Extension of the Basic Theorems of Classical Welfare Eco-
nomics," Proceedings of the Second Berkeley Symposium on Mathematical Statistics
and Probability, ed. by J. Neyman, Berkeley, Calif., University of California Press,
1951.
2. Arrow, K. J., and Debreu, G., "Existence of an Equilibrium for a Competitive
Economy," Econometrica, 22, July 1954.
3. Berge, C., Topological Spaces, tr. by Patterson, New York, Macmillan, 1963 (French
original, 1959).
4. Debreu, G., "The Coefficient of Resource Utilization," Econometrica, 19, July 1951.
5- -, Theory of Value, New York, Wiley, 1959.
6. Fenchel, W., Convex Cones, Sets and Functions, Princeton, 1953 (offset).
7. Karlin, S., Mathematical Methods and Theory in Games, Programming, and Economics,
Vol. 1, 1st ed., Reading, Mass., Addison-Wesley, 1959.
8. Kuhn, H. W., and Tucker, A. W., "Nonlinear Programming," Proceedings of the
Second Berkeley Symposium on Mathematical Statistics and Probability, ed. by
J. Neyman, Berkeley, Calif., University of California Press, 1951.
9. McKenzie, L. W., "On the Existence of General Equilibrium for a Competitive
Market," Econometrica, 27, January 1959-
10. Moore, J. C., "Some Extensions of the Kuhn-Tucker Results in Concave Pro-
gramming," Papers in Quantitative Economics, ed. by J. P. Quirk and A. M. Zarley,
Lawrence, Kansas, University of Kansas Press, 1968.
'I- , "A Note on Point-Set Mappings," Papers in Quantitative Economics, Vol. 1,
294 THE THEORY OF COMPETITIVE MARKETS

ed. by J. P. Quirk and A. M. Zarley, Lawrence, Kansas, University of Kansas Press,

1968.
12. Negishi, T., "Welfare Economics and the Existence of an Equilibrium for a Competi-
tive Economy," Metroeconomica, 12, Agosto-Dicembre 1960.
13. Nikaido, H., Introduction to Sets and Mappings in Modern Economics, tr. by K. Sato,
Amsterdam, North-Holland, 1970 (Japanese original, Tokyo, 1960).
14. Rader, T., "Existence of a Utility Function to Represent Preferences," Review of
Economic Studies, 30, October 1963.
15. Takayama, A., and El-Hodiri, M., "Programming, Pareto Optimum, and the
Existence of Competitive Equilibria," Metroeconomica, XX, Gennaio-Aprile 1968.
16. Uzawa, H., "The Kuhn-Tucker Theorem in Concave Programming," in Studies
in Linear andNon-linear Programming, ed. by Arrow, Hurwicz, and Uzawa, Stanford,
Calif., Stanford University Press, 1958.
3
THE STABILITY OF COMPETITIVE EQUILIBRIUM

Section A
INTRODUCTION

Consider an isolated (competitive) market for one commodity (say, A).

Suppose that the demand for A, denoted by D(p), is a function of its price p,
and suppose also that the supply of A, denoted by S(p), is also a function of
its price. An equilibrium price, p, is a price such that D(p) = S(p). Whether or not
there exists such a p is the problem of the "existence" of an equilibrium. This
existence of an equilibrium is guaranteed if the demand curve and the supply
curve cross. If they cross at many points, then there are many equilibria. Suppose
that the price of A, p, deviates from a certain equilibrium price-say, p. The
question now is what happens to the time path of p. In particular, we want to
know if p will converge to the original p. This is the question of "stability." To
solve this "stability problem," we impose one basic assumption: an excess demand
for A will raise its price, p, and an excess supply of A will lower its price.
Mathematically, dp/dt 0 according to whether D(p) - S(p) 0. Here dp/dt
denotes the time derivative of p. In order to facilitate our understanding
< of the
problem, let us assume that there exists a unique equilibrium price p such that
D(p) = S(p). When the question is posed in this manner, the answer is almost
obvious from a diagrammatical analysis. The left-hand diagram in Figure 3.1
illustrates the case of a stable equilibrium and the right-hand diagram illustrates
the case of an unstable equilibrium. The reader should easily be able to check the
direction in which p moves when p is off the equilibrium price p.
To consider a more general problem, let us now suppose that a certain
economy is described by n equations which describe the "equilibrium relations"
of the economy. Let x = (x1, x2, ..., be the variables which should be deter-
mined from this system of n equations. Let the equilibrium relations of the econ-
omy be described by f (x) = 0, i = 1, 2, .. ., n [or f (x) = 0]. The value of x which
satisfies the above system (assuming that there exists such an x) is called an
equilibrium value of x, and we denote it by z. The stability question is the problem

295
296 THE STABILITY OF COMPETITIVE EQUILIBRIUM

P
P

0 0
0 0

Figure 3.1. Stability of an Isolated Market.

of what happens to the time path of x when it deviates from z. In particular, we

want to consider whether x will converge to z. An example of such an equilibrium
system is the system of a competitive market, where f,. denotes the excess demand
for the ith commodity and x; denotes the price of the ith commodity. Another
example of an equilibrium system is the classical or the Keynesian macro equi-
librium system, which typically consists of the equilibrium relations that describe
the goods market, the money market, and the labor market.
Let us suppose that the initial value of x is given by x°. Assume that x°
is not an equilibrium value, that is, f (x°) 0. Let us suppose that this generates
a certain adjustment process, from which the time path of x, denoted by x(t),
is described by the following system of equations.
F.[x(t), t] = 0, i = 1, 2, ... , n
with x (0) = x0; or F [x(t), t] = 0 with x(0) = x°. Typically this system of equations
is generated by a system of differential (difference) equations that describe the
adjustment processes. The stability analysis is concerned with the solution of the
above system or the question of whether x(t) converges to an equilibrium value
z. Most typically, the system of differential equations which generates the above
dynamic system may be written as follows:
d ,(t) = h.[.f(x(t))]
dt
Or, more simply,
dx;(t) = h.[f(x(t))]
dt

In the case of a competitive equilibrium, this says that the movement of the ith
price, x;(t), is a function of the excess demand for the ith commodity f, (or excess
demand for all the commodities f). When the problem is written in differential
equation form, one suspects that the theory of stability developed for differential
INTRODUCTION 297

equations might be of some value. This was indeed the case in the development
of the theory of the stability of a competitive equilibrium. In Section B, we survey
the basic material on differential equations. This discussion will also be useful in
later chapters.
Before concluding this introductory section, one important discussion is
necessary on the distinction between the "Walrasian stability" and the "Marshal-
lian stability."' In the introductory exposition of the stability problem of a com-
petitive equilibrium above, we considered the basic postulate in the form of

P1(t) = h, [.fi(P(t) )] , i= 1,2,.. .,n

where pi is the price of the ith commodity and p(t) is an n-vector in which the
ith element is p;(t) and p;(t) = dp;(t)/dt. Also,fi(p(t)) is the excess demand function
for the ith commodity and h; is any (fixed) monotone increasing differentiable real-
valued function. For the case of an isolated market for one commodity, we write
this equation as
p = h[D(p(t)) - S(p(t))]
where h refers to some fixed monotone increasing differentiable real-valued
function. Or, more simply,
p = k[D(p(t)) - S(p(t))]
where k > 0 can be interpreted as the "speed of adjustment" of the market.
There are two important premises in the above formulation. One is that
neither demanders nor suppliers can affect the price that prevails in the market,
but rather they take it as given. This is the premise of a competitive market. The
second premise is that the price is the only adjusting parameter of the market.
At each instant of time, demanders and suppliers, respectively, adjust the quanti-
ties that they wish to demand and supply, based only on the information of the
price given to them. This adjustment is assumed to be instantaneous. Then the
price moves as described in the differential equation above. As price moves, the
quantity of excess demand will vary and stability of the market is achieved when
the price moves in such a way that the excess demand vanishes.
In contrast to such a price adjustment process, the quantity adjustment type
mechanism is often considered. in Figure 3.2, suppose q` is the given quantity
of the commodity. We denote by D(q') and S(ql), respectively, the price that
buyers are willing to pay and the price that sellers are charging for a given
quantity q1. The dynamic output adjustment equation for the above market can
be written as
q = k [D(q) - S(q)]
where k > 0 is the speed of adjustment of the market. This reflects the fact that
if D(q) > S(q), for example, the suppliers can profitably increase the quantity
supplied. If the time path of the solution of the above differential equation con-
298 THE STABILITY OF COMPETITIVE EQUILIBRIUM

Price

Figure 3.2. An Illustration of Output Adjustment.

verges to the equilibrium quantity g as t extends without limit, then the equilibrium
is said to be "stable." Such a stability is often called the Marshallian stability,
while the stability in the price adjusting market as discussed before is often called
the Walrasian stability.
These two stability definitions apparently do not coincide. The market can
be Walrasian stable (resp. unstable) but Marshallian unstable (resp. stable). In
the literature, this is often illustrated by diagrams such as those shown in Figure
3.3. The diagrams should be self-explanatory.
However, comparison of the two concepts as shown in these diagrams
contains a-very serious confusion. In essence, these two concepts are in com-
pletely different dimensions and should not be compared in the same figure.
One source of this confusion is probably attributable to Hicks' remark
p. 62) on the distinction between these concepts. He states that the Marshal-

P P

q
0 Q 0

"Walrasian stable- Marshallian unstable" "Walrasian unstable-Marshallian stable"

Figure 3.3. Two Concepts of Stability.

INTRODUCTION 299

lian stability concept is more appropriate to conditions of monopoly than to those

of perfect competition.
As Newman ([3], pp. 106-108) has pointed out, the common confusion
about the Walrasian vs. the Marshallian stability lies in the failure to distinguish
clearly the theory of exchange from the theory of production. The Marshallian
stability conditions are explicitly designed for the theory of production, whereas
the Walrasian price adjustment is more suited for the theory of exchange. Hence
these two concepts cannot be compared in the same dimension. Thus the descrip-
tion of an equilibrium as Walrasian stable but Marshallian unstable is rather
meaningless. It contains a "serious substantive error of muddling up exchange
with production" (Newman [3], p. 107). As Newman has pointed out, this con-
fusion is found frequently in the literature.
The next question is: What is the essential distinction between the problem
of exchange and the problem of production2 One answer is essentially the time
involved in the two problems. Exchange can be considered as the "temporary"
problem, whereas production is the "short-run" problem? This is based on the
recognition that producers take a significant period of time before attaining
their optimum positions, whereas the consumer's adjustment can be much
faster. Take, for example, an isolated market for one commodity-say, apples.
When apples are harvested in the fall, the quantity that each producer can supply
to the market may be considered to be fixed; hence, the total quantity of apples
supplied to the market is fixed. The market thus characterized is that of exchange
rather than that of production. The Walrasian adjustment process is one excellent
way to explain the mechanism of reaching an equilibrium price. Once an equi-
librium price is determined, the producers determine the next year's output
based on the price of apples this year. Here the output-adjusting Marshallian
mechanism is probably most relevant. Note that in this example the quantity of
apples is fixed in the Walrasian adjustment process, whereas the price of apples
is fixed in the Marshallian adjustment process. In this example the behavior of
the market is very similar to that described in the Cobweb model.' In general,
this does not have to be the case. For example, the market supply of a commodity
can still be a function of price, even if the total amount is fixed (until the produc-
tion of the next period is finished).
In any case, what the above example illustrates is that the Walrasian price
adjustment process is more appropriate for the "temporary" period in which
production is not completed, whereas the Marshallian adjustment is better suited
to the "short-run" period in which the adjustment of output is explicitly con-
sidered. It is important to note that the Marshallian output adjustment process
is, contrary to Hicks' remark, perfectly relevant for a competitive market.
Producers in the above-described apple market are competitive in the sense that
they take the market price of apples as given. The usual discussion that the
Marshallian process is more appropriate for monopoly is thus wrong.
One typical and brilliant diagrammatical analysis of the Marshallian mecha-
nism is found in Marshall's paper "The Pure Theory of Foreign Trade" (privately
300 THE STABILITY OF COMPETITIVE EQUILIBRIUM

printed in 1879; the revision is reprinted in his Money, Trade and Commerce,
London, Macmillan, 1923, appendix J). The curves drawn there became known
later as "offer curves." The intersection of the two offer curves determines the
equilibrium outputs of the two commodities involved. It is assumed that the con-
sumers adjust to their optimum positions instantaneously and the Walrasian
adjustment process is completed instantaneously (the stability in the Walrasian
process is implicitly assumed). The adjustment from an off-equilibrium point to
the equilibrium point described in the above article is purely that of output
adjustment.4
We may remark that both Marshall and Walras clearly realized that there
are these two types of adjustments and they both used them in the proper context.'
Hence it may be rather misleading to call the stability in the price adjustment
process the "Walrasian stability" and the stability in the output adjustment pro-
cess the "Marshallian stability." But since this practice is already much too
common, we will not change it. A difference between these two approaches is
probably that Marshall emphasized the "short-run" output adjustment mecha-
nism and utilized a diagrammatical technique for this adjustment, whereas
Walras emphasized the "temporary" price adjustment mechanism and utilized
a diagrammatical technique for this adjustment in his theory of two-person
exchange.' Moreover, Walras [ 5], in his theory of production, treated the output
adjustment process as the one that occurs simultaneously with the price adjustment
process.`
The question still remains whether the Walrasian price adjustment process
is only relevant to the theory of exchange. I believe it is not. As long as both
demand and supply are functions of prices, the prices must be the final adjust-
ment parameter. After the "temporary" and the "short-run" adjustments are
completed, we should find an equilibrium position in which D(p) = S(p). Hence,
if we wish to abstract such "temporary" processes and "short-run" processes,
we may simply assume that both demand and production adjust instan-
taneously to price and then consider the time path as described byp = k[D(p) -
S(p)] , and so on. In other words, we can still consider the price adjustment as
the one that describes the mechanism for the final equilibrium (see Walras' theory
of production [5] ).'
In the later revival of the stability theory, starting from Hicks [1] and
Samuelson [4] , the Walrasian type of price adjustment has been the main issue
and little attention has been paid to the output adjustment. This is rather un-
fortunate, but as long as the price is the sole independent variable in a competitive
market, it may be natural to emphasize the price adjustment process (either as a
theory of temporary equilibrium in exchange or as a theory of short-run equi-
librium when all adjustments including output are completed).
In any case, this chapter is dedicated to exploring this recent development
in the price-adjusting theory. We will examine both the mathematical technique
and the conceptual difficulties in this recent development of the theory. The
mathematical exploration serves as a beautiful example of the application of
the theory of differential equations to economics.
INTRODUCTION 301

A short summary of this chapter is now in order.-After the exposition on

the elements of the theory of differential equations in Section B, we start our
discussion on the stability of competitive equilibria with a historical survey of the
topic in Section C. In Section D, we give a proof of the global stability of a
three-commodity market. This section is also useful as an illustration of the phase
diagram technique which has recently turned out to be very useful in many other
branches of economics. In Section E, we sketch the proof of the global stability
of a competitive economy given by Arrow, Block, and Hurwicz, and in Section
F, we make some important remarks on the stability analysis of a competitive
market. In Section G, we discuss some basic problems involved in the dynamic
adjustment equations, in particular the tatonnement and the non-tatonnement
processes. We end the chapter with a short survey of Liapunov's "second
method," which recently turned out to be useful in the stability analysis of a
competitive equilibrium (Section H).

FOOTNOTES

1. I am indebted to Takashi Negishi for the subsequent remark. Clearly, mistakes, if

there are any, are my own.
2. This approach is due to Marshall [2]. The terms such as "temporary" and "short
run" are also his. Marshall's "long run" is, as is well known, concerned with the
adjustment process which involves the adjustment of the capital stocks.
3. Such an intertemporal model which contains more than one (production) period is
really outside the scope of the ordinary "stability" analysis, which is concerned with
the adjustment process within the one-period model. The analysis of a multiperiod
model usually belongs to the theory of growth, business cycles, and so forth.
4. For an excellent attempt to reconsider Marshall's study from this viewpoint, see
A. Amano, "Stability Conditions in the Pure Theory of International Trade: A
Rehabilitation of the Marshallian Approach," Quarterly Journal of Economics,
LXXXII, May 1968.
5. For another study (besides Newman [ 3] ) on the Marshallian vs. Walrasian stability
conditions, see D. G. Davis, "A Note on Marshall versus Walrasian Stability
Conditions," Canadian Journal of Economics and Political Science, 29, November
1963.
6. Marshall did not utilize diagrams for the price adjustment process. Walras did not
utilize diagrams for his output (-price) adjustment process.
7. For the case of a single-commodity isolated market, an example of such a simulta-
neous adjustment process may be formulated as follows: p = k, [D(p) - q] and
q = k2 [ p - S(q)] , that is, the system of simultaneous differential equations. For
further studies based on the view that the adjustment processes are simultaneous,
see, for example, M. Morishima, "A Reconsideration of the Wal ras-Cassel- Leon ti ef
Model of General Equilibrium," in Mathematical Methods in the Social Sciences, 1959,
ed. by K. J. Arrow, S. Karlin, and P. Suppes, Stanford, Calif., Stanford University
Press, 1960, and E. Malinvaud, "Decentralized Procedures for Planning," in
Activity Analysis in the Theory of Growth and Planning, ed. by E. Malinvaud arld
M. 0. L. Bacharch, London, Macmillan, 1967.
8. Such a view may be represented by some of the above approaches to simultaneous
adjustment of prices and outputs. Another view is that all the adjustments boil
down to price adjustments after all. For example, Jones obtained the stability
302 THE STABILITY OF COMPETITIVE EQUILIBRIUM

condition interpreting the adjustment mechanism in the international trade equi-

librium a la Marshall (that is, the intersection of the two countries' offer curves)
as the one with the price-adjusting type. See R. W. Jones, "Stability Conditions in
International Trade: A General Equilibrium Analysis," International Economic
Review, 2, May 1961.

REFERENCES

1. Hicks, J. R. Value and Capital, 2nd ed., Oxford, Clarendon Press, 1946.
2. Marshall, A., The Principles of Economics, 8th ed., London, Macmillan, 1920.
3. Newman, P., The Theory of Exchange, Englewood Cliffs, N.J., Prentice-Hall, 1965.
4. Samuelson, P. A., Foundations of Economic Analysis, Cambridge, Mass., Harvard
University Press, 1947.
5. Walras, L., Elements of Pure Economics, 1926 ed., tr. by W. Jaffe, London, George
Allen & Unwin, 1954.

Section B
ELEMENTS OF THE THEORY
OF DIFFERENTIAL EQUATIONS

Before we pursue a discussion of the stability of a competitive equilibrium,

let us review some basic material from the theory of differential equations. Not
only has this technique been very useful in science and engineering, but it has also
proved itself useful in economics. This is especially true in problems of dynamics,
that is, problems that have "time" as an essential element, such as those found in
the theory of the stability of an economic (micro or macro) equilibrium as well as
in business cycle theory and growth theory. Hence this discussion is important for
several branches of economics.
First let us consider a very simple example of an "(ordinary) differential
equation":
dx(t)
x(t) = ax(t), a -A 0, where *(t) = dt

Here x(t) is a real-valued differentiable function defined on the real line, and a
is some real number which is constant. The two most basic features of the above
equation that one should note are the following:

(i) The above equation holds for all values of t in the domain (here the entire real
line).
(ii) The function x(t) is not a priori specified, that is, it is an "unknown" function.'
ELEMENTS OF THE THEORY OF DIFFERENTIAL EQUATIONS 303

To "solve" a given differential equation(s) is to specify the unknown

function(s) so that the given differential equation(s) is reduced to an identity(ies).
For example, x(t) = ce°f, where c is some constant, will reduce the above
differential equation to an identity. In other words, we have now "found" the
unknown function, which turns out to be x(t) = ce°1; hence it is a "solution" of
the above differential equation. The reader may immediately note that c can be
any real number, as long as it is fixed. Hence there are infinitely many solutions.
However, if we specify one more condition in the above equation such that
x(t°) = x°, then this is no longer the case, for the above solution x(t) = ce°' must
satisfy this "boundary condition" (or "initial condition"). The constant c must
take some fixed value, c = x°e-°1°, so that the solution is now written as x(t) =
x°e°('-'O). It is easy to check that this solution is unique up to the boundary
condition; that is, if there exists another solution i(t) such that . (t°) = x°, then
z(t) = x(t) = x°e0-'n). Often t° is taken to be 0. Then c = x° so that x(t) = x°e°'
is a (unique) solution. When an unspecified constant(s), such as c in the above
example, is specified by a boundary condition(s), the solution thus obtained is
often called a (particular) solution. The solution with an unspecified constant(s) is
often called a general solution. The differential equation in the above example
contains only the first derivative of the unknown function, and it is called a first-
order differential equation. When the highest order of the derivative of the un-
known function is n, it is called an nth-order differential equation. To obtain a
particular solution of an nth-order equation, n boundary conditions are usually
required. It is also possible that we have more than one unknown function. For
example, there can be x1(t), x2(t), ..., x(t) that
is, n unknown functions. For the first-order system of n simultaneous differential
equations, n boundary conditions are usually required to obtain a particular
solution of x(t).
In passing, we should note one fundamental fact in the above example
x(t) = ax(t). We observed that x(t) = ce°' is a general solution and a particular
solution is unique up to a boundary condition such as x(t°) = x°. In other words,
in this example there exists a solution, and a particular solution is unique up to
the boundary condition. This may not necessarily be the case when we are given
a general form of a differential equation(s). However, without the existence (and
uniqueness) of a solution, the study of differential equations will be meaningless
or, at most, uninteresting- Hence the first fundamental theorem in the theory of
differential equations is the one which gives the conditions under which a solution
exists. This is stated later as the Cauchy-Peano theorem. We are now ready to
begin a more formal discussion of differential equations.

Definition: Let J, i = 1, 2, .., n, be real-valued functions defined on X Ox T

where X c R" and T (T°, TI) c R. The system of equations

X1(t) = I [x1 (t), x2(t), ..., x°(t), t], i = 1 , 2, ... , n
where z; (t) = dx; (t)/dt, or, in vector notation,
304 THE STABILITY OF COMPETITIVE EQUILIBRIUM

X(t) = f [x(t), t]
is called a system of n first-order differential equations. An R"-valued function c(t)
defined on a subinterval of T, (t', t2), is called a solution of the system if
(i) The function ¢(t) is continuous on (t1, t2).
(ii) c(t) E X for all tin (t1, t2).
(iii) 0(t) = f[¢(t), t] for all t in (t1, t2), except possibly for the elements of
some countable subset of (t1, t2).

Let x(t) = f [x(t), t] be a system of n first-order differential equations, which

is defined as above. Let (x°, t°) be a point in X (x T such that x(t°) = x°. We call
(x°, t°) the initial condition if 0(t°) = x°. We denote the solution which satisfies
the initial condition by c(t; x°, t°) or, if there is no danger of confusion, simply by
x(t; x°, t°), or x(t; x°), or even x(t). The notation x(t) is clearly sloppy, but it is
often used in the engineering literature.
REMARK: The system of differential equations
X(t) = f [x(t), t]
is sometimes contrasted with the following special case:
X(t) = f [x(t)]
In the latter case, t does not explicitly appear on the right side of the
equations. The former system is called a nonautonomous system and the latter
is called an autonomous system.
REMARK: Let 0 be a continuous real-valued function defined on an open
subset of R"+2. Then the equation 0 [x(t), X(t), ..., xt"1(t), t] = 0, where
x( )(t) = d"x(t)/dt", is called an nth-order differential equation. In particular,
if this is written in the form

xt"1(t) _ d [x(t), X(t), ... , x("-1)(t), t]

the change of variables, y1(t) = x(t), yz(t) = X(t), ..., y"(t) = x("-1)(t),
allows the equations to be rewritten in the following form, which is a system
of n first-order differential equations:
Yr=y1+l,i= 1,2,...,n- 1
y" _ (b [y1 (t), yz(t), ... , yn(t), t]
Hence it is sufficient to consider the theory of a system of n first-order
differential equations (although it may not always be convenient).

Definition: Suppose that a system X(t) = f [x(t), t] can be written in the form
ELEMENTS OF THE THEORY OF DIFFERENTIAL EQUATIONS 305

x(t) = A(t) x(t) + u(t)

or equivalently
n
xi(t) _ ai1(t) x1 (t) + u.(t), i = 1, 2, ... , n
1= I

where A(t) = [a;1(t)] ; then the system is called linear. If a;1(t) = a constant for
all t and for all i and j, then it is called a linear system of constant coefficients. The
function u(t) is called the forcing function or control function of the system, and if
u(t) = 0, the linear system is said to be homogeneous. Unless otherwise specified,
we will be concerned primarily with the nonlinear system x = f [x(t), t].
The fundamental theorem in the theory of differential equations, as
mentioned before, is the Cauchy-Peano theorem, which asserts the existence and
uniqueness of the solution. For the purposes of later chapters (especially Chapter
8), we will state this theorem for a system which has a form slightly different from
the one described above. We consider the form
x(t) = f [x(t), u(t), t]
where u(t) is a known (or a priori given) function (sometimes called a control
function). It is an m vector-valued function oft [or u(t) E R°'] . We may neglect
this function for the purposes of this chapter. We now state the theorem.

Theorem 3.B.1 (Cauchy-Peano): Let X '(t) = f [x(t), u(t), t] be a system of n

first-order differential equations, where f is an R"-valued function on X ®x R"' H)T
[X is an open connected subset of R" and T = (T', T2) c R]. Suppose that the
following conditions hold.
(A-1) The function f is continuous on X x® Rm® T.
(A-2) The partial derivative 8 f / 8x1 exists and is continuous on X Ox R "' Q T for all i
and j = 1, 2, ..., n.
(A-3) The function u(t) is "piecewise" continuous on T, that is, continuous on T
except possibly for a countable number of points in T.
(A-4) (x°, t°) E X x0 T.
Then there exists a junction 0 (t) from some interval (t 1, t2) containing to intoR"such
that2

(i) The function 0(t) is continuous on (t1, t2) and 0(t°) E X.

(ii) 0 (t°) = x°.
(iii) 0(t) =f [0(t), u(t), t] (that is, 0(t) is a solution of the system).
(iv) If 0(t) satisfies (i), (ii), and (iii) above on an interval (s1, s2), then 0(t) _ 0(t)
on (t1, t2) (1 (s1, s2) (that is, the solution which satisfies the initial condition is
unique).3
'06 THE STABILITY OF COMPETITIVE EQUILIBRIUM

REMARKS :

(i) For the proof of this theorem, see Coddington and Levinson [ 5] , chapter
1, or any standard textbook on differential equations.
(ii) Note that no assumptions are made about the existence and continuity of
the partial derivatives 8f /8 uk.
(iii) The theorem gives a local result, for it establishes the existence of a solu-
tion on an interval (t', t2), which can be very small.
(iv) In the statement of the theorem, R"' can be replaced by any subset of R"'
which contains the closure of the range of u(t).
(v) Assumption (A-2) can be weakened; that is, it can be replaced by the
following condition, called the Lipschitz condition.4

(A-2') There exists a constant k > 0 such that

11f(x1,u,t)-f(x2,u,t)11 <k11x' -x211

for all x', x2 E X and t E T, where 1 I
denotes the Euclidian norm.
REMARK: Since the Cauchy-Peano theorem gives only a local result, and
since we are primarily concerned with global results in this chapter [ that is,
the process of t - cc presupposes the existence of a unique solution on (t°,
cc)] , this theorem is only a guide for the existence of a solution. The global
existence theorem has not been fully established as yet.5 Henceforth we will
often adopt a simple assumption such as the following: "We assume that
there exists a unique solution to the system determined by the initial point
(x°, t°)." This, however, does not diminish the importance of the Cauchy-
Peano theorem.
REMARK: In the linear system x(t) = A(t) x(t) + u(t), assumptions (A-1)
and (A-2) are clearly satisfied. Let u(t) be continuous and let (x°, t°) be a
point in XQQ T. Then we can show that the unique solution which satisfies the
initial condition exists and is such that (t', t2) in the above theorem may be
replaced by the whole domain of t, that is, T. In other words, for the linear
case, the existence theorem gives a global result.
We will now discuss the stability of a system of n first-order differential
equations, x(t) = f [x(i), t] , as defined above.

Definition: A point x E X in the above system of differential equations is called

an equilibrium point (state) if f(k, t) = 0 for all t. In the autonomous system, z is
called an equilibrium point if f(z) = 0.
We are concerned with the behavior of the solution, 0(t; x°, t°), which
satisfies the boundary condition, 0(t°; x°, t°) = x° or, in short, O(x°, t°) = x°. In
particular, we are concerned with whether o (t; x°, t°)->z as t>cc. There are two
ELEMENTS OF THE THEORY OF DIFFERENTIAL EQUATIONS 307

important concepts in this connection; one is concerned with the behavior of 0

when x° is "sufficiently" close to I, and the other is concerned with the behavior of
0 when x° is an arbitrary point in the x-plane. The former is the one required for
"local stability" and the latter is the one required for "global stability."
We should also note that the solution path 0 (t; x°, t°) depends on to as well
as on x°. Here we are concerned with the behavior of 0, regardless of the value of
t°; thus, for example, we can pick an arbitrary value of t°, such as to = 0. Hence
our definition of stability is now stated in a form which is independent of the value
of t°.

Definition: An equilibrium state I is called globally stable, if ¢(t; x°, t°)-> I as

t ->oo regardless of the value of (x°, t°). Or, more precisely, for any c > 0, there
exists a t such that 11 ¢(t; x°, t°) - I 11 <_ E for all t > to + t, where 11 x° 11 5 S and
S > 0 can be arbitrarily large. (Note that I in the above definition depends on f
but not on x° and t°.)

Definition: An equilibrium state I is called locally stable if there exists a closed

ball BS(I) about I with radius S > 0 such that x° E BS(I) implies 0(t; x°, t°)-
I as t -aoo. More precisely, for any c > 0 there exists a S (E) (that is, S depends on E)
and a t (E, x°) (that is, t depends on c and x°) such that 11 x° - 111 < S implies
11 0(t; x°, t°) - 111 < E for all t > to + 1. (Note that in the above definition, I does
not depend on to although it depends on c and x°.)

REMARK: In the above definitions of global and local stability, t may

depend on to and x°, and S may depend on to as well as on E. In order to
discuss this, a more complete exposition of the stability concepts is neces-
sary. We postpone this task to Section H of this chapter where, among other
things, we will discuss how the above concepts are related to the concepts
called "uniform global stability" and "uniform local stability." Since such
a discussion will be confined solely to Section H, the reader need not worry
about it here.
REMARK: An equilibrium state may not be unique. In other words, there
may exist many values of x such thatf(x, t) = 0 for all t. The above definition
of global stability is much too strong in this case, for it requires that the
solution 0(t; x0, t°) go to a particular equilibrium state regardless of the
initial point xU. When there are multiple equilibria (each of which may or
may not be isolated from the others), and if the solution 0 (t; x°, t°) converges
to some equilibrium point, then we may say that the system is globally stable.
If every limit point of 0 (t ; x°, t°), as t -co, is an equilibrium,' then the system
is said to be quasi-stable. This concept was studied by Uzawa [81. If the
system is globally stable, then it is quasi-stable. The converse does not neces-
sarily hold. However, if all the equilibrium points are distinct from each
308 THE STABILITY OF COMPETITIVE EQUILIBRIUM

other, then the quasi-stability of the system implies the global stability of the
system.'
The following simple example may be useful to clarify some of the above
concepts.
EXAMPLE: X (t) _ -2x(t), t E R, x E R, and x(t°) = x°
Clearly z = 0 (for all t) is an "equilibrium state" of this equation. It is
easy to see that this solution is unique. The solution of this differential
equation is obtained as
0(t; x°, t°) = x°e-2(t-t°)

Clearly, ¢(t; x°, t°) --> 0 (=z) as t - co regardless of the initial value,
(x°, t°). In other words, the equilibrium point of the above differential
equation is unique and globally stable. In general, given x(t) = a x(t), t E R,
x E R, and .x(t°) = x°, z = 0 is a unique equilibrium state. It can easily be
seen that z = 0 is globally stable if and only if a < 0.
A diagrammatical device is often useful to ascertain the stability property of
an equilibrium state when the dimension of x(t) is small (say, 1 or 2). To illustrate
this, consider the following example:

EXAMPLE: x(t) 5Ix3(t), t E R, x E R, x(t°) = x°

Clearly, z = 0 (for all t) is an equilibrium state. The above differential equa-

tion is illustrated in Figure 3.4. From the diagram it is clear that if x > 0,
x < 0 so that x(t) decreases as t increases, and that if x < 0, x > 0 so
that x(t) increases as t increases. Hence in either case x(t) - 0 as t- oo,
regardless of the initial position of x [or more precisely, regardless of the
initial value, (x°, t°)] It is also easy to see from the diagram that z = 0
.

is a unique equilibrium state. The diagrammatical proof of a slightly more

complicated case will be illustrated in Section D in connection with the

Figure 3.4.. An Illustration of Simple Proof of Stability.

ELEMENTS OF THE THEORY OF DIFFERENTIAL EQUATIONS 309

proof of global stability of a competitive equilibrium for the three-com-

modity case. Such a diagrammatical technique is in general referred to as
the phase diagram technique. Note that in this technique no explicit solution
of the differential equation is sought.
Suppose that we now have the following homogeneous linear system with
constant coefficients:
x(t) = A x(t), where A = [au] is an n x n matrix
Clearly z = 0 is an equilibrium state and it is unique if A is nonsingular. From the
remark on the Cauchy-Peano existence theorem, we know that there exists a
unique solution. In fact, an explicit solution to the above system can easily be
obtained. We state it as a theorem.

Theorem 3.B.2: Let x(t) =-A x(t) be a given system of differential equations
with x(0) = x°. Then
0(t; x°) =

eA` °0 Ak t k

(so that eAl is an n x n matrix).

REMARK: For an excellent but simple discussion of the explicit solution of
the linear system, see, for example, Athans and Falb [ 1 ] , pp. 125-149.
Suppose A has n distinct real eigenvalues,9 A1 , A2, ..., A,,; then we know
from the elementary theory of linear algebra that there exists a nonsingular n x n
matrix P such that P A P - ' is a diagonal matrix where the diagonal elements are
A 1, A2, . . ., A,. Using this property we can easily rewrite the solution of the above
linear system as follows.

Corollary: I f A. i n the above system, has n distinct eigenvalues Ai , A2, ... , A,,, then
the solution can be written as
0(t;x°)=Pe:,,P-
xo

where P is an n x n nonsingular matrix and"'

eAli 0 0 0

0 0 ... ... e..i r

310 THE STABILITY OF COMPETITIVE EQUILIBRIUM

Hence in this case we can see that the stability property depends crucially
on the eigenvalues, the A,'s. In particular, if the .A,'s are all negative, then clearly
e"->0 as t ->oo. Hence the system is globally stable. In general, we have the
following theorem, which holds even when the eigenvalues are not all distinct.

Theorem 3.B.3: Let *(t) = A x(t) be a given system of differential equations. The
equilibrium point x = 0 is globally stable if and only if the real part of any eigenvalue
of A is negative.
PROOF: See, for example, Bellman [2], [3], Coddington and Levinson [5],
Birkoff and Rota [4], and Gantmacher [6].
REMARK: If a system is given in the form z(t) = A [x(t) - 11, then,
clearly, x(t) = x is an equilibrium state and it is unique if A is nonsingular.
Carrying out the change of variable y(t) = x(t) - x, we find that y = 0 is the
unique equilibrium state for y(t) = A y(t), and the above theorem can be
applied immediately.
REMARK: If A is negative definite, then from the elementary theory of
linear algebra, we know that all its eigenvalues are negative; hence the
system is stable.
Given an arbitrary n x n matrix, we now wish to know whether all the eigen-
values of A have negative real parts. There is a famous theorem for this.

Theorem 3.B.4 (Routh-Hurwitz): A necessary and sufficient condition that all the
roots of the equation
a0An + aIA"- I + ... + an = 0
with real coefficients have negative real parts is that the following conditions hold:
aI ao 0 0
al ao 0 a3 a2 al a0
al ao
al > 0, > 0, a3 a2 aI > 0,..., a5 a4 a3 a2 > 0
a3 a2
a5 a4 a3

10 0 0 0 ... an

Here ao is taken to be positive (if ao < 0, then multiply the equations by - 1).
PROOF: See Gantmacher [6], chapter XV.
REMARK: The above condition is known as the Routh-Hurwitz condition.
Its power lies in the fact that it provides a necessary and sufficient condition
for stability. However, in actual application, its power is quite weak because
ELEMENTS OF THE THEORY OF DIFFERENTIAL EQUATIONS 311

of the computations required when n is large. In fact, when n > 4, the

computation usually becomes too tedious.
Let us return to the (autonomous) nonlinear differential equation
x(t) = f [x(t)]
to obtain the local property of the solution path. The standard procedure is to
take a Taylor expansion off about an equilibrium value z, a point in which
f (z) = 0 (if there exists such a value), and then to disregard the second and higher
order terms. Thus we obtain the linear differential equation

where A = [atj] and atj = (evaluated at x = z) [in other words, A = f'(z),

the Jacobian matrix off at z] We call the linear system obtained in this manner
.

the linear approximation system. It is clear that if an equilibrium point in the linear
approximation system is (globally) stable, then it is locally stable in the original
system. We should note, however, that the converse is not necessarily true. In
other words, it is possible that an equilibrium point is locally or globally stable in
the original system and is not stable in its linear approximation system. This is
due to the fact that higher order terms may act favorably for stability. Consider
the following example.
EXAMPLE: x(t) = ax(t) - x(t)3. Clearly z = 0 is an equilibrium point. If
a = 0, z = 0 is a globally stable equilibrium point. Its linear approximation
system is i(t) = 0 (when a = 0). Hence z = 0 is not stable-the solution
starting from an initial point x0 always stays at x°. In order to stress this
fact, we call z, an equilibrium point which is stable in the linear approxima-
tion system, linear approximation stable." This point is often confused in the
literature which applies Samuelson's "correspondence principle" to the
comparative statics problem. Although the Routh-Hurwitz condition
provides a necessary and sufficient condition for the stability of the linear
approximation system, it does not necessarily provide a necessary condition
for the (local) stability of the original (nonlinear) system owing to the
reason discussed above. Hence the Routh-Hurwitz condition for the linear
approximation system cannot, in general, be utilized in obtaining compara-
tive statics results.

FOOTNOTES

I. Observe also that the unknown function here, x(t), contains only one independent
variable, t. This is the defining characteristic of an ordinary differential equation. If
the unknown function contains more than one independent variable, then we have
a "partial differential equation." Here we are solely concerned with ordinary
differential equations.
312 THE STABILITY OF COMPETITIVE EQUILIBRIUM

2. Note that the interval (t1, t2) on which the solution 0(t) is defined is a subset of
T. In other words, 0(t) may not be defined on the entire interval T. For example,
consider x = x2, x E R. Clearly 0(t) = -1/t is a solution which passes through
¢(1) = - 1. However, ¢(t) is not defined at t = 0. The existence theorem here asserts
only the existence of ¢(t) in a neighborhood of t0, that is, (t1, t2).
3. When some of the assumptions of the theorem are violated, the solution which
satisfies the initial condition, even if it exists, may not be unique. For example, con-
sider z = Vx- if x 0, and x = 0 if x < 0, with x E R and 0(0) = 0. Clearly
[¢(t) = 0 for all t, - oo < t < oo] is a solution which satisfies ¢(0) = 0. However,
[¢(t) = t2/4, if t > 0, and ¢(t) = 0, if t < 0] is also a solution which satisfies ¢(0) = 0.
Here (A-2) is violated at x = 0. Given the initial point (t0, x0), the problem of finding
the solution 0(t), defined on (t', t2), of a given system of differential equations which
satisfies ¢(t0) = x0, is called the initial value problem.
4. It can be shown that if f (x, t) has continuous partial derivatives, it satisfies the
Lipschitz condition. But f(x) = V A (where x E R, x > 0) does not satisfy even the
Lipschitz condition at x = 0. To see this, observe that I V A - vy- I = I X - Y I /
({ + /) where x 0 and y 0. When x and y approach 0, 1/(,/x + Vly) will
increase indefinitely. In general, the Lipschitz condition [or (A-2)] is crucial
to guarantee the uniqueness of the solution which satisfies ¢(t°) = x0. If f is not
Lipschitzian [hence (A-2) is violated] but if all the other assumptions of Theorem
3.B.1 are satisfied, then all the conclusions of Theorem 3.B.1 follow except (iv);
that is, the existence of ¢(t) is guaranteed but not its uniqueness. Here the continuity
off is the crucial assumption for existence.
5. However, in specific cases, global existence (and uniqueness) can be ascertained.
The procedure is as follows. Suppose that the solution ¢(t) exists. Suppose f is
bounded as well as continuous in X Ox T. Then we can show that ¢(tl + 0) [ that is,
lim ¢(t) as t -> tl with t > tl ] and ¢(t2 - 0) [that is, lim ¢(t) as t- t2 with t < t2]
both exist. Suppose ¢(t' + 0) and ¢(t2 - 0) are in X; then the solution exists in
neighborhoods of (t' + 0) and (t2 - 0) by Theorem 3.B.1. In this way the solution can
be "continued" or extended to an interval which is larger than (t1, t2) and, there-
fore, we can prove the existence (and the uniqueness) of solutions for the interval
(0, co) under certain assumptions. For the discussion of the "continuation" of
solutions, the reader is referred to any standard textbook on differential equations.
6. In other words, if for some sequence tq, q = 1, 2, ..., such that tq -> w, lim
¢(tq, x0, t0 as q -> oo exists, then lim ¢(tq, x0, t0) as q-. oo is an equilibrium. As
Uzawa ([8], p. 619) has shown, the concept of quasi-equilibrium in essence means
that the "distance" between the set of equilibrium points and 0(t, x0, t0) converges
to zero as t -? oo.
7. In other words, whether or not equilibrium points are isolated is crucial. As an
example of a system that is quasi-stable but not (globally) stable, consider the case
in which ¢(t, x0, t0) spirals toward the unit circle as t increases but approaches
no single point on the unit circle, while the set of equilibrium points is the unit
circle. See Section H of this chapter.
8. Let z be a scalar (real or complex). Then the exponential function eZ can be
defined by e2 = )k 0zk/k!; hence the definition of e't below conforms with
this definition. When z is a real number, then the above definition of eZ can be
obtained as a consequence of the usual definition of e2 by using the Taylor ex-
pansion theorem.
9. Readers who are not familiar with the concept of eigenvalues are referred to the
beginning of Section B, Chapter 4 (or any standard textbooks on matrix algebra).
THE HISTORICAL BACKGROUND 313

10. Notice that the definition of eA` conforms with the above definition of elt.
11. In the above example, z = -x3 (with a = 0), c = 0 is not linear approximation
stable.

REFERENCES

1. Athans, M., and Falb, P. L., Optimal Control, New York, McGraw-Hill, 1966,
esp. chap. 3.
2. Bellman, R., Stability Theory of Differential Equations, New York, McGraw-Hill,
1953.
3. , Introduction to Matrix Analysis, New York, McGraw-Hill, 1960.
4. Birkoff, G., and Rota, C. C., Ordinary Differential Equations, Boston, Ginn & Co.,
1962.
5. Coddington, E. A., and Levinson, N., Theory of Ordinary DifferentialEquations, New
York, McGraw-Hill, 1955.
6. Gantmacher, F. R., The Theory of Matrices, Vol. II, New York, Chelsea Publishing
Co., 1959 (tr. from Russian).
7. Samuelson, P. A., Foundations of Economic Analysis, Cambridge, Mass., Harvard
University Press, 1947.
8. Uzawa, H., "The Stability of Dynamic Processes," Econometrica, 29, October 1961.

Section C
THE STABILITY OF
COMPETITIVE EQUILIBRIUM-
THE HISTORICAL BACKGROUND

We assume here that a competitive equilibrium is described by the following

system of equations:
f(n,,Pz,.... Pn)=0,i= 1, 2,...,n[orf(p) = 01
where pi denotes the price of the ith commodity and f denotes the excess demand
for the ith commodity,' and we consider its stability. The fundamental assumption
in the stability analysis of a competitive market is that an excess demand for the
ith commodity raises the price of the ith commodity and that an excess supply of
the ith commodity lowers the price of the ith commodity. One may question this
traditionally accepted assumption of the competitive market, but here we will
proceed on the basis of this assumption. Note that if the market for a certain com-
modity is isolated from all the other markets, the stability analysis is not too dif-
314 THE STABILITY OF COMPETITIVE EQUILIBRIUM

ficult and its solution has already been indicated in Section A. The complications
arise when we have repercussions among a number of markets.
The first satisfactory treatment of the stability of a competitive equilibrium
was done by Leon Walras [ 18]. He solved this problem fairly completely for the
two-commodity exchange economy. Hicks [7] extended the scope of the analysis
to a multicommodity economy. For the multimarket case, we suspect naturally
that repercussions among the various markets will complicate the analysis a great
deal. In order to deal with this problem, following Hicks, we distinguish two con-
cepts of "stability." An equilibrium in the market for the jth commodity (hence-
forth thejth market) is said to be imperfectly stable if the markets for all the other
commodities are held in equilibrium (with possible adjustment of the prices of
these goods) and there is stability in the jth market. Let p be an equilibrium
price vector [that is, f(p) = 0] (which is assumed to exist), and suppose that
the price vector p deviates from this p. In order to avoid complicating the dis-
cussion, let us assume that p lies in a certain small neighborhood of p (by this
convention we would like to avoid for the time being the problem of multiple
equilibria and local vs. global stability). Then the equilibrium in the jth market
is said to be imperfectly stable if
(i) f(p) = 0 for all i 4 j,
and
(ii) pj > pj implies j;(p) < 0 and pj < pj implies f (p) > 0.
The equilibrium in the jth market is said to- have perfect stability if the above
imperfect stability holds regardless of the number of the other markets adjusted to
equilibrium, or more specifically, whether or not other prices are fixed or adjusted
so as to maintain equilibrium in the relevant market [that is, for i j, either
f(p) = 0 or pi = constant]. If the equilibrium in every market in the economy
is imperfectly stable (say, at p), then Hicks states that the equilibrium of the system
is imperfectly stable. If the equilibrium in every market in the economy is perfectly
stable, then Hicks states that the equilibrium of the system is perfectly stable.
The Hicksian method of stability analysis is essentially that of comparative
statics. By differentiating the equilibrium system f (p) = 0 at a certain equi-
librium-say, p-with respect to pi and applying the definitions of perfect stability
and imperfect stability (then repeating this for all j = 1, 2, ..., n), Hicks obtained
the following condition for perfect stability for the equilibrium of the system:

aii aii a,1 aik

aii < 0, ail > 0, < 0, ...
aii aii aik
aii ail aki akf akk

for all i, j, k, . . ., of the index set { 1, 2, . . ., n} . (See Quirk and Saposnik [ 16] ,
pp. 153-160, as well as Hicks [7] .) Here ay = af,./app, evaluated at p, i, j = 1, 2,
. ., n. It is to be noted that the Walrasian condition for the stability of the two-
.
THE HISTORICAL BACKGROUND 315

commodity economy corresponds to the first of the above conditions (that is,
a,A < 0). When an n x n matrix A = [ate] satisfies the above condition, A is said to
be Hicksian. The Hicksian condition for the imperfect stability was obtained as
AAi/A < 0, where A denotes the determinant of A and AAA denotes the co-factor of
A at ai,.
The above Hicksian concepts of perfect and imperfect stability (in addition
to the assumption of timeless and instantaneous adjustment) clearly have an air
of artificiality about them and thus require some further examination. A care-
ful scrutiny of these concepts will reveal that they may in fact have little to
do with the stability problem that we are considering. Instead of checking these
points, Samuelson [ 17] proposed a fresh approach to the problem. First, he writes
the fundamental assumption of stability analysis as the following system of dif-
ferential equations:
dpA(t)
dt = kAf [pi(t), P2(0, ..., A,(t)] , i = 1, 2, ..., n
Here k; denotes the speed of adjustment of the ith market.2 The fundamental as-
sumption of stability analysis specifies that ki is strictly positive. Then stability
analysis is reduced to the problem of examining the dynamic system generated by
the above system of differential equations. This amounts to examining the stability
property of the above system of differential equations. Alternatively, one may also
formulate the fundamental assumption of stability analysis in terms of the follow-
ing system of difference equations:
pA(t+1)-pi(t)= k1I[p(t)],i= 1,2,...,n
where kA > 0 is the speed of adjustment of the ith market.
Whichever approach one takes, we say that an equilibrium (or, more specifi-
cally, an equilibrium price vector p) is "stable" if the time path of the solution of
the dynamic system, starting from an initial point p°, converges to p. When this is
the case, Samuelson calls the equilibrium truly dynamically stable. We can dis-
tinguish here between local stability and global stability. Samuelson was mainly
concerned with local stability. We may note that either one of the above dynamic
systems describes the behavior of the price vector when it is not an equilibrium
point. We can thus carry out the stability analysis by examining the stability prop-
erty of either of the above dynamic systems. Partly because the theory of dif-
ferential equations is more developed than the theory of difference equations, the
later development occurs mostly through the differential equation approach.
This (dynamic) approach by Samuelson is conceptually much more trans-
parent than Hicks's approach in the sense that it properly handles the general
equilibrium nature of the stability analysis (that is, the repercussions among
various markets). It also has the advantage that it makes clear the dynamic
character of the adjustment process toward an equilibrium.
Samuelson then takes a linear approximation of the above system [that is,
he takes only the linear terms of the Taylor expansion off,.(p) about an equilibrium
316 THE STABILITY OF COMPETITIVE EQUILIBRIUM

price vector p] . Noting that j(p) = 0 from the definition of an equilibrium, we

easily obtain
n
dpr(t) = 1, 2,...,n
dt l= i

where ay = aji/app evaluated at p = p; or, in vector notation

dp(t)=
dt [p(t) - P]
where A = [a,], and K is a diagonal matrix whose diagonal elements are k; and
whose nondiagonal elements are all zero. Then the stability analysis of a competi-
tive market is reduced to the stability analysis of the above system of linear dif-
ferential equations. We now recall the discussion of Section B, that is, a necessary
and sufficient condition for stability is that all the eigenvalues of A have negative
real parts. In order to establish this property for matrix A, we refer to the Routh-
Hurwitz condition (Theorem 3.B.4).3 We recall that stability in the above linear
approximation system (that is, "linear approximation stability") implies local
stability in the original system and that the converse of this statement is not
necessarily true. In other words, an equilibrium point can be locally stable in the
original system but it may not be stable in the linear approximation system. We
note that as long as we deal with the stability of the linear approximation system,
we can also use the theory of linear differential equations.
Samuelson then considers the relation between the Hicksian stability (the
conditions of which were discussed above) and the true dynamic stability (in the
linear approximation system). He concludes that (1) for the two-commodity case,
the two conditions are equivalent, (2) for the three-commodity case, the Hicks
condition for perfect stability (that is, that matrix A be Hicksian) is sufficient for
true dynamic stability, and (3) for the n-commodity case (n > 3), the Hicks
condition for perfect stability is neither necessary nor sufficient for true dyna-
mic stability. This relation between Hicks' condition for perfect stability and
dynamic stability is explained in more detail in the literature (for example,
Samuelson [ 17] , Lange [8] , Metzler [ 10] , and Morishima [ 11 ] ), with the fol-
lowing results.

(i) If A is symmetric (that is, ay = ajj) and if k; = I for all i, the Hicksian con-
dition is equivalent to the dynamic condition (Samuelson and Lange). This can
be seen easily by noting that if A is symmetric and Hicksian, then it is negative
definite, which implies that the real parts of the eigenvalues of A are always
negative.
(ii) If A is quasi-negative definite [that is, (A + A')/2 is negative definite where A'
is the transpose of A] , and if k, = 1 for all i, then Hicks' condition is equivalent
to the dynamic condition (Samuelson).
(iii) If the dynamic process is stable regardless of the values of the speeds of ad-
justment, then Hicks' condition must be satisfied (Metzler).
THE HISTORICAL BACKGROUND 317

(iv) If A has all its nondiagonal elements positive (ail > 0; i L j), Hicks' condition
is equivalent to the dynamic condition (Metzler).

In order to understand the meaning of statement (iv), let us consider the

model of pure exchange. Let xi denote the total demand for the ith commodity in
the economy and xi the total holdings of the ith commodity in the economy. We
note that xi is the sum of each consumer's demand for the ith commodity. Thus if
xik denotes the demand for the ith commodity by the kth consumer and if there
are m consumers in the economy,
m
Xi = 2: Xik
k=1

The excess demand function fi(p) may then be written as

J(p)=xi(p)-x1,i= 1,2,...,n
Then

a ij
of = aXi m aXik
k=
ap1 a Pi 1 a Pi

(evaluated at p). Hence ail > 0 if axik/apt > 0 for all k = 1, 2, ..., m. This means
that for each consumer the demand for the ith commodity rises when the price of
the jth commodity rises. There should be no confusion between this concept of
substitutability and ordinary (net) substitutability. The latter is concerned with
the (positive) effect of a change in the price of commodity j on the demand for
commodity i when real income is properly compensated. Such a qualification of in-
come compensation is absent in "gross substitutability." That is, when axik/apt > 0
holds (for all p), we say that commodity i is a gross substitute of j for Mr. k with
respect to the change in the price of j (i j). Hence ail > 0 for all i and j (i j) is
guaranteed if all the commodities are gross substitutes for each other for every
consumer in this pure exchange economy. We call this case, ail > 0 for all i and
j (i 4 j), the gross substitute case.
This gross substitute case attracted the attention of many economists, and in
1958 (that is, about ten years after the publication of Samuelson's Foundations
[ 17] ), Hahn [6] , Negishi [ 12], and Arrow and Hurwicz [2] independently
proved that if ail > 0, i j, then the equilibrium point is stable in the linear ap-
proximation system; hence it is locally stable in the original system. Note that in
statement (iv) above, Hicks' condition for perfect stability is stated as a necessary
and sufficient condition for dynamic stability. What Arrow and Hurwicz, Negishi,
and Hahn proved is that Hicks' condition can be totally dispensed with in the gross
substitute case. The novelty of their proof is that they take full advantage of the
implications of the economic assumptions underlying the competitive model, such
as Walras' Law, and the zero homogeneity of the individual's demand function.
In 1959 Arrow, Block, and Hurwicz [ 1 ] finally proved that the original system is
318 THE STABILITY OF COMPETITIVE EQUILIBRIUM

globally stable if all commodities are gross substitutes for each other and put an end
to one of the major periods in the history of the stability of competitive markets.
We may simply list some other major points considered after Samuelson [ 17] .4

(i) The nonnegativity of the price vector has been explicitly considered (Nikaido
and Uzawa [ 15] ).
(ii) Expectation has been introduced into the model (Enthoven and Arrow [5] for
the extrapolative expectation and Arrow and Nerlove [4] for adaptive ex-
pectation).
(iii) Non-t&tonnement processes have been introduced and examined.5
(iv) Some attempts to relax the gross substitutability assumption have been made.
In the course of such attempts, important examples for unstable equilibrium
have been discovered by Scarf (see Section F), which in turn cast dark shadows
on the scope of the stability of competitive markets and the method of finding
an equilibrium by such an adjustment mechanism of the markets.'

Finally, two remarks are in order. In the course of the proof, it was noticed
that the speed of adjustment, k;, is immaterial for the stability property. Arrow
and Hurwicz [2] and Arrow, Block, and Hurwicz [ 1]. noted that by choosing the
units of measurement properly, we can choose k, = 1 for all i and for all t.' If this is
the case, our basic dynamic adjustment system is simplified as
dpi(t) 1, 2, ..., n
dt = f [P l(t), P2(t), ... , Pn(t) ]

dp(t)
dt = f [P(t)]
The second remark is concerned with the equilibrium state. In the system
f(p) = 0, which defines an equilibrium state, we note that one commodity can
be taken as the numeraire (for example, po = 1). If every individual's budget
relation holds with equality (that is, if everybody spends all his "income"-as a
result of nonsatiation and the like), then we have the relation known as Walras'
Law. That is, the price-weighted sum of all the excess demands is identically
equal to zero. This relation is supposed to hold whether the economy is in equi-
librium or not. When one of the prices is taken to be the numeraire and one of
the equations in the system is dropped because of Walras' Law, we say that the
system is a normalized system; otherwise it is a nonnormalized system. That one
commodity can be chosen as numeraire depends on the homogeneity assumptions
For the nonnormalized system, none of the commodities is designated as numeraire,
although the homogeneity of the excess demand functions is usually assumed
to be still binding on the system. The Hicksian discussion on stability is based on
the normalized system, while dynamic stability can be (and has been) discussed
THE HISTORICAL BACKGROUND 319

under either the normalized or the nonnormalized system. Interestingly enough,

the proof of the dynamic stability of the normalized system is in general different
from that of the nonnormalized system, and the stability relation between the two
systems has not been studied thoroughly.

FOOTNOTES

1. Let there be n + 1 commodities. Assuming homogeneity, choose one commodity

(say the 0-th commodity) as the numeraire. By Walras' Law one equation can
be dropped and the above system of n equations describes the equilibrium. Recall
our discussion in Section E of Chapter 2.
2. In a dynamic form, this means that the price of the ith commodity rises if its demand
exceeds its supply and falls in the opposite case, the so-called "law of supply and
demand."
3. As long as we are talking about linear approximation stability, the difference equa-
tion approach is as good as the differential equation approach. For the analysis of the
stability of a nonlinear system, the differential equation approach is often more con-
venient. The condition which corresponds to the Routh-Hurwitz condition in this
case is known as the Schur-Cohn condition. See Samuelson [ 17] , for example.
4. For an excellent survey article on the stability problem of competitive equilibrium,
see Negishi [ 13]. See also Quirk and Saposnik [ 16], especially chapter 5.
5. For the survey of this discussion, see Negishi [ 13] and Section G of this chapter.
6. In addition to these attempts, which deepened our understanding of the stability
theory of competitive markets, important mathematical techniques were made
known to economists and added to the list of standard tools of economic analysis.
For example, the mathematical theory of the Leontief's input-output analysis was
found to be relevant to stability analysis, resulting in the inclusion of the theory of
"dominant diagonal matrices" in our box of standard tools (see Chapter 4). Also the
importance of the Liapunov "second method" in economic analysis was recognized
and added to our standard tools.
7. As Parry Lewis pointed out, this convention of setting ki = 1 for all i "by a suitable
choice of units of commodities" implies that these commodities have to be measured
in peculiar units. Although this does not seem to affect the analysis by Arrow and
Hurwicz [2] and Arrow, Block, and Hurwicz [1] with gross substitutability,
serious confusions may occur unless these units of measurement and dimensions are
kept carefully in mind. See J. P. Lewis, "Dimensions in Economic Theory," Manches-
ter School of Economic and Social Studies, 31, September 1963. Moreover, we may
also note that while the equilibrium may become stable for one set of speeds of
adjustment, it may remain unstable for another set of speeds of adjustment. To
illustrate this point, consider the case in which each market is stable when the
repercussions from the other markets are ignored, but the equilibrium of the system
as a whole is unstable for one set of speeds of adjustment; then, by making speeds
of adjustment large enough in a sufficiently large number of markets, we may
actually obtain the stability of the equilibrium of the system as a whole.
8. If there are n (instead of n + 1) commodities in the economy, dpi/dt = f,(pi,
P2, ... , A,), i = 1, 2, . ., n, describes the adjustment mechanism of the nonnormalized
.

system. Here none of the commodities is taken to be a numeraire.

320 THE STABILITY OF COMPETITIVE EQUILIBRIUM

REFERENCES
1. Arrow, K. J., Block, H. D., and Hurwicz, L., "On the Stability of the Competitive
Equilibrium, II," Econometrica, 27, January 1959.
2. Arrow, K. J., and Hurwicz, L., "On the Stability of the Competitive Equilibrium,
I," Econometrica, 26, October 1958.
3. , and , "Decentralization and Computation in Resource Allocation,"
in Essays in Economics and Econometrics, ed. by Phouts, Chapel Hill, N. C., Univer-
sity of North Carolina Press, 1960.
4. Arrow, K. J., and Nerlove, M., "A Note on Expectation and Stability," Econometrica,
26, April 1958.
5. Enthoven, A. C., and Arrow, K. J., "A Theorem on Expectations and the Stability of
Equilibrium," Econometrica, 24, July 1956.
6. Hahn, F. H., "Gross Substitutes and the Dynamic Stability of General Equilibrium,"
Econometrica, 26, January 1958.
7. Hicks, J. R., Value and Capital, 2nd ed., Oxford, Clarendon Press, 1946.
8. Lange, 0., Price Flexibility andEmployment, Bloomington, Ind., Principia Press, 1944.
9. McKenzie, L. W., "Stability of Equilibrium and the Value of Positive Excess
Demand," Econometrica, 28, July 1960.
10. Metzler, L., "Stability of Multiple Markets: The Hicks Conditions," Econometrica,
13, October 1945.
11. Morishima, M., "Notes on the Theory of Stability of Multiple Exchange," Review of
Economic Studies, XXIV, 1957.
12. Negishi, T., "A Note on the Stability of an Economy Where All Goods Are Gross
Substitutes," Econometrica, 26, July 1958.
13. , "The Stability of a Competitive Economy: A Survey Article," Econometrica,

30, October 1962.

14. , The Theory of Price and Resource Allocation, Tokyo, Toyokeizai Shimpo-sha,
1965 (in Japanese).
15. Nikaido, H., and Uzawa, H., "Stability and Nonnegativity in Walrasian Taton-
nement Process," International Economic Review, 1, January 1960.
16. Quirk, J., and Saposnik, R., Introduction to General Equilibrium Theory and Welfare
Economics, New York, McGraw-Hill, 1968.
17. Samuelson, P. A., Foundations of Economic Analysis, Cambridge, Mass., Harvard
University Press, 1947.
18. Walras, L., Elements of Pure Economics, 4th ed., tr. by Jaffe, London, George Allen
& Unwin, 1954.

For a more complete bibliography on this topic, see Negishi [ 131.

AN ILLUSTRATION OF THE PHASE DIAGRAM TECHNIQUE 321

Section D
A PROOF OF GLOBAL STABILITY
FOR THE THREE-COMMODITY CASE
(WITH GROSS SUBSTITUTABILITY)-
AN ILLUSTRATION OF THE
PHASE DIAGRAM TECHNIQUE

Here we consider a model of a competitive economy in which there are only

three commodities. We prove the global stability of the system using a diagram-
matical technique called the phase diagram technique. Since this technique has
many applications in other branches of economics, such as macro-economics and
growth theory, the reader will benefit by becoming familiar with its use. This tech-
nique was first introduced to economics by Marshall [ 3] and adapted by Hicks [2]
to the stability analysis of a competitive market. But its first rigorous use with
complete recognition of the assumptions made in its formulation was by Arrow
and Hurwicz [ 1 ] in proving the global stability of the three-commodity economy.
Our exposition is based chiefly on this paper. The discussion here will illustrate the
use of the phase diagram technique as well as a proof of global stability, especially
its use of economic assumptions involved in the system.
Let f (p), where p = (p1, P2, p3), denote the excess demand function of the
ith commodity. Consider the model of a competitive equilibrium described by
f (PI, h, h) = 0, i = 1, 2, 3
or, in short, j (P) = 0, i = 1, 2, 3. We assume
(A-1) (Gross substitutability) f,7 (p) > 0 for all values of p, i 4 j, i, j = 1, 2, 3.'
(A-2) (Homogeneity) f (p), i = 1, 2, 3, are positively homogenous of degree 0
[that is, f,;(ap) = f (p) for all a > 0, a E R] .
(A-3) (Walras' Law) i= I p, f (p) = 0 for all p.
(A-4) pi > 0, i = 1, 2, 3.
(A-5) There exists at least one P such that J(P) = 0 for all i.2
In view of (A-2), we henceforth normalize the price vector p such that p3
always?
Our dynamic adjustment process is described by
pi = ki.r(p1, P2, P3), k; > 0, i = 1, 2
Note thatfl = 0 and f2 = 0 imply f3 = 0 from Walras' Law (that is, if the first two
markets are brought into equilibrium, the third market is automatically brought
into equilibrium). Hence it suffices to consider the adjustment process of the first
two markets for the stability analysis.
322 THE STABILITY OF COMPETITIVE EQUILIBRIUM

Our problem is to find out whether or not the solution of the above system of
differential equations, p(t; p°, 0), or simply p(t; p°), converges to the equilibrium
price vector p, where p is defined by f (p) = 0, i = 1, 2, 3. The phase diagram tech-
nique is a device which shows the time path of p(t; p°) without explicitly solving
the differential equations. The technique (for the present case) is essentially based
on the fact that each [p; = 0] curve or [ f (p) = 0] curve (i = 1, 2,) [that is, the
locus of (p1, p2) such that pi = k. f (p) = 0] divides the entire (p 1-p2)-plane into two
regions: the region in which pi > 0 and the region in which A < 0, where i = 1, 2.
Recall in this connection that p3 = 1 always. Hence we can omit any consideration
of p3 or p3. This enables us to consider the problem in the two-dimensional plane.
First we ascertain the shape of the [f1(p) = 0] curves (i = 1, 2). We assert
that they are both upward sloping and that they intersect only once [hence an
equilibrium point, that is, a point in which f,.(p) = 0 for all i, if it exists, is unique] .

Moreover, we can assert that the [f2(p) = 0] curve intersects the [f (p) = 0]
curve "from the left." By checking the signs ofpl and J2 in the four regions defined
by these two curves, we will be able to ascertain the global stability of p. We now
pursue this process in detail.
First observe that the following "Euler's equation" holds owing to the homo-
geneity assumption (A-2):
3
Z =0 forallp,i= 1,2,3
j= I

Then in view of (A-1) and (A-4), we obtain

f;<0 forallp,i= 1,2,3
Consider the values ofp1 and p2 for whichf, = 0. In the (p1-p2)-plane this defines
a curve. We want to obtain the slope of this curve, that is, dp2/dp1. This can be
obtained simply by differentiating f1 = 0. That is, f11 dp1 + f12 dpZ = 0 ('.'p3 = 1 or
dp3 = 0). Thus dp2/dp1 = -fl 1 /f12, which is positive from the fact thatf < 0 for all
i and from (A-1). Similarly, we obtain the slope of the curve defined byf2 = 0 on
the (p1-p2)-plane by differentiating f2 = 0, that is, f21 dp1 +f22 dpZ = 0. Hence
dp2/dp1 = -f21 /f22, which again is positive. Thus we have established that both
the [ f, (p) = 0] curve and the [ f2 (p) = 0] curve are upward sloping.
Next we ascertain the signs of p; in the region defined by these two curves.
Sincef12 > 0 for all p by (A-1), to the left (resp. right) of the [f, (p) = 0] curve,
f(p) > 0 or P, > 0 (resp. f1(p) < 0 or p, < 0). Since f21 > 0 for all p by (A-1), to the
left (resp. right) of the [ f,(p) = 0] curve, f2(p) < 0 or P2 < 0 (resp. f2(p) > 0 or
P2 > 0). In establishing the above facts, we can use f, 1 < 0 and f22 < 0 instead of
f 2 > 0 and f21 > 0, respectively.
By (A-5), there exists at least one equilibrium point. Thus the two curves, the
[f, (P) = 0] curve and the [f2(p) = 0] curve, intersect at least once. Now we
assert that such an intersection happens only once. In other words, we assert that
the equilibrium price vector p is unique. To show this, suppose the contrary, that
AN ILLUSTRATION OF THE PHASE DIAGRAM TECHNIQUE
323

is, suppose that there exists another equilibrium point p* > 0, f(p*) = 0 and
p* # p, where p` = p3 = 1. Then the [fi(p) = 0] curve and the [f2(p) = 0] curve
intersect at least twice, that is, at pointsp* and p. Then at one of these two points-
say, at p*-the [f, (p) = 0] curve must intersect the [f2(p) = 0] curve from the
left. This is illustrated in Figure 3.5.
P2

f,(P) =C

Figure 3.5. The Proof of the Uniqueness of an Equilibrium.

Now consider a point p in the diagram; p is chosen such thatp, > pr andp2 >
p`. (Note that p3 = Pt = p3 = 1). Then f, (P) > 0 and f(p) > 0. Totally differen-
tiatef3(p), and obtain

df3 = f31 dPI + f32dp2 ('.'dp3 = 0 for p3 = 1 always)

Compare the two points p* and p. Since f (p*) = 0, the above equation implies
f3(p) > 0 ('-'Al > 0,132 > 0, dp, > 0, dp2 > 0, so that df3 > 0), so that f (p) > 0
for all i = 1, 2, 3. This contradicts Walras' Law, or (A-3), in view of(A-4). Hence
we establish the uniqueness of an equilibrium point and that p* cannot be an
equilibrium point; that is, the [f2(p) = 0] curve intersects the [f(p) = 0] curve
from the left only (and the intersection point is p).4
We now obtain the phase diagram illustrated in Figure 3.6, from which the
global stability ofp can easily be seen.5 In the diagram the time paths of the price
vector (p I, p2) corresponding to the two possible initial points (po and p°) are il-
lustrated. For example, consider point po in Figure 3.6. At po, f, > 0 and f2 < 0
so that p, > 0 and P2 < 0. In other words, p, increases whereas P2 decreases over
time. The price path of [p 1(t), p2(t)] eventually hits the [f, = 0] curve-say, at
point A where f2 is still. negative. At point A, then, p, = 0 and P2 < 0, so that the
price path enters the region in which f, < 0 and f2 < 0, where both p, and P2 de-
crease over time. Notice that the price path may hit the [12 = 0] curve afterward-
say, at point B. But at point B, p, < 0. Hence the price path will "bounce back" to
the region in which f, < 0 and f2 < 0, approaching the equilibrium point.
The essence of the phase diagram proof of stability here, under gross sub-
324 THE STABILITY OF COMPETITIVE EQUILIBRIUM

k,f,=0 k2f2=0
Pp
Po +/ - -/+
IN

P,
0

Figure 3.6. An Illustration of the Phase Diagram.

stitutability, is that, regardless of the initial value of the price vector, the price path
[pl(t), p2(t)] is "trapped" inside the region in whichpl < 0 andp2 < 0 withp, > pI
andp2 > p2, or inside the region in whichpl > Oandp2 > Owith p I < p I andp2 < p2 .

FOOTNOTES

1. As before, f,j denotes of/app.

2. We will observe that under the present set of assumptions, especially (A-2), the
equilibrium p is unique. The proof of the existence of equilibrium is fairly simple
under the present set of assumptions, (A-1) to (A-4), especially with gross sub-
stitutability. For such a proof, see H. Nikaido, "Generalized Gross Substitutability
and Extremization," Advances in Game Theory, ed. by M. Presher, L. S. Shapley,
and A. W. Tucker, Princeton, N.J., Princeton University Press, 1964. For a
generalization of Nikaido's result which relaxes the assumption of the continuous
differentiability of the excess demand functions, see K. Kuga, "Weak Gross
Substitutability and the Existence of Competitive Equilibrium," Econometrica,
33, July 1965.
3. The normalization thus amounts to adding another equation p3 = 1 to the system.
Assumption (A-2) implies that if p is an equilibrium price vector, then ap is also
an equilibrium price vector for any a > 0. By the normalization with P3 = 1, this
is no longer possible.
4. Using similar logic, we can prove easily that the [f, (p) = 0] curve will not overlap
with the [f2(p) = 0] curve for any "interval." To see this, suppose that the two
curves overlap over the interval between the two points p* and p in Figure 3.5. Then
we have f,.(p*) = f,(p) = 0, i = 1, 2, so that f3(p*) = f3(p) = 0. But using
df3 = f3idPI + f32dP2 and f3(p*) = 0, we also obtain f3(p) > 0. This contradicts
f3 U) = 0.
5. In general, let x(t) = f [x (t)] , where z,(t) = f,.[x(t)] , be a given system of differential
equations. The solution x(t) of this differential equation defines a curve on the (xl-
x2)-plane for a given initial point x0, where t is taken to be the parameter. With
the existence and the uniqueness of a (continuous) solution, such a curve is uniquely
THE n-COMMODITY CASE 325

drawn for each given x° and is continuous. Such a curve is called the (solution)
path or orbit, and the (x1, x2)-plane on which the solution path is drawn is called
the phase space. Since there can be many possible initial points, we can draw a family
of the solution paths, each path corresponding to each initial point. The phase
diagram technique is concerned with the technique of studying the behavior of the
solution paths on the phase space, without actually solving the given system of
differential equations.

REFERENCES

1. Arrow, K. J., and Hurwicz, L., "On the Stability of the Competitive Equilibrium, I,"
Econometrica, 26, October 1958.
2. Hicks, J. R., Value and Capita!, 2nd ed., Oxford, Clarendon Press, 1946.
3. Marshall, A., Money, Credit and Commerce, London, Macmillan, 1923, appendix J
(this appendix was originally published in 1879 as "The Pure Theory of Foreign
Trade").

Section E
A PROOF OF GLOBAL STABILITY
WITH GROSS SUBSTITUTABILITY-
THE n-COMMODITY CASE

In this section we study the proof of the global stability of a competitive

equilibrium due to Arrow, Block, and Hurwicz [ 1] .' For this exposition, I
benefited from an excellent survey article on the stability problem by Negishi [ 3]
and a critical scrutiny of [ 1] by Hotaka [2]. We consider the following non-
normalized system of an n-commodity pure exchange economy:'

(1) d dt t) - fT [ p 1(t), p2 (t), ... , pn (t)] = xi [ p1(t), p2 (t), ... , A, (t)] - xi

where the functions f are defined on the interior of the nonnegative orthant of
Rn and are assumed to be continuously differentiable. Here we adopt the Arrow-
Hurwicz convention of setting all the speeds of adjustment equal to one by
choosing units of measurement for each commodity properly. Denote the
price vector (p1, P2, ..., by p. It should be understood that p is a func-
tion of time, t; that is, p = p (t). The price vector p (t), which is the solution
of the above system of equations, obviously depends on the initial condition,
p(0). We denote the value of p(O) by p° and assume that it is positive. We
also assume that, for any given initial price p°, there exists a unique solution
326 THE STABILITY OF COMPETITIVE EQUILIBRIUM

p(t, p°), t E [0, cc) for the above dynamic system, there exists a positive equilibrium
price vector p [that is, f (p) = 0, i = 1, 2, ..., n], and the demand functions are
single-valued and continuously differentiable. Finally, we assume that the follow-
ing relations hold:
(A-1) (Wairas' Law) Jn 1p;f,(p)= 0.
(A-2) (Homogeneity) xi(p) = x;(ap), i = 1, 2, ... , n, for any positive number a.
(A-3) (Gross substitutability) ax;(p)/apj > 0, for all p, i # j, i, j = 1, 2, ..., n.3

REMARK: All the above relations are assumed to hold for any t > 0 and
p(t) > 0.+ Thus we may rewrite Walras' Law as
n
E p (t)I. [P(t)] = 0
i= I

The homogeneity assumption, (A-2), implies that f (p) is homogeneous of

degree zero. By Euler's equation,
n
E.rjpj=0,i= 1,2,...,n

J= i

where f j _ of/app In view of (A-3), 0, i 4 j, i, j = 1, 2, ..., n.

Lemma 3.E.1: Assume Walras' Law; then we have

n
II P(t) II = II p(O) II for all t > 0, where II p(t) 112 = 2:P,2(t)
i=1

PROOF: Differentiate II p(t) II 2 with respect to t. Then

d n n

dt Pr(t )Z = 2 X p1(t)P, = 2 pi(t)f [P(t)] = 0 ('.'Walras' Law)

1 i=1 r= 1

Hence 11 p(t) 11 = constant = II P(o) II (Q.E.D.)

Lemma 3.E.2: The homogeneity and the gross substitutability assumptions, that is,
(A-2) and (A-3), imply that the equilibrium price vector is unique up to a positive
scalar multiple.
REMARK: This lemma states that any equilibrium price vector may be
expressed in the form ap where a is some positive number. Geometrically,
this means that there is a unique "equilibrium ray" {ap: a > 0}.
PROOF: Suppose not. In other words, let p and p* be two equilibrium price
vectors such that p* # ap for any positive a. Let p,/p* - min; { p 1 /p * ,
P2IP2* , P;lP*, ..., and write y =_ P;/p*. By definition, µ < pilp*
for all i, or pi > up* for all i. Since p* ap for any positive a, p; > yp*
for some i L 1. Write p, - up*. Then pj > p; for all i with strict inequality
THE n-COMMODITY CASE 327

for some i L I. From the gross substitutability assumption, this implies

x,(p) > x,(p), and from the homogeneity assumption f (p*) = 0 implies
f (p) = 0. Thus we have x, = x,(p) > x,(p) = x,. This is a contradiction.
(Q.E.D.)
REMARK: A slight error by Arrow, Block, and Hurwicz [1] in this
connection was pointed out by Hotaka [2].
REMARK: Note. that Walras' Law is not used in the proof of Lemma 3.E.2.
REMARK: From Lemma 3.E.1, we know that p(t) moves only along the
sphere with radius II p(O) II Hence Lemma 3.E.2 implies that if p(t)->p (an
.

equilibrium price vector), then p is unique, where p is confined to such a

sphere.

Lemma 3.E.3: Let p be an equilibrium price vector. Under the assumptions of

Walras'Law, homogeneity, and gross substitutability, we have

n
p; f,1 (p) > 0 for all p > 0 such that p ap for any a > 0
1= I

PROOF: We illustrate the proof for the two-commodity case diagram-

matically in Figure 3.7 and refer to Arrow, Block, and Hurwicz [ 1] and its
revision by Hotaka [2] for the proof of the general case. This exposition
for the two-commodity case is due to Negishi [3].
The point x in Figure 3.7 represents the total stock of the two com-
modities, that is, the vector z = (x I, x2). Let p be an equilibrium price
vector; then x1(p) = x 1 and x2(p) = x 2. Let p be a price vector such that
p ap for any positive number a. By Lemma 3.E.2, p is not an equilibrium
price vector. As a result of Walras' Law, E2 i p;x;(p) _'Elp;x;. Hence a
point [xI(p), x2(p)] is on the line AB which passes through the point z and

D B

Figure 3.7. An Illustration of the Proof of Lernma 3.E.3.

328 THE STABILITY OF COMPETI'rIVE EQUILIBRIUM

whose slope' is given by p. Let CD be the line which passes through the point
x and whose slope is given by p. We assume that p 142 > p i /p2 (under the
assumption p1/p2 < pi/p2, the lemma can be proved analogously). In other
words, we assume that the line CD is steeper than the line AB. This assump-
tion means that pl/p, > p2/p2. Write,u = p2/p2. Then pi > µp1 and p2 = µp2.
Hence the gross substitutability assumption implies that x2(p) > x2(µp).
But Walras' Law implies that µpixI(p) + µp2x2(p) = µp1x I + µp222 =
µplx1(µp) + µp2x2(µp), so that we must have x1(p) < x1(µp). Using the
homogeneity assumption, we get xi(p) = xi(µp) > xl(p) = x I, and x2(p) =
x2(Pp) < x2(p) = x2. Hence point x(p) = [x1(p), x2(p)] must lie to the
right of the point x in Figure 3.7. Now draw a line parallel to CD passing
through the point x(p). We see at once that p x(p) > p z. Hencefi f(p) > 0
wheref(p) = [fi(p),f2(p)] (Q.E.D.)
REMARK: This lemma states that in any disequilibrium situation, the sum
of the excess demands weighted by the equilibrium prices is always positive.
We recall that Samuelson's weak axiom of revealed preference states that
p. t x < 0 implies p' Ax < 0, where Ax = x(p') - x(p). That this axiom
holds for an individual is a consequence of rational behavior.' However,
the statement that this axiom holds for the entire economy (that is, for the
market demand as a whole) is not a consequence of rational behavior but
is an additional assumption as we remarked in the Appendix to Section E,
Chapter 2. In any case, suppose that this axiom holds for the entire economy.
Walras' Law implies that p x(p) = p x = p x(fi), where p is an equi-
librium price vector, so that we have p Ax = 0 where Ax = x(p) - x(p).
Hence from the weak axiom of the revealed preference for the entire econ-
omy, we have p A x < 0. This means that p [x(p) - x(p)] < 0, orp [x -
x(p)] < 0, which is nothing but the statement of the lemma. Hence Lemma
3.E.3 is also implied from the weak axiom of revealed preference for the
entire economy.
REMARK: We may recall that if the weak axiom holds in the aggregate
(that is, the conclusion of Lemma 3.E.3), then the equilibrium is uniqueupto
a positive scalar multiple.' To prove this, suppose not. That is, suppose there
exists a p * crp for any a, yet f (p *) = 0 for all i. But by the assumption,
we have > 0, which is a contradiction. Note that gross sub-
stitutability is not needed in this proof. See the appendix to Section E,
Chapter 2.
REMARK: The uniqueness is nice,` but it is a rather restrictive phenome-
non.' Note that the uniqueness here is a consequence of such restrictive
.

assumptions as gross substitutability or the weak axiom in the aggregate.

Theorem 3.E.1 (Arrow-Block-Hurwicz): Let p be an equilibrium price vector.

Under the assumptions of Walras' Law, homogeneity, and gross substitutability, the
system described in equation (1) is globally stable.
THE n-COMMODITY CASE 329

PROOF: We consider the Euclidian distance between p(t) and P and show
that this distance converges to zero as t->oo. Let D(t) - II p(t) - P II 2 =
2:" i [p;(t) P;]z. From Lemma 3.E.1,
- II p(t) II = II p(O) II Normalize P
such that II P II = II p(O) II Differentiate D(t) with respect to t. That is,
.

dD(t) = d
dt
[ E 1p, (t)
dt ;= i - Pr}2] = 2 i=1
117i (t) - Pr} dP`
dt
n n n
= 2 E {p, (t) - Pr} .f (p) = 2 12: pi(t)f(p) - 2: Rff (p)}
i=I i=i
n

-2Epf,(p) Law).

Hence dD(t)/dt < 0, by Lemma 3.E.3, as long as p # aP for any a > 0. If

p = aP, then dD(t)/dt = 0 [since f,(af) = 0 for all i], and we are done;
so we assume that p # aP. dD(t)/dt < 0 implies that the convergence of
p(t) to the equilibrium point, P, is monotone. This monotone movement of
p(t) toward P does not preclude the possibility of p(t) never reaching P. In
other words, D(t) may be bounded away from 0. Suppose it is, that is, p(t)
is bounded away from P. Let P = {p: II p II = II p(O) 11 1. Then there exists an
open ball BE(P) about f with radius E > 0 such thatp(t) E P - P \ BE(P), for
all t. Since P is compact and dD/dt is continuous in p, dD/dt achieves its
maximum in P (Weierstrass' theorem). Since dD/dt < 0 in P, this implies that
there exists a S > 0 such that dD/dt < -8 < 0. Integrating both sides of
the above inequality from 0 to t, we obtain D(t) - D(O) < -8t, or D(t)
D(O) - St. Hence for t larege enough, D(t) < 0. This contradicts the condi-
tion that the norm is always nonnegative. (Q.E.D.)
REMARK: In view of the above proof and the remark on uniqueness
immediately following Lemma 3.E.3, the gross substitutability assumption
is replaced by the assumption that the weak axiom holds in the aggregate.
REMARK: Arrow, Block, and Hurwicz also showed the proof in terms of
the maximum norm, Dm = max; {(p; (t) - P;)/P; } (instead of the Euclidian
norm used above), under a weaker set of conditions. However, the proof is
more difficult, and monotonicity of convergence in the maximum norm
does not necessarily imply monotonicity of convergence in the Euclidian
norm.

FOOTNOTES

1. It is possible to produce simpler proofs than the ones given by Arrow, Block,
and Hurwicz [ 1] . However, our attempt here to sketch one of the proofs in [1]
will be useful in enhancing our understanding of the stability problem. The facts
that we pick up along the way in the present round-about manner of proof are
of some economic interest in themselves.
330 THE STABILITY OF COMPETITIVE EQUILIBRIUM

2. Arrow, Block, and Hurwicz [ 11 also proved the stability of the normalized system.
3. The gross substitutability assumption can be stated without using derivatives (that
is, without assuming differentiability) as follows: For any j = 1, 2, ... , n, we have
pi = p; for all i j and pj < pj, implying that f(p) < f (p') for all i j, where p =
(p1, ..., p") and p' = (pi, ..., p.). This is called gross substitutability in the finite
incremental form.
4. It can be easily shown that the gross substitutability and homogeneity assumptions
are inconsistent with each other if pi = 0 for any i (Section F-c, of this chapter
and Hotaka [21). Therefore, under these two assumptions, the f,.'s have no meaning
if p(t) is a boundary point of the nonnegative orthant of R". This obviously implies
that if p is an equilibrium price vector [that is, fi(p) = 0, i = 1 , 2, ... , n] , then
p > 0. An error by Arrow, Block, and Hurwicz [ 1] in this connection was pointed
out and corrected by Hotaka ([2] , pp. 305-306), who also showed that these two
assumptions and Walras' Law imply that if p;-0 for some i (the other prices
being fixed), then f (p)->oo.
5. A brief recollection of the weak axiom may be useful. Interpret x and x' as the
consumption vectors of a particular individual. Let x and x', respectively, be chosen
by him when p and p' prevail. Assume the uniqueness of the choice. If x' is affordable
at p-that is, p. x' < p- x-then x is revealed to be preferred to x', for he could
have consumed x'. If this is the case, x' cannot be revealed to be preferred to x;
that is, it is impossible to have p' x < p'- x', with x' chosen under p'. In other words,
p Ax < 0 implies p' Lx < 0. See P. A. Samuelson, Foundations ofEconomicAnalysis,
Cambridge, Mass., Harvard University Press, 1947, chapter 5.
6. We may recall that in the proof of the existence of a competitive equilibrium, Wald
proved the uniqueness of equilibrium by assuming that the weak axiom holds in the
aggregate.
7. When we have multiple equilibria, the property that p(t, po) always converges to
a particular equilibrium point regardless of p0 is rather restrictive. Thus "unique-
ness" 'is a nice property, especially when we are interested in global stability. How-
ever, multiple equilibria are not necessarily destructive in stability analysis. Recall
Uzawa's concept of "quasi-stability" which we remarked upon in Section B. See
also Section H.
8. It may suffice to recall the possibility of multiple intersections of the offer curves
in the Mill-Marshall diagram in the theory of international trade.

REFERENCES

1. Arrow, K. J., Block, H. D., and Hurwicz, L., "On the Stability of the Competitive
Equilibrium, II," Econometrica, 26, January 1959.
2. Hotaka, R., "Some Basic Problems on Excess Demand Functions," Econometrica,
39, March 1971.
3. Negishi, T., "The Stability of a Competitive Economy: A Survey Article," Econo-
rnetrica, 30, October 1962.
SOME REMARKS 331

Section F
SOME REMARKS

a. AN EXAMPLE OF GROSS SUBSTITUTABILITY

From the previous analysis, it is clear that the gross substitutability assump-
tion plays an important role in stability analysis. An interesting question, then, is:
What sort of utility function will give rise to a demand function which exhibits
the gross substitutability property? Arrow and Hurwicz [2] have presented such
an example.' Suppose that Mr. i's preference ordering can be represented by the
following real-valued (utility) function:'

(1) ui(xil, X12, ..., Xin) aif log Xii

i= I

where Zj= i aid = 1, aii > 0 for all i, j, and xi (j = 1, 2, ... , n) is the amount of
thejth commodity consumed by Mr. i. We will show that the above utility function
yields a demand function for Mr. i which exhibits the gross substitutability prop-
erty. For notational simplicity, we will omit the subscripts i. In other words,
we will represent Mr. i's preference ordering as
n
(2) u(xi,x2,...,xn) of logxf,Eof=1,anda. >0 forallj
j= I j= I

Now suppose that Mr. i maximizes his utility over his budget constraint and note
that he consumes a positive amount of every commodity (that is, an interior solu-
tion is achieved for this constrained maximum problem).' Then the first-order
condition (see Chapter 1, Section F) can be written in the following form:
au
(3)
Xj
=)pl, j= 1,2,...,n
where A is the Lagrangian multiplier for this problem. Since the above utility
function is a concave function, this condition is also sufficient for the global
maximum of the solution. From (2), au/axj can immediately be obtained as
(4) au _ a;
ax; x/

Therefore, from (3) and (4), we have

(5) X' = p,, or a. = Apxl, j = 1, 2, ... , n

1
332 THE STABILITY OF COMPETITIVE EQUILIBRIUM

Note that this.implies A > 0. This, in turn, implies that all the income is spent.
In other words, if we denote Mr. i's income by M, then M = 2:j Ipixj. Now
we sum equation (5) over j and obtain
n n

1= aj= APjxj= AM
j=I j=I
or

(6)

Suppose that all his income is obtained by selling his resources in the markets.
Denote his initial holding of the jth resources by x j (where we again omit the
subscript i for notational simplicity). Then
n
(7) M- -Y pjzi
j= I
Using (5), (6), and (7), we then obtain
n

M E PJ'xJ
xj= aj-=
XP--
aj-=
1
aj
Pi Pi
Therefore

(8) axj_axk
aPk
JPJ, kj
Hence under the assumptions pj > 0 and zk > 0 for all j and k = 1, 2, ..., n,
all commodities are gross substitutes for Mr. i. Thus if everybody in the economy
has the utility function specified by (2) (with the assumptions used above), the
market demand function also exhibits the "gross substitutability" property.'
To obtain the gross substitutability for the market demand function, we
note (resuming the subscript i for Mr. i):
m in n
xU= i S,'p;x
i=I Pji=l.j=l
Hence

_a
8Pki=I
m a. in
Z xij = J2:xik,
Pji=I
k .j

which is positive if pi > 0 and Z'j. I z ik > 0.

SOME REMARKS 333

b. SCARF'S COUNTEREXAMPLE
In Section E, we established the global stability of a competitive equilibrium
under the assumptions of Walras' Law, homogeneity, and gross substitutability.
It is natural to ask how far we can relax these assumptions. In particular, we
would like to know the extent to which gross substitutability can be relaxed.
It was conjectured that gross substitutability could be replaced by a more plau-
sible assumption on the utility function, such as quasi-concavity (that is, convex
to the origin indifference curves). Scarf [ 18] has constructed examples that cast
doubt on all such conjectures. His examples are useful in understanding the
basic problem involved in the stability question. Here we explain one of them.
Consider a pure exchange economy consisting of three consumers and three
commodities. Let x, be the consumption of commodity j (j = 1, 2, 3) by Mr. i
(i = 1, 2, 3). Suppose that Mr. i's utility function, ui, can he written in the following
form.
ul(XI1, X12, X13) = min {x11, x12}
u2(X21, X22, X23) = min {X22, x23}
u3(X31, X32, X33) = min {X31, X33}

In other words, each individual desires only two commodities and wants them
only in the fixed ratio (one to one). It can easily be seen (by analogy to the case
of fixed production coefficients) that such a utility function gives rise to an L-shaped
indifference curve (which is clearly convex to the origin!). Let xi be the initial
holding of commodity j by Mr. i. Let us suppose that
x;;= 1, and iii= 0,
i (i = 1, 2, 3) only has one unit of commodity i and none
of the other commodities. Scarf's indifference curve and the budget line are
illustrated in Figure 3.8.
Note that, in view of the specifications of the utility functions, the income
consumption path of each individual in Figure 3.8 is the 45-degree line, so that
x11 = x12, X22 = X23, and x31 = x33. Consider the change in the price indicated
by the arrow in the diagram (a decrease in the price of commodity 2). It is clearly
illustrated by the diagram that there is only an "income effect"; there is no
"substitution effect." Hence for such indifference curves, the entire price cha-.ge
is absorbed into the income effect.
The excess demand for commodity I can be written as
(9) X1 - X1 = (X11 + X21 + X31) - (x11 + X21 + x31)
= (xI1 + x31)- x11 = (x11 - Xil) + X31
The budget equation for Mr. I can be written asplxl I + P2XI2 = plx l 1. But by our
convention, X1 I = X12 and x 11 = 1. Hence x1 I = P1 /(pl + P2) so that x11 - x 11 =
334 THE STABILITY OF COMPETITIVE EQUILIBRIUM

x12

Figure 3.8. Scarf's Counterexample (the Case of Mr. 1).

-P2/(Pi + p2). Similarly, x31 can be obtained from Mr. 3's budget equation,
PIX31 + P3x33 = P3x33; that is, x31 = P3/(P3 + p1). Therefore, from (9), we obtain

(10-a) X1 - X 1 = -PI P2 P3
+ P2 + P3 + PI
Similarly we obtain

(10-b) x2-x2--P2+P3+PI+P2
P3 Pi

(10-c) x3-x3- _ Pi P2
P3+PI +P2+P3
From equations (10-a, -b, -c), it is clear that there exists a unique "equilibrium
price ray" PI = P2 = P3
We can write the dynamic adjustment equation as
(11) pi(t)=x1(t)-x;,i= 1,2,3
Now we want to show that II p(t) II = constant for all t. To show this, we
differentiate 11 p(t) 11 with respect to t. In other words

d
dt 1 Pi (t)] = 2 2: P1(t) Pr = 2 2: P;(x1 - x;) = 0
i=I i=I
Hence we conclude that II p(t) II 2 = 1p,2(t) = constant.-'
Next we want to show that II 3_ 1 p1(t) = constant. To do so, differentiate this
as follows:
SOME REMARKS 335

dt [ 11 p1(t)] = PIP2P3 + P2P3PI + P3PIP2

(X1 - XI)P2P3 + (x2 - x2)P3P1 + (X3 - x3)PIP2 = 0

The last equality is obtained by using equations (10-a, -b, -c).

Now we can show that the dynamic process (11) is not globally stable.
First choose the initial prices p;(0), i = 1, 2, 3, such that Er Ip12(0) = 3 and
I13. 1p1(0) 1. Then Z3- ip;2(t) = 3 and H. pi(t) 1 for all t. Since E3.1 p;2(t)
1

= 3, the only possible equilibrium prices are pl = p2 = p3 = 1. Hence the solution

of the above system of differential equations, (11), denoted by pi(t; pi(0), 0), cannot
converge to the equilibrium price pi where pI p2 = p3 = 1.
We may observe the following facts in the above example.

(i) There is no substitution effect.

(ii) The indifference curve is not strictly convex to the origin.
(iii) The indifference curve has a kink and hence is not differentiable.

These conditions are somewhat peculiar when compared with the ordinary
Hicks-Slutsky model of consumer's behavior. Scarf [ 18] and Gale [5] also con-
sidered cases of instability in which the substitution effect is present (however,
this effect is "smaller" than the income effect-Giffen's case). It'is certainly
difficult to say precisely under what conditions the instability arises. However,
Scarf's examples indicate that instability may occur in a wide variety of cases.

C. CONSISTENCY OF VARIOUS ASSUMPTIONS'

We have observed that a certain set of assumptions is necessary to prove
the stability of a competitive equilibrium. A natural question now is whether the
assumptions, by which we can guarantee the stability, are consistent with each
other. It would also be nice to know some of the implications of these assump-
tions, other than the stability of a competitive equilibrium, given that they are
consistent with each other. Following are some important observations on the
gross substitutability and homogeneity assumptions.
These two assumptions can be inconsistent unless pi > 0 for all i. For
example, if we suppose pi > 0 for all i with equality for some i, then we can get
a contradiction. To see this, note that zero degree homogeneity (by definition)
means xi(p) = xi(ap), i = 1, 2, ..., n, for any positive real number a, where x,
is the demand for the ith commodity. Now suppose that pi, = 0 for some i0. Then
the gross substitutability assumption implies xi0(ap) > x,0(p) for a > 1. This
contradicts the homogeneity condition.
However, if pi > 0 for all i, then the three assumptions-gross substituta-
bility, homogeneity, and Walras' Law-are consistent. Consider the following
example of an excess demand function:'
336 THE STABILITY OF COMPETITIVE EQUILIBRIUM

GkakiPk
Pi
where the aki's are arbitrary constants such that aki > 0 for k i and Eiaki = 0.
Since afi/epk = aki/pi > 0 for k i, the gross substitutability assumption is
satisfied. Second, note that
GPifi ' G 'XakiPk = GakiPk = DkG aki - 0
i i Pi k i,k k i

Hence Walras' Law is satisfied. Finally, note that fi(ap) = (2:kakiapk)/(api) _

(2:kakipk) lPi = f (p). Hence the homogeneity condition is satisfied.

d. NONNEGATIVE PRICES
The problem of the stability of a competitive equilibrium, as we have seen,
is concerned with the following system of differential equations:
Pi(t) =f [P1(t), P2(t), ..., A#)], i = 1, 2, ... , n
If the f's are defined on an open connected set X, the Cauchy-Peano theorem
guarantees the existence of a solution in a neighborhood of t = 0. But the Cauchy-
Peano theorem does not guarantee the existence of a solution for the entire
region [0, eo), with which stability analysis is concerned. A natural question is:
Can we guarantee the existence of a solution for the entire region of t, [0, oo), by
guaranteeing the existence of solutions in the local regions 10, EI], [E1, E2],
[E2, E311 ..., and so forth? Suppose we can make these "continuations" by some
suitable methods We may find that we can not go further than in some region
[Ej, e, J, for when t is in the region [Ej, E;+ 11, the solution vector p [t; p(0)],
may lie outside the region X on which the f's are defined. Then the above dif-
ferential equation system would not have solutions for the entire region [0, co).
In stability analysis, X is often taken as the positive or nonnegative orthant of R",
or else X = R" with the explicit constraint p ? 0. Hence the question here is one
of negative prices. In other words, when t reaches the region [Ej, cj+ 11, the solu-
tion of pi[t; p(0)] may become negative for some i. Then p is outside the region
X or violates the restriction p > 0 so that it becomes meaningless to discuss the
question of whether or not p [t, p(0)] converges to an equilibrium price vector
as t goes to .

This nonnegativity condition for the price vector is often neglected in the
literature. Explicit consideration is given to this problem in two masterpieces,
[ 1] and [8] Here we illustrate the problem by using some other studies.
.

One way to avoid the above difficulty is to modify the above system of
differential equations such that, for each i,

dpi(t) _ 0, if pi(t) = 0 and f [ p 1(t), ... , p,(t)] < 0

dt f [p, (t), ... , for all other cases
SOME REMARKS 337

This approach is adopted by Arrow, Hurwiez, and Uzawa [3], Morishima [9],
and Kose [7]. The advantage of this method is that we do not have to modify
the stability analysis of the previous sections very much, if we assume the existence
of a solution for the above modified system. The question now is whether we can
guarantee the existence of a solution for this modified system. The ordinary
Cauchy-Peano existence theorem is not applicable here because the right-hand
side of this system may not be a continuous function of p. To see this, consider a
sequence {pv} in X such that p9_ with p = 0. Assume fi(p) < 0. Then the RHS
of the above differential equation converges to 0, and not to f, (p), as p"->P.
Hence we indeed have a discontinuity. Unless a new existence theorem is proved
for the above modified system, we are not being quite honest if we proceed with the
stability analysis.
Nikaido and Uzawa [ 15] proposed the following alternative system:

(NU) dpi(t) = max If [P, (t), ..., pn(t)] , -pi}, i = 1, 2, . .., n

dt
The right-hand side is a continuous function of p if f is continuous; hence the
problem of discontinuity disappears. Using the continuity and the homogeneity
of f , Nikaido and Uzawa proved a global existence theorem for the above
system (by first showing that the continuity and the homogeneity imply the
boundedness of the above system). The (NU) system states that whenp; happens to
hit the "wall" of p; = 0, then the RHS of (NU) is either 0 or positive. Hence
the nonnegativity condition will be satisfied for the entire process. Moreover, the
"switching" from dp;/dt = f to dp;/dt = -p, is carried out continuously. In
other words, when the sequence of prices, p9, converges top where one of the
components (say, P;) ofP is zero, f,(p9) may become negative (it can be positive,
but then there will be no such "switching"). But dp;/dt would already have
become -p; = 0. When f (p9) < -p;, the dynamic process is switched from
dp;/dt = f to dp;/dt = -pi continuously.
Another dynamic process for the adjustment of a competitive market was
proposed by Nikaido [12]. The essential idea of this procedure was to utilize
the Brown-von Neumann differential equation which had been developed in the
study of the two-person zero-sum games. The Brown-von Neumann equation for
the present case can be written as
dPr(t) =
dt
F'i[P1(t),P2(t),...,A,(t)] - G[pI(t),...,Pn(t)]pi(t) (i = 1, 2,..., n)
F1[PJ (t),P2(t),...,p(t)] = max{f[pl(t),...,pn(t)],0} (i= 1, 2,...,n)
n

G [Pt (t), P2(t),... , Pn(t)] = F'i [P1(t),...

, Pn(t)]
i=1

In ending, we may note one criticism of the above three devices for avoiding
the problem of negative prices. The process of "switching," whether continuous
338 THE STABILITY OF COMPETITIVE EQUILIBRIUM

or discontinuous, is quite mechanical or artificial and little economic meaning can

be given to it. If we understand the dynamic adjustment equation as simply "a rule
of the game," as mentioned before, we may not have to question this. The switch-
ing rule is then a part of the rule. Since the dynamic adjustment equation involves
an extremely difficult problem (which we discuss in the next section), we do not
discuss this issue any further.

FOOTNOTES

1. We should not, however, emphasize gross substitutability too much. Hicks [6]
considered "strongly asymmetrical income effects and extreme complementary"
as causes of instability. Unfortunately, we do not have any important stability
theorem that applies when the gross substitutability assumption is relaxed, although
we do have some results for instability when this assumption is relaxed (cf. Scarf
[ 18] and Gale [ 5] ). See subsection b of this section. Recent studies by Morishima
[ 10] and Ohyama [ 16] seem to offer interesting attempts when gross substitutability
is absent. Unfortunately, [ 10] seems to contain a serious error.
2. Note that this is a logarithmic transformation (which is a monotone transformation)
of a Cobb-Douglas type utility function u; = H1 1 x,JXU, _Yjai = 1. Recall that a
preference ordering is invariant under a monotone (increasing) transformation of the
utility function.
3. This is due to our specification of the utility function, (2). For if consumption of
one of the commodities becomes zero, then the consumer's utility becomes -oo. As
long as he has a positive income, this (zero utility) is certainly not optimal for him.
4. Recently Eisenberg [4] has shown that if each individual's utility function is of
the Cobb-Douglas type (or more generally homogeneous), then the individual utility
functions are "aggregated" to form a social welfare function, which is of the Cobb-
Douglas (or homogeneous) type. Here the "social welfare function" is not used to
describe the welfare level of the society, but rather it is used to describe the behavior
of the society.
5. This result was proved in Lemma 3.E.1. The proof is recorded here to keep our
exposition sufficiently self-contained.
6. For the expositions of this and the following subsections, I am indebted to Nikaido
[13].
7. As we observed in subsection a. such an excess demand function can be obtained
from the Cobb-Douglas type of utility function.
8. For "continuation" of the solution, recall our remark in Section B. For an explicit
proof of the possibility of continuation under the present framework, see Nikaido
([ 14] , pp. 338-339).

3. Arrow, K. J., Hurwicz, L., and Uzawa, H., eds., Studies in Linear and Nonlinear
Programming, Stanford, Calif., Stanford University Press, 1958.
4. Eisenberg, E., "Aggregation of Utility Functions," Management Science, 7, July 1961.
5. Gale, D., "A Note on Global Instability of Competitive Equilibrium," Naval
Research Logistics Quarterly, 10, March 1963.
6. Hicks, J. R., Value and Capital, 2nd ed., Oxford, Clarendon Press, 1946.
7. Kose, T., "Solutions of Saddle-Value Problems by Differential Equation," Econo-
metrica, 24, January 1956.
8. McKenzie, L. W., "Stability of Equilibrium and the Value of Positive Excess
Demand," Econometrica, 28, July 1960.
9. Morishima, M., "A Reconsideration of the Walras-Cassel-Leontief Model of
General Equilibrium," in Mathematical Methods in Social Sciences, ed. by Arrow
et. al., Stanford, Calif., Stanford University Press, 1960.
10. , "A Generalization of the Gross Substitute System," Review of Economic
Studies, XXXVII, April 1970.
11. Negishi, T., "The Stability of a Competitive Economy: A Survey Article," Econo-
metrica, 30, October 1962.
12. Nikaido, H., "Stability of Equilibrium by the Brown-von Neumann Differential
Equation," Econometrica, 27, October 1959.
13. , "The TAtonnement Process and the Nonnegativity Condition," in New

Economic Analysis, ed. by M. Morishima et al., Tokyo, Sobunsha, 1960 (in Japanese).
14. -, Convex Structures and Economic Theory, New York, Academic Press, 1968.
15. Nikaido, H. and Uzawa, H., "Stability and Nonnegativity in a Walrasian TAtonne-
ment Process," International Economic Review, 1, January 1960.
16. Ohyama, M., "On the Stability of Generalized Metzlerian Systems," Review of
Economic Studies, XXXIX, April 1972.
17. Quirk, J., and Saposnik, R., Introduction to General Equilibrium Theory and
Welfare Economics, New York, McGraw-Hill, 1968.
18. Scarf, H., "Some Examples of Global Instability of the Competitive Equilibrium,"
International Economic Review, 1, September 1960.

Section G
THE TATONNEMENT AND TH E
NON-TATONNEMENT PROCESSES

The following system of differential equations has been an essential part of

the analysis of the dynamic adjustment process:
dpi(t) =fIPl(t),P2(t), ...,Pn(t)j,i= 1,2,...,n
dt
340 THE STABILITY OF COMPETITIVE EQUILIBRIUM

We pointed out in Section C (and in Section A) that this system reflects a funda-
mental assumption of stability analysis-that a positive excess demand for com-
modity i raises the price of i and a negative excess demand (that is, an excess
supply) for commodity i lowers the price of i. The above system of differential
equations is a straightforward mathematical formulation of this assumption.
This assumption, although it seems quite plausible, is beset with two serious
difficulties: (1) its behavioral background, and (2) the unrealistic nature of the
"tatonnement process," of which the above system of differential equations is a
mathematical formulation.

a. THE BEHAVIORAL BACKGROUND AND THE TATONNEMENT

PROCESS
The first difficulty is that it is not clear whose behavior is described by the
above system of differential equations. If it describes the behavior of each
"market," it is not at all clear what type of economic agent is behind each market
and what type of behavior leads to the above adjustment process. Walras [ 15]
gave one ingenious answer. He assumed that all the traders gather in one place and
that there exists a "market manager." The market manager quotes a price for the
commodity (say, i). Then each trader writes the amount of that commodity that
he wishes to buy or sell on a piece of paper (called a ticket). If there is an excess
demand for i, the market manager raises the price of i, and if there is an excess
supply of i, he lowers the price of i. Each time he quotes a new price, the "tickets"
are again collected. This process continues until the excess demand becomes zero
(that is, until an equilibrium price is called). Until then, no actual transaction takes
place. This process is called the tatonnement process. Two varieties of the tatonne-
ment process are discussed in the literature. The first assumes that this adjustment
process is carried out simultaneously for all markets. The second assumes that this
process is carried out in one market after another. In other words, first the adjust-
ment process is carried. out in the first market; after an equilibrium price is called
in the first market, the adjustment process takes place in the second market, and so
on. In each process, only one price is adjusted. For example, when the adjustment
process is carried out in the ith market, only the price of commodity i is adjusted
so as to bring the ith market into equilibrium. This process continues until the
last market is in equilibrium; but by that time, the markets considered earlier are,
generally, in disequilibrium again. Thus another "round" of adjustment is carried
out from the first to the nth market. This cycle continues until all the markets are in
equilibrium. The two varieties of tatonnement process may be called the simul-
taneous tatonnement process and the successive tatonnement process, respectively.
The dynamic process, p = f [p(t)] , is a simultaneous process and the original
tatonnernent process considered by Walras is a successive process. In any case,
the crux of the tatonnernent process lies in the exchange of tickets with no actual
trade being carried out until all the markets are in equilibrium,' and the crux of
the stability analysis is to see whether such a tatonnement process can bring all the
THE TATONNEMENT AND THE NON-TATONNEMENT PROCESSES 341

markets into equilibrium. If so, we say that the tatonnement process is a stable
process. This is the problem that we have considered so far in this chapter.
Under the tatonnement process, it is clear whose behavioral rule is described
by p(t) = f [ p(t)] . This is the behavioral rule of the "market manager .112 However,
the question has not been completely answered yet, because we do not know why
the market manager has to obey this behavioral rule. No straightforward explana-
tion such as the profit maximization of producers or the utility maximization of
consumers is given for this behavioral rule. Thus the stability analysis may be
considered as an analysis which shows whether the tatonnement process converges
to an equilibrium when the market manager is instructed to behave according to
p = f [p(t)] . Since it is not clear who should establish this rule or why the market
manager should behave according to this rule, we may consider this to be the "rule
of the game." Thus we get only a partial answer to the question of the stability of
the behavior which is described by this dynamic equation.
Some readers may not like the interpretation of the dynamic process p =
f [p(t)] in terms of tatonnement process, for it appears quite unrealistic to think
of all traders being assembled in one place at one time to carry out the tatonnement
process as described. Thus we may come back to the original question: Whose
behavior is described by this dynamic adjustment process? We end this inquiry
into the behavioral background of the dynamic adjustment equation with the
following acute observation by Koopmans ([5], p. 179):3
If, for instance, the net rate of increase in price is assumed to be proportional
to the excess of demand over supply, whose behavior is thereby expressed?
And is the alternative hypothesis, that the rate of increase in supply is pro-
portional to the excess of demand price over supply price any more plausible,
or any better traceable to behavior motivations?

b. THE TATONNEMENT AND THE NON-TATONNEMENT

PROCESSES
In the above description, we pointed out that no transactions are carried
out in the tatonnement process until all the markets reach equilibrium. The follow-
ing passage from Takayama ([13], p. 142) summarizes the problems involved
when we allow intermediate purchases and actual transactions in the adjustment
process:
if we admit actual purchases in the process towards equilibrium, the excess
demand function is necessarily affected. This is because of the difference
between the trader's purchasing power evaluated at the current price and the
current holding of goods (before trade is carried out), and that evaluated at
the changed price and the changed holding of goods (after trade has taken
place), when Walras's law is effective and we do not assume "recontracting"
in the tatonnement. Then the excess demand function is clearly changed and
the eventual equilibrium will depend on the time path of "tatonnement".
In terms of our equations, we defined p, an equilibrium price vector, to be the
one such that f (p) = 0, and we then described the dynamic process by dp(t)/dt =
342 THE STABILITY OF COMPETITIVE EQUILIBRIUM

f[p(t)] In other words, we used the same function f to denote the equilibrium
.

relation and the dynamic process. If we allow intermediate purchases and actual
transactions in the process, then this excess demand function f will change from
time to time as the traders' income or purchasing power varies 4 Hence the price
vector which prevails when the market is finally cleared depends on the time path
of the process and will, therefore, not generally be the same for any two processes.
Thus the process does not describe at all how the economy actually reaches an
equilibrium price Vectorp, the very problem with which Walras was concerned.'
Here we may note that the stability analysis of the t&tonnement process,
however unrealistic it may look, is one of fundamental importance in economics.
Some of the reasons for this are as follows:

(i) It is a genuine model which describes how the economy can reach an equilibrium.
As long as we describe our economy in terms of equilibrium relations such as
f (p) = 0, it is important to see how we can actually reach an equilibrium. More-
over, our economy may be constantly in disequilibrium as a result of changes in
consumers' tastes, production technology, and the availability of resources in the
economy. That is, the equilibrium relation may be constantly changing. Hence
when the equilibrium relation f(p) = 0 moves to a new relation f(p) = 0, the
price vector p with f (p) = 0 does not necessarily sustain an equilibrium
under the new relation. Hence if the economic model described by equilibrium
relations is to be meaningful, it must contain a model of the adjustment mechan-
ism, by which the equilibrium if disturbed could be restored.' [ In the above
example, ifJ(p*) = 0 for a uniquep*, the mechanism which brings p top * should
be established.] The t&tonnement process, if it is a stable process, offers such
a model. We may also note that the dynamic stability analysis which has been
described in this chapter can be relevant for a model which is more realistic and
does not necessarily involve the t&tonnement process. The author once offered
such a model [ 13] . Even if we grant that the t&tonnement process is unrealistic,
this does not negate the importance of the dynamic stability analysis described
in this chapter. Moreover, there exist adjustment processes in the real economy
which resemble the t&tonnement process-for example, the stock market, the fish
market, the corn market, and so on.
(ii) In Chapter 2, Section C, we showed that a competitive equilibrium is a Pareto
optimal state. Moreover, a competitive equilibrium has a unique feature: a de-
centralized decision-making process. Even apart from the problem of individual
incentives and the like, the decentralized process seems to have a clear advantage
over a centralized decision-making process. It does not involve the almost
impossible task (and accompanying costs) of collecting all the relevant data on
each consumer's tastes, each firm's production set, the resource availability,
and so on, so that the "center" may treat these data in such a way as to obtain a
decision. Hence the model of a competitive equilibrium, whether it is realistic
or not, offers an excellent prototype for the optimal organization of a society
and can be used as a realistic means of achieving a social optimum (even by a
socialist state). Thus when the model of a competitive equilibrium is viewed
as a realistic device for achieving a Pareto optimum, we certainly should know
THE TATONNEMENT AND THE NON-TATONNEMENT PROCESSES 343

how we can actually reach this "equilibrium" state. The tatonnement process,
if it is stable, offers exactly such a process.' See Arrow and Hurwicz [ 2] , for
example."

Now let us return to the unrealistic elements of the tatonnement process. How
can all the traders in the economy gather in one place and exchange tickets? How
can actual trade (and production) be prohibited until an equilibrium price has been
achieved? Despite the fact that we can offer an example from a real economy
which is based on the stability analysis described in this chapter but does not
contain the above difficulties, it is certainly very important to consider models
which explicitly avoid the unrealistic elements mentioned above. Such models
have been developed recently and are known as non-tatonnement processes. The
only models of non-tatonnement processes developed so far are pure exchange
models. In such models, the dynamic adjustment equation dp(t)/dt = f [p(t)]
(=Exi [p(t)] -Z3Ei) is replaced by

m
dpj(t) m

dt E xij [p(t), xl (t), 5E2(t),

= i=1 ... , xm(t)] - i=1
E -xij(t), j = 1, 2, ... , n

and

dxij(t)=Fij[p(t),11(t),...,X,n(t)],I = 1,2,...,m;j= 1,2,...,n

dt
Here xij denotes Mr. i's demand for commodity j, xi;(t) denotes Mr. i's holding
of the commodity j at time t(i = 1, 2, ..., m), and xi(t) is a vector denoting
[x it (t), ... , x The functions Fif denote the transaction rules that individuals
follow to change their stock of commodities xij. Thus Em 1 X1(t) denotes the
vector of all the commodities available in the economy at time t. In the case of pure
exchange, Ji" 1 zi(t) is clearly a constant for all t (which implies dFi,/dt = 0
for all j). It should be clear that zij (t) moves over time and this represents
the essence of the non-tatonnement process which allows intermediate pur-
chases. In the non-tatonnement process the resource allocation i(t) _ [xl(t),
z2(0, ... , as well as the price vector [p1 (t), ... , are the adjustment
parameters of the system. We call [p*, k*] an equilibrium state of the non-tbton-
nement process if
III Ill !ll

Exij[p, X] _ LXij (_ Exij(O) ,j = 1, 2, ..., n

i= J i=1
i= 1

We can define "stability" in terms of

[ p(t), z (t)] -> [ p*, i *] as t -> oo

Three kinds of non-tatonnement processes are well known in the literature, all
of which are confined to the model of the pure exchange economy:'
344 THE STABILITY OF COMPETITIVE EQUILIBRIUM

(i) A barter process (Negishi [ 8] ). In a barter exchange economy, to get some-

thing one must offer something else of equal value. Hence such an exchange
does not alter the total value of the commodities held by each individual.
That is,
n
P. (&x ij
dt )= j=l EpjFij=0 forallt,i= 1,2,...,m
j=l
Negishi [8] showed that the barter process is stable under the gross sub-
stitutability assumption.
However, one may ask whether or not we can avoid the use of the gross
substitutability assumption in the adjustment of the non-tbtonnement process.
This may be done by specifying the rule of exchange in the above barter
process in more detail. In fact, the Edgeworth process and the Hahn-Negishi
process are examples of such a process.
(ii) Edgeworth process (Uzawa [ 14] , Hahn [ 3] , Morishima [ 7] ). This process
is based on the assumption that each individual participates in the exchange
as long as it increases his satisfaction. Hence in the Edgeworth exchange
process, it is supposed that the utility of the stock of commodities held by
each individual increases over time as a result of the exchange transactions
(the process is named after Edgeworth who first described it in his paper
originally written in 1891 and published in his Papers Relating to Political
Economy, Vol. II, London, Macmillan, 1925). When the process reaches a
Pareto optimal point, it cannot move any further by definition; hence it is
an equilibrium point. Let ui be Mr. i's utility function and consider the following
constrained maximum problem:1°
m
Maximize: Eaiui(xi)
i= 1
m m
Subject to: Z zij = X x y
i=I i=1
n
pjxij = pjx ij, and
j= I j= I
ui(xi) > ui(x i)
where xi = (xi 1, xi2, ... , xin ). Let 5 = (zi , z2, .. , z,n) be a solution to this
constrained maximum problem. Obviously for each t we have a different
z ii(t) and hence a different z. The Edgeworth process moves in the direction
of a solution of such a constrained maximum problem; that is,
dx ij
dt = zij(t)- zij(t).i= 1, 177,j = 1, 2, ..., n

Uzawa [ 14] claims that he has shown that the Edgeworth process is stable.''
(iii) Hahn-Negishi process ([ 4] ). This is based on the assumption that if there is
an excess supply of a certain commodity, then all the buyers of this commodity
can achieve their desires, and that if there is an excess demand for a certain
THE TATONNEMENT AND THE NON-TATONNEMENT PROCESSES 345

commodity, then all the sellers of this commodity can achieve their desires.
The following relations illustrate this process:
(1) (Disequilibrium) If xij(t) - 3E,1(t) 1- 0, then sign x;j(t)] = sign
1, 2,...,m;j= 1, 2,...,n.
(2) (Equilibrium) If E;"_ i x, ([) - J" ;(t) = 0, then x;;(t) = 0,
foralli= 1,2,...,m.
Statement (1) means that if there is an excess demand for j (that is,Z;x;f -
,Eix i > 0), then all the sellers can sell (hence x,1 - x; = 0 if Mr. i is a seller)
and not all the buyers can buy (hence xi - x; >_ 0 if Mr. i is a buyer). On the
other hand, if E;x, - Zix; < 0, then all the buyers can buy (hence x,,1 -
x; = 0 if Mr. i is a buyer) but not all the sellers can sell (hence x, - x- < 0
if Mr. i is a seller). Using properties (1) and (2), Hahn and Negishif4] proved
the stability of this process without using the gross substitutability assumption.

FOOTNOTES

1. Walras apparently was not fully aware of the significance of such a "false trading"
(or "recontracting") for he introduced such a concept in his theory of production but
there is no evidence that he considered it seriously in his theory of exchange (see
Patinkin, [ 12] esp. Note B). Newman argues that Walras'tiitonnementwasnotmeant
to be a device to deal with false trading, but that "the device was meant to cope
with-the problem of convergence of the `excess demand' mechanism in multiple
market situations." (See P. Newman, The Theory of Exchange, Englewood Cliffs,
N. J., Prentice-Hall, 1965, p. 102.) Incidentally, for a modern mathematical treat-
ment of the Walrasian successive t&tonnement process in the theory of pure exchange,
see H. Uzawa, "Walras' Tatonnement in the Theory of Exchange," Review of
Economic Studies, XXVII, June 1960.
2. Negishi ([ 111, p. 135) has proposed that the "market manager" in the t&tonnement
process may be regarded as the "incarnation of the competitive forces in the market."
Although this is an interesting observation, it has an objectionable flavor of meta-
physics, as was the case with the "invisible hand." Moreover, in most markets, it is
not easy to think of such a "competitive force."
3. See also Arrow [I] and Takayama [ 13], for similar comments.
4. Or we may consider that f depends on the allocation of commodities among the
traders (in the theory of pure exchange). Given a price vectorp and an initial resource
r _
vector x ; = 1 x ;. , .. , a ;,, , trader i chooses his demand vector xi. = (x11 , ., xi ),
. .

so that we may write x; = x;(p, x;). The market excess demand vector is defined as
x; - Ex;, which can, therefore, be written as f (p, x , ... , x,,,), assuming ni traders
in the economy. Now if actual transactions are allowed in the process, the x ;'s (as well
as p) change from time to time so that dp(t)/dt = f [p(t), x1( 1 ) ,... , x,,, (t)] . Here f
does not change over time. See the latter part of this section.
5. Hahn [3] pointed out a rather artificial case in which intermediate transactions are
allowed which have no effect on the distribution ofwelfare between individuals. This
is the case in which no stocks of commodities exist but where there is a continuous
flow of perishable commodities.
6. This is the point emphasized by Samuelson when he proposed the "correspondence
346 THE STABILITY OF COMPETITIVE EQUILIBRIUM

principle." Its application to macro-economics was emphasized by Patinkin [ 121.

But we cannot see the relevance of the t&tonnement process to macro-economics, for
we cannot quite visualize all people involved in a "macro-economy" gathered in one
place to exchange tickets.
7. Notice that the t&tonnement process offers a method of computing an equilibrium
price vector without actually specifying the function f. In the process, dpi(t)/dt =
p(t)] [or pi(t + 1) = pi(t) + kif [p (t)] ] , for example, fi [ p(t)] , i = 1, 2, ... , n,
can be easily computed by adding or subtracting the numbers recorded on the
"tickets," so that the market manager does not have to know the functions f at
all in computing the new price vector. If the process is stable, the price vector
eventually comes close to the equilibrium vector.
8. There is a recent interest in the topic of "optimal" organization of society and pos-
sible adjustment processes toward an "optimal" state by economists such as L.
Hurwicz, S. Reiter, J. Marschak, T. Marschak, R. Radner, and A. Camacho. The
classical article on this topic is L., Hurwicz, "Optimality and Informational Efficiency
in Resource Allocation Processes," in Mathematical Methods in the Social Sciences,
ed. by K. J. Arrow, S. Karlin, and P. Suppes, Stanford Calif., Stanford University
Press, 1960.
9. The proofs of stability for these processes can best be done by utilizing the Liapunov
"second method" or its modified form. For a sketch of the proof, see Negishi [ 10] .
An exposition of the Liapunov second method will be given in the next section of
this chapter.
10. Here the ai's are assumed to be positive constants.
11. A minor flaw in [ 14] was pointed out by R. W. Ruppert and R. R. Russel in their
"A Note on Uzawa's Barter Process," International Economic Review, 13, June 1972.

REFERENCES
1. Arrow, K. J., "Towards a Theory of Price Adjustment", in The Allocations ofRe-
sources, ed. by M. Abramovitz, Stanford, Calif., Stanford University Press, 1959.
2. Arrow, K. J., and Hurwicz, L., "Decentralization and Computation in Resource
Allocation," in Essays in Economics and Econometrics, ed. by R. Phouts, Chapel Hill,
N.C., University of North Carolina Press, 1960.
3. Hahn, F. H., "On the Stability of a Pure Exchange Equilibrium," International
Economic Review, 3, May 1962.
4. Hahn, F. H., and Negishi, T., "A Theorem on Non-tatonnement Stability," Econo-
metric,, 30, July 1962.
5. Koopmans, T. C., Three Essays on the State of Economic Science, New York,
McGraw-Hill, 1957.
6. Morishima, M., Dynamic Economic Theory (Dogakuteki Keizai Riron), Tokyo,
Kobundo, 1950 (in Japanese).
7. , "The Stability of Exchange Equilibrium: An Alternative Approach," Inter-
national Economic Review, 3, May 1962.
8. Negishi, T., "On the Formation of Prices," International EconomicReview, 2, January
1961.
9. , "On the Successive Barter Process," Economic Studies Quarterly, XII, January
1962.
LIAPUNOV'S SECOND METHOD 347

10. , "Stability of a Competitive Economy: A Survey Article," Econometrica, 30,

October 1962.
11. ,Theories of Price and Resource Allocation (Kakaku to Haibun no Riron),
Tokyo, Toyo Keizai Shimpo-sha, 1965 (in Japanese)
12. Patinkin, D., Money, Interest and Prices, 2nd ed., New York, Harper and Row, 1965.
13. Takayama, A., "Stability in the Balance of Payments, A Multi-Country Approach,"
The Journal of Economic Behavior, 1, October 1961.
14. Uzawa, H., "On the Stability of Edgeworth's Barter Process," International Economic
Review, 3, May 1962.
15. Walras, L., Elements of Pure Economics, tr. by Jaffe, London, George Allen & Unwin,
1954.

Section H
LIAPUNOV'S SECOND METHOD

In connection with the proof of the global stability of an equilibrium, people

found that Liapunov's second method might be useful. The distance function
adopted in the above proof of global stability by Arrow, Block, and Hurwicz may
be considered a Liapunov function (of his "second method"). The first explicit use
of this method was probably by McKenzie [71, who, in turn, acknowledges the
idea to Arrow. The history of Liapunov's method, consisting of the "first method"
and the "second method," goes back to 1892, but it was practically unknown until
its French translation appeared in 1907 [1 ] . Since the 1930s, major developments
in the theory of the stability of differential equations have occurred in the U.S.S.R.
Liapunov's first method is to find explicit power series solutions, convergent near
the origin, and then to deduce the stability from the behavior of the series, for large
t. Liapunov's second method, on the other hand, makes no attempt to find explicit
solutions. This allows a far greater range of applications for the second method
than for the first. However, the second method cannot provide any explicit in-
formation on the actual behavior of the solutions.
Consider the following systems of differential equations:
dx
(A) dt =f(x1,x2, 1,2,...,n
or

dx _ f(x), where f:X

T R", X c R"

dx;
(NA) dt
=f(x1,x2, ...,x,,;t),i= 1, 2, ...,n
348 THE STABILITY OF COMPETITIVE EQUILIBRIUM

dx =
f(x, t), where f:X ®x (- oo, oo) -, Rn, X c Rn
dt
In both systems, f is assumed to be continuous.
As discussed in Section B, the system (A) is called the autonomous system and
the system (NA) is called the nonautonomous system. Let z be an equilibrium point
of the (A) system, that is, f(z) = 0 [or f(z; t) = 0 for all t, for (NA)] . We may
choose z = 0 if we wish. This is not really much of a restriction, for by definingy
x - z, we get a new system dy/dt = f(y) [or dy/dt =f(y; t)] whose equilibrium
point is the origin, that is, y = 0. We assume that z is an isolated equilibrium point
in the sense that there is no other equilibrium point in some open ball about z.
We assume that the initial point x(t°) (where to is the "initial" time) lies inside this
open ball. If the equilibrium is unique, we take this open ball to be the whole space
in which both systems, (A) and (NA), are defined. We are concerned with the
global stability of the solution of the above systems x(t; x°, t°), which start from
an initial point x(t°) = x°.
We now define various concepts of stability. We assume that there exists a
unique solution determined by the initial point and that it is continuous with
respect to the initial point. We write the solution vector starting from (x°, t°)
as x(t; x°, t°).

Definition (SI): An equilibrium state z of a dynamic system is called Liapunov

stable if for any real number c > 0 and any to there exists a positive real numbers
such that
11 x° - z 11 <_ 3 implies 11 x(t; x°, t°) - z 11 sE
for all t > to where 8 = 8(e, t°) (that is, 8 is dependent on c and t°).'
REMARK: This concept is a local concept; that is, it refers to behavior near
i, since 8 can be very small. Moreover, this does not say that x(t; x°, t°) con-
verges to i. The concept of Liapunov stability is illustrated in Figure 3.9.'

Figure 3.9. An Illustration of Liapunov Stability.

LIAPUNOV'S SECOND METHOD 349

In essence it says that if x° is sufficiently close to z, then x(t; x°, t°) remains
bounded for all t.

Definition (S2): An equilibrium state 2 of a dynamic system is called (asymptotic-

ally) locally stable if3

(i) It is Liapunov stable, and

(ii) Every motion starting sufficiently close to z converges to z as t , co. In other
words, for any µ > 0 there exists an r = r(t°) > 0 and T = T(µ, x°, t°) such that
II x° - 111 < r(t°) implies II x(t; x°, 1°) - 111 < µ, for all t >_ to + T, where r and
Tare some real numbers dependent on to and (,u, x°, t°), respectively.

REMARK: This again is a local concept, for r(t°) can be very small. A some-
what puzzling thing in the definition is that (SI) (Liapunov stability) has
to be mentioned even though we have (ii); that is, (ii) in the above definition
alone does not necessarily imply (i). Kalman and Bertram ([3], pp. 375-376)
gave the following example.
EXAMPLE: Consider the second-order system in polar coordinates

x=(r,0),0<r<co,0<0<27r
1= [g(8, t)lg(O, t)] r
6=0
where g(8, t) = sin28/ [sin40 + (1 - t sin28)2] + 1/(1 + t2). Here (ii) of (S2)
is satisfied but (i) of (S2) [that is, (S,)] is not satisfied. However, if all
motions are continuous in x°, then we can show that if every motion suf-
ficiently close to z converges uniformly to z, then (S,) holds (see [3], p. 376).

Definition (S3): An equilibrium state z of a dynamic system is called (asympto-

tically) globally stable' if

(i) It is Liapunov stable, and

(ii) Every motion converges to z as t > co.

REMARK: If a system is autonomous [that is, z = f(x)] , then we can show

that 8, r, and Tin the preceding definitions do not depend on t°. Hence (1) if
z is Liapunov stable, it is uniformly Liapunov stable; and (2) if z is asympto-
tically locally stable, it is uniformly asymptotically locally stable. Moreover,
we can show that if z is asymptotically globally stable it is uniformly
asymptotically globally stable, the latter being defined as follows:

Definition (S4): An equilibrium state z of a dynamic system is called uniformly

(asymptotically) globally stable if
350 THE STABILITY OF COMPETITIVE EQUILIBRIUM

(i) It is uniformly Liapunov stable in the sense that 8 in the definition of (Sl) does
not depend on to, and
(ii) Every motion converges to i as t - co uniformly in to and II xg II < r where r is
fixed and can be arbitrarily large. [That is, given any r > 0 and y > 0, there is
some T (y, r) such that II xo - 111 :5: r implies II x(t ; x0, t°) - z II < µ for all t
to + T.]

Definition (S5): An equilibrium state z of a dynamic system is called (asympto-

tically) strongly uniformly globally stables if
(i) It is uniformly (asymptotically) globally stable, and
(ii) It is uniformly bounded in the sense that, for any given r > 0, there is some B =
B(r) such that II x° - x II < r implies II x(tO; x°, to) - 111 5 B.
REMARK: The above concepts maybe illustrated by Figure 3.10, where the
arrow reads "implies" and "+ (autonomous)" reads "if the system is auto-
nomous."
Strong uniform global stability

Uniform global stability

+ (autonomous)

Global stability Uniform local stability

Local stability

Liapunov stability Uniform Liapunov stability

Figure 3.10. Relations among Various Concepts of Stability.

We are now ready to state some of the major results obtained by Liapunov
and his followers.

Theorem 3.H.1: Consider the autonomous system (A) [ that is, z = f (x)] with f (0) _
0. Suppose that there exists a real-valued continuously differentiable function V(x)
on X such that
SECOND METHOD 351

(i) V (x) > O for all x 0, V (O) = 0,°

(ii) dV[x(t; x°, t°)] /dt < 0 for all x(t; x°, t°) 0,'
(iii) V (x) -> oo with II x II -> 00

Then the equilibrium state = 0 is uniformly globally stable, so that x(t; x°, t°)-
0 (for any to and x°), as t co.

REMARK: The function V(x) is called a Liapunov function of system (A).

EXAMPLE: Consider x = - x3, x E R. Define the Liapunov function for

this system by V(x) = x2. Conditions (i) and (iii) above are obviously satis-
fied. To show (ii), observe that V = - 2x4.
The above theorem provides sufficient conditions for "complete" stability in
the sense that the uniform global stability holds for the entire region X (which can
be R"). The next theorem provides sufficient conditions for the uniform global
stability of a certain subregion of X.

Theorem 3.H.2: Consider (A) with f(0) = 0 and suppose that there exists a real-
valued continuously differentiable function V(x) on X and a region D = {x E X:
V (x) < k } which is nonempty and bounded such that

(i) V(x) > O for all x 0, x E D, and V(0) = 0, and

(ii) dV [ x(t; x°, t°)] /dt < O for all x(t; x 0, t°) 0, x(t; x°, t°) E D.
Then z = 0 is Liapunov stable and x(t; x°, t°), the solution of(A), converges to z = 0
as t -> co for any t°, if x° E D.

REMARK: The function V(x) in the above theorem is again called the
Liapunov function of the system (A). Again, condition (ii) implies that the
origin is the unique equilibrium point.
EXAMPLE: Consider the van del Pol equation, z - E(x2 - 1)x + x = 0
where c > 0,8 or its equivalent form,' x = y + E(x3/3 - x), y = -x. Define
the Liapunov function for this by V(x, y) = (x2 + y2)/2. T hen V = Ex2(x2/ 3 -
1) along the solution path, so that V <_ 0 if x2 < 3. Define the region D by D
{ (x, y) E R2: x2 + y2 < 3} . Then we have V(x, y) < 0, as well as V(x,_y) > 0,
for all (x, y) in D [ note that V (x, y) = 0 only when (x, y) = (0, 0); that is, con-
dition (i) of Theorem 3.H.2 is satisfied] . Now observe that V(x,y) = 0 along
the solution path only when x = 0, that is, only when (x, y) is on they-axis.
But if x° = 0 and y° r 0, then V < 0 for any t > to, for z = y on they-axis.
This proves condition (ii) of Theorem 3.H.2. Therefore x(t; x0, y°, t°)-> 0
and y(t; x°, y 0, t°) -> 0 as t --> co, provided that (x°, y°) E D.
352 THE STABILITY OF COMPETITIVE EQUILIBRIUM

We now state the asymptotic stability theorem for the nonautonomous

system.

Theorem 3.H.3: Consider the nonautonomous system (NA), z = f(x, t) with

f (O; t) = 0 for all t. Suppose that there exists a real-valued continuously differentiable
function V(x, t) on X(2) (-oo, oo) such that V(0, t) = 0, and

(i) There exist continuous nondecreasing real-valued functions a and R such that
a(0)=0and f3(0)=0and 0<a(IlxII)<V(x,t)<f(11x11)for all tand all
x=A 0,
(ii) There exists a continuous real-valued function y such that y(0) = 0 and
dV[x(t; x°, t°), t] /dt < -y(II x II) < 0 for all t and all x 1 0,10
(iii) a(11xII)- x with IIxII -> co

Then the equilibrium state z = 0 is strongly uniformly globally stable so that x(t;
x°, t°) -> z = 0 for any x° and to when t -> oo. The function V (x, t) is called a
Liapunov function of the system (NA).
REMARK: For the proof of the above theorems, see Kalman and Bertram
[ 3] ; LaSalle and Lefshetz [4] , chapter 2; and Yoshizawa [ 10] , chapter 5.
REMARK: If V(x, t) in Theorem 3.H.3 is positive definite and if
dV[x(t; x°, t°), t] /dt < 0 (instead of < 0), then we can merely say that
1 = 0 is Liapunov stable.
REMARK: Conditions (i), (ii), and (iii) of Theorem 3.H.3 can be restated as
follows: There exist continuous positive definite functions a(x), b(x), and
c(x) such that

(i) a(x) < V(x, t) < b(x) for all t and all x 1 0,
(ii) dV[x(t; x°, t°), t]/dt < -c(x), for all t and all x # 0, where II x II <
oo, and
(iii) a(x)->oo as I1 x11 oo
REMARK: The converse of Theorem 3.H.3 is also, in a sense, true. In
particular, we can show the following: Let f in (NA) be Lipschitzian" and
suppose that f (0, t) = 0 for all t. If z = 0 is strongly uniformly globally stable,
then there exists a real-valued function V(x, t), infinitely differentiable in
x and t, which satisfies the hypothesis of Theorem 3.H.3."-
Functions a and /3 in (i) and (ii) of Theorem 3.H.3 are illustrated by Figure
3.11.
As a result of the above theorems, the proof of the stability of a certain
dynamic system can be obtained by finding a Liapunov function for the system.
Many stability theorems in the literature are obtained as special cases of the above
dynamic systems (A) and (NA). (For example, see Kalman and Bertram [31.)
LIAPUNOV'S SECOND METHOD 353

Figure 3.11. An Illustration of a and Ai in Theorem 3. H.3.

An important stability condition for the linear case [f(x) = A x where A is an

n x n matrix with constant components] , that is, the Routh-Hurwitz condition,
can be obtained from Theorem 3.H.1. (See, for example, Kalman and Bertram
[ 3] , pp. 381-382.) In mechanics or thermodynamics, a natural candidate for a
Liapunov function is the amount of total energy13 or, with the sign reversed, total
entropy. The proof of the stability of a competitive equilibrium can be handled by
finding a Liapunov function. Let p = f(p), where p = (pi, P2, .. .1 PA denotes a
price vector. This is clearly an autonomous system. The distance function D (t)
[or D,,,(t)] introduced in the proof by Arrow, Block, and Hurwicz can be con-
sidered a Liapunov function, although they made no explicit use of Theorem
3.H.1. McKenzie [7] constructed the following Liapunov function for the above
system:

V(t) Pi If(P)I
i-,
that is, V(t) is the sum of the absolute values of the excess demands multiplied by
their prices. Clearly V(t) > 0, whenever p p (p = an equilibrium price vector),
and V = 0 when p = p. Hence, if we can show that V < 0, the proof of the stability
is almost complete. For a short sketch of the proof that V < 0, we refer the reader
to Negishi ([ 8] , pp. 656-657). See also Chapter 4, Section D.
In the Liapunov method discussed above, it is assumed that the equilibrium
point is either isolated or unique. However, it is often important to consider cases
in which there are more than one equilibrium point which are not isolated. As we
remarked in Section B, Uzawa [9] reconsidered Liapunov's second method to
allow for such a case. Let f: X ->Rn be continuous and consider a system of differ-
ential equations 1(t) = f [x(t)] . Let x° be the initial value of x at t = 0. Assume
354 THE STABILITY OF COMPETITIVE EQUILIBRIUM

that, for any value of x° E X, this system of differential equations has a unique
solution x(t, x°) for all t > 0, which is continuous at x°. Let E be the set of all equi-
librium vectors, that is, E = {z: z E X andf(z) = 0}. Sincef(x) is continuous, E is
closed in X.

Definition: The process z = f [x(t)] is called quasi-stable if its solution x(t; x°)
satisfies the following conditions:

(i) Every limit point of x(t, x°), as t tends to infinity, is an equilibrium. That is, if for
some sequence tq, q = 1, 2, . . ., such that tq - co, lim x(tq, x°), as q- oo,
exists, then limq-,, x(tq, x°) is an equilibrium.
(ii) It is uniformly bounded; that is, for any given r > 0, there is some number
B = B(r) such that II x° - 111 < r implies 11 x(t; x°, t°) - z 11 < B.

REMARK: Uzawa [9] has shown that if the setX is closed, quasi-stability is
equivalent to

(i') lim V[x(t;x°)] = 0, where V(x) = inf ix - il, x E X

1_00 1CE

The function V(x) signifies the distance between point x and the equilibrium
set E. This function V will play the role of the Liapunov function.
REMARK: If the equilibrium points are isolated from each other, then con-
dition (i') means nothing but the asymptotic convergence of x(t, x°) to some
equilibrium point. The concept of quasi-stability, however, allows the case in
which the equilibrium points are not isolated and the solution x(t, x°) does
not necessarily converge to a particular point in the equilibrium set (see
Section B of this chapter).
Uzawa then proved the following theorem and showed its application.

Theorem 3.H.4: Suppose that the solution x(t; x°, t°) ofz = f [x(t)] is contained in
a compact set X, and
(U) There exists a continuous function V(x) defined on X such that V [x(t, x°)] is a
strictly decreasing function with respect to t unless x(t, x°) is an equilibrium.
Then the process x = f [x(t)] is quasi-stable.
REMARK: Uzawa called the function V(x) above a modified Liapunov func-
tion. Unlike Liapunov's V, it is not assumed to be differentiable and positive
definite.
As an illustration of such a theorem applied to stability analysis, consider the
nonnormalized adjustment process of a competitive equilibrium:"
LIAPUNOV'S SECOND METHOD 355

dpi
dt
=f [P(t)], i = 1, 2,..., n
where we have

(i) (Homogeneity) f (p) = f (ap) for all a > 0, and for all p, i = 1, 2, , n.
(ii) (Gross substitutability) 8f /8pj > 0 for all i r j and for all p.

Assume that an equilibrium price vector exists, and denote it byfi. Assume further
that the above system of differential equations has a unique solution p(t; p°) for all
t > 0, which is continuous in p°, where p° > 0.
Following Uzawa [9], define the functions V(p) and v(p) by

V(P) = max {Pt(t) P2(t) Pn(t)}

PI ' P2 Pn

and

v(p) = min {Pt(t) P2(t), ... , Pn(t)}

PI P2 fin

The functions V(p) and v(p) are "proxies" for the distance between p(t) and p.
These functions would play the role of the (modified) Liapunov function.''
Without loss of generality, we may assume that pl(t)/p1 > pi(t)/p; for all i.
That is, V(p) = pl(t)/pl, for some time interval, say, -r. To simplify the exposition,
assume further that V(p) and v(p) are differentiable in t." We, hence, observethat
dV(p)
dt
1 dpl(t)
dt
1 [ct(P) - X t]
pl Pt

...'Pt Pn) - 71]

[X1(p1 P1, P2, homogeneity)
A Pt

Since pl/pl < p;/p; for all i with strict inequality for at least one i,'7 we have
PI
P1 pt
XI( P1,..., -p) < X01,...,Pn) = X1
PI PI

due to gross substitutability. Hence dV/dt < 0 for the time interval T if p(t; p°) is not
an equilibrium vector. A similar argument holds for any time interval so that dV/ dt
< 0 for all t, if p(t; p°) is not an equilibrium vector. Similarly, we can show that
dv/ dt > 0 for all t, if p(t; p°) is not an equilibrium vector. Hence the solution p(t; p°)
is contained in a compact set { p : v(p°) < p;/p; < V(p°), i = 1 , 2, ... , n} of positive
vectors.'' Hence by applying Theorem 3.H.4, every limit point of p(t) as ttendsto
infinity is an equilibrium point. Hence there exists a sequence t9 such that tq -> 00
(q --> oo) and
356 THE STABILITY OF COMPETITIVE EQUILIBRIUM

9
limp't = 1, i = 1,2,...,n
t .00Pr

Hence both V [ p(t9; p°)] and v [ p(t9; p°)] go to unity as q ->oo. But

Pr(; P°)
V [ At; P°)] < < V [ At; P°)]

for all t and i = 1, 2, ., n, and V [ p(t; p°)] and v [ p(t; p°)] are both bounded
. .

and monotonic. Therefore lim1..,) pi(t)/p; always exists and equals 1, for i =
1, 2, ... , n. This proves the uniqueness and global stability of the equilibrium price
vector
FOOTNOTES

1. We may show that if there is Liapunov stability for some initial time t°, then there
is Liapunov stability for any other initial time t1, provided that all motions are
continuous in the initial state.
2. In the above definition of Liapunov stability, if 8 can be chosen independently of
t°, then z is said to be uniformly Liapunov stable. If z is not Liapunov stable,
z is called unstable. An example of a uniformly Liapunov stable equilibrium is
z=0in z=0,xER.
3. As an example, consider x = -x/(t + 1), x E R. The solution can be written as
x(t; x°, t°) = x°(t° + 1)1(t + 1). Then z = 0 is asymptotically locally stable,
since (1) 11 x(t; x°, t°) II < II x° II whenever t > to (Liapunov stable) and
(2) x(t; x°, t°) -> 0 as t -oo. If condition (i) in the definition is replaced by uniform
Liapunov stability (that is, 8 does not depend on t°) and if r and T do not depend on
to in condition (ii), then z is said to be uniformly (asymptotically) locally stable.
As an example, consider _e = -x, x E R and z = 0. The point z = 0 in the above
example, z = -x/(t + 1), x E R, is not uniformly locally stable. See Yoshizawa
[ 101, p. 96.
4. The phrase "globally stable" can be replaced by "stable in the large." Similarly,
"locally stable" can be replaced by "stable in the small." The word "asymptotically"
is often dropped in economics literature (although this is not usually the case in
mathematics literature).
5. Often z is simply called (asymptotically) uniformly globally stable; that is, the
word "strongly" is omitted. In this case no special name is given to the stability
property of z under (S4)-
6. That is, V(x) is "positive definite." In general, any real-valued continuous function
V(x, t), defined on XOx (-oo, oo) where X c R", is said to be positive definite in
region D c X, if there exists a continuous real-valued function W(x) defined on D
such that V(x, t) > W(x) for all t where W(x) > 0 if x # 0 and W(O) = 0.
7. That is, V < 0 along the solution path x(t; x°, t°). We may define such a V(x) by
V(x) VC(x) f(x), noting z = f(x), where VX is the gradient vector of V. Clearly
V(0) = 0 since f (0) = 0. Note that condition (ii) implies that the origin is the unique
equilibrium point, since if r 0 is an equilibrium point [that is, f(z) = 0] , then
V(z) = Vx(c) f(z) = 0, contradicting condition (ii).
8. The sign of E is crucial. If E < 0, it is known that the only equilibrium point is the
origin and it is unstable. Moreover, there is a unique "limit cycle" which surrounds
LIAPUNOV'S SECOND METHOD 357

the origin. In economic theory such a differential equation (with e < 0) is associated
with the so called "Kalecki-Kaldor model" of business cycles. The limit cycle is
supposed to constitute the "business cycles." The name `van del Pol" is due to his
article, "Relaxation-Oscillations," Philosophical Magazine, series 7, vol. 2, November
1962.
9. This form is known as the "Lienard form." The present example is discussed in
Yoshizawa [ 10] .

10. Note that dV[x(t, x0, to), t] /dt = VC f [x(t, x0, to), t] + 8V/ot.
11. That is, 11 f (x, t) - f (y, t) 11 < k 11 x - y 11 where k is a positive constant.
12. This theorem is due to Massera. See J. L. Massera, "Contributions to Stability
Theory." Annals of Mathematics, 64, 1956.
13. The following remark by Kalman and Bertram ([3], p. 371) is quite instructive.
The principal idea of the second method is contained in the following physical
reasoning: If the rate of change dE(x)/dt of the energy E(x) of an isolated
physical system is negative for every possible state x, except for a single
equilibrium state xe, then the energy will continually decrease until it finally
assumes its minimum value E(xe). In other words, a dissipative system per-
turbed from its equilibrium state will always return to it.
14. This illustration is from Uzawa [91, pp. 623-624.
15. The use of such proxies rather than the distance itself together with Theorem 3.H.4
simplify Uzawa's proof of global stability considerably.
16. In general, V and v are not necessarily differentiable in t, although they are con-
tinuous in t. However, the nondifferentiable case can be analyzed analogously. See
Uzawa [91, pp. 623-624. Note that Theorem 3.H.4 requires only the continuity
(and not the differentiability) of the modified Liapunov function.
17. For, otherwise, p(t) is an equilibrium.
18. This proves p(t; p0) > 0 for all t > 0, as long as p0 > 0.

REFERENCES

1. Antosiewicz, H., "A Survey of Liapunov's Second Method," in Contribution to

Nonlinear Oscillations IV, Princeton, N.J., Princeton University Press, 1958.
2. Hahn, W., Theory and Application of Liapunov's Direct Method, Englewood Cliffs,
N.J., Prentice-Hall, 1963 (German original, 1959).
3. Kalman, R. E., and Bertram, J. E., "Control System Analysis and Design Via the
`Second Method' of Liapunov, I: Continuous-Time System," .Iournal of Basic
Engineering, June 1960.
4. LaSalle, J., and Lefshetz, S., Stability by Liapunov's Direct Method, New York,
Academic Press, 1961.
5. Liapunov, A. M., "Problem general de la stability du mouvement," Annales de la
Faculte de Sciences de l'Universite de Toulouse (2), 9, 1907, pp. 203-247 (in French).
Reprinted in Annals of Mathematical Study No. 17, Princeton, N.J., Princeton
University Press, 1949.
6. , Stability of Motion, New York, Academic Press, 1966.
7. McKenzie, L. W., "Stability of Equilibrium and the Value of Positive Excess
Demand," Econometrica, 28, 1960.
358 THE STABILITY OF COMPETITIVE EQUILIBRIUM

8. Negishi, T., "The Stability of a Competitive Economy: A Survey Article," Econo-

metrica, 30, October 1962.
9. Uzawa, H., "The Stability of Dynamic Processes," Econometrica, 29, October 1961.
10. Yoshizawa, T., Introduction to Differential Equations, Tokyo, Asakura-Shoten, 1967
(in Japanese).
11. Zubov, V. I., Methods of A. M. Liapunov and their Application, Netherlands, Noor-
dhoff Gronigen, 1964.
FROBENIUS THEOREMS, DOMINANT DIAGONAL
MATRICES, AND APPLICATIONS

Section A
INTRODUCTION

In a general equilibrium model of an economy (such as the one we discussed

in Chapter 2), there are usually two types of economic agents (consumers and
producers) and various commodities. Each commodity is either reproducible or
nonreproducible. The nonreproducible commodities are called "primary factors."
In this section we call the reproducible commodities "goods." Each good is used
for the production of other goods and/or consumption. In other words, it is
demanded by producers and/or consumers, the two types of economic agents.
When a general equilibrium model such as the Walras-Cassel model was popular-
ized in the 1920s and 1930s, Wassily Leontief conceived of doing the empirical
groundwork for it and actually attempted to do so using the U.S. economy as
an example [8]. Thus we note that his path-breaking book has the subtitle
"An Empirical Application of Equilibrium Analysis."
Let ay be the amount of the ith good used to produce one unit of thejth
good (i,j = 1, 2, . ., n). Let xj be the amount of the jth good produced, and letc;
.

be the amount of the ith good used for (final) consumption purposes. Then the
demand = supply equilibrium relation for each good is written as
n
(1) +ci=xi,1= 1,2,...,n
l= I

or, in vector notation,

(1') A x + c = x, where A = [a,1]
Here it is assumed that there is only one productive process which produces
each good so that a, is constant for all i and j. (This assumption was adopted
by Cassel and copied by Leontief.) The matrix A is called the input-output matrix.
It was actually empirically estimated by Leontief [8]. To illustrate the use of the
above relation, rewrite it as

359
360 FROBENIUS THEOREMS, DOMINANT DIAGONAL MATRICES, AND APPLICATIONS

(2)

Here the matrix (I - A) is assumed to be nonsingular. Now if we can predict the

final consumption demand c, then x can be immediately computed from the
above formula. Thus we can predict the output of each industry. This device was
successfully used by Leontief for a post-World War II forecast and has become
very popular since then. The input-output matrix (A) has been empirically estim-
ated for many countries and its size has been considerably expanded in several
of the countries. Many applications have been discovered for this type of study,
and it is now called "input-output analysis." See, for example, Chenery and Clark
[1] and Fukuchi [4]. It is considered an important tool for the analysis of
economic decisions by governments.
Let us return to the basic relation of the input-output analysis, (I - A) x =
c. Although this looks like a very simple relation, it conceals many difficult ques-
tions. At the outset we can immediately raise the following questions.
(i) ('Ihe existence problem) For any given c ? 0, can we guarantee that there
exists an z > 0 such that (I - A) z = c? If so, is such an z unique?
(ii) (The nonsingularity problem) Is the matrix (I - A) nonsingular? If so, is
(I - A)-1 > 0?

Clearly these two questions are very closely related. In fact, we can prove
that (i) is answered affirmatively if and only if (ii) is answered in the affirmative.
However, whether these questions can be answered in the affirmative is not at all
obvious; (I - A) x = c involves n equations and n unknowns for a given c, but this
certainly does not guarantee the existence of x or its nonnegativity.
The study of the existence problem produced the following interesting
conditions as necessary and sufficient for (i) to be answered in the affirmative.

b11 b12 ... bin

b11 b12 b21 b22 -.. b2n

(3) (H-S) b 11 > 0, > 0, . >0
b21 b22

bn1 bn2 ... bnn

where bid iS the i j clcirncnt of matrix = (i - A), in Other words, for any c
there exists a unique _r" >_ 0 such that A z + c = k if and only if all the successive
principal minors of (I - A) are positive. The condition (H-S) is now known as the
Hawkins-Simon condition.
In order to obtain an intuitive understanding of this condition and the
Leontief system, let us consider a simple two-industry (say, steel and coal) input-
output model. In this case, the Hawkins-Simon condition is expressed as
1 - all -a12
(4) I- a 11 > 0 and I
>0
- a21 1- a22
N

INTRODUCTION 361

Note that the second of the above conditions may also be written as
(5) (1 - a11)(1 - a22) > a12a21

This coupled with 1 - all > 0 in (4) implies 1 - a22 > 0.

The conditions that 1 - all > 0 and 1 - a22 > 0 obviously mean that each
industry produces positive net output of its own good. To ease the understanding
of (5), suppose, for example, that a22 = 0, which means that the second industry
(say, coal) requires none of its own input (that is, coal). Then equation (5) states
that 1 > all + a12a21 , where all is the amount of steel required to produce a
ton of steel directly and a 12 a2i represents the amount of steel required to make
coal to make a ton of steel. That is, (a11 + a12a21), the total requirement of steel
to produce a ton of steel as direct and indirect input, must be less than the
amount of output (that is, a ton of steel). In other words, an intuitive meaning
of condition (5) is that the unit production of any good must use less than one
unit of itself as direct and indirect input. When the number of industries is more
than two, there are additional strings of determinants than those indicated by
(4), but the interpretation would always be that all subgroups of industries shoulc:
be "self-sustaining," directly and indirectly.
The nature of the Leontief system and the Hawkins-Simon condition can be
made more specific with the aid of a diagram. For this purpose, we first define two
vectors a 1 and a2 by

1-a11 -a12
(6) al = a2=
-a21 1 -a22

Clearly, a 1 and a2 signify the input-output combination involved in a one-unit

operation (that is, one unit of gross output) of the first and the second industry,
respectively. In Figure 4.1, points A and B, respectively, denote a1 and a2. Point
C denotes the final consumption vector c, and (I - A) x = c can be written as
a1x1 + a2x2 = c. Points A and B, respectively, denote the vectors a1x1 and
a2x2, and point C is obtained by the parallelogram law of vector addition from
the two points A and B.
Notice that in Figure 4.1 the positive angle between the OA and OB rays
(that is, the angle 0) is less than 1800. As long as this condition holds, for any
point in the nonnegative orthant of the (c1-c2)-plane except the origin, such as
point C, we can find an x > 0 and an x2 >_ 0 (not vanishing simultaneously) such
1

that a 1 x 1 and a2x2 (such as points A and B) add up to c. It is easy to see that if
the slope of the OA ray is equal to or greater than the slope of the OB ray (that is,
the angle 0 is equal to or greater than 180'), then there does not exist a point in
the nonnegative orthant of the (c1-c2)-plane (except the origin) such that x1 > 0
and x2 > 0 and a 1x1 + a2x2 = c. Hence with the assumption of 1 - a I 1 > O and
1 - a22 > 0 tacitly made in the construction of Figure 4.1, a necessary and
sufficient condition for an x > 0 to exist to satisfy (I - A) x = c for any c >_ 0
362 FROBENIUS THEOREMS, DOMINANT DIAGONAL MATRICES, AND APPLICATIONS

Figure 4.1. An Illustration of the Hawkins-Simon Condition.

is, for the two-industry case, that the slope of the OA ray be flatter than the
slope of the OB ray. In other words,
a21 I -a22
(7) I aI I ajz
which is nothing but condition (5) in the Hawkins-Simon condition.
Let c be a final demand vector. In order to obtain this bundle of goods, we
need the "first-round" input requirements of the goods, namely, A c. But to
obtain the bundle A A. c, we need the "second-round" input requirements of the
goods, that is, A - (A c) = A2 c. Then on the third round, and so on, ad infinitum.
Therefore the total requirements of the goods would be
(8) c+

Now the question arises whether this infinite series converges and is in fact equal
to the bundle of goods produced in the economy, x. That is, the following problem
is posed.
THE CONVERGENCE PROBLEM: Does the above series converge? If so,
can we assert that
00

(9)
k=0
INTRODUCTION 363

It turns out that this problem can be answered in the affirmative if and only
if the existence problem or the nonsingularity problem is answered in the affirma-
tive, or if and only if the Hawkins-Simon condition holds.
Empirically, the input-output matrix A is typically estimated for a particular
year. Suppose that for that year we have c > 0 and x > 0. In other words, we have

(10)
For a given particular c > 0, there exists an x > 0 such that

Clearly if the answer to the existence problem is affirmative, (10) holds. What
about the converse? We will prove later that the converse also holds. Then con-
dition (10) becomes a necessary and sufficient condition for an affirmative answer
to the existence problem, the nonsingularity problem, and the convergence
problem, and it is also necessary and sufficient for the Hawkins-Simon condition
(see Section Q.
Obviously condition (10) can be restated as follows:
(11) There exists an x > 0 such that (I - A) x > 0
It will, be shown later (see Section D) that condition (11) is equivalent to the
following:
(12) There exists a p >_ 0 such that (I - A )' p > 0
where (I - A)' is the transpose of (I - A) and p may be interpreted as a "price"
vector. The amount of "value-added" (per unit output) by the jth industry, vj,
can be defined as
n
(13) !;=pj- pjaij,j= 1,2,...,n
Hence condition (12) can be interpreted as implying the existence of a price
vector p > 0 such that the vj computed by use of this p is positive for all j. Obviously
the vj's go to other factors of production such as labor.
Consider the following sums of the coefficients (a;j's) of the input-output
matrix:
n
(14-a) ri aij, i = 1, 2, ... , n
j=

(14-b) sj= i=1

Ea;j,j= 1,2,...,n
where r; and sj signify the ith row sum and the jth column sum, respectively. Let
r = max r; for i = 1 , 2, ..., n and s = max s j for j = 1, 2, ..., n. In the course of
the study of the input-output matrix, it has become clear that either of the follow-
ing two conditions, (15-a) and (15-b), is sufficient to answer the existence problem,
the nonsingularity problem, and the convergence problem:
364 FROBENIUS THEOREMS, DOMINANT DIAGONAL MATRICES, AND APPLICATIONS

(15-a) r < I

(15-b) T < 1

The conditions (15-a) and (15-b) are known as the (Brauer-) Solow conditions (see
Section Q. Condition (15-a) should not be surprising to the reader, for it simply
asserts a special case of condition (11). That is, condition (15-a) asserts that
condition (11) is realized by an x whose elements are all equal to one. Similarly,
condition (15-b) asserts a special case of condition (12) [choose p in (12) with
elements all equal to one] . In other words, condition (15-b) states that if we choose
the unit of measurement of each good properly so that the price of each good
is equal to one, then the valued-added (per unit output) of each good is positive.
In the course of these studies it was realized that the input-output matrix,
A, has a special property, that is, all of its elements are nonnegative. By imposing
this special nonnegativity restriction on the matrix, it was conjectured that we
should be able to obtain stronger results than those listed in the usual textbooks
on matrix algebra. Looking back into journals of mathematics, economists found
that such matrices had been discussed at the beginning of the century by the
German mathematicians Perron and Frobenius. Hence the theorem, now called
the "(Perron-) Frobenius theorem," suddenly attracted a great deal of attention
from economists.' A number of papers (by Metzler, Debreu-Herstein, Solow,
Chipman, Morishima, Goodwin, and so on) have been published on the properties
of A (the nonnegative matrix). (See Section C.) By using the properties of such
an A, the nature of the (I - A) matrix was made precise and clear (See Section D).
All of these studies are treated in a unified fashion by McKenzie [9] and Nikaido
[ 13] The unifying concept here is taken from condition (11) or condition (12).
.

McKenzie [9] thus discovered the relevance of the conceptof"dominant diagonal

matrices" in this context (see Sections C and D).
In retrospect, many of the results thus discovered with a great deal of effort
by economists were already known, especially among Russian mathematicians.
For this reason Karlin writes ([7], p. 289),
Since 1908 there have been innumerable extensions and applications of the
Frobenius results. . An impressive collection of these extensions is found
. .

in Gantmacher and Krein, which in addition develops much of the finer

structure of the theory of positive matrices. . Unfortunately, economists
. .

continue to rediscover many of these theorems and to assign them thoroughly

inaccurate priorities.
This comment may be rather harsh. We may add that the studies by Nikaido [ 13]
and McKenzie [9] , mentioned above, provide us with an interesting unifying view,
which is certainly new in the literature. More importantly, we cannot overstress
the fact that the structure of these mathematical theorems has been straightened
out and their nature made intuitively clear to economists (as indicated above)
through the Leontief input-output system.'-
An interesting by-product of the study of the input-output matrix is the
INTRODUCTION 365

study of the stability of a competitive market. The discussion of linear approxima-

tion stability is essentially concerned with the stability of the equilibrium of a
differential equation system of the form p = A p. In connection with the stability
of this system, Metzler [ 10] (see also chapter 3, Section C) found that if A = [aJ
is such that a;j > 0, i j, then Hicks' stability condition is equivalent to the
stability of the equilibrium of the above differential equation. Recall that
Hicks' condition is that the successive principal minors of A alternate in sign.
This condition is obviously equivalent to saying that all the successive principal
minors of (-A) are positive-in other words, A satisfies the Hawkins-Simon
condition!
Hence the study of the stability of a competitive market and input-output
analysis developed in parallel, and the unified theory as presented by McKenzie or
Nikaido has applications in both fields.
Clearly the economic applications of the theory of such matrices are not
confined to these two fields. For example, Metzler [ 111, Chipman [2], and Good-
win [6] in the 1950s considered an application to the multisectoral income
propagation of the Keynesian type model. Later, economists found applications
to the dynamic Leontief theory, the substitution theorem, the turnpike theorem,
and so forth. The purpose of this chapter is to study the properties of such matrices
in a systematic fashion and to show some of the applications. The theory
developed in this chapter is an extremely useful technique in economics and the
reader may find many unknown applications for it.3 Some of the important
applications are explained in Section D.
A brief summary of this chapter is now in order. The most useful results
of this chapter are summarized in the beginning of Section D. Section D then
develops important applications of these results to various topics of economic
theory (such as the input-output analysis, the stability of competitive equilibrium,
and comparative statics). Section D thus illustrates the use of the basic results
of this chapter, enabling and motivating the reader to investigate further applica-
tions of the results.
Sections B and C do the major groundwork for Section D. Section B high-
lights well-known "(Perron-) Frobenius theorems" on matrices whose entries are
all nonnegative. Important concepts such as "indecomposable matrix" and
"primitive matrix" are explained in this connection. Section C discusses matrices
with "dominant diagonals," which play a key role in the type of problems dis-
cussed in Section A. The Frobenius theorems are crucial in obtaining some of
the important results in Section C.
The reader may find that Sections B and C, especially some of the proofs
of the theorems in these sections, are rather tedious to read. In the first reading
of these sections, it may therefore be advisable for the reader to skip reading
these proofs altogether. In fact, to understand Section D, the reader can even
skip reading the statements of these theorems (especially those in Section C).
Only familiarity with some of the basic concepts is really required for the reading
of Section D.1
366 FROBENIUS THEOREMS, DOMINANT DIAGONAL MATRICES, AND APPLICATIONS

FOOTNOTES

1. Morishima ([ 12] , p. 1) claims that Frobenius' theorems were rediscovered indepen-

dently by him, R. Goodwin, and T. Yasui.
2. For an excellent exposition of the input-output theory, see Dorfman, Samuelson,
and Solow (DOSSO) [3], for example.
3. In these applications, the matrices whose off-diagonal elements are all nonnegative
(such as the gross substitution matrix) or are all nonpositive (such as the Leontief
input-output matrix) turn out to yield very sharp results. To honor the name of a
pioneer of such a matrix, the matrix is often called the Metzler matrix.
4. The reader, if he so wishes, can therefore go directly to Section D, occasionally
referring to previous sections for the concepts used. After reading Section D, he may
then be more motivated to read the statements and proofs of the theorems in
Sections B and C.

REFERENCES

1. Chenery, H. B., and Clark, P. G., Interindustry Economics, New York, Wiley, 1959.
2. Chipman, J. S., The Theory of Inter-Sectoral Money Flows and Income Formation,
Baltimore, Md., Johns Hopkins University Press, 1951.
3. Dorfman, R., Samuelson, P. A., and Solow, R. M., Linear Programming andEconomic
Analysis, New York, McGraw-Hill, 1958.
4. Fukuchi, T., Introduction to Linear Economics, Tokyo, Toyo Keizai Shimpo-sha,
1963 (in Japanese).
5. Gantmacher, F. R., The Theory of Matrices, Vol. II, New York, Chelsea Publishing
Co., 1959 (tr. from Russian).
6. Goodwin, R. M., "Does the Matrix Multiplier Oscillate?" Economic Journal, LX,
December 1950.
7. Karlin, S., Mathematical Methods and Theory in Games, Programming and Economics,
Vol. I, Reading, Mass., Addison-Wesley, 1959.
8. Leontief, W. W., The Structure of American Economy, 1919-1939, 2nd ed., New
York, Oxford University Press, 1951.
9. McKenzie, L. W., "Matrices with Dominant Diagonals and Economic Theory,"
in Mathematical Methods in the Social Sciences, 1959, ed. by Arrow, Karlin, and
Suppes, Stanford, Calif., Stanford University Press, 1960.
10. Metzler, L. A., "Stability of Multiple Markets: The Hicks Conditions," Econo-
rnetrica, 13, October 1945.
11. , "A Multiple Region Theory of Income and Trade," Econometrica, 18,
October 1950.
12. Morishima, M., Interindustry Relations and Economic Fluctuations (Sangyo-renkan
to Keizai Hendo), Tokyo, Yuhikaku, 1955 (in Japanese).
13. Nikaido, H., Introduction to Sets and Mappings in Modern Economics, tr. by K. Sato,
Amsterdam, North-Holland, 1970 (the Japanese original, Tokyo, 1960).
14. , Convex Structures and Economic Theory, New York, Academic Press, 1968.
FROBENJUS THEOREMS 367

Section B
FROBENIUS THEOREMS

We begin this section by recalling some of the basic concepts of linear

algebra which are important in this chapter. Let A be an m x n matrix. Here we
assume that all the elements of A = [a;1] are real numbers. Note that A is a linear
function from Rn into Rn', in the sense that, for any x and y in Rn, we have
A (x ± y) = A x ± A y and A A. (ax) = aA x for any real number a. On the other
hand, any linear function from Rn into Rn' can be written in matrix form. Clearly,
A is a continuous function for any subset X of Rn into R. Hence if X is compact
in Rn, f(X) is compact in R. From now on we will confine ourselves to a square
matrix, that is, let A be an (n x n) square matrix. Thus A is a linear continuous
function from Rn into itself. If we have A x = Ax, x 0, where x is an n-vector
and A is a scalar (real or complex), then we call A an eigenvalue or characteristic
root of A. The vector x is called the eigenvector (or characteristic vector) associated
with A. If we write the above equation in the form
(1) (A1 - A)-x= 0
we can immediately see that if A is an eigenvalue, then equation (1) has a solution
x # 0, so that we have
(2) O(A) = JAI - Al = 0
That is, (AI - A) must be singular [here JAI- Al denotes the determinant of
(AI - A)]. Conversely, if A is a solution of (2), then (1) has nonzero solution x.
Hence (1) and (2) are really equivalent. Equation (2) is called the characteristic
equation or eigen equation (of A). Clearly we may write (2) as
0(A) = An + a1A11-I + ... + an-1A + an = 0
where the a;'s are functions of the a,'s. By a fundamental theorem of algebra,
the above polynomial has n (not necessarily distinct) roots. Each root is not
necessarily a real number. Hence an eigenvector is not necessarily a real vector.
EXAMPLES:

(a)A-[I A-I
=)122-2A =A(,1-2)=0 .'.1=0and2

0.1 01
A = 1 (double root)
368 FROBENIUS THEOREMS, DOMINANT DIAGONAL MATRICES, AND APPLICATIONS

l
O
(c)A= [ ' 1 0 w ==1
A _ ± i (complex roots)
When A, is a simple root of the characteristic equation, A is called the simple
eigenvalue (or simple root) of A.
We will now restrict ourselves to an n x n matrix whose elements are all
nonnegative. First recall our conventions with regard to vector and matrix nota-
tion. Here x is a vector whose ith element is x,, and A is a matrix whose i-j
element is aid.

NOTATIONS:
(a) x>0ifxi>>0foralli
x ?0 if xi > 0 for all i and xi > 0 for some i (that is, x 0)
x> 0ifxi> 0 for all i
(b) A?0if ay>0for all iand j
A?0if ay>0for all iand jand a, > 0 for some i and j
A> 0ifa1 > 0forall iandj
Although we are concerned here exclusively with an (n x n) square matrix A, the
matrix A in the above notation does not have to be square. If A > 0, we call A
a nonnegative matrix, and if A > 0, we call A a (strictly) positive matrix (A ? 0 is
often called a semipositive matrix). One may consider A as an input-output matrix
so that aid denotes the amount. of ith input needed to produce a unit of the jth
output.

Definition: A permutation is a one-to-one function from the set { 1, 2, 3, ... , n}

onto itself. We denote it by
i i i2 ...
j /2 ... jn
or

a(Ik) = jk, k = 1, 2, ... , n

(12
EXAMPLE: is a permutation of 1 -> 2, 2 -> 3, and 3 -> 1 [that
3 /
is, a(1) = 2, a(2) = 3, and a(3) = 1].
A permutation matrix, usually denoted by P, is the one which is obtained by
permuting the columns (or rows) of the identity matrix. Or, more formally, it is
defined as follows.

Definition: An n x n matrix P = [ pii] is called a permutation matrix if pQ(j)1= 1,

FROBENIUS THEOREMS 369

j=1, 2, ..., n [resp.p;Q(;) = 1 , i = 1 , 2, ..., n], and if p;1= 0 for all i zk Q(j)
[resp. p, = 0 for all j r u(i)] .
REMARK: The identity matrix itself is a permutation matrix. Every per-
mutation matrix can be obtained by interchanging (two) columns (or rows)
of the identity matrix a finite number of times.
EXAMPLE:
0 0 1

P= 1 0 0
0 1 0

is obtained by interchanging the columns as follows:

1 2 3 3 2 1 2 3 1

1 0 0 0 0 1 0 1

1= 0 0 0 0 0 0
r
1 1

0 0 1 1 0 0 o 1 0

If a is some permutation 2 ... n we denote by PQ the permutation

(Itla2 - . a,,
matrix obtained by permuting the columns of the identity matrix 1 by u. In the
above example, Q was 2 3 .' Similarly, the transpose P, of P7 can be
1

2 3 I)
obtained by permuting the rows of the identity matrix by U. For example,
0 1 0
PQ = 0 0 1

1 0 0
is obtained by permuting the rows of the (3 x 3) identity matrix by the above u.
It can also be obtained by interchanging the rows as follows:
[10 0 0 1 0 0 1 '1 3 0 1 0 2
1= 1 0 2 0 1 0 -..J 2 0 0 1 3
0 0 1 3 1 0 0 1 1 0 0 1

It can be shown easily that PQ = PQ-1. Note that Pu-1 A P, (or P, A is-the
matrix obtained by permuting the rows and the columns by u.
EXAMPLE:
all a12 all
r l
U
\2 3 1 /, A = a21 a22 a23
a31 a32 a33

a22 a23 a21

Pa- I A - PQ = a32 a33 all
a12 a13 all
370 FROBENIUS THEOREMS, DOMINANT DIAGONAL MATRICES, AND APPLICATIONS

REMARK: If A is an input-output matrix, then Po-' A=P, amounts to

renaming (or renumbering) the industries in the economy by the permuta-
tion a. If a = 2 3 l , then the industries are renamed as follows:
1

2 3 1/
1 - 3, 2 - 1, and 3 - 2. Since the numbering or the naming of the industries
should not alter what is going on in the economy, we may sometimes wish to
perform PR-' A A. P by choosing a properly.
We are now ready to introduce the first important concept.

Definition: An n x n matrix A is called decomposable if there exists a permutation

matrix P such that
A12
P-1 - A P =
0 A27

where All and A22 are square submatrices 2 If this is impossible, A is called
indecomposable.
REMARK: This definition can be restated as follows: A is called "decompos-
able" if (1) there exists a partition {J, K} of N = {1, 2, . ., n}, such that .

N= J u K and J r1 K= Q with J L Q and K 0, and(2)a,= 0 for i E K

and j E J. That is,
J K
.i All A121
K [ 0 A22

REMARK: If A is the input-output matrix, we can interpret the decompos-

able matrix as follows: The whole economy is partitioned into two groups
of industries, the J-group and the K-group. Any industry that belongs to
the J-group does not require any inputs from the industries in the K-
group. If A can be transformed by a permutation matrix P such that
P ' A P= A 0, then A is called completely decomposable. In
11

[ 0 A22]
this case, not only do the industries in the J-group not require any inputs
from the K-group industries, but also the K-group industries do not require
any inputs from the J-group industries as well.
EXAMPLE:

(12 2
is decomposable by P, where a = 1". That is,
3
FROBENIUS THEOREMS 371

1 1 1

0 1 1

In the above example, the decomposability is really obvious without per-

forming P,,. Another obvious case of decomposability is one in which all the
elements in some row (or column) are zero. However, there are many cases in
which this is not obvious. The following examples are given by Nikaido ([8],
pp. 83-85).
0 1 0
(i) Al = 0 0 I is indecomposable
1 0 0

0 0 1 2
0 2
(ii) AZ = 0 0 is indecomposable
3
2 4 0 0

REMARK: The reader may wish to prove that A 1 and A2 in the above
examples are indeed indecomposable.
We now prove the following lemma.

Lemma 4.B.1: Let A be a nonnegative indecomposable n x n matrix. Then

[I + A]'- 1 > 0, inhere I is the identity matrix.
PROOF: It suffices to show that for every x >_ 0, [1 + A] x > 0, for then
by choosing

x=(0,0,...,0, 1,0,...,0)
we have [I + A] "- I > 0. To show that [1 + A] x > 0 for any x > 0, it,
I

in turn, suffices to show that the vector y = [I + A] x always has fewer zero
coordinates than x does. Suppose the contrary. Note that y = x + A x and
A x > 0, so that for each positive coordinate of x, there corresponds a
positive coordinate of y. Thus x cannot have more positive coordinates than
y. Hence x has the same zero coordinates as y. Without loss of generality,
we may suppose that x and y have the form

x=[0],y= LoI U>0,v>0

where u and v are of the same dimension. Let
'411 A12
A=
'421 A22

where A and A22 are square submatrices. Then x + A x = y means that

372 FROBENIUS THEOREMS, DOMINANT DIAGONAL MATRICES, AND APPLICATIONS

A21 u = 0. Since u > 0, A21 = 0. This contradicts the indecomposability of

A. (Q.E.D.)
We now state and prove the important theorems called the Frobenius
theorems.

Theorem 4.B.1 (Frobenius' theorem I): Let A be a nonnegative indecomposable

n x n matrix. Then

(i) A has an eigenvalue A > 0 such that

(ii) An eigenvector i > 0 can be associated with A.
(iii) The eigenvector z is unique up to a scalar multiple; that is, if y is an eigenvector
associated with it, th' n y = Bz for some positive scalar 0.
(iv) If A x = ax for some y > 0 and x > 0, then y = A.
(v) If co is any eigenvalue of A, then I w < A.
(vi) The eigenvalue A. increases when any element of A increases; that is, if A I > A2 > 0
and Al and A2 are indecomposable,3 then )LA1 > A42, where iA1 and AA2'
respectively, denote thei associated with AI and A2.
(vii) The eigenvalue A is a simple root.

Definition: The root A in the above theorem is called the Frobenius root of A; it
is denoted by AA or simply A.

PROOF. OF THEOREM 4.B.1 (WIELANDT [121): Given x E Rn, with x >_ 0, define
Ax = max {A: A x > Ax, A E R}. Let (A x); =,E1 a;xj and define A.(x) by

(A. x);
x;
ifx;> 0
A (x) =
, oo if x; = 0 an d (A x);fined
0
und e if x1 = 0 and (A x); = 0

Observe that AX can also be written as

,IX = min A;(x)

The concept of A, is illustrated in Figure 4.2.

We are interested in establishing the existence of an i >_ 0 which maxi-
mizes the value of AY over x >_ 0, for it turns out later that such a maximum
value of AC will give the Frobenius root of A, with which the eigenvector
i is associated. First note that A,. = 1nx - max {A: A (ax) > A(ax),.A E R},
where a is any positive number. Hence for the maximization of AY over
x > 0, it suffices to restrict x to the following portion of the unit sphere,
denoted by S:
FROBENIUS THEOREMS 373

A
x2

X
0 ' Figure 4.2. An Illustration of A,,.

n
S= {xERn: Exi2= 1,x>-0}
i=1

If the function Ax were continuous on S, then the existence of an z in S

which maximizes A. is guaranteed by Weierstrass' theorem. However, Ax
can have discontinuities at the boundary points of S at which one of the
coordinates vanishes, although it is continuous for all x > 0.
An ingenious way to avoid this difficulty of possible discontinuities of
Ax on S is used by Wielandt ([ 12], p. 644).1 Define the set S, in place of
S, by

S={y:y=(I+A)'-'-x,xES}
Clearly S is compact, since S is compact (Theorem O.A.17). Moreover, by
Lemma 4.B.1, S consists solely of positive vectors. Multiply both sides of
the inequality A x >_ Axx by (1 + A)n 1 > 0, and obtain A _y > ,l,_y, where
y = (I + A)n- x. Hence by the definition of A,,, A, > A. Hence, instead of
I

considering the maximization of Ax over S, it suffices to consider the maxi-

mization of A, over S. Since y E S implies y > 0, A,, is continuous on S
and achieves its maximum in S, say, at z > 0, z E S.
Write A _ A. Denote by z every vector for which I = A,;. Clearly
z ? 0. We will prove that A is a Frobenius root and z is its eigenvector.
In other words, we will prove each statement of Frobenius' theorem I.

(i-a) A> 0. Let u = ( 1 , 1 , ... , 1 ) E Rn. Then A.= minI <;<,7 7 1 a;i. Since no rows
of an indecomposable matrix can consist only of zeros, A,, > 0. Since .1 >_ ..1,,,
A > 0.
(i-b) A is an eigenvalue of A and x is its eigenvector, that is, A z = A. Suppose
that A z 4 A i. By the definition of A, A z > Az, so that A z - )i >_ 0.
Multiply both sides of this inequality by (I + A)"- 1 > 0 (recall Lemma 4.B.1),
and (I+A)" I

This contradicts the definition of A, for it implies that A z - (.,l + F)z > 0 for
any sufficiently small E > 0, that is, ,1, ? (A + c) > A. Hence A . = U.
374 FROBENIUS THEOREMS, DOMINANT DIAGONAL MATRICES, AND APPLICATIONS

(ii) z > 0. Let z= (I + A)" z. Then by Lemma 4.B.1, z > 0. But (I+ A )n-1 i =
(1 + A)"-'i. Hence 0 < z = (1 + A)"- 1i so that z > 0.
(iii) z is unique up to a scalar multiple. Let y be another eigenvector associated with
A. Let 0 - min, <i<"yi/ii, and let y = y - 0i. Then A- y = A- (y - Bi) =
Ay - 05.i = Ay. By definition of 0, y ' 0. Suppose y # 0. Then A y = Ay
means y is also an eigenvector associated with A. Hence using a proof
similar to that of (ii), y > 0. This contradicts the condition that y 0. Hence
y=Dory=Oi.
(iv) A x = µx and x > 0 imply that µ = A. Clearly A' is indecomposable if and
only if A is indecomposable, where A' is the transpose of A. Denote the
.A for A by AA and the A for A' by AA'; then we can easily show that

AA = AA'- Hence there exists a y > 0 such that A'- y = AA , y. By assump-

tion, A x = µx, x >_ 0. Consider the inner product p(x, y) _ <µx, y) =
<A- x, y) _ <x, -
AA(x,y>. Hence('IA u)(x,Y)= 0.
But x > 0, y > 0 imply that( x, y) > 0, so that AA - µ = 0-
(v) I w I < A. Let w be an eigenvalue of A, so that A- x = wx for some x 0.
Taking the absolute values of both sides and using the triangular inequality,
we obtain
Iwlx+

where x + is obtained by replacing all the elements of x by their absolute values.

Hence 1wI <_ A+ < A, where AC = max {A: A x+ Ax+, A ER}.
(vi) A, ? A2 >_ 0 = AA, > AA2, if A, and A2 are indecomposable.s Let C
?(A, + A2); then C is also indecomposable. Consider the A for Cand denote it
by Ac. Let z > 0 be its eigenvector [the existence of Ac and z > 0 are
guaranteed by (i) and (ii) above and the indecomposability of C]. Clearly
llcz = C. z < A, z, since A, > C. Let y > 0 be an eigenvector associated with
'i"I
A. Consider the inner product Ac(y, z> = <i', Acz) = (y, C- z> < (y, AI - z> =
<A, - y, z) = (AA'j -Y, z) = AA,(y, z>. Hence (AA' - Ac) (Y, z> > 0. But y > 0
and z > 0 imply that (y, z> > 0, so that (AA, - Ac) > 0. Hence AA, =
AA,i > Ac. Similarly, we can show that Ac > AA, (in view of C >- A2), so that
AA > 2142.

(vii) A is a simple root. (Proof omitted; see Gantmacher [41, p. 57, or Debreu and
Herstein [ 1] , p. 599.) (Q.E.D.)
REMARK: The essential part of the above proof is (i). Note that (i) follows
from Weierstrass' theorem, an elementary property of compact sets. An
alternative proof using sequential compactness is provided by Nikaido [7].
Debreu and Herstein [1] gave a very simple and elegant proof by using
Brouwer's fixed point theorem. It should be stressed, however, that we do
not need such a powerful theorem to prove Frobenius' theorem I. Debreu
and Herstein's proof of (i) goes roughly as follows:

(a) x ? 0 implies A x ? 0. If A x = 0, then A would have a column of zeros

and would not be indecomposable.
(b) A has an eigenvalue A > 0. Let 9 -- {x E R": x ? 0, "= I xi = 1} be the
FROBENIUS THEOREMS 375

(n - 1) unit simplex. For each x E S, we define T(x) - [ 1/A1(x)] A x

where 2(x) > 0 is so determined that T(x) E ' [by (a), such a A(x) exists
for every x E S]. The mapping T(x) is illustrated in Figure 4.3. Clearly
T(x) is a continuous function from S (which is compact and convex) into
itself. Hence by Brouwer's fixed point theorem there exists an z in ' such
that z = T(z) = [ 1 /A(1)] A i. Finally, let A - A(z).

X1
Figure 4.3. An Illustration of T(x).

Frobenius' theorem I is concerned with indecomposable nonnegative

matrices. A similar theorem for arbitrary (hence possibly decomposable) non-
negative matrices can be obtained by observing that every nonnegative matrix
A >_ 0 can be represented as the limit of a sequence of positive matrices Aq > 0
(which are obviously indecomposable), that is
A= limAq (Aq> O,q= 1,2,...)6
q-.co

Theorem 4.B.2 (Frobenius' theorem II): Let A be a nonnegative n x n matrix.

Then
(i) The matrix A has an eigenvalue A > 0 such that
(ii) With A we car,, associate an eigenvector -il - 0.
(iii) If A- x ? ux for some real number p and x - 0, then A ? µ.
(iv) If to is any eigenvalue of A, then A ? w
(v) If A, - A2 ? 0, then "AI ? A42.

PROOF: The proof is omitted.' See Gantmacher [4], pp. 66-68, or Debreu
and Herstein [ I ] , p. 600.
REMARK: In the above theorem, a. is again called the Frobenius root of A
and AA, denotes the Frobenius root of A,. Owing to the lack of indecom-
posability we miss certain properties which hold for the indecomposable
376 FROBENIUS THEOREMS, DOMINANT DIAGONAL MATRICES, AND APPLICATIONS

case (Frobenius' theorem I). In particular, note that. if A is not indecompos-

able, then
(i) The root A can be zero 8
(ii) Some (not all) elements of x can be zero.
(iii) Both x and y with y Bz for any B E R can be eigenvectors associated
with A.
(iv) The root A is not necessarily a simple root.
REMARK: Let ,(A) be the Frobenius root of A >_ 0 (decomposable or in-
.

decomposable). Then we can easily show that

(i) 5.(A) = ).(A') where A' is the transpose of A.
(ii) a.(aA) = aA(A) where a is any nonnegative real number.
(iii) A(Am) = A.(A)m where m is any positive integer.
The proofs of these propositions are left to the interested reader.
In the above discussion we observed that when a matrix does not have the
indecomposability property, it also does not have all of the nice properties that
we obtained in Theorem 4.B.1. Now we will impose an additional restriction on
indecomposable nonnegative matrices to see whether we can obtain nicer proper-
ties for such matrices.

Definition: An n x n indecomposable matrix A is called imprimitive (or cyclic)

if there exists a permutation matrix P such that
0 A 12 0 0 0
0 0 A23 0 ... 0

0 0 0 A34 0

0 0 0 0 A,n_1,
0 0 0 ... 0

Here the 0's in the diagonal are square matrices, but the A;,; + i >_ 0 are not neces-
sarily square. If such a permutation does not exist, A is called primitive (or cyclic).
REMARK: This definition can also be stated as follows. An n x n indecom-
posable matrix A = [a;i] is called imprimitive (otherwise primitive) if

(i) There exists a partition { JI , J2, ... , J,,, } of N = { 1 , 2, ... , n } such that
N = J, U J2 U ... U j,,,, Ji n Ji = 0 (i j), J; 0, i = 1, 2, ... , rn,
and
(ii) ail = 0(i (4 J;_ 1, j E J;), and Z;E J._, all > 0(j E J.), i = 1, 2, . . ., in.

Here we regard J0 as J,,, . We note that this partition is not necessarily unique.
FROBENIUS THEOREMS 377

REMARK: Hence if A is imprimitive, it means that any industry which

belongs to the J;-group industries does use the outputs of the J;_ I -group
industries but does not use the outputs of any other group of industries as
inputs.
EXAMPLE: Nikaido ([7], section 21) gave the following example:

0 0 0 0 0 1

0 0 0 0 5 0
0 0 0 3 0 0
A = can be shown to be imprimitive
0 4 0 0 0 0
6 0 0 0 0 0
0 0 2 0 0 0

2 3 4 5 6
5): then by the permutation matrix Pa , we have
(11

1. Let or, =
4263
r0 0 0 1 001011
00 4 0 0
0 0 0 0 0 5
PRA-I
0 0 0 0 2 0
0 3 0 0 0 0
6 0 0 0 0 0

Here the partition is Ji = { 1, 4}, J, = {2, 6}, J3 = {3, 5}.

2. Also let aZ = 1 2 3 4 5 6)
163425' then
0 i 0 0 0 0
0 0 2 0 0 0
0 0 0 3 0 0
Pot 1 A' Pat 0 0 0 0 4 0
0 0 0 0 0 LL
6 0 0 0 00
Here the partition is J, 1 }, J2 = {6}, J3 = {3}, J4 = {4}, JS = {2}, and
J6 = {5}.
This example also shows that the partition which reveals the imprimitive-
ness is not necessarily unique. The relationships between the above concepts may
he illustrated as follows:
decomposable
primitive (acyclic)
indecomposable
imprimitive (cyclic)
378 FROBENIUS THEOREMS, DOMINANT DIAGONAL MATRICES, AND APPLICATIONS

We will now state the most important theorem on primitive (indecomposable)

matrices.

Theorem 4.B.3: A nonnegative indecomposable square matrix A is primitive if and

only if its Frobenius root is unique; that is, there are no other eigenvalues ofA whose
absolute values are equal or greater than this.
PROOF: The proof is omitted. See Nikaido [7] , section 21, and [9] , section
8.3, for example.
REMARK: From this theorem, we may define the "primitive matrix" in
terms of its unique Frobenius root (see Gantmacher [4], p. 80, for example).
In fact, if we define the primitive matrix in terms of the unique Frobenius
root and define the acyclic matrix as above, then the above theorem can
be restated as follows: A nonnegative indecomposable square matrix is
acyclic if and only if it is primitive (see Solow [ 11 ] , p. 40, for example).
REMARK: Frobenius' theorem I asserts that if A is nonnegative and inde-
composable, then its Frobenius root A. is a simple root and a. > Iwl for any
other eigenvalues to of A. In other words, there can be an eigenvalue whose
absolute value is as large as )1. The above theorem asserts that if A is primi-
tive, then there is no such root.
The following theorem is a rather easy corollary of Theorem 4.B.3.

Theorem 4.B.4: A nonnegative indecomposable square matrix A is primitive ifand

only if some power of A is positive; that is, A°' > O for some positive integer m > 1.
PROOF: The proof is omitted. See Gantmacher [4], pp. 80-81, for example.
The following theorem, which is due to Frobenius, gives a sufficient condi-
tion for A to be primitive.

Theorem 4.B.5: A nonnegative indecomposable square matrix A is primitive if it

has at least one diagonal element which is positive.
PROOF: The proof is obvious from the definition of primitive matrices.

FOOTNOTES

1. Note that the second, third, and first columns of 1 now become the first, second, and
third columns of P,,. If A is any 3 x 3 matrix whose jth column is ai-that is,
A = [as, a2, a3] -then A PQ = [a2, 613, ai] , under the above permutation.
2. It immediately follows from the definition that A is decomposable if and only if
the transpose of A is decomposable.
3. In fact, it is not necessary to assume the indecomposability of both A, and A2.
It suffices to assume that only Ai is indecomposable.
FROBENIUS THEOREMS 379

4. The "modified" compactness argument used by D. Glycopantis to prove the exis-

tence of the von Neumann path can be used as an alternative method to avoid this
difficulty. See his "The Closed Linear Model of Production: A Note," Review of
Economic Studies, XXXVII, April 1970. See also Section A of Chapter 6.
5. As remarked before, it suffices to assume that only Al is indecomposable. The
proof is analogous to the one below. Note that the matrix C is still indecomposable
as long as A i is indecomposable. If A2 is decomposable, Ac > £A2 should be modified
to AC > AA2, and to assert this we need the first two statements of Frobenius'
theorem II. The proofs of these two statements do not presuppose the present state-
ment (vi) of Frobenius' theorem I.
6. The proof based on this observation is given by Gantmacher [4]. Alternatively,
observe that decomposable matrices can be written (by suitable permutation of rows
and columns) in a form which has indecomposable submatrices on the principal
diagonal. Then, using Frobenius' theorem I, we can obtain the theorem for the
decomposable case. Such a proof is given by Debreu and Herstein [ 1] .
7. The proof of statements (i) and (ii) can be sketched as follows. Let Aq_A as
q-' oc, where Aq > 0, and xq > 0 is the eigenvector associated with A Let .v be
q,
chosen such that xq is in the unit sphere S. Since S is compact, the sequence {xq}
has a convergent subsequence {xqh} whose limit z is in S. Since Aqh xqh = Aghxgh
in this subsequence with xqh > 0 and A'qh > 0, we have A z = A where ,r >- 0 and
A > 0 in the limit.
8. An obvious example is A = 0. It can be shown that a necessary and sufficient condi-
tion for A = 0 is that A' = 0 for some positive integer in. In economic applications,
such a case turns out to be rather uninteresting.

REFERENCES

1. Debreu, G., and Herstein, I. N., "Nonnegative Square Matrices," Econometrica, 21,
October 1953.
2. Frobenius, G., "Uber Matrizen aus Positiven E]ementen," Sitzungsberichte der
Koniglichen Preussichen Akademie der Wissenschaftan, 1908, pp. 471-76, 1909,
pp. 514-518.
3. "Uber Matrizen aus Nicht Negativen Elementen," Sitzungberichte der
Koniglichen Preussichen Akademie der Wissenschaften, 1912, pp. 456-477.
4. Gantmacher, F. R., The Theory of Matrices, Vol. II, New York, Chelsea Publishing
Co., 1959 (tr. from Russian).
5. McKenzie, L. W., "Matrices with Dominant Diagonals and Economic Theory," in
Mathematical Methods in the Social Sciences, 1959, ed. by Arrow, Karlin, and Suppes,
Stanford, Calif., Stanford University Press, 1960.
6. Morishima, M., "The Mathematical Theory of the Leontief System," in his
Inter-Industry Relations and Economic Fluctuations, Tokyo, Yuhikaku, 1955 (in
Japanese).
7. Nikaido, H., introduction to Sets and Mappings in Modern Economics, tr. by K. Sato,
Amsterdam, North-Holland, 1970 (the Japanese original, Tokyo, 1960).
8. Linear Mathematics for Economics, Tokyo, Baifukan, 1961 (in Japanese).
9. , Convex Structures and Economic Theory, New York, Academic Press, 1968.
380 FROBENIUS THEOREMS, DOMINANT DIAGONAL MATRICES, AND APPLICATIONS

10. Perron, 0., "Zur Theorie der Matrizen," Mathematischen Annalen, 64, July 1907,
pp. 248-263. -

11. Solow, R. M., "On the Structure of Linear Models," Econometrica, 20, January 1952.
12. Wielandt, H., "Unzerlegbare, Nicht Negative Matrizen," Mathematische Zeitschrijt,
LII, Marz 1950, pp. 642-648.

Section C
DOMINANT DIAGONAL MATRICES

Consider the following relation which appears in the Leontief input-output

analysis:

[I - A] - x = c
Here A > 0 is the input-output matrix, x is the output vector, and c is the final
-
demand vector. We may call [I A] the Leontief matrix. We begin this section
by reminding the reader of some of our discussions in Section A.
Suppose we estimate the input-output table A for a particular year from the
statistical data for this year. Suppose that c > 0 and also that x > 0. In other words,
we have the following property for [I A]. -
(i) For some c > 0, there exists an x > 0 such that [I - A] x = c.'
Suppose now that we want to use this input-output table A to predict x for future
years. This can be done if we can predict the final demand vector, c, for these
years and if we can assume that A is "fairly" constant. This is a more or less
usual procedure for the application of the input-output table A. However, there
remains one obvious question: How can we guarantee the nonnegativity of the
x-vector which corresponds to some future c > 0? In other words, we want to
be able to make the following assertion.

(ii) For any c > 0, there exists an x > 0 such that [I - A] x = c.

In the course of studying this question, the condition that all the successive
principal minors of [I - A] be positive has been shown to be important. This
condition is called the Hawkins-Simon condition, as pointed out in Section A.
Another question has arisen in connection with the problem of "dynamiz-
ing" the Leontief input-output relation. It is now known that the crucial con-
dition here is that the absolute values (modulus) of A's eigenvalues are all less
than one. It is also known that this condition is closely related to conditions (i) and
(ii) above, that is, the Hawkins-Simon condition. In the course of those studies,
DOMINANT DIAGONAL MATRICES 381

the Frobenius theorems were rediscovered and have since played an important
role in developments in this area. We now know that condition (i) is crucial
in the study of the matrix [I - A] ; for then the concept of the "dominant diagonal
matrix" can be used by economists. The relationship of these properties of the
matrix [I - A] or of "dominant diagonal matrices" to other studies in economics,
such as the theory of stability of a competitive market, has also been realized.
McKenzie's article [I Q brilliantly summarizes the whole of this unifying struc-
ture. Nikaido's work [ 16], which was published at about the same time, is partly
devoted to displaying this unifying structure as well. The purpose of this section
is to clarify the mathematical structure of these problems. Hence it is natural that
our exposition rely heavily on McKenzie [ 11] and Nikaido [ 16]. We begin with
the definition of a dominant diagonal matrix.

Definition: An n x n matrix A = [aij] is said to have a dominant diagonal if there

exist positive numbers dI, d2, ... , d, such that.

djlajjl >Edilaijl, forj= 1,2,...,n

i#j
The phrase "a dominant diagonal" will be abbreviated "d.d."2
REMARK: As the defining property, we could have chosen
dilaii1 >Zdjlaijl, for i = 1,2,...,n
j 4i

which would define row dominance as compared with column dominance.

The choice of column dominance here is mainly for the convenience of the
present exposition.
REMARK: The usual definition in the literature is that A is said to have a
dominant diagonal if I all > Xj4jI aijl , for all j, which is due to Hadamard.
If D is the diagonal matrix whose diagonal elements dii are the di's in column
dominance, then D D. A has a dominant diagonal in the Hadamard sense.
The following theorem by McKenzie [ 11 ] is a slight extension of the theorem
due to Hadamard in terms of the new definition. This theorem is a funda-
mental theorem for dominant diagonal matrices.

Theorem 4.C.1 (Hadamard, McKenzie): If an n x n matrix A has d.d., A is non-

singular.
PROOF: Suppose A is singular. Then there exists an x# O such that B'- x = 0
where B = D A and D is a diagonal matrix with dii > 0 (i = 1, 2, ..., n).
(Here D is chosen so that A has d.d. with respect to D, that is, bjj l >
.Y;jI bijI for all j = 1, 2, ..., n.) Therefore, xjbj + 2:i jxibij = 0, j = 1,
2, ..., n. Or I xjA I bjjl = J2:i4jxibijj < 2:i jlxil Ibijl, j = 1, 2, ..., n. Let J
be the index set such that i x ( > I xi I for all i = 1, 2, ... , n, when j E J. Then
382 FROBENIUS THEOREMS, DOMINANT DIAGONAL MATRICES, AND APPLICATIONS

we have Ixj4 I bjj' <Ti4jl xil I bijl < Ti4jlxjl I bijt for j E J. Or I bjjl <
2:i#i 1 bij 1, j E J. This contradicts the assumption that A has d.d. so that
I bjj I > fi#j I bij I for all j. (Q.E.D.)

Theorem 4.C.2: If an n x n matrix A has d. d. that is positive, all its eigenvalues have
positive real parts.

PROOF (McKENZIE): Consider [ pI - A]. Suppose that p has a nonpositive

real part; then I p - affil > Iaiil, i = 1, 2, ..., n. (Here the absolute value
symbols mean modulus if complex values are being considered.) Hence
[ pI - A] has d.d. and is nonsingular by the previous theorem. Hence p
cannot be an eigenvalue of A (-.'if p is an eigenvalue of A, det [ pI - A] = 0).
(Q.E.D.)
REMARK: Similarly, we can show that if A has d.d. that is negative, all its
eigenvalues have negative real parts.
The next theorem is clearly important in the Leontief input-output theory.
It characterizes condition (ii) in terms of the matrix with d.d.

Theorem 4.C.3: Let B = [ bij] be an n x n matrix with bii > O for all i and bij < 0
for i j. Then there exists a unique x > 0 such that B x = c for every c > 0 if
and only if B has d. d.

PROOF: We first prove sufficiency. Suppose that there exist di > 0, i = 1, 2,

n, with which B has d.d. By Theorem 4.C. 1, B is nonsingular. Hence a
unique solution x exists. To show x > 0, suppose that xj < 0 for j E J
and xj > 0 for j 0 J, where J is a set of indices. Consider Ejo, bijxj +
T,jE, bijxj = ci > 0 for i E J. Multiplying by di and summing, we obtain
(*) T T dibijxj + T T d,bijxj = fi d,ci >_ 0
iEJ joJ iEJ jEJ iEJ

Clearly the first term on the left is nonpositive since xj > 0 for j 0 J and bij
0 for i r j. By assumption of d.d., ZLEJ, itidi I bij I < di j bij I for all j, hence for
j E J. Since bjj > 0, this implies that LEJ, i#jdibij + djbij = f icidibij > 0 for
j E J. Hence 2:iEJTjEJdibiixj < 0; that is, the second term on the left-hand
side of (*) is negative. Thus the left-hand side of (*) is negative, which is a
contradiction.
Now we prove necessity. Consider B x = c. By assumption, for any
c > 0, there exists a unique x > 0. In particular, let c > 0. Then x > 0, since
bii > 0 for all i and bii < 0, i j. Hence B' has d.d. realized by this x. Then
by the above, B' p = 7C has a unique solution p > 0 for any 7C > 0. In par-
ticular, let z > 0; then p > 0, since bii > 0 for all i and bij <_ 0, i J. In
other words, B has d.d. with respect to this p. (Q.E.D.)
DOMINANT DIAGONAI. MATRICES 383

The following theorem follows immediately from Theorem 4.C.3.

Theorem 4.C.4: Let B = [b,1] be an n x n matrix such that b, < O for i j; then
the following conditions are equivalent.

(I) There exists an x > 0 such that B x > 0.

(II) For any c > 0, there exists an x > 0 such that B B. x = c.
(III) The matrix B is nonsingular and B- I >- 0.
PROOF: We prove (I) (III) (II) (I).

(i) [(I) ' (III)] : Since bid 5 0, i j, B- x > 0 implies that

-2:j for all i so that b;; > 0 for all i and x > 0. Hence B has d.d.
with respect to this x. Therefore B is nonsingular by Theorem 4.C. 1, and,
by the previous theorem, for any c ? 0, B- I c >_ 0. Let
1

c;-(0,0,.. ,1,...,0)
>0,i= 1, 2, ...,n, which implies B-1 > 0.
(ii) [(III) =>(M)]: The proof follows trivially.
(iii) [(II) =>(I)]: The proof follows trivially. (Q.E.D.)
REMARK: From the proof it is obvious that if B satisfies (I), then B has d.d.
that is positive.
REMARK: Let B be the Leontief matrix [I- A] Then conditions (I) and
.

(II) are obviously restatements of our conditions (i) and (ii), respectively.
Hence we now know that condition (i) is equivalent to condition (ii).
REMARK: Condition (I) can be restated as follows:
(I) For some c > 0, B x = c has a solution x > 0.
Nikaido [ 16] called (I) the weak solvability condition and (II) the strong solvability
condition.

Theorem 4. C.5 (Haakins-Simon): Lett B = be an n X n matrix with b < 0,

i # j; then the following two conditions are equivalent.
(I) There exists an x > 0 such that B x > 0.
(IV) [or (H-S)] All the successive principal minors of B are positive, that is,
bll b12 bill
b11 b,2
b21 b22 b2n
b11 > 0, > 0,..., > 0
b21 b22
bnl bn2 ... bnn

In other words,
384 FROBENIUS THEOREMS, DOMINANT DIAGONAL MATRICES, AND APPLICATIONS

bII ...
blk
b21 . .
bzk
det Bk > 0, k = 1, 2, ..., n, where Bk
bkl bkk

REMARK: We call condition (H-S) the Hawkins-Simon condition.

PROOF [(I) = (H-S )] : Condition (I) means there exist xi >_ 0 such that
Ib,1xj > 0,i= 1,2,...,n.Since bj<0,iL j, this implies b1> 0foralli
and that there exists x j 0 such that Zj 0 (i = 1, 2, ... , n), for k = 1,
I

2, ..., n, and k = i. In other words, condition (I) holds for all Bk, k = 1, 2,
n. Hence by Theorem 4.C.4, Bk is nonsingular. Now suppose that by
(i j) shrinks to zero. In this process condition (I) is clearly preserved so that
the nonsingularity of Bk is also preserved. In other words, in this process of
shrinking the by's (i j), det Bk will never be zero so that det Bk keeps the
same sign. But in the limit of big = 0 (i j), det Bk > 0 since bii > 0 for all i.
Hence det Bk must also be positive (k = 1 , 2, ... , n) for the original big (i j).
[ (H-S) =>(I)] : It suffices to show that (H-S)=(II). We prove this by mathe-
matical induction. For n = 1, bI Ix1 = cl. Condition(H-S)impliesthatbll> 0.
Hence for any cl > 0, there exists an xI > 0 such that bI Ix1 = cI (obviously
x, = cI /b, 1). Suppose that (II) holds for n - 1, and we want to prove that
it holds for n. Consider -j 1 b,1xj = ci, i = 1, 2, ... , n (or B- x = c). We
want to show that, for any c = (cl, c2, ., 0, there exists an x =
(x1, x2, ..., 0 such that E c,, i = 1, 2, ..., n. Noting that
bil > 0 (hence 0), we obtain IbiIxi bljxj] biI /bII = ci - I

clbil/bl1 for i = 2, 3, ..., n. Define b;i = bid - bl;bil/bII and c'i = ci -

clbil/bll,i= 2,3,...,n,.j= 1,2,...,n.Then weobtainZ 2biixj =c,,
i= 2, 3, n, and b C', 0. Note that
bil b12 ... b lk I I bil biz .. b'l k b22 ... bzk
b21 b22 b zk 0 b22 ... bzk
= bil
bkl bk2 ... b kk 0 bk2 ... bkk bk2 ... bkk

Since det Bk > 0, k = 1, 2, ..., n [by (H-S)], we obtain

bz2 . . . bz4

>0,k=2,3,...,n
bk2 ... bkk

That is, the (H-S) condition holds for the (n - 1) x (n - 1) matrix [b;.].
Then by the induction hypothesis, E 2b;jxj = cj has a nonnegative solution
(x2, ..., for any nonnegative (c2...... c',). Let c = (Ch c2, ..., Obe
an arbitrary vector. Compute c'i, i = 2, 3, .. , n, from c' = ci - cl bil /b ; l
I

c'. _>_ 0 since biI < 0, i L 1, and bII > 0. Then obtain x2, x3, ..., x > 0 and
DOMINANT DIAGONAL MATRICES 385

obtain x, from x, _ [c, - E 2b,jxj]/b,,. Since bii < 0, 1 4 j, and b > 0,

x, is also nonnegative. Hence B x = c has a solution x >_ 0 for any c > 0.
Thus (II) holds for n. (Q.E.D.)
REMARK: The proof of [ (H-S) = (I)] is due to Nikaido [ 16], section 3,
and [ 17], p. 92.
REMARK: Combining Theorem 4.C.4 and Theorem 4.C.5, we can say that
conditions (I), (II), (III), and (IV) are all equivalent.

Corollary: Let B = [bij] be an n x n matrix with bii < 0, i 4 j. Then (H-S) implies
that all the principal minors of B are positive,3 that is,

bii bii bij bik

(h-s), 0, bii > 0, ...
bii > bj1 bii I bii bil bj k > 0,
bki bkj bkk

PROOF: By Theorem 4.C.5, (H-S) implies that condition (I) holds. That is,
for some c > 0 there exists an x > 0 such that B x = c. Then renumber
the coordinates of x and c and renumber the bit's correspondingly. Clearly,
(I) holds throughout this process. Hence by Theorem 4.C.5 (H-S) holds for
the new system. Since the renumbering can be any permutation of { 1, 2,
. . ., n}, condition (h-s) holds. (Q.E.D.)
REMARK: This (h-s) is the so-called "Hawkins-Simon condition." Gant-
macher called the above corollary the Kotelyanskii theorem ([5], pp.
71-73).4

Theorem 4.C.6: Suppose B is written as B = [ p1 - A] where A = [ail] is an n x n

nonnegative matrix (that is, A > 0) and p is a positive real number.' Then any one of
the conditions (I), (II), (III), and (IV) is equivalent to the following condition:6
1 °° \k
The series - A) is convergent.
P k=o P /
PROOF (NIKAIDOY: We prove the above equivalence by showing the equi-
valence of the above condition and (III). First we show that (III) implies the
above condition with the sum of the series being [ p1- A] -1. Let T,,,
2:k=o(Alp)k /p. Then

(1) T,,,-[pI-A] _ [pI-A]-Tm=I-

This implies that [ p1- A] Tm < I. Hence T,, <_ [ p1- A] ' in view of
(III), so that the sequence {T,,,} is bounded from above. But To < T, < T2
< ... , since p > 0. Hence T , is convergent. Write T,,, = T. Since A` I/
386 FROBENIUS THEOREMS, DOMINANT DIAGONAL MATRICES, AND APPLICATIONS

p,n+I = pTm+I - pT,,,->O as m->co, from (1).we obtain (pI - A) T= I,

or T = [pI - A] -I. Conversely, if T,,. is convergent, then (1) converges to
T. [pI - A] = [pI - A] T = I, since A-+ I/pm+ I ->0, as above. That is,
-
[PI A] is invertible and [pI - A] - I = T. Since T,,, 0, T >_ 0 so that
[pI - A] -I > 0. (Q-E.D.)
We now come to using the Frobenius theorem. Let A = [at,] be an n x n
nonnegative matrix, and consider B = [ pI - A] where p is some real number.
Clearly b; < 0, i 4j. We prove the following theorem.

Theorem 4.C.7: Let B -- pI - A, where A = [a,1] >= 0 and p E R; then thefollow-

ing conditions are all equivalent.8
(I') There exists an x > 0 such that B x > 0.
(V') The real parts of all the eigenvalues of B are positive.
(VI') We have p > AA, where a.A is the Frobenius root of A.
PROOF: Let co be an eigenvalue of A; then by definition 0 = det [coI - A] _
det [-cuI + A] = det [(p - co)I - (pI - A)] . Hence any eigenvalue of B
can be written in the form (p - co). We now prove this theorem in the
following order: (I') =>(V') = (VI') (I').
[(I') (P)]: Condition (I') implies that B has d.d. Hence (V') follows
by Theorem 4.C.2.
[(V')= (VI')] : By (V'), we have Re(p - w) > 0, where w is any eigen-
value of A and Re(p - co) denotes the real part of (p - co). By Frobenius'
theorem II, A has a maximal nonnegative eigenvalue "A Since "A is a real
root, p - AA = Re(p - AA) > 0. Hence p > "A.
[(VI') = (I')] : By Frobenius' theorem II, condition (iii), if A x ? px
for some x ? 0 and for some real number p, then we have AA > p. Hence
if )A < p, then for any x ? 0 we have A x < px, or [pI - A] x > 0, which
in turn implies that (I') holds. (Q.E.D.)
REMARK: In view of Theorems 4.C.4 and 4.C.5 these conditions are also
equivalent to (I1), (111), and (IV) for this B = [pI - A].
We now come back to our original matrix B = [b1 ] with b; <_ 0, i j. We
want to write B in the form of [pi - A] , for some A 0. To do this, let p be a posi-
tive number which is big enough so that [ p1- B] ? 0. Write A = pI - B. Then we
can write B = pI - A, where A > 0. Hence, using Theorem 4.C.7, we at once
obtain the following theorem.

Theorem 4.C.8: Conditions (I), (II), (III), and (IV) are all equivalent to thefollowing
condition.
(V) The real parts of all the eigenvalues of B are positive.
PROOF: By Theorem 4.C.7, condition (V) is equivalent to the condition
DOMINANT DIAGONAL MATRICES 387

p > AA where AA is the Frobenius root of A. This, in turn, is equivalent to con-

dition (I). (Q.E.D.)

Corollary (Metzler): Let A be an n x n nonnegative matrix and be its Frobenius

root. Then a necessary and sufficient condition for )t < 1 is that all the principal minors
of [I - A] are positive.
PROOF: This is a result of (VI') (IV') [or (H-S)]. (Q.E.D.)
When we considered [ pI - A], we simply assumed A > 0. If, in addition, A is
indecomposable, we can get a stronger result. But first we prove the following
lemma.

Lemma: Let A = [a;j] be an n x n nonnegative matrix. Suppose A x < uxfor some

u E R and x >_ 0. Then if A is indecomposable, x> 0.'

PROOF:

C`4
Suppose x 0. Then we may write x =
L
I , where x' > 0. Write

A- ' A12 accordingly. Then A. x < µx implies AZ, x' < ,u0 = 0.
[A21 A22
Hence A21 = 0. This contradicts the indecomposability of A. '(Q.E.D.)

Theorem 4.C.9: Let A = [au] be an n x n nonnegative indecomposable matrix and

let p be a real number. Consider B = pI - A. Then the following conditions are
equivalent.

(I') There exists an x > 0 such that B x > 0.

(VII') There exists an x > 0 such that B x >- 0.
(VIII') The matrix B is nonsingular and B- I > 0.

REMARK: Condition (VII') is a strengthening of (I'), and (VIII') is a

strengthening of (III'), where condition (III') means (III) holds when B can
be written as B pI - A,A> 0.
[ ( VII') < (I' )] : Condition (I') = (VII') follows trivially. Hence it
suffices to show that (VII') = (I'). By assumption, there exists an x > 0 such
that B x ? 0. Clearly x 0. Hence x > 0. Let A' be the transpose ON and let
AA, be its Frobenius root with associated eigenvector y. Since A is indecom-
posable, so is A'. Thus by Frobenius' theorem I, y > 0. We have A'. y = AA' Y'
Write B x = c, where c >_ 0. Then px = A x + c >_ A x. Consider the inner
product of y and px: p<y, x> = <y, px> > <y, A x> = <A'- y, x> = <A,ry, x>
AA'<y, x>. Since<y, x> > 0, we have p > >'A-. Since the Frobenius root of A
(denoted by "A) is equal to A,,, we have p > "A . Since (VI') s (I') by
Theorem 4.C.7, (I') follows.
388 FROBENIUS THEOREMS, DOMINANT DIAGONAL MATRICES, AND APPLICATIONS

[(VIII') <_ ' (I')] At suffices to show that (VIII') <> (III') since (III')
(I') by Theorem 4.C.4. Since (VIII')= (III') follows trivially, it remains to
show that (III') (VIII').
[(III') (VIII')] : Let c > 0 be an arbitrary semipositive vector. By
assumption, B- I >_ 0. Hence x = B- c > 0. We will show that x > 0. Note
that B . x = c implies px = A . x + c A . x, so that x >- 0. Hence by the
previous lemma, x > 0. Choose

c=(0,0,...,0, 1,0,...,0)
Then B- I c > 0 means that the ith column of B- I is strictly positive. Let
i = 1, 2, . . ., n. Thus B-I > 0. (Q.E.D.)

Theorem 4.C.10 (Brauer [I], Solow [19]): Let A= [aij] bean n x nnonnegative
indecomposable matrix. Let r; _ Yj Iai,, i = 1, 2, ..., n (row sum).Ifp > r; foralli
with strict inequality for some i, then p > where A,, is the Frobenius root of A.
PROOF: Let ci = p - r, and B = pI - A. Then B x = c has a solution x =
(1, 1, ..., 1). Hence condition (VII') holds, so that condition (VI') follows;
that is, p > a.A. (Q.E.D.)

REMARK: If we let sj - Zn I aij,j = 1, 2, .. ., n (column sum), we can show

analogously that if p > sj for all j with strict inequality for some j, then p> AA.
REMARK: If B = [I - A], this condition gives a sufficient condition for
.aA < 1.
REMARK: If ri < p for all i (or sj < p for all j), then condition (I') immedi-
ately follows so that AA < p. No indecomposability assumption is necessary.
In fact, this condition provides a trivial (but very useful) sufficient condition
for conditions (I'), (II'), and so on, to hold.
REMARK: The following proposition is a slight generalization of Brauer-
Solow's result.

Theorem 4.C.11: (Fisher [4], Takayama [20] ): Let A = [aij] be an n x n

nonnegative matrix (not necessarily indecomposable) and let AA be its Frobenius root.
Let sj = En I aij, j = 1, 2, ... , n (column sum) and let max jsi and s - min js1;
then we have s < AA < s, and if, in particular, sj = 1 for all j, then AA = 1. A similar
proposition can be obtained with respect to the row sums of A.
PROOF: Let z = (XI, ..., 0 be the eigenvector associated with AA-
By definition, we have .1,z = A z, that is, AA-ii = En I aijz1, i = 1, 2, ..., n.
Summing over i, we obtain
DOMINANT DIAGONAL MATRICES 389

n n n n n n
AA E xi = Z Z aijxj = ± xj ail = xjsj
i= I i= I j= I .1= I i-- / .1=

Hence
xj si
j= I
A= n
Xi
i= I

(that is, AA is a nonnegative weighted average of the column sums, the s1's).
The statement of the theorem follows immediately from this relation.
(Q.E.D.)
REMARK: Takayama [ 20] proved the theorem for the case when sj = 1 for
all j. Fisher's result [4] as recorded above is more general, but his method
of proof is identical with that of Takayama [ 20] . Hence in essence it is only a
slight generalization.
Finally we may point out the interesting discussion on the "choice of units"
by Fisher [ 3] . Consider the Leontief system A A. x + c = x. Suppose the jth
element of x and c (that is, the jth good) is to be measured in new units. Then the
jth row of A must be multiplied by, and the jth column divided by, the same ap-
propriate conversion factor. In other words, the shift of units will convert the
original matrix A to

where D is an n x n diagonal matrix with positive diagonal elements. Writex*

D x and c* = D c; then the original system can be rewritten with new units as
A * x* + c* = x *. Then we can show easily that if A is indecomposable, there
exists a set of units in which all column (row) sums of A * are equal to its Frobenius
root. To prove this, first note that A and A * have the same Frobenius root (say,
AA) and that if z > 0 is the eigenvector of A associated with AA, then D z is the
eigenvector of A* associated with AA. Then set di = 1/zi, i = 1, 2, ..., n, where di
is the ith diagonal element of D. Then the above proposition follows immediately
from A* (D z) _ AA(D z).
With this observation, we can immediately obtain the following corollary of
Theorem 4.C.10 (see Fisher [3] , p. 446, for a decomposable case).

Corollary: Let A be indecomposable. Then a necessary and sufficient condition for

1 > AA is that there exist a set of units in which all column (row) sums are at most
unity, and one such sum is less than unity.
In this connection, we may also recall the definition of dominant diagonal
matrices. By this definition, the Leontief matrix [I - A], where A >_ 0, has domi-
390 FROBENIUS THEOREMS, DOMINANT DIAGONAL MATRICES, AND APPLICATIONS

nant diagonals if there exist d > 0 such that dj(1 - -ajj) > 2:i idia,j, j = 1, 2, ... , n;
that is, 1 > " L(d,/dj)a0, j = 1, 2, ..., n. In other words, [I - A] has dominant
diagonals if and only if the column sums are less than unity with an appropriate
shift of units. In the definition of dominant diagonals, we noted that McKenzie
extended the usual definition. We can now see easily that the dominant diagonals
in McKenzie's extended sense are equivalent to the existence of a set of units in
which the matrix in question has dominant diagonals in the usual sense (see Fisher
[3] , p. 448).

FOOTNOTES

1. Note that this condition is equivalent to the following condition: For some c > 0,
there exists an x > 0 such that [I - A] x = c. That is, x > 0 can be replaced by a
weaker statement x > 0. To see this, write [I - A] x = c > 0 as xi - Zjn= i ayxj _
ci > 0 for all i, and observe that this requires x1 > 0 for all i anyway.
2. Define M = [ mq] from A = [ aii] as follows: mid = - I aii , i j and mii = I aii 1. Then
it is easy to see that A has a dominant diagonal if and only if there exists a d > 0 such
that M'- d > 0, where M' is the transpose of M. Clearly this condition is closely
related to the above condition (i) for the Leontief matrix.
3. Therefore the condition (H-S) is equivalent to the condition (h-s).
4. Hawkins and Simon [7] obtained (h-s) and Nicholas Georgescu-Roegen obtained
(H-S) in his "Some Properties of a Generalized Leontief Model," in Activity Analysis
of Production and Allocation, ed. by T. C. Koopmans, New York, Wiley, 1951 (theorem
7), reprinted with revision and "A Postcript (1964)" in his Analytical Economics,
Cambridge, Mass., Harvard University Press, 1966.
5. If (IV) [that is, (H-S)] holds, then p > 0 is automatically implied if aii > 0, for
(H-S), among other things, requires p - a, I > 0.
6. In the proof it will be shown that if the series is convergent it is equal to [p I - A ] - ' .
7. See Nikaido [ 17] , p. 97, and [ 16] , section 19.
8. Note that the prime in (I'), (V'), and (VI') indicates that B has the specific form
EP I - A ] , where A > 0 is a given nonnegative square matrix.
9. In view of this lemma, we can immediately assert that if A x < µx for some y and
x 0, x 0, then A is decomposable. In fact, we can also show the converse of this
statement, that is, if A is decomposable, then A- x < µx for some µ and x >_ 0,
x 0. The proof of this follows easily from the definition of decomposability.

REFERENCES

1. Brauer, A., "Limits for the Characteristic Roots of a Matrix," Duke Mathematical
Journal, 13, September 1946.
2. Debreu, G., and Herstein, I. N., "Nonnegative Square Matrices," Econometrica,
21, October 1953.
3. Fisher, F. M., "Choice of Units, Column Sums, and Stability in Linear Dynamic
Systems with Nonnegative Square Matrices," Econometrica, 33, April 1965.
4. -, "An Alternate Proof and Extension of Solow's Theorem on Nonnegative
Square Matrices," Econometrica, 30, April 1962.
SOME APPLICATIONS 391

5. Gantmacher, F. R., The Theory of Matrices, Vol. II, New York, Chelsea Publishing
Co., 1959 (tr. from Russian).
6. Hawkins, D., "Some Conditions of Macro-economic Stability," Econometrica, 16,
October 1948.
7. Hawkins, D., and Simon, H. A., "Note: Some Conditions of Macro-economic
Stability," Econometrica, 17, July-October 1949.
8. Herstein, I. N., "Comments on Solow's `Structure of Linear Models,"' Econometrica,
20, October 1952.
9. Karlin, S., Mathematical Methods and Theory in Games, Programming and Economics,
Vol. 1, Reading, Mass., Addison-Wesley, 1959.
10. McKenzie, L. W., "An Elementary Analysis of the Leontief System," Econometrica,
25, July 1957.
11. , "Matrices with Dominant Diagonals and Economic Theory," in Mathe-
matical Methods in Social Sciences, 1959, ed. by Arrow, Karlin, and Suppes, Stanford,
Calif., Stanford University Press, 1960.
12. Metzler, L. A., "Stability of Multiple Markets: The Hicks Conditions," Econo-
metrica, 13, October 1945.
13. , "A Multiple Region Theory of Income and Trade," Econometrica, 18,
October 1950.
14. Morgenstern, 0., ed., Economic Activity Analysis, New York, Wiley, 1954, esp. articles
by Wong, Y. K., and Woodbury, M. A.
15. Mosak, S. L., General Equilibrium Theory in International Trade, Bloomington, Ind.,
Principia Press, 1944.
16. Nikaido, H., Introduction to Sets and Mappings in Modern Economics, tr. by K. Sato,
Amsterdam, North-Holland, 1970 (the Japanese original, Tokyo, 1960).
17. , Convex Structures and Economic Theory, New York, Academic Press, 1968.
18. Price, G. G., "Bounds for Determinates with Dominant Principal Diagonals,"
Proceedings of the American Mathematical Society, 2, 1951.
19. Solow, R. M., "On the Structure of Linear Models," Econometrica, 20, January 1952.
20. Takayama, A., "Stability in the Balance of Payments: A Multi-Country Approach,"
Journal of Economic Behavior, 1, October, 1961 (the paper presented at the Washing-
ton meeting of the Econometric Society, 1959, resume, Econometrica, 28, July 1960).

Section D
SOME APPLICATIONS

a. SUMMARY OF RESULTS
We begin this section by summarizing some of the results obtained in the
previous section. In order to make it easy to refer to these results, we will present
them as theorems.
392 FROBENIUS THEOREMS, DOMINANT DIAGONAL MATRICES, AND APPLICATIONS

Theorem 4.D.1: Let B = [by] be an n x n matrix with b,, <- O for i # j. Then the
following five conditions are mutually equivalent.
(I) There exists an x > 0 such that B x > 0 (that is, for some c > 0, there exists
an x > 0 such that B x = c).'
(II) For any c ? 0, there exists an x >_ 0 such that B x = c.
(III) The matrix B is nonsingular and B- l >- 0.
(IV) [or (H-S)] All the successive principal minors ofB are positive. In other words,

bll b12... bin

b11 b12 b21 b,2... b2
bll>0, > 0,..., >0
b2l b22
bn1 bn2' ' ' bnn

(V) The real parts of all the eigenvalues of B are positive.

PROOF: The proof is obvious from Theorem 4.C.8.
The matrix B is often written in the form B = [pI - A], where A > 0.
Clearly Theorem 4.D.1 is applicable to this B. In fact, we can say even more
in this case as is done in our next theorem.

Theorem 4.D.2: Let A = [a,,] be an n x n nonnegative matrix. Let B = [ pI - A] ,

where p is a real number and I is the identity matrix. Then thefollowing six conditions
are mutually equivalent.
(I') There exists an x ? 0 such that B x > 0.
(II') For any c ? 0, there exists an x >_ 0 such that B x = c.
(III') The matrix B is nonsingular and B-1 >- 0.
(IV') [(H-S)] All the successive principal minors of B are positive.
(V) The real parts of all the eigenvalues of B are positive.
(VI') We have p > ".A where AA is the Frobenius root of A.
If in addition, A is indecomposable, then any of the above conditions is equivalent to
either of the following conditions:
(VII') There exists an x ? 0 such that B x ? 0.
(VIII') The matrix B is nonsingular and B- > > 0.
PROOF: The proof is obvious from Theorems 4.C.7, 4.C.8, and 4.C.9.
REMARK: From Theorem 4.C.6, the first six conditions [(I')-(VI')] are
equivalent to the following condition, if p > 0.

(IX') The series IZk 0 (A)A is convergent.

SOME APPLICATIONS 393

REMARK: As it was remarked in Theorem 4.C.6, if the series

1°° (A)k
P= P
is convergent, it is equal to [pI - A] -'.
In the above theorems we are concerned with B such that bif < 0 for i j,
that is, such that all its off-diagonal elements are nonpositive. Suppose, on the
contrary, that all the off-diagonal elements are nonnegative. We know that such
a matrix appears in the stability analysis of a competitive market as the gross
substitution matrix; hence we may suspect that if we know the properties of
such a matrix, it will have many applications in economics. But the properties
of such a matrix can immediately be obtained from the above theorems. Let B
be a (square) matrix whose off-diagonal elements are all nonnegative, and let
B = -B. Then clearly all the off-diagonal elements of B are nonpositive. Thus
Theorem 4.D.1 can be applied and we obtain the following theorem.

Theorem 4.D.3: Let h be an n x n. matrix whose off-diagonal elements are all

nonnegative. Then the following conditions are mutually equivalent:
(I") There exists an x >_ 0 such that h E. x < 0.2
(II") For any c < 0, there exists an x > 0 such that B x = c.
(III") The matrix h is nonsingular and B-' 0.
(IV") The successive principal minors of B alternate in sign; that is, if Bk is the
successive principal minor o f B o f order k, then (- 1)k Bk > 0, k = 1, 2, ... , n. In other
words, h is Hicksian.
(V") The real parts of all the eigenvalues of b are negative.
If h is written in the form h = [A - pI] where A >_ 0 and p is a real number,
then any of the above conditions is equivalent to the following:
(VI") We have p > A,,, where .A,, is the Frobenius root of A.
If in addition, A is indecomposable, then any of the above six conditions is
equivalent to either of the following conditions:
(VII") There exists an x >_ 0 such that h-- x << 0.
(VIII") The matrix B is nonsingular and B-' < 0.
Finally we prove the following theorem, which will be useful and important
in several economic applications.

Theorem 4.D.4: Let B be an n x n matrix such that b,, < 0 for i j. Then the
following four conditions are mutually equivalent.
(I) There exists an x > 0 such that B x > 0.
(I) There exists a p > 0 such that B'- p > 0 where B' is the transpose of B.
(1*) There exists an x > 0 such that B x > 0.
(I*) There exists a p > 0 such that B'- p > 0.
394 FROBENIUS THEOREMS, DOMINANT DIAGONAL MATRICES, AND APPLICATIONS

PROOF: From Theorem 4.D.1, condition (I) holds if and only if B-1 >_ 0
[condition (III)]. But by a well-known relation in elementary linear algebra,
(B-')' = (B')-'. Thus B-1 >_ 0 is true if and only if (B')-' >_ 0. Then
applying Theorem 4.D.1 again, this is true if and only if there exists a p > 0
such that B'. p > 0. Hence (I) and (I) are equivalent.
To show the equivalence of (I) and (I*), we recall that if there exists an
x > 0 such that B x > 0, x must be strictly positive since b;, > 0.1 Con-
versely, if there exists an x > 0 such that B x > 0, then condition (I) clearly
follows. Similarly, (I*) holds if and only if there exists a p > 0 such that
B' . p > 0. Combining this with the first part of the theorem, we obtain the
second part of the theorem. (Q.E.D.)
REMARK: In the first part of the proof of Theorem 4.D.4, we use the
relation (I) (III). By noting that the eigenvalues of a matrix are the same
as the eigenvalues of its transpose, we can prove the same statement using
the relation (I) (V).
REMARK: The equivalence (I*) (1*) in Theorem 4.D.4 implies that
B' has a dominant diagonal if and only if B has a dominant diagonal .4
REMARK: We also note that Theorem 4.D.4 can be proved directly from
Theorem 4.C.3. To do this, simply note that B x > 0 for somex > 0 means
that B' has a dominant diagonal. Hence, from Theorem 4.C.3, B' p = 7r > 0
has a solution p > 0. The converse holds similarly.

b. INPUT-OUTPUT ANALYSIS
Let A be an input-output matrix so that all denotes the amount of the ith good
necessary to produce one unit of the jth good. Obviously A > 0. Let c and x be the
final demand vector and the output vector, respectively. The basic relation of
(static) input-output analysis is written as
[I -
be computed for a particular year and let x and c be obtained for the
A
-
year as well. Clearly c > 0 and x > 0, and all the off-diagonal elements of [I A]
are nonpositive. Hence condition (I) of Theorem 4.D.1 is satisfied for [I A]. -
Suppose that this technology matrix A is expected to be fairly constant for some
years. Then by predicting the final demand vector cf for some future year, we
-
can easily compute the output vector for that particular year as xf= [I A] -1 cf.
In order to apply the above theorems, we consider the following two questions
mentioned before.
(i) For any cf > 0, does there exist an xf 0 such that [I - A] xf= cf?
(ii) Is [ I - A] nonsingular? If so, is [ I - A] -' ? 0?

By use of Theorem 4.D.1, we can immediately answer both these questions

in the affirmative. Questions (i) and (ii) are nothing but conditions (II) and (III) of
SOME APPLICATIONS 395

Theorem 4.D.1, respectively. By Theorem 4.D.2, we must also have 1 > A,, where
A.,, is the Frobenius root of A. Conversely, if 1 > A,, we can answer questions (i)
and (ii) in the affirmative. Thus 1 > A,, offers a characterization of this problem in
terms of the matrix A.
Suppose that some elements of the final demand vector c (for the year in
which A is estimated) are zero. In other words, c ? 0 (instead of c > 0). This is pos-
sible if certain goods are used only as intermediate goods. Can we again answer
questions (i) and (ii) above in the affirmative? The questions can be answered "yes"
by referring to Theorem 4.D.2. In other words, if A is indecomposable, we can say
that, for any cf> 0, there exists an xf> 0 such that [I - A] - xf = cf and that
[I - A ] is nonsingular with [ I - A ] -1 > 0.
Suppose that there exists an x > 0 such that [ I - A ] x = cfor some c. Then
from Theorem 4.D.4, this is true if and only if there exists a p > 0 such that
[I - A] ' p > 0. In order to understand the economic significance of this state-
ment, let us suppose that this productive system is realized in a competitive equilib-
rium. Then we may suppose that a set of prices p. > 0, i = 1, 2, ..., n, will be
established for the n goods. Then 2:n p;ai constitutes the payment by the jth
industry for the goods used to produce one unit of the jth good (that is, "raw
material cost"). Since each industry presumably uses some primary factors (such
as labor) for the production, we must have
Pj n
> 0(1= 1,2,...,n)
r= i
or

[I- A]' p > 0 if x > 0, where [I- A]' is the transpose of [I- A]

Conversely, suppose that [I - A]' p > 0 for some p. Then by paying an

amount equal to [1- A] ' p, for the primary factors, we can achieve positive out-
put x > 0 under this competitive equilibrium. This illustrates a use of Theorem
4.D.4 in input-output analysis.
Above we observed that [I - A]'. p represents the payments for the primary
factors. In fact, it represents the vector of the value added per unit of output in each
industry. Suppose there is only one kind of primary factor, "labor," and let us
denote the amount of labor necessary to produce one unit of thejth good by 1i.
Assume I j is constant for all j = 1, 2, ... , n. Cases in which there are more than one
primary factor can be analyzed analogously and hence are omitted from the sub-
sequent discussion. Let us suppose the competitive condition holds so that wages
in each industry are equalized. Denote this wage by w. The cost of production per
unit output of thejth good is vj _ [[;'_ i pray + wig] . The "profit" per unit output
of the jth good is simply [pi - vj]. Let 7rj - [ p1 - vj] /vj. Then 7rj is the rate of
profit per unit of "working capital" in thejth industry. So far 7r, can be different
from industry to industry. Under the competitive condition 7rj will be equalized
for all industries. [In the long run, nj will be zero, as noted by Walras. Here
396 FROBENIUS THEOREMS, DOMINANT DIAGONAL MATRICES, AND APPLICATIONS

we are also considering the intermediate step (that is, 7r > 0) to this "Walrasian
long run".] Then write 7 = 7L1 _ 7r2 = _ mn. Let I denote the n-vector whose
jth component is 1j. Then from the definition of m1 and the condition on the ml's, we
obtain
[I-A]'.p- wl]

p=(1+m)[A'.p+wl]
Assume 7L > 0; then 1 + 7L 0. Hence [ pI - A'] p = wl where p = 1/(1 + 7r).
From Theorem 4.D.2, a necessary and sufficient condition that there exists ap > 0
such that this relation holds for any wl > O is simply p > AA where AA is the Froben-
ius root of A(hence also of A'). This means
1
> AA
1 +71:

Under this condition, [ pI - A'] is nonsingular and [ pI - A'] >- 0 as a result of

condition (III") of Theorem 4.D.2. Then p is explicitly obtained as
p = [pI - A'] -' wl for a given w

c. THE EXPENDITURE LAG INPUT-OUTPUT ANALYSIS

In an earlier study of input-output analysis, Solow [21] considered the
following system of difference equations:

where A = [a, ] is an (n x n) matrix and xj(t) is the output vector in period t. This
system may be justified by the assumption that the demand for the ith good by the
jth industry in period t [that is, x,1(t)] is proportional to this industry's sales
(= output) in period t - 1 [that is, xj(t - 1)]. Here all is simply defined by aid
x;1(t)/xj(t - 1). It should be noted that the meaning of aid here is slightly different
from that of the input-output (production) coefficient in the ordinary sense as dis-
cussed above. Here A denotes the expenditure relations in this model.'
Consider the stationary state in which x(t) = x(t - 1) = x* for all t. Then we
have x* = A x* + c, or [I - A] x* = c. Two questions immediately arise.

(i) Does x(t) -> x*, as t -> co?

(ii) Is [I - A] nonsingular and is [I - A] -' ? 0?

The first question is the problem of stability and the second question is the prob-
lem of the existence of a nonnegative solution. From Theorem 4.D.1, we know
immediately that [I - A] is nonsingular and [I - A] 0 if and only if 1 > "A
where AA is the Frobenius root of A.
SOME APPLICATIONS 397

In order to understand the stability question, let us carry out the following
successive substitutions:
x(l) = c
x(2) = c= A2. X(0) + (I +
A x(2) + c = A3 x(0) + (1 + A + A 2). C
..................................................
x(t) = 1) + c = A+ + A' c

Then from the remark following Theorem 4.D.2, we can say that A' as t -->co
if and only if 1 > AA, and that 0A' c is convergent and equal to [1 - A ] -'- c
(= x*). Therefore x(t) -- [I - A]_- I c as t ---) co if and only if 1 > AA. Hence we
see that the stability question and the problem of the existence of a nonnegative
solution are really equivalent.

d. MULTICOUNTRY INCOME FLOWS

Another application of our theorems is in the stability problem of the (naive)
Keynesian multiplier model with a multisector specification (see Chipman [3],
Metzler [8] and [91, and Morishima [III, for example). The "multisector" model
can be a "multi-industry" model or a "multicountry" model. Here we will illust-
rate this application in terms of a multicountry model of income flows. Let there be
n countries in the world and let Uij(t) be countryj's demand for country i's goods
for consumption purposes in period t. Let Vij(t) be countryj's demand for country
i's goods for investment purposes in period t. Let Yj(t) be the national income of
country j in period t. We assume the following expenditure lag model.

Uij(t)=aijYj(t- 1)+uij, i,j= 1,2,...,n

Vij(t) = RijYY(t - 1) + Vij, i,j = 1, 2,..., n
where aij, uij, Rij, vii are all constants. We then assume the following equilibrium
relations.

Yi(t) = E
n
r T,ij(t),
n

Uij(t) + L. r i= 1, 2, ... , n
j= I j= I

Write aij + Pij - aij and 2:j". I (uij + vij) = ci. Then the above relations can be com-
bined and written as
n
Yi(t) aijYj(t - 1) + ci, i= 1,2,...,n
j= I

or in matrix-vector notation,
Y(t) = A Y(t - 1) + c, where A = [aij]
This is exactly the same equation as that for the Leontief expenditure lag type
398 FROBENIUS THEOREMS, DOMINANT DIAGONAL MATRICES, AND APPLICATIONS

model discussed above. As we observed there, the following three conditions are
all equivalent.

(i) as t - oo.
(ii) [I - A] is nonsingular and [I -A]-' > 0.
(iii) 1 > ' A where AAA is the Frobenius root of A.

More complicated models are discussed by Morishima [ 11] and others.

e. A SIMPLE DYNAMIC LEONTIEF MODEL

Above we considered a dynamic model of input-output analysis based on an
expenditure lag. We noted there that A is no longer a technological input-output
matrix unless a certain restrictive assumption is made. Another way to dynamize
the static Leontief model is to introduce a production lag (instead of an expendi-
ture lag). Then A, in the new dynamized model, represents the technological
matrix. We will discuss a simple version of such a model. Let us suppose that the
production of each good takes one period. Production periods will vary from good
to good in an actual economy, but by defining one period as the largest common
divisor of the production periods of all the goods, we may suppose that the produc-
tion period for every good is equal to one (see, Chapter 6, Section A). Let x,(t)
be the output of the ith good in the tth period. At the beginning of the tth period
[that is, at the end of the (t - 1)th period], the amount of the ith good available
is x-(t - 1). This is used for production in the tth period. Hence we have, for
t = 1, 2, ..
xi(t aijxj(t) + ci(t), or x(t - 1) = A x(t) + c(t)
j= 1

Here a;j is the input-output coefficient in the ordinary sense. We now ask whether
there exists a balanced growth solution to the above system, starting from an
arbitrary c(0), where balanced growth means, for t = 1, 2, ...,

x.(t)= axi(t - 1),andci(t)= aci(t - 1),i= 1,2,...,n

Here a is a positive constant
k
cal1ed
«1 . the ....ac.
gf tort and
growth ,, a > 1 means "growth"
and a < 1 means "decay" of the economy. Write a = 1 + y; then y is called the
growth rate. Substitute the above balanced growth relation into the original system
of difference equations to obtain, for t = 0, 1, 2, ...

p] rt ct where

When t = 0, [pi - A] x(0) = c(0). And x(t) = atx(0) and c(t) = atc(0) along
this balanced growth path. A necessary and sufficient condition for the existence
of a solution x(0) ? 0 for any c(O) > 0 is given by Theorem 4.D.2 as p > a.A (or
1/a > A.,,), where AA is the Frobenius root of A [(II') (VI')]. Hence p > AA
SOME APPLICATIONS 399

gives a necessary condition for the existence of a balanced growth solution starting
from an arbitrary initial c(0) > 0. Conversely, if p > AA, we can also obtain the
above balanced growth solution. Hence a necessary and sufficient condition for
the existence of a balanced growth path with growth factor a is 1 /a > AA. Note that
if c(O) = 0 and 1/a > "A, then c(t) = 0 and x(t) = 0 for all t. In order to achieve
x(t) > 0 for all 1, we need the indecomposability of A. Suppose that for some
c(0) > 0 there exists an x(0) >_ 0 such that [ pl - A] x(O) = c(0). Then condition
(VII') of Theorem 4.D.2 is satisfied, provided that A is indecomposable. Then from
condition (VIII'), [ pI - A] is nonsingular and [ pI - A] -1 > 0. Hence x(0) > 0
with c(O) >_ 0, so that x(t) = a&x(0) > 0 for all t. In these considerations it is
important to realize that the initial output vector cannot be arbitrary; it must be
equal to [ pl - A] c(0).
I

Now suppose that the households form an industry with labor as its output.
Labor is now considered a good rather than a primary factor, and thus c(t) = 0
for all t. Then balanced growth with a positive growth factor in the above dynamic
system is possible if and only if there exists a p > 0 such that [ pI - A ] x(O) = 0.
If A is indecomposable, then we know that there exists a unique AA >0 and an
x(0) > 0, such that A x(0) = .1.,,x(0). Then set A,, = p. In other words, if A is
indecomposable, there exists a unique balanced growth path whose growth factor
is equal to 1/A,, where A,, is the Frobenius root of A. Note that we cannot choose
the initial x(0) arbitrarily.
The basic weakness in the above dynamic analysis is that there is no con-
sideration of the stock of goods. A proper treatment of this will give rise to the so-
called "dynamic Leontief model," which we discuss in Chapter 6. There we show
that there exists a balanced growth path, corresponding to the Frobenius root,
and we will discuss the conditions for the "convergence" to this path starting
from an arbitrary given initial stock of goods.

f. STABILITY OF COMPETITIVE EQUILIBRIUM

Let f(p), i = 0, 1, ..., n, be the excess demand function for the ith good
in a competitive trading system. As in Chapter 3, we consider the following system
of dynamic adjustment equations:
dpi(t)frp(tll,i= 0, 1,...,;.
di
where p(t) _ [po(t), pI(t), ..., p be an isolated equilibrium price
so that (i) f, (p) = 0, i = 0, 1, ..., n, and (ii) is unique in a certain open ball
about p. Assume p > 0. Assume also that the f, 's are differentiable, and expand
the above relation about p according to Taylor's expansion. Then taking only
the first-order term of the Taylor series, we obtain
n
dpr(t) _ 1,...,n
dt - l=o
where aid = of/ app evaluated at p = p.
400 FROBENIUS THEOREMS, DOMINANT DIAGONAL MATRICES, AND APPLICATIONS

The question is whether or not pi(t) _4 pj as t -- oo for all j. Since the above
system is linear, a global solution p(t) always exists for all t >_ 0. Assume the
solution p(t) remains positive for all t >_ 0. But owing to the linear approximation
procedure, the stability of the above system does not establish global stability,
although it does establish local stability. Write
qi(t)=Pi(t)-Pi,i= 0, 1,...,n
Then the above system can be rewritten as
n
dgi(t)
dt I
- j=0 i = 0, 1,-, n
We impose the following two assumptions:

(A-1) (Walras' Law) I f(p)pi = 0 (for all p).

i=o
(A-2) (Homogeneity) f (p) is positively homogeneous of degree zero (for all p),
i=0,1,...,n.
In view of Walras' Law, if f (p) = 0, i = 1, 2, .. ., n, then fo(p) = 0. By homo-
geneity, we may set one of the prices-say, po-equal to unity (numeraire). There-
fore, the equilibrium is described by

fi(P)=0,i= 1,2,...,n
and the linear system of dynamic adjustment is written as
dpd(t)=
Iaij[pj(t)-PJ , i = 1,2,...,n
J
or

dgi(t) _
dt f_ I

since po(t) = po = 1 for all t. In matrix notation, we may rewrite this as

dq(t)
dt = A A. q(t)
where q(t) = [q1(t), ..., and A = [aif], i, j = 1, 2, ..., n. Our problem
is to ascertain whether q(t)->O as t oo in the above system of differential equa-
tions. Needless to say, p(t) -- p if and only if q(t) -- 0. From the elementary
theory of differential equations (see Chapter 3, Section B), it is known that
q (t) - 0 as t -moo if and only if the real parts of all the eigenvalues of A are negative.
Hence the question is reduced to finding the condition that will guarantee that
the real parts of all the eigenvalues of A are negative. If A is symmetric, then
negative definiteness gives this condition; this result is given by Samuelson (see
SOME APPLICATIONS 401

Chapter 3, Section Q. However, the symmetry of A is hard to justify. Metzler

observed that if a;j > 0 (i 4 j), then this condition on the eigenvalues is equivalent
to Hicks' condition (that is, A is "Hicksian"). In 1958, Arrow and Hurwicz
[ 1], Hahn [4], and Negishi [ 15] independently proved that if a;j > 0, i 4 j (gross
substitutability), then this condition on the eigenvalues follows. In other words,
linear approximation stability is implied by the strong gross substitutability con-
dition. The essential idea was to utilize such economic laws as Walras' Law and
the homogeneity of the excess demand functions.
Instead of gross substitutability, we impose a slightly weaker condition.
(A-3) a;j > 0 for all i # j, where a;j = of/apj evaluated at p (i, / = 0, 1, ... , n).
This assumption is known as the weak gross substitutability assumption.
We are now ready to prove the (linear approximation) stability of the
normalized system.

Theorem 4.D.5: Under (A-3), the normalized system q(t) = A q(t) is stable, that
is, q(t) - 0 as t - co, provided that either one of the following conditions holds."

(i) Assumption (A-1) and apj > 0, j = 1, 2, ... , n.

(ii) Assumption (A-2) and a;o > 0, i = 1, 2, ..., n.

PROOF:

(i) First use (A-1) (that is, Walras' Law) and apj > 0 for j 0. From (A-1)
we haves 0Pia,1 = a0j < 0 for all j. Hence condition (I") of Theorem
4.D.3 is satisfied for A' where A' is the transpose of A, which implies (I")
also holds for A from Theorem 4.D.4. Hence condition (V") of Theorem
4.D.3 holds, which in turn implies the stability.
(ii) Now use (A-2) (that is, homogeneity) and a;o > 0, for i 0. The homo-
geneity implies _yjn_oa;jpj = 0 (Euler's equation). Or -Y oa;jpj =
-a,p < 0, for all i. Hence condition (I") of Theorem 4.D.3 is satisfied
for A', which implies that condition (V") also holds for A. This again
establishes stability. (Q. E. D.)

REMARK: The above proofs are not exactly the same as those of Negishi
and Hahn. This is understandable, for, at the time of their proofs, the struc-
ture of Theorem 4.D.3 was not well recognized. However, our proofs are
essentially based on their ideas. Note also that we only required a;j > 0
(i ' j)' (called the weak gross substitute case) instead of a j > 0 (i 4-1 j).
REMARK: As remarked above, Metzler [9] proved that under gross sub-
stitutability, the condition for dynamic stability (that is, the eigenvalues
with negative real parts) is equivalent to the Hicksian condition (that is, the
alternating sign of the successive principal minors). In other words, Metzler
proved part of Theorem 4.D.3 by establishing the equivalence between
402 FROBENIUS THEOREMS, DOMINANT DIAGONAL MATRICES, AND APPLICATIONS

conditions (V") and (IV"). The Hahn-Negishi theorem indicated the con-
nection between conditions (V") and (I") of Theorem 4.D.3.
We now prove the global stability of the nonnormalized differential equation
system, p = f (p). For the other systems (normalized system or difference equation
system), the reader can attempt a similar proof. The essential idea is to use
Liapunov's second method. (For Liapunov's second method for a difference
equation system, see Kalman and Bertram [ 5] .) We will use our theorems to show
V < 0 (or A V < 0 for the difference equation system), where V is Liapunov's func-
tion (as described in Chapter 3, Section H).

Theorem 4.D.6: Let p(t; p0) be the solution price vector for the system p(t) _
f [ p (t)], with the initial condition p (0) = p0. Assume Walras' Law (A-1 ), homogeneity
(A-2), and
(A-3') f,j(p) (- of (p)/apj) > O for all i j (i, j = 0, 1, ..., n) and for all p.
Then p (t ; p 0) - > p as t-> oo where f (p) = 0, regardless of the initial point p0.

REMARK: Clearly, (A-3') is a stronger assumption than (A-3). In the above

theorem, it is tacitly assumed that the equilibrium price vectorp is unique up
to a positive scalar multiple, which is certainly the case if (A-3') is strength-
ened to the following: f,j(p) > 0 for all i j and for all p (see Lemma 3.E.2.)
The following proof is, in essence, due to McKenzie [6] .

PROOF: We use Liapunov's second method. Define V [ f (p)] - 7G iE J f 2(p),

where f (p) = [ fo (p), ... , f (p)] and J = { i s f (p) > 01. In other words, J
is the set of indices for goods whose excess demands are positive. Obviously
the set J changes as prices move from time to time. We shall show that V is a
Liapunov function. First note that V(f) > 0 if f 0, so that V(f) = 0 if and
only if f = 0.8 By Lemma 3.E.1, II p(t) II = II p(O) II for all t, so that we can
confine our attention to such a path of p(t). This in particular implies that
the equilibrium price vectorp is unique in which p(O) II
Also
V(f) > 0 for all p(t) We now show that V(f) < 0 for all p(t)
Consider

[P(t)] Zf.fif
dt - ZiEJZ.f.fjPJ
j=0
=Z ufJjf = Z ZMJ4 +
iEJ jEJ
iEJ j=0 iEJ jeJ
where fj - f,j(p) = af(p)/apj (that is, evaluated at p rather than p). From
(A-3') (fj > 0, i .j) and f < 0 for j 0 J, we have ZiE JZj0Jf fjf < 0. We
now show that Ii. J `jE J f fj f < 0. From the definition of J, f > 0 for i E J.
By Walras' Law, Z"=0 Piffj = -f ,j = 0, 1, ..., n. HenceZiEJPiffj 5 _f < 0,
for j E J. Therefore F', pJ < 0, where F'' is the transpose of F. = [fij] , i, j E
J; Pi is the vector [pj] where j is taken from J. Also by homogeneity,
I jEJ.f jPj + Zje Jfj pj = 0 for all i, so thatZjE J fj pj < 0 for i E J, or FJ pJ <_ 0.
SOME APPLICATIONS 403

Consider F; = (F, + F;). Then F; p, < 0, so that condition (I") of

Theorem 4.D.3 is satisfied for F; . Hence the real parts of all the eigenvalues
of F; are negative as a result of condition (W). Since F; is symmetric, the
eigenvalues of F; are all real,9 so that they are all negative. Hence F; is
negative definite,10 that is, x F* x < 0 for all x 0. Since x F; x =
2x F, x, this implies x F, x < O for all x O. Hence p, F, p, < 0. From
Walras' Law, f, and f are zero together, where f - (fo,... , Hence we have
established V [f(p)] < 0 forf # 0. Hence f-> 0 as t -> oc. (Q.E.D.)

REMARK: The proof for the normalized system is similar. Or, alternatively,
use p f(p) > 0 (Lemma 3.E.3); that is, define y = E (p, - p;)2 so that V =
2(p - p)- p = 2(p - p) f(p) < 0 for all p p by Lemma 3.E.3 and (A-1).
Here we normalize p by Ep;(t) = 1 for all t.

9. COMPARATIVE STATICS
Suppose that a certain economic system is described by
1,2,...,n
or simply
f(x)=0
Such a system can describe the set of certain equilibrium relations or the set of
certain optimization conditions. An example of the former interpretation is that
f and x; are respectively taken as the excess demand for and the price of the ith
commodity. The value of x(say, z) which satisfies the above relationsf(x) = 0 is
called the equilibrium value of x. Clearly the equilibrium value of x is not neces-
sarily unique.
In order to consider a shift of the above system, rewrite the system as
.f(xl, xn; a, r, ...) = 0, 1 = 1, 2, . ., n
.

or simply f (x ; a, p, ...) = 0, where all the second partial derivatives of f,. (i = 1, 2,

n) with respect to xl, x2, ... , x and a, /3, ... are assumed to exist and be con-
tinuous in the domain. Here a,/3, ... are the shift parameters or exogenous variables.
The equilibrium value of x clearly depends onthevalues ofa,/3, ....The concept of
an equilibrium value of x is meaningful only after the values of a, A.... are
specified-say, as a, A, .... If the f's are "well-behaved," then we can write such
dependence relations as
xi=x;(a,/3,...),i= 1,2,...,n
or simply x = x(a, /3, ...), where the function x is continuously differentiable in
some region S about (a, /3, .), and
. .

f [x(a, /3, ...); a, /3, ...] = 0, for all (a, /3, ...) in the region S
Obviously the f's are not so "well-behaved" in general, so that an explicit func-
404 FROBENIUS THEOREMS, DOMINANT DIAGONAL MATRICES, AND APPLICATIONS

tional relation such as x = x(a, j3, ...) may not be obtained, either locally or
globally. A set of mathematical conditions guaranteeing the "local" possibility is
known as the implicit function theorem." In this case S is some neighborhood of
...), which can be very small.
Comparative statics is concerned with the effect of a change in one or more of
the shift parameters a, j3, ... on the equilibrium value of x. The comparative statics
analysis can be local or global, depending on whether the region S is large [for
example, so that it covers all possible values of (a, j3, ...)] or S is confined to a
certain (small) neighborhood of (a, ji,...). The local analysis in connection with
the classical optimization theory was discussed in the Appendix to Section F,
Chapter 1.12
Partially differentiate the system f [x(a, j3, ...); a, j3, ...] = 0 with respect
to one of the shift parameters-say, a (while keeping the other parameters,
y, ... , constant)-we obtain

Zlf, aai+ b;=0,i= 1,2,...,n, forall(a,/3,...)inS

where fij = of/ax; and b, = of/aa with x = x(a, j3, ...). In matrix form, we may
rewrite this as
F x,, + b = 0
where F = [J,] , b = [b;] , and x,, is the n-vector whose jth element is axe/aa.
Assuming that F is nonsingular, we can rewrite the above equation as
xa = - F- I b, A

which is called the fundamental equation of comparative statics.13 The solution x,

indicates the effect of a change in a on the equilibrium value of x in the region S.
The use of the theorems of this section in comparative statics is clear. They
can be applied when all the off-diagonal elements of F have a definite sign. If
fij< 0 for all 14 j, then we can use Theorem 4.D.1, and if fij > 0 for all i 4 j,
then we can use Theorem 4.D.3.
For example, suppose that f, and xi, respectively, denote the excess demand
for and the price of the ith commodity. Let a represent the taste of the consumers.
Let there be (n + 1) commodities (i = 0, 1, ..., n). Assume homogeneity and let
commodity 0 be the numeraire (xo = 1). Assume that the equilibrium of the
market can be described by n equations: f,.[x(a, R, ... ); a, f, ... ] = 0, i = 1,
2, ..., n, for (a, j3, ...) in the region S. From homogeneity, we have

n
ZY ,,x,.= 0,i= 0, 1,2,...,n, forall(x;a,j3,...)
1=0

Assume now that f, > 0 for all i 4J(0, 1, ... , n) and for all x (gross substitutability).
Then we obtain
SOME APPLICATIONS 405

n
Z f jxj < 0, for all i and for all (x; a, /3, ...)
j=
Therefore the n x n matrix F = 1.41 satisfies condition (I") of Theorem 4. D.3.
We are now ready to consider the so-called Hicksian laws of comparative statics.
As an illustration, we show that a shift in demand from the numeraire to commodity
k raises the price of k and the prices of all other commodities. For this purpose,
set bk afk/aa = 1 and bi = of/aa = 0 for all i # k (i, k = 1, 2, ..., n). Since
F is indecomposable and condition (I") of Theorem 4.D.3 holds, condition (VII')
also holds so that F is nonsingular and F- I < 0. Hence xa. = - F- 1 b implies
xa > 0; that is, the price of all the commodities (except that of the numeraire)
must rise.'"
We now turn to another illustration, the theory of the firm. Consider a
firm that produces a single product y using n inputs, x I, x2, ... , xn, with the
production function O (x). Let p be the price of the product and w the factor price
vector, which are all taken as positive constants given to the firm. Assume that the
firm maximizes its profit py - w x subject to 0 (x) ? y and x ? 0. Assume
further that

(i) aD/axi> 0,i= 1,2....Inforallx> 0.

(ii) There exists an x > 0 and a y > 0 such that 0 (x) > 3; (Slater's condition).
(iii) The Hessian matrix of CD is negative definite for all x.

Condition (iii) implies that 0 is a strictly concave function. Under these assump-
tions, the following set of conditions together with D(z) = y gives a necessary
and sufficient characterization of a unique global maximum:
/ w'
ffl-x,w != - p=0,i= 1,2,...,n
where we assume that the optimal values, z and y, are strictly positive.'-'
First consider the effect of the minimum wage regulation (MWR) on employ-
ment. Suppose that a local government imposes a minimum wage rate w which
is above wn. Assume that this does not disturb the market in such a way as to
change p and wh ... , Thus p, WI, w2, ... ,
1 . w,, are taken as (positive)
constants to the firm. This is again a problem of comparative statics." Assume
that there exists a continuously differentiable function x = x(w/p) such that
fi [x (wv/p), iv/p] = 0, i = 1, 2, ... , n, for all w/p in some region S of w/p. Then the
partial differentiation of f, with respect to wn yields

fl axI
+ b; = 0, i = 1, 2, ... , n, for all w/p in S
ativn
j= l

where we now define thef j's and the b,'s by f j of/a xj = (a 20/axia xi) = CD;jand
406 FROBENIUS THEOREMS, DOMINANT DIAGONAL MATRICES, AND APPLICATIONS

b; = a (- w; /p)/ a w,,, respectively. Note that b, = 0 for all i n and that b =

-1/p. Since the Hessian matrix of r is negative definite, the Hessian matrix of
0 is Hicksian (see Chapter 1, Section E). Assume further"
(iv) J > O for all i L j, for all x.
Then noting that condition (IV') of Theorem 4.D.3 is satisfied and that F = [ fij]
is indecomposable, we can conclude that condition (VII") is also satisfied. In other
words, F is nonsingular and F-1 < 0 for all x. Therefore we obtain
ax
aw < 0, j = 1, 2, ... , n, for all w/p in S where x = x(w/p)
n

In other words, a local MWR always decreases the employment of labor and all
other factors, provided that the above assumptions are satisfied.`,,
In order to explore the above line of thinking further, let us suppose that
the price of the kth factor changes, where k is not restricted to n. Then the partial
differentiation of the system Di [x(w/p)] - w,/p = 0 (where Oi - 00/0x;), i = 1,
2, . . ., n, with respect to wk, again yields

n ax.
Zf, awk
+ b; = 0, k = 1, 2, ... , n, for all -win S
l=
where b. = 0 if i r k, bk = -1/p, and fij _ cI It is often assumed in general
.

equilibrium theory that commodities are gross substitutes. Our question now. is
whether we can extend the list of these commodities to factors. In particular, we
want to examine whether axe/awk > 0 for all j # k, that is, whether an increase
in the kth factor price will increase the demand for the jth factor when j 4 k.
Consider again the "normal" case in which fi > 0, for all i rL j, and for all x. Then
we can apply Theorem 4.D.3 again. Observing that the negative definiteness of
the Hessian matrix 0 implies condition (IV') of Theorem 4.D.3 and that F = [ fij]
is indecomposable, we can conclude that condition (VIII') is satisfied. In other
words, F is nonsingular and F-' < 0 for all x. Hence we conclude19
ax
<0 forallj k,j,k= 1,2,...,n, for allP inSwhere x= xrwl
P
In other words, "normally" factors are not gross substitutes but rather "gross
complements." This is the conclusion obtained by Rader [ 19] .20 The economic
interpretation of this result can be found in [ 19] , p. 40.

FOOTNOTES

1. Since b0 < 0 for all i :kj, B x > 0 with x > 0 implies x > 0. This was also noted
in the proof [step (i)] of Theorem 4.C.4. Hence condition (I) says that B' (that is,
the transpose of B) has d.d.
2. This means that like (I'), h has d.d.
SOME APPLICATIONS 407

3. The same argument is used in the proof [step (i)] of Theorem 4.C.4.
4. We can also conclude that an arbitrary matrix-say, A (that is, the one whose
off-diagonal elements do not necessarily have a definite sign)-has a dominant
diagonal if and only if its transpose A' has a dominant diagonal. To see this, recall
that (by the definition of a dominant diagonal) A = [a;i] has d.d. if and only if
the matrix M = [m j] , where my = -Ja jl, i # j, mi; _ Ja111, has d.d. Note that M is
a B matrix in Theorem 4.D.4.
5. It is, however, possible to interpret A as the technology matrix, if we assume that
sales expectations are made on the basis of simple extrapolation of all industries;
that is, the sales of the last period x(t - 1) are expected to continue as sales of this
period so that A x(t - 1) + c signifies the expected demands for the goods in period
t. The advantage of this interpretation is that we can interpret A as the technology
matrix.
6. In proving local stability, it suffices to assume that (A-1) or (A-2) holds at the
equilibrium p. The theorem is often referred to as the Hahn-Negishi theorem. Hahn
used (A-1) and Negishi used (A-2).
7. Strictly speaking, we also required that aoj > 0 or a,n > 0, i,j = 1, 2, . . ., n, where
the 0th commodity is the numeraire. This also implies that the choice of the
numeraire is important.
8. Obviously, f = 0 implies V(f) = 0.
9. It is well known and easy to show that all the eigenvalues of any symmetric matrix
are real.
10. It is well known in matrix theory that a symmetric matrix is negative definite if and
only if its eigenvalues are all negative (for the proof, see any textbook on matrix
theory).
11. As remarked earlier in Chapter 1, the (local) implicit function theorem roughly
states the following. Let f,(z; a, /i, . .) = 0 for some r and for some (a, /i, ...).
.

Assume that det[ay] 0, where aj^ of/axj, evaluated at (z, a, /3, .). Then . .

there exists a neighborhood N of (5, p, .) and a unique continuously differentiable

. .

function g such that g(i, /i, ...) = z and f [g(cr, A.... .):a,13,...] = 0 for all (cr, /5, ...)
in N. See W. H. Fleming, Functions of Several Variables, Reading, Mass., Addison-
Wesley, 1965, and most textbooks on advanced calculus.
12. The use of comparative statics, as remarked in the above, is not confined to optimiza-
tion theory; that is, the above system,f(x, cr, ...) = 0 is not necessarily made up of
first-order conditions.
13. See Samuelson [201, which also contains an excellent exposition of comparative
statics (especially chapters 2 and 3). Also see the Appendix to Section F, Chapter 1.
14. For the other Hicksian laws of comparative statics, see Mundell [ 14] . See also
Morishima [ 131, pp. 3-14.
15. Recall Section F of Chapter 1 (especially subsection c).
16. There seems to be a widespread misunderstanding among labor economists with
regard to the comparative statics nature of the problem. For example, in their
study of the MWR of New York City, M. Benewitz, and R. E. Weintraub wrote,
"Economic theorists assert, as a logical deduction from the diminishing returns,
that elasticity of the demand for labor is negative. This means that a rise in the wage
rate will lead to a decline in employment." See their "Employment Effects of a Local
Minimum Wage," Industrial and Labor Relations Revieiv, 17, January 1964, p. 283.
The error seems to be in confusion between shifts of a curve and movements along
a curve; the employment of other factors as well as labor adjust to a new level of
the wage rate, which causes a shift of the marginal productivity curve of labor.
408 FROBENIUS THEOREMS, DOMINANT DIAGONAL MATRICES, AND APPLICATIONS

Also recall the famous controversy between Lester and Machlup in the American
Economic Review (in the 1940s) with regard to the validity of the marginal pro-
ductivity theory. See also J. M. Peterson, "Employment Effects of Minimum Wages,
1938-50," Journal of Political Economy, LXV, October 1957, as well as our footnote
18.
17. This means that an increase in the employment of the jth factor will increase the
marginal (physical) productivity of the ith factor, if i 4 j. This means that factors are
used in conjunction with each other rather than as substitutes. This so-called
"Wicksell's Law" is termed the normal case by Rader [ 19]. The case in which this
assumption is slightly weakened to fj > 0 for all i j can be analyzed in a method
similar to the subsequent analysis. Also note that the crucial condition for the global
invertibility of C D [that is, the existence of the unique inverse - 1 , for all (w1 /p, ... ,
w,/p)] in the Gale-Nikaido theorem is satisfied if the Hessian matrix of D is Hicksian.
See Nikaido [ 181, section 20, and our Chapter 2, Appendix to Section E.
18. If some of the assumptions are violated (which is quite plausible), it is possible that
a local MWR may increasethe employment of labor. Such an analysis is carried out by
Takayama [22] . For example, suppose the demand function of the product is
p(y)=y-0- (a complete monopoly) and the production function is y = LK. Then we
can show that the profit-maximizing value of (L, K) is unique and that the imposi-
tion of MWR increases the employment of labor L. If p = ay- 1/'? and y = bL°'K/,
where 71 > 1, a, R > 0, and a, b > 0, then a necessary and sufficient condition that the
imposition of M WR increase the employment of labor is computed as [ 1 - e(a + R)] /
(ca - 1) > 0 where e = 1 - 1/77. Hence the empirical findings that the imposition of
MWR does not necessarily decrease the amount of labor employment do not con-
stitute a sufficient reason to refute the marginal productivity theory.
19. If fj > 0, i j, condition (VI") of Theorem 4.D.3 is equivalent to (III") instead of
(VIII"). In other words, F is nonsingular and F- 1 5 0, which slightly weakens the
conclusion to 8xj/8 wk < 0, for all j k.
20. A similar conclusion was obtained by M. Morishima in "A Note on a Point in Value
and Capital," Review of Economic Studies, XXI, 1953-1954 (which is cited in Rader
[ 19] ). See also D. V. T. Bear, "Inferior Inputs and the Theory of the Firm," Journal
of Political Economy, LXXIII, June 1965. Note that the above analysis of MWR
(Takayama [22]) essentially establishes the same result.

REFERENCES

1.Arrow, K. J., and Hurwicz, L., "On the Stability of the Competitive Equilibrium, I,"
Econometrica, 26, October 1958.
2. Bellman, R., Introduction to Matrix Analysis, New York, McGraw-Hill, 1960.
3. Chipman, J. S., The Theory of Inter-Sectoral Money Flows and Income Formation,
Baltomore, Md., Johns Hopkins University Press, 1951.
4. Hahn, F. H., "Gross Substitutes and the Dynamic Stability of General Equilibrium,"
Econometrica, 26, January 1958.
5. Kalman, R. E., and Bertram, J. E. "Control System Analysis and Design Via the
"Second Method" of Lyapunov, II, Discrete-Time Systems," Journal of Basic
Engineering, June 1960.
6. McKenzie, L. W., "Matrices with Dominant Diagonals and Economic Theory," in
SOME APPLICATIONS 409

Mathematical Methods in the Social Sciences, 1959, ed. by Arrow, Karlin, and Suppes,
Stanford, Calif., Stanford University Press, 1960.
7. , "An Elementary Analysis of the Leontief System," Econometrica, 25, July
1959.
8. Metzler, L. A., "Underemployment Equilibrium in International Trade," Econo-
metrica, April 1942.
9. , "Stability of Multiple Markets: The Hicks Conditions," Econometrica, 13,

October 1945.
10. , "A Multiple Region Theory of Income and Trade,"Econometrica, 18, October
1950.
11. Morishima, M., "The International Inter-relatedness of Economic Fluctuations,"
in his Inter-Industry Relations and Economic Fluctuations, Tokyo, Yuhikaku, 1955
(in Japanese).
12. , Introduction to the Inter-Industry Analysis, Tokyo, Sobun-sha, 1956 (in
Japanese).
13. , Equilibrium, Stability and Growth, Oxford, Oxford University Press, 1964.
14. Mundell, R. A., "The Homogeneity Postulate and the Law of Comparative Sta-
tics," Econometrica, 33, April 1965.
15. Negishi, T., "A Note on the Stability of an Economy Where All Goods Are Gross
Substitutes," Econometrica, 26, July 1958.
16. Nikaido, H., Introduction to Sets and Mappings in Modern Economics, tr. by K. Sato,
Amsterdam, North-Holland, 1970 (the Japanese original, Tokyo, 1960).
17. Linear Mathematics for Economics, Tokyo, Baifukan, 1961 (in Japanese).
18. , Convex Structures and Economic Theory, Academic Press, N.Y., 1968.
19. Rader, T., "Normally, Factor Inputs Are Never Gross Substitutes," Journal of
Political Economy, 76, January/ February 1968.
20. Samuelson, P. A., Foundations of Economic Analysis, Cambridge, Mass., Harvard
University Press, 1947.
21. Solow, R. M., "On the Structure of Linear Models," Econometrica, 20, January 1952.
22. Takayama, A., "Minimum Wage and Unemployment," Purdue University, March
1967 (unpublished manuscript).
5
THE CALCULUS OF VARIATIONS AND
THE OPTIMAL GROWTH OF AN
AGGREGATE ECONOMY

Section A
ELEMENTS OF
THE CALCULUS OF VARIATIONS
AND ITS APPLICATIONS

a. STATEMENT OF THE PROBLEM

Consider the following (Riemann) integral:
b
J=
Ja
f [t, x(t), x'(t)] dt, where x'(t) = dt
and a and b are some constants. The function x(t) can either be real-valued or
Rn-valued. .Clearly the value of this integral depends on the function x(t). By
changing the function x(t), we can get different values of J. Suppose we are given
a certain class of functions X (for example, the set of all differentiable functions
defined on the closed interval [a, b] ). Then we can consider the problem of
choosing a function x(t) from the class of functions X such as to maximize the
integral J, subject to the conditions x(a) = a and x(b) = A. This is the type of
problem that the calculus of variations is concerned with.
The simplest problem in the calculus of variations is probably the problem
of finding the curve which joins two fixed points on the plane with the minimum
distance. Given the two points A and B in Figure 5.1, a curve joining A and B can be
represented by x(t) with x(a) = a and x(b) = A.
Given an "arc" (or "path") x(t), the distance along each infinitesimal
segment of x(t) is ds = \/(dt)2 + (dx)z = \/1 + x'(t)2 dt. Hence the distance be-
tween A and B along this arc can simply be computed as
b dx
JD = r 1 + x'(t)2 dt, where x'(t) = d
n

The problem then is one of finding a function ("arc") x(t) from the set of dif-
ferentiable functions' to minimize the above integral JD subject to x(a) = a
and x(b) = A. This problem is called the minimum distance problem.

410
ELEMENTS OF THE CALCULUS OF VARIATIONS AND ITS APPLICATIONS 411

t
b

Figure 5. 1. The Minimum Distance Problem.

The answer to the above problem is obviously the straight line joining A
and B. This answer can readily be obtained without applying any of the theorems
in the calculus of variations. However, we can use this problem to illustrate the
nature of the technique of the calculus of variations.
The first major development in the calculus of variations came as a result
of a little more difficult problem first discussed by Galileo in 1630 and then by
John Bernouilli in 1696. The problem was solved by John Bernouilli himself and
by James Bernouilli, Newton, and L'Hospital. The problem (later called the
brachistochrone problem) was as follows: Let 0 and A be two fixed points in a
vertical plane and consider a particle with mass sliding from 0 to A under the
force of gravity along a certain curve connecting 0 and A on this vertical plane.
The problem is to find a curve such that the particle moves from 0 to A in the least
amount of time. The problem is illustrated in Figure 5.2. Here 0 is taken as the
origin of the coordinates.
Using elementary mechanics, we can prove that the problem can be reduced
to one of minimizing the following integral:2

Y Figure 5.2 The Brachistochrone Problem.

412 CALCULUS OF VARIATIONS AND OPTIMAL GROWTH OF AN AGGREGATE ECONOMY

= f° 1 + y'(x)2dx
a
- J° 2gy
Here y(x) [with y(0) = 0 andy(a) = a] denotes the curve joining 0 and A, and
g denotes the gravitational constant.
Later it was discovered that the calculus of variations provided a unified
view of many problems in physics. For example, in optics it is known that light
travels in such a way that it traverses the distance between two given points
in the least possible time (Fermat's principle). In classical mechanics, W. Hamilton
discovered that the motion of a system of n particles (in the x-y-z-plane) can be
explained by the minimization of the following integral:

'
(T - U)dt, where t denotes time
fo

Here T - T [xl(t), yl(t), zi(t), . , . . and U = U[xt(t),...

z (t)] , respectively, denote the kinetic and the potential energy of the system
at time t. (This principle is called Hamilton's principle.) Similar theorems hold in
the field of electricity and magnetism and in Einstein's theory of relativity.
We may note that in the 1920s and 1930s, specialists in the calculus of
variations, men such as Roos and Evans, were greatly interested in economics. For
a complete bibliography of their works, the reader is referred to Evans [ 3] , p. 166.
Unfortunately, there was hardly any response from economists at the time,
except that some of their results were incorporated by R. G. D. Allen ([1],
chapter XX) in 1938. A simple problem discussed by them which is summarized
in Allen [ 1 ] is the problem of dynamic monopoly.
Consider a complete monopolist who produces and sells a single good X.
Suppose that his cost function is given by C(x), where x denotes the amount of
output of X. Suppose that the price of X at time t is given by p (t ). Assume that the
function p(t) is differentiable and let p(t) - dp(t)/dt. Let us suppose the demand
function of X is given by D = D [p(t), p(t)] where p represents a "speculative"
element. The monopolist's profit per unit of time is obviously Dp(t) - C(x) where
x = D. Suppose that he wants to find the optimal pricing policy p(t) so as to
maximize his profit over a period of time, say, [0, T1. In other words, he wants
to find the function p(t) which will maximize the following integral subject to the
boundary conditions p(0) = p° and p(T) = p
(°T
J[Dp(t) - C(x)] dt = {D[p(t),p(t)] p(t) - C[D(p(t), p(t))] }dt
where x = D and p = dp/dt. This is again a variational problem.
During the period of this interest in economics by specialists in the calculus
of variations, Frank Ramsey [5] considered the problem of maximizing social
welfare over time. Ramsey's problem is again a problem in the calculus of varia-
tions. This problem, once forgotten, has recently attracted enormous attention
and was solved, to a certain extent, by Koopmans and Cass. We take up this
ELEMENTS OF THE CALCULUS OF VARIATIONS AND ITS APPLICATIONS 413

problem in Section D as another illustration of the application of the calculus

of variations to economics.

b. EULER'S EQUATION
We now obtain the first-order necessary condition for the maximization
(or minimization) problems discussed above. The emphasis will be on the exposi-
tion and the intuitive understanding of the derivation. Hence some sacrifice of
mathematical rigor is inevitable. Moreover, we will consider only the simplest
problem. We go to a more general analysis in the next section.
Let X be the set of all real-valued (and single-valued) continuously differen-
tiable functions defined on the closed interval [a, b] (admissible functions).
We want to find a function x(t) inXwhich maximizes (or minimizes) the following
integral:

(1) J[x] 5f[t,x(t),x'(t)]dt

where x'(t) = dx/dt, subject to x(a) = a and x(b) = R. We assume thatfpossesses

continuous first and second partial derivatives with respect to all its arguments.
Assume that there exists a function z(t) inXwhich maximizes (or minimizes)
J.;' Consider an arbitrary differentiable function (called the displacement) h(t) E X
such that h(a) = 0 and h(b) = 0. Let E be a real number and define xE (t) E X by
(2) xE(t) = _i(t) + Eh(t)
By assumption, J [xj ] attains its maximum (or minimum) when c = 0. Regard-
ing J [xE ] as a function of E and assuming an interior maximum (or minimum),
this means

(3) aE J [xE] = 0
where the partial derivative a J [x, I/ a E is evaluated at E = 0. But we have

fr[t, z + Eh, z' + Eh'] dt

(4) aE J[xE]IE=o [a IE_o

= Jra fhdt+ f f .h'dt

where J, and fz- denote the partial derivatives off respectively, with respect
to its second and third argument evaluated at c = 0.
Integration by parts yields
fb[fh]dt=l.hl ab
[dt h]dt
(5)

_-
jh[J]dt [.-h(a) = h(b) = 0]
414 CALCULUS OF VARIATIONS AND OPTIMAL GROWTH OF AN AGGREGATE ECONOMY

provided we tacitly assume that dfx,/dt exists. Therefore, we obtain from (3), (4),
and (5)

(6) aE J[xj IE=o = ,lab [Ix - N h(t)dt = 0

This is true for any h(t) E X with h(a) = h(b) = 0. To be able to conclude from
(6) that fX - dfx-/dt = 0, we prove the following lemma, which is often called
the fundamental lemma of the calculus of variations.

Lemma: Let F(t) be a given continuous function on [a, b]. Let X0 be the set of
all continuous functions on [a, b] such that h(t) E Xo implies h(a) = h(b) = 0. Sup-
pose that f bF(t) h(t) dt = 0 for all h(t) E X0. Then F(t) is identically equal to zero
f o r all t E [a, b] .
PROOF: Suppose F(1) 0 at some point 1 in [a, b], say, F(I) > 0. Then,
from the continuity of F(t), F(t) > 0 in some interval [c, d] where c < d,
7 E [c, d], and [c, d] c [a, b] . Choose h(t) E X0 such that h(t) > 0 for
t E (c, d) and h(t) = 0 for t 0 (c, d). Then f'F(t) h(t)dt > 0, which is a
contradiction. (Q.E.D.)
Using this lemma and noting the continuity of we obtain from (6) that

(7) fx - dt f = 0, with z (a) = a, z (b)

This equation is called Euler's equation (condition) and gives a necessary condition
(the first-order condition) for the maximum (or the minimum) of the integral J.;
REMARK: Since fX, = fr- It, z(t), z'(t)] , fr - df,,/dt = 0 means

This is a second-order ordinary differential equation where the unknown

function is i(t). The two boundary conditions are given by i(a) = a and
x(b) = R.
REMARK: The above remark also reveals that if f,, is a differentiable
function with respect to its arguments (that is, dfr- exists), then df /dt exists;
dfz-/dt exists if and only if z" exists [that is, . (t) is twice differentiable].
The twice differentiability of i(t) is a rather strong assumption, for _i(t)
is an unknown function and, in defining the objective integral J = Jbf It, x(t),
x'(t)] dt, only the differentiability of x(t) is required. For this reason, Euler's
equation is often expressed in the following form:

(8) fr- = r f,d + const

a
ELEMENTS OF THE CALCULUS OF VARIATIONS AND ITS APPLICATIONS 415

This equation can be obtained formally by integrating Euler's equation.

When dfx-/dt does not exist, this procedure of integrating Euler's equation
is illegitimate since Euler's equation itself is illegitimate. However, the
above equation can be directly obtained from (3) and (4) and the integration
by parts of the first term of the RHS of (4). Thus Euler's equation (7) implies
(8). But (8) does not necessarily imply (7). However, if afxaz' (denoted
by exists and is nonzero everywhere in the domain off, then it can be
shown that (8) always implies (7). It is interesting to observe that z(t) then
has a higher order of differentiability than the admissible functions. When
the condition fx-,, 0 is satisfied, we call the above problem regular.
REMARK: In the above discussion we assumed that the function inside the
integral J has the form f [t, x(t), x'(t)] . In certain cases it may lack certain
arguments, and useful formulae can be obtained for such special cases.
Consider, for example, the case where f lacks t so that it has the form
f[x(t), x'(t)]. First note that

t =fxxI +.fr'x

But for the optimal path i(t), Euler's condition is satisfied so that fx =
df,-/dt. Hence we have [along the optimal arc z(t)] :

df = [i] .r' + fX.z" =

dt [i']
where f - j[z(t), z'(t)] and f,. = of/ax'. Therefore, we obtain the following
condition in place of Euler's condition:
(9) 1= fx'x' +C
where c is some constant.

C. SOLUTIONS OF ILLUSTRATIVE PROBLEMS

In subsection a we discussed the nature of the variational problem by giving
some examples of actual problems, and in subsection b we obtained the first-
order condition equation for a simple case. Here we solve the problems presented
in subsection a by using Euler's equation. Since we have not discussed sufficiency
conditions, our discussion here will naturally be confined to the necessary condi-
tion and the path which.satisfies Euler's necessary condition. We will not discuss
whether there exists a path which satisfies Euler's equation. In the application of
Euler's equations (7) or (9), we omit the ^ for notational simplicity.

1. THE MINIMUM DISTANCE PROBLEM

We are to find an x(t) 6 X such as to
416 CALCULUS OF VARIATIONS AND OPTIMAL GROWTH OF AN AGGREGATE ECONOMY

Minimize: JD = 6 1 + (x')2 dt
Ja
Subject to: x(a) = a and x(b) _ A
Write /i + (x')2 = .fD; then Euler's condition is written as
afD _ d afo
ax dt ax'
SincefD contains no x, afo/a x = 0. Hence

a fD = x'

-+(X,)2
= const.
ax'
0
In other words, x'(t) = y (constant). Hence x(t) = pt + a. Since x(a) = a, and
x(b) = A3, we obtain x(t) = [(a - /3)/(a - b)] t + (a/3 - ab)/(a - b). This is the
equation which denotes the desired straight line.

2. THE BRACHISTOCHRONE PROBLEM

Here we are to find a y(x) such as to

(y),
Minimize: JB = f oa 1 + dx
2gy
Subject to: y(O) = 0 and y(a) = a

Write fB - V[ 1 + (y')2]/(2gy). Since fB does not explicitly contain x, we use

formula (9).
In other words,

fB = yB)y + c, where c is some constant

+ _ (y,)2 +c
2gy 2gy [I + (y')2]
We find, on multiplying, squaring, and collecting terms, that
k
y= where k = I
1 + (y' )2 2912

We proceed with the following parametric representation of y':

y'=tanw
Hence
ELEMENTS OF THE CALCULUS OF VARIATIONS AND ITS APPLICATIONS 417

k k = k cos2w
y + (y')2 - 1 + tan2w
Thus y' 2k cos w sin w dw/dx = tan w, so that dw/dx = I/ (2k cos2w). Thus
dx = -2k cos2w dw = -k(1 + cos 2w)dw. Hence integration yields
x = -k(a) + z sin 2w) + (constant)
Write 2w =7r - 0 and note that sin (7r - 0) = sin 0. Then we may write
x=k,(0-sin0)+k2
where k1 and k2 are some constants. Obviously k1 = k/2. Also y can be obtained
as follows:
y=kcos2w=k(1+cos2w)=k[1+cos(7l-0)]

=2[1-cos0]=k,[1-cos0]
In other words, we have obtained the parametric representation of the solution
as

x=k,(0-sin0)+k2
y = k,(1 - cos 0)
These equations define a family of "cycloids" with cusps on the x -axis. A unique
curve is determined by the boundary conditions
y(O) = 0 and y(a) = a

3. A PROBLEM OF DYNAMIC MONOPOLY

We are to find the function p(t) which minimizes
fT
JM = { D [p(t), p(t)] p(t) - C [D [p(t), p(t)]]) dt
subject to p(0) = p° andp(T) = p (where, we may recall, x = D). Denote the inside
of the above integral by fM; that is, JM - fo fMdt. since fM does not contain t explicit-
ly, we can use formula (9) in place of Euler's equation, that is,

fM =
a- p + S, where S is some constant
Thus pD - C = [^ - C'D,,] p + S, where D = aD/ap and C' = dC/dx. Clearly,
C' signifies the marginal cost function. The above equation is a first-order differen-
tial equation, the solution of which involves two arbitrary constants that can be
determined by the boundary conditions p(O) = p° and p(T) = p. The following
special case illustrates the problem. Let
418 CALCULUS OF VARIATIONS AND OPTIMAL GROWTH OF AN AGGREGATE ECONOMY

D[p,p] =ap+bp+c, a<0

C(x)=ax2+Aix+y
where a, b, c, a, p, and y are all constants. The solution for this particular problem
can be obtained as follows (see Allen [ 1 ] , pp. 535-536, for example, or try it your-
self):

p(t) = p + Ae-1t + Be--"

where p and A are defined by
c-(2ac+a and A-16 a(aa-1)
2a (aa - 1) a
The constants A and B are determined from the boundary conditionsp(O) = p°
andp(T)=p`.

FOOTNOTES

1. The analysis can be extended to the case of continuous functions (instead of dif-
ferentiable functions).
2. Let m be the mass of the particle. The gravitational force F is given by F = mg,
where g is the gravitational constant. The force F is decomposed into its normal
and tangential components. The former plays no part in the motion. Let ds be
the tangential distance along the curve at point (x, y). Since cos 0 = dylds (note
ds2 = dx2 + dy2), the tangential component of F at (x, y) is equal to F cos 0 =
mg dy/ds. Hence the acceleration along ds is equal to g dylds. That is, dv/dt = g dylds
where v = ds/dt (velocity along ds). Thus we obtain v dv = g dy. Integrating this and
using the initial condition y = 0, v = 0, we have v = 2gy. Therefore dt = ds/v =
1 + y' dx/ 2gy. Hence for the required duration, we obtain the expression
J(y) = \° (I + y )/2gy dx.
3. The existence of an optimal function i(t) is not at all obvious in many problems and
the proof of the existence should be supplied separately. But such a proof will exceed
the scope of the present section. We may also note that z(t), even if it exists, may not
be unique.
4. Solve equation (7) regarding z as an unknown function of t. The solution £(t) of (7)
will provide the equation that the optimal path must satisfy. It is important to realize,
however, that this i(t) does not necessarily maximize (or minimize) the objective
integral J, since equation (7), in general, only provides a necessary condition and
not a sufficient condition for the optimum. In Section B, we prove that under certain
concavity conditions equation (7) also provides a sufficient condition for the optimum.

REFERENCES
Allen, R. G. D., Mathematical Analysis for Economists, London, Macmillan, 1938,
esp. chap. XX.
2. Bliss, G. C., Lectures on the Calculus of Variations, Chicago, I]1., University of
Chicago Press, 1946.
SPACES OF FUNCTIONS AND THE CALCULUS OF VARIATIONS 419

3. Evans, G. C., Mathematical Introduction to Economics, New York, McGraw-Hill,

1930.
4. Gelfand, I. M., and Fornin, C. V., Calculus of Variations, Englewood Cliffs, N. J.,
Prentice-Hall, 1963 (tr. from Russian).
5. Ramsey, F. P., "A Mathematical Theory of Saving," Economic Journal, XXXVIII,
December 1928.
6. Shilov, G. Y., Mathematical Analysis, Oxford, Pergamon Press, 1965 (tr. from
Russian).
7. Tomiyama, K., The Logic of Modern Physics, Tokyo, Iwanami, 1956 (in Japanese).

Section B
SPACES OF FUNCTIONS AND
THE CALCULUS OF VARIATIONS'

a. INTRODUCTION
In the last section, we considered the problem of finding a function x(t) from
the set Xla,h1 of differentiable real-valued functions defined on the closed interval
[a, b] such as to maximize the following integral:
h
J [x] = J a f [t, x(t), x'(t) ] dt

The alert reader may already have noticed the resemblance of the above problem
to the ordinary nonlinear programming problem that we discussed in Chapter 1.
The set X[a,b] of continuously differentiable real-valued functions on [a, b] is a
linear space. The function J[x] is a real-valued function defined on Xla,bl.The
problem is to find an z E X[a,b] which maximizes J[x]. As a matter of fact, this
analogy is the same even if we take x(t) as a continuously differentiable function of
[a, b] into R", that is, x(t) = [xI (t), x2(t), ... , xn(t)] , and XIa bl is the collection
of such vector-valued continuously differentiable functions on [a, b] . The func-
tion x'(t) in J[x] is simply defined by x'(t) = [xi(t), ..., x,,(t)], where x;(t)
dx;(t)/dt. What then is the difference between the above problem and the ordinary
nonlinear programming problem? The crucial difference is simply that the linear
space X[a,b] is no longer finite dimensional. In the exposition of ordinary non-
linear programming, the choice set is typically a finite dimensional Euclidian
space, and theorems are developed under this basic assumption. But there is no
guarantee that the theorems that hold for a finite dimensional Euclidian space
remain valid for an infinite dimensional linear space.
However, it is interesting to note that many theorems in the theory of non-
linear programming for the finite dimensional Euclidian space can be re-proved
420 CALCULUS OF VARIATIONS AND OPTIMAL GROWTH OF AN AGGREGATE ECONOMY

without too much difficulty (sometimes almost word for word) for infinite dimen-
sional linear spaces. There may be some unexpected difficulties in this task, but if
we accomplish it, we obtain a general theory of nonlinear programming which
covers both the finite dimensional and infinite dimensional cases. In particular,
such a theory will include the classical calculus of variations problem as a special
case. We may note that this corresponds to a trend in modern mathematics to
review the classical results of analysis for spaces of functions (or function spaces);
this field of study is known as functional analysis.
Extensive work along this line has been done by Hurwicz in his seventy-page
article, "Programming in Linear Spaces" [5]. Textbooks on the calculus of
variations have been written from the viewpoint of function spaces (see Gelfand
and Fomin [3] and Shilov [ 10] ; for the exposition of this section we are indebted
to them). We note that our Chapter 1 was written in the same spirit as was the
Hurwicz article. Although we confined our attention to the finite dimensional
case, we remarked in several places that the definitions and theorems could be
extended to the infinite dimensional case. This was done, for example, in the
definitions of derivatives and in the proof of the Kuhn-Tucker main theorem
(Theorem 1.D.3).
In this section we shall explicitly state our problem as a "nonlinear pro-
gramming" problem for the infinite dimensional case and proceed with our analysis.
Euler's condition will be rigorously and more systematically obtained under this
procedure. However, we will not follow this procedure through to its completion.
In other words, we will not be concerned here with developing all the results of
the classical calculus of variations from the viewpoint of nonlinear programming
in infinite dimensional vector spaces. One reason is that this attempt has not been
completed yet. In the meantime, a new development suddenly attracted a great
deal of attention from mathematicians. This development became a matter of vital
interest to American mathematicians after the publication (and translation) of
the book by Pontryagin and his students [ 8] . This was followed by the work of
Hestenes [4] and his students, in which all the major results of the classical
calculus of variations have been obtained and extended by this new approach. The
most important extension is probably the incorporation of inequality constraints
in a natural way. Active research and development in this field (known as optimal
control, theory), which we will summarize in Chapter 8, is being carried out vigor-
ously. This new approach resembles Hurwicz's approach [5] in the sense that it
recognizes the problem as a choice problem in infinite dimensional space, but it
is different in the sense that it does not come out as a natural extension of the
ordinary linear and nonlinear programming theory. An interesting novelty in
viewing and formulating variational problems is seen in the Pontryagin-Hestenes
approach. A natural question now is whether we can develop the ordinary non-
linear programming theory for the finite dimensional case from this new formula-
tion. The answer should be yes, but how? This task has been partly accomplished
recently by Canon, Cullum, and Polak [2]. However, we will not go into this
problem here. Furthermore, in this section we restrict ourselves to the simplest
SPACES OF FUNCTIONS AND THE CALCULUS OF VARIATIONS 421

problem, that is, the problem in which there are no constraints and the "end
points" such as a and b are fixed.

b. SPACES OF FUNCTIONS AND OPTIMIZATION

We begin our discussion by reminding the reader of some basic definitions
(see Chapter 0, Section A, and Chapter 1, Section Q.

Definition: A set X of elements x, y, z, ... of any kind is called a linear space

(or vector space) over the real field if X is closed under "addition" (denoted by + )
and multiplication by real numbers such as a, f , and so on, and if the following
axioms (for any x, y, z E X and for any a, /i E R) are satisfied:
(L-1) x+y=y+z.
(L-2) (x + y) + z = x + (y + z).
(L-3) 3 an element "0" ("zero") such that x + 0 = x for any x E X.
(L-4) For each x E X, there exists an element "-x" such that x + (-x) = 0.
(L-5) x.
(L-6) a(x + y) = ax + ay.
(L-7) +fix.

Definition: A linear space X is said to be normed if to each element x E X there

corresponds a nonnegative number II x II (called the norm of x) such that (1) II x II =
0 if and only if x= 0; (2) II x + y II < II x II + II y II for any x, y E X ; and (3)
I1 ax11 =lal I1 x11 foranyaER.

Definition: Let X be a linear space. Any function from X into R is called a func-
tional on X. A functional Jon X is called a linear functional on X if (1) J[ax] =
aJ [x] for any x E X and a E R, and (2) J [x + y] = J [x] + J [y] for any x,
yEX.

Definition: Let X be a normed linear space. A functional J is said to be con-

tinuous at x° if, for any f > 0, there exists a S > 0 such that II x - x° II < S implies
IJ[x] - J[x°] I < `. When J is continuous at every point of X, J is called
continuous in X.
REMARK: It can be shown fairly easily that
(i) A linear functional on X is continuous in X if it is continuous at one point
x°EX.
(ii) A linear functional is continuous if and only if it is bounded (that is,
there exists a p such that If [ x ] 5 µ 11 x 11 for all x E X). For the proofs
1

of the above statements, see Kolmogorov and Fomin [71, pp. 77-78, for
example (or the reader may try it himself).
422 CALCULUS OF VARIATIONS AND OPTIMAL GROWTH OF AN AGGREGATE ECONOMY

EXAMPLE: Let C[ab] be the set of all (bounded) continuous real-valued

functions x(t) defined on a closed interval [a, b]. Then C[p,b] is a linear
space.' II x II = supa<<<b1X(t)I is a norm.' Note that there can be many
kinds of norms. J 1X1 = Jbx(t)dt (Riemann integral) is a continuous linear
functional on C[a.b] J[x] = Jba(t)x(t)dt, where a(t) is a given function
in C[a,b], is also a continuous linear functional.
REMARK: The normed linear space will be considered as a metric space
with the metric naturally induced by the norm.

Definition: Let X be a normed linear space. A functional J on X is said to be

differentiable at x0 where x° E X, if there exists a continuous linear functional
0 on X such that AJ[x°; h] = J [x° + h] - J [x°] = 0 [h] + o (1 1 h II ), where
of II h II) is an infinitesimal of higher order than h(Landau's o), that is,

limo(Ilhll) =p
h-.o II h II

The functional J is said to be differentiable inX, if it is differentiable at each x in X.

We may recall that when X is finite (say, n) dimensional, 0 [h] can be written
as 0 h where 0 is an n-vector. Note that 0 [h] denotes the value of the linear
functional 0 at h. The linear functional 0 is called the first derivative of J and is
denoted by J' or J' [x°] ; 0 [h] is called the first differential (or the first variation)
of J and is. denoted by dJ [x°; h] or dJ [h].'
REMARK: Needless to say, for a fixed x0, 0 [h] depends only on h. How-
ever, if x° changes, 0 [h] also changes' Hence i [h] may be written as
0 [x°; h] . It can be shown that, given x°, the differential0 [h] ofadifferen-
tiable function is unique, which makes the above definition meaningful.
REMARK: A linear functional J is obviously always differentiable since, by
definition,
J [x + h] - J [x] = J [h]
REMARK: From the above definition, we can say that a linear functional is
differentiable if its increment AJ [x] is split into the two parts: a linear
functional of h and an infinitesimal of higher order of h, that is, o( 11 h I ).

Definition: A functional J [x] on a normed linear space X is said to be twice

differentiable at x° where x° E X, if there exists a linear functional 0 and a quadratic
functional Q such that

AJ [x°; h] = J [x° + h] - J [x°] _ 0 [h] + Q [h, h] + o(II h II'-)

SPACES OF FUNCTIONS AND THE CALCULUS OF VARIATIONS 423

Here o(II h II 2) is the infinitesimal of higher order than II h II 2, that is, lim1,.0
o(II h II 2)/ II h II = 0. Q[ h, h] is called the second differential (or second varia-
2

tion) of J at x° and is denoted by d2J [x0; h] or d2J [h]. When J is twice

differentiable for all x E X, J is said to be twice differentiable in X.
REMARK: We may recall the definition of the quadratic functional (Chapter
1, Section E,). A real-valued function Q(x, x) on a linear space X is called a
quadratic functional on X if it is a bilinear functional.
REMARK: Since a quadratic functional on R n can be written as a quadratic
form, Q [h, h] in the above definition can be written as Q [h, h] = A h,
where A = [ay] is an n x n matrix, h E R", that is, h = (hi, h2,Zh..., hn),
and a , = 82J/ix,bxj evaluated at x = x0. If the ay's are bounded in the
neighborhood of x° (that is, if there exists a,u such thatlayl <_ IA for all i and j),
then h A h < ; jjuh; hj < µ.n111 h 112, where we may define II h II by

n 2

(a) II h 11 = h?
r= i
or

(b) Il h 11 = maxn { I h;1 }

We now turn to the problem of maximization (or minimization) in an infinite

dimensional space. The reader will readily recognize that our discussions here are
strictly analogous to the ones in the unconstrained maximization (or minimization)
problem for a finite dimensional case (see Chapter 1, Section C).

Definition: Let J [x] be a functional (not necessarily linear) defined over a

normed linear space X.
(i) The functional J [x] is said to achieve its local maximum (resp. local minimum)
at z E X, if there exists an open ball BE (2) about 2 with radius e such that
BE (2) r- X and J [z] > J [x] (resp. J [z] < J [x]) for all x E BE (z). If
J[z] > J[x] (resp. J[2] < J[x]) for all x E BE(2) whenever x # 2, we say
that J achieves its unique local maximum (resp. unique local minimum) at.
(ii) We say that J achieves its global maximum (resp. global minimum) at c E X if
J[2] > J[x] (resp. J[2] < J[x]) for all x E X.IfJ[2] > J[x] (resp. J[z] <
J[x]) whenever x # z, then we say that J achieves its unique global maximum
(resp. unique global minimum) at 2.
REMARK: When J [ x] has either a local maximum or a local minimum at z,
it is said to have a local extremum at z, and similarly for unique local extre-
mum, global extremum, and unique global extremum.
REMARK: It should be clear that if z furnishes a global maximum (resp.
424 CALCULUS OF VARIATIONS AND OPTIMAL GROWTH OF AN AGGREGATE ECONOMY

global minimum) of J [x] , it also furnishes a local maximum (resp. local

minimum) of J[x], but not vice versa. In other words, local maximality is
necessary for global maximality but is not sufficient.

Theorem 5.B.1: Let J [x] be a differentiablefunctional defined over a normed linear

space X. A necessary condition for J [x] to have a local (or global) extremum at z E X
is that its differential vanishes at x = 2; that is, dJ[.i; h] = O for all h.
PROOF: From the definition of a differential, AJ [z; h] = J[ z + h] - J [z]
=dJ[Y;h] +b11h11 where s-o(11h11)/Ilh11.Bythedefinition ofo(llhll),
S --> 0 as h --> 0. Hence AJ[z; h] and dJ[.i; h] have the same sign [that is,
AJ[z; h] > 0 (or < 0) depending on whether dJ[z; h] > 0 (or < 0) for suf-
ficiently small h] . Now suppose that dJ[2; h°] zk 0 for some admissible h°.
Since dJ is a linear functional in h, dJ[.i; -Bh°] = -B dJ[.i; h°] O for any
real B 0. Let 6 > 0 be small enough. Then 4J[z; h] can be neither always
>= 0 nor < 0, for arbitrary small h (or small B). This contradicts the assump-
tion that J [x] has a (local) extremum at z [that is, (J [z] - J [x] ) has a de-
finite sign in some neighborhood of z] . (Q.E.D.)

REMARK: When X = R", we used the "chain rule" (Theorem 1.C.2) for the
proof of the above theorem (Theorem 1.C.6). By noting that the chain
rule also holds in an infinite dimensional (normed linear) space, we may
prove Theorem 5.B.1. in the way we proved the finite dimensional case
(Theorem 1.C.6). Let X, Y, and Z be normed linear spaces. Let f be afunction
from 'X into Y and g be a function fromf(X) c Y into Z. Let f be differenti-
able at x0 and g be differentiable at f (x°). The chain rule for this case simply
states that for h = gof, h'(x°) = g' [ f (x°)] of'(x°). The proof of Theorem
5.B.1 then goes as follows: Let h E X be such that 11 h 11 = 1. Consider
0(6) - J [z + Oh] where O E R. Then ti (B) is a function from P into itself. By
assumption, 0 (B) has a local extremum at B = 0; hence by elementary cal-
culus, cv'(0) = 0. By the chain rule, V(0) = J' [ c] h. Thus J' [z] = 0 or
dJ[z] = 0.
REMARK: It may have to be recalled that X does not have to be a finite
dimensional linear space. For example, x(t) may be in C[a,b] with a certain
norm. Then the fact that J [x] has a local maximum at r means that J [z] -
J[x] > 0 for some neighborhood of the curve c(t), where the "neighbor-
hood" is defined in terms of the metric induced by the norm of C[a,b} .

Theorem 5.B.2: Let J[x] be a twice differentiable functional defined on a normed

linear space X. A necessary condition for J[x] to have a local (or global) minimum
at x = i is that
d2J[z; h] ? 0 for all h
SPACES OF FUNCTIONS AND THE CALCULUS OF VARIATIONS 425

PROOF: By definition, we have

AJ[z; h] = J[z + h] - J[z] = dJ[z; h] + d2J[z; h] + o(II h 11 2)

By Theorem 5.B.1, dJ[z; h] = 0. Hence AJ[z; h] and d2J[z; h] have the
same sign for sufficiently small II h II Suppose d2J[1; h°] < 0 for some h°.
Then from the bilinearity of the quadratic functional d2J[z; h], we have
d2J[z; Oh°] = 02d2J[z; h°] < 0 for any 0 0

Hence AJ[z; h] can be made negative for an arbitrarily small II h II , which

contradicts the assumption that J[x] has a local minimum at x = 1.
(Q.E.D.)
REMARK: Similarly, we can easily prove that d2J[z; h] < 0 is a neces-
sary condition for z to furnish a local maximum for Ax]. We call d2J[z;
h] ? 0 (or < 0 for maximum) the second-order necessary condition for a local
minimum (or maximum). For the finite dimensional case, recall Theorem
1.E.15.

Definition: A quadratic functional Q [x] defined on a normed linear space X is

called strongly positive definite if there exists a constant 0 > 0 such that
Q[x] > 011x112 for all x E X

Theorem 5.B.3: A sufficient condition for a functional J[x] to have a unique local
minimum at x = i, given that the first differential at z vanishes (that is, dJ [z; h] = 0),
is that its second differential at z, d2J[z; h], be strongly positive definite.
PROOF: Since dJ[x; h] = 0, we have AJ[z; h] = d2J [z; h] + E 11 h II 2, where
f = 0( 11 h 112)/ 11 h I1 2 (that is, c -> 0 as h -> 0). By assumption, there exists a
0 > 0 such that d2J[z; h] ? 0 II h II 2 for all h. Hence we obtain
AJ[z;h] = d2J[X;h] + c11h112? (0 + E) IIh112
For E small enough (that is, II h II small enough) with h 0, we have B + E > 0.
Hence AJ[z; h] > 0. (Q.E.D.)

REMARK: In a finite dimensional space, the strong positive definiteness of

a quadratic form is equivalent to its positive definiteness. In the general
case (not necessarily finite dimensional), strong positive definiteness can
be stronger than positive definiteness. Hence in the above sufficiency
condition, the strong positive definiteness of d2J[z; h] cannot be replaced
by the positive definiteness (that is, d2J[9; h] > 0 for all h). The following
example is given by Shilov ([10], pp. 90-91): J[x] = jo x2(t)[t - x(t)]dt,
426 CALCULUS OF VARIATIONS AND OPTIMAL GROWTH OF AN AGGREGATE ECONOMY

where x(t) is continuously differentiable on [0, ].]. Let 1(t) = 0. Then

dJ[z; h] = dJ[O; h] = 0. Moreover d2J[z; h] = d2J[0; h] = 5 th2(t)dt,
which is positive for every h(t) r 0. But we can easily pick a function x(t)
such that J[x; h] < 0; that is, J(x) does not achieve a minimum at1(t) = 0.
REMARK: Theorem 5.B.3 is formulated in terms of minimum. For the
maximum problem, we can say z furnishes a unique local maximum of
J[x], if dJ[z; h] = 0 and if there exists a 0 > 0 such that d2J[1; h]
-0 11 h 11 for all h.

C. EULER'S CONDITION AND A SUFFICIENCY THEOREM

Let x(t) = [x1(t), ..., and x'(t) = [x1(t), ..., x;, (t)] where x1(t),
i = 1, 2, ..., n, are real-valued continuously differentiable functions defined on
the closed interval [a, b], and x' (t) = dx, (t)/dt, i = 1, 2, . . ., n. Let us denote this
by x; (t) E D [a,b] or x(t) E D"[a,b]. Now consider the following specific form ofJ [x] .

J[x] = Sa b f [t, x(t), x'(t)] dt

Here f is defined in an open subset ofR2"+' which includes the space of It, x(t), x'(t)]
which is defined for a < t < b. The function f is assumed to possess continuous
first and second partial derivatives with respect to t, x, and x'. With this explicit
form of J[x], we realize at once that the general problem of minimization (or
maximization) of J[x] turns out to be that of the calculus of variations.
Let us now consider a "displacement" of x(t) by h(t), where h(t) E D"[a,b]
with h, (a) = h; (b) = 0, i = 1, 2, ., n. . .

AJ[x; h] = J[x + h] - J[x]

= J If [t, x(t) + h(t), x'(t) + h'(t)] - f [t, x(t), x'(t)] } dt

For notational simplicity, write the 2n-vectors as (x, x') = y and (h, h') = k. Clearly
(x + h, x' + h') = y + k. From the differentiability off with respect toy, we have,
for each fixed t,
f[t,x+ h, x'+ h'] -f[t,x,x'] =f[t,y+ k] -f[t,y]

where f _ (af/ax1, ..., of/ax;). Here the norm II k II is defined as

kl = max {I hl(t)I,...,
I h'(t)l,..., Ih;,(t)I}
a<t<b
Alternatively, we can carry out a similar analysis with the following norm:
n 12

II k II = max h12(t) + (h;.(t))2

a=t<b i=1 i=1 J
SPACES OF FUNCTIONS AND THE CALCULUS OF VARIATIONS 427

Note that o( II k II) depends on t and y as well as on k, so that we may write

of II k II ) = r(t, y, k). Assume that the second partial derivatives of the function
f(t, y) with respect to y are bounded by N (in absolute value), so that we have''

o(IIkII)I 'N(2n)2IIkII2
Hence we obtain
fb(fy
AJ[x;h] = k)dt+ fbo(IIkI )dt
a

b b
f a (fy k) dt + 2Nn2 f i2dt
a
for II k II < u

In other words,

AJ [x; h] < f b
(fy k) dt + 2N n2 (b -a)µ2 for II k II <

Thus we see that the increment of the functional J[x] is split into a principal
linear part (linear with respect to k) and an infinitesimal of higher order. For the
latter, note that limy,-0 [2N n2(b - a)µ2] /µ = 0. Hence J [x] is differentiable and
its differential has the form
b b of of of of
dJ[x; h] = fa (.fy y) dt = fa [---h1
axl
+ ... + axe h; + ... + axnhn dt

In other words, if f, defined on an Euclidian space, is differentiable with respect

to x and x', and if the second partial derivatives off with respect to x and x' are
bounded, then the functional J is differentiable and its first differential is written
as above.
From Theorem 5.B.1, a necessary condition for J to have an extremum is
that its first differential vanish. Hence for the above J, we ought to have, for the
optimal arc x(t),
dJ=fblzh;+I axh;
a Ll
dt=0

To obtain the Euler condition, we assume that x(t) is twice differentiable and then
perform integration by parts for the terms which involve h; in the above equation.

bn of n of b - fb
aX
' dt
j_ I aXi
'h`I
a a i= I
[:i:;] h,dt
where fr,. = of/ax;. The first term on the right of the above equation is zero since
hi(a) = hi(b) = 0. Hence we have
428 CALCULUS OF VARIATIONS AND OPTIMAL GROWTH OF AN AGGREGATE ECONOMY

b n n
d
V= fX;hi - X atf. h; dt
r=1

J Qbk. 1
[ fX; (f)]h.}

for all hi(t), i = 1, 2, ..., n, where f = of/ax; and fx- = of/ax'. Since all the
increments hi(t) are independent and arbitrary, we obtain, by setting all except
one to zero, that

s:[[f1 - dt
-Ux)] h; dt 0, for all h;(t)
J
Hence from the fundamental lemma of the calculus of variations (Section A), we
must have (noting the continuity of dff /dt which is due to the continuous dif-
ferentiability offX.)

(E) fX; - dt (fx) = 0, i = 1, 2, ... , n

This is a (necessary) condition that the optimal arc must satisfy and it is called
Euler's condition for the n-variable case.

REMARK: When x(t) is not twice differentiable, then we cannot get (E) as
we discussed in Section A. We instead obtain
ft

fC.dt = fY + c, i = 1, 2, . , n, where c is some constant

We may obtain the expression for the second differential d2J for
b
J[x] - fa f [t, x(t), x'(t)] dt
by assuming that all the second partial derivatives off exist and are continuous,
and by using Taylor's expansion. For example, if x(t) is real-valued (instead of
Rn-valued), then
b

dzJ = z f [ffXh2 + 2ffX,hh' + ff,C,h'z] dt

A similar expression can be obtained readily when x is Rn-valued. Then using

Theorems 5.B.2 and 5.B.3, we can obtain expressions of the second-order
necessary conditions and sufficient conditions. These conditions are known by
the names of Legendre, Weierstrass, Jacobi, and so on. The simplest is the
Legendre condition for necessity (of a minimum), which states
SPACES OF FUNCTIONS AND THE CALCULUS OF VARIATIONS 429

when x is real-valued. For a maximum, it is

f'f<0
Since discussions of these conditions are tedious, we omit them entirely. We will,
however, prove the following remarkable sufficiency theorem, which has its
counterpart for the finite dimensional case (see Theorem 1.C.7).

Theorem 5.B.4: Let f[t, x(t), x'(t)] be differentiable with respect to x(t) and x'(t),
where x(t) is an R"-valued twice differentiable function on the closed interval [a, b]
with x(a) = a and x(b) = A. Suppose that f is a concave junction in x(t) and x'(t).
Then a necessary and sufficient condition that z(t) maximizes the integral

J[x] = f bf [t, x(t), x'(t)] dt

is that it satisfies the Euler condition

ax it [x j' with x(a) = a and x(b)

PROOF: The necessity is obvious, so we prove the sufficiency. Let i(t) and
z'(t) satisfy the above Euler equation. Denote f [t, z(t), z'(t)] by], and let
afla.x = L, alga. ' = Jr'. Then, we can write the following string of an
inequality and equalities:

J[x] - J[X] = f a
b

(f - f)dt < f [(x - z) fr + (x' - X')

a
b
dt
('.'concavity)

d
,Jab (X X) (.fc - dtjx")dt + (x - C) JC, la

The integration by parts yield f b (x' - z') f,,dt = (x - z) fX I

fh
(x - X)
dt (fx')dt]
b
_ (x - z) fL-, I ['.' Euler's equation]
a

= 0 [.'. fixed end points, that is, x(a) = z(a) = a and x(b) = x(b) _ /3]
(Q.E.D.)
REMARK: It can easily be seen from the above proof that if f is strictly
concave in x and x', then z(t) provides a unique global maximum. Note also
430 CALCULUS OF VARIATIONS AND OPTIMAL GROWTH OF AN AGGREGATE ECONOMY

that if f is convex (resp. strictly convex in x and x'), then 1(t) provides a
global minimum (resp. unique global minimum).
REMARK: Any integral evaluated at a single point is obviously zero. Also,
any integral evaluated on a set of countably many points is zero. Therefore,
if two (integrable) functions u(t) and v(t) differ only for countably many
points in [a, b], then the values of the integrals of these functions from a
to b will be the same. Hence the "uniqueness" of an optimal solution in the
above only means uniqueness except for countably many points.
REMARK: As we noted in Section A, the above theorem does not say that
there does exist such an 2(t). It is possible that there is no solution for
the differential equation, that is, Euler's equation. The existence of z(t) is
simply assumed in the above theorem.
REMARK: This theorem was obtained by the author and presented as a
lecture at the University of Minnesota in the spring of 1966. See Takayama
[11] . It is now a special case of Mangasarian's theorem in optimal control
theory (see Theorem 8.C.5).

FOOTNOTES

1. For the first reading, this section can safely be skipped, except for Theorem 5.B.4.
The major purpose of this section (except for Theorem 5.B.4) is to clarify the basic
underlying mathematical structure of the calculus of variations problem (rather than
to provide theorems useful for practical applications), which then will be useful for
further theoretical studies on this topic. Theorem 5.B.4 can be read independently
of the rest of this section, and it provides a useful result in applications. That is,
under "concave cases" the Euler condition is sufficient (as well as necessary) for
a global maximum.
2. Define addition and scalar multiplication pointwise; that is, for any x(t) and
y(t) E C[Q,b], define (x + y)(t) = x(t) + y(t) and (cax)(t) - ax(t). The zero element
is x(t) = 0 for all t and (-x)(t) _ -x(t).
3. With this norm, C[a,b] is a Banach space. We may recall that the Banach space is
defined as a normed linear space which is "complete" as a metric space induced by
the norm. A metric space is called complete if every Cauchy sequence is convergent.
In general, CX, the set of all bounded continuous real-valued functions on a topo-
logical space X, is a Banach space with the norm II x II ° sup I x(t) 1. Convergence of a
sequence {xq} with respect to this norm is a "uniform convergence." The sequence
{xq} is said to converge to x0 uniformly if for any c > 0 there exists a.q such that
q > 4 implies Ixq(t) - x0(t)I < E. It is crucial that 4 does not depend on t. If q
depends on t, then we have a pointwise convergence. That the space CX is complete
amounts to the fact that xq-> x0 (uniformly) and xq E CC for each q implies x0 E C.
The norm in the above, Jxii = sup Ix(t)I, is often called the uniform norm.
4. Clearly the definitions of differentiability and differentials depend on the choice
of the norm. However, as remarked earlier (Chapter l , Section C), the choice of the
norm really does not matter infinite dimensional spaces; that is, if J is differentiable
at x0 in one norm, then J is differentiable at x0 in any other norm and the differentials
at x0 under any norm are the same.
SPACES OF FUNCTIONS AND THE CALCULUS OF VARIATIONS 431

5. If 0 is continuous at x0 with respect to x, J is said to be continuously differentiable

at x0. If J is continuously differentiable at each x in X, J is said to be continuously
differentiable in X.
6. If the second partial derivatives of f in y (denoted by fyy) exist, then we have, by
the well-known Taylor theorem, r(t, y, k) _ M fyy(t, y + Ok)k;k1, for some 6,
0 < 6 < 1. For the Taylor theorem, see any standard textbook on advanced calculus.

REFERENCES

1. Bliss, G. A., Lectures on the Calculus of Variations, Chicago, Ill., University of

Chicago Press, 1946.
2. Canon, M., Cullum, C., and Polak, E., "Constrained Maximization Problem in
Finite-Dimensional Spaces," Journal of SIAM Control, Vol. 4, no. 3, 1966.
3. Gelfand, I. M., and Fomin, S. V., Calculus of Variations, Englewood Cliffs, N.J.,
Prentice-Hall, 1963 (tr. from Russian).
4. Hestenes, M. R., Calculus of Variations and Optimal Control Theory, New York,
Wiley, 1966.
5. Hurwicz, L., "Programming in Linear Spaces," in Studies in Linear and Non-
linear Programming, ed. by Arrow, Hurwicz, and Uzawa, Stanford, Calif., Stanford
University Press, 1958.
6. -, "Programming Involving Infinitely Many Variables and Constraints," in
Activity Analysis in the Theory of Growth and Planning, ed. by Malinvaud and
Bacharach, London, Macmillan, 1967.
7. Kolmogorov, A. N., and Fomin, S. V., Elements of the Theory of Functions and
Functional Analysis, Vol. I, Rochester, N.Y., Grayrock, 1957 [tr. from 1st (1954)
Russian ed.] .
8. Pontryagin, L. S., Boltyanskii, V. G., Gamkrelidze, R. V., and Mishchenko, E. R.,
The Mathematical Theory of Optimal Processes, New York, Interscience, 1962 [tr.
from Russian (1961) ed.].
9. Ritter, K., "Duality for Nonlinear Programming in a Banach Space," SIAM Journal
of Applied Mathematics, vol. 15, no. 2, March 1967.
10. Shilov, G. Y., Mathematical Analysis, Oxford, Pergamon Press, 1965, esp. chap. III
(tr. from Russian).
11. Takayama, A., "On the Structure of the Optimal Growth Problem," Krannert
Institute Paper, No. 178, Purdue University, June 1967.
432 CALCULUS OF VARIATIONS AND OPTIMAL GROWTH OF AN AGGREGATE ECONOMY

Section C
A DIGRESSION:
THE NEO-CLASSICAL
AGGREGATE GROWTH MODEL

In the next section we treat the problem of optimal growth of an aggregative

economy. We will argue that we can consider the problem as a straightforward
application of the calculus of variations. However, before we turn to this problem,
a short summary of the discussion on the aggregate growth model is probably
useful and hence we digress here from our main topic to do the summary work.
Although this section will serve as an introduction to modern growth theory, it has
nothing to do with the calculus of variations, so that those readers who are familiar
with this much of growth theory may, without too much difficulty, skip this
section.'
Let us suppose that the economy can be characterized by one sector, which
produces "national product,'.' Y. Let us suppose that this is produced by two
factors, labor (L) and capital (K), with the following production function
(1) Y, = F(L,, Kt)
where t denotes time. Denoting consumption by X1 and investment by I,, equi-
librium in the output market (output Y1 = the demand for the output) is described
by

(2) Y1=X,+1,
Assuming that the amount of depreciation of capital at each instant of time is a
constant proportion (u) of the existing stock of capital,' the amount of gross
investment must be equal to K1 + µK1, where K1 = dK1/dt. That is,
(3) K1+ uK,=I,
Assume that labor grows at a constant rate n, so that we have
(4) L, = Loent

where Lo is the amount of labor available at t = 0. So far, there are four equations
above but there are five variables, L,, K1, Y1, It, X1, excluding time t. Hence by
adding one more equation we can "close" the model; that is, if these five equations
are somewhat "nice," we should be able to solve for these five variables with
respect to t. The fifth equation which can be used to close the model is the
equation which describes the consumption behavior. A common behavioral
assumption here is that the amount of consumption is a constant fraction of
net income.
A DIGRESSION: THE NEO-CLASSICAL AGGREGATE GROWTH MODEL 433

(5) X, = (1 - s)(Y, - NK, )

Here s is a constant, a fraction between 0 and 1, called the average propensity to
save and (1 - s) is called the average propensity to consume. The usual justification
for this is the famous empirical observation due to Simon Kuznets for U.S. data.'
The saving behavior implied by (5) is called the proportional saving behavior.
The above model is quite well known in the literature, but the following
two important remarks are sometimes forgotten.

(i) In equation (1), L, denotes the amount of labor input and in equation (4), L,
denotes the total amount of labor available in the economy. Hence the fact
that the same notation L, is used in those two equations means that full
employment of labor is assumed. Similarly, K, in equation (1) denotes the
amount of capital input and K, in equation (3) is the amount of the total stock
of capital available in the economy. Hence the fact that the same notation K,
is used in these two equations means that full employment of capital is as-
sumed. It is true that the economy can deviate from such a full employment
state from time to time. But if we are interested in the long-run behavior of
the economy, we might as well consider such unemployment states as "short-
run" phenomena and abstract our model from them (at least as a first ap-
proximation).
(ii) Equation (1) describes the equilibrium relation in the output market. Nothing
is mentioned about how this equilibrium can be achieved. Typically, such an
equilibrium can be achieved through flexibility of certain price variables such
as the price of output (vis-i -vis money) and/or the rate of interest. A full
consideration of this mechanism involves the consideration of other markets
such as the money market. The above model is abstracted from this consider-
ation. This abstraction parallels the previous assumption of full employment
where the mechanism of how full employment of labor and capital can be
achieved is not considered. In the model, the full employment equilibrium in
the output market is maintained through adjustments in investment I,. That is,
investment is assumed to be completely "passive"; the amount of I, is auto-
matically adjusted to the level just equal to the amount of saving.

In order to analyze the properties of the model, we have to make some

preliminary comments about the production function. Following a standard con-
vention, we assume that F is defined on the nonnegative orthant of R2 and that it
exhibits constant returns to scale with diminishing returns with respect to each
factor. That F exhibits constant returns to scale means F(aL, aK,) = aF(L, K,)
for any positive real number a. That F exhibits diminishing returns with respect
to each factor can be described as (assuming that F is twice differentiable)
a2F/aL,2 < 0 and 02F/aK,2 < 0 for all values of L, and K, in the domain. This
means that MPPL, the marginal physical product of labor (aF/aL,), decreases as
L, increases and that MPPK, the marginal physical product of capital (aF/aK,),
decreases as K, increases. The well-known example of such a production function
is the Cobb-Douglas function, F(L,, K,) = L,'-°K,,r, 0 < a < 1.
434 CALCULUS OF VARIATIONS AND OPTIMAL GROWTH OF AN AGGREGATE ECONOMY

From the assumption of constant returns to scale on F, for L1 > 0 we

can rewrite equation (1) as

YY = L1F(l,
K = L, f(k,), where k, = K and f(k1) - F(1, K
In other words,
Y`
(6) yr = f (k1), where y1 =

Henceforth we will assume L > 0 always (that is, labor is indispensable for
production). The function f is a real-valued differentiable function defined on the
half real line [0, cc).
We can prove the following two lemmas, both of which are important in
aggregate growth theory, fairly easily: (for the sake of notational simplicity, we
omit the subscript t in these two lemmas.)

Lemma 5.C.1: MPPL = f (k) - k f'(k), and MPPK = f'(k).

PROOF: MPPL = aF/aL = a [Lf(k)] /aL = f(k) + Lf'(k) (-K/Lz) = f(k)
-kf'(k). MPPK - aF/aK = a [Lf(k)] /aK = Lf'(k)(1/L) = f'(k).
(Q.E.D.)

Lemma 5.C.2: Let L > 0, K > 0. Then a 2F/ aLz < 0 for all L > 0, K > 0 if and
only if f' (k) < 0 for all k > 0, and a 2F/ a K2 < 0 for all L > 0, K > 0 if and only if
f"(k) < for all k > 0.
PROOF: We use Lemma 5.C.1.

02F= 0 (LK}f'(k) - kf"(k) (-L

aLz aL [f(k) - kf'(k)] = f'(k) (i) - L2

1
= kzf' (k)
L

From this the first statement of the lemma follows immediately. Also
a
[.f '(k)] = f"(k) L
a 2F OK

From this the second statement of the lemma follows immediately.

(Q.E.D.)
A corollary of Lemma 5. C.1 is that if f'(k) > 0 for all k > 0, then the marginal
physical product of capital is always positive. Respectively, f'(0) andf"(0) denote
the right-hand derivative off and f' at 0. The production function with constant
A DIGRESSION: THE NEO-CLASSICAL AGGREGATE GROWTH MODEL 435

f(k)

Figure 5.3. An Illustration of Production Function.

returns to scale and diminishing returns with respect to each factor is illustrated
in Figure 5.3. The reader should note that both (i) f'(k) > 0 for all k > 0 and (ii)
f"(k) < 0 for all k >_ 0 are satisfied in the diagram. Note also thatf(O) = 0, that is,
that capital is indispensable for production, is assumed in the diagram.
The next task is to simplify the above set of equations. From equations (1),
(2), and (3), we can immediately obtain

F(LI, Kt) = Xt + (K, + uK,)

Dividing both sides of the equation by Lt(> 0) and writing xt = X,/Lt, we
obtain

(7) f(kt)=xt+K`+ uk,

But

kt = Kt
Lt
- Ltkt
Lt

Since equation (4) implies Lt/Lt = n, we obtain

Kt = ,Ct
Lt i + , L. t

Combining this equation with (7), we obtain

(8) kt = f (kt) - Ak, - x, where A - n + ,ct

This is the fundamental equation of the (neo-classical) aggregate growth model.

Definition: The time path (kt, xt) is called a (neo-classical aggregate) feasible
(growth) path, if it satisfies equation (8) and kt >_ 0, xt >_ 0. If in addition it satisfies
CALCULUS OF VARIATIONS AND OPTIMAL GROWTH OF AN AGGREGATE ECONOMY
436

the prescribed initial conditions ko and x0, then it is called the attainable path
with respect to ko and x0.
REMARK: It should be clear that any path (L1, K1, Y,, X1, I,) which satisfies
equations (1), (2), (3), and (4) can be completely described by the path (k1,
x1) which satisfies equation (8). This is because (8) is obtained from (1), (2),
(3), and (4), and from any path (k1, x1) which satisfies equation (8) we can
obtain a path (L1, Kt, Y1, Xt, It) which satisfies equations (1), (2), (3), and (4).
Clearly there are many attainable paths starting from the same point (ko,
xo). This is due to the fact that there are two "unknowns" in equation (8). We can
close the model by specifying the behavior of consumption. Robert Solow [ 15 ] ,
following Harrod and Domar, adopted the consumption behavior as described in
equation (5). By dividing both sides of (5) by L1 and referring to (6), we obtain

(9) xt = (1 - s) [f(k1) - µkt]

Hence combining (9) and (8), we have

(10) kt = sf(kt) - .;Lkt, where A = n + sy

We now impose the following assumptions
(A-1) f(k) > 0 and f" (k) < 0 for all k > 0.
(A-2) f(0) = 0.
(A-3) f'(0) is "sufficiently" large or, more specifically, f'(0) > A*/s.
(A-4) f'(oc) is "sufficiently" small or, more specifically, f'(co) < A.*/s.
(A-5) 0 < s <_ I.
Assumptions (A-3) and (A-4) are often written in a stronger form as
(A-3') f'(0) _ °° .
(A-4') f'(-O) = 0.
We are now ready to state and prove the theorem due to Solow [ 15] .

Theorem 5.C.1 (Solow): Under assumptions (A-1) to (A-5), there exists a feasible
path (ks, xs), which is unique, where kc and x, are some positive constants, such
that any attainable growth path with the proportional saving behavior converges
,11onotonically to it (that is, k, - ks and x,->x, monotonically) as t - cc , regard-
less of the initial value of ko > 0, where ks and xs are determined by sf (ks) _
.l*kc and x, = (1 - s) [f(k) - µks] .

PROOF: Under assumptions (A-1) to (A-5), we can have the situation as

illustrated in Figure 5.4. It is important to note that assumptions (A-1)to
(A-5) guarantee the unique intersection of the A*k-line (from below) with the
sf(k)-curve at a positive value of k, ks, as illustrated in Figure 5.4, which
A DIGRESSION: THE NEO-CLASSICAL AGGREGATE GROWTH MODEL
437

k
0 k,

Figure 5.4.. An Illustration of the Proof of Solow's Theorem.

proves the existence and the uniqueness of the path (ks, xs) with ks > 0 and
xs > 0. The rest of the proof is also easy. As is clear from Figure 5.4, k, ks
according to whethersf(k,) .).*k,.Butequation (10)states thatk, 0accord-
ing to whether sf(k,) )*k,. Hence k, > 0 according to whether k, < ks. In
other words, if k, > ks, then k, < O so that k, decreases over time. If k, < ks, then
k > 0 so that k, increases over time. And if k, = ks, then k, = 0 so that k, stays
at ks (Q.E.D.)
REMARK: Starting from ko, what will be the amount of time required to
reach a certain prescribed value k*? To find this, note
ksdtdk= k=

t(k*) k
k
Recall that
(>0ifk<ks
sf(k)-.A*k l<0ifk>ks
Theorem 5.C.1 establishes that k approaches ks monotonically. Hence t(k*)
is meaningful only when k* is such that ko < k* < ks or ks < k* < ko.5 Let
ko # ks. The question is: What is the amount of time necessary to reach ks
when ks. r ko'? This can be resolved by considering the two cases ko < ks
and ko > ks. In either of these two cases it is elementary to see that

lim t(k*) = o0
k*-ks

since sf(k*) -> A*k* as k* -> ks. In other words, it takes an infinite amount
of time to reach the path (ks, xs).6
438 CALCULUS OF VARIATIONS AND OPTIMAL GROWTH OF AN AGGREGATE ECONOMY

REMARK: As was seen in the proof, (A-1) to (A-5) are used to guarantee
the existence and the uniqueness of a positive k, that is, the existence and
uniqueness of a path (k5, x,) with k, > 0 and x, > 0. The reader can easily
think of many alternative sets of assumptions which guarantee the existence
and the uniqueness of a positive k,.' This path (k x,) may be referred to
as Solow's path. Note that in Solow's path, labor, L1, and capital, K1, grow
at the same rate n (because K1/L1 = constant k,), and Y, and X, also grow
at this rate n [because f(k,) and (1 - s) f(k,) are constant]. Investment I,
also grows at this rate because of equation (2). Hence the above theorem
establishes that the path of (L K Y X,, I,) approaches a "balanced
growth" path in which these variables all grow at the same rate as time
extends without limit, regardless of the initial value of these variables. This
global stability theorem was not quite established in Solow's original paper
[ 15 1. A further scrutiny and proof of this theorem with an explicit recogni-
tion of key assumptions is due to Okamoto and Inada [10] .

REMARK: If F(L, K) = LI-Ka, 0 < a < 1 (the Cobb-Douglas case), for

example, then f(k) = k". Solow's path requires that sf(k,) = A*k or k, _
(s/A*)I/(I -a)

REMARK: Solow's path may be illustrated as a ray from the origin with its
slope equal to k as illustrated in Figure 5.5. Note, however, that k,
approaching k, as t -> oo does not guarantee that the (L K,) configuration
in the L-K-plane asymptotically converges to the k, -ray (as illustrated by
the dotted line in Figure 5.5). In fact, such an asymptotic convergence is
impossible for the present model, as is shown by Deardorf [3]. Let S, be
the vertical distance of the (L,, K,) path from the k, -ray; that is, 8,
(k, - k,)L,. Deardorf argues that 8,-> 0 is impossible and that 8,-0o is
more plausible.

Figure 5.5. An Illustration of Solow's Theorem.

A DIGRESSION: THE NEO-CLASSICAL AGGREGATE GROWTH MODEL 439

One basic premise in Solow's theorem is that consumption is a constant

proportion of income. This is a rather awkward assumption, for it means that
the overall (average) propensity to save is constant regardless of income distribu-
tion between the two factors. Empirical evidence may not mean too much in this
connection, for it is an interplay of various other factors such as technological
progress, international trade, and factor movements. An alternative assumption
on consumption behavior is the one adopted by the classical economists, that is,
the assumption that the workers save nothing and the capitalists save a certain
constant fraction of their gross income. We call this the classical saving behavior.
The capitalists' income can be described by (MPPk)K, which, from Lemma 5.C.1,
is equal to f(k)K. Letting s be the capitalists' average propensity to save, where
0 < T < 1, the following equation now replaces equation (5).
(11) Tf'(k,)Kr = Y. - Xt
Note that X consists of consumption by capitalists and by workers, so that

(12) Xt = (1 - s)f'(kt)Kt + [f(kt) - kt f'(kt)] Lt

We can check easily that the Xt thus obtained satisfies (11) with Yt = Lf(kt).
By dividing both sides of (11) by L, we obtain
Tf'(kt)kt = f(kt) - xt
Combining this equation with the fundamental equation (8), we obtain (assuming
kt > 0)

(13) k` = Tf'(kt) - A, where A = n + u

Under (A-1), (A-3'), and (A-4'), we can draw Figure 5.6, from which we can
immediately assert the existence and the uniqueness of k, such that Tf'(k,) = A

k
k

-x

Figure 5.6. An Illustration of Equation (13).

440 CALCULUS OF VARIATIONS AND OPTIMAL GROWTH OF AN AGGREGATE ECONOMY

and k, > 0. Moreover, the time path of k, is quite apparent from Figure 5.6. Thus
we can easily determine k, < 0 according to whether k, > k, Ask, approaches k,
x, approaches x.., where x.. can easily be obtained from (12), that is,

x, = (1 - `s).f'(k,)k, + [f(k,) - kJ'(k,)]

(14)
x, = .f(k,) - T.f'(k,)k,
Hence we have established the following theorem.

Theorem 5.C.2: Under assumptions (A-1), (A-3'), and (A-4') with the classical
saving behavior, there exists a unique feasible path (kc, xc) with kc > 0 and xc > 0,
such that any attainable path (k x,) approaches it monotonically as t -> oo regardless
of the initial value of k0, where kc and xc are respectively determined by Tf'(kc) _
A and equation (14).
REMARK: We may call (kc, xJ the classical path. Theorem 5.C.2 establishes
global stability for the classical path, which is again a balanced growth path
of (L,, K,, Y X,, I,). It can also be shown that the time required to reach the
classical path is infinity. Hence, like Solow's theorem, the above theorem
establishes "asymptotic" stability for the classical path. Note that the
above method of proof can also be used to prove Solow's theorem. For this,
simply divide (10) by k, and observe that we obtain an equation similar to
(13). Such a proof is used in Okamoto and Inada [ 10].
The above two theorems lead us to focus our attention on balanced growth
paths.

Definition: A neo-classical feasible path (k x,) is called a golden age paths if

k, and x, are both constant overtime.
REMARK: In other words, the set of all the golden age paths is the set of
all the balanced growth paths of (L K Y X,, I,) which satisfy equations
(1) to (4).
Since k, and x, are constant in the golden age paths, we write them simply
as k and x, respectively. Note that k = 0 in the golden age paths. Hence from
equation (8), we obtain
(15) x = f (k) - 1,k
Then under assumptions (A-1), (A-2), (A-3'), and (A-4'), the locus of the (k, x)'s
which satisfy equation (15) can be illustrated by Figure 5.7. It is clear from Figure
5.7 that there exists a unique positive value of k which maximizes x globally subject
A DIGRESSION: THE NEO-CLASSICAL AGGREGATE GROWTH MODEL 441

f(k) -Xk

Figure 5.7. An Illustration of Equation (15).

to (15). We denote it by k. Formally we obtain this by maximizing [f(k) - Ak]

with respect to k. The first-order condition can be obtained as

A [ f (k) - Ak] = 0
or

(16) f(k) = A
There exists, in view of (A-1), only one value of k which satisfies (16), and k is
this value. Note that (A-1) guarantees the strict concavity of f and hence of
[f(k) - 3.k] Thus (16) gives a necessary and sufficient condition for the unique
.

global maximum (Theorem 1.C.7). Even without such a remark this may be
obvious from Figure 5.7. We now define a very important concept.

Definition: A golden age path which maximizes per capita consumption x at

every instant of time is called the golden rule path.'
In view of this definition, the above consideration has really established the
theorem which can be stated as follows.

Theorem 5.C.3: Under assumptions (A-1), (A-2), (A-3'), and (A-4'), there exists
a unique golden rule path, (k, z), where k and z are respectively defined byf'(k) _ A
and z=f(k)-Ak.
REMARK: Thus in the golden rule path the marginal productivity of capital
f(k) is equal to the rate of population growth (n) plus the rate of depreciation
(y). The above theorem was established by quite a number of different
economists. See Phelps [ 11], Robinson [ 14], Swan [ 17], von Weizsacker
[20] , Allais [ 1 ] , and Desrousseaux [4] .
442 CALCULUS OF VARIATIONS AND OPTIMAL GROWTH OF AN AGGREGATE ECONOMY

REMARK: It is easy to see that (A-3') in the theorem can be relaxed to

(A-3") P(O) > A.
Also assumption (A-4') can be relaxed to
(A-4") f'(oo) < A.
REMARK: It is important to recall that the golden rule path is that path
which maximizes per capita consumption in the set of all the golden age
paths (that is, balanced growth paths). It disregards the historically given
value of ko (or Lo and KO). In other words, there is no guarantee that the
historically given value of ko is actually on the golden rule path. This seriously
undermines the usefulness of this concept. However, as we shall see in the
next section, we can assert that, under a certain situation, an "optimal" path
which starts from an arbitrary given k0 converges to the golden rule path
as time extends without limit. From this theorem, the concept of the golden
rule path becomes important.

FOOTNOTES

1. A complete survey of macro growth theory is not attempted here. For such an
attempt, see, for example, Hahn and Matthews [7].
2. This implies, of course, that Y is gross national product and I is gross investment.
If instead we take Y as net national product, then we can put y = 0. Then I is taken
as net investment. This convention is adopted by Solow [ 15] and others.
3. The theoretical justification for this from a "long-run" standpoint by Deusenberry
is well known. See J. Deusenberry, Income, Saving and the Theory of Consumer
Behavior, Cambridge, Mass., Harvard University Press, 1949.
4. As remarked in footnote 2, Solow has no explicit consideration of depreciation. A
similar model and a similar theorem were also obtained by Swan [ 16], althou h he
assumed that the production function is of the Cobb-Douglas form. Tobin L 19]
obtained a similar but more general model that incorporates money. But he did not
obtain the stability theorem like Theorem 5.C.1.
5. In other words, we preclude such cases as k* < ko < ks and k, < ko < k*.
6. It may be of some interest to investigate what is the time required for the actual
path to come "close enough" to the path (ks, x,). This clearly depends on such
parameters as s, n, n, and k0. There was a debate between R. Sato and K. Sato
on this point. See, for example, K. Sato, "On the Adjustment Time in Neo-Classical
Growth Models," Review of Economic Studies, XXXIII, July 1966.
7. For example, we can have the case in which the f (k)-curve is mound-shaped, that is,
f(k) < 0 for sufficiently large k (capital satiation). The essential point here is that
the sf(k)-curve intersects the A*k-line from the "left" with only one point of inter-
section. If the rate of population growth n (hence A*) is not constant but depends
on per capita income, and thus is a function of k, we can have multiple equilibria with
a mixture of stable and unstable ones. This has been studied by such economists
as R. R. Nelson, H. Leibenstein, J. Buttrick, and J. Niehans. This is used as a
rationale for the "big-push" thesis. However, we may question why n (= LIL) rather
than L is a function of y.
A DIGRESSION: THE NEO-CLASSICAL AGGREGATE GROWTH MODEL 443

8. The name is used to emphasize its mythological character. See J. Robinson, The
Accumulation of Capital, 2nd. ed., London, Macmillan, 1965, p. 99.
9. For the Cobb-Douglas case, that is, f (k) = ka, 0 < a < I, the value of k in the golden
rule path is easily obtained ask = (a/))"I('--l. Note that, for this case, k k, accord-
ing to whether a s, if A. = .a.*.

REFERENCES

1. Allais, M., "The Influence of the Capital-Output Ratio on Real National Income,"
Econometrica, 30, October 1962.
2. Champernowne, D. G., "Some Implications of Golden Age Conditions When
Savings Equal Profits," Review of Economic Studies, XXIX, June 1962.
3. Deardorf, A. V., "Growth Path in the Solow Neoclassical Growth Model," Quarterly
Journal of Economics, LXXXIV, February 1970.
4. Desrousseaux, J., "Expansion table et taux d'interet optimal," Annales de Mines,
November 1961.
5. Domar, E. D., Essays in the Theory of Growth, London, Oxford University Press,
1957.
6. Haavelmo, T., A Study in the Theory of Investment, Chicago, Ill., University of
Chicago Press, 1960.
7. Hahn, F. H., and Matthews, R. C. 0., "The Theory of Economic Growth: A Survey,"
Economic Journal, LXXIV, December 1964.
8. Harrod, R. F., "Second Essay in Dynamic Theory," Economic Journal, LXX, June
1960.
9. Meade, J. E., A Neo-classical Theory of Economic Growth, London, George Allen and
Unwin, 2nd. ed., 1962 (1st. ed. 1961).
10. Okamoto, T., and Inada, K., "A Note on the Theory of Economic Growth,"
Quarterly Journal of Economics, LXXVI, August 1962.
11. Phelps, E. S., "The Golden Rule of Accumulation: A Fable for Growthmen," Ameri-
can Economic Review, LI, September 1961.
12. , "Second Essay on the Golden Rule of Accumulation," American Economic

Review, LV, September 1965.

13. , Golden Rules of Economic Growth, New York, W.W. Norton, 1966.
14. Robinson, J., "A Neo-Classical Theorem," Review of Economic Studies, XXIX, June
1962.
15. Solow, R. M., "A Contribution to the Theory of Economic Growth," Quarterly
Journal of Economics, LXX, February 1956.
16. Swan, T. W., "Economic Growth and Capital Accumulation," Economic Record,
XXXII, November 1956.
17. , "Growth Models of Golden Ages and Production Functions," in Economic

Development with Special Reference to East Asia, Proceedings oflnternational Economic

Conference, ed. by Barrill, London, Macmillan, 1963.
444 CALCULUS OF VARIATIONS AND OPTIMAL GROWTH OF AN AGGREGATE ECONOMY

18. Takayana, A., "Per Capita Consumption and Growth: A Further Analysis,"
Western Economic Journal, V, March 1967.
19. Tobin, J., "A Dynamic Aggregative Model," Journal of Political Economy, LXIII,
April 1955.
20. von Weizsacker, C. C., Wachstum, Zins and Optimale Investitionsquote, Basel,
Kyklos-Verlag, 1962.

Section D
THE STRUCTURE OF
THE OPTIMAL GROWTH PROBLEM
FOR AN AGGREGATE ECONOMY'

a. INTRODUCTION
In the previous section we discussed an aggregate model of economic growth.
The model we considered can be described by the following three equations:
(1) Y1 = F(L1, Kt)
(2) K1 + µK1 = Y, - Xt
(3) L` n
L,

This economy produces a single output, Y, using two inputs, labor (L) and capital
(K); X denotes the amount of consumption. The rate of depreciation is denoted by
µ and the subscript t denotes time. In the previous section we observed that, by
adding the equation which describes the consumption (or saving) behavior and by
specifying the initial capital and labor (or the capital: labor ratio ko if F is homo-
genous of degree one), we can "close" the model and thus completely describe the
time path of each variable.
In this section we ask a different question. Instead of specifying the con-
sumption behavior, we ask: What is the necessary amount. of consumption at each
instant time in order to maximize a certain target while satisfying the above three
equations (the feasibility condition) and the prescribed boundary conditions?
Clearly such a target must be based on the satisfaction that one can obtain from the
stream of consumption. The question thus posed casts a genuine question of
choice. If we consume more at present, then we have less saving so that the amount
of capital stock in the future will be less compared with the case in which we save
more (that is, consume less) at present. This, in turn, implies that we have less out-
put and less consumption (unless we eat up the capital accumulated) in the future
compared with the case in which we save more at present. Hence, although we can
get more satisfaction at present as we consume more now, we will have less satis-
STRUCTURE OF OPTIMAL GROWTH PROBLEM FOR AN AGGREGATE ECONOMY 445

faction in the future. The question is: What is the optimal amount of present con-
sumption? In this verbal presentation of the problem, we implicitly assumed that
our time consists of only two periods, present and future. In general, there are
more than two periods. But this does not create too much difficulty.
Supposing that we can choose the time path of consumption on a time
continuum, we may ask what the optimal time path of consumption is. Let xr
X,/L1 be per capita consumption of the economy. One obvious target function
which the economy may wish to choose is

(4) 1 fTx,dt
0

where T is the planning horizon. Here the society wishes to maximize the total sum
of per capita consumption over time, satisfying the feasibility condition, equations
(1) to (3), and the boundary condition (say, k0 and kT). In Figure 5.8 we illustrate
two types of consumption streams. The a-curve denotes the "thrifty type" of
economy, that is, one which chooses less consumption at present or in the immedi-
ate future, while the Ai-curve denotes the "nonthrifty type." The problem here is to
compare the area under the a-curve up to the T-line with the area under the 3-
curve up to the T-line. If the former, for example, has a larger area than the latter,
we say that the former is "better" than the latter under the target prescribed in
equation (4). Clearly curves such as aand Ai are not drawn arbitrarily; they must
satisfy the "feasibility" prescribed by equations (1), (2), and (3). The optimal pro-
gram we choose under the prescribed target equation (4) is the one which gives the
largest area under the curve up to the T-line.
As alert readers may have already realized, such an optima] program de-
pends on the length of the planning horizon T. If the planning horizon T is longer,
the thrifty type of program may eventually be "better" than the nonthrifty type,
as it pays off at a later time. However, if T is short enough, the thrifty type of
program will not be optimal. Then a question arises as to what should be the length
of this planning horizon. Should it be 5 years, 10 years, or longer? This is a rather

Figure 5.8. An Illustration of Optimal Consumption Problem.

446 CALCULUS OF VARIATIONS AND OPTIMAL GROWTH OF AN AGGREGATE ECONOMY

difficult problem. However, there is one serious objection to a finite T, however

large T may be. What happens after time T? If we allow the accumulated capital to
be used up, then the optimal program must be such that there is no capital stock
left after time T. Thus the people in the economy will starve to death after time T.
Even if we do not allow ourselves to eat up the accumulated capital, one essential
difficulty remains: the arbitrariness of the cut-off point. When we decide the size
of T, we automatically decide to ignore the time after T, and such a Tis arbitrarily
chosen for there is no a priori criterion by which to choose the size of T. The
general consensus among economists about this point seems to be to choose
T = oc in order to avoid such an arbitrary cut-off point.
There are, however, some difficulties in the infinite horizon formulation as
well. First, as astronomers tell us, our world may cease to exist after some few
billions of years (which is still finite!). Second, there is always the problem of un-
certainty. In particular, how do we know that our technology (that is, the produc-
tion function) will be the same for the next hundred or thousand years (which is,
incidentally, much shorter than "some few billion years").
One answer is that the optimal program may not be particularly sensitive to
what we do a hundred years later but depends more crucially on what we do in the
neighborhood of the present. As Koopmans [22] discovered, all the "eligible"
paths in his formulation closely approach some fixed balanced growth path; hence
they all look the same for a large time horizon. In other words, the infinite horizon
formulation, contrary to some people's expectation, may really describe the im-
mediate future more than it does the infinite future. In this connection, the follow-
ing analogy due to Gale ([ 14], p. 2) may be useful.
One is guiding a ship on a long journey by keeping it lined up with a point on
the horizon even though one knows that long before that point is reached the
weather will change (but in an unpredictable way) and it will be necessary to
pick up a new course with a new reference point, again on the horizon rather
than just a short distance ahead.
Another justification simply admits that the infinite horizon formulation is more
convenient and revealing. On this point, Arrow ([ 1], p. 92) writes
As elsewhere in mathematical approximation to the real world, it is frequently
more convenient and more revealing to proceed to the limit to make a mathe-
matical infinity in the model correspond to the vast futurity of the real world.
So much for the discussion of the size of T. The next question is to find a more
sensible target function than the one described in equation (4). One answer is
T
(5) JT = fo u (x,)e-Ptdt

where u is a utility function associated with the "representative" individual in the

society and p is the time discount factor. Clearly the target described in (4) is a
special case of the one described in (5), where u(x,) = x, and p = 0. In elementary
STRUCTURE OF OPTIMAL GROWTH PROBLEM FOR AN AGGREGATE ECONOMY 447

economics, we usually argue that a utility function of the form u(x,) = x1 is rather
unrealistic. Instead, we say that the "marginal utility" is decreasing with an in-
creasing amount of consumption. We now impose such an assumption. In other
words, throughout this section, we assume that u is defined on [0, 00), is twice
differentiable, and
(A-1) u'(x,) > 0 and u"(x1) < 0 for all x1 ? 0
Under this assumption, u(x,) is a strictly monotone and strictly concave function.
[ Here u'(0) and u" (0), respectively, refer to the right-hand derivative ofu and u' at
x = 0.]
The question of discounting the future (that is, p > 0) is not an easy one.
Frank Ramsey, who first studied the optimal saving problem systematically, argued
that p should be equal to zero, for it is "unethical" to discount the utility of our
descendants compared to the utility of ourselves. The welfare of different genera-
tions should be equally weighted. However, Koopmans [21.] and Koopmans,
Diamond, and Williamson [25] have discovered that a utility function of all con-
sumption paths, which at the same time exhibits time neutrality and satisfies other
reasonable postulates on utility functions, does not exist. Since this question has
not been settled yet, we will not discuss it further. For the time being, we assume
that p is constant and nonnegative.
For the infinite horizon formulation, equation (5) is rewritten as

(6) J - rl u(x1)e-Ptdt
0

Thus our problem is now to find the time path of x1 which maximizes J subject to
the feasibility conditions (1), (2), and (3) with the prescribed value of the initial
capital-labor ratio ko (assuming constant returns to scale) and the nonnegativity of
each variable.
The question thus formulated, however, casts another problem immediately.
That is, how can we guarantee that J converges? If, for some feasible paths with a
prescribed ko, the integral J does not converge (say, goes to 00), the above formula-
tion becomes meaningless.' This question of convergence is especially acute when
p = 0. Ramsey [35] solved this question beautifully by constructing some ref-
erence path, say, u, and converting the problem to one of maximizing

(7) J = f0 , I - u] dt
Note that both J and JR are bounded from below for the optimal path under the
monotonicity of the utility function u, assuming that the economy is "productive"
in the sense that it allows a strictly positive path of consumption starting from a
given ko. When p > 0, the easiest way to guarantee the convergence of J is simply
to assume that the function u is bounded from above (that is, satiation). (See also
footnotes 12 and 16.)
448 CALCULUS OF VARIATIONS AND OPTIMAL GROWTH OF AN AGGREGATE ECONOMY

So' much for the discussion of the formulation of the problem. We now
proceed to the solution of the problem thus formulated. This question of the opti-
mal saving problem was first formulated and solved to a certain extent by Ramsey
in 1928 [ 35]. Then the problem was almost forgotten for some time probably as a
result of the Great Depression and the war. Then in the 1950s there was a revival of
the problem with enthusiasm and it was solved by Koopmans [22] and Cass [6] in
the early 1960s.1 Cass formulated the problem in terms of Pontryagin's maximum
principle, reflecting the fashion during the time the paper was written. (We shall
take up such a formulation in Chapter 8.) As we will see in this section, we really
do not need this new technique, the full understanding of which requires a con-
siderable mathematical maturity. Instead, we will use only the knowledge of the
elementary theory of the calculus of variations that we discussed in Section A.
Koopmans' paper [22], although masterly and very penetrating, is long, consist-
ing of sixty-three printed pages, and his proofs are sometimes difficult. This dif-
ficulty is partly the result of his thorough and important examination of the
"eligibility" conditions for the "feasible path."
We can simplify the treatment considerably if we realize the fact that the
whole problem is a straightforward application of the elementary part of the clas-
sical calculus of variations. We will see that the "phase diagram" will be very
useful and vital in our analysis. There is one basic difference between his and
our approach. Koopmans first eliminates the ineligible paths, then chooses the
optimal (eligible) path from the set of eligible (attainable) paths, whereas we first
eliminate the paths which do not satisfy the Euler condition as "nonoptimal," and
then choose the eligible (optimal) path from the set of the attainable paths that
satisfy Euler's condition. In the process of obtaining the set of attainable paths
that satisfy Euler's condition, we use the elementary theory of the calculus of
variations.
To solve our problem, we first have to simplify the constraint equations (1),
(2), and (3). This procedure of simplification has already been discussed in the
previous section, assuming that F exhibits constant returns to scale. In short, for
L, > 0, equations (1), (2), and (3) are reduced to
(8) k, = J'(k,) - Ak, - x,
where k, = K, IL, (capital labor ratio), z, = X, /L,, f (k,) -- F(L K,)/L, . We call
the path (k,, x,) the feasible path if it satisfies (8). If, in addition, it satisfies the
arbitrarily prescribed initial value ko and the terminal value kT, we call it the
attainable path. When T -> co, kT will not be specified. The problem, then, is the
following:;
T
Maximize: JT - f u(x,)e-P1dt
0

Subject to: k, = f(k,) - Ak, - x, and k, >_ 0, x, >_ 0 for all t

with the prescribed values of ko > 0 and k,, >_ 0.
STRUCTURE OF OPTIMAL GROWTH PROBLEM FOR AN AGGREGATE ECONOMY
449

We first observe that x, in the target function can be expressed in terms of

k, and k, in view of the constraint equation (8). In other words, the problem may
be reformulated as follows:

Maximize: fou[f(kt)
T - A.k, - k,]e P'dt
Subject to: k, > 0, x, > 0 for all t
with the prescribed values of ko > 0 and kt > 0.
The nonnegativity constraints k, > 0, x, > 0, do not cause any trouble here. For,
as we will see later, the solution path obtained by neglecting the nonnegativity
condition is in fact in the nonnegative orthant.
Neglecting the nonnegativity condition, we can immediately apply the Euler
condition to choose x, so as to maximize the integral JT from the set of attainable
paths. In other words, our problem is now converted to the calculus of variations
problem without the constraint. Thus letting
(9) I(t, k1, kt) = u[.f(kr) - Ak, - k,]e -Pt
we can write Euler's condition as follows:

(10)
aa) dad
ak, dt[ak,
where the partial derivatives are evaluated at the optimal path k, . For notational
simplicity we henceforth omit (° ), which denotes the optimal path. Equation (10)
gives a necessary condition for k, to be an optimal path. By utilizing (9), (10) can
be computed as

This is a necessary condition for an optimum. It is also sufficient for a unique

optimum, as we proved in Section B (Theorem 5.B.4), if 0 is strictly concave in k,
and k, (as long as T is finite).
Thus we have obtained two equations, (8) and (11). The former describes
the feasibility condition and the latter describes the optimality condition. The path
(k x1) which satisfies both equations is called the feasible Euler path. By pre-
scribing the values of two boundary conditions ko and kT, we can obtain the
solution path of the problem as described.
We now ask the question: What happens to the path (k,, x,) which satisfies
both feasibility (8) and optimality (11) with the initial condition ko, when the
planning horizon T is sufficiently large (where kT is no longer fixed)? As we argue
later, there can be three such possible types of paths, depending on the initial
condition. We argue that two of these three paths create some difficulties for a
sufficiently large T. Hence we call the paths in these two classes noneligible. The
path in the third class does not create such a difficulty and it is called an eligible
450 CALCULUS OF VARIATIONS AND OPTIMAL GROWTH OF AN AGGREGATE ECONOMY

path. We show that the eligible path which satisfies both feasibility (8) and Euler's
condition (11) is such that it monotonically approaches the "modified golden
rule path" (whose concept is to be defined later) regardless of the initial k0 as T
increases. It can be shown that the integral J(for p > 0) orJR(for p = 0) converges
along such a path. The eligible Euler feasible path thus obtained will be better than
any other feasible path starting from the same initial point k0 for any sufficiently
large T. This criterion of choosing the optimal path corresponds to the one
proposed by von Weizsacker [52] as the "overtaking criterion."
Finally, one remark about the feasibility condition (8), in particular the
shape of f(k,), should be mentioned. In the neo-classical model as described in the
previous section, we supposed a strictly concave shape off, that is, f'(k,) > 0 and
f"(k,) < 0 for all k, . However, there is one other type of production function that
is quite common in the literature of economic growth and development. This
production function has the assumption of a constant capital:output ratios. In
this case, F(L, Kt) has the form

Y,
(12) _ a Kt
where or is a positive constant denoting the capital:output ratio. Notice that labor,
Lt, is not explicitly involved in this production function. By dividing both sides
by Lt we obtain

(13) yl = I kt, where yt - L`

In other words, by identifying f (k,) with (1/Q )k, , we can consider the present case
as a special case of the production function considered above.' We can use the
same conditions (8) and (11), with f (k,) now identified as f (k,) = (1/Q )k, Equation
.

(8) can now be rewritten as

(14) k,= I[1-Qa.]k,-x,

b. THE CASE OF A CONSTANT CAPITAL:OUTPUT RATIO

During the revival period of the optimal growth problem in the 1950s
and early 1960s, there were important discussions by Tinbergen [47], [48],
Chakravarty [91, Goodwin [ 151, and others, before we reached the culmination
by Koopmans [22] and Cass [6]. They considered the case in which there exists a
constant capital:output ratio with a special form of the utility function and a finite
planning horizon. Such a case clearly constitutes the simplest possible case in the
problem formulated in subsection a. We now will attempt to critically survey the
literature during this period, especially [48], [9], and [ 15]. We will show that the
model with a constant capital: output ratio yields a difficulty when the planning
horizon is infinite. That is, the optimal path in this case does not exist at all in
many cases. Chakravarty [9] considered a finite horizon problem as did Goodwin
[ 151. There, in terms of numerical examples, he made the very interesting con-
STRUCTURE OF OPTIMAL GROWTH PROBLEM FOR AN AGGREGATE ECONOMY 451

jecture that the optimal attainable path is "insensitive" with respect to the
terminal capital stock kT and also that it is insensitive with respect to the planning
horizon T. We will show that these conjectures are true under a general frame-
work, which will shed some light on a later controversy between Chakravarty and
Maneschi.' We note that our sensitivity analysis deals with a simple case of Brock's
elegant analysis [5]. We point out that, in the Chakravarty-Goodwin case,
the optimal program for a sufficiently large T approximates the program in which
consumption is kept at the subsistence level forever. The discussion of this sub-
section will be useful to increase the reader's understanding of the problem
involved in the constant capital:output ratio case and of the basic technique
employed in the analysis, as well as some of the basic difficulties involved in the
optimal growth problem of an aggregate economy.
With these preliminary remarks we now proceed with our analysis. First
rewrite the Euler equation, (11), for the case of a fixed capital: output ratio, that
is,

(15) Xt uI
u Q
(A + P)

Since f(k,) - A.kt - k1[= (1/u - A)k, - k1] is a linear function in k1 and kt (hence
concave) and u is a strictly concave function, the Euler condition, (15), is sufficient
for a global optimum as well as necessary (assuming that T is finite). The optimal
feasible path is the one which satisfies equations (14) and (15) simultaneously with
k, > 0 and x1 > 0. We may replace the condition x1 > 0 by x1 > x, where x is the
subsistence level of consumption. We may note that Chakravarty assumes x = 0.
In the formulation of the problem by Tinbergen, Goodwin, and Chakravarty,
the depreciation of capital is not explicit. In our formulation, this amounts to
putting µ = 0. Also these three people assumed that there is no time discount for
the future consumption so that p = 0. Tinbergen and Chakravarty in the main
assumed that there is no population growth in the economy so that n = 0 (thus
A = 0). Goodwin gives a numerical example of the problem in which he assumes
n = 0.01 and u = 4 (see [ 15] , pp. 773-774). Hence all the treatments of Tinbergen,
Chakravarty, and Goodwin can be considered as special cases of the following
assumption:
(A-2) i -u(A+p)> 0
The case in which 1 - u(A + p) < 0 can be discussed mutatis mutandis so that the
analysis for this case can be omitted from our discussion. We may note that if
1 - u(A. + p) < 0, then the path constrained by equation (15) requires the economy
continuously to reduce per capita consumption (that is, z< < 0), since u' > 0
and u" < 0 by (A-1). This is an uninteresting case. We may note that (A-2) implies
1 - uA > 0. If 1 - uA < 0, then, from equation (14), any feasible path (with x1
some positive constant) must undergo a decrease in capital stock and the economy
must disappear for a sufficiently large T in order to keep some positive level of
consumption. Otherwise, the amount of per capita consumption must become zero
452 CALCULUS OF VARIATIONS AND OPTIMAL GROWTH OF AN AGGREGATE ECONOMY

and everyone in the society will eventually starve to death. Therefore, the case
(1 - aA < 0) is uninteresting. Note that (A-2), among others, implies that p
cannot be too large, and in fact, if we accept p = 0, (A-2) can easily be accepted
as a realistic assumption.
The studies by Tinbergen, Goodwin, and Chakravarty all assume the
following specific form of the utility function," which obviously satisfies (A-1),
forx,>.:
(16) u'(x,) _ (x, - x) ", v>0
or

(17) u(x,) = log(x, - z) if v = 1 , and u(x,) = 1 1

v
(x, - x)' if 0 < V :qt 1

Here u is defined for x, > x if v = 1 and for x, >= x for 0 < v < 1. In (16) and (17),
x is the subsistence level of consumption. If we suppose u (x) > 0 for some value of
x > x, then v cannot be greater than 1. Goodwin assumes that v = 1. Tinbergen
quotes the figures from Frisch's study of 1931 which, for example, says v = 0.6 for
American workers ([48] , p. 482). The specification of the utility function as above
may cause strong opposition from the view point of the cardinality of utility.
However, since one of the purposes of this section is to survey the past studies,
we want to keep the explicit form of the utility function as defined above.
With the above specification of the utility function, the Euler equation (15)
can be rewritten as

(18) z,-= a(x, - x), where a a [1 - a(A + p)] > 0 from (A-2)

The solution of this differential equation can immediately be obtained as

(19) x,=x+Ae'
where a > 0 from (A-2), and A is a constant determined by the boundary condi-
tions. Equation (14) can now be rewritten as

(20) k, - /3kt = - (z + Ae'r'), where R = 1U (1 - a i) > 0 from (A-2)

This is a simple linear differential equation, and its solution can be obtain CA as

(21) kt = R + (B - At)el'', if cr = /3

x
(22) k, = + e'll + BeA', if a /3
/3-rx
The two constants A and B are to be determined by the boundary conditions. One
of them is obviously the initial value of k. We can consider several candidates for
the other. Goodwin chooses the terminal growth rate YT/YT. Since the capital:
output ratio is constant and the rate of labor growth is constant, this amounts to
STRUCTURE OF OPTIMAL GROWTH PROBLEM FOR AN AGGREGATE ECONOMY 453

choosing kT/kT ° Chakravarty chooses the terminal stock of capital kT as the other
boundary condition. In either case, the specification of the two boundary condi-
tions determines the values of A and B, and hence specifies completely the optimal
attainable path of (k,, x,). We may note that Goodwin assumes v = 1 andp = 0 so
that a = /3, whereas Tinbergen and Chakravarty consider the case in which v < 1
and p = 0 so that a /i. In other words, Goodwin considers a special case of the
time path (k,, x,) described by equations (20) and (21), while Tinbergen and Chak-
ravarty considered a special case of the path of (k,, x,) described by (20) and (22).
To pursue the analysis further, let us assume x = 0. This amounts to choos-
ing the origin properly and does not constitute a loss of generality. (One may, if he
so desires, redefine x, k, by x, - x and k, - x//3, respectively.) Then (19), (21), and
(22) can be rewritten respectively as

(23) x, = Ae"t, regardless of the relative size of a and /3

(24) k, _ (B - At)e/', when a = /3

A
(25) k,
-a e-t + Befit, when a
Write the two boundary conditions as
(26) ko = a and kT = b, where we assume a > 0 and b > 0
Note that if a = 0, then k, = 0 and x, = 0 for all t. We may disregard this un-
interesting case. Using (26), we can obtain expressions for A and B as follows:
CASE I: a = /i
a - be-AT
(27) A =
T
(28) B = a

CASE II: a /3

(29) A=(a-/3) eaelT-b

raT - '0AT

(30) B=a+ A

We may rewrite (29) as follows:

(31)
a-
A = (a - /3) e(n-A)T
be- AT

-1
We now examine the nonnegativity condition, that is, k, > 0, x, > 0 for all t.
Clearly whether this condition is satisfied or not depends on the magnitudes of
454 CALCULUS OF VARIATIONS AND OPTIMAL GROWTH OF AN AGGREGATE ECONOMY

A and B. So far as equations (27) to (30) are concerned, A and B can be either
negative or positive depending on the size of T and the relative sizes of a and b.
For example, if T is sufficiently small and b is sufficiently large relative to a,
then we may have a < be-AT, so that A < 0 in (27). A necessary and sufficient
condition for x, > 0 for all t is, in view of (23), that A >_ 0 regardless of the
relative size of a and R. A > 0 holds [in view of (27) and (29)] if and only if
(32) aePT > b, regardless of the relative size of a and R
We assume that this condition holds, for otherwise x, < 0 (for all t).
In order to consider the condition in which k, > 0 for all t, we obtain the
expressions for k, using (24), (25), (27), (28), (29), and (30):

(33) k, = aept(T 7, t) + T, when a = R

eear - Of
(34) k, = aeAt - (aept - b) eaT - eAT , when a A

which can be rewritten as

k, = e(zT [Qepr{(eaT - eI) - (eat - e1t)} + b(ea` - ePt)] (when a # R)

(35) 1 ePT

In view of (33), k, > 0 for all t, when a = R. Also in view of (35), k, >_ 0 for all t,
when a # A. In fact, k, > 0 for all t, 0 < t < T, regardless of the relative size of
a and /3 as long as a> 0.
In order to investigate what happens when T is large enough, take the limit
as T->oo in (27), (28), (29), and (30). We then obtain:
(36) A=0,Ba,when a=R
(37) A=O,B=a,when a>p
(38) A(p-a)a,B=0,when a</3
Hence in view of (23), (24), and (25), we obtain:
(39) x, = 0, for all t, when a > R
(40) _v, _ (A - a)ae"t, for all t, when n' <
(41) k, = aelit, for all t, when a > /3
(42) k, = ae°t, for all t, when a < /3
Note that when a < A, we obtain, in view of (40) and (42), the following relation:
(43) x, _ (/i - a)k,, for all t
In other words, x, and k, grow at the same rate a and the ratio between them is
constant (that is, /i - a). For a > /i, we do not have such a solution.
STRUCTURE OF OPTIMAL GROWTH PROBLEM FOR AN AGGREGATE ECONOMY 455

Define the limit path as the path which is specified by (39) and (41) [or
(40) and (42)]. What can we infer from the limit path? One important implication
is that such a path gives an approximation of the optimal path when the planning
horizon is large enough.
Can we infer anything about the infinite horizon problem? First note that
we cannot specify the terminal stock b for the infinite horizon problem. It is
certainly meaningless to talk about the capital:labor ratio for the infinite future,
that is, the date which we can never reach! Therefore, strictly speaking, the
solution of the infinite horizon problem is not the limit of the finite horizon
problem. The problem is altered with regard to the specification of the terminal
stock b.
The reader may then wonder whether we can replace the boundary con-
dition kT = b by a condition such as limT_. kT = b. Then the terminal condition
is specified. But we can immediately see that the limit path then does not give a
solution of the infinite horizon problem, by simply observing k,->oo as t -->oc in
the limit path. In other words, if we adopt the limit path approach for the infinite
horizon problem, the terminal condition should not be specified.
Furthermore, the limit path is not, in general, a solution of the infinite
horizon problem anyway, even if the terminal condition is unspecified. This is
easy to see by assuming a > /3 and recalling (39). In other words, if a > /3, x, = 0
for all t in the limit path; that is, consumption must be kept at the subsistence
level forever. Clearly the path in which x, = 0 for all t cannot be an "optimal"
path. In fact, it gives the worst possible path if /3 > 0, since it is possible for the
economy to sustain itself at more than the subsistence level. To see this, it suffices
to choose x, such that 0 < x, < /3a and examine (14). Clearly such apath is attain-
able and k, is non-decreasing in t. Such a path is certainly better than the path
in which the economy is kept at the subsistence level for all t.
What then can we infer from this? The appropriate conclusion is that the
solution of the infinite horizon problem does not exist if a > /i. Actually a simple
procedure, which does not involve the tedious process of obtaining the limit
path, will also reveal this. First note that the solution must satisfy the feasibility
condition (14), and the Euler condition, (18) or (23).1° Any path which satisfies
these two conditions is called the feasible Euler path. The feasible Euler path is
not necessarily a solution path (an optimal path), that is, the solution of the
optimization problem. We then have to proceed to screen the solution path out
of the set of all feasible Euler paths. The test used in this screening process is
called the eligibility test. Note that if x0 = 0, then x, = 0 for all t by Euler's condi-
tion (23). Since the society can sustain itself at more than the subsistence level, the
path in which x, = 0 for all t is not "eligible" for the solution path. Recall (24)
and (25). Then if x0 > 0 and a > 0, k, eventually becomes negative for a sufficiently
large t ['.'A > 0 from (23) and x0 > 01. In other words, if a >_ /3, none of the
feasible Euler paths is "eligible" for the solution path; that is, the solution path
for the infinite horizon problem does not exist.
Since the above conclusion crucially hinges on whether a >_ /3 or a < 0, let
456 CALCULUS OF VARIATIONS AND OPTIMAL GROWTH OF AN AGGREGATE ECONOMY

us-obtain the condition under which a< A. This can easily be obtained from the
definitions of a and A in (18) and (20), and we can conclude that the necessary
and sufficient condition for a < /i is

(44) v(1 -aA)> 1 -a(A+ p)

Hence, in particular, when p = 0 (no future discount), the necessary and sufficient
condition for a < R is simplified to
(45) v>1

As we remarked earlier, if we assume u(x) > 0 for some x > 0, v cannot be greater
than one. Hence, in this case, a >_ A must hold as long as p = 0. Note that if
p > 0, then v 5 1 is necessary for a > p, and that v > 1 is sufficient for a < A
[in view of (44A. Goodwin's case (v = 1, A > 0, p = 0) and Chakravarty's model
(v < 1, A > 0, p = 0), as well as the above-mentioned Tinbergen case (A = p = 0,
v < 1), all yield the case in which a > A. An interesting example of a < A may be
the case in which A = 0, p > 0, and v > 1.
Tinbergen considered the infinite horizon problem with the above specifica-
tion (which amounts to a > p) and contended that his article is "an unsuccessful
attempt to find a simple solution to the problem of optimum savings" ([48],
p. 481). Both Goodwin and Chakravarty considered the finite horizon problem;
hence there is no such "unsuccessful" story.
For the finite horizon problem, the limit path represents an approximation
to the case in which T is sufficiently large. The only question that remains is
how the economy can tolerate spending most of its time near the subsistence
level. The present contention of the author is that this is not a small criticism,
although such a judgment may be a matter of taste.
Confining ourselves to the finite horizon problem, there is a way to avoid
the above-mentioned problem of the "arbitrary cut-off point." This is the
"sensitivity analysis" explored by Brock [5]. Postponing its full discussion to
the Appendix, we now illustrate this analysis for the present problem. Assuming
that T is finite, this analysis examines questions such as the effect of a change in
the terminal stock kT = b and a change in the terminal date T on the optimal
consumption program. Then we find out that the optimal consumption program
is "insensitive," at least for a certain initial period, with respect to these changes,
if T is large enough. As remarked before, Chakravarty [9] conjectured such
insensitivities by constructing certain numerical examples. These problems were
then solved under a general framework with both linear and nonlinear production
functions by Brock [5]. Our consideration here offers a simple case of Brock's
result. Also note that Brock dealt with a discrete time model while ours is a
continuous time model.
We first consider the effect of a change in the terminal stock requirement
kT, assuming T is fixed. Write the optimal path for kT = b, as (k,', x,')
and the optimal path for kT = b2 as (k,2, x,2). Similarly, we write the values of
STRUCTURE OF OPTIMAL GROWTH PROBLEM FOR AN AGGREGATE ECONOMY 457

A and B for kT = b1 as A 1 and B1 and those for kT = b2 as A2 and B2. Then

using (23) to (25) and (34), we can compute the following, where we assume
b, > b2:
CASE I: a=/3
,T
(46) xtl - x,2 = elt(A1 - A2) = eat(b2 - b,) e 7 < 0 (which implies x,1 < xr2)

(47) kt' - k2 = teat(A2 - A,) = teat(b, - b2) Te-IT

> 0 (which implies k,1 > kt2)

CASEII: a#/i
(48) _x,1 - x,2 = eat(A1 - A2)
(b2 - bl)eat - eaT < 0 (which implies x,l < x,2)
eat - eWt
(49) kt1 - kt2 ear - epr(b, - b2) > 0 (which implies k,l > k,2)

Hence on the optimal path, an increase in the terminal stock requirement kT

implies a decrease in x, for each t and an increase in k, for each t. Moreover,
when a = /i, the distance between x,1 and x,2 can be made arbitrarily small for
each t by choosing T sufficiently large and t sufficiently small; also the distance
between ktl and k,2 can be made arbitrarily small for each t by choosing a suf-
ficiently large T, provided that t is sufficiently small. In other words, the optimal
path is "insensitive" to a change in the terminal stock kT for a certain initial
period when T is sufficiently large, provided that a The degree of this
insensitivity (that is, the choice of T and t) can be precisely computed from (46)
and (47).
In order to see whether a similar conclusion can be obtained for a we
rewrite (48) and (49), respectively, as follows.

(50) xr 1 - xr2 = eat (b2 - b 1)(a -

ePT(e(rr-13)T- 1)
erxr(b2 - bl)(R - a)
erxT(e(A- )T- 1)

err - e/ir
(51)
k, l-k2=(b -b ,
2) = (b , ?
ePt - eat
-b )err7'(e(-a)T
e%T(e(a-P)T - 1) - 1)
Hence the distance between x,1 and x,2 can again be made arbitrarily small for
each t by choosing T large enough (relative to t), when a /3. Also the distance
between k,1 and k,22 can again be made arbitrarily small for each t by choosing
T large enough (relative to t), when a /3. The choice of T with a given distance
between x,1 and x,2 (resp. k,l and k2) and with a given value of t can be computed
precisely from (50) [resp. (51)].
458 CALCULUS OF VARIATIONS AND OPTIMAL GROWTH OF AN AGGREGATE ECONOMY

Next we consider the effect of a change in the planning horizon T on the

optimal path. Write the optimal path for the T-period problem as (k,T, x,T) and
the optimal path for the T'-period problem as (k J" x,T'). Assume T' > T. Using
(23) to (25) and (34), compute the following:
CASE I: a=f3

(52) - x,T = eatST, where Sr = a (4- T - b e-A

r -e r
T
(53) k,T' - k,T = - teatST

CASE II: a

(54) xtT - XtT = e°"t(a - p)A7, where A r __

a - be AT' a- be-pr

e(a-P)r' -1 e(a-p)r - 1
(55) ktT' - ktT = - (eat - el3')A7'

Note that S r and L r can be made arbitrarily close to zero by choosing T sufficiently
large (regardless of the relative size of a and p). Hence, for each fixed t, both the
distance between xtT and xtT" and the distance between ktT and ktT' can be made
arbitrarily close to zero, by choosing T sufficiently large (relative to t), regardless
of the relative size of a and A.
Note also that if b = 0, then 8r < 0 so that XtT' < XtT and k,T" > ktT when
a = R. Also b = 0 implies that A T j 0 according to whether a > A. Hence xtT"
< XtT and k,T' > k,r, when a R. In other words, the monotonicities of XtT and
ktr with respect to T can be achieved whenever we have b = 0. Note that a neces-
sary and sufficient condition for such monotonicities can be computed from (52)
to (55) for the case in which b > 0." Also note that we have established the in-
sensitivity of the optimal path with respect to T without regard to any such
monotonicities.
Hence we obtained the conclusion that the optimal path is insensitive both
to a change in the planning horizon T and to a change in the terminal stock kT
for a certain initial period. I believe that this is a precise formalization of Chakra-
varty's conjecture, where he confined himself to numerical examples.
In the above we noted the following features of the constant capital:output
ratio model when a >- A.

(i) The solution of the infinite horizon problem does not exist.
(ii) Although the sensitivity results hold for the finite horizon problem, the solution
for a sufficiently large T approximates the program in which consumption is
kept at the subsistence level for a long period of time.

These observations lead us to suspect the plausibility of the constant capital:

output ratio model. In the next subsection we take up a nonlinear specification
STRUCTURE OF OPTIMAL GROWTH PROBLEM FOR AN AGGREGATE ECONOMY 459

of the production function, that is, the case in which the assumption of a con-
stant capital:output ratio does not hold. We then show that under a certain set
of plausible assumptions, the solution of the infinite horizon problem always con-
verges to a balanced growth path ("modified golden rule path"). In the Appendix
to Section D, we show that the sensitivity results hold in general, including such
a nonlinear case.
Here we should also recall the problem of the inequality between the
natural rate of growth and the warranted rate of growth in the Harrod-Domar
model. In other words, we have to ask ourselves the question whether we can
really describe the "optimal" path without any significant consideration of
the growth of labor. Will not such a path be bounded by the ceiling of the
growth with "full employment of labor"? Will not such a path cause contin-
uously increasing unemployment of labor? Will not the productivity of capital
(1/a) be decreased with the increase in the capital:labor ratio? There are no
clear answers to these questions as long as we retain the assumption of a
constant capital:labor ratio.

C. NONLINEAR PRODUCTION FUNCTION WITH INFINITE TIME

HORIZON
In this subsection we consider the case in which there is substitution
between labor and capital in the production function. In other words, we are
dealing with equation (8), where f is some nonlinear function of k. We consider
the problem in which the time horizon is infinite. This is the problem that was
posed and answered by Koopmans [22] and Cass [6]. We will see that with
a nonlinear specification of the production function, the difficulty that arose in the
previous section will not arise here. The assumptions that we impose on the
function f are as follows:
(A-3) f(k) > 0 andf"(k) < 0 for all k > 0.
(A-4) f'(0) = co, f'(co) = 0, and f(0) = 0.
Here f'(0) and f"(0), respectively, are the right-hand derivatives off and f at 0.
Note that f'(k) > 0 for all k means that the marginal physical product of capital
is always positive and f"(k) < 0 means that the marginal physical product of
capital (labor) is a decreasing function with respect to capital (labor); f (0) = 0
means that capital is indispensable for production. These assumptions are also
introduced in the previous section. Under (A-3) and (A-4), we can draw the
following familiar diagram. Note that (A-3) and (A-4) guarantee the existence of
a unique solution for f(k) = Ak. We denote the value of k which satisfies this
equation by k. Note also that the first assumption of (A-4) is strategic in avoid-
ing the uninteresting possibility off (k) - Ak < 0 for all k (such a possibility would
mean that the economy must eventually disappear, regardless of the optimality
condition), in order to keep the consumption positive (x > 0). It can also be
seen from Figure 5.9 that if k0 > k, k < 0, regardless of the optimality condition,
in order to keep the consumption positive.''- It is, in any case, very important
460 CALCULUS OF VARIATIONS AND OPTIMAL GROWTH OF AN AGGREGATE ECONOMY

Figure 5.9. An Illustration of Production Function.

to see that the "nonlinear" specification off creates an essential difference from
the "linear" specification off where the capital:output ratio is assumed to be
constant. In the subsequent analysis, we shall show that, under the nonlinear
specification off (also of F), there exists a unique optimal attainable path for
the infinite horizon problem, which approaches the modified golden rule path.
The "modified golden rule path" will be defined later. (It is equal to the golden
rule path when the discount factor p is equal to zero.)
To obtain this result, we need one more specification on the utility function
in addition to (A-1):
(A-5) lim u(x) -. - co as x - 0 with x > 0.
This assumption is due to Koopmans [221. He explains that "this means a strong
incentive to avoid periods of very low consumption as much as is feasible" (p. 241).
If x, = 0 for any (small) time interval, then by (A-5) the objective integral diverges
to -co. That is, (A-5) in essence guarantees an interior solution.
We are now ready to proceed with our analysis. As discussed in subsection a,
we first solve the problem with a finite horizon, and then examine the optimal
feasible path when T extends without limit. Thus our first task is to maximize the
integral JT [equation (5)] subject to feasibility. This is a straightforward calculus
of variations problem, of which the Euler condition is already obtained [equation
(11)] . Now note that since f is strictly concave ink from (A-3) and u is a strictly
concave function from (A-5), C1 is a strictly concave function in k and k. Hence the
Euler condition as given in (11) is sufficient (as well as necessary) for a unique
global maximum (Theorem 5.B.4). Ignoring the possibility of a corner solution
(which may arise due to the nonnegativity condition k, > 0, x, > 0), the feasible
Euler path is the one that satisfies equations (11) and (8) simultaneously. Hence the
STRUCTURE OF OPTIMAL GROWTH PROBLEM FOR AN AGGREGATE ECONOMY 461

time path of k1 and x1 can be analyzed simply in terms of the following phase dia-
gram, where we now confine our attention to the nonnegative orthant of the
(x - k)-plane in view of the nonnegativity constraint.
In Figure 5.10, k(p) is defined as the value of k which satisfies the following
equation:
(56) f(k) = A + p
From (A-3) and (A-4), k(p) lies between 0 and k. Also, z(p) in Figure 5.10 is
defined by the following equation:

(57) c(p) =f(k(p)) - Ak(p)

In Figure 5.10, the vertical line starting from k(p) represents the set of (k, x) com-
binations which satisfy equation (11), so that z = 0 along this line, z > 0 to the left
of this line, and x < 0 to the right of this line [which follows from (A-3)]. The
mound-shaped curve in Figure 5.10 represents the set of (k, x) combinations
which satisfy x = f (k) - Ak, so that k = 0 along this curve, k < 0 above the curve,
and k > 0 below the curve. Hence we can obtain the arrows in Figure 5.10 and
trace various paths of (k1, x1) on the diagram. Therefore, given the initial ko and
another boundary condition-say, kT-we can completely describe the shape of
the time path of (k1, x1) on Figure 5.10. It may be interesting to observe that all the
optimal paths of (k1, x1) arch toward the path of [k(p), X_ (p)] when T is sufficiently
large.13 This phenomenon is the basis of the theorem which Samuelson called the
"consumption turnpike theorem" [36].
Let us now turn to the problem with an infinite time horizon. We are now
concerned with the problem of maximizing J as defined in (6) subject to feasibility.
The problem can simply be analyzed by tracing the feasible Euler paths as

x=f(k)-Xk(ork=0)

Figure 5.10. Phase Diagram for the Nonlinear Case.

462 CALCULUS OF VARIATIONS AND OPTIMAL GROWTH OF AN AGGREGATE ECONOMY

described in the above diagram. From the diagram it is clear that there are three
kinds of feasible Euler paths, (k,, x,), namely,

(Type A) k, > k(p) for all t > t (for some 1 > 0).
(Type B) k,-k(p) and x,->1(p) as t oo.

(Type C) k, < k(p), for all t > t (for some 1 > 0).

Along the type A path, x, < i(p) as well as k, > k(p) from some time on
(say after t = to). Then we can always improve on a given type A path by con-
suming the capital stock (disinvesting) in some interval beginning at to until k,
diminishes to k(p), while we raise x, to i(p). After this, we maintain i(p) and
k(p), and we obtain a path superior to a given type A path. In other words, the
type A path cannot be optimal. Along the type C path, k, < 0 for all t > it so
that k, is decreasing over time, yet x, is nondecreasing over time as can be seen
from the above phase diagram. Hence k, eventually goes to some negative value
for a sufficiently long passage of time." This violates our assumption of k, > 0
for all t > 0. Hence both the type A and type C paths are not eligible for the
infinite horizon problem. Note that when we consider the problem of t->co
(hence also T--> oo ), we do not pre-specify the value of kT.
What about the type B path? If p is positive, then the integral J defined in (6)
along the type B path is clearly convergent, so that we obtain a unique eligible
path which is feasible and satisfies Euler's condition, for any positive initial ko.15
The path approaches monotonically to [k(p), z(p)] as time extends without
limit.' 6 If p is zero, then the integral J defined in (6) along the type B path is not
convergent. However, the problem of divergence in this case can be avoided if we
redefine the target function as follows:

(7) JR - fo [u(x,) - u(z)] dt where z = z(O)

Along the type B path, we can show that the integral JR is convergent; hence the
feasible Euler path of type B is eligible for the infinite horizon problem under
this new target function JR.'' This Ramseyian device is also used by Koopmans
[22]. We now obtain the following theorem.

Theorem 5.D.1 (Ramsey-Koopmans-Cass): Under assumptions (A-1), (A-3),

(A-4), and (A-5), we have
(i) p > 0: Given an arbitrary initial value of k, an optimal feasible and eligible path
is unique and it converges to the path [k(p), z(p)] monotonically. The optimality
is defined in the sense of maximizing the integral J, and this integral is con-
vergent for this optimal path.
(ii) p = 0: Given an arbitrary initial value of k, an optimal feasible and eligible path
is unique, and it converges to the path [k(0), 1(0)] monotonically. The optimality
STRUCTURE OF OPTIMAL GROWTH PROBLEM FOR AN AGGREGATE ECONOMY 463

is defined in the sense of maximizing the integral JR, and this integral is convergent
for this optimal path.
REMARK: ' If ko = k(p), then the optimal feasible path is simply the path of
[k(p), 1(p)] for all t > 0. The target is the integral Jwhen p > 0 and JR when
p = 0.
REMARK: The path of [k(p), 2(p)] is the familiar "golden rule path" a la
Phelps, Robinson, and so on, when p = 0. We can, in general, call the path of
[k(p), 1(p)] with p > 0 the modified golden rule path.
The importance of the above theorem may be emphasized. It gives a
completely new significance to the golden rule path. As discussed in Section C,
the concept of the golden rule path can be severely criticized on the grounds
that it neglects the historically given stocks of capital and labor, and that its
choice set is restricted to the golden age paths. This means that even if the
historically given value of the capital:labor ratio happens to be on the golden
rule path, it only maximizes per capita consumption in the choice set which is
limited to the set of the golden age paths. Theorem 5.D.1 gives an answer to
both of these criticisms. In other words, it says that the path which maximizes the
"Ramsey sum" of utility over the infinite horizon (that is, JR) converges to the
golden rule path regardless of the initial value of k0, as long as it satisfies the
eligibility conditions. Here the choice set is not limited to the golden age paths,
so that kt can fluctuate over time (in fact, along this optimal attainable eligible
path, kt approaches k(p) monotonically-hence, in general, it is not constant). If
we have a positive discount factor (p > 0), the theorem says that the optimal

x f'(k) =X+p

1-1 -f(k)-Ak

Figure 5.11. An illustration of the Case with a Negative Discount Factor.

464 CALCULUS OF VARIATIONS AND OPTIMAL GROWTH OF AN AGGREGATE ECONOMY

attainable eligible path converges monotonically to the "modified golden rule

path."
Koopmans [22] observed that the above theorem is, in general, no longer
true when the time discount factor p is negative. The wicked character of the
optimal feasible path when p < 0 may be illustrated by the phase diagram of
Figure 5.11. In this case, there exists no feasible Euler path which is eligible for
the infinite horizon problem. Note also that if k(p) takes the value between k(0)
and k, then, using a similar phase diagram, a proposition analogous to the previous
theorem should follow even if p < 0, since the intersection of the k = 0 curve and
the x = 0 line occurs for a positive value of k (and x).

FOOTNOTES

1. This section was first presented by the author as a lecture at the University of
Minnesota in the spring of 1966. See Takayama [461. For a recent survey of
the same problem, see Koopmans [241, for example. In the first reading of this
section, the reader may skip reading subsection b.
2. This point is discussed by Chakravarty [8].
3. There exists an extensive literature on this topic including recent textbook exposi-
tions. The earlier contributions on this problem in addition to [35] , [22] , and [6],
include: Tinbergen [47] and [481, Goodwin [151, Black [41, Chakravarty [8]
and [9], Dasgupta [12], Horvat [18] and [19], Leontief [27], Meade [31],
Samuelson [ 36], Sen [ 38] and [39], Stone [45] and von Weizsi cker [ 52]. (Dis-
cussion on "investment criteria" in the 1950s by Sen, Eckstein, and others, especially
in the Quarterly Journal of Economics, also belongs to this category of problem.)
An extension to the multisector model has been attempted since the pioneering work
by Samuelson and Solow [371. More recent turnpike theorems obviously belong
to this category. We take up this topic later (Chapter 7, Section A). See also a further
extension of this multisector growth model by Gale [ 14]. We also discuss this
later (Chapter 7, Section B). The extension to the two-sector optimization model is
attempted by Kurz [26] , Srinivasan [43], Stoleru [44], Johansen [20], Uzawa
[50] and [51], Atsumi [3], and so on. See also J. Z. Drabicki and A. Takayama,
"On the Optimal Growth of the Two Sector Economy," Krannert Institute Paper,
No. 383, January 1973.
4. We implicitly assume that the economy can "eat up" the existing stock of capital:
that is, the economy can increase the amount of consumption by reducing the existing
stock of capital. Cass [6] , and Arrow and Kurz [2] considered the optimal growth
problem by explicitly banning this possibility.
5. It is often referred to in connection with the "Harrod-Domar mode]." This repre-
sentation of a production function implicitly assumes that labor is not scarce.
Harrod's and Domar's original models are more sophisticated than the one with such
an assumption.
6. This also implies either that Y is defined as "net" (rather than gross) national
product or that the capital good is assumed to last forever.
7. See [111 [29], and [30].
,

8. Note that v = - (x1 - x)u"/u', which signifies the elasticity of marginal utility. The
crux of such a specification of the utility function is the constancy of this elasticity.
STRUCTURE OF OPTIMAL GROWTH PROBLEM FOR AN AGGREGATE ECONOMY 465

9. Goodwin's framework is a little more complicated. He asks the question of "trans-

forming an economy characterized by the Old, low productivity technique into
one consisting entirely of the New, high productivity technique." And Tis determined
as the year of the completion of this transformation. However, so far as the
mathematical structure is concerned, his model is essentially the same as that of
Tinbergen and Chakravarty. Hence we treat these models together.
10. For the infinite horizon problem, the objective integral is the J of (6) when p > 0 and
it is the JR of (7) when p = 0. Both J and JR yield the same Euler's equation.
11. Such a condition would be a much sharper result than Brock's result (Theorem 3 in
[ 5] ). However, this sharpness is obtained at quite a high price, that is, the specifica-
tion of the production function as a constant capital: output ratio.
12. Hence, if k0 > k, k, < k0 for all t > 0, so that f (k,) < f (k0) for all t > 0 for any
attainable path. Therefore x, hence also u(x,), is bounded from above for all t if
x, is to come only from the current output. That is, x, <- f (k0) for all t > 0. Also
if k0 < k, we can show that, for any attainable path, k, < k for all t 0 (by
setting x, = 0 for all t, the path of pure accumulation). Hence again, x, and u(x,)
are bounded from above for all t > 0. Therefore, u(x,) is bounded from above
regardless of whether k0 > k or k0 < k , so that the convergence of the integral
J is guaranteed for any attainable path whose value of J does not diverge to -oo,
provided that p > 0. In other words, we can solve the convergence problem discussed
earlier without setting an upper bound on u, if x, is to come only from the current
output. For the other case, that is, when the capital can be "eaten up," the proof
of convergence is slightly more complicated; yet it can be handled analogously as
above by noting that x, cannot go up for an indefinite period of time.
13. See Samuelson [361, p. 490. A linear approximation of our system will yield a
catenary solution. We may note that our formulation completely avoids such a linear
approximation.
14. From (11) and (A-4), x, - co as k, -> 0 along the type C path. Hence from (8) there
exists a constant S > 0 such that k, < -S from some time on (,.'f is bounded from
above for all k). This shows that k, will eventually become negative. On the other
hand, if the capital cannot be "eaten up," then x, is bounded by the current output,
that is x, < f (k,). In figure 5.10, the reader can easily draw the picture of this bound
(called the boundary curve). Then along the type C path, x, will hit the boundary
curve, where k, _ -ilk,, and x, and k, will approach the origin along the boundary
curve. In this case k, will never become negative and the above eligibility test fails.
15. In the theory of differential equations, the point [k(p), z(p)] is a saddle point, and
the type B path is its stable branches. This can be confirmed by linearizing the
dynamic equations (8) and (11) around [k(p), i(p)] , and showing that the eigen-
values of the coefficient matrix are real and of opposite signs. See any standard
textbooks on differential equations.
16. Therefore, we have shown that the feasible Euler path which is eligible must be
the type B path. The converse remains to be shown: that is whether the feasible Euler
path which satisfies the end-point conditions k0 and k(p) is indeed optimal compared
to any attainable path starting from k0. But this can be done easily by using a
method analogous to the proof of Theorem 5.B.4 together with the end-point
conditions k0 and k(p). The improper integrals which appear in the above proof
are bounded from above in view of footnote 12. Since any attainable path with
x, = 0 for a certain interval of time cannot be optimal in view of (A-5), we can
delete such paths from our consideration. Hence the improper integrals which appear
in the proof are bounded from below also, and thus they are well-defined. Note also
466 CALCULUS OF VARIATIONS AND OPTIMAL GROWTH OF AN AGGREGATE ECONOMY

that the uniqueness of the optimal path is provided for by the strict concavity of
the function u.
17. Our Euler condition (11), under this target function, can be transformed as u'(x 1)k 1=
u(z) - u(x1). This is the Keynes-Ramsey rule which says that "the net increase in
capital per worker multiplied by the marginal utility of consumption per worker at
any time equals the excess of the maximum sustainable utility level over the current
utility level." See Koopmans [22], p. 243, and also pp. 272-273. As we will show
in the next section, his equation (28) corresponds to our equation (11). As Koopmans
has shown, the time necessary to reach the golden rule path is infinity.

REFERENCES

1. Arrow, K. J., "Applications of Control Theory to Economic Growth," in Mathema-

tics of the Decision Sciences, Pt. 2, ed. by G. B. Dantzig and A. F. Veinott, Providence,
R. I., American Mathematical Society, 1968.
2. Arrow, K. J., and Kurz, M., "Optimal Growth with Irreversible Investment in a
Ramsey Model," Econometrica, 38, March 1970.
3. Atsumi, H., "Neoclassical Growth and the Efficient Program of Capital Accumula-
tion," Review of Economic Studies, XXXII, April 1965.
4. Black, J., "Optimum Savings Reconsidered, or Ramsey Without Tears," Economic
Journal, LXXII, June 1962.
5. Brock, W. A., "Sensitivity of Optimal Growth Paths with Respect to a Change in-
Target Stocks," Zeitschrift fur Nationalokonomie, Supp. 1, 1971 (originally presented
at the Purdue Meeting of the Kansas-Missouri Seminar on Quantitative Economics,
October 1969).
6. Cass, D., "Optimum Growth in an Aggregate Model of Capital Accumulation,"
Review of Economic Studies, XXXII, July 1965.
7. , "Optimum Growth in an Aggregative Model of Capital Accumulation: A
Turnpike Theorem," Econometrica, 34, October 1966.
8. Chakravarty, S., "The Existence of an Optimum Savings Program," Econometrica,
30, January 1962.
9. , "Optimal Savings with Finite Horizon," International Economic Review, 3,
September 1962.
10. , "Optimal Investment and Technical Progress," Review of Economic Studies,

XXXI, June 1964.

11. , "Optimal Savings with Finite Horizon: A Reply," International Economic

Review, 7, January 1966.

12. Dasgupta, A., "A Note on Optimum Savings," Econometrica, 32, July 1964.
13. Farrell, M. J., and Hahn, F. H., ed., Infinite Programmes in Economics, Edinburgh,
Oliver & Boyd, 1967 (Review of Economic Studies, January 1967 issue).
14. Gale, D., "On Optimal Development in a Multi-Sector Economy," Review ofEcono-
mic Studies, XXXIV, January 1967.
15. Goodwin, R. M., "The Optimal Growth Path for an Underdeveloped Economy,"
Economic Journal, LXXI, December 1961.
16. Harrod, R. F., "Second Essay in Dynamic Theory," Economic Journal, LXX, June
1960.
STRUCTURE OF OPTIMAL GROWTH PROBLEM FOR AN AGGREGATE ECONOMY 467

17. Hicks, J. R., Capital and Growth, Oxford, Clarendon Press, 1965.
18. Horvat, B., "The Optimum Rate of Saving: A Note," Economic Journal, LXVII,
March 1958.
19. , "The Optimum Rate of Investment," Economic Journal, LXVIII, December
1958.
20. Johansen, L., "Saving and Growth in Long-Term Programming Models," in Econo-
metric Analysis for National Economic Planning, ed. by Hart, P. E., Mills, G., and
Whitaker, J. K., London, Butterworth, 1964.
21. Koopmans, T. C., "Stationary Ordinal Utility and Impatience," Econometrica, 28,
April 1960.
22. , "On the Concept of Optimal Economic Growth," in The Econometric
Approach to Development Planning, Pontificiae Academiae Scientiarvm Scriptvm
Varia, Amsterdam, North-Holland, 1965 (also "Discussion," pp. 289-300).
23. , "On Flexibility of Future Preferences," in Human Judgement and Optimality,
ed. by Bryan and Shelly, New York, Wiley, 1966.
24. , "Objectives, Constraints and Outcomes in Optimal Growth Models,"
Econometrica, 35, January 1967.
25. Koopmans, T. C., Diamond, R. A., and Williamson, R. E., "Stationary Utility
and Time Perspective," Econometrica, 32, January-April 1964.
26. Kurz, M., "Optimal Paths of Capital Accumulation under Minimum Time
Objective," Econometrica, 33, January 1965.
27. Leontief, W., "Theoretical Note on Time Preference, Productivity of Capital,
Stagnation, and Economic Growth," American Economic Review, XLVIII, March
1958.
28. , "Time Preference and Economic Growth: A Reply," American Economic
Review, XLIX, December 1959.
29. Maneschi, A., "Optimal Savings with Finite Planning Horizon: A Note," Inter-
national Economic Review, 7, January 1966.
30. "Optimal Savings with Finite Planning Horizon: A Rejoinder," Inter-
national Economic Review, 7, January 1966.
31. Meade, J. E., Trade and Welfare: Mathematical Supplement, London, Oxford Uni-
versity Press, 1955.
32. Mirrlees, J., "Optimal Growth When Technology is Changing," Review of Economic
Studies, XXXIV, January 1967.
33. Phelps, E., "The Rammsey Problem and the Golden Rule of Accumulation," in Phelps,
Golden Rules of Economic Growth, New York, W. W. Norton, 1966.
34. Pontryagin, L. S., Boltyanskii, V. G., Gamkrelidze, R. V., and Mishchenko, E. R.,
The Mathematical Theory of Optimal Processes, New York, Interscience, 1962, (tr. by
Trirogoff and Neustadt from Russian).
35. Ramsey, F. P., "A Mathematical Theory of Saving," Economic Journal, XXXVIII,
December 1928.
36. Samuelson, P. A., "A Catenary Turnpike Involving Consumption and the Golden
Rule," American Economic Review, LV, June 1965.
37. Samuelson, P. A., and Solow, R. M., "A Complete Capital Model Involving Hetero-
geneous Capital Goods," Quarterly Journal of Economics, LXX, November 1956.
468 CALCULUS OF VARIATIONS AND OPTIMAL GROWTH OF AN AGGREGATE ECONOMY

38. Sen, A. K., "A Note on Tinbergen on the Optimum Rate on Saving," Economic
Journal, LXVII, December 1957.
39. , "On Optimising the Rate of Saving," Economic Journal, LXXI, September
1961.
40. Shell, K., "Applications of Pontryagin's Maximum Principle to Economics," in
Mathematical Systems, Theory and Economics, ed. by H. W. Kuhn and G. P. Szego,
Berlin, Springer-Verlag, 1969.
41. Solow, R. M., "A Contribution to the Theory of Economic Growth," Quarterly
Journal of Economics, LXX, February 1956.
42. Srinivasan, T. N., "Investment Criteria and Choice of Techniques of Production,"
Yale Economic Essays, 2, Spring 1962.
43. -, "Optimal Savings in a Two-Sector Model of Growth," Econometrica, 32, July
1964.
44. Stoleru, L. G., "An Optimal Policy for Economic Growth," Econometrica, 33, April
1965.
45. Stone, R., "Misery and Bliss: A Comparison of the Effect of Certain Forms of Savings
Behaviour on the Standard of Living of a Growing Community," Economia Inter-
nazionale, VIII, Febraio 1955.
46. Takayama, A., "On the Structure of the Optimal Growth Problem," Krannert
Institute Paper, Purdue University, No. 178, June 1967.
47. Tinbergen, J., "The Optimum Rate of Saving," Economic Journal, LXVI, December
1956.
48. , "Optimum Savings and Utility Maximization over Time," Econometrica,
28, April 1960.
49. Tobin,_J., "Economic Policy as an Objective of Government Policy," American Eco-
nomic Review, LIV, May 1964.
50. Uzawa, H., "Optimal Growth in a Two-Sector Model of Capital Accumulation,"
Review of Economic Studies, XXXI, January 1964.
51. -, "Optimal Technical Change in an Aggregative Model of Economic Growth,"
International Economic Review, 5, January 1965.
52. von Weizsacker, C. C., "Existence of Optimal Programs of Accumulation for an
Infinite Time Horizon," Review of Economic Studies, XXXII, April 1965.
53. Westfield, F. M., "Time-Preference and Economic Growth: Comment," American
Economic Review, XLIX, December 1959.
54. Yaari, M. E., "On the Existence of an Optimal Plan in Continuous-time Allocation
Process," Econometrica, 32, October 1964.

Appendix to Section D: A Discrete Time Model of One-Sector Optimal

Growth and Sensitivity Analysis

a. INTRODUCTION
In Section D, we have assumed that time t is a continuum or, more specific-
ally, that it is represented by real numbers. The purpose of this section is to con-
struct a one-sector optimal growth model when time t is not a continuum but
STRUCTURE OF OPTIMAL GROWTH PROBLEM FOR AN AGGREGATE ECONOMY 469

discrete, that is, when it is represented by integers. Such an analysis in economics

is known as "period analysis" and it is used in many fields of economics other than
growth theory, such as the stability theory of competitive markets, business cycle
theory, macro theory, and so on. In period analysis, difference equations rather
than differential equations often become the main tool of analysis. In many cases,
differential equations are known to be the better tool for use by theoreticians,
because there are many more readily available theorems in the (mathematical)
theory of differential equations than in the theory of difference equations. How-
ever, in some cases, difference equations are an equally good or even better tool of
analysis. In fact, the present topic of optimal growth may provide such an example.
In any case, this topic enables us to compare the two optimization techniques, non-
linear programming and the calculus of variations. In many cases, topics which
can be analyzed by the calculus of variations can also be analyzed by the usual non-
linear programming technique. We will use period analysis in multisector growth
models (Chapters 6 and 7). In Chapter 6, Section B, the use of the difference equa-
tion technique is illustrated, and in Chapter 7, Section B, the use of the nonlinear
programming technique is illustrated. The present analysis may serve as a bridge to
these later models. We may also note that these later models can be formulated in
terms of differential equations and/or the calculus of variations. The rationale for
the use of period analysis does not particularly lie in the tools of analysis that one
employs. An important merit of period analysis is that this mode of analysis is often
very useful in making explicit the crucial roles of "periods" in certain economic
occurrences. For example, it is often noted that consumption may depend on in-
come of the previous "period." As is well known, the recognition of this pheno-
menon is an important starting point of modern business cycle theory. In this case,
one "period" is defined as the length of time in which a consumer's reaction is
delayed. Similarly, we can consider many cases in which "periods" may play im-
portant roles in economic analysis, such as the "gestation period" of investment,
the "duration period" of fixed capital, and the adjustment lag of the labor market
compared with other markets (say, the money market).
It should be noted that the point made in the previous paragraph has no
direct relevance to the fact that the time element in human economic activities is
often discrete in the sense that many offices open only during daytime, some
markets open only once a week, and so on. Tt is certainly possible to define "period"
by one day or one week depending upon such "realistic" considerations. But un-
less there are certain crucial economic meanings attached to such calendar
periods, one can often use differential equations (instead of difference equations)
by supposing that a day, week, or year shrinks to a point of time, and still obtain
meaningful results.'
The analysis of this Appendix does not provide an example of a case in which
the definition of a period is of crucial significance to the conclusion. The analysis
turns out to be merely a discrete analogue of the continuous time analysis of
Section D. We obtain results essentially similar to those obtained there. We should,
however, stress that one unit of a "period" is not arbitrarily defined. It is explicitly
470 CALCULUS OF VARIATIONS AND OPTIMAL GROWTH OF AN AGGREGATE ECONOMY

defined as a unit of a production period. Such an explicit recognition of the mean-

ing of a "period" is very important, once one has decided to carry out his analysis
in terms of period analysis. In the literature, such a consideration is often missing,
so that there are several different period analyses for presumably the same prob-
lem, each obtaining a different conclusion.
In this Appendix, we obtain most of the results of the one-sector optimal
growth model in Section D, such as the optimality condition and the convergence
of the optimal attainable path to the modified golden rule path. However, our
emphasis here is on the following:

(i) A rigorous formulation of the discrete time model for the present topic.
(ii) The illustration of the use of nonlinear programming for the present topic.
(iii) The obtaining of some important additional results-in particular, the existence
and uniqueness of the optimal attainable path and Brock's theorem on
sensitivities [2].

The existence theorem is not a particularly easy topic when we use the cal-
culus of variations and differential equations. However, when we use nonlinear
programming, the simple Weierstrass theorem is often sufficient for this purpose.
We have already illustrated sensitivity analysis in Section D for the case of a
constant capital:output ratio. We will now record general results with the proofs.

b. MODEL
We define the notations L,, K,, X,, I, and so on, as we have done in the two
previous sections, except that t now refers to period t. The labor supply equation is
now written as

(1) L, = (1 + n)1Lo, or Li±1 = (1 + n), where n ? 0

We write the basic equilibrium relation in the output market as follows:

(2) X,+ I + It+ I = F(L1, Kt)
The basic assumption involved here is that the unit period is chosen to be the "pro-
duction period." That is, it is assumed that production is not instantaneous but
takes a certain period of time, and that period is chosen to be a one unit time
period. It is assumed that the stock of capital does not depreciate within the period
but depreciates suddenly at the end of each period at the rate y, 0 < µ < 1 (when
the production of each period is completed). Hence we can write the production
function as F(L,, Kt); that is, the value K, is unchanged during the entire tth
period, and at the end of the tth period, the capital stock inherited from the
previous period suddenly declines from Kt to (K, - ,uK,).
However, at the beginning of the (t + 1)th period, a part of the output
produced during the tth period is now available for increasing the capital stock. It
is assumed that if I,+ 1 is the amount to be invested in the (t + 1)th period, then the
STRUCTURE OF OPTIMAL GROWTH PROBLEM FOR AN AGGREGATE ECONOMY 471

entire amount of It+I is invested at the beginning of the (t + 1)th period. Hence
Kt+ 1, the stock of capital available for production in the (t + 1)th period, is written
as

Kt+ l = (Kt - uK,) + It+ I

It, I = Kr+ I - (Kt - uK,)

Combining this with (2), we obtain
(3) Xt+ I + [Kt+ I - (K, - uK,)] = F(L,, K,)
or

Xt+ I + (Kt+ - Kt) = F(L,, Kt) - uKt

Consumption, unlike investment, does not have to take place all at once at the
beginning of the period. That is, the amount Xt+ l is consumed during the entire
(t + 1)th period
In the literature, there does not seem to be a consensus on the form of the
basic output equilibrium relation such as (3), when time is discrete. For example,
Samuelson ([7], p. 273) writes the corresponding equation as
(4) Xt + (Kt+I -K1)= F(Lt, Kt) - µKt
Notice that Xt+ in (3) is replaced byX, in (4). One interpretation is that production
I

takes place instantaneously, unlike our assumption concerning production. We

proceed with our analysis using (3).2
Assume again that F is homogeneous of degree one, and write
Kr
(5) F(L, Kr) = Ltf(kr), where kr = L
r

Dividing (3) by L,+ I and using (1) and (5), we obtain

kt
- f(k,)
µk,
(6) Xt+ I + kt+ I - l+n l+n
where Xt+ I = X`+ and kr+ I = Kt+ I
Lt+ Lr+ l

We can rewrite (6) as

(7) Xt+ I + kt+ I = g(k,)
wh ere

(8) g(kr) = 1 + n [f(k,) + (kt - ukt)]

We impose the following assumptions on f.
472 CALCULUS OF VARIATIONS AND OPTIMAL GROWTH OF AN AGGREGATE ECONOMY

(A-1) f(O) = 0, 0 < f'(k,) < oo and f" (k,) < 0 for all k, < oc.
For the meaning of these assumptions, the reader is referred to Section C. Note
that f" (k,) = 0 (for all k,) corresponds to the case in which the capital:output
ratio is a constant. Note also that (A-1) implies the following:
(A-1') g(O) = 0, 0 < g'(k,) < oo and g"(k,) < 0 for all k, < oo
This, among other things, implies that the function g is concave.
We assume that the economy is endowed with the stock of a good whose
per capita amount is equal to a. We also assume that the economy is required to
bequeath a stock of that good to the amount of b per capita at the end of the Tth
period. Thus we have the following conditions:
(9) x0 + k0 = a
and
(10) kT= b
If a = 0, then k0 = 0 as well as x0 = 0, which in turn implies that k, = 0, and x, = 0
for all tin view of g(0) = 0 and (7).3 In order to avoid this uninteresting case, we
assume a > 0 and that
(A-2) (a) There exists a unique k, 0 < k < oc, such that g(k) - k = 0, or
(b) g" = 0 for all k, > 0 (and k = oo).
In terms off, this can be expressed as'
(A-2') (a). There exists a unique k, 0 < k < oc, such thatf(k) = ilk, where.l
,u+n, or
(b) f" = 0 for all k, > 0 (and k = oo).5
Recall that an assumption similar to part (a) of (A-2') was imposed in the Cass-
Koopmans model which we discussed in Section D.
Now consider the problem of finding a path such that k, = k > 0 and
x, = x > 0, for all t = 0, 1, ... , T (k, x are constants). That is, we ask whether
there exists a nonzero balanced growth path.' This problem is reduced to one of
finding k > 0, x > 0, such that
(i1) k+x=aandx=g(k)-k
It is clear from Figure 5.12 that such a path exists uniquely, if a < k. We call such a
path the balanced growth (or the golden age) path with respect to a, and we denote
it by }k*(a), x*(a)}. Note that k > k*(a) > 0 and x*(a) > 0. We henceforth assume
(A-3) (a) a < k, when g" < 0 for all k, > 0, or
(b) g(k,) - k, > 0 for all k, > 0, when g" = 0 for all k, > 0.
It is important to note that this consideration implies that the economy is
capable of growing with strictly positive values of X, and K, (or x, and k,),
STRUCTURE OF OPTIMAL GROWTH PROBLEM FOR AN AGGREGATE ECONOMe
473

x=g(k) -k

Figure 5.12. The Existence of the Balanced Growth Path.

as long as the initial condition satisfies (A-3), for we can then choose x, = x*(a)
and k, = k*(a). On the other hand, we may consider the path of pure accumulation
or the path of subsistence with respect to a, which is defined by
(12) ko = a, I = g(kt), t = 0, 1, ..., T - 1,
kt+ xt = 0, t = 0, 1, ... , T
The path of pure accumulation is illustrated in Figure 5.13.
When a > k and f" < 0 [so that (A-3) is violated], then k, monotonically
declines to k in the path defined by (12), as t increases. This is an uninteresting case
and may be considered as another justification of (A-3).7

Figure 5.13. The Path of Pure Accumulation.

474 CALCULUS OF VARIATIONS AND OPTIMAL GROWTH OF AN AGGREGATE ECONOMY

c. THE OPTIMAL ATTAINABLE PATHS

Consider now the following T-period optimization problem:
T
Maximize: U ° Z u(x,)(1 + p)-,
(k,,x,) t=0
Subject to: x,+k,=g(k,_i), t 1,2,...,T,x0+k0=a,kT=b
and x,>_ 0, k,> 0,t=0, 1,...,T
Here p > 0 is the discount rate and u is the utility function with the following
assumption:
(A-4) u(0) _ -cc, u'(x,) > 0, and u" (x,) < 0 for all x,.
The solution to the above problem is called an optimal (attainable) path starting
from a and ending at b. We denote this by [k,(a, b; T), i,(a, b; T)], or, unless
confusion might result, simply by (k1, z,) or [k,(b), z,(b)] .
On the other hand, the path (k,, x,), t = 0, 1, ... , T, which satisfies the above
constraints
(13) x,+ k,=g(kt_1),t= 1,2,...,T
(14) x0 + k0 = a, kT = b

is called an attainable path starting from a and ending at b. When only (13) is
imposed and (14) is disregarded, it is called a feasible path. Clearly the optimal
attainable path is the path which maximizes Uamong the set of all attainable paths.
Note that the set of attainable paths can be empty, so that there may not exist a
solution for the above maximization problem. For example, b may be so large that
the economy cannot attain it within the prescribed T periods, even if x, = 0
for all t (the path of pure accumulation). We may denotethesetofalltheattainable
paths by A(a, b; T). For the infinite horizon problem (T--> oo), this set is denoted
by A(a, oo), or simply A(a), where we do not impose the constraint such as
lim r , kT = b.
We assume that the set of attainable paths is nonempty, for otherwise it is
meaningless to consider the problem. It can be shown that the attainable set
A(a, b; T) is compact in the (2T + 2) dimensional Euclidian space. To show this,
let (k9, x,9), q = 1, 2, be a sequence such that
(15) x,9+k,9=g(k9_1),t= 1,2,...,T
(16) x09 + k09 = a, kT9 = b
and

(17) k,9-kf and x,q-->xt

Then we have x* + k* = g(kf ), xo + ko = a, and k; = b, sincegis continuous.'
That is, the set A(a, b; T) is a closed set. Since it is obviously bounded, it is
STRUCTURE OF OPTIMAL GROWTH PROBLEM FOR AN AGGREGATE ECONOMY 475

compacts Thus, the attainable set A(a, b; T) is nonempty and compact. Hence,
in view of the Weierstrass theorem (Theorem O.A.18) and the continuity of u,
there always exists a solution for the above nonlinear programming problem. That
is, the existence of an optimal attainable path is demonstrated.1'
Next we consider the nonnegativity of the optimal attainable path. In fact,
we can show, under a certain assumption, that k, > 0,.i, > 0, for all t = 0, 1, ... , T
(except possibly for kT = b, which can be zero), where (k,, z,) denotes an optimal
attainable path. To consider this problem, first suppose that b = k*(a). Then it is
clear that thepath (k,,x,),inwhich k,= k* (a) > 0,x,=x*(a)> 0,t=0,1,...,T,
is an attainable path. Then in view of the assumptions that u(0) = -oc and
f (O) = 0,11 we have
(18) k, > 0, z, > 0, for all t = 0, 1, . . ., T (except possibly for kT)
That is, we have an "interior solution" for the above maximization problem. Now
suppose that b < k*(a). Then we can similarly conclude that we have an interior
solution [that is, (18) holds] since the path (k,, x,) in which k, = k* (a), x, = x*(a),
t = 0, 1, ..., T - 1, and kT = b, xT = g[k*(a)] - b = a - b, is an attainable path
and k, > 0, x, > 0, for all t along this path. Henceforth we impose the following
assumption:
(A-5) b _< k*(a).
Note that if b = 0, then this condition is always satisfied. We leave it to the
interested readers to work out the implications of the case in which (A-5) is not
satisfied.
We now assert the uniqueness of an optimal path. Although we should be
able to assert this by way of the Lagrangian of the above maximization problem
and using assumptions such as u" < 0 in (A-4) and g' > 0 in (A-l'), here we
will prove uniqueness. directly from the problem because that method has applica-
tions to some cther problems; in particular, we will use it for the multisector case
(Chapter 7, Section B). First, for the sake of notational simplicity, write x
(x0, x1, ..., xT) and k = (ko, k1, ..., kT), so that U(x) = ET ou(xr)(1 + P)-`.
We first assert that the strict concavity of u(that is, u" < 0) implies that the
optimal consumption path z is unique. To show this, suppose that (k, z) and
(k', x') are two optimal attainable paths with z 4 x'. That is, U(x) = U(x') and
a- io- ko= O,g(k,-,)- i,- k,= 0,t= 122,..., T,kT-b=O,a-xO-ko=0,
g(kt_ 1 ) - x, - k', = 0, t = 1, 2, ... , T, k'T - b = 0. Define a new path (k i,) by

(19) +x,),t=0,1,...,T- 1
(20) koa - io, k1=g(kr-I)-Xr,t= 1,2,...,T
(21) iT = g(kT- i) - b
Then (k,, i,) is an attainable path. Note that ko = Z(ko + ko). Hence, from the
concavity of g, we obtain
476 CALCULUS OF VARIATIONS AND OPTIMAL GROWTH OF AN AGGREGATE ECONOMY

(22) i1 + k1 = g(ko) zg(ko) + zg(ko) = z(Xi + k+ z(x + ki)

= zi + Z(k, + k')
Here the equality holds if ko = ko.Hence we obtain k1 z(kl + ks), with equality
holding when ko = k6. This in turn implies
(23) z2 + k2 = g(ki) > g(zk1 + zki) > zg(k1) + zg(ki)
= '(-i2 + k2) + '(X2' + kz) = X2 + 1(k2 + k2)

Hence we obtain k2 > k(k2 + k2). Here the equality holds if ko = ko and ki = k'1.
Repeating the above argument, we obtain
(24) zT + b = g(kT- i) > g(ZkT-1 + IkT-1) zg(kT- i) + zg(kT- i)

z(XT + b) + 2(XT + b) = 2(XT

+ XT) + b

Here the equality holds when k, = k for all t = 0, 1, ... , T. Therefore zT

Z(.rT + xT) with equality holding when k, = k, for all t = 0, 1, 2, ..., T. Then in
view of the monotonicity and the strict concavity of u, we obtain

(25) U(X) ? U(I i + ix') > z U(X) + - U(X') = U(X)

which is a contradiction. Note that the above consideration does not preclude
the possibility that k k'. We now show that this is impossible. Suppose that
(z, k) and (z, k') are two optimal attainable paths. Then observe from the attain-
ability that.
(26) g(kr-1) - k, = g(kt-1) - k, t = 1, 2, ..., T
Since kT = k'T = b, so kT_ i = k'T_ i as a result of the monotonicity of g.' 2 Then
using the relation (26) successively, we obtain k, = k,, t = 0, 1, 2, ... , T, which
in turn is consistent with a - ko = a - k6. Thus we obtain k = k'. Note that in the
above proof the crucial assumption is g' > 0 and not g" < 0. That is, the strict
concavity of g (or g" < 0) is not crucial.
Having demonstrated the existence and the uniqueness of the optimal
attainable path, we now proceed to the characterization of such a path. To ease
the notation, we define the following h-functions within the constraints of the
above maximization problem:
(27) ho(ko, x0) = a - x0 - ko

(28) 1,2,...,T
Note that kT may be replaced by b, so that it can be dropped from the list of the
control variables.13 In order to obtain the first-order characterization of the
above maximization problem, we next examine the rank constraint qualification.' 4
For this purpose, define the (T + 1) x (2T + 1) matrix H as
STRUCTURE OF OPTIMAL GROWTH PROBLEM FOR AN AGGREGATE ECONOMY 477

aho aho aho aho aho

ako ak, akT_, axo aXT
ah, ah, ah, ah, ah,
(29) H ako ak, akT_, aXO aXT

a hT a hT a hT a hT ah7
ako ak, akT_, axo aXT

Since the number of effective constraints for the above problem is (T + 1), the
rank constraint qualification of this problem states that the rank of matrix H
should be equal to (T + 1). It is easy to see that this condition is in general
satisfied by actually computing H in view of (27) and (28) (and evaluating the
matrix along the optimal attainable path).
We now define the Lagrangian function L of the above problem as
T T
(30) Lu(x,)(1 +p)-`+ P,h,

where po, p, , ... , PT are the Lagrangian multipliers. Since the constraint qualifica-
tion is satisfied, the following first-order condition gives a set of necessary condi-
tions for an optimum (recall Chapter 1, Section D):
aL' _
(31) at -p,+P,+ig'(k,)=0,t=0, 1,..., T- 1

(32)
ax,-u'(X,)(1+P)-`-P,=0,t=0, 1,2,...,T

Note that we do not have the inequalities aL/ak, < 0, and aL/ax, 5 0, for we
ruled out the corner solution (that is, k, = 0, z, = 0, for some t) by (A-5) and
the assumptions u(0) _ -oo, f(0) = 0. Note that we have p, > 0 for all t = 0.
1, 2, .. ., T in view of (32) and the assumption u'(x,) > 0 for all x,. There are
(2T + 1) conditions in (31) and (32). These together with the (T + 1) constraints
(that is, h, = 0, t = 0, 1, ..., T) determine the optimal value of the (2T + 1) +
(T + 1 ) variables (that is, k, t = 0, 1, ... , t - 1 ; X p,, t = 0, 1, ... , T ).
It is easy to show that (31) and (32) together with the constraints (13) and (14)
also give a set of sufficiency conditions for a (global) optimum in view of the con-
cavity of u and g. Since the optimal attainable path is unique, these conditions give
a set of necessary and sufficient conditions for a unique global optimum. We now
turn to a study of these conditions. First we note that conditions (31) and (32) can
be rewritten as

(33) u'(X,) = u'(1,+1)

g+k,)
,t=0, 1,...,T-1
P
478 CALCULUS OF VARIATIONS AND OPTIMAL GROWTH OF AN AGGREGATE ECONOMY

There are T conditions in (33), which, together with the (T + 2) conditions in (13)
and (14), completely determine the value of the (2T + 2) variables k,, z t = 0,
1,...,T.
The economic meaning of (33) is easy to see. By reducing consumption by
one unit of the good, the loss of utility is u'(z,) for the tth period. By investing
this one unit, there is a gain in "net" output15 by the amount of g'(k,). This, in
turn, gives an increase in utility by the amount of u'(z,+ 1) g'(k,)/(1 + p). Hence
the equality (33) gives nothing but the competitive intertemporal arbitrage condi-
tion. It is easy to rewrite (33) in the following equivalent form:
-u'(++p)
(33') u'(Xt+i) - u'(2,) = [g'(kt) - (1 + P)]

It should be clear that this corresponds to the Euler equation obtained in Section
D of this chapter.
We summarize some of the results obtained obove.

Theorem 5.D.2: Under assumptions (A-1), (A-2), (A-3), (A-4), and (A-5), we have
the following:

(i) The balanced growth path [k*(a), x*(a)] starting from a > 0 exists, is unique,
and k*(a) > 0, x*(a) > 0.
(ii) The optimal attainable path (k,, r",) starting from a> 0 and ending at kT = b > 0
exists, is unique, and k, > 0, r, > O for all t = 0, 1, 2, ... , T, with the possible
exception that kT = b = 0.
(iii) A necessary and sufficient condition for the path to be optimal and attainable is
given by (33), (13), and (14).

We now define the concept of competitiveness.

Definition: The attainable path (k,, zt) starting from a and ending at b is called
competitive if there exist nonnegative numbers ("prices") pt such that

(34) u(Xt)(1 + Wt - pt2, > u(xt)(1 + Wt - P,xt

for allxt> 0,t=0, 1, . . ., T
and

(35) Ptg(kt- 1) - Pt- kt- I > Ptg(kt- 1) - pt-! kt-

forallk,_i>0,t= 1, .. ., T
REMARK: Relation (34) in the definition of competitiveness implies the
well-known condition that consumers maximize utility subject to the
budget constraint; that is,
(36) u(it) >_ u(xt) for all xt > 0 such that p,x, < p,z,, t = 0, 1, ..., T
STRUCTURE OF OPTIMAL GROWTH PROBLEM FOR AN AGGREGATE ECONOMY 479

Condition (35) corresponds to the well-known profit maximization condition

for the producers.
We now prove the following important corollary of Theorem 5.D.2, which is
proved by Gale [4] in the multisector context.

Corollary: An attainable path is competitive if and only if it is optimal.

PROOF: In view of (iii) of Theorem 5.D.2, an attainable path is optimal if and
only if there exist pt, t = 0, 1, . . ., T, all > 0, such that

(37) E u(Xt)(1 + p) t + po(a - xo - ko) + 1=1

Pt [g(kt-1) - zt - i ]
t=o
T T

t=o
u(xt)(1 + p)-t + po(a - xo - k0) +
t=1
Pt[g(kt-1) - xt - kj
for all k, x, > 0, t = 0, 1,. . ., T. Set xt = zt for all t = 0, 1, ... , T, except for

t = to. Then we obtain (34), the first condition of competitiveness, since the
choice of to is arbitrary. Next set kt = kt for all t = 1, ... , T, except for t = to,
and xt = zt for all t = 0, 1, . . ., T. Since the choice of to is arbitrary, we
establish (35), the second condition of competitiveness. In other words, we
established that optimality implies competitiveness.
To show the converse, first note the following simple identity:
(38) p0a - pTb = p0a - prb, where kT = kT = b
Then summing both sides of (34), (35), and (38), we obtain (37), which
establishes the converse. (Q.E.D.)
We now turn to the problem in which t becomes "very large," which is
obviously meaningful when T is large enough. Define k(p) and i(p) by

(39) g'[k(p)] = 1 + p
and
(40) x(p) = g[k(p)] - k(p)

Here we assume g" (k) < 0 for all k and impose assumption (A-2), part (a). Then
[k(p), . (p)] defines the modified golden rule path for the present discrete time
model. Note that 0 < k(p) < k . Recall now the two basic equations of the
present model, that is, (7) for feasibility and (33') for optimality. Then, in view
of g"(kt) < 0 for all kt and u"(xt) < 0 for all xt, we can easily conclude from
(33') that

(41) zt+ I - zt 0 according to whether kt < k(p)

480 CALCULUS OF VARIATIONS AND OPTIMAL GROWTH OF AN AGGREGATE ECONOMY

We rewrite condition (7) as

(7') kr+ I - kr = g(kr) - k, - x,+ I
Hence we can easily conclude that

(42) kr+ 1 - k, 0 according to whether g(k,) - k, - z,+ 1 0

We can now draw a phase diagram similar to the one used in Section D. Applying
our argument of the "eligibility condition," we can obtain the same conclusion we
obtained in Section D, which we list as another corollary of Theorem 5.D.2.

Corollary (Ramsey-Koopmans-Cass): As t -> oo (and T -> oo), z, -->,i(p) and

k, - k(p ), regardless of the initial stock a.

d. SENSITIVITY ANALYSIS: BROCK'S THEOREMIb

Finally we turn to the sensitivity analysis. Following Brock [2] , we consider
the "sensitivity" of the optimal attainable path with respect to the final stock b
and with respect to the planning horizon. Here we can allow the case in which
g" (k,) = 0 for all k, Moreover, the subsequent analysis can allow for "autono-
.

mous" changes in u and g; that is, u(x,) can be replaced by u(x,, t) andg(k,) can be
written as g(k,, t). The second argument t in these functions signifies the autono-
mous shifts of these functions. For example, u(x,, t) can involve the case in which
there is a change in the discount rate as well as taste over time, and g(k t) can
mean technological progress. We may then rewrite our feasibility condition (7)
accordingly as
(7")
x1+ + kr+ = g(kt, t), t = 0, 1, ... , T - 1
It can be shown fairly easily, by repeating our earlier argument, that the optimality
condition (33) can be rewritten accordingly as

(33") u'(z t) = u'(z,+i, t) g'(kr, t)

P
where prime (') obviously means the partial derivative with respect to the first
argument of the relevant functions. In fact, we can omit the discount factor
(1 + p)-, from the target function U and simply rewrite U = ZT ou(x t), for the
second argument t of u takes care of such a time discount. We may then omit
(1 + p) from (33").
Denote the T-period optimal attainable path starting from a and ending at
b i and b,, respectively, by [k,(b ), z,(b i )] and [/ (b2), z,(b2)] , and let b 1 > b,,
First assume ko(bi) < ko(b2), and we obtain a contradiction. ko(bl) < ko(b2)
implies
(43) z0(b i) > z0(b2)
STRUCTURE OF OPTIMAL GROWTH PROBLEM FOR AN AGGREGATE ECONOMY 481

because z0(b 1) + ko{b 1) = a = xo(b2) + ko(b2). This in turn implies that

u'[zo(h1), t] < u'[zo(b2), t]. Write
(44) P1(b1) = u'[X1(b1), t], P,(b2) = u'[1,(b2), t]
g't(b1) = g'[k,(b1), t] , g',(b2) = g'[k,(b2), t]
to save space. Then we have
(45-a) Po(b1) < Po(b2)

(45-b) g'o(b i) > g'o(b2)

(45-c) go(bl) < go(b2)

Hence, from (33") and (45-b),

(46)
Po(b1) = g'o(b1)> g'o(h2)__ Po(b2)
P1(b1) 1 +P 1 +P P1(b2).

Therefore, from (46) and (45-a),

(47) PI(b1) < Po(b1) < 1
P1(b2) Po(b2)

Hence P, (b 1) < P1(b2), which means z 1(b 1) > z 1(b2). Then, together with (45-c)
and feasibility, (13),
(48) k1(b1) - k1(b2) = [go(b1) - X1(b1)] - [go(b2) - C1(b2)]

= [go(bi) - go(b2)] - [X1(b1) - X1(b2)] <0

Repeating the above argument, we obtain
(49) k,(b1) < k,(b2) for all t = 1, 2,..., T
which contradicts kT(b1) = b1 > b2 = kT(b2). Therefore we have
(50) ko(b,) > ko(b2)
Repeating the above argument, we obtain
(51) k,(b,) > k,(b2) for all t = 0, 1,-, T, and
z,(b,) < z1(b2) for all t = 1, 2,..., T
That is, an increase in the final stock requirement increases k, and decreases
z, for each t in the optimal attainable path.
Next we consider the effect of changing the planning horizon T. We denote
the optimal attainable path starting from a and ending at b for the T-period
problem by [k,T(b), j,T(b)] Hence, for example, kTT+ '(0) denotes the (per
.

capita) stock of capital in period T in the (T + 1)-period optimal attainable path

starting from a and ending at kT+ 1 = 0.
482 CALCULUS OF VARIATIONS AND OPTIMAL GROWTH OF AN AGGREGATE ECONOMY

We now compare two optimal attainable paths, both starting from a and
ending at 0 with the only difference being the planning horizon, T for one and
(T + 1) for the other. Then kT+ 1 T+ 1 (0) = kTT(0) = 0, but kYT+' (0) > 0; for if
kTT+1(0) = 0, then iT+1T+1(0) = 0, which implies u[zT+1T+1, T + 1] _ oo.
Next observe that
(52) k1T+' (0)
= k!' (kTT+' (0)), t = 0, 1, ... , T
for otherwise one can always increase the value of U for the (T + 1)-period
program with (kT+ 1T+ ' = 0) by following the path ktT(kTT+' (0) ), t = 0, ... , T
(that is, up to the Tth period). Since kTT+' (0) > 0, (52) implies

(53) 0 = kTT(0) < kTT(kTT+1 (0))

Hence, in view of (51) and (53), we obtain
(54) k< T(0) < ktT+'(0) for each t = 0, 1, ... , T
and

z1T(0) > ztT+' (0) for each t = 0, 1, ..., T

Repeating the argument, we have for each t,
(55) ktT(0) < k,T+' (0) < ktT+ 2 (0) < ...

and

z'T(0) > z'T+' (0) > z'T+ 2(0) > ...

In other words, an increase in the planning horizon with zero terminal stocks
always increases the optimal (per capita) stock for each t and decreases the
optimal (per capita) consumption for each t.
Next note that
(56) ktT(0)< kt for allt=0, 1,...,T;T= 1,2,...
where k1 denotes k1 in the path of pure accumulation. Hence, for each t, k1T(0)
is a monotone increasing sequence with respect to T, which is bounded from
above. Hence lim, , k1T(0) exists. Denote this by k, Also z1T(0) is a monotone
decreasing sequence with respect to T, which is bounded from below by zero.
Hence limT_,,,, z'T(0) exists. Denote this by z, The path (ks, zj is attainable
in view of the continuity of g.
This convergence is the essential result in the sensitivity analysis with
respect to T, for it asserts that the distance between k,'(0) and k1''(0) and the
distance between r1T(0) and z'T'(0) can be made arbitrarily small, at least for
certain initial periods when T and T' are sufficiently large. Also note that (k, zj
corresponds to the path obtained by Koopmans and Cass for the infinite horizon
problem (which also implies that k, does not converge to 0 when t -> oc).
STRUCTURE OF OPTIMAL GROWTH PROBLEM FOR AN AGGREGATE ECONOMY 483

We summarize the above results as follows:

Theorem 5.D.3 (Brock): Under (A-1) and (A-4), we have (51) and (55), and the
optimal attainable path for the T-period problem with no terminal stock requirements,
[k1T(0), i,T(0)], converges (for each t) to a limit path (k,, z,) as T -> oo.
Can we assert that [k1T(b), z,T(b)] also converges to (k,, z,) when b > 0, as
T-> oo? Brock [2] proved the following corollary (his theorem 3), which asserts,
in essence, that if b is below a certain value, then such a convergence holds.

Corollary: If b < lim k,, then lim k1T(b) exists and equals k1.
l-m T-oo

REMARK: Before we prove this corollary, we may have to explain the

mathematical concept lim (or lim inf). Let {xq} be a sequence of real num-
bers and let X be the set of real numbers x (including oo and -oo) such that
xqs - x for some subsequence {xqs} of {xq}. Then lim (or lim inf) and lim
(or lim sup) are defined as"

(57) lim xq = lub of X

q-co
and

(58) lim xq = glb of X

q-.co

Write limq-00 xq = z. Then we can show easily thatz E X, and that ifa < z then
there exists an integer Nsuch that q ? N implies xq > a. (That is, for a given
c > 0, there exists an N such that q > N implies xq > z - E.) Similar results
hold for lim sup. We now turn to the proof of the above corollary.

PROOF: By assumption, b < lim k1. Hence there exists a To such that
1-w
(59) k, > b, for t > To
For T > To, choose N, which depends on T, such that

(60) kTT(b) = b < kT"(0)

which is possible since kT"(0) unconditionally converges to k1 by (55) and

k1 > b by (59). Then by (51) we have

(61) k1T(b) < k,"(0),t = 0, 1,..., T

Therefore, using (51) again, we obtain

(62) k,T(0) < k,T(b) < k,"(0),t = 0, 1,..., T
484 CALCULUS OF VARIATIONS AND OPTIMAL GROWTH OF AN AGGREGATE ECONOMY

with equality when b = 0. Observe that k,T(0)>kt as T>oo and k 'N (0)->k, as
N> co. Since N> co as T> oo, (62) implies that ktT(b)> kt (for each t) as
T> co. (Q.E.D.)
REMARK: This corollary shows that as long as b is below the bound limt-.,)
kt , the optimal attainable path is insensitive to changes in the terminal stock b,
for certain initial periods. In other words, as long as.bl and b2 are below the
bound, the distance between k1T (b1) and ktT (b2) can be made arbitrarily small
for certain initial periods by choosing T sufficiently large.

FOOTNOTES

1. This does not preclude the importance of period analysis in the empirical contexts.
For example, in empirical econometric research one is often forced to use period
analysis since the time series data are tabulated at discrete time intervals (for example,
GNP). In optimization models, one may be able to change the policy (or control)
variables only at discrete time intervals as a result of practical considerations. Then
such a time interval may define the period.
2. We should note that our specification of the model is not the only way to produce
equation (3). Our point here is simply that we should make the specification explicit
to avoid possible misunderstanding.
3. We here adopt the convention that zero is the subsistence level of consumption.
Hence xt = 0 for all t means that the economy is at the subsistence level all the time.
4. In the literature, the following alternative assumptions are used in place of (A-2'),
part (a): f'(0) = oo and f'(oo) = 0. See Sections C and D. Clearly this implies the
present assumption, (A-2'), part (a).
5. This is the case of a constant capital:output ratio. Obviously for such a case, (A-2),
part (a), cannot be imposed.
6. If kto = 0 for some to, then f(0) = 0 implies xt = 0, kt = 0 for all t > to. This is an
uninteresting case of "balanced growth."
7. See Koopmans [ 6], p. 237, for example. When a = k, then kt = k and xt = 0 for all t
except possibly for t = T and T - 1. This is again a trivial and uninteresting case.
8. Recall the definition of continuous functions and note that the linear (affine) func-
tions are continuous everywhere in the domain. The function g is continuous because
it is differentiable.
9. Each of kt, xt, t = 0, 1, 2, ... , T, is bounded from below by 0; kt, t = 0, 1, 2, ... , T, are
bounded from above by the path of pure accumulation; xt, t = 0, 1 , 2, ... , T, are
bounded from above by g'(.k,) < oc for all .k, < co and equation (7). Let x t, kt, t = 0,
1 , 2, ... , T, be these upper bounds and consider rectangles St in R2 defined by St =
{(xt, k,): 0 k, < kt, 0 < xt < xt}, t = 0, 1, ..., T. Clearly the St's are compact;
hence in view of Tychonoff's theorem 0,.0S1 is also compact in R2T+2 with
respect to the product topology (Theorem O.A.15). As a closed subset of the compact
set, A(a, b; T) is compact.
10. Consider S,, t = 0, , 2, ... , ad inf. Define S - Ox '0St. Then S is again compact as
1

a result of the Tychonoff theorem. Hence A(a) with T- oo is compact as a closed sub-
set of S. Hence, using the Weierstrass theorem again, we can assert the existence of an
optimal attainable program for the infinite horizon problem (T = oo), as long as
U(xp, xL, ...) - Z' 0 u(xt) (1 + p)-t remains continuous and bounded. This argu-
STRUCTURE OF OPTIMAL GROWTH PROBLEM FOR AN AGGREGATE ECONOMY 485

ment is used to prove existence by Beale and Koopmans [ 1] . A different proof of

existence for a more general model which involves factor-augumenting technological
progress is provided by Brock and Gale [3]. They also showed the possible non-
existence of an optimal attainable path in such an economy (if the discount rate p is
below a certain critical value). See also Gale and Sutherland [5].
11. If, in (kt, xt), xt = 0 for some t, then this path becomes "infinitely worse" than (k*(a),
x*(a)), because u(0) = - oc. If kt = 0 for some t, then f (O) = 0 implies that xt+ i =
kt+I=...=xT=kT=O.
12. That is, g'(kt) > 0 for all kt > 0.
13. It is certainly possible to take into account the constraint kT = b explicitly. We can
also modify this constraint to kT > b. However, as long as u' > 0 everywhere (that is,
nonsatiation), kT > b will produce the same solution to the above maximization prob-
lem as in the case in which kT = b. The reader should be able to carry out the analysis
when the constraint kT = b or kT > b is explicitly taken into account.
14. We can certainly use other constraint qualifications. However, as long as every func-
tion in the constraints is differentiable, we might as well use the classical rank condi-
tion, for it is convenient to deal with the equality constraints (see Chapter 1, Section
D, in particular, Theorem, I.D.6.).
15. "Net" means net or depreciation (µ) and population growth (n). Recall (8).
16. We are heavily indebted to Brock [2] for the argument here. Our effort is simply
expository.
17. "lim inf" and "lim sup" are the abbreviations of limit inferior and limit superior,
respectively. Needless to say, lub stands for the least upper bound (supremum), and
glb stands for the greatest lower bound (infimum).

REFERENCES
1. Beale, R., and Koopmans, T. C., "Maximizing Stationary Utility in a Constant
Technology," SIAM Journal of Applied Mathematics, 17, September 1969.
2. Brock, W. A., "Sensitivity of Optimal Growth Paths with Respect to a Change in
Target Stocks," Zeitschrift fur Nationalokonomie, Supp. 1, 1971 (originally present-
ed at the Purdue Meeting of the Kansas-Missouri Seminar on Quantitative
Economics, October 1969). .

3. Brock, W. A., and Gale, D., "Optimal Growth under Factor Augmenting Progress,"
Journal of Economic Theory, vol. 1, October 1969.
4. Gale, D., "On Optimal Development in a Multisector Economy, "Review of Economic
Studies, XXXIV, January 1967.
5. Gale, D., and Sutherland, W. R., "Analysis of a One Good Model of Economic
Development," in Mathematics of the Decision Sciences, Part 2, ed. by G. B. Dantzig
and A. F. Veinott, Providence, R. I., American Mathematical Society, 1968.
6. Koopmans, T. C., "On the Concept of Optimal Economic Growth," in The Econo-
metric Approach to Development Planning, Pontificiae Academiae Scientarum Varia,
Amsterdam, North-Holland, 1965.
7. Samuelson, P. A., "A Turnpike Refutation of the Golden Rule in a Welfare-Maximiz-
ing Many Year Plan," in Essays on the Theory of Optimal Economic Growth, ed. by K.
Shell, Cambridge, Mass., M.I.T. Press, 1967.
MULTISECTOR MODELS OF ECONOMIC
GROWTH

Section A
THE VON NEUMANN MODEL

a. INTRODUCTION
If we had to name the most important immediate forerunner of modern
mathematical economics, we would not hesitate to choose John von Neumann for
his model of economic growth presented in his 1937 paper [22]. Not only did this
paper provide the first explicit nonaggregate model in capital and growth theory,
but also it presented (1) the first explicit activity analysis model of production and
(2) the first abstract model of a competitive economy (together with Wald's model
in his papers published in 1935 and 1936), which lead to the models of the 1950s (see
Chapter 2, Section E). In addition to the modern character of the model, the prob-
lems that von Neumann dealt with are also modern. In particular, he was con-
cerned with the path that gives the maximal rate of growth and the price implica-
tion of such a path. In the first part of this section, we present the model as
formulated now so as to convey its modern character. In subsection b, we prove his
major results in an elementary fashion.
Because of the innovative character of the paper, many papers have been
written on the von Neumann models, but we restrict ourselves to his growth model
as such and exclude its impact on other developments such as activity analysis and
the theory of competitive markets.
A major effort has been devoted to simplifying the proof of von Neumann's
major theorem in the original paper. To prove his theorem von Neumann used
Brouwer's fixed point theorem. Later, Loomis [15], Georgescu-Roegen [5],
Gale [3], and Karlin [11] all provided elementary proofs. The proof of the
existence of the balanced growth path with the maximal growth rate can be
separated from the proof of the price implication of such a path. The essence
of the proof of the existence is simply to utilize the compactness of the relevant
set of production. Our proof in subsection b also follows this line. In this con-
nection, we may note a formal similarity to the theory of nonnegative matrices in

486
THE VON NEUMANN MODEL 487

which the existence of the Frobenius root can be proved by utilizing the compact-
ness of a certain set, although it can also be proved by using Brouwer's fixed
point theorem (see Chapter 4). For the proof of the price implication of the
maximal rate balanced growth path, Georgescu-Roegen, Gale, and Karlin made
direct use of the separation theorem of convex sets. We use, instead, the funda-
mental theorem of concave functions (Chapter 1, Section B). It is true that this
theorem is derived from the separation theorem, but the use of this fundamental
theorem will avoid many steps in the proof (which is hence much simpler) compared
with those necessary in the proof which uses the separation theorem directly.
Another major effort devoted to the von Neumann model is in the direction
of the generalization of the model and its results. Essentially there are three weak-
nesses in the original paper:

(i) There is no discussion of the irreducibility of the model, and there is no proof
that the value of output at each time period is positive.
(ii) The growth paths that von Neumann considered were restricted to the"bal-
anced growth paths," that is, paths in which all commodities grow at the
same rate.
(iii) There is no explicit treatment of consumption.

The major contributions to eliminating the first weakness were made by

Kemeny, Morgenstern, and Thompson [12], Thompson [21], and Gale [3].
The major breakthrough concerning the second point came in the form of the
turnpike theorem by DOSSO [2] and later followers such as Radner and Mc-
Kenzie. The turnpike theorem will be discussed in the next chapter. The introduc-
tion of consumption was attempted by Morishima [ 181. We discuss these three
weaknesses here under subsection c.
Consider an economy that transforms the stock of n commodities x, an n-
vector, into the stock of n commodities y, an n-vector, in one time period. This
time period can be considered as a unit "production period." This transformation
can be represented by (x, y), a point in R2n. Here x is an "input" vector and y
is an "output" vector. There are certainly many input-output combinations that
are possible in the economy. The set of such input-output combinations, that
is, the collection of these (x, y)'s which are technically possible in the economy,
is called the technology set or the production set of the economy and is denoted by
T. Clearly, elements of x or y can be zero. The concept of T is essentially the
same as the concept of a production set in activity analysis (see Chapter 0, Section
Q. A point of caution is that x and y here are explicitly vectors of the stocks of
commodities rather than the flows of commodities.
At first glance, the concept of a production process, especially the concept
of a production period, may seem absurd. There are two apparent difficulties. One
is the problem of the simultaneous inputs and outputs, and the other is the
problem of differences in production periods. Let us view these two difficulties
separately. The first difficulty is the assumption that in the beginning of each
488 MULTISECTOR MODELS OF ECONOMIC GROWTH

period all the commodities are simultaneously fed into the process and at the
end of the period all the commodities are simultaneously produced. In reality, such
a simultaneous input or output rarely occurs. More commonly, various com-
modities are fed into the process at different points in time and various commodi-
ties are produced from time to time. However, such "successive" inputs and out-
puts can be handled simply as follows. Consider a production process in which
inputs go into the production process at time to and t1 and outputs come out at time
t2 and t3. Then we can "decompose" such a production process into three
"steps." In other words, the commodities that are fed in at time to produce
certain "intermediate commodities," and at t 1 , new inputs, together with these
intermediate commodities, are fed in. At time t2 certain commodities are pro-
duced as final outputs together with the higher order intermediate goods. At time
t3 only the final outputs are produced. Hence, including these intermediate com-
modities in the classification of commodities, in each of the production periods
(to , t I ), (t 1 , t2 ), (t2 , t 3 ), all the commodities are simultaneously fed in at the
beginning of the period and produced at the end of the period. We only consider
such "decomposed" processes.
The second apparent difficulty in the concept of a production period lies
in the difference in the actual time period from process to process. This can
simply be handled as follows. Suppose there are only three processes z1, z2, and
z3 such that z1 takes 30 days, z2 takes 60 days, and z3 takes 45 days. Take the
greatest common divisor of these three periods {30, 60, 45}-that is, 15-and
define this 15 days as a unit period of production. Then z1 is decomposed into
two steps. In other words, at the end of the first period (that is, at the end of 15
days), the process produces certain "intermediate" (or unfinished) commodities,
and during the second period of production these intermediate commodities
are all transformed into the final commodities of the process z 1 Thus, at the end
.

of the second period (that is, at the end of 30 days), this process z 1 is completed.
Hence, by choosing the unit of period properly and by properly including
the "intermediate" commodities in the list of commodities, we can avoid the two
difficulties and can proceed meaningfully to our analysis. Time in this economy
elapses with the succession of such production periods, and each production
process is in the technology set T. We assume that this set T is constant over
time. This implies, among other things, that there is no technological progress
in the economy.
In the original von Neumann presentation of the model, a concrete explana-
tion about the input-output vector (x, y) is given. Let a,1 be the amount of the
ith commodity needed as an input in a one-unit operation of thejth process (or
"activity"). Let b,1 be the amount of the ith commodity produced in a one-unit
operation of the jth process. Let there be n commodities and m processes in the
economy. Let A = [a;1] and B = [b;1] be n x m matrices whose entries are non-
negative real numbers and possibly zero for some elements. Let a' be an n-vector
whose ith element is a; and let b' be an n-vector whose ith element is b;. Some
of the ay's and the b11's can be zero. One unit operationofthejthprocesstransforms
THE VON NEUMANN MODEL 489

aj to bi. Let z(t) be an m-vector whose jth element, zj(t), signifies the level of
operation of the jth process in period t; zj(t) > 0 for all j and t. The vector z(t)
is called the activity level vector (or process level vector) in period t. That x(t)
(often denoted also as x1) is an input vector in period t means that x(t) can be
written as
m++

x(t) = ! aJzj(t)
j= 1

for some zj(t) > 0 (j = 1, 2, ... , m). Similarly, that y(t) (or yt) is an output vector
in period t means that it can be written as

Y(t) = E b'zj(t)
j= 1

for some zj(t) > 0 (j = 1, 2, ..., m). The assumption that the technology set T is
constant implies that these a;j's and b;j's are constant over time for some zj(t) >_ 0
(j = 1, 2, . ., m). Allow "free disposability"; that is, if the process (x, y) is in the
.

technology set, then there is a nonnegative m-vector z such that

and
Hence, the von Neumann technology set TN can be written as
TN- {(x,y):x> A z, 0 some z> 0}
That TN is constant over time can be expressed by saying that A and B are constant
over time. Note that the set TN is a convex polyhedral cone with its vertex at
the origin.
The following assumptions are then imposed by von Neumann.
(AN-1) a;j > 0, bij ? 0 for all i and j.
(AN-2) A + B > 0, that is, a;j + bij > 0 for all i and j.
The second assumption is rather unrealistic, for even if (AN-2)holds,a;j= 0implies
b;j > 0, and bij = O implies a;j > 0, which means that every commodity iseitherused
as an input or produced as an output in every production process. It is quite
1_ .
possible that certain commodities may not be involved in some processes either
as inputs or as outputs. About this point, Morishima remarked that "in the
process of producing sewing machines, a banana is neither produced as a by-
product, nor is it used as a raw material" ([ 19], p. 20).
Kemeny, Morgenstern, and Thompson [ 12] modified the above assumptions
as follows:
(A-i) a;j > 0 and b;j >_ 0 for all i and j.
(A-ii) For any j, there exists at least one i such that a;j > 0.
(A-iii) For any i, there exists at least one j such that bij > 0.
490 MULTISECTOR MODELS OF ECONOMIC GROWTH

Assumption (A-ii) means that every process uses some commodity as an in-
put, which implies "the impossibility of the land of Cockaigne." Assumption
(A-iii) means that every commodity is producible by some process. Assumptions
(A-ii) and (A-iii) modify (AN-2) in a significant way.
Karlin [I I[11], in his formulation of the von Neumann model, did not use
the von Neumann specification of the technology set in terms of the matrices
A and B. Hence he did not adopt the above specification of assumptions in terms
of the ay's and the b,3's. Rather, he abstracted the essential nature of the von
Neumann technology by supposing that the technology set T of the economy is
specified by the following four assumptions.
(A-1) T is a closed convex cone in ( 2n, the nonnegative orthant of R2n.
(A-2) (Free disposability) (x, y) E T, x' >_ x, and 0 < y' < y imply (x', y')
E T.
(A-3) (The impossibility of the land of Cockaigne) (0, y) E T implies y = 0.
(A-4) (The "productiveness") For any i, there exists an (x, y) E T such that
y, > 0 (that is, every commodity is producible).
REMARK: As remarked in the exposition of activity analysis (Chapter 0,
Section C), "T is a convex cone" implies additivity, proportionality (that is,
complete divisibility and constant returns to scale), and the possibility of
inaction.
REMARK: In view of (A-1), (A-4) is equivalent to the following:
(A-4') There exists an (x, y) E T such that y > 0.
It is easy to see that the von Neumann technology set TN with (A-i) to (A-iii)
is a special case of the technology set T with (A-1) to (A-4). To see this, note that
(A-i) together with the definition of TN imply (A-1) and (A-2); (A-ii) implies
(A-3); and (A-iii) implies (A-4).
The economy transforms the stock of commodities at the beginning of period
x(t) to the stock of commodities y(t) by spending one time period, such that (x(t),
y(t)) E T, where T satisfies (A-1) to (A-4). Thus the movement of the economy
can be depicted schematically by Figure 6.1.
T
x(t+1) > y(t+1)

T
x(t) > y(t)

T
y(t-1)

Period (t-1) Period t Period (t+1)

[(x(t), y(t)) e T for all t]

Figure 6.1. Economic Growth in the von Neumann Economy.
THE VON NEUMANN MODEL 491

Von Neumann restricted his consideration to the set of balanced growth

paths, that is, the paths of (x(t), y(t)) such that (x(t), y(t)) E T and y(t) _
ax(t) for some a > 0. He allows a < 1 so that the economy may decay rather
than grow. The question then is whether there exists a balanced growth path
that has the maximal rate of growth and whether such a path can be supported
under a certain economic system. The first question to consider is the problem
of maximizing a over the set of balanced growth paths, and the second is the
problem of the economic interpretation (in particular, the price implication) of
the balanced growth path which gives the maximal rate. The reader may realize
that this question is analogous to either of the following two problems in modern
economic literature:

(i) Activity analysis: the existence of an efficient point in a production set and
the assertion that every efficient point can be realized as a profit maximization
point (Theorem O.C.3).
(ii) Welfare economics: the existence of a Pareto optimal point and the assertion
that every Pareto optimal point can be supported by competitive pricing
(Theorem 1.C.2).

b. MAJOR THEOREMS
We now proceed to the investigation of the two major problems stated above:
(1) the existence of the balanced growth path with a maximal rate, and (2) the
price implication of such a path.

Definition: Define a real-valued function a(x, y) on T as

a(x, y) _ m anx {a: y > ax}, where (x, y) 0

The value of a(x, y) is called the rate of expansion of the process (x, y) in T.
REMARK: For (0, 0), a(x, y) cannot be defined. Note also that, owing to
(A-3), (0, y) 0 T if y >_ 0. Hence a(x, y) is not defined on such points under
(A-3). Thus a(x, y) is not defined for (0, 0) and (0, y) with y > 0. Since
x > 0 and y > 0, this means that a(x, y) is defined only when x 0. This
implies that a(x, y) > 0. The concept of "rate of expansion" is illustrated
in Figure 6.2. As an example of a(x, y) = 0, consider x = (0, 1) and y =
(2, 0). Note that a(x, );) may be less than 1; hence the process can produce
"decay" rather than "expansion." Writing x = (XI, x2, ..., x,) and
y = (yI, y2, . . ., we can also write the above definition of the expansion
rate as follows. Let ai(x, y), i = 1, 2, ..., n, be defined as

yi if xi > 0
xi
ai(x, y)
oo ifxi= Oandyi> 0
undefined if xi = 0 and yi = 0
492 MULTISECTOR MODELS OF ECONOMIC GROWTH

Commodity 2

If y'=ax,a=a(x, y)

0 Commodity 1

Figure 6.2. An Illustration of the Rate of Expansion.

Then
a(x, y) = min a;(x, y) for x >_ 0
i
As is clear from Figure 6.2, the concept of the rate of expansion is that of
a "balanced growth path," that is, a ray from the origin. Given (x, y), if
y is not on the ray from the origin passing through x, y is brought into such
a ray as illustrated in Figure 6.2 (y - y').
Since a(x, y) is a function of (x, y), the value of a(x, ),), the expansion
rate, varies from process to process. The following theorem asserts that there
exists a process (z, y) in T which gives the maximum rate of expansion. The
essential argument for this existence theorem is the standard compactness argu-
ment if we assume that a(x, y) is continuous.

Theorem 6.A.1: Under the assumptions (A-1) to (A-4), there exists an (z, y) E T
such that y = az, where a = a(z, y), and a > a(x, y) for all (x, y) E T with
x >_ 0. Also we have 0 < a < oo.

PROOF: Let T be the intersection of T with the unit sphere (with the center
at the origin) in R2 . Since T is closed, T is also closed, which in turn implies
that T is compact. Suppose a(x, y) is continuous for all points of T (or
T). Then a(x, y) achieves a maximum on T by Weierstrass' theorem
(Theorem O.A.18). However, as Glycopantis [6] pointed out, a(x, y) can be
discontinuous (for such an example, see [61, p. 296). This, in essence, is
due to the fact that the domains of the functions a, (x, y) are not identical.
To avoid this difficulty, define the subset T, of T as
T, = J (x, y) E T: a (x, y) = a! (x, y)}
THE VON NEUMANN MODEL 493

Clearly, T \ {0} = U;T;. Let T; = T;U {0}, and let (xy, yq) be a convergent
sequence in T; with limit (x°, y°). Then, sinceyk9/xk9 > yfl/x;9 for (,r9, yq) # 0
implies that yk%xk° ? y;°/x;°, we have a(x°, y°) = a; (x°, y°). That is,
(x°, y°) E T; so that the Ti's are closed cones. Also the functions a, (x, y) are
continuous except when they are undefined. Let T* be the intersection of D.
with the unit sphere in R2i. Then the T*'s are compact. Hence, from Weier-
strass' theorem, a; (x, y) achieves its maximum b; on T*. Choose the maximum
of the b;'s over i, which clearly exists. Thus we have shown that there exists
an (x*, y*) E T such that a(x*, y*) >_ a(x, y) for all (x, y) E T ('.'U;T; = T; \
J01). Write s - a (x*, y*). Since T is a cone, for any (x, y) E Twithx ? 0, there
exists an (z, y) in T such that (Ax, Ay) = (x, y) for some A > 0, A E R [note
(x, y) = (x, y)/ II (x, y) II ]. However, by the definition of a(x, y), we
have a(x, y) = a (AY, Ay) = a (Y, y). Since a(x*, y*) >_ a(x, y) for
all (x, y) E T with x >_ 0, we then obtain that a(x*, y*) >_ a(x, y) for all
(x, y) E T with x 0. From the free disposability assumption (A-2),
we can find an (z, y) in T such that a(z, y) = a(x*, y*) and y = az. Thus
a = a(z, y) > a(x, y) for all (x, y) E T with y = &z. Finally, we show
that 0 < 6 < oo. From (A-4) there exists an (z, y) in T such that y > 0.
Since a(z, y) > 0 and 6 > a(z, y), a > 0. a < oo clearly follows from (A-3)
and y = az. (Q.E.D.)
REMARK: In the above proof (which is in essence due to [6] ), we observe
that the possible discontinuities of the function a(x, y) make it necessary
to complicate the "compactness" proof. Note that in the above proof the
continuity of a(x, y) is neither established nor utilized. The proof relies
on the continuity of a;(x, y).
REMARK: We may recall that the "failure" of the "compactness" proof
(as a result of the lack of continuity) also appeared in the proof of the
Frobenius theorem (Theorem 4.B.1). The reader may, therefore, wish to
consider Theorem 6.A.1 and the Frobenius theorem under a unified frame-
work.
REMARK: For the case of n = 1, that is, a one commodity economy, the
above proof can be illustrated by Figure 6.3.
REMARK: The definition of a(x, y) reduces the rate of expansion to the
rate in the corresponding balanced growth path, as was illustrated in the
definition of a(x, y). Hence Theorem 6.A.1 simply asserts that there exists
a balanced growth path which maximizes the rate of expansion in the set of
all the balanced growth paths in the economy. We call such a path the
von Neumann path. It is important to notice that the von Neumann path is
not necessarily unique. In the above illustrations, the von Neumann path
was supposed to constitute a unique ray from the origin. But, as McKenzie
[ 16] emphasized in connection with the turnpike theorem, this is not
necessarily the case. The set of von Neumann paths, in general, constitutes a
facet.
494 MULTISECTOR MODELS OF ECONOMIC GROWTH

Figure 6.3. An Illustration of the Proof of Theorem 6. A. 1.

REMARK: Theorem 6.A.1 essentially asserts the existence of the following

nonlinear programming problem and states some properties of its solution.
Maximize: a
(X. Y)
Subject to: y - ax > 0 and (x, y) E T
Since a is a function of (x, y), neither the constraints nor the maximand
function is linear. The solution of the above problem is denoted by a =
a(z, y). If we introduce the von Neumann technology, y = B z and x =
A A. z, explicitly, then the above problem is written:

Maximize:
Z
a
Subject to: [B - aA] z > 0 and z > 0
As Gale noted ([4], p. 312), this problem yields the following problem,
which appears strikingly analogous to the dual problem of linear program-
ming.

Minimize: /S
P
Subject to: p - [B - A] < 0 and p > 0
Then using exactly the same argument as in the proof of Theorem 6.A.1, we
show that there exists a solution /3 for the above problem. In general, we
cannot, however, show that /3 = &, although it can be shown that/ < a. The
above "dual problem" can have various economic interpretations. If
p a> > 0, the ratio p b//p of is meaningful and signifies return divided by
cost, a kind of "profit ratio" of the jth activity. The inequality p [B - /3A]
0 means (p bf)l(p af) < /3 whenever p a> > 0. In other words, /3 is the
maximum profit rate. In a competitive economy with free entry, competition
THE VON NEUMANN MODEL 495

forces this to a minimum. A second interpretation of /3 is as an interest

factor. Suppose each activity is financed by borrowing and suppose also that
one dollar borrowed at the beginning of the period is paid back byA dollars.
Then p bi - f3p ai is the profit of the jth activity and thus p [B - RA] < 0
indicates that no activity will make a profit, the well-known principle in a
competitive economy.
Returning to the original model a la Karlin, we prove the following theorem,
which gives the price implication of Theorem 6.A.1 and increases our understand-
ing of the above "dual problem."

Theorem 6.A.2: Under assumptions (A-1) to (A-4), there exists a p such that p > 0
and p (y - &x) < 0 for all (x, y) E T.
PROOF: By definition of &, there exists no (x, y) in T such that y - &x > 0
['.'if there exists an (x', y') E T such that y' - &x' > 0, then there exists
an e > 0 such thaty' - (a + e) x' > 0, so that & is not the maximum expansion
ratio-contradiction] . Since T is convex and the function f(x, y) -- y - 6x
is concave (in fact linear), we can apply the fundamental theorem of concave
functions, that is, Theorem 1.B.2. From this theorem, there exists a p > 0
such that p f(x, y) < 0 for all (x, y) E T. (Q.E.D.)
REMARK: We can prove Theorem 6.A.2 directly from the separation
theorem by considering two convex sets: { y - ax: (x, y) E T, 11 (x, y) 11 < 1 }
and the interior of S2". For such a proof, see Karlin [ 11], pp. 339-340. We
note, however, that Theorem 6.A.2 follows immediately once we recall
the fundamental theorem (Theorem 1.B.2). The above proof illustrates the
usefulness of the fundamental theorem.
REMARK: As we discussed above in connection with the "dual problem,"
(p y)/(p x) is the "profit ratio" of process (x, y), whenever p x > 0.
Theorem 6.A.2 states that this does not exceed the maximum expansion
rate. Note that this theorem also implies that fi < a. Gale ([4], p. 316)
gave an example of A3 < a.
The following theorem, originally due to von Neumann [221, is now an
immediate corollary of Ti eorems 6.A.1 and 6.A.2.

Theorem 6.A.3 (von Neumann): Let A and B respectively be the input and the output
matrices in the von Neumann technology. Then under assumptions (A-1) to (A-4)
there exist a > 0, z > 0, and p ? 0, where a c R, z c R"', and p E R", such that
(i)
[B -
p [B - 0
496 MULTISECTOR MODELS OF ECONOMIC GROWTH

PROOF: Let (z, 9) be the process which gives rise to the maximum expansion
rate a in Theorem 6.A.1; then there exists a z > 0 such that9 = &z,9 <_ B z,
and z > A 2. Hence B 2 >-_ &A 2, or (i) is shown, since & > 0 and 9 >- 0.
2 >- 0. To show (ii) of Theorem 6.A.3, recall Theorem 6.A.2. Then we have
p [B - &A] z < 0 with p >- 0 for all z > 0, so that p [B - &A] < 0 with
> 0. Thus (ii) is shown. To show (iii), note that (i) and p >- 0 imply
p [B - &A] 2 > 0, so that p [B - &A] 2 = 0, in view of (ii).
(Q. E. D.)
REMARK: The relation (iii) of Theorem 6.A.3 states that if the jth activity
yields negative profit, the corresponding activity level z1 is zero, and that if
the ith commodity is expandingat a rate greater than &, being "oversupplied,"
its price p; is zero.
In order to increase our understanding of Theorem 6.A.3, let us consider
more closely the original von Neumann model [22] in terms of the matrix [A, B].
Let z(t) and p(t) denote the activity level vector in period t and the price vector
in period t, respectively. Let /3(t) be the interest factor in period t. Then we have
the following "equilibrium" relations.
(i) A z(t + 1) < p(t + 1) B
(iv) [ /3(t) p(t) A - p(t + 1) B] z(t) = 0

The interpretation of the above four relations is as follows:

(i) It is impossible to consume more of each commodity (in the production
process) than is available.
(ii) If a commodity is in excess supply, its price becomes zero.
(iii) The production process will make no positive profit and the maximum profit
is zero.
(iv) If a process yields negative profits, it will not be used.

Now assume balanced growth in the sense that

z(t + 1) = az(t), p(t + 1) = p(t) = p (constant) for all t
and

/3 (t) = /3 (constant) for all t

Then writing z(O) -- z, we can rewrite the above four relations as:

(i')

[B -
p z=0
THE VON NEUMANN MODEL 497

We may call the quadruplet [z, p, a, $] which satisfies the above four relations
the von Neumann quadruplet. Theorem 6.A.3 asserts the existence of such a quad-
ruplet with a = R > 0 and z > 0, p > 0. It is important to note that in this
interpretation of the model, a is not defined as the maximum expansion factor. An
interpretation in terms of the dual maximization and minimization problems that
Gale conceived is not intended here. Strictly speaking, $ here is not interpreted as
the solution of Gale's minimization problem. The model (i') to (iv') describes the
workings of a closed economy as interpreted above with the fundamental as-
sumption of balanced growth. That is, the interpretation here is that of a descrip-
tive model and not that of a planning model. However, it should also be noted that
Theorem 6.A.3 provides a "planning" interpretation. In other words, a in the
von Neumann quadruplet can be interpreted as the maximum expansion factor.
This means that if the economy is organized as described by (i') to (iv') with
the attached interpretations, it will maximize the expansion factor a. This result
is analogous to a result in the theory of competitive equilibrium, namely, that
every competitive equilibrium realizes a Pareto optimum.

c. TWO REMARKS
Irreducibility
In Theorem 6.A.3, we proved the existence of [a, p, z] with a > 0, p >_ 0,
and z > 0, such that
5
We may call this triplet [a, p, z] the von Neumann equilibrium of the [A, B]
economy. However, in the conclusion of Theorem 6.A.3, or the definition of
the von Neumann equilibrium, the possibility that p B z = 0 is not precluded.
The condition p B z = 0 means (intuitively) that the total value of all com-
modities produced in the von Neumann equilibrium is zero. This is rather
annoying. Gale [3] apparently realized this and considered the "regular" von
Neumann model, where the "regularity" is defined in terms of B B. z > 0. Clearly
B z > 0, together with p >_ 0, implies p B z > 0. Thompson [21] and Kemeny,
Morgenstern, and Thompson [ 12] also realized this and explicitly introduced
the condition p B B. z > 0 into the definition of the von Neumann equilibrium.
A natural question now is: What then is the condition which would guarantee
that p B z > 0? Gale [3] and [4] introduced "irreducibility," the concept
which is analogous to the indecomposability of the Frobenius theorem. It turns
out that this concept of "irreducibility" plays an important role in the above
question.

Definition: The set of indices I, a subset of N = { 1, 2, . . ., n}, is called an inde-

pendent subset if it is possible to produce commodity i E I, without consuming any
commodity i E 1', where 1' - N \ I. That is, I is independent if there exists J, a
subset of M = {1, 2, . ., m}, such that a,1 = 0 for i E 1' and j 6 J, and b, > 0
.
498 MULTISECTOR MODELS OF ECONOMIC GROWTH

for all i E I and for some j E J. The input-output matrix [A, B] is said to be
irreducible if I' = 0.
REMARK: Hence, if the model is reducible, there is a certain permutation
of rows and columns of A (that is, renumbering of indices) such that A is
decomposed as

J . J'
I A1I AJ2

I' 1 0 E Azz

Theorem 6.A.4: Let A and B, respectively, be the input and the output matrices in
the von Neumann technology. Suppose there exist it > 0, z > 0, p > 0, a E R,
2 E R-, and p E RR such that (1), (ii), and (iii) of Theorem 6.A.3 hold. Then ifthe
input-output matrix [A, B] is irreducible, we have p B 2 > 0.
PROOF: Renumber j and partition z such that z = (20, 21), where z° > 0
and 21 = 0. Let J be the set of indices where z° > 0 (so that J' = MI J is
the set of indices where 21 = 0). Let b; be the m-vector whose jth element is
b,,. Let I be the set of indices i such that b; z > 0 (so that I' = N \ I is the
set of indices i where b, z = 0). Renumbering i and j, A and B respectively
can be partitioned as

J J, J J,
I A A12 1 BB1,
-------t ------ and -- ----
I' A21 A22
if B21 B22

By assumption of the theorem, B . z > &A z, which, in turn, means B1, z° _>_
6A 11 2° and B21 z° >_ &A21 z° (.' z' = 0). But B21 z° 0 by definition
of I and J ('.' 0 = B21 z° + B22 z' = B21 z°). Hence &A21.2° < 0, or
A21 z° <_ 0 since 6 > 0. But A21 z° > 0, since A2, >_ 0 and z° > 0, so that
we must have A2, i° = 0. This, in view of A21 > 0 and z° > 0, implies that
A2, = 0. That is, aid = 0 for i E I', j E J. Since [A, B] is irreducible by assump-
tion, the set I' must be empty. That is, I = N, so that b; z > 0 for all i =
1, 2, ..., n. Hence B 2 > 0, which implies p B z > 0.
(Q.E.D.)
REMARK: The above proof is based on Gale [4], p. 315.
REMARK: If we have assumptions (A-i) through (A-iii), then we can prove
the above theorem without irreducibility. See Kemeny, Morgenstern, and
Thompson [ 121. For alternative proofs, see Howe [ 10], p. 638, orNikaido
[20], p. 146.
THE VON NEUMANN MODEL 499

The von Neumann Model with Consumption (Morishima)

One basic criticism of the von Neumann model is that it has no explicit
treatment of consumption. It is assumed that "consumption of goods takes place
only through the processes of production which include necessities of life
consumed by workers" ([22], p. 2). Explicit introduction of consumption and
labor into the model is attempted by Morishima [ 18] and was reconsidered by
Haga and Otsuki [7] in terms of the duality theorem of linear programming.
Our purpose here is to sketch this Morishima model and understand the way in
which consumption is explicitly treated in the model.
The starting point of Morishima's model is the following model, originally
due to Kemeny, Morgenstern, and Thompson [ 12].
(1) 1) <
(2) /3(t) p(t) A >_ p(t + 1) B
(3) p(t + 1) = p(t +
(4) /3(t) p(t) A z(t) = p(t + 1) B z(t)
(5) p(t + 1) B z(t) > 0
As we remarked earlier, the addition of (5) is novel in Kemeny, Morgenstern, and
Thompson [13] and in Thompson [21]. The point of departure is the explicit
introduction of labor into the model. Let L be the m-vector whose jth element
is 1j, where I is the amount of labor necessary for the one unit operation of
the jth production process. L is assumed to be constant for all t. Assume that there
is only one kind of homogeneous labor in the economy and that the wage rate of
this labor in period t is equal to w(t). Assume also that labor is indispensable for
every production process, that is, I > 0 for all j. The cost of production in
period t is no longer equal to p(t) A but, instead, equals [ p(t) A + w(t)L] ,
assuming that wages are paid at the beginning of the production period. Hence
the above relations (2) and (4) may be rewritten as
(2') /3(t) [ p(t) A + w(t)L] p(t + 1) B
(4') /3(t) [ p(t) A + w(t)L] z(t) = p(t + 1) B z(t)
if, on the other hand, the wage is paid at the end of the production period, then
/3(t) [ p(t) A + w(t)L] in (2') and (4') should be modified accordingly. Morishima
called the former "Marx-von Neumann" and the latter "Walras-von Neumann."
Since the Walras-von Neumann model can be considered analogously to the
Marx-von Neumann model, we will stick to the above Marx-von Neumann
equations (2') and (4'). Let E(t) be the total "profit" (or "capitalists' income") in
period t. That is, we define E(t) by
(6) E(t) -- [ p(t) B - p(t - 1) A - iv(t - 1)L] z(t - 1)
Then, in view of (4'), we have
500 MULTISECTOR MODELS OF ECONOMIC GROWTH

(7) E(t) = [/3(t - 1) - 1] [ p(t - 1) - A + w(t - 1)L] z(t - 1)

Let d;(t) be the capitalists' consumption of commodity i in period t, and let d(t)
be the n-vector whose ith element is d,(t). Let c be the average propensity to
consume of the capitalists, so that we have
(8) cE(t) = p(t) d(t)
where c is assumed to be a positive constant for all t as long as E(t) ? 0. We
suppose that E(t) < 0 implies d(t) = 0, so that c = 0 whenever E(t) < 0. Let e(t)
be the workers' consumption vector in period t. Assuming that the workers
consume all their wage income, we have
(9) W(t) = p(t) e(t)
where W(t) -- w(t)L z(t).
Equations (1) and (3) are now modified (in view of the above considerations)
as
(1') 1) ? e(t) + d(t)
(3') p(t) B z(t - 1) = p(t) A z(t) + W(t) + cE(t)
Morishima [18] supposed the functional relations d;(t) = cd; [ p(t), E(t)] and
e.(t) = e; [ p(t), W(t)], which are assumed to be homogeneous of degree 0 with
respect to their respective arguments. It is further assumed that the Engel elasti-
city of consumption is unity so that cd and e" can be written explicitly in the following
form:

and e(t) = W ((t) g[q(t)]

(10) d(t) E(lp){t)f[9(t)]

where q(t) - p(t)/ (Ej 1 pi (t) ), f and g are n-vectors, and " 1 pi (t) > 0 for all
t is assumed. Note that E(t) < 0 implies f = 0 by assumption. Defining µ(t) by
µ(t) = w(t)/(E" 1p;(t)), e(t) can be further rewritten as e(t) =1u(t)g[q(t)]L z(t).
Here µ may be interpreted as the "real wage rate."
Morishima [ 18] retains the fundamental balanced growth assumption a la
von Neumann so that

(11)
z(t) = rxz(t - 1), p(t) = constant = p
w(t) = constant = w, /3(t) = constant = p
Also the unit of measurement of the commodities is chosen properly so that
En_ 1 pi = 1. Note that this convention implies p = q and p = w. Then, using (7)
and (10), the model consisting of (1'), (2'), (3'), (4'), and (5) can be rewritten as
(12) B.z>_c 1) wL]
(13) wL] >ppB
THE VON NEUMANN MODEL 501

(14) [a+(/i- 1)c]

(15) wL] z=
(16) 0
Here p, z, a, /3, and w are all constants. The value [p*, z*, a*, /3*, w*] which
satisfies (12) to (16) is the von Neumann equilibrium a la Morishima. Morishima
([18], pp. 356-359) showed the existence of [ p*, z*, a*, /3*] such that p* >_ O and
z* >_ 0, with a given value of w > 0, which satisfies (12) to (16) [with some additional
results, such as /3* - I > 0 implies that a* - 1 = s(13* - 1), were s = (1 - c),
and so on] . The basic assumptions used for the proof are (i) a,, > 0, b, 0 for all
i and j, and (ii) for every i, there exists at least one j such that b;; > 0. Compare
these with the three assumptions (A-i) to (A-iii) introduced by Kemeny, Morgen-
stern, and Thompson [121, and note that the assumption that for each j there
exists at least one i such that a; > 0 is dropped. The assumption lj > 0 for all j
replaces this assumption, for this itself implies "no land of Cockaigne."
One should note that this model is not complete. There is no mechanism
to determine w. One way to close the model is to introduce a labor supply equa-
tion and equilibrium in the labor market (= full employment) through fluctuations
in w. Another way is simply to assume the existence of (Marxian) underemploy-
ment so that w is constant at the subsistence level.
It should also be noted that there is no mechanism in the model that deter-
mines (absolute) money prices. To determine money prices, we have to introduce
an equation which describes the money market.
Finally, we should recall one fundamental assumption involved in the
Morishima model, that is, the balanced growth assumption in the sense of (11).
Nothing is said about what happens if the historically given stocks of commodi-
ties are not on such balanced growth paths and how a particular balanced growth
path [p*, z*, a*, /3*, w*] is chosen. On this point, Morishima inherits the basic
difficulty of von Neumann.
REFERENCES

1. Champernowne, D. G., "A Note on J. v. Neumann's Article on `A Model of

Economic Equilibrium'," Review of Economic Studies, XIII, 1945-1946.
2. Dorfman, R., Samuelson, P. A., and Solow, R. M., Linear Programming and
Economic Analysis, New York, McGraw-Hill, 1958.
3. Gale, D., "The Closed Linear Model of Production," in Linear Inequalities and
Related Systems, ed. by H. W. Kuhn and A. W. Tucker, Princeton, N.J., Princeton
University Press, 1956.
4. , The Theory of Linear Economic Models, New York, McGraw-Hill, 1960.

5. Georgescu-Roegen, N., "The Aggregate Linear Production Function and Its

Applications to von Neumann's Economic Model," in Activity Analysis of Production
and Allocation, ed. by T. C. Koopmans, New York, Wiley, 1951.
502 MULTISECTOR MODELS OF ECONOMIC GROWTH

6. Glycopantis, D., "The Closed Linear Model of Production: a Note," Review of

Economic Studies, XXXVII, April 1970.
7. Haga, H., and Otsuki, M., "On a Generalized von Neumann Model," International
Economic Review, 6, January 1965.
8. Hahn, F. H., and Matthews, R. C. 0., "The Theory of Economic Growth: A
Survey," Economic Journal, LXXIV, December 1964.
9. Hicks, J. R., Capital and Growth, Oxford, Clarendon Press, 1965.
10. Howe, C. W., "An Alternative Proof of the Existence of General Equilibrium in a
von Neumann Model," Econometrica, 28, July 1960.
11. Karlin, S., Mathematical Methods and Theory in Games, Programming and Economics,
Vol. 1, Reading, Mass., Addison-Wesley, 1959.
12. Kemeny, J. G., Morgenstern, 0., and Thompson, G. L., "A Generalization of the
von Neumann Model of an Expanding Economy," Econometrica, 24, April 1956.
13. Koopmans, T. C., "Analysis of Production as an Efficient Combination of Activi-
ties," in Activity Analysis of Production and Allocation, ed. by T. C. Koopmans,
New York, Wiley, 1951.
14. , "Economic Growth at a Maximal Rate," Quarterly Journal of Economics,
LXXVII, August 1964.
15. Loomis, L. H., "On a Theorem of von Neumann," Proceedings of National Academy
of Science, 32, 1946.
16. McKenzie, L. W., "Turnpike Theorems for a Generalized Leontief Model,"
Econometrica, 31, January-April 1963.
17. , "Maximal Paths in the von Neumann Model," in Activity Analysis in the

Theory of Growth and Planning, ed. by E. Malinvaud and M. O. L. Bacharach,

London, Macmillan, 1967.
18. Morishima, M., "Economic Expansion and the Interest Rate in Generalized von
Neumann Models," Econometrica, 28, April 1960 (also chap. V of his Equilibrium,
Stability and Growth, A Multi-Sectoral Analysis, Oxford, Clarendon Press, 1964).
19. , "A Multi-Sectoral Analysis of Balanced Growth," in New Economic Analysis,
ed. by M. Morishima, M. Shinohara, and T.-Uchida, Tokyo, Sobunsha, 1960 (in
Japanese).
20. Nikaido, H., Convex Structures and Economic Theory, New York, Academic Press,
1968.
21. Thompson, G. L., "On the Solution of a Game-theoretic Problem," in Linear In-
equalities and Related Systems, ed. by H. W. Kuhn and A. W. Tucker, Princeton, N.J.,
Princeton University Press, 1956.
22. von Neumann, J., "Uber ein okonomisches Gleichungs-System and eine Vera]1-
gemeinerung des Brouwerschen Fixpunktsatzes," in Ergebnisse eines Mathmatischen
Kolloquiums, ed. by K. Menger, no. 8, 1937 (tr. as "A Model of General Equilibrium,"
Review of Economic Studies, XIII, no. 1, 1945-1946).
23. Winter, S. G., "Some Properties of the Closed Linear Model of Production,"
International Economic Review, 6, May 1965.
THE DYNAMIC LEONTIEF MODEL 503

Section B
THE DYNAMIC
LEONTIEF MODEL

a. INTRODUCTION
The dynamic Leontief model is a natural extension of the static input-output
Leontief model to a dynamic case. As in the static case, the general equilibrium
interaction among various industries in an economy is explicitly taken into
account. Like the static model, the dynamic model is also used extensively for
empirical purposes to ascertain the industrial structure of particular economies
for forecasting, planning, and so on.
Theoretically speaking, the dynamic Leontief model can be considered a
special case of the von Neumann model, in which there is only one production
process available for the production of each good (the "fixed coefficient assump-
tion") and no joint output is allowed except that capital goods may be considered
as joint outputs in the sense that they are transferred from one period to the next.
The essential idea in dynamizing the static Leontief model seems to have
come from the Harrod-Domar model. Hence the assumption of fixed capital
coefficients is essential in the model.
The fixed coefficients assumption or the strict linearity of the model,
although a useful assumption for empirical purposes, causes serious theoretical
difficulties in the dynamic Leontief model because of its rigidity. The most notable
of such difficulties is known as causal indeterminacy. That is, unless the initial
output vector and stock vector are on a certain ray from the origin, it may so
happen that the output and the stock of at least one good may become negative
for sufficiently large t. Thus, waking up on a bright Monday morning, one may
find that the dynamic Leontief economy, which had started with a positive initial
stock of commodities, has realized a negative stock of some commodity!
There have been various attempts to rescue the dynamic Leontief model
from this difficulty. One useful concept in this connection turns out to be that
of "relative stability," developed by Solow and Samuelson [39]. That is, if the
coefficient matrices A and B satisfy certain conditions, then there exists a balanced
growth path in which all outputs (or stocks) grow at the same rate such that the
ratio between the output (or stock) of the balanced growth path and the actual
output (or stock) for each good converges to a certain positive constant, as time
extends without limit, regardless of the initial configuration of outputs and stocks.
(Incidentally, the definitions of coefficient matrices A and B will be given at the
beginning of subsection b. Here we simply proceed with our discussion without
worrying about the definitions of A and B.)
504 MULTISECTOR MODELS OF ECONOMIC GROWTH

It turns out that the study of this question is concerned with a system of
first-order, linear, homogenous difference equations of the form
x(t + 1) = M x(t)
where M is an n x n constant matrix and x(t) [and x(t + 1)] is an n-vector.
In the (closed) dynamic Leontief model, it turns out that M is written as M =
I + B- I (1 - A).
A necessary and sufficient condition for the relative stability of the above
system was discovered by Tsukui [42]. It states that the above system is relatively
stable if and only if there exists a positive integer m such that M" > 0. Observe
that by iteration the above system of difference equations yields
x(t) = MI. x(0)
Thus Tsukui's theorem means that, for a sufficiently large t (say, m), the output
of every good becomes positive regardless of the initial point, x(O) >_ 0.
This is an interesting result even from a purely mathematical viewpoint, for a
system of difference equations such as x(t + 1) = M- x(t) can appear in many
fields of economics other than the dynamic Leontief system. Hence it has many
potential applications.
Coming back to the dynamic Leontief system, suppose now that the
coefficient matrices A and B are such that we do not have relative stability. Then
we are back to the problem of causal indeterminacy. In general, there is nothing
which would guarantee the relative stability of the system. The answer to this
question of causal indeterminacy under these general circumstances can be sought
from two directions. One is to convert the (deterministic) dynamic Leontief model
into a planning model, in which case the problem of causal indeterminacy can
be avoided trivially by explicitly introducing the nonnegativity of the output and
the stock vectors (for all t) in the constraints. The nontrivial part of this conversion
procedure is to change the equalities in the Leontief system into inequalities.
The procedure of converting the deterministic Leontief model to a planning
model, thus avoiding the problem of causal indeterminacy, was developed by
Solow [381. Since the coefficient matrices A and B are fixed so that the system
is linear, he obtained a linear programming model. Then considering the dual
problem of this linear programming model and interpreting the dual variables
as "pi ice" variables, Solow obtained a remarkable result: the price implication
of the output system of the dynamic Leontief model.
Preserving linearity in the sense of linear homogeneity of the production pro-
cesses, we can still avoid the problem of causal indeterminacy by explicitly intro-
ducing some sort of "nonlinearity" into the system. In particular, we may point
out the following three kinds of "nonlinearity" to avoid causal indeterminacy.
(i) Allow factor substitution in the production processes. In this case, the aj's
and b;j's in the matrices A and B are no longer fixed but are functions of prices.
Moreover, labor can be introduced in this substitution mechanism.
THE DYNAMIC LEONTIEF MODEL S0S

(ii) Allow demand (of consumer) substitution. In the usual dynamic Leontief
model, the final demand vector c(t) is exogenously given; but we may allow
it to depend on prices.
(iii) Introduce a "floor" and "ceiling," just as Hicks [ 13] introduced them into
Samuelson's business cycle model. As Goodwin [8] observed, this essentially
amounts to introducing nonlinearity into the system.

It is easy to see that causal indeterminacy could be avoided by (i) and/or (ii).
If the stock of a certain good decumulates in a certain period(s) (too close to zero),
then the scarcity of this good would, in general, cause an increase in its marginal
productivity and/or an increase in its marginal utility. This would increase the
demand for the good, thus increasing its price. This, in turn, would cause an
increase in its supply, which would avoid the stock of the good decumulating to
zero.
That labor can be introduced in the mechanism of producer's substitution
in (i) has another important implication in the sense that it will avoid another
major difficulty in the dynamic Leontief model with fixed coefficients. We now
discuss this difficulty. Under full employment of capital, the output vector x(t)
may follow the law of motion described by a system of difference equations such
as x(t + 1) = M. x(t) + d(t) [where d(t) is exogeneously given owing to final
demand] in the open Leontief model in which labor is explicitly introduced. As
long as there is a fixed relation between the labor input and the output of each
good, the movement of x(t) will uniquely determine the labor requirement in the
economy. Let it be L(t). Suppose this L(t) does not correspond to the actual
supply of labor. There is no mechanism in the usual dynamic Leontief system
to eliminate the gap between L(t) and the supply of labor. For example, suppose
the supply of labor is given exogenously by population and the like, and grows
proportionately such that L(t) = Lo(1 + n)'. Suppose also that the output
determined by x(t + 1) = M. x(t) + d(t) is given by x(0)(1 + µ)t + (constant).
Then we have ever-expanding unemployment of labor if n > a. If, on the other
hand, n < µ, then the output system such as x(t + 1) = M M. x(t) + d(t) is meaning-
less, for such a growth of output is impossible due to the labor constraint. This
difficulty can be avoided by introducing labor in the mechanism of producer's
substitution. In other words, if labor grows faster than is required, then the price of
labor will go down and encourage the use of labor in the production vis-a-vis other
factors. This, in turn, will increase the labor requirement. In other words, the
labor coefficients-the amounts of labor necessary to produce one unit of each
good-are not fixed constants but functions of prices.
Morishima [26] introduced nonlinearity of type (i). However, he was appar-
ently misled by his desire to obtain the substitution theorem for the dynamic
Leontief system. He was concerned with reducing nonlinearity to linearity by
arguing that only one set of values of the au's and b,1's will be chosen regardless
of the value of final demand, rather than with the problem of causal indeterminacy
per se.
506 MULTISECTOR MODELS OF ECONOMIC GROWTH

In spite of his difficulty in establishing the substitution theorem for the

dynamic case, his attempt to build the dynamic Leontief model with an explicit
recognition of producer's substitution and to prove the existence and the unique-
ness of the equilibrium values of the variables is very important. Besides, his
model has an interesting feature in that his treatment of prices and Solow's
assumption on prices represent two polar assumptions in the treatment of prices
in the dynamic Leontief model.
After all, the problem of finding a proper price system for the dynamic
Leontief model is not an easy one. In fact, Solow writes ([38], p. 30),
The price-valuation side of the dynamic Leontief system has been pretty
thoroughly neglected. So far as I know, a complete history of the literature
on this subject can be given in a paragraph.
This is because of the obvious reason that in the Leontief model there is no
explicit discussion of consumer's and producer's substitution in which prices can
play a vital role. As remarked above, consumer's substitution is neglected because
of the exogenous treatment of final demand, and producer's substitution is
neglected because of the fixed coefficient assumption. However, it is not correct
to say that there is no price formation in the dynamic Leontief system. We can
still follow the logical implication of the model as it stands by supposing that
each commodity is evaluated by certain prices in a systematic manner and asking
what the implications of such an evaluation are. In the literature, two examples
of the price system under the competitive framework are well known: Solow's
system and Morishima's system.
In subsection b, we sketch the output system of the usual dynamic Leontief
system. There we take up topics such as relative stability and causal indeter-
minacy. This subsection will alsobe useful to introduce and illustrate an important
technique of economics, that is, difference equations. However, the complete
discussion of this technique is not attempted. The reader, if he wishes to do so,
should be able to easily convert the discussion of the present section in terms of
differential equations.
In subsection c, we sketch the price system of the dynamic Leontief model
based on Solow [38]. It should be understood that a competitive intertemporal
arbitrage relation under the assumption of perfect foresight is crucial in this price
system.
In subsection d, we take up Solow's conversion of the deterministic Leontief
model to a planning model and discuss the price implications based on such a
model, and finally, in subsection e, we deal with the Morishima model.
In the course of reading this section, it is hoped that the reader will become
aware of the difficulties inherent in the linear system with fixed coefficients.
Although such a system may be very useful for empirical purposes, it is, after all,
a linear approximation of a nonlinear system. The difficulty posed by causal
indeterminacy seems to be that of stretching properties that may hold only locally
(that is, in the neighborhood in which the linear approximation is a good
approximation) to a global domain which is inevitable as t -->oc.
THE DYNAMIC LEONTIEF MODEL
507

b. THE OUTPUT SYSTEM

Let x(t) be an n-vector whose ith element is xi(t), the output of the ith
good in period t. Let A = [ay] be an n x n matrix, where aij denotes the current
input of the ith good used per unit of the jth good. Let B = [bij] be an n x n
matrix, where bij denotes the quantity of the ith good invested in the jth industry
in order to increase the output of that industry by one unit. Let ci(t) be the final
demand (such as consumption demand) of the ith good in period t. Then the total
demand for the ith good in period t is
n
- Xj(t)]
n

(1) 2aijxj(t) + bij[xj(t + 1) + Ci(t)

j= I j= I

The second term of this expression can be understood by supposing that the
production of each good requires a stock of goods (such as "capital") as well
as current inputs (such as "raw materials"). In other words, suppose that, in the
production of one unit of the jth good, bij units of the stock of thejth good are
necessary as a capital good. Let Kij(t) be the amount of the stock of the ith good
required as capital in the jth industry in period t. Then we have
(2) bijxj(t) = Kij(t)
Let Ki(t) be the total stock of the ith good required as a capital good in the
economy in period t, that is, Ki(t) = f 1Kij(t). Then we have [in view of (2)]
n
(3) Ki(t) _ bijxj(t)
J= I

Assume that capital is freely transferable from one industry to another, and
assume also the full employment of capital so that Ki(t) in (3) also denotes the
supply of the ith capital as well as its demand (i = 1, 2, .. ., n). Then
n
(4) li(t) = A Ki(t) = Ki(t + 1) - Ki(t) _ E bij [ xj(t + 1) - xj(t)]
j=1

where Ii(t) is the amount of the ith good demanded (and supplied) for "invest-
ment" purposes.' Expression (1) may be interpreted as (demand as a current
input) + (investment demand) + (final demand) for the ith good.
Since xi(t) denotes the output of the ith good in period t, the basic supply =
demand equilibrium relation for the ith good can now be written as
n n

(5) xi(t) aijx (t) + bij [Xj(t + 1) - Xj(t)] + Ci(t)

j=j j=1

or in matrix form,
(6) x(t) = A x(t) + B B. [x(t + 1) - x(t)] + c(t)
where c(t) is an n-vector whose ith element is ci(t). We may also rewrite (6) as
508 MULTISECFOR MODELS OF ECONOMIC GROWTH

(6') x(t) = A A. x(t) + A K(t) + c(t) and K(t) = B x(t)

where A K(t) = K(t + 1) - K(t). Here free transferability and full employment of
capital are assumed. Equation (5) or (6) [or (6')] denotes the basic output equation
of the dynamic Leontief system. The matrix A is called the current input coefficient
matrix and matrix B is called the capital coefficient matrix. The purpose of this
subsection is to study the behavior of x(t) over time, as described by equation (6).
It is assumed that
(A-1) aid > 0 and big ? 0 for all i, j = 1, 2, ..., n, and the aid's and the bid's are all
constant over time.
The constancy of the aid's and bij's signifies that there is no technological progress.
As remarked earlier, the above system of the dynamic Leontief model can be
considered a special case of the von Neumann growth model in which there is
only one production process in each industry ("fixed coefficients") and joint
output is allowed only in the sense that each production process uses stocks of
goods which are transferred from one period to the next.'
In order to facilitate further analysis, we impose the following assumption:'
(A-2) The matrix B is nonsingular.
Thus we can rewrite (6) as
(7) x(t + 1) =[I+ B-1 (I - A)] x(t) - B-1 c(t)
where I is the n-dimensional identity matrix.
In order to sharpen our analysis, let us restrict our attention to the closed
(dynamic) Leontief system. In other words, we set c(t) = 0 for all t. Then (7)
can be rewritten as
(8) x(t + 1) =[I+ B-1 (I -A)] x(t)
or, in short,
(8') x(t + 1) = M x(t), where M = [I + B-1(I - A)]
This is a system of n first-order, linear, homogenous difference equations with
constant coefficients. Suppose that Ai is an eigenvalue of M; then it is known
that the following is a particular solution:
(9) x(t) = 5itxi

where x' is an eigenvector associated with Ai. That this is a solution of (8) can
be checked easily by substituting this into (8) and noticing that (8) is reduced to
an identity. If all the eigenvalues of M are distinct, then then particular solutions
in the form of (9) (i = 1, 2, ... , n) are linearly independent, and the general solu-
tion of (8) can be written as
(10) i(t) = h1.ltx1 + ... + hn.Antxn
THE DYNAMIC LEONIEF MODEL 509

where h1, h2, ... , h are determined by then initial (boundary) conditions. The
fact that this is a solution of (8) can be checked easily by noticing that (10) reduces
(8) to an identity.
In general, the A,'s are complex numbers and the x''s can contain negative
elements, so that a solution (9) may not have any economic meaning. Suppose,
however, that one of the eigenvalues-say, A,-is a positive (real) number and that
an eigenvector associated with it-say, x'-is a positive vector (that is, x' > 0);
then a solution
(11) x*(t) _ A,1x'
does make economic sense, because this tells us that if the initial output vector,
x(0), of the economy is x1 (or its positive constant multiple, say, h,x'), then the
economy is capable of "balanced growth" (at the rate of A,) for A, > 1 or
"balanced decay" for O < A, < 1. We simply call the path such as (11) a balanced
growth path or a balanced growth solution. This is an interesting conclusion, for if
the initial output vector x(0) is xl or its (positive) constant multiple, then, in
the economy in which (8) holds, the output of every good grows (or decays) at the
same rate A,.
A natural question that follows from this consideration is: What are the
conditions which would guarantee the existence of a positive eigenvalue A, and a
positive eigenvector x' associated with it? An immediate thought about this is
to consider the case where M is a nonnegative matrix. For if M is a nonnegative
indecomposable matrix, then owing to the Frobenius theorem (Theorem 4.B.1),
there exists a positive eigenvalue (called the Frobenius root) and a positive
eigenvector associated with it. That is, simply by taking the Frobenius root as A,
and the associated eigenvector as x I, we have a balanced growth solution, (11).
However, there is one basic difficulty, namely, the question of how we can
guarantee that M = [I + B-' (I - A)] is a nonnegative matrix. In general, M will
not be nonnegative. To see this, consider the following example by DOSSO
([4], p. 297):

EXAMPLE 1:

F l rl 0
Let A- B-
I3 3 L0 1

5 I

3
Then M = [I + B-'(I - A)] _ I 5

1-3 3

However, the fact that M is not a nonnegative matrix may not preclude the
possibility that the system of difference equations, (8), contains a balanced growth
solution. Hence, we want to find a set of plausible assumptions under which there
exists a balanced growth solution of (8). To do this, first assume that the matrix
-
[I A] has a dominant diagonal, that is,
510 MULTISECTOR MODELS OF ECONOMIC GROWTH

(A-3) There exists an x ? 0, x E R" such that (I - A) x > 0.

Then, using Theorem 4.D.2, we can conclude that (I - A) is nonsingular and
(I - A)-1 ? 0. Since B ? 0 by (A-1), (I - A)-IB is also nonnegative. If, in
addition, A is indecomposable, (I - A) -I > 0 so that (I - A )- IB > 0 in view of
the nonsingularity of B.4 But A may not be indecomposable; hence we simply
assume that
(A-4) (I - A )- I B is indecomposable.
Then, owing to the Frobenius theorem (Theorem 4.B.1), there exists a maximal
eigenvalue v > 0 with which a positive eigenvector x > 0 is associated.' In other
words,
(12) (I - A)- 1B z = vi, where v > O and x > 0
Then p - 1/v is an eigenvalue of B-1(I - A) and its associated eigenvector is z,
since B- '(I - A). 3E = (1/v)x by (12). Therefore,
(13) (1 + p)z = [I + B- I(I - A)] - 5E =
Since 1 + p > 0 and x > 0, we discover the desired result, namely, a positive eigen-
value of M and a positive eigenvector associated with it. Thus setting A = 1 + p
and x 1 = z, the economy is capable of balanced growth in the form of (11), even
if M is not nonnegative. That is,
(14) x*(t) = (1 + p)'x
We are now interested in the long-run character of the movement of i(t),
the solution of (8), when it starts from an arbitrary given initial vector x(0)
instead of a particular one such as z(or x 1). For this purpose, the following
concept turns out to be important.

Let x(t + 1) = M x(t) be a system of difference

Definition (relative stability):
equations where M is an n x n constant matrix. Suppose that x*(t) = Atx > 0
is a particular solution of this system of difference equations. Let z(t) be a
solution of this system starting from an arbitrary initial vector i(O) ? 0. Then
the balanced growth solution x*(t) is said to be relatively stable if

(15) lim ii(t) = Q exists such that oc > u > 0 and u is independent
r-co x*(t)
of i, where i stands for the ith component
REMARK: The concept of relative stability is really independent of
whether the motion of x(t) is described by a system of linear difference
equations such as (8). Essentially, if i(t) behaves according to a certain law
of motion (which can be anything) starting from the initial value z(O), and if
THE DYNAMIC LEONTIEF MODEL 511

there exists a reference path x*(t) [for example, x*(t) _ Atx], which is
positive for all t, then the definition such as the one described by (15) holds.
REMARK: One of the crucial features of the concept of relative stability is
that i(t) can start from any arbitrary initial point. That is, regardless of the
initial value z(0), relation (15) holds if x*(t) is relatively stable. Suppose
z(t) and z°(t) are two solutions starting from z(0) and z°(0), respectively,
such that z(t) > 0 for all t. Then noting that z,°(t)/,ii(t) = [zi°(t)/x*(t)]/
[z1(t)/x*(t)], we can conclude that if there exists a balanced growth path
x*(t) > 0 which is relatively stable, then lim, [z;°(t)/z1(t)] also converges
to a constant which is independent of i.
In Figure 6.4, we illustrate the concept of relative stability in such a way
that i(t) asymptotically approaches the balanced growth path x*(t). Contrary
to a common misunderstanding, this asymptotic convergence is not necessary
in the concept of relative stability. In other words, that x*(t) is relatively stable
does not necessarily imply that z(t) -> x*(t) as t -> oo. An example of such a case
is given by Nikaido ([32], section 22) as follows:

EXAMPLE 2: Let M = [ 3] Then the eigenvalues of M are 4 and 2

and the general solution can be written as

(16) i(t) = h141 Cl] + h221 [ 1]

where [;] and [ ] are the eigenvectors associated with 4 and 2, respectively.
Clearly,
(17) x*(t) = 41, i = 1, 2

is a balanced growth path of the economy. If the initial output vector is such that
x, (0) = 2 and x2(0) = 0, then the path of the output of each good, determined by

X(t)

xl Figure 6.4. An Illustration of Relative Stability.

512 MULTISECTOR MODELS OF ECONOMIC GROWTH

(16), is
(1.8) z1(t) = 41 + 2t and z2(t) = 41 - 21
Hence i,(t)1xi (t) and z2(t)1x2(t) both approach 1 as t -- oo; that is, x*(t) is
relatively stable. But the Euclidian distance between i(t) and x*(t) in period t
is given by

(19) II _j(t) - x*(t) II = V[X, (t) - xi (t)] 2 + [12(t) - xz (t)] 2

= 2 `,

which goes to oo as t -- oo.

An important application of the concept of relative stability was given by
Solow' and Samuelson [39] in their 1953 article, in which they considered the
following system of n first-order nonlinear difference equations:
(20) x.(t + 1) = H/[x,(t), x2(t), ..., x*)], i = 1, 2, ..., n
or simply
x(t + 1) = H[x(t)]
Here the Hi, i = 1, 2, ..., n, are assumed to be real-valued, continuously differen-
tiable,' linear homogeneous functions defined on the nonnegative orthant of R".
The Hi's are in general nonlinear functions. Hence the system of linear difference
equations such as (8), or x(t + 1) = M x(t) where M is an n x n matrix, is an
example of the above system.
Solow and Samuelson [39] imposed the following assumption on the func-
tions Hi.
(SS) The partial derivatives of H; are all positive for all x(t), i = 1, 2, ..., n.
Under this assumption they proved that the system of difference equations, (20),
has a unique balanced growth solution that is relatively stable. This theorem is
known as the Solow-Samuelson theorem.
The assumption (SS) is very strong. Considering (20) as a dynamic system
of output growth in which H. describes the production relation' (as in Solow and
Samuelson [39] ), this assumption means not only that every good is useful in the
production of every other good, but also that if we have only one good, then we
can produce all the other goods. Our model (8)-that is, x(t + 1) = M x(t)-
does not, in general, satisfy this assumption for it requires that all the elements
of M be positive.
Attempts to simplify the proof and to generalize the basic result of the
Solow-Samuelson theorem have been made by Suits [40], Muth [30], Morishima
[27] and [28], and Nikaido [ 33].1 Since the Frobenius theorem plays an important
role in the linear system, such as (8) (as explained above), it can easily be con-
jectured that the Frobenius theorem can be extended to a nonlinear case and can
THE DYNAMIC LEONTIEF MODEL 513

be applied to the Solow-Samuelson model, (20). Morishima [28] and Nikaido

[33] substantiated such a conjecture.
Here we will not go into the exposition of the above attempts at the general-
ization of the Solow-Samuelson theorem and the nonlinear extension of the
Frobenius theorem.9 Instead we will come back to the original closed Leontief
system (8), x(t + 1) = M x(t). As we remarked before, we cannot establish
the relative stability of a balanced growth solution for this system by directly
applying the original Solow-Samuelson theorem, since M can contain zero and
negative elements. However, Tsukui [42] proved that relative stability can be
established if there exists a positive integer m such that M"' > 0. We prove below
the essence of his result, which we call Tsukui's lemma. Clearly the application
of this lemma is not confined to the dynamic Leontief problem per se.10

Tsukui's Lemma: Let M = [1 + F] be an n x n matrix, where F is nonsingular

and F-' > 0.'' Then there exists a positive integer m such that M"' > 0 if and only
if there exists an eigenvalue A, of M such that Al > IAA where the A,i, i = 2, 3, ...,
n, are the other eigenvalues of M.
PROOF: Let Ai, µi, and vi(i = 1 , 2, ... , n) be the eigenvalues for the matrices
(1 + F), (I + F)"', and F -', respectively. Let xi, y', z' be the corresponding
eigenvectors so that
(21) (1 + F) x' = Aixi, (I + F)'" yi = µiy', (F- 1) . zi = vizi

Note that (I + F)2. xi = (I + F) (Aixi) = A,2x'. Similarly, we have

(22) (1 + F)"I - xi = Al In xi

Also note that (F -') z' = vi z' implies (1 /vi) zi = F z', which in turn implies
(1 + F) zi = (1 + 1/vi)zi. Hence we have

= Ai, and xi = y' = z', i = 1, 2, ..., n

1
(23) Aim = 9i, I + vi

We are now ready to prove the statement of the theorem.

(Necessity) Suppose M III > 0 or (1 + F)"' > 0, for some integer in > 0.
Then M"' is indecomposable and primitive; hence there exists a positive
eigenvalue µ, (Frobenius' root) such that 1L , > I µi 1, i = 2, ... , n, and an
eigenvector _y' > 0 associated with µ,. This implies that A, > lAii, i =
2, ..., n, and x' > 0, for,.,"' = µ and x' = y' by (23), which is the desired
result.
(Sufficiency) Let v, be the Frobenius root of F- ' > 0, so that F-' x' _
v, X' . Then v, > 0 and x' > 0. Moreover, A, = I + 1/vi > 0. Since the trans-
pose of a matrix has the same set.of eigenvalues as the original matrix, we
have v, u' = u' (F- '), so that (1 /v, )u' = u' F. Moreover, u' > 0, since
514 MULTISECTOR MODELS OF ECONOMIC GROWTH

F- 1 > 0." Then note the following relations:

(24)

[ul(1 + 1 )] x' _ (1 + 1)(ul . x')

V1 171

Since Al = 1 + 14, and also X11 > I .A; J for all i = 2, ... , n, by assumption'13
this implies

(25)
Let e' be the n-dimensional column vector whose ith component is 1 and all
other components are zero, and consider the system of difference equations
x(t + 1) =
Then, assuming that all the eigenvalues of M are distinct," the general
solution of this system of difference equations is [as discussed in connection
with (10)]

X(t) = h1A1tx' + ... + hnAntxn

where the h1, i = 1, 2, ... , n, are determined by the initial conditions. Hence
a path starting from e' must satisfy
(26) z(0) = e' = hlxl + + hnxn

so that

(27) Mt. e' = h1A1tx1 + ... + hnAntXn

since the A,'s are the eigenvalues of M and the x''s are the eigenvectors of
M associated with the A,'s, i = 1, 2, ... , n. Also (26) implies
n

(28)
r= 1

in view of (25). Since x1 > 0 and ul > 0, we have

u1 e'
(29) h1=ul xl>0
Since X11 > I Ail, i = 2, 3, ... , n, and A, > 0, x1 > 0 by assumption, (27)
implies that there exist (finite) positive integers k1, i = 1, 2, ... , n, such that

(30) Ofort> k,
Let m; be the smallest of such k;'s and let m - max }M h , 1772, ... , m;, ... , In
Then M"' e' > 0 for all i = 1, 2, ... , n, so that
THE DYNAMIC LEONTIEF MODEL 515

(31) Mm> 0
as desired. (Q.E.D.)
We are now ready to consider the relative stability property of a balanced
growth solution of the system of difference equations x(t + 1) = M x(t), where
M - (I + F). Consider the following particular solution, which is a (meaningful)
balanced growth solution if Al > 0 and x1 > 0:

(32) x*(t) = A11x1

Hence, assuming that all the eigenvalues of M are distinct," the solution x*(t)
is relatively stable (by definition) if and only if
n
xj(t) = 1i
(33) r-1
A11xjl x' --> o > 0 as t -> oo, j = 1, 2, ... , n
xj (t)

where xj' is the jth element of x', i = 1, 2, ..., n, and z(t) is the general solution
written in the form of (10), whose jth element is zj (t). It is clear then that the
balanced growth path x*(t) is relatively stable if and only ifA1 > I A; 1 , i = 2, ... , n,
with h1 > 0. Hence, in view of the previous lemma, we can conclude the following.

Theorem 6.B.1: Suppose there exists a positive integer m such that Mm > 0 with
with F-' > 0; then the balanced growth solution x*(t) as defined in (32) is relatively
stable, where A1m is the Frobenius root of M"'. Conversely, if the solution x*(t) is
relatively stable and F- 1
> 0, then there exists a positive integer m such that
Mm > 0.18
REMARK: Nikaido ([32], section 22) proved the first half of this theorem
for the case m = 1 (that is, M > 0), which is a special case of the Solow-
Samuelson theorem.
REMARK: The matrix M is not necessarily a nonnegative matrix in the
above theorem. However, if M is a nonnegative matrix, then Al becomes
the Frobenius root ofM. Moreover, ifMis nonnegative and indecomposable
(but not necessarily M > 0), then from Theorem 4.B.4 there exists a positive
integer m such that Mm > 0 if and only if M is primitive. Hence we obtain the
following corollary.

Corollary: Let M be a nonnegative indecomposable matrix. Then if M is primi-

tive, the balanced growth solution x*(t) as defined in (32) is relatively stable.
Conversely, if the solution x*(t) is relatively stable and if F -I > 0, then M is
primitive.
REMARK: The result of the first part of this corollary is also obtained in
Nikaido [32], pp. 110-113.
516 MULTISECTOR MODELS OF ECONOMIC GROWTH

Above, we remarked that the balanced growth path x*(t) defined by (32) is
relatively stable if and only if )t > Ja.,I, i = 2, ..., n. However, the examination
of (33) also reveals that the path x*(t) is relatively unstable if
(34) Ai < IA;I, i=2,...,n
Now suppose that M- "' > 0 for some positive integer m where M-"' is defined by
(Mm)-1. Write the Frobenius root of M-m as p i and observe that p I > Pi , i = I I

2, ... , n, and 1 /p; _ ,uj = A/', i = 1, 2, ... , n, which in turn implies condition (34).
On the other hand, if M -- I + F, where F -' > 0, and if (34) holds, we can prove
that there exists a positive integer m such that M-m > 0. The proof is analogous
to the sufficiency proof of Tsukui's lemma. Therefore, we obtain the following
direct opposite to Tsukui's lemma, which is also due to Tsukui [44].

Theorem 6.B.2 (relative instability theorem): Let M = [I + F] be an n x n

matrix, where F is nonsingular and F-' > 0. Then there exists a positive integer
m such that M-m > 0 if and only if there exists an eigenvalue A, of M such that
AI 0 for some m is more
common in practice than M"' > 0 for some m. If this is the case, the output
system is relatively unstable."
In the dynamic Leontief system where M = I + B (I - A), we may impose
conditions on A and B such as A ? 0, B >-_ 0, B is nonsingular, (I - A) is nonsingular
with (I - A)-' > 0, and 0 < a, < 1 for all i and j. However, these conditions
are not sufficient to guarantee that M" > 0 for some positive integer m. In other
words, even setting aside its empirical validity, the condition M"' > 0 for some m
is not necessarily satisfied theoretically. To see this, recall Example 1 above (which
is due to DOSSO [4] ). As we noted, we have

r 5 1

3 3
M-- I+B-'(I-A)= 1 5

1--s 3

The eigenvalues of M can be computed easily as 2 and 3 with the corresponding

eigenvectors and (or their constant multiples). Hence, the general

-1 l

solution of (8), x(t + 1) = M- x(t), can be written as

(35) k1(t) = h12t + h2( x2 (t) _ -h121 + h;(3

where hi and h, are to be determined by the boundary conditions, say, ci(0) =

THE DYNAMIC LEONTIEF MODEL 5I7

h, + h2 and c2(0) _ -h, + h2. If z,(0) and z2(0) are such that h, = 0, then
the economy is on a balanced growth path x*(t) = h2(3)', i = 1, 2. On the other
hand, if z,(0) and z2(0) are such that h, 0, then one of the outputs eventually
becomes negative. For example, if h, > 0, then z2(t) < 0 for all t > 1 for some 1.
Note that the balanced growth solution h,2' is impossible for any h, > 0 under
the assumption that x(0) ? 0, that is, x;(0) > 0, i = 1, 2, with strict inequality for
at least one i.18 In other words, we have shown for the present dynamic Leontief
system that one of the outputs eventually becomes negative except when the
boundary conditions are such that the economy is actually on its only balanced
growth path x*(t) = h2(3)', i = 1, 2. We may note that if z;(t) becomes negative
for t > 1, then Ki(t), the stock of this ith good, also becomes negative for t > 1
in view of (3). Clearly, negative output and a negative stock of goods do not make
any economic sense in the present discussion. Such a possibility in a dynamic
Leontief system is called "causal indeterminacy."19

Definition: If the relative configuration of the initial outputs (or the initial stock
of goods) does not coincide with that of any possible balanced growth path of
the economy, then the growth path may ultimately reach a situation at which the
output (and the stock) of at least one good becomes negative. If this happens,
then we say that we have causal indeterminacy.
Clearly this possibility of causal indeterminacy seriously undermines the
dynamic Leontief model. Note also that if the economy possesses a balanced
growth path which is relatively stable, then there is no causal indeterminacy in
such an economy. This is, as mentioned earlier, another point of crucial import-
ance in the concept of relative stability.
C. THE PRICE SYSTEM20
Assume that our economy is equipped with "money" which can be produced
at no cost and which functions as a medium of exchange as well as a unit of
account by which the price of each good is measured. Let p(t) be the price vector in
period t, whose ith element pi(t) denotes the price of the ith good in period t.
We assume that the production of all goods takes exactly one period and that
prices are constant throughout each period. It is assumed that no individual
can affect the price of any good that prevails in the market (the "competitiveness"
assumption).
Consider a bundle of goods denoted by the vector b; = (b,j, b2j, ... , bj).
This bundle of goods gives the necessary configuration of capital equipment for
the production of one.unit of the jth good. The value of this bundle is equal to
Uj p(t) bj = 2:;'_, p,(t)b,1. Consider an individual who has money in the amount
of vj at the beginning of period t. Suppose he can either lend this (say to a "bank")
at the rate of interest r(t) or invest it in the production of the jth good.
We assume that no individual can affect the interest rate which prevails in
the economy and that r(t) is the rate which prevails in the economy throughout
518 MULTISECTOR MODELS OF ECONOMIC GROWTH

period t. By lending at this rate in period t, he can obtain

(36) [ 1 + r(t)] ve(t)
at the beginning of the (t + 1)th period.
Suppose that, instead of lending his money, he invests it in the production
of the jth good. Then he can buy the configuration of capital equipment which
is necessary for the production of the jth good in the amount of exactly one
unit with his money vj. Assume for the sake of simplicity that (homogeneous)
labor is the only primary factor. Let aoi be the amount of labor necessary to
produce one unit of thejth good. Letpo(t) be the price of labor ("wages") in period
t and assume that wages and the material cost are paid at the end of the period
(that is, at the beginning of the following period). Then the wage cost and the
material cost for the production of one unit of the jth good in period t are given,
respectively, by
n
po(t + 1)aoj and p;(t + 1)a; = p(t + I) - aj

where aj = [a1j, a2j, . . ., and] . The current profit for period t per unit production
of the jth good is thus given by
7I(t)=p(t+ 1)-po(t+ 1)aoi-P(t+
Since the configuration of capital equipment bj will be worth p (t + 1) - bj at the
beginning of period (t + 1), the total value of his assets at the beginning of period
(t + 1) is given by
(37) 7I (t) + p(t + 1) bb

Assuming the competitive arbitrage condition, it should be immaterial in equilib-

rium whether one lends the vi or invests it in the production of the jth good; the
above expression, (37), must be equal to the one given by (36). In other words,
we have
-j(t) + p(t + [1 + r(t)]ij
or
(38) p,(t + 1) - po(t 1)aoj - Pt F 1) aj + p(t + l) bj _ [ 1 + r(t)] P(tbi
Recalling that vi(t) = p(t)- bi, and rearranging terms after dividing both sides of
(38) by ve(t), we obtain
(39) vj(t + 1) - ve(t) + -j(t) = r(t)
Ve(t) Vi (t)

This is the well-known equation of capital theory. We may consider vi(t) the price
of a unit capacity to produce thejth good. A usual way of interpreting (39) is as
follows: Suppose a person has one unit of money (say, a "dollar") with which he
THE DYNAMIC LEONTIEF MODEL 519

can buy the capacity for the jth good in the amount of l/vj(t).21 One unit of this
capacity is worth vj(t + 1) in period (t + 1) so that vj(t + 1)Ivj(t) is the value of
the capacity of the original (1/vj(t)) units. Moreover, (1/v,(t)) units of capacity
yield current profits of'r (t)/vj(t) in period t. On the other hand, one dollar, if it
is loaned to a "bank," will be worth [I + r(t)] at the beginning of the (t + 1)th
period. Thus, in equilibrium, we have
(40) vj(t + 1) + 7rj(t)
+ r(t)
v;(t) Vi(t)

which is obviously equivalent to (39).

Coming back to (38), this relation must hold for all j = 1, 2, ... , n. Hence,
recalling that aj is the jth row of A and bj is the jth row of B, we obtain

(41) p(t + 1) [1-A+B] = [I 1)ao

where ao is an n-vector whose jth element is aoj. This is the basic price equation
of the dynamic Leontief system. This equation is the "dual" of the output equation
(7), which we may rewrite as

(7') B x(t + 1) = [1 - A + B ] x(t) - c(t)

In the above exposition of the derivation, for the sake of convenience, we
introduced such concepts as "money" and "bank" (or lending possibility). But
these are not essential concepts in the construction of the economy. "Money" is
introduced to measure prices, the pj(t)'s, and to facilitate transactions, and
"bank" is introduced solely to facilitate the concept of the rate of interest. If there
are neither "banks" nor any lending possibilities, then r(t) is considered the own
rate of interest in period t. Assuming that p(t) and p(t + 1) are given, we may
then consider equation (39) as the defining equation of the own rate of interest,
r(t).
Assuming again that B is nonsingular [assumption (A-2)], we can rewrite
(41) as

(42) p(t + 1) 1 + r(t)] p(t) [ 1 + (1- A)B- 1] - I + W

where W = po(t + 1)ao [1 - A + B] ;1. This equation corresponds to (7) for the
output system. In obtaining (42), the nonsingularity of [I (1- A)B- 1] is
+

implicitly assumed.
In the closed dynamic Leontief system in which labor is subsumed as one of
the industries, the term W does not appear, so that (42) is simplified to
(43) p(t + 1) = [1 +r(t)]p(t) [I+ (1-A)B-1]-I
This is again a system of n first-order, linear, homogeneous difference equations.
We should note the one fundamental assumption in the above procedure of
520 MULTISECTOR MODELS OF ECONOMIC GROWTH

obtaining (43). That is, one has to know p(t + 1) in period t. Clearly, p(t + 1) is a
future price in period t; thus this is the assumption of perfect foresight 22
We may also note that this system of equations, (43), is not self-contained.
That is, there are (n + 1) unknown time profiles (the prices of n goods and the
interest rate), whereas (43) contains only n equations. If we specify r(t), then we
can solve this system of equations for pi(t), i = 1 , 2, ... , n. An unspecified interest
rate is appropriate in the present model, for there is no mechanism in the model
which determines the interest rate. Alternatively, if we set one of the goods to
be the numeraire (say p (t) = 1 for all t), then (43) determines r(t) and pi (t),
i = 2, ..., n. But then there is no mechanism in the model which determines the
absolute prices. The banking system, demand for money, and the like, are not
discussed. In his treatment of the dynamic Leontief price system, Solow proposed
"to let the interest rate hang and treat it as an arbitrary function of time" ([38],
p. 36). Then, soon after this statement, he assumed r(t) constant for all t (p. 36).
Jorgenson [ 151, in proving his dual stability theorem, assumed that r(t) = 0 for
all t. Here let us also assume r(t) = 0 for all t, so that we now have
(44) p(t + 1) = p(t) N
where N -- [I + (I - A )B -' ] -'. Let ,, i= 1, 2, . ., n, be the eigenvalues of N
.

with corresponding eigenvectors p,, i = 1, 2, ..., n, that is,

(45) P;' N = rP;, i = 1, 2, . . ., n
If all the eigenvalues of N are distinct, then the general solution of (44) can be
written as

(46) p(t) = g1 1t 1 + ... +

where the g;, i = 1, 2, ..., n, are constants which are determined by the initial
conditions. Hence again the behavior of p(t) over time depends on the eigenvalues
of N, the ,'s.
In order to study the eigenvalues of N, let ;, i = 1, 2, ..., n, be the eigen-
values of (I - A)B-' with corresponding eigenvectors q;, i = 1, 2, . . ., n. That is,

(47) i=1,2,...,n
Then
(48) 9,[I+(I-A)B-1] =(1 i = 1,2,...,n
That is, the (I + ,)'s are the eigenvalues of [I + (I - A)B '] Then we have .

(49) 9r. [1+(I-A)B-1]-1 9;

1+(i i=1,2,...,n
Hence in view of (45), we obtain
THE DYNAMIC LEONTIEF MODEL 521

(50) r =1 + ;1
and Pi=9;,i= 1,2,...,n

Therefore, to study the eigenvalues of N, it suffices to study the eigenvalues of

(I - A)B- I. But the eigenvalues of (1 - A)B- I are equal to the eigenvalues µ;
of B-'(1- A), i= 1, 2, ..., n. Hence we have , = µ;, i = 1, 2, ..., n, so that we
obtain

(51) ti = _,, i = 1, 2, ..., n

since A; = 1 + pi, i = 1, 2, ..., n.

The output system (8), x(t + 1) = M x(t), has a balanced growth path which
is relatively stable if and only if there exists a positive eigenvalue A of M such
I

that A I > A i l , i = 2, ... , n, with x' > 0, where x I is the eigenvector associated with
l .

AI > 0. The relevant balanced growth path which is relatively stable is x*(t) _
AItz'. But if AI > IA;I, i = 2, ..., n, then we have, in view of (51),

(52) ti < Itrl, i = 2,..., n

The price equation which corresponds to the balanced growth path x*(t) =
AItzl can be written as

P*(t) = V P1
where I > 0 and p I > 0 as long as AI > 0 and x' > 0. Hence if x*(t) is relatively
stable, then p*(t) is not relative stable, for the ratio p;(t)/p*(t), i = 1, 2, ..., n,
does not, in general, converge to a constant as t -> oo in view of (46). Conversely, if
there exists a §I > 0 and the corresponding eigenvector )5, > 0 such that I >
i = 2, ..., n, then we have AI < 1A11, i = 2, ..., n. Hence the balanced growth
path x*(t) is not relatively stable. We may call this result the (Solow-Jorgenson)
dual stability theorem in view of Solow [ 38] and Jorgenson [ 15] .23
Finally, let us examine whether it is possible for the price vector to be con-
stant for all t. Coming back to (41), put p(t + 1) = p(t) = constant = p, and
po(t) = constant = po. Then we obtain

(53) fPo'ao
In other words, if the initial price vectors p(O) and po(O) happen to be such that
p(O) = p and po(0) = po and that p and po satisfy (53), then all the prices are
constant for all t. For the special case r = 0 andpo = ao = 0, that is, the case we are
considering in connection with (44), we have p = 0, assuming (1- A) is non-
singular. This can also be seen from (44) directly. This means that in the closed
Leontief system with r = 0, the only constant price case is the case in which all
the prices are zero. The constant price solution is the one that Morishima [26]
is concerned with. A discussion of this is given in subsection e.
522 MULTISECTOR MODELS OF ECONOMIC GROWTH

d. INEQUALITIES AND OPTIMIZATION MODEL (SOLOW)

The phenomenon of causal indeterminacy brings up the following question:
Under what conditions can we guarantee the existence of nonnegative output and
stock vectors, x(t) and K(t), for all t, which satisfy a system of difference equa-
tions such as (8)? This question, in a sense, resembles the Walras-Wald problem of
the existence of a competitive equilibrium as a solution of a system of simultan-
eous equations. We recall (Chapter 2, Section E, subsection a) that this problem
is solved by allowing inequalities in the system. We may then naturally conjecture
that the problem of causal indeterminacy can also be avoided by allowing in-
equalities in the system. This is precisely the route that Solow [ 38] investigated.
Following Solow, we consider the following system [instead of (6')] :

(54) x(t) = A x(t) + A K(t) + c(t)

and
(55) K(t) > B x(t), with x(t) > 0 and K(t) > 0
Note that the important feature is that inequalities are introduced as K(t)
B B. x(t). This, of course, means that we allow the possibility that output may fail
to use up all of the available capacity; that is, excess capacity may exist."' The in-
troduction of these inequalities relaxes the "tightness" of the original equality
system such as (6') or (8). However, we can no longer define a unique output path
x(t) [ and a stock path K(t)] as a solution of a given system of difference equations
such as (8). The above relations (54) and (55) define only a "feasible" path of an
output vector and a stock vector, x(t) and K (t). Clearly there can be many x(t) and
K(t) which would satisfy the above relations (54) and (55), even if the initial con-
ditions-say, K(0)-are uniquely given. In other words, (54) and (55) define only
the set of feasible paths.
What then can we say about the output path and the stock path which would
actually be chosen in the economy? The answer to this question should seem
obvious to any economics student, that is, that "it depends on the demand con-
ditions." But what demand conditions? The bestwaytounderstandthisisprobably
to consider the above dynamic system in terms of a diagram. Write d(t) = A x(t) +
c(t), which determines the demand for goods (net of accumulation) in period t,
once x(t) is determined. At each t, K(t) is given to the economy and the possible
values of output are determined by K(t) >_ B x(t), x(t) > 0. For t = 0 this is
illustrated by the area inside (and including the boundary) OEFG in Figure 6.5.
Suppose that point P is chosen, which is an output vector in period 0, that is, x(0).
Then deducting d(0) = [d,(0), d2(0)], we obtain point Q, which, in terms of (54),
defines the increase in the stock of goods in period 0, A K(0) = K(1) - K(0), from
which K(1), the stock vector in the next period, is determined.
It may be plausible to assume that x(0) is chosen such that x(0) is on the
frontier (that is, the kinked line EFG), for any other point (that is, point inside) can
be improved upon." However, the choice of location of x(0) on the EFG line is a
THE DYNAMIC LEONTIEF MODEL 523

X, (0) Figure 6.5. Feasible Set.

matter of the demand conditions. In this connection, we should note that point Q
and point P are related in a definite way, for vector d(O) is uniquely specified once
x(O) is specified. Hence the choice of x(O) is uniquely related to the choice of
A K(0) if the equality (54) is to hold, so that K(1) is determined in a definite way.
In other words, the "demand condition" simultaneously determines the output
vector of the current period [ here x(0)] and the stock vector of the next period
[ here K(1)] . Once the stock vector of the next period is determined, we are ready
to consider the next period by taking the stock vector of this period as a new initial
condition.
Solow [ 38] then proposed to consider this demand condition as a solution of
an optimization model. Suppose, for example, that the demand condition is not
determined in a decentralized way as the sum of each individual's desires or tastes
but is rather determined by the central planning authority in such a fashion as to
optimize a certain target.
As such a target (following Solow), we choose a weighted sum of the terminal
capital stocks,a;K;(T), or a- K(T), where a >_ 0. Recalling our discussion
in Chapter 1, Section E, we may regard this as the vector maximum problem of
maximizing K(T). In any case, given the value of a, our problem now is a simple
linear programming problem of maximizing a, KIT) subject to
(I - A) x(t) > c(t) + A K(t), t = 0, 1, 2, ... , T - 1
1,2,...,T- 1
and

x(t) > 0, K(t) > 0, t = 0, 1, 2, ..., T

Note that we now allow an excess supply of goods [ that is, x(t) > A x(t) +
AK(t) + c(t)] as well as excess capacity. The constraints (I - A) x(t) >_
E (t) + A K (t) and K (t) > B x(t) are written out as follows (here write C -- I - A):
524 MULTISECTOR MODELS OF ECONOMIC GROWTH

I 0 0 . 0 0 - C 0 0 0 K(1) K(0) - c(0)

-I 10 ..
...

0 0 0 -C 0 0 K (2) -c(1)
0 -I I 0 0 r
0 0 -C ... 0 K (3) -c(2)

0
---- ----
0 0 -I I 1 0
------ --------4---- -- ---- ---- ---- ---
0 0 -c K(T) -c(T- 1)
0 0 0 0 0 B 0 0 ... 0 x(0) K (0)
-I 0 0 0 0 0 B 0 ... 0 0
0 0;
X(1)
0 -I 0 0 0 B ... 0 x(2) 0
...
..
...
......
......
... ...
... ...
. ... ...
0 0 0 -I 0 0 0 0 ... B x(T- 1) 0

Clearly the choice variables here are K(t), t = 1 , 2, ... , T, and x(t), t = 0, 1, 2, ... ,
T - 1. Clearly, it is possible to generalize the above maximization problem by add-
ing the primary resource constraints such as a0 x(t) s L(t) where L(t) is the exo-
genous supply of labor in period t. Since such an analysis would be analogous to
the subsequent one, we shall leave it to the interested reader.
In order to consider the price implications of the above problem, Solow con-
sidered the dual of this linear programming problem, which can be easily obtained
by constructing the dual constraints from the above original constraints. That is,
the dual constraints are explicitly written out as follows:

-I 0 -I 1
I
0
0
I
0
-I .0

I
...
...
0
0
0
0
0
0
0
-I
0

0
...

...
0
0
0
U(O)
U(1)
u(2)
0
0
0

0 0 0 -I j 0 0 0 -I u(T-2) 0
0 0 0 ... I ; 0 0 0 0 u(T - 1)
-C' 0 0 .. 0 B' 0 0 ... 0 q(O) 0
0 -C' 0 ... 0 0 B' 0 ... 0 q(1) 0
0 0 -C' 0 0 0 B' 0 q(2) 0

0 0 0 - C' 0 0 0 B' Lq(T- 1)J 0

Here u(t), t = 0, 1, T - 1, and q(t), t = 0, 1, ..., T - 1, are the dual

variables, and B' and C' are, respectively, the transposes of B and C. We thus
obtain the following set of constraints for the dual problem:
(56) u(t)-u(t+ 1)-q(t+ 1)>0,t=0, 1,2,...,T-2
(57) u(T - 1) > a
and

(58) - C' u(t) + B' q(t) > 0, t = 0, 1, 2, ..., T - 1

THE DYNAMIC LEONTIEF MODEL 525

as well as the nonnegativity constraints, u(t) > 0, q(t) > 0, t = 0, 1, ... , T - 1. The
objective of this dual problem is to minimize
T-1
uo- [K(0) - F(0)]- u(t) F(t) + q(0) K(0)
t= J

- Et=0 [u(0) + q(0)] K(0)

Let k (t), t = 1, 2, ... , T, and i(t), t = 0, 1, ... , T - 1, be a solution of the

original problem, and let fi(t), t = 0, 1, 2, ..., T- 1, and 4, t = 0, 1, ..., T - 1,
be a solution of the dual problem. Then applying the LP duality theorem (see
Theorem 1.F.1), we can easily conclude that
2:1

(59) a K(T) = [u(0) + 4(0)] K(0) - 4(t)F(t)

t= O

(60) k(t) > 0 implies u(t) - u(t + 1 ) - 4(t + 1 ) = 0, t = 0, 1, ... , T - 2

(61) K(T) > 0 implies u(T - 1) = a
(62) i(t) > 0 implies - C u(t) + B' 4(t) = 0, t = 0, 1, 2, ... , T - 1
(63) u(t) > 0 implies 1(t) = A z(t) + Ak(t) + F(t), t = 0, 1, 2, ... , T - 1
(64) 4(t) > 0 implies k(t) = B z(t), t = 0, 1, 2, ... , T - 1
(65) ICj(t) > [B z(t)] , implies 4j(t) = 0, t = 1, 2, ... , T - 1
Following Solow [38] , we interpret u(t) as the vector of commodity prices
in period t, discounted back to the present, and 4(t) as the vector of stock rents dis-
counted back to the present. In terms of the previous notations

(66) u(t) p(t) u(0) = p(0) and is (t) > 0

+ r(T)]
Let R (t) denote the undiscounted vector of stock rents. Then

(67) 4(t) = R(r)

1[1 r(T)],
4(0)=R(O) and 4(t) > 0
IIT= +

Now assuming that all stocks are always positive, that is, k(t) > 0 for all t, we
obtain from (60)
(68) [1 + r(t)] p(t) = R(t + 1) + p(t + 1)
from which we can easily obtain
526 MULTISE(TOR MODELS OF ECONOMIC GROWTH

Pi (t + 1) --pi(t) + Ri(t + 1)
(69) = r(t), for all t
Pi (t) Pi (t)
which corresponds to our old intertemporal-arbitrage equilibrium condition (39).
Note that Ri(t + 1) corresponds to ni(t), which is the current profit (own rent) of
period t from a spectrum of equipment which has a capacity of producing one unit
of thejth good. On the other hand, R1(t) is the rent on a stock consisting of one unit
of thejth good. Also from (61) we obtain [by assuming k(T) > 0]
p(T - 1)
(70) a= T-
,=I[ 1 + r(i)]
Using (66) and (67), the constraint - C. u(t) + B'- q(t) > 0 can be written as
(71) -C'-p(t) + B'-R(t) > 0, or p(t) < A'-p(t) + B'-R(t)
Suppose k(t) > 0 for all t so that (60) holds; then we have
(72) (B + C)' p(t + 1) < [1 + r(t)] B'-p(t), for all t
If i(t) > 0, then (71) holds with equality for all tin view of (62). Hence (72) holds
with equality for all t, and we can easily rewrite it as the basic price equation (43)
in the dynamic Leontief system. [If we had incorporated the labor constraint
a0- x(t) < L(t) in the original maximization problem, we would obtain equation
(42) instead.]
If some stock is not held for some period, that is, if Ki(t) = 0 for some j and
some t, then (69) does not necessarily hold with equality. In other words,
(73) pi(t + 1) - pi(t) + Ri(t + 1) _< r(t)
Pi(t) Pi(t)
The strict inequality here would induce a holder of the stock to get rid of it. We
henceforth assume Ki(t) > 0 for all j and for all t.
Suppose that ai > 0, that is, the terminal stock of the jth good is positively
weighted. Then we can show that ui(t) > 0 for all t. To see this, observe that, owing
to (60), ui(t) = 0 implies ui(t + 1) = 0 and qi(t + 1) = 0, so that ui(t + 2) =
ui(t + 3) = . . = ui(T - 1) = 0. But ui(T - 1) = ai > 0. Hence ui(t) = 0 is impos-
.

sible for any t. Therefore, assuming a > 0, we can conclude that u(t) > 0 for all t.
Hence, in view of (63), we obtain
(74) z(t) = A i(t) +A K(t) + c (t), for all t
In terms of Figure 6.5 this means that the output must always be chosen on the
frontier, that is, the kinked line EFG, as we remarked earlier.
If the stock of thejth good has excess capacity in period t, then in view of (65),
we have qi(t) = 0, so that R1(t) = 0. Then, owing to (69),
(75) pi(t + 1) = [1 + r(t)] pi(t)
THE DYNAMIC LEONTIEF MODEL. 527

In other words, the nominal price of thejth good will increase simply at the com-
pound rate of interest r(t).
What does this all add up to? It may be worthwhile to recapitulate some of
the results obtained above.

(i) The problem of causal indeterminacy can be avoided simply by converting the
equalities to inequalities, that is, by allowing excess capacity of capital together
with the nonnegativity constraint.
(ii) The dual variables u(t) and q(t) play the role of prices in the competitive mech-
anism. For example, the competitive intertemporal arbitrage equation (69) is
obtained by interpreting the dual variables as prices.
(iii) The values of the dual variables can be computed explicitly by solving the dual
linear programming problem.
(iv) There are certain important relations between prices [p(t) and R(t)] and the
real variables [x(t) and K(t)] implied by relations (59) to (65). Some of them are
obtained as (69), (71), (72), (73), (74), and (75). The possibility of zero prices
and of excess capacity is a novel feature in the present formulation.
(v) The price equation (43) holds if k(t) > 0 and z(t) > 0 for all t.

Finally, we should stress that the above model a la Solow is a planning model
and not a descriptive model of an economy. Although it can be interpreted as an
"optimal path" generated by a "competitive" mechanism, it does not describe the
mechanism nor the equilibrium of a competitive dynamic economy. This is in
marked contrast to Morishima's model, which we describe in the next subsection.
However, if we recall the treatment of a competitive (static) equilibrium in terms of
linear programming (with the duality theorem) by Kuhn [18] and DOSSO [4]
(see our Chapter 2, Section E, subsection a), we realize immediately that we can
construct a model for a competitive dynamic economy utilizingthe above planning
model and prove various properties of the model such as existence, and so on. In
other words, we may consider the above planning model as a part of a descriptive
model.
There is another possible route of development in the above planning
model, and that is dropping the assumption of fixed coefficients and allowing
various production processes for the production of one or more goods, while still
retaining the basic planning character of the model. A natural question which
then arises is that of characterizations of the "optimal path." The turnpike
theorems that we discuss in the next chapter (Section A) are concerned with this
question, under the assumption that the planning horizon Tis long enough. This
route is further investigated by Gale, and others, with a more satisfactory treat-
ment of consumption (Section B of Chapter 7).

e. MORISHIMA'S MODEL OF THE DYNAMIC LEONTIEF SYSTEM

Another important model of the dynamic Leontief system is provided by
Morishima [26] and [27]. The essential difference here from the usual dynamic
528 MULTISECTOR MODELS OF ECONOMIC GROWTH

Leontief model is that the production coefficients such as the ay's and by's are
no longer assumed to be constant. That is, these coefficients can now vary depend-
ing on changes in the prices of goods (which are used as factors of production) as
well as on changes in the wage rate.26
One of the important objectives of Morishima in the above-mentioned
papers was to prove that the coefficients, the ay's and b11's, are, in fact, constant.
Although these coefficients are allowed to vary, only fixed values are chosen in
equilibrium so that the usual analysis of the dynamic Leontief system with fixed
coefficients could be justified. This is an extension of the famous substitution
theorems` of Samuelson from a static to a dynamic case.
Although we will argue that his dynamic substitution theorem holds only
for very limited cases, his introduction of the possibility of factor substitution
is a very important contribution. For example, as we remarked in subsection a,
it could enable us to avoid the difficulty of causal indeterminacy which is in-
herent in the dynamic Leontief model with fixed coefficients. In other words,
suppose that for certain period(s) of time, the stock of some good diminishes.
Then, under the usual circumstances, such a decumulation would increase the
marginal productivity of this good when it is used as a factor of production, which
may, in turn, encourage the production of this good. Hence the decumulation of
the stock of this good could be stopped, thus avoiding causal indeterminacy.28
We do not, however, attempt to prove this observation rigorously in this
subsection. We leave this to the interested reader. Here, instead, we try to build
a foundation for such an attempt by discussing Morishima's model critically.
This will enable the reader to understand how the model is to be constructed
when factor substitution is allowed. In this connection we may also point out an
interesting feature in Morishima's model, that is, he distinguishes capital goods
from noncapital goods explicitly, so there are goods that are never used as capital
goods (unlike the usual Leontief model in which the nonsingularity of the B
matrix is assumed).
Second, Morishima's model represents a polar assumption with regard to the
treatment of the price equation compared with our price equation (41) a la
Solow [38]. In other words, as we will discuss later, whereas (41) represents
the dynamic price equation with changing prices but perfect foresight, Mori-
shima's price equation is based on the assumption that entrepreneurs always
expect prices to remain constant.29

Substitution of goods as factors of production can be explicitly introduced

by considering the production function for each good in the usual neo-classical
way. In other words, omitting t for the sake of notational simplicity, we have

(76) Xi = f (xoi, X i1, ... , X nj), J = 1, 2, ... , m

(77) k = m + 1,...,n
THE DYNAMIC LEONTIEF MODEL 529

where xij is the amount of the ith good used for the production of the jth (non-
capital) good and xik is the amount of the ith good used for the production of
the kth (capital) good. When i = 0, it refers to labor. It is assumed that (homo-
geneous) labor is the only primary factor of production. Assuming that these
production functions i and fk's) exhibit constant returns to scale (that is,
9

linear homogeneity), and then dividing both sides of the above equations by xj or
Xk, we obtain
(78) 1 = f (a0i, ... , ami, bm+ 1,i, ... , bni), i = 1, 2, ... , n
where aji and bki are now defined as aji ° xji/xj, j = 0, 1, ... , m, and bki = xkilxk,
k= m + 1, ...,n, (i = 1, 2, ...,n).
Unlike the usual dynamic Leontief system, these aji's and bki's are not
constant. They are assumed to depend on prices, thus reflecting the substitutability
of these goods as factors of production. That is,
(79) aji = aji(po, pi, ... , Pn),j = 0, 1, ..., m; i = 1, 2, ..., n
(80) bki = bki(Po, P1,... , pn), k = m + 1, ..., n; i = 1, 2, ..., n
It is assumed that aji > 0 and bki > 0 for all i, j, and k. The aji's and bki's may
be chosen to minimize the cost of production subject to theproduction constraints
(76) and (77) [or (78)].
The treatment of price equations in Morishima [26] is of the traditional
Walrasian type and is different from Solow's. Let pi (i = 1, 2, ... , m) be the price
of the noncapital good i, pk (k = m + 1, ., n) be the price of capital service k,
. .

and po be the wage rate. Let Pk be the price of new capital good k and let SkPk
and SkPk be the depreciation charges and the insurance premium, respectively, to
be deducted from the gross income Pk of one unit of capital good k(k =
m + 1, . ., n). Then the net price of capital service k is Pk - (Sk + Sk)Pk. Let r be
.

the rate of interest. Following Walras [46], Morishima supposed that the net price
of each capital service is equal to rPk. That is, Pk - (Sk + Sk)Pk = rPk, or3o

Pk
(81) Pk = , '5k = 4 + .5", k = m + 1,..., n
r + -5k

Under a regime of perfect competition, owing to free entry and exit of firms,
profit is zero for all industries. Thus we have, for each period,
m n
(82) pi = 2: ajipj + bkiPk, i = 1, ..., m
j=0 k=m+ 1

m n
(83) Pi = 2: ajipj + 2: bkiPk, i = m + 1, ..., n
j=0 k=m+ I

Define the matrices A1, B1, A2, B2, by

530 MULTISECTOR MODELS OF ECONOMIC GROWTH

all a12
...

...
alm bm+l,l bm+I,2 "' bm+I,m
all a22 a2m bmb+nl 2, 1 bm b+ 2,2 ' bm +2,m
l=
'. '.

A ... ,BI

aml am2 "' amm n2 bnm

al,,, 1 al,m+2 "' a [n bm+ j ,m+ l ,, "' bm+ I,n

A2 =
a2,m+ 1
...
a2,m+ 2
...
' '
...
' a2n
... B2 =
bm+2,m+1
...
"' "'
...
bm+2.n
...
,

am,m+ I am,m+2 ' ' amn bn,m+ I bnn

p2
Then writing P I = (P l , P2, Pm ), P2 = (Pm+ 1, , Pn ), = (Pm+ 1, . . ., Pn ),
-

11 = (aol, ., aom), and 12 = (ao,m+1, ..., aon), we can rewrite (82) and (83) as

(84)

(85)

Let 8 be an (n - m) x (n - m) diagonal matrix whose kth diagonal element is

Sm+k (where all the off-diagonal elements are zero). Then in view of (81), (85)
can be rewritten as

(86) P2 = LPl ' A2 + P2' B2 + Po121 S + r LP I ' A2 + P2' B2 + Po12]

Let p = (PI, ..., pn) = (p', p2), and define A, B, and I by

Al A2S OI A2 ll
(87) A = , B-- 1

BI B2-15 02 B2 12 (rl + 8)

where Ol and 02 are, respectively, m x m and (n - m) x m matrices whose elements

are all zero. Then A and B are both n x n matrices, and 1 is an n-vector. With these
notations we can now rewrite (84) and (86) as
(88) p = p A + rp B + pol
These matrices A and B correspond to A and B in the usual dynamic
Leontief model. As a result of the convention that there are goods which cannot be
used as capital goods, matrix B contains columns whose elements are all zero.
We can easily compare the above price equation of Morishima with the
Solow equation (41) by recalling that (41) is reduced to (53) when all prices are
constant, which corresponds to the above Morishima equation (88). This makes
us suspect that Morishima's equation is a special case of the Solow type equation,
(41), that is, the case in which all the prices are constant. Reexamination of equa-
tion (81) may strengthen this conviction, for this equation is true only when prices
THE DYNAMIC LEONTIEF MODEL 531

are constant, so there are no capital gains or losses by holding a stock of goods.
However, to say that Morishima's equation is a special case of Solow's equa-
tion may be a bit too strong. The best stand with regard to this comparison can be
found in Solow's own writing, where he states ([38] , p. 32):
Morishima's model and mine can be reconciled by recognizing that they rep-
resent polar assumptions about price expectations. I assume that entre-
preneurs have perfect foresight and (correctly) expect prices to change, and
I ask what price movements are then logically consistent. Morishima assumes
that entrepreneurs always expect prices to remain constant although in
fact they do change from time to time in order to clear markets, and he asks
what set of constant prices (and interest rate) can actually be made to endure.
It's a toss-up which assumption does more violence to reality.
In other words, what Morishima [26] was concerned with is the possibility of
enduring constant prices. Since he wished to prove the dynamic substitution
theorem in which prices are uniquely chosen (thus constant), he must find the set
of prices (and interest rate) that can be kept constant. Therefore, his way of treat-
ment of the price equation is a natural consequence of his interest in proving his
dynamic substitution theorem 31
We now count the number of equations and the number of unknowns. For
this purpose, first note that the matrices A and B are completely specified once
the price vector (po, p) is given. Hence we may write (88) as
(89) P = P A (Po, P) + rP B (Po, P) + Pol
which we may also write as
(89') « (Po, p, r) = 0
It is well known (and can be shown easily) that the af;'s and the bk;'s are all
homogeneous of degree zero in (po, p). Hence we have A (apo, ap) = A (po, p)
and B (apo, ap) = B (po, p) for any positive number a. Hence (89) implies
(90) ap = ap A (app, ap) + r(ap) B (app, ap) + apol
for any a > 0. In other words,
(91) 0 (apo, ap, r) = 0
for any a > 0. in view of the homogeneity of (D in (po, p), we may impose the
following price normalization equation :12
n

(92) Pi = 1
i-o
Equations (89) and (92) combined provide us with (n + 1) equations. There are
(n + 2) variables to be determined within the system, that is, p;, i = 0, 1, ..., n, and
r. Hence if we can preassign the value of eitherpo or r, the system is completely
specified. This is schematically described by
532 MULTISECTOR MODELS OF ECONOMIC GROWTH

(93-a) r - (Po, P)
(93-b) Po - (p, r)
The values of (po, p) or (p, r) thus determined define the equilibrium values of the
system. Note that the mechanism described in (93) does not determine the
absolute (monetary) prices of the goods. This is because there are two degrees of
freedom in (89) and one of them is controlled by (92). Alternatively, we may
specify both r and po exogenously where (92) is not binding; then the absolute
prices of the goods can all be specified.
As we remarked in Chapter 1, Section E, subsection a, the procedure of
counting the number of equations and the number of unknowns merely checks the
consistency of the model and does neither prove the existence nor the uniqueness
of the equilibrium values of the unknowns. The task of establishing the existence,
uniqueness, and nonnegativity of the equilibrium values was attempted by
Morishima [26]. The problem of existence is not a particularly difficult one. The
continuity of the linear maps A (po, p) and B (po, p) immediately establishes the
continuity of rD (po, p, r). For the r - (po, p) determination, it then suffices to
consider a continuous map c D from a unit simplex {(po, p): opc = 1} into itself
and to apply the Brouwer fixed point theorem (Theorem 2.E.2). The existence of
the equilibrium values for the po - (p, r) specification can be proved anal-
ogously.33 For an attempt to prove the uniqueness, the reader is referred to
Morishima [ 26] . Unfortunately, Morishima [ 26] apparently forgot to prove the
nonnegativity of the equilibrium values. Take the case of the p o- (p, r) specifica-
tion, for example. Just looking at (89), it can immediately be seen that ifpo is large
enough, r may have to be negative to preserve the nonnegativity of the (po, p)
vector by (92). Hence it is not surprising to find the brilliant example due to
Georgescu-Roegen [ 5] , in which the value of r is negative when the value ofpo
is preassigned. Such a defect can be remedied if we can set an upper bound on the
value of po. Morishima and Murata [ 29] thus "remedied" this defect simply by
assuming such a bound 34
That the value of (po, p) is uniquely determined for a given value of r, or that
the value of (p, r) is uniquely determined for a given value of po, has a very im-
portant implication; it means that the aj;'s and bk;'s remain constant regardless of
any change in the values of the final demand for goods, provided that these
changes do not disturb the preassigned values of r and po. This means, under
certain assumptions, that a perfectly competitive economy would choose to
produce each good by one process. Hence (as remarked before) the above result,
essentially obtained by Morishima [26] , is considered an extension of Samuel-
son's famous substitution theorem for the static Leontief system to the case of
intertemporal production (see, for example, Hahn and Matthews [9], p. 870).
Under what circumstances are the values of r and po determined in such a
way that they are undisturbed by changes in the final demand for goods? An
obvious case is the "Keynesian situation" in which the money rate of interest r is
THE DYNAMIC LEONTIEF MODEL 533

fixed owing to the "liquidity trap" in the money market and/or the money wage
rate is fixed owing to its rigidity in the labor market.3.5 However, except for such
rather extreme cases, we cannot, in general, establish that the values of r and po
are undisturbed by changes in the final demand for goods. This obviously under-
mines the use of Morishima's substitution theorem in the dynamic Leontief system
described above, contrary to the belief by Morishima [26] and Hahn and
Matthews [9] .

To illustrate this point, consider the po -> (p, r) determination. In this case,
the equilibrium value of (p, r) is uniquely determined as long as the value ofpo
is given. But if the value of P0, that is, the wage rate, changes, the values of p
and r also move, thus causing changes in the values of the aj;'s and the bki's.
This is not the case for the substitution theorem of the static Leontief model 36
Theoretically speaking, the (absolute) values of po and r are determined
under a broader general equilibrium system which incorporates the markets
ignored in the above consideration-that is, the markets for labor, money, bonds,
and so on-with the introduction of the store of value as a function of money as
well as of bonds. If we incorporate these markets into our analysis, it is more dif-
ficult, at least for this author, to accept Morishima's basic postulate of constant
prices, for constant prices presuppose certain assumptions with regard to the
supply of money and the like.3r Suppose now that prices change over time. Then
the basic equation (89) is no longer valid, for it lacks the term that signifies price
changes which certainly affects the profit condition.34 This factor, among others,
may destroy the homogeneity of the function 0, which in turn makes the price
normalization equation (92) invalid.
In this connection, we may point out that within the profession an active
interest has arisen recently in incorporating money (which, among other things,
functions as a store of value) into a growth model (see Tobin [41] and the ap-
pendix to this section, for example).39 Although such an analysis in the literature
has so far been confined to the model in which there is only one commodity be-
sides labor, it is certainly possible to extend it to a multicommodity model by
using the dynamic Leontief system as discussed in this section.
Keeping this in mind, let us return to the constant price world to explore
further implications of such a model. Consider (89) and normalize the price vector
p by the wage rate po instead of the price normalization represented by equation
(92). Thus rewrite (89) as
(94) p=p
where p = ()5 , = p1/po, i = 1, 2,.., n, A*(p)
. (1, P/Po), and
B*(p) = B (1, p/po). There are n equations in (94) with (n + 1) variables, p andr.
Suppose that the value of r is preassigned, and ask the question whether it is pos-
sible to have identical technology matrices A * and B * for two different values of
r. If it is possible, we call such a phenomenon reswitching of techniques."' If the
relation between r and p is one-to-one, then "reswitching" is obviously im-
534 MULTISECTOR MODELS OF ECONOMIC GROWTH

possible."' Note that the relation between r and p may not be one-to-one, even if
the functions A *(p) and B *(p) are continuous and one-to-one.
Actually we can also show that reswitching is impossible even in the absence
of a one-to-one relation between r and p,42 if p is strictly positive for all relevant
changes in r. To prove this, let r and r' be two interest rates, and suppose that re-
switching is possible. In other words, suppose that we have the same matrices A
and B * for both r and r'. Since A *(p) and B *(p) are one-to-one, we have the same
p for r and r'. Therefore
(95-a)
and

(95-b)

Hence
(96)
p > 0 by assumption, we obtain r = r' as long as at least one element of
B* is positive. This proves the impossibility of the reswitching of techniques.
Let us now turn to the determination of outputs in the above system. Let
xj(t), j = 1, 2, ..., m, represent the output of noncapital good j and let Xk(t),
k = m + 1, . . ., n, represent the output of capital good k. Then we have
n
(97) x (t) aj;xi(t) + Cj(t), j = 1, 2, ... , m

where ej(t), j = 1, 2, . ., m, represents the final demand for the ith noncapital
.

good. Here, unlike the usual dynamic Leontief system, c; as well as the aji's are
functions of the prices (po, Pl, ., Pn)
Let yk(t) be the existing quantity of capital in period t. Assuming that all
the existing capital goods are all fully employed we have

(98) Yk(t) _ bkiXi(t), k = m + 1, ... , n

;= 1

Since the obsolete and destroycd portions of capital goods are cckyk(t), k _
m + 1, ..., n, we obtain
(99) Xk(t) = 8kYk(t) + IYk(t + 1 ) - Yk(t)] + Ck(t), k = m + 1, ... , n
where ch(t) is the final demand for capital good k. Hence, in view of (98), we obtain

(100) Xk(t) = EakiXi(t) + bki[Xi(t + 1) - X;(t)] + Ck(t) k = m + 1, ..., n

i=1 i=1

where aki - 6kbki, i = 1, 2, ..., n, and k = m + 1, ..., n. Defining the matrices

A and h by
THE DYNAMIC LEONTIEF MODEL 535

A, AZ 0, 03
(101) A- , B
8.B2 B1 B2

where 01 and 03 are respectively m x m and m x (n - m) matrices whose elements

are all zero, we can write (97) and (100) as
(102) X(t) = B. [x(t + 1) - x(t)] + c(t)
which corresponds to our earlier output equation (6). Note that, unlike the case
in the usual dynamic Leontief system, A and h, respectively, are different from
A and B, which appeared in the price equation. This is a result of the distinction
between capital goods and noncapital goods and the introduction of the.5k -factors.
However, we should also note that the determination of the aji's and bki's (by
relative prices) simultaneously determines A, B, A, and B. Note that the addition
of (102) to the system increases the number of equations by n, corresponding to the
addition of the n variables x.(t), i = 1, 2, ..., n. Hence we still have essentially two
degrees of freedom, as discussed earlier. In other words, we need to specify two
variables-say, r and po-to specify completely the equilibrium values of all
the variables. Specification of these variables requires the consideration of
markets other than the goods market, in particular the money market and the
labor market.
In the above model, as in the usual dynamic Leontief model, it is assumed
that the final demand vector c(t) is given exogenously; that is, c(t) is given as an
explicit function of t. This means, of course, that the usual treatment and the
above treatment of the dynamic Leontief model completely assume away two
important choices of consumers, namely, consumers' substitution among various
goods owing to changes in prices and the choice between present and future con-
sumption. Extensions of the dynamic Leontief model in these directions are
again left to the interested reader. We may, however, point out that this will add
more flexibility to the Leontief model, in the sense that it will help to avoid the
difficulty posed by causal indeterminacy.
FOOTNOTES

l , Interpreting [ K, (t + 1) - fci(t)] as a net increase in the stock of the ith good,

I, (t) is the net investment of the ith good. In other words, we abstract the depreciation
of the stock of goods by properly interpreting the b,'s.
2. That is, the production process of the ith industry produces the stock of the ith
good (for the next period) as a joint output.
3. In reality, many goods are never used as capital goods for the production of certain
goods. On this ground, it is often argued that (A-2) is a serious weakness of the
dynamic Leontief model. However, in the actual computation of B, items such as raw
materials and inventories should be included. Then (A-2) is quite realistic, contrary
to the widespread belief on this point. I am indebted to Jinkichi Tsukui for pointing
this out to me.
4. Since B > 0 is nonsingular, every row and every column of B contain at least one
positive element. Since (I - A)-f > 0, this proves that (I - A)-'B > 0.
536 MULTISECTOR MODELS OF ECONOMIC GROWTH

5. That is, if w is any other eigenvector of (I - A)-1B, then u >_ I w 1. If A is in-

decomposable so that (I - A)-1B > 0, then (A-4) is automatically satisfied. More-
over, (I - A)-1B for this case is primitive so that p is unique, or u > I m I (recall
Theorems 4.B.3 and 4.B.5).
6. The continuous differentiability of H; (everywhere in the domain) means that H;
possesses continuous partial derivatives (everywhere in the domain), for all
i= 1,2,...,n.
7. The function H; is not necessarily the production function of the ith good. Given
the quantities of various goods in period t, x(t), the quantity of the ith good in
period t + 1, x;(t + 1), is given by Hi. Therefore, H; can be a complex mixture of
many production processes in the economy. Hence this Solow-Samuelson model is
sometimes called the sausage machine model (see Hahn and Matthews [91, p. 872).
Note that joint output is allowed in the model, for Hi does not necessarily imply
a particular production process (for the production of the ith good). The quantity
represented by x;(t) can be either a "stock" or "flow" of the ith good, although
in Solow and Samuelson [39] it is taken as the "flow" ("output") of the ith good.
8. Hahn and Matthews' work [9] contains a simple exposition of the Solow-Samuelson
theorem for the two-good case (pp. 872-873).
9. There is an excellent exposition in Nikaido ([ 33] , pp. 149-161) of the generalization
of the Solow-Samuelson theorem and the nonlinear extension of the Frobenius
theorem. The assumptions of the differentiability of the Hi's and of the positivity
of the partial derivatives of the Hi's in the Solow-Samuelson theorem are replaced
by the continuity and the "monotonicity" of the Hi's (see theorem 10.7 of [33],
pp. 160-161, especially).
10. Tsukui [43] and McKenzie [231, for example, applied this result to the proofs of
their turnpike theorems.
11. Define F by F = M - I. The dynamic Leontief system (8) is a special case of this in
which F = B-1(I - A) or M - I + B-1(I - A). Note that F-1 > 0 does not
necessarily imply M > 0. Also note (from the subsequent proof) that F -I > 0 is
required only in the sufficiency part of the theorem. Furthermore, the assumption
of F- 1 > 0 can be weakened so that F-1 is nonnegative, indecomposable, and
primitive.
12. The eigenvector u1 is that associated with the Frobenius root u1 for the trans-
pose of F -1.
13. Since u 1 is the Frobenius root of F - 1 , u 1 > I u; I (i = 2, ... , n). Moreover, 1 + 1 /u 1 >
11 + l/v;l(i = 2, ..., n) since X11 > 1A11(i = 2, ..., n) by assumption. However,
VI> I u; I (i = 2, . . ., n) and 1 + 1 /u 1 >I1 + 1 /u; j (i = 2, ... , n) are not inconsistent
with each other. For some readers, an inconsistency may appear to occur, if, for
example, more than one eigenvalue of F is positive. But this is not true. The
assumption of A I > I Al I (i = 2.... n) precisely rules out such a possibility.
14. That the eigenvalues of M are all distinct is not essential in the subsequent arguments.
The reader can easily extend our analysis to the case in which there are multiple roots.
For such a case, the general solution is written as z(t) = Ei= I hi(t)A1', where h; (t)
is now a polynomial in t, the order of which is less than the multiplicity and where
these li's are distinct. (See any introductory treatment on difference equations.)
15. That the eigenvalues of M are all distinct is not essential in the subsequent arguments
and theorems. It essentially amounts to rewriting Z" 1 h1A11x1' as Zi=1 h11(t)AA1'
where h,3(t) is now a polynomial in t, the order of which is less than the multiplicity
of li, and where these ).,'s are distinct.
16. This follows from the sufficiency part of the proof of the previous lemma. The root
THE DYNAMIC LEONTIEF MODEL 537

)i1 is the Frobenius root of Mm > 0, since Aim > l airy' l, i = 2, ... , n ['.' a. i > I a J,
i = 2, ... , n] . Note that the assumption F- I > 0 can be weakened so that F- ` is
nonnegative, indecomposable, and primitive. Incidentally, the x1 in (32) is a positive
vector as is clear from the proof of Tsukui's lemma.
17. When the output system (6) is replaced by x(t) = A x(t) + B [x(t) - x(t - 1)] +
T (t), where the coefficients in B now can be interpreted similarly to the coefficients
in the usual acceleration principle, then relative stability is considered to be empiri-
cally more plausible. I owe this observation to Jinkichi Tsukui. In this system,
investment is assumed to be "passive," that is, it takes place only to supplement the
excess demand for capacity in the preceding period.
18. It suffices to show that we cannot have h2 = 0. To show this, suppose the contrary,
or h2 = 0. Then x, (0) _ -x2(0)(=hj ), which contradicts x(0) ? 0.
19. Our Example 1, as an illustration of causal indeterminacy, is due to DOSSO [4].
20. The formulation of the price system here is due to Solow [381. See also DOSSO [4]
and Samuelson [351.
21. Assume perfect divisibility of the capital good.
22. Because (43) is obtained by comparing only two periods, the assumption is also
known as that of "myopic perfect foresight."
23. Observing this, Jorgenson proved Solow's conjecture that in the closed Leontief
model, if the output system is relatively stable, then the price system is relatively
unstable and vice versa, provided that n > 2. A similar result is obtained by Uzawa
[451, M. Fukuoka, and H. Niida.
24. In other words, unlike the notation in subsection b, K(t) denotes only the supply of
capital and does not denote the demand for capital, whereas AK(t) denotes the
demand for an increase in the supply of capital.
25. Mathematically speaking, this means that although inequality may be allowed in
(54) [that is, x(t) ? A x(t) + A K(t) + e(t)] , only the equality case is chosen.
Such a choice can be justified if for each good there exists an individual who is
not satiated with the good.
26. One difference between [26] and [27] is that in [26] such a neo-classical "smooth"
substitution with a continuum of activities is assumed, whereas in [27] a "discrete"
substitution with a finite number of activities is assumed. Although the latter
resembles "reality" more closely, there is little theoretical difference between the
two approaches. Hence we mostly adhere to the simpler neo-classical. case [26] .
27. Since the "substitution theorem" asserts that the input coefficients are in fact
fixed, it is often (perhaps more properly) called the nonsubstitution theorem.
28. If we allow such a price flexibility and factor substitution, then all the factors (here
labor and the stocks of goods = capital) are fully employed. Hence Morishima's
model is in sharp contrast to Solow's excess capacity model [38] described in
subsection d. It resembles more closely Solow's neo-classical revision [37] of the
Harrod-Domar model. Morishima's model also contrasts sharply with Jorgenson's
descriptive excess capacity model [ 16] which converts Solow's optimization model
[38] to a descriptive model, allowing excess capacities but sticking to the fixed
coefficients of production (as in Solow [381). Jorgenson's model [16] thus also
aims to remove the difficulty of causal indeterminacy and the dual stability theorem,
but has unfortunately attracted severe criticism by McManus [251.
29. In other words, Morishima assumed static expectations. As we will discuss later,
he then asked: What set of constant prices can actually be made to endure so that
such an expectation is realized for each t (which also, in fact, implies perfect fore-
538 MULTISECTOR MODELS OF ECONOMIC GROWTH

sight)? Such a state, of constant prices is sustained, except for knife edge cases,
only on a balanced growth or decay path [often termed the steady state (growth)
path] . These two polar assumptions with regard to prices and expectation-that is,
perfect foresight with changing prices and static foresight with constant prices-are
quite common in growth theory literature. Both these assumptions are clearly
unrealistic. This rather unfortunate state of the theory is, among other things, due
to the fact that we do not have any elaborate theory with regard to future expectation
and uncertainty. See also the Appendix to this section.
30. As in Walras [46] , part V, (81) is crucial in Morishima in establishing the consistency
of the system with capital accumulation or decumulation under constant prices (that
is, a balanced growth or decay path). Note that (81) also signifies that the returns to
various capital goods are equal, or [pk - (8k + 8k)Pk] /Pk is the same for all k
(and equal to r). This means that capital goods are freely transferable among
industries. In the absence of price changes, r is equal to the interest rate (or the own
rate of interest in the moneyless economy).
31. As we remarked earlier, the state of constant prices is sustained, except for knife
edge cases, only in a state of balanced growth or decay. It is not clear how the
economy reaches such a steady state starting from a historically given initial point.
This question, in spite of its great importance, was not investigated by Morishima,
thus undermining the significance of his work.
32. Choose a = 1/Z! 0P1 and let p; =_ ap;, i = 0, 1, ..., n. Then clearly E" op; = 1.
The imposition of (92) amounts to writing pi for each such p; and dropping the
homogeneity from (89) or (89').
33. There is a slight complication that we must take care of in the proof. That is, it is
easy to see from (89) that if r is large enough, it may not be possible to have a
positive po; hence it may be impossible to find a "fixed point" in the simplex.
34. However, it is not quite clear from Morishima and Murata [29] what is the mechan-
ism that sets the upper bound of the wage rate po.
35. Morishima [26] pointed out another situation, the "Ricardo-Marx" case, in which
the "real" wage rate is fixed as a result of the "reserve army" of labor and so on
([26] , p. 69). Here the "real" wage rate means the "money" wage rate po deflated
by a certain price index (see [261, p. 66).
36. Unlike in Morishima [261, it seems more natural to emphasize the r -> (po, p)
determination as a dynamic substitution theorem. In this case, as long as r is given,
the aj,'s and the bk;'s are fixed regardless of p and po. Since r reflects intertemporal
choice (hence it is abstracted away from the static theory), the static substitution
theorem may be considered as a special case (that is, r = 0) of the r- (P0, p)
determination.
37. With the introduction of money, which functions as a store of value, the state of
balanced growth or decay of the real goods sector does not necessarily imply
constant absolute prices, although relative prices may be constant. This seems
to be an obvious point, but it is often forgotten in growth theory literature.
38. It will not be too difficult to modify (89) if we assume perfect foresight. The real
task, of course, is to modify (89) under a suitable assumption with regard to
expectations concerning future prices.
39. For a recent survey of such a theory, see, for example, Burmeister and Dobell [ J ] ,
chapter 6.
40. The reswitching of techniques has the following important implication: If it occurs,
then it is impossible to say that a lower interest rate implies (in steady-state equili-
THE DYNAMIC LEONTIEF MODEL 539

brium) a more mechanized technology. Owing to the significance of this statement,

the problem of the reswitching of techniques has excited a part of the profession.
See, for example, the symposium on "Paradoxes in Capital Theory," Quarterly
Journal of Economics, LXXX, November 1966, which argues that reswitching is
possible under an activity analysis type technology. See also Burmeister and Dobell
[ 1] , especially sections 8.6 and 9.2.
41. Although reswitching is possible in an activity analysis type model, it is not
possible under the smooth neo-classical technology (see theorem 5 of [ 1] , p. 279).
For a recent discussion of this problem, see D. A. Starrett, "Switching and Reswitch-
ing in a General Production Model," Quarterly Journal of Economics, LXXXIII,
November 1969.
42. In the "smooth" neo-classical technology (as in Morishima [261), it may be
natural to assume that A*(p) and B*(p) are continuous and single-valued (and
even one-to-one). However, in the activity analysis type "discrete" technology,
such an assumption would be absurd.

REFERENCES
1. Burmeister, E., and Dobell, A. R., Mathematical Theories of Economic Growth,
London, Macmillan, 1970.
2. Chakravarty, S., Capital and Development Planning, Cambridge, Mass., M.I.T.
Press, 1969.
3. Domar, E. D., Essays in the Theory of Growth, New York, Oxford University Press,
1957.
4. Dorfman, R., Samuelson, P. A., and Solow, R. M., Linear Programming and
Economic Analysis, New York, McGraw-Hill, 1958.
5. Georgescu-Roegen, N., "Book Review: Morishima, M., Equilibrium, Stability and
Growth-A Multi-Sectoral Analysis," American Economic Review, LV, March 1965.
6. , Analytical Economics, Cambridge, Mass., Harvard University Press, 1966.

7."Goodwin, R., "A Non-linear Theory of the Cycle, "Review ofEconomics andStatistics,
XXXII, November 1950.
8. , "The Non-linear Accelerator and the Persistence of Business Cycles,"
Econometrica, 19, January 1951.
9. Hahn, F. H., and Matthews, R. C. 0., "The Theory of Economic Growth: A Survey,"
Economic Journal, LXXIV, December 1964.
10. Harrod, R. F., "An Essay in Dynamic Theory," Economic Journal, XLIX, March
1939.
11. , Towards a Dynamic Economics, London, Macmillan, 1948.
12. , "Domar and Dynamic Economics," Economic Journal, LXIX, September
1959.
13. Hicks, J. R., A Contribution to the Theory of the Trade Cycle, Oxford, Clarendon
Press, 1950.
14. Jorgenson, D. W., "On Stability in the Sense of Harrod," Economica, XXVII, August
1960.
540 MULTISECTOR MODELS OF ECONOMIC GROWTH

15. , "A Dual Stability Theorem," Econometrica, 28, October 1960.

16. , "Stability of Dynamic Input-Output System," Review of Economic Studies,

XXVIII, February 1961.

17. , "The Structure of Multi-sector Dynamic Models," International Economic

Review, 2, September 1961.

18. Kuhn, H. W., "On a Theorem of Wald," in Linear Inequalities and Related Systems,
ed. by H. W. Kuhn and A. W. Tucker, Princeton, N.J., Princeton University Press,
1956.
19. Leontief, W. W., The Structure of American Economy, 1919-39, 2nd ed., New York,
Oxford University Press, 1951.
20. , "Structural Change," in Studies in the Structure of the American Economy,
by W. W. Leontief et al., New York, Oxford University Press, 1953.
21. , "Dynamic Analysis," in Studies in the Structure of the American Economy,
by W. W. Leontief et al., New York, Oxford University Press, 1953.
22. , Input-Output Economics, New York, Oxford University Press, 1966.

23. McKenzie, L. W., "Turnpike Theorems for a Generalized Leontief Model," Econo-
metrica, 31, January-April 1963.
24. McManus, M., "Self-Contradiction in Leontief's Dynamic Model," Yorkshire
Bulletin, 9, May 1957.
25. , "Notes on Jorgenson's Model," Review of Economic Studies, XXX, June
1963.
26. Morishima, M., "A Dynamic Leontief System with Neo-Classical Production Func-
tion," chap. III in his Equilibrium, Stability and Growth: A Multi-Sectoral Analysis,
Oxford, Clarendon Press, 1965 (a revision of his paper in Econometrica, 26, July
1958.)
27. , "An Alternative Dynamic System with a Spectrum of Technique," chap. IV

in his Equilibrium, Stability and Growth: A Multi-Sectoral Analysis, Oxford, Claren-

don Press, 1965 (a revision of his paper in Econometrica, 27, October 1959).
28. , "Generalization of the Frobenius-Wielandt Theorems for Non-negative
Square Matrices," Journal of London Mathematical Society, 36, 1961 (also in his
Equilibrium, Stability and Growth, appendix).
29. Morishima, M., and Murata, Y., "An Input-Output System Involving Nontransfer-
able Goods," Econometrica, 36, January 1968.
30. Muth, J. F., "A Note on Balanced Growth," Econometrica, 22, October 1954.
31. Neisser, H., "Balanced Growth under Constant Returns to Scale," Econometrica,
22, October 1954.
32. Nikaido, H., Introduction to Sets and Mappings in Modern Economics, tr. by K. Sato,
Amsterdam, North-Holland, 1970 (the Japanese original, Tokyo, 1960).
33. , Convex Structures and Economic Theory, New York, Academic Press, 1968.

34. Samuelson, P. A., "Abstract of a Model Concerning Substitutability in Open

Leontief Models," in Activity Analysis of Production and Allocation, ed. by T. C.
Koopmans, New York, Wiley, 1951.
35. , "Market Mechanisms and Maximization," in The Collected Scientific Papers

of Paul A. Samuelson, Vol. 1, Cambridge, Mass., M.I.T. Press, 1966.

THE DYNAMIC LEONTIEF MODEL 541

36. Sargan, J. D., "The Instability of the Leontief Dynamic Model," Econometrica, 26,
July 1958.
37. Solow, R. M., "A Contribution to the Theory of Economic Growth," Quarterly
Journal of Economics, LXX, February 1956.
38. , "Competitive Valuation in a Dynamic Input-Output System,"Econometrica,
27, January 1959.
39. Solow, R. M., and Samuelson, P. A., "Balanced Growth under Constant Returns
to Scale," Econometrica, 21, July 1953.
40. Suits, D. B., "Dynamic Growth Under Diminishing Returns to Scale," Econometrica,
22, October 1954.
41. Tobin, J., "Money and Growth," Econometrica, 33, December 1965.
42. Tsukui, J., "On a Theorem of Relative Stability," International Economic Review,
2, May 1961.
43. , "Efficient and Balanced Growth Paths in Dynamic Input-Output
System-
A Turn-Pike Theorem," Economic Studies Quarterly, XIII, September 1962 (in
Japanese).
44. , "Application of a Turn-Pike Theorem to Planning for Efficient Accumula-

tion: An Example for Japan," Econometrica, 36, January 1968.

45. Uzawa, H., "Causal Indeterminacy of the Leontief Dynamic Input-Output System,"
Economic Studies Quarterly, XII, September 1961.
46. Walras, L., Elements of Pure Economics, tr. by W. Jaffe, Homewood, Ill.,'Richard D.
Irwin,1954.

Appendix to Section B: Some Problems in the Dynamic Leontief Model-

The One-Industry Illustration'

The purpose of this Appendix is to illustrate some of the difficulties inherent

in-the dynamic Leontief model by taking a simple one-industry model as an
example. Clearly, by using such an example, we ignore some important aspects of
the model, such as the interrelatedness among the industries. Moreover, we can-
not discuss such an important concept as a "balanced growth path" in the one-
industry illustration, and hence we ignore one important difficulty in the dynamic
Leontief model: that of causal indeterminacy.
However, the simplicity of the one-industry model will serve to make some
important features of the model stand out as well as clearly illustrate some weak-
nesses of the model. In this Appendix, we point out these weaknesses and try to
show their repercussions on other markets such as the labor market and the
money market, which are often neglected in the dynamic Leontief model.
We should, however, note that the purpose of this Appendix is not to build
a complete model of the one-industry economy. It is rather to point out some of
the important difficulties in the usual dynamic Leontief model using a simpler
model. The task of constructing a more general dynamic Leontief model by taking
account of these criticisms is left to the interested reader.
542 MULTISECTOR MODELS OF ECONOMIC GROWTH

In order to maintain connection with our discussion of the one-industry

growth model in Chapter 5, we use the notation adopted there. In other words,
Yt = output, Xt = consumption, K, = capital stock, L, = labor, and It = invest-
ment. Since we adopt the difference equation formulation here, the subscript t
now refers to period t (instead of time t). The basic equation in the dynamic
Leontief model is the equilibrium relation in the goods market, stating that
supply is equal to demand. In the one-industry model, Yt is considered to be
the final output and the use of this good for intermediate purposes (such as "raw
material") is netted out. That is, Yt is the output after we deduct a portion of
the output used for intermediate purposes. Thus matrix A in the (dynamic)
Leontief model is not explicit in the model. In any case, we obtain the well-known
identity that the total demand for Y, consists of investment and consumption.
Hence the equilibrium relation in the goods market is, as it is well known,
(1) Yt=It+Xt
As in the usual Harrod-Domar model, I, is derived from the "acceleration
principle" in the sense that
(2) alt = I't+ I't
where a is assumed to be a positive constant and is called the relation or accelerator
coefficient. This equation can be derived from the following consideration. Let
K, be the stock of capital goods available in period t. Assuming away depreciation
and obsolescence, or taking Y, net of these, we have
(3) Kt+i-Kt=4Kt=It
That is, an investment in period t becomes a capacity increase in period t + 1.
Suppose now that the production function for the output can be written as
(4) Yt = aK,

where a > 0 is a constant. Note that K, here means the amount of capital employed
in period t. That the same notation K, is used for the supply of capital in period
t implies that we are assuming the full employment of capital. Thus (2) can be
obtained easily from (3) and (4).
Equations (1) and (2) can be rewritten as

(5) I't=I[Y,+i-Y,]+Xt
which corresponds to (6) in Section B. Here 1/a corresponds to the B matrix, and
there is nothing which corresponds to the A matrix.
It is certainly possible to specify consumption, such as
(6) Xt = (1 - s)Yt + c
where s denotes the marginal propensity to save (which is assumed to be constant
THE DYNAMIC LEONTIEF MODEL 543

and 0 < s < 1), and c ? 0 is a constant. In the Harrod-Domar model, the speci-
fication of the behavior of consumption as above is made explicit. Moreover, it is
usually assumed that c = 0. The closed Leontief model in which consumption is
included corresponds to a special case of the above, that is, the case in whichX, = 0
for all t, or s = 1 and c = 0 for all t. The open Leontief model with a fixed bundle
of final consumption corresponds to a special case of the above, that is, the case in
which X, = constant, or s = 1 and X, = c > 0. However, it is also to be noted that,
in the general open Leontief model, X, may not be constant over t but rather is
given exogenously as an explicit function of time t. In any case, as long asX, is given
either in the induced form such as (6) or in an exogenous manner as a function of t,
we can solve for Y, explicitly as a function of t.
We may consider the present model to consist of four equations, (1), (3),
(4), and (6), and four variables (Y,, K,,X,, I) to be determined in the system. IfX, is
given exogenously, then equation (6) drops out from this list and the variableX, is
also dropped accordingly. In the latter case, we simply obtain Y, from (5), and in
the former case, we obtain the following equation by combining (5) and (6):
(7) Ii+I =(1+su)Y,-co-
From this, Y, is obtained explicitly as a function of time. Then K, is obtained from
(4), I, is obtained from either (2) or (3), and X, is obtained from (5).
The solution of equation (7) is easily found to be
(8) Y, = (1 + sU), (Yo - Y*) + Y*
where Y* = c/s and Yo is the initial output. Clearly, as long as Yo > Y*, the
economy is capable of growth. In the usual Harrod-Domar model, 'j is assumed to
be zero (thus Y* = 0) so that Yo > Y* will always hold, and the economy grows at a
constant rate, su, that is,
(9) Y, = Y0(1 + sU),

When s = 1, the growth rate is equal to a, which corresponds to the balanced

growth path in the closed dynamic Leontief model, and 1/u corresponds to the
eigenvalue of the B matrix.
Although it is not emphasized in the literature, the above model gets into
trouble when c > 0 and o < Y*, for then Y, becomes negative for sufficiently
large t.
Most of the discussions of the dynamic Leontief model, including ours,
avoid this problem by considering the closed Leontief model. In the one-sector
version, this amounts to assuming c = 0 (as well as s = 1) or X, = 0 for all t. As
remarked earlier, there is no difficulty of causal indeterminacy in the present
model, for there is only one sector in the economy [whose growth is described by
(9) regardless of the initial point YO].
We are now ready to describe another difficulty inherent in the fixed coef-
ficient model. Let I be the amount of labor necessary to produce one unit of out-
544 MULTISECTOR MODELS OF ECONOMIC GROWTH

put. Suppose this is a fixed constant. In order to avoid the above difficulty, assume
Yo > Y* so that Yt grows over time according to (8). As Yt grows over time, the
labor requirement, denoted by Lt, also grows according to lYt, or

(10) Lt = [(1 + sc)t (Yo - Y) + Y] I

Clearly, the actual supply of labor may not grow at the same rate, and there is no
mechanism in the economy to equilibrate the supply and the requirement of labor.
To illustrate this point, assume c = 0 or Y* = 0 following the usual Harrod-
Domar convention, so that (10) is rewritten as
(11) Lt = Y0(l + sc)'l
Suppose the supply of labor, T, grows at a constant rate n, so that

(12) Lt = Lo(1 + n)t

If n > so-, then there is an ever-increasing unemployment of labor, and it is hard
to conceive of any society which can tolerate this. If n < so-, on the other hand,
then the output growth described by (9) is impossible. The output can grow only
as much as labor grows, namely at the rate n. This, in turn, implies an ever-
increasing unemployment of capital, for the full employment of capital requires
the increase of output at the rate su (> n by assumption). Here the fixity of the
capital coefficient u is crucial for this dilemma. It is certainly hard to conceive
that such an ever-increasing unemployment of capital is possible and that there is
any continuing investment (or capital construction) as required by (1) in such an
economy.
Let us now turn to the price implication of the above system. Let pt be
the price of the good (which is also the capital good) in period t. Suppose one
has ptKg dollars; then he can buy Kt units of capital in period t, which are worth
pt+ I Kt dollars in period (t + 1) (where depreciation and obsolescence are assumed
away). By employing Lt units of labor, he can produce Yt units of the good, which
are worth pt, I Yt in period (t + 1). Let wt be the wage rate in period t. Then the
current profit, in period t, can be computed as [p, i Yt - wtLt] , so that the current
profit per unit dollar (denoted by 7Tt) is

- Pt+ L Yt - wtLt
t -
PtKt

(13) U
7r, = pt Pr+ t - wtl

where u = Y/Kt and 1= Lt/Y,. Here we assume (as in Section B, subsection c)

that the wage is paid at the end of each production period.
We can now write the intertemporal arbitrage equation (with perfect fore-
THE DYNAMIC LEONTIEF MODEL 545

sight)2 as
(14) Pt+IKt + [pt+IYt - w,L,] = (1 + r) (p,K1)
which can be rewritten as

(14') Pt+ Q
= l + r,
[Pt+ 1 - w,1]
Pt + Pt
where rt is the interest rate in period t. This is also written as

(15) Pt+]-Pt
Pt Pt
Equation (14) or (15) corresponds to equation (39) in Section B.
Following Morishima [ 161, we now ask whether constant prices are possible
in (15). That is, setting r, = r, w, = w, and p, = p*, we obtain from (15)
p*
(16) p* = r + w1
Q

which corresponds to Morishima's price equation (88) in Section B, where 1%0-

corresponds to Morishima's B matrix. From (16), p* can be obtained as
P* - Qw1
(17)
a-r
which is meaningful only when u > r.
Following Solow [23], let us not assume that the price p, is constant, but
assume that both r, and w, are constant so that r, = r and w, = w. Then solving the
above difference equation (14') or (15), we can easily obtain
1 + vr
(18) Pt= (Po-P*) l + ) +P
where p* is defined in (17).
In the knife edge case (or = r), p, = po for all t. For all practical purposes, a-
is likely to be larger than r. If this is the case, then regardless of the initial value
Po, pt converges top* as t -> oo.3 If we assume r = 0 as in Solow [23] and Jorgenson
[ 14] , this assumption of a > r obviously holds. As a matter of fact, if u < r, then
something very strange will happen in the system, namely, pt-oo as t-> oo with
Po > P*-
I hope that the above discussion has made clear the type of assumptions
buried in Morishima [ 16], Solow [23] , and Jorgenson [ 14]. Their conclusions
depend crucially on these conditions. In general, neither r, nor w, is constant, and
r, is not necessarily equal to 0. The assumption of o > r is not explicit in the usual
discussions of the price system of the dynamic Leontief system. The assumption
of r = 0 is not crucial in proving the convergence of pt to p* as t - oo. It can be
relaxed to or > r.
546 MULTISECTOR MODELS OF ECONOMIC GROWTH

Morishima'introduced factor substitution into the model so that coefficients

such as 1 and a are no longer fixed constants. The factor substitution in the present
model amounts to incorporating the following production function
(19) Y, = F(L1, Kr)
which was introduced by Solow [23] and others in connection with the one-sector
model (recall Chapter 5, Section Q. Assuming the linear homogeneity of F and
dividing both sides of (19) by Y,, we obtain

(20) 1 = F(1,, ar), where 1, = Y and ar ° Kt

which corresponds to our equation (78) in Section B. Notice that subscript t is now
attached to 1 and a. If we follow Morishima in [ 161, these coefficients 1, and a,
are now chosen to minimize the unit cost [ wrlr + Priar] subject to (20),4 where fit
is the price of the capital service of the good. Then we may write the optimal
values of 1, and a, as
(21) 1, = 1(Pr, wr)

(22) ar = a(Pr, wr)

which correspond to (79) and (80) in Section B. Let p, = constant = p*, w, _
constant = w, and r, = constant = r. Then we obtain (16), and moreover there is a
unique relation between p* and such ass
(23) P* = P*

which corresponds to Morishima's (81) in Section B. The price equation (16) can
be rewritten [in view of (23)] as
r *
(24) Pr = P* = Qr + rwl, for all t

which corresponds to Morishima's price equation (88) in Section B. Equations

(21), (22), (23), and (24) determine the values of p*, p*, 1,, and a, with a given
set of i and w, which is the essence of the dynamic substitution theorem of
Morishima, which we discussed in subsection e.
The analysis above is incomplete in the sense that there are still two degrees
of freedom-say, r and w-in the model. The model will become self-contained
after we explicitly introduce the money market and the labor market. It is to this
task that we now turn.
We take up the money market first. Let M, be the supply of money exo-
genously controlled by the monetary authority and let M be the demand function
for money. We suppose that equilibrium in the money market is written as'
(25) Mr = M(r,,prYr,At),where At = M, + p,K1
THE DYNAMIC LEONTIEF MODEL 547

The demand function M is taken to be homogeneous of degree one with respect

to p, Y, and A for otherwise it would not be independent of the monetary units
which are used to measure p,Y, and A,. Thus (25) can be rewritten as

(26) i=Mlr,,Yr,P,)
We assume labor is employed up to the point where the marginal produc-
tivity of labor is equal to the real wage rate.' Then we have
8
(27) w` =
Pr 8L, F(L,, K,)
Assuming that the production function F is homogeneous of degree one, so that
we can write F(L,, K,) = L,f(k,) where k, = K,/L, and f(k,) = F(1, k,), we have

(28)
=
f (k) - k, f'(k,)
Pr

wheref'(k,) = df(k,)/dk,. We assume that the supply of labor is given exogenously

and that it grows at a constant rate n. Since, in this Appendix, L, denotes the
demand for labor, the following equation now signifies the equilibrium relation in
the labor market:
(29) L, = Lo(1 + n)
We may suppose that the equilibrium in the labor market is achieved through
fluctuation in the real wage rate w,/p,. The earlier difficulty of ever-increasing
unemployment of labor can then be avoided. In fact, the full employment of labor,
namely, (29), can be guaranteed by assuming the flexibility of the real wage rate.
Fn other words, unemployment of labor lowers w,/p,, which, in view of (28), in-
creases the employment of labor with a given K,. The reverse will happen with an
excess demand for labor."
Our model now consists of equations (1), (3), (6), (14), (19), (26), (28), and
(29). If consumption X, is not to be specified, then equation (6) is to be dropped.
If depreciation is explicitly introduced into the model, then (3) can be
modified to
(3') Ir=(Kr+1-K,)+ 8K,

where 8 is the rate of depreciation and 1, now denotes gross investment instead of
net investment. Corresponding to this, Y, now denotes gross output instead of net
output. As a result of the explicit introduction of depreciation, (14') should be
modified tog
(14") Ppt (1 - 8) + [P,+ i- w,l,] = 1 + r,
Pt
where a, = Y,/K, and 1, = L,/Y,.
548 MULTISECTOR MODELS OF ECONOMIC GROWTH

Note that equations (1), (3'), and (19) are summarized as

(30)
This together with (14"), (19), (26), (28), and (29) will determine the time paths of
L,, K,, Y p w and rt once the consumption specification, such as (6), is made.
It is certainly possible to describe this model in terms of differential equa-
tions, which can be written as
(31) F(L,,K,)=K,+uK,+X,
where ,u is the instantaneous rate of depreciation,10
(32) Pt+ p,[(1 -,u) + aJ _ (1 + r,)p, + Q,tiv,l,
where u, - Y/K, and 1, = L,/ Y,,
M`
(33) = M [rt, F(L,, K,), A
A P11
and
(34) L, = Loe"'
where n now denotes the instantaneous rate of labor growth. Assuming the linear
homogeneity of F, we can obtain the following equation from (31) and (34):
(35) k, =f(k,) - Ak, - x,
where k, = K,/L,, x, = X,/L A - n + u, and f(k,) = F(1, k,). Since M IS
homogeneous of degree one in Pt Y, and At, we have"

In, = M [rtf(kt), k, + pl11

(36)
where m, = M,/L,. We may also rewrite (32) as1z

(37) Pt k, +
[(1 - u)k, +.f(kt)] = (1 + r,)k, + Pt
Equations (35), (36), (37), and (28) determine the time path of [k p , r w, j ,
once the behavior of per capita consumption x, is specified and the per capita
money supply m, is determined. The model can be complicated further (and
generalized) by introducing government expenditures and taxes explicitly. This
complication can be handled by reformulating the equilibrium relation of the
goods market (1) and by formulating explicitly the relation among government
expenditures, taxes, and the money supply. We leave this to the interested reader.
Note that equations (37) and (28) can be combined to yield"
(38) r, = [f'(k,) - u] + Ot
THE DYNAMIC LEONTIEF MODEL 549

where 0r = pr/pr, the rate of inflation or deflation. Equation (38) implies that there
is inflation (0, > 0) or deflation (cI < 0) at time t, depending on whether the
money interest rate r, exceeds or falls short of the "net" marginal productivity of
capital, [ f'(k,) - µ] , at time t. Price stability (0, = 0) is achieved if and only if
r, = f'(k,) - µ.
Per capita consumption x, depends on the choice between present consump-
tion and future consumption (present savings) for each individual and on the
income distribution among people (say, between the capitalists and the workers).
Hence x, would, in general, depend on variables such as r, p,, w,, and so on. There
is one simple specification of x, which ignores all such considerations, namely,
the proportional savings behavior, X, = (1 - s) [ Y, - µK,] . Or
(39) x, = (1 - s) [f(k,) - µk,]
where 0 < s < 1 is assumed to be constant."
If we adopt this specification of x, then (35) is simplified to
(40) k, = sf(k,) - (n + sy)k,
so that the time path of k, becomes independent of the other part of the model.
This is the case Solow was concerned with in his 1956 paper [ 22] As we proved
.

in Chapter 5, Section C, we can show that, under somewhat plausible assumptions,

k, monotonically approaches a constant value k,. (called Solow's path) "as t-aco,15
where k, is defined by
(41) sf(k,r) = (n + sy)k,.
This occurs regardless of the specification of the other part of the model. For
example, the capital:labor ratio k, as specified by (40) and the per capita output
y, move independently of the money supply.
Following Tobin's path-breaking works ([ 27] and [28] ),there has arisen a
rather heated discussion on "money and growth." In view of the above considera-
tion, it is not surprising to observe that the main feature of these "money and
growth" models often lies in their departure from the consumption specification
such as (39). More sophisticated behavioral relations on consumption than (39)
can be imposed by recognizing that "income" arises also from a change in the real
value of cash balances as well as from production. Assuming that money consists
only of "outside money," Tobin [28] and others imposed

(42) X, = (1 - s) [F(L,, K,) - µK, + dt(p r)]

where M, denotes outside money (government noninterest-bearing debt). The

introduction of a term such as d(M,/p,)/dt, which signifies capita] gains or losses,
constitutes the major path through which money affects the working of the
economy (in the "money and growth" literature). Following Shell, Sidrauski, and
Stiglitz [20], we may call [F(L,, K,) - µK, + d(M,/p,)/dt] the real purchasing
550 MULTISECTOR MODELS OF ECONOMIC GROWTH

power or the purchasing power in terms of the real good. If we recognize that the
price of money in terms of the real good is 1/p, and if we denote it by pm,, then
d(M,/p,)/dt = pm,M, + pm,M,, where pm, - 1/p,. Also denote the "net output"
by Q,; that is, Q, - F(L,, K,) - µK,. Then (42) may also be written as"
(43) X, = (1 - s) [Qt + pmtMt + pm1Mt]
Alternatively, we may assume that the money value of consumption is a
constant function of the money value of income. Then instead of (42) or (43), we
have
(44) p, X, = (1 - s) [ p, Q, + M, + p, K, ]
The two specifications (43) and (44) are fundamentally different. As is well known,
(44) involves money illusion (see Burmeister and Dobell [ 1 ] , pp. 166-167). We
proceed by using (43) or (42).
Write the per capita real cash balances as z, - mt/p,. Also write the rate of
change in the money supply as 0, = M,/M,. Then, dividing both sides of (42) by
Lt, we 17

(45) x, = (1 - s) [ f (k,) - µk, + Mot - m t )]

Combining this with (35), we obtain
(46) k, = sf (k,) - (n + sfc)k, - (1 - s)z, (0, - cD,)
Rewrite (36) as
(47) zt = M [rt,.f(kt), k, + zt]
Assume that aM/ar, < 0, aM/af > 0, and 1 > aM/a(k, + z,) > 0.18 Then we
can show that (47) may be rewritten as
(48) r, = g(k,, z,)
where ag/ak, > 0 and ag/az, < 0.19 Equations (46) and (48), respectively, signify
equilibrium in the goods market and the money market. Combining (48) with the
intertemporal arbitrage equation (38) we obtain
(49) ct = - [.f'(kt) - u - g(kt, z1)]
Using (49), we can rewrite (46) as

(50) k, = sf(k,) - (n + sfc)k, - (1 - s)z, [0, - { f'(k,) - ft - g(k z,)}]

Next, noting that20
(51) z, _ (0, - n - Ot)zt
we obtain from (49)
i
(52) Z`
Z,
= I'(kt, z,; 0t)
THE DYNAMIC LEONTIEF MODEL 551

where
(53) tP(kt, z1; 0r) = 01 - n + [f'(kt) - it - g(k,, z1)]

Equations (50) and (52) then define the equilibrium path of (k,, z,) for a pre-
assigned value of 0,. Once the path of (k,, z,) is determined, the rate of price
change D is determined by (49). The dynamic behavior of (k,, z,) can be studied
by constructing a phase diagram in the (k,, z,)-plane using (50) and (52). The
construction of the phase diagram also reveals the condition for the existence and
uniqueness of the steady state path in which a, = 0 and k, = 0. This task is left to
the interested reader. Such an analysis can be seen, for example, in Burmeister
and Do bell ([1], Chapter 6). In this connection, note that (46) and (51) imply
(54) k, = sf (kt) - (n + su)k, - (1 - s) (z, + nz, )
Hence, if the steady state path (k, z) is ever achieved, then
(55) sf(k) = (n + su)k + (1 - s)nz
Therefore, assuming that f'(k,) > 0 for all k, and k > 0, (55) implies
(56) k < ks
where ks is the value of kin Solow's steady state path defined by (41). Equation (56)
implies that per capita output under the present steady state path is lowerthan that
under Solow's steady state path, that is, f (k) < f (ks ). Note that these conclusions
are independent of the rate of change in the money supply, 0,.21 It is, however, to
be stressed that the convergence to the steady state path under the present model
does not necessarily hold (unlike in Solow's theorem in [ 22] ).22 Hence the value
of any statement with regard to the steady state path underthepresent model is not
very great.
Above we assumed that the rate of change of the money supply 0t is exo-
genously given. Alternatively, we may suppose that the monetary authority
manipulates 0, so as to maintain price stability (that is, ct = 0 for all t).23 Imposing
(Pt = 0, we obtain from (46), (51), and (49)
(57) k, = sf (k,) - (n + su)k, - (1 - s)z, 0,
(58) z, = z,(0, - n)
(59) f'(k,) = g(kt, z1) + y
Note that (59) implies
(60) dz,
=
f- gk
A, gZ

where gk - t9g/ek, and gZ = 8g/8z,. Hence, assuming that gk > 0, g< < 0, and
f" < 0, we obtain dz,/dk, > 0.21 Therefore, we may write

(61) z, _ (k,), where K' = >0

At
552 MULTISECTOR MODELS OF ECONOMIC GROWTH

In order to facilitate our study of the above system, it is necessary to explore

the meaning of (47) further. We noted that (47) can be written as (48). Now note
that (47) can also be written as
(62) z, = h(k1, r1)
where ahlak1 > 0 and ahlar1 < 0, assuming again aM/ar1 < 0, aM/ay1 > 0, and
1 > aM/a(k, + z1) > 0.25 Assume that for a sufficiently large value r = r(k), the
demand for money is zero. Then we have
(63) 0 = h(k, r(k))
Assume further that the function r(k) satisfies
(64) 0 < r(k) < co, r'(k) > 0, and k-oo
limr(k) = R > 0

The fact that r(k) > 0 means that the transaction demand for money is positive.
The shape of r(k) is illustrated in Figure 6.6.
Assuming that f" (k) < 0j'(0) = co, and f'(co) = 0, the shape of [f'(k) - µ]
is also illustrated in Figure 6.6. As is clear from the diagram, under the above
assumptions, there exists a unique value of k > 0 which satisfies
(65) f'(k) = r(k) + u

Now return to (61) or

(61') zr = (k1) = h[kr,.f'(kr) - u]
Obviously r(k1) > f'(k1) - u by the definition of r(k1), so that kt cannot be less
than k, as is clear from Figure 6.6. The shape of (kt) is illustrated in Figure 6.7.26
We are now ready to study the system consisting of (57), (58), and (59) [or
(6l')]. Note that (57) and (58) yield an equation which is the same as (54), and note
that (61') implies that a, = 'k1. Then we obtain

Figure 6.6. r(k) and k.

THE DYNAMIC LEONTIEF MODEL 553

k, Figure 6.7. An Illustration of (k,).

(66) [1 + (1 - s)K'] kt = sf(k,) - (n + su)k, - (1 - s)nK(k,)

The dynamic behavior of k, can now be deduced easily from Figure 6.8.
It is clear from Figure 6.8 that there exists a unique k* with 0 < k* < ks
which is defined by
(67) sf(k*) = (n + su)k* + (1 - s)nK(k*)
Moreover, from (66) we can immediately conclude that k, converges to k*
monotonically as t->oo, regardless of the initial value of ko. In other words,
k* is globally stable. If k, converges to k* monotonically, then from (61') we can
also conclude that z, converges monotonically to z*, where z* -- K(k*). Note
that at (k*, z*), k, = zi = 0, so that (58) implies 0, = n; that is, the money supply
is increasing at the rate of population growth. Moreover, if k, < k*, then z, < z*,
and z, is monotonically increasing, so that (58) implies 0, > n. On the other hand,
if k1',> k*, then we can similarly conclude that 0, < n. The precise formula for
0, can be computed from (57), (58), and (59). Some of the above conclusions may
be summarized as follows:

(n+sµ)k,
.01

k,
Figure 6.8. - Dynamics of k,.
554 MULTISECTOR MODELS OF ECONOMIC GROWTH

Proposition Under the above specification and the assumptions of the model, if
the monetary authority manipulates the money supply so as to maintain price stability,
then there exists a unique steady state path (k*, z*), where 0 < k* < ks, which is
globally stable. Along the stipulated price stability path of the money supply, B, n
according to whether k, V.
The relation between money and growth in connection with the one-industry
model has recently attracted the attention of many economists since Tobin's
fundamental paper [28]. We have simply traced and developed some thoughts
along these lines. Since active research is still being done on this topic, we do
not go any further.27 The above analysis is an exercise under the rather limited
conditions of perfect foresight, full employment, and no fiscal elements.21, The
reader may wish to extend our analysis by realizing these limitations. However, the
purpose of this Appendix is fulfilled if the reader realizes some of the inherent
difficulties in the dynamic Leontief model.
FOOTNOTES

1. The major result on money and growth in this appendix is taken from Takaya na
[261.
2. Note that intertemporal arbitrage is concerned only with two periods (t and t + 1).
Hence this perfect foresight assumption is often called myopic perfect foresight, as
mentioned before.
3. Recall that p* > 0 only when a > r.
4. This is clearly myopic, and can be justified under Morishima's assumption of a
stationary state. When we divert ourselves from the steady state (or balanced
growth) assumption, it is desirable to reconsider this decision rule. Here we simply
assume that such a "long-range" decision rule is reduced to the present myopic
rule.
5. Note that in (23), depreciation and obsolescence are assumed away.
6. Here At signifies the money value of assets. An alternative formulation with regard
to monetary equilibrium is M, = M(r,, p1K1, At). See Tobin [28], for example.
Following Tobin [28], we may assume that there are only two kinds of assets,
(outside) money, and the stock of capital, which certainly justifies the definition of A,
in (25). However, (25) can incorporate private (nongovernmental) bonds, and signify
a part of the portfolio equilibrium of the three types of assets, M, K, and, private
bonds. The introduction of interest-yielding government bonds will complicate the
formulation, although the essence of the conclusions in this Appendix will still
remain. On the other hand, if there are no bonds, equation (15) [and also (14")
and (32), which will appear later as a modified version of (15)] becomes the defining
equation of the own rate of interest (or the money rate of return on the physical
capital) and does not show the intertemporal arbitrage relation. In this case, (25)
describes the portfolio equation of M and K alone.
7. The behavioral rule of cost minimization (for each period) is the major background
for this result.
8. Assume, for example, that w, adjusts the labor market. Then an excess demand
(resp. supply) of labor will increase (resp. lower) w, with a given pt, thus increasing
(resp. lowering) w, /p,. It is usually assumed that pt adjusts the goods market. This
THE DYNAMIC LEONTIEF MODEL 555

adjustment mechanism takes place within the framework of the Hicksian week. Here
it is assumed that such an equilibrium is achieved in order to focus attention on
the equilibrium path. Under the Keynesian framework, w1 has downward rigidity;
that is, an excess supply of labor will not reduce wt.
9. With p1K1 dollars, one can buy Kt units of capital in period t, which are worth
pl+ I (Kt - 8K1) dollars in period t + 1. The current profit in t is pl+ i Yt - w1L1.
Hence we have the intertemporal arbitrage equation pt+ I (K1 - SK1) + (pl+ I YY -
w1 Lt) _ (1 + rt )pt Kt . From this, we can deduce (14").
10. Equation (32) corresponds to (14"). Note that (32) cannot be obtained by simply
setting cD1+ 1 = Pl + pt in (32). To obtain (32), first observe the following inter-
temporal arbitrage equation under perfect foresight for the continuous time case,
f 00
-A -1) e-f,%da dz
Pt = pT e

where pT = ptaT - wTaT/T, and we assume that the integral converges. The term
inside the integral gives the present value of the quasi-rent for time T. Totally
differentiating the above equation with respect to t, we obtain pt [pt at -
w1a111] + µp1 + r1pt, from which we obtain (32). In this derivation of (32), no
assumption is made with regard to the myopic nature of intertemporal arbitrage. It
is assumed that rT is known for all future time (T). However, (32) can also be obtained
by assuming the myopic nature of intertemporal arbitrage (that is, the arbitrage is
concerned only with the current time and the next instant of time).
11. Suppose that we have the alternative formulation M1 = M[rt, ptK1i A1]. Then
under a suitable set of assumptions we can conclude that Mt/(p1Kt) is a function of
r1 alone and that it is a decreasing function.
12. Equation (37) is obtained from (32), which assumes perfect foresight in the inter-
temporal arbitrage relation. In general, (pt/pt) should be replaced by E(pt/pt),
which denotes the expected rate of price change at 1. However, if (pt/pt) E(pl/Pt).
that is, if expectations turn out to be incorrect, then some sort of learning device
to correct such a mistake is necessary. One device which is used in the literature to
cope with this problem is the simple "adaptive expectations" postulate of the form
7rt = E(P1/pt - -1), where r1 denotes E(pt/pt) and E _> 0 signifies the speed of
adjustment in expectations. It says that the rate of adjustment in the current expected
rate of price change is linearly dependent on the error made in predicting the
current rate of change. No doubt this device can be useful. The reader can, if he is
interested, modify our analysis accordingly using this device. However, the funda-
mental question with regard to the background of this device from the viewpoint
of rational behavior is still unclear.
13. Equation (38) says that the money rate of interest (rt) is equal to the net marginal
productivity of capital [ f'(kt) - , ] plus the rate of inflation (or deflation) 41.
14. In (39), it is assumed that consumption is a constant fraction of net national product.
Alternatively, we may assume that consumption is a constant fraction of gross
national product. Then (39) is simply written as
(39') xt = (1 - s)f(kt)
15. This conclusion will be unaltered even if (39) is replaced by (39'). In that case,
the definition of ks needs to be modified to the one specified by
(41') sf (ks) = Y.ks
16. Consider the possibility that real output is very low, or to dramatize the story,
556 MULTISECTOR MODELS OF ECONOMIC GROWTH

consider the case in which F(Lt, Kr) = 0. The consumption specification (42) or
(43) then says that usual bounds on s such as 0 < s 0. Assuming that one does not "eat up" the capital that is already
invested, this is again absurd, for this then implies that one can live by paper money
alone. Hence we may naturally impose the constraint such as 0 < Xt < F(L1, Kt).
17. Actually,' several other alternative specifications are possible. See, for example,
Levhari and Patinkin [ 15] and Stein [251. Note also that if E(pt/pt) pt/pt, then
capital gains or losses due to miscalculation should also be introduced in defining
the purchasing power. This point seems to be ignored in the literature.
18. It is easy to see that aM/art < 0 and aM/af > 0 from the usual Keynesian hypothesis
on liquidity preference; aM/a(kt + zt) > 0 or aM/aA1 > 0 says that money is not
an inferior good; and I > 3M/3(k1 + zt) says that an extra dollar of wealth will not
all be held in the form of money.
19. The proof of this statement is left to the reader.
20. Equation (51) is obtained from zt - mt/pt. Equation (51) implies that in the steady
state path in which it = 0, 0t = Bt - n; that is, the rate of price change is equal to
the rate of change of the money supply minus the rate of population increase.
21. However, the actual values of k and z, in general, depend on Bt. This is often
called the nonneutrality of money. The conclusion that "money matters" is rattier
obvious in view of the change of the consumption specification from (39) to (45).
22. In various money and growth models, it has been established that the steady state
path under the present specifications is not globally (nor locally) stable. See, for
example, Burmeister and Dobell [ 1] and Nagatani [ 18J.
23. To me, this is a much more acceptable hypothesis than the usual one in the money
and growth literature which assumes that the monetary authority keeps the rate of
money change (Bt) constant forever, regardless of the state of the economy.
24. In other words, if price stability is maintained, then kt and zt move together. Needless
to say, (59) signifies that if price stability is maintained, then the marginal physical
productivity of capital is equal to the rate of depreciation plus the rate of interest.
25. The proof is again left to the reader.
26. Such an illustration of (kt) is also seen in Burmeister and Dobell ([ 1] , p. 169).
27. For a recent survey of the discussion on "money and growth," the reader is referred
to Stein [25] and Burmeister and Dobell [ 1] .
28. Another major limitation is that we are concerned only with the equilibrium path
in which temporary (or momentary) equilibrium in all markets is achieved (instan-
taneously). For pioneering studies in which "disequilibrium" is allowed in this con-
text, see, for example, Rose [191, Stein [24], and Tsiang [30] . In essence, they
assume the price of the good changes if and only if there is disequilibrium in the
goods market. Writing It and St, respectively, for planned investment and planned
saving at t, they (somewhat arbitrarily) imposed that k, = ill, + (I - q)S,, where
71 is a constant with 0 <'1 < I.

REFERENCES

1. Burmeister, E., and Dobell, A. R., Mathematical Theories of Economic Growth, New
York, Macmillan, 1970.
2. Domar, E. D., "Capital Expansion, Rate of Growth and Employment," Econo-
metrica, 14, April 1946.
THE DYNAMIC LEONTIEF MODEL 557

3. , "Expansion and Employment," American Economic Review, XXXVII, March

1947.
4. , Essays in the Theory of Growth, New York, Oxford University Press, 1957.
5. Dorfman, R., Samuelson, P. A., and Solow, R. M., Linear Programming and Eco-
nomic Analysis, New York, McGraw-Hill, 1958.
6. Georgescu-Roegen, N., "Book Review: Morishima, M., Equilibrium, Stability and
Growth: A Multi-Sectoral Analysis," American Economic Review, LV, March 1965.
7. Hahn, F. H., "On Money and Growth," Journal of Money, Credit and Banking, 1,
May 1969.
8. Hahn, F. H., and Matthews, R. C. 0., "The Theory of Economic Growth: A Survey,"
Economic Journal, LXXIV, December 1964.
9. Harrod, R. F., "An Essay in Dynamic Theory," Economic Journal, XLIX, March
1939.
10. , Towards a Dynamic Economics, London, Macmillan, 1948.
11. , "Domar and Dynamic Economics," Economic Journal, LXIX, September
1959.
12. Hicks, J. R., A Contribution to the Theory of the Trade Cycle, Oxford, Clarendon
Press, 1950.
13. Johnson, H. G., "The Neoclassical One-Sector Growth Model: A Geometrical
Exposition and Extension to a Monetary Economy," Economica, XXXIII. August
1966.
14. Jorgenson, D., "A Dual Stability Theorem," Econometrica, XXVIII, October 1960.
15. Levhari, D., and Patinkin, D., "The Role of Money in a Simple Growth Model,"
American Economic Review, LVIII, September 1968.
16. Morishima, M., "A Dynamic Leontief System with Neo-Classical Production Func-
tion," chap. III in his Equilibrium, Stability and Growth: A Multi-Sectoral Analysis,
Oxford, Clarendon Press, 1965 (a revision of his paper in Econometrica, 26, July 1958).
17. Morishima, M., and Murata, Y., "An Input-Output System Involving Nontransfer-
able Goods," Econometrica, 36, January 1968.
18. Nagatani, K., "Professor Tobin on Money and Economic Growth," Econometrica,
38, January 1970.
19. Rose, H., "Real and Monetary Factors in the Business Cycle," Journal of Money,
Credit and Banking, 1, May 1969.
20. Shell, K., Sidrauski, M., and Stiglitz, J. E., "Capital Gains, Income and Saving,"
Review of Economic Studies, XXXVI, January 1969.
21. Sidrauski, M., "Inflation and Economic Growth," Journal of Political Economy,
LXXXIV, October 1966.
22. Solow, R. M., "A Contribution to the Theory of Economic Growth," Quarterly
Journal of Economics, LXX, February 1956.
23. , "Competitive Valuation in a Dynamic Input-Output System," Econo-
n7etrica, 27, January 1959.
24. Stein, J. L., "Neoclassical and Keynes-Wicksell Monetary Growth Models," Journal
of Money, Credit and Banking, 1, May 1969.
25. , "Monetary Growth Theory in Perspective," American EcononricReview, LX,

March 1970.
558 MULTISECTOR MODELS OF ECONOMIC GROWTH

26. Takayama, A., "A Note on Money and Growth," Krannert Institute Paper No. 305,
Purdue University, March 1971.
27. Tobin, J., "A Dynamic Aggregative Model," Journal of Political Economy, LXIII,
April 1955.
28. "Money and Growth," Econometrica, 33, December 1965.
29. , "The Neutrality of Money in Growth Models: A Comment," Economica,
XXXIV, February 1967.
30. Tsiang, S. C., "A Critical Note on the Optimum Supply of Money," Journal of
Money. Credit and Banking, 1, May 1969.
MULTISECTOR OPTIMAL GROWTH MODELS
7
Section A
TURNPIKE THEOREMS

a. INTRODUCTION
Consider a trip from a suburb of Chicago to a suburb of New York City.
There are many routes that one could take. The fastest way is probably not the
route that is the shortest, that is, the route that is approximately a straight line
between the two suburbs. The fastest route is probably to get to the "turnpike"
as quickly as possible and travel on it until reaching an exit that leads to the
destination. This is true even if the "turnpike" appears to be a very roundabout
route compared to a straight line between the starting point and the terminal
point.
In the problem of economic growth, one may wonder whether or not there is
a path of growth that resembles a "turnpike," that is, a growth path on which an
economy should spend most of its time. This problem was considered by Dorfman,
Samuelson, and Solow (DOSSO) [2] with respect to the von Neumann type of
growth model. They conjectured that there is such a path and that it is none
other than the von Neumann growth path, that is, the path which maximizes the
growth rate among the set of balanced growth paths. This conjecture was first
proved rigorously by Morishima [22] and Radner [25] for the n-commodity case.
Since then, there have been many extensions and variations of the basic theorem.
For example, we list the following important papers: McKenzie [ 171, [181, [19],
and [20], Nikaido [23], Inada [ 10], Tsukui [281, [29], and [301, Drandakis [3].
Winter [35], and expository articles by Koopmans [ 14] and Hahn-Matthews [7]
for simpler cases. Because of the variations in these numerous papers, the theorem
is often referred to in the plural form as turnpike theorems.
Let us now describe the essence of theseturnpike theorems more specifically.
The basic model is the von Neumann (or at least a von Neumann type) economy
with n-commodities. The vector of historically given initial stocks of the commodi-
ties is given arbitrarily. It is supposed that the economy wishes to maximize the

559
560 MULTISECFOR OPTIMAL GROWTH MODELS

vector of the final stocks of commodities or the utility function which is defined
with regard to only the final stocks of commodities as its arguments. Then, in
terms of this optimality criterion, the "best" way for the economy to achieve
its goal is to spend "most" of its time "sufficiently close" to the von Neumann
growth path, regardless of the initial point, provided that the planning horizon
(that is, the terminal time) is sufficiently far away. In almost all versions of the
turnpike theorems, it is not advocated that the optimal path actually be on the von
Neumann path most of the time; it is only required that it spend most of the time
"sufficiently close" to the von Neumann path. Hence the above Chicago-New
York analogy is not quite accurate.
The above statement of the essence of the turnpike theorems may be illust-
rated by a simple diagram (see Figure 7.1). The turnpike theorems essentially
require that the "optimal" path arch toward the von Neumann path; it is in this
sense that the von Neumann path plays the role of the "turnpike." It is important
to note that in the statement above, optimality is defined with respect to the final
state only and that, unlike the analogy of the Chicago-New York trip, the terminal
point is not given whereas the time to reach the terminal state is specified..It is
possible to conjecture that the "turnpike property" of arching toward a certain
path holds for other types of models. Optimality may depend on the interim
states as well as on the final state, or the final state may be given and optimality
may be defined to minimize the time in reaching the final state.
The significance of the turnpike theorems for the von Neumann model and
the von Neumann theorem should now be clear. As we remarked in Chapter 6,
Section A, it saves the von Neumann theorem from its two basic criticisms: the
von Neumann path ignores the historically given stocks of commodities and
consideration is restricted to balanced growth paths. The situation is analogous
to the Ramsey-Koopmans-Cass theorem for the one-sector economy concerning
the golden age path (recall Chapter 5, Section D).
Aside from the number of commodities, there is, however, one important

The von Neumann path

Figure 7.1. An Illustration of a Turnpike Theorem.

TURNPIKE THEOREMS 561

difference between the turnpike theorems and the Ramsey-Koopmans-Cass

theorem. In the latter, consumption is explicit in the model and the optimality
depends on the interim states as well as on the terminal state. The extension of the
turnpike theorems in this direction was first achieved by Atsumi [ 1] for a two-
commodity economy and then by Gale [6] and McKenzie [21] for an n-com-
modity economy. Such an extension will be discussed in the next section of
this chapter.'
The purpose of this section is not to survey all the literature on the turnpike
theorems.2 There is too much for such a limited space. In this section, we are
mainly concerned with giving an expository account of the turnpike theorem due
to Radner [25]. Radner's turnpike theorem has been criticized and extended by
many writers, but the elegance of the paper and the excellence of the method
of proof are generally agreed upon. In fact, his basic method of proof may also
be found in several papers which have extended his result (including the papers
which deal with consumption explicitly). See, for example, McKenzie [20].
In subsection b, we develop the basic model for the Radner turnpike
theorem. In subsection c, we digress from the turnpike theorem and develop a
"profit maximizing characterization" of the optimal feasible path, which is not
discussed by Radner but is very important since this characterization is implied
by a competitive market. Here the theorem is strictly analogous to the one in
activity analysis and the price implication of the Pareto optimum in the theory
of competitive markets. In subsection d, we proceed to the proof of the Radner
turnpike theorem. In the proof of this theorem, the lemma due to Radner is
essential.

b. THE BASIC MODEL AND OPTIMALITY

Consider an economy which produces n commodities. Let (x1, y,) denote
the, production process in period t where xt and yt, respectively, denote an input
vector and an output vector. Let T be the set of such processes which are techno-
logically feasible in the economy. For the meaning of the production "period,"
the reader is reminded of our discussion in Chapter 6, Section A. We assume:

(A-1) The set Tis a closed cone in the nonnegative orthant of the 2n-dimensional
real space, IZ2n.
(A-2) (No land of Cockaigne) (0, y) E T implies); = 0.

The model described above is the von Neumann type "closed" model of produc-
tion, in which there is no explicit treatment of consumption.

Definition (feasibility): Given N, the span of the programming periods, and given
the vector of initial commodity stocks, xo, a sequence {x1}, t = 0, 1, ..., N, is
called a feasible path with respect to xo, if (x1, xt+1) E T, t = 1, ..., N - 1, and
Xp = Yo.
562 MULTISECTOR OPTIMAL GROWTH MODELS

Definition (von Neumann path): A triplet (z, p, A), where z and p are nonzero
elements in the nonnegative orthant of R" and A E R with A > 0, is called a von
Neumann triplet or a von Neumann equilibrium, if
(i) (I, Ax) E T
(ii) p- (y - Ax) < 0 for all (x, y) E T
We call the process (z, A, i) a von Neumann process. The ray from the origin through
z is called the von Neumann ray (with respect to z) or the von Neumann (growth)
path (with respect to z). This ray is denoted by the set {x: x E R", x = az, a > 0,
a E R}. In the above triplet, p is called the (von Neumann) price, and A. is called
the (von Neumann) interest factor [or sometimes the (von Neumann) growth
factor]. An evaluation of process (x, y) E T by p (y - Ax), where p is the von
Neumann price and A. is the von Neumann interest factor, is called the von
Neumann value (or the von Neumann profit) of the process (x, y).
REMARK: Set y = A.z and x = z in condition (ii) of the definition of the
von Neumann triplet. Then we have p (y - Ax) = p (A.z - Ai) = 0. In
other words, the von Neumann value of the von Nuemann process (z, lz)
is zero.
We now assume:
(A-3) There exists a von Neumann triplet.
REMARK: In Chapter 6, Section A, we proved the "von Neumann
theorem" which asserts the existence of a von Neumann triplet under (A-1),
(A-2), and the following:

(Convexity) T is convex.
(Free disposability) (x, y) E T, x' > x and 0:5 y' !S y imply (x', y') E T.
(Productiveness) There exists an (x, y) E T such that y > 0.

Radner imposed the following assumption which qualifies the von Neumann
triplet:
(A-4) Let (z, p, A) be a von Neumann triplet. Then p (y- Ax) < 0 for all (x,y)'s
in T that are not proportional to (z, Ak), that is, those (x, y)'s in T which are not on
the von Neumann ray with respect to z.
REMARK: It is important to note that (A-4) guarantees the uniqueness of
a von Neumann ray. Radner remarked that (A-4) can be obtained from the
following assumption:
(A-4') The set T has a nonempty interior and is a strictly convex cone, in
the sense that z, z' E T, with z' not proportional to z, implies 6z + (1 - 0)z'
is in the interior of T for any 0 where 0 < 0 < 1.
That (A-4') implies (A-4) can be proved as follows. Suppose (A-4) does
not hold, so that there exists an (x', y') 6 T, which is not proportional to
TURNPIKE THEOREMS 563

(z, A2), but p (y' - Ax') = 0. Let (x, y) = A.i) + Z(x', y'). Then
Z(2,

p (y - Ax) = 0. But (x, y) is in the interior of T from (A-4'). Hence for

sufficiently small a and b, (x + a, y + b) E T and p (b - Aa) > 0. The last
inequality implies that p [(y + b) - A(x + a)] > 0, which contradicts that
(z, p, A) is a von Neumann triplet [see condition (ii) of the definition of the
von Neumann triplet] .
REMARK: Assumption (A-4') is very restrictive for it precludes the case
in which T is a convex polyhedral cone. Morishima [22] assumed that T
is a convex polyhedral cone but imposed a quite restrictive assumption
which in fact implies the uniqueness of the von Neumann ray.
The economy has preferences among the feasible sequences {xr?o". It is
assumed that these preferences depend only upon the terminal state x" and that
the preference is represented by a real-valued function u(x) defined on the non-
negative orthant of Rn. Following Radner [25], we impose the following assump-
tions on u:
(A-5) The function u(x) is nonnegative and continuous and there exists an
x > 0, x E Rn, such that u(x) > 0.
(A-6) The function u(x) is homogeneous of degree one.
An obvious example is u(x) = p x, where p* is a nonzero n-vector whose
elements are all nonnegative. The vector p* can be interpreted as the price or the
weight vector associated with the terminal commodity vector. Another example
of u(x) is the "Cobb-Douglas" utility function
n n
u(x) flx;a1,a;> 0, a;= 1
-. We are now ready to define the optimal feasible path.

Definition (optimality): A feasible path {x,}, t = 0, 1, 2, . . ., N, starting from 5EO,

is said to be optimal if it maximizes u(x") among the set of all the feasible paths
starting from xo.

C. FREE DISPOSABILITY AND THE CONDITIONS FOR OPTIMALITY

If we allow free disposability, the optimal feasible path can then be con-
sidered as a solution of the following nonlinear programming problem:
PROBLEM I:
Maximize: u(y")
{(x,, )'r)}o
Subject to:
x, yi-1,t= 1,2,...,N
xo<xo
and(xt,y) E T,t=0, 1,...,N
564 MULTISECTOR OPTIMAL GROWTH MODELS

Here the inequalities presuppose free disposability. With free disposability, a

feasible path (starting from xo) is now defined as a sequence {(x,, Al, t = 0,
1, ., N, such that x, < yt_1, t = 1, 2, ..., N, x0 < Yo and (x,, ),,) E T for all t.
Now consider the following vector maximum problem:
PROBLEM II:
Maximize: YN
1(x,, yt)}ON

Subject to:
XI 55-y, t= 1121...IN
xo<10
and (x,, y,) E T, t = 0, 1, ..., N
Here the inequalities again presuppose free disposability. Suppose, for example,
u is concave and "Slater's condition" holds [that is, there exists an (x, y) E T
such that x < x0 and _I < y] ; then the solution to Problem II may be regarded
as a solution to Problem I with the following particular utility function:
u(YN) = P*'YN, P* > 0, P* 0

(Recall Theorem 1.E.4.) Clearly this utility function, as remarked before, satisfies
assumptions (A-5) and (A-6). Since it may be rather artificial to conceive of a
utility function for the economy in which the consumers are not explicit, the
formulation of optimality in terms of Problem II may be better than that in terms
of Problem I. In this subsection, we use the formulation in terms of Problem II.
REMARK: In the above, we assumed free disposability and used the in-
equality constraints. This was done to utilize the ready-made theorems
developed in Chapter 1, and hence to relate the present discussion to that
chapter. In general, the free disposability assumption is not essential, and
the results in the present subsection follow in the main without such an
assumption by using the equality constraints.
Here we may digress from Radner's discussion of the turnpike theorem and
characterize the optimal feasible path (a la Problem II). In other words, by apply-
ing a theorem on vector maximum (especially Theorem 1.E.4), we can assert the
following:

Theorem 7.A.1: Suppose J (z, , y,%, t = 0, 1, ... , N, is a solution to.Problem II and

suppose that there exists an (r , y) E T such that x < x0 and ,r < y. Then there
..., N, such that
exist p* >_ 0, p* :it 0, and p, > 0, t = 0, 1,
P,+i'Yt-A*it p, i'Y1-pt'x,,t=0, 1,...,N- 1
forall(x,,y,)ET,t=0, 1,...,N- 1,
P*'YN. PN' xN _> P* 'YN PN' xN
TURNPIKE THEOREMS 565

for all (XN, YN) E T, and

p,.(Yt-1 -sr)=0, t= 1,...,N

REMARK: By interpreting the pr's as prices, the above inequalities signify

that profit in each period is maximized. A similar theorem is proved by
McKenzie [ 19] using the separation theorem. The result corresponds to
that of Malinvaud [ 15] and Koopmans [ 13] .
PROOF: Note that the saddle-point condition implies the above equalities
and
N
P* YN + 2: Pt' (Yt-l - Xr) + P0 - (Yo - Xo)
t=1
N

t= 1

for all (x1, yr)'s in T. Then rewrite this inequality as

N-1
(Pt+1'Yt-PC Xt) +
t=o
N-1
(P*'YN - PN xN) + 2: (Pt+1'Yt - PI xt) + PO' co
t=o
Then set (x t, yr) _ (Xt, i) for all t except T. Noting that the choice oft is
arbitrary, we obtain the conclusion of the theorem. (Q.E.D.)
REMARK: If the relations in the conclusion of the theorem hold with p * >
0 and if {(Xr, yr)} is feasible, then {(Xt, Y,)} is optimal. In other words, the
converse of the above theorem also holds if p* > 0. This is easy to see by re-
calling Theorem 1.E.5.
REMARK: If xo is on the von Neumann ray with respect to (s, p, A),
where s > 0, p > 0, and A > 0, then we can show that the von Neumann
path {(s, As)} starting from Yo >_ 0 satisfies the conditions (that is, conclu-
sions) of the previous theorem, and hence is optimal because of the previous
remark.' To see this, first note that xo = as for some a > 0, since Yo is on the
von Neumann ray. Without loss of generality, we may chooses so thatxo =
s. Now set
)'t=,1r+1X,
PrA-1p, t = 0, 1, 2,...,N
(A is interpreted as the growth factor or the interest factor.) Then observe
that
Pt+1 Yt - P1. it =
(a.-(t+
(A
11) - (A-tp). (Ali) = 0, for all t
1)P).

On the other hand, in view of condition (ii) of the definition of the von
566 MULTISECTOR OPTIMAL GROWTH MODELS

Neumann equilibrium,
p- (yt - Axt) < 0, for all (xt, yt) E T
That is,
Pt+ 1, Yt - Pr' xr < 0, for all (xt, yt) E T
Hence we obtain
Pr+ I ' Yr - Pt' Xt Pt+ I ' Yr - Pr' xt, for all (xr, Yr) E T, t = 0, 1, ... , N
where we set p* = pN+ 1. It is elementary to see that the other conditions
set in Theorem 7.A.1 also hold.
REMARK: The above theorem is, in essence, concerned with the saddle-
point characterization of the optimal feasible program. Let us now consider
the quasi-saddle-point characterization. To do this, first write the produc-
tion set T as
T= { (x, y): F (x, y) > 0, (x, y) E S2 In }
where the production function F is assumedto be continuously differentiable
and concave. Consider Problem II and assume again that the Slater condition
holds for this problem [that is, there exists an (2, y) E 02n such that x <
Yo, x < y, and F(2, y) > 0] . The Lagrangian of this problem can be written
as
N N

PYN + pr' (Yt-I - xr) + Po'(xo - x0) +

1=0
q,F(xr,Yr)
r=1

Let (rt, Yt), t = 0, 1, 2, ..., N, be the solution for Problem II. Writetheith
element of . , P , p, and p* as X/, Y/, p/, and p*, respectively. Then,
assuming an interior solution for all t (that is, Xr > 0 and Yr > 0 for all t) and
p* > 0, the following quasi-saddle-point conditions are necessary and suf-
ficient for an optimum:
aF(Xr,Yr)
-Pr i +9t ax; =0,i= 1,2,...,n;t=0, 1,...,N
r

Pr+I a yt
=0,i= 1,2,...,n;t=0, 1,...,N- 1
aF(XN, YN)
p*+gN
ayN'
=0,i=1,2,...,n

Yr- I - Xt > O, p,- (Yr- - Xt) = 0, t = 1, 2, ... , N

zo - Xo > O, Po' (Yo - Xo) = 0

F(Xr,j) > 0,9rF(1r,Yr)=0,t=0, 1,2,...,N

p* > 0,Pr> 0,9r> 0, t = 0, 1,2,...,N
TURNPIKE THEOREMS 567

Here 3F(1 , y, )/ a y,+ and 3F(1 , y, )/ a x,' denote that these partial derivatives
are evaluated at (z,, Assume p, > 0 and q, > 0 for all t. Then we have
Yr-t=z,,t= 1,2,...,N,1o=xo,andF(11 ,Y1)=0,t=0,1,2,...,N.
Moreover, the first two sets of conditions yield

aF(zr-1,Yr-I)laY,_1i_ aF(x,,Yj)laxt`t=
aF(11,Y1)laxti
1,...,N,andi,j- 1,...,n

This is the famous intertemporal efficiency condition obtained by DOSSO

[2]. In other words, this corresponds to the following remark by them ([2],
p. 312):
A necessary condition for intertemporal efficiency is the following: the MRS
between any two goods regarded as outputs of the previous period must
equal their MRS as inputs for the next period.
This condition is illustrated in Figure 7.2 for the two-commodity case. The
location of the production isoquant for producing y, obviously depends on
the location of y,. In Figure 7.2, the above (tangency) condition determines
the location of y,.
In the vector maximum problem (that is, Problem II), yN (and hence the path
leading toyN) are not, in general, unique. The value ofp * depends on yN. However, in
Problem I, yN is unique if the utility function u is strictly quasi-concave. Then p * is
unique, and the path (z, , y,) that leads to yN can be unique (as can be seen from
Figure 7.2).

d. THE RADNER TURNPIKE THEOREM

We now proceed to Radner's turnpike theorem returning to the model
developed in subsection b. His turnpike theorem states that any optimal feasible
path, regardless ofthe initialcommodity vectorxo, spends most of the time sufficiently
"close" to the von Neumann ray. To define "closeness," we have to define "dis-
tance."
Commodity 2

f Yr

The isoquant for yr

The production possibility locus of y,_ 1

Commodity 1
0

Figure 7.2. An Illustration of the Intertemporal Optimality Condition.

568 MULTISECTOR OPTIMAL GROWTH MODELS

Definition: The (Radner) distance between two vectors z' and z" is defined as
Z' z"
d(z', z")
IIZ'11 IIZ"II

where II refers to the Euclidian norm, that is, II z II = (z

z II

REMARK: In this definition, it is not essential that II z II be the Euclidian

Z
norm. It can be replaced by other norms such as II = En Ilzil. The
essential point is that the distance between two vectors be measured in a
"normalized way." The above concept of distance may be illustrated in
Figure 7.3.

The unit circle

Figure 7.3. An Illustration of Distance.

In proving the main theorem, Radner first proved the following lemma,
which is crucial to the proof of his main theorem. It is often referred to as Radner's
lemma. Here we do not assume free disposability.

Radner's Lemma: Suppose that (A-1), (A-2), (A-3), and (A-4) hold, and let
(z, p, a.) be a von Neuman triplet. Then for any c > 0, there exists a S, 0 < 8 < A,
such that (x, y) E T and d(x, z) ? E imply p y < (A -. S) (p x).
REMARK: If (x, y) E T is on a von Neumann ray, then p y = A.(p x); that
is, the value of the output is A Limes the value of the input. Radner's lemma
asserts that whenever the distance from the process (x, y) E T to a given
von Neumann ray [or, equivalently, to (z, U)] exceeds some number c,
then the value of the output falls short of A times the value of the input for
such a process (x, y), by some proportion 8, as long as p. x > 0. In other
words, there is a certain "value loss" associated with such a process. It is easy
to see that the lemma is crucial in establishing the turnpike theorem. Suppose,
for example, that a feasible program {xj, t = 0, 1, ..., N, deviates from a
given von Neumann path in many periods. Then the sum of the values lost
may be excessive. If we could link the "value loss" totheoptimality criterion,
TURNPIKE THEOREMS 569

we would be able to obtain a turnpike theorem. Note that the uniqueness of

the von Neumann ray appears to be essential, for if the von Neumann ray
were not unique, a process not on a certain von Neumann ray might be on
another von Neumann ray, thus causing no "value loss." The possibility of
multiple von Neumann rays, which is assumed away both in Radner [25]
and Morishima [22], is fully explored by McKenzie ([17] and [20]).Itis
argued that the collection of von Neumann rays constitutes a facet of Tand
that the turnpike theorem is then concerned with the conditions under which
the optimal feasible path arches toward this facet (which can be n-dimen-
sional). Radner's theorem is concerned with a special case in which this facet
is one-dimensional. However, the proof for such a general case is analogous
to that for the present case, and, in fact, McKenzie's method is essentially
similar to Radner's method. Here, for the sake of simplicity, we assume the
uniqueness of the von Neumann ray by way of (A-4).
We now turn to the proof of Radner's lemma.
PROOF:
(i) Since p. (y- Ax) < 0 for all (x, y) E T, p x = 0 implies p. y = O when-
ever (x, y) E T. Thus the conclusion of the lemma follows trivially when
p. x = 0. We may, henceforth, assume p . x 0, or p . x > 0. Note that
p. x 0 also precludes the case in which x = 0. Hence we take x :it 0.
[Note that the assertion of the lemma (with p x > 0) is that (p. y)/ (p. x)
- A < - a < 0, that is, [(p y)l (p x) - A] is bounded away from 0.]
(ii) Define the set T, by

T1={y:(x,y)ETand IIx1I = 1}

We claim T, is bounded; otherwise there exists a sequence (xs, yS) such

that II x5 II = I and II YS II->as
-co s--ioo. Since xs:jt 0, we may choose
ys so that ys L 0. Consider a sequence (xs/ II YS II , YS/'I YS II ) ('.'since
ys 0, II ys II > 0, so that the division by II ys II is possible). This is clearly
a bounded sequence; hence it contains a convergent subsequence. Let
(7, y) be its limit. Since (xs/ II ys II , ys/ II ys II ) E T ('.' T is a cone) and
since T is a closed set by (A-] ), (7, y) E T. Since II ys II oc, x = 0. But
II (ys/ II YS II) II = I for all s so that II y II = 1, or y r 0. In other words,
(0, .11 E T with y
1
0. This contradicts (A-2). Thus Ti is bounded.
(iii) Now suppose that there is an c > 0 and a sequence {(x,, yq)}, q = 1,
2, ... , with (xq, yq) E T for all q for which x9 0 and d(xq, i) >0
but for which [ (p. yq)/(p xq) - A] -> 0 as q- co. Wewill show thatthis
will lead to a contradiction by using the fact just established that T, is
bounded. Consider a sequence {(iq, yq)} defined by i q = xq/ x9 II and 11

Y9 = Yq/ II xq II (the division by 11 xq II is possible as x9 0). Since T is a

cone by (A-1), (iq, yq) E T for all q = 1, 2, .... Moreover, (iq, yq) E T,
for all q. Hence {(iq, yq)} is a bounded sequence. Therefore it contains
a convergent subsequence. Let (i, y) be the limit of this subsequence.
Because T is closed by (A-1), (i, y) E T. Since 11 iq 11 = I for all q,
570 MULTISECTOR OPTIMAL GROWTH MODELS

X 11 = 1, or i -k 0. Since (P. yq)l (p xq) = (p. Yq)l (p zq) and (p. yq)l
11

(p .xq)- A as q--> oo by assumption, we have (p y)l(p z) = A by the

continuity of the inner product. Therefore p y = A(p. z). But d(zq, z) _
d(xq, z) > E > 0 by assumption, for all q = 1, 2, ..., so that d(z, z)
E > 0, which, in turn, implies (z, y) cannot be proportional to (z, Az);
hence p. y < A(p z), from (A-4). This contradicts the above equality
p y = ).(p . z). Hence (P. yq)/(p xq) cannot approach A. Note that
(p. yq)/(p xq) - A < 0 for any (xq, yq) E T that is not proportional to
(z, Ai) (with p. xq > 0). This does not preclude the possibility that
[ (p. yq)l (p xq) - A] approaches 0. But we have just denied this possi-
bility. This proves that [(p y)/(p x) - A] is bounded away from zero
whenever (x, y) E T and d(x, z) > E > 0. The fact that S < A is obvious,
for otherwise p . y < 0, which is a contradiction. (Q.E.D.)

The next step of Radner's proof of his turnpike theorem is to construct a

reference path which coincides with the von Neumann ray except for the initial
state, and then to show that this reference path is better than any feasible path
that departs too far and for too long from the von Neumann ray (hence such;,a
path cannot be optimal). Note that this reference path itself may not be optimal.
Before we proceed to the statement and the proof of the Radner turnpike theorem,
we introduce the following additional assumptions:
(A-7) An initial commodity vector Yo is given such that ther e exists areal number
k > 0 such that (xo, kz) E T.
(A-8) There exists a real number a > 0 such that u(x) < a(p x) for all com-
modity vectors x.
(A-9) u (x^) > 0.

REMARK: Assumption (A-8) is satisfied if, for example, all the coordinates
of p are positive. Assumption (A-7) is satisfied if, for example, there is free
disposability and x0 provides positive amounts of all those commodities. This
assumption enables the economy to reach the von Neumann ray one period
after the initial time, starting from an arbitrary initial pointxo. Assumption
(A-7) can be slightly weakened as follows: An initial vector Yo is given such
that there exists a feasible sequence {x}, t = 0, 1, . . ., N0(No > 1), starting
from the given value oil' A. o at t = 0, such that xN,O = kz for some k > O. In other
words, the economy can reach the von Neumann ray within a finite number
of periods. Assumption (A-9) can be weakened as follows (as pointed out by
Radner [25] ): For some integer N, > 0 and some commodity vector y for
which u(y) > 0, there is a feasible sequence from i toy in N, periods.

Theorem 7.A.2 (Radner's turnpike theorem): Let(!, p, )) be a von Neumann triplet.

Suppose that assumptions (A-1) to (A-9) hold. Let {z, } , t = 0, 1, 2, ..._N, be an optimal
feasible path with respect to xo. Then, for any c > 0, there is a number N such that
the number of periods in which d(xt, z) > E cannot exceed N.
TURNPIKE THEOREMS S71

PROOF: First we define a reference path { ct 1, t = 0, 1, 2, ... , N, as a feasible

path such that
zo=x0,i1=kz
and
zt = kAt-1z, t = 1, 2, . . . , N
The existence of a k > 0 is guaranteed by (A-7). Let {xt}, t = 0, 1, ..., N,
be an arbitrary feasible path which starts from a given initial point x0.
Consider any c > 0. Then for any t for which d (xt, z) > c, there exists a S > 0
such that
(A -
by Radner's lemma. Also

for any (xt, xt+ i) E T. Suppose that d(xt, z) > c for N' periods. Then we have
P' xN <_ (A - 8)NAN-N (p. o)
Then, by (A-8), there exists an a > 0 such that
u(XN) < a(pXN) < a(A - 8)NAN-N(p Xo)
On the other hand, by the homogeneity of u, (A-6),
U(XN) = UN-1 u(x)

Hence, in view of (A-9), we have

u(XN) A-S N'

u(xN) = b , where b -
A ku(z)
Hence for {xt} to be an optimal feasible path with respect to x0, it is neces-
sary that
b(A-S)N >1
JL /

A>0
logb+N'logA8
Here it is essential to note that S < A; for, otherwise, log [(A - S)/A,] makes
no sense. The above inequality can be rewritten as
N, < log b
log
(/1-8
572 MULTISECTOR OPTIMAL GROWTH MODELS

Define N by
log b
N = max 1, log( -
) (Q.E.D.)
REMARK: The number N gives the maximum number of periods that any
optimal feasible path can remain at a distance exceeding c from the von
Neumann ray. It is crucial to observe that N is independent of the planning
period N. Hence if N is sufficiently large, N becomes sufficiently larger than
N and any optimal feasible path starting from an arbitrary initial point
spends "most" of its time within the E-distance from the von Neumann ray.
Note also that Radner's theorem (just as several other versions of the turn-
pike theorems) does not advocate that any optimal feasible path must be
on the von Neumann ray most of the time. It requires only that it must be
sufficiently close to the von Neumann ray most of the time.
One of the difficulties in Radner's turnpike theorem is that it does not
preclude the possibility that an optimal feasible path may run out of the neighbor-
ing E-cone of the von Neumann ray around the halfway point of the entire pro-
gramming period. In other words, the optimal feasible path may enter and leave
the neighboring E-cone several times. In this sense, Radner's theorem is some-
times referred to as a hop-skip-jumping turnpike theorem or a weak turnpike
theorem. This possibility can, with certain additional assumptions, be ruled out.
Such a theorem is often referred to as a strong turnpike theorem. In terms of the
Radner type. model, such a theorem is proved by Nikaido [23] and Inada [ 101. 4
Nikaido [23] imposed the following assumptions in addition to the assumptions of
Radner's theorem.
(N-1) For any x > 0 there is some y such that (x, y) E T, where y can be 0.
(N-2) z > 0.
(N-3) The function u(x) is such that x > x' >_ 0 implies u(x) > u(x').
Assumption (N-1) is related to but weaker than the usual free disposability
assumption, which says that (x, y) E T, x' > x, and y' < y imply (x', y') E T.
Assumption (N-3) is satisfied if, for example, u(x) = p* x with p* > 0.
Under these additional assumptions, Nikaido's strong turnpike theorem
([23], p. 154) asserts the following:
For any c > 0, there is a number N, such that, for any N and for any optimal
feasible program, {z,}, t = 0, 1, ... , N, starting from an arbitrarily given x0, we have
d(z,, z) < E for N, < t < N - N,

FOOTNOTES

1. The extension of the turnpike theorems to the model in which consumption is

allowed may be referred to as the neo-turnpike theorems.
TURNPIKE THEOREMS 573

2. For an excellent attempt to survey various turnpike theorems, see Turnovsky [321,
for example.
3. Obviously this is not necessarily true, if the prescribed initial stock x p is not on
the von Neumann ray. On the other hand, we can conclude that any balanced growth
path other than the von Neumann path is not optimal even if the initial stock xp
is on such a path. This is owing to the observation made in the previous remark
that the converse of the above theorem also holds.
4. In this connection, we should point out Tsukui's contribution [281. In a Leontief
type model with alternative techniques, he proved a strong turnpike theorem as well
as other results. The result of this paper overlaps with those in McKenzie [ 17] ,
Drandakis [31, and Tsukui [291. As in [291, Tsukui in [281, also proved a "dual
theorem" which shows the turnpike behavior of the shadow prices of the efficient
path about the von Neumann price ray. Strikingly enough, his [28] was apparently
completed in February 1961 (as a Ph.D. thesis at the Hitotsubashi University), and it
appears to be independent even of pioneering works by Morishima [22] and Radner
[ 25] . Tsukui's contribution in [ 28] seems to be unduly ignored. In this connection,
the truly pioneering nature of the Japanese edition (published in 1957) of Furuya and
Inada [4] in the turnpike literature should be emphasized. Incidentally, Nikaido
[ 23] was apparently written under the stimulus of Tsukui [ 28] (see [ 23] , p. 151).

REFERENCES

1. Atsumi, H., "Neoclassical Growth and the Efficient Program of Capital Accumula-
tion," Review of Economic Studies, XXXII, April 1965.
2. Dorfman, R. A., Samuelson, P. A., and Solow, R. M., Linear Programming and
Economic Analysis, New York, McGraw-Hill, 1958, chap. 12.
3. Drandakis, E. M., "On Efficient Accumulation Paths in the Closed Production
Model," Econometrica, 34, April 1966.
4. Furuya, H., and Inada, K., "Balanced Growth and Intertemporal Efficiency in
Capital Accumulation," International Economic Review, 3, January 1962.
5. Gale, D., "The Closed Linear Model of Production," in Linear Inequalities and
Related Systems, ed. by H. W. Kuhn, and A. W. Tucker, Princeton, N.J., Princeton
University Press, 1956.
6. , "On Optimal Development in a Multi-Sector Economy," Review ofEconomic
Studies, XXXIV, January 1967.
7. Hahn, F. H., and Matthews, R. C. 0., "The Theory of Economic Growth: A Survey,"
Economic Journal, LXXIV, December 1964.
8. Hicks, J. R., "The Story of Marc's Nest," Review of Economic Studies, XXVIII,
February 1961.
9. , Capital and Growth, Oxford, Clarendon Press, 1965.

10. Inada, K., "Some Structural Characteristics of Turnpike Theorems," Review of

Economic Studies, XXXI, January 1964.
11. Karlin, S., Mathematical Methods and Theory in Games, Programming and Economics,
Vol. 1, Reading, Mass., Addison-Wesley, 1959.
12. Kemeny, J. G., Morgenstern, 0., and Thompson, G. L., "A Generalization of the
von Neumann Model of an Expanding Economy," Econometrica, 24, April 1956.
574 MULTISECTOR OPTIMAL GROWTH MODELS

13- Koopmans, T. C., "Analysis of Production as an Efficient Combination of Activi-

ties," in Activity Analysis of Production and Allocation, ed. by T. C. Koopmans, Cowles
Foundation Monograph, No. 13, New York, Wiley, 1951, chap. 3.
14. , "Economic Growth at a Maximal Rate," Quarterly Journal of Economics,
LXXVIII, August 1964.
15. Malinvaud, E., "Capital Accumulation and Efficient Allocation of Resources,"
Econometrica, 21, April 1953.
16. , "Efficient Capital Accumulation: A Corrigendum," Econometrica, 30, July
1962.
17. McKenzie, L. W., "Turnpike Theorems for a Generalized Leontief Model," Econo-
metrica, 31, January-April 1963.
18. , "The Dorfman-Samuelson-Solow Turnpike Theorem," International Eco-

nomic Review, 4, January 1963.

19. , "The Turnpike Theorem of Morishima," Review of Economic Studies, XXX,
October 1963.
20. , "Maximal Paths in the von Neumann Model," in Activity Analysis in the
Theory of Growth and Planning, ed. by E. Malinvaud and M. O. L. Bacharach,
London, Macmillan, 1967.
21. , "Accumulation Programs of Maximum Utility and the von Neumann Facet,"
in Value, Capital and Growth, Papers in Honour of Sir John Hicks, ed. by J. N. Wolfe,
Edinburgh, Edinburgh University Press, 1968.
22. Morishima, M., "Proof of a Turnpike Theorem: The `No Joint Production' Case,"
Review of Economic Studies, XXVIII, February 1961.
23. Nikaido, H., "Persistence of Continual Growth Near the von Neumann Ray,"
Econometrica, 32, January 1964.
24. Radner, R., Notes on the Theory of Economic Planning, Athens, Greece, Center of
Economic Research, 1963.
25. , "Paths of Economic Growth that are Optimal with Regard Only to Final
States: A Turnpike Theorem," Review of Economic Studies, XXVIII, February 1961.
26. Samuelson, P. A., "Efficient Paths of Capital Accumulation in Terms of the Calculus
of Variations," in Mathematical Methods in the Social Sciences, 1959, ed. by Arrow,
Karlin, and Suppes, Stanford, Calif., Stanford University Press, 1960.
27. Thompson, G. L., "On the Solution of a Game Theoretic Problem," in Linear
Inequalities and Related Systems, ed. by H. W. Kuhn and A. W. Tucker, Princeton,
N.J., Princeton University Press, 1956.
28. Tsukui, J., "Efficient and Balanced Growth Paths in a Dynamic Input-Output
System-A Turnpike Theorem," Economic Studies Quarterly, XIII, 1, 1962 (in
Japanese).
29. , "Turnpike Theorem in a Generalized Dynamic Input-Output System,"
Econometrica, 34, April 1966.
30. , "The Consumption and the Output Turnpike Theorems in a von Neumann
Type of Model-A Finite Term Problem," Review of Economic Studies, XXXIV,
January 1967.
31. , "Application of a Turnpike Theorem to Planning for Efficient Accumulation:

An Example of Japan," Econometrica, 36, January 1968.

MULTISECTOR OPTIMAL GROWTH WITH CONSUMPTION 57S

32. Turnovsky, S. J., "Turnpike Theorems and Efficient Economic Growth," Chapter
10 of Mathematical Theories of Economic Growth, by E. Burmeister and A. R. Dobell,
New York, Macmillan, 1970.
33. von Neumann, J., "A Model of General Economic Equilibrium," Review ofEconomic
Studies, XII, 1, 1945-46 (originally published in German, 1937).
34. Winter, S. G., "Some Properties of the Closed Linear Model of Production,"
International Economic Review, 6, May 1965.
35. , "The Norm of a Closed Technology and the Straight-Down-the-Turnpike
Theorem," Review of Economic Studies, XXXIV, January 1967.

Section B
MULTISECTOR
OPTIMAL GROWTH
WITH CONSUMPTION

a. INTRODUCTION
In spite of all the excitement in the profession, the turnpike theory, at least
in its earlier versions, has one major weakness: It assumes that the utility function
is a function of the terminal stock of commodities only. This means that the
economy's concern about the intermediate periods is restricted only to their effect
on the terminal stock of commodities. The utility function in the (earlier) turn-
pike theory is defined only on the terminal stock of commodities and not on the
stock of commodities in any intermediate period. As Koopmans remarked, "the
purpose of economic activity is by implication assumed to be the fastest growth
rather than the enjoyment of life by all generations" ([9], p. 357).
Ramsey, Koopmans, and Cass have overcome these shortcomings of the
turnpike theory for a one-commodity economy. We have already discussed their
problem in Chapter 5, Section D. For a multisector model Gale [6], then
McKenzie [ 13], have made major progress and have provided an almost complete
solution of the problem. In their treatment of the problem, the utility function
depends on every intermediate state as well as on the terminal state. If s, represents
the state of period t relevant to satisfaction, Gale's utility function is represented
as E,""_ iu(s,) for an N-period program, or as E' iu(s,) for an infinite horizon
program. Gale and McKenzie are concerned primarily with the optimal program
when the time horizon is infinite. In this sense, they are addressing the same ques-
tion for the multisector optimal growth problem that Ramsey, Koopmans, Cass,
and so on, addressed for the one-sector optimal growth problem. Gale shows the
existence of an optimal path by actually constructing such a path, which at the
same time exhibits the basic characteristics of the optimal path. In arriving at
this major result, Gale utilizes Radner's procedure in proving his turnpike
576 MULTISED'OR OPTIMAL. GROWTH MODELS

theorem; hence a concept analogous to the von Neumann ray becomes essential in
his procedure. For this purpose, he defines the concept of an "optimal stationary
program"; then the "loss" associated with paths which deviate from this "optimal
stationary program" plays a crucial role in establishing his major theorem. He
confesses, "it may well be true that there is a more direct way of obtaining our
existence theorem," but "the facts we pick up along the way are of economic
interest in themselves describing properties of an `optimal path"' ([6] , p. 1).
Although in showing the existence and in characterizing the optimal path
we essentially follow Gale's procedure, our presentation is more expository. In
addition, it differs from Gale's presentation in the following respects:

(i) Gale assumed that the utility function is defined on the input-output process
adopted at each period. If (x,, y,) denotes such a process, then s1 = (xr, yr),
and Eu(x,, y,) represents his utility series. Here (x,, y,) includes consumption
activities such as eating cakes as well as production activities such as producing
cakes. Although he claims that this is "the conceptually correct way" ([61,
p. 6), and although he may be right in his defense, this has the weakness of
obscuring the distinction between the production activity and the consumption
activity, with minor implications such as obscuring the role of consumers'
satiation. Certainly an activity of consuming cakes is essentially different from
an activity of producing cakes and in economics it is often very important to
make this distinction clear. It would be difficult to rewrite the entire theory
of competitive markets (such as described in Chapter 2) by adopting Gale's
procedure. Hence in this section we assume that the utility function is defined
on consumption vectors instead of on input-output vectors. Such a procedure
is certainly the case for the one-sector optimal growth theory a la Ramsey,
Koopmans, and Cass.
(ii) In the discussion of the "optimal stationary program," Gale [7] utilized his
new results in the theory of nonlinear programming and developed the Kuhn-
Tucker theorem. We show that we can do the same job by utilizing the ordinary
concave programming theory (Chapter 1, Section B) without any new result.
(iii) Brock [3] worries about Gale's assumption of strict concavity of the utility
function. His worry is mainly due to the fact that it does not include the "von
Neumann" economy. Although he followed Gale in defining a utility function
on input-output vectors, Brock simplified Gale's procedure on one important
point which we call "Brock's lemma." For a discussion of his other important
contribution on the "weakly maximal program," the reader is referred to his
paper [3].

A rough preview of this section is now in order. First, we may note that we
agree with Gale about the importance of appreciating various "sceneries" in
connection with the present problem. In subsection b, we formulate the basic
model of this section. Then in subsection c, we discuss the finite horizon problem.
There we show that every "competitive" program is "optimal" and that every
"optimal" program is "competitive" (Theorems 7.B.1 and 7.B.2). (As pointed out
MULTISECTOR OPTIMAL GROWTH WITH CONSUMPTION 577

by Malinvaud [II ] , similar theorems for the infinite horizon case would not hold
without an important modification, that is, "cost minimization.") In subsection d,
we switch to the finite horizon problem. First, we introduce the concept of "optimal
stationary program" (O.S.P.). Theorem 7.B.3 asserts its existence and Theorem
7.B.4 asserts the price implications of the optimal stationary program. In the
corollaries of Theorem 7.B.4, we discuss (i) the consumers' nonsatiation condi-
tions which guarantee semipositiveness and strict positiveness of the price vector
associated with the O.S.P, and (ii) the conditions which guarantee the uniqueness
of the O.S.P. In subsection e, we compare an arbitrary "attainable" program (for
the infinite horizon problem) with the O.S.P. In Theorem 7.B.5, we assert that
no attainable program is "infinitely better" than the O.S.P. In Theorem 7.B.6, we
characterize the attainable paths which are not "infinitely worse" than the O.S.P.
(called the "eligible programs"). Theorem 7.B.7 establishes the existence of an
eligible attainable program, and in Theorem 7.B.8, we assert that every eligible
attainable program converges to the O.S.P. if the O.S.P. is unique. In subsection f,
we turn to the optimal program for the infinite horizon problem and by Theorem,
7.B.9, prove the crucial result of this section, the existence of the optimal attain-
able program. Before we prove Theorem 7.B.9, we introduce Brock's lemma,
which is crucial to this theorem.

b. THE MODEL
Consider an economy with n commodities. Let (x,, y,) denote the production
process in period t where x, and yt, respectively, denote the (stock) input vector
and the (stock) output vector. Let T be the set of such processes which are tech-
nologically feasible in the economy. Thus Tis the technology set (or the production
set) of the economy. Let ct denote a consumption vector of the economy and let
C be the set of all possible consumption vectors (not necessarily technologically
feasible) in the economy. We assume that:
(A-1) (i) The set T is a nonempty, compact, and convex subset of R2", and
(ii) C is a nonempty, compact, and convex subset of R".
The set T is bounded because of some sort of resource limitation, which we
clarify in an example later in this subsection, and C is bounded from below for
the obvious reason of subsistence, and so on. We may assume that C is hounded
from above owing to the physiological limitation of personal consumption, for if
C is not bounded from above and if the economy can "grow" indefinitely as time
extends without limit, then the above assumption would not hold. However, the
justification for the upper bound of C may not be acceptable to some readers.
One way to avoid this question is to introduce consumers' satiation, which imposes
a practical upper bound on consumption; for example, recall Ramsey's "bliss"
in [ 16]. Another way is to suppose an upper bound on capital accumulation
resulting from capital satiation. The latter may be more acceptable. (See, for
example, McKenzie [ 13].) If there is an upper bound on capital accumulation.
578 MULTISECTOR OPTIMAL GROWTH MODELS

then this, together with the lower bound on the consumption set, will practically
make the relevant "attainable" set bounded, which makes the relevant consump-
tion set bounded. In this connection, we may remind the reader of the procedure
in the theoryof competitive equilibria, in which the attainable set is "compactified."
(See Debreu [4], pp. 76-78.)
In any case, we proceed with our analysis under the assumption that both
T and C are bounded. Since both T and C are compact, T x®C is also compact.
That T (2)C is nonempty and convex follows from the fact that a product of non-
empty convex sets is nonempty and convex. Hence (A-1) implies that TO C is
nonempty, compact, and convex. Let ro denote the vector of the stock of com-
modities made available at the beginning of period 0. It is called the initial resource
vector. Let Z be the set of all possible initial resource vectors. We assume that
Z is a nonempty bounded subset of R".
We assume that the welfare of the society in period t can be represented by
the utility function u(c,) such that:

(A-2) The utility function u(c,) is continuous and concave on C.

Consider a sequence {(x,, y c,)}, t= 0, 1, 2, . ., N, such that

(a) (x,,y,)E T, t=0, 1,2,...,iV

(b) c, c- C, t=0, 1,2,...,N
(c) xo + co -5 ro, ro E Z
(d) xt ct < y,- I, t = 1, 2, .. , N

We call such a sequence {(x y c,)} an N-period attainable program starting from
ro. The set of all the N-period attainable programs starting from ro is denoted by
AN(ro). When N - co, we can analogously define the infinite horizon attainable
program starting from ro. The movement of our economy, that is, a sequence
{(x y,, c,)} where (x y,) E T and c, E C, is described by Figure 7.4
We now discuss an important example of the economy described above.
EXAMPLE: We consider an economy in which n "commodities" and one
type of "labor" are involved. The essential characteristics of this "labor" are
that it is indispensable for any production process and that it grows at a con-
stant rate y. Let a,1 be the amount of the ith commodity input per unit opera-
tion of the jth process and let b,1 be the amount of the ith commodity output
per unit operation of the jth process. Here the unit operation of each process
is measured by one unit input of "labor." We assume that a1 >_ 0 and
b;1 > 0 for all i and j. Let A and B be n x m matrices such that A =
and B = [by] The technology set may be defined as {(A v, B v): v >_ 0,
.

v E Rm}, which is a convex polyhedral cone; hence it is closed but not

bounded. Let L, be the total labor available in period t. By assumption,
MULTISECTOR OPTIMAL GROWTH WITH CONSUMPTION 579

X,+2

x,+i >Yt+i-3Ct+2

xt >Yt>Ct+i
T

x`-1 ) Yt-i - c,

b
Period (t-1) Period t Period (t+1)

Figure 7.4. An Illustration of the Movement of the Economy.

Lt = (1 + µ)L,_ 1. Assume Lt > 0 for all t. Let ct be the aggregate con-

sumption vector of the economy. Assume that the consumption set is the
entire nonnegative orthant of R". Let Ro > 0 be the initial resources. Then
we have

PROGRAM a

t= 1,2,...,N
<Lr,vt>_0andct>0, t=0, 1,...,N
where u = (1, 1, ..., 1) E R-

Divide both sides of these inequalities by Lt and set ct - ct/Lt, zt = vt/Lt,

and ro = Rc/L0. Then we obtain

zt-1,
t= 1, 2,...,N
1 +,u.
1,zt>_0andct>0, t=0, 1,...,N
Write xt = A zt and yt_ I = B zt_ 1 /(1 + µ). Let T (xt, y,): xt = A zt,
yt = B zt/(1 + µ), 0 < u zt < 1 } . Clearly, T is nonempty, compact, and
convex. We can now construct an attainable program.
580 MULTISECFOR OPTIMAL GROWTH MODELS

PROGRAM R

(xr, Yr) E T, ct E on
xt+ct =Yt-1, t= 1,2,...,N
xo+co!5 ro
Clearly program a and program R are equivalent in the sense that there
exists a one-to-one correspondence by the rule defined above. Hence the
set of utility sequences u(ct/Lt) in program a and the set of utility sequences
u(ct) in program R are identical. Therefore, it suffices to consider only
program A.

C. FINITE HORIZON: OPTIMALITY AND COMPETITIVENESS

We now consider the following maximization problem for the finite-period
problem:
N
Maximize: Z u(ct)
{(C"J", c,)} t=0

Subject to: {(xt, yt, ct)} E AN(ro)

A solution {(zt, Yt, ct)} of the above problem is called an optimal (attainable)
program with respect to AN(ro). Note that a solution to the above problem may not
exist. First of all, AN(ro) may be empty. Moreover, even if AN(ro) is not empty, the
solution still may not exist. We now state the following assumptions:
(A-3) The set AN(ro) is nonempty and compact.
(A-4) (Productiveness) Given ro, there exist (xt, yt) E T, t = 0, 1, ..., N, and
ct E C such that xt + ct < yt_1, t = 1 , 2, ..., N, and zo + co < ro.
Although the nonemptiness of T and C in (A-1) can be considered as
preliminary to (A-3), it is not sufficient. For example, if ro is too small, then
there exists no c E C and (x, y) E T such that x + c < ro. Assumption (A-4)
implies that the economy is capable of expansion starting from ro. Note the strict
inequality in xt + ct < Yt_ 1.
We can now assert that if (A-3) holds and if u is continuous by (A-2), then
there exists an optimal program for AN(ro) as a result of the Weierstrass theorem
(Theorem O.A.18). If u is strictly concave, we can show that the sequence {ct} is
unique. However, this does not imply that there exists a unique (zt, yt) E T cor-
responding to this unique ct. In other words, there can be many (xt, yt) E Tsuch
that {(xt, yt, ct)} E AA,(ro) is optimal.
The problem thus stated very much resembles some problems which appear
in the theory of competitive markets (for example, the Pareto optimum problem),
and therefore we can obtain analogous results. The basic methodology is again
nonlinear programming. We first define the following concept.
MULTISECTOR OPTIMAL GROWTH WITH CONSUMPTION 581

Definition (competitiveness): An N-period program {(zr, yr, cr)} starting from

r0 with (zr, yr) E T and cr E C is called competitive if there exists a sequence of
n-vectors p, 0, t = 0, 1, 2, ..., N, N + 1 (called "price" vectors), such that

(i)
(ii) Pr+ Yt - Pr' xr = Pr+ I Yr - Pr xr for all (xr, Yr) E T, t = 0, 1, ... , N
(iii) { (xr, Yr, cr)} E AN(ro), po- (r0 - c0 - zo) = 0, p,- (vr_ i cr - zr) = 0
t= 1,2,...,N and PN+i = 0

REMARK: We define competitiveness for an infinite horizon program in a

strictly analogous fashion.
REMARK: Condition (ii) is the condition for profit maximization and con-
dition (iii) says that a competitive program must be attainable and that if a
commodity is in excess supply for each period, its price becomes zero. Note
that condition (i) implies that u(cr) > u(cr) for all cr E Cwith pr cr = pr' cr,
t = 0, 1, 2, ..., N, that is, utility is maximized subject to the budget condition.
Note also that our definition is analogous to that of competitive equilibrium
(Chapter 1, Section F).
REMARK: Strictly speaking, the supposition of an aggregate utility func-
tion is a very uncomfortable assumption. Instead, we may consider a
society which consists of "consumers" who are immortal (immortal con-
sumers can be justified by conceiving of each consumer as a family unit) and
then define the utility function for each consumer. Many of the theorems
and definitions in this section would then follow with suitable changes (for
example, maximization of a real-valued function u is converted to vector
maximization of a vector-valued function). We leave this task to interested
readers. We should note, however, that there is a clear simplification in the
exposition as a result of the supposition of an aggregate utility function,
which also has the advantage of giving stronger results. Gale [6] , for
example, adopted this supposition of an aggregate utility function.
Our first theorem is analogous to the theorem that every competitive equili-
briuiri is a Pareto optimum.

Theorem 7.B.I: If a program {(zr, yr, Cr)} E AN(ro) is competitive, then it is

optimal.
PROOF: Let {(xr, yr, cr)} be any other program in AN(ro). By conditions (i)
and (iii) of competitiveness, we have
(1) uV0) - u(co) > Po - Vo - co)
= Po' (r0 - 10) - Po- (r0 - xo) = Po' xo - Po' Xo
582 MULTISECTOR OPTIMAL GROWTH MODELS

(2) u(C1) - u(cr) > Pr' (Err - cr) > Pr' lvr-1 - xr) - pt-(Yr-1 - x,)
15t<N

(3) 0 = PN+1'YN PN+1'YN

Summing from 0 to N and rearranging terms,
N N N
(4) Z u(er) - Z u(cr) > Z [(Pr+I'Y, - PC'xr) - (Pr+1'Yr - P1'xr)]
r=0 r=0 r=0
Then in view of condition (ii) of competitiveness, the RHS of the above in-
equality is nonnegative, which implies
N `N

u(cr) >= Z u(cr)

r=0 r=0 (Q.E.D.)
Next we prove the converse of the above theorem, which is analogous to the
theorem in the theory of competitive markets that every Pareto optimum point
can be achieved as a competitive equilibrium point.

Theorem 7.B.2: Under (A-2) and (A-4), if {(zr, yr,cr)} is an optimal program with
respect to AN(ro), then it is competitive.
PROOF: By the hypothesis of the theorem, {(zr, yr, cr)} maximizes
N

Z u(cr)
r=0
subject to ro >_ x0 + co, yr- I > _ xr + cr, t = 1 , 2, ... , N
and(x,,yr)E T, c1E C, t = 0, 1,2,...,N
Since u is concave and since Slater's condition is satisfied from (A-4), we can
apply the Kuhn-Tucker-Uzawa theorem of concave programming (Theorem
1.B.3 and its corollary). In other words, there exist pr > 0, t = 0, 1, 2, ... , N,
such that
N N
(5) Zu(6r)+Po'(r0-zo-co)+ PI'(Yr 1-xr-cr)
r=0 r= I
N A'

u (c,) + Po' (ro - x0 - co) + Pr' (Yr- I - xr cr)

r=0 r= 1

for all cr E C, (x,, );r) E T, t = 0, 1, 2, ... , N

and
N
(6) Po' (ro - zo - co) + ZPr' (Yr-I - sr - cr) = 0
1-1

WesetpN+I = 0.Since ro-zo-co>_ 0andyr-I - it - cr>_ 0,t= 1,2,...,N,

owing to {(2,, yr, cr)} E AN(ro), condition (iii) of competitiveness follows
MULTISECTOR OPTIMAL GROWTH WITH CONSUMPTION 583

from (6) and pN+ I = 0. Condition (5) can be rewritten as follows:

(7) uVt) - Po- (xo + co) + 2: P, - (y,-1

t=o t=1
N N

Eu(ct) - Po- (xo + CO) + Pt' (Yt- I - xt - c,)

t=0 t=1

for all (x,, y,) E Tandc, E C, t= 0, 1,2,...,N

In particular, put x, = z,, y, = y, for all t = 0, 1, 2, ... , N, and ct = c, for
all t = 0, 1, 2, ..., N, except for to. Then we obtain

(8) Pto' cto > u(c,) - Pto' c,0 for all c,0 E C

Since the choice of to is arbitrary, this establishes condition (i) of com-

petitiveness. Next put c, = ct in relation (7) for all t = 0, 1, ... , N, and
xN = zN. Then we obtain
N-I N-1
2:
t=o
-Y,) t=0
x,)

for all (x,, y,) E T, t = 0, 1, ..., N - 1. Now set x, = z, and y, = y, for all
t = 0, 1, ..., N - 1, except for to; then we obtain
(9) Pto+ I (Yto - y,o) > P,0' (z,o - x,o) for all (x,o, yto) E T

Note that the choice of to is arbitrary, so that (9) holds for any to = 0, 1, ... ,
N - 1. Next put c, = c,,x,= z,, and y,= y,(for t= 0, 1,...,N- 1) in
relation (7), and obtain
PN' xN > -PN ' xN
or

PN+ I YN PN " xN _> PN+ I ' YN - PI V' xN

since PN+ I = 0. Thus we have established condition (ii) of competitiveness.

d. OPTIMAL STATIONARY PROGRAM

From now on we consider only infinite horizon programs, so we use such
words as "program" and "attainable program" without explicitly stating "infinite
horizon," unless it is desirable to make it explicit. A sequence {(x,, y c,)} is
a feasible program if (x,, y,) E T, c, E C, and x, + c, < y,_ 1, t = 1, 2, .... If in
addition xo + co 5 ro, then it is an attainable program starting from ro. The set of
all the feasible programs will be denoted by A. The set of all the attainable pro-
grams starting from ro will be denoted by A(ro). The crucial difference between
A and A(ro) is whether the initial condition (xo + co < ro) is specified. In A, the his-
584 MULTISECTOR OPTIMAL GROWTH MODELS

torically given initial vector r0 is ignored. In other words, A(r0) = { {(x,,y,,c,)} E A:

x0 + co < r0}. We henceforth assume that the set A(r0) (and also A) is nonempty.
We first consider a program { (x, y, c)} in A such that each of the x, y, and c is
constant over time. Such a program, if it exists, is called a stationary program
or a golden age program. The concept is analogous to that of the "golden age path"
for the one-sector optimal program (Chapter 5, Section C) or the balanced growth
path (Chapter 6, Section A). Next consider the program which maximizes u(c)
among the set of stationary programs. Such a concept is analogous to that of
the golden rule path or the von Neumann path.

Definition: A constant sequence {(x, y, c)} E A is called an optimal stationary

program (O.S.P.) or a golden rule path if it is a solution to the following nonlinear
programming problem:
Maximize: u(c)
(X. y. c)
Subject to: x + c < y
(x, y) E T and c E C
where x, y, and c are constant over time. Note that in the concept of O.S.P., the
initial condition r0 is disregarded.
To consider the above problem we make the following assumption:
(A-5) There exist (x, y) E T and c c C such that x + c < y.
One distinction between (A-4) and (A-5) must be made clear; in (A-5), the his-
torically given initial resource vector is ignored. Under (A-1) and (A-5), the set
{(x, y, c): (x, y, c) E T®C, x + c < y} is nonempty and compact. Then the
continuity of u in (A-2) insures the existence of a solution to the above nonlinear
programming problem (that is, the existence of an O.S.P.). Hence we assert the
following theorem.

Theorem 7.B.3: Under (A-1), (A-2), and (A-5), there exists an O.S.P.
REMARK: In establishing this theorem the strict inequality (< ) in (A-5) can
be weakened to the weak inequality ().
We now prove the following important theorem which is in Gale [6].

Theorem 7.B.4: If {(x, y, c)} is an O.S.P. and if (A-2) and (A-5) are satisfied, then
there exists a p ? 0, p E R", such that
(10)
for all (x, y) E T and c c C
and

(11) y-z-c>=0
MULTISECTOR OPTIMAL GROWTH WITH CONSUMPTION 585

PROOF: By definition of O.S.P., the theorem can be obtained by applying the

theorem for the saddle-point characterization in concave programming (see
the corollary of Theorem 1.B.3). The "productiveness assumption," (A-5),
provides Slater's condition. (Q.E.D.)
REMARK: The converse of the above theorem holds and its proof is easy
(see Theorem 1.B.4).

REMARK: We call the p obtained in Theorem 7.B.4 the price vector asso-
ciated with O.S.P. It corresponds to the von Neumann price vector.

REMARK: The above inequality holds, afortiori, for all (x, y) E T and c E C
such that x + c < r0 for a given r0 E Z [provided that such an (x, y, c) exists].
If c in the above theorem is not a satiation consumption [that is, if there
exists a c' E C such that u(c) > u(c)], then we can also assert that p 0. To see
this, suppose p = 0 in the inequality (10); then u(c) > u(c) for all c E C, which
contradicts c being a nonsatiation consumption. Suppose we strengthen this
assumption of nonsatiation such that, for c, there exists a ic' = (c,, c2' ... ,
c;,..., E C such that u(ic') > u(c), for all i = 1, 2, ..., n. This means that the
consumption vector for c can be improved by changing the amount of consump-
tion of any commodity. In other words, at c, the consumption of no commodity
reaches a satiation point. We call this assumption the strong nonsatiation assump-
tion, and we call the former assumption ["there exists a c' E C such that u(c') >
u(c)"] the weak nonsatiation assumption. If this strong nonsatiation assumption
holds, then we can assert that p is strictly positive. To see this, suppose the contrary,
so that pio = 0 for some i0 in 11, 2, ... , n}. Let ci = ci, xi = zi, and yi = yi for all i =
1, 2, ..., n, except for i = i0 in inequality (10). Then we have u(c) > u(ioc) for
all E C where ioc = (ci , c2, ... , cip, ... , c,). This contradicts the strong non-
satiaon assumption. We may summarize this as a corollary of Theorem 7.B.4.

Corollary 1: In Theorem 7.B.4,

(i) If the weak nonsatiation assumption holds, then p >_ 0.

(ii) If the strong nonsatiaticn assumption holds, then p > 0.

It should be clear that the O.S.P. is not necessarily unique. If u(c) is a strictly
concave function of c, then we can assert that c is unique. But this does not
guarantee that the associated input-output vector (z, y) is unique, unless T has
some special feature. Here we point out one such feature, that is, the strict con-
vexity of the set T.

Definition: The set T is said to be strictly convex if (x, y) E T, (x', y') E T, and
(x, y) (x', y') imply that there exists an (z, y) E T such that, for some 0,
0<0<1,
586 MULTISECTOR OPTIMAL GROWTH MODELS

and
y > By + (1 - 0)y'

Corollary 2: Suppose that u is strictly concave and that both T and C are convex.
Let (z, y, c) be an O.S.P. Then

(i) c is unique.
(ii) If in addition, the assumptions of Theorem 7.B.4 hold with p 0 and if T is
strictly convex, then the (z, y) associated with c is also unique. Thus (z, y, c) is a
unique O.S.P.

PROOF:
(i) We first assert that the strict concavity of u implies the uniqueness of.
c. To prove this, suppose that (z, y, c) and (x', y', d) are two O.S.P.'s
such that e c'. Note that u(e) = u(c) by the definition of O.S.P. Let
Z + c'), z = + x'), and + y'). Clearly z + c < Also
(z, y)E T and c E CZ(zresulting from theZ(9
convexity of T and C. Hence the
constraints for the defining programming problem are all satisfied for
(z, y, c). But owing to the strict concavity we have
u (c) > Z [ u (c) + u (c' )] = U (c)

which is a contradiction.
(ii) Next we show that the input-output vector (1, y) associated with c is
unique under the strict convexity of T. To show this, suppose the contrary
so that both (z, y) E T and (x', y') E T are associated with c, where
(c, y) (x', y'). In other words, both (r, y, c) and (x', y', c) are solutions
of the defining programming problem of O.S.P. Since the assumptions of
Theorem 7.B.4 hold, the necessary and sufficient condition for O.S.P.-
that is, the relations (10) and (1 I)-holds. In view of (11) and the assump-
tion that both (z, y, c) and (x', y', c) are O.S.P.'s, we have

p'(1'- 2)=p (y'-x')=p c

But because of the strict convexity of T, (c, y) E T, (x', y') E T, and
(z, y) (x', y') imply that there exists an (x*, y*) E T such that, for
some 0, 0 < 0 < 1,
x* < Os + (1 - 0)x'

Hence, using p 0, we obtain

p(y*-x*)>&p (y--)+(1-0)p'(y'-x')=p'c
MUL'TISECTOR OTPIMAL GROwTH WITH CONSUMPTION
587

P.(Y*-x*-c)>0
Set x = x*, y = yand c = c in relation (10), and note (11). Then we
obtain
P. (y*-x*-c)<0
which is a contradiction. (Q.E.D.)
REMARK: As we remarked in subsection a, Gale [6] and Brock [3]
assumed that u is a function of (x, y) rather than of c and suppressed c from
their entire analysis. Hence in Gale [6], his assumption of the strict con-
cavity of u implies the uniqueness of his O.S.P., (z, y), without any assump-
tion such as the strict convexity of T. However, as Brock pointed out, the
strict concavity of u(x, y) precludes the von Neumann type model from the
analysis. Brock therefore assumed the concavity of u(x, y) instead of its
strict concavity to allow for the von Neumann model. To obtain some of his
major theorems, he also assumed that the O.S.P. is unique. We may wish
to obtain the conditions, other than the strict convexity of T and the strict
concavity of u, which would imply the uniqueness of the O.S.P. We leave
this to the interested reader.
REMARK: The uniqueness of the optimal attainable program
for the finite horizon problem can be established in a manner similar to
that in the above Corollary 2.

e. O.S.P. AND ELIGIBILITY

The relations (10) and (11) of Theorem 7.B.4 imply
(12) u(c) - u(c)>= 0 for all (x, y)E Tand cEE C such that p- (y - x - c)? 0
Consider any attainable program {(x,, y,, c,)} withy,_ I - x, - Cl ? 0 starting from
an arbitrary r0 in Z, and consider a particular O.S.P. Relation (12) means that
there can be no utility gains along such a path compared with the O.S.P. if we
ignore the initial condition of the program. This observation will be central in
establishing the following.

(i) The utility sequence of any attainable program cannot be infinitely better than
that of the O.S.P. (Theorem 7.B.5).
(ii) An attainable program can be "infinitely worse" than the O.S.P. by deviating
from it sufficiently. The characterization of such a path is given by Theorem
7.B.6.
(iii) Any attainable program that is not "infinitely worse" must converge to the
O.S.P. asymptotically (Theorem 7.B.8), if the O.S.P. is unique.

The situation is very much analogous to the procedure used by Radner,

who utilizes the "value loss" associated with any path deviating from the von
588 MULTISECTOR OPTIMAL GROWTH MODELS

Neumann ray in establishing his turnpike theorem (see Section A of this chapter).
Following Gale [6], we now establish the above statements one by one. First
we prove the following.

Theorem 7.B.5: Suppose that {(z, y, c)} is an O.S.P. and that (A-1), (A-2), and
(A-5) hold. Then, for any N and for any attainable program {(xr, yt, cr)} startingfrom
any given ro E Z, there exists an M such that
N
(13) Y (ur - u) 5 M, where ut = u(cr) and u = u(c)
r=0
PROOF: Using (A-2) and (A-5), we apply Theorem 7. B.4. Then there exists a
p > 0 such that
(14) for all(x,y)E Tand cE C
or
(15) u(ct) - u < p. (xr + cr - yt) for all (xt, yr) E T and ct E C
t=0, 1,2,...,N
Now suppose { (xr, yr, cr)} E A so that xr + cr < yt- , t = 1, 2, ... , N. Then
I

summing both sides of the above inequality (15) over t, we obtain the follow-
ing relation for {(xt, yr, c,)} E AN(ro):
N N

1=0
(ur- u) =(x,+
r=0
ct-Y1).
N
co)+ P'(xr+ cr - Yr- i) - P'YN
r=1

P'r'o-P'YN
The last two inequalities hold because xr + cr 5 yr_ 1 and x0 + co < ro with
p >_ 0. Since T and Z are bounded, there exists a real number M, independent
of N, such that
(17)
P'r'o-P'YN< M
Hence
N

(ur - u) < M (Q.E.D.)

t=0

REMARK: Following the above convention, we henceforth write ur = u(cr)

and u - u(c).
REMARK: As remarked before, the above theorem establishes that there
is no attainable program which is infinitely better than the O.S.P.,
MULTISECTOR OPTIMAL GROWTH WITH CONSUMPTION 589

c)}. However, there is nothing discussed so far that precludes the

possibility that there exists an attainable program which is infinitely worse
than the O.S.P. To consider this problem, we introduce the concept of
"eligibility."

Definition: An infinite horizon attainable program {(x,, y,, c,)}, starting from an
arbitrary given r0 E Z, is called eligible if its associated utility series is bounded
from below; that is, there exists a real number E such that
N
(18) X(u,-u)>_E for any N
r=0

Theorem 7.B.6: If an attainable program {(x,, y,, c,)} starting from any given
r0 E Z is not eligible, and if (_A-1), (A-2), and (A-5) hold, then
N
(19) I (u, - u)- - oo as N->oo
r=0
PROOF: Since {(x,, ),,, c,)} is not eligible for any E, there exists an N depen-
dent on E such that

(20) (u,-t)<E
t=o
Also, in view of relation (15), we have
N N

(21) X (ut-u)< t=AV+I

I=N+I
f p- (x,+c,-),t)
N

p " (xN+ 1 + CN+ I - yN) + I P .(X,

t=AT+2
+ C, Yt -1)

Since xt + ct - y,_I < 0 for all t for {(x,, y,, c,)} E A(r0), this implies
N

(22) (u1 - u) < p - (xN +I + cN +I - y,v) < B

I=N+I
where B is a bound independent of N and IV, and B is obtained by the com-
pactness of T and C. Hence combining (20) and (22), we obtain
N
(23) 1(u, - u) < E + B for N > N
1=0

It follows that
N
(24) (u, - u) - - co as N - co
=0

since E can be any (absolutely large) negative number. (Q. E. D.)

590 MULTISECrOR OPTIMAL GROWTH MODELS

REMARK: Thus every attainable program is noneligible if and only if its

associated utility series diverges to -co as N oo. In other words, every
attainable program is either eligible or its associated utility series diverges
to - oo as N - oo.
The next theorem in order is the one which asserts the asymptotic conver-
gence of every eligible attainable program to the O.S.P. (if it is unique). However,
here we need to make an important digression. We establish that there exists an
eligible attainable program; otherwise any discussion on the eligible attainable
program would be vacuous. The existence of an eligible attainable program is not
really obvious. For example, it may so happen that even an attainable program
(not necessarily eligible) starting from a given ro may not exist. This is the case
when the initial resource vector ro is so small that there does not exist any co E C
such that x0 + co < ro for any x0. This difficulty may be handled by assumption
(A-6) below. However this does not establish the existence of a program that is
eligible. To do this we impose assumption (A-7) below.
(A-6) There exist an (x, y) E T and a c E C such that x + c < y and x + c rj.
(A-7) The function u has a "bounded steepness" at c. (The concept of "bounded
steepness" was introduced by Gale [6], and it is defined as follows.)

Definition: The function u(c) is said to have a bounded steepness at c E Cifthere

exists a positive number a, dependent on c, such that
(25) 1 u (c) - u (a) I < a 11 c - c 11, for all c E C
where 11 c - c 11 may be any convenient norm on R°
Assumption (A-6) is a slight modification of (A-5). Here the additional
restriction x + c < ro is imposed. Note that if x + c > ro, then this "productive"
process (x, y) does not have any meaning for the economy starting from ro.
Assumption (A-6) is regarded as a restriction on the initial resource vector ro,
given a certain productive technology. Assumption (A-7) corresponds to the well-
known Lipschitz condition in which the relation in (25) holds for any c and c" in C.
We now state and prove a theorem for the existence of an eligible attainable
program.

Theorem 7.B.7: Suppose that an O.S.P., (z, y, c), exists. Then under (A-1), (A-6),
and (A-7), there exists an eligible attainable path starting from an arbitrary given
ro in Z.
PROOF: By (A-6), there exist an (x, y) E T and a c E C such that x + c < y
and x + c ro. Define x, , y, , and c, as
(26) x, = (1 - ).)z + Ax, where 0 < A < 1

(27) y i = (1 - A)y + ply

(28) c, = (1 - A)c + Ac
MULTISECTOR OPTIMAL GROWTH WITH CONSUMPTION 591

Note that x, + cl = (1 - A)(z + c) + A(x + c) < (1 - A)(z + c) + Ay; hence

x, + cl < y, for, otherwise, we obtain x, + cl > (1 - A)(z + c) + Ay by
choosing A close enough to one. The convexity of T implies (xj, y1) E T and
the convexity of C implies cl E C.
We now define the sequence {x,, y,, c,)}, t = 1, 2, ..., by the rule

(29) x,=(1 -At)z+Atx=z+At(x-z)

(30) y,=(1-A')Y+A.ty= Y+At(y-Y)
and
(31) ct = (1 - At)c + A.tc = c + At(c - c)
Next we note that
(32) x, = (I - A).z + Ax,_
because
(I -A.)z+Ax,-i=(1 -A)1 +A[(1-At-i)z+A.t--'x]

=(1 -At)z+Atxx,

yt = (I - A)Y + Ayt- I

ct = (I - A)c + Act- i
We now show that x, + c, < y,_ 1 and (x y,) E T for all t = 1, 2, .... First
note that this is true for t = I by putting yo = y. Next, by mathematical in-
duction on t,
(35) x,+ i + ct+ I (I - A)z + Ax, + (I - A)c + Ac,
=(1-A)(z+c)+A(x,+ct)<(1-A)y+Ayt-I=yt
using xt + ct < y,_1. Also (xt, yt) E T means (xt+1, yt+1) E T, owing to the
convexity of T. That c, E C for all t = 1, 2, ... , can also be shown easily by
mathematical induction. First recall cl E C. Then the convexity of C with
c E C and c, E C implies ct+ I E C in view of (34). Let x0 - x and co = c.
Then x0 + co < ro. Hence {(xt, yt, c,)}, t = 0, 1, 2, ..., constitutes an attain-
able program.
To show that the program is eligible, note that u has a bounded steep-
ness by (A-7). In other words, there exists a a > 0 such that

(36) Iu(ct)-u(0)P=Iu[c+At(c-c)] -u(c)I

<aA`lic-011,t=0, 1,2,...
592 MULTISECTOR OPTIMAL GROWTH MODELS

Summing this inequality over t and recalling 0 < A < 1, we obtain

00
Q
(37) E 1 ut - u I< 1 II c - c II (Q.E.D.)
t=o
We will now prove that every eligible attainable program starting from any
ro E Z converges to the O.S.P. Here we have to recall that the O.S.P. is not neces-
sarily unique. If this is the case, convergence to the O.S.P. has little meaning. One
way to avoid this problem is simply to assume the uniqueness of the O.S.P. As
remarked before, the strict concavity of u and the strict convexity of T will imply
the uniqueness of the O.S.P. The interested reader may attempt to find other
conditions which would guarantee the uniqueness of the O.S.P. Another way to
handle the situation is to consider the set of all the O.S.P.'s with the recognition
that the O.S.P. is not necessarily unique. Then the convergence to a unique
O.S.P. will be replaced by convergence to such a set. This procedure will be
analogous to the one that McKenzie [ 121 used to establish a turnpike theorem
in terms of the "von Neumann facet" instead of the unique von Neumann ray.
For such a study, see McKenzie [13]. With this remark, we now prove.the
following theorem.

Theorem 7.B.8: Suppose that (A-2) and (A-5) hold and that (z, y, c) is the unique
O.S.P. with strict convexity of T and strict concavity of u. Then i[an attainableeprogram
{(xt, y, ct)} starting from ro is eligible, then {(x,, y, ct))} converges to as
t - co, regardless of the value of ro in Z.
PROOF: By the hypothesis of the present theorem, Theorem 7.B.4 holds, so
that relation (15) also follows. (See the proof of Theorem 7.B.5.) Rewriting
relation (15), we obtain
(38) ut - u= p (x,+ct-yt)-At,t=0,1,2,...
for all (xt, yt) E T and ct E C, where u, = u(ct), u = u(c), and At > 0 for all
t. Summing this, we obtain
N N N

(39) Z(u,-u)=ZP'(xt+ct-Yt)-ZRt
t=o t=o t=o
2:p.(xt+ct-Yt-1)-ERt,
N N

=P'(xo+co-YN)+ for all N

1=1 t=o
Since xt ± c, < y, for all t = 1, 2, ... , and x0 + co < ro for {(xt, Yt, ct)} E
i

A (ro), we have

N N

(40) Z (ut - u) < P' (ro - YN) - ' At, N = 0, 1, .. .

t=o t=o
Because T and Z are bounded, there exists an M such that p (ro - yN) < M.
MULTISECTOR OPTIMAL GROWTH WITH CONSUMPTION 593

But the eligibility of {(x,, y,, c,)} implies that there exists an E such that
E < Zr o(u, - u). Hence we have
N N
ES (ut-u)<M-YAt
t=0 t=0
or

N
(41) 2 /3t<_M-E,N=0,1,...
t=0

Since E" o/3t, , N = 0, 1, 2, ..., is a monotone nondecreasing sequence

with an upper bound because of (41), /3, - 0 as t - oo.
Since u is strictly concave, T is strictly convex, and Slater's condition
holds by (A-5), the O.S.P. (z, y, c) is a unique choice of (z, y) E T and
E E C which maximizes the function
(42) 0 (x, y, c) = u(c) + p (y - x - c)
where (x, y, c) E T(E) C, and cP (z, y, c) = u(c). But from (38)

cP (xt, Yt, c,) = u(O) - At

and since /3t -> 0 as t - oo, it follows that
0 (x,, yt, ct) -> ( D (z, y, c) as t- >0 0
Hence from the strict concavity of u, the strict convexity of T, and the
continuity of u, we have

(x yt) -> (z, y) and ct -> c as t -> oo (Q. E. D.)

REMARK: For the case in which (c, y, c) may not be a unique O.S.P., see
Brock ([3], his lemma 4).

Corollary: Suppose that the assumptions of the previous theorem hold. Then
Et o(ut - u) converges to a finite value.

PROOF: Let {(x y,, c,)} be an eligible attainable program. Then

(x y,, c,) c) as t -> co. From (39) and eligibility, we obtain
N N N

(43) 2: P-(xt+c,-Yt-t)=2: (u,-'u)+P'(YN-xo-c0)+ 2: At

t= I t=0 t=0

+ N
E + P'(YN - ro) 2: 13t, forallN
t=0
Here a real number E which is independent of N exists owing to the eligibility
of the program. Since Et ol3t is convergent (from the proof of Theorem
594 MULTISECTOR OPTIMAL GROWTH MODELS

7.B.8) and yN - y as N -> oo, the series Z' l p (zt + ct - yr- i) is bounded
from below by k - p r0 where k - E + p y + Z' 0Ar, a fixed number.
(Since Z is bounded, p r0 is also bounded.) Since p >_ 0 and (xt + ct -
yt_ i) s 0 for {(xt, yt, ct)} E A(r0), and 2:0 1 p (xt + ct - yt_ 1) is bounded
from below, this series Zj I p (xt + ct - yt_ 1) is monotone nonincreasing
and converges to a finite value. Therefore, in view of equation (39) and re-
Z,t"_0(ut
calling again that y,, - y as N -> oo, - u) converges to a fixed
value. (Q.E.D.)

f. OPTIMAL PROGRAM FOR AN INFINITE HORIZON PROBLEM

Now we turn to the question of the (infinite horizon) "optimal program."
Clearly such a program must be an attainable program. In other words, the
optimality must be defined in the set of (infinite horizon) attainable programs
(starting from the same initial point). It is tempting to define the optimal program
as an attainable program {(xt, yt, ct)} which maximizes

(44) E u (ct )
t= 0

or
co
u(ct)
(45) where p >- 0 is a discount factor
t=O 0 + p)t,

Following- Ramsey [ 16] and Gale [ 6] , we assume that the discount factor is zero.
But we cannot simply adopt the target function such as (44), for such a target may
diverge to infinity in many attainable programs. To avoid such a situation for
infinite horizon programs, we define optimality as follows.

Definition: An attainable program { (zt, yt, ct)} starting from the initial resource
vector r0 is said to be optimal if there exists an N such that, for any attainable
program {(xt, yt, ct)} starting from the same r0,

(46)
t=0
[u(ct) - u(ct)] ? 0 for all N > N

REMARK: In the literature relation (46) is often referred to as defining the

program {(zt, yt, ct)} that overtakes the program {(x1, yr, ct)}. Hence the
above definition says that { (zt, yt, ct )} is optimal if it overtakes all the attain-
able programs starting from the same r0. This concept, which ingeniously
avoids the problem of divergence to infinity for infinite horizon programs,
is due to von Weizsacker [20] and Atsumi [2].
REMARK: It is possible to define and consider weaker concepts of
optimality than the one defined above. For such a treatment, see Brock [3].
MULTISECTOR OPTIMAL GROWTH WITH CONSUMPTION 595

The question that we wish to ask now is whether there exists an optimal
attainable program and, if so, what the characteristics of such a program are.
Gale answered both questions simultaneously by constructing such a program.
This is probably the most important although the most tedious part of his paper
[6]. Brock in a recent paper [3] has simplified this tedious procedure. In this
simplification, the following lemma, which we call Brock's lemma, plays a central
role.

Brock's Lemma : Suppose that (A-1), (A-2), and (A-5) hold. Let J (x,, y, , c,) j bean
attainable program starting from an arbitrary given ro E Z. Assume that an eligible
attainable program exists starting from ro. Let (z, y, c) be an O.S.P. with an as-
sociated price vector p. Then there exists a nonnegative sequence {S, }t o associated
with {(x,, y, c,)} such that
N N

(47) V (u1 - u) = p. (ro - YN) - Z S, f o r N= 0, 1, 2, .. .

1=0 r=0
where u, - u (c) and u = u (c)
Moreover, there exists an attainable program { (z,, 9i, c,)} starting from ro such that
its associated series Zr0S, is minimal in the class of attainable programs starting
from a given ro.
PROOF:

(i) Recall equation (38) and set

(48) S1 =-p.(x,+c,-yr-1)+/ /,, 1,2,
and
So=
Then we obtain
N N
2:31,N=O, 1,...
1=0 1=0

Since x, + c, 5 y,_ i, for all t = 1, 2, ... , and xo + co r0, for any at-
tainable {(x,, y,, c,)}, we have S, > 0 for all t = 0, 1, 2, .... This proves
the first statement of the lemma.
(ii) Consider an attainable program {(x,, y,, c,)} starting from an arbitrary
given ro which is eligible. Then for any N = 0, 1, 2, ... , there exist E
and B such that
N N N
(49) E<_
r=0
ZSr<B - 2:15,
r=0 r=0
where the existence of a bound B , independent of N and r0 E Z, can
be asserted owing to the boundedness of T and Z. Therefore we have
596 MULTISECTOR OPTIMAL GROWTH MODELS

N
(50) r OS, < B - E, N = 1, 2,....

Hence X OSr < co (that is, bounded from above), for any eligible and
attainable { (xr, yr, c,)} starting from re E Z.
(iii) Given any { (xr, yr, c)} starting from a given re, we can obtain the
associated sequence of 8r > 0, t = 0, 1, 2, ..., defined in (48). Let a be
defined by

(51) a - inf t=o Sr: {Sr} is associated with an attainable { (xr, yr, cr)}
starting from r0 }
-

Here the infimum is taken over the set of all attainable { (x y c,)} start-
ing from a given r0. Since an eligible program exists by assumption,
a < oo [ see step (ii) of the proof] . Starting from r0, there may not exist
any attainable program such that its associated series is equal to a. Our
task now is to show that there does exist such a program. That is, we wish
to find an attainable program { (zr, y,, cr)} starting from r0 such that its
associated sequence {Sr} is such that
00
(52) a = X91
r=0
In other words, we wish to find a program { (z, cr)} such that its
associated series Z' OSr is minimal in the class of programs starting
from a given r0.
(iv) By the definition of a, there exists an attainable program { x,N, y,N, c1N}
starting from r0 with its associated series Z' 08,N such that

(53)
r=0
Sr"<_a+
N+ 1
N=0,1,2,...

Now for a given t, consider (xrN, y,", c/N) as a sequence over N, where
N = 0, 1, 2, .... Then owing to the compactness of TO C, it contains a
convergent subsequence whose limit is in T ®x C. That is, there exists an
{N'} c {N} such that
(54) (xrN , YrN" crN') (Xr , Yr , r) as All - co
where (Yr, yr) E T and c, c C. Note that x,N' + c,N' < yr_ I N' for all t, so
that we have z, + c, < yr. Hence {(z y cr)} E A(r0). Owing to the
compactness of T QC, the boundedness of Z, and the continuity of u,
the sequence {8,N'} (sequence with respect to N') is bounded for each t
[recall equations (38) and (48)]. Hence for each t there exists a con-
{8rN'};
vergent subsequence of that is, there exists a subsequence {M} c
{N'} such that, for each t = 0, 1, 2, ...,
(55) SrM Sr as M --> co
MULTISECTOR OPTIMAL GROWTH WITH CONSUMPTION 597

It is clear from the continuity of u and the definition of S, that {s,}

corresponds to the program {(zt, y,, 6,)l [recall equations (38) and (48)]
Hence
XD

(56) -Y S,>_ a
r= 0

by the definition of a.
Write E' OSt. Suppose /3 > a. Then choose ri and r2 such that
(57) > r2> r1 > a
. Choose No large enough so that
No
(58) _Y St>r2
t= O

Next choose M0, MO E {M}, so that M > MO implies

(59) -Y S,M = rl
t=O

which is possible because S,M St as M -> oo. But in view of (53), we

also have
00 N
(60) a+ 1
StM > StM
M+1 -Y -Y
t=0 t=0
Then in view of (59), we have

M+1 >rl
(61) +

This is a contradiction, for a < r, implies that we can choose M large

enough so that a + 1/(M + 1) < rl. (Q.E.D.)
REMARK: The program {(z,, y,, c,)}, obtained in the above lemma, is called
the program with minimal associated series (Zm, OS,). It is easy to see that
this program is an eligible program.

Theorem 7.B.9: Suppose that (A-1), (A-2), and (A-5) hold, that an eligible attainable
program exists starting from r0, and that the O.S.P. is unique. Then there exists
an attainable program {(z,, y,, c,)} starting from r0 such that, for any attainable
program {(x,, y,, c,)} starting from the same r0, there exists an N such that
N
(62) Z [u(c,)
t=0
- u(c,)] > 0 for all N >_ N

PROOF: Let {(z,, y,, c,)} be the program with minimal associated series
E°° 0S which is obtained in the previous lemma. We claim that this is
the optimal program that is desired in Theorem 7.B.9. As remarked before,
598 MULTISECrOR OPTIMAL GROWTH MODELS

this program is an eligible attainable program. In view of Theorem 7.B.6,

we need only compare eligible attainable programs.
From (47) of Brock's lemma, we obtain

(63)
t=0
[u(cr) - u(cr)] = P- [ro - ro + (YN - YN)] + 2:t=0Sr - t=0
2: Sr

Also as a result of the eligibility of the two programs and the uniqueness
of the O.S.P., yN -> Y and YN -> Y as N -> oo. By definition, 081 is the
minimal series so that 2:' 08t > L= o8t. Hence the conclusion of the
theorem follows from (63). (Q.E.D.)

REFERENCES

1. Arrow, K. J., Hurwicz, L., and Uzawa, H., "Constraint Qualification in Maximiza-
tion Problems," Naval Research Logistics Quarterly, 8, June 1961.
2. Atsumi, H., "Neoclassical Growth and the Efficient Program of Capital Accumula-
tion," Review of Economic Studies, XXXII, April 1965.
3. Brock, W. A., "On Existence of Weakly Maximal Programmes in a Multi-Sector
Economy," Review of Economic Studies, XXXVII, April 1970.
4. Debreu, G., Theory of Value, Cowles Foundation Monograph, No. 17, New York,
Wiley, 1959.
5. Drandakis, E. M., "On Efficient Accumulation Paths in the Closed Production
Model," Econometrica, 34, April 1966.
6. Gale, D., "On Optimal Development in a Multi-Sector Economy," Review of
Economic Studies, XXXIV, January 1967 (also "Correction," Review of Economic
Studies, XXXVIII, July 1971).
7. , "A Geometric Duality Theorem with Economic Applications," Review of
Economic Studies, XXXIV, January 1967.
8. Koopmans, T. C., "Analysis of Production as an Efficient Combination of Activ-
ities," in Activity Analysis of Production and Allocation, ed. by T. C. Koopmans,
Cowles Foundation Monograph, No. 13, New York, Wiley, 1951, chap. 3..
9. , "Economic Growth at a Maximal Rate," Quarterly Journal of Economics,
LXXVIII, August 1964.
10. , "On the Concept of Optimal Economic Growth," in The Econometric
Approach to Development Planning, Pontificiae Academiae Scientiarvm Scriptvm
Varia, Amsterdam, North-Holland, 1965.
11. Malinvaud, E., "Capital Accumulation and Efficient Allocation of Resources,"
Econometrica, 21, April 1953 (also "A Corrigendum," Econometrica, 30, July 1962).
12. McKenzie, L. W., "Turnpike Theorems for a Generalized Leontief Model," Econo-
metrica, 31, January-April 1963.
13. , "Accumulation Programs of Maximum Utility and the von Neumann Facet,"

in Value, Capital and Growth, Papers in Honour of Sir John Hicks, ed. by. J. N. Wolfe,
Edinburgh, Edinburgh University Press, 1968.
MULTISECTOR OPTIMAL GROWTH WITH CONSUMPTION 599

14. Nikaido, H., Convex Structures and Economic Theory, New York, Academic Press,
1968.
15. Radner, R., "Paths of Economic Growth that Are Optimal with Regard Only to
Final States: A Turnpike Theorem," Review of Economic Studies, XXVIII, February
1961.
16. Ramsey, F. P., "A Mathematical Theory of Saving," Economic Journal, XXXVIII,
December 1928.
17. Tsukui, J., "Turnpike Theorem in a Generalized Dynamic Input-Output System,"
Econometrica, 34, April 1966.
18. , "The Consumption and the Output Turnpike Theorems in a von Neumann
Type of Model-A Finite Term Problem," Review of Economic Studies, XXXIV,
January 1967.
19. Uzawa, H., "The Kuhn-Tucker Theorem in Concave Programming," in Studies in
Linear and Non-Linear Programming, ed. by K. J. Arrow, L. Hurwicz, and H. Uzawa,
Stanford, Calif., Stanford University Press, 1958.
20. von Weizsacker, C. C., "Existence of Optimal Programs of Accumulation for
an Infinite Time Horizon," Review of Economic Studies, XXXII, April 1965.
F;7
DEVELOPMENTS OF OPTIMAL CONTROL THEORY
AND ITS APPLICATIONS

Section A
P0 NTRYAG I N'S
MAXIMUM PRINCIPLE

a. OPTIMAL CONTROL: A SIMPLE PROBLEM AND THE

MAXIMUM PRINCIPLE
Consider the problem of shooting a guided missile to intercept an airplane.
The location of the missile at time t can be described by a three-dimensional
vector-valued function x(1). The problem is to obtain the "optimal" trajectory
x(t) so that it maximizes or minimizes a certain objective. For example, the
objective may be to minimize the time for the missile to reach the airplane. Clearly
x(t) can be "controlled" by a number of variables. Thus we may consider that the
trajectory of the missile x(t) is controlled by the fuel consumption of the missile
at time t and the angle between the direction of thrust and the "flat" of the earth at
time t.' These variables are, in general, denoted by an r-dimensional vector-valued
function u(t). The function x(t) is, in general, an n-dimensional vector-valued
function, and hence the problem is, in general, to obtain the trajectory x(t) by
choosing a function u(t) so as to maximize or minimize a certain objective. This
problem is an optimal control problem and the theory for such a problem is called
optimal control theory. Examples of optimal control are vast and one can find many
such examples in everyday life. Some problems are very complex and some are
quite trivial (in the sense that the solution can be found easily). For example, a
trivial problem is to minimize the time required to fill a bathtub with water by
controlling the amount of water running from a faucet at each instant of time.
We can find many such problems in economics.' The problem of optimal growth
as discussed in Chapter 5, Section D, is an example. In that problem, we were
concerned with finding the time path of per capita consumption so as to maximize
the discounted sum (or the integral) of the utilities obtained from future con-
sumptions. Corresponding to the optimal time path of per capita consumption, we
obtained the trajectory of the capital: labor ratio.
Not only has the (modern) optimal control theory revolutionized the
600
PONTRYAGIN'S MAXIMUM PRINCIPLE 601

traditional control theory in various fields of engineering, but also it has attracted
attention throughout our society. It can be considered a mathematical theory with
applications extending to all of human activity. Mathematically, optimal control
theory is closely related to the calculus of variations, as can be suggested by the
problem of optimal growth. In fact, optimal control theory provides a link to the
vast literature on the calculus of variations.' And, by contrast to the classical
calculus of variations, optimal control theory incorporates general constraints
imposed on the problem in a direct and natural way.4 The work by the famous
Russian mathematician L. S. Pontryagin and his associates [111 is chiefly
responsible for this new approach.' Although pioneering works were done by
F. A. Valentine in 1937, E. J. McShane in 1939 and 1940, and M. R. Hestenes in
1949, this new approach has attracted a huge audience of mathematicians,
engineers, economists, and so on, only after the publication (of the English
translation) of Pontryagin et al. [ 11] Especially since the publication of this
.

work, the literature in the field of optimal control theory has been increasing very
rapidly,e and already includes a number of good textbooks (for example, [2],
[8], and [9] ).
The basic result of Pontryagin et al. [ 11 ] is called Pontryagin's maximum
principle, which is concerned with the necessary conditions for optimality.? This
condition is analogous to the maximization of the Lagrangian in the classical
theory of nonlinear programming. Further results by Hestenes [5] and others
extended this condition to incorporate various kinds of constraints.
The purpose of this chapter is to give an expository account of this theory
and to illustrate it with some applications in economics. Since a rigorous exposi-
tion of optimal control theory requires a book, and since there are several such
books available, our exposition here is rather intuitive.
Consider a system of n first-order differential equations

(1) xi(t) = f,- [x(t), u(t), t], i = 1, 2, ... , n

where x(t) _ [xi (t), x2(t), ..., and u(t) = [u1 (t), u2(t), ..., u,.(t)] Here
.

the f's, x,'s, and uk's are all real-valued functions. The boundary conditions for
(1) are given by

(2) x, (to) = x;°, i = 1, 2, ... , n

If we specify the uk(t)'s-say, u(t) = u(t)-then, assuming the uniqueness and
the existence of a solution, we can completely and uniquely specify the solution
path x(t; x°, t°) of the above differential equations. The Cauchy-Peano theorem
discussed in Chapter 3, Section B, is a theorem on the (local) existence and the
uniqueness of a solution.
In optimal control theory, we do not specify u(t) a priori, but rather we
choose u(t) from a set of functions-say, U-in order to maximize (or minimize)
a certain target. In this sense, the vector-valued function u(t) is called a control.
The variables Uk (t), k = 1, 2, ... , r, are called the control variables. The range of the
602 DEVELOPMENTS OF OPTIMAL CONTROL THEORY AND ITS APPLICATIONS

u (t)'s in U is denoted by U. The region U is called the control region and U is called
the set of admissible controls. When u(t) E U, u(t) is called an admissible control
(function). In this section, we assume that the control region U is independent of
x(t) and t. The case in which U is restricted by a constraint such as g(x, u, t) > 0
will be discussed in Section C. Throughout this chapter, we assume that U is
restricted to the set where u(t) is "piecewise continuous." By piecewise continuous
8
we mean that a function is continuous except possibly at a finite number of points
It is important to note that discontinuities are allowed for the control functions
(recall footnote 7). Notice that the control region U can be a closed set. In other
words, U can incorporate a constraint such as
0<u(t)< 1 forallt
Such a bound may appear if, for example, u(t) is the propensity to save at time t.
The f's are assumed to be continuous in each x;, Uk, and t, and possess
continuous partial derivatives with respect to each x, and t. The range of x(t)
is denoted by X, which is assumed to be an open connected subset of R". The
boundary point (x°, t°) must be such that x° E X and to E (t1, t2). It is required
that x(t) be continuous and have piecewise continuous derivatives.
We now set the target as follows (where T is a fixed constant):
n
(3) S= cix;(T), where T E (t1, t2)

and consider the problem of choosing u(t) E U so as to

Maximize: S
Subject to: z; (t) = f, [x(t), u (t), t], and x; (0) = x;°, i = 1, 2, ... , n
Once such a control function denoted by u(t) is found, we should be able to find the
corresponding function X(t) as a solution of the system of differential equations
(1). The variables x,(t), i = 1, 2, ..., n, which are assumed to be continuous in t,
are called the state variables. It is important to note that the derivative of each state
variable is in the constraints, but no derivatives of the control functions are in-
volved in either the target function or the constraints. This is sometimes used to
distinguish the state variables from the control variables. Although t often refers to
time t, in practical applications, this does not have to be the case, of course. See,
for example, El-Hodiri [4], pp. 122-126. However, following the usual conven-
tion, we nickname t as "time" t.
We now state the most basic theorem in this chapter, which is concerned with
the above problem.

Theorem 8.A.1: Under the above specifications of the problem, in order that u (t) be a
solution of the above problem with the corresponding state variable z(t), it is necessary
that there exist a nonzero, continuous vector-valued function p (t) _ [p 1(t), p2(t), ... ,
p"(t)] such that"
PONTRYAGIN'S MAXIMUM PRINCIPLE 603

(i) p(t) together with u(t) and 1(t) solve thefollowing Hamiltonian system:

(4) x,
of OR
api,pi - -ax,i= 1,2,...,n
where H is defined by

(5) H[x(t), u(t), t, p(t)] = 2: Pi (t)f [x(t), u(t), t]

i= I

(which is called the Hamiltonian), and R = H [z(t), u(t), t, p(t)]

(ii) H [1(t), fi(t), t, p(t)] > H [.i(t), u(t), t, p(t)] for all u(t) E U, that is, H is
maximized with respect to u(t)
(iii) pi(T) = ci, i = 1, 2, ..., n

REMARK: To avoid any misunderstanding, one notational remark is in

order. For example, the sentence, Pi = - aH/axi, where ft - [,i(t), fi(t), t,
p(t)], should be taken to mean that pi = - aH/axi, where the partial deriva-
tive OH/ax; is evaluated at [-i(t), fi(t), t, p(t)] . That is, aH/axi is not the
derivative of H with respect to xi (which is clearly meaningless).
REMARK: In view of the fact that His to be maximized [ condition (ii)], this
theorem is called the maximum principle, by Pontryagin et al. [Ill. Thus the
above theorem is known as Pontryagin's maximum principle. Theorem 8.A.1
gives the necessary conditions for u(t) to be optimal. It was later shown by
Mangasarian [ 10] and others that these conditions are also sufficient (for a
global optimum) if thef's are concave in x and u. We discuss Magasarian's
theorem in Section C. Note that the above necessary conditions do not
guarantee the existence of an optimal control fi(t); they are only the condi-
tions that are implied by optimality, assuming the existence of an optimal
control fi(t). As remarked above, the local existence of i(t) which satisfies
the system of differential equations (1), conditional upon the existence of
fi(t), is guaranteed by the assumption that the f's are continuously dif-
ferentiable in the xi's and t (the Cauchy-Peano theorem). An alert reader may
have realized the similarity between the above problem and the ordinary
nonlinear programming problem. The pi's correspond to the Lagrangian
multipliers, and H corresponds to the Lagrangian. The maximization of the
Lagrangian is now converted to the maximization of the Hamiltonian. Pon-
tryagin et al. called the pi (t)'s the auxiliary variables. They are also called the
multipliers or the costate variables.
REMARK: It is important to note that in the above formulation of the
problem, T is a priori fixed and x(T) is not a priori specified. We determine
z(t) from the differential equation z = f [x(t), u(t), t], once e2(t) is specified,
and we obtain i(T) from i(t). The third condition in the theorem, pi (T) =
ci, i = 1, 2, ... , n, is called the transversality condition and its role is to provide
the additional conditions required due to the fact that x(T) is not a priori
604 DEVELOPMENTS OF OPTIMAL CONTROL THEORY AND ITS APPLICATIONS

specified. For a good and relatively elementary discussion of the transver-

sality condition, see Pontryagin et al. [ 11] , especially chapter 1, section 2.
REMARK: Each X, = a1lap;, i = 1, 2, ..., n, is reduced to the constraint
equation (1), that is, X, = f [X(t), u(t), t], i = 1, 2, ..., n. The equations
p; = - -9&9x;, i = 1, 2, ... , n, are rewritten as

Pi =IPA
n afj
ax; 'i= 1,2,...,n
where f denotes f [X(t), u(t), t]. In other words, Theorem 8.A.1 produces
2n first-order differential equations with 2n boundary conditions x,(t0) =
x0 and pi(T) = c;, i = 1, 2, ..., n; hence the actual solution of the above
problem is reduced to solving a system of differential equations. Clearly this
system is unsolvable unless the function u(t) is specified. The choice of u(t)
depends upon condition (ii) (that is, the maximization of H). The triplet
[,i(t), L(t), p(t)] thus found in Theorem 8.A.1 is called the optimal triplet or
the solution triplet. The pair [X(t), u(t)] is called the optimal pair or the
solution pair.
REMARK: Note that the target function as described in (3) is more general
than it appears, as it includes the following target function:

(6) I j1J[x(t)u(t), t] dt
To see this, define x0(t) by *0 =fo[x(t), u(t), t] with xo(0) = 0. Then I =
xo(T),-which is clearly a special case of (3). Hence the problem of maximizing
I subject to (1) and (2) can be converted to the problem of simply maximizing
xo(T) subject to (1), (2), and x0 = fo[x(t), u(t), t] and x0(0) 0. We can then
immediately apply Theorem 8.A.1.
REMARK: In the above formulation of the problem, we defined the target
function by S = r 1c;x;(T). We noted in the above remark that an integral
target in the form of

I= T fo [x(t), u(t), t] dt
J
can be converted into the form of S. The converse is also true. In other words,
the target in the form of S can be converted to the above integral form. To
see this, note that
n T n n

S= c,x;(T) = f I c.i1(t)dt + c,x;(0)

0 r= i /= 1

Hence if the x;(0)'s are fixed, the maximization of S is equivalent to the

maximization of the integral
PONTRYAGIN'S MAXIMUM PRINCIPLE 605

T n
J= 5c,.cj(t)dt
Hence the maximization of S subject to x;(0) = x;o, x = f [x(t), u(t), t],
i = 1, 2, ..., n, is reformulated as follows:

Maximize: 5'fo[x(t), u(t), t] dt

u(t)

Subject to: *, = f [x (t), u(t), t] and x;(0) = x;o, i = 1, 2, ..., n

n
where fo[x(t), u(t), t] = Z c; fj[x(t), u(t), t]
i= 1

REMARK: If u is in the interior of the control region U and if each f is

continuously differentiable in u (so that H is continuously differentiable in
u), then condition (ii) implies
aH
k=0,k= 1, 2,...,r
and u is in the interior of U if U is an open set. Note that aH/c'htk = 0 for
all k means that the maximization of H usually implies r independent con-
ditions. Conversely, if H is a concave function in the uk's, then the equations
aH/auk = 0 for all k imply condition (ii). It is important to note, however,
that the power of condition (ii) is specifically that U is not restricted to
being an open set; in fact, U can be a closed set. For example, if uk(t) is
restricted by 0 _< uk(t) < I for all k, then U is the closed interval which is
a closed set. Since uk(t) can be any piecewise continuous function, uk(t) may
be such that, for each k,
Uk(t) = 0 for t0 t < 1-
uk(t) = 1 for f <t<t1
Such a solution, as remarked before, is called the bang-bang solution1, and
is obtained very often in practical applications. For example, consider the
problem of filling a bathtub with water from a faucet in a minimal amount
of time. The amount of water that can be run from the faucet at each instant
of time is the control. Let the unit of the volume of water be chosen such
that the maximum rate at which water can be run into the tub is equal to
one. Thus the control u(t) is restricted by 0 _< u(t) < 1. The solution of
this bathtub problem is obviously u(t) = 1, for 0 < t < T, and u(t) = 0, for
t = T, where T is the point of time at which the tub is full. That is, we obtain
a kind of bang-bang solution. Note also that in this problem Tis not specified
a priori, unlike in the problem stated for Theorem 8.A.1. We consider such
a case later.
606 DEVELOPMENTS OF OPTIMAL CONTROL THEORY AND ITS APPLICATIONS

REMARK: If we define M by
(7) M[X(t), t, p(t)] = "s pUH[X(t), u(t), t, p(t)]
E

then condition (ii) of Theorem 8.A.1 can be rewritten as

H [,i(t), u(t), t, p(t)] = M [,i(t), t, p(t)]

REMARK: Those readers who are familiar with the calculus of variations
problem with differential equation constraints should be able to see that
Theorem 8.A.1 may be reduced to a well-known result in the calculus of
variations when u(t) is in the interior of U. To see this, note that the
problem is reduced to one of maximizing

frT Zc;z;(t)dt
0 r=I
subject to (1) and (2)

since the constant term -2:, 1c;x;(0) will not affect the solution [z(t),
fi(t)]. Then form the "Lagrangian," 0 - 2:" ic;x; + 2:, T.(*, - f). View=
ing this as a calculus of variations problem, we can write Euler's conditions
here as
d ac15 arD d acD arD
and for all the i's and k's
dt axi ax; dt auk auk'

The first condition gives p; = - aH/8x; [condition (i) of Theorem 8.A.1 ]

and the second condition gives condition (ii) of Theorem 8.A.1 if u(t) is in
the interior of U.
b. THE PROOF OF A SIMPLE CASE
We now give a proof of Theorem 8.A.1. Since the proof of Theorem 8.A.1
in general is quite complicated and takes a great deal of space, we give the proof
for a special case only. In particular, we consider the case in which the functions
f, i = 1, 2, ..., n, take the following special form:

(8) f [x(t), u(t), t] af(t)xj(t) + Oj[u(t), t], i = 1, 2, ... , n

i=
In other words, thef's are linear in the state variables, and the control variables
are separable from the state variables. Because of this, the proof is greatly
simplified and will enhance the reader's understanding of Theorem 8.A.1. The
proof for this simple case is based on Kopp [7]. The reader who is interested
in more general cases is referred to Leitmann [9], chapter 1, as well as to other
works on the topic, such as Hestenes [5].
In order to present the proof, we now repeat the problem with which
Theorem 8.A.1 is concerned.
PONTRYAGIN'S MAXIMUM PRINCIPLE 607

PROBLEM:

Maximize: S = cixi(T), where T is fixed

u(l) i= I

Subject to:
(9) xi = f,, [x(t), u(t), t], i = 1, 2, ... , n
(10) xi(0) = xi° (fixed), i = 1, 2, ... , n
and
(11) U(t) E U
where x(t) = [xl(t), ..., xR(t)] and u(t) = [ui(t), ..., u,(t)].
First we define the "auxiliary variables" pi(t), i = 1, 2, ... , n, by
of
(12) pi(t)= axi
,p;(T)=ci,i= 1,2,...,n
i=l

Here aj/axi denotes aj/axi evaluated at [,i(t), u(t), t]. Define the function H by

(13) H[x(t), u(t), t, p(t)] = Pi (Of [x(t), u(t), t]

where p(t) = [pl(t), ..., p,(t)] ; it is clear from the definition of the pi(t)'s that
(14) pi(t) aH, where H = H[z(t), u(t), p(t)]

We assume that an optimal control vector u(t) has been found and let _z(t)
be the corresponding state vector. We are concerned with the characterization of
this solution pair [2(t), u(t)]. Consider now a variation Au(t) from the optimal
control vector u(t) such that u(t) + Au(t) E U, and let 4x(t) be the resulting total
variation from the optimal state vector 2(t). Then from (9), we have
n n

(15) PAx; P;[f,(z + Ax, u + Au, t) -I(X, u, t)]

Hence we obtain
Tn Tn
(16) f 1 piaxidt = fo Zpi[f(X + Ox, u + Du, t) - t)] dt
0 ;_1 i=I

Integration by parts of the LHS of the above equation yields

T n n Tn
(17) P ;t xidt piAxi fn i=i p;Oxidt
fo f=1 i=i 0
608 DEVELOPMENTS OF OPTIMAL CONTROL THEORY AND ITS APPLICATIONS

Then, rewriting the second term of the RHS of (17) by utilizing (12), and using
(16) and (17), we obtain

(18)
n

E+ PiAxi
i= 1 0
T n

=- f Z Z Pi(t) a Xj AXjdt
o i= 1 j= 1
n f
+ Z pi [ f (z + AX, u + Au, t) - f,(1, u, t)] dt
J
00 i= 1

Since the initial state vector x(0) is assumed to be fixed, Ax(0) = 0. Note that in
(12), pi(T) is chosen such that pi(T) = ci. Hence we have
n T n

(19) 2:PiAxi = Z c;Ax;(T) = AS

i=1 0 i= 1

where AS is the total variation of the payoff function S.

To consider the RHS of (18), expand f in a Taylor series about (z, u + Au, t)
and obtain
(20)
n
n of (X, u +A u, t) a2fi (X + §A x, + Au, t) 1 77

+ =1 ax1 AXi+ I Z1 Z1 3Xi3Xk

AXiAXk

where 0 < § < 1. Here it is assumed that the first and second continuous partial
derivatives off exist. From (18), (19), and (20), we obtain

(21) AS= f7p[f1(

i=1
i u +u, t) - fj(, u, t)] dt
T n n
+ Pi aXj [ f (X, u + Au, t) - f,(z, u, t)] Axjdt
0 i=1 j=1

1 1T n n n
a 2f,- (X + i;AX, u+ Au, t)
+Z
0
- Pi
i=I j=1 k=1 aXjaX k AxiAxkdt

Now recall our special form of the f's, that is, equation (8). Then, owing to (8), the
last two members of the RHS of (21) vanish and AS becomes

(22) AS= fo [H(z, u + Au, t) - H(x, u, t)] dt

A sufficient condition for a maximum of the payoff function Sat (r, u, t) is clearly
AS < 0, and a sufficient condition for AS < 0, in turn, is obtained from (22) as
(23) H(z, u + Au, t) - H(z, u, t) < 0
for all0<t<T.
PONTRYAGIN'S MAXIMUM PRINCIPLE 609

To obtain a necessary condition for a maximum of S, a special condition on

the control vector is chosen, that is,
(24) Au=(0,...,O,Au;,O...., 0)
and Au; = 0 except in the interval (t1, t2), where ti < t2. Now suppose we have
(25) H (z, u + Au, t) - H (z, u, t) > 0
for some interval between 0 and T. If the interval (t,, t2) is chosen to include
the interval over which relation (25) is satisfied, then AS > 0. A similar argument
may be presented for all the control variables. But AS > 0 contradicts the fact
that S achieves a maximum at (z, u, t). Hence by denying (25), we obtain the
following necessary condition
(26) H(z, u + A u, t) - H(z, u, t) < 0
In other words, the maximum of S at (z, u, t) implies that H is maximized at
(z, u, t) with respect to the control vector u. Thus condition (ii) of Theorem 8.A.1
is proved. Conditions (i) and (iii) are obvious from our choice of the pi's in
equation (12). Thus the proof is complete.

C. VARIOUS CASES
As already remarked, the above theorem is concerned with the case in which
the time horizon (T) is fixed and the end-point x(T) is not a priori fixed (it is deter-
mined from the solution of the problem). However, in many circumstances this
may not be the case. For example, if the target of the problem is to minimize
the time (T) to reach a certain target, then T is not a priori specified but it is
rather obtained as a solution of the problem. Such a problem is called the time
optimal problem. In general, we can formulate various problems depending, first,
on whether or not some (or all) coordinates of the state vector x(T) are a priori
fixed and, second, on whether or not the "final time" (T) is fixed.
A few examples are now in order. In the problem of minimizing the time to
fill a bathtub, the final state x(T) = 100 (%) is a priori fixed, but the final time T
is not specified; it is determined as a solution of the problem. In the problem of
shooting a missile to intercept an airplane in a minimum amount of time, both the
final time T and the final state x(T) are unspecified. They are determined as a
part of the solution. In the optimal growth problem of maximizing the discounted
sum (integral) of utilities over time [0, T] with fixed initial and terminal
capital:labor ratios, k0 and kT, both the final time T and the final state k(T) are
a priori fixed.
We now turn to a general consideration of such problems. First we discuss
the case in which m coordinates of the terminal value of the state vector, x(T),
are a priori fixed, where m < n or n7= n. Next we consider the case in which the
final time T is not a priori fixed.
(i) The Right-Hand End-Point x(T) Partially Specified: In other words
610 DEVELOPMENTS OF OPTIMAL CONTROL THEORY AND ITS APPLICATIONS

(27) xi (T) = xjT, i = 1, 2,..., , m (fixed), m < n

and no conditions are specified for xi(T), i = m + 1, ... , n. In this case we
rewrite the transversality condition of Theorem 8.A.1 as follows:
(28) p;(T)=ci+A1,i= 1,2,...1 m
(29) pi(T) = ci, i = m + 1,.. n

Here the Ai's are unknown variables that are constant over time. Note that
in (27) we have m new equations, which corresponds to m new variables, the
Ai's. The rest of Theorem 8.A.1 holds as it is. Clearly, Theorem 8.A.1 with
condition (iii) revised as above is a generalization of the original Theorem 8.A.1.
Note that the original theorem is concerned with the case in which m = 0
[that is, none of the xi(T)'s are a priori specified] . A further generalization
of the theorem is obtained if, instead of specification (27), we have the following
functional specification on the right-hand end-point x(T):
(30) Fj [x(T)] = 0, j = 1, 2, ... , m

where the Fj's are real-valued differentiable functions. Clearly (27) is a special
case of (30) in which F j [x(T)] = xj(T) - xjT, j = 1, 2, ... , m. In this case, the
transversality conditions (28) and (29) are rewritten as

e? i= 1,2,...,n
(31) pi(T)=ci+ j=1
X '

where the A j's are unspecified variables which are constant over t." It should
be clear that (31) is a generalization of (28) and (29) in the sense that (28) and
(29) are obtained from (31) (not vice versa). Thus by replacing the transversality
condition (iii) by (31), we obtain a further generalization of Theorem 8.A.1.
(ii) Final Time Open: We now turn to the consideration of the case in which the
"terminal time" T is not a priori specified. Since T is not specified, we have
one additional degree of freedom in the system. Hence one additional equation
is required, which is written as follows:
n
(32) _Y pi(T)zi(T) = 0
i= i

In view of the constraint equations zi = f,,[x(t), u(t), t] , i = 1, 2, ..., n,

equation (32) can be rewritten as

(33) H[z(T), u(T), T, p(T)] _ _ypi(T)f [z(T), u(T), TI = 0

i= i

Here [z(T), u(T), p(T)j denotes the solution triplet at T. In terms of M [z(T),
T, p(T)] as defined in (7), (33) can be rewritten as
(34) M [z(T), T, p(T)] = 0
In the case of an autonomous system in which thef's do not explicitly depend
PONTRYAGIN'S MAXIMUM PRINCIPLE 611

on t (that is, f,. [x(t), u(t)] ), (33) or (34) can be rewritten as

(35) M [1 (t), p (t)] = H [1 (t), u(t), p (t)] = 0 for all t

To see this, first note

n n

d M 1,i (t), p (t)] =i=laxi

- aHXi + E a Pr
Wt i=1api
Hence, in view of condition (i) of Theorem 8.A.1, we have dM[ c(t), p(t)] /dt
= 0, or M [z(t), p(t)] = constant for all t. Thus (34) implies (35).
We are now ready to summarize the modifications considered in (i) and (ii).
FIXED VS. VARIABLE END POINTS [MODIFICATION OF TRANSVERSALITY
CONDITION (iii)] :
xi(T) = xiT, i = 1, 2, .., m(m < n) pi(T) = ci + Ai, i = 1, 2, ..., m
xi (T) = unspecified, i = m + 1, ... , n] pi(T) = ci, i = m + 1, ..., n
"FINAL TIME" OPEN:
autonomous H[z(t), u(t), p(t)] = 0, for all t
nonautonomous H[z(T), u(T), T, p(T)] = 0 .

FIXED "FINAL TIME": The two conditions given above for the case of open
final time are not required for the case of fixed final time.
Theorem 8.A.1 with the above modifications in (i) and (ii) may be called the
generalized Theorem 8.A.1. However, for the sake of simplicity, we will hence-
forth refer to this simply as Theorem 8.A.1. With these modifications, we are
ready to derive the results for various interesting cases as corollaries of Theorem
8.x.1. Not only will these corollaries give us some readily available results, they
will also enhance the reader's understanding of Theorem 8.A.1. In fact, the
reader will observe that all the theorems listed in chapter 1 of Pontryagin et al.
[ 11 ] are really special cases of this theorem.
(a) FIXED-TIME WITH FIXED-END POINTS PROBLEM: We consider the
following problem in which the final time T is fixed (with fixed end-points):

Maximize: fnJ[x(t), u(t), t] dt

u(i)
Subject to: i, = f [x(t), u(t), t], i = 1, 2,... , n
and xi(0) = xio, xi(T) = xi', i = 1, 2, ... , n

Here T is fixed and both the initial and terminal end points, xio and xiT, are
also fixed. To consider this problem, define x0(t) by *0 = fo[x(t), u(t), t],
x0(0) = 0. Then, as we noted earlier, the problem is converted to one of maxim-
izing x0(T) subject to zi = f,.[x(t), u(t), t], i = 0, 1, 2, ..., n, and x0(0) = 0,
612 DEVELOPMENTS OF OPTIMAL CONTROL THEORY AND ITS APPLICATIONS

xi(0) = x,0, xi(T) = xiT, i = 1, 2, ... , n. Applying Theorem 8.A.1, we obtain

the following theorem.

Theorem 8.A.2 (Pontryagin et al. [ 11] , pp. 67-68): In the above problem, in order
that [i(t), u(t)] be optimal, it is necessary that there exist a nonzero (n + 1)-vector-
valued continuous function p (t) ° [po(t), p i (t), ... , which has piecewise
continuous derivatives, such that
(i) z0(t), z(t), u(t), and p(t) solve the following Hamiltonian system:

aH, i = 0, 1, 2, ..., n
aPi Pi = - ax,
xi = aH,

where

H[x(t), u(t), t, p(t)] _ Z pi(t) f [x(t), u(t), t]

i=0
and

H = H[z(t), u(t), t, p(t)]

(ii) H[z(t), p(t)] _> H[i(t), u(t), t, p(t)] , for all u(t) E U
(iii) po(t) =constant > 0, for all t, 0 < t < T
REMARK: Because of the fixed time assumption, conditions such as ft = 0
do not appear in the above theorem. Note that po = - aH/a xo = 0; hence
po(t) = constant for all t. The pi(T), i = 1, 2, ..., n, are left unspecified
because the xi(T) = xiT, i = 1, 2, ... , n, are fixed. Note that po(T) = 1 cannot
be concluded in general, since the solution pair [,i(t), u(t)] with zi(0) = xio
and Xi(T) = xiT, i = 1 , 2, ..., n, will imply the specification offo [,i(t), u(t), t],
and hence its integral, xo(T). The proof of po(T) _> 0 [which, in view of
po(t) = constant for all t, implies the nonnegativity in condition (iii)]
requires a further consideration and it is omitted here. We may note, how-
ever, that if we can show that po(T) 0 [so that po(T) > 0] , then we can
take po(T) = 1.12 The condition which guarantees po(T) 0 is analogous
to the "normality condition" in nonlinear programming, which is discussed
in Chapter 1 (especially Sections B and D).
REMARK: In the ordinary nonlinear programming problem of maximizing
a real-valued function f(u) subject to gn(u) ? 0, j = 1, 2, ... , m, u E U c R
we can obtain the following theorem (Fritz John's theorem; recall also
Theorem 1.B.3).
If u is a solution of the above problem, then there exist multipliers po,
p i , ... , p,,, (all constants), not vanishing simultaneously, such that
m m
Pof(u) + p1gj(u) > Pof(u) + p gj(u) , for all u E U
i=1 i=I
PONTRYAGIN'S MAXIMUM PRINCIPLE 613

and
in

X pjgj(u) = 0
j= 1

Clearly condition (ii) of Theorem 8.A.2 corresponds to the first condition

in the above theorem for the optimization of the ordinary nonlinear pro-
gramming problem. As we remarked in Chapter 1, the condition which
guarantees po > 0 (rather than po > 0) is called the normality condition.
(R) FINAL TIME OPEN WITH FIXED END POINT (NONAUTONOMOUS CASE):
Consider the following problem.

Maximize: ffo[x(t), u(t), t] dt

Subject to: z; = f [x(t), u(t), t], i = 1, 2, ... , n
and x;(0)= x;o,xi(T)=x,T,i= 1,2,...,n
Here both the initial and terminal end points, the x.(0)'s and x,(T)'s, are
fixed, but final time T is not a priori specified. To consider this problem,
define xo (t) by zo = fo [x(t), u(t), t] and xo(0) = 0 and utilize the generalized
Theorem 8.A.1 for the case of fixed end points with final time open.

Theorem 8.A.3 (Pontryagin et al. [11] , pp. 60-61): In the above problem, in order
that [c(t), u(t)] be optimal, it is necessary that there exist a nonzero (n + 1)-vector-
valued continuous function p(t) = [po(t), pi(t), ..., which has piecewise
continuous derivatives, such that
(i) co(t), 2(t), u(t), and p(t) solve the following Hamiltonian system:
aH aR
xi= a p;
,pi= 'i= 0, 1,2,...,n
aX1.

H[x(t), u(t), t, p(t)] I pi(t)f,-[x(t), u(t), t]

i=o
and
H = H[2(t), fi(t), t, p(t)]
(ii) H[c(t), u(t), t, p(t)] > H[c(t), u(t), t, p(t)] , for all u(t) E U
(iii) H[c(T), u(T), T, p(T)] = 0
(iv) po(t) = constant > 0 for all t, 0 < t < T
REMARK: Condition (iii) of the above theorem is necessary because the
final time is open (for a nonautonomous case). The fact thatpo(t) = constant
614 DEVELOPMENTS OF OPTIMAL CONTROL THEORY AND ITS APPLICATIONS

for all t follows from po = - aH/axo = 0. The p,(T), i = 1, 2.... , n, are left
unspecified because the x, (T), i = 1, 2, ... , n, are specified. Note thatpo = 0
is possible in the above theorem.
(y) TIME OPTIMAL PROBLEM (NONAUTONOMOUS CASE): Consider the
following problem.

Minimize: T
UW

Subject to: x; = f, [x(t), u(t), t], i = 1, 2, ..., n

and x,(0) = x,0, x.(T) = xiT, i = 1, 2,..., n
Here both the x,(0)'s and the x;(T)'s are fixed but T is open. The problem
is called the time optimal problem, for it is concerned with minimizing the
time for the transfer from afixed pointx;0to another fixed point X,T satisfying
the differential equation z; = f [x(t), u(t), t], i = 1, 2, ..., n. Clearly this
is a special case of the above problem (13) with fo [x(t), u(t), t] _ -1 for all t.
Hence defining the Hamiltonian H by
p(t)] n
H[x(t), u(t), t, = -p0(t) + p (r)f [x(r), u(r), t]

we can apply Theorem 8.A.3 and obtain the following theorem (here we
define H [x(t), u(t), t, p(t)] = E7 1 pi(t)fi[x(t), u(t), t] and note that H =
-po + H).

Theorem 8.A.4 (Pontryagin et al. [11], p. 65): In the above problem, in order
that [.i(t), u(t)] be optimal, it is necessary that there exist a nonzero n-vector-valued
continuous function p(t) _ [ pl(t), P2(t), ..., p, (t)] , which has piece wise continuous
derivatives, such that
(i) i(t), u(t) and p(t) solve the following Hamiltonian system:

xi = aH ,Pi= - 8-x; , i= 1,2,...,n

apt

where

H[i(t), u(t), t, p(t)] _ pt(t)fj[i(t), u(t), t]

(ii) H[i(t), fi(t), p(t)] ? H[i(t), u(t), p(t)] for all u(t) E U
(iii) H[i(T), ii(T), T, p(T)] ? 0
REMARK: The above condition (iii) follows from po(T) > 0 and
H[z(T), u(T), T, p(T)] -- -po(T) + H[z(T), u(T), T, p(T)] = 0
PONTRYAGIN'S MAXIMUM PRINCIPLE 615

(8) FINAL TIME OPEN WITH FIXED END POINTS (AUTONOMOUS CASE):
Consider the following problem.

(a) Maximize: fo[x(t), u(t)] dt

J
Subject to: x, = f,, [x(t), u(t)], i = 1, 2, ..., n
and x;(0) = xto, x;(T) = x;T, i = 1, 2, ..., n

Here T is not specified. This is a special case (autonomous case) of the

problem considered for Theorem 8.A.3. Here, owing to the autonomous
character of the problem, condition (iii) of Theorem 8.A.3 is modified as
(36) H[z(t), u(t), p(t)] = 0 for all t, 0 < t < T
The rest of Theorem 8.A.3 holds as is.
Next consider the following time optimal problem for the autonomous
case.

(b) Minimize: T
u(t)

Subject to: zi = f [x (t), u(t)] , i = 1, , ..., n

and x;(0) = x o, xi(T) = xiT, i = 1, 2, ... , n

This is a special case (autonomous case) of the problem considered for

Theorem 8.A.4. Here, owing to the autonomous character of the problem,
condition (iii) of Theorem 8.A.4 is modified to the following condition:
(37) H[z(t), u(t), p(t)] > 0 and constant for all t, 0 < t < T
Thus we obtain the following theorem.

Theorem 8.A.5 (Pontryagin et al. [11], p. 19 and pp. 20-21): The necessary
conditions for [z(t), u(t)] to be a solution of problem (8-a) are obtained by replacing
condition (iii) of Theorem 8.A.3 by condition (36). (The other conditions of Theorem
8.A.3 hold as they are.) The necessary conditions for [,i(t), u(t)] to be a solution
of problem (8-b) are the same as those in Theorem 8.A.4, except that condition (iii) of
the theorem is replaced by condition (37).
(E) FIXED-TIME WITH VARIABLE RIGHT-HAND END POINTS PROBLEM:
Now consider the following problem:

/
Maximize: fo[x(t), u(t), t] dt
u(t) o

Subject to: *, = f [x(t), u(t), t], i = 1, 2, ..., n

and x; (0) = X'0, i = 1, 2, ... , n
616 DEVELOPMENTS OF OPTIMAL CONTROL THEORY AND ITS APPLICATIONS

Here T is fixed but the x; (T), i = 1, 2, ... , n, are not fixed. Again defining x0(t)
by xo = fo [x(t), u(t), t], x0(0) = 0, we can convert the above problem to one
of maximizing xo(T). Thus applying Theorem 8.A. 1, we obtain the following
theorem.

Theorem 8.A.6 (Pontryagin et al. [11], p. 69): In the above problem, in order
that [i(t), fi(t)] be optimal it is necessary that there exist a nonzero (n + 1)-vector-
valued continuous function p(t) = [po(t), p, (t), ..., which has piecewise
continuous derivatives, such that
(i) z0(t), 2(t), u(t), and p(t) solve the following Hamiltonian system:

8H aH
xi=B-,Pi
pi
=-8-,i=0,1,2,...,n
x;
where

H [x(t), u(t), t, p(t)] - p;(t) fj [x(t), u(t), t]

i=0
and

H =_H [. (t), u(t), t, p(t)]

(ii) H [1(t), u(t), t, p (t)] > H [. (t), u(t), t, p (t)] , for all u(t) E U

(iii) p(T) = (1, 0, ..., 0) [that is, po(T) = 1 and pi(T) = 0, i = 1,2..... n]
(iv) P0(t) = 1 for all t

REMARK: Owing to the transversality condition for the variable end points,
pi (T) = 0 for all i = 1, 2, ... , n. Since p (T) = [po(T), p l(T), ... , p (T)]
is a nonzero vector, this implies p0(T) 4 0, orpo(T) > 0. Thus without loss of
generality we may take p0(T)=1. Hence we obtained condition (iii),
especially p0(T) = 1, without mentioning anything about the normality con-
dition.

REMARK: Note that condition (i) of the above theorem implies p0 =

- aH/axo = 0, so that p0(t) = constant for all t. Hence p0(t) = 1 for all
t in view of condition (iii); that is, condition (iv) follows. Thus we may write
n

REMARK: It may be of some interest to obtain Theorem 8.A.1 as a special

case of Theorem 8.A.6. To do this, recall the remark for Theorem 8.A.1
which noted that the problem of maximizing S - Z° ic;x;(T) subject to
x; = f,, [x(t), u(t), t] , x;(0) = x;0, i = 1, 2, ..., n, can be converted to the
following problem:
PONTRYAGIN'S MAXIMUM PRINCIPLE 617

Maximize: JTJO[x(t), u(t), t]dt

Subject to: x, = f, [x(t), u(t), t], i = 1, 2,..., n

and x;(0) = x;o, i = 1, 2, ..., n
n
where fo [x(t), u(t), t] = 2: c; f, [x(t), u(t), t]
1=1

Then apply Theorem 8.A.6. The Hamiltonian H is defined by H = po fo +

In i p; f , which, in view of the definition of fo, can also be written as
H = In,,==1 q; f , where q; (t) = c; po(t) + p, (t), i = 1, 2, ... , n. Conditions
(i) and (ii) of Theorem 8.A.1 follow immediately from conditions (i) and (ii)
of Theorem 8.A.6. Condition (iii) of Theorem 8.A.1 follows from condition
(iii) of Theorem 8.A.6 by noting thatpo(T) = 1 andp;(T) = 0, i = 1, 2, ...,n,
imply q;(T) = c;, i = 1, 2, ..., n.

d. AN ILLUSTRATIVE PROBLEM: THE OPTIMAL GROWTH

PROBLEM
Consider the optimal growth problem discussed in Section D of Chapter 5.
Suppose that we are to choose x(t), the time path of per capita consumption,
such as to

Maximize: j U [x(t)] a-P'dt

Subject to: k(t) = f [k(t)] - )k(t) - x(t)
with k(0) = ko > 0, k(T) = kT > 0, k(t) > 0, x(t) >_ 0
The notations used here are the same as those used in Chapter 5, Section D:
p..> 0 is the discount factor, k(t) is the capital:labor ratio at time t, :l is the
rate of population growth plus the rate of capital depreciation, f is the per capita
production function, and u is the utility function. We adopt assumptions similar
to the ones used in subsection c of Chapter 5, Section D. In particular, we assume
the following:
(A-1) u'(x) > 0 and u"(x) < 0forallx>- 0.
(A-2) f(k) < co, f'(k) > 0, and f"(k) < 0 for all k >_ 0.
(A-3) f (O) = 0, f'(0) = co, and f'(co) = 0.
One may notice that this problem is a special case of the general optimal
control problem. Here x(t) is the control variable and k(t) is the state variable.
The control region for this problem is the entire nonnegative region, which is
a closed set. Here we assume that the final time T is fixed and the final state k(T)
is fixed. Hence Theorem 8.A.2 is relevant to this problem. The Hamiltonian H
is defined by

(38) H[k(t), x(t), t, p(t)] = u [x(t)] e-P' + p(t) [f [k(t)] - .ik(t) - x(t)]
618 DEVELOPMENTS OF OPTIMAL CONTROL THEORY AND ITS APPLICATIONS

Strictly speaking, H should be

(39) you [x(t)] a-Pt + p(t) [f [k(t)] - Ak(t) - x(t)]
Then condition (ii) of Theorem 8.A.2 implies

(40) Pou[x(t)] e-P` + P(t)[f[k(t)] - Ak(t) - x(t)]

< pou[z(t)] a-Pt + p(t) [f[k(t)] - Ak(t) - z(t]
for all x(t) >0, 0 <t < T
Now if po = 0 [note that po(t) = constant and >_ O from condition (iii) of Theorem
8-A.2], then
(41) p(t)x(t) > p(t)x(t) for all x(t) > 0, 0 < t < T
Since po and p(t) cannot vanish simultaneously, we obtain p(t) 4 0 for all t.
If p (t) < 0, then (41) implies that z(t) > x(t) for all x (t) > 0; thus z(t) is unbounded.
This is impossible since i(t) must be bounded from above because of the feasi-
bility condition k(t) = f [k(t)] - Ak(t) - x(t). (Recall footnote 12 of Chapte?-5,
Section D.) Hence p(t) > 0 for all t. Set x(t) = 0 in (41) and obtain
(42) 0 >- p(t)x(t) for all t, 0 < t < T
Since p (t) > 0 for all t, this implies that i(t) = 0 for all t. If we impose Koopmans'
assumption
(43) lim u(x) ao
x-.0
x>0

then i(t) = 0 (for all t) cannot be optimal. However, even without (43), we can
guarantee that i(t) = 0 (for all t) is not optimal. To see this, notice that the
productivity conditions imposed by (A-2) and (A-3) permit the existence of x (t) >
0 for all t, which is technically feasible in the economy. Since u'(x) > 0 for all x,
the path with 3e(t) is better than the path with i(t); that is, i(t) = 0 for all t cannot
be optimal. Hence (42) is a contradiction so that (41) is also a contradiction.
This then implies that po = 0 is a contradiction. Therefore we obtain po > 0. We
can now choose po to be unity without loss of generality since all the necessary
conditions for optimality in the various theorems (such as Theorem 8.A.2) stated
so far in this section will not be affected by this choice. This then justifies our
definition of the Hamiltonian in (38). The above rather lengthy consideration,
which justifies po = 1, is usually ignored in the literature.
Furthermore, note that with po = 1, (40) may be rewritten as

(44) u [,i(t)] a-Pt - u [x(t)] a-Pt > p(t)x(t) - p(t)x(t)

for all x(t) > 0, 0 < t < T
PONCRYAGIN'S MAXIMUM PRINCIPLE 619

If we assume (43), then (44) implies

(45) , X (t) > 0 for all t, 0 < t < T
so that we have an interior solution (that is, the optimal control is in the interior
of the control region). Later we will see that (45) can also be justified under an
alternative assumption such as (63).
The Hamiltonian system for the present problem is written as

(46) k(t) = f[k(t)] - Ak(t) - X(t) (= aH

ap
)

and

(47) P(t) _ -P (t) [f'[k(t)] - A](=

ak /
where (-) over aH/ap and all/,9k signifies that these partial derivatives are
evaluated at [k(t), X(t)]. Assuming an interior solution (that is, i(t) > 0 for all
t), condition (ii) of Theorem 8.A.2 can be rewritten as13

(48) aH = 0

Equation (48) can be spelled out as

(49) p(t) = u'[X(t)]e-Pt

Since u'(x) > 0 for all x, this implies that p(t) > 0 for all t. From (49) we obtain

(50) X(t) L11

[flk(01 - (A + P)]
where u' -- u'[X(t)] and u" [X(t)]. This is exactly the same equation which
is obtained as the Euler equation in the calculus of variations formulation of this
optimal growth problem, discussed in Section D of Chapter 5. Hence combining
this with the feasibility equation, we can obtain exactly the same phase diagram
obtained there (if we assume u' > 0 for all x). The extension to the infinite horizon
problem can be carried out by examining the "eligibility conditions" as we did
there, and thus we can omit this from the present discussion.
Here we may note another important implication of the maximum principle.
The maximality of the Hamiltonian H implies the following inequality [setpo = 1
in (40)] :

(51) u[X(t)] e-Pt + p(t) [f [k(t)] - Ak(t) - X(t)]

>_ u [x(t)] a-Pt + p(t) [f [k(t)] - Ak(t) - x(t)]
for all x(t) > 0
620 DEVELOPMENTS OF OPTIMAL CONTROL THEORY AND ITS APPLICATIONS

This can be rewritten as

(52) e-s" u[2(t)] - u[x(t)]] > p(t) [1(t) - x(t)]
for all x(t)?0,0<t<T
as already observed in (44). From (52) we obtain
(53) u[2(t)] - u[x(t)] >_ 0
for all x(t) >_ 0, 0 < t < T, with p(t) [2(t) - x(t)] ? 0
In other words, the satisfaction from consumption is maximized along the optimal
path for each instant of time subject to the budget constraint. From (52) we also
obtain
(54) p(t) [x(t) - 1(t)] >_ 0
for all x(t) __> 0, 0 < t < T, with u [x(t)] - u [2(t)] ? 0
In other words, the consumption expenditure [at "implicit prices" p(t)] reaches
its minimum along the optimal path at each instant of time in the set of paths with
utility equal to or exceeding that of the optimal path [k(t), 2(t)] Relations (52),
.

(53), and (54) and their interpretations are the generalizations of Koopmans' prop-
osition F([6], pp. 245-246), for our propositions are concerned with each instant
of time. Note also that (49) and (52) yield
(55) u[2(t)] - u[x(t)] >_ u' [z(t)] [1(t) - x(t)]
forallx(t)>0,0<t<T
In other words, at any instant of time, the excess of 2(t) over x(t) multiplied by the
marginal'utility of the optimal consumption at t cannot exceed the excess of utility
at 2(t) over that at x(t).
In fact, the above formulation (51) in terms of the maximum principle and
the subsequent implications discussed above should hold even if we replace
u [x(t)] a-Pt and f [k(t)] by more general functions u [x(t), t] and f [k(t), t],
where u and f are continuously differentiable in t [as well as in x(t) and k(t)]. The
function u [x(t), t] allows the possibility of a nonconstant discount, andf [k(t), t]
allows the possibility of technological progress. We may rewrite relations (52),
(53), and (54) in terms of these new functions u and f as follows:

(52') u[z(t), t] - u[x(t), t] >_ p(t) [2(t) - x(t)]

for all x(t) >_ 0, 0 < t < T

(53') u [2(t), t] - u [x(t), t] >_ 0

for all x(t) ? 0,0 < t < T, withp(t) [2(t) - x(t)] >- 0

(54') p(t) [x(t) - i(t)] ? 0

for all x(t) ? 0, 0 < t 5 T, with u[x(t), t] - u[z(t), t] >_ 0
PONTRYAGIN'S MAXIMUM PRINCIPLE 621

The interpretations of these relations follow analogously.

In the above interpretations, it is clear that Pontryagin's "auxiliary variable"
p(t) plays the role of the "implicit (or shadow) price." Hence the Pontryagin maxi-
mum principle which, among other things, asserts that the maximization of a
certain function or integral implies the existence of an auxiliary variablep(t), has,
in turn, a very important implication in economics. The situation is strictly analo-
gous to the one in the theory of nonlinear programming in which the maximization
implies the existence of the Lagrangian multipliers that play the role of "prices."
Koopmans [6] defines the price of output by "the present value of the marginal
(instantaneous) utility of consumption at time t." This corresponds precisely to our
relation (49), obtained from the maximum principle (note that Koopmans used
neither the maximum principle nor the calculus of variations). Define P(t) by
(56) P(t) - p(t) [I' [k(t)] - a-]
That is, P(t) is the (present) value of the "net" marginal productivity of capital at
time t. Using this relation, the optimality equation (47) can be rewritten in the
following simple form:
(57) P(t) + p(t) = 0
This relation is the same as the one obtained by Koopmans [6] (his proposition
I).
Now suppose that we alter the above problem such that the terminal end-
point of the state variable, k(T), is not a priori specified (with time T still fixed); then
we apply Theorem 8.A.6, a theorem for the variable end-points problem. With this
modification, the above analysis holds as it is' 4 except for two crucial points: (1) We
do not have to prove thatpo > 0 (sincepo = 1 for all t), and (2) we have the following
transversality condition:
(581. p(T) = 0
In view of (49), (58) is rewritten as
(59) u' [x(T)] e-pT = 0
where we assume that p > 0. Notice that (59) implies that u'(x) = 0forsomexif T
is finite. In other words, as long as Tis finite, (A-1) looks like it should be modified
so as to allow satiation in consumption.
However, as Arrow ([ 1.], p. 88) has shown, we can (and should) modify (58)
and hence (59) by recognizing the fact that the terminal state is constrained by the
condition k(T) >_ 0. In other words, (58) and (59) should, respectively, be rewritten
as follows. (For the derivation, see Arrow [ 1 ] , or Section C of this chapter.)
(58') p(T) ? 0 and p(T)k(T) = 0
(59') u' [x(T)] e-PT > 0 and u' [x(T)] k(T)e-"T = 0
622 DEVELOPMENTS OF OPTIMAL CONTROL THEORY AND ITS APPLICATIONS

Hence, if we have u'(x) > 0 for all x (no satiation) so that p(T) > 0, we must have
k(T) = 0. This condition then replaces (58). Thus k(0) = k0 and k(T) = 0 specify
the two boundary conditions for (46) and (47).
To analyze the present problem, we will consider a phase diagram which is
different from the one used in Chapter 5, Section D. For this purpose, first note
that (A-1) [especially u"(x) < 0 for all x] and (49) imply
(60) z(t) = g [ q(t)]
where

(61) q(t) = p(t)eP' = u'(z,), g = (u') - 1, g' < 0 for all q

In view of (60), (46) can be rewritten as
(62) k(t) = f [k(t)] - Ak(t) - g [ q(t)]
The relation between q and z is illustrated in Figure 8.1.

A
x
0

Figure 8.1. The Relationship between q and z.

Note that if we impose the following assumption a la Cass [31,

(63) lim u' [x] = ao

x-,0
x>0

we can again assert that the solution is an interior solution, that is, z(t) > 0 for all
t. 15
Now recalling the definition of q(t) in (61), we rewrite equation (47) as
(64) q = - q(t) [f' [k(t)] - (A + p)]
Hence we obtain the phase diagram from equations (62) and (64) as in Figure 8.2.
The transversality condition (58') means that q(T) ? 0 and q(T)k(T) = 0.
Suppose that satiation is not allowed so that u' (x) > 0 for all x. Then q(T) cannot be
zero. Hence, as remarked earlier, the transversality condition is replaced by
k(T) = 0. This means that it is always better to "eat up" the capital to increase con-
PONTRYAGIN'S MAXIMUM PRINCIPLE 623

A
k

Figure 8.2. An Alternative Phase Diagram for the Optimal Growth Problem.

sumption for some period of time and leave nothing after the planning horizon.
This reflects the fact that k(T) is not a priori specified by k(T) = kT. The optimal
attainable path would in general be unique up to the boundary conditions k(O) _
ko and k(T) = 0. It is illustrated by a curve such as the CC' path in Figure 8.2. A
curve such as the AB path cannot be optimal, because at point B we have q = 0 so
that u' = 0, which violates the nonsatiation assumption. Note that a path such as
the DE path in Figure 8.2 cannot be optimal whether or not satiation in consump-
tion is allowed, because the transversality condition (58') cannot be satisfied in
any way.
What happens for the infinite horizon problem (T = oo)? As long as we do not
specify the terminal stock k(T) when T-> oo, the problem is identical with the usual
optimal growth problem, that is, the one discussed in Section D of Chapter 5,
except in one important aspect. How should the transversality condition be modi-
fied for the infinite horizon problem? Note that when T-> oo the problem of satia-
tion discussed above does not arise, since p(T) = 0 when T-> oo as long as u'(x) is
bounded (by the bound on the movement of x). In other words, the condition in
the form of either (58) or (58') is satisfied as T-> oo. The real question here is
whether such a condition indeed constitutes a condition for optimality. In other
words, the question we have to ask is: What is a transversa]ity condition at infinity?
Mathematically speaking, the "transversality conditions" refer to the condi-
tions which require that the state variables be in a particular target set at the
terminal point (see, for example, Pontryagin et al [ 11 ] , p. 49). For the finite hori-
zon problem, the values of the state variables would have a definite meaning; but
the meaning of the limit of these values when T-> oo is ambiguous, for the limit
may not exist for all attainable paths. "' Hence the phrase "transversality condi-
tions" is rather meaningless for the infinite horizon problem. Although for the in-
624 DEVELOPMENTS OF OPTIMAL CONTROL THEORY AND ITS APPLICATIONS

finite horizon case the condition [ p(T) ->0 as T-> oo ] may appear to be a natural
extension of the transversality condition [p(T) = 0 for finite T] , counterexamples
have been discovered where this is not true." In general, appropriate conditions
for the infinite horizon problem, which replace the transversality conditions for the
finite horizon problem, are not known. However, for the present problem of
optimal growth, Arrow [ 1 ] pointed out that the following condition happens to be
necessary for optimality, as long as p > 0:'a
(65) lim p(T) > 0 and lim p(T)k(T) = 0
T-.oo T-.oe

In other words, the simple extension of the finite horizon transversality condition
(58') holds in this particular case.
Clearly any path in which x(t) approaches a finite value as t->oo will satisfy
condition (65) so long as p > 0. It can be shown easily that of the paths illus-
trated in Figure 8.2, only the FG path and the HG path satisfy condition (65)
[note that k(t) ->kk implies i(t) ->zp from (46)]. Hence assuming the existence of
a unique optimal attainable path, we have FG as the optimal attainable path if
ko < kP and we have HG as the optimal attainable path if ko > kP. Here [kr, zpJ-is
the modified golden rule path discussed in Chapter 5, Section D.19
In the above analysis, we assumed that p > 0 (positive discount factor).
Suppose now that p = 0. Then the condition such as limn , u' [x(T)] k(T) = 0
which would correspond to (65) does not hold in general. That is, condition
(65) is false when p = 0. As Koopmans ([6], proposition C and lemma 3) has
shown, the following condition is necessary for the present problem, with p = 0:

(66) lim u' [x(T)] = u'(zP) and lim k(T) = kP

T-.oc T_oo

(67) Tire p(T) = u'(zP) and Tim p(T)k(T) = u'(zP)kp ( 0)

Thus in the case where p = 0, condition (65) is replaced by condition (66) or (67).
Condition (66) reconfirms our conclusion that the only optimal attainable
path is the one which converges to the golden rule path. Needless to say, the maxi-
mand integral for the case of p = 0 should be changed to

j[u[x(t)1 - u[.iP] }dt

in order to handle the problem of the convergence of the maximand integral (see
Chapter 5, Section D).
Note that condition (67) is a counterexample to the conjecture that the trans-
versality condition for the finite horizon problem is simply extended to the infinite
horizon problem by setting T->oo. Such a conjecture would be false in general
regardless of whether there is a restriction on the final state (such as k(T) > 0).
PONTRYAGIN'S MAXIMUM PRINCIPLE 625

When there are no restrictions on the final state, the transversality condition re-
quired in Theorem 8.A.6 is pi(T) = 0, i = 1, 2, . . ., n. However, the condition that
pi(T) = 0, i = 1, 2, . . ., n, as T- co may fail to hold for the infinite horizon prob-
lem. A counterexample, which is due to H. Halkin, is reported by Arrow and
Kurz.20 In view of the importance of the problem, we reproduce Halkin's counter-
example here.
Consider a control problem which maximizes

fy [1 - y(t)] v(t)dt
subject to y(t) 1 - y(t)] v(t), -1 < v(t) < 1, and y(O) = 0, where y(t) E R de-
notes the state variable and v(t) E R denotes the control variable. Observe that

f J
[1 - y(t)]v(t)dt = ji(t)dt = lim
t-or

But by direct integration, y(t) = 1 - e-''() where V(t) = .J v(i)de. Hencey(t) < 1
for all t. Hence any choice of v, - 1 < v < 1, forwhich limt,,, V(t) = co is optimal.
For example, v(t) = vo (constant), where 0 < vo < 1, is optimal. The Hamiltonian
for this problem is
H = [ 1 + p(t)] [ 1 - y(t)] v(t)
where p (t) is the auxiliary variable. Since v(t) = vo is a solution, it maximizes H.
Since vo is in the interior of [-1, 1], the control region, the maximality of H with
respect to v in turn implies that aH/aav = [ 1 + p(t)] [ 1 - y(t)] = 0. Hencep(t)
-1 for all t, since y(t) < 1 for all t. Owing to the continuity ofp(t), lim,' p(t) _ -1
and not 0.

FOOTNOTES

1. The above problem of shooting a guided missile is a favorite example in the literature
of optimal control theory. An expository account of the solution of this problem can
be found, for example, in Leitmann [9] , section 8, chapter 2, and in Saaty and Bram
[12].
2. See, for example, K. Shell ed., Essays on the Theory of Optimal Economic Growth,
Cambridge, Mass., M.I.T. Press, 1967, as well as Arrow [ l ], El-Hodiri [4], and Shell
[ 131. See also G. Hadley and M. C. Kemp, Variational Methods in Economics,
Amsterdam, North-Holland, 1971.
3 The development of the classical calculus of variations reached its culmination in the
1930s, especially at the University of Chicago.
4. The major results in optimal control theory and the relation between the calculus of
variations and optimal control theory are discussed in Hestenes [5] in a systematic
and unified way. The major content of this work was published in 1965 in the Journal
of SIAM Control.
5. It received the Lenin Prize in 1962.
626 DEVELOPMENTS OF OPTIMAL CONTROL THEORY AND ITS APPLICATIONS

6. See, for example, the Journal of SIAM Control.

7. In [ 11] , Pontryagin et al. emphasized that the range space of the control function can
be a closed set (hence a "corner solution" can be discussed in a satisfactory way). In
addition to this, they allowed the "control function" u(t) to be "piecewise continu-
ous," and thus they obtained a satisfactory treatment of the "bang-bang" solution
(the solution which jumps from one corner to another). The term "piecewise con-
tinuity" will be defined shortly.
8. Here the discontinuity is limited to the first kind; that is, the left-hand and the right-
hand limits are finite (limj .a,r<au(t) and lime ,a ou(t) are both finite), although
they are not equal. Note that the definition of piecewise continuity does not exclude
the possibility of a function which is continuous over the entire interval.
9. The functions p(t) as well as x(t) are required to have piecewise continuous deriva-
tives (as well as to be continuous) on the interval (t1, t2)- [The possible discontinuities
of p (t) and X '(t) occur at the points of discontinuity of u(t). ]. Sincep(t) is continuous
for all t in the closed finite interval, it must be bounded in the same interval. The
function u(t) is called the optimal control, and i(t) is called the optimal trajectory.
10. It is important to note that the bang-bang solution assumes that the jump in the
control is "costless" or "inertialess."
11. An intuitive way to understand (31) is to convert the problem of maximizin&S
Xc;x;(T) subject to Fj[x(T)] = 0, j = 1, 2, ..., m, to one of maximizing
S + ZAJFJ and set e9/8xi(T) = p;(T). See Kopp [7], pp. 260-261.
12. If po(T) = 1, then po(t) = 1 for all t. Thus the Hamiltonian can be written as H =
fo + X;' I p; It is, however, important to note that there is a distinct possibility that
po(t) = 0 for all t. Note also that po(T) = 0 implies po(t) = 0 for all t sincepo(t) _
constant for all t.
13. Notice that if we assume an interior solution in the first place, the proof of normality
(that is, the proof of po > 0) can be greatly simplified. To see this, note that (48), in
view of (39), means pou'e-PT = p(t) in the presence of po. Hence if po = 0, we have
p(t) = 0 for all t. This contradicts that p0 and p(t) do not vanish simultaneously for
any t. Since po > 0 by (iii) of Theorem 8.A.2, this proves po > 0. Notice that in the text
we first proved po > 0 without using the interior solution assumption, and then
derived an interior solution in (45).
14. Hence all the relations such as (52), (53), (54), and (55) also hold.
15. This is due to the fact that Pontryagin's auxiliary variable p(t) is continuous and
hence bounded for all 0 < t < T [hence for the present problem q(t) is bounded] .
More rigorously, when we do not assume the interior solution and allow the
possibility of a corner solution (that is, the possibility of z = 0), then equation
(48) should be replaced by 8H/8x < 0 with equality if i(t) > 0. Thus, instead of (49),
we will have p(t) > u' [1(t)] a-P'. If p(t) is bounded and if (63) holds, then z(t) cannot
be zero so that z(t) > 0 for all t.
16. Denote the state vector by x(t) as we did earlier in this section. The transversality
conditions require that x(T) be in a certain set, where T is the terminal value of t. The
problem is that the limit of x(T) for T-> oo may not exist for all feasible paths x(t).
17. Counterexamples will be provided later in this section.
18. For a discussion that such a condition is necessary under a more general context, and
also for a useful discussion of the transversality condition, see W. A. Brock, "What Is
a Transversality Condition at Infinity?" University of Rochester, 1969 (unpublished).
19. That is, k,, and xP are respectively defined by f'(/) = A + p and zP - f(kp) - Akp.
20. See, K. J. Arrow and M. Kurz, Public Investment, the Rate of Return, and Optimal
Fiscal Policy, Baltimore, Md., Johns Hopkins Press, 1970, p. 46. See also Shell [ 13] .
SOME APPLICATIONS 627

REFERENCES
1. Arrow, K. J., "Applications of Control Theory to Economic Growth," in Mathe-
matics of the Decision Sciences, Part 2, ed. by G. B. Dantzig and A. F. Veinott,
Providence, R.I., American Mathematical Society, 1968.
2. Athans, M., and Falb, P. L., Optimal Control, New York, McGraw-Hill, 1966.
3. Cass, D., "Optimum Growth in an Aggregative Model of Capital Accumulation,"
Review of Economic Studies, XXXII, July 1965.
4. El-Hodiri, M. A., Constrained Extrema: Introduction to the Differentiable Case with
Economic Applications, Berlin, Springer-Verlag, 1971.
5. Hestenes, M. R., Calculus of Variations and Optimal Control Theory, New York, Wiley,
1966.
6. Koopmans, T. C., "On the Concept of Optimal Economic Growth," in The Econo-
metric Approach to Development Planning, Pontificiae Academiae Scientiarvm Scriptvm
Varia, Amsterdam, North-Holland, 1965.
7. Kopp, R. E., "Pontryagin Maximum Principle," in Optimization Techniques, ed. by
G. Leitmann, New York, Academic Press, 1962.
8. Lee, E. B., and Markus, L., Foundations of Optimal Control Theory, New York, Wiley,
1967.
9. Leitmann, G., An Introduction to Optimal Control, New York, McGraw-Hill, 1966.
10. Mangasarian, O. L., "Sufficient Conditions for the Optimal Control of Nonlinear
Systems," Journal of SIAM Control, vol. 4, February 1966.
11. Pontryagin, L. S., Boltyanskii, V. G., Gamkrelidze, R. V., and Mishchenko, E. F.,
The Mathematical Theory of Optimal Processes, New York, Interscience, 1962 (tr. by
K. N. Trirogoff from Russian original). (A translation by D. E. Brown was published
by Macmillan, 1964.)
12. Saaty, T. L., and Bram, J., Nonlinear Mathematics, New York, McGraw-Hill, 1964,
esp. chap. 5.
13.=-,.Shell, K., "Applications of Pontryagin's Maximum Principle to Economics," in
Mathematical Systems, Theory and Economics, ed. by H. W. Kuhn and G. P. Szego,
Berlin, Springer-Verlag, 1969.
14. Takayama, A., "On the Structure of the Optimal Growth Problem," Krannert In-
stitute Paper, No. 178, Purdue University, June 1967.

Section B
SOME APPLICATIONS

a. REGIONAL ALLOCATION OF INVESTMENT'

Consider an economy consisting of two regions (1 and 2),2 each producing
one and the same output, Y (called the "national product"). The output of
628 DEVELOPMENTS OF OPTIMAL CONTROL THEORY AND ITS APPLICATIONS

each region is an increasing function of the capital input; hence an increase in

the capital stock of each region will result in an increase in the output of that
region. The increase in the stock of capital in a region is due to an increase in
the investment in that region. Assume that the total investment funds are pooled
in a central agency and allocated to the two regions. Then the increase in the
investment in a region depends on the allocation of the total investment funds.
Assume that total investment funds come from the total savings of the people in
the economy. The question is: What is the "optimal" allocation of the total invest-
ment funds?
This was the question posed and analyzed by Rahman [ 10] in his Ph.D. dis-
sertation at Harvard. His analysis, which was in terms of dynamic programming,
was reformulated by Intriligator [6] in terms of Pontryagin's maximum principle.
Although the reformulation is ingenious, Intriligator's conclusions do not coin-
cide with Rahman's result. This is because of errors involved in Intriligator's
analysis. Rahman, in his rebuttal [ 11 ] , commented on this but failed to come up
with a complete and precise analysis. The latter was provided by Takayama ([ 14]
and [ 15] ). .

The model presented in the Rahman-Intriligator studies is a very simple one.

A linear target is maximized subject to linear differential equations with constant
coefficients. Because of its simplicity, this model is very useful as an illustration of
Pontryagin's maximum principle. The reader, if he wishes, can always construct a
more general model and analyze it in a similar manner.
Assume that the output of each region Y.(i = 1, 2) is produced with a fixed
capital:output ratio so that we have
Y;=b,K;, i= 1,2
where K. denotes the stock of capital in region i and 1 /b, > 0 denotes the capital:
output ratio in the ith region. The variables such as Y; and K; are all functions of
time t so that we may write Y.(t) and Ki(t). However, for the sake of notational
simplicity, we omit the notation for time except where it might cause confusion to
do so. Since the investment funds for the two regions come from the saving done in
the whole economy, we have
(1) K,+K2=s1Y,+s2Y2
where the dot refers to the total derivative with respect to t; that is, Ki = dK]/dt,
and so on. Here we assume that the propensity to saves, of each region is constant
for all t and that 0 < s, < 1. Defining g, by
g;=b;s i= 1,2
we can rewrite (1) as follows:
(2) K, + K2 = g, K, + g2K2
Let /3 = /3 (t) be the proportion of investment al located to region 1. This /3 may
be called the allocation parameter. Clearly (1 - /3) is the proportion of investment
SOME APPLICATIONS 629

allocated to region 2. Then we have the following set of equations:

(3-a) K, _ /3(g,K, + g2K2)
(3-b) K2 = (1 - /3)(g1Ki + g2K2)
with arbitrarily given initial capital stocks K, (to) = K10 > 0 and K2(to) = KZ > 0.
It is obvious that
(4) 0</3< 1
The problem facing the economic planner is to choose/3 (t) so as to maximize
some objective function. The objective considered by Rahman is to maximize in-
come at some given future terminal time T. In other words, his problem is to3
Maximize: Y,(T) + Y2(T) [ = b, K, (T) + b2K2(T)]
Subject to: conditions (3) and (4)
Following the maximum principle, especially Theorem 8.A.1, we first define
the Hamiltonian as follows:
(5) H= pi/3(gIK, + g2K2) + (1 - /3)P2(g1K, + g2K2)
where pi's are "auxiliary variables" and satisfy the following transversality con-
ditions [ see (iii) of Theorem 8.A.1 ] :

(6) P1(T) = bI, p2(T) = b2

The Hamiltonian system consists of (3) and the following equations:

(7) pi= 1, 2

Henceforth we omit e), which denotes the optimality, for the sake of notational
simplicity.
According to the maximum principle we choose the control variables so as to
maximize H.' Noting that H = [/3(pI - P2) + P2] (g, K, + g2K2), we obtain'
(8) /3 = 1 if p, > P2 and /3 = 0 if P, < P2
Equations (7) and (6) can be rewritten as
(7') Pi = - [/3(PI - P2) + P2] gi, PI(T) = b,
P2 = (PI - P2) + P2] g2, PAT) = b2
Hence, observing that p1/P2 = g1/g2 from the above, we obtain"

(9) Pi(t) = g2p2(t) + bg62(S2 - Si), P1(T)= b1, P2(T) = b2

or
g ggg2P2(t)
(91) Pi(t) - PZ(t) = + -(S2
92
- SO
630 DEVELOPMENTS OF OPTIMAL CONTROL THEORY AND ITS APPLICATIONS

If g 1 > g 2 and s 2 > s 1(also for gi = 92 and s2 > s1, or g1 > 92 and s2 = sl ),
then pI > P2 always. Hence, /3 = 1. In particular if the saving rates in both regions
are the same, we should obviously invest all the funds in that region where
productivity of capital, b, is higher. Similarly, if the productivities are the same
(b, = b2), we should invest all the funds in the region where the saving rate is
higher. In this case, therefore, there is no switch for our control variable/3. Since
this is rather obvious, one may wonder why Intriligator was misled to conclude that
there is always a "switch" at the terminal date.
In order to understand the more difficult case gI > g2, s1 > s2,8 we draw
a diagram for equation (9), as in Figure 8.3. Although in Figure 8.3, p2 is to
the right of b2, in fact, it may be on either side of b2. The value of p2 is obtained
by setting p, = P2 in (9):

(10) P2 (which is positive by assumption)

g1 - g26162

We can easily show that:

(11)
P2<
b2 according to whether b2 < b,

CASE is b2 > b1
This is the case depicted in Figure 8.3. Let t* be the point of time at which
p2(t) takes the value of p2, and let to be the initial point of time. Since both p,
and p2 are monotone decreasing functions of time t from equation (7'), the point
t* is unique and we can consider case i as composed of two subcases.
to < t*

Figure 8.3. Rahman's Objective with b2 > bl.

SOME APPLICATIONS 631

There is a switch such that

1 forto< t < t*
0 for t*<t<T
In other words, this is a case of "bang-bang" control.
(i-b) to > t*
In this case there is no switch and = 0 all the time.
In order to find the value of t*, we solve the following differential equation:
P2 = - [/3 (P i - Pz) + Pz] 92 where /i = 0 and P2(T) = b2
and we obtain

(12) P2(t) = bzeg2(T-1)

Equating p2(t) to p2, the value of which is given in (10), we obtain the exact
expression for the switching time t* as follows:

(13) t* = T- l logS1 - s2 bil

g2 \gi -92 )
CASE ii: b2 < b,
This case is illustrated in Figure 8.4. From the diagram it is clear that
Pi > P2 always so that /i = 1 always. In fact, this should be obvious. Region 1
has a higher propensity to save and a higher productivity.

Figure 8.4. Rahman's Objective with bi > b2.

632 DEVELOPMENTS OF OPTIMAL CONTROL THEORY AND ITS APPLICATIONS

As an alternative objective for the planner, Intriligator proposed to maxi-

mize per capita consumption over the planning period. Since consumption, X, is
given by X = (1 - s1)b1K1 + (1 - sz)b2K2, his target function can be written as
rNdt
J0
where N is the population of the economy.9 Assuming that population grows
exponentially at a constant rate n so that1° N = ent and letting p be the time
discount rate for the future consumption, we may rewrite our target function as
follows:

fTeL[(l - s1)b,K, + (1 - sz)b2K2] dt

where
A -p+n
and p is assumed to be nonnegative (that is, p > 0). Our problem now is to choose
(t) so as to maximize the above target subject to conditions (3) and (4).
To apply Theorem 8.A.6, we first define the Hamiltonian as
H - e--r(b1K1
+ b2K2) + [A(PI - P2) + P2 - e-"] (g1K1 + g2K2)
and the Hamiltonian system consists of (3) and the following equations:
eH
(15) Pr = - aK, e-a`b; - [P(Pi - P2) + P2 - e-"] gi, i = 1, 2
with p, (T) = p2(T) = 0.'' The value of /i which maximizes H is again
(16) A(t) = I if pi(t) > P2(t) and /3(t) = 0 if pl(t) < pz(t)
From (15), we can easily obtain the following expression:

(17) Pi(t) - Pz(t) =

g1 -92 !z (e-lr - e- ")(S2 - s1)
PAO +
92 AS

Hence if g, > g2 and s, < sz" (instead of s, > sz), then PI(t) - PA(t) > 0 for all
t < T, and the optimal policy is to invest the entire fund in the first region. How-
ever, if s, > sz (together with gi > gz and A # 0), then we cannot arrive at any
immediate conclusion about the optimal policy. This forces us to reconsider the
problem under Intriligator's target function from a completely fresh viewpoint.
To do this, we obtain from (15) the following equation:
A,
(18) Pi - Pz = - D (Pi - P2) + Pz] (gi - gz) + [(1 - sz)bz - (1 - si)bi] e
If u = (1 - sz)b, - (1 - s,)bi is negative," then P, - P2 < 0 for optimal values
of /3, provided that g, > gz .'' if g, > 92 and s, < sz, then b, > bz, which is required
SOME APPLICATIONS 633

in order that a < 0. Note, however, that a can be negative even if we have g, > gz
and s, > sz. Since p,(T) = pz(T) = 0,P' i(t) -P'2(t) < 0 for all t < TimpliespI(t) -
P2(t) > 0 for all t < T. Hence the optimal value of/i is equal to one. In other words,
if g1 > gz and a < 0, we have 1. The same conclusion holds when g, > gz and
a<0.
However, when a > 0, p i - pz is not necessarily negative. In the subsequent
analysis, we assume a > 0. First we define q.(t) - p;(t)eA', i = 1, 2. It should be
clear that, for all t < T, q,(t) q2(t) according to whether pl(t) p2(t). We
also note that q.(t) 0 according to whetherp;(t) Ofor each t, i = 1, 2. From this
we can conclude that ql(T) = q2(T) = 0 and that q;(t) > 0, i = 1, 2, for all t < T.
With this definition of q.(t), we immediately have the following equations:

(19) pi = (q; - 9igi)e-are

i = 1, 2
We can also show that q;(t) < 0, i = 1, 2, for all t < T for 1 or 0, if we assume
that g; > A., i = 1, 2.15 Using the definitions of the q;(t) and (19), we now consider
(18) for the two cases /i = 0 and /i = 1.
CASE 1: qi < q2 (that is, /i = 0)
In this case equation (18) can be rewritten as"

(20) q1-q2 =- -9 9i - 92 +g2](gi-gz) + a

gi -92
Hence q, - 92 = 0 if and only if the values of (q,, qz) satisfy the following equation:
(21) -1q, + (g1 - gz + A)q2 = a
The locus of (ql, qz) values which satisfy (21) is clearly a straight line in the
,(q,-q2)-plane and divides the entire plane into two parts, that is, the region
where q, - q2 > 0 and the region where 9I - q2 < 0. We should also recall that we
are concerned only with the region where qi < q2 with qj > 0, i = 1, 2. This implies
that we are concerned only with the nonnegative region below the 450 line in the
(q1-q2)-plane where 9i - q2 > 0." We call the region which satisfies these require-
ments the "relevant region." The relevant region for the present case (q, < q2) is
illustrated by the shaded triangle in Figure 8.5.
CASE ii: qi > qz (that is, /i = 1)
In this case, (18) can be rewritten as

(22) q1 - qz = - - A 9, - 92 + q1 (g1 - 92) +

gt - gz
Hence q, - q2 = 0 if and only if the values of (q1, q2) satisfy the following equation:

(23) (g1 - gz - i.)ql + a.qz = a

The locus of (q,, q2) pairs which satisfy (23) is clearly a straight line in the
634 DEVELOPMENTS OF OPTIMAL CONTROL THEORY AND ITS APPLICATIONS

45°

q2
0 a

/ 91-92
/
/Equation (21)
Slope = _91-_92- 11

/
Figure 8.5. Case a: q1 < q2 with g, - gZ + A > 0.

(q1-q2)-plane and divides the entire plane into two parts, that is, the region
where q1 - q2 > 0 and the region where q1 - q2 < 0. We should recall that we
are concerned with the case in which q I > q2; hence the "relevant region" in the
present case is the region above the 45° line in the (q1-q2)-plane where q1 - 42
< 0.18 We are again concerned with the nonnegative values of q1 and q2. The
sign of the slope of the straight line which satisfies (23) will differ according to
whether g1 - g2 - A > 0 or < 0.18 We illustrate the relevant region for the case
q1 > q2 with g1 - g2 - A > 0 by the shaded triangle in Figure 8.6.

45° /
/N Equation (23)
x
\\I Slope = -
91 92

0 a
g1-92
\ 4z

+ \-

Figure 8.6. Case b: q1 > q2 with g, - 92 - A > 0.

SOME APPLICATIONS 635

Figure 8.7. Intriligator's Objective with g, - 92 - A > 0.

In Figure 8.7, we combine Figures 8.5 and 8.6, and we obtain the path of
(q1, q2), which is illustrated by an arrowed line. In Figure 8.8, we illustrate the
case in which gi - 92 - A < 0, when ql > q2. The optimal path of (ql, q2) is again
indicated by an arrowed line. Note that in both cases-that is, g1 - g2 - A > 0 and
g1 - g2 - A < 0-there is a possibility of a switch of the optimal policy from
l to /3 = 0. For example, in the case of g1 - g2 - A > 0, the optimal value of
is equal to one until ql(t*) = q2(t*) = cr/(gi - g2), and then it switches to zero
until the terminal point of time T. The same holds for the case ofgi - $2 - A < 0.
Finally, let us obtain the switching time t*. This can be done by noting that
qi(t*) = q2(t*) = cr/(g1 - g2). The explicit expression for q1(t) with /3 = I is
written as20

24) q i (t)
I[(1 - sl)bl(r [e1T)
gi- A J
Define A as

(25) A-_ Cl - sl)bi(gl

(gi - A)a
- g2)
Then the switching time t* is obtained as
(26) t* = T - log (A + 1)
gi - A.
If g, > A, then A > 0. Hence if T is sufficiently large so that the RHS of (26) is
636 DEVELOPMENTS OF OPTIMAL CONTROL THEORY AND ITS APPLICATIONS

Figure B.S. Intriligator's Objective with g, - 92 - A < 0.

positive, there is a switch of the optimal policy at t = t*. If T is not big enough,
there is no switch and the optimal policy will always be/3 = 0. If g, < A, then A < 0;
hence as long as the difference between gl and A is not too large,' log (A + 1) < 0,
and there is again a possibility of a switch of the optimal policy at t = t* provided
T is sufficiently large so that the RHS of (26) is positive.
This finishes our analysis under Intriligator's target function. We summarize
the results as follows:
(1) gi > g2, sl < s, (or g1 = g2, s1 < s2; g1 > g2, s1 = s2; or g> > g2, A = 0):
R = 1 always.
(ii) gi > 92, Q < 0: /3 = I always .12
(iii) g, = g7, o > 0: /3 0 always."
(iv) g1 > g2, a > 0: possibility of a switch from R = 1 to /3 = 0.

Our results now can be summarized in the following table.'"

S1 >s2 s1 <s2(bl>b2)

Rahman's objective function bi < b2 b[ > b2

switch /3 = 1

Intriligator's objective function

Q > 0 Q< 0
/3 = 1
switch /3 = 1
SOME APPLICATIONS 637

We may now discuss whether the model is really plausible or not. The first
question is whether we can assume that the b;'s are kept constant, since the b,'s
may decline, owing to the law of diminishing returns, as capital accumulates. One
reason the b;'s may be kept from declining is that labor is freely available so that
the capital:labor ratio is kept constant. However, this is impossible in a full-
employment economy unless the total labor supply increases at the same rate as
capital. Even in an economy with an "unlimited" supply of labor, it is not easy
to conceive of a mechanism which would determine the total employment of labor
and the allocation of this labor to different regions such that the capital:labor
ratio would remain constant in each region.
Apart from this, a more important question is the implication of our optimal
policy which says that the planner should invest all the funds in one region only
(say, region 1). If the income is growing in region 1 while the income in region 2
is stagnant, there may be a migration of labor from region 2 to region 1. It is not
quite clear whether there should be a mechanism to stop this and whether we
should consider the effect of this migration on productivity and the propensity to
save of each region. In short, the question we ought to face is whether we can
keep labor implicit in our model.
ADDENDUM: Here we record the explicit expressions for pi(t) and P'i(t),
i = 1, 2, as functions of time, corresponding to the optimal values ofA under
Intriligator's objective function. They can be obtained by putting R = 1 or
0 in equation (15) and solving the linear differential equations thus obtained
subject to the boundary conditions pi(T) = p2(T) = 0.25
CASE is A=1

e-.it. (1 - s1)bi 1 e(gi-1)(r-r)

(27) Pi(t) = g, -A II
-1 1

(1 - sl)bi
(28) Pi(t) - e-at [A - g, e(91 -A)(T-1)
g, - A
-e-at S,Abi
(29) P2(t) = (1 - s2)b2 + (lg, g2[e(g1-;t)(T-1)- 1]

To obtain the expression for p2(t), we rewrite (17) as

(1 T) g92 1 - e-.t(T- )] e -.it

P2(t) = I P, (t) + b2 (s 1 - s2)
,

Then substituting (27) into this, we immediately obtain the expression for
p2(t). Note that e(g1-1)(T-1) < 1 according to whether g, < A, for all t < T.
Therefore from (27), (28), and (29) we obtain pi(t) > 0, pi(t) < 0 and p2(t) <
0, for all t < T. Since p2(T) = 0, we also have p2(t) > 0 for all t < T. Using
(19), (27), and (28), we can easily show that g1(t) < 0 for all t < T. Ifg, > A,
we can also show that g,(t) < 0 for all t < T. 26
638 DEVELOPMENTS OF OPTIMAL CONTROL THEORY AND ITS APPLICATIONS

CASE ii: A= 0
[(' - s2)b2
(30) P2(t) = e-2.t
e(92-2.)(r-t)
-1
g2-A
(31) P1(t) = -e-"t (1 - si)bl + (1 -g2-A
s2)b291 [e(92-2)(T-1)

([1_s2b2] A - g2e(82-A)(T-1)
(32) P2 = e -' 11

[92
The expression for pl(t) can be obtained by substituting (30) into (17').
Using an argument analogous to the one above, we can show that p,(t) > 0
and Pi(t) < 0, i = 1, 2, for all t < T. Using (19), (30), and (32), we can show
that 42(t) < 0 for all t < T. Also if g2 > A, we can show that q1(t) < 0 for all
t < T.

b. OPTIMAL GROWTH WITH A LINEAR OBJECTIVE FUNCTION 27

Here again we take up the optimal growth problem. The only change we
make here is that the objective integral is now defined as

(33) J= f 0
xte-Ptdt, where p > 0

The constraints, which are exactly the same as before, are

(34) kt = f(k) - .Lkt - xt, and xt > 0, kt > 0
The notations are also the same as before: xt, per capita consumption; kt,
capital:labor ratio; p, discount factor; A, rate of population growth (n) + rate of
depreciation (,u); andf(k), per capita production function. The subscript t refers
to time t. Once again it is assumed that A > 0.
This change in the objective function implies a special form of the utility
function, that is, u(xt) = xt, so that the marginal utility is constant (and is equal
to 1). The objective (33), the discounted sum of the per capita consumption
stream, is in fact quite common in the literature. From the mathematical view-
point, the crucial change is that the function inside the objective integral is now
a linear function with respect to the control variable.
Let st be the propensity to save at time t, that is,
(35) St = Y, - Xt = f (kt) - x,
Y, f(k,)
so that xt = (1 - s) f (k). We rewrite the objective integral (33) and the constraint
(34) as follows:'"
cc

(36) J- r0
(1 - st)f(kt) a-Ptdt
SOME APPLICATIONS 639

(37) kt = stf(kt) - Akt, st < 1, kt >- 0

In order to emphasize the corner solution, we further assume that st >_ 0. Thus st
is assumed to be bounded in the unit closed interval
(38) 0 < st < 1

The fact that st > 0 means that consumption does not exceed current income; that
is, gross investment It is nonnegative. This signifies that we do not allow capital
to be "eaten up" (except for depreciation), which means that once the output is
invested as capital stock it is not used for the purpose of consumption. The
assumption of It > 0 is often called the irreversibility of investment (see Arrow 111,
for example).
Note three (mathematical) features in the present formulation of the optimal
growth problem: (1) the objective function is linear in the control (st); (2) the
RHS of the differential equation constraint (37) is again linear in st; and (3)
the control is in the closed region prescribed by (38). Under these features, it will
be observed that we obtain a "corner solution" as a usual case. Since it is typically
supposed in the classical calculus of variations that the control region is an open
set, the corner solution requires special consideration. However, the Pontryagin
maximum principle, in which the control region can be a closed set, is well suited
for the analysis of this problem. Moreover, the solution of the above.problem is
such that there is a jump in the optimal control from a corner solution to an
interior solution. Hence the assumption of the piecewise continuity of the control
function is useful.
The problem is to choose the time path of st so as to maximize J defined
in (36) subject to (37) and (38) with a given ko. The solution path is called the
optimal attainable path (with respect to ko). For this problem st is the control
variable and kt is the state variable. We consider this problem as the one with open
terminal end point and apply Theorem 8.A.6. The Hamiltonian for this problem
is

(39) H[kt, st, t, pt] -- e -P`(1 - st)f(kt) + P,[stf(kt) - Akt]

Omitting ("), which indicates optimality, for the sake of notational simplicity, we
write the three necessary conditions from Theorem 8.A_.6.29

(i) The variables kt, st, p, solve the Hamiltonian system which consists of (37)
and the following equation:
(40) -
Pt = - [e-P'(1 - st)f'(kt) + Pt [s,f'(kt) A]]
(ii) The Hamiltonian H is maximized with respect to st.
(iii) The right-hand end-point condition: limt-apt = 0.

In order to simplify the problem, define qt by

(41) qt = PteP' t < co
640 DEVELOPMENTS OF OPTIMAL CONTROL THEORY AND ITS APPLICATIONS

Then e-Ptg, - pe-P'q, = p, so that (40) can be rewritten as

(42) qt = (A + p)qt - 7rtf'(kt)
where 7r, is defined as
(43) 7rt = (1 - st) + s, qt = 1 + (qt - 1)st
In terms of q,, condition (iii) can be rewritten as
(44) lira q,e-Pt = 0
t-ao
Note also that H can be rewritten in terms of q, and 7r, as follows:
(45) H= e-Pt[7rtf(kt) - Ak,q,]
Hence condition (ii), the maximization of H with respect to s,, is realized if and
only if 7r, is maximized with respect to s,. Thus
(46) s, = 0 if q, < 1 and st = 1 if q, > 1
Note also that
(47) 7r, = 1 if s, = 0 and 7r, = q, if s, = 1
In other words, the choice of s, depends on whether q, is greater or less than 1,
and corresponding to this choice of s,, 7r, is specified in (47). Using such a choice
of s, and the specification of 7r., [which in turn is a result of condition (ii)], we
investigate the Hamiltonian system [condition (i)]. Therefore we consider our
problem by distinguishing the two cases (q, > 1 and q, < 1).
CASE is q, > 1
in this cases,= 1 and 7r, = q, in view of (46) and (47) so that (37) and (42)
can be rewritten, respectively, as
(48) k, = j(k,) - Ak,
(49) qt = - [f'(k,) - (,1.
+ p)] qt
The phase diagram for these two differential equations is depicted in Figure
8.9 [assuming, as before, that j'(k) > 0j "(k) < 0 for all k >_ 0'f'(0) = oo,
and j'(oo) = 0].
Here k and k* are respectively defined by the following equations:
(50) f(k) = Ak
(51) j'(k*) = }i + p
Note that k* is the capital:labor ratio in the modified golden rule path, the
concept of which was already discussed in Chapter 5, Section D.
Hence, in Figure 8.9, if the initial capital: labor ratio, ko, is less than
k*, then there exists a path (k,, q,) which starts from [ko, q(ko)] and reaches
the state (k*, 1). It can be easily shown that such a path reaches the state
(k*, 1) within a finite amount of time-say, T-and along this path q, is
always greater than 1. Assuming that ko < k*, we define another path
SOME APPLICATIONS 641

q
4=0 k=o

N
q(ko) `J

1 1 k
0 ko k` k

Figure 8.9. Phase Diagram When q, > 1.

(k,', q/), which is the same as the above path for the period 0 < t < T but
is (k*, 1) for t > T.
We may now examine whether the path (k*, 1) satisfies the system of
differential equations (37) and (42). To do this, first note that q, = 1 implies
7E, = 1, and for the path (k*, 1), k, = 0 and 4, = 0. Therefore, (37) and (42)
are reduced to
(52) 0 = s, f(k*) - Ak*
(53) 0 = (A + p) - f'(k*)
Equation (53) is obviously satisfied by the definition of k* [see (51)].
Equation (52) is satisfied if and only if s, takes the value

(54) s, = s* = f(k*) for all t

Note that s* > 0, and that k* < k implies s* < 1. Hence we have 0 < s* < 1,
which satisfies (38). Thus (k*, 1) satisfies both (37) and (42).
Hence the path (k,', q,') defined above satisfies all three conditions of
the maximum principle for this problem, including condition (iii) [or(44)] .
It can be easily shown that along this path (k,', q,'), the integral Jdefined
by (36) also converges.
CASE ii: q, < 1

In this cases, = 0 and ?c, = I in view of (46) and (47), so that (37) and
(42) can be rewritten respectively as
(55) k, = -ilk,
(56) 4, = (A + p)q, - f'(k,)
642 DEVELOPMENTS OF OPTIMAL CONTROL THEORY AND ITS APPLICATIONS

Figure 8.10. Phase Diagram When qt < 1.

The phase diagram for these two differential equations is depicted in Figure
8.10.
If ko > k*, there exists a path (kt, qt), starting from [k0, q(k0)] , such
that it reaches the state (k*, 1). This is also illustrated by an alternative phase
diagram, Figure 8.11, which again can be obtained from (55) and (56). It can
be shown that it reaches (k*, 1) within a finite amount of time-say, T.
Assuming k0 > k*, we therefore define the path, denoted by (kt", q/1),
as the above path for the period 0 < t < T and (k*, 1) for t > T. Clearly
(kr", 'q,") satisfies all three conditions of the maximum principle for the
present problem, including condition (44). Also, along this path, the integral
J defined by (36) converges.
We may summarize the above discussion in the following theorem.

Figure 8.11. An Alternative Phase Diagram When qt < 1.

SOME APPLICATIONS 643

Theorem 8.B.1: For the above model, given an arbitrary initial value of k, there is
a unique optimal attainable path which is characterized as follows:

(i) k0 < k*: si = 1, and after k, reaches k*, k, = k* for all such t; that is, the path
(kt', qt').
(ii) k0 > k*: si = 0, and after ki reaches k*, ki = k* for all such t; that is, the path
k " ")
(r+9r
(iii) k0 = k*: ki = k* for all t and st = s* = Ak*/f(k*).
In other words, the optimal attainable path is the one that reaches the modified
golden rule path (k*, x*) [where x* = (1 - s*)f(k*)] with a maximum speed and
stays thereafter on it. This optimal attainable path is illustrated in Figure 8.12.
Thus the solution path is such that if k0 < k*, the economy maximizes savings
from current income until time T and after time T maintains a constant saving
ratio s*; if k0 > k*, the economy minimizes saving from current income until
time T and after time T maintains a constant saving ratio s*. As is clear from
the above diagrams, the optimal saving ratio is a kind of bang-bang control or,
more precisely, "bang-off" (or "bang-coast") control.

Sr Sr

S" ------- S.

0 IT
t t

kr I
kr

0 T 0 T

Figure 8.12. An Illustration of the Solution Path.

644 DEVELOPMENTS OF OPTIMAL CONTROL THEORY AND ITS APPLICATIONS

FOOTNOTES

1.This subsection is taken from Takayama ([ 14] and [ 15] ), which were originally
developed in his lectures at the University of Minnesota in the spring of 1966.
2. Our analysis can be modified to a two-sector economy (for example, agriculture and
industry). It can also be extended to an n-region or n-sector economy without too
much difficulty. An attempt for the two-sector economy is made by Bruno [3].
3. We can also consider the objective function in the form of c1Y1(T) + c2Y2(T)
where c; is some weight attached to the income of each region by the planner. The
analysis, in this case, will be analogous to the one which we develop below. It is
also possible to consider different propensities to save of each factor (labor, capital,
and so on). The analysis will be similar as long as we assume fixed coefficients of
production. For such variations, see, for example, Dorfman [5] .
4. The maximum principle, as it is presented and proved by Pontryagin et al. 191,
gives necessary conditions for the optimum. Since the right-hand sides of equations
(3-a) and (3-b) are linear (hence concave) functions of /i, K, and K2, the maximum
principle is also sufficient for the optimum. See Mangasarian [8] and Section C
of this chapter.
5. We can interpret pi as the "shadow price" of investment in the ith region. Condition
(8) can be interpreted simply as investing the entire fund in the region where thee
"shadow price" of investment is higher.
6. From (7'), it should be clear that pl (t) > 0 and P2(t) > 0, for 0 < t < T, for the
optimal values of /i (0 or 1).
7. If gl = 92 and s, = s2, the two regions would look exactly the same to the planner,
so that the choice between the two would be indifferent.
8. Since the name of the region is arbitrary, this exhausts all the possible cases.
9. We can certainly extend our analysis to the case in which the target function is
more generally defined as fo (X/N)e-t'rdt + a1K1(T) + a2K2(T) where p is a time
discount factor for future consumption and a; is a weight attached by the planner
for the capital stock in the ith region at time T.
10. We choose the units of population properly so that the initial amount of pupulation
is equal to one.
11. We can show that pi (t) and P2(t) are positive, 0 < t < T, if /i = 1 or 0.
12. Or: (i) gl = 92, si < s2; (11) gl > 92, Sl = s2; (iii) 91 > 92, A = 0-
13. If 91 > 92 and s, > s2, then [(1 - s2)b2 - (1 - s, )bi] is not necessarily negative.
14. This is because we can show that p, (t) > 0 and p2(t) > 0, for all t < T, ifP = 1 orO,
and that the optimal value of /i is either 0 or 1. In the Addendum to this subsection,
we show our proof for pi (t), p2 (t) > 0, for all t < T, if /i = 1 or 0. In the argument
that follows, we assume that gi > g2. If g, = g2 and a > 0, then pi -- p) > 0 so
that p, - P2 < 0 for all t < T. In other words, the optimal policy is to invest the
entire fund in the second region (/i = 0). When gi = 92 and a < 0, 1 always.
But this case is already covered by case i of footnote 12.
15. See the Addendum to this subsection.
16. We may recall that gi > 92 by assumption.
17. This is due to the fact that qI - q2 > 0 implies q1 - q2 < 0 for all t < T
since qi(T) = q2(T) = 0. If qi - q2 < 0, then q1 > q2, which is a case that
should be excluded from the assumption of the present case (qi < q2). We also
note that q, > 0, i = 1, 2, for all t < T. Hence we are concerned with the
nonnegative orthant of the (ql-q2)-plane.
SOME APPLICATIONS 645

18. Again this is due to the fact that 4i - 42 < 0 implies qi - q2 > 0 for all t < Tsince
qi (T) = q2(T) = 0.
19. If we approximate the discount factor p by the current market rate of interest, the
gi's may become much larger than A(= n + p), where n is the rate of population
growth. Note also that if this is the case, both qi(t), i = 1, 2, decrease over time
for all t < T.
20. See the Addendum to this subsection, especially (27), and recall that pl(t) - Bi(t)e-Ar.
21. If the difference between gi and A is too big, then (A + 1) is negative and (26)
makes no sense. We may avoid such a possibility altogether by assuming that gi > A.
22. In this case, b> > b2. Needless to say, bi > b2 and g> > 92 do not necessarily
imply a < 0.
23. In this case, b1 < b2-
24. Here we assume that g > 92.
25. Some of the computational procedure can be simplified by transforming the pi(t) to
qi(t) and noting (19).
26. Use (19), (29), (17'), and (27).
27. This part is also from my lectures at the University of Minnesota given in the spring
of 1966. This is a simplification of Uzawa's model [ 16] , which involves the two
sectors, material output and knowledge. This simplification illuminates the signifi-
cance of a linear objective more dramatically.
28. The condition that xi > 0 or, equivalently, s< < 1, implicitly assumes that the
starvation level of consumption is zero. If we want to explicitly consider a positive
level of consumption as the starvation level, then we alter this condition to
x, >_ x > 0 or, equivalently, s< < s < 1, where x is the starvation level of con-
sumption ands is the corresponding propensity to save. However, this change will
not alter the subsequent analysis in any essential way as we simply assume x = 0.
29. Using a proof similar to the one used in Mangasarian's theorem [8] , or in Theorem
8.C.5, we can show that these conditions are also sufficient for optimality. Note
also that condition (iii) below needs a proof, for Theorem 8.A.6 is concerned only
with the finite horizon problem.

REFERENCES

1. Arrow, K. J., "Optimal Capital Policy with Irreversible Investment," in Value,

Capital, and Growth, Papers in Honour of Sir John Hicks, ed. by J. N. Wolfe,
Edinburgh, Edinburgh University Press, 1968.
2. , "Applications of Control Theory to Economic Growth," in Mathematics of
the Decision Sciences, Part 2, ed. by G. B. Dantzig and A. F. Veinott, Providence,
R. I., American Mathematical Society, 1968.
3. Bruno, M., "Optimal Accumulation in Discrete Capital Models," in Essays on the
Theory of Optimal Economic Growth, ed. by K. Shell, Cambridge, Mass., M. 1. T.
Press, 1967.
4. Cass, D., "Optimum Growth in an Aggregative Model of Capital Accumulation,"
Review of Economic Studies, XXXII, July 1965.
5. Dorfman, R., "Regional Allocation of Investment: Comment," Quarterly Journal
of Economics, LXXII, February 1963.
646 DEVELOPMENTS OF OPTIMAL CONTROL THEORY AND ITS APPLICATIONS

6. Intriligator, M. S., "Regional Allocation of Investment: Comment," Quarterly

Journal of Economics, LXXIII, November 1964.
7. Koopmans, T. C., "On the Concept of Optimal Economic Growth," in The Econo-
metric Approach to Development Planning, Pontificiae Academiae Scientiarvm
Scriptvm Varia, Amsterdam,. North-Holland, 1965 (also "Discussion" on pp.
289-300).
8. Mangasarian, O. L., "Sufficient Conditions for the Optimal Control of Nonlinear
Systems," Journal of SIAM Control, vol. 4, February 1966.
9. Pontryagin, L. S., et al., The Mathematical Theory of Optimal Processes, tr. by
Trirogoff, New York, Interscience, 1962.
10. Rahman, M. A., "Regional Allocation of Investment," Quarterly Journal ofEconom-
ics, LXXII, February 1963.
11. , "Regional Allocation of Investment: Continuous Version," Quarterly Journal

of Economics, LXXV, February 1966.

12. Takayama, A., "On the Structure of the Optimal Growth Problem," Krannert
Institute Paper, No. 178, Purdue University, June 1967.
13. , "Per Capita Consumption and Growth: A Further Analysis," Western Econo-
mic Journal, V, March 1967.
14. , "Regional Allocation of Investment: A Further Analysis," Quarterly Journal

of Economics, LXXXI, May 1967.

15. , "Regional Allocation of Investment: Corrigendum," Quarterly Journal of

Economics, LXXXII, August, 1968 (also Krannert Institute Paper, No. 186, Purdue
University, August 1967).
16. Uzawa, H., "Optimal Technical Change in an Aggregative Model of Economic
Growth," International Economic Review, 5, January 1965.

Section C
FURTHER DEVELOPMENTS
IN OPTIMAL CONTROL THEORY

a. CONSTRAINT: g [x(t), u(t), t] > 0

In many applications of optimal control theory it is necessary to consider
explicitly constraints of the following form:
(1) gj [x(t), u(t), t] > 0, j = 1, 2, ... , m
or, in vector notation,
(2) g [x(t), u(t), t] 0

We refer to constraints of the form (1) or (2) as the "g-constraints." In the

FURTHER DEVELOPMENTS IN OPTIMAL CONTROL THEORY 647

9-constraints, it is important to note that the control function u(t) explicitly

enters the g-function. Throughout this section, our g-function contains u(t).'
With the above constraints, we consider the following control problem.
PROBLEM 1

Maximize: I fo fo[x(t), u(t), t] dt

Subject to: i, = f, [x(t), u(t), t] , i = 1, 2,... , n
gj [x(t), u(t), t] > 0, j = 1, 2, ... , m
0

Here fo, the J's, and the gj's are assumed to be continuously differentiable
in (x, u, t)-space. The functions xi(t), i = 1, 2, ..., n, 0 < t < T, are con-
tinuous, and the u;(t), i = 1, 2, ..., r, are piecewise continuous. In this
problem the final time T is fixed and the terminal end points, the x.(T)'s,
are not specified.

Here we shall not attempt a full exposition of the maximum principle with
the g-constraints (1). Instead, following Arrow [ 1] , we simply give below a heuristic
explanation of the main result for Problem I.
First we consider the above problem without the constraints (1); the problem
is then reduced to the one discussed for Theorem 8.A.6. Hence we obtain the
necessary conditions described in that theorem. The essential part of Theorem
8.A.6 is the maximization of H with respect to u. That is, for each t,

(3) H[. (t), u(t), t, p(t)] > H[z(t), u(t), t, p(t)] for all u(t) E U
where
M

(4) H [x(t), u(t), t, p(t)] = fo + p,J

Now we add constraints (1) to the problem and consider the maximization of
H as a constrained maximum problem, that is, the problem of maximizing H
subject to the constraints (1). Thus (3) may now be replaced by
(5) u(t) maximizes H[z(t), u(t), t, p(t)] (for each t)
subject to gj [z(t), u(t), t] > 0, j = 1, 2, ... , m, and u(t) E U
Let us now define the Lagrangian L (or the generalized Hamiltonian) by
(6) L [x(t), u(t), t, p(t), q(t)] = H[x(t), u(t), t, p(t)]

+ q;(t)g; [x(t), u(t), t]

%_I

where H is defined by (4), and the qj(t)'s are the "multipliers" associated with
the g-constraints. Then, in view of our discussions in Chapter 1, the maximization
648 DEVELOPMENTS OF OPTIMAL CONTROL THEORY AND ITS APPLICATIONS

of L implies the following conditions for each t, provided the "constraint qualifica-
tion") holds:'

(7)
au;
0, i = 1, 2, ..., r, where f = L [z (t), u(t), t, p(t), q(t)]

(8) q(t)- g[z(t), fi(t), t] = 0 and q(t) > 0, where q(t) = [q1(t), ..., q,,,(t)]
or

gj(t)gj [,i(t), u(t), t] = 0 and qj(t) > 0 for each j = 1 , 2, ... , m

('.'gj [. (t), fi(t), t] > 0 for all j)
The constraint qualification was discussed in detail in Chapter 1. Here we simply
remind the reader of some conditions of the Arrow-Hurwicz-Uzawa theorem
[2] (Theorem 1.D.4).

Lemma: Any one of the following conditions provides the constraint qualification:

(i) The functions gj [x, u, t] , j = 1, 2, ... , m, are all convex functions in u.

(ii) The functions g j [ x, u, t] , j = 1, 2, ... , m, are all linear junctions in u.
(iii) The functions gj[x, u, t] , j = 1, 2, ... , m, are all concave functions in u and there
exists a u E U such that g j [ z, ii, t] > 0, for all j.
(iv) The functions g j [x, u, t] , j = 1 , 2, ... , m, satisfy the rank condition for each t;
that is, the rank of [ agj l aui ] E [ evaluated at (z, u) where E denotes that j is
taken from the effective constraints] must be equal to the number of effective
constraints in gj(z, u, t) > 0, j = 1, 2, ., m.'
.

We can now state the necessary conditions for Problem I.

Theorem 8.C.1: Assuming that the constraint qualification holds, in order that
u(t) be a solution of Problem I with the corresponding state variable i(t), it is
necessary that there exist vector-valued functions p(t) = [pi(t), p2(t), ..., p,(t)]
and q(t) - [qi(t), q2(t), . . ., gm(t)]5, where the pi(t)'s are continuous and have
piecewise continuous derivatives and the gj(t)'s are piecewise continuous and contin-
uous at all points of continuity of u(t), such that
(i) The function p(t) together with u(t) and i(t) solve the following Hamiltonian
system:

(9) Xi=
aL
api i
and pi=- aL,i= 1, 2,...,n,
for each interval on which u(t) is continuous
H[z(t), u(t), t, p(t)] > H[$(t), u(t), t, p(t)]
for all u(t) E Usuch thatgj[z(t), u(t), t] > 0, j = 1, 2, ..., m.
FURTHER DEVELOPMENTS IN OPTIMAL CONTROL THEORY 649

(iii) aL = 0, i = 1, 2, ... , m, and

au;

(10) gj(t)gj[X(t), u(t), 1] = 0, qj(t)> 0, j= 1,2,...,m

(iv) dt L --9
L
';

(v) p;(T)=O,i= 1,2,...,n

REMARK: If L is a concave function in u, then condition (iii) implies
condition (ii).' The function L is concave in u if, for example, fo, the
and the gj's are all concave in u and the pi's are all nonnegative.
REMARK: A typical situation in economics is that the control u(t) is
constrained by the nonnegativity condition, that is,
(11) u;(t)?0,i= 1,2,...,r
We can treat this as a special case of theg-constraints in which go [x(t), u(t),
t] = u(t) > 0. Then the L-function can be written as
(12) L = H + q(t) g + ,u (t) u(t)
where

H[x(t), u(t), t, p(t)] = fo [x(t), u(t), t] + j pi(t)l. [x(t), u(t), t]

and
q(t) = [q, (t), ... , qm(t)] , g = [gl, ... , g,], gj = gj [x(t), u(t), t]
j= 1,2,...,m,(1)= u,. (t)]

Then condition (iii) [with (10)] of the above theorem implies'

8L 8L
(13) u <0 and u,=0,i= 1,2,...,r
where
in

H = H [z(t), u(t), i, p(t)] and L+ H+ gj(t)gj [x(t), fi(t), t]

j= I

Conversely, if (13) holds, then we can easily find multipliers i(t) > 0,
i = 1, 2, .. ., n, so that condition (iii) of the above theorem is satisfied. An
alert reader may have realized that this procedure and condition (13) are
analogous to those discussed in connection with the nonnegative quasi-
saddle-point condition in Chapter 1, Section D.
REMARK: It should also be realized that Pontryagin's maximum principle
as discussed in Section A can be considered as a special case of Theorem
650 DEVELOPMENTS OF OPTIMAL CONTROL THEORY AND ITS APPLICATIONS

8.C.1, in which gj[x(t),u(t), t] > 0 takes the form gj[u(t)] ? O,j= 1,2,...,m,
or the constraint region U is restricted by the g-constraint.
In many problems of economics, it may be desirable to impose the following
condition explicitly:'
(14) x.(T)>_ 0, i = 1,2,...,n
To do so, we first alter the objective functional of Problem I as follows:

(15) J = fo f [x(t), u(t), t] dt + t=i c;x;(T)

where T is fixed, x(T) is not specified, and the c;, i = 1, 2, ... , n, are some fixed
constants. The transversality condition for Problem I with this change in the
objective is

(16) p.(T)=c;,i= 1,2,...,n

instead of condition (v) of the above theorem. Now suppose that each c; is chosen
such that it is a sufficiently large positive number if x,(T) . < 0 and is zero if
x;(T) >_ 0. This choice of the ci's amounts to putting a prohibitively high penalty for
negative x;(T)'s so that in the optimal program, negative x.(T)'s are avoided. Thus
we can guarantee x.(T) > 0 for all i. In view of this choice of the ci's, the trans-
versality condition (16) should be rewritten as "'

(17) pi(T) > 0 and p;(T)z;(T) = 0, i = 1, 2, ... , n

Hence we obtain the following corollary of Theorem 8.C.1.

Corollary: Assuming that the constraint qualification is satisfied, in order that

[.i(t), fi(t)] be a solution pair of Problem I, with the additional constraint x.(T) > 0.
i = 1, 2, . . ., n, it is necessary that there exist continuous (and bounded) vector-valued
functionsp(t) and q(t) such that conditions (i), (ii), (iii), and(iv) oftheprevious theorem
hold and condition (17) holds instead of condition (v).
REMARK: If the objective functional for Theorem B.C. i is replaced by the
one with an infinite horizon,

(18) I= f fo [x (t), u(t), t] dt

then assuming that an optimal policy exists, all the conclusions of the above
corollary except condition (17) and all the conclusions of Theorem 8.C.1
except condition (v) hold. As remarked at the end of Section A, the appro-
priate conditions which replace the transversality conditions (17) or (v)
(of Theorem 8.C. 1) for the infinite horizon problem are not yet known in a
general form. So far it is necessary to prove such conditions for each case.
FURTHER DEVELOPMENTS IN OPTIMAL CONTROL THEORY 651

In other words, conditions such as

lim p(t) > 0 and lim p(t)z(t) = 0 [or lim p(t) = 0]

are, in general, false. Halkin's example, which we discussed at the end of

Section A, and (Koopmans) condition (66) of Section A constitute counter-
examples.

b. HESTENES' THEOREM
In an important paper [5] and later in a book [6] that discusses the
relation between the classical calculus of variations and optimal control theory,
Hestenes presented a general formulation of the necessary conditions for optimal-
ity in optimal control theory and theproofs of his majortheorems. Theformulation
is general enough to cover constraints of the type g [x(t), u(t), t] > 0, equality
constraints, integral constraints with both inequalities and equalities, as well as
ordinary differential equation constraints of the Pontryagin type. The formulation
also introduces the "control parameter."
An example of the integral constraint problem follows.
PROBLEM II:

Maximize: ffo[x(t), u(t), t] dt

u(t)E U 0

Subject to: k, = f, [x(t), u(t), t], i n

xi (0) = x ,P, and
r
(19) Jk[x, u] = fhk[x(t), u(t), t] dt > 0, k = 1, 2, ..., 1

Here T is fixed and the xi (T)'s are unspecified. The integral constraint in
the above problem is stated in the form of an inequality."
Let [,i(t), u(t)] be a solution pair of the above problem. Assuming all
functions, fo, the fi's, and the hk's, are continuously differentiable with respect
to their arguments, we have the following theorem describing the necessary
conditions of optimality.

Theorem 8.C.2: Suppose [ :(t), u(t)] is a solution of the above problem. Assume
that the constraint qualification holds. Then there exist multipliers po, pi(t), i = 1,
2, . . ., n, A k, k = 1, 2, .. ,1, not vanishing simultaneously on 0 < t < T, and a function
H,
(20) H[x(t), u(t), t, p(t)] = pofo[x(t), u(t), t]

+ G Pif [x(t), u(t), t] + G Akhk [x(t), u(t), t]

i= I k=1
652 DEVELOPMENTS OF OPTIMAL CONTROL THEORY AND ITS APPLICATIONS

such that the following relations hold:

(i-a) The multipliers po, Ak, k = 1 , 2, ... , 1, are constants and Ak ? 0, k = 1, 2, ... ,1,
with AkJk[x,u] =0.
b) The multipliers pi(t), i = 1, 2, ..., n, are continuous and have piecewise con-
tinuous derivatives.
(ii)

(21)

Moreover, we have
xi=
aH
ap;' pi=- aH
x;
, in
The functions z(t), u(t), p(t) satisfy the equations

(22) H, on each interval in which u(t) is continuous

dtH at
(iii) The relation
(23) H [1(t), u(t), t, p(t)] _> H[i(t), u(t), t, p(t)], for all u(t) E U
holds.
(iv) The transversality condition

(24) p,(T)=0,i= 1,2,...,n

holds.

REMARK: If the terminal end points, the x; (T)'s, are fixed such that x; (T) _
x; T, i = 1, 2, ..., n', where n' < n, then the above transversality condition
(iv) is replaced by
(25-a) x;(T) = x;T, i = 1, 2, ..., n'
and
(25-b) pi(T) = 0, i = n' + 1, n' + 2, ..., n
EXAMPLE: Consider a consumer who wishes to maximize the sum of his
satisfaction from consumption over his lifetime. Assume, for the sake of
simplicity, that he knows that his life span is T, and that he also knows
the time path of the price vector p, of his consumption bundle c, and his
income y,, over his entire life span. Let r be the market rate of interest
which is assumed to be a positive constant. Assume that this consumer is
"competitive" (that is, "small" enough relative to the economy) so that
his choice of c, for any t will not affect the p, and r that prevail in the
market. Let M be his total (discounted) income; that is, M = £T e-rty, dt.
Let a differentiable real-valued function u(c,) represent his satisfaction
from the consumption vector c,. Let C be his consumption set. His problem
is to choose the time path of consumption c, from C such as to maximize
his satisfaction over time subject to his budget constraint. That is,
FURTHER DEVELOPMENTS IN OPTIMAL CONTROL THEORY 653

Maximize:
C,
f e-Pru(cr)dt
0
T

Subject to: fpr.cre_nldt < M [orfT(_ p, cre-rtJdt ? 0]

and c,E C
where it is assumed that the consumer is not interested in leaving any
bequest to his children.'Z Here p is the discount factor, which is assumed
to be constant over time, and p ? 0; p > 0 indicates a preference for present
consumption. In this problem, cr is the control variable and there is no
state variable. In order to apply Theorem 8.C.2, we define a function

(26) H[cr] = Poe-P`u(cr) + A. [1i_. cre-r]

where A is the multiplier corresponding tothe integral budget constraint.
Then in view of the above theorem, we obtain the following necessary
conditions for c, to be a solution:

(27) Poe-P`u(cr) + .(7M - Pr' ire '` J Poe Pru(cr) + (T - Pr' cre-rrl
for all cr E C, where A ? 0, and /
[
(28)
_I
M- f pr c,e-''rdt] = 0
0
T

If po = 0, then .1 > 0 (the multipliers do not vanish simultaneously).

Then (27) is reduced to peer < Pt' cr for all c1 E C. Assuming that there
exist cr E C such that p, cr < peer (the "cheaper point assumption"),
this cannot happen so that po > 0. We can then choose po = 1. Thus rewrite
(27) as follows:

(27') e-PrWr) + Al T - Pr'ire '`J > e-P`u(cr) + A( - Pr' cre-'`)

for all cr E C, where A ? 0. if there exist cr E C such that u(C1) > u(cr) for
all t ("nonsatiation"), then A > 0. For if A = 0, then we obtain u(cr) ? u(cr)
for all cr e C from condition (27'), which contradicts the above nonsatiation
assumption. If A > 0, then (28) implies

(28')
0

In other words, all of his income is spent over his lifetime. This is certainly
a natural consequence under the nonsatiation assumption.
654 DEVELOPMENTS OF OPTIMAL CONTROL THEORY AND ITS APPLICATIONS

Now rewrite (27') as

(29) e-Pt[u(ct) - u(c,)] > A[Pt' t - pt' ct] a-'r for all ct E C
Thus, for A > 0,
(30) u(ct) u(ct) for all ct E C such that pt' ct > Pt' ct
and

(31) Pt' cr Pt' ct for all ct E C such that u(ct) > u(ct)
Condition (30) says that this consumer maximizes his satisfaction at each
instant of time over those consumptions whose values do not exceed the
value of the optimal consumption ct. Condition (31) says that, for the optimal
consumption bundle, his consumption expenditure is minimized at each
instant of time over those consumption bundles which would give him
satisfaction that is higher than or equal to the satisfaction obtained from ct.
The control variable u(t) is a function of time t. In many cases it may.So
happen that we can choose a variable that does not depend on t. Such a variable is
called a control parameter. Let b = [b1, b2, ..., b.] be an a-dimensional vector
which denotes the control parameter. Let B c R" be the set to which b is restricted.
Consider the following problem.

PROBLEM III:''
Maximize: 0(b) + f f0 [x(t), u(t), t] dt
-u(t)EU,bEB 0

Subject to: k, = f, [x(t), u(t), t], i = 1, 2, ... , n

g [x(t),u(t),t] >0,>= 1,2,...,m
x,(0) = x;0 (fixed), x;(T) = x,T(b), i = 1, 2, ... , n
T = T(b)
The problem we considered before, the one with final time open and variable
terminal end points, can be considered as a special case of the above problem
in which b = T. The time optimal problem can also be considered as a special
case when T= b, 0(b)= 0, f0 = -1, or with b= T, 0(b)= -b, f0 = O.
The problem with the fixed terminal end points is reduced to the case in
which x;T(b) = x;T (= constant), i = 1, 2, ... , n. In economics, the problem
of investing a large fixed capital may be a once-and-for-all choice (which
may be the case in the peak-load problem). Then such an investment can
be considered as a control parameter (see the next section).
Suppose there exists a solution [u(t), b] of the above problem with the cor-
responding state variable i(t); then we have the following theorem, which gives
a set of necessary conditions for an optimum.
FURTHER DEVELOPMENTS IN OPTIMAL CONTROL THEORY 655

Theorem 8.C.3: Suppose [1(t), u(t), b] is a solution of the above problem and the
constraint qualification holds. Then there exist multipliers po, pi(t), i = 1, 2, ..., n,
qj(t), j = 1, 2, ..., m, not vanishing simultaneously on 0 < t < T, and a junction L,
m
(32) L [x(t), u(t), t, p(t), q(t)] = H[x(t), u(t), t, p(t)] + 2:I qj (t) g1 [x(t), u(t), t]

where
n
(33) H[x(t), u(t), t, p(t)] = pofo[x(t), u(t), t] + 2:1 pif.[x(t), u(t), t]

such that the following relations hold:

(i-a) The multiplier p0 is a nonnegative constant'4 and the multipliers pi(t), i = 1,
2, ... , n, are continuous and have piecewise continuous derivatives.
(i-b) The multipliers qj(t), j = 1 , 2, ... , in, are piecewise continuous and are continuous
at each point of continuity of u(t). Moreover, for each j,
(34) qj (t) > 0, and qj(t)gj[x(t), fi(t), t] = 0
This may be written as q > 0 and q 0, where

q [qI, q2,..., qm],g° [gl,g2..... m]

9i-gj[x(t),u(t),t],j = 1,2,...,in
(ii) The functions £(t), fi(t), pi(t), i = 1, 2, .. , n, qj(t), j = 1, 2, , in, satisfy the
Hamiltonian system

aL aL
(35) xi =api
-, Pi = -
axi '
i = 1,2,.. ,n

for the interval on which u(t) is continuous and

aL
(36)
aui
=0, i= 1,2,...,r

where

L = L[2(t), u(t), t, p(t), q(t)]

Moreover, we have

(37)
dtL atI
on each interval of continuity of u(t) and the function L is continuous on 0 t < T.

(iii) H[i(t), u(t), t, p(t)] > H[x(t), u(t), t, p(t)]

for all u(t) E U such that gj[.i(t),u(t), t] > 0, j = 1, 2, ... , in
(iv) (transversality condition)

n r
(38) LTeb + iI.IPi(T) ab- = 0
-PO ab
-
656 DEVELOPMENTS OF OPTIMAL CONTROL THEORY AND ITS APPLICATIONS

where
LT = L[z(T), u(T), T, p(T), q(T)]

REMARK: The last condition, (iv), summarizes (or generalizes) the simple
transversality conditions discussed in Section A. In particular,
(a) T is fixed and b; = x;T: pi(T) = 0, i = 1, 2, ..., n.

(b) T is unspecified and the xiT3s are fixed: Set b = T and obtain

L [z(T), u(T), T, p(T), q(T)] = 0

REMARK: It can be shown that (37) is a consequence of (34) and (35).
REMARK: Note that if T is fixed and b; = x;T, i = 1, 2, ..., n, then
Problem III is reduced to Problem I; hence Theorem 8.C.3 is reduced to
Theorem 8.C. 1. In this case, po > 0 so that we can choose po = 1. To carry
out a proof of this, note the transversality condition pi(T) = 0, i = 1, 2, ...'4.. , n,
and suppose, for example, that the rank constraint qualification holds at
t = T so that the rank of the matrix [8gj/au;] for j c E (where E is the
set of indices for the effective constraints at t = T) is equal to the number
of the effective constraints (for t = T). Now suppose that po is not positive
so thatpo= 0. Sincep;(T)= 0 forall i, weobtain, for t = T,L=
u(T), T] = 0. Hence condition (36) implies that aL/aUk = Ej_' I qj(T)gjkT =
0, k = 1, 2, ..., r, where gJkT = 8gi/8uk evaluated at [2(T), u(T), T]. Let
qE be the vector whose jth element is qj(T), j c E. Note that qj(T) = 0 for
j (t E from relation (34). Let GE be the matrix whose (j,k)th element isgjkT,
j c E, of the above relation; then we obtain qE GE = 0. But the rank con-
straint qualification means that the rank of GE is equal to the dimension
of qE so that qE = 0; hence qj(T) = 0 for all j. Therefore we havepo = 0,
pi(T) = 0, i = 1, 2, ..., n, qj(T) = 0, j = 1, 2, ..., m, which contradicts the
condition that the multipliers do not vanish simultaneously; hence, po > 0.
If we assume the Slater type condition [condition (iii) of the lemma] instead
of the rank condition, the proof is simpler. To see this, suppose that there
exists a "E (T) E U such that gj [2(T), u (T), T] > 0 for all j. Then condition
(34) implies qj(T) = 0 for all j. Thus again the multipliers vanish simulta-
neously for t = T under the assumption po = 0.

We are now ready to proceed to a more general theorem which is due to

Hestenes [5]. Again let x(t), to < t < t j , be a state variable, which is a continuous
n-vector-valued function, let u(t), to < t < t 1, be a control variablewhichisapiece-
wise continuous r-vector-valued function, and let b be a control parameter which
is an a-dimensional vector. Here to and t 1 are not necessarily fixed but are functions
of the control parameter b. It may be convenient to refer to the arc of an (n + r + a)-
dimensional vector [x(t), u(t), b], to < t < t1, by a single letter z, that is,
FURTHER DEVELOPMENTS IN OPTIMAL CONTROL THEORY 657

(39) z: [x(t), u(t), b], to <_ t < t,

The general problem to be considered here is that of maximizing a function

(40) Io [z] = 00(b) + Jro [x(t), u(t), b, t] dt

z: [x(t), u(t), b], to < t < t 1

satisfying the following conditions
(41) x = f [x(t), u(t), b, t], i = 1, 2, ..., n
(42) gj [x(t), u(t), b, t] >_ 0, j = 1, 2, ... , m'
(43) gj[x(t), u(t), b, t] = 0,.j = m' + 1, m' + 2, ..., rn
(44) I k (z) 0, k = 1 , 2, ... , 1'

(45) Ik (z) = 0, k = 1' + 1, 1' + 2, ..., 1

where Ik is defined as

(46) Ik = Ok(b) + J'hk[x(t)u(t), b, t] dt, k = 1, 2, ... , 1

and
(47) to = to (b), t I = t, (b)

(48) Xi(to) = xio(b), xi(ti) = xil(b), i = 1, 2, ..., n

The problem thus formulated is called the optimal control problem of Bolza-
Hestenes. The name Bolza is introduced because a similar calculus of variations
.problem was discussed by 0. Bolza. This problem may also be referred to as the
problem of Hestenes. The solution of the above problem is denoted by
z: [z (t), u(t), b], to < t < ti
We assume
(A-I) All functions, tjlo, fo, f'S, gj'S, tjlk's, hk'S, to, t1, Xi°'S, and xi1's, are con-
tinuously differentiable on a set X of points in the (x, u, b, t)-space.
Let Xo be a set of all elements (x, u, b, t) in X satisfying gj [x, u, b, t] >_ 0,
j=1, 2, ... , m', gj(x, u, b, t) = 0, j = m' + 1, ... , m. This set Xo is called the
set of admissible elements. We assume further that
(A-2) The matrix
Ag
(49) (du,

8iigi
658 DEVELOPMENTS OF OPTIMAL CONTROL THEORY AND ITS APPLICATIONS

has rank mat each element [i(t), u(t), b, t] inXo, where 8;i is "Kronecker's delta"
defined by 8y = 1 if i = j and 84 = 0 if i j, and ag/au is the Jacobian matrix of
g with respect to u evaluated at [i(t), u(t), b, t] .
The matrix (49) ca n be written out as

091 agi agi

0
g 0 0 ..
au, au2 au'.
1

age age age 0 g2 0 0

(50) au, au2 au'.

499111 09111
39 111
0 0 0
au, au2 au, 9111

In other words, this is an m x (m + r) matrix. The rank of a rectangular matrix

is defined as the number of linearly independent rows, which is equal to the
number of linearly independent columns. The above matrix has rank m if and
only if the matrix

-(51)

has rank s, where E is the set of indices in which the gl's are effective, that is,
(52) E = f j: gjx, u, b, t] = 0}
and s is the number of these effective constraints. In other words, (A-2) says that
the rank of (agE/au) is equal to the number of the effective constraints. If all
the gl-constraints are inequality constraints (so that m' = m), then (A-2) amounts
to the rank constraint qualification discussed in subsection a [ see condition (iv) of
the lemma and Arrow-Hurwicz-Uzawa [2] ].

Theorem 8.C.4 (Hestenes): Suppose that the arc

[z(t), fi(t), b] , to s t < t,
is a solution of the above problem and suppose that (A-1) and (A-2) hold. Then there
exist multipliers
po,p;(t),gi(t),Ak,i= 1,2,...,n;j= 1,2,...,m;k= 1,2,...,1
not vanishing simultaneously on to < t < tl and functions L and I' where
,n
(53) L [x(t), u(t), b, t, p(t)] H[x(t), u(t), b, t, p(t)] + qj (t)gi [x(t), u(t), b, t]

with H defined as
FURTHER DEVELOPMENTS IN OPTIMAL CONTROL THEORY 659

(54) H[x(t), u(t), b, t, p(t)] = pofo[x(t), u(t), b, t]

n I
+ E pi , [x(t), u(t), b, t] + E ,1khk [x(t), u(t), b, t]
i= 1 k= 1
and

(55) IY(b) = po0o(b) + AkYbk(b)

k=1

such that the following relations hold:

(i-a) The multipliers po, A k, k = 1, 2, ..., 1, are constants, and Ak > 0, k = 1,
2, ... , 1', with

(56) AkIk [x(t), u, b] = 0, k

(i-b) The multipliers pi(t), i = 1, 2, ..., n, are continuous and have piecewise
continuous derivatives.
(i-c) The multipliers qj(t), j = 1, 2, ..., m, are piecewise continuous and are con-
tinuous at each point of continuity of ii (t). Moreover, for each j, I < j < m',
we have
(57) qj(t) > 0, gj(t)gj[z(t), ii (t), b, t] = 0
The last equation may be rewritten as
(58) qg=0
where q= andg= 1h, h- , gm]
gj=gj[X(t),u(t),6,t],j= 1,2,.. m
(ii) The functions z(t), u(t), pi (t), i = 1, 2, .. , n, qj (t), j = 1, 2, .. , m, satisfy
the Euler-Lagrange-Hamiltonian equations

(59) xi=aL-,
api
Pi= -ax'
aL
i = ],2, ,n

(60) aL=0,
aui
i= 1,2,...,r

where L = L [z(t), u(t), b, t, p(t), q(t)] . Moreover, we have

(61)
dtL atL
on each interval of continuity of u(t) and the function L is continuous on
to<t< tl.
(iii) The following formula holds:

(62) H[z(t), u(t), b, t, p(t)] > H[z(t), u(t), b, t. p(t)]

for all [z(t), u(t), b, t] in Xo. Or equivalently,
H[c(t), u(t), b, t, p(t)]
660 DEVELOPMENTS OF OPTIMAL CONTROL THEORY AND ITS APPLICATIONS

is maximized with respect to u subject to [i(t), u(t), b, t] E X and gjz(t),

u(t), b, t] > 0, j = 1, 2, ... , m', gj [z(t), u(t), b, t] = 0, j = in' + 1, ... , in.
(iv) The transversality condition
s= 1

(63) - a0 + [_!sotS
abj
n

i=1
ax .c
)abj
1,2,...,a
0
c-o

holds, where

Ls = L [X(ts), u(ts), b, ts, p(ts), 9(ts)] , s = 0, 1

REMARK: If fo, the f,'s, gj's, and hk's do not contain b explicitly (so that
L does not contain b explicitly), then the right-hand side of the transversality
condition is identically equal to zero. If, in addition, to is fixed and does
not depend on t and the xro's are fixed, the transversality condition is further
simplified to

(64) a bj
+ [_Li aJ
+
i_
Pi(ti) 6 J=O,j= 1, 2, ...,a
J

Writing tI = T and x, = x.T, we can obtain the transversality condition (38)

discussed in Theorem 8.C.3.

C. A SUFFICIENCY THEOREM
All the theorems we have discussed so far have been concerned with the
necessary conditions for optimality. A naturally important question is: Under
what conditions are these conditions also sufficient for optimality? In the case of
ordinary nonlinear programming and the calculus of variations, several important
sufficiency theorems exist; however, in each case there exists a simple but powerful
sufficiency theorem which implies optimality when the relevant functions are
concave. Here we prove such a theorem, which is a generalization of a theorem
due to Mangasarian 191.
We consider a problem in which x(t) is the state variable and u(t) is the
control variable. The function x(t) is an n-dimensional vector-valued continuous
function and u(t) is an r-dimensional vector-valued piecewise continuous func-
tion. The problem is as follows:

Maximize: I [x, u] _- 0o [x(0), x(T)] +fTfo[x(t), u(t), t]dt

Subject to:
(65) xi = f- [x(t), u(t), t], i = 1, 2, ... , n
(66) gj [x(t), u(t), t] > 0, j = 1, 2, ... , In
T
(67) 0 k [x(0)] + jhk[x(t),u(t), t] dt >_ 0, k = 1, 2, ... ,1'
FURTHER DEVELOPMENTS IN OPTIMAL CONTROL THEORY 661

T
(68) Ok[x(T)] + ,lo hk[x(t), u(t), t] dt > 0, k = 1' + 1,-, 1
Note that in Mangasarian [91, there are no integral constraints. In this problem
both the initial and the terminal time (that is, 0 and T) are assumed to be fixed.
Regarding the vector [x(0), x(T)] as a control parameter b, we can apply the
Hestenes theorem. Thus, under suitable assumptions, we have the following set of
necessary conditions in order that [z(t), fi(t)] be optimal:

(i) There exist multipliers po, p(t) _ [p I (t), ..., q(t) _ [qi (t), ..., q,,,(t)] ,
and i. _ [A1, . . ., At] , such that
(i-a) PO and a. are constants and a. > 0, with

T
(69) A [ r/, + h tit] = 0, where
J

k=Ok[x(0)],k= 1,2,...,1',4Ok[x(fl],k1'+ 1,.. ,1

h= [hi,h2,...,ht],and hk= hk[X(t),u(t),t],k= 1,2,...,1

(i-b) pi(t), i = 1, 2, ..., n, are continuous and have piecewise continuous deriva-
tives.
(i-c) qj(t), j = 1, 2, ..., m, are piecewise continuous and are continuous at each
point of continuity of u(t) and qj(t) > 0 for all j with
(70) g= ,gj=gj[X(t),u(t),t],j= m
(ii) The functions z(t), u(t), p(t), and q(t) satisfy the following differential equa-
tions:

(71) x, -a-'
Pt
Pi= - aLa,
x i = 1, 2_ . n

aL
(72)
au;
0, i= 1, 2,. .. r
,

where L is defined by
(73) L = L[x(t), u(t), t, p(t), q(t)]
and

(74) L [x(t), u(t), 1, p (t), q(t)] ° Pofo[-x(t), u(t), t]

p ...

+ pi(t)Jj[x(t), u(t), t] + Ey gj(t)gj[x(t), u(t), t]

r= 1 j= I

I
+ -y3.khk[x(1), u(t), t]
6 -I
662 DEVELOPMENTS OF OPTIMAL CONTROL THEORY AND ITS APPLICATIONS

(iii) The following transversality conditions hold:

(75) eiY + pi (0) = 0, i = 1, 2, ..., n

8xi (0)

(76) 1,2,...,n
where

(77) lY ° 0o [ X(0), X(T)] + A fir, where _ [oil ..., 4, ..., M

Denote the gradient vector offo with respect to x, evaluated at [z, u], by fox.
Define f0u similarly. Define the Jacobian matrix off [ fi, f2, ... , f ] (where f
f [x(t), u(t), t]) with respect to x, evaluated at (z, u), by J. Similarly, define
fu, gx, gu. Denote the Jacobian matrix of h with respect to x (resp. u), evaluated
at (z, u), by hx (resp. hu). Also denote the Jacobian matrix of 0 with respect to
x(O) [resp. x(T)], evaluated at [z(O), I(T)], by'x(O) [resp. x(T)]. DefineOox(o)
and Y'Ox(T) similarly. Then conditions (71) and (72) can be rewritten respectively as
(78) x =.f, p = - [Po!Ox + p'.fx + q'gx + A. hx]
(79) fou+p.fu+q.gu+A.hu=0
Conditions (75) and (76) can be rewritten respectively as
(80) +'Ox(O) + A' ''x(0) + p(O) = 0

(81) 'OX(T) + A x(T) P(T) = 0

We now impose the following assumptions (here when we say that a vector-
valued function is concave, we mean that every component is concave):
(A-3) The functions fo[x, u, t], f [x, u, t], g [x, u, t], and h [x, u, t] are all
concave and differentiable in (x, u) for t E [0, T] .
(A-4) The functions 0o[x(O), x(T)] and O [x(O), x(T)] are concave and dif-
ferentiable in x(O) and x(T).
Now we can state our theorem.

Theorem 8.C.5: For the above problem, if (A-3) and (A-4) hold, then all the
necessary conditions (i), (ii), and (iii), stated above, are also sufficient for [,i(t), u(t)]
to be a global optimum solution of the problem, provided that po = 1 and thefollowing
additional condition holds:
(82) p(t) > 0 for all t
If the concavity in (A-3) and (A-4) is replaced by strict concavity, then the optimality
is "unique. "
IN OPTIMAL CONTROL THEORY 663
FURTHER DEVELOPMENTS

(65) to (68). The proof is

PROOF: Let x(t) and u(t) satisfy the
constraintsinequalities. For simplicity,
equalities and
carried out by writing a string of equa deal similarly withf, f, g, u, w, and
denote fo [2(t), u(t), t] by fo and "I

00)
I[x, u] = foT(o -fo)dt +
fT ou] dt + [z(0) - x(0)]ox(o)

+ [z (T) _ )C(T)] ''Ox(T)

(b) x) ' (P ' ix + 9 ' gx + a hx +

P) - (u - u) (p fu
Jo '`
[z(0) x(0)] [P(0) + A. x(o)] Y

+ 9' gu + A. hu)] dt
x(z)]
+

9,gx + A. h.') + P. (l -f)

(c) = fo
[ - (z - x) (p . c +
d
-(u - u) (P'.fu + 9' gu + A. hu)]
x(T)] ' (A . x(T))
[X (T) -
(h - h) + p (f-f)]dt
(d) '= for[P'(f-f)+4'(g-g)+A
- [z(0) x(0)] (A .r(o))

.,. - ((111 (1.. tlj .,,m')

(e)

- [2(T) - x(T)] . (,I. x(T))

> fT h - h)dt + A. (Y - Y)
(f )
0

- fT' hdt +
(g)
0
> 0
(h)

hold:
Following are the reasons the above relations
and the concavity offo and io.''
Inequality (a) by the differentiability
Equation (b) by (78), (79), (80), and (S I)(65), and the continuity of x(t), x(t)
Equation (c) by integration by parts'
and p(t).
664 DEVELOPMENTS OF OPTIMAL CONTROL THEORY AND ITS APPLICATIONS

Inequality (d) by the differentiability and concavity off, g, and h, q(t) >_ 0,
and (82) [note that this is the only step in the proof where (82) is used] .
Inequality (e) by (70), q(t) > 0, and (66).
Inequality (f) by the concavity and differentiability of 0, and by A > 0.
Equation (g) by (69).
Inequality (h) by A >_ 0, (67), and (68).

If the concavity offo, 0()j, g, h, and 0 is replaced by strict quasi-concavity,

then the inequalities in steps (a), (d), and (f) are replaced by strict inequality
for [x, u] # [z, u]. (Q.E.D.)
REMARK: The above proof is essentially the same as that in Mangasarian
[9]. The idea of the proof is clearly due to Kuhn and Tucker [7].
REMARK: The assumption represented by condition (82) and the concavity
off can be replaced by the weaker condition that p fis concave in (x, u).
Note also that if f (x, u, t) is linear in (x, u), then (82) is not needed for this i.
[Recall step (d).]
REMARK: Any integral evaluated at a single point is obviously zero. Also,
any integral evaluated on a set of countably many points is zero. Hence, for
any (integrable) function y(t) and y'(t), fo y(t)dt = fo y'(t)dt if y(t) and y'(t)
are different only for countably many points in [0, T] , that is, if they are
identical "almost everywhere." Therefore the "uniqueness" of an optimal
solution in the optimal control problem only means the uniqueness almost
everywhere. Recall that we made the same remark in connection with
Theorem 5.B.4.
Suppose that the initial point is fixed as

x(0) = xo

Then this condition replaces transversality condition (75). Similarly, if the right-
hand end-point is fixed as x(T) = XT, then this replaces transversality condition
(76). With this remark, we can easily prove the following corollary.

Corollary:

(i) If x(0) = xo [or x(T) = x"], and if hk = O for all k, then Theorem 8.C.5 holds
with (69), (75) [or (76)] , and A. > 0, and the vector A, all deleted.
(ii) If x(0) = xo and x(T) = xT, and if hk = 0 for all k, then Theorem 8.C.5 holds
with (69), (75), (76), and A >_ 0, and the vector A., all deleted.

This completes our discussion on the theory of optimal control. Theorem

8.C.4 provides us with a set of necessary conditions for a quite general class of
FURTHER DEVELOPMENTS IN OPTIMAL CONTROL THEORY 665

problems, and Theorem 8.C.5 guarantees the sufficiency of these necessary

conditions for an important class of problems. However, there are still two
important topics that we have left out. One is the problem of bounded state
variables, which we mentioned already.'s And the other is the problem of the
existence of an optimal control. Those interested in this topic of existence are
referred to L. Cesari, "Existence Theorems for Optimal Solutions in Pontryagin
and Lagrange Problems,"Journal ofSIAM Control, vol. 3, no. 3, 1966 (and perhaps
his two articles in Transactions of American Mathematical Society, 124, September
1966).'

FOOTNOTES

1. When the g-function lacks the u(t) (the case of bounded state variables), the optimal
control problems become quite difficult and tedious, and are beyond the scope of
our exposition. The interested readers are referred to Hestenes ([6] , chapter 8) and
Russak [ 101, for example. See also footnote 16.
2. The constraint qualification is the qualification imposed or, the constraint to
guarantee "normality." In other words, if the constraint qualification does not hold,
then L must be written as L - q0H + qjgj, where q0 can be zero. The con-
straint qualification in this context can be interpreted as the qualification imposed
on the constraint to guarantee 90 > 0. (Note that if qo > 0, we can choose 90 = 1,
for we can always redefine the multipliers qj by qj/9o. Recall our discussion on the
normality condition in Chapter 1.)
3. More precisely, the notation Of/au; means aL/au, evaluated at [.i(t), fi(t), t,
p(t), q(t)] for each t. In the subsequent discussion, we use the notation aL/ax;,
aL/ap, in the same sense.
4. That constraint gj[x, u, t] ? 0 is effective means gj[z, u, t] = 0. In the sub-
sequent discussion we refer to condition (iv) as the rank constraint qualification.
5. The pi's and the qt's are often called multipliers. It is important to note that in the
definition of the functions L and H, the multiplier corresponding to f0 (that is, po) is
set equal to one. This is due to the fact that the present problem corresponds to the
one considered in Theorem 8.A.6. In other words, this is the case with variable right-
hand end-points. As should be clear from our discussion in Section A, if the right-hand
end-points are fixed, then in general we do not obtainpo = 1. In this case, Hshould be
defined as H = pof0 [x(t), u(t), t] + -Y° i p; f [x(t), u(t), t] withpo > 0 (constant) and
the definition of L should be modified accordingly.
6. More precisely, L is continuous along [z(t), u(t), t] and has a piecewise continuous
derivative given by aL/at on each interval on which u(t) is continuous. From
conditions (i) and (iii), we obtain dL/dt = (aL/ax) z + (aLlau) u + (aLlat) +
(aLlap) p + (aL/aq) q = (aLlat) + (aLlaq) q = (aL/at) + g- q, where
g [ z(t), u (t), t] . If gj = 0, then gj qj = 0. On the other hand, if gj > 0 on some interval,
then qj gj = 0 [condition (iii)] means qj(t) = 0 (= constant) for this interval so
that qj = 0 for this interval. Thus we have gj qj = 0 for this interval. In other words,
we have gj qj = 0 for all j so that dL/dt = aL/at.
7. Clearly, if L is not concave in u, then condition (iii) does not necessarily imply
condition (ii). It is important to note that condition (ii) always implies condition (iii)
provided that the constraint qualification holds.
8. From condition (iii), aL/au; = 0 and µ;(t) > 0, µ;(t)u;(t) = 0, i = 1, 2, ..., r. But
666 DEVELOPMENTS OF OPTIMAL CONTROL THEORY AND ITS APPLICATIONS

aL/aui = aL+ /aui + µi (t). From this condition (13) follows.

9. Clearly such a condition is unnecessary if we consider the fixed terminal end points
problem [that is, the xi(T)'s are all fixed] .
10. There is a slight inaccuracy in the above argument to show (17): that is, the ci's
are to be a priori fixed, and should not be chosen after the signs of the xi(T)'s are
determined. For a more accurate argument, see Arrow [ 1] . There is still another
kind of inaccuracy. Since p(T) = 0 is not necessarily true anymore, strictly speaking,
H should now be redefined as H = pofo + Ei
Ipif [x(t), u(t), t] where po > 0 (and
L should be redefined accordingly). (See the proof of Theorem 8.A.6.) However,
following Arrow [ 1 ] , we implicitly assume here the normality po > 0, so that we
can take po = 1. Actually, in many problems with the constraint x(T) >_ 0 we can
find a solution by ignoring this constraint and then observe that this constraint is
satisfied by z(T) > 0. In such a case, we may solve the problem by replacing condition
(17) by the usual condition p(T) = 0. [If z(T) > 0 in (17), we have p(T) = 0.]
11. The integral constraint, especially with equality, is often called the isoperimetric
constraint, gaining this name from a famous isoperimetric problem in the calculus
of variations which is concerned with finding the curve enclosing the greatest area
among all closed curves of a given length. "Isoperimetric" means "with the same
perimeter." Such a constraint is expressed in the integral form.
12. This integral constraint contains the assumption that the consumer can borrow
or lend any amount at the fixed rate of interest r. An alternative assumption is that
there is a bound (say, zero) on the amount that he can borrow. For further discus-
sions of the problem of the lifetime allocation of consumption, see, for example,
M. E. Yaari, "On the Consumer's Lifetime Allocation Process," International
Economic Review, 5, September 1964, and K. Avio, "Age-Dependent Utility in the
Lifetime Allocation Problem," Krannert Institute Paper, No. 260, Purdue University,
November 1969. Recall that the same example was used in Section A of Chapter 1.
13. We can formulate this problem in such a way that the gj's, and hk's all contain
the control parameter b explicitly as well as x(t), u(t), and t. However, without
loss of generality, we can also assume that b does not appear in the f's, gj's, and
hk's, since it can be eliminated by introducing the new state variables i (t) subject
to the conditions 0 and xn+i(ti) = bi, i = 1, 2, .. , a.
14. It is important to note the possibility of po = 0.
15. Recall Theorem 1.C.3.
16. It is not difficult to illustrate the basic method involved in such a problem by a
simple example. Suppose that we have the well-known constraint in economics,
x(t) > 0, in addition to the usual f-, g- and h- constraints discussed in this section. If
xi (t) > 0, then the constraint is ineffective and can be disregarded. If xi (t) = 0, then
we must have zi (t) > 0 to satisfy the constraint. Since zi = f, (x, u, t), this amounts to
adding an additional constraint f (x, u, t) > 0. Let vi (t) be the multiplier correspond-
ing to this constraint, and define the Lagrangian by L = H + q g + v f, whereH
p o f o + p f + A - h, f ° ( f l ,.. .,fn) and v - (vi, vR ). Then it is easy to see that the
necessary conditions for optimality consist of the usual conditions described in
Theorem 8.C.4 and v(t) > 0, v(t) f [z(t), u(t), t] = 0 and v(t) z(t) = 0. Needless
to say, if z(t) > 0 then v(t) = 0 and the problem is reduced to the one without the
constraint x(t) > 0.
17. See also the following works: A. F. Filipov, "On Certain Questions in the Theory of
Optimal Control," Journal of SIAM Control, vol. 1, no. 1, 1962; R. A. Gambill,
"Generalized Curves and the Existence of Optima] Controls," Journal of SIAM
Control, vol. 1, no. 3, 1963; and E. B. Lee and L. Markus, Foundations of Optimal
Control Theory, New York, Wiley, 1967.
TWO ILLUSTRATIONS 667

REFERENCES

1. Arrow, K. J., "Applications of Control Theory to Economic Growth," in Mathe-

matics of the Decision Sciences, Part 2, ed. by G. B. Dantzig and A. F. Veinott,
Providence, R.I., American Mathematical Society, 1968.
2. Arrow, K. J., Hurwicz, L., and Uzawa, H., "Constraint Qualifications in Nonlinear
Programming," Naval Research Logistics Quarterly, vol. 8, January 1961.
3. Berkovitz, L. D., "Variational Methods in Problems of Control and Programming,"
Journal of Mathematical Analysis and Applications, 3, August 1961.
4. Guinn, T., "Weakened Hypotheses for the Variational Problem Considered by
Hestenes," Journal of SIAM Control, vol. 3, no. 3, 1965.
5. Hestenes, M. R., "On Variational Theory and Optimal Control Theory," Journal
of SIAM Control, vol. 3, no. 1, 1965.
6. , Calculus of Variations and Optimal Control Theory, New York, Wiley, 1966.
7. Kuhn, H. W., and Tucker, A. W., "Non-linear Programming," Proceedings of the
Second Berkeley Symposium on Mathematical Statistics and Probability, ed. by
J. Neymann, Berkeley, Calif., University of California Press, 1951.
8. Lee, E. B., "A Sufficient Condition in the Theory of Optimal Control," Journal of
SIAM Control, vol. 1, no. 3, 1963.
9. Mangasarian, O. L., "Sufficient Conditions for the Optimal Control of Nonlinear
Systems," Journal of SIAM Control, 4, February 1966.
10. Russak, B., "On Problems with Bounded State Variables," Journal of Optimization
Theory and Applications, 5, February 1970.

Section D
TWO ILLUSTRATIONS:
THE CONSTRAINT
g [x(t), u(t), t] >_ 0 AND THE USE OF
THE CONTROL PARAMETER

In this section we illustrate the theorems developed in Section C by con-

sidering two problems which will enhance the reader's understanding of these
theorems. In particular, we discuss the optimal growth problem and the peak-load
problem. The peak-load problem is also useful as an illustration of the use of the
control parameter.

a. OPTIMAL GROWTH ONCE AGAIN'

We again consider the problem of optimal growth because familiarity with
this subject will help the reader to understand the theory developed in Section C.
Here we discuss optimal growth problems with explicit consideration given to the
668 DEVELOPMENTS OF OPTIMAL CONTROL THEORY AND ITS APPLICATIONS

inequality g- constraint. The notations of the problem are the following: N, labor
force; K, capital stock; I,, gross investment; X,, consumption; n, rate of popula-
tion growth; u, rate of capital depreciation; p, discount factor; F, production
function; u, utility function; subscript t, time t. (Refer to Chapter 5, Section C.)
Then our problem can be written as follows:

Maximize: f [X' e-Ptdt, p>0

(X,, 1J o
at
v
Subject to:
(1) F(N K,) - Xt - It > 0
(2) K, + µKt = It
(3) N, = Noent
(4) K,>0
(5) X,> 0
It is important to note the explicit introduction of the inequality constraint
(1). This inequality means that the goods can be in excess supply. Previously we
stated (1) in the form of an equality, signifying demand = supply equilibrium in
the goods market. By thus stating (1) in the form of an equality, we could solve
(1) with respect to It so that we obtained K, + uKt = F(N,, K,) - X, from (1)
and (2). Hence we were able to eliminate It from the system. Now (1) is stated
in the form- of an inequality and, therefore, elimination of It from the system is
impossible. Note that It (as well as Xt) is now considered as a control variable. As-
suming the linear homogeneity of F so that F(N,, Kt) = N,f(k,), where k,
K,/L, and f(k,) = F(1, k,), we can rewrite the above problem as follows:

Maximize: J= u(x,)e-Ptdt, p>0

xt, it 0

Subject to:
(6) f(k,)-xt-it>0
(7) k,=it-i.kt, where =_n+
(8) k, > 0, ko is given
(9) x, > 0
where x, = X,/N, and it = 1,/N,. Equation (7) is obtained from (2) and (3). We
retain the assumptions made before for this problem: f'(k) > 0J11 (k) < 0 for all k,
f'(0) = o, f'(oo) = 0, f(0) = 0, and u" (x) < 0 for all x. Thus f is strictly concave
in k, and u is strictly concave in x. Viewing this problem as an optimal control
problem, x, and it are the control variables and k, is the state variable. Here
k.. is not fixed. We first proceed with our analysis without explicit consideration
TWO ILLUSTRATIONS 669

of the state variable constraint (8). Introducing the multipliers p, r and v,,
we define the function L as follows:
(10) L=L[k,,x,, it, t, p,, r, v,]
u(x,)e-Pr
+ pt(i, - Akt) + rt [f(kt) - x, - it] + v, x,
Note that the rank constraint qualification is trivially satisfied, for we can observe2

ax[f(k)- x - i] ai[f(k)- x-

ax ax =1L0
ax ai
1 0

Then, in view of Theorem 8.C.1, the solution [k,, z,, it] of the above problem
must satisfy the following conditions:'

(i) The variables k, z,, it, p,, r,, and v, must satisfy the Euler-Lagrange-Hamil-
tonian equations
aL aL
apt' pt ak

(12) aL=0 and

aL=0, whereL=L[k,,z,,i,,t,p,,rt,vt]
r l

(ii) The relations

(13) r>0, r,[f(kt)-,r,-4] =0
and

(14) V, ? 0, v,zt=0
hold.
H[k,, z,, it, pt] > H[k,, x,, it, p,]

for all [k,, x,, i,] which satisfy f(k,) - x, - it >= 0 and x, > 0, where

(15) H[kt, x1, it, pt] -- u(xt)e-P!

+ pt(it - Akt)
(iv) The right-hand end-point condition
(16) limo pt >_ 0 and limes p,k, = 0

must hold. (See Arrow [ 1 ] , pp. 92-93, and recall our discussion at the end
of Section A.)

Condition (11) can be written as equation (7) for the optimal path and
670 DEVELOPMENTS OF OPTIMAL CONTROL THEORY AND ITS APPLICATIONS

(17) Pr =Apt - rt.f'(kt)

Also (12) can be written as

(18) u'(Xt)e-Pt - rt + vt = 0
and
(19) pt - rt = 0
Clearly condition (iii) implies relations (12), (13), and (14) under the assumption
that the constraint qualification holds. Moreover, the converse [that is, (12), (13),
and (14) imply condition (iii)] also holds, because L is (strictly) concave in zt
and it in view of the (strict) concavity of u. Note also that (19), in view of ri 0,
implies

(20) Pt > 0
Moreover, (19) and (17) imply
(21) Pt = -P1[.f'(kt) -A]
Define a new variable qt by

(22) 9t -- PtePt fort < oo

Then q1e-P' - pe-P' qt = p, so that (21) can be rewritten as

(23) 9t = -9r[f'(kt) - (A + p)]
In view of (14) and (19), (18) can be rewritten as

(24) u'(X1)e-P' < Pt

(24') u'(21) < qt for t < oo

Then assuming u'(xt) > 0 (nonsatiation) for all x1 > 0, or at least for the optimal
per capita consumption path z (24) [or (24')] implies
(25) Pt > 0
or
(25') gt > 0 fort < oo
Then (13) combined with (19) implies
(26) .f(kt) - zt - it = 0
TWO ILLUSTRATIONS 671

In other words, constraint (6) holds with equality. It is important to note that
the equality constraint is obtained as a result of the explicit recognition of the
nonsatiation assumption.
Combining (26) with (11), we now obtain

(27) k, = f(k,) - Akt - zt

If we further assume limX_o,X>o u'(x) = oo [or limc-o,X>o u(x) _ -oo] , then
from (24) [or condition (iii)] and the boundedness of the multiplier pt, we must
have
(28) zt > 0 for the solution path
Therefore, in view of (14), we must have vt = 0 so that we obtain the following
equation from (18) and (19):

(29) u'(Xt)e-pt = Pt
or
(29') u'(ct) = qt
Combining (23), (27), and (29'), we can draw the phase diagram on either the
(x-k)-plane or the (q-k)-plane. The rest of the analysis is the same as that carried
out in Section A. Recall that the nonnegativity of the state variable-that is,
condition (8)-is satisfied along the optimal path which converges to the modified
golden rule path. And along this path, the right-hand end-point condition (16)
is satisfied, and the integral J converges.
In the above analysis, it is not assumed that it > 0. In other words, i, or It
can be negative. This means K, + uK, can be negative. If Kt < 0, then the economy
may "eat up" the capital accumulated in the past. We may suppose that this is
impossible. In other words, we introduce the assumption of the irreversibility of
investment, that is, It > 0 (or it > 0). This means that investment once made in
physical form cannot be converted into consumer goods; hence the economy
cannot "eat up" the capital accumulated in the past. Such a problem is discussed
by Arrow [2] and Arrow and Kurz [3], but we omit it here.

u. Two PEAK-LOAD PROBLEMS'

Consider a monopoly which produces a single nonstorable good Y, (say,
electricity). The output of Yt depends on the initial investment of fixed capital
K and the vector of inputs of the variable factors L. We assume that there is a
constant relation between Lt and Yt; that is, Lt = aY1, where a is the vector of
variable inputs required per unit of output. We assume that a is constant over
t and Y. The subscript t refers to time, and hence it also signifies that the relevant
variable is a function of time. We also assume that as Yt increases, the degree of
utilization of the fixed capital increases. Let bYt denote the degree of such a
672 DEVELOPMENTS OF OPTIMAL CONTROL THEORY AND ITS APPLICATIONS

utilization. Then there is a capacity limit given by the relation bYt < K.. We
assume that b is a positive constant. It is important to note that the suffix t is not
attached to K; that is, K is not a function of time. The demand function for the out-
put of the firm is given by D(pt, t), where pt is the price of the output at time t.
Thus if pt is constant for all t (say, p, = p), we can draw the time path of demand
for the output as illustrated in Figure 8.13.
Suppose that the firm is required to produce an output that will meet the
peak demand. If the firm builds a capacity which will meet the peak demand,
then there will be an excess capacity during the nonpeak periods, for the output
is nonstorable by the assumption of the peak-load problem. Such a loss of excess
capacity can be reduced if the firm sets a higher price for a peak period, thus
"flattening" the demand curve. One version of the peak-load problem is that
of choosing the amount of initial investment K and the time path of price so
as to minimize such a "loss."
Let w be the price vector of L and T be the planning horizon of the firm.'

For the sake of simplicity, we assume that the capital lasts for the period T with
the same efficiency and w is constant over time and over the relevant range of
output.' We also assume that the initial purchase of capital stock costs the firm
r dollars per unit of capital at each t.` Assume that r is a positive number.
There are at least two types of targets that the firm might wish to achieve.
In one case, the firm wishes to maximize total social welfare over time. This may
be the case when the firm is owned by a public authority, for example. In the
other case, the firm wishes to maximize the total profit over time. This may be
the case when the firm is privately owned. The solution may be different in each
of the two cases; then the problem of optimal public regulation occurs.
First we consider the case in which the firm wishes to maximize social
welfare over time. This formulation of the peak-load problem seems to be more
common in the literature (see, for example, Williamson [11] and Steiner [8] ).
The definition of "social welfare" or at least its maximization will cause well-
known difficulties. We assume that the "optimum conditions" of production and
exchange are satisfied elsewhere in the economy in order to avoid the "second
best" digression. We also assume that the social welfare at each instant of time

D (P, t)

Figure 8.13. An Illustration of the Peak-Load Problem.

TWO ILLUSTRATIONS 673

is measured by (total revenue) plus (consumer's surplus) minus (social cost). That
is,

(30) Wt = TRt + St - TCt

where Wt = social welfare, TRt = total revenue, St = consumers' surplus, and
TCt = total social cost, each at time t.
The demand function maybe expressed as Yt = D(pt, t). However, assuming
DP < 0 for all pt and t, we can globally invert the function D, and we may write
the demand function as8

(31) Pt = P(Yt, t), where PY -- Y<0

Notice that the firm can select either the price policy of choosing the time path of
pt or the output policy of choosing the time path of Yt. However, in view of the
demand relation Yt = D(pt, t) or pt = P(Yt, t), the choice of one policy auto-
matically implies the choice of the other policy. In other words, it does not make
any difference whether we suppose the firm adopts the price policy or the output
policy. Thus if the firm adopts the price policy, it has a uniquely implied output
policy determined by Yt = D(pt, t). Here we suppose that the firm adopts the
output policy (that is, the policy of choosing the time path of Yt). The price policy
is then implied by pt = P(Yt, t). The demand function is illustrated in the tradi-
tional manner in Figure 8.14. Note that as t changes (say, from t1 to t2), the
demand curve shifts.
The total revenue plus consumers' surplus at time t, when Yt is chosen,
is given by
Y,

(32) TRt + St = 5py,, t) dyt

We denote (32) as follows:

yr
a

Figure 8.14. Welfare and Demand Curves in the Peak-Load Problem.

674 DEVELOPMENTS OF OPTIMAL CONTROL THEORY AND ITS APPLICATIONS

y'
(33) F(Y,, t) = P(y,, t) dy,
Jo

Assume again that w, r, a, and b are all constant over time and over the relevant
range of output. Then total social cost at time t, that is, TC,, is given by
(34) TC, = (w - a)Y, + rK

Thus total social benefit over the period of time [0, T] is given by
r
(35) W -- f W,dt = f(TRI + S, - TC,)dt = JT[F(Y1, t) - (w a) Y, - rK] dt
The analysis with a positive future discount (that is, W = Jo W,e-°ldt, o- > 0, where
r is social discount rate) is analogous to the subsequent analysis; hence it is left
as an exercise to the interested reader.
We are now ready to formulate the present version of the peak-load
problem.

PROBLEM I:
T

Maximize: W =
YK
f o [F(Y t) - (w a) Y, - rK] dt

Subject to:
(36) K > bY,
and
(37) Y, > 0

Viewing this as an optimal control problem, Y, is the control variable and K

is the control parameter. There is no state variable in this problem. We assume,
for the sake of simplicity, that K is perfectly divisible. In order to apply Hestenes'
theorem, we define the following function L:
(38) L = L [ Y,, K, q,, u,
o[F(Y,, t) - (w a)Y, - rK] + q, [K - bY,] + µ,Y,
Here 00, q, and y, are the multipliers. Using Hestenes' theorem, we now have
the following necessary conditions for k, and k to be optimal:

(i) 00 > 0 (constant), q, > 0, µ, > 0, for all t, and 00, q, , and µ, do not vanish
simultaneously. Moreover, µ, Y, = 0, for all t.

(ii) OL = 0, where L L [ Y,, k, q,, µ,] , for all t; that is,

TWO ILLUSTRATIONS 675

(39) 00{Fy - (w a)} - bq, < 0 and [0o{Fy - (w. a)} - bq,] Y, = 0, for all t
where Fy = aF(Y!, t)/BY,.
(iii) The following relations hold:
(40) qt(k - b1i) = 0 and k > bYi, for all t
(iv) The following relation also holds:
(41) Oo[F(Yt, t) - (w. a)Y, - rk] ? Oo[F(Yr, t) - (w. a)Y1 - rk]
for all t, and for all Y1 such that k > bY1 and Y1 > 0.
(v) The following transversality condition holds:
T
(42) (- cbor + g1)dt = 0

Noting that Fy = P(Y1, t) in view of (33), and assuming an interior solution

for Y1 (that is, k, > 0 for all t) so that µ, = 0 for all t, we can rewrite the relation
(39) as follows:'
(43) 00[P(Y1, t) - (w - a)] = bq1, for all t
Or writing 1= P(Y1, t), we have
(44) 00(p, - w a) = bq1, for all t
The rank constraint qualification for this problem is trivially satisfied, for we
can observe
a (Kb
a yrYt)
(45) =-b 0, for all I

Next we show 00 > 0 so that we can take 00 = 1. To see this, simply note the
relation (44). If 00 = 0, then (44) implies qr = 0. Since we assumed Y1 > 0 (the in-
terior solution) (so that u, = 0), this means that all the multipliers (00, q, and µ1)
.vanish simultaneously. This contradicts condition (i) in the above. Hence the rela-
tions (44), (41), and (42) can now be rewritten as follows:
(46) P(Yt, t) - w a = bq1, for all t, or qr = [P(Y1, t) - w a] /b, for all t
(47) F (Y,, t) - (w - a) Y, - rK F (Y1, t) - (w a)Y, - rk, for all It
or
(48) F(Yt, t) - (w - a)k, > F(Y1, t) - (w a) Yt, for all t
and for all Y1 such that k > bY1, Y1 > 0. Also,

(49) f (q1 - r)dt = 0, or rT = r Tg1dt

0 o

Relation (47) means that WW is maximized subject to the constraints at each instant
of time.
676 DEVELOPMENTS OF OPTIMAL CONTROL THEORY AND ITS APPLICATIONS

Conditions (46), (47), (49), and (40) constitute necessary conditions for
(Y,, K) to be optimal. Assuming a2F/aY,2 = aP/aY, < 0, W, is a concave func-
tion in Y, and K, so that condition (47) is implied from condition (46). Therefore,
in view of Mangasarian's theorem (Theorem 8.C.5), conditions (40), (46), and (49)
constitute a set of necessary and sufficient conditions for an optimum. In other
words, conditions (40), (46), and (49) completely describe the solution of the
problem, k, q,, k, t e [0, T]. Notice that if q, > 0 for all t, then k = bY, t E
[0, T] replaces (40). In general, q, can be zero for some t, although q, > 0 holds
over a certain period of time in view of (49).
If the firm has an existing stock of capital K, then k is written as k
K + Ka, where Ka is the additional capital requirement. If K ? K, then our
analysis above follows word for word, except that it should be reinterpreted ac-
cordingly. If K < K, a slight modification of the analysis would be necessary, and
r would presumably be zero. If r = 0, (49) implies q, = 0 for "almost all" t (that is,
for all t except for a countable number of isolated points in [0, T] ), so that we
have P(Y1, t) = w a for almost all t. We proceed with our analysis for K ? K.
From (46) and (49) we obtain

(50) SP(,,t)dt= T [w- a+ br]

Let A be a subset of [0, T] in which k, = Y, where k = IC/b. That is, A is the set
of "top-peak" periods (the periods in which full capacity output is achieved).
Since q, = 0 for t 0 A in view of (40), we have
(51)
p, for
The is that optimal outputs are equal
and prices are unequal for t E A, while optimal outputs are unequal and prices are
equal for t 0 A.
Using (46) and (49), we may rewrite (50) in the following form:

(52) f A [P(Yr, t) - w a] = JA bqr = brT

where f., denotes the integration in t over the range of A.
The profit of the firm at time t is written as 7r, -- (p, - w a) Y, - rK. Then
under the above prescribed optimal policy, the total profit over the whole period
is computed as

(53) so' [(p,-

T
=f bgtY, - f
o (rK)dt

=f q,K- rTK= 0
TWO ILLUSTRATIONS 677

In other words, the profit over the whole planning horizon should be zero.
Now suppose that the demand function P is such that gr > 0 for all t so that
A = [0, T] (that is, full capacity output always occurs). Then f, = IC/b = con-
stant (= Y) for all t, so that the value of k is determined by (50) as follows:

(54) fP(,t)dt= T [ w- a + br]

The value of IC can then be determined as k = b Y.

For the sake of illustration, suppose further that
(55-a) P(Y1, t) = PI(Y,) for t E T, c [0, T]
(55-b) P(Y1, t) = P2(Yr) for t E T2 c [0, T]
where TI n T2 = and TI U T2 = [0, T]. We may call T, "day" and T2 "night."
Let the length of periods in T, and T2 be TI and T2, respectively. Then (54) can be
rewritten as
(56) TIPI(Y) + t2P2(Y) = T [w a + br]
where TI + T2 - T. The solution k indicated in (56) can be illustrated by Figure
8.15, which corresponds to Steiner's solution for the "shifting-peak" case ([8],
p. 588), as generalized by Williamson [ 11 ] .
In general, we have periods in which full capacity is not achieved. An
extreme case is the case in which
(57-a) P(Y,, t) = P3(Yr) for t E A
(57-b) P(Y1, t) = P4(Yr) for t (4 A
That is, the demand curve is fixed as long as t E A or t 0 A. Let a be the sum of

Figure 8.15. An Illustration of the Solution When Full Capacity Is Achieved Always.
678 DEVELOPMENTS OF OPTIMAL CONTROL THEORY AND ITS APPLICATIONS

w.a+ brT
a

P3(Y)

w.a

y
0
Y

Figure 8.16. An Illustration of the Solution When Full Capacity Is Not Necessarily Achieved.

the lengths of the periods in A. Noting that (52) implies

(58) aP3(Y) = a(w-a) + brT
for this case, we can illustrate the solution in Figure 8.16, which corresponds to
Steiner's solution for the "firm-peak" case ([8], p. 588), as generalized by
Williamson [ 11 ] .

It may be of some interest to consider the case in which there is no shift in

the demand function, in other words, where P, = P(Y,). Then we may suppose that
[0, T] = A (that is, the full capacity output is achieved always), so that Y, = Y
(constant) for all t. Hence in view of (46), we obtain q, = constant. Then from (49)
we have
(59) q, = r for all t
Therefore, from (46), we have
(60) p,= rb
We define the long-run marginal cost, MC by
(61) MC,- aY rK]
r

Evaluating MC, along the optimal path k and k (=bY), we obtain

(62) MC, = w a + rb = p, for all t
which corresponds to the conventional rule of marginal cost pricing.
We now suppose that the firm wishes to maximize profit over time. Let
7r, be the profit for time t. Then we have
(63) 7r, p, Y, - w L, - rK

(Pr - w- a)D(P1, t) - rK
TWO ILLUSTRATIONS 679

Note that once p, is set, the firm knows the demand for output by D(p,, t) and
hence produces an amount Y, = D(p t).
The firm is supposed to maximize

(64) n = foT7te_p1dt

where p > 0 denotes the discount rate for the firm. We assume p = 0 for the sake
of simplicity. The analysis in which p > 0 is analogous to the subsequent analysis;
hence it is left as an exercise for the interested reader. We are now ready to state
our problem.

PROBLEM II:
T
Maximize: f[(pi - wa) D(p,, t)
Pr, K
- rK] dt
Subject to:

(65) K - bD(pt, t) > 0

and
(66) pt > 0

Viewing this as an optimal control problem, K is the control parameter and pt

is the control variable. There is no state variable. We again assume, for the sake
of simplicity, that capital K is perfectly divisible.
Then, in view of the optimal control theorem, we first define the function
L by

(67) L [Pr, K, q,, rut] _ 00[(p, - w- a) D(P1, t) - rK]

+ qr [K - bD(P1, t)] + ru,Pr

Here 00, q,, and u, are multipliers. Although the same notation is used for these
multipliers (as well as L) as in the previous problem, their values can, of course,
be different from the corresponding ones in the previous problem. The same
notation is used purely for the sake of notational simplicity. Using Hestenes'
theorem, we now have the following necessary conditions for pl and K* to be
optimal:

(i) The multipliers Oo, q,, and µ, do not vanish simultaneously and 00 > 0
(constant), q, > 0, µ, > 0, for all t. Moreover, p, pt = 0, for all t.
*
(ii) apt = 0, where L* = L(p,*, K*, q, fe,), for all t, that is,
680 DEVELOPMENTS OF OPTIMAL CONTROL THEORY AND ITS APPLICATIONS

(68) +fo{D* + (p* - w. a)DD} - gtbDD < 0

and [Oo{D* + (p* - w. a)Dp} - gtbDp] p* = 0 for all t

where D* - D(p* , t) and Dp =- oD(p* , t)/opt. Assuming the interior solution

for p*, that is, p* > 0, for all t (so that µt = 0 for all. t), this relation holds with
equality. Thus we have'0
(69) Oo[D* + (pT - w a)Dp] - gtbDD = 0, for all t
(iii) The following relation holds:
(70) qt(K* - bD*) = 0, for all t
(iv) The following relation also holds:
(71) 00[(pl - w. a)D(pl, t) - rK*] > 00[(pt - w. a)D(pt, t) - rK*]
for all t and for all pt such that K* > bD(pt, t) and pt _> 0.
(v) The following transversality condition holds:

(72) f0
r + gt)dt = 0

We may assume that

(73) apt [K - p,D(p* , t)] # 0, for all t

In other words, we may assume that the rank constraint qualification is satisfied
because
(74) Dpi0,or Dp<0
from the assumption on the demand function. This implies 00 > 0. To see this,
suppose 00 = 0. Then the relation (69) with condition (74) implies qt = 0. Since
µt = 0 from p* > 0, all the multipliers vanish, contradicting condition (i) above.
Thus 00 > 0, so that we may choose 00 = 1. Conditions (69),(7 1), and (72) are now
simplified as follows:'

(75) D _ -(p* - w a - gtb)DD, for all t

(76) (p* - vv - a)D(p7, t) ' (pt -

for all t and for all pt such that K* > bD (pt, t) and pt > 0.
T r
(77) (qt - r)dt = 0, or rT = f gtdt
0

Note that relation (76) means that the "current profit" (that is, profit except for
capital cost) as well as the total profit are to be maximized at each instant of time.
TWO ILLUSTRATIONS 681

Conditions (75), (76), (77), and (70) constitute necessary conditions for an
optimum for the present problem. Moreover, if we assume DPP - PD/8p,2 < 0
for all p, then our n, (hence n also) is a concave function so that these conditions
are also sufficient for an optimum (again in view of Mangasarian's theorem or
Theorem 8.C.5).
We now proceed to further characterizations of the above solution. First,
we define the elasticity of demand by

(78) 77,=71 SDP>0,with DP<0

Then we can rewrite (75) as

(79) (rl* - 1)p* = rl*(w a + q,b), for all t

where?* = 71 (p*, t). Since (w a + q,b) > 0, (79) requires that

(80) 71* > 1, for all t
Define E* by E* = rj*/(i* - 1). Then
(81) E* > 1 for all t
Also (79) can be rewritten as12
(82) p* = E*(w a + q,b), for all t
Therefore
r jTC*(W
(83) f p*dt =
o a + q,b)dt

Note that (81) and (82) imply

(84) p* - w- a > p* - E*(w a) = E*qtb 0, for all t
In other words, current profits (p* - w a) are always positive, whereas they are
zero for "non-top-peaks" (that is, t 0 A) for the welfare maximizing monopoly.
Next let B be the subset of [0, T] such that bD(p*, t) = K*. In other words,
B is the set of "top peaks." Clearly, B can be different from A. For t tt B, we
have bD(p*, t) < K*. Hence q, = 0 for t B. Thus, in view of (82), we obtain
(85) p* = E*(w a), t (4 B
Since c* > 1, the optimal price p* exceeds the operational cost w- a for t 0 B.
Note also that c*, in general, changes from time to time. Hencep* is not necessarily
constant for t 0 B, while in the welfare maximization problem p* is constant
(and equal to w a) during the non-top-peaks (that is, t 0 A).
Using (84), (81), and (77) successively, total profit over the entire period can
be shown to be positive. In other words
682 DEVELOPMENTS OF OPTIMAL CONTROL THEORY AND ITS APPLICATIONS

r
(86) 7r* fo [(p* - rK*]dt
T

> fo [(Eig1b)D* - rK*] dt

50T[q1*
> - rK*] dt
= f g1bD* - rK*T = 0

where [B denotes the integration in t over the range of B. Recall that, in the case of
a welfare maximizing monopoly, total profit is zero.
Now for the sake of illustration, assume that Tt* is constant over t. An
example of a demand function that yields a constant Tt* is

(87) D(P1, t) = S(t)Pt-"*

where Tt* > I is some constant and 8(t) signifies the time shift of the demand
function (it is easy to check that Tt* satisfies the definition of the elasticity of
demand). If rt* is constant, c* is also constant so we denote it by E*. For this
case total profit is computed by using (82) and (77) as
SOT

(8 8) [(P* - w a)D* - rK*] dt

= f(, [{(E* - 1)w a + qtb}D* - rK*] dt

= T(E* - 1)w a
That is, total profit is larger when E* (or the degree of monopoly 1/q*) is larger.
When c* is constant, (83) can be rewritten as
r
(89)
f p* dt = E*(w a + br)T
Suppose further that the demand conditions are such that qt > 0 for all t. Then
Y* = D(p*, t) for all t, where Y* = K*/b. Using this relation, we obtain
(90) p* = P(Y*, t), for all t

For the sake of illustration, suppose also that relation (55) holds for the function
P. Then (89) and (90) yield
(91) TIPI(Y*) + T2P2(Y*) = E* T(w a + br)

The diagrammatical illustration of (91) is strictly analogous to that of (56).

TWO ILLUSTRATIONS 683

Since E* > 1, Y* < Y, so that K* < K. That is, capacity for the profit maximizing
monopoly tends to be less than the socially optimum amount. It is easy to prove
that this conclusion also holds even if c* changes over time.
If, on the other hand, the demand conditions are those specified by (57),
then (82) and (77) imply

(92) /3P3(Y) = Ac(w a) + brT

where p is the size of B. The illustration of the solution Y* is strictly analogous

to that of (58). Again, c* > 1 implies that Y* < Y and K * < K, provided that a = R.

FOOTNOTES

1. This subsection relies heavily on Arrow [ 1] .

2. The examination of the constraint qualification is often neglected in the practical
application of optimal control theory. This is a bad practice.
3. Note that pu in the definition of the L-function, the multiplier attached to the maxi-
mand function (inside the integral), which appears in Hestenes' theorem, is set equal
to one. To prove this, modify (18) and (29) with pp. Then using (19) and the fact
that the multipliers do not vanish simultaneously, we obtain a contradiction by
supposing pu = 0. Notice also that the rank constraint qualification is satisfied for
this problem. Incidentally, condition (16) may be replaced by a more usual condition
lim1., pt = 0 without changing the argument and the conclusions in essence.
4. This subsection is probably the first application of optimal control theory to the
peak-load problem. The problem can also be solved by using the ordinary nonlinear
programming technique. See Takayama [9] .
5.. The choice of T is a difficult problem indeed. We omit the discussion of the choice
of a finite T. We may note that all the arguments in the literature assume that time
is discrete, and usually assume that there are only two periods to facilitate a diagram-
matical analysis. For an example of n-period analysis, see Steiner [8]. We may
also note that there is no literature so far that treats the problem with a continuum of
time.
6. That w is constant with respect to Y means that the firm is small in the (variable)
factor markets. That w is constant with respect to t is an assumption purely for the
sake of simplicity. This is the assumption adopted in the literature.
7. Consider, for example, that the firm borrows money to purchase the initial capital'
stock K, and assume that this borrowing amounts to repaying r dollars per unit
of capital K at each instant of time for the period [0, T] .

8. We will assume that pt > 0 implies Y1= D(pl, t) > 0 for all t.
9. In view of the assumption made in footnote 8 and Fy < 0, Y, > 0 implies p1
Fy(Y1, t) > 0.
10. In view of the assumption made in footnote 8, p* > 0 implies D* = D(p7 , t) > 0.
11. Equation (75) can be rewritten as p* + D*ID, = w a + q1b. But D* = Y7 and
l/Dp = 8 P(Yt , t)l8 YY = P. Hence the LHS of this equation is p7 + P**YY _
8 (p* Y*)/8 Y1, which signifies the marginal revenue. The RHS of the equation,
w. a + q1b, will signify the "marginal cost." Therefore equation (75) may be inter-
preted as the familiar rule, MR, = MC,.
684 DEVELOPMENTS OF OPTIMAL CONTROL THEORY AND ITS APPLICATIONS

12. Note that c = 1/(1 - 1/n7), and that l/q7 is the well-known degree of monopoly
a la Lerner. Notice that et is greater, the greater the degree of monopoly. From
footnote 11, (w. a + gtbj is equal to the marginal revenue. Hence (82) signifies the
usual rule that the difference between the price and the marginal revenue increases
as the degree of monopoly increases.

REFERENCES

1. Arrow, K. J., "Applications of Control Theory to Economic Growth," in Mathe-

matics of the Decision Sciences, Part 2, ed. by G. B. Dantzig and A. F. Veinott,
Providence, R. I., American Mathematical Society, 1968.
2. , "Optimal Capital Policy with Irreversible Investment," in Value, Capital,
and Growth, Papers in Honour of Sir John Hicks, ed. by J. N. Wolfe, Edinburgh,
Edinburgh University Press, 1968.
3. Arrow, K. J., and Kurz, M., "Optimal Growth with Irreversible Investment in a
Ramsey Model," Econometrica, 38, March 1970.
4. Buchanan, J. M., "Peak Loads and Efficient Pricing: Comment," Quarterly Journal
of Economics, LXXX, August 1966.
5. Hirschleifer, J., "Peak Loads and Efficient Pricing: Comment," Quarterly Journal
of Economics, LXXII, August 1958.
6. Houthakker, H. S., "Electricity Tariffs in Theory and Practice," Economic Journal,
LXI, March 1951.
7. , "Peak Loads and Efficient Pricing: Further Comment," Quarterly Journal

of Economics, LXXII, August 1958.

8. Steiner, P. 0., "Peak Loads and Efficient Pricing," Quarterly Journal ofEconornics,
LXXI, November 1957.
9. Takayama, A., "On the Peak-Load Problem," Krannert Institute Paper, No. 251,
Purdue University, June 1969.
10. Turvey, R., "Peak-Load Pricing," Journal of Policitical Economy, 76, January-
February 1968.
11. Williamson, O. E., "Peak-Load Pricing and Optimal Capacity," American Economic
Review, LVI, September 1966.
THE NEO-CLASSICAL THEORY OF INVESTMENT 685

Section E
THE NEO-CLASSICAL THEORY
OF INVESTMENT AND
ADJUSTMENT COSTS-
AN APPLICATION OF
OPTIMAL CONTROL THEORY'

a. INTRODUCTION
The essence of the present treatment of the theory of investment is the
behavioral assumption that a firm maximizes the present value of net cash flows
subject to constraints such as a production function and a capital accumulation
equation. Hence it is a part of dynamic decision theory. Since the firm determines
both the demand for factors such as labor as well as the demand for investment, the
name "theory of investment" seems slightly inappropriate. Rather it should be
termed the dynamic theory of the firm.
Whatever we call it, there seems to be quite a bit of confusion in the theory
of investment. The purpose of this section is partly expository in the sense that we
attempt to correct these confusions and partly illustrative in the sense that we
present various theories in a unified and generalized fashion.
First there is the argument (Haavelmo [20] and Lerner [43], for example)
which says that there is no investment demand schedule for an individual firm.
Assuming that the firm is competitive and small enough and that all prices are
constant, the firm can and would adjust instantaneously to the desired stock of
capital, which is constant. In this case, investment is always equal to the amount
of depreciation and there is no investment function as such. Thus Haavelmo, for
example, concludes the following ([20], p. 216):
What we should reject is the naive reasoning that there is a `demand schedule'
for investment which would be derived from a classical scheme of producer's
behavior in maximizing profit.

The capital is adjusted to the desired level instantaneously at the initial time
and it will be kept constant over the whole planning horizon ([20], p. 163).
Jorgenson, being apparently distressed by this, argued that "it is possible
to derive a demand function for investment based on purely neoclassical con-
siderations" ([27], p. 133). The secret of Jorgenson's innovative procedure of
obtaining the investment demand schedule is to change prices, notably the price
of capital goods, over time ([27], p. 149). The amount of investment then changes
over time depending on the time path of the prices.
However, as Tobin ([61], p. 157) noticed, nothing basic is changed. If all
prices are assumed (or expected by the firm) to be constant, then the amount of
investment is also constant over time in Jorgenson's model. In other words, the
686 DEVELOPMENTS OF OPTIMAL CONTROL THEORY AND ITS APPLICATIONS

basic characteristic of the Haavelmo-Lerner treatment, the instantaneous adjust-

ment to the desired capital stock, is unchanged in Jorgenson's treatment.
The instantaneous adjustment to the desired stock of capital, in essence,
implies that the volume of investment It is unbounded. In other words, if we write

Imin = It < Imax

this assumption of unbounded investment means Imin = -co and Imax = oo. The
assumption of Imax = co is often justified under the assumption of a competitive
firm. I believe that this justification is quite confusing. Even though Imax may
be very large for each firm, it should still be finite.
A more serious difficulty, however, is in the assumption of Imin = -co.
This is a silly assumption at the macro-economic level. But even at the micro-
economic level, it is difficult to accept this assumption because it means that
the firm is able to sell the capital good already installed in any amount at the price
of a newly produced capital good. This is obviously unrealistic. Here it may suffice
to quote Arrow ([7], p. 2):
From a realistic point of view, there will be many situations in which the sale
of capital goods cannot be accomplished at the same price as their purchase.
There are installation costs, which are added to the purchase price but cannot
be recovered on sale; indeed, there may on the contrary be additional costs
of detatching and moving machinery. Again sufficiently specialized machinery
and plants have little value to others. So resale prices may be substantially
below replacement costs.
Arrow, however, goes to the other extreme by assuming that resale of capital
goods is impossible, or by assuming Imin = 0; that is, the imposition of the con-
straint It > 0 for all t.
Mathematically speaking, both Haavelmo [20] and Jorgenson [27] utilized
the classical calculus of variations and thus implicitly ignored the constraint
Imin < It < Imax A more plausible mathematical technique is optimal control
theory. Not only can this constraint be incorporated into the analysis in a satis-
factory manner by using the optimal control technique, but also this technique
enables us to realize the "bang-bang" nature of the solution and to understand
why instantaneous adjustment is optimal to the firm if Imax is sufficiently large
and if Imin is sufficiently small. These are not clear in the rather naive applications
of the calculus of variations as seen in Haavelmo [20] and Jorgenson [27]
(also [25], [28], and so on).
In subsection b, we formulate the problem explicitly as an optimal control
problem, and we conclude the following for the nonconstant returns to scale case:

1. The investment policy for the firm is to reach the "long-run" desired stock
of capital (K*) as soon as possible (that is, II = Imc if Kp < K*, and It = Imin
if Kp > K*), and after reaching K* to remain at K*.
2. The "long-run" desired stock of capital, K*, is determined by the usual marginal
productivity principle.
THE NEO-CLASSICAL THEORY OF INVESTMENT 687

3. The above conclusions imply that the investment demand changes once over time
when the capital stock reaches K* (except for the case in which K0 = K*).
4. The (Lerner-Haavelmo-Jorgenson) conclusion of instantaneous adjustment
cannot occur for the continuous time model (see, for example, [20] and [27] ),
regardless of the sizes of Imax and Imin, as long as they are finite. This is simply
because the integral over a point of time-say, at t = 0-is zero. For the discrete
time model, instantaneous adjustments can occur (see, for example, Takayama
[ 59] ).
5. However, if the sizes of Ima and lImini are large enough, then the time required
to reach K* can be made very small; that is, an "almost" instantaneous adjust-
ment occurs.

These conclusions may appear to be intuitively obvious. However, this does

not negate the importance of deriving these results rigorously in mathematical
terms 2 In particular, at the end of subsection b we also observe that the above
conclusions will be altered greatly if the production function exhibits constant
returns to scale.
Another confusion in connection with the neo-classical investment theory
is that the Keynesian theory of the marginal efficiency of capital is irrelevant to
the neo-classical theory and hence is dismissed (see, for example, Jorgenson [27] ).
In subsection b, we observe that the Keynesian rule of the marginal efficiency
of capital is, in essence, the same as the neo-classical marginal productivity rule.
As remarked earlier, the crucial feature of the Lerner-Haavelmo-Jorgenson
theory of investment is instantaneous and frictionless adjustment to the desired
stock of capital. Commenting on Jorgenson [271, Tobin remarked ([611,
p. 158):

Jorgenson's investment demand schedule cannot serve the analytical pur-

poses for which such a schedule is desired, and one must look elsewhere
for a determinate theory of investment. At the level of a single firm, this
may be derived from frictional or adjustment costs.

We then have an increasing literature on the investment function which intro-

duces adjustment costs explicitly (by economists such as Eisner and Strotz,
Treadway, Lucas, Gould, and Uzawa).
In subsection c, we discuss the theory of investment with adjustment costs.
In a typical treatment of this topic in the literature (see Eisner and Strotz [ 14],
Lucas [45], and Gould [ 18], for example), it is assumed that the adjustment
cost function is strictly convex and quadratic, and it is concluded that the optimal
investment (as well as the optimal capital stock) is uniquely determined and
constant over time and that the capital stock monotonically approaches the "long-
run" desired level as time extends without limit. Note that, in obtaining this result,
it is assumed in the literature that the adjustment cost function is quadratic. In
subsection c, we will observe that such an assumption is not essential.
Moreover in obtaining the above result, it is usually assumed that the
production function exhibits constant returns to scale. Although the constant
688 DEVELOPMENTS OF OPTIMAL CONTROL THEORY AND ITS APPLICATIONS

returns to scale assumption might be convenient for macro-economic analysis

in dealing with the exhaustion of the product problem, such an assumption
might not hold for an individual firm. In subsection c, we also discuss the case of
nonconstant returns to scale. We then conclude that the optimal investment is
no longer constant over time, but rather monotonically approaches a certain limit
as time extends without limit. In other words, we will observe that the constant
returns to scale assumption is crucial in obtaining the conclusion in which the
optimal investment is constant over time.
Subsection d contains three remarks. The first remark is concerned with the
"response function." If the optimal investment demand is determinate and
constant (say, I) as in the case of adjustment costs with a constant returns to
scale production function, then we obtain an equation such as (see, for example,
Gould [ 18] ):

K,=8(K-K)
where K = I/8 and 8 is the rate of depreciation. Viewing K as the "long-run"
desired stock of capital, this equation seems to define the usual "response
function" which is often seen in the empirical literature. Our first remark in
subsection d is a critical note on such a claim.
The second remark in subsection d is a critical summary of Uzawa's treat-
ment of adjustment costs, the "Penrose effect". The third remark is concerned
with a possible extension of investment theory. Among other things, we point
out that essentially the same results follow for the complete monopoly case.

b. THE CASE OF NO ADJUSTMENT COSTS

Consider a firm that wishes to maximize the sum of the present value of net
cash flows W, for all future time, W, where W is defined by

e-.tW,dt
(1) W- coo,

where
(2) Wt = P1Q1 - w1L1 - g111

Here we use the following notations: Q,, output; L,, labor input; I, investment;
p price of output; w , wage rate; q1, price of capital goods; and r, discount rate 3
There are three constraints in this maximization problem. The first is the produc-
tion function, which we write as
(3) Q(L1,K1)-Q,=0
where K1 is the stock of capital. The second is the capital accumulation constraint,
which, following the literature, we write as
(4) K1= It - 8K1
where 8 denotes the rate of depreciation (0 < 8 < 1).
THE NEO-CLASSICAL THEORY OF INVESTMENT 689

It is important to note that (4) contains a crucial assumption with regard to

depreciation, that is, "depreciation by evaporation." Equation (4) together with
It > 0 and K0 > 0 imply that the capital good continues to exist forever.'
The third constraint is that the volume of investment is bounded, that is,
(5) Imin < It = Imax for all t
A typical lower bound Imir, is 0, or I, _> 0, which was introduced and called, as
mentioned earlier, the "irreversibility of investment" in the literature by Arrow
and so on. As we will see later, the constraint (5) is crucially significant in the
present problem in which the "adjustment costs" are ignored.
It is supposed that the firm chooses the time stream of L1 and I, (hence also
K1 and Q,) so as to maximize the present value of all future profits, W, subject
to (3), (4), and (5), L, >_ 0, K1 > 0 (Qt > 0), and a given stock of initial capital K0.
Such a maximization problem is considered by Jorgenson [27] and others, except
that the constraint (5) is often ignored.
Before embarking on solving the above maximization problem, the following
remarks may be useful in revealing the assumptions which are often implicit in
the literature.
REMARKS:
(i) It is assumed that the firm takes the prices (ps, w1, q,) which prevail at
each t as given data. The output can be sold in any quantity at time t
at the price pt, and the firm's employment of labor L, and investment
I, do not affect the prices w1 and q1 for each time t.
(ii) It is assumed that the firm knows all future prices pt, ivi, and q1 for all t
0 with perfect certainty (perfect foresight). Alternatively, it is assumed
that the firm has a certain definite expectation for these prices for all t
0.' When their expectation turns out to be incorrect in the future, they
correct their program. If instead, these future prices are not expected
with probability one but rather they are expected with a certain probability
distribution, then the maximization problem should be altered. In other
words, the procedure of first solving the problem with a definite expecta-
tion (that is, expectation with probability one) with regard to future prices
and then of solving another problem when these prices turn out to be dif-
ferent from the originally expected value, does not, in general, give a truly
optimal solution for the firm.

We now solve the above maximization problem. To sharpen our analysis

and to highlight the problems involved, we consider a simple but important case
in which p, w1, and q1 are expected to be constants for all t. Therefore, we write
pt, w1, and q, asp, w, and q, respectively. The case in which these prices change over
time is left to the interested reader.
First we substitute (2) and (3) into (1) and rewrite the firm's maximization
problem as follows:
PROBLEM I: Choose the time path of L, and I, so as to
690 DEVELOPMENTS OF OPTIMAL CONTROL THEORY AND ITS APPLICATIONS

Maximize: £e_rt [pQ(Lt, Kt) - wLt - qIt] dt

Subject to: Kt = It - SKt
Imin<It"" Imax
Lt>0,Kt>0
and a given value of Ko
We view this as an optimal control problem in which Lt and It are the control
variables and Kt is the state variable. To solve this problem, define the Hamiltonian
ass

(6) H = e-r, [pQ(L,, K,) - wLt - qIt] + Nt [I, - SK,]

The set of necessary conditions for an optimum is written as'
dK, aR
(7)
dt 8µt

aA
(8) Nt = -

(9) e-r, [pQ(Lt, Kt) - wLt - qlt] + N t [It - SIC,]

> e-r, [pQ(Lt, Kt) - wLt - qIt] + y t [It - SIC,]
for all L, >_ 0 and I, with I min < It < Imax
(10) 1imODµt = 0

As remarked in Sections A and C, condition (10) can be replaced by "Arrow's

condition,"
(10') limy, > 0 and limµtKt = 0

by explicitly introducing the constraint liml,co K, > 0. But the analysis and the
conclusion would be the same as the present one. Condition (9) signifies the
maximization of the Hamiltonian H with respect to the control variables Lt and
I,. Setting L, = L, for all t, condition (9) implies
(11) (µ, - e-rtq)It > (ut - e-rtq)I, for all It with Imin < It < Imax
In other words,
(12-a) It = Imax if µt > e-rtq
(12-b) It = 1 min if u, < e-rtq
(12-c) It E [Imin, Imax] If ut = e-rtq
Now the significance of the constraint (5), Imin < It < Imax, should be apparent.
THE NEO-CLASSICAL THEORY OF INVESTMENT 691

If there are no such bounds, I, = co when u, > e-'rg and It = -co when
µr < e-'rg; both of these conditions do not make too much sense either economic-
ally or mathematically. The usual calculus of variations approach as seen in
Jorgenson (for example, [25], [27], [28], and soon) is thus rather inappropriate
for the present case. Note that the solution described in (12) arises from the fact
that the terms inside the objective integral and the constraint function are both
linear in It. This "bang-bang" characteristic described in (12) is ignored in the
literature.
By setting It = It for all t in (9), we obtain8
(13) PQL - w !S 0 and (PQL - w)Lt = 0
where QL = 8Q/8L, evaluated at (L,, K,). Assuming Lt > 0 for all t, we obtain
w
(14) QL =
P
which is the familiar marginal productivity rule with respect to labor.
Conditions (7), (8), and (10) are respectively rewritten as9
(15) ICI=It -SIC,
(16) (r+ S)A, - PQK
where

A, = µren and OK = a QQr [evaluated at

and

(17) lim A,e-rr

=0
I- co
In terms of A,, (12) is rewritten as
(18-a) It = Imax
(18-b) It ='min
(18-c) It E [Imin, Imax]
Therefore (15) is now rewritten as
(19-a) K, = Imax - SK, If A, > q
(l 9-b) Kr = I min - S K, if A, < q
(19-c) Kt = I, - SIC, if .A, = q, where it E [Imin, Imax]
Note that if the function Q is concave, conditions (14), (16), (17), and (19)
are sufficient as well as necessary for (IC,, L, !,) to be optimal."
Assume that (14) can be rewritten as"

(20) L, = L(K1, p)
692 DEVELOPMENTS OF OPTIMAL CONTROL THEORY AND ITS APPLICATIONS

and write pQK(L,, K,) = pQK [L,(K,, P), IC, = 0(K,); that is,

0(pr)
(21) QK

where it is recalled that w/p is a constant.

Assume that 0' < 0" for all k, at each t. Then the phase diagram which
describes the time path of (K .A.,) can now be constructed from (16) and (19) as
illustrated in Figure 8.17.
In Figure 8.17, it is assumed that Im;,, < 0, and K* is defined from

(22) 9 = 0 (K 8

From Figure 8.17, it is clear that the only path that is eligible13 is the one that ap-
proaches K*, which is described by the heavy lines. Mathematically,'
(23-a) Kt = I max - 8K, if Ko < K*
(23-b) K, ='min - 8 K, if Ko > K*
(23-c) K, = K* if Ko = K*
From (23-a), k, is explicitly obtained for Ko < K* as follows:

(23'-a) K, = Koe-ar + I Sax (1

K,=0

Figure 8.17. The Dynamic Path of K, and At.

THE NEO-CLASSICAL THEORY OF INVESTMENT 693

Therefore assuming Imin = 0, we can describe the optimal path of investment as

follows:
(24-a) (K0 < K *)
I[=Imax for all t,0<t<T*
1[=8K*(-- I*) forailt> T*
(24-b) (K0 > K*)
1[=0 for all t,0<t<T**
it = I*(= SK*) for all t > T**
(24-c) (K0 = K*)
1[=I*(=SK*) forallt>0
Here T* in (24-a) and T** in (24-b) are computed respectively from

Koe-3T* + Imax(1 - e-8T*)

K* = 8

and

K* =
Koe_8T**

In other words,15

T* Imax - 8Ko
(25-a) 1
tog >0
S Imax - SK*

(25-b) T**=SlogK**>0
The optimal policy in (24) may be described as the one by which the firm
reaches K* as soon as possible, and after reaching K*, remains there. It is illus-
trated in Figure 8.18.
We call K* the long-run desired stock of capital. If k, = constant = K*, then

r Figure 8.18. The Time Path of K[.

694 DEVELOPMENTS OF OPTIMAL CONTROL THEORY AND ITS APPLICATIONS

from (20), Lt = constant = L*. Recalling (14) and (22), the values of L* and K* are
determined by

(26-a) QL(L, K) =

(r +p8)q
(26-b) Q K (L* , K*) =

Here (r + 8)q signifies the rent for capital; hence (26-a) and (26-b) are the famous
marginal productivity rules. To see that (r + 8)q is the rent for capital, suppose
that one unit of the capital good is rented with rent c. The physical quantity of one
unit of capital good at time t will be decayed to a-St; hence the rental payment
will be ce-St. Therefore, if the capital good is rented for an infinite future, the
present value of rent over all future time will be

(27) fcete_rtdt = 5ce_()tdt =

00 r+8
The intertemporal arbitrage relation will equate this with the price of a unit of
capital, so that's
C
(28) or c = (r + 8)q
q +8
Relations (27) and (28) are also obtained by Jorgenson [27] (in a more com-
plicated manner).
Suppose that the firm is in the long-run steady state (L*, K*). Then in
view of (27) and (28), we immediately realize that (26-b) can be rewritten as

pQ*e-(+ r)tdt
(29) q=S K

where QK aQ/aKt evaluated at (L*, K*). This is the famous Keynesian rule
of the marginal efficiency of capital which states that the demand for the stock of
capital is determined by the equality between the unit price of capital and the
present value of all future income from an additional unit of capital. Since
equations (26-b) and (29) are equivalent, the Keynesian rule of the marginal
efficiency of capital coincides with the neo-classical marginal productivity rule;
if the firm is on the path (L*, K*). A similar observation with regard to the
equivalence between the marginal efficiency rule and the marginal productivity
rule can be made in terms of a discrete time model with depreciation by "sudden
death" (see Takayama [59]). Therefore, we disagree with the following view
which is taken by Jorgenson ([27], p. 152) as well as others: "Keynes' construction
of the demand function for investment must be dismissed as inconsistent with the
neoclassical theory of optimal capital accumulation."
Needless to say, Keynes does not explicitly impose some of the above
THE NEO-CLASSICAL THEORY OF INVESTMENT 695

assumptions. In other words, in Keynes, prices may change over time, the firm
may not be a price taker, and the demand function that the firm (if monopolistic)
faces may change in the future. Therefore Keynes obtained a rule which is much
less explicit than (29); that is,
q=r
0 Rye-(s+.)rdt

where R, is the "expected rate of return" on capital at time t, that is, Rt is the
expected revenue minus the expected operating cost (not including the deprecia-
tion cost) per additional unit of capital.
It is important to observe that the values of L* and K* are equal to the ones
determined by maximizing the "short-run" (or instantaneous) profit
pQ(L, K) - wL - cK
In other words, the "long-run" solution (L*, K*) for the dynamic optimization
problem is reduced to the one for the static optimization problem. The myopic
rule is optimal after all from the long-run viewpoint.
With constant prices, the effect of changes in parameters such as p, w, q,
and r on L* and K* can easily be obtained from (26-a) and (26-b) by using the usual
comparative statics procedure. For this purpose, assume, for example, that
QLLQKK - QLK > 0, QLL < 0, Q*K < 0, and 0. Then it can be established,
for example, that aL*/ar < 0, aK*/ar < 0, aL*/aw < 0, and aK*/aw < 0. Since
I* = SK*, we also obtain aI*/ar < 0 and a7*/a w < 0.
Assuming that the values of K* and L* are uniquely determined by (26-a)
and (26-b), these values are constant as long as the prices (w, p, q, and r) are
constant. In this case, I* is also constant and equal to the amount of depreciation
SK*. In other words, the firm's investment is constant after it reaches K*.
Jorgenson [27] obtained results in which I, changes over time. This is due to
the fact that he allowed the price of capital q to vary over time, while he in the
main assumed all other prices (p, w, and r) constant." It is not quite clear why he
allowed this asymmetry with regard to the expectation of future prices.
Finally, let us consider the case in which the production function Q(L,, K,)
is homogeneous of degree one (constant returns to scale). In this case, QLLQKK -
QLK2 = 0 for all (Lt, Kt), and the above analysis should be modified. Note that,
in this constant returns to scale case, QL(Lt, K,) and QK(Lt, K,) are both homo-
geneous of degree zero. Then in view of (14), K,/Lt is constant (-- k) for the fixed
value of w/p, as long as QLL < 0 and QKK < 0 for all (Lt, K,). Hence QK =
QK(l, k) is also constant.18 Hence in view of (17), the At which satisfies (16) is
obtained as

(30) At = PQK = constant (= )

provided that S > 0. Hence, assuming that Im;,, = 0, the optimal investment is
obtained as
696 DEVELOPMENTS OF OPTIMAL CONTROL THEORY AND ITS APPLICATIONS

(24'-a) It = Imax if A > q

(24'-b) I, = 0 if .a. < q
For the knife edge case in which A = q, the optimal value of investment is in-
determinate. The results (24'-a) and (24'-b) correspond to the ones obtained by
Thompson and George [60] in a slightly more complicated situation [in partic-
ular, they set'max as a known function of time, say, lmax M(t)]. Since Al can
be interpreted as the (shadow) demand price of capital and q is the (market)
supply price of capital, the economic interpretation of (24'-a) and (24'-b) should
be self-evident. Note also that we have (by definition of A)
QK > (r + S)q
according to whether A > q
P
The marginal productivity rule with respect to capital as defined by (26-b) holds
only for the "knife edge" case in which A = q. In Figure 8.19, we illustrate the
case in which A > q (with 0 < r + 8 < 1).

PQK(1,k)

PQK
k =_ K/L
0
Figure 8.19. An Illustration of A > q.

The time path of the optimal stock of capital is obtained from (15) and
(24') as

(23'-a) Kt = Koe-st + 1max(1 - e-st) if .a. > q

(23'-b) .kr = K0e-8t

iii < q
Clearly,
Imax
lim'k, = ifA > q
t-oo 8

limK, = O if <q
r-0]

The time path of IC, for the constant returns to scale case is illustrated in Figure
8.20.
We have remarked that )l = q is the "knife edge" case. Although this may be
THE NEO-CLASSICAL THEORY OF INVESTMENT 697

Figure 8.20. The Time Path of K, (the Constant Returns to Scale Case).

true for the behavior of an individual firm, the situation A 4- q cannot continue for-
ever, if every firm behaves under the rule described above and if every firm has
more or less the same production function (that is, the same technical efficiency).
For example, if A > q, the total demand for capital for the market as a whole
would exceed its supply and the price of the capital good (q) would sooner or
later rise to the point at which A = q. Notice that in the meantime, the demand
for labor would increase as k, increases in order to keep k constant, which
might push up the real wage rate. Then the value of k would decrease to keep the
relation Q, = w/p, which in turn would increase A. In other words, the equilibrium
would be realized at an increased level of A. In any case, if A = q is to be achieved
sooner or later, then A = q is not really a knife edge case. Notice also that under
A = q, the marginal productivity rule with respect to capital, (26-b), is also
realized, although the volumes of optimal investment and capital stock become
indeterminate.

C. THE CASE WITH ADJUSTMENT COSTS

Introduction
in the above analysis, it is assumed that the firm can obtain any amount of
investment I, without affecting the investment price q,. This is true for each t,
and it does not matter whether q, is constant over time or not. This assumption
has recently been criticized on the basis of the "fixity" of capital. Fixity of capital
was considered to be the basis of Marshall's well-known distinction between
"short-run" and "long-run" analyses.
What then is the "fixity" of capital? Although there are a number of ways
to introduce this concept into the firm's maximization problem, here we consider
it as the cost per unit of gross investment rising with the investment rate. This
cost behavior can be rationalized, for example, (1) by postulating a monopsonistic
capital goods market, or (2) by introducing internal costs of investment which
are the sum of purchase costs (with either perfect or imperfect factor markets)
and installation costs. What then is the mathematical specification of such adjust-
698 DEVELOPMENTS OF OPTIMAL CONTROL THEORY AND ITS APPLICATIONS

ment costs of capital? Eisner and Strotz [141, Lucas [451, and Gould [181
suggested the following form to replace gtIt:

(31) Ct = C(It)
where C(I1) > 0, C'(I1) > 0, C" (It) > 0 for all It > 0, C'(0) > 0, and C(0) = 0.19
The condition C" (It) > 0 means that adjustment costs will be greater, on the
average, the greater the rate of investment.
On the other hand, Lucas [47] , viewing the adjustment cost as the internal
cost of the output foregone, introduced adjustment costs by altering the usual
production function (3) to the following form:
(32) Qt = Q(Lt, Kt, It)
where it is assumed that aQ/alt < 0 and a2Q/alt2 < 0 for all (L1, Kt, It) > 0.11
Clearly, the choice of the mathematical formulation may affect the con-
clusion. Such a choice would depend on empirical considerations and will vary
between industries. Here we simply adopt the form presented in (31).
The firm's maximization problem is now slightly altered by this modification;
qt is no longer exogenous to the problem. In the definition of H in (6), we should
replace gtlt by C(I1). In other words, we rewrite the firm's problem as follows.
PROBLEM II: Choose the time path of Lt and It so as to

Maximize: Je1 [pQ(L1, Kt) - wL1 - C(II)]dt

Subject to: Kt = It - SK1

Imin = It = Imax
Lt? 0, Kt >0
and a given value of Ko
Write the Hamiltonian H now as 21

(6') H _ e-rt[PQ(L1, Kt) - wL, - C(I1)] + µt[It - SK1]

and define alt again by22

At -- rt

Denote again the optimal path by (ICS, L, Ii). Assuming that the function Q is
concave and noting that C is (strictly) concave, the following conditions are
sufficient as well as necessary for an optimum:
(33) dK` = I - SICt

(34) At = (r + 8)A1 - PQK

THE NEO-CLASSICAL THEORY OF INVESTMENT 699

where QK = 8Q/8K, evaluated at (L,, K,);

W
(35) QL=
P
where QL = 8Q/8L, evaluated at (L IC,);
(36) At = C1(1l)
and23

(37) limA,e-rl
=0
t- m

In obtaining (35), it is assumed that L, > 0. In obtaining (36), it is assumed that

'min < 1, < 1max. It is important to note that the introduction of adjustment costs
by the function C enables us to assume the existence of an interior solution.-'a In
other words, the "bang-bang" characteristic of the optimal investment in the
previous section disappears with the introduction of adjustment costs.
Nonconstant Returns to Scale
In the literature with adjustment costs, it is usually assumed that the pro-
duction function Q is homogeneous of degree one. However, unless we consider
the investment problem on the macro-economic level and worry about the ex-
haustion of the product, there seems to be no need to assume constant returns
to scale. Here we consider the nonconstant returns to scale case. Following the
previous section, we specifically assume that Q(L1, K,) satisfies the following
condition:
(38) QLL < 0, QKK < 0, QLK > 0, and QLLQKK - QLK2 > 0
for the "relevant" neighborhood of the optima] path (L,, Kl).25
Then, as we did in the previous subsection, we can obtain L, from (35) as
(39) L, = L(K
P)
and write
(40) PQK= 1 11

where 0' < 0 for all k, at each t.

Since C"(1,) > 0 for all I, we may write, in view of (36),
(41) 1, = g(A,), for all A., > C'(0)
where g' > 0 for all A,. Using (33), (34), (40), and (41), we can construct the phase
diagram shown in Figure 8.21 to describe the dynamic path of A, and K.
The equation for the (A, = 0)-curve is

(42-a) A,- (KlS C__ rPQ

3 J
700 DEVELOPMENTS OF OPTIMAL CONTROL THEORY AND ITS APPLICATIONS

K,
0
K.

Figure 8.21. The Dynamic Path of (At, Re).

and the equation for the (Kt = 0)-curve is

(42-b) ICS= g(ilr)

The values of At and k, at the intersection of these two curves are denoted by 21*
and K*, respectively, and are defined by the following equations:

(43-a)

(43-b)

In order to have K* > 0, we require that

(44) 21* > C'(0)
Moreover, from (43) we can easily see that
ax*<O,aY*<0,aa*>0,aa*>0
(45)
P P
Hence, writing I* = g(.1*), we obtain
aI* aK* aj*
(45') aK* < 0 >0 >0
ar ar < 0' ap ap

In other words, an increase in the interest rate (resp. the price of output) lowers
(resp. increases) the "long-run" desired stock of capital (K*) and investment (P).
There are three types of (Kr, A,)-paths in Figure 8.21: (1) the path in which
THE NEO-CLASSICAL THEORY OF INVESTMENT 701

both IC, and A, decreases over time; (2)'the path in which A,- co; (3) the path in
which ICt_K* as t->oo. Clearly only the third type of path is eligible.26
In the third path, k, monotonically approaches K*, as indicated in Figure
8.21. Notice that at K*
(46) It = I* = g(A*) = SK*
as we can see from (43-b). In other words, the optimal investment at (K*, A*) is
just equal to the amount of depreciation. Now observe that
(47) It - SIC, = It - I* - (SIC, - SK*)
= [g(At) - g(A*)] -S(ICt-K*)
In the third path, A, approaches A* as k, approaches K*. Therefore, from (47),
I, - SIC, approaches zero as k, approaches K*. This implies that it takes an infinite
amount of time to reach the steady state (K*, A*). We thus conclude the
following:
Of d<0
(48-a) Ko < K*: > 0,!L,' for all t >_ 0, Ktk= K*, lim it = I*
lim

(48-b) K0 > K*:

da,t < 0, d > 0 for all t ? 0, l i m k, = K*, lim I, = I*
1-.c
(48-c) Ko = K*: K, = K* and It = I* for all t > 0
In Figure 8.22, we illustrate the time path of optimal investment I.
Notice also that
(49) ddt=lt-SK,=(It-I*)+S(K*-IC,)
Since It I* for all t >A, we have
dK, # S(K*
(50)
dt
- K,) for any t

In other words, the usual response function which appears in the literature cannot
hold for any t (see subsection d).
it

K>K

0 Figure 8.22. The Time Path of I.

702 DEVELOPMENTS OF OPTIMAL CONTROL THEORY AND ITS APPLICATIONS

Constant Returns to Scale

-
In this case, QLLQKK QLK2 = 0 for all (L, K,) and 0' = 0. On the other hand,
equation (35) uniquely determines the value of the capital: labor ratio k for each
fixed w/p, as long as QLL < 0 and QKK < 0 for all (L[, Q. Therefore the value of QK
becomes constant for all t.27 Write pQk = e. Then from (34) we have

(51) A,_ (r+b)A,-e

This is a linear differential equation, and its solution is obtained as

(52) At
A + Ae('+a)t
r+b
where A is the integrating constant. In view of the right end-point condition (37), A
must be zero. Hence
_ c
(53)
r+b( )
which corresponds to (30). Then in view of (36) with C" > 0, and hence in view
of (41 we obtain

(54) it = g
[r + 81 - I (= a positive constant)
for all t, where it is assumed that
c
C'(0)
(55)
r+b >

Therefore we conclude that the optimal investment is constant and positive for all
t. Notice that we obtained this result without assuming the quadratic approxima-
tion of the adjustment cost function which is the usual convention in the liter-
ature.28
Since the function g is monotone increasing, it is easy to conclude

(56)
a
r <0'>0'as <o
in view of (54).
I lie path of the optimal capital stock, k, is easily obtained from (33) as

(57) dd`=7-SIC,

The solution of this linear differential equation is easily obtained as

IC[ _ T(1 e-") + Koe_st

(58)

Notice that k, monotonically approaches k as t - cc regardless of the values of

THE NEO-CLASSICAL THEORY OF INVESTMENT 703

K0, where K is defined by

(59)

Notice also that

T
je_STdr
J
Thus the first term on the RHS of (58) signifies the investment accumulated up to
time t. The second term of the RHS of (58) is the initial capital stock left over
at time t. The time path of k, is illustrated in Figure 8.23. It takes an infinite
amount of time to reach K if K0 is different from K.
K,

0 t Figure 8.23. The Time Path of K.

In view of (56) and (58), we also obtain

(60) as < O and p > 0 for all t

Notice that the strict convexity of the function C (that is, C" > 0) guarantees
the uniqueness of the optimal investment as a result of (54), and thus guarantees
the uniqueness of k, from (58), which in turn implies the uniqueness of L, since
KdL! = k = constant. In other words, the strict convexity guarantees the unique-
ness of the solution (K L,, I) (almost everywhere), in spite of the fact that Q is
homogeneous of degree one, and hence not strictly concave.

d. SOME REMARKS

The Response Function

In subsection b, we pointed out that the firm adjusts very quickly to its "long-
run" desired stock of capital K* if the adjustment is cost free and frictionless and if
Imax is large enough or Imin is small enough. This more or less corresponds to the
704 DEVELOPMENTS OF OPTIMAL CONTROL THEORY AND ITS APPLICATIONS

Lerner-Haavelmo observation. If all the prices, p, wt, gt, and r, are constant, then
the desired stock of capital K*, whose value depends on the parameters p, w, q, and
r, would be constant. In Jorgenson's study [27], these prices are not constant;
hence the desired stock of capital is not constant, and is hence denoted by K1 .

Jorgenson and his associates then insisted that the actual stock of capital Kt is
in general different from the desired level. In "reality," the firm cannot adjust its
stock of capital to the desired level instantaneously and frictionlessly. Hence the
investment demand at time t will be determined in such a way as to accommodate
this adjustment.
Suppose that the response mechanism is represented in such a way that
actual capital is a weighted average of all past levels of desired capital with geo-
metrically declining weight. That is
(61) Kt = E aT
Go k,-,r
T=l
where29
0C

(62) zaT = 1 and aT >_ 0, T = 1, 2, .. .

In the continuous time model, such a distributed lag can be represented by

(63) Kt = JaT&_TdT

where
00

(64)
To
aTdT=1 and aT>0 forallt>0
If the lag function at is a simple exponential lag
(65) at = ae-111, a > 0
then it is well known that (63) can be rewritten as30
(66) Kt = a(Kt - Kt)
where a may be termed the response parameter or the speed of adjustment.
It is possible to select a lag function other than the above simple exponential
lag, and the specification of the lag function would, in general, affect the results.
For example, in their empirical studies of comparing alternative theories of
investment, Jorgenson and Siebert ([29], p. 688) remarked the following:
Misspecification of the lag distribution for a given theory of investment
behavior may bias the results of our comparison. Accordingly, we choose the
best lag distribution for each alternative specification of desired capital from
among the class of general Pascal distributed lag functions.31
THE NEO-CLASSICAL THEORY OF INVESTMENT 705

In a number of empirical studies of the investment function, Jorgenson and his

associates have obtained good statistical fits for Jorgenson's "neo-classical"
theory. (See, for example, [24], [25], and [29] -[37].)
However, there is a serious difficulty in the above procedure. In obtaining
the desired stock of capital, it is assumed that the adjustment is instantaneous
and frictionless. Hence the imposition of a response function such as (61) or (66)
for empirical estimation is a serious inconsistency in theory (regardless of the
specification of the lag function), for it implies that the adjustment to the desired
stock of capital is neither instantaneous nor frictionless. If the adjustment is not
instantaneous and involves some friction, then this should be incorporated in the
maximization behavior of the firm. Thus, with regard to the treatment of the
investment function by Jorgenson and his associates, Uzawa [65] comes to the
following conclusion:

It goes without saying that the investment function ... which is derived
from logically completely inconsistent assumptions, cannot have any
economic and empirical significance, no matter how good the statistical
fit of that function may be. (Translation is mine.)

'In other words, a response mechanism such as (61) or (66) affects the profits, and
hence affects the desired stock of capital.
The introduction of adjustment costs by such authors as Treadway [62],
Lucas [45] and [47], and Gould [ 18], as described in subsection c is, no doubt,
based on serious skepticism about the above difficulty in the studies by Jorgenson
and his associates.
However, the result they obtained is remarkable. As we can observe from
our discussion in subsection c [especially from equations (57) and (59)], if the
production function exhibits constant returns to scale with respect to labor and
capital and if the adjustment cost function is strictly convex with respect to
investment (I,), then we obtain a result which states that I, = I (constant) for all
t and

(67)
dICi
dt
(K-k)
where 8 is the rate of depreciation. From (67) it is concluded that it is optimal for
the firm to adjust in the way described by (66) where a = 8.
Now we should comment on this result. As remarked earlier, this result
depends crucially on the assumption of constant returns to scale. As we showed
in subsection c, we cannot obtain a nice response function such as (66) for the case
of nonconstant returns to scale.
Secondly, one crucial assumption in the formulation by Gould, Treadway,
Lucas, and so on, is that there is no lag between the investment decision and the
realization of the decision. This is in sharp contrast to the usual rationalization
706 DEVELOPMENTS OF OPTIMAL CONTROL THEORY AND ITS APPLICATIONS

of a time lag. For example, Jorgenson made the following (empirically) very sound
remark ([25], p. 2):
... we divide the investment process into separate stages. The first stage of the
process is a change in the demand for capital services. Subsequent to an
alteration in demand for capital services, architectural and engineering plans
must be drawn up, cost estimates prepared, funds appropriated and funds
committed through the issuing of orders for equipment or the letting of
contracts for construction. Actual investment expenditure is the final stage
in the investment process. Only after a given investment project has passed
through each of the intermediate stages can actual investment expenditure
take place.

Clearly this seems to be the motivation of the empirical studies by Jorgenson

and his associates. In the adjustment cost study as described in subsection c, such
a time lag is ignored.
It is possible to argue, at least in a purely theoretical framework, that as
long as all future prices are known to the firm and as long as the firm remains
competitive (that is, a pure price taker), the firm would lay out the investment
plans for all future time. Hence, except for the initial period, the time lag as
described by Jorgenson does not matter. If, for example, there is a construction
lag of 0, the order will be given out B periods in advance.
Obviously this is unrealistic, as the firm usually neither knows all future
prices, nor, as assumed in subsection c, expects constant prices. However, the
basic criticism of the studies by Jorgenson and his associates still stands: If there
is such a time lag, it ought to be incorporated into the optimization procedure of
the firm.
The introduction of this time lag into the maximization problem with
changing future prices together with the cost of adjustments is left to the interested
reader. One problem remains: The response function, such as (66), is not likely to
come out as a criterion function for the maximization anymore. The present study,
as a critical summary of the existing theory, is only intended to serve as a frame-
work for such a study in the future.
Uzawa on the "Penrose Effect"
Uzawa in a number of papers proposed to take a closer look at the actual
behavior of firms, and in particular he argued that we sh11 ould utilize the study by
Penrose [54] .32
He summarizes the Penrose effect as follows ([64], p. 4):
The managerial and administrative abilities required by a firm in the process
of growth are basically different from those which are needed in the mere
management of the existing administrative structure of the firm. The nature
of such a process may be conveniently summarized by a schedule relating
the level of investment ... with the rate by which the stock of real capital
available to the firm grows.
THE NEO-CLASSICAL THEORY OF INVESTMENT 707

Uzawa postulates the following functional relation

(68)
Kt - t/
where 7r' > 0 and 7r" > 0; ir" > 0 signifies that the marginal effect of investment
upon the growth process is diminishing 33 The ir-schedule depends on how the
managerial and administrative resources are accumulated in the course of the
growth of the firm. Notice that K, in (68) is not the usual physical capital stock,
but rather it incorporates the scarcity of managerial and administrative resources.
In essence, the Penrose effect seems to treat the managerial and administra-
tive resources as another factor of production. It is assumed that such resources
are directly related to the accumulated physical stock of capital. Moreover,
instead of directly relating k, to It, Uzawa's Penrose function relates Kt/Kt to
It/Kt. These conventions enable him to avoid the problem of how to measure
managerial and administrative resources,34 to simplify mathematical deduction
considerably, and thus to obtain some definite conclusions.
The Penrose function is illustrated in Figure 8.24 and it is called the Penrose
curve.

K,
K,

Figure 8.24. The Penrose Curve.

It is important to realize that the administrative and managerial resources

which define the Penrose curve specify a kind of adjustment cost35 In other words,
Uzawa's formulation of the investment function is in the same line of development
as the investment theory which takes account of adjustment costs such as that of
Treadway, Lucas, Gould, and so on; however, Uzawa's theory is perhaps more
sophisticated than theirs.
Note also that by the specification of the Penrose function, the usual
708 DEVELOPMENTS OF OPTIMAL CONTROL THEORY AND ITS APPLICATIONS

depreciation relation
(69) K1= It - SKr
is already included. This relation may be considered as the case in which the
Penrose curve takes a special form:

(69') It = Kt + S
Kr Kt

The problem of the firm is again to maximize the present value of all future
profits subject to the constraints

Q(Lt, Kt) - Qt = 0 and Kt = n(Kt

It is again assumed that all prices, p, w, q, and r, are kept `constant.

Notice that the usual depreciation equation or equation (69') is replaced
by equation (68), which denotes the Penrose effect. Equation (69) should be
dismissed, for under the present circumstances it misses the essence of the Penrose
effect, that is, the scarcity of managerial and administrative resources. On the
other hand, if we assume that the physical capital is a single entity with units
measurable in a normal way, then an equation such as (69) or variants thereof
cannot be avoided. This then causes an inconsistency with the Penrose function.
However, Uzawa's Kt is not the usual physical capital stock, as remarked above.
Instead, it incorporates the scarcity of managerial and administrative resources.
Moreover, Uzawa supposed that a spectrum of different capital goods is used. In
order to avoid the difficulty involved in measuring the (real) unit of capital" and
to incorporate the Penrose effect into his analysis, he invented an index to
represent the unit. We omit such a discussion by simply referring the reader to
his [66], pp. 637-6393`
We are now ready to state Uzawa's problem as an optimal control problem.
PROBLEM III: Choose Lt and Zt so as to3e

Maximize: JOB
e-.t [Q(Lr, Kt) - wLt - q,r)Ktl dt
r J
Subject to: 'tV -- 7r
Zmin _< Zt < Zmax
Kt>_0,Lt?0
and a given value of Ko

Here Kt is the state variable and Zr and Lr are the control variables. The intro-
duction of a new control variable like Zt is a standard practice in control theory.
We omit the discussion of the procedure of obtaining the solution to this
THE NEO-CLASSICAL THEORY OF INVESTMENT 709

problem, for it is already described in Uzawa ([64] and [66] ), and the reader,
if he so wishes, should easily be able to obtain it by himself.39
Assuming the linear homogeneity40 of the function Q, Uzawa obtained the
following results:
(70-a) it = constant (=z) for all t
(70-b) k, = constant (=k) for all t
(70-c) A, = constant (=A) for all t
where it -- Z1/K, and k1= K,/L,. The value of k is determined by QL = w/p as
a result of the homogeneity assumption. The value of z is obtained as a solution of
the problem of maximizing
v - c - g7r(z)
(71) where c -- pQK
r-z
with respect to z.41
Since 4/K1= it and I,/K, _ ( ), the above solution implies

(72) Ki = Koezt

(73) It = Ioezr
where Io -- Kon(z), and

(74) Lt = Loezt

where Lo = Ko/k. In other words, once z is determined to be a positive constant,

the firm's demand for labor and capital (and investment) grows exponentially.
This solution is in sharp contrast to the solutions obtained in subsections b and c,
in which there exist stationary values for optimal labor, capital, and investment.
My concern with Uzawa's solution is simply that this can be in contradiction to the
assumption that the firm is competitive and a pure price taker. If the firm keeps
growing according to Uzawa's solution, then it may become large compared with
the rest of the economy. Certainly the entire economy may be growing at the
same time so that the relative position of the firms may be kept small enough
to be competitive. But this is something which remains to be shown.
However, Uzawa's theory is apparently aiming at a theory of aggregate
investment, which is clear from his various papers. Then the above inconsistency
may not be of any serious consequence. One natural question at this point is
whether or not the supply.of labor grows exponentially at the rate z* as its demand
prescribes. This is obviously highly unlikely. Then there is a possible gap between
demand for and supply of labor at the aggregate level, and the real wage rate will
increase or decrease depending on whether there is an excess demand for or an
excess supply of labor. The firm then changes the plan accordingly. A similar
situation exists with respect to the supply of capital. The supply of capital is
710 DEVELOPMENTS OF OPTIMAL CONTROL THEORY AND ITS APPLICATIONS

obviously funded from saving. Thus there is a possible gap between the demand
for and supply of capital. Uzawa did realize these points. Assuming that the supply
of labor grows at a constant rate and devising a theory of saving, he concluded
that there is a steady state solution in the economy which is dynamically stable.
It is assumed that the firm makes its decision on the assumption that the current
interest rate and prices continue to remain the same for all future times. Clearly,
the firm realizes its expectations are wrong in every case, for these prices and
the interest rate change continuously over time. It is assumed that the firm never
learns from past experience, which, to me, is highly dubious.
In this connection, the following confession by Uzawa with regard to the
weakness of his theory seems to be pertinent ([66], pp. 651-652):
However, the most serious limitation of the present analysis is the hypothesis
that the aggregative behavior of each of two major sectors of the national
economy may be explained in terms of the representative unit which behaves
itself in a way similar to each individual unit. It might be less objectionable for
a static or stationary analysis, but an economic model which is purportedly
analyzing the mechanism of a growing economy would be deemed question-
able if enough attention were not paid to the process of aggregation.
Needless to say, if the economy is in the stationary state, all the prices and the
interest rate will presumably be constant; hence there should be no problem
of aggregation nor should there be a problem of not learning from past experience.
However, in Uzawa's investment theory, the firm wishes to expand exponentially
without any limit; hence it cannot be a static theory. In other words, it seems
that his theory also contains an inconsistency.

Some Extensions

As remarked earlier, the purpose of the present study is partly expository;

hence several heroic assumptions are made to highlight some of the problems
involved and to obtain the basic framework for analysis in investment theory.
Although these assumptions are also seen in the literature, one may wish to
weaken or modify some or all of them.
One assumption that may bother the reader is that of constant prices. If
the economy is in a steady state or on a balanced growth path, then this assumption
may be acceptable. However, if the economy is not in such a state, prices are
not constant and may in fact be constantly changing. Hence a firm's investment
program made under the assumption that the present prices are expected to
continue forever is bound to be revised.
As is well known, this is a difficult problem in any theory which involves
time explicitly, and actual trading occurs from time to time. A well-known
assumption, alternative to the above static foresight assumption, is the one of
perfect foresight. There it is assumed that the economic agent (in our case the
firm) knows all future prices with perfect certainty. Under this assumption, the
economic agent never makes a mistake with regard to future prices, and hence
the agent does not have to revise his program because of past mistakes in fore-
THE NEO-CLASSICAL THEORY OF INVESTMENT 711

casting nor does he have to learn from past mistakes. This perfect foresight
assumption is thus useful in building a theoretically consistent model. But it is
obviously an unrealistic assumption to make for a theory intended to explain the
behavior of the firm. The firm does make mistakes with regard to the forecasting
of future prices; hence it has to revise its program from time to time.
An obvious generalization here is to introduce an element of risk into fore-
casting and incorporate the firm's utility function with regard to risks into the
analysis. Commenting on Jorgenson, Karl Borch has made the following remark
([24] p. 273):

There are at least two generalizations which immediately suggest themselves

as desirable: to replace the constant discount rate by a utility function which
expresses a preference for the timing of payments and to introduce a prob-
ability distribution or a stochastic process to allow for the uncertainty
associated with future payments.
But then he immediately admits (p. 273):42
If it should be possible to construct a realistic theory of investment along
these or other lines, it will be a very complex model, and it may contain so
many parameters that it will be virtually impossible to test the theory against
observations.
Another assumption that one may object to is that the firm is a pure price
taker. Two interesting grounds for investigation are the behavior of an oligo-
polistic firm and that of a completely monopolistic firm. As is well known, the
theory of the oligopolistic firm entails difficult problems. However, the theory of
the complete monopoly, about which we can make a few remarks, is not difficult.
If the firm is a complete monopoly, it is no longer a price taker but rather
a price setter. The firm has a demand function Q, = D(p,) for its output (where
D' < 0 for all p,). Assuming that this can be rewritten in the inverse form p, =
p(Q,), the firm's revenue at time t may be expressed as p(Q,)Q,, which in turn
can be expressed as a function of L, and K,. Therefore the firm's profit at time t,
W, is written as43

(75-a) (with no adjustment costs) W, = G(L,, K,) - wL, - qI,

(75-b) (with adjustment costs) W, = G(L, K,) - wL, - C(I1)
where
(76) G(L1, Kr) = p(Q1)Q1 with Q, = Q(L,, K,)

It can then be immediately seen that nothing basic is changed mathematically. The
function pQ(L,, K,) is replaced by G(L K,). If we replace the concavity assump-
tion of Q with that of G, our analysis in subsections b and c follows almost word
for word. Notice that the choice of the path (L,, K,) implies the choice of the
output path Q,, which in turn implies the choice of the price by p, = p(Q,).
In the case of no adjustment costs, the firm will adjust to the long-run
712 DEVELOPMENTS OF OPTIMAL CONTROL THEORY AND ITS APPLICATIONS

optimal stock of capital and labor (K* and L*) very quickly as long as Imax is
large enough or Imm is small enough. Both K* and L* are determined by the
famous`rule that marginal revenue equals marginal cost or, equivalently, that the
marginal revenue product equals the factor price. In other words, L* and K*
are determined by;;
(77-a) GL(L*, K*) = w
(77-b) GK(L*, K*) = (r + 8)q
where GL = 8G/8L and GK = 8G/8K. Hence the optimal price p* is given by

(78) P* = p(Q), where Q = Q(L, K)

Note that the optimal price is constant over time.
The adjustment costs case can be analyzed analogously and is left to the
interested reader. Here we only point out one difficult problem. The optimal
capital stock, in this case, is in general not constant but changes over time. Here
the optimal output price changes over time. For example, if it increases mono-
tonically while the optimal path of labor employment is constant, then the
optimal output increases over time, so that the optimal price decreases over time.
The consumer eventually knows about this and may postpone present consump-
tion of the good. In other words, the demand for output shifts over time. Or,
more appropriately, the demand function in the form of Q1 = D(p,) orpr = P(Q1)
is inappropriate for the analysis; the demand for the output at each t depends
on the time path of pt. In other words, the firm's maximization problem must be
based on_such a demand function rather than on the simple one Qt = D(pt).

FOOTNOTES

1. This note is a revised version of my lecture given at Purdue University, May 1971,
which is recorded in A. Takayama, "The Neoclassical Theory of Investment and
Adjustment Costs," Krannert Institute Paper, no. 349, Purdue University, April 1972.
2. A similar attempt is made by Arrow [71, for example, in the model in which labor
is not explicitly introduced.
3. It is assumed that r is constant for all t. It can be interpreted as the current interest
rate, if the firm can borrow or lend any amount at this rate r.
4. This assumption on depreciation contrasts, in the present analysis, to the so-called
"depreciation by sudden death" assumption, which means that the capital goods last
for a finite period of time with a constant efficiency and then "die" suddenly at the
end of that period (that is, they are scrapped with zero value). The assumption of
depreciation by evaporation ("radioactive" decay) is a very common assumption
in the literature dealing with the continuous time model.
5. In the classical stationary state where everything is repeated over and over again,
all prices are constant. Hence everybody expects all prices to remain constant and
their expectations are always correct (perfect foresight). This makes the classical
theory simple and elegant. But in a dynamic economy, the problem of future expecta-
tions causes very difficult problems.
THE NEO-CLASSICAL THEORY OF INVESTMENT 713

6. Rigorously speaking, the Hamiltonian should be defined as

H - ve"[pQ(Lr, Kt) - wLr - q11] + µr[Lr - 8K1]
where v is a multiplier as well as µr, and v is constant for all t and nonnegative.
But it is possible that v = 0. Under certain plausible assumptions we can prove v > 0,
and thus we can set v = 1. Recall Theorem 8.A.6.
7. The optimal values of Kr, Lr, and Ir are denoted respectively by ICr, Lr, and I.
The circumflex in the partial derivatives 8H/8µr and 8H/8K1 denotes that they
are evaluated at (Kr, Lr, Ir).
8. In other words, we obtain
pQ(Lr, Kr) - wLr ? pQ(L1, Kr) - wLr
for all Lr > 0, which means the maximization of pQ(L1, Kr) - wLr with respect to
Lr. This in turn implies (13). It is assumed that this maximum is achieved at a finite
value of Lr.-Denote Q(L1, Kr) by Q(LI) and assume Q' > 0. Assume also Q" < 0 for
all Lr and Q(0) = 0, Q'(0) > w/p, Q'(oo) = 0. Then we can easily prove that a finite
Lr exists and that Lr is unique and strictly positive. Moreover, (13) gives a sufficient
(as well as necessary) condition for Lr to maximize pQ(L1, Kr) - wLr.
9. From (8), we obtain ,ur = µrS - e-"pQK. Then observing that 2r = µr e'r + rAr, we
obtain (16).
10. This is a result of Mangasarian's theorem. See his [491, or our Theorem 8.C.5.
11. Assume that the function Q(LI, Kr) satisfies the following conditions for some
"relevant" neighborhood of the path (Lr, Kr): QLL < 0, QKK < 0, QLLQKK - QLK2 > 0
(which implies the strict concavity of Q in this neighborhood), and QLK > 0, where
QLL = 82Q/8L2, QLK = 82Q/8L3K, QKK = 82Q/8K2. The condition QLK > 0 is re:
ferred to as the "normal" case by Trout Rader (see Section D, Chapter 4). We cannot
impose this condition for all Lt > 0 and Kr > 0, for then the strict concavity of
Q for the entire domain, Lr > 0 and Kr >- 0, together with Q(0, 0) = 0, imply
diminishing returns to scale, which forces the scale of operation Qr to zero under
the competitive market (in the traditional sense of Viner [ 69] ). In the literature,
it is sometimes observed that the assumption of the strict concavity of Q is imposed
together with Q(0, 0) = 0, hence causing a slight inconsistency. This inconsistency
was pointed out by Proctor [56] . The "relevant" neighborhood of (Lr, Kr) is
slightly ambiguous phraseology, but the meaning should be clear from the usual
Knightian cost function which appears in standard textbooks of intermediate price
theory. A more precise specification of this is left to the interested reader.
12. -
We can obtain 0' = p [ QLL QKK QLK2] I QLL, which is negative under the as-
sumptions made in the previous footnote.
13. There are in essence three types of paths: (1) Kr < 0 within a finite t; (2)oo
with 2i = (r + 8)ilr - (constant); (3) Kr _K *. Clearly the first type of path violates
the restriction Kr >_ 0, and the second type violates condition (17). Hence only the
third type of path is eligible. In the above discussion of type (1) path, it is implicity
assumed that Imin < 0. When Imin, > 0, the discussion should be modified.
14. -
Equation (23-a) may be rewritten as Kr = 8(Kmax Kr), where Kmax - Imax/8.
This may be considered as the response equation for the period in which K0 < K*. A
similar response equation can be obtained for the period in which K0 > K* by replac-
ing Kmax by Kmin ° Imin/S
15. We assume that Imax > 8K*, that is, Imax > J. Note that T* > 0 for any largelmax
(except for + oo). In other words, the firm does not adjust instantaneously, although
T* can be very small, for 1 max can be very large. In the limiting case of 1max = + oo,
we have T*=0.
714 DEVELOPMENTS OF OPTIMAL CONTROL THEORY AND ITS APPLICATIONS

16. If the capital good is rented for a finite period of time-say, T-then the inter-
temporal arbitrage relation is

q= r Tce (S+r)t dt + qe (s+,)r

This yields the same relation as before; that is, c = (r + 8)q.

17. He seems to have failed to observe the adjustment mechanism required to reach the
"long-run" state (for our case K*), since he relied on a rather crude use of the
calculus of variations instead of the optimal control technique. In other words, he
considered only the case in which the desired stock of capital is equal to the "long-
run" desired level.
18. We can also observe that 0' = 0. Note also that QLL < 0 if and only if QKK < 0 for the
constant returns to scale case with QLK # 0-
19. As a specific form of C(I,), we may consider q(I,)I,, where q(I,) > 0, q'(I,) > 0,
and q"(1,) > 0 for all 1, > 0. It is easy to check that the conditions for the function
C are satisfied by these specifications for the function q.
20. As a simple form of (32), we may consider Q, Q (L,, K,) - C(I1) with C' > 0,
C" > 0 and C(0) = 0. Such a specification is considered in Treadway ([ 62] , p. 67,
his equation (9)).
21. As noted in footnote 6, we should have the multiplier v attached to the term
e rt [pQ(Lt, Kt) - wLt - C(I,)]. We again omit the discussion concerning the as-
sumptions which lead to v = 1.
22. Clearly the values of µ, and A, will in general be different from the ones in the
previous section. However, for the sake of notational simplicity, we use the same
notations for the multipliers.
23. Again it is possible to replace (37) by "Arrow's conditions," ? 0 and
lim,-M t1 K, = 0 (or equivalent conditions in terms of A, ), and this will yield the same
analysis and conclusions as the present one.
24. The problem is. to find the assumptions to guarantee the existence of I, which maxi-
mizes [ -e-r1C(I1) + µ,I,] or, equivalently, [A,!, - C(1,)] for each t such that
Imin < 1, < 1max, Assuming that C'(I,) > 0 and C'(I1) > 0 for all I, > 0, C(O) = 0,
and that C'(0) is small enough, the existence of a unique I, > 0 can be proved easily
if A, > 0. Here we also assume that Imin < 0 and that Imax is sufficiently large. Note
that (36) implies A, > 0.
25. Recall our discussion in footnote 11.
26. The first type of path violates the conditions k, ? 0 and il, > C'(0). Using (34), we
can show that the second type of path violates condition (37).
27. We denote again the optimal path for the present case by (K,, Lt, I,); QKdenotes
8 Q/ 8 K evaluated at (L, , k,).
28. Gould and others assumed the quadratic approximation of the adjustment cost
function, that is, C(I1) = aI, + b!,2 wherea > Oand b > 0. See, for example, [ 18] p.48.
29. If K,_ i = Kt_Z = . . . and if they are all equal to k, then the RHS of (61) gives (al +
aZ + .)K,. On the other hand, in this case, the LHS of (61) should be equal to
K, . Therefore a I + a2 + - = 1.
30. See, for example, Allen [4] , pp. 88-89.
31. Solow has proposed the Pascal probability distribution for the lag function. Jorgen-
son [26] generalized this function. For a survey of distributed lags, see, for
example, Griliches [ 17] . In connection with the above quotation from Jorgenson and
Siebert [ 29] , J. A. Swanson made the following remark to me: "J & S choose the `best'
lag in an inappropriate manner; Viz., by choosing that lag structure which minimizes
THE NEO-CLASSICAL THEORY OF INVESTMENT 715

sample residual variance. This procedure, developed by Theil, is appropriate only for
non-stochastic regressors, but J & S have It, as regressors!" By "J & S," Swanson
seems to have been referring to Jorgenson and Stephenson [32] .
32. See, for example, Uzawa [64] , [651, [66] , [67] , and [68] .
33. A powerful outcome of the assumption n" > 0 is that the uniqueness of the optimal
path of capital accumulation K, is obtained, even if the production function is
homogeneous of degree one. See Uzawa [66] , p. 642. In addition to the above re-
strictions on the function n, Uzawa ([64] , p. 4; [66] , p. 641) also imposed the
following conditions: 7r(O) = 0 and n'(0) = 1. These conditions are assumed to hold
for mathematical convenience and do not impair the generality of his argument.
34. The crucial feature of managerial and administrative resources is that they are not
usually bought and sold in the market; hence the market prices of these resources
usually do not exist. This then creates the problem of how to measure them.
35. In other words, Uzawa claims that the crucial feature of the "fixity of capital" is in
the limitational character of administrative and managerial resources.
36. Notice that, in Problems I and II, there is really no problem of how to measure the
units of capital, output, labor, and cost of adjustment. In the profit function, pQt,
wLt, qlt, and C(It) all enter in dollar terms. Notice also that as long as all prices are
constant, the maximization of pQt - wLt - qlt] dt will give the same result
as that of the maximization of the present value of real profit foe-.t [Q, - wL1/p -
qlt/p] dt.
37. This does not deny the importance of such a treatment. In fact, this is one of the
most important features of Uzawa's theory, reflecting clearly the influence of Joan
Robinson.
38. Uzawa maximizes the present value of real profit f e-rt [Q(Lt, K) - wLt/p -
n(ZtlKt)Kt] dt, where his convention with regard to the measurement of units
(mathematically) amounts to setting p = q.
39. Assuming the concavity of Q and an interior solution, the following conditions are
necessary and sufficient for an optimum: Kt = Zt; A, = rdt - [ PQK -- q(n -
n'Zt/Kt)] ; QL = w/p; -It = qn'; limt-.e-'tdt = 0. Here the optimal path is again
denoted by (Kt, Lt, Zt). Also n' denotes do/dzt evaluated at 2t where z, - Zt/Kt.
Similarly, n = n(2t), QK - QK(Lt, Kt) and QL = QL(Lt, Kt).
40. The reader can easily analyze the case of nonconstant returns to scale. The analysis
will be analogous to the one in subsection c. The results are different from Uzawa's.
41. The range of the variation of z is assumed to be restricted to 0 < z < r and 0 < qn(z)
!5e. It is shown then that 0 < 2 < r.
42. There is a more difficult problem involved. The firm may not be able to ascertain the
probability distribution of its future prices or payments. Recall the famous Knightian
distinction between risk and uncertainty.
43. It is assumed that there is no monopsony in the labor market. In the case of no
adjustment costs, it is further assumed that there is no monopsony in the capital
market.
44. Compare (77) with equations (26-a) and (26-b).

REFERENCES

1. Alchian, A. A., "The Rate of Interest, Fisher's Rate of Return Over Costs and
Keynes' Internal Rate of Return," in The Management of Corporate Capital, ed. by
E. Solomon, Glenco, Ill., Free Press, 1959.
716 DEVELOPMENTS OF OPTIMAL CONTROL THEORY AND ITS APPLICATIONS

2. , "The Basis of Some Recent Advances in the Theory of Management of the

Finn," Journal of Industrial Economics, XIV, November 1965.

3. Allen, R. G. D., Mathematical Economics, 2nd ed., London, Macmillan, 1965.
4. , Macro-Economic Theory, London, Macmillan, 1969.
5. Arrow, K. J., "Optimal Capital Policy, the Cost of Capital and Myopic Decision
Rules," Annals of the Institute of Statistical Mathematics, 16, 1964 (Tokyo).
6. , "Optimal Capital Adjustment," in Studies in Applied Probability andManage-
ment Science, ed. by K. J. Arrow, S. Karlin, and H. Scarf, Stanford, Calif., Stanford
University Press, 1962.
7. , "Optimal Capital Policy with Irreversible Investment," in Value, Capital

and Growth, Papers in Honour of Sir John Hicks, ed. by J. N. Wolfe, Edinburgh,
Edinburgh University Press, 1968.
8. Arrow, K. J., Beckman, M. J., and Karlin, S., "The Optimal Expansion of the
Capacity of a Firm," in Studies in the Mathematical Theory of Inventory and Produc-
tion, ed. by K. J. Arrow, S. Karlin, and H. Scarf, Stanford, Calif., Stanford University
Press, 1958.
9. Arrow, K. J., and Kurz, M., Public Investment, the Rate of Return and Optimal Fiscal
Policy, Baltimore, Md., Johns Hopkins Press, 1970.
10. Bailey, M. J., "Formal Criteria for Investment Decisions," Journal of Political
Economy, 67, October 1969.
11. , National Income and the Price Level, 2nd ed., New York, McGraw-Hill, 1971,

Chap. 8.
12. Chenery, H. B., "Overcapacity and the Acceleration Principle," Econometrica, 20,
January 1952.
13. Eisner, -R., "A Distributed Lag Investment Function," Econometrica, 28, January
1960.
14. Eisner, R., and Strotz, R. H., "Determinants of Business Investment," in Impacts of
Monetary Policy, by D. B. Suits et al., Englewood Cliffs, N.J., Prentice-Hall, 1963.
15. Eisner, R., and Nadiri, M. I., "Investment Behavior and the Neo-Classical Theory,"
Review of Economics and Statistics, L, August 1968.
16. Fisher, I. N., The Theory of Interest, New York, Macmillan, 1930.
17. Griliches, Z., "Distributed Lags: A Survey," Econometrica, 35, January 1967.
18. Gould, J. P., "Adjustment Costs in the Theory of Investment of the Firm," Review
of Economic Studies, XXXV, January 1968.
19. , "The Use of Endogenous Variables in Dynamic Models of Investment,"

Quarterly Journal of Economics, LXXXIII, November 1969.

20. Haavelmo, T., A Study in the Theory of Investment, Chicago, Ill., University of
Chicago Press, 1961.
21. Hestenes, M. R., Calculus of Variations and Optimal Control Theory, New York,
Wiley, 1966.
22. Hirshleifer, J., "On the Theory of Optimal Investment Decision," in the Management
of Corporate Capital, ed. by E. Solomon, Glenco, Ill., Free Press, 1959.
23. , Investment, Interest and Capital, Englewood Cliffs, N.J., Prentice-Hall, 1970.

24. Jorgenson, D. W., "Capital Theory and Investment Behavior," American Economic
THE NEO-CLASSICAL THEORY OF INVESTMENT 717

Review, LIII, May 1963. (Also "Discussion" by C. F. Christ, E. Mansfield, and K.

Borch.)
25. , "Anticipations and Investment Behavior," in Brookings Quarterly Econo-

metric Model of the United States, ed. by E. Kuh, G. Fromm, and L. R. Klein,
Amsterdam, North-Holland, 1965.
26. , "Rational Distributed Lag Functions," Econometrica, 34, January 1966.

27. , "The .Theory of Investment Behavior," in Determinants of Investment Be-

havior, ed. by R. Ferber, New York, NBER, 1967 (reprinted in Macroeconomic
Theory, Selected Readings, ed. by H. R. Williams and J. D. Huffnagle, New York,
Appleton-Century-Crofts, 1969).
28. -, "The Demand for Capital Services", in Economic Models, Estimations and
Risk Programming: Essay in Honor of Gerhard Tintner, ed. by K. A. Fox, J. K.
Sengupta, and G. V. L. Narasimham, Berlin, Springer-Verlag, 1969.
29. Jorgenson, D. W., and Siebert, C. D., "A Comparison of Alternative Theories of
Corporate Investment Behavior," American Economic Review, LXIII, September
1968.
30. , and , "Optimal Capital Accumulation and Corporate Investment
Behavior," Journal of Political Economy, 76, November/ December 1968.
31. Jorgenson, D. W., and Stephenson, J. A., "The Time Structure of Investment
Behavior in U.S. Manufacturing, 1947-60," Review of EconomicsandStatistics,XLIX,
February 1967. -

32. , and , "Investment Behavior in U.S. Manufacturing, 1947-60" Econo-

metrica, 35, April 1967.

33. , and , "Anticipations and Investment Behavior in U.S. Manufacturing,

1947-60," Journal of American Statistical Association, 64, March 1969.

34. -, and , "Issues in the Development of the Neo-Classical Theory of
Investment Behavior," Review of Economics and Statistics, LI, August 1969.
35. Jorgenson, D. W., Hunter, J., and Nadiri, M. I., "A Comparison of Alternative
Economic Models of Quarterly Investment Behavior," Econometrica, 38, March
1970.
36. , , and , "The Predictive Performance of Econometric Models

of Quarterly Investment Behavior," Econometrica, 38, March 1970.

37. Jorgenson, D. W., and Handel, S. S., "Investment Behavior in U.S. Regulated
Industries," Bell Journal of Economics and Management Science, 2, Spring 1971.
38. Jorgenson, D. W., McCall, J. J., and Radner, R., Optimal Replacement Policy,
Amsterdam, North-Holland, 1967.
39. Keynes, J. M., General Theory of Employment, Interest and Money, London, Mac-
millan, 1936.
40. Klein, L. R., "Studies in Investment Behavior," in Conferences on Business Cycles,
New York, NBER, 1951.
41. Koyck, L. M., Distributed Lags and Investment Analysis, Amsterdam, North-Holland,
1954.
42. Kuh, E., "Theory and Institutions in the Study of Investment Behavior," American
Economic Review, LIII, May 1963.
718 DEVELOPMENTS OF OPTIMAL CONTROL THEORY AND ITS APPLICATIONS

43. Lerner, A. P., The Economics of Control; Principles of Welfare Economics, New York,
Macmillan, 1944.
44. , "On Some Recent Developments in Capita'! Theory," American Economic
Review, LV, May 1965.
45. Lucas, R. E., "Optimal Investment Policy and the Flexible Accelerator," Inter-
national Economic Review, 8, February 1967.
46. , "Tests of a Capital Theoretic Model of Technological Change," Review of
Economic Studies, XXXIV, April 1967.
47. , "Adjustment Costs and the Theory of Supply," Journal of Political Economy,

75, August 1967.

48. Lutz, F., and Lutz, V., The Theory of Investment of the Firm, Princeton, N.J., Prince-
ton University Press, 1951.
49. Mangasarian, O. L., "Sufficient Conditions for the Optimal Control of Nonlinear
Systems," Journal of SIAM Control, 4, February 1966.
50. Masse, P. B. D., Optimal Investment Decisions, Englewood Cliffs, N.J., Prentice-Hall,
1962.
51. Nadiri, M. I., and Rosen, S., "Interrelated Factor Demand Functions," American
Economic Review, LIX, September 1969.
52. Nerlove, M., Distributed Lags and Demand Analysis, USDA, Handbook No. 141,
Washington, D. C., 1958.
53. Newlyn, W. T., Theory of Money, Oxford, Clarendon Press, 1962, esp. chap. 8.
54. Penrose, E. T., The Theory of the Growth of the Firm, Oxford, Blackwell, 1959.
55. Pontryagin, L. S., Boltyanskii, V. G., Gamkrelidze, R. V., and Mischchenko, E. F.,
The Mathematical Theory of Optimal Processes, tr. by K. N. Trirogoff, New York,
Interscience, 1962.
56. Proctor, M. S., "Two Equivalence Theorems for Fisherian and Neoclassical Invest-
ment Criteria," Purdue University, 1971.
57. Ramsey, J. B., "The Marginal Efficiency of Capital, the Internal Rate of Return,
and Net Present Value: An Analysis of Investment Criteria," Journal of Political
Economy, 78, September/October 1970.
58. Samuelson, P. A., "Some Aspects of the Pure Theory of Capital," Quarterly Journal
of Economics, LI, May 1937.
59. Takayama, A., "A Note on Marginal Efficiency of Capital and Marginal Producti-
vity of Capital," Purdue University, February 1971.
60. Thompson, R. G., and George, M. D., "Optimal Operations and investment of the
Firm," Management Science, 15, September 1968.
61. Tobin, J., "Comment," in Determinants of Investment Behavior, ed. by R. Ferber,
New York, NBER, 1967.
62. Treadway, A. B., "What is Output? Problems of Concept and Measurement," in
Production and Productivity in the Service Industries, ed. by V. Fuchs, New York,
Columbia University Press, 1969.
63. , "On the Rational Multivariate Flexible Accelerator," Econometrica, 39,

September 1971.
THE NEO-CLASSICAL THEORY OF INVESTMENT 719

64. Uzawa, H., "The Penrose Effect and Optimum Growth," Economic Studies Quarterly,
XIX, March 1968.
65. , "A New Theory of the Investment Function," Nihon Keizai Shimbun, July
1969 (in Japanese).
66. , "Time Preference and the Penrose Effect in a Two Class Model of Economic
Growth," Journal of Political Economy, 77, July/August 1969.
67. , "Towards a Keynesian Model of Monetary Growth," IEA Conference on

the Theory of Economic Growth, 1970.

68. , "Diffusion of Inflationary Process in a Dynamic Model of International

Trade," presented at the Far Eastern Meeting of the Econometric Society, June
1970.
69. Viner, J., "Cost Curves and Supply Curves," Zeitschrift fur Nationalokonomie, 1931
[reprinted in Readings in Price Theory, ed. by G. J. Stigler, and K. E. Boulding, with
"Supplementary Note (1950)," Chicago, Ill., Irwin, 1952].
70. Witte, J., "The Microfoundations of the Social Investment Function," Journal of
Political Economy, 71, October 1963.
71. Wright, J. F., "Notes on the Marginal Efficiency of Capital," Oxford Economic
Papers, n.s., 15, June 1963.
Name Index

Abadie, J., 100, 101 Birkoff, G., 310

Afriat, S. N., 166 Black, J., 464
Akerman, G., 46 Bliss, G. A., 97, 101, 108
Allais, M., 441 Block, H. D., xx, 301, 317, 318, 319,
Allen, R. G. D., 412, 714 325, 327, 328, 330, 347, 353
Amano, A., 301 Bohm-Bawerk, E., 47
Arrow, K. J., xx, xxiii, 60, 93, 94, 100, Bohnenblast, F., 68
103, 110, 111, 123, 150, 174, 186, Bolza, O., 657
198, 199, 201, 229, 232, 235, 260, Bolzano, B., 28
261, 264, 275, 276, 277, 288, 292, Borch, Karl, 711
301, 317, 318, 319, 325, 327, 328, Borel, E., 29
330,331,337,343,345-,347,353, Bowley, A. L., 187, 193, 198, 204
400, 446, 464, 621, 624, 625, 626, Brahe, Tycho, xiii
647, 658, 666, 669, 671, 683, 686, Bram, J., 625
689, 690, 712, 714 Brauer, A., 364, 388
Athans, M., 309 Brock, W. A., 451, 456, 465, 480, 483,
Atsumi, H., 464, 561, 594 485, 576, 587, 593, 594, 595, 626
Aumann, R. J., 206, 225, 226, 229, 231 Brouwer, L. E. J., 260, 277
Avio, K., 61, 666 Brown, A. L., 85
Brown, G. W., 337.
Barone, E., 186 Bruno, M., 644
Bator, F. M., 232 Burmeister, E., 538, 539, 551, 556
Bausch, A. F., 186 Buttrick, J., 442
Bear, D. V. T., 408
Bellman, R., 310 Camacho, A., 346
Benewitz, M., 407 Canon, M., 61, 420
Berge, C., 19, 26, 31, 32, 33, 38, 41, 44, Caratheodory, C., 97, 101, 108, 166
65, 67, 68, 69, 73, 235, 240, 250, 252, Carlson, S., 136
254, 255, 293 Cass, D., 448, 450, 459, 462, 464, 480,
Bernouilli, James, 411 482, 575, 576, 622
Bernouilli, John, 411 Cassel, G., 258, 275, 358
Bertram, J. E., 349, 352, 353, 357, 402 Cauchy, A. L., 44, 101, 305
Billera, L., 211 Cesari, L., 665

721
722 NAME INDEX

C'hakravarty, S., 450, 451, 452, 453, 456, Fenchel, W., 33, 65, 109, 248
464, 465 Fermat, P. de, 412
Chenery, H. B., 360 Filipov, A. F., 666
Chipman, J. S., 149, 181, 184, 364, 365, Fine, H. B., 167
397 Fisher, F. M., 388, 389, 390
Clark, P. G., 360 Fleming, W. H., 65, 85, 407
Cobb, C. W., 27, 220, 231, 338, 433, Fomin, S. V., 32, 420, 421
442, 563 Frechet, M., 77
Coddington, E. A., 306, 310 Frisch, R., 452
Cohn, A., 319 Frobenius, G., 364, 372, 375
Cournot, A. A., 159 Fromovitz, S., 100, 101
Cullum, C., 61, 420 Fukuchi, T., 360
Fukuoka, M., 537
Dantzig, G. B., 55, 132 Furuya, H., 573
Dasgupta, A., 464
Davis, D. G., 301 Gale, D., xxi, 44, 101, 174, 261, 262,
Deardorf, A. V., 438 263, 281, 335, 338, 446, 464, 479,
Debreu, G., 38, 126, 174, 180, 181, 184, 485, 486, 487, 494, 497, 498, 527,
186, 201, 202, 205, 206, 211, 213, 561, 575, 576, 581, 584, 587, 588,
215, 224, 225, 226, 229, 235, 248, 590, 594, 595
254, 255, 261, 263, 264, 265, 275, Galilei, Galileo, xv, 411
276, 277, 292, 364, 374, 379, 578 Gambill, R. A., 666
Desrousseaux, J., 441 Gantmacher, F. R., 119, 310, 364, 374,
Deusenberry, J., 442 378, 379, 385
Diamond, R. A., 447 George, M. D., 696
Dobell, A. R., 538, 539, 551, 556 Georgescue-Roegen, N., 181, 390, 486,
Domar, E. D., 436, 459, 464, 503, 537, 487, 532
542, 543 Ghouila-Houri, A., 26, 33, 38, 41, 67, 68
Dorfman, R., 644 Gillies, D. B., 205
DOSSO (Dorfman, R., Samuelson, P. A., Glicksberg, I., 67, 69
Solow, R. M.), 145, 150, 258, 259, Glycopantis, D., 379, 492
261, 276; 366, 487, 509, 516, 527, Goldman, A. J., 44, 132
537,559:567 Goodwin, R. M., 364, 365, 366, 450,
Douglas, A., 61
451, 452, 453, 456, 464, 465, 505
Douglas, P. H., 27, 220, 231, 338, 433,
Gould, F. J., 100, 101
442, 563
Gould, J. P., 687, 688, 698, 705, 707,
Drabicki, J. Z., 464 714
Drandakis, E. M., 559, 573 Graham, F., 276
Eckstein, 0., 464 Green, J., 231
Edgeworth, F. T., 187, 193, 198, 204, Griliches, Z., 714
205, 206, 215, 229, 344
Eisenberg, E., 231, 338 Haaveimo, T., 685, 686, 687
Eisner, R., 687, 698 Hadamard, J. S., 381
El-Hodiri, M. A., 6 1 , 100, 113, 125, 291, Hadley, G., 625
602, 625 Haga, H., 499
Engel, E., 159, 500 Hahn, F. H., 229, 260, 317, 344, 345,
Enthoven, A. C., 110, 111, 123, 150 401, 442, 532, 533, 536, 559
Euler, L., 58; 61, 108, 136, 413, 414, Halkin, H., 625, 651
42-6,428,659 Halmos, P. R., 11, 14, 32
Evans, G. C., xxi, 100, 101, 412 Hamilton, W. R., 412, 603, 659
Harrod, R. F., 436, 459, 464, 503, 537,
Falb, P. L., 309 542, 543
Fan, K., 67, 92 Hausdorff, F., 30
Farkas, J., 42 Hawkins, D., 360, 383, 390, 583
NAME INDEX 723

Heine, H. E., 29 Keynes, J. M., xv, xvi, xxii, xxiii, 397,

Helly, E., 67 466, 532, 555, 556, 687, 694, 695
Herstein, I. N., 364, 374, 379 Kirman, A., 231
Hestenes, M. R., xxii, 27, 44, 61, 100, Knight, F. H., 713, 715
119,125,420,601,606,625,651, Kolmogorov, A. N., 32, 421
656, 657, 658, 665 Koopmans, T. C., 48, 186, 189, 199, 200,
Hicks, J. R., 46, 126, 128, 133, 134, 135, 201, 202, 261, 277, 341, 446, 447,
136, 138, 139, 151, 15 8, 165, 174, 448, 450, 459, 460, 462, 464, 466,
186, 234, 235, 246, 247, 265, 275, 480, 482, 484, 485, 559, 565, 575, 576,
281, 298, 299, 300, 314, 315, 316, 618, 620, 621, 624, 651
317, 321, 338, 405, 505, 555 Kopp, R. E., 606, 626
Hildenbrand, W., 231 Kose, T., 337
Hobson, E. W., 86 Kotelyanskif, D. M., 385
Hoffman, A. J., 67, 69 Krein, M. G., 364
Horvat, B., 464 Kuga, K., 324
Hotaka, R., 325, 327, 330 Kuhn, H. W., 44, 56, 60, 61, 68, 87, 89,
Houthakker, H. S., 234, 235 90, 91, 93, 100, 115, 258, 259, 260,
Howe, C. W., 498 261, 276, 292, 527, 582, 664
Hurwicz, L., xx, xxiii, 60, 61, 71, 72, 73, Kurz, M., 464, 625, 626, 671
93,94, 100, 102, 103, 107, 113, 127,
170, 174, 235, 245, 301, 317, 318,
319, 321, 325, 327, 328, 330, 331, Lagrange, J. L., 58, 61, 70, 87, 97, 99,
337, 343, 346, 347, 353, 401, 420, 658 108, 136, 659
Hurwitz, A., 310, 353 Lancaster, K., 73, 128, 183
Landau, E., 77
Inada, K., 276, 284, 438, 440, 559, 572, Lange, 0., 186, 201, 316
573 LaSalle, J., 352
Intriligator, M. S., 628, 630, 632, 635, Lebesgue, H., 225
636 Lee, E. B., 666
Lefshetz, S., 352
Jacobi, C. G. J., 428 Legendre, A. M., 428
Jaffe, W., 274 Leibenstein, H., 442
Johansen, L., 464 Leitmann, G., 606, 625
John, Fritz, 93, 100, 106, 107, 612 Leontief, W. W., 319, 359, 360, 364, 380,
Jones, R. W., 301, 302 397, 398, 399, 464, 503,507, 508,
Jorgenson, D. W., 520, 521, 537, 545, 517, 519, 527, 541, 573
685, 686, 687, 689, 691, 694, 695, Lerner, A. P., 186, 684, 685, 686, 687
704, 705, 706, 714, 715 Lester, R: A., 408
Levhari, D., 556
Kakutani, S., 259 Levinson, N., 306, 310
Kaldor, N., 357 Lewis, J. P., 319
Kalecki, M., 357 L'Hospital, G. F. A. M. de, 411
Kalman, R. E., 349, 352, 353, 357, 402 Liapunov, A. M., xx, 347, 348, 351,
Kantorovich, L., 55 352, 354, 356
Karlin, S., xxiii, 65, 69, 115, 246, 255, Li6nard, A., 357
292, 364, 486, 487, 490, 495 Lipschitz, R., 306
Karush, W., 61, 73, 100 Little, I. M. D., 201
Katzner, D. W., 248 Loomis, L. H., 486
Kelley, J. L., 11, 23, 24, 26, 27, 29, 31, Lucas, R. E., 687, 698, 705, 707
33, 34 Luenberger, D. G., 166
Kemeny, J. G., 487, 489, 497, 498, 499,
501 Machlup, F., 408
Kemp, M. C., 625 Malinvaud, E., 47, 565, 576
Kepler, Johannes, xv Maneschi, A., 451
724 NAME INDEX

Mangasarian,' O. L., xxii, 61, 99, 100, 277, 281, 284, 292, 318, 324, 337,
101, 603, 644, 645, 660, 661, 713 364, 365, 371, 374, 377, 378, 381,
Markus, L., 666 383, 385, 390, 408, 498, 511, 512,
Marschak, T., 346 513, 515, 536, 559, 572, 573
Marshall, A., 174, 266, 280, 297, 298,
300, 301, 321, 330, 697 Occam, W., 350
Marton, B., 127 Ohyama, M., 338
Marx, K., 499, 538 Okamoto, T., 438, 440
Massera, J. L., 3 57 Olmstead, J. M. H., 167
Matthews, R. C. 0., 442, 532, 533, 536, Otani, Y., 165, 166
559 Otsuki, M., 499
May, K. 0., 183
McKenzie, L. W., 174, 235, 242, 248, Page, A., 85
263, 264, 265, 276, 277, 292, 347, Pareto, V., 187, 188, 190, 201, 208, 286
353, 364, 365, 381, 390, 402, 487, Pascal, B., 704, 714
493, 536, 559, 561, 565, 569,573, Patinkin, D., 345, 346, 556
575, 577, 592 Peano, G., 305
McManus, M., 537 Peleg, B., 206, 229
McShane, E. J., 601 Penrose, E. T., 706
Meade, J. E., 464 Perron, 0., 364, 365
Menger, Karl, 258, 275 Peterson, J. M., 408
Metzler, L. A., 316, 317, 364, 365, 366, Phelps, E. S., 441
397, 401 Planck, Max, xv
Mill, J. S., 142, 146, 147, 148, 330 Polak, E., 61, 420
Minkowski, H., 40, 41, 42 Ponstein, J., 127
Moore, E. H., 34 Pontryagin, L. S., xxii, 61, 420, 600,
Moore, J. C., 71, 72, 73, 74, 95, 186, 601, 603, 604, 611, 612, 613, 614,
201, 202, 254, 255, 277, 293 615, 616, 623, 626, 644, 651
Morgenstern, 0., 38, 206, 487, 489, 497 Proctor, M. S., 713
498, 499, 501 Quirk, J., 202, 314, 319
Morishima, M., 55, 274, 301, 316, 337,
338, 344, 364, 366, 397, 398, 407, Rader, T., 180, 184, 232, 277, 291, 406,
408, 487, 489, 499, 500, 501, 505, 408, 713
506, 512, 513, 521, 527, 528, 529, Radner, R., 346, 487,.559, 561, 562,
530, 531, 532, 533, 537, 538, 539, 563, 564, 567, 568, 569, 570, 573, 587
545, 546, 559, 563, 569, 573 Rahman, M. A., 628, 629, 630, 631, 636
Mundell, R. A., 407 Ramsey, F. P., 412, 447, 448, 462, 463,
Murata, Y., 532, 538 466, 480, 575, 576, 577, 594
Muth, J. F., 512 Reiter, S., 346
Reymond, du Bois, 85
Nagatani, K., 556 Ricardo, D., 142, 144, 145, 186, 538
Negishi, T., 174, 285, 289, 291, 292, Richter, M. K., 235
293, 301, 317, 319, 325, 327, 344, Ritter, K., 92
345, 353, 401, 407 Robinson, J., 441, 443, 715
Nelson, R. R., 442 Roos, C. F., xxi, 412
Nerlove, M., 318 Rose, H., 556
Newman, P., 299, 301, 345 Rota, C. C., 310
Newton, Isaac, xv, 4 11 Routh, E. J., 310
Niehans, J., 442 Rudin, W., 23, 26, 32, 85
Niida, H., 537 Ruppert, R. W., 346
Nikaido, H., 23, 27, 28, 32, 33, 62, 131, Russak, B., 665
174, 229, 260, 261, 262, 263, 265, Russel, R. R., 346
NAME INDEX 725

Saaty, T. L., 625 Swan, T. W., 441, 442

Samuelson, P. A., xvi, 126, 128, 151, Swanson, J. A., 714, 715
165, 166, 167, 174, 186, 209, 234,
235, 281, 283, 300, 310, 315, 316, Takayama, A., 148, 167, 230, 291, 341,
317, 318, 319, 328, 330, 345, 400, 342, 345, 388, 389, 408, 430, 464,
407, 461, 464, 465, 471, 503, 512, 554,628,644,683,687,694,712
528, 532, 536, 537 Taylor, Brook, 312, 315, 399, 428
Saposnik, R., 202, 314, 319 Theil, H., 715
Sato, K., 442 Thompson, R. G., 487, 489, 497, 498,
Sato, R., 442 499,50-1,696
Scarf, H., xx, xxiii, 205, 206, 210, 211, Tinbergen, J., 450; 451, 452, 453, 456,
213, 215, 224, 225, 226, 227, 229, 464, 465
230, 231, 260, 318, 333, 335, 338 Tobin, J., 442, 533, 549, 554, 685, 687
Schlesinger, K., 258, 260 Tolle, J. W., 100, 101
Schur, J., 319 Tompkins, C. B., 276
Schwartz, H. A., 44 Treadway, A. B., 687, 705, 707, 714
Scitovsky, T., 48 Tsiang, S. C., 5 5 6 .

Sen, A. K., 464 Tsukui, J., 504, 513, 516, 535, 536, 537,
Shapley, L., 68, 206, 211, 212, 227, 229, 559, 573
232 Tucker, A. W., 44, 56, 60, 61, 68, 87, 89,
Shell, K., 549, 625, 626 90, 91, 93, 100, 115, 132, 292, 582,
Shephard, R. W., 167 664
Shilov, G. Y., 420 Turnovsky, S. J., 573
Shubik, M., 205, 206, 211, 212, 227, Tychonoff, A. N., 29
229, 232
Sidrauski, M., 549 Uekawa, Y., 284
Siebert, C. D., 704, 714 Uzawa, H., 60, 68, 69, 72, 73, 74, 93, 94,
Simmons, G. F., 24, 25, 27, 29, 31, 32, 100, 103, 161, 174, 235, 277, 292,
33 312, 318, 330, 337, 344, 345, 353,
Simon, H. A., 360, 383, 384, 390 354, 355, 357, 464, 537, 582, 645,
Slater, M., 69, 93 658, 687, 688, 705, 706, 707, 708,
Slutsky, E. E., 158, 247 709, 710, 715
Smith, Adam, 186
Smith, H. L., 34 Valentine, F. A., 61, 107, 601
Solow, R. M., 364, 378, 388, 396, 436, van del Pol, B., 35, 357
438, 442, 464, 503, 504, 505, 506, van der Waerden, B. L., 85
512, 520, 521, 522, 523, 524, 525, Vind, K., 225, 231
527, 528, 529, 530, 531, 536, 537, Viper, J., 162, 713
545, 546, 549, 551 von Neumann, J., 38, 68, 206, 276, 337,
Sonnenschein, H. F., 182, 183, 248, 276 486, 488, 489, 490, 493, 494, 495,
Srinivasan, T. N., 464 497, 499, 500, 559, 562
Starrett, D. A., 539 von Weizsacker, C. C., 441, 450, 464, 594
Stein, J. L., 556
Steiner, P. 0., 672, 677, 678, 683 Wald, Abraham, 256, 258, 259, 260, 275,
Stephenson, J. A., 715 276, 280, 283, 330, 486, 522
Stiglitz, J. E., 549 Walras, L., xvii, 174, 256, 257, 258, 274,
Stoleru, L. G., 464 275, 277, 280, 297, 298, 299, 300,
Stone, R., 464 301, 314, 340, 342, 345, 359, 395,
Strotz, R. H., 687, 698 396, 499, 522, 529, 538
Suits, D. B., 512 Weierstrass, K., 28, 29, 78, 85, 428
Suppes, P., xxiii Weintraub, R. E., 407
Sutherland, W. R., 485 Wicksell, K., 46, 47, 186, 408
726 NAME INDEX

Widder, D. V., 167 Yaari; M. E., 61, 666

Wielandt, H., 372, 373 Yasui, T., 366
Wilansky, A., 11, 31, 32 Yokoyama, T., 235
Williamson, 0. E., 447, 672, 677 Yoshizawa, T., 352, 356, 357
678
Wilson, E. B., 167 Zeuthen, F., 258, 275
Wong, Y. K., 162 Zorn, M., 11
Subject Index

Abelian group, 6 Arrow-Hurwicz-Uzawa (A-H-U) condi-

Accelerator coefficient, 542 tions, 94, [see Condition (A-H-U)]
Accumulation point, 21 Arrow-Hurwicz-Uzawa (A-H-U) theorem,
Activity, 46 93, 95, 96, 98, 102, 103, 105, 106,
level, 48, 489 111, 131, 132, 139, 147, 150, 648
Activity analysis, xix, 45-54, 140-142, further note, 102-108
486, 491 stated, 93-94
two fundamental theorems of, 51-53 Associative law, 5
Acyclic matrix, 376, 377, 378 Asymptotically, 356
Addition, 5 Autonomous system, 304, 306, 348, 349,
Adjustment costs, 687, 688, 689, 697, 350, 353, 610, 611, 615
698, 699, 711 conditions for global stability,
no, 688, 711, 715 350-351
quadratic approximation of, 702, 714 conditions for quasi-stability, 354
Adjustment process conditions for uniform global stability,
output, 297, 299, 300, 301 351
price, 297, 299, 300, 301 defined, 304
simultaneous, 300, 301 equilibrium point (defined), 306
Walrasian, 299 Auxiliary variables, 603, 621, 626, 629
Admissible control,, 602 Average cost, 167
set of, 602 Average propensity to consume, 433
Admissible function, 413, 602 Average propensity to save, 433
Affine function, 15 (see Linear (affine) Axiom of extension, 1
function) Axiom of specification, 1
Allocation, 207
feasible, 207 Balanced growth path, 398, 399, 438,
Arithmetic mean theorem, 127 440, 446, 459, 472, 478, 484, 486,
Arrow-Block-Hurwicz theorem, 328 487, 491, 492, 493, 500, 503, 509,
Arrow-Enthoven theorem, 110, 133, 139, 515, 516, 521, 538, 543, 584
147 Banach space, 92, 101, 430
relevant variable, 110, 150 defined, 101
stated, 110-111 Bang-bang control, 605, 626, 631, 643,

727
728 SUBJECT INDEX

Bang-bang control (continued) sufficiency theorem, 426, 429-430

686, 691, 699 uniqueness of solution, 430
Bang-coast (-off) control, 643 Capital, 46, 47, 432, 444, 470, 505, 535,
Barter process, 344 554
Basic activities, 48 durability of, 47
Bathtub problem, 600, 605, 609 expected rate of return on, 695
"Big-push" thesis, 442 fixty of, 697, 715
Bijection, 3 freely transferable, 507, 508, 538
Bilinear function, 118 full employment of, 433, 507, 526, 542
Binary relation, 177 (see Relation) marginal efficiency of, 687, 694
Blocking coalition, 206, 207, 208, 209, marginal physical product of, 433, 434
228 measurement of the unit of, 708, 715
block superior, 208 Capital:labor ratio, 448
defined, 208 constant, 459
domination by coalition, 208 Capital:output ratio, 450
Bolzano-Weierstrass property, 28 constant, 450, 451, 458, 465, 484
Bond, 533, 554 Cartesian product, 2
Bordered Hessian, 123, 125, 136 Cauchy-Peano theorem, 303, 305, 306,
defined, 123 337, 601, 603
Bordered Hessian condition, 135, 139, stated, 305
153 [see Condition (BHC)] Cauchy-Schwartz inequality, 44
Bordered principal minors, 126 Cauchy sequence, 101
Boundary, 24, 38, 272 Causal indeterminacy, 503, 504, 505,
defined, 24 506, 517, 527, 528, 537, 541,543
point, 24 defined, 503, 517
Boundary curve, 465 Chain rule, 79, 167, 424 (see Composite
Bounded from above, 3 function theorem)
Bounded from below, 4, 41 Characteristic equation, 367
Bounded set, 34 Characteristic function, 209
Bou.ided stafe variables, 665, 666 Characteristic root, 367
Bounded steepness, 590 Characteristic vector, 367
Brachistochrone problem, 411, 416 Cheaper point assumption, 195, 215,
Brauer-Solow condition, 364 239, 241, 243, 264, 267, 286, 287,
Brock's lemma, 576, 595-597, 598 653
stated, 595 stated, 195, 286
Brock's theorem, 470, 480-484 Chosen point, 190, 191
stated, 483 Classical saving behavior, 439, 440
Brouwer's fixed point theorem, 259, 260, Classical theory of maximization, 82-85,
276, 277, 375, 487, 532 97-99, 123-124, 151-153
stated, 260 Lagrange theorem, 97
Brown-von Neumann differential equa- Closed ball, 24
tions, 337 Closed function, 251', 252
Budget Closed half space, 36
constraint, 172, 262, 274, 275 Closed set, 21, 24, 25, 27, 32, 36, 39, 40,
function, 236, 241 41, 48, 265, 578, 602, 605, 626, 639
set, 171, 236, 266 defined, 21
Closure, 21
Calculus of variations, xxi, 61, 101, Coalition, 206, 207, 228, 230
410-431, 448, 449, 460, 469, 601, balanced set of, 230
606,61 9,625, 639, 557, 691, 714 defined, 207
elements of, 410-419 Cobb-Douglas function, 27
fundamental lemma of, 414, 428 production function, 433, 442, 443
regular problem of, 415 utility function, 220, 231, 338, 563
SUBJECT INDEX 729

Cobweb model, 299 defined, 63

Column dominance, 381 fundamental theorem of, 66, 115, 132,
Commodities, 46, 53, 169 487,495
characteristics of, 183 Hessian, 121
dating, 46, 47, 171, 174 Concave programming, 62-74, 111, 132,
Giffen, 166 135, 136, 202
inferior, 166 defined, 70
intermediate, 51, 458 Condition (A-H-U), 94, 96, 117 (see
primary, 51 Arrow-Hurwicz-Uzawa conditions)
Commutative law, 5 Condition (AHU), 103, 104, 105, 106
Compact set, 28-30, 41, 209, 236, 238, Condition (BHC), 153, 155 (see Bordered
239, 240, 248, 252, 259, 260, 263, Hessian condition)
265, 270, 276, 288, 292, 354, 367, Condition (BHC'), 153
373, 484, 492, 577, 578 Condition (Conc.), 96
defined, 28 Condition (FOC), 124, 125, 152, 153,
Compactness argument (proof), 379, 492, 155, 156, 160, 166 (see First-order
495 condition)
Comparative statics, 154-155, 403-406, Condition (H-S), 360, 383, 384, 390 (see
407,695 Hawkins-Simon condition)
fundamental equation of, 155, 404 Condition (h-s), 385, 390
Hicksian law of, 405, 407 Condition (KTCQ), 89, 90, 92, 93, 96,
second order conditions, 155-156 102, 107, 117 (see Kuhn-Tucker con-
Compensated change in income, 166 straint qualification)
Compensated demand function, 244-247 defined, 89
defined, 244 Condition (LM), 90, 92, 93, 96, 98, 101,
Compensation principle, 230 103, 124, 125, 152, 153, 154, 155,
Competitive equilibrium, 170, 188, 190, 156, 160 (see local maximality condi-
192, 193, 197, 198, 202, 205, 206, tion)
213, 214, 216, 224, 226, 227, 262, Condition (LVM), 116, 117
268, 269, 273, 286, 287, 291, 313, Condition (M), 86, 88, 96 (see Maxi-
342, 497, 581, 582 mality condition)
defined, 190, 213, 262, 268-269 Condition (M'), 110
irreducibility, 277 Condition (QSP), 87, 88, 90, 92, 93, 95,
Schesinger's formulation, 258 96, 98, 99, 103, 107, 110, 133, 134
uniqueness of, 259, 260, 280-284, (see Quasi-saddle-point condition)
323, 326, 328, 330, 356 Condition (QSP'), 88, 107, 138, 139,
Competitive markets, 169, 174 147, 150, 153 (see Nonnegative
exposition of the theory, 169-294 quasi-saddle point condition)
theory of, xvi, xviii, xix, 174-175 Condition (R), 152, 153, 154, 155, 160,
Competitive price mechanism, 226, 229 166 (see Rank condition)
Complements, 166 Condition (S), 69, 93, 96, 97 (see Siater's
Complete quasi-ordering, 177 condition)
Complete specialization, 146 Condition (SONC), 152, 153, 155, 160
Composite function theorem, 79-80, (see Second-order necessary'condi-
(see Chain rule) tion)
Concave function, 60, 63, 64, 65, 66, Condition (SOSC), 152, 153, 155, 166
68, 71, 72, 73, 80, 8 2, 83 , 87, 88, 93, (see Second-order sufficient condi-
94, 96, 97, 105, 109, 112, 113, 117 tion)
121, 122, 133, 137, 138, 146, 218, Condition (SP), 69, 86, 87, 96, 141, 142
246, 248, 286,.291, 292, 429, 564, (see Saddle-point condition)
566, 578, 605, 648, 649, 662, 664, Condition (VM), 116, 117
665, 670, 676, 691, 698 Condition (VQSP), 116, 117
continuity of, 65, 173, 291 Cone, 17, 18, 41
730 SUBJECT INDEX

Cone (continued) 146, 161, 350, 351, 352, 407, 413,

closed, 561 419, 426, 431, 512, 536, 566, 605,
(vertex at the origin), 17 647, 651, 657
Connected set, 180, 305 defined, 78
Constant returns to scale, 48, 264, 265 Continuum of traders, 225
433, 435, 448, 490, 687, 688, 695, Contract curve, 188, 205, 231
696, 699, 702, 705 Control (function), 601, 602, 626, 639,
non-, 699, 715 647, 660
Constraint qualification, 97, 485, 648, Control parameter, 651, 654, 656, 661,
650, 651, 655, 665, 683 666, 674, 679
Constraint set, 57, 111, 267 defined, 654
Constraints Control region, 602, 639
active, 58 Control variables, 601, 602, 617, 629,
distinct, 101 638, 656, 668, 674, 679, 708
effective, 58, 85, 94, 103, 658 defined, 601
equality, 58, 65, 71, 94, 97, 101, 290 Convergence, 22, 34
inactive, 58 Convergent sequence, 22
inconsistent, 56, 58 Convex combination, 16
inequality, 71, 100, 658, 668 Convex cone, 18, 19, 50, 72, 73, 266,
integral, 651, 666 490
isoperimetric, 666 defined, 18
mixed, 98 generated by S, 18
nonnegativity, 58, 612, 649, 666 spanned by S, 18
Consumers, 169, 170, 175 (vertex at the origin); 18
of the same type, 214 Convex function, 60, 63, 67, 83, 84, 93,
Consumer's lifetime allocation process, 105, 112, 122, 648
59, 652-654, 666 defined, 63
Consumers' surplus, 673 Hessian, 122
Consumption set, 171, 174-176, 183, Convex hull, 33, 230
188, 189, 198, 207, 236, 265, 277, Convex polyhedral cone, 18, 27, 48, 50,
286 51, 52, 53, 489, 563, 578
aggregate, 189, 286 defined, 18
compactness of, 236, 248, 265, 288, Convex polyhedron, 33
292, 577-578 Convex polytope, 33
defined, 175 Convex set, 16-19, 36, 39, 50, 61, 64,
Consumption theory 65, 67, 72, 83, 84, 87, 88, 93, 94, 96,
as an application of nonlinear program- 99, 103, 111, 112, 113, 117, 122,
ming, 133-136 175, 195, 209, 214, 216, 239, 243,
Consumption turnpike theorem, 461 259, 260, 262, 263, 264, 265, 270,
Continuous function, 25, 26, 27, 73, 78, 272,286,291,292, 487, 495, 562,
252, 254, 255, 259, 260, 271, 286, 577, 578, 586
304, 305, 340, 352, 354, 414, 410, defined, 16
421, 492, 563, 578, 602, 612, 613, Convexity
614, 616, 647, 650, 652, 655, 656, of consumption set, 175, 176, 181,
659, 661, 665 188, 195, 209, 214, 236, 239, 265,
defined, 25, 240 577
examples of, 26 of preferences, 181-183, 194-195,
Continuous functional, 421 212,214, 232, 242, 262, 264, 265,
Continuous at a point, 25, 239, 251, 421 276, 277, 286, 292, 335, 578, 587
Continuous but nowhere differentiable of production set, 50, 52, 195, 232,
function, 78, 85, 264, 266, 277, 286, 291, 292, 490,
Continuous switching, 337 562, 577
Continuously differentiable function, 78, Coordinate, 2
SUBJECT INDEX 731

set, 3 defined, 76
system, 14 Differential, 76, 77, 79, 424
Core, 170, 174, 204, 205, 206, 208, 209, Differential equations, 302, 315, 347,
211, 213, 214, 215, 216, 217, 221, 465,548
223, 224, 225, 226, 227 continuation of solutions, 312, 336, 338
defined, 208 control function, 305
theory of, 204-234 existence of a solution, 305, 312
Corner solution, 134, 136, 626, 639 first order, 303
Correspondence, 3 forcing function, 305
Correspondence principle, 311, 345-346 global existence of a solution, 312
Cost minimization, 163 homogeneous, 305
Costate variables, 603 initial condition, 304
Costs of coalitions, 226, 228 initial value problem, 312
Countable set, 32 linear, 305
Countably infinite set, 32 local existence of a solution, 306, 312
Cournot aggregation property, 159 n-th order, 303
Cyclic matrix, 376, 377 ordinary, 311
solution of, 303, 304, 309
theory of, 302-313
uniqueness of a solution, 305, 312
Debreu's theorem, 180, 184 with constant coefficients, 305
Decentralized decision-making, 202,342 Differentiation, 75-82, 422-423
Decomposable matrix, 370, 371, 375, Diminishing returns, 433, 435
377, 378, 379, 390 Diminishing returns to scale, 265, 266
completely, 370 Discontinuity of the first kind, 626
defined, 370 Discount factor, 446, 463, 594, 624
Degree of monopoly, 682, 684 negative, 463
Demand function, xix, 234, 235, 236, nonconstant, 620
237, 241, 271 Discounting the future, 447
continuity of, 235, 264 Discrete time, 468, 469, 470, 484
homogeneity of, 237, 271 Discrete topology, 20
single-valuedness of, 271 Disjoint set, 2
Demand theory, 234-249 Distance, 8 (see Metric)
Depreciation, 432, 442, 444, 470, 535, Enclidian, 8, 329
542, 547, 548, 688, 712 Distributed lag, 704
by evaporation, 689, 712 Distributive law, 5
by sudden death, 712 Divisibility, 48, 50, 179, 184, 188, 194,
Derivative, 75, 76 239, 490
Derived set, 21 Dominant diagonal (d.d.), xx, 359, 364,
Descriptive model, 497, 527 365, 380, 381, 382, 383, 389, 390,
Desired goods, 51 394, 407, 509
Diagrams defined, 381
use of, xxii, 189, 201 DOSSO, 150
Difference equations, 469, 504, 508, 510, Dual stability theorem, 520, 521, 537
512, 515, 519, 522, 601, 604, 606 Dynamic adjustment, 321, 400
Differentiable at a point, 75, 76, 77, 78, Dynamic Leontief model (system), xxi
79, 80, 85, 422 399, 503-540, 541, 542, 554
two definitions of, 85. basic output equation of, 508
Differentiable function, 76, 78, 80, 84, basic price equation of, 519
87, 88, 90, 94, 95, 103, 116, 117, capital coefficient matrix, 508
133, 137, 244, 245, 246, 247, 281, closed, 508, 519, 543
355, 410, 413, 414, 419, 422, 424, current input coefficient, 508
427, 429, 430, 662 Morishima's model, 527-535
732 SUBJECT INDEX

Dynamic Leontief model (continued) defined, 306

and nonlinearity, 504-505 isolated, 312, 348
optimization model (Solow), 522-527 Equilibrium price vector, 280, 314, 322,
output system of, 507-517 326, 327, 328, 346 (see also Competi-
price system of, 517-521 tive equilibrium)
price valuation side of, 506 Equilibrium values, 275, 403
Ricardo-Marx case, 538 Equivalence relation, 177, 183
simple, 398-399 Euclidian space, R'1, 5, 9-11, 21, 32, 34
some problems in, 541-558 (see Real space)
Dynamic programming, 628 structure of, 9-10
Dynamic substitution theorem, 528, 531, usual topology, 21, 34
538; 546 Euler-Lagrange-Hamiltonian equations,
659, 669
Economics, Euler's equation (condition), xxi
future of, xvii 413-415, 426-429, 430, 448, 449,
method of, xvi 451, 460, 606, 619
Edgeworth-Bowley (box) diagram, 187, defined, 414, 428
193,198,204 Euler's equation for homogeneous func-
Edgeworth process, 344 tions, 283, 322, 326
Efficient manager, 45, 137 Excess capacity, 522, 526, 537, 672
Efficient point, 51, 52, 53, 113, 141, Exhaustion of the product problem, 688
185, 491 Existence of competitive equilibrium,
defined, 51 177, 255-279, 285, 288, 289, 522
Eigen equation, 367 historical background of, 255-265
Eigenvalue, 309, 310, 312, 316, 367, proof by Debreu, Gale, and Nikaido,
372, 375, 378, 380, 382, 386, 392, 261-263
393,400,407, 508, 510, 513, 515, proof by DOSSO, and Kuhn, 259-261
516, 520, 521 proof by McKenzie, 265-274
defined, 367 proof by Takayama and E17Hodiri,
simple, 368 288-291
Eigenvector, 367, 372, 375, 510, 520, Wald's assumptions, 259
521 Exogenous variables, 275, 403
defined, 367 Expectation, 318, 407, 689, 695, 710,
Elasticity of demand, 682 712
Eligibility, 446, 455, 465,.480, 587, 619, adaptive, 318, 555
692,713 extrapolative, 318
Empty core, 211-213, 226 Expenditure lag input-output analysis,
convexity of preferences, 212-213 396-397
Empty set, 2 Explicitly quasi-concave function, 111,
Endogenous variables, 275 112, 127, 182, 291
Engel aggregation property, 159 defined, 111
Entrepreneurship, 265 and preference ordering, 182
Envelope, 161, 162, 165, 167 Explicitly quasi-convex function, 112,
of the short-run cost curves, 164 127
Envelope theorem, 160-165 Exponential function, 312
applications, 162-165 Exponential lag, 704
stated, 160 Externalities, 170, 174, 202, 232, 261,
e-core, 212-213, 232 262, 265, 286
Equal treatment, 229 Extremum, 82
theorem, 224, 230 global, 423, 424
stated, 214 local, 84, 423, 424
Equilibrium point (state) of the system of unique global, 423
differential equations, 306, 307, 310, unique local, 423
312,3 49,350, 351, 354, 357 Factor substitution, 528, 546
SUBJECT INDEX 733

Feasibility, 187, 189, 207, 286, 444, 522, cooperative, 206, 207, 229
561 garbage, 227, 232
Feasible set, 57 noncooperative, 229, 261
Fermat's principle, 412 side-payment, 206, 229
Final settlement, 205, 229 theory of, xxiii
Finite horizon, 623, 624 General equilibrium, 255, 256
Finite set, 32 analysis, xvi
First axiom of countability, 34, 254, 255 General production set, 45, 49, 51, 52
First derivative, 120, 422 (see also Production set)
First differential, 120, 422, 425 Generalized Hariltonian, 647
First-order condition, 84, 87, 96, 124, Golden age path, 440, 442, 463, 472
152, 153, 154, 155, 156, 160 [see defined, 440
Condition (FOC)] golden age program, 584
First-order partial derivative, 98 Golden rule path, 441, 442, 463, 466,
First variation, 422 624
Fixed coefficient assumption, 503, 506, defined, 441
508,527,528 golden rule program, 584
Frechet differential, 77, 79 Goldman-Tucker, theorem, 71, 130, 132,
Free disposability, 50, 137, 174, 190, 142, 145
201, 264, 266, 269, 290, 489, 490, stated, 132
562, 563, 564, 568 Gradient vector, 78, 84
Fritz John theorem, 106-107 Greatest lower bound (glb), 4, 483, 485
Frobenius root, 372, 375, 376, 378, 386, (see Infimum)
387, 388, 392, 393, 395, 396, 398, Gross complements, 406
399, 487, 509, 513, 515, 516, 536, Gross substitutability, xvii, 282, 283, 317,
537 318, 321, 325, 326, 327, 328, 329,
defined, 372, 375 330, 335, 338, 345, 355, 401, 404,
Frobenius theorem, xx, 359, 366, 406
367-380, 386, 493, 497, 509, 510, example of, 331-332
512, 536 in the finite incremental form, 330
theorem I, stated and proved, 372-375 weak, 401
theorem II, stated, 375 Guided missile problem, 600, 625
Function(s), 3, 32
affine, 15 Hahn-Negishi process, 344
composite, 3 Hahn-Negishi theorem, 402, 407
constant, 3 Halkin's counterexample, 625, 651
domain of, 3 Hamiltonian, 603, 614, 617, 618, 619,
linear affine, 15 625, 626, 629, 632, 639, 690, 698,
multivalued, 3 713
single-valued, 3, 12, 32 Hamiltonian system, 603, 612, 613, 614,
space of, 419, 421 616, 629, 632, 639, 640, 655
value of, 3 Hamilton's principle, 412
Functional, 12, 421 Harrod-Domar model, 459, 464, 503,
Functional analysis, 420 537, 542, 543
Functions of class C1, 78 Hausdorff space, 30, 31, 34, 255
defined, 30
G-closed function, 251, 255 Hawkins-Simon condition, 360, 361, 362,
g-constraint, 646, 647, 650, 668 363, 365, 380, 384, 385 [see Condi-
defined, 646 tion (H-S)]
Gain function, 253 defined, 360, 383-384
Gale-Nikaido lemma, 263, 277 Hawkins-Simon theorem, 383-384
Gale-Nikaido theorem, 408 Heine-Borel theorem, 28, 29, 30
Game stated, 29
balanced, 211, 230 Helly's theorem, 67
734 SUBJECT INDEX

Hessian, 117, 121, 122, 123, 125, 136, 458, 459, 460, 463, 474, 482, 484,
138, 153, 248, 405, 406, 408 578, 589, 594, 623, 650
defined, 121 justifications of, 446
Hestenes' theorem, 651-660, 661, 674, Infinite set, 32
679, 683 Infinitesimal, 77, 423, 427
stated, 658-660 Information (cost), 227-228, 232
Hick's condition for stability, 314, 316, Injection, 3
317,365,401 Inner product, 7, 44
stated, 314 axiom of, 7
Hicks-Slutsky equation, xix, 135, 150, Euclidian, 7
156-160, 234, 235, 236, 242, 247 usual, 7
defined, 156 Inner product space, 7, 9
Hicksian matrix, 281, 282, 283, 315, Input-output analysis, 319, 360, 366, 394
316, 393, 401, 406, 408 Input-output matrix, 359, 360, 363, 364,
defined, 281, 315 370, 380, 394
Hicksian method of stability, 314 defined, 359
Hicksian week, 555 Insurance premium, 228
Homogeneity, 262, 284, 321, 326, 327, Integrability problem, 234, 235
328, 330, 355, 400, 402, 404 Interest rate, 517, 520, 527, 529, 531,
Homogeneous of degree one, 164, 218, 532, 538, 545, 549, 554, 555, 645,
246, 471, 547, 563 712
Homogeneous of degree zero, 237, 246, Interior, 24, 72, 85, 271, 562, 605, 619
247, 262, 271, 283, 321, 326, 400, defined, 24
500, 531 Interior point, 24, 75, 76, 77, 100, 150,
Hyperplane, 35, 37 239, 267, 292
bounding for a set, 36 defined, 24
separated by, 35, 36, 37 Interior point assumption, 215, 239, 243
strictly separated by, 36 Interior solution, 134, 138, 231, 619,
supporting, 36, 38 622, 626, 675, 680, 699, 715
Intertemporal arbitrage, 526, 527, 544,
Image, 3, 32 555,694,714
inverse, 3 Inverse demand function, 258, 261, 2776
imperfect stability, 314 Irreversibility of investment, 639, 671,
Implicit function theorem, 101, 157, 689
165, 275,404, 407 Irreversibility of production, 48, 50, 264,
stated, 165, 407 266, 277
Imprimitive matrix, 376, 377 Isolated point, 21
Income, 171 Isomorphic correspondence, 14
elasticity, 159 Isoperimetric problem, 666
Increasing returns to scale, 50, 277
non-, 50 Jacobian, 79, 165, 275, 281, 658, 662
Indecomposable matrix, 365, 370, 371, defined, 79
372, 375, 377, 378, 379, 387, 388, Journal of SIAM Control, 625, 626
389, 392, 393, 395, 399, 509, 510,
513, 515 K-concavity, 71, 73
defined, 370 K-convexity, 72, 73
Indifference curve, 66, 179 Kakutani's fixed point theorem, 259,
Indifference set of z, 179 260, 263, 273, 276, 290
Indifferent-to-z set, 179 stated, 259
Indiscrete topology, 20 Kalecki-Kaldor model of business cycles,
Infimum, 4, 485 (see Greatest lower 357
bound) Karlin's condition, 69
Infinite horizon, 446, 447, 455, 456, Keynes-Ramsey rule, 466
SUBJECT INDEX 735

Keynesian multiplier, 397 Limit cycle, 356, 357

Kotelyanskii theorem, 385 Limit point, 21, 22, 23, 24, 28, 32, 33,
Kronecker's delta, 658 34, 307, 354
Kuhn-Tucker constraint qualification, 69, and limit, 22-23
89, 96, 100, 290 [see Condition defined, 21
(KTCQ)] Limit theorem, 215, 220, 225, 226
defined, 89 discussions on, 213-217
Kuhn-Tucker irregularity, 91 proved, 216-217
Kuhn-Tucker-Lagrange condition, 87 stated, 216
Kuhn-Tucker regular, 90, 93 Linear (affine) function, 15, 72, 94, 96,
Kuhn-Tucker's main theorem, 90-92, 97, 105, 275, 648
132,420 defined, 15
stated, 90 Linear approximation stability, 311, 313,
316, 401
Labor, 142, 432, 444, 470, 505, 518, defined, 311
529, 579 Linear approximation system, 311
full employment of, 433, 459, 505, Linear dependence, 10-11
544, 547 Linear form, 12
marginal physical product of, 433 Linear function(s), 10-15, 367
Lagrange's theorem, 97 defined, 12
Lagrangian (function), 70, 98, 99, 124, examples of, 12
447, 601, 603 inverse of, 13
defined, 70 invertible, 13
for the optimal control. problem, 647 multiplication of, 12
Lagrangian multiplier, 99, 162, 603, 621 nonsingular, 13
meaning of, 162 singular, 13
Land of Cockaigne, 48, 50, 259, 266, Linear functional, 12, 77, 121, 421, 422
490, 501, 561 examples of, 12
Landau's o-symbol, 77 Linear homogeneous function, 128, 218,
Law of supply and demand, 319 219, 231, 546
Least upper bound (lub), 4, 483, 485 (see and strict concavity, 218, 231
Supremum) Linear independence, 10-11, 101, 107,
Left-hand derivative, 75, 76, 285, 292 165
Legendre condition, 428 defined, 10
Leontief matrix, 380, 383, 389, 390 Linear manifold, 36
defined, 380 Linear objective function, 638
Leontief system (static) Linear programming (LP), 15, 44, 53, 55,
convergence problem, 362, 363 56, 61, 71, 93, 99, 129, 130-132,
existence problem, 360, 363, 394, 395 142, 145, 259, 276
nonsingularity problem, 360, 363, 374, dual problem, 130, 504, 524, 525, 527
395 duality theorem, 130, 131, 132, 258,
Lexicographic ordering, 181-182 260, 276, 499, 525, 527
Liapunov function, 351, 352 duality theorem, stated, 130
evaluating vector, 130
Liapunov's second method, 301, 319,
346, 347-380, 402 Linear space, 5, 6, 11, 72, 176, 419, 420
principal idea, 357
axiom of, 5, 421
basis of, 10
Lienard form, 357 complex, 7
Lim inf, 483, 485 examples of, 6
Lim sup, 483, 485 finite dimensional, 1 1, 85, 176, 419,
Limit, 22, 23, 28, 30, 34 424, 425, 430
and limit point, 22-23 Hammel basis of, 10
defined, 22 inequalities in, 72-73
736 SUBJECT INDEX

Linear space (continued) relative, 82

infinite dimensional, 11, 85, 176, 419, strong global, 82
420, 424 strong local, 82
of linear functions, 12, 19 unique global, 82, 405, 423, 429
Linear subspace, 6 unique local, 82, 423
Linear sum of sets, 7 Maximum principle, 600, 603, 619, 621,
Lipschitz condition, 306, 312, 590 629, 640, 642, 644, 647 (see
Lipschitzian, 352 Pontryagin's maximum principle)
Liquidity trap, 533 Maximum theorem, xx, 235, 249,
Local maximality (LM) condition, 90, 253-255
124, 152 [see Condition (LM)] stated, 254
Local nonsatiation, 190, 191, 192, 202, Measurability, 183
. 238 of individual utility, 201
Logically equivalent, 4 Metric, 8
Long-run cost curve, 165 axiom of, 8
Long-run marginal cost, 163, 678 induced from norm, 9
Lower bound, 4 Metric space, 8, 27, 30, 34, 422
Lower contour set, 178 complete, 107, 430
Lower inverse, 250, 255 examples of, 8
Lower semicontinuous function, 236, Metzler matrix, 366
239-241, 248, 252-253, 255 Mill-Marshall diagram, 300, 330
defined, 240, 252 Mill's problem, 146-150
Lower semicontinuous at a point, 239, Mill's condition, 148
251 Minimum, 4
global, 82, 83, 423, 424, 430
Managerial and administrative ,: sources, local, 82, 83, 423, 424, 425
706, 707, 708, 715 strong global, 82
Mangasarian theorem, 430, 645, 676, 681 strong local, 82
Map, 3 unique global, 423, 425
Marginal cost, 167, 683, 712 unique local, 423, 425
Marginal cost pricing, 678 Minimum distance problem, 410, 411,
Marginal productivity, 415
rule, 687, 691, 694, 697 Minimum expenditure function, 238,
theory, 408 242-247,248
Marginal revenue product, 712 defined, 238
Marginal utility, 184 Minimum wage regulation (MWR), 405,
elasticity of, 464 406, 407-408
of income, 162 Minimum wealth assumption, 195, 267
Market failures, 227, 232 (see Cheaper point assumption)
Marshallian external economies and dis- Minkowski-Farkas lemma, 41-44, 68, 92,
economies, 266 99, 104, 131, 132
Marshallian "long-run," 697 stated, 42
Marshallian "short-run," 697 Minkowski's separation theorem, 40-41,
Matrix, 13, 15, 367 107, 217
defined, 13 stated, 40
Maximality (M) condition, 86 [See Condi- Modified golden rule path, 450, 459, 460,
tion (M)] 463, 640, 643
Maximum, 4 defined, 463
absolute, 82 Modified Liapunov function, 354, 355,
global, 58, 82, 83, 84, 85, 112, 127, 357
135, 138, 423 Money, 171, 229, 442, 517, 518, 519,
local, 58, 82, 83, 85, 103, 112, 123, 520, 533, 538, 546, 549, 554, 556
124, 127, 152, 166, 423, 425 nonneutrality of, 556
SUBJECT INDEX 737

Monopoly, 228, 299, 412, 417, 672, 679, Solow's path, 438, 549
711 Neo-classical theory of investment,
Moore-Smith convergence theory, 34 685-715
Moore's theorem, 72, 73 case of no adjustment costs, 688-697
Multicountry income flows, 397-398 case with adjustment costs, 697-703
Multiplier 603, 612, 647, 651, 655, 656, complete monopoly, 711-712
658, 659, 661, 665, 679, 680, 683, critiques of Jorgenson's theory, 687,
714 705, 714-715
Multisector model of economic growth, lag distribution, 704
486-541 long-run desired stock of capital, 686,
dynamic Leontief model, 503-541 687, 693, 700, 703, 714
von Neumann model, 486-502 response function, 703-706
Multisector optimal growth model with response mechansim, 704
consumption, 575-599 response parameter, 704
attainable program (finite horizon), 578 speed of adjustment, 704
attainable program (infinite horizon), Uzawa on the Penrose effect, 706-710
578, 583, 587, 588, 589, 590, 594, Neo-turnpike theorem, xxi, 572
595 (Net) substitution term, 166, 246
competitive program, 577, 581, 582 No-worse-than- z set, 178, 238
eligible attainable program, 577, 589, Nonautonomous system, 304, 306, 348,
590, 592, 595, 598 352, 611, 613, 614
feasible program, 583 conditions for strong uniform global
finite horizon problem, 580-583 stability, 352
golden age program, 584 defined, 304
golden rule program, 584 equilibrium point (defined), 306
initial resource vector, 578 Nonlinear programming, xix, xx, 44, 55,
optimal (attainable) program (finite 56, 59, 61, 103, 285, 419, 420, 469,
horizon), 581, 582 470, 475, 494, 563, 580, 601, 603,
optimal (attainable) program (infinite 612, 613, 621, 683
horizon), 594-598 exposition of, 55-168
defined, 594 feasible point, 60
optimal stationary program (O.S.P.), maximand function, 61
583-5.87 (see also Optimal stationlx-. maximum point, 61
program) objective function, 61
O.S.P. and eligibility, 587-594 optimal program, 61
stationary program, 584 optimal solution, 61
solution, 60, 61
Necessary condition, 4 uniqueness of solution, 60, 84, 112,
Necessary and sufficient condition, 4 127
Negative definite matrix, 118 -123, 128, Nonnegative matrix, 364, 368, 372, 375,
166, 316, 406, 407 378, 385, 387, 388, 392, 509, 510,
defined, 118 515, (see also p. 121)
Negative prices, 135, 136, 269 defined, 368
Negative semidefinite matrix, 118-123 Nonnegative quasi-saddle point (QSP')
124, 125, 156, 158, 247, 248 condition, 88, 100, 649 [see Condi-
defined, 118 tion (QSP')]
Neo-classical aggregate growth model, Nonnormalized system, 318, 319, 325,
432-444,546-554 402
attainable path, 436, 440 defined, 318
classical path, 440 Nonsatiation, 136, 195, 215, 264, 267,
feasible path, 435, 436, 440 268, 287, 289, 485, 585, 623, 653,
fundamental equation of, 435 670, 671
with money, 546-554, 556 defined, 195
738 SUBJECT INDEX

Nonsatiation (continued) Optimal (attainable) path, 448, 450, 458,

strong, 585 462, 465, 474, 479, 480, 481, 482,
weak, 585 483, 484, 485, 543, 623, 624, 639,
Nonsingular matrix, 13, 126, 154, 275, 643
360, 383, 387, 392, 393, 394, 396, characterization of, 476
398, 508, 510, 513, 519 existence of, 475, 476, 478
Nonsubstitution theorem, 537 (see Sub- nonnegativity of, 475
stitution theorem) uniqueness of, 466, 475, 476, 478
Non-tdtonnement process, xx, 318, 339, Optimal control, 607, 619, 626, 686, 714
341-345 existence of, 603, 665, 666
equilibrium state of, 343 uniqueness of, 662, 664
three kinds, 343-345 Optimal control problem, 668, 674, 679
Norm, 9, 76, 77, 85, 329, 421, 422, 423, of Bolza-Hestenes, 657
424, 430 final time fixed, 609, 611, 617
axiom of, 9, 421 final time open, 609, 610, 611, 613,
Euclidian, 9, 76, 77, 329 615, 654
examples of, 9 fixed end point, 609, 611, 613, 615
induced from inner product, 9 fixed time, 611, 615
maximum, 329 of Hestenes, 657
"Normal" factors of production, 406, time optimal problem, 609, 614, 615,
408, 713 654
Normal space, 30 two illustrations of, 667-684
Normality condition, 69, 92, 93, 100, variable end points, 609, 611, 616, 621
101, 612, 613, 616, 665, 666 variable right end point, 609, 615, 654
Normalized system, 318, 319, 330, 401, Optimal control theory, xxii, 61, 420,
403 600, 601, 625, 646, 651, 683
defined, 318 exposition of, 600-719
Normed linear space, 9, 49, 77, 120, 421, optimal pair, 604
422, 423, 424, 430 optimal trajectory, 626
defined, 9 optimal triplet, 604
Normed vector space, 9 (see Normed set of admissible elements, 657
linear space) solution pair, 604, 607, 612
Not-better-than-z set, 178 solution triplet, 604
Nowhere differentiable function, 86 sufficiency theorem, 660-664
Numeraire, 262, 280, 283, 284, 318, 319, Optimal growth of an aggregate economy,
400, 404, 405,407 444-485, 617-625, 638-643,
667-671
Occam's razor, 201, 206, 242 as an application of optimal control
Offer curves, 300, 330 theory, 617-625, 638-643,
One to one, 3 667-671
Onto, 32 attainable path, 448, 474, 475, 479
Open ball, 19, 76, 82, 96, 125, 190, 238, attainable set, 474
243 case of constant capital: output ratio
defined, 19 450-459
Open base, 33, 34, 254 competitive path, 478-479
Open cover, 28 continuous time model, 444-468
Open cube, 33 convergence question, 447, 465
Open kernel, 24 (see Interior) discrete time model, 468-485
Open set, 20, 24, 25, 34, 84, 87, 88, 90, eligible Euler path, 449, 455, 461, 462,
93, 103, 116, 117, 120, 122, 252, 465
605, 639 feasible path, 448, 474
defined, 20 with inequality constraints, 667-671
Operator, 3 limit path, 455
SUBJECT INDEX 739

with linear objective function, Perfect foresight, 520, 528, 531, 538,
638-643 544, 555,689,710,711,712
optimal (attainable) path [see Optimal and intertemporal arbitrarge, 555
(attainable) path] myopic, 537, 554, 555
sensitivity analysis, 456-458, 480-484 Perfect stability, 314
solution path, 449, 455 Period, 398, 469, 487, 488, 561
Optimal stationary program (O.S.P.), of production, 47, 398, 470, 487, 488
576, 577, 583, 584, 586, 587, 588, Period analysis, 469, 470, 484
589, 590, 592, 593, 597 Permutation, 368
defined, 584 matrix, 368, 369, 376
price vector associated with, 585 Phase diagram, 323, 324, 448, 461, 622,
uniqueness of, 585, 586, 587, 597 623, 640, 641, 642, 671, 692
Optimum tariff argument, 150 orbit, 325
Ordering, 177 phase space, 325
Origin, 6 solution path, 325
Overtaking criterion, 450, 594 Phase diagram technique, 309, 321, 322,
Own rate of interest, 519, 554 325
applied to the stability of competitive
Parameterizability condition, 89 equilibrium, 321-325
Pareto optimum, xix, 113, 185, 186, 187, Piecewise continuous derivatives, 602,
188, 190, 192, 193, 195, 197, 198, 612, 613, 614, 616, 626, 648,652,
201, 202, 204, 205, 206, 208, 219, 655, 659, 661, 665
220, 221, 229, 285, 286, 287, 288, Piecewise continuous function, 305, 602,
291, 342, 491, 497, 561, 580, 581, 605, 626, 639, 647, 648, 655, 659,
582 661
Arrow's anomalous case, 199 defined, 305, 602
and core, 208 Planning horizon, 445, 446, 458, 480,
defined, 190, 286 481, 482, 527, 672, 685
Koopman's example, 200 Planning model, 497, 506, 527
Parity theorem (core), 230 Pointwise convergence, 430
stated, 214 Polar cone, 269
Partial derivative, 77-78, 305, 413, 426, negative, 53
512, 536 nonnegative, 23
defined, 77-78 normalized, 269
Partial equilibrium, 255, 256 Pontryagin's maximum principle,
Partial ordering, 177 600-627, 628, 649 (see Maximum
Partial preordering, 177 principle)
Partial quasi-ordering, 177
basic theorem, 602-603
proof of a simple case, 606-609
Pascal distributed lag function, 704 stated, 603
Pascal probability distribution, 714 various cases, 609-617
Passive investment, 433 Positive definite function, 352, 356
Path of pure accumulation, 465, 473, 482 defined, 356
Peak-load problems, 654, 667, 671-684 Positive definite matrix, 118-120, 122
firm peak case, 678 Positive matrix, 368
full capcity, 677, 678 Positive semidefinite matrix, 118 -12 0,
nonpeak periods, 672 122
peak demand, 672 defined, 118
shifting peak case, 677 Possibility of inaction, 48, 289, 490
social welfare, 672-673 Preference ordering, 176-179, 180,
top-peak periods, 676 181-183, 184, 190, 194
Penrose curve, 707, 708 closed, 180
Penrose effect, 688, 706, 707, 708 connected, 180
740 SUBJECT INDEX

Preference ordering (continued) ' positive definite, 118-120, 122

continuous, 195, 237, 262, 265, 266 positive semidefinite, 118-120, 122
convex, 181, 182, 184, 194, 212, 264, strongly positive definite, 425
276, 277 Quasi-concave function, 109, 111, 117,
individualistic, 170, 174, 178, 183, 202 123, 133, 135, 139, 146, 150, 182,
intransitive, 183 210
locally nonsaturating, 191 defined, 109
nonconvex, 211, 226, 232 and preference ordering 182
representative, 179, 180, 182, 229, 235 which is not concave, 110
selfish, 170, 174, 178, 202 Quasi-concave programming, 109-112,
strictly convex, 181, 242, 265-266 135, 136, 147
transitive, 182, 183, 184, 248, 276, 291 defined, 111
weakly convex, 181 Quasi-convex function, 109, 110
Preference relation, 177, 178, 235, 264 Quasi-negative definite matrix, 316
Preferred-to-i set, 179 Quasi-ordering, 176
Price elasticity, 159 Quasi-saddle-point characterization,
Price stability, 551, 554 86-102, 285, 566
Primitive matrix, 376, 377, 378, 513, 515 Quasi-saddle-point (QSP) condition, 87,
defined, 376 96, 103, 124 [see Condition (QSP)]
Principal minor, 119, 385, 387 Quasi-stability, 307, 308, 312, 330, 354
defined, 119 conditions for, 354
Private ownership economy, 194 defined, 354
Producer, 169, 173 Quasi upper semicontinuous function,
Product topology, 27, 29, 34, 485 251, 255
Production frontier, 53, 136
Production function, 45, 136 R-related to, 177
(Production) process, 46, 171, 487, 488 Radner distance, 568
additivity of, 47 Radner lemma, 568-570
level, 489 stated, 568
proportionality of, 47 Radner turnpike theorem, 561, 567,
Production set, 46, 173, 189, 264, 267, 570-572
286, 292, 487, 577 (see Technology reference path, 570, 571
set and also General production set) stated, 570
aggregate, 173, 189, 202, 232, 264, Ramsey-Koopmans-Cass theorem,
265, 266, 277, 286, 291, 292 462-463, 480, 560, 561
Production theory stated, 462-463
as an illustration of nonlinear program- Ramsey sum, 463
. ming, 136-142 Rank (R) condition (constraint qualifica-
Productiveness, 48, 490, 562, 580 tion), 93, 94, 96, 124, 125, 127, 139,
Profit, 49, 173, 189 150, 290, 476, 477, 485, 648, 656,
condition, 257 665, 675, 680, 683 [see Condition
maximization, 53, 142, 190, 269, 479, (R)]
581 Rank of a rectangular matrix, 107, 165
Projection, 27 Rank theorem, 101, 107
Proportional saving behavior, 433, 436 'Rational consumer, 187
Public goods, 232 Real space, Rn, 5, 21 (see Euclidian
space)
Quadratic form, 117-120, 128 Recontracting, 205, 229, 345
defined, 117-118 Region, 281
real, 119 rectangular, 76
Quadratic functional, 118, 121, 423 Regional allocation of investment,
negative definite, 118-120 627-638
negative semidefinite, 118-120 allocation parameter, 628
SUBJECT INDEX 741

Intriligator's objective function,632, 636 428 [see Condition (SONG)]

Rahman's objective function, 629, 636 Second-order partial derivative, 78
relevant region, 623 Second-order sufficient condition, 124,
switching time, 631, 635 125, 126, 128, 152, 153, 428, [see
Relation, 177 Condition (SOSC)]
reflexitivity, 177 Second variation, 423
symmetry, 177 Semicontinuity
transitivity, 177 various concepts of, 249-253
Relative instability theorem, 516 Semipositive matrix, 368
Relative stability, 503, 504, 506, 510, 'Separation properties, 30
511, 512,515, 517, 537 Separation theorem(s), xix, 35-45, 99,
defined, 510 115, 132, 202, 230, 487, 495, 565
Relative topology, 31-32 Sequence, 22, 33
Relevant variable, 110, 133, 150 of real numbers, 33
Remainder of the differential, 77, 79 values of, 33
Resource constraint, 50, 51 Sequential compactness, 28
Reswitching of technique, 533, 539 Set(s), 1
Revealed preference, 234 complement of, 1
weak axiom of, 234, 281, 283, 328, difference of, 2
329, 330 equality of, 1
Ricardo's theory of comparative advan- inclusion of, 1
tage, 142-146 intersection of, 1
Ricardo's condition, 144 union of, 2
Ricardo's point, 145, 148, 149 Shadow price 163, 573, 621, 644, 696
Right-hand derivative, 75, 76, 285, 292 Shephard-Samuelson theorem, 167
Routh-Hurwitz condition, 310, 311, 316, Shift parameters, 152, 403
319,353 Short-run cost curve, 165
defined, 310 Simple root, 368, 372
Routh-Hurwitz theorem, 310 Simplex method, 53, 56, 132
Row dominance, 381 Slater's condition, 69, 70, 71, 73, 87, 88,
Royalties, 228 93, 94, 105, 113, 115, 117, 137, 141,
144, 148, 149, 292, 405, 564, 566,
Saddle branches, 465 582, 596, 656 [see Condition (S)]
Saddle point, 62, 69, 70, 71, 72, 86, 95, defined, 69
96, 113, 116, 132, 291, 292 generalized, 94
defined, 62 Slater's example, 73, 93
nonsaddle-like, 63 Snob effect, 170
Saddle-point characterization, 62-74, Solow condition, 364
285, 566 Solow-Samuelson theorem, 512, 513,
defined, 70 515, 536
Saddle-point (SP) condition, 86 [see Con- Solow's theorem, 436, 437, 438, 439,
dition (SP)] 440, 551
Satiation, 264, 286 (see also Non- stated, 436
satiation) Speed of adjustment, 297, 315, 316, 318,
Sausage machine model, 536 319, 325
Scalar multiplication, 5 units of measurement, 319
Schur-Cohn condition, 319 Stability
Second differential, 120, 423, 425 (asymptotic) global, 307, 31 J,
Second-order conditions, 84, 117, 122, 311, 315, 321, 325, 347, 349, 350
123-127, 135, 136, 152-155 (asymptotic) local, 307, 315, 317, 349,
and comparative statics, 155-156 350, 356
Second-order necessary condition, 124, in the large, 356
128, 152, 153, 154, 155, 160, 425, Liapunov, 348, 349, 350, 351, 352,356
742 SUBJECT INDEX

Stability (continued) Subsequence, 33

Marshallian, 297, 298, 299, 300 Subset, 1
in the small, 356 Subsistence, 175, 183, 277, 291, 452,
strongly uniform (asymptotic) global, 455, 473, 484
352 Subspace, 31
uniformly (asymptotic) global, 307, Substitutes, 166
349, 350, 351, 356 Substitution matrix
uniformly (asymptotic) local, 307, 349, properties of, 247
350,356 Substitution theorem, 505, 506, 528,
uniformly Liapunov, 350, 356 532,533,537
Walrasian, 297, 298, 299 Successive principal minors, 119, 122,
Stability of competitive equilibrium, 294, 155, 360, 365, 380, 383, 392, 393
295-358, 365, 392, 399-403 defined, 119
consistency of various assumptions, Sufficient condition, 4
335-336 Supremum, 4, 485 (see Least upper
historical background of, 313-320 .bound)
nonnegative prices, 318, 336-338 Survival problem, 264, 267, 289
proof of global stability, 325 -330, Symmetric matrix, 117, 121, 126, 128,
402-403 247
proof of local stability, 399-402
Scarf's counterexample, 333-335 T1-space, 30
use of phase diagram technique, T2-space, 30
321-325 T4-space, 30
Walrasian stability vs. Marshallian Tdtonnement process xvii, xx, 227,
stability, 297-300 339-347
State variables, 602, 617, 660, 666, 668, behaviorial background, 340-341
671, 674, 679, 708 importance of, 342-343, 346
defined, 602 intermediate purchases, 341-342, 345
Static expectation, 537-538 market manager, 340-341, 345
Steady state (path), 538, 551, 554, 556, simultaneous, 340
. 694, 701, 710, 712 stable, 340
Strictly concave function, 63, 81, 84, successive, 340, 345
122, 128, 218, 219, 222, 231, 429, ticket, 340, 346
460, 475, 580, 586, 592, 593, 662, Taylor expansion (theorem), 312, 215,
703, 713 399, 428
defined, 63 (Technological external economies and
Hessian, 122, 128 diseconomies, 48, 170, 202, 227,
linear homogeneity, 128, 218, 231 232, 261, 266
Strictly convex cone, 562 Technology set, 487, 490, 577 (see Pro-
Strictly convex function, 63, 82, 703 duction set)
defined, 63 von Neumann, 489, 490, 494
Hessian, 122, 128 Theory of conflicts and interactions, xvii
Strictly convex set, 585, 587, 592, 593 Theory of growth, xvi,, x.viii
defined, 585 Theory of social systems and organiza-
Strictly positive matrix, 368 tions, xvii
Strictly quasi-concave function, 109, 112, Topological space, 20, 25, 28, 29, 30, 31,
127, 128, 150, 182, 214, 230, 567, 34
664 axiom of, 20
and preference ordering, 182 Topology, 19-32, 33, 34
defined, 109 defined, 20
Strictly quasi-convex function, 109, 127 induced from metric, 21, 25, 34
Strong solvability condition, 383 usual, 21
Subcover, 28 Total ordering, 177
SUBJECT INDEX 743

Total quasi-ordering, 177, 180, 235, 265 Utility function, 109, 150, 179-181,
defined, 177 184, 188, 207, 229, 234, 264, 338
Transformation, 3 aggregate, 146, 581
Transversality condition, 603, 610, 611, defined, 179
616, 621, 622, 623, 624, 625, 626, existence of, 180
629, 650, 652, 655, 656, 660, 662, indirect, 162
664, 675, 680 Utility index, 179
at infinity, 623-625, 626, 650-651 Utility possibility set, 209
stated in the most general form, 660
Triangular inequality, 8, 9 Value-added, 363, 395
True dynamic stability, 315, 316 van del Pol equation, 351
Truncated production cone, 50, 51 Vector(s), 6, 15
Tsukui's lemma, 513, 516, 537 linear combination of, 10
stated, 513 nonnegative linear combination of, 17,
Turnpike property, 560 111
Turnpike theorem, xxi, 464, 527, Vector local maximum, 113
559-575 Vector maximum, 112-113, 115, 116,
feasible path, 561, 564 209, 289, 291, 564, 567
free disposability and optimality, defined, 112-113
563-567 problem, 73, 112-117, 128, 141, 142,
hop-skip-jumping, 572 144
intertemporal efficiency condition, 567 Vector space, 6, 420 (see Linear space)
optimal path, 563 Vector subspace, 6 (see linear subspace)
strong, 572 von Neumann
value loss in, 568, 569 equilibrium, 497, 501, 562
weak, 572 facet, 592
Twice continuously differentiable func- growth factor, 562
tion, 79, 122, 123, 128, 152, 246 interest factor, 562
defined, 79 path, 379, 493, 560, 562, 573, 584
Twice differentiable function, 120, 121,
price, 562
414, 423, 424, 427, 428, 429
Twice differentiable at a point, 120, 422 process, 562
defined, 120 profit, 562
Tychonoff's theorem, 29, 484 quadruplet, 497
stated, 29 triplet, 562, 568, 570
value, 560
Unconstrained maximum, 75, 82-85, von Neumann (growth) model, xxi, 276,
123-124 486-502, 508, 560
Uncountable set, 32 dual problem, 494-495
Uniform convergence, 430 existence of maximum rate of expan-
Uniform norm, 430 sion, 492
Uniformly bounded, 350, 354 existence of price vector, 495
Upper bound, 4 independent subset, 497
interest factor, 495
Upper contour set, 66, 178, 267
irreducibility, 497-498
Upper inverse, 250, 255 maximum profit rate, 494
Upper semicontinuous function, rate of expansion, 491, 493, 496
239-242, 250-254, 259, 261, 262, regular, 497
263, 276, 293 von Neumann theorem, 495-496, 560
defined, 240, 251 von Neumann model with consumption
Upper semicontinuous at a point, 239, Marx-von Neumann model, 499
250 Morishima's treatment of, 499-501
Util, 206 Wairas-von Neumann model, 499
744 SUBJECT INDEX

von Neumann ray, 562, 563, 565, 567, Weierstrass theorem, 29, 53, 59, 288,
568, 569, 570, 572, 573, 576, 588 292, 373, 374, 470, 475, 484, 492,
uniqueness of, 562, 563, 569 495, 580
proved, 30
Walras-Cassel system, 258, 266, 275, 282 stated, 29
Walras' Law, 259, 262, 274, 318, 319, Welfare economics, 185, 187, 491
321,326,327,328,330,335,341, two classical propositions of, 185-204
400, 402 Wicksell's Law, 408
in the general sense, 276 Wong-Viner envelope theorem, 162
in the narrow sense, 276 World efficient frontier, 146
Walrasian "long run," 396 World production set, 144
Weak solvability condition, 383 Worse-than-2 set, 179
Wealth, 171
Zorn's lemma, 11

HOY - Mathematics For Economics - 2nd Edition
93% (57)
HOY - Mathematics For Economics - 2nd Edition
1,117 pages
Modern Macroeconomics PDF
100% (2)
Modern Macroeconomics PDF
549 pages
L. Mathematical Methods and Models For Economists - Angel de La Fuente Con Todo y Todo
100% (8)
L. Mathematical Methods and Models For Economists - Angel de La Fuente Con Todo y Todo
836 pages
Knut Sydsaeter, Peter Hammond - Further Mathematics For Economic Analysis
50% (2)
Knut Sydsaeter, Peter Hammond - Further Mathematics For Economic Analysis
610 pages
John G Riley Essential Microeconomics 2012
100% (1)
John G Riley Essential Microeconomics 2012
717 pages
Advanced Macroeconomics David Romer
100% (3)
Advanced Macroeconomics David Romer
550 pages
Snowdon, B. and H. Vane (2005) Modern Macroeconomics
91% (22)
Snowdon, B. and H. Vane (2005) Modern Macroeconomics
826 pages
Further Mathematics For Economic Analysis PDF
100% (1)
Further Mathematics For Economic Analysis PDF
610 pages
Ana Espinola-Arredondo, Felix Munoz-Garcia - Intermediate Microeconomic Theory - Tools and Step-by-Step Examples-The MIT Press (2020)
100% (3)
Ana Espinola-Arredondo, Felix Munoz-Garcia - Intermediate Microeconomic Theory - Tools and Step-by-Step Examples-The MIT Press (2020)
505 pages
Advanced Macroeconomics
91% (11)
Advanced Macroeconomics
420 pages
Macroeconomics, An Introduction To Advanced Methods by William M. Scarth
100% (14)
Macroeconomics, An Introduction To Advanced Methods by William M. Scarth
309 pages
Macroeconomics-A Growth Theory Approach
No ratings yet
Macroeconomics-A Growth Theory Approach
94 pages
Essentials of Advanced Macroeconomic Theory 2012
100% (2)
Essentials of Advanced Macroeconomic Theory 2012
184 pages
Steps To Root Cause Clock Latency - Insertion Delay QOR Post CCopt CTS
No ratings yet
Steps To Root Cause Clock Latency - Insertion Delay QOR Post CCopt CTS
8 pages
DGT Limits Continuity
No ratings yet
DGT Limits Continuity
42 pages
Principles of Mathematical Economics
100% (16)
Principles of Mathematical Economics
510 pages
Newtonian Microeconomics PDF
86% (7)
Newtonian Microeconomics PDF
476 pages
Patrick Minford, David Peel - Advanced Macroeconomics - A Primer (2019, Edward Elgar) - Libgen - Li
80% (5)
Patrick Minford, David Peel - Advanced Macroeconomics - A Primer (2019, Edward Elgar) - Libgen - Li
517 pages
Macroeconomic Theory, Thomas Sargent
67% (3)
Macroeconomic Theory, Thomas Sargent
544 pages
(Akira Takayama) Analytical Methods in Economics
100% (4)
(Akira Takayama) Analytical Methods in Economics
693 pages
Advanced Econometrics - 1985 - 1era Edición - Amemiya
100% (1)
Advanced Econometrics - 1985 - 1era Edición - Amemiya
531 pages
Applied Computational Economics and Finance PDF
No ratings yet
Applied Computational Economics and Finance PDF
521 pages
Principles of Microeconomics
100% (7)
Principles of Microeconomics
442 pages
MacroeconomicMethodology PDF
100% (5)
MacroeconomicMethodology PDF
273 pages
Duality and Modern Economics - Cornes PDF
No ratings yet
Duality and Modern Economics - Cornes PDF
304 pages
International Economics
100% (5)
International Economics
294 pages
EC400 Slides Lecture 1
No ratings yet
EC400 Slides Lecture 1
44 pages
A C Chiang Fundamental Methods of Mathematical Economics PDF
82% (68)
A C Chiang Fundamental Methods of Mathematical Economics PDF
679 pages
Bergin, Jim - Mathematics For Economists With Applications-Routledge (2015)
No ratings yet
Bergin, Jim - Mathematics For Economists With Applications-Routledge (2015)
713 pages
Final PERT SEMINAR
No ratings yet
Final PERT SEMINAR
8 pages
Foundations of Mathematical Economics
100% (3)
Foundations of Mathematical Economics
666 pages
Mathematical Economics PDF
100% (6)
Mathematical Economics PDF
280 pages
Mathematical Economics
No ratings yet
Mathematical Economics
80 pages
Advanced Macroeconomics An Introduction For Undergraduates 1786349140 9781786349149 Compress
100% (2)
Advanced Macroeconomics An Introduction For Undergraduates 1786349140 9781786349149 Compress
170 pages
Handbook of Mathematical Economics - Vol.1 - 978!0!444-86126-9
100% (2)
Handbook of Mathematical Economics - Vol.1 - 978!0!444-86126-9
381 pages
(James - Bradfield) - Introduction To The Economics of Financial Markets
100% (6)
(James - Bradfield) - Introduction To The Economics of Financial Markets
508 pages
Models in Microeconomic Theory by Martin J. Osborne Ariel Rubinstein .
100% (2)
Models in Microeconomic Theory by Martin J. Osborne Ariel Rubinstein .
362 pages
Financial Economics
100% (4)
Financial Economics
343 pages
Economic Dynamics Phase Diagrams and Their Economic Application - Ronal Shone
100% (3)
Economic Dynamics Phase Diagrams and Their Economic Application - Ronal Shone
724 pages
Economics and Mathematics of Financial Markets
100% (3)
Economics and Mathematics of Financial Markets
517 pages
Osborne-Rubinstein-Models - in - Microeconomic - Theory-She 2
100% (1)
Osborne-Rubinstein-Models - in - Microeconomic - Theory-She 2
362 pages
Modern Econometric Analysis
100% (3)
Modern Econometric Analysis
236 pages
Microeconomic Theory and Computation - Michael R. Hammock - J. Wilson Mixon PDF
100% (1)
Microeconomic Theory and Computation - Michael R. Hammock - J. Wilson Mixon PDF
393 pages
Curtis, Irvine Macroeconomics
No ratings yet
Curtis, Irvine Macroeconomics
330 pages
Macro Economics
60% (5)
Macro Economics
261 pages
(Handbook of Macroeconomics 1, Part C) John B. Taylor and Michael Woodford (Eds.) - North Holland (1999)
100% (1)
(Handbook of Macroeconomics 1, Part C) John B. Taylor and Michael Woodford (Eds.) - North Holland (1999)
566 pages
Robert J. Carbaugh - Contemporary Economics - An Applications Approach-Routledge (2016)
100% (1)
Robert J. Carbaugh - Contemporary Economics - An Applications Approach-Routledge (2016)
552 pages
Principles of Mathematical Economics
100% (2)
Principles of Mathematical Economics
250 pages
Thomas J. Sargent, Jouko Vilmunen Macroeconomics at The Service of Public Policy
100% (1)
Thomas J. Sargent, Jouko Vilmunen Macroeconomics at The Service of Public Policy
240 pages
The General Theories of Inflation, Unemployment, and Government Deficits
From Everand
The General Theories of Inflation, Unemployment, and Government Deficits
John Lindauer
No ratings yet
Learn Econometrics Fast
From Everand
Learn Econometrics Fast
Hesbon R.M
No ratings yet
Lecture Notes For ECON660 and ECON460-2022-08
No ratings yet
Lecture Notes For ECON660 and ECON460-2022-08
265 pages
Introductory Optimization Dynamics: Pierre N.V. Tu
No ratings yet
Introductory Optimization Dynamics: Pierre N.V. Tu
7 pages
Further Mathematics For Economic Analysis PDF
No ratings yet
Further Mathematics For Economic Analysis PDF
610 pages
Lecture Notes: Guoqiang TIAN Department of Economics Texas A&M University College Station, Texas 77843 (Gtian@tamu - Edu)
No ratings yet
Lecture Notes: Guoqiang TIAN Department of Economics Texas A&M University College Station, Texas 77843 (Gtian@tamu - Edu)
218 pages
List of Books
No ratings yet
List of Books
92 pages
Detailed Contents: Part One
No ratings yet
Detailed Contents: Part One
6 pages
Micro Math PDF
No ratings yet
Micro Math PDF
139 pages
Micro Notes Main
No ratings yet
Micro Notes Main
207 pages
Lecture Notes
No ratings yet
Lecture Notes
157 pages
Micro Notes Main PDF
No ratings yet
Micro Notes Main PDF
233 pages
Stokey, Lucas, Prescott CH 1-2
No ratings yet
Stokey, Lucas, Prescott CH 1-2
53 pages
A Cookbook of Mathematics
100% (1)
A Cookbook of Mathematics
116 pages
M.A. English
No ratings yet
M.A. English
22 pages
LP Goal Programming (Linear Programming)
No ratings yet
LP Goal Programming (Linear Programming)
30 pages
Poverty Proposed Research On Wealth Inequality
No ratings yet
Poverty Proposed Research On Wealth Inequality
6 pages
Applying TO Linear Programming Your Pay Structure: Since
No ratings yet
Applying TO Linear Programming Your Pay Structure: Since
10 pages
American Statistical Association
No ratings yet
American Statistical Association
8 pages
Scheduling and Planning in Service Systems With Goal Programming: Literature Review
No ratings yet
Scheduling and Planning in Service Systems With Goal Programming: Literature Review
17 pages
An To An A That It It An: I. (L, The
No ratings yet
An To An A That It It An: I. (L, The
10 pages
Informs: INFORMS Is Collaborating With JSTOR To Digitize, Preserve and Extend Access To Operations Research
No ratings yet
Informs: INFORMS Is Collaborating With JSTOR To Digitize, Preserve and Extend Access To Operations Research
17 pages
Omega A Goal Programming Model For Paper Recycling System
No ratings yet
Omega A Goal Programming Model For Paper Recycling System
14 pages
Neoclassical Growth Accounting and Frontier Analysis: A Synthesis
No ratings yet
Neoclassical Growth Accounting and Frontier Analysis: A Synthesis
18 pages
Stochastic Calculus by Alan Bain
100% (1)
Stochastic Calculus by Alan Bain
87 pages
013 007 PDF
No ratings yet
013 007 PDF
46 pages
STATE and Prove 2d Collision
No ratings yet
STATE and Prove 2d Collision
13 pages
Sample - Unit - 1 Maths
No ratings yet
Sample - Unit - 1 Maths
14 pages
Assignment 1: Time Complexity of Algorithms
No ratings yet
Assignment 1: Time Complexity of Algorithms
2 pages
MC Lab Manual Vtu
No ratings yet
MC Lab Manual Vtu
86 pages
Peta 3 This Is Not A Circle
No ratings yet
Peta 3 This Is Not A Circle
2 pages
Lecture Notes - Research Methodology
No ratings yet
Lecture Notes - Research Methodology
1 page
Politics of Globalisation QP 2019 - TutorialsDuniya
No ratings yet
Politics of Globalisation QP 2019 - TutorialsDuniya
5 pages
Grade 9 - Reviewer Math
No ratings yet
Grade 9 - Reviewer Math
60 pages
LTE Design Requirements - Intermodulation (Passive) Issue On LTE800 - 1800
No ratings yet
LTE Design Requirements - Intermodulation (Passive) Issue On LTE800 - 1800
12 pages
Mathematics 5: Full Name: Class / Number: Day / Date
No ratings yet
Mathematics 5: Full Name: Class / Number: Day / Date
2 pages
Intro Matlab
No ratings yet
Intro Matlab
72 pages
1 Kundur IntroDSP Handouts
No ratings yet
1 Kundur IntroDSP Handouts
13 pages
Amity School of Business: BBA, Semester-III Operations Research Dr. Deepa Kapoor
No ratings yet
Amity School of Business: BBA, Semester-III Operations Research Dr. Deepa Kapoor
29 pages
Arithmetic Basic Maths
No ratings yet
Arithmetic Basic Maths
1 page
Year 9 EOY MOCK Exam - 2025 - Final
No ratings yet
Year 9 EOY MOCK Exam - 2025 - Final
21 pages
Mathematics IV (Full PDF
No ratings yet
Mathematics IV (Full PDF
141 pages
Designing A Helical-Coil Heat Exchanger
No ratings yet
Designing A Helical-Coil Heat Exchanger
8 pages
Gjmat 22-23
No ratings yet
Gjmat 22-23
27 pages
Viva Questions and Answers
No ratings yet
Viva Questions and Answers
14 pages
1-Divide and Conquer Algorithms
No ratings yet
1-Divide and Conquer Algorithms
99 pages
Heinemann Maths Zone 9 - Chapter 3
No ratings yet
Heinemann Maths Zone 9 - Chapter 3
38 pages
Angle Between Two Line
No ratings yet
Angle Between Two Line
4 pages
5.0 (5.5 - 5.8) Line Model and Performance
No ratings yet
5.0 (5.5 - 5.8) Line Model and Performance
27 pages
Week 1 Rev (Introduction To Demand and Revenue Management)
No ratings yet
Week 1 Rev (Introduction To Demand and Revenue Management)
81 pages
Sri Chaitanya IIT Academy., India: A Right Choice For The Real Aspirant
No ratings yet
Sri Chaitanya IIT Academy., India: A Right Choice For The Real Aspirant
19 pages