0% found this document useful (0 votes)

256 views1,797 pages

Handbook of Production Economics

Uploaded by

varshneyanaisha

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

256 views1,797 pages

Handbook of Production Economics

Uploaded by

varshneyanaisha

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 1797

Subhash C.

Ray
Robert G. Chambers
Subal C. Kumbhakar
Editors

Handbook of
Production
Economics
Handbook of Production Economics
Subhash C. Ray • Robert G. Chambers •
Subal C. Kumbhakar
Editors

Handbook of Production
Economics

With 108 Figures and 20 Tables

123
Editors
Robert G. Chambers
Subhash C. Ray
Department of Agricultural and
Department of Economics
Resource Economics
University of Connecticut
University of Maryland
Storrs, CT, USA
College Park, MD, USA
Subal C. Kumbhakar
Department of Economics
Binghamton University
Binghamton, NY, USA

ISBN 978-981-10-3454-1 ISBN 978-981-10-3455-8 (eBook)

ISBN 978-981-10-3456-5 (print and electronic bundle)
https://doi.org/10.1007/978-981-10-3455-8

© Springer Nature Singapore Pte Ltd. 2022

All rights are reserved by the Publisher, whether the whole or part of the material is concerned, specif-
ically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction
on microfilms or in any other physical way, and transmission or information storage and retrieval,
electronic adaptation, computer software, or by similar or dissimilar methodology now known or
hereafter developed
The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication
does not imply, even in the absence of a specific statement, that such names are exempt from the relevant
protective laws and regulations and therefore free for general use.
The publisher, the authors, and the editors are safe to assume that the advice and information in this book
are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or
the editors give a warranty, expressed or implied, with respect to the material contained herein or for any
errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional
claims in published maps and institutional affiliations

This Springer imprint is published by the registered company Springer Nature Singapore Pte Ltd.
The registered company address is: 152 Beach Road, #21-01/04 Gateway East, Singapore 189721,
Singapore
Preface

In recent years, the neoclassical theory of production seems to have lost its appeal
among academics and graduate students in microeconomic theory courses. Students
in standard economics doctoral programs only receive the minimal exposure to
production and cost functions necessary for an exposition of the theory of markets
en route to the ultimate goal of game theory, experimental economic issues, and
strategic behavior. For example, only 40 of the 971 pages of the Microeconomic
Theory book by Mas-Colell, Whinston, and Green (1995) are devoted to production,
cost minimization, and profit maximization. While a student admittedly has learnt
the basic theory of producer behavior in their “Intermediate Micro Theory” courses,
more advanced concepts like Allen-Uzawa partial elasticities of substitution are
not covered either at undergraduate or at graduate level. An average student
never sees a transcendental logarithmic (Translog) or a Generalized Leontief cost
function in class. Yet, the latter half of the twentieth century was an era of
spectacular development in production theory within economics. The 1951 Cowles
Foundation anthology Activity Analysis of Production and Allocation edited by
Koopmans remains one of the richest collection of essays in economic theory.
Appearing at about the same time, the duality theory of Hotelling, Roy, Hicks,
Samuelson, and Shephard opened up novel ways of analyzing the production
technology through cost, revenue, and profit functions. These topics are rarely
covered in microeconomics courses, although these topics are covered in the two-
volume Production Economics: A Dual Approach to Theory and Applications edited
by Fuss and McFadden (1978). In the meantime, Nerlove used the dual cost function
to empirically estimate the parameters of a Cobb Douglas production function using
data for electric utilities in the USA (1965). Emergence of generalized cost functions
(like the Translog, the Generalized Leontief, and the Generalized CES) liberated
the empirical analyst from the confines of Cobb Douglas, Leontief, or the CES
specifications and enriched both economic theory and econometric analysis in equal
measures. These seem to be history now. By the last decade of the past century,
interest in production theory had clearly waned. Resurgence of identification of
production function in the recent literature mostly focuses on the primal Cobb-
Douglas production function – completely bypassing the duality literature.
Papers included in this three-volume handbook focus on both theoretical con-
cepts and empirical issues from neoclassical production economics. Each of the
chapters is intended to provide a state-of-the-art survey on a specific topic in
v
vi Preface

production economics. The objective is to serve as a single unified source of

reference for the serious scholar seeking in-depth knowledge of the underlying
theory behind the sophisticated empirical analysis appearing in applied papers.
The chapters in volumes 1 and 2 of the handbook are devoted exclusively to
theory and different analytical methodologies for empirical estimation. By contrast,
every chapter in volume 3 offers an overview of empirical applications in the
accepted literature that employ the theoretical framework described in volume 1 to
analyze the technical and behavioral relations between relevant variables in various
industries ranging from banking or air transportation to education or professional
sports.
Putting together the 45 chapters of the handbook contributed by more than
twice as many authors, each somehow contributing their valuable time to write the
chapters within their busy schedules already full of numerous commitments, has,
naturally, been a long-drawn effort lasting over years. On top of it, the upheaval
brought about by the Covid-19 pandemic put the viability of the entire project
in jeopardy. Fortunately, however, through the collective effort and cooperation of
the contributing authors and the editorial staff at Springer Nature, we managed to
overcome all hurdles and completed the project.
We are grateful to the editorial staff at Springer Nature for their help and par-
ticularly thank Sagarika Ghosh, Nupoor Singh, Audrey Wong-Hillman, Mokshika
Gaur, and Salmanul Faris Nedum Palli for their valiant effort to keep the publication
on track as much as possible.
At the present moment, rapid and sweeping developments in information tech-
nology are changing the fundamental character of production in many industries,
prompting serious researchers to wonder if there will be any workers left in the
workplace in the near future. We hope that the handbook will help to revive interest
in production economics and inspire a new generation of scholars to revisit and
extend the theory. Only that will make editing this handbook worthwhile.

May 2022 Subhash C. Ray

Robert G. Chambers
Subal C. Kumbhakar
Contents

Volume 1

Part I Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1

1 Neoclassical Production Economics: An Introduction . . . . . . . . . . . . 3

Robert G. Chambers and Subhash C. Ray
2 Reminiscences of “Returns to Scale in Electricity Supply” . . . . . . . . 49
Marc Nerlove
3 Duality in Production . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
W. Erwin Diewert
4 Multiproduct Technologies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 169
Rolf Färe, Daniel Primont, and W. L. Weber
5 Functional Structure and Aggregation . . . . . . . . . . . . . . . . . . . . . . . . . 215
Daniel Primont
6 Elasticities of Substitution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 259
R. Robert Russell
7 Distance Functions in Production Economics . . . . . . . . . . . . . . . . . . . 295
Robert G. Chambers and Rolf Färe
8 Stochastic Frontier Analysis: Foundations and Advances I . . . . . . . 331
Subal C. Kumbhakar, Christopher F. Parmeter, and Valentin
Zelenyuk
9 Stochastic Frontier Analysis: Foundations and Advances II . . . . . . 371
Subal C. Kumbhakar, Christopher F. Parmeter, and Valentin
Zelenyuk
10 Data Envelopment Analysis: A Nonparametric Method
of Production Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 409
Subhash C. Ray

vii
viii Contents

11 Activity Analysis in Production Economics . . . . . . . . . . . . . . . . . . . . . 471

Thijs ten Raa
12 Bad Outputs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 483
Sushama Murty and R. Robert Russell

Volume 2
13 Market Structures in Production Economics . . . . . . . . . . . . . . . . . . . 537
Devin Garcia, Levent Kutlu, and Robin C. Sickles
14 Production Under Uncertainty . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 575
Robert G. Chambers
15 Dynamic Analysis of Production . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 611
Spiro E. Stefanou
16 Cost, Revenue, and Profit Function Estimates . . . . . . . . . . . . . . . . . . 641
Levent Kutlu, Shasha Liu, and Robin C. Sickles
17 Scale Elasticity and Returns to Scale . . . . . . . . . . . . . . . . . . . . . . . . . . 681
Victor V. Podinovski and Finn R. Førsund
18 Nonconvexity in Production and Cost Functions: An
Exploratory and Selective Review . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 721
Walter Briec, Kristiaan Kerstens, and Ignace Van de Woestyne
19 Index Numbers and Productivity Measurement . . . . . . . . . . . . . . . . . 755
D. S. Prasada Rao
20 Conceptualization and Measurement of Productivity Growth
and Technical Change: A Nonparametric Approach . . . . . . . . . . . . . 821
Subhash C. Ray
21 Modeling Technical Change: Theory and Practice . . . . . . . . . . . . . . 871
Subal C. Kumbhakar
22 Economics of Externalities: An Overview . . . . . . . . . . . . . . . . . . . . . . 925
Jean-Paul Chavas
23 Shadow Pricing in Production Economics . . . . . . . . . . . . . . . . . . . . . . 951
Rolf Färe, Shawna Grosskopf and Dimitris Margaritis
24 Capacity and Capacity Utilization in Production Economics . . . . . . 1001
Dale Squires and Kathleen Segerson
25 Aggregation of Efficiency and Productivity: From Firm to
Sector and Higher Levels . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1039
Valentin Zelenyuk
Contents ix

Volume 3

Part II Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1081

26 Choice of Inputs and Outputs for Production Analysis . . . . . . . . . . . 1083

Subhash C. Ray
27 Airline Economics: A Survey of Applied Issues in the
Performance of the US and International Airline Industry . . . . . . . 1117
Levent Kutlu, Daniel Prudencio, and Robin C. Sickles
28 Globalization, Innovation, and Productivity . . . . . . . . . . . . . . . . . . . . 1145
Shunan Zhao and Man Jin
29 Empirical Analysis of Production Economics: Applications
to Banking . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1165
Stephen M. Miller
30 Applications of Production Economics in Education . . . . . . . . . . . . . 1193
Jill Johnes
31 Dairy Farming from a Production Economics Perspective:
An Overview of the Literature . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1241
Boris E. Bravo-Ureta, Alan Wall, and Florian Neubauer
32 Performance Evaluation of Mutual Funds Using Frontier
Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1281
Subrata Sarkar
33 Performance of Microfinance Institutions: A Review . . . . . . . . . . . . 1309
Christopher F. Parmeter and Valentina Hartarska
34 The Economics of Production in Marine Fisheries . . . . . . . . . . . . . . . 1339
Dale Squires and John Walden
35 Production Economics in Spatial Analysis . . . . . . . . . . . . . . . . . . . . . . 1379
Luis Orea and Inmaculada C. Álvarez
36 Technical Efficiency and Its Determinants in the
Manufacturing Sector: What We Know and What We
Should Know . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1411
Sumon Kumar Bhaumik
37 Application of Production Economics in the Electricity
Distribution Sector . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1433
Ørjan Mydland and Gudbrand Lien
38 Production and the Environment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1463
Moriah Bostian and Tommy Lundgren
x Contents

39 Applications of Production Theory in Transportation . . . . . . . . . . . . 1491

Phill Wheat, Kristofer Odolinski, and Andrew Smith
40 Productivity in Global Aquaculture . . . . . . . . . . . . . . . . . . . . . . . . . . . 1525
Frank Asche, Ruth Beatriz Mezzalira Pincinato,
and Ragnar Tveteras
41 Benchmarking in the European Water Sector . . . . . . . . . . . . . . . . . . 1563
Alan Horncastle, Joseph Duffy, Chien Xen Ng, and Peter Krupa
42 The Economics of Sports . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1629
Joshua Congdon-Hohman and Victor Matheson
43 The Effects of Management on Production: A Survey
of Empirical Studies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1651
Alecos Papadopoulos
44 Production Economics in the Telecommunications Industry . . . . . . 1699
Arun Bhattacharyya
45 Cost Assessment of (Un)bundling: Separation of Vertically
Integrated Public Utilities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1751
Pablo Arocena and Subal C. Kumbhakar

Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1787
About the Editors

Subhash C. Ray is Professor of Economics at the

University of Connecticut, USA. His principal area
of research is nonparametric measurement of produc-
tivity and efficiency using Data Envelopment Analy-
sis (DEA). His reference textbook Data Envelopment
Analysis: Theory and Techniques for Economics and
Operations Research (Cambridge University Press)
was published in 2004. He is an associate editor of
the Journal of Productivity Analysis. He has served as
guest editor of special issues of the Journal of Produc-
tivity Analysis and Indian Economic Review. He was
a member of the editorial board of Indian Economic
Review. He has lectured and conducted workshops on
DEA in different countries including China, India,
Korea, England, Brazil, Peru, Germany, Malaysia, and
Turkey, among others. He received the W.W. Cooper
Lifetime Contribution Award from International DEA
Society in 2016.

xi
xii About the Editors

Robert G. Chambers was born in Washington,

DC and raised in nearby Rockville, Maryland. He
received his undergraduate training at Georgetown
University, his MS degree from the University of Mary-
land, and his PhD from the University of California
(Berkeley). He joined the faculty at the University of
Maryland in 1979 and has been there ever since apart
from leave to serve as senior economist at the US Presi-
dent’s Council of Economic Advisers. He is a fellow of
the Agricultural and Applied Economics Association.
His areas of interest include production economics,
microeconomic theory, decision-making under uncer-
tainty, and agricultural economics. He is married and
has three sons, Christopher, Geoffrey, and Timothy. He
currently resides in Maryland, New York, and New
Mexico with his wife Michelle, his youngest son Tim,
and their Portuguese Water Dogs, Nelson and Skipper.

Professor Subal C. Kumbhakar (http://bingweb.

binghamton.edu/~kkar/) is a University Distinguished
Research Professor of Economics at the State Uni-
versity of New York at Binghamton. His main area
of research is applied microeconomics with a focus
on estimation of efficiency in production using cross-
sectional and panel data.
Professor Kumbhakar is a fellow of the Journal
of Econometrics (1998) and a distinguished author
of Journal of Applied Econometrics (2017). He holds
an honorary doctorate degree (Doctor Honoris Causa)
from Gothenburg University, Sweden (1997).
Professor Kumbhakar is currently a co-editor of
Empirical Economics, associate editor of Empirical
Economics since 2001, and former associate editor
of the American Journal of Agricultural Economics
(1997–1999). He is serving in the board of editors of
the Journal of Productivity Analysis since 1998; Tech-
nological Forecasting and Social Change: An Interna-
tional Journal since 1991; International Journal of Busi-
ness and Economics since 2002; Macroeconomics and
Finance in Emerging Market Economies since 2007;
Applied Econometrics, http://appliedeconometrics.
cemi.rssi.ru/AppEc_en.html, since 2016; and Ecos de
Economía: A Latin American Journal of Applied Eco-
nomics, http://publicaciones.eafit.edu.co/index.php/
ecos-economia/index, since 2016. He is a board mem-
ber of the Journal of Regulatory Economics since 2015.
About the Editors xiii

Professor Kumbhakar is the co-author (with Knox

Lovell) of Stochastic Frontier Analysis (2000), A Prac-
titioner’s Guide to Stochastic Frontier Analysis Using
Stata (with Hung-Jen Wang and A. Horncastle) (2015)
both published by the Cambridge University Press.
Google Scholar Citations: https://scholar.google.
com/citations?user=-rB5HVsAAAAJ&hl=en
Wikipedia: https://en.wikipedia.org/wiki/Subal_
Kumbhakar
Personal webpage: https://sites.google.com/
binghamton.edu/subalckumbhakar/homes
Contributors

Inmaculada C. Álvarez Oviedo Efficiency Group, Department of Economics,

Universidad Autónoma de Madrid, Madrid, Spain
Pablo Arocena Universidad Pública de Navarra (UPNA), Institute for Advanced
Research in Business and Economics (INARBE), Pamplona, Navarra, Spain
Frank Asche School of Forest, Fisheries and Geomatics Sciences, Institute for
Sustainable Food Systems and Fisheries and Aquatic Sciences, University of
Florida, Gainesville, FL, USA
Arun Bhattacharyya Independent Researcher. Formerly, Director of Strategic
Forecasting at Pfizer Inc. NYC., New York, NY, USA
Sumon Kumar Bhaumik Sheffield University Management School, University of
Sheffield, Sheffield, UK
IZA – Institute of Labor Economics, Bonn, Germany
Global Labor Organization, Geneva, Switzerland
Moriah Bostian Department of Economics, Lewis & Clark College, Portland, OR,
USA
Department of Economics, Centre for Environmental and Resource Economics
(CERE), Umeå University, Umeå, Sweden
Boris E. Bravo-Ureta Agricultural and Resource Economics, University of Con-
necticut, Storrs, CT, USA
Walter Briec University of Perpignan, LAMPS, Perpignan, France
Robert G. Chambers Department of Agricultural and Resource Economics,
University of Maryland, College Park, MD, USA
Jean-Paul Chavas University of Wisconsin, Madison, WI, USA
Joshua Congdon-Hohman College of the Holy Cross, Worcester, MA, USA

xv
xvi Contributors

W. Erwin Diewert Vancouver School of Economics, University of British

Columbia, Vancouver, BC, Canada
School of Economics, UNSW, Sydney, NSW, Australia
Joseph Duffy Oxera Consulting, Oxford, UK
Rolf Färe Department of Economics and Department of Agricultural and Resource
Economics, Oregon State University, Corvallis, OR, USA
Department of Economics and Department of Applied Economics, School of Public
Policy, Oregon State University, Corvallis, OR, USA
Department of Agricultural Economics, University of Maryland, College Park, MD,
USA
Finn R. Førsund Department of Economics, University of Oslo, Oslo, Norway
Devin Garcia Ernst and Young, LLP, Houston, TX, USA
Shawna Grosskopf Department of Economics, School of Public Policy, Oregon
State University, Corvallis, OR, USA
Valentina Hartarska Auburn University, Auburn, AL, USA
Alan Horncastle Oxera Consulting, Oxford, UK
Man Jin Department of Economics, Oakland University, Rochester, MI, USA
Jill Johnes Huddersfield Business School, University of Huddersfield, Hudders-
field, UK
Kristiaan Kerstens IESEG School of Management, CNRS, Université de Lille,
UMR 9221-LEM, Lille, France
Peter Krupa Oxera Consulting, Oxford, UK
Subal C. Kumbhakar Department of Economics, State University of New York at
Binghamton, Binghamton, NY, USA
Inland Norway University of Applied Sciences, Lillehammer, Norway
Levent Kutlu Department of Economics and Finance, University of Texas Rio
Grande Valley, Edinburg, TX, USA
Gudbrand Lien Inland School of Business and Social Sciences, Inland Norway
University of Applied Sciences, Lillehammer, Norway
Shasha Liu Enterprise Model Risk, Freddie Mac, McLean, VA, USA
Tommy Lundgren Department of Economics, Centre for Environmental and
Resource Economics (CERE), Umeå University, Umeå, Sweden
Dimitris Margaritis Department of Accounting and Finance, University of Auck-
land Business School, Auckland, New Zealand
Contributors xvii

Victor Matheson College of the Holy Cross, Worcester, MA, USA

Ruth Beatriz Mezzalira Pincinato UiS Business School, University of Stavanger,
Stavanger, Norway
Stephen M. Miller Department of Economics, Lee Business School, University of
Nevada, Las Vegas, Las Vegas, NV, USA
Sushama Murty Centre for International Trade and Development, School of
International Studies, Jawaharlal Nehru University, New Delhi, India
Ørjan Mydland Inland School of Business and Social Sciences, Inland Norway
University of Applied Sciences, Lillehammer, Norway
Marc Nerlove Department of Agricultural and Resource Economics, College of
Agriculture and Natural Resources, University of Maryland, College Park, MD,
USA
Florian Neubauer Agricultural and Resource Economics, University of Connecti-
cut, Storrs, CT, USA
Kristofer Odolinski Institute for Transport Studies, University of Leeds, Leeds,
UK
Society, Environment, and Transport, The Swedish National Road and Transport
Research Institute (VTI), Stockholm, Sweden
Luis Orea Oviedo Efficiency Group, Department of Economics, University of
Oviedo, Oviedo, Spain
Alecos Papadopoulos Athens University of Economics and Business, Athens,
Greece
Christopher F. Parmeter Department of Economics, University of Miami, Miami,
FL, USA
Victor V. Podinovski School of Business and Economics, Loughborough Univer-
sity, Loughborough, UK
Daniel Primont Department of Economics, Southern Illinois University-
Carbondale, Carbondale, IL, USA
Daniel Prudencio Department of Economics, Rice University, Houston, TX, USA
D. S. Prasada Rao School of Economics, The University of Queensland, Brisbane
St. Lucia, QLD, Australia
Subhash C. Ray Department of Economics, University of Connecticut, Storrs, CT,
USA
R. Robert Russell Department of Economics, University of California, Riverside,
Riverside, CA, USA
Subrata Sarkar Indira Gandhi Institute of Development Research, Mumbai, India
xviii Contributors

Kathleen Segerson Department of Economics, University of Connecticut, Storrs,

CT, USA
Robin C. Sickles Department of Economics, Rice University, Houston, TX, USA
Andrew Smith Society, Environment, and Transport, The Swedish National Road
and Transport Research Institute (VTI), Stockholm, Sweden
Dale Squires NMFS, Southwest Fisheries Science Center, La Jolla, CA, USA
Department of Economics, University of California San Diego, La Jolla, CA, USA
Spiro E. Stefanou Food and Resource Economics Department, University of
Florida, Gainesville, FL, USA
Wageningen University, Wageningen, Netherlands
Thijs ten Raa Utrecht School of Economics, Utrecht University, Utrecht, The
Netherlands
Ragnar Tveteras UiS Business School, University of Stavanger, Stavanger, Nor-
way
Ignace Van de Woestyne Research Unit MEES, KU Leuven, Brussel, Belgium
John Walden NMFS, Northeast Fisheries Science Center, Woods Hole, MA, USA
Alan Wall Department of Economics, University of Oviedo, Oviedo, Spain
W. L. Weber Department of Accounting, Economics and Finance, Southeast
Missouri State University, Cape Girardeau, MO, USA
Phill Wheat Institute for Transport Studies, University of Leeds, Leeds, UK
Chien Xen Ng Oxera Consulting, Oxford, UK
Valentin Zelenyuk School of Economics and Centre for Efficiency and Productiv-
ity Analysis (CEPA), The University of Queensland, Brisbane, QLD, Australia
Shunan Zhao Department of Economics, Oakland University, Rochester, MI, USA
Part I
Theory
Neoclassical Production Economics: An
Introduction 1
Robert G. Chambers and Subhash C. Ray

Contents
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
An Overview of Neoclassical Production Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
The Primal Perspective . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
The Dual Perspective . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
Restricted Profit Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
The Search for a Practical Production Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
The Cobb-Douglas Production Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
The Constant Elasticity of Substitution (CES) Production Function . . . . . . . . . . . . . . . . . . . 26
Homothetic and Non-homothetic CES Production Functions . . . . . . . . . . . . . . . . . . . . . . . . 28
Additive Implicit Multiple Input Production Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
Constant Ratio of Elasticities of Substitution (CRES) Production Functions . . . . . . . . . . . . 32
Indirect Production Function: An Aside . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
Additive Implicit Indirect Production Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
Flexible Functional Forms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
Appendix 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
Appendix 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
Appendix 3 Elasticity of Substitution Derived from the Dual Cost Function . . . . . . . . . . . . . . 45
Cross-References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47

We thank Chuang Li for his technical assistance in preparing the final version of this manuscript.
R. G. Chambers
Department of Agricultural and Resource Economics, University of Maryland,
College Park, MD, USA
e-mail: [email protected]
S. C. Ray ()
Department of Economics, University of Connecticut, Storrs, CT, USA
e-mail: [email protected]

© Springer Nature Singapore Pte Ltd. 2022 3

S. C. Ray et al. (eds.), Handbook of Production Economics,
https://doi.org/10.1007/978-981-10-3455-8_18
4 R. G. Chambers and S. C. Ray

Abstract

To emphasize the nexus between the theory and the empirics of production, this
chapter is split into two parts. The first presents a brief overview of the state of
neoclassical production theory as it exists in the third decade of the twenty-first
century. The second part presents an overview of the history of the development
of functional forms for the production function.

Keywords

Production · Primal · Dual · Profit-maximization

Introduction

This handbook is divided into two volumes. The first volume focuses on theoretical
issues of production economics. The second volume focuses on empirical applica-
tions of the theories to applied production analysis. This split, hopefully, clarifies the
presentation. In practice, however, no such clear separation exists. Throughout its
history, production theory has responded to empirical exigency. An early exemplar
is von Thünen’s [25] induction of the principle of diminishing marginal returns from
records for his farming estate. So, too, are the Cobb and Douglas [6] development
of their production function to fit observed trends in US macroeconomic data and
Gorman’s [10] theorem on the aggregation of fixed factors of production.
To emphasize this nexus between production theory and production empirics,
we split this volume’s introductory chapter into two parts. The first presents a brief
overview of the state of neoclassical production theory as it exists in the third decade
of the twenty-first century. The second part presents an overview of the history of
the development of functional forms for the production function

An Overview of Neoclassical Production Theory

Neoclassical theory gradually emerged from classical economics during the last
quarter of the nineteenth century. And, as illustrated by Marshall’s Principles, the
early twentieth-century discussions of producer behavior were heavily sprinkled
with neoclassical notions such as marginal productivity (returns), marginal cost,
and diminishing marginal returns. Often, these ideas were not wholly original to
neoclassical thinkers. For example, classical writers had recognized the principle
of diminishing returns. But they often attributed it to different causes (e.g.,
deteriorating quality of inputs) than the neoclassical school. What distinguished the
early neoclassical writers was their emphasis on the marginal principle defined in
terms of identical units of inputs and outputs.
As the economic analysis of productions systems developed, it became increas-
ingly evident that identical was to be interpreted in the narrowest possible terms
to mean identical in all aspects (e.g., quality, time, place, state of nature, etc.).
1 Neoclassical Production Economics: An Introduction 5

An example from early empirical studies of economic growth illustrates. Growth

accounting seeks to attribute output growth to its basic causes. In the neoclassical
framework, that means attributing output growth to increased resource use and
technical advances in know-how. Abramovitz [1] observed that, according to then
current measurement techniques, the primary source of US national product growth
in the preceding eight decades was not measured resource growth. Instead, it was
an increase in total factor productivity defined as the residual between measured
output growth and input growth. This challenged received neoclassical production
theory because it implied that the neoclassical model of rational producers reacting
to a technology could only explain observed growth patterns in a deus ex machina
fashion as unexplained shifts in production frontiers. Jorgenson and Griliches [14]
showed, however, that once mismeasurements of inputs, outputs, and prices were
eliminated to ensure a closer concordance between measured variates and their
theoretical counterparts, all but a small residual of output growth could be explained
by input adjustments along a given production frontier.1
The essential point is that neoclassical production theory works with precisely
defined mathematical variates, while the extramathematical reasoning [8] that we
attach to it often does not. That theory is a way of organizing our thinking about
how producers behave in real life. Or it’s a set of formal stories about producers
based on a model of real-world decision settings. If the mathematics are correct, so
too must be those formal stories. But because models are not replicas, those formal
stories can and will differ from reality. The ultimate test of a model and its derived
theory is whether they allow us to say something useful about reality.
Ideally, models would allow us to predict exactly observed or measured behavior.
But that seems beyond our reach. So while confronting theoretical models with data
is crucial, one needs to remember that observed data are only truly informative about
a model’s accuracy to the extent that measured variates correspond to the model’s
theoretical variates. Of course, the converse is also true; a model’s usefulness is
circumscribed by the correspondence between the measured variates of interest in
real-world settings and the model’s theoretic variates. The ultimate challenge is to
strike the proper balance.
Our theoretical overview is broken into two parts that roughly accord with dif-
ferent analytic approaches popularly known as the primal and the dual approaches.
This terminology stems from four sources: the mathematical notion of a dual space,
the economist’s choice of market value as the producer’s objective function, the
economist’s expression of behavioral relations for quantities as functions of prices
with both treated as objects falling in the same dimensional real space, and the
different perspectives adopted in analyzing these relations.

1 Atroughly the same that Abramovitz [1] wrote, Schultz [21] articulated the idea that an ideal
input-output formula would have measured output growth completely explained by measured input
growth. If that could be achieved, it would suggest that a complete economic explanation of output
growth had been accomplished. Schultz [21], in a footnote, attributed the idea to Zvi Griliches,
who was a graduate student at the time.
6 R. G. Chambers and S. C. Ray

For an arbitrary space, X, its dual space, X∗ , is defined as the space of linear
operators for X. An important property of finite dimensional real spaces, denoted by
RM , is that they are self-dual in the sense that RM∗ = RM and RM∗∗ = RM . Thus,
as purely mathematical objects, an M−dimensional vector of quantities, denoted
z ∈ RM , and an M−dimensional vector of prices, q ∈ RM , are both recognizable
as linear operators. The market value of bundleM∗ z ∈ RM is given by the linear
function of quantities m qm zm for prices q ∈ R . But because the dual space of
RM∗ is RM , the same market value of z ∈ R can
M also be interpreted as the linear
function of prices m zm qm for quantities z ∈ RM∗∗ .
Originally, economists focused their attention on value maximization viewed as
a problem of choosing quantities
to optimize market values of commodity bundles.
That placed the focus on m qm zm viewed as a linear function of quantities. It was
only with the development of the theory of optimization in the decades following
World War II that researchers apprehended and exploited the analytic advantages of
viewing m zm qm as a linear function of prices. The quantity-based approach was
identified with the primal terminology and the price-based approach with the dual
approach.
It is to be emphasized that the analytic difference between the two approaches is
one of perspective and technique. The same substantive results are available from
either approach. However, the dual approach quickly proved particularly popular
because it offered an econometrically advantageous way to model production
systems.
The discussion that follows first considers the primal approach. Then the model
is generalized and analyzed from a dual perspective. A brief demonstration of how
the general model can accommodate both long-run and short-run producer behavior
follows.

The Primal Perspective

Prior to roughly 1970, the approach that economists used to analyze productive sys-
tems closely followed Samuelson’s [18] classic treatment of the profit-maximizing
firm. Competitive profit-maximizing producers were modeled as facing a technolog-
ical constraint treated typically as a smooth transformation or production function,
and programming techniques were used to characterize profit-maximizing solutions
and to make inferences about producer responses to price perturbations.
Let x ∈ RN + , y ∈ R+ , w ∈ R++ , and p ∈ R++ denote, respectively, a vector
M N M

of inputs, a vector of outputs, the input price vector, and the output price vector.
We denote by t (x, y) a function of inputs and outputs that is nondecreasing in y
and nonincreasing in x such that x can produce y if and only if t (x, y) ≤ 0. Posed
formally, the producer’s problem is to choose (x, y) to

max p y − w x : t (x, y) ≤ 0 . (1)

(Here, and in what follows, x y for x, y ∈ RN denotes the usual inner product,

n xn yn .) The Lagrangian associated with this problem is written
1 Neoclassical Production Economics: An Introduction 7

L (x, y, w, p, λ) = p y − w x − λt (x, y)

where λ is a nonnegative Lagrangian multiplier, and the first-order conditions for an

optimal interior solution are

∇y L = p − λ∇y t (x, y) = 0,
∇x L = −w − λ∇x t (x, y) = 0,
∇λ L = −t (x, y) = 0.

Here ∇h f (h) denotes the gradient of f with respect to the argument h.

These conditions are familiarly interpretable as: output price ratios, (pk /pj ), are
equated to marginal rates of transformation between yk and yj , dy∂t
/ ∂t ; input price
k dyj
∂t
ratios, (wn /wi ), are equated to marginal rates of technical substitution, ∂x / ∂t ; the
n ∂xi
marginal product of xn in producing ym , − ∂x ∂t
/ ∂t is equated to (wn /pm ); and
n ∂ym
x and y are technically efficient (i.e., fall on the relevant frontiers). Figure 1a–c
illustrate visually.
The associated second-order conditions for a local maximum (under the assump-
tion that λ > 0) require that

Fig. 1 (a): Input equilibrium (b): Output equilibrium (c): Efficient production
8 R. G. Chambers and S. C. Ray

⎡ ⎤
∇yy t (x, y) ∇yx t (x, y) ∇y t (x, y)
H = − ⎣ ∇xy t (x, y) ∇xx t (x, y) ∇x t (x, y) ⎦
∇y t (x, y) ∇x t (x, y) 0
be negative semi-definite. Then under the assumptions that (a) the first-order
conditions are satisfied, (b) the second-order conditions are satisfied, and (c) the
conditions of the implicit function theorem are met, optimal producer behavior is
differentially characterized by
⎡ ⎤ ⎡ ⎤
dy (p, w) dp
⎣ dx (p, w) ⎦ = −H −1 ⎣ −dw ⎦ , (2)
λ̂ (p, w) 0
where y (p, w) and x (p, w) denote, respectively, optimal (profit-maximizing)
supply and derived demand (for inputs) vectors, λ (p, w) denotes the optimal value
of the Lagrangian multiplier, and λ̂ (p, w) = d ln λ (p, w) .
Using assumptions (a)–(c) and (2) establishes that:
Behavioral Prediction (1) ∂ym∂p(p,w)
m
≥ 0, m = 1, 2, . . . , M;
Behavioral Prediction (2) ∂xn∂w
(p,w)
n
≤ 0, n = 1, 2, . . . , N;
Behavioral Prediction (3) ∂ym∂p
(p,w)
k
= ∂yk (p,w)
∂pm , k, m = 1, 2, . . . , M;
∂xj (p,w)
Behavioral Prediction (4) ∂xn∂w
(p,w)
j
= ∂wn , j, n = 1, 2, . . . , N ; and
∂x (p,w)
Behavioral Prediction (5) ∂ym∂w
(p,w)
j
= − j∂pm , m = 1, 2, . . . , M,
j = 1, 2, . . . , N.
Because the objective function for the profit-maximizing firm is linear in (p, w)
and t (x, y) is independent of (p, w) , any solution to the problem, y (p, w) and
x (p, w) , for (p, w) must also be a solution for (μp, μw) where μ > 0, whence
y (μp, μw) = y (p, w) and x (μp, μw) = x (p, w) . Combined with Behavioral
Predictions (1) − (5), zero-degree homogeneity summarizes the core results for
neoclassical production theory. Producers do not suffer from money illusion, profit-
maximizing supplies slope upward in their own prices, profit-maximizing demands
slope downward in their prices, and in a smooth world differential supply and
demand adjustments possess an essential symmetry. A number of related results,
for example, the Le Chatelier principle relating long- and short-run supply respon-
siveness, follow from suitable modifications of these basic techniques.
Some comments are relevant: First, the setting of the problem and the assumed
motivation behind producer behavior are key to its analysis. Assuming that produc-
ers are price takers and small relative to the market eliminates the possibility for
strategic interactions that would complicate analyses. Second, assuming producers
are profit seekers lets them be modeled “as if” they solve a maximization problem
with a clearly articulated objective function and constraints. Optimal or rational
behavior is then identified with conditions required for profit maximization. And,
more suggestively, producers are said to be “in equilibrium” if their behavior is
consistent with that optimum. As Samuelson [18] explained, the resemblance
to physical systems being “in equilibrium” when entropy is maximized is not
1 Neoclassical Production Economics: An Introduction 9

accidental. Neither is the fact that the tools of analysis, variational techniques
applied to differentiably smooth systems, closely parallel those used in classical
thermodynamics.
Consequently, equilibrium behavior is described primarily in terms of condi-
tions which the physical technology must satisfy in an optimum. Marginal rates
of transformation and marginal rates of substitution, which characterize trade-
offs between inputs and outputs internal to the technology, must be equated
to real market prices. Moreover, maxima (at least local) are distinguished from
inflection points or minima by conditions that the technology must satisfy (see
the second-order conditions). Again the resemblance to classical mechanics and
thermodynamics is not accidental. And once these conditions are assured, producers
can be shown to behave in a manner that accords with the most familiar parables of
microeconomic theory as captured by Behavioral Predictions (1) and (2) and zero-
degree homogeneity.
Third, optimal producer behavior is characterized in infinitesimal terms. To
be sure, directional results are obtained, but they only strictly apply in tiny
neighborhoods of the identified equilibria. Individuals are modeled “as if” they will
perceive and respond (smoothly) to even the tiniest perturbations in market prices.
And to make inferences about how individuals respond to discrete changes, these
differential results must be augmented by a combination of integral analysis and the
correspondence principle.

The Dual Perspective

In this section, we continue to treat producers who are profit maximizers and face
given prices and a technological constraint. But we alter the framing of the problem
and its mathematical analysis.
We now model the producer’s technological constraints as a closed and nonempty
subset of M-dimensional real space, T ⊂ RM , that we shall refer to interchangeably
as the technology set or the technology. We relax the distinction between inputs and
outputs and work instead with net outputs (netputs for short) denoted as z ∈ RM .
Using netputs accommodates the possibility that in differing circumstances the same
commodity can function variously as an input or an output.
The technology set, T ⊂ RM , is defined as

T = z ∈ RM : z is technically feasible .

In principle, the technology subsumes all feasible productive activities.2 The

convention is that zk < 0 denotes a netput functioning as an input and zk > 0

2 This raises a semantic point. One often reads or hears references to individuals or firms facing
different technologies. For example, a hand-push, reel lawn mower and a self-propelled lawn
mower might be referred to as two different technologies for cutting grass. This is not our
interpretation of T , which we take to encompass all technically feasible activities. In what follows,
we shall discriminate between different productive activities (e.g., growing wheat as opposed to
producing steel) not as different technologies but as different production processes that fall in T .
10 R. G. Chambers and S. C. Ray

Fig. 2 Netput technology

one functioning as an output. That role can change. Figure 2 illustrates one possible
T . Netput prices are denoted as q ∈ RM++ .
We now show that the behavioral predictions for producers derived indirectly
from t (x, y) via H and the Lagrangian approach can be deduced directly from the
following postulates:
Postulate (a) Producers face competitive pricing for netputs.
Postulate (b) Producers are profit maximizers.
Postulate (c) Finite solutions exist for

π (q) ≡ max q z : z ∈ T (3)
z

for all q ∈ RM
++ .

In what follows, we refer to π (q) as the profit function. More formally, it is the
support function for T [17, p. 28]. Let

Z (q) = arg max q z : z ∈ T

denote the correspondence3 giving the profit-maximizing solutions to (3) and

z (q) ∈ RM denote a particular element of Z (q).
Because q z is linear in q and T is independent of q, any solution for (3) for q
is also a solution for μq with μ > 0. Hence, for μ > 0, Z (μq) = Z (q) (optimal
netput supplies are homogeneous of degree zero in q) and π (μq) = μπ (q) (π (q)
is positively homogeneous in q). The homogeneity properties of Z (q) and π (q)

3A correspondence represents a point to set mapping. Thus, Z (q) ⊂ RM denotes the set of profit-
maximizing solutions associated with the point q ∈ RM . We use the correspondence notation to
remind the reader that profit maximization problems may have multiple, global solutions.
1 Neoclassical Production Economics: An Introduction 11

manifest the familiar adage that “only real prices matter” in making economic
choices.
Because z (q o ) , z (q ∗ ) ∈ T , the definition of π (q) ensures that π (q o ) =
q z (q o ) ≥ q o z (q ∗ ) and π (q ∗ ) = q ∗ z (q ∗ ) ≥ q ∗ z (q o ) . These inequalities,
o

which characterize maxima, are the fundamental source of the behavioral results
that follow. Geometrically, they require for all q ∈ RM++ that z (q) ∈ T and that
T fall in the half-space z ∈ R : q z ≤ q z (q) generated by the hyperplane with
M

normal q that passes through z (q) . Visually, that translates into a hyperplane with
normal q being tangent to T (from above) at z (q) .
Adding the inequalities and rewriting gives4

z qo − z q∗ q o − q ∗ ≥ 0. (4)

Formally, expression (4) says that Z (q) is (positively) cyclically monotone in q

[17, p. 228]. Cyclical monotonicity represents one multidimensional generalization
of univariate monotonicity.5 As an example, when M = 1, cyclical monotonicity
requires z (q) to be nondecreasing in q. More generally, setting qko = qk∗ for all
k = m in (4) gives

zm q o − zm q ∗ o
qm ∗
− qm ≥ 0.

The economic interpretation of this basic characteristic of maxima is that optimal

netput supplies must be nondecreasing in their own prices. This condition, which
applies for discrete as well as infinitesimal price changes, is to be compared to
Behavioral Predictions (1) and (2).
Now observe that for all q o , q ∗

π q o ≥ q o z q ∗
⇓

π qo ≥ π q∗ + z q∗ qo − q∗ , (5)

where the second inequality follows by adding zero in the form of π (q ∗ ) −q ∗ z(q ∗ )
to the right-hand side of the first inequality. In words, expression (5) requires that
any element of Z (q) must belong to the subdifferential correspondence for π (q) ,
which we denote by ∂π (q) ⊂ RM , at q ∗ [13, p. 220].

4 Samuelson [18, pp. 80–1] established an equivalent result in the one-output, multiple-input case.
5 But not the only one. Another stronger notion of monotonicity is that q ≥ q ⇒ z q ≥
z(q). This version implies cyclical monotonicity but is not implied by cyclical monotonicity. One
intuitive way to discriminate between the two notions of monotonicity is that cyclical monotonicity
means that price and quantity movements are positively correlated. The stronger notion requires
that any price increase be matched by all quantities at least weakly increasing.
12 R. G. Chambers and S. C. Ray

The subdifferential notion generalizes the more familiar gradient to accom-

modate nonsmooth (nondifferentiable) functions. Where a gradient is interpreted
geometrically as the unique normal of a hyperplane that is tangent to the graph of
a function, the subdifferential is interpreted as the set of normals to the hyperplanes
that support the graph of the function from below. Panels (a) and (b) of Fig. 3
illustrate. Thus, h ∈ RM belongs to the subdifferential of f (v) for f : RM → R
at v o if the hyperplane h v satisfies h v o = f (v o ) and f (v) ≥ h v for all v so
that the graph of f (v) lies above that of h v while sharing a point in common. A
function that possesses a nonempty ∂f (v) is differentiable if and only if ∂f (v) is a
nonempty singleton set.
Intuitively, therefore, expression (5) says that tangent hyperplanes that support
the graph of π (q) from below at q ∗ must belong to Z (q ∗ ). Figure 4 illustrates.
This is a general version of Hotelling’s lemma, which says that profit-maximizing
netput supplies correspond to partial derivatives of the profit function, π (q),

∂π (q)
zm (q) = , m = 1, 2, . . . , M
∂qm

when π (q) is smooth (differentiable). Conversely, π (q) is differentiable in qm only

if there is a unique optimal solution zm for (3).
A familiar geometric characterization of smooth convex functions is that they
are always underapproximated by first-order Taylor series around the point of
approximation. Cyclical monotonicity of subdifferentials is the generalization of
that characterization that covers nonsmooth convex functions. Hence, cyclical
monotonicity of ∂π (q) ensures that π (q) is convex as a function of q [17,
Theorems 24.8 and 24.9]. Or as Fig. 4 illustrates, all of π (q) s tangent hyperplanes
support it from below for all q. (One can show directly that π (q) is a convex
function of q. Versions of that argument are presented in several of the chapters
that follow, and so we do not repeat it here.)

Fig. 3 (a): Gradients as unique tangents for smooth function (b): Elements of subdifferential for
nonsmooth function
1 Neoclassical Production Economics: An Introduction 13

Fig. 4 Subdifferential as profit-maximizing netput

Having demonstrated zero-degree homogeneity of z (q) and the discrete general-

ization of Behavioral Predictions (1) and (2), all that remains is to derive the netput
analogues of Behavioral Predictions (3)–(5). This is done easily by assuming that,
in addition to being convex, π (q) is twice-continuously differentiable and applying
Hotelling’s lemma and the Schwarz-Young symmetry theorem.
Thus, Postulates (a)–(c) suffice to establish generalized versions of the results
established using techniques borrowed from classical thermodynamics. While the
robustness of those results is to be remarked, it is essential to recognize that they
derive not from conditions placed upon T but from Postulates (a) and (b) that require
that producers maximize profit in a competitive market setting.
As long as T admits a maximum, the behavioral predictions are robust to T s
actual structure. Thus, they are best recognized as extramathematical reasoning
applied to fundamentally mathematical results about solutions to optimization
problems. So, what we interpret as an economic result involving upward-sloping
supply curves is a mathematical characteristic of optimal solutions to a class
of optimization problems, cyclical monotonicity, that has intuitively plausible
economic implications.
The powerful corollary is that models of producer behavior need not be con-
structed from the “ground up,” so to speak, by starting with t (x, y) and then laboring
through first- and second-order conditions to obtain behavioral economic relations.
Instead, if we accept Postulates (a)–(c), a direct specification of behavioral models
in terms of π (q) and ∂π (q) is available that provides observationally equivalent
behavioral predictions to those derived via t (x, y).
A natural consequence is that production economists have become less focused
on the technical aspects of the underlying technology. Beyond manifesting the theo-
rem of comparative advantage relative to more physically oriented disciplines, such
as engineering and the biological sciences, that change in focus also emphasizes the
14 R. G. Chambers and S. C. Ray

true role T , and its representations play in economic analysis. Clearly, producers
must understand the details of T if they are to prove successful. On the other hand,
production economists are not themselves producers. Their job is to develop models
that accurately depict producer behavior. And that requires an understanding of the
characteristics of T that play an essential role in conditioning that behavior but not
of all of its technical details.
Our arguments have shown that, as a mathematical object, π (q) is positively
homogeneous and convex (sublinear) as a function of q ∈ RM ++ . Under weak
continuity restrictions, those properties ensure that π (q) is the support function
for the closed, convex subset T̄ of RM [17, Theorems 13.1 and 13.2], [13, Theorem
C.3.1.1] given by

T̄ = z ∈ RM : q z ≤ π (q) for all q ∈ RM

++ .

Recall that a set B ⊂ RM is convex if b0 , b1 ∈ B implies λb0 + (1 − λ) b1 ∈ B for

all 0 < λ < 1.
The construction of T̄ is illustrated in Fig. 5. There the hyperplanes labeled q̄ z =
π (q̄) and q̂ z = π q̂ , respectively, represent the upper boundaries for the closed,

convex half-spaces z ∈ RM : q̄ z ≤ π (q̄) and z ∈ RM : q̂ z ≤ π q̂ generated
by q̄ ∈ RM++ and q̂ ∈ R++ . T̄ must be contained in
M

q̄, q̂ = z ∈ RM : q̄ z ≤ π (q̄) ∩ z ∈ RM : q̂ z ≤ π q̂

and in all other half-spaces z ∈ RM : q z ≤ π (q) generated by the remaining q ∈
RM++ . Because q̄, q̂ , as Fig. 5 illustrates, and these other half-spaces are closed
convex, T̄ must be as well. Moreover, as is visually apparent from Fig. 5, z̃ ≤ z ∈ T̄

Fig. 5 Constructing T̄
1 Neoclassical Production Economics: An Introduction 15

implies z̃ ∈ T̄ (free disposability of netputs) because z̃ ≤ z requires q z̃ ≤ q z for

all q ∈ RM ++ .
These observations show that T̄ cannot correspond to technology sets that are
nonconvex and that do not exhibit free disposability of netputs. (Recall the only
restrictions which have been placed upon T are that it be nonempty and closed.)
Convex technologies exhibit what economists refer to as diminishing marginal
returns and decreasing returns to scale. (Strictly speaking, diminishing and decreas-
ing should be replaced with nonincreasing.) Free disposability of netputs, on the
other hand, is the mathematical generalization of t (y, x) nondecreasing in y and
nonincreasing in x (nonnegative marginal productivities).
Therefore, as a general rule, T̄ = T . Nevertheless, given Postulates (a)–(c),
T ⊂ T̄ necessarily. This claim follows from the observation that z ∈ T only if
q z ≤ π (q) for all q ∈ RM ++ . (If this were not true, there would exist a z ∈ T such
that q z > π (q) for some q, which violates the definition of π (q) .) On the other
hand, as z̃ in Fig. 6 illustrates, there can exist points falling in T̄ that do not belong to
T . Thus, T̄ represents an over or outer approximation to T . All technically feasible
points necessarily fall in T̄ , but T̄ can contain technically infeasible netputs.
For example, in Fig. 6, it is visually obvious that Z q̂ = z1 q̂ , z2 q̂
and z̃ ≤ μz1 q̂ + (1 − μ) z2 q̂ for all μ ∈ [0, 1] . In words, for q̂ ∈ RM ++ ,
multiple solutions exist for the profit maximization problem, and there exist points
dominated by all the convex combination of those multiple solutions that fall in T̄
but not in T . The existence of such points manifests the nonconvex nature of the
boundary of T between the points z1 q̂ and z2 q̂ . These nonconvexities can be
interpreted, respectively, as the technology exhibiting increasing returns in z1 and
z2 in the neighborhood of the origin. It is well known, however, that Postulates (a)
and (b) guarantee that points falling between z1 q̂ and z2 q̂ on the nonconvex
portion of the boundary of T will never be utilized by profit-maximizing producers
(e.g., [15, 18]). One sees this visually in Fig. 6 by noting that any hyperplane with

Fig. 6 Non-convex T
16 R. G. Chambers and S. C. Ray

a strictly positive normal cannot support T from above in a region where it is

nonconvex. In more traditional terms, profit maximizers never operate in region 1
of the production function.
Thus, while T̄ may not characterize T if the latter is nonconvex, the information
lost in going from T to T̄ is economically superfluous under Postulates a) and b)
because T and T̄ yield the same π (q) . Put another way, π (q) derived from T and T̄
are observationally equivalent to one another in the sense that the profit-maximizing
choices made by an individual facing T cannot be distinguished from those of an
individual facing T̄ .
In more formal terms, T̄ is the free disposal convex hull of T . That is, T̄ is the
smallest convex set consistent with free disposability for which T ⊂ T̄ . It consists
of all convex combinations of T and all elements of RM dominated (in the sense of
≤) by those convex combinations of T [17, Theorem 2.3].
The practical import is that choosing to model producer behavior in terms of
a positively homogeneous and convex π (q) whose subdifferentials depict profit-
maximizing netput supplies is mathematically equivalent to assuming that producers
solve (3) for convex T satisfying free disposability of netputs. The importance of
this observation is hard to overestimate.
Few economists would argue that either convexity of the technology or
free disposability of netputs is realistic. As just one example, intermediate
micro students are routinely introduced to lazy-S-shaped production functions
that violate both convexity and free disposability. Nevertheless, in the form
of convexity and monotonicity assumptions on t (x, y) , these properties are
routinely imposed in more formal analyses of the profit maximization problem
because they provide a mathematical foundation for using calculus tools and
Kuhn-Tucker theory. What our arguments show is that Postulates (a)–(c) ensure
that observed profit and netput supply functions can be treated “as if” they
come from a T satisfying these conditions, even if the true T does not exhibit
these “nice” regularity conditions, without losing any economically relevant
information.
When translated from netput space to more familiar input-output space, a convex
T exhibiting free disposability would resemble the area on or below the graph of
t (x, y) illustrated in Fig. 1c. Figure 1c represents a technology where complete
inaction is possible, so that if economic circumstances dictate, the individual is free
to operate at (0, 0) . We have not yet endowed T with a parallel property, and as
Fig. 7 illustrates, it is possible to specify closed, convex T for which π (q ∗ ) < 0 for
some q ∗ ∈ RM ∗
++ . If one were to confront such a T and q in the real world, common
sense would dictate that, in a free market economy, the best long-run solution is not
to produce. (Of course, one can easily imagine circumstances in which short-run
circumstances would dictate rationally accepting short-term negative profits.) Our
Postulate (b), properly interpreted, allows for such decisions. But it is essential
to recognize that in formal terms maximizing profit is not the same thing as
mechanically solving (3). On the other hand, if T is allowed to permit inaction,
formally 0M ∈ T where 0M denotes the traditional origin in M-dimensional real
space, then π (q) ≥ 0 for all q ∈ RM ++ .
1 Neoclassical Production Economics: An Introduction 17

Fig. 7 Negative π (q)

Our discussion motivates the following definition.

Definition 1. A technology set, T ⊂ RM , is neoclassical if:

(a) T is closed and nonempty.

(b) z ∈ T ⇒ z ∈ T for all z ≤ z (free disposability of netputs).
(c) z0 , z1 ∈ T ⇒ λz0 + (1 − λ) z1 0 < λ < 1 (convexity).
(d) 0M ∈ T (inaction is possible).

The following theorem presents the basic duality results that establish the
equivalence of approaching the firm’s problem either via π (q) or via T and then
developing π (q).

Theorem 1 (Fundamental Duality). If T is neoclassical, π (q) ≥ 0 for all q ∈

RM
++ , π (q) is positively homogeneous and convex as a function of q, and

T = z ∈ RM : q z ≤ π (q) for all q ∈ RM

++ .

In many practical settings, working with set-based concepts can prove challeng-
ing. And cardinal functional representations of T then prove attractive. Define
z
γ (z) = inf λ > 0 : ∈T
λ
as the gauge function for T . Figure 8 illustrates γ (z) visually as the maximal radial
expansion of z that is consistent with z/λ remaining technically feasible. If z ∈ T ,
it’s geometrically obvious from Fig. 8 that γ (z) ≤ 1. Conversely, if γ (z) ≤ 1,
then by the definition of γ (z) , z ≤ z/γ (z) ∈ T under free disposability of netput.
Hence, if T is neoclassical, γ (z) is a complete function representation of T in the
sense that
18 R. G. Chambers and S. C. Ray

Fig. 8 Gauge function for

neoclassical T

z ∈ T ⇔ γ (z) ≤ 1. (Indication property)

Knowing γ (z) is mathematically equivalent to knowing neoclassical T . Moreover,

a basic property of gauge functions [13, Theorem C.1.2.5] ensures that if T is
neoclassical, then γ (z) is nondecreasing, positively homogeneous, and convex
(sublinear) as a function of z .
There are a variety of ways to interpret γ (z) . For example, as Fig. 8 illustrates,
if a point, z∗ , falls in the interior of T , then γ (z∗ ) < 1. In a production context,
that means that z∗ does not lie on the frontier of T and, as such, is not technically
efficient. Thus, γ (z) can be viewed as a measure of technical efficiency. (In
particular, it is closely related to the distance-function concept discussed in detail
in Chambers and Färe, this volume).
Another natural interpretation of γ (z) is as a formally derived version of
the transformation function used in our Lagrangian formulation of the profit
maximization problem. Slightly abusing notation, this can be seen by defining

t (z) ≡ γ (z) − 1.

Adopting this interpretation and applying the indication property, problem (3) can
be reformulated in equivalent mathematical programming terms as

π (q) = max q z : γ (z) ≤ 1
z

If T is neoclassical, it follows from standard Kuhn-Tucker theory in the smooth

case that the first- and second-order conditions developed for the Lagrangian
formulation are necessary and sufficient conditions for a global solution to the profit
maximization problem.
Given Theorem 1, it is natural to suspect that the existence of a positively
homogeneous and convex π (q) implies the existence of a γ (z) consistent with
1 Neoclassical Production Economics: An Introduction 19

a neoclassical T . A simple perspective on that relationship is offered by observing

the definition of γ (z) requires that z/γ (z) ∈ T , whence

q z
≤ π (q) for all q ∈ RM
++ , z ∈ R .
M
γ (z)

Therefore,

q z
≤ γ (z) for all q ∈ RM
++
π (q)

so long as γ (z) and π (q) > 0. (When T exhibits constant returns to scale, T is a
cone, so that both γ (z) and π (q) equal zero.) Thus, if Z (q) = ∅, it must be true
z
that there exists z (q) such that q z (q) = π (q) γ (z) so that the upper bound of πq(q)
in the inequality must be achieved. Hence,

q z
γ (z) = max
q∈RM
++
π (q)

= max q z : π (q) ≤ 1 , (6)
q∈RM
++

where the normalization after the second equality follows from the homogeneity of
degree zero in q of the objective function. Thus,

Theorem 2 (Profit-Gauge Duality). If T is neoclassical,

π (q) = max q z : γ (z) ≤ 1
z∈RM

and

γ (z) = max q z : π (q) ≤ 1 .
q∈RM
++

Figure 9 illustrates the solutions to the two programming problems posed in

Theorem 2. For ease of illustration, we have assumed that z2 always acts as an input
in T and that the boundary of T is smooth. Otherwise, the visual demonstration is
general. From this perspective and the fact that π (q) is the support function for T ,
it is apparent that an alternative interpretation of γ (z) is as a support function for
the dual technology set

T ∗ = q ∈ RM
++ : π (q) ≤ 1 .

And, although we do not present a formal demonstration, it also follows that π (q)
has a natural interpretation as the gauge function for T ∗ that is associated with a
neoclassical T .
20 R. G. Chambers and S. C. Ray

Fig. 9 Support function for T ∗

Modifying earlier arguments, Theorem 2 allows us to establish the following

generalized version of Shephard’s lemma for neoclassical technologies (Hotelling’s
lemma is a special case):

z ∈ ∂π (q) ⇔ q ∈ ∂γ (z) (Shephard’s lemma) (7)

As was shown earlier, z ∈ ∂π (q) implies z ∈ Z (q), so that z ∈ Z (q) requires that
q support the graph of γ from below at that z. (In other words, market price ratios
are equated to marginal rates of substitution (transformation).) Going the other way,
Shephard’s lemma shows that if market price ratios are equated to marginal rates of
substitution (transformation), then z ∈ Z (q) . This is a generalization of the familiar
envelope theorem that can be cast more formally as saying that Z (q) and

Q (z) = arg max q z : π (q) ≤ 1 ,

the set of price-dependent netput supplies, are (lower) inverses of one another.
Shephard’s lemma links multiplicity of solutions to the profit-maximizing
problem to the smoothness properties of γ (z) and π (q) . To illustrate, suppose
that ∂π (q ∗ ) is a nonsingleton set so that at q ∗ , π (q ∗ ) possesses a continuum of
supporting hyperplanes. Geometrically, this is manifested by π (q ∗ ) being kinked
as illustrated in Fig. 10. By Shephard’s lemma, that implies that q ∗ must be a
supporting hyperplane to γ (z) for all z ∈ ∂π (q ∗ ) . That can be true only if γ (z) is
linear over that continuum of z. Conversely, if γ (z) is kinked at z∗ , z∗ must provide
a supporting hyperplane to π (q) over a continuum of q. In short, flats in primal
space map into kinks in dual space, and kinks in primal space map into flats in dual
1 Neoclassical Production Economics: An Introduction 21

Fig. 10 Kinks and flats

space, and vice versa. It follows that γ (z) is strictly convex in z if and only if π (q)
is strictly convex in q.
Apart from the slight difference in their domains, RM versus RM ++ , γ (z) and
π (q) , as mathematical objects, are both positively homogeneous and convex (sub-
linear) functions. That they should be derivable from similar calculating formulae
should not be surprising. The recognition, however, that they are essentially natural
inverses of one another for neoclassical technologies is fundamental and reflects the
mathematical principles behind Theorem 1. In fact, the first formal demonstration
of a duality between a dual economic object (a cost function) and a primal
representation of the technology (a distance function) due to Shephard [22] assumed
virtually the same structure as Theorem 2. Ultimately, however, the analytic message
is the same. If there exists a netput supply system, z (q) , which satisfies the usual
neoclassical postulates of homogeneity of degree zero (absence of money illusion),
upward-sloping output supplies, and downward-sloping demand, there must exist a
neoclassical technology that is consistent with it.
We close this section with Fig. 11 that illustrates the triadic relationship between
T , π (q) , and γ (z).

Restricted Profit Functions

To this point, all M netputs have been treated as freely variable in solving the
profit maximization problem. Thus, π (q) represents a long-run profit function.
Economists, however, are frequently interested in examining short-run behavior that
is characterized by some of netputs being fixed. Two important cases in analyses that
maintain a split between inputs and outputs are offered by the cost function

c (w, y) = min w x : (x, y) ∈ T
x

and the revenue function

22 R. G. Chambers and S. C. Ray

Fig. 11 A triad of
production relations

R (p, x) = max p y : (x, y) ∈ T .
y

Both the cost and the revenue function are manifestations of the more general
notion of a restricted (short-run) profit function. Restricted profit functions give the
maximal profit available conditional on holding a subvector of z fixed. Partition z
as z = z0 , z1 , where z0 ∈ RN with N < M and z1 ∈ RM−N . We refer to z0 as
variable netputs and z1 as (potentially) fixed netputs. Given this partition

π (q) = max q 0 z0 + q 1 z1 : z0 , z1 ∈ T
z0 ,z1

= max max q 0 z0 : z0 , z1 ∈ T + q 1 z1 ,
z1 z0

where the second equality follows by Bellman’s Principle, which in the current
setting simply means that long-run profit maximization always implies variable
profit maximization. Defining the variable profit function as

π 0 q 0 , z1 ≡ max q 0 z0 : z0 , z1 ∈ T ,
z0

it is easy to identify c (w, y) with the case where z0 = −x and z1 = y and R (p, x)
with z0 = y and z1 = x.
It is immediate from preceding developments that π 0 q 0 , z1 is positively homo-
geneous and convex in q 0 and that variable-profit-maximizing netputs z0 q 0 , z1
satisfy z0 q 0 , z1 ∈ ∂π 0 q 0 , z1 (where the subdifferential is understood to be
in terms of q 0 ). Moreover, a suitably modified version of Theorem 1 implies the
existence of a closed convex T 0 z1 ⊂ RN
1 Neoclassical Production Economics: An Introduction 23

T 0 z1 = z0 ∈ RN : q 0 z0 ≤ π 0 q 0 , z1 for all q 0 ∈ RN
++ .

Making these arguments only requires modifying our previous notation, and so we
do not pursue it here. Straightforward corollaries, therefore, are as follows: c (w, y)
is positively homogeneous and concave in w, the subdifferentials of −c (w, y) in
w represent minus cost-minimizing demands, and those demands are downward
sloping in their own input prices; R (p, x) is positively homogeneous and convex
in p, the subdifferentials of R (p, x) represent revenue-maximizing supplies, and
those supplies are upward sloping in their own prices.
From the functional equation

π (q) = max π 0 q 0 , z1 + q 1 z1 ,
z1

we can infer for all q̂, q̃ that

π q̂ ≥ q̂ 0 z0 q̂ 0 , z1 (q̃) + q̂ 1 z1 (q̃)

whence

π q̂ − π (q̃) ≥ π 0 q̂ 0 , z1 (q̃) − π 0 q̃ 0 , z1 (q̃) + q̂ 1 − q̃ 1 z1 (q̃) . (8)

Figure 12 illustrates. Expression (8) confirms that z1 (q) belongs to the subdifferen-
tial of π (q) in the subvector q 1 and that the short-run profit function and π (q) are
tangent to one another when z1 is evaluated at z1 (q) .
Setting q̂ 1 equal to q̃ 1 , so that only variable netput prices change, gives

π q̂ − π (q̃) ≥ π 0 q̂ 0 , z1 (q̃) − π 0 q̃ 0 , z1 (q̃) .

This expression manifests the Le Chatelier principle that characterizes optima. As

commonly interpreted in economics, that principle requires unconstrained (long-
run) optima to respond more to parametric changes than constrained (short-run)
optima. A direct corollary from it and Hotelling’s lemma is that long-run optimal
netput supplies are more responsive to own price changes than their short-run
counterparts.
Intuitively, the Le Chatelier principle simply reflects the fact π 0 q̂ 0 , z1 (q̃) −
π 0 q̃ 0 , z1 (q̃) is the producer’s best response to the price change holding the
fixed netput constant at z1 (q̃) . Because this alternative is always available to the
producer, her optimal long-run response must at least weakly dominate that strategy.
24 R. G. Chambers and S. C. Ray

Fig. 12 Le Chatelier
principle

The Search for a Practical Production Function

The Cobb-Douglas Production Function

The Cobb-Douglas production function remains a classic example of empirical

evidence inspiring a theoretical formulation of a production function that has served
as the gold standard in neoclassical production economic theory for decades and has
retained much of its popular appeal despite the advent of more flexible functional
forms even as it nears its centenary.
It started from the remarkable observation by Douglas that when plotted on
semi-log paper with output (Y), labor (L), and capital (K) measured on the
logarithmic scale along the vertical axis against time (t) measured along the
horizontal axis, over the years 1899 through 1922, the time series plot of the
three series exhibited the tendency that the distance between log(Y) and log(L) was
approximately one quarter of the distance between log(L) and log(K). At Douglas’s
request, Cobb (a mathematician at Amherst College) came up with the specification
3 1
Y = AL 4 K 4 ; A = 1.01. All of its theoretical properties including constant factor
shares, constant returns to scale, diminishing marginal productivities, and unitary
elasticity of substitution between the inputs were validated rather than imposed as
prior restrictions on the technology.
Samuelson [19] offers the following direct derivation of the Cobb-Douglas form
as a simple back-of-the-envelope calculation:

ln Yt −ln Lt
= 14 ; 1899 ≤ t ≤ 1922
ln Kt −ln Lt (9)
⇒ ln Yt = 34 ln Lt + 14 ln Kt
1 Neoclassical Production Economics: An Introduction 25

Using the initial conditions ln Yt = ln Lt = ln Kt = 0 (t = 1899), one gets

3 1
Yt = ALt 4 Kt 4 ; A = 1.0. Deviation of the value of the constant (A) from 1 in
the formula fitted by Cobb and Douglas can be attributed to measurement errors.
As noted in Samuelson [19], one could easily have started from the profit-
maximizing behavior of a competitive firm where at the optimal input-output choice,
factor prices are equated to their respective values of the marginal product. Denoting
the prices of the output, labor, and capital by p, w, and r, respectively, the shares of
the two inputs are

∂Y
wL L ∂ ln Y
sL = = ∂L = f or labor,
pY Y ∂ ln L

and
∂Y
rK ∂K K ∂ ln Y
sK = = = f or capital.
pY Y ∂ ln K

Further, zero normal profit in a competitive market implies sL + sK = 1. Now,

if the factor shares remain constant, we can set sL = α and sK = β; α + β = 1.
Hence, a solution of the partial differential equations (i) ∂ ln Y = α ∂ ln L and (ii)
∂ ln Y = β ∂ ln K yields

ln Y = γ + α ln L + β ln K
(10)
⇒ Y = ALα K β ; A = eγ , α + β = 1.

Elasticity of Substitution
One of the most important characteristics of the production technology is the degree
of substitutability between inputs allowing the producer to change input proportions
in response to changes in relative prices of inputs.6 In a two-input case, it relates to
the curvature of the isoquant.
At one extreme is the Leontief production function

Y = min {aL; bK} (11)

The cost-minimizing input bundle is (L∗ , K ∗ ) = Ya , Yb which depends only on

the output level and does not change with changes in the relative price of inputs.
∗
The optimal capital-labor ratio is KL∗ = b , and isoquant is L-shaped with zero
a

substitutability between inputs.

The other extreme is the linear production function

Y = aL + bK (12)

6 See “Elasticity of Substitution” in the chapter by Russell, this volume.

26 R. G. Chambers and S. C. Ray

where the marginal rate of substitution along the isoquant is − dK

dL =
a
b, which is
constant at every point on the isoquant.
The elasticity of substitution between the two inputs is

d ln( K
L) d ln( K
L) FK .FL
σ = =− F
=− (13)
d ln(MRT S) d ln( )L F.FKL
FK

where

∂2F
Y = F (K, L), FL = ∂F
∂L , FK = ∂F
∂K , FKL = ∂K∂L (14)

For the Cobb-Douglas production, function σ equals 1 along the entire isoquant. An
implication of this unitary elasticity of substitution is that for every 1% increase in
w relative to r, the capital labor ratio will always increase by 1%!

The Constant Elasticity of Substitution (CES) Production Function

Arrow, Chenery, Minhas, and Solow (ACMS) [2] questioned the validity of the
Cobb-Douglas production function across all industries within the manufacturing
sector. The basis of their disagreement was the empirical evidence that value added
per worker (y = YL ) in any specific industry varied with the real wage rate (w)
widely across countries. They used country-level cross-sectional data on different
two-digit manufacturing industries to estimate the regression

ln yi = a + b ln wi + ui (15)

The estimated value of the parameter b representing the elasticity of value added
per worker with respect to the wage rate differed widely across industries and in
most cases was statistically significantly different from both 0 and 1. This prompted
them to question the universal validity of both the Leontief (fixed input proportions)
production function and the Cobb-Douglas production function exhibiting unitary
elasticity of substitution everywhere on the isoquant.
Equivalence between the parameter b in (15) and the elasticity of substitution
σKL in (13) is not apparent. ACMS [2] first show that the elasticity of the output per
worker does, indeed, measure the elasticity of substitution between the two inputs
and use their results to develop the new CES production function.
Under the assumption of constant returns to scale, the production function Y =
F (K, L) is homogeneous of degree 1 so that y = YL = F (k, 1) ≡ f (k) where
k=K L is the capital-labor ratio. Under competitive profit maximization

w = FL = f (k) − kf (k) = y − k dy
dk . (16)
1 Neoclassical Production Economics: An Introduction 27

This may be inverted to obtain

y = ϕ(w) = ϕ(y − k dy
dk ) (17)

From (16)

d(y−kf (k)) dy
dw = dw − d(kfdw(k))
= dfdk(k) dwdk
− f (k) dw
dk
− kf (k) dw
dk
(18)
= −kf (k) dy dwdk dk

But

dk 1 1
= dy =
dy f (k)
dk

Hence,

−kf (k) dy
dk dy
dw

(k) dy
= − kff (k) dw = 1 (19)
dy
⇒ dw = − kff (k)
(k)

Therefore,

dy w d ln y f (k) f (k) − kf (k)

= = − (20)
dw y d ln w kf (k) f (k)

Finally, as shown in Appendix 1, the expression on the right-hand side of (20) is the
elasticity of substitution σKL . Combining (15) and (20), one gets

d ln y
= b = σ. (21)
d ln w

Substituting (16) and (21) into (15), one gets

ln y = ln a + σ ln(y − k dy
dk ) (22)

Thus,

y = a(y − k dy
dk )
b
1 1
⇒ y b = a b (y − k dy
dk )
1 1 1 1 1 1 (23)
− − ( −1)
dy a b y−y b y−a b y b y(1−a b y b )
⇒ dk = 1 = k = k
ka b
28 R. G. Chambers and S. C. Ray

1
Defining a b = α and 1
b − 1 = ρ,

dy dy αy dy ρ−1
= y(1−αy
dk
k ρ ) = y + (1−αy ρ )
(24)
⇒ d ln k = d ln y − ρd ln(1 − αy ρ )

Upon integration, (24) leads to

ln k = ln y + ln(1 − α ln y)−ρ + ln B (25)

Setting the constant of integration ln B = 1

ρ ln β yields

βy ρ
kρ = . (26)
1 − αy ρ

Thus,

βy ρ
k ρ = 1−αy ρ

⇒ y ρ (β + αk ρ ) = k ρ
1 (27)
−ρ
⇒ y = k(β + αk ρ )
1
−ρ
= (βk −ρ + α)

In terms of the aggregate variable, (27) can be expressed as

1
−ρ + β −ρ
Y
L = α( K
L)
1
−ρ (28)
⇒ Y = αK −ρ + βL−ρ
⇒ Y −ρ = αK −ρ + βL−ρ

Finally, defining γ = (α + β) and δ = α

α+β , (28) can be expressed as

1
−ρ
Y = γ δK −ρ + (1 − δ)L−ρ (29)

ACMS describe ρ as the substitution parameter, δ as the distribution parameter, and

γ as the efficiency parameter.

Homothetic and Non-homothetic CES Production Functions

A function y = f (x); x ∈ R+
n is said to be homothetic if it can be expressed as

y = g(h(x)) (30)
1 Neoclassical Production Economics: An Introduction 29

where g is strictly increasing and h(x) is homogeneous of degree r. An important

property of a homothetic production function is that the marginal rate of substitution
between a pair of inputs depends only on the input proportion and is independent
of the level of output. This implies that for homothetic technologies, isoquants are
radially parallel and output expansion paths are straight lines.7

Homotheticity of the CES Function

From (28),

∂Y
−ρY −(ρ+1) = −αρK −(ρ+1)
∂K
and
∂Y
−ρY −(ρ+1) = −βρL−(ρ+1)
∂L
Hence,
∂Y (ρ+1)
α K
∂L
∂Y
= (31)
∂K
β L

Thus, the marginal rate of substitution remains unchanged so long as the capital-
labor ratio remains constant even as the output level changes.

Non-homothetic CES Function

Sato [20] considered two implicit production functions:

F (K, L, Y ) = K −ρ + C(Y )L−ρ − H (Y ) = 0 for σ = 1 (32)

and

F (K, L, Y ) = ln K + C(Y ) ln L − H (Y ) = 0 for σ = 1 (33)

Both C(Y) and H(Y) are monotone functions of Y. The function (32) is the non-
homothetic CES (NH-CES), and (33) is the non-homothetic Cobb Douglas (NH-
CD) production function. For the NH-CES function in (32),

ρK −(1+ρ) −(1+ρ) C(Y )

∂Y
∂K = ; ∂Y = ρL
C −ρ −H (ρ) ∂L C −ρ −H (ρ)
;
∂Y (1+ρ) (34)
⇒ ∂L
∂Y = C(Y ) K L
∂K

7 See Appendix 2 for proof.

30 R. G. Chambers and S. C. Ray

Thus, for this production function, the marginal rate of substitution depends on both
the level of output (Y) and the capital-labor ratio K L . Hence, the expansion path
of the firm is not a straight line. But the elasticity of substitution is still a constant.

Non-homothetic Cobb-Douglas Production Function

For (33)

C(Y )
∂Y
∂K = K(C (Y ) ln L−H (Y )) ; ∂L
1 ∂Y
= L(C (Y ) ln L−H (Y )) ;
∂Y (35)
⇒ ∂L
∂Y = C(Y ) K L
∂K

For the NH-CD even though the elasticity of substitution remains 1 (as in the case
of the standard Cobb Douglas function), the marginal rate of substitution between
capital and labor declines as one moves to a higher isoquant even as the capital-labor
ratio is unchanged.
An example of the NH-CD (implicit) production function is
√
ln K + Y ln L − Y 2 = 0 (36)

It can be seen that for (36)

√
∂Y
∂K = 1 ; ∂Y
∂L = Y ;
ln√L ln√L
K −2Y L −2Y
2 Y 2 Y (37)
∂Y √
⇒ ∂K
∂Y = K L Y
∂L

Additive Implicit Multiple Input Production Functions

Consider an additively separable implicit production function

n
F (Y, x1 , x2 , . . . , xn ) = F i (Y, xi ) = 1 (38)
i=1
n
Minimizing cost, i=1 wi xi , for this production function yields the first-order
conditions
i
wi = λ ∂F
∂xi (i = 1, 2, . . . , n), (39)

n
1− F i (Y, xi ) = 0, (40)
i=1

where λ is a nonnegative Lagrangian multiplier. From (40)

1 Neoclassical Production Economics: An Introduction 31

n
∂xi
Fii ∂wj
= 0 (j = 1, 2, . . . , n) (41)
i=1

∂F i
Here, Fii = ∂xi . Similarly, from (39),

∂xi ∂xi Fii ∂λ

λFiii ∂w + Fii ∂w
∂λ
=0 ⇒ = − λ1 ; (i = j )
j j ∂wj Fiii ∂wj
j (42)
j ∂x j ∂λ ∂xj 1 Fj ∂λ
λFjj ∂wjj + Fj ∂wj =1 ⇒ ∂wj = 1
j − λ F j ∂wj
λFjj jj

∂Fii ∂2F i
Here Fiii = = .
∂xi ∂xi2
Substitution of (42) into (41) yields

n (Fkk )2 1 Fjj
j
− λ1 ∂w
∂λ
j k=1 F k + λ Fj =0
kk⎛ j ⎞
j (43)
Fjj ⎜ ⎟
⇒ ∂λ
∂wj = j ⎝ 1
(F k )2
⎠
Fj n k
k=1 F k
kk

Hence,
⎛ ⎞⎛ ⎞
j
Fi1 Fj
∂xi
= − λ1
Fii ∂λ ∂xi
= − λ1 ⎝ i j ⎠⎜
⎝
1 ⎟
⎠ (44)
∂wj Fiii ∂wj ∂wj Fii Fjj n (Fkk )2
k=1 F k
kk

Define

Fkk
ak ≡ − k
(k = 1, 2, . . . , n) (45)
xk Fkk

Using (39), the share of input k in the total cost can be written as

F k xk
sk = nwk xk = n k i (46)
i=1 wi xi i=1 Fi xi

Thus,

(F k )2
ak sk = − k nk i
Fkk i=1 xi Fi
n 1 k 2 (47)
n k=1 k (Fk )
Fkk
⇒ k=1 ak sk = − n i
i=1 xi Fi
32 R. G. Chambers and S. C. Ray

Hence,8

Fii
Fii
aa i x Fi
xi Fii
n i j =− ⎛ 1
i ii
⎞
k=1 ak sk n k 2
k (F )
⎜ k=1 Fkk k ⎟
⎝ n i ⎠
i=1 xi Fi
⎛ ⎞
n
i Fii Fii
=− i=1 xi Fi ⎝ 1 ⎠
xi xj Fiii Fiii n 1
(Fkk )2
(42)
k=1 k
⎛ Fkk ⎞
n i i

=− i=1 wi xi 1 Fi Fi ⎝ 1 ⎠
xi xj λ Fi Fi n 1
ii ii k=1 k (Fkk )2
n Fkk
i=1 wi xi ∂xi ∂2C
= xi xj ∂wj = C
∂C ∂C ∂wi ∂wj = σij .
.
∂wi ∂wj

Constant Ratio of Elasticities of Substitution (CRES) Production

Functions

Consider the implicit production function

F (Y, x1 , x2 , . . . , xn ) = ni=1 F i (Y, xi ) = 1;
(48)
F i (Y, xi ) = Di Y −ei di xidi ; (i = 1, 2, . . . , n)

The Lagrangian for the constrained cost minimization problem is

n
L = w x + λ(1 − Di Y −ei di xidi ),
i=1

and the first-order conditions are

wi = λDi di xi(di −1) Y −ei di (i = 1, 2 . . . , n) (49)

Hence
1−dj
xj Dj dj (ej dj −ei di ) wi
= Y (50)
xi1−di Di di wj

8 See Appendix 3 Elasticity of Substitution Derived from the Dual Cost Function.
1 Neoclassical Production Economics: An Introduction 33

Thus
1 ej dj −ei di 1−di 1
Dj dj 1−dj − wi 1−di
xj = Y 1−dj
(xi ) 1−dj
(51)
Di di wj

Define

1 Dj d ei di − ej dj
A= ln ;h= ;
1 − dj Di di 1 − dj

Further, from (48)

j
Fii 1 Fj 1
ai = − = ; aj = − = .
xi Fiii 1 − di j
xj Fjj 1 − dj

Then (51) reduces to

wi aj
ln xj = A + h ln Y + ai ln + ln xi (52)
wj ai

The equation in (52) can be estimated econometrically although the presence of the
endogenous variable ln xi on the right-hand side rules out using OLS.
It is clear from (48) that

σij ai aj aj 1 − dk
= = = . (53)
σik ai ak ak 1 − dk

Because the ratio of the elasticities of the substitution between inputs i and j and
between i and k depends on the parameters dj and dk , the ratio is a constant.
The implicit production function in (48) can be described as the constant ratio of
elasticity of substitution (CRES) production function [12].9
For an intuitive interpretation of (53) consider a production technology with four
inputs: unskilled labor (x1 ), skilled labor(x2 ), energy(x3 ), and capital(x4 ).Suppose
that σ13 = 2σ14 . That is, the elasticity of substitution between unskilled labor and
capital is twice as large as the elasticity of substitution between unskilled labor and
energy. Then, by virtue of (53) it must be true that σ23 = 2σ24 . That is, exactly the
same proportionality must hold for elasticities of substitution between skilled labor
and capital and between skilled labor and energy. For this to be true for any triplet
of inputs (i, j, k), the function must be additively separable in every input.
We may now consider some special cases with appropriate restrictions on the
parameters ei and di .

9 See also Mukerji [16], Gorman [9], and Hanoch [11].

34 R. G. Chambers and S. C. Ray

Case 1: di = d, ei = e ; (i = 1, 2, . . . , n)
In this case, (48) becomes

n −ed x d
i=1 Di Y i =1
− 1
n −d ed
⇒Y = i=1 D i xi (54)
1
−d e
⇒ Y −d = n
D
i=1 i i x

Further, if e=1, we get the ACMS/CES production function with constant returns to
scale. In this formulation, d is the substitution parameter and e is the returns to scale
parameter while the Di s are the distribution parameters.

Case 2: di = d but ei not constant

In this case, (48) is

n
Di Y −ei d xid = 1. (55)
i=1

Cost minimization subject to (55) as a constraint leads to

= Di Y −ei d xi
wi (d−1)
C
1 (56)
⇒ xi = Di Y −ei d wCi
d−1

Thus,

− 1−d

n 1 e d − d
d
− 1−d
i
C= Di1−d
Y wi 1−d (57)
i=1

This cost function is not multiplicatively separable in output and input prices and,
hence, the underlying technology is not homothetic. At the same time, because di =
d (i = 1, 2, . . . , n) it is a CES function. Hence, (55) above is a non-homothetic CES
production function.

Case 3: Consider the special case

F i (Y, xi ) = Di ln xY −ei
n (58)
i=1 F (Y, xi ) = 1
i
1 Neoclassical Production Economics: An Introduction 35

Thus
n n
i=1 Di ln xi − ln Y i=1 ei Di = 1
n (59)
⇒ ln Y = − n 1
+ n Di ln xi
i=1 ei Di
i=1 e D
k=1 k k

Define

1 Di
α = − n ; βi = n , (i = 1, 2, . . . , n) (60)
i=1 ei Di k=1 ek Dk i

Then (59) reduces to the usual Cobb-Douglas production function

n
ln Y = α + βi ln xi . (61)
i=1

Case 4: Next consider

F i (Y, xi ) = Di Y −ei ln xi
n (62)
i=1 F (Y, xi ) = 1.
i

Now

Di Y −ei −ei Di Y −ei

Fii = xi ; Fii
i = − Di Y2 ; xi Fiii = xi
xi
F i (63)
⇒ ai = − i i = 1 , (i = 1, 2, . . . , n)
xi Fi

This time

ai aj 1
σij = n = n = 1. (64)
a
k=1 k ks k=1 sk

Thus it is the Cobb Douglas technology.

However

wi Fii 1 Di Y −ei
C = n k = xi n −ek
k=1 xk Fk k=1 Dk Y (65)
C Di Y −ei
⇒ xi = wi n D Y −ek
k=1 k

Similarly,

C Di Y −ei
xj = n (66)
wj k=1 Dk Y −ek
36 R. G. Chambers and S. C. Ray

Thus,

xj Dj wi
= Y −(ej −ei ) (67)
xi Di wj

Clearly, the input proportions depend on both the marginal rate of substitution (i.e.,
the input price ratio) and the output level. Hence, the technology is non-homothetic,
even though the constant elasticity of substitution equals unity. This is an example
of a non-homothetic Cobb-Douglas production function.

Indirect Production Function: An Aside

An alternative specification of the production technology is in terms of the indirect

production function that defines the maximum output producible by a firm with
a given amount of budgeted expenditure (C) facing a specific vector of input
prices (w). This is particularly relevant for nonprofit organizations and public sector
enterprises which operate under hard budget constraints.
Define the normalized input price vector q = w C . The indirect production
function can be expressed as

Y ∗ = G(q) = max F (x) s.t. q x ≤ 1. (68)

Consider the Lagrangian

L = F (x) + λ(1 − q x)

∂F
So long as ∂x i
> 0 for at least one input, output is not maximized until the budgeted
expenditure is completely spent. Hence, the (normalized) budget restriction will be
strictly binding, and the first-order conditions for a maximum will be

∂F
∂xi = λqi ; (i = 1, 2, . . . , n)
(69)
q x =1

This implies

∂F
wi ∂xi
qi = C = n ∂F
(i = 1, 2, . . . , n) (70)
k=1 xk ∂xk

The relation in (70) is known as Wold’s theorem.

One can solve the system of n equations above for the optimal demand functions

xi∗ = xi (q) (i = 1, 2, . . . , n) (71)

1 Neoclassical Production Economics: An Introduction 37

Therefrom, the indirect production function is obtained as

Y ∗ = F (x(q)) = g(q). (72)

Roy’s Identity
One of the duality relations in production theory is between the indirect and the
direct production functions10

F (x) = minq g(q)

s.t. q x = 1
↓↑ (73)
g(q) = maxx F (x)
s.t. q x = 1

The Lagrangian for the constrained minimization problem in (73) is

L = g(q) + θ (1 − q x) (74)

The first-order conditions for a minimum are

∂g
∂qi = θ xi (i = 1, 2, . . . , n)
(75)
q x =1

Thus

n
∂g n
qi =θ qi xi = θ (76)
∂qi
i=1 i=1

and
qi
xi = n ∂g
(i = 1, 2, . . . , n) (77)
k=1 qk ∂qk

The relation in (77) is an example of Roy’s identity.

Additive Implicit Indirect Production Functions

Hanoch [12] introduced a class of additive implicit production functions of the form

10 For an intuitive explanation of a similar duality between the direct and the indirect utility
functions see Varian [24] page 129–130.
38 R. G. Chambers and S. C. Ray

n
G(q, Y ) = Gi (Y, qi ) = 1 (78)
j =1

∂Gi ∂ 2 Gi
Define Gii = ∂qi and Giii = ∂q 2 i
. Then, by Roy’s identity,

Gii
xi = n k
, (i = 1, 2, . . . , n) (79)
k=1 qk Gk

Next, consider

∂ ln xi ∂ ln Gii ∂ ln( nk=1 qk Gkk )
= − (80)
∂wj ∂wj ∂wj

Now,
wi
∂ ln Gii ∂ ln Gii ∂qi Giii ∂ C
∂wj = ∂qi ∂wj = Gi ∂wj
i (81)
Giii wj xj Giii x
= − i C C = − i qj Cj
Gi Gi

Also,
n
n k
∂( k
k=1 qk Gk )
∂ ln( k=1 qk Gk ) ∂wj
= n k
(82)
∂wj k=1 qk Gk

n
∂( k
k=1 qk Gk )
n
∂Gkk n k ∂qk
∂wj =
k=1 qk ∂wj + k=1 Gk ∂wj
n ∂Gkk ∂qk n k ∂qk
= k=1 qk ∂qk ∂wj + k=1 Gk ∂wj
(83)
n k ∂qk
= k=1 qk Gkk + Gk ∂wj
k

Further,
∂w ∂C
C ∂wk −wk ∂w
∂qk
∂wj = j
C2
j
(84)
δ w x
= Cj k − Ck 2 j

In (84), δj k = 1 for j = k and 0 otherwise.

Thus,
n k n
∂( k=1 qk Gk ) δj k wk xj
∂wj = k=1 qk Gkkk + Gkk C − C2
n j j (85)
xj nk=1 qk2 Gkkk xj k
k=1 qk Gk
qj Gjj Gj
=− C − C + C + C
1 Neoclassical Production Economics: An Introduction 39

But
j n k j
Gj xj k=1 qk Gk
Gj
xj = n k
⇒ = (86)
k=1 qk Gk C C

Hence,
n k j n 2 k
∂( k=1 qk Gk )
Gj xj k=1 qk Gkk
= − (87)
∂wj C C

C
Substitution of (81) and (87) into (80) and multiplying both sides by xj give

2
C ∂ ln xi C ∂w∂i ∂w
C
=
j
= σij
xj ∂wj xi xj
n
Giii Giii 2 k
k=1 qk Gkk
=− qj − qj + n (88)
Gii Gii k
k=1 qk Gk

Define

qk Gkkk
ak = − (89)
Gkk

Then,

ak Gkk qk Gkkk
n j
= − n j
(90)
j =1 qj Gj j =1 qj Gj

Hence, by (79),

q Gkkk q 2 Gkkk
ak xk = − n k j ⇒ ak xk qk = ak wCk = ak sk = − n k j
j =1 qj Gj n j =1 qj Gj
n 2 k (91)
j =1 qk Gkk
⇒ k=1 ak sk =− n j
j =1 qj Gj

Hence, (88) can be expressed as

n
σij = ai + aj − ak sk , (i = j ) (92)
k=1

An implication of (92) is that for the additive indirect production function con-
sidered above, σij − σik = aj − ak which is independent of the input i.
40 R. G. Chambers and S. C. Ray

Hanoch [12] described this as the constant difference of elasticity of substitution

(CDE) production function.
Consider the specification

G(q, Y ) = ni=1 Gi (qi , Y ) = 1;
(93)
Gi (qi , Y ) = Bi q bi Y ei bi (i = 1, 2, . . . , n)

Then, by Roy’s identity,

(b −1)
Gii Bi bi qi i Y ei bi
xi = n k
= n bk e k bk
(i = 1, 2, . . . , n) (94)
k=1 qk Gk k=1 qk Bk qk Y

Hence,

xi Bi bi
ln = ln + (ei bi − ej bj ) ln Y + (bi − 1) ln qi − (bj − 1) ln qj (95)
xj Bj bj

Similarly,

xk Bk bk
ln = ln + (ek bk − ej bj ) ln Y + (bk − 1) ln qi − (bj − 1) ln qj (96)
xj Bj bj

Combining (95) and (96), we get

xi xk
ln = Kij + ε ln + ai ln qi − ak ε ln qk − aj (1 − ε) ln qj (97)
xj xj

where

Bi bi ei bi − ej bj
Kij = ln ; ε= ; ai = bi − 1 (i = 1, 2, . . . , n) (98)
Bj bj ek bk − ej bj

This is a system on (n − 1) simultaneous equations which are nonlinear in parame-

ters.
Special Cases:
Case 1: bi = b (i = 1, 2, . . . , n). In this case,

G(q, Y ) = ni=1 Bi bqib Y ei b = 1
(99)
⇒ C = ni=1 Bi bwib Y ei b

The cost function is not multiplicatively separable in output and input prices, which
implies non-homotheticity of the technology. However,
1 Neoclassical Production Economics: An Introduction 41

σij − σik = bk − bj = 0
(100)
⇒ σij = σik = σ.

Hence, it is a NH-CES production function.

Case 2: bi = b; ei = e; (i = 1, 2, . . . , n) This time,
n
G(q, Y ) = b eb =
i=1 Bi bqi Y 1
1 (101)
⇒C= Y e b ni=1 Bi wib b

This is the dual cost function for the homothetic CES production function. Returns
to scale depends, however, on the parameter e.
Case 3: ei = e = 1; bi = b = 1; (i = 1, 2, . . . , n)
Now,

G(q, Y ) = ni=1 Bi qY = 1
n (102)
⇒C=Y i=1 Bi wi

This is the cost function for the fixed coefficients Leontief production function.
Case 4: Gi (qi , Y ) = Bi ln (qi Y ei ) ; i = 1, 2, . . . , n
Thus,

Gi (ql , Y ) = Bi ln qi + Bi ei ln Y
⇒ Gii = Bqii , Giii = − B2i
qi (103)
q Gi
⇒ ai = − i i ii = 1; i = 1, 2, . . . , n
Gi

Hence,

n
σij = ai + aj − ak sk = 1 (104)
k=1

This corresponds to the Cobb-Douglas technology.

Flexible Functional Forms

Although the implicit additive direct and indirect production functions allow more
flexibility in respect of input substitution compared to the Cobb-Douglas and the
ACMS functions, they still impose considerable structure on the technology a priori.
The parallel development of the flexible functional forms, mainly the transcendental
logarithmic (translog) production and cost functions developed in [3–5] and the
generalized Leontief cost function by Diewert [7], has revolutionized empirical
analysis of production by allowing the use of the cost or profit function as the
primary analytical format.
42 R. G. Chambers and S. C. Ray

The main properties of the production function like the level of the output,
the marginal productivities of inputs, and the pairwise substitution elasticities
can be derived from the function itself, its vector of partial derivative, and the
Hessian matrix of second and cross partial derivatives. The flexible functional forms
permit the analyst to extract these properties from a second-order Taylor’s series
approximation of an unspecified underlying production, cost, or profit function.
Consider, for example, the function y = f (x) and its second-order approxima-
tion at some point x 0 . Thus,

∂2f
f (x)≈g(x) = f (x 0 )+(x − x 0 ) ∂f 1 0
∂x x=x 0 + 2 (x − x ) ∂x∂x x=x 0 (x − x )x=x 0
0

1 0 ∂2f
= f (x 0 ) − x 0 ∂f ∂x x=x 0 + 2 x ∂x∂x x=x 0 x
0 + ∂f
∂x x=x 0 x
0 x

∂2f
+ 12 x ∂x∂x
0
x
x=x
≡ a0 + a x + 12 x Ax
(105)
Here

∂2f
a0 = f (x 0 ) − x 0 ∂f + 12 x 0 ∂x∂x x=x 0 x0;

∂x x=x 0
∂f
a=x 0
∂x x=x 0 ; (106)

∂2f
A = ∂x∂x 0
x=x

It can be verified that at the point of approximation x 0

∂f ∂g ∂ 2f ∂ 2g
f (x) = g(x); = ; = (107)
∂x ∂x ∂x∂x ∂x∂x

Translog Cost Function

The m-output n-input translog cost function is
m
ln C = α0 + m αi ln Yi + 12 m s=1 αrs ln Y ln Y +
n 1 n n
i=1 r=1 mr s n
β
i=1 i ln w i + 2 i=1 β
j =1 ij ln w i ln w j + r=1 j =1 γrj ln Yr ln wj
(108)
Using Shephard’s lemma, one can get

wi xi ∂ ln C n m
si = = = βi + βij ln wj + γrm ln Yr ; (i = 1, 2, . . . , n)
C ∂ ln wi
j =1 r=1
(109)
Now,

∂si ∂si ∂ ln wj βij

= = (110)
∂wj ∂ ln wj ∂wj wj
1 Neoclassical Production Economics: An Introduction 43

while
⎛ ⎞
∂xi
∂ wi xi C ∂w − xi ∂w
∂C
C
= wi ⎝
j j
⎠ = wi CCij − xi xj (111)
∂wj C2 C2

Hence,

wi xi wj xj CCij
βij = C C xi xj − 1
CCij β (112)
⇒ σij = xi xj = 1 + si ijsj .

It should be noted that the pairwise substitution elasticities vary across data points
because the cost shares of the inputs depend on both the input prices and the output
quantities.

Generalized Leontief Cost Function

Another popular flexible function form is the generalized Leontief cost function
introduced by Diewert [7]:
⎛ ⎞
n
n
√
C(w, Y ) = ⎝ bij wi wj ⎠ Y (113)
i=1 j =1

Using Shephard’s lemma, one gets the conditional input demand functions:
⎛ ⎞

n
wj
xi = ⎝bii + bij ⎠Y (114)
wi
j =1,j =i

xi
It can be seen that if bij = 0 for alli = 0, Y = bii , and we get the standard Leontief
technology.
From (114),

∂xi bij
Cij = = √ Y ; (i = j ) (115)
∂wj 2 wi wj

Hence,
n n
√
i=1 j =1 bij w i w j bij
σij = (116)
w
bii + nj=1,j =i bij wji bjj + ni=1,i =j bij w
wi
j

Clearly, σij = 0 if bij = 0.

44 R. G. Chambers and S. C. Ray

The homothetic and CRS version of the cost function may be generalized further
by allowing the input ratios to depend on both the output level and the input prices.
One such formulation that is both non-homothetic and allows nonconstant returns
to scale is
⎛ ⎞ !
n n
√ n
C(w, Y ) = ⎝ bij wi wj ⎠ Y + 2 1
di wi Y 2 (117)
i=1 j =1 i=1

This time, the conditional input demands are

⎛ ⎞

n
wj
xi = ⎝bii + bij ⎠ Y + di Y ; (i = 1, 2, . . . , n) (118)
wi
j =1,j =i

Now at the cost-minimizing bundle, the input proportions depend both on input
price ratios and the level of output.
While the flexible functional forms impose few prior restrictions on the nature
of the technology instead of allowing the empirical analyst to statistically test such
restrictions, in applied research, one too often encounters violation of regularity
conditions like positive marginal cost, nonnegative factor demands/shares, negative
own price elasticities, concavity of the cost function, and so on. Unfortunately,
at what level of such violations does the estimated model become unacceptable
remains a judgment call for the analyst.

Appendix 1
y= Y
L ⇒ Y = Ly = Lf (k) = f K
L
FK = ∂K = L ∂K = L ∂k ∂K = Lf (k) L1 = f (k)
∂Y ∂y ∂y ∂k

FL = ∂Y ∂y ∂y ∂k
∂L = L ∂L + y = y + L ∂k ∂L = f (k) − Lf (k) L2
K

= f (k) − K
L f (k) = f (k) − kf (k)
FKL = f (k) ∂L
∂k
= L1 kf (k)
f (k) f (k)−f (k)
⇒ σKL = − FF·F K FL
KL
= − kf( (k)f (k) )

Appendix 2

Consider the production function

z = h(x1 , x2 , . . . , xn )
y = g(z); g (z) > 0

Then,
1 Neoclassical Production Economics: An Introduction 45

∂y
∂xi = g (z) ∂x
∂h ∂y
;
i ∂xj
= g (z) ∂x
∂h
j
∂y ∂h
∂xi ∂xi
⇒ ∂y = ∂h
∂xj ∂xj

But due to homogeneity of degree r, h(tx) = t r h(x). Now suppose that g(h(x)) =
y0 and g(h(tx)) = y1 .Then, along the isoquant for output y1 ,

∂y
= g r ∂x
∂h ∂y
; = g r ∂x
∂h
∂xi
∂y
! i ∂xj
∂h
j
∂y
!
∂xi ∂xi ∂xi
⇒ ∂y = ∂h = ∂y
∂xj y=y1 ∂xj ∂xj y=y0

The input vectors x and tx have the same ratio of any two inputs i and j. Hence, the
marginal rate of substitution between the inputs remains unchanged so long as the
input proportions remain the same irrespective of the output level.

Appendix 3 Elasticity of Substitution Derived from the Dual Cost

Function

This appendix is based on McFadden [15]. We first consider the two-input case:
y = f (x1 , x2 ). Dual cost function:

C(w1 , w2 , y) = w1 x1 (w1 , w2 , y) + w2 x2 (w1 , w2 , y)

Recall the definition of the elasticity of substitution

d ln( xx21 )
σ 12
=
d ln( ff21 )

By Shephard’s lemma, xi = ∂C
∂wi = Ci (i = 1, 2) Also, by the FOC for cost
minimization,

f1 w1
=
f2 w2
46 R. G. Chambers and S. C. Ray

Hence,

d ln( xx12 ) C2
d ln( C )
σ 12
= = 1
w1
d ln( ff21 ) d ln( w2
)
dC2 dC1
d ln C2 − d ln C1 C2 − C1 A
= = ≡
d ln w1 − d ln w2 dw1
− dw2 B
w1 w2

Now,

dC2 = C21 dw1 + C22 dw2

dC1 = C11 dw1 + C12 dw2

Hence,

C21 dw1 + C22 dw2 C11 dw1 + C12 dw2

A= −
C2 C1

C12 C11 C22 C12
= − dw1 + − dw2
C2 C1 C2 C1

Now, because xi (w1 , w2 , y) = Ci (w1 , w2 , y) (i = 1, 2) is homogeneous of degree

0 in w,

w2
C11 w1 + C12 w2 = 0 ⇒ C11 = −C12
w1
w1
C21 w1 + C22 w2 = 0 ⇒ C22 = −C12
w2

Hence,

C12 C12 w2 C12 w1 C12
A= + dw1 + − − dw2
C2 C1 w 1 C2 w 2 C1

C12 C12 w2 C12 w1 C12
= + dw1 − + dw2
C2 C1 w 1 C2 w 2 C1
C12 (C1 w1 + C2 w2 ) dw1 C12 (C1 w1 + C2 w2 ) dw2
= −
C1 C2 w1 C1 C2 w2

C12 (C1 w1 + C2 w2 ) dw1 dw2
= −
C1 C2 w1 w2

But C = C(w1 , w2 , y) is homogeneous of degree 1 in (w1 , w2 )

1 Neoclassical Production Economics: An Introduction 47

Hence, C1 w1 + C2 w2 = C
Therefore,

C.C12 dw1 dw2
A= −
C1 C2 w1 w2

Hence,

A C.C12
σ 12 = = .
B C1 C2

For the multiple-input case,

CCij
σ ij =
Ci Cj

Uzawa [23] has shown this to be equivalent to the Allen elasticity of substitution
defined above.

Cross-References

Elasticities of Substitution

References
1. Abramovitz M (1956) Resource and output trends in the United States since 1870. Am Econ
Rev 46:5–23
2. Arrow KJ, Chenery HB, Minhas BS, Solow RM (1961) Capital-labor substitution and
economic efficiency. Rev Econ Stat 43(1):225–250
3. Berndt E, Christensen L (1973) The translog function and the substitution of equipment,
structures and labor in U.S. manufacturing 1929–1968. J Econ 1(1):81–114
4. Christensen LR, Jorgenson D, Lau L (1971) Conjugate duality and the transcendental
logarithmic production function. Econometrica 39(4):225–256
5. Christensen LR, Jorgenson D, Lau L (1973) Transcendental logarithmic production frontiers.
Rev Econ Stat 55:28–45
6. Cobb CW, Douglas PH (1928) A theory of production. Am Econ Rev 18(1):139–165. Papers
and Proceedings (Mar)
7. Diewert E (1971) An application of the Shepard duality theorem: a generalized Leontief
production function. J Polit Econ 79:489–507
8. Fishburn PC (1972) Mathematics of decision theory. Mouton, The Hague
9. Gorman WM (1965) Production functions in which the elasticities of substitution stand in fixed
proportions to each other. Rev Econ Stud 32:217–224
10. Gorman WM (1968) Measuring the quantities of fixed factors. In: Wolfe JN (ed) Value, capital
and growth. Edinburgh University Press, Edinburgh
11. Hanoch G (1971) CRESH production functions. Econometrica 39(5):695–712
12. Hanoch G (1975) Production and demand models with direct and indirect implicit additivity.
Econometrica 43(3):395–419
48 R. G. Chambers and S. C. Ray

13. Hiriart-Urruty J-B, LeMaréchal C (2001) Fundamentals of convex analysis. Springer, Heidel-
berg/Berlin
14. Jorgenson DW, Griliches Z (1967) The explanation of productivity change. Rev Econ Stud
34(99):249–283
15. McFadden D (1978) Cost, revenue, and profit functions. In: Fuss M, McFadden D (eds)
Production economics: a dual approach to theory and applications, vol I. North-Holland,
Amsterdam, pp 3–109
16. Mukerji V (1963) A generalized S.M.A.C. function with constant ratios of elasticity of
substitution. Rev Econ Stud 30:273–284
17. Rockafellar RT (1970) Convex analysis. Princeton University Press, Princeton
18. Samuelson PA (1947) Foundations of economic analysis. Harvard University Press, Cam-
bridge, MA
19. Samuelson PA (1979) Paul Douglas’s measurement of production functions and marginal
productivities. J Polit Econ 87(5):923–939, Part 1
20. Sato R (1977) Homothetic and non-homothetic CES production functions. Am Econ Rev
67(4):559–569
21. Schultz TW (1958) Output-input relationships revisited. J Farm Econ 40:924–932
22. Shephard RW (1953) Cost and production functions. Princeton University Press, Princeton
23. Uzawa H (1962) Production functions with constant elasticity of substitution. Rev Econ Stud
29:201–299
24. Varian H (1992) Microeconomic analysis, 3rd edn. Norton, New York
25. von Thünen J (1826) Der isolirte Staat in Beziehung auf Landwirtschaft und National-
Ökonomie. Wirtschaft & Finan
Reminiscences of “Returns to Scale
in Electricity Supply” 2
Marc Nerlove

Contents
Introduction: Genesis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
Development . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
Conclusion: Loose Ends and the Aftermath . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55

Abstract

The origins and further development of a very early empirical application of

Shephard’s duality theorem is discussed. The investigation of the “Returns to
Scale in Electricity Supply” was based on a cost function derived from a Cobb-
Douglas production function initially. The function was modified to allow for
variable “returns to scale.” Duality between cost and production then showed the
modified Cobb-Douglas production function behind the cost function which was
initially estimated.

Keywords

Cost/production functions · Duality · Returns to scale · Modified

Cobb-Douglas · Electricity generation

M. Nerlove ()
Department of Agricultural and Resource Economics, College of Agriculture and Natural
Resources, University of Maryland, College Park, MD, USA
e-mail: [email protected]

© Springer Nature Singapore Pte Ltd. 2022 49

S. C. Ray et al. (eds.), Handbook of Production Economics,
https://doi.org/10.1007/978-981-10-3455-8_2
50 M. Nerlove

My paper “Returns to Scale in Electricity Supply” was first issued in 1961 as a

Technical Report at Stanford University.1 Later that year, my friend and former
office-mate at the University of Chicago, Yehuda Grunfeld, was tragically drowned
while swimming off the beach at Tel-Aviv, Israel. Carl Christ and Don Patinkin put
together a distinguished list of contributors for a volume of essays in his memory.
When asked to contribute, it was an honor and I could not but give it my best
work at the time. The volume appeared in 1963.2 Soon thereafter, the paper was
reprinted in a compendium of papers in econometrics edited by Arnold Zellner.3
After the paper was made more widely accessible, it was discussed in econometric
texts, for example, Berndt,4 Hayashi.5 My 1955 data set was augmented for 1970
by Christensen and Greene, who used a translog cost function to estimate returns to
scale in electricity supply in the 2 years.6

Introduction: Genesis

How did I come to write this paper? What were my source of ideas, in particular
the idea of estimating the parameters of a production function from a cost function,
without, at the time, knowing of Shephard’s now famous result on the duality of
cost and production functions 7 ?
I served in the US Army from April 1, 1957, to February 23, 1959. After basic
training, I was assigned to the Program Coordination Office of the labs at Fort
Detrick, Maryland. There I met a young private, another economist, Max D. Steuer,
who had been studying at the London School of Economics.8 We had some time
on our hands and decided to do a research project together; we thought for various
reasons that it would be useful to study the roles of capital, labor, and fuel in US
production of electric power. A ready source of data was at hand in the annual

1 Technical Report No, 96, May 25, 1961, of the Institute for Mathematical Studies in the Social
Sciences, Stanford University.
2 Carl F. Christ [1]
3 Zellner [2]
4 Berndt [3]
5 Hayashi [4]
6 Christensen and Greene [5]. “Cross-section data for 1955 and 1970 are analyzed using the translog

cost function. We find that in 1955 there were significant scale economies available to nearly all
firms. By 1970, however, the bulk of U.S. electricity generation was by firms operating in the
essentially flat area of the average cost curve. We conclude that a small number of extremely large
firms are not required for efficient production and that policies designed to promote competition in
electric power generation cannot be faulted in terms of sacrificing economies of scale” (p. 655).
7 Shephard [6]
8 He was the grandson of the famous lawyer for the defense in the Triangle Shirtwaist Factory Fire

case – in case you wanted to know. Steuer, Max David [7].

2 Reminiscences of “Returns to Scale in Electricity Supply” 51

reports of the Federal Power Commission.9 Steuer left the Army shortly thereafter
to return to London and I was left to continue on my own. We never resumed our
collaboration.
During my last year in the Army, I was asked by George Heberton Evans, then the
chairman of the Department of Political Economy of the Johns Hopkins University,
to teach the course in “Mathematical Economics.” This was, at the time, essentially
a course in calculus with economic examples. I sought and received permission
from my commanding officer to teach the course on Saturday mornings through the
fall semester. I used R. G. D. Allen’s text supplemented by many exercises of my
own devising.10 Although teaching this course did not leave me time to return to
research on the production of electric power in the United States, it proved to be a
crucial element in my thinking when I later returned to the project after leaving the
Army to take a position at the University of Minnesota.
Toward the end of my course at Hopkins, I dealt with the maximization
(minimization) of functions of several variables subject to one or more constraints
(Allen, op. cit., Chap. 19). We covered utility maximization subject to an income
(expenditure) constraint and total cost minimization for given output and input
prices subject to a production function constraint. In one exercise, Allen (Ex.15,
p. 519) uses a Cobb-Douglas utility function to show that demands for the various
goods have unitary income and own-price elasticities while having zero cross-
price price elasticities. I turned this exercise upside down: Instead of maximizing
a nonlinear function (Cobb-Douglas utility function) subject to a linear total
expenditure constraint, I asked my students to minimize a total linear cost function
subject to a nonlinear Cobb-Douglas production function constraint, to solve the
first-order conditions for the derived demands for factors of production, and then
substitute these derived demand functions into the total cost function. The total cost
was thus obtained as a function of output and factor prices. The coefficients in the
original Cobb-Douglas production function, including the degree of returns to scale,
were uniquely recoverable from the coefficients in this total cost function. I and my
students thus stumbled upon a special case of the Shephard duality theorem, but I
was not to know this until later.

Development

I left the Army at the end of February 1959 to accept a teaching and research position
at the University of Minnesota. I taught the same course I had taught at Hopkins and

9 US Federal Power Commission, Steam Electric Plants, Construction Costs and: Construction
Costs and Annual Production Expenses, Washington, D.C.: annually; and Statistics of Electric
Utilities in the United States, Classes A and B Privately Owned Companies, Washington, D.C.:
annually.
10 Allen [8]. I am indebted to Sara Seten Berghausen, Associate Curator of Collections. David M.

Rubenstein Rare Book & Manuscript Library, who retrieved my class notes and exercises for my
course at Hopkins.
52 M. Nerlove

John Chipman and I taught a joint Workshop on Econometrics. Among our students
in the Workshop was Daniel McFadden, who also served as an unofficial research
assistant. For the Workshop, I prepared a set of lectures on the estimation and
identification of Cobb-Douglas production functions, which with the addition of two
chapters based on my related work was later to serve as a short book.11 We discussed
among other things, Klein’s early work on railroad production functions,12 and John
Chipman’s reestimation using a different approach.13
Klein assumes a two-output asymmetric Cobb-Douglas production function.
He assumes three inputs with exogenously varying prices, all for a cross-section
sample of firms. By asymmetric I mean he normalizes the exponent of one of
the output variables to be −1 and so treats it as the “dependent” variable in
the production function. He argues that the cross-section data represent short-
run adjustments, and because the industry is regulated, outputs for each firm are
exogenous. From the first-order conditions for cost minimization subject to the
Cobb-Douglas production function constraint, given outputs and input prices, Klein
then derives estimates for the ratios of the production function exponents of two
pairs of inputs by taking geometric means, implicitly assuming errors in cost
minimization or random differences in the production function exponents. To arrive
at estimates of the individual exponents in the production function, Klein constructs
a synthetic dependent variable in a logarithmic regression with the two output
variables as independent, on the assumption that these are exogenously determined
for each railroad. If all this seems somewhat convoluted, it is because it is.14
Chipman’s idea was simple and straightforward: If one is going to treat the
two output variables and input prices as exogenous, why not estimate the cost-
minimizing demand for inputs derived from the first-order conditions? Cost min-
imization under a Cobb-Douglas constraint leads thus to a system of three derived
demand equations linear in the logs of factor prices and the two outputs.15 From
the coefficients in each of the three equations one can derive a different estimate of
returns to scale, and thence estimates of the individual elasticities of input levels.
The problem is thus to impose appropriate constraints in the estimation for the three
equations. This is where Chipman stopped in 1957.16 As far as I know only one
person picked up the thread of Chipman’s approach, using a restricted system of
derived demand equations much later.17

11 Nerlove [9]
12 Klein[10]. This paper is briefly summarized in Klein [11]. There is a more extensive discussion
in Nerlove, op. cit., Chap. 4, pp. 61–85.
13 Chipman [12]
14 See Nerlove [9], loc. cit., pp. 78–79, for a more precise exposition of what Klein does. I also

argue there that Klein’s estimates of returns to scale are neither unbiased nor consistent.
15 See Nerlove [9], loc. cit., pp. 80–81.
16 Chipman, op. cit.
17 Hasenkamp [13, 14].
2 Reminiscences of “Returns to Scale in Electricity Supply” 53

From the exercise I had given to my students the previous year, namely, to derive
the cost minimizing cost function for a single output firm, I knew what to do. The
method also worked for the case of multiple outputs. Of course, in the case of
electricity supply, I had only one output. That year, 1959, with our research assistant
Dan McFadden, I gathered up the data on steam-electric generating plants, made
initial estimates of factor prices, and estimated the logarithmic cross-section cost
functions.
“Construction” of the basic cost, input, and above all price data was one of the
most difficult and time consuming tasks. Details of how I did this for 145 firms
in 1955 are contained in Appendix B (pp. 190–192). Capital input and capital
costs were the most problematic. In the end I never got very good results for the
elasticity of output with respect to capital input which depends upon estimates of
capital prices in the cost function. Once I had estimates for total costs and the
three input prices, and of course total annual output, I could estimate a total cost
function. If I assumed that firms had one plant each and minimized cost subject to
a Cobb-Douglas production function with fixed input prices and given output, I had
estimates of the input elasticities and returns to scale. I wrote these results up and
presented them at our workshop, and at seminars at Purdue and Stanford. During the
Purdue seminar, Hirofumi Uzawa was present and sitting in the back of the room. He
asked a seemingly innocent question, “You can always do this?” To which I replied,
“I don’t know if I can always do this but for a Cobb-Douglas production function I
can.” He replied, “Not a question,” and called our attention to the Shephard Duality
theorem, which was apparently not well-known by economists at the time.
But that was not the end of the story. In the summer of 1960 I moved to Stanford
University. While working on a submitable draft of the paper, I discovered a number
of loose ends. First, I was not happy with the finding that firms in this industry
were inefficiently small. But then I realized that increasing returns might prevail
in the production of electric power but that nonetheless there might be constant or
diminishing returns to scale in the supply of electric power because of transmission
losses. I worked all this out in an appendix to the paper. There was another problem
with the results: When I plotted the residuals from the logarithmic cost regression,
they should have had the usual lens shape about a straight line; they did not. Instead
the residuals were more often positive for lower than average outputs and higher
than average outputs and correspondingly negative for those in between, that is,
they were just the opposite of the lens shape I expected to find. I cudgeled my
brain for an explanation until I came upon the obvious and simple one: Returns
to scale were not constantly increasing nor constantly diminishing as implied by a
logarithmic linear cost function, but were first strongly increasing, then less so, and
finally nearly constant or slowly decreasing. I broke the 145 firms into five groups
of 29 on the basis of output, smallest in the first group, largest in the last, and so
on. Those five regressions enabled me to test the hypothesis of variable returns to
scale. The test statistic was highly significant. Looking at the results for the separate
cost regressions, it occurred to me that if I added the square of the log of output
to the overall regression of total cost on log output and log factor prices I could
get a good fit and the normal lens shaped plot of residuals on log output. I did so
54 M. Nerlove

and by Shephard’s duality theorem I had a new production function with variable
returns to scale. The logarithmic regression including the square of log output fit
extremely well and had overall residuals plotted against log output of the traditional
shape. Returns to scale were always increasing but variable and decreasing with log
output. I was not able to find a closed parametric form for the production function
corresponding to my new cost function, though by Shephard’s duality theorem I
knew that one existed and I could find the elasticities of output with respect to each
of the three inputs as they varied with scale. That was the end of my work on the
paper but not of the questions I had.

Conclusion: Loose Ends and the Aftermath

After I had finished the paper in May of 1961, I still had some unanswered questions:

First, the Cobb-Douglas form with which I had started and the variable extent of
returns to scale production function to which I had finally arrived had elasticities
of substitution equal to one between any pair of factors. This did not square
with my intuition about how electric power was produced in 1955. In particular,
I thought that labor played a rather minimal role in steam electric generation
and therefore could not easily substitute for fuel. An alternative, the multifactor
CES production function, had an associated cost function that was relatively
easy to estimate and the parameters of which might be easily interpretable.
Unfortunately, the multifactor CES had elasticities all equal, although different
from one. This did not seem helpful.
Second, although the CES and its limiting cases, the Cobb-Douglas and the
Leontief, had associated cost functions easy to estimate and easy to interpret
in terms of the underlying production functions, I was curious as to the most
general cases in which such duals existed.

McFadden, who was then a graduate student at the University of Minnesota,

came out to Stanford to work with me on his dissertation in the summer of 1961.18
Uzawa, who had noted at the Purdue seminar that one could always determine
the production function from the cost function, had returned from UC Berkeley
as associate professor in 1961. McFadden was also keen to work with him. I raised
my concerns with them about generalizations of CES production and cost functions
and about duality more generally.
Uzawa realized the importance of the Shephard duality theorem and provided a
more accessible proof of the result in his 1961 paper.19 In response to my question
about a generalization of the usual constant elasticity of substitution production

18 McFadden [15]. Much of the core of this dissertation was later published as McFadden [16], in

which McFadden generalizes the elasticity of substitution for the multifactor case.
19 Uzawa [17]
2 Reminiscences of “Returns to Scale in Electricity Supply” 55

function developed by Arrow et al.,20 Uzawa showed that the most general form
for a CES production function was one for which the indices of the input levels
could be partitioned, one partition such that the elasticities of substitution were one,
and the other for which the elasticities were a constant for any pair other than one,
that is, a Cobb-Douglas of two-factor CES functions.21
The difficulties with extending constant elasticities of substitution production
functions and duality theory spawned a whole series of subsequent papers mod-
ifying and extending production function forms among which are Christensen,
Jorgenson, and Lau, and Diewert.22 I hope I may take at least a small part of the
credit for this renaissance of production theory and practice, but I fear that the credit
belongs largely to Daniel McFadden.23

References
1. Christ CF et al (1963) Measurement in economics: studies in mathematical economics
& econometrics in memory of Yehuda Grunfeld. Stanford University Press, Stanford, pp
167–198
2. Zellner A (ed) (1968) Readings in economic statistics and econometrics. Little Brown and Co,
Boston, pp 409–439
3. Berndt ER (1990) Chapter 3, Costs, learning curves and scale economies. In: The practice of
econometrics: classic and contemporary. Addison, Wesley, Reading, pp 60–101
4. Hayashi F (2000) Chapter 1, Sec. 7 Application: returns to scale in electricity supply. In:
Econometrics. Princeton University Press, pp 60–70
5. Christensen LR, Greene WH (1976) Economies of scale in U.S. Electric Power Generation.
J Polit Econ 84(4, Part 1):655–678
6. Shephard RW (1953) Cost and production functions. Princeton University Press, Princeton
7. Steuer MD (1951) Who was Who in America, vol IV. Marquis-Who’s Who, Inc., Chicago
8. Allen RGD (1938) Mathematical analysis for economists. Macmillan, London. reprinted 1953
9. Nerlove M (1965) Estimation and identification of Cobb-Douglas production functions. Rand-
McNally & Co, Chicago
10. Klein LR (1947) The use of cross-section data in econometrics with application to a study
of production of railroad services in the United States. Mimeograph. National Bureau of
Economic Research
11. Klein LR (1953) A textbook of econometrics. Row Peterson, Evanston, pp 226–236
12. Chipman JS (1957) Returns to scale in the railroad industry: a reinterpretation of Klein’s data,
(abstract). Econometrica 25:607
13. Hasenkamp G (1976) Specification and estimation of multiple output production functions.
Lecture notes in economics and mathematical systems, No. 120. Springer, Berlin
14. Hasenkamp G (1976) A study of multiple-output production functions: Klein’s railroad study
revisited. J Econ 4(3):253–262
15. McFadden D (1962) Factor substitution in the economic analysis of production. PhD thesis,
University of Minnesota

20 Arrow et al. [18]

21 Uzawa [19]
22 Christensen et al. [20]; Diewert [21]
23 Fuss and McFadden [22]
56 M. Nerlove

16. McFadden D (1963) Constant elasticity of substitution production functions. Rev Econ Stud
30(2):73–83
17. Uzawa H (1964) Duality principles in the theory of cost and production. Int Econ Rev
5(2):216–220
18. Arrow KJ, Chenery HB, Mijnhas BS, Solow RM (1961) Capital-labor substitution and
economic efficiency. Rev Econ Stat 43:225–250
19. Uzawa H (1962) Production functions with constant elasticities of substitution. Rev Econ Stud
29(4):291–299
20. Christensen LR,Jorgenson DW, Lau LJ (1971) Conjugate duality and the transcendental
logarithmic functions, (abstract). Econometrica 39(4):255–256
21. Diewert WE (1971) An application of the Shephard Duality Theorem: a Generalized Leontief
production function. J Polit Econ 79(3):481–507
22. Fuss M, McFadden D (eds) (1978) Production economics: a dual approach to theory and
applications. North Holland Publishing, Amsterdam
Duality in Production
3
W. Erwin Diewert

Contents
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
Cost Functions: The One Output Case . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
The Duality Between Cost and Production Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
The Derivative Property of the Cost Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
The Comparative Statics Properties of Input Demand Functions . . . . . . . . . . . . . . . . . . . . . . . 75
The Duality Between Constant Returns to Scale Production Functions
and Their Unit Cost Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82
The Constant Elasticity of Substitution Production Function . . . . . . . . . . . . . . . . . . . . . . . . . . 85
Flexible Functional Forms for Cost Functions: The Generalized Leontief
Functional Form . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95
The Translog Functional Form . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100
The Normalized Quadratic Unit Cost Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107
The Konüs Byushgens Fisher Unit Cost Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114
Semiflexible Functional Forms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120
The Use of Splines for Modeling Technical Progress . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121
Allowing for Flexibility at Two Sample Points . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123
National Product or Variable Profit Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127
The Comparative Statics Properties of Net Supply and Fixed Input Demand Functions . . . . 137
The Translog Variable Profit Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143
The Normalized Quadratic Variable Profit Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 146
The KBF Variable Profit Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 150
Joint Cost Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 152
Flexible Functional Forms for Joint Cost Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 159
Applications of Joint Cost Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 162
Problems that Require Additional Research . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 163
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 165

W. E. Diewert ()
Vancouver School of Economics, University of British Columbia, Vancouver, BC, Canada
School of Economics, UNSW, Sydney, NSW, Australia
e-mail: [email protected]

© Springer Nature Singapore Pte Ltd. 2022 57

S. C. Ray et al. (eds.), Handbook of Production Economics,
https://doi.org/10.1007/978-981-10-3455-8_21
58 W. E. Diewert

Abstract

The chapter reviews the application of duality theory in production theory.

Duality theory turns out to be a useful tool for two reasons: (i) it leads to
relatively easy characterizations of the properties of systems of producer derived
demand functions for inputs and producer supply functions for outputs and (ii)
it facilitates the generation of flexible functional forms for producer demand and
supply functions that can be estimated using econometrics. The chapter focuses
on describing the properties of five functional forms that have been used in
the production literature: (i) the constant elasticity of substitution (CES), (ii) the
generalized Leontief, (iii) the translog, (iv) the normalized quadratic, and (v) the
Konüs Byushgens Fisher functional forms. The applications of GDP functions
and joint cost functions to various areas of applied economics are explained.

Keywords

Production theory · Duality theory · Cost functions · Production functions ·

Joint cost functions · National product functions · GDP functions · Variable
profit functions · Properties of producer demand and supply functions ·
Shephard’s Lemma · Hotelling’s Lemma · Samuelson’s Lemma · Flexible
functional forms · Estimation of technical progress · The valuation of public
sector outputs · Modeling monopolistic behavior · Sunk costs

JEL Classiﬁcation Numbers

C02, C32, C43, D24, D42, D92, E01, E22, F11, H44, L51, M40, O47

Introduction

Duality theory is a very useful tool for estimating production functions or more
generally, for estimating production possibilities sets. It is also useful in allowing
one to derive the theoretical properties that differentiable derived producer demand
for input and supply of output functions must satisfy if the producer is maximizing
profits or minimizing costs. This chapter will illustrate these advantages of duality
theory in the producer context.
Sections “Cost Functions: The One Output Case,” “The Duality Between Cost
and Production Functions,” “The Derivative Property of the Cost Function,”
“The Comparative Statics Properties of Input Demand Functions” and “The
Duality Between Constant Returns to Scale Production Functions and Their
Unit Cost Functions” below will focus on the case of one output, N input
technologies. The multiple output and multiple input case will be considered
in sections “National Product or Variable Profit Functions,” “The Compara-
tive Statics Properties of Net Supply and Fixed Input Demand Functions,”
3 Duality in Production 59

“The Translog Variable Profit Function,” “The Normalized Quadratic Variable

Profit Function,” “The KBF Variable Profit Function,” “Joint Cost Functions,”
“Flexible Functional Forms for Joint Cost Functions,” and “Applications of Joint
Cost Functions.”1
The one output, many input cost function is defined in section “Cost Functions:
The One Output Case” and in section “The Duality Between Cost and Production
Functions,” the conditions on the production function that allow the cost function
to completely describe the underlying technology are listed: this establishes the
Shephard [81] Duality Theorem between cost and production functions. Section
“The Derivative Property of the Cost Function” explains Shephard’s Lemma; that
is, it shows why differentiating a cost function with respect to input prices generates
the vector of cost minimizing input demand functions. If the cost function is
twice continuously differentiable with respect to input prices, then Section “The
Comparative Statics Properties of Input Demand Functions” derives the properties
that the system of cost minimizing input demand functions must satisfy. Section
“The Duality Between Constant Returns to Scale Production Functions and Their
Unit Cost Functions” looks at the duality between cost and production functions if
production is subject to constant returns to scale, that is, if the production function
is homogeneous of degree one in inputs.
Sections “The Constant Elasticity of Substitution Production Function,” “Flex-
ible Functional Forms for Cost Functions: The Generalized Leontief Functional
Form,” “The Translog Functional Form,” “The Normalized Quadratic Unit Cost
Function,” and “The Konüs Byushgens Fisher Unit Cost Function” look at specific
functional forms for the cost function. The five functional forms that are studied are
(i) the constant elasticity of substitution (CES), (ii) the generalized Leontief, (iii) the
translog, (iv) the normalized quadratic, and (v) the Konüs Byushgens Fisher (KBF)
functional forms. The last four functional forms are flexible functional forms; that
is, they can provide a second order approximation to an arbitrary twice continuously
differentiable unit cost function at any arbitrary price point.2 A major problem with

1 Insections “Cost Functions: The One Output Case,” “The Duality Between Cost and Production
Functions,” “The Derivative Property of the Cost Function,” “The Comparative Statics Properties
of Input Demand Functions,” “The Duality Between Constant Returns to Scale Production
Functions and Their Unit Cost Functions,” “The Constant Elasticity of Substitution Production
Function,” “Flexible Functional Forms for Cost Functions: The Generalized Leontief Functional
Form,” “The Translog Functional Form,” “The Normalized Quadratic Unit Cost Function,” “The
Konüs Byushgens Fisher Unit Cost Function,” “Semiflexible Functional Forms,” “The Use of
Splines for Modeling Technical Progress,” “Allowing for Flexibility at Two Sample Points,”
“National Product or Variable Profit Functions,” “The Comparative Statics Properties of Net
Supply and Fixed Input Demand Functions,” “The Translog Variable Profit Function,” “The
Normalized Quadratic Variable Profit Function,” “The KBF Variable Profit Function,” “Joint Cost
Functions,” and “Flexible Functional Forms for Joint Cost Functions,” it will be assumed that
the producer takes prices as given constants in each period. Section “Applications of Joint Cost
Functions” extends the analysis to the case of monopolistic behavior.
2 Diewert and Wales ([45], 89–92) discuss some additional flexible functional forms that are not

discussed here. These alternative functional forms have various problems.

60 W. E. Diewert

flexible functional forms is the curvature problem; that is, an estimated flexible
functional form for a unit cost function may violate the concavity in prices property
that cost functions must satisfy. It turns out that the normalized quadratic and KBF
functional forms are such that the correct curvature conditions can be imposed
without destroying the flexibility of the functional form.3
Section “Semiflexible Functional Forms” introduces the concept of a semiflexible
functional form. A major problem with the use of a flexible functional form is that
it requires the estimation of roughly N2 /2 parameters if there are N inputs. The
semiflexible concept reduces this large number of parameters in a sensible way.
Section “The Use of Splines for Modeling Technical Progress” shows how piece-
wise linear functions of time can be used to model technical progress in a more
general manner than just using linear time trends in the demand functions. Section
“Allowing for Flexibility at Two Sample Points” shows how a flexible functional
form can be generalized to achieve the second order approximation property at two
sample points if we are estimating production functions in the time series context.4
Section “National Product or Variable Profit Functions” introduces Samuelson’s
[79] National Product Function or the variable profit function. This function
conditions on a vector of fixed inputs and maximizes the value of outputs less
variable inputs. The comparative statics properties of this function are developed in
section “The Comparative Statics Properties of Net Supply and Fixed Input Demand
Functions.” Sections “The Translog Variable Profit Function,” “The Normalized
Quadratic Variable Profit Function,” and “The KBF Variable Profit Function” look at

3 On a personal note, I did my thesis on flexible functional forms and, with the help of Daniel
McFadden (my thesis advisor), I came up with the Generalized Leontief cost function as my first
attempt at finding a “perfect” functional form that was flexible, parsimonious (i.e., had the minimal
number of parameters to be estimated that would enable it to be flexible) and generated derived
demand (or supply) functions that were either linear or close to linear in the unknown parameters
in order to facilitate econometric estimation. I was a graduate student at Berkeley at the time
(1964–1968) and I met frequently with Dale Jorgenson. He and his student at the time, Lawrence
Lau, realized that instead of taking a quadratic form in the square roots of input prices, one could
take a quadratic form in the logarithms of prices as a functional form for the logarithm of the
cost function and the translog functional form was born. However, empirical applications of these
functional forms soon showed that these functional forms had a drawback: it was not possible to
impose the correct concavity or convexity properties on these flexible functional forms without
destroying the flexibility of the functional form. In the 1980s, Diewert and Wales came up with
the normalized quadratic functional form which was flexible, parsimonious and had the property
that the correct curvature conditions could be imposed without impairing the flexibility property.
However, in order to preserve the parsimony property, one had to pick a more or less arbitrary
alpha vector and imbed it into the functional form as we will see later in this chapter. But different
choices of alpha could generate perhaps substantially different estimates for demand and supply
elasticities. The last flexible functional form that we will discuss in this chapter, the KBF functional
form, overcomes this difficulty and hence completes our quest for the “perfect” flexible functional
form.
4 It should be noted that our analysis is geared to the time series context. Much of our analysis can

be translated to the cross sectional context.

3 Duality in Production 61

three flexible functional forms for this function: (i) the translog, (ii) the normalized
quadratic, and (iii) the KBF variable profit functions. The systems of estimating
equations that these functional forms generate are also exhibited.
Sections “Joint Cost Functions,” “Flexible Functional Forms for Joint Cost
Functions,” and “Applications of Joint Cost Functions” develop the properties of
joint cost functions; that is, these functions generalize the one output cost function
to a cost function for multiple output producers. Section “Flexible Functional Forms
for Joint Cost Functions” looks at three flexible functional forms for this function:
(i) the translog, (ii) the normalized quadratic, and (iii) the KBF joint cost functions.
The latter two functions have the property that the correct curvature conditions
can be imposed on them without destroying their flexibility properties. Section
“Applications of Joint Cost Functions” looks at applications of joint cost functions
to: (i) problems associated with the measurement of the outputs of public sector
producers in the System of National Accounts, (ii) the measurement of the efficiency
of regulated utilities, and (iii) the estimation of technology sets when producers have
some monopoly power.
Section “Problems that Require Additional Research” concludes with a listing of
three problems that are not addressed in this chapter and require further research.
It may be useful to use this chapter as part of a course in microeconomic
theory or in production theory. To facilitate this use, the author has added many
straightforward problems that the instructor can assign to students. These problems
are also an efficient way of extending the results presented in the main text.

Cost Functions: The One Output Case

The production function and the corresponding cost function play a central role
in many economic applications. In the following section, we will show that under
certain conditions, the cost function is a sufficient statistic for the corresponding
production function; that is, if we know the cost function of a producer, then this
cost function can be used to generate the underlying production function.
Let the producer’s production function f(x) denote the maximum amount of
output that can be produced in a given time period, given that the producer has
access to the nonnegative vector of inputs, x ≡ [x1 , . . . , xN ] ≥ 0N .5 If the production
function satisfies the minimal regularity condition of continuity from above,6 then
given any positive output level y that the technology can produce and any strictly

5 Notation: x ≥ 0N means each component of the vector x is nonnegative, x > 0N means x ≥ 0N

and x = 0N and x 0N means each component of x is positive. pT x ≡ n = 1 N pn xn . Vectors are
understood to be column vectors when it matters.
6 We require that f be continuous from above for the minimum to the cost minimization problem

to exist; i.e., for every output level y that can be produced by the technology (so that y ∈ Range
f), we require that the set of x’s that can produce at least output level y (this is the upper level set
L(y) ≡ {x : f(x) ≥ y}) is a closed set in RN .
62 W. E. Diewert

positive vector of input prices p ≡ [p1 , . . . , pN ] 0N , we can calculate the

producer’s cost function C(y,p) as the solution value to the following constrained
minimization problem:

C (y, p) ≡ minx pT x : f (x) ≥ y; x ≥ 0N . (1)

It turns out that the cost function C will satisfy the following seven properties,
provided that the production function is continuous from above7 :

Theorem 1 Diewert ([27], 107–114)8 : Suppose f is continuous from above. Then

C defined by Eq. (1) has the following properties:

Property 1 C(y,p) is a nonnegative function.

Property 2 C(y,p) is positively linearly homogeneous in p for each fixed y; that is,

C (y, λp) = λC (y, p) (2)

for all λ > 0, p >> 0N and y∈Range f (i.e., y is an output level that is producible by
the production function f).

Property 3 C(y,p) is nondecreasing in p for each fixed y ∈ Range f; that is,

y ∈ Range f, 0N p1 < p2 implies C y, p1 ≤ C y, p2 . (3)

Property 4 C(y,p) is a concave function of p for each fixed y ∈ Range f; that is,

y ∈ Range f, p1 0N ; p2 0N ; 0 < λ < 1 implies C y, λp1 + (1 − λ) p2

≥ λC y, p1 + (1 − λ) C y, p2 .
(4)

Property 5 C(y,p) is a continuous function of p for each fixed y ∈ Range f.

Property 6 C(y,p) is nondecreasing in y for fixed p; that is,

p 0N , y1 ∈ Range f, y2 ∈ Range f, y1 < y2 implies C y1 , p ≤ C y2 , p . (5)

7 Note that this minimal regularity condition cannot be contradicted using a finite data set.
8 For the history of closely related results, see Diewert ([22], 116–120).
3 Duality in Production 63

Property 7 For every p 0N , C(y,p) is continuous from below in y; that is,

y∗ ∈ Range f, yn ∈ Range f for n = 1, 2, . . . , yn ≤ yn+1 , limn→∞ yn

(6)
= y∗ implies limn→∞ C yn , p = C y∗ , p .

Proof of Property 1 Let y ∈ Range f and p 0N . Then

C (y, p) ≡ minx pT x : f (x) ≥ y; x ≥ 0N

= pT x∗ where x∗ ≥ 0N and f (x∗ ) ≥ y

≥0 since p 0N and x∗ ≥ 0N .

Proof of Property 2 Let y ∈ Range f, p 0N and λ > 0. Then

C (y, λp) ≡ minx λpT x : f (x) ≥ y; x ≥ 0N

= λminx pT x : f (x) ≥ y; x ≥ 0N since λ > 0

= λC (y, p) using the definition of C (y, p) .

Proof of Property 3 Let y ∈ Range f, 0N p1 < p2 . Then

C y, p2 ≡ minx p2T x : f (x) ≥ y; x ≥ 0N

= p2T x∗ where f x∗ ≥ y and x∗ ≥ 0N
≥ p1T x∗ since x∗ ≥ 0N and p2 > p1

≥ minx p1T x : f (x) ≥ y; x ≥ 0N since x∗ is feasible for this problem

≡ C y, p1 .
64 W. E. Diewert

Proof of Property 4 Let y ∈ Range f, p1 0N ; p2 0N ; 0 < λ < 1. Then

T
C y, λp + (1 − λ) p
1 2
≡ minx λp + (1 − λ) p
1 2
x : f (x) ≥ y; x ≥ 0N

T
= λp1 + (1 − λ) p2 x∗ where x∗ ≥ 0N and f x∗ ≥ y

= λp1T x∗ + (1 − λ) p2T x∗

≥ λminx p1T x : f (x) ≥ y; x ≥ 0N + (1 − λ) p2T x∗

since x∗ is feasible for the cost minimization problem

that uses the price vector p1 and using also λ > 0

= λC y, p1 + (1 − λ) p2T x∗ using the definition

of C y, p1

≥ λC y, p1 + (1 − λ) minx p2T x : f (x) ≥ y; x ≥ 0N

since x∗ is feasible for the cost minimization problem

that usesthe price vector p2 and using also 1 − λ > 0

= λC y, p1 + (1 − λ) C y, p2 using the definition

of C y, p2 .

Figure 1 below illustrates why this concavity property holds.

In Fig. 1, the isocost line {x : p1T x = C(y, p1 )} is tangent to the production
possibilities set L(y) ≡ {x : f(x) ≥ y, x ≥ 0N } at the point x1 and the isocost line
{x : p2T x = C(y, p2 )} is tangent to the production possibilities set L(y) at the point x2 .
Note that the point x** belongs to both of these isocost lines. Thus, x** will belong
to any weighted average of the two isocost lines. The λ and 1 − λ weighted average
isocost line is the set {x : [λp1 + (1 − λ)p2 ]T x = λC(y, p1 ) + (1 − λ)C(y, p2 )}
and this set is the dotted line through x** in Fig. 1. This dotted line lies below9
the parallel dotted line that is just tangent to L(y), which is the isocost line
{x : [λp1 + (1 − λ)p2 ]T x = [λp1 + (1 − λ)p2 ]T x∗ = C(y, λp1 + (1 − λ)p2 )}
and it is this fact that gives us the concavity inequality (4).

9 It can happen that the two dotted lines coincide.

3 Duality in Production 65

Fig. 1 The concavity in prices property of the cost function

Proof of Property 5 Since C(y,p) is a concave function of p defined over the open
set of p’s, ≡ {p : p 0N }, it follows that C(y,p) is also continuous in p over this
domain of definition set for each fixed y ∈ Range f.10

Proof of Property 6 Let p 0N , y1 ∈ Range f, y2 ∈ Range f, y1 < y2 . Then

C y2 , p ≡ minx pT x : f (x) ≥ y2 ; x ≥ 0N

≥ minx pT x : f (x) ≥ y1 ; x ≥ 0N

since if y1 < y2 , the set x : f (x) ≥ y2 is a subset of the

set x : f (x) ≥ y1 and the minimum of a linear function

over a bigger set cannot increase

≡ C y1 , p .

Proof of Property 7 The proof is rather technical and may be found in Diewert
([27], 113–114).

10 See Fenchel ([52], 75) or Rockafellar ([77], 82).

66 W. E. Diewert

Problems
1. In industrial organization,11 it once was fairly common to assume that a firm’s
cost function had the following linear functional form: C(y, p) ≡ α + βT p + γy
where α and γ are scalar parameters and β is a vector of parameters to be
estimated econometrically. What are sufficient conditions on these N + 2
parameters for this cost function to satisfy properties 1–7 above? Is the resulting
cost function very realistic?
2. Suppose a producer’s production function, f(x), defined for x ∈ S where
S = {x : x ≥ 0N } satisfies the following conditions:
(i) f is continuous over S
(ii) f(x) > 0 if x 0N and
(iii) f is positively linearly homogeneous over S; that is, for every x ≥ 0N and
λ > 0, f(λx) = λf(x)
Define the producer’s unit cost function c(p) for p 0N as follows:
(iv) c(p) ≡ C(1, p) ≡ minx {pT x : f(x) ≥ 1; x ≥ 0N }; that is, c(p) is the minimum
cost of producing one unit of output if the producer faces the positive input
price vector p. For y > 0 and p 0N , show that
(v) C(y, p) = c(p)y
Note: A production function f that satisfies property (iii) is said to exhibit
constant returns to scale. The interpretation of (v) is that if a production
function exhibits constant returns to scale, then total cost is equal to unit cost
times the output level.12
3. Shephard ([81], 4) defined a production function F to be homothetic if it could
be written as
(i) F(x) = g[f(x)]; x ≥ 0N
where f satisfies conditions (i)–(iii) in Problem 2 above and g(z), defined
for all z ≥ 0, satisfies the following regularity conditions:
(ii) g(z) is positive if z > 0
(iii) g is a continuous function of one variable and
(iv) g is monotonically increasing; that is, if 0 ≤ z1 < z2 , then g(z1 ) < g(z2 ).
Let C(y,p) be the cost function that corresponds to F(x). Show that under the
above assumptions, for y > 0 and p 0N , we have
(v) C(y, p) = g−1 (y)c(p)
where c(p) is the unit cost function that corresponds to the linearly
homogeneous f and g−1 is the inverse function for g; that is, g−1 [g(z)] = z for
all z ≥ 0. Note that g−1 (y) is a monotonically increasing continuous function
of one variable.

11 For
example, see Walters [85].
12 Wewill study the unit cost function in more detail in section “The Duality Between Constant
Returns to Scale Production Functions and Their Unit Cost Functions” below.
3 Duality in Production 67

The Duality Between Cost and Production Functions

The material in the previous section shows how the cost function can be determined
from knowledge of the production function. We now ask whether knowledge of
the cost function is sufficient to determine the underlying production function. The
answer to this question is yes, but with some qualifications.
To see how we might use a given cost function (satisfying the seven regularity
conditions listed in the previous section) to determine the production function that
generated it, pick an arbitrary feasible output level y > 0 and an arbitrary vector of
positive prices, p1 0N and use the given cost function C to define the following
isocost surface: {x : p1T x = C(y, p1 )}. This isocost surface must be tangent to the
set of feasible input combinations x that can produce at least output level y, which
is the upper level set, L(y) ≡ {x : f(x) ≥ y; x ≥ 0N }. It can be seen that this isocost
surface and the set lying above it must contain the upper level set L(y); that is, the
following halfspace M(y,p1 ), contains L(y):

M y, p1 ≡ x : p1T x ≥ C y, p1 . (7)

Pick another positive vector of prices, p2 0N and it can be seen, repeating

the above argument, that the halfspace M(y, p2 ) ≡ {x : p2T x ≥ C(y, p2 )} must also
contain the upper level set L(y). Thus, L(y) must belong to the intersection of the
two halfspaces M(y,p1 ) and M(y,p2 ). Continuing to argue along these lines, it can
be seen that L(y) must be contained in the following set, which is the intersection of
all of the supporting halfspaces to L(y):

M (y) ≡ M (y, p) . (8)
p0N

Note that M(y) is defined using just the given cost function, C(y,p). Note also
that since each of the sets in the intersection, M(y,p), is a convex set, then M(y) is
also a convex set. Since L(y) is a subset of each M(y,p), it must be the case that L(y)
is also a subset of M(y); that is, we have

L (y) ⊂ M (y) . (9)

Is it the case that L(y) is equal to M(y)? In general, the answer is no; M(y)
forms an outer approximation to the true production possibilities set L(y). To see
why this is, see Fig. 1 above. The boundary of the set M(y) partly coincides with
the boundary of L(y) but it encloses a bigger set: the backward bending parts of
the isoquant {x : f(x) = y} are replaced by the dashed lines that are parallel to
the x1 axis and the x2 axis and the inward bending part of the true isoquant is
replaced by the dashed line that is tangent to the two regions where the boundary
of M(y) coincides with the boundary of L(y). However, if the producer is a price
taker in input markets, then it can be seen that we will never observe the producer’s
68 W. E. Diewert

nonconvex portions or backwards bending parts of the isoquant.13 Thus, under the
assumption of competitive behavior in input markets, there is no loss of generality
in assuming that the producer’s production function is nondecreasing (this will
eliminate the backward bending isoquants) or in assuming that the upper level sets of
the production function are convex sets (this will eliminate the nonconvex portions
of the upper level sets). A function has convex upper level sets if and only if it is
quasiconcave.14
Putting the above material together, we see that conditions on the production
function f(x) that are necessary for the sets M(y) and L(y) to coincide are15 :

f (x) is defined for x ≥ 0N and is continuous from above over this

(10)
domain of definition set;

f is nondecreasing and (11)

f is quasiconcave. (12)

Theorem 2 Shephard Duality Theorem16 : If f satisfies Eqs. (10), (11), and (12),
then the cost function C defined by Eq. (1) satisfies the properties listed in
Theorem 1 above and the upper level sets M(y) defined by Eq. (8) using only the
cost function coincide with the upper level sets L(y) defined using the production
function; that is, under these regularity conditions, the production function and the
cost function determine each other.
We consider how an explicit formula for the production function in terms of the
cost function can be obtained. Suppose we have a given cost function, C(y,p), and we
are given a strictly positive input vector, x 0N , and we ask what is the maximum
output that this x can produce. It can be seen that

13 Hotelling ([62], 74) made this point many years ago.

14 f is a quasiconcave function defined over a convex subset S of RN if f has the following property:
x1 ∈ S, x2 ∈ S, 0 < λ < 1 implies f(λx1 + (1 − λ)x2 ) ≥ min {f(x1 ), f(x2 )}; see Fenchel ([52], 117).
15 Since each of the sets M(y,p) in the intersection set M(y) defined by Eq. (8) are closed, it can be

shown that M(y) is also a closed set. Hence if M(y) is to coincide with L(y), we need the upper
level sets of f to be closed sets and this will hold if and only if f is continuous from above.
16 Shephard [81, 82] was the pioneer in establishing various duality theorems between cost and

production functions. See also Samuelson [79], Uzawa [84], McFadden [72, 73], Diewert ([20,
22], 116–118) and Blackorby et al. [6] for various duality theorems under alternative regularity
conditions. Our exposition follows that of Diewert ([27], 107–117). These duality theorems
are global in nature; i.e., the production and cost functions satisfy their appropriate regularity
conditions over their entire domains of definition. However, it is also possible to develop duality
theorems that are local rather than global; see Blackorby and Diewert [5].
3 Duality in Production 69

f (x) = maxy {y : x ∈ M (y)}

= maxy y : C (y, p) ≤ pT x for every p 0N using definitions (7) and (8).

= maxy y : C (y, p) ≤ 1 for every p 0N such that pT x = 1
(13)

where the last equality follows using the fact that C(y,p) is linearly homogeneous in
p as is the function pT x and hence we can normalize the prices so that pT x = 1.
We now consider the continuity properties of C(y,p) with respect to p. We have
defined C(y,p) for all strictly positive price vectors p and since this domain of
definition set is open, we know that C(y,p) is also continuous in p over this set,
using the concavity in prices property of C. We would like to extend the domain of
definition of C(y,p) from the strictly positive orthant of prices, ≡ {p : p 0N }, to
the nonnegative orthant, Clo ≡ {p : p ≥ 0N }, which is the closure of . It turns
out that it is possible to do this if we make use of some theorems in convex analysis.

Theorem 3 Continuity from above of a concave function using the Fenchel closure
operation: Fenchel ([52], 78): Let f(x) be a concave function of N variables defined
over the open convex subset S of RN . Then there exists a unique extension of f to
Clo S, the closure of S, which is concave and continuous from above.

Proof Using one of Fenchel’s ([52], 57) characterizations of concavity, the hypo-
graph of f, H ≡ {(y, x) : y ≤ f(x); x ∈ S}, is a convex set in RN + 1 . Hence, the closure
of H, Clo H, is also a convex set. Hence, the following function f* defined over Clo
S is also a concave function:

f∗ (x) = maxy {y : (y, x) ∈ Clo H} ; x ∈ Clo S.

(14)
= f (x) for x ∈ S.

Since Clo H is a closed set, it turns out that f* is continuous from above.
To see that the extension function f* need not be continuous, consider the
following example, where the domain of definition set is S ≡ {(x1 , x2 ); x2 ∈ R1 , x1 ≥
x2 2 } in R2 :

f (x1 , x2 ) ≡ −x2 2 /x1 if x2 = 0, x1 ≥ x2 2 ;

(15)
≡ 0 if x1 = 0 and x2 = 0.

It is possible to show that f is concave and hence continuous over the interior of
S; see problem 5 below. However, it can be shown that f is not continuous at (0,0).
Let (x1 ,x2 ) approach (0,0) along the line x1 = x2 > 0. Then

lim x1→0 f (x1 , x2 ) = lim x1→0 −x1 2 /x1 = lim x1→0 [−x1 ] = 0. (16)
70 W. E. Diewert

Now let (x1 ,x2 ) approach (0,0) along the parabolic path x2 > 0 and x1 = x2 2 .
Then

lim x2→0; x1 = x22 f (x1 , x2 ) = lim x2→0 − x2 2 /x2 2 = −1. (17)

Thus, f is not continuous at (0,0). It can be verified that restricting f to Int S and
then extending f to the closure of S (which is S) leads to the same f* as is defined by
Eq. (15). Thus, the Fenchel closure operation does not always result in a continuous
concave function.
Theorem 4 below states sufficient conditions for the Fenchel closure of a concave
function defined over an open domain of definition set to be continuous over the
closure of the original domain of definition. Fortunately, the hypotheses of this
Theorem are weak enough to cover most economic applications. Before stating the
Theorem, we need an additional definition.

Definition A set S in RN is a polyhedral set iff S is equal to the intersection of a

finite number of halfspaces.

Theorem 4 Continuity of a concave function using the Fenchel closure operation;

Gale et al. [57], Rockafellar ([77], 85): Let f be a concave function of N variables
defined over an open convex polyhedral set S. Suppose f is bounded from below over
every bounded subset of S. Then the Fenchel closure extension of f to the closure of
S results in a continuous concave function defined over Clo S.
The proof of this result is too involved to reproduce here but we can now apply
this result.
Applying Theorem 4, extend the domain of definition of C(y,p) from strictly
positive price vectors p to nonnegative price vectors using the Fenchel closure
operation and hence C(y,p) will be continuous and concave in p over the set
{p : p ≥ 0N } for each y in the interval of feasible outputs.17
Now return to the problem where we have a given cost function, C(y,p), we are
given a strictly positive input vector, x 0N , and we ask what is the maximum
output that this x can produce. Repeating the analysis in Eq. (13), we have

17 Iff(0N ) = 0 and f(x) tends to plus infinity as the components of x tend to plus infinity, then the
feasible y set will be y ≥ 0 and C(y,p) will be defined for all y ≥ 0 and p ≥ 0N .
3 Duality in Production 71

f (x) = maxy {y : x ∈ M (y)}

= maxy y : C (y, p) ≤ pT x for every p 0N using definitions (7) and (8).

= maxy y : C (y, p) ≤ 1 for every p 0N such that pT x = 1

where we have used the linear homogeneity in prices property of C

= maxy y : C (y, p) ≤ 1 for every p ≥ 0N such that pT x = 1

where we have extended the domain of definition of C (y, p) to

nonnegative prices from positive prices and used the continuity
of the extension function over the set of nonnegative prices
= maxy {y : G (y, x) ≤ 1}
(18)

where the function G(y,x) is defined as follows:

G (y, x) ≡ maxp C (y, p) : p ≥ 0N and pT x = 1 . (19)

Note that the maximum in Eq. (19) will exist since C(y,p) is continuous in p and
the feasible region for the maximization problem, {p : p ≥ 0N and pT x = 1}, is a
closed and bounded set.18 Property 7 on the cost function C(y,p) will imply that
the maximum in the last line of Eq. (18) will exist. Property 6 on the cost function
will imply that for fixed x, G(y,x) is nondecreasing in y. Typically, G(y,x) will be
continuous in y for a fixed x and so the maximum y that solves Eq. (18) will be the
y* that satisfies the following equation19 :

G y∗ , x = 1. (20)

Thus, Eqs. (19) and (20) implicitly define the production function y∗ = f(x) in
terms of the cost function C.

Problems
4. Show that the f(x1 ,x2 ) defined by Eq. (15) above is a concave function over the
interior of the domain of definition set S. You do not have to show that S is a
convex set.

is where we use the assumption that x 0N in order to obtain the boundedness of this set.
18 Here
19 This
method for constructing the production function from the cost function may be found in
Diewert ([22], 119).
72 W. E. Diewert

5. In the case where the technology is subject to constant returns to scale, the
cost function has the following form: C(y, p) = yc(p) where c(p) is a unit cost
function. For x 0N , define the function g(x) as follows:
(i)

g (x) ≡ maxp c (p) : pT x = 1; p ≥ 0N .

Show that in this constant returns to scale case, the function G(y,x)
defined by Eq. (19) reduces to
(ii)

G (y, x) = yg (x) .

Show that in this constant returns to scale case, the production function
that is dual to the cost function has the following explicit formula for
x 0N :
(iii)

f (x) = 1/g (x) .

6. Let x ≥ 0 be input (a scalar number) and let y = f(x) ≥ 0 be the maximum output
that could be produced by input x, where f is the production function. Suppose
that f is defined as the following step function:
(i)

f (x) ≡ 0 for 0 ≤ x < 1;

≡ 1 for 1 ≤ x < 2;
≡ 2 for 2 ≤ x < 3;

and so on. Thus, the technology cannot produce fractional units of output
and it takes one full unit of input to produce each unit of output. It can be
verified that this production function is continuous from above.
(a) Calculate the cost function C(y,1) that corresponds to this production
function; that is, set the input price equal to one and try to determine
the corresponding total cost function C(y,1). It will turn out that this cost
function is continuous from below in y.
(b) Graph both the production function y = f(x) and the cost function
c(y) ≡ C(y, 1).
7. Suppose that a producer’s cost function is defined as follows for y ≥ 0, p1 > 0
and p2 > 0:
(i)

C (y, p1 , p2 ) ≡ b11 p1 + 2b12 (p1 p2 )1/2 + b22 p2 y

where the bij parameters are all positive.

3 Duality in Production 73

(a) Show that this cost function is concave in the input prices p1 ,p2 . Note: this
is the two input case of the generalized Leontief cost function defined by
Diewert [20].
(b) Calculate an explicit functional form for the corresponding production
function f(x1 ,x2 ) where we assume that x1 > 0 and x2 > 0. Hint: This
part of the problem is not completely straightforward. You will obtain a
quadratic equation but which root is the right one?

The Derivative Property of the Cost Function

Theorem 2, the Shephard Duality Theorem, is of mainly academic interest: if the

production function f satisfies properties (10), (11), and (12), then the corresponding
cost function C defined by Eq. (1) satisfies the properties listed in Theorem 1 above
and moreover completely determines the production function. However, it is the
next property of the cost function that makes duality theory so useful in applied
economics.

Theorem 5 Shephard’s ([81], 11) Lemma: If the cost function C(y,p) satisfies the
properties listed in Theorem 1 above and in addition is once differentiable with
respect to the components of input prices at the point (y* ,p* ) where y* is in the
range of the production function f and p∗ 0N , then

x∗ = ∇p C y∗ , p∗ (21)

where ∇ p C(y∗ , p∗ ) is the vector of first order partial derivatives of cost with respect
to input prices, [∂C(y∗ , p∗ )/∂p1 , . . . , ∂C(y∗ , p∗ )/∂pN ]T , and x* is any solution to the
cost minimization problem

minx p∗T x : f (x) ≥ y∗ ≡ C y∗ , p∗ . (22)

Under these differentiability hypotheses, it turns out that the x* solution to Eq.
(22) is unique.

Proof Let x* be any solution to the cost minimization problem (22). Since x* is
feasible for the cost minimization problem when the input price vector is changed
to an arbitrary p 0N , it follows that

pT x∗ ≥ C y∗ , p for every p 0N . (23)

Since x* is a solution to the cost minimization problem (22) when p = p∗ , we

must have

p∗T x∗ = C y∗ , p∗ . (24)
74 W. E. Diewert

But Eqs. (23) and (24) imply that the function of N variables, g(p) ≡ pT x∗ −
C(y∗ , p) is nonnegative for all p 0N with g(p∗ ) = 0. Hence, g(p) attains a global
minimum at p = p∗ and since g(p) is differentiable with respect to the input prices
p at this point, the following first order necessary conditions for a minimum must
hold at this point:

∇p g p∗ = x∗ − ∇p C y∗ , p∗ = 0N . (25)

Now note that Eq. (25) is equivalent to Eq. (21). If x** is any other solution to
the cost minimization problem (22), then repeat the above argument to show that

x∗∗ = ∇p C y∗ , p∗
(26)
=x

where the second equality follows using Eq. (25). Hence, x∗∗ = x∗ and the solution
to Eq. (22) is unique.
The above result has the following implication: postulate a differentiable func-
tional form for the cost function C(y,p) that satisfies the regularity conditions listed
in Theorem 1 above. Then differentiating C(y,p) with respect to the components of
the input price vector p generates the firm’s system of cost minimizing input demand
functions, x(y, p) ≡ ∇ p C(y, p).
Shephard [81] was the first person to establish the above result starting with
just a cost function satisfying the appropriate regularity conditions.20 However,
Hotelling ([61], 594) stated a version of the result in the context of profit functions
and Hicks ([60], 331) and Samuelson ([79], 15–16) established the result starting
with a differentiable utility or production function.
One application of the above result is its use as an aid in generating systems
of cost minimizing input demand functions that are linear in the parameters that
characterize the technology. For example, suppose that the cost function had the
following generalized Leontief functional form 21 :
N N
C (y, p) ≡ bik pi 1/2 pk 1/2 y; bik = bki for 1 ≤ i < j ≤ N (27)
i=1 k=1

where the N(N + 1)/2 independent bik parameters are all nonnegative. With
these nonnegativity restrictions, it can be verified that the C(y,p) defined by Eq.
(27) satisfies properties 1–7 listed in Theorem 1.22 Applying Shephard’s Lemma

20 This is why Diewert ([22], 112) called the result Shephard’s Lemma. See also Fenchel ([52],

104). We have used the technique of proof used by McKenzie [74].

21 See Diewert [20].
22 Using problem 7 above, it can be seen that if the b are nonnegative and y is positive, then
ik
the functions bik pi 1/2 pk 1/2 y are concave in the components of p. Hence, since a sum of concave
functions is concave, it can be seen that the C(y,p) defined by Eq. (27) is concave in the components
of p.
3 Duality in Production 75

shows that the system of cost minimizing input demand functions that correspond
to this functional form are given by:
N
xi (y, p) = ∂C (y, p) /∂pi = bik (pk /pi )1/2 y; i = 1, 2, . . . , N. (28)
k=1

Errors can be added to the system of Eq. (28) and the parameters bik can be
estimated using linear regression techniques if we have time series or cross sectional
data on output, inputs and input prices.23 If all of the bij equal zero for i = j, then
the demand functions become:

xi (y, p) = ∂C (y, p) /∂pi = bii y; i = 1, 2, . . . , N. (29)

Note that input prices do not appear in the system of input demand functions
defined by Eq. (29) so that input quantities do not respond to changes in the relative
prices of inputs. The corresponding production function is known as the Leontief
[71] production function.24 Hence, it can be seen that the production function that
corresponds to Eq. (28) is a generalization of this production function.
We will consider additional functional forms for a cost function in subsequent
sections.

The Comparative Statics Properties of Input Demand Functions

Before we develop the main result in this section, it will be useful to establish
some results about the derivatives of a twice continuously differentiable linearly
homogeneous function of N variables. We say that f(x), defined for x 0N is
positively homogeneous of degree α iff f has the following property:

f (λx) = λα f (x) for all x 0N and λ > 0. (30)

A special case of the above definition occurs when the number α in the above
definition equals 1. In this case, we say that f is (positively) linearly homogeneous25
iff

23 Note that b12 will appear in the first input demand equation and in the second as well using
the cross equation symmetry condition, b21 = b12 . There are N(N − 1)/2 such cross equation
symmetry conditions and we could test for their validity or impose them in order to save degrees of
freedom. The nonnegativity restrictions that ensure global concavity of C(y,p) in p can be imposed
if we replace each parameter bik by a squared parameter, (aik )2 . However, the resulting system of
estimating equations is no longer linear in the unknown parameters.
24 The Leontief production function can be defined as f(x , . . . , x ) ≡ min {x /b : i = 1, . . . , N}.
1 N i i ii
It is also known as the no substitution production function. Note that this production function is
not differentiable even though its cost function is differentiable.
25 Usually in economics, we omit the adjective “positively” but it is understood that the λ which

appears in definitions (30) and (31) is restricted to be positive.

76 W. E. Diewert

f (λx) = λf (x) for all x 0N and λ > 0. (31)

Theorem 6 Euler’s Theorems on Differentiable Homogeneous Functions: Let f(x)

be a (positively) linearly homogeneous function of N variables, defined for x 0N .
Part 1: If the first order partial derivatives of f exist, then the first order partial
derivatives of f satisfy the following equation:
N
f (x) = xn ∂f (x1 , . . . , xN ) /∂xn = xT ∇f (x) for all x 0N . (32)
n=1

Part 2: If the second order partial derivatives of f exist, then they satisfy the
following equations:
N
∂ 2 f (x1 . . . , xN ) /∂xn ∂xk xk = 0 for all x 0N and n = 1, . . . , N.
k=1
(33)

The N equations in (33) can be written using matrix notation in a much more
compact form as follows:

∇ 2 f (x) x = 0N for all x 0N . (34)

Proof of Part 1 Let x 0N and λ > 0. Differentiating both sides of Eq. (31) with
respect to λ leads to the following equation using the composite function chain rule:
N
f (x) = [∂f (λx1 , . . . , λxN ) /∂ (λxn )] [∂ (λxn ) /∂λ]
n=1
(35)
N
= [∂f (λx1 , . . . , λxN ) /∂ (λxn )] xn .
n=1

Now evaluate Eq. (35) at λ = 1 and we obtain Eq. (32).

Proof of Part 2 Let x 0N and λ > 0. For n = 1, . . . , N, differentiate both sides

of Eq. (31) with respect to xn and we obtain the following N equations:

fn (λx1 , . . . , λxN ) ∂ (λxn ) /∂xn = λfn (x1 , . . . , xN ) for n = 1, . . . , N or

fn (λx1 , . . . , λxN ) λ = λfn (x1 , . . . , xN ) for n = 1, . . . , N or (36)
fn (λx1 , . . . , λxN ) = fn (x1 , . . . , xN ) for n = 1, . . . , N

where the nth first order partial derivative function is defined as fn (x1 , . . . , xN ) ≡
∂f(x1 , . . . xN )/∂xn for n = 1, . . . , N.26 Now differentiate both sides of the last set

26 Using definition (30) for the case where α = 0, it can be seen that the last set of equations in

(36) shows that the first order partial derivative functions of a linearly homogenous function are
homogeneous of degree 0.
3 Duality in Production 77

of equations in (36) with respect to λ and we obtain the following N equations:

N
0= [∂fn (λx1 , . . . , λxN ) /∂xk ] [∂ (λxk ) /∂λ] for n = 1, . . . , N
k=1
(37)
N
= [∂fn (λx1 , . . . , λxN ) /∂xk ] xk .
k=1

Now evaluate Eq. (37) at λ = 1 and we obtain the N Eq. (33).

The above results can be applied to the cost function, C(y,p). From Theorem 1,
C(y,p) is linearly homogeneous in p. Hence, by part 2 of Euler’s Theorem, if the
second order partial derivatives of the cost function with respect to the components
of the input price vector p exist, then these derivatives satisfy the following
restrictions:

∇ 2 pp C (y, p) p = 0N . (38)

Theorem 7 Diewert ([27], 148–150): Suppose the cost function C(y,p) satisfies
the properties listed in Theorem 1 and in addition is twice continuously
differentiable with respect to the components of its input price vector at some
point, (y,p). Then the system of cost minimizing input demand equations,
x(y, p) ≡ [x1 (y, p), . . . , xN (y, p)]T , exists at this point and these input demand
functions are once continuously differentiable. Form the N by N matrix of
input demand derivatives with respect to input prices, B ≡ [∂xi (y, p)/∂pj ], which
has ij element equal to ∂xi (y, p)/∂pj . Then the matrix B has the following
properties27,28,29 :

B = BT so that ∂xi (y, p) /∂pk = ∂xk (y, p) /∂pi for all i = k; (39)

B is negative semidefinite and (40)

Bp = 0N . (41)

Proof
Shephard’s Lemma implies that the firm’s system of cost minimizing input
demand equations, x(y, p) ≡ [x1 (y, p), . . . , xN (y, p)]T , exists and is equal to

27 These are the Hicks ([60], 311) and Samuelson ([78], 69) symmetry restrictions. Hotelling ([61],

549) obtained analogues to these symmetry conditions in the profit function context.
28 Hicks ([60], 311) and Samuelson ([78], 69) also obtained versions of this result by starting with

the production (or utility) function f(x), assuming that the first order conditions for solving the cost
minimization problem held and that the strong second order sufficient conditions for the primal
cost minimization problem also held. Thus using duality theory, we obtain the same results under
weaker regularity conditions.
29 Hicks ([60], 331) and Samuelson ([78], 69) also obtained this result using their primal technique.
78 W. E. Diewert

x (y, p) = ∇p C (y, p) . (42)

Differentiating both sides of Eq. (42) with respect to the components of p gives
us

B ≡ ∂xi (y, p) /∂pk = ∇ 2 pp C (y, p) . (43)

Now property (39) follows from Young’s Theorem in calculus. Property (40)
follows from Eq. (43) and the fact that C(y,p) is concave in p and the fourth
characterization of concavity. Finally, property (41) follows from the fact that the
cost function is linearly homogeneous in p and hence Eq. (38) holds.
Note that property (40) implies the following properties on the input demand
functions:

∂xn (y, p) /∂pn ≤ 0 for n = 1, . . . , N. (44)

Property (44) means that input demand curves cannot be upward sloping.
If the cost function is also differentiable with respect to the output variable y, then
we can deduce an additional property about the first order derivatives of the input
demand functions. The linear homogeneity property of C(y,p) in p implies that the
following equation holds for all λ > 0:

C (y, λp) = λC (y, p) for all λ > 0 and p 0N . (45)

Partially differentiating both sides of Eq. (45) with respect to y leads to the
following equation:

∂C (y, λp) /∂y = λ∂C (y, p) /∂y for all λ > 0 and p 0N . (46)

But Eq. (46) implies that the function ∂C(y, p)/∂y is linearly homogeneous in p
and hence part 1 of Euler’s Theorem applied to this function gives us the following
equation:
N
∂C (y, p) /∂y = pn ∂ 2 C (y, p) /∂y∂pn = pT ∇ 2 yp C (y, p) . (47)
n=1

But using Eq. (42), it can be seen that Eq. (47) is equivalent to the following
equation30 :
N
∂C (y, p) /∂y = pn ∂xn (y, p) /∂y. (48)
n=1

30 This method of deriving these restrictions is due to Diewert ([27], 150) but these restrictions

were originally derived by Samuelson ([78], 66) using his primal cost minimization method.
3 Duality in Production 79

Problems
8. For i = k, the inputs i and k are said to be substitutes if ∂xi (y, p)/∂pk = ∂xk (y, p)/
∂pi > 0, unrelated if ∂xi (y, p)/∂pk = ∂xk (y, p)/∂pi = 031 and complements if
∂xi (y, p)/∂pk = ∂xk (y, p)/∂pi < 0. (a) If N = 2, show that the two inputs cannot
be complements. (b) If N = 2 and ∂x1 (y, p)/∂p1 = 0, then show that all of
the remaining input demand price derivatives are equal to 0; that is, show that
∂x1 (y, p)/∂p2 = ∂x2 (y, p)/∂p1 = ∂x2 (y, p)/∂p2 = 0. (c) If N = 3, show that at
most one pair of inputs can be complements.32
9. Let N ≥ 3 and suppose that ∂x1 (y, p)/∂p1 = 0. Then show that ∂x1 (y, p)/∂pn = 0
as well for n = 2, 3, . . . , N. Hint: You may need to use the definition of negative
semidefiniteness in a strategic way. This problem shows that if the own input
elasticity of demand for an input is 0, then that input is unrelated to all other
inputs.
10. Recall the definition (27) of the generalized Leontief cost function where the
parameters bij were all assumed to be nonnegative. Show that under these
nonnegativity restrictions, every input pair is either unrelated or substitutes.
Hint: Simply calculate ∂ 2 C(y, p)/∂pi ∂pk for i = k and look at the resulting
formula. Comment: This result shows that if we impose the nonnegativity
conditions bik ≥ 0 for i = j on this functional form in order to ensure that
it is globally concave in prices, then we have a priori ruled out any form of
complementarity between the inputs. This means if the number of inputs N is
greater than 2, this nonnegativity restricted functional form cannot be a flexible
functional form 33 for a cost function; that is, it cannot attain an arbitrary pattern
of demand derivatives that are consistent with microeconomic theory, since the
nonnegativity restrictions rule out any form of complementarity.
11. Suppose that a producer’s three input production function has the following
Cobb Douglas [17] functional form:
(a) f (x1 , x2 , x3 ) ≡ x1α1 x2α2 x3α1 where α1 > 0, α2 > 0, α3 > 0 and α1 + α2 +
α3 = 1.Let the positive input prices p1 > 0, p2 > 0, p3 > 0 and the positive
output level y > 0 be given.
(i) Calculate the producer’s cost function, C(y,p1 ,p2 ,p3 ) along with the
three input demand functions, x1 (y,p1 ,p2 ,p3 ), x2 (y,p1 ,p2 ,p3 ), and
x3 (y,p1 ,p2 ,p3 ). Hint: Use the usual Lagrangian technique for solving

31 Pollak ([76], 67) used the term “unrelated” in a similar context.

32 This result is due to Hicks ([60], 311–312): “It follows at once from Rule (5) that, while it is
possible for all other goods consumed to be substitutes for x1 , it is not possible for them all to be
complementary with it.”
33 Diewert ([22], 115) introduced the term “flexible functional form” to describe a functional

form for a cost function (or production function) that could approximate an arbitrary cost
function (consistent with microeconomic theory) to the second order around any given point.
The Generalized Leontief cost function defined by Eq. (27) above is flexible for the class of cost
functions that are dual to linearly homogeneous production functions if we do not impose any
restrictions on the parameters bij ; see Diewert [20] or section “Flexible Functional Forms for Cost
Functions: The Generalized Leontief Functional Form” below for a proof of this fact.
80 W. E. Diewert

constrained minimization problems. You do not need to check the

second order conditions for the problem. The positive constant
k ≡ α1−α1 α2−α2 α3−α3 will appear in the cost function.
(ii) Calculate the input one demand elasticity with respect to output
[∂x1 (y, p1 , p2 , p3 )/∂y][y/x1 (y, p1 , p2 , p3 )] and the three input one
demand elasticities with respect to input prices [∂x1 (y, p1 , p2 , p3 )/∂pn ]
[pn /x1 (y, p1 , p2 , p3 )] for n = 1, 2, 3.
(iii) Show that −1 < [∂x1 (y, p1 , p2 , p3 )/∂p1 ][p1 /x1 (y, p1 , p2 , p3 )] < 0.
(iv) Show that 0 < [∂x1 (y, p1 , p2 , p3 )/∂p2 ][p2 /x1 (y, p1 , p2 , p3 )] < 1.
(v) Show that 0 < [∂x1 (y, p1 , p2 , p3 )/∂p3 ][p3 /x1 (y, p1 , p2 , p3 )] < 1.
(vi) Can any pair of inputs be complementary if the technology is a three
input Cobb Douglas?
Comment: The Cobb Douglas functional form is widely used in macroe-
conomics and in applied general equilibrium models. However, this problem
shows that it is not satisfactory if N ≥ 3. Even in the N = 2 case where
analogues to (iii) and (iv) above hold, it can be seen that this functional form
is not consistent with technologies where the degree of substitution between
inputs is very high or very low.
12. Suppose that the second order partial derivatives with respect to input prices
of the cost function C(y,p) exist so that the nth cost minimizing input demand
function xn (y, p) = ∂C(y, p)/∂pn > 0 exists for n = 1, . . . , N. Define the input n
elasticity of demand with respect to input price k as follows:
(a)

enk (y, p) ≡ ∂xn (y, p) /∂pk pk /xn (y, p) for n = 1, . . . , N and
k = 1, . . . , N.

Show that for each n, k = 1 N enk (y, p) = 0.
13. Let the producer’s cost function be C(y,p), which satisfies the regularity
conditions in Theorem 1 and, in addition, is once differentiable with respect
to the components of the input price vector p. Then the nth input demand
function is xn (y, p) ≡ ∂C(y, p)/∂pn for n = 1, . . . , N. Input n is defined to
be normal at the point (y,p) if ∂xn (y, p)/∂y = ∂ 2 C(y, p)/∂pn ∂y > 0; that is, if
the cost minimizing demand for input n increases as the target output level y
increases. On the other hand, input n is defined to be inferior at the point (y,p) if
∂xn (y, p)/∂y = ∂ 2 C(y, p)/∂pn ∂y < 0. Prove that not all N inputs can be inferior
at the point (y,p). Hint: Make use of Eq. (48).
14. If the production function f dual to the differentiable cost function C(y,p)
exhibits constant returns to scale so that f(λx) = λf(x) for all x ≥ 0N and
all λ > 0, then show that for each n, the input n elasticity of demand with
respect to the output level y is 1; that is, show that for n = 1, . . . , N,
[∂xn (y, p)/∂y][y/xn (y, p)] = 1.
3 Duality in Production 81

15. Let C(y,p) be a twice continuously differentiable cost function that satisfies the
regularity conditions listed in Theorem 1 in section “Cost Functions: The One
Output Case” above. By Shephard’s Lemma, the input demand functions are
given by
(i)

xn (y, p) = ∂C (y, p) /∂pn > 0; n = 1, . . . , N.

The Allen ([1], 504), Uzawa [83] elasticity of substitution σnk between
inputs n and k is defined as follows for 1≤ n, k≤ N:
(ii)

σnk (y, p) ≡C (y, p) {∂ 2 C (y, p) /∂pn ∂pk }/{∂C (y, p) /∂pn }{∂C (y, p) /∂pk }.

Define = [σnk (y, p)] as the N by N matrix of elasticities of substitu-
tion.
(a) Show that has the following properties:
(iii)

T
= ;

(iv) is negative semidefinite and
(v)

s = 0N

where s ≡ [s1 , . . . , sN ]T is the vector of cost shares; that is,

sn ≡ pn xn (y, p)/C(y, p) for n = 1, . . . , N. Now define the N by N matrix
of cross price elasticities of demand E in a manner analogous to definition
(ii) above:
(vi)

E ≡ enk n = 1, . . . , N; k = 1, . . . , N

= (pk /xn ) ∂xn (y, p) /∂pk

= (pk /xn ) ∂ 2 C (y, p) /∂pn ∂pk using (i)

= x̂ −1 ∇ 2 pp C (y, p) p̂.

(b) Show that E = ŝ where ŝ is an N by N diagonal matrix with the
elements of the share vector s running down the main diagonal.
82 W. E. Diewert

The Duality Between Constant Returns to Scale Production

Functions and Their Unit Cost Functions

In this section, we will add more structure to the production function: we will
assume that f(x) is subject to constant returns to scale so that f(λx) = λf(x) for
every nonnegative input vector x ≥ 0N and nonnegative scalar λ ≥ 0.
In many areas of applied economics, constant returns to scale in production are
assumed. Samuelson [80] justified this assumption as an approximation to reality
by using a plant replication argument. He assumed that there was a plant size that
minimized average cost and showed if this optimal plant size output level was
small relative to the size of the market, then by replicating optimal size plants,
the industry production function would approximate a constant returns to scale
production function.34 Thus, in this section, we will assume constant returns to
scale in production and see what additional properties the resulting cost function
must satisfy.
Before we develop a formal duality theorem, it is necessary to prove a useful
mathematical result.

Theorem 8 Berge ([3], 208): If f is a positive, linearly homogeneous and quasi-

concave function defined over the positive orthant in RN , , then f is also concave
over .

Proof Let x1 0N , x2 0N and 0 < λ < 1. We need to show that:

f λx1 + (1 − λ) x2 ≥ λf x1 + (1 − λ) f x2 . (49)

Without loss of generality, we can assume 0 < f(x1 ) ≤ f(x2 ). Let μ > 0 be the
scalar that causes f(μx2 ) to equal f(x1 ). Using the constant returns to scale property
of f, μ can be defined as follows:

μ ≡ f x1 /f x2 > 0. (50)

Points on the line segment joining the point x1 to μx2 can be represented by
αx1 + (1 − α)μx2 where 0 ≤ α ≤ 1. The quasiconcavity property of f implies that
the following equality holds for all α such that 0 ≤ α ≤ 1:

f x1 ≤ f αx1 + (1 − α) μx2 . (51)

Define β > 0 as the proportionality factor that deflates the point λx1 + (1 − λ)x2
onto the line segment joining the point x1 to μx2 . Thus, we have:

34 Diewert [26] elaborated on Samuelson’s results.

3 Duality in Production 83

β λx1 + (1 − λ) x2 = αx1 + (1 − α) μx2 . (52)

Thus, the unknown α and β must be the solution to the following two equations:

βλ = α; β (1 − λ) = (1 − α) μ. (53)

The solution to Eq. (52) is β = μ/(1 − λ + λμ) and α = λμ/(1 − λ + λμ).

It is straightforward to show that the solution satisfies β > 0 and 0 ≤ α ≤ 1. Now
substitute Eq. (52) into (51) and we obtain the following inequality:

f x1 ≤ f αx1 + (1 − α) μx2

= f β λx1 + (1 − λ) x2 using (52)
(54)
= βf λx1 + (1 − λ) x2 using the linear homogeneity of f

= [μ/ (1 − λ + λμ)] f λx1 + (1 − λ) x2 .

Thus, Eq. (54) implies:

f λx1 + (1 − λ) x2 ≥ μ−1 (1 − λ + λμ) f x1

= μ−1 (1 − λ) f x1 + λf x1

= λf x1 + (1 − λ) f x2 using definition (50).
(55)

The above result will prove to be useful in what follows. Recall that in section
“Cost Functions: The One Output Case” above, we initially assumed that the
production function f(x) only satisfied continuity from above. We continue to make
this very weak regularity assumption but we now assume that in addition, f satisfies
the following linear homogeneity property:

f (λx) = λf (x) for all λ ≥ 0 and x ≥ 0N . (56)

We also assume that there exists an x∗ > 0N such that y∗ ≡ f(x∗ ) > 0; that is, there
exists a nonnegative, nonzero input vector x* which can produce a positive output.
This assumption along with the constant returns to scale assumption (56) means that
the technology can produce any positive output level.
84 W. E. Diewert

Let y > 0 and p 0N . We can define the total cost function that corresponds
to our homogeneous production function using definition (1) again; that is, define
C(y,p) as follows:

C (y, p) ≡ minx pT x : f (x) ≥ y; x ≥ 0N

= minx pT x : y−1 f (x) ≥ 1; x ≥ 0N

= minx pT x : f (x/y) ≥ 1; x ≥ 0N using (56)
(57)

= minx ypT (x/y) : f (x/y) ≥ 1; x/y ≥ 0N

= y minz pT z : f (z) ≥ 1; z ≥ 0N letting z = x/y

= y c (p)

where c(p) is the unit cost function that corresponds to f, defined as follows:

c (p) ≡ minz pT z : f (z) ≥ 1; z ≥ 0N . (58)

We can use the input price properties of the total cost function C(y,p) that were
implied by Theorem 1 in section “Cost Functions: The One Output Case” in order to
derive the properties of the unit cost function, c(p). Thus, Theorem 1 tells us that c(p)
is well defined as a minimum for p 0N and it is nonnegative, positively linearly
homogeneous, nondecreasing, and concave in p over the positive orthant. In fact,
the continuity from above property of f along with the assumption that f is linearly
homogeneous will imply that f(0N ) = 0 and this in turn will imply that c(p) > 0 for
p 0N . Since c(p) is concave over the positive orthant, we can also deduce that it
is continuous over this domain of definition. The domain of definition of c(p) can
be extended to the nonnegative orthant using the Fenchel closure operation as was
done in section “Cost Functions: The One Output Case.” The resulting c(p) will be
continuous over the nonnegative orthant. Thus, there is no problem in going from
the production function to its unit cost function.
Can we use the unit cost function to recover the underlying production function?
We can get an outer approximation to the true technology using the algebra in
section “Cost Functions: The One Output Case.” Let x > 0N be an arbitrary nonzero,
nonnegative input vector. The maximum output y that is consistent with using the
outer approximation technology and the input vector x must satisfy the inequalities
yc(p) ≤ pT x for every p > 0N . Thus, we want the maximum y such that y ≤ pT x/c(p)
for every p > 0N . Now the functions pT x and c(p) are both linearly homogeneous so
we can normalize one of these functions and minimize or maximize the remaining
3 Duality in Production 85

function to obtain y = f∗ (x), where f* (x) is the production function that corresponds
to the outer approximation technology. If we set pT x = 1, then we want to minimize
1/c(p) subject to the constraint pT x = 1 and so in this case, f* (x) is defined as
follows:

f∗ (x) ≡ minp 1/c (p) : pT x = 1; p ≥ 0N
(59)
= 1/maxp c (p) : p x = 1; p ≥ 0N .
T

Note that the maximization problem in Eq. (59) is a concave programming prob-
lem. On the other hand, we could set c(p) = 1. In this case, f* (x) is (equivalently)
defined as follows:

f∗ (x) ≡ minp pT x : c (p) = 1; p ≥ 0N
(60)
= minp pT x : c (p) ≥ 1; p ≥ 0N .

In order to recover the original production function, f(x), by using the formulae
on the right hand sides of Eq. (59) or (60), we need to assume that f is nondecreasing
and quasiconcave, as in section “Cost Functions: The One Output Case.” However,
using Berge’s Theorem 8 above, it can be seen that when f is linearly homogeneous
and quasiconcave (and positive) over the positive orthant, then f is also a concave
function over the positive orthant. If in addition, f is continuous over the nonnegative
orthant, then f will also be concave over the nonnegative orthant. Thus, f and c
satisfy exactly the same regularity conditions, with respect to x and p, respectively,
if we assume that f is nondecreasing and quasiconcave. Moreover, the underlying
technology can be represented by using either the linearly homogeneous production
function or its dual unit cost function. Samuelson ([79], 15) and Shephard [81] were
the first to obtain versions of this duality theorem for the homogeneous case.35
In the following sections, we will exhibit various explicit functional forms for a
linearly homogeneous f or its dual unit cost function.

The Constant Elasticity of Substitution Production Function

The constant elasticity of substitution (CES) production function, f(x), is defined as

follows:
1/s
N
f (x1 , . . . , xN ) ≡ βn xn s (61)
n=1

35 See also Diewert ([22], 110–112) for a duality theorem along the present lines.
86 W. E. Diewert

where the parameters βn are positive and s is a parameter which satisfies s = 0

and the inequality s ≤ 1. The two input case of this functional form was introduced
into the economics literature by Arrow et al. ([2], 230).36 The problems below
show that the CES production function is a well-behaved constant returns to scale
production function which satisfies the regularity conditions that were developed in
the previous section, provided that s ≤ 1.

Problems
16. Let s = 0and rewrite the f(x) defined by Eq. (61) as f(x) = γfs (x) where
fs (x) ≡
[ n = 1
N β ∗ x s ]1/s , β ∗ ≡ β /
n n n n i=1
N β for n = 1, . . . , N and
i
γ ≡ [ i = 1 βi ] . Show that lims → 0 lnfs (x1 , . . . , xN ) = n = 1 N βn ∗ lnxn .
N 1/s

Thus the CES production function defined by Eq. (61) tends to a Cobb-Douglas
production function as the parameter s tends to 0. Hint: Write lnfs (x1 , . . . , xN )
as g(s)/h(s) where g(s) ≡ ln [ n = 1 N βn ∗ xn s ] and h(s) ≡ s. Let s tend to 0 and
apply l’Hospital’s Rule. Note that g(0) = h(0) = 0.
17. Let βT ≡ [β1 , . . . , βN ] where βn > 0 for n = 1, . . . , N. Define β̂ as the N by
N diagonal matrix with the elements of the vector β running down the main
diagonal. Show that the N by N matrix −β̂ + ββT is a negative semidefinite
matrix. Hint: Show that the inequality zT −β̂ + ββT z ≤ 0 for all vectors z
is equivalent to the Cauchy-Schwarz inequality (xT y)2 ≤ (xT x)(yT y) with x ≡
β̂ 1/2 1N ; y ≡ β̂ 1/2 z where 1N is a vector of ones of dimension N and β̂ 1/2 is
a diagonal matrix with the positive square roots of the elements of β running
down the main diagonal.
18. Show that the CES production function f(x) defined by Eq. (61) above is
homogeneous of degree one in the components of x.
19. Show that the CES production function f(x) defined by Eq. (61) above is a
concave function of x if s = 0 and s ≤ 1 and is a convex function of x if s ≥ 1.
Hint: Calculate the matrix of second order partial derivatives of f, ∇ xx 2 f(x), for
x 0N and show it is negative semidefinite if s ≤ 1 and positive semidefinite
if s ≥ 1. Problem 17 will be useful.
We now want to determine the cost minimizing system of input demand
functions. We will first calculate the unit cost function that corresponds to the CES
production function defined by Eq. (61). We assume that the producer faces the
positive input prices p ≡ [p1 , . . . , pN ] 0N . The unit cost minimization problem is
the following one:

36 These authors wrote the CES functional form
defined by Eq. (61) as f(x) = γ[ n = 1 N βn ∗ xn s ]1/s
where the βn ∗ now sum up to one and γ ≡ [ n = 1 βn ]
N 1/s is a positive efficiency parameter. They

noted that the function of x that is defined by [ n = 1 N βn ∗ xn s ]1/s is mean of order s of the inputs,
x1 , . . . , xN and they referred to Hardy et al. ([59], 13) for the mathematical properties of this class
of means.
3 Duality in Production 87

1/s
N N
minx pn xn : βn xn s
= 1; x ≥ 0N . (62)
n=1 n=1

Ignoring the nonnegativity constraints, x ≥ 0N , and assuming that s < 1 and

s = 0, the Lagrangian first order conditions for an interior solution for Eq. (62) are
equivalent to the following conditions:

pn = λβn xn s−1 ; n = 1, . . . , N; (63)

N
1= βn xn s (64)
n=1

where the unknowns in Eqs. (63) and (64) are x1 , . . . , xN and the Lagrange
multiplier λ. The solution to Eqs. (63) and (64) turns out to be the following one
(remember, s = 0 and s = 1)37 :
1/s
∗ N
xn (p) ≡ pn 1/(s−1)
βn 1/(1−s)
/ βi 1/(1−s)
pn s/(s−1)
; n = 1, . . . , N.
i=1
(65)

Once the unit output demand functions have been calculated, the unit cost
function, c(p), can be calculated:
N
c (p) ≡ pn xn ∗ (p)
n=1
(s−1)/s
N
= βn 1/(1−s) pn s/(s−1) using (65) (66)
n=1
1/r
N
= αn pn r
n=1

where the new parameters r and α1 , . . . , αN are defined as follows38 :

r ≡ s/ (s − 1) ; αn ≡ βn 1/(1−s) ; n = 1, . . . , N. (67)

37 When s = 1, we have a linear production function. Usually, an interior solution to the cost
minimization problem defined by Eq. (62) will not occur; i.e., in this case, we have a linear
programming problem and the solution will normally be a corner solution.
38 Note that c(p) ≡ [
N α p r ]1/r can be rewritten as γ∗ [

n = 1 αn pn ]
N ∗ r 1/r where
n=1 n n
αn ∗ ≡ αn / i = 1 N αi and γ∗ ≡ [ i = 1 N αi ]1/r . Thus c(p) is equal to an efficiency parameter γ∗
times a mean of order r.
88 W. E. Diewert

When s takes on the values between 1 and −∞, r = s/(s − 1) goes from −∞
to 1.39 Thus, the range of r and s is the same, but they travel in opposite
directions. Hence, the CES unit cost function c(p) defined by Eq. (66) will be
a linearly homogeneous, concave, and nondecreasing function and have the same
mathematical properties as the CES production function f(x) defined by Eq. (61).
Once the CES unit cost function has been defined, the CES total cost function
is defined as C(y, p) ≡ yc(p) where c(p) is defined by Eq. (66). Using Shephard’s
Lemma, the CES system of cost minimizing demand functions is the following one:
(1/r)−1
N
xn (y, p) = y αn pn r−1
αi pi r
; n = 1, . . . , N.
i=1
(68)
N
= C (y, p) αn pn r−1
/ αi pi r
i=1

Problem
20. Recall problem 15 above which defined the Allen Uzawa elasticity of substi-
tution σnk between inputs n and k. Show that if C(y,p) is the CES total cost
function, then σnk (y, p) = 1 − r for all input pairs n,k such that n = k. Thus
every elasticity of substitution between any two distinct inputs is equal to the
same constant.

The above problem shows why the CES functional form is unsatisfactory if the
number of inputs N exceeds two, since it is a priori unlikely that all elasticities of
substitution between every pair of inputs would equal the same number. Thus, in
the following sections, we will look for functional forms for the production or cost
function that allow for more flexible patterns of substitution between inputs.
We conclude this section by listing some possible methods for estimating the
elasticity of substitution if the underlying technology can be adequately described
by the CES functional form.
We will first look at estimating equations where input prices are exogenous
variables and input quantities (and hence output) are endogenous variables. Take
logarithms of both sides of the CES input demand functions defined by Eq. (68).
Add error terms to each equation, say en t for equation n in period t.40 Subtract the
logarithm of the first input demand function from these N equations. Suppose that
there are data on inputs, output, and input prices for t periods and the period t data
are xt ≡ [x1 t , . . . , xN t ], yt and pt ≡ [p1 t , . . . , pN t ] for t = 1, . . . , T. We obtain the

39 Note that when s = 0, r will also

equal 0. Rewrite the c(p) defined by the last line in Eq. (66)
as c(p) = γ∗ cr (p) where γ∗ ≡ [ i = 1 N αi ]1/r and cr (p) ≡ [ n = 1 N αn ∗ pn r ]1/r . Using the results
of problem 16, it can be seen that the limiting casefor cr (p) as r tends to 0 is the Cobb-Douglas

unit cost function which has the logarithm equal to n = 1 N αn ∗ lnpn where αn ∗ = αn / i = 1 N αi for
n = 1, . . . , N.
40 The errors in our models can be due to measurement errors in the prices and quantities, the

assumption of incorrect functional forms and errors in optimization.

3 Duality in Production 89

following estimating equations41 :

ln xn t /x1 t = lnαn − lnα1 + (r − 1) ln pn t /p1 t
(69)
+ en − e1 ;
t t
n = 2, . . . , N; t = 1, . . . , T.

The above equations are linear in the unknown parameters, the lnαn and
r − 1 ≡ − σ. However, not all of the lnαn can be identified. This may not matter
if the focus is on the estimation of r (or on the elasticity of substitution, σ). In
order to identify all of the parameters, we can add a unit cost function equation
to thesystem defined by Eq. (69) Thus, define observed unit cost in period t as
ct ≡ ( n = 1 N pn t xn t )/yt for t = 1, . . . , T. Add the following estimating equations to
Eq. (69) where e0 t is the period t error term42 :

N
lnc = (1/r) ln
t
αn pn r
+ e0 t ; t = 1, . . . , T. (70)
n=1

Of course, the estimating equations in (70) are nonlinear in the unknown

parameters so nonlinear regression techniques will have to be used.
If the focus is on estimating the elasticity of substitution, Eq. (69) can be
differenced again, this time with respect to time. Thus, define the double differenced
logarithmic input quantity and price variables, dxn t and dpn t as follows for n = 2,
. . . , N; t = 2, . . . , T:

dxn t ≡ ln xn t /x1 t − ln xn t−1 /x1 t−1 = lnxn t − lnx1 t − lnxn t−1 + lnx1 t−1 ;
(71)

dpn t ≡ ln pn t /p1 t − ln pn t−1 /p1 t−1 = lnpn t − lnp1 t − lnpn t−1 + lnp1 t−1 .
(72)

41 Much of the literature on estimating CES unit cost functions deals with the application of
this functional form in the consumer context when aggregating over similar products; e.g., see
Broda and Weinstein [8], Bernard et al. [4] and Gábór-Toth and Vermeulen [56]. Almost all of
the estimating equations discussed in this section can be applied to the consumer context; i.e.,
replace the period t output level yt by the period t utility level ut and interpret xt as a vector of
cost minimizing consumer demands. Estimating equations which involve yt cannot be used in the
consumer context since the utility level ut is not observable.
42 This equation cannot be estimated in the consumer context because unit cost ct is not observable.
90 W. E. Diewert

The double differenced counterparts to Eq. (69) are now the following equa-
tions43 :

dxn t = (r − 1) dpn t + en t − e1 t − en t−1 + e1 t−1 ; n = 2, . . . , N; t = 2, . . . , T

(73)

where r − 1 = −σ . There are (N − 1)(T − 1) estimating equations in the system of

equations defined by Eq. (73) and only one economic parameter to estimate, namely,
−σ = r − 1. Note that the only exogenous variables in Eqs. (69), (70), and (73) are
input prices. Thus, to prevent biased estimates, it is important that these prices be
measured with minimal measurement error.
There is a problem with the systems of estimating equations defined by Eqs. (69)
and (73) and that is that these equations are dependent on the choice of the numeraire
input, which in the above algebra is input 1. Looking at the estimating equations, it is
evident that it is probably best to choose the numeraire commodity as one where the
original error terms, the en t , have means close to 0 and small variances. In practice,
it may be difficult to choose the “best” numeraire commodity.44
There is a way to avoid asymmetry in the estimating equations and
that is to shift from estimating systems of input demand functions to esti-
mating systems of share equations. From Eq. (68), it can be seen that
sn (y, p) ≡ pn xn (y, p)/C(y, p) = αn pn r / i = 1 N αi pi r
for n = 1, . . . , N. Define the
nth input share of cost in period t as sn t ≡ pn t xn t / i = 1 N pi t xi t for n = 1, . . . , N
and t = 1, . . . , T. Adding error terms to the above cost share equations leads to the
following nonlinear system of estimating equations:

43 The double differencing methodology originated in Feenstra ([49], 163). Equation (73) can be

converted into double differenced log input shares equal to a constant times double differenced
log input prices plus error terms; see Broda and Weinstein ([7], 564, [8], 714) and Gábór-Toth
and Vermeulen [56] and Eq. (75) below for these share equations. The present analysis follows
the material in Diewert and Feenstra ([32], 14). Diewert and Feenstra ([32], 76–79) worked out
the analogous estimating equations for a CES direct aggregator function where double differenced
log shares were equal to double differenced log quantities plus error terms. A potential cost of the
double differencing technique is that the variance of the error terms in the system of estimating Eq.
(73) can be much larger than the variances in the system of equations defined by Eq. (69) or in a
system that just used xn t or lnxn t as the dependent variable for input n in period t. However, the
standard error for σ when the very simple estimating system of equations defined by Eq. (76) used
by Diewert and Feenstra was very small.
44 Here is a possible strategy for choosing the numeraire input. Take logs of both sides of Eq. (68)

and add the error term (with 0 mean) en t to equation n for period t. Run a preliminarysystems
nonlinear regression in order to obtain estimates for the variance-covariance matrix of the
vector of errors, [e1 t , . . . , eN t ] which is assumed
to be distributed independently over time. Use
the estimated
variance-covariance matrix, ∗ say, to solve the convex programming problem,
minw {wT ∗ w : wT w = 1}. The ∗ solution to this problem will be a normalized eigenvector
that corresponds to the smallest eigenvalue of ∗ . Make a furthernormalization of w* so that
the resulting vector, w** , satisfies the constraint w∗ ∗ T 1N = 1. If ∗ is a diagonal matrix, this
methodology will pick the numeraire input to be the input which has the smallest error variance.
3 Duality in Production 91

r N t r
sn = αn pn t /
t
αi pi + en t ; n = 1, . . . , N; t = 1, . . . , T. (74)
i=1

n = 1 en = 0 for
If we sum Eq. (74) over n for a fixed t, we find that N t

t = 1, . . . , T. Thus, within each time period, the errors cannot be distributed

independently. Thus to prevent exact collinearity, one of the N estimating equations
must be dropped. Furthermore, it can be seen that not all of the α n parameters can
be identified. Thus, we require a normalization on the αn such as i = 1 N αi = 1 or
α1 = 1. Alternatively, Eq. (70) can be added to the (N − 1)T independent estimating
equations in (74) as additional estimating equations which will enable all of the αn
to be identified.
An alternative stochastic specification can be obtained if we take logarithms of
both sides of the equations sn t = [αn (pn t )r / i = 1 N αi (pi t )r ] and add error terms en t*
to the resulting equations. Choose input 1 as a numeraire input and consider the
following estimating equations:

ln sn t /s1 t = lnαn − lnα1 + rln pn t /p1 t
(75)
+ en t∗ − e1 t∗ ; n = 2, . . . , N; t = 1, . . . , T.

If the focus is on estimating the elasticity of substitution, σ = 1 − r, then

Eq. (75) can be differenced with respect to time and we obtain the following system
of estimating equations:

dsn t = rdpn t + en t∗ − e1 t∗ − en t−1∗ + e1 t−1∗ ; n = 2, . . . , N; t = 2, . . . , T

(76)

where the double differenced log price dpn t is defined by Eq. (72) and the double
differenced log share dsn t is defined as lnsn t − lns1 t − lnsn t − 1 + lns1 t − 1 . Note
that dpn t appears as an exogenous variable on the right hand sides of equation n,t in
(73) and (76).
We conclude this section by considering the estimation of a system of CES
inverse demand functions; that is, we assume that prices are the endogenous
variables and output and input quantities are the exogenous variables. Thus, the
input prices are regarded as the prices that rationalize the observed choice of inputs,
assuming that the CES production function is the “true” production function.45 This
may seem to be an odd thing to do but it can turn out that estimating the CES system

45 This was the methodological approach taken by Arrow et al. [2] in their pioneering study on

the estimation of CES functional forms. If the CES unit cost function model fits the observed data
perfectly, then it will turn out that estimating the direct CES production function using a system of
inverse demand functions will also fit the data perfectly.
92 W. E. Diewert

of inverse demand functions can lead to a much better fitting model than estimating
the CES system of direct input demand functions as was done above.46
Let y > 0 and p 0N and the technology can be described by the CES production
function defined by Eq. (61); that is, f(x1 , . . . , xN ) ≡ [ n = 1 N βn xn s ]1/s where s < 1,
s = 0 and βn > 0 for n = 1, . . . , N. Then the producer’s cost minimization problem
is equivalent to the following constrained maximization problem:

N N
minx pn xn : βn xn s = ys ; x ≥ 0N . (77)
n=1 n=1

The first order necessary (and sufficient) conditions for solving Eq. (77) are
equivalent to the following conditions:

pn = λβn xn s−1 ; n = 1, . . . , N; (78)

N
ys = βn xn s . (79)
n=1

Multiply both sides of equation n in Eq. (78) by xn and sum the resulting
equations. We obtain the following equation:
N N
pn xn = λ βn xn s
n=1 n=1 (80)
= λys

where the secondequation follows using Eq. (79). Use the second equation in (80)
to solve for λ = n = 1 N pn xn /ys and substitute this equation back into Eq. (78). The
resulting equations evaluated at the period t data are Eq. (81) below. As usual, the
period t data are xt ≡ [x1 t , . . . , xN t ], yt and pt ≡ [p1 t , . . . , pN t ] for t = 1, . . . , T. T
We obtain the following equations:

N s−1 t s
pn t / pn t xn t = βn xn t / y ; n = 1, . . . , N; t = 1, . . . , T.
n=1
(81)

Take logarithms of both sides of Eq. (81) and add the error term en t to the
resulting equations.47 We obtain the following system of estimating equations:

46 Thiswas the case in the empirical study of CES estimation undertaken by Diewert and Feenstra
[32].
47 These error terms are different from the error terms defined previously.
3 Duality in Production 93

N
t
ln pn / t
pn xn t
= ln βn + (s − 1) ln xn t − slnyt + en t ;
n=1 (82)
n = 1, . . . , N; t = 1, . . . , T.

Choose input 1 as the numeraire input and form the differenced Eq. (83):

ln pn t /p1 t = lnβn − lnβ1 + (s − 1) ln xn t /x1 t
(83)
+ en t − e1 t ; n = 2, 3, . . . , N; t = 1, . . . , T.

Not all of the parameters βn can be identified using the (N − 1)T equations
in (83). In order to identify all of the βn , we could make y an endogenous
variable
that is explained by the exogenous xn , using the production function,
y = [ n = 1 N βn xn s ]1/s . Thus, we could add the following estimating Eq. (84) to
Eq. (83):

N s
lnyt = (1/s) ln βn xn t + e0 t ; t = 1, . . . , T. (84)
n=1

If the focus is on estimating the elasticity of substitution, then we can time

difference Eq. (83) and obtain the following estimating equations:

dpn t = (s − 1) dxn t + en t − e1 t − en t−1 + e1 t−1 ; n = 2, . . . , N; t = 2, . . . , T

(85)

where the double log differenced variables dxn t and dpn t are defined by Eqs. (71)
and (72). Recall the r which appeared in the CES cost function. The elasticity of
substitution that corresponds to r is σ = 1 − r. The s which appears in Eq. (85)
corresponds to r = s/(s − 1). Thus s − 1 = − σ−1 . Our previous system of
estimating Eq. (73) for r can be written as dxn t = (r − 1)dpn t = − σdpn t , where we
have omitted the error terms. Our new system of estimating equations for s, Eq. (85),
can be written as dpn t = (s − 1)dxn t = − σ−1 dxn t where we have again omitted the
error terms. Thus, if either CES model fits the data perfectly, then the other model
will fit the data perfectly and the two estimates for σ will be identical. Note that the
two systems of estimating equations both have (N − 1)(T − 1) degrees of freedom
and only one (nonvariance) parameter, σ, to estimate.
It is useful to obtain a different system of estimating equations. Recall the first
order condition Eq. (79) above. If we evaluate these equations using the period t
data, we obtain the following equations which will hold if there are no errors in the
CES cost minimization model:

t s N s
y = βn xn t t = 1, . . . , T. (86)
n=1
94 W. E. Diewert

Recall our earlier first order condition Eq. (81). Multiply equation n,t by xn t and
we obtain the following system of equations after adding error terms, en t :

N
sn t ≡ pn t xn t / pi t xi t + en t n = 1, . . . , N; t = 1, . . . T
i=1
s s
= βn xn t / yt + en t using equations (81) (87)
s N t s
= βn xn t / βi xi + en t ; using equations (86).
i=1

If we sum Eq. (87) over n for a fixed t, we find that n = 1 N en t = 0 for t = 1,
. . . , T Thus, within each time period, the errors cannot be distributed independently.
To prevent exact collinearity, one of the N estimating equations must be dropped.
Furthermore, it can be seen that not all of theβn parameters can be identified. Thus
we require a normalization on the βn such as i = 1 N βi = 1 or β1 = 1. Alternatively,
Eq. (84) can be added to the (N − 1)T independent estimating equations in (87) as
additional estimating equations which will enable all of the βn to be identified.
Note that the dependent variables in Eq. (87) are exactly the same as the
dependent variables in our earlier nonlinear system of share estimating equations,
Eq. (74). In Eq. (87), input quantities xn t are the explanatory variables, whereas
in Eq. (74), input prices pn t were the explanatory variables. In actual empirical
applications of the CES model, the fit in the two systems can differ enormously.48
This explains why we developed the algebra for the estimation of either system.
An alternative stochastic specification can be obtained if we take logarithms of
both sides of the equations sn t = [βn (pn t )r / i = 1 N βi (pi t )r ] and add error terms en t*
to the resulting equations. Choose input 1 as a numeraire input and consider the
following estimating equations:

ln sn t /s1 t = ln βn − ln β1 + s ln xn t /x1 t
(88)
+ en t∗ − e1 t∗ ; n = 2, . . . , N; t = 1, . . . , T.

If the focus is on estimating the elasticity of substitution, σ = 1/(1 − s), then Eq.
(88) can be differenced with respect to time and we obtain the following system of
estimating equations:

dsn t = sdxn t + en t∗ − e1 t∗ − en t−1∗ − e1 t−1∗ ; n = 2, . . . , N; t = 2, . . . , T

(89)

where the double differenced log input quantity dxn t is defined by Eq. (73) and the
double differenced log share dsn t is defined as lnsn t − lns1 t − lnsn t−1 + lns1 t−1 .

48 See Diewert and Feenstra [32]. The system (87) fit their data much better than the corresponding

system (74).
3 Duality in Production 95

Note that dxn t appears as an exogenous variable on the right hand sides of equation
n,t in (85) and (89).

Flexible Functional Forms for Cost Functions: The Generalized

Leontief Functional Form

From the previous section, it can be seen that the CES functional form is not
suitable for economic applications where elasticities of substitution are allowed to
be different between different pairs of inputs. This leads us to define formally the
concept of a flexible functional form. We will define this concept first for a unit cost
function c(p) and then for a general cost function C(y,p).
Let c* (p) be an arbitrary unit cost function that satisfies the appropriate regularity
conditions on unit cost functions and in addition, is twice continuously differentiable
around a point p∗ 0N . Then we say that a unit cost function c(p) that is also
twice continuously differentiable around the point p* is flexible if it has enough free
parameters so that the following 1 + N + N2 equations can be satisfied49 :

c p∗ = c∗ p∗ ; (90)

∇c p∗ = ∇c∗ p∗ ; (91)

∇ 2 c p∗ = ∇ 2 c∗ p∗ . (92)

Thus c(p) is a flexible functional form if it has enough free parameters to provide
a second order Taylor series approximation to an arbitrary unit cost function.
At first glance, it looks like c(p) will have to have at least 1 + N + N2 independent
parameters in order to be able to satisfy all of the Eq. (90), (91), and (92). However,
since both c and c* are assumed to be twice continuously differentiable, Young’s
Theorem in calculus implies that ∂ 2 c(p∗ )/∂pi ∂pk = ∂ 2 c(p∗ )/∂pk ∂pi for all i = k
(and of course, the same equations hold for the second order partial derivatives of
c* (p) when evaluated at p = p∗ ). Thus, the N2 equations in (92) can be replaced with
the following N(N + 1)/2 equations:

∂ 2 c p∗ /∂pi ∂pk = ∂ 2 c∗ p∗ /∂pi ∂pk for 1 ≤ i ≤ k ≤ N. (93)

Another property that both unit cost functions must have is homogeneity of
degree one in the components of p. By part 1 of Euler’s Theorem on homogeneous
functions, c and c* satisfy the following equations:

49 Diewert [20] introducedthe concept of a flexible functional form. The actual term “flexible” was
introduced in Diewert ([22], 133).
96 W. E. Diewert

c p∗ = p∗T ∇c p∗ and c∗ p∗ = p∗T ∇c∗ p∗ . (94)

Thus, if c and c* satisfy Eq. (91), then using Eq. (94), we see that c and c*
automatically satisfy Eq. (90). By part 2 of Euler’s Theorem on homogeneous
functions, c and c* satisfy the following equations:

∇ 2 c p∗ p∗ = 0N and ∇ 2 c∗ p∗ p∗ = 0N . (95)

This means that if we have ∂ 2 c(p∗ )/∂pi ∂pk = ∂ 2 c∗ (p∗ )/∂pi ∂pk for all i = k, then
Eq. (95) will imply that ∂ 2 c(p∗ )/∂pi ∂pi = ∂ 2 c∗ (p∗ )/∂pi ∂pi as well, for i = 1, . . . , N.
Summarizing the above material, if c(p) is linearly homogeneous, then in order
for it to be flexible, c(p) needs to have only enough parameters so that the N
equations in (91) can be satisfied and so that the following N(N − 1)/2 equations
can be satisfied:

∂ 2 c p∗ /∂pj ∂pk = ∂ 2 c∗ p∗ /∂pi ∂pk ≡ cik ∗ ; for 1 ≤ i ≤ k ≤ N. (96)

Thus, in order to be flexible, c(p) must have at least N+N(N−1)/2 = N(N+1)/2

independent parameters.
Recall that the generalized Leontief cost function was introduced in section “The
Derivative Property of the Cost Function.” The unit cost function that corresponds
to this function form is defined as follows50 :
N N
c (p) ≡ bik pi 1/2 pk 1/2 (97)
i=1 k=1

where bik = bki for all i and k. Note that there are exactly N(N + 1)/2 independent
bik parameters in the c(p) defined by Eq. (97). For this functional form, the N
equations in (91) become:
N 1/2
∂c p∗ /∂pn = bnk pk ∗ /pn ∗ = ∂c∗ p∗ /∂pn ≡ cn ∗ ; n = 1, . . . , N.
k=1
(98)

The N(N − 1)/2 equations in (96) become:

∗ ∗ 1/2
1 = cik ∗ ; 1 ≤ i < k ≤ N.
2 bik / pi pk (99)

However, it is easy to solve Eq. (99) for the bik :

1/2
bik = 2cik ∗ pi ∗ pk ∗ ; 1 ≤ i < k ≤ N. (100)

50 We no longer restrict the bij to be nonnegative.

3 Duality in Production 97

Once the bik for i < k have been determined using Eq. (100), we set bki = bik for
i < k and finally the bii are determined using the N equations in (98).
The above material shows how we can find a flexible functional form for a unit
cost function.51 We now turn our attention to finding a flexible functional form for a
general cost function C(y,p). Let C* (y,p) be an arbitrary cost function that satisfies
the appropriate regularity conditions on cost functions listed in Theorem 1 above
and in addition is twice continuously differentiable around a point (y* ,p* ) where
y∗ > 0 and p∗ 0N . Then we say that a given cost function C(y,p) that is also
twice continuously differentiable around the point (y* ,p* ) is flexible if it has enough
free parameters so that the following 1 + (N + 1) + (N + 1)2 equations can be
satisfied:

C y∗ , p∗ = C∗ y∗ , p∗ ; (1 equation) (101)

∇p C y∗ , p∗ = ∇p C∗ y∗ , p∗ ; (N equation) (102)

∇ 2 pp C y∗ , p∗ = ∇ 2 pp C∗ y∗ , p∗ ; N2 equation (103)

∇y C y∗ , p∗ = ∇y C∗ y∗ , p∗ ; (1 equation) (104)

∇ 2 py C y∗ , p∗ = ∇ 2 py C∗ y∗ , p∗ ; (N equation) (105)

∇ 2 yp C y∗ , p∗ = ∇ 2 yp C∗ y∗ , p∗ ; (N equation) (106)

∇ 2 yy C y∗ , p∗ = ∇ 2 yy C∗ y∗ , p∗ ; (1 equation) . (107)

Equations (101), (102), (103), (104), (105), (106), and (107) are the counterparts
to our earlier unit cost Eqs. (90), (91), and (92). As was the case with unit cost
functions, Eq. (102) is implied by the linear homogeneity in prices of the cost
functions and Part 1 of Euler’s Theorem on homogeneous functions. Young’s
Theorem on the symmetry of cross partial derivatives means that the lower triangle
of equations in (103) is implied by the equalities in the upper triangle of both
matrices of partial derivatives. Part 2 of Euler’s Theorem on homogeneous functions
implies that if all the off diagonal elements in both matrices in Eq. (103) are equal,
then so are the diagonal elements. Hence, in order to satisfy all of the equations
in (101), (102), and (103), we need only satisfy the N equations in (102) and the

51 This
material can be adapted to the case where we want a flexible functional form for a linearly
homogeneous utility or production function f(x): just replace p by x and c(p) by f(x).
98 W. E. Diewert

N(N − 1)/2 equations in the upper triangle of the N2 equations in (103). Young’s
Theorem implies that if Eq. (105) are satisfied, then so are Eq. (106). However,
Euler’s Theorem on homogeneous functions implies that

∂C u∗ , p∗ /∂y = p∗T ∇ 2 py C∗ y∗ , p∗ = p∗T ∇ 2 py C∗ u∗ , p∗ = ∂C∗ y∗ , p∗ /∂y.
(108)

Hence, if Eq. (105) are satisfied, then so is the single Eq. (104). Putting this all
together, we see that in order for C to be flexible, we need enough free parameters
in C so that the following equations can be satisfied:

• Equation (102); N equations

• The upper triangle in Eq. (103); N(N − 1)/2 equations
• Equation (105); N equations
• Equation (107); 1 equation

Hence, in order for C to be a flexible functional form, it will require a minimum of

2N + N(N − 1)/2 + 1 = N(N + 1)/2 + N + 1 parameters. Thus, a fully flexible cost
function, C(y,p), will require N + 1 additional parameters compared to a flexible
unit cost function, c(p).
In the following Sections, we will define several flexible functional forms for unit
cost functions c(p). Once we have a flexible functional form for a unit cost function
c(p), then the algebra below shows how we can modify c(p) to obtain a flexible total
cost function C(y,p).52
Suppose the unit cost function is the generalized Leontief unit cost function c(p)
defined by Eq. (97) above. We now show how terms can be added to it in order to
make it a fully flexible cost function. Thus, define C(u,p) as follows:

C (y, p) ≡ yc (p) + bT p + (1/2) a0 αT py2 (109)

where b ≡ [b1 , . . . , bN ] is an N dimensional vector of new parameters, a0 is a new

parameter and α ≡ [α1 , . . . , αN ] > 0N is a vector of predetermined parameters.53
Using Eq. (109) as our candidate for a flexible (total) cost function C, Eqs. (102),
(103), (105), and (107) become:

y∗ ∇p c p∗ + b + (1/2) a0 αy∗2 = ∇p C∗ y∗ , p∗ ; (110)

y∗ ∇ 2 pp c p∗ = ∇ 2 pp C∗ y∗ , p∗ ; (111)

52 The algebra for converting the translog unit cost function into the translog cost function is
different.
53 We have defined the cost function C in this manner so that it has the minimal number of

parameters required in order to be a flexible functional form. Thus it is a parsimonious flexible

functional form.
3 Duality in Production 99

∇p c p∗ + a0 αy∗ = ∇ 2 py C∗ y∗ , p∗ ; (112)

a0 αT p∗ = ∇ 2 yy C∗ y∗ , p∗ . (113)

Use Eq. (111) in order to determine the bik for i = k. Use Eq. (113) in order
to determine the single parameter a0 . Use Eq. (112) in order to determine the bii .
Finally, use Eq. (110) in order to determine the parameters bn in the b vector. Thus,
the cost function C(u,p) defined by Eq. (109), which uses the generalized Leontief
unit cost function c(p) defined by Eq. (97) as a building block, is a parsimonious
flexible functional form for a general cost function.
In fact, it is not necessary to use the generalized Leontief unit cost function in
definition (109) in order to convert a flexible functional form for a unit cost function
into a flexible functional form for a general cost function. Let c(p) be any flexible
functional form for a unit cost function and define C(y,p) by Eq. (109). Use Eq.
(113) to determine the parameter a0 . Once a0 has been determined, Eqs. (111) and
(112) can be used to determine the parameters in the unit cost function c(p). Finally,
Eq. (110) can be used to determine the parameters in the vector b.
Differentiating Eq. (109) leads to the following system of estimating equations,
where x(y, p) = ∇ p C(y, p) is the producer’s system of cost minimizing input demand
functions:

x (y, p) = y∇c (p) + b + (1/2) a0 αy2 . (114)

If the generalized Leontief unit cost function is used as the c(p) in Eq. (114),
then the N estimating equations will be linear in the unknown parameters. This will
facilitate econometric estimation. The cross equation symmetry restrictions could
be tested or imposed.
In empirical applications, if we use the generalized Leontief functional form
when there are more than two inputs, a problem can occur: one or more of the
estimated bik can turn out to be negative numbers (so that inputs i and k are
complements). Under these conditions, the estimated cost function can fail to be
concave at the observed data points and it will not be globally concave over all
positive input prices. Global concavity can be imposed by replacing the off diagonal
bik parameters by their squares54 but if this is done, then all pairs of inputs will
be either substitutes or be unrelated. Global concavity can be imposed but at the
cost of destroying the flexibility of the functional form.55 Thus, the generalized
Leontief functional form is not a “perfect” flexible functional form. Finding flexible
functional forms where the restrictions implied by microeconomic theory can be

54 The resulting estimating equations become nonlinear in the parameters when we square the bik .
Typically, this does not create any problems: just use a nonlinear estimation method.
55 If there are more than 4 inputs and we allow for complementarity, then experience has shown

that complementary input pairs show up almost always.

100 W. E. Diewert

imposed on the functional form without destroying its flexibility is a nontrivial

task which we will address later in sections “The Normalized Quadratic Unit Cost
Function” and “The Konüs Byushgens Fisher Unit Cost Function” below.

The Translog Functional Form

The translog unit cost function, c(p), is defined as follows56 :

N N N
lnc (p) ≡ α0 + αi lnpi + (1/2) γik lnpi lnpk (115)
i=1 i=1 k=1

where the parameters αi and γik satisfy the following restrictions:

γik = γki ; 1 ≤ i < k ≤ N; (N (N − 1) /2symmetry restrictions)

(116)

N
αi = 1; (1 restriction) (117)
i=1

N
γik = 0; i = 1, . . . , N (N restrictions) . (118)
k=1

Note that the symmetry restrictions (116) and the restrictions (118) imply the
following restrictions:
N
γik = 0; k = 1, . . . , N. (119)
i=1

There are 1 + N αi parameters and N2 γik parameters. However, the restrictions

(116), (117), (118), and (119) mean that there are only N independent αi parameters
and N(N − 1)/2 independent γik parameters, which is the minimal number of
parameters required for a unit cost function to be flexible.
We show that the translog unit cost function c(p) defined by Eqs. (115),
(116), (117), and (118) is linearly homogeneous; that is, we need to show that
c(λp) = λc(p) for λ > 0 and p 0N . Thus, we need to show that

lnc (λp) = ln λc (p) = lnλ + lnc (p) ; λ > 0 and p 0N . (120)

56 This functional form is due to Christensen et al. [14–16]. The material in this Section is due to

these authors.
3 Duality in Production 101

Using definition (115), we have

N N N
lnc λp1 , . . . , λpN = α0 + αi lnλpi + (1/2) γik lnλpi lnλpk
i=1 i=1 k=1
N
= α0 + αi lnλ + lnpi
i=1
N N
+ (1/2) γik lnλ + lnpi lnλ + lnpk
i=1 k=1
N N
= α0 + αi [lnλ] + αi lnpi
i=1 i=1
N N
+ (1/2) γik lnλ + lnpi lnλ + lnpk
i=1 k=1
N
= α0 + 1 [lnλ] + αi lnpi
i=1
N N
+ (1/2) γik lnλ + lnpi lnλ + lnpk using(117)
i=1 k=1
N N N
= lnλ + α0 + αi lnpi + (1/2) γik [lnλ] [lnλ]
i=1 i=1 k=1
N N
+ (1/2) γik [lnλ] lnpk
i=1 k=1
N N
+ (1/2) γik lnpi [lnλ]
i=1 k=1
N N
+ (1/2) γik lnpi lnpk
i=1 k=1

N N N
= lnλ + α0 + αi lnpi + (1/2) γik [lnλ] [lnλ]
i=1 i=1 k=1

N N
+ (1/2) γik lnpk [lnλ]
k=1 i=1

N N
+ (1/2) γik lnpi [lnλ]
i=1 k=1
N N
+ (1/2) γik lnpi lnpk
i=1 k=1
N N
= lnλ + α0 + αi lnpi + (1/2) [0] [lnλ] [lnλ]
i=1 i=1
N N
+ (1/2) [0] lnpk [lnλ] + (1/2) [0] lnpi [lnλ]
k=1 i=1
N N
+ (1/2) γik lnpi lnpk using (118)and(119)
i=1 k=1
N N N
= lnλ + α0 + αi lnpi + (1/2) γij lnpi lnpk
i=1 i=1 k=1

= lnλ + lnc (p) using definition(115)

(121)
102 W. E. Diewert

which establishes the linear homogeneity property (120). Thus the restrictions (116),
(117), and (118) imply the linear homogeneity of the translog unit cost function.
To establish the flexibility of the translog unit cost function c(p) defined by Eqs.
(115), (116), (117), and (118), we need only solve the following system of equations,
which is equivalent to the N(N + 1)/2 equations defined by Eqs. (91) and (93):

lnc (p) = lnc∗ p∗ ; 1 equation (122)

∂lnc p∗ /∂lnpi = ∂lnc∗ p∗ /∂lnpi ; i = 1, 2, . . . , N − 1; N − 1 equations
(123)

∂ 2 lnc p∗ /∂lnpi ∂lnpk = ∂ 2 lnc∗ p∗ /∂lnpi ∂lnpk ;
(124)
1 ≤ i < k ≤ N; N (N − 1) /2 equations.

Upon differentiating the translog unit cost function defined by Eq. (115), we see
that Eq. (123) are equivalent to the following equations:
N
αi + γik lnpj = ∂lnc∗ p∗ /∂lnpi ; i = 1, 2, . . . , N − 1. (125)
k=1

Differentiating the translog unit cost function again, we find that Eq. (124) are
equivalent to the following equations:

γik = ∂ 2 lnc∗ p∗ /∂lnpi ∂lnpk ; 1 ≤ i < j ≤ N. (126)

Now use Eq. (126) to determine the γik for 1 ≤ i < k ≤ N. Use the symmetry
restrictions (116) to determine the γik for 1 ≤ k < i ≤ N. Use Eq. (118) to determine
the γii for i = 1, 2, . . . , N. With the entire N by N matrix of the γij now determined,
use Eq. (125) in order to determine the αi for i = 1, 2, . . . , N − 1. Now use Eq. (117)
to determine αN . Finally, use Eq. (112) to determine α0 .
We turn our attention to the problems involved in obtaining estimates for the
unknown parameters αi and γik , which occur in the definition of the translog unit
cost function, c(p) defined by Eq. (115). The total cost function C(y,p) is defined in
terms of the unit cost function c(p) as follows:

C (y, p) ≡ yc (p) . (127)

Taking logarithms on both sides of Eq. (127) yields, after some rearrangement:

ln C (y, p) /y = lnc (p)
N N N
= α0 + αi lnpi + (1/2) γik lnpi lnpk
i=1 i=1 k=1
(128)
3 Duality in Production 103

where we have replaced lnc(p) using Eq. (115). The corresponding system of
cost minimizing input demand functions x(y,p) is obtained using Shephard’s
Lemma:

x (y, p) ≡ ∇p C (y, p) = y∇p c (p) . (129)

Suppose that we have data for a production unit on output in period t, yt , inputs
xt ≡ [x1 t , . . . , xN t ] and input prices pt ≡ [p1 t , . . . , pN t ] for t = 1, . . . , T. Thus, the
period t observed unit cost is:

N
ct ≡ ptT xt /yt ≡ pi t xi t /yt ; t = 1, . . . , T. (130)
i=1

Evaluate Eq. (128) at the period t data and add an error term, e0 t . Using
Eq. (130), (128) evaluated at the period t data becomes the following estimating
equation:

N
lnct = α0 + αi lnpi t
i=1
(131)
N N
+ (1/2) γij lnpi lnpk + e0 ;
t t t
t = 1, . . . , T.
i=1 k=1

Note that Eq. (131) is linear in the unknown parameters.

In order to obtain additional estimating equations, we have to use the input
demand functions, xi (y, p) ≡ y∂c(p)/∂pi for i = 1, . . . , N (see Eq. (129) above).
The ith input share function, si (y,p), is defined as:

si (y, p) ≡ pi xi (y, p) /C (y, p) i = 1, . . . , N

= pi y∂c (p) /∂pi /C (y, p) using(129)

= pi y∂c (p) /∂pi /yc (p) using(127)
(132)
= pi ∂c (p) /∂pi /c (p)

= ∂lnc (p) /∂lnpi

N
= αi + γik lnpk
k=1

where the last equation follows upon differentiating the c(p) defined by
Eq. (115).
Now evaluate both sides of Eq. (132) at the period t data and add error terms ei t
to obtain the following system of estimating equations:
104 W. E. Diewert

N
si t ≡ pi t xi t /Ct = αi + γij lnpj t + ei t ; i = 1, . . . , N; t = 1, . . . , T.
j=1
(133)

Note that Eq. (133) are also linear in the unknown parameters.57 Obviously,
the N estimating equations in (133) could be added to the single estimating
Eq. (131) in order to obtain N + 1 estimating equations with cross equation equality
constraints on the parameters αi and γij . However, since total cost in any period t,
Ct , equals the sum of the individual expenditures on the inputs, i = 1 N pi t xi t , the
observed input shares si t ≡ pi t xi t /Ct will satisfy the following constraint for each
period t:
N t
si = 1; t = 1, . . . , T. (134)
i=1

Thus, the stochastic error terms ei t in Eq. (133) cannot all be independent. Hence
we must drop one estimating equation from (133). Thus, Eq. (131) and any N − 1
of the N equations in (133) may be used as a system of estimating equations in order
to determine the parameters of the translog unit cost function.58
We now turn our attention to the problem of deriving a formula for the price
elasticities of demand, ∂xi (y, p)/∂pj , given that the unit cost function has the translog
functional form defined by Eqs. (115), (116), (117), and (118). Recall Eq. (132)
above. For k = i, differentiate the ith equation in (132) with respect to the log of pk
and we obtain the following equations for all k = i:

∂si (y, p) /∂lnpk = pi ∂ xi (y, p) /C (y, p) /∂lnpk = γik . (135)

Hence upon noting that si (y, p) = pi xi (y, p)/C(y, p) and using Eq. (135), we have
for k = i:

57 Note also that the cross equation symmetry conditions, γik = γki , could be tested or imposed.
58 In situations where N is large relative to the number of observations T, maximum likelihood
estimation of Eq. (131) and N − 1 of the Eq. (133) can fail if a general variance covariance
matrix is estimated for the error terms in these equations. The problem is that all of the unknown
economic parameters are contained in Eq. (131) and as a result, the estimated squared residuals in
this equation will tend to be small relative to the estimated squared residuals in Eq. (133), where
each equation has only a few unknown economic parameters. Hence Eq. (131) can suffer from
multicollinearity problems and the small apparent variance of the residuals in this equation can lead
to the maximum likelihood estimation procedure giving too much weight to the unit cost function
equation relative to the other equations. Under these conditions, the resulting elasticities may be
erratic and they may not satisfy the appropriate curvature conditions. Note that the estimation of the
Generalized Leontief unit cost function did not suffer from this problem of having every unknown
parameter in a single equation.
3 Duality in Production 105

γik = pi ∂ xi (y, p) /C (y, p) /∂lnpk

= pi pk ∂ xi (y, p) /C (y, p) /∂pk
2
= pi pk 1/C (y, p) ∂xi (y, p) /∂pk − xi (y, p) 1/C (y, p) ∂C (y, p) /∂pk

= pi xi (y, p) /C (y, p) ∂lnxi (y, p) /∂lnpk

− pi xi (y, p) /C (y, p) pk xk (y, p) /C (y, p)
using Shephard s Lemma, xk (y, p) = ∂C (y, p) /∂pk
= si (y, p) ∂lnxi (y, p) /∂lnpk − si (y, p) sk (y, p) .
(136)

Equation (136) can be rearranged to give us the following formula for the cross
price elasticities of input demand for all i = k:
−1
∂lnxi (y, p) /∂lnpk = si (y, p) γik + sk (y, p) . (137)

Now differentiate the ith equation in (135) with respect to the logarithm of pi and
get the following equations:

γii = pi ∂ pi xi (y, p) /C (y, p) /∂pi ; i = 1, . . . , N;

= pi xi (y, p) /C (y, p) + pi /C (y, p) ∂xi (y, p) /∂pi

− pi xi (y, p) /C(y, p)2 ∂C (y, p) /∂pi

= pi xi (y, p) /C (y, p) + pi /C (y, p) ∂xi (y, p) /∂pi

− pi xi (y, p) /C(y, p)2 xi (y, p) (138)

using Shephard s Lemma, xi (y, p) = ∂C (y, p) /∂pi

= pi xi (y, p) /C (y, p) + pi xi (y, p) /C (y, p) ∂lnxi (y, p) /∂lnpi
2
− pi xi (y, p) /C (y, p)

= si (y, p) + si (y, p) ∂lnxi (y, p) /∂lnpi − si (y, p)2 .

Equation (138) can be rearranged to give us the following formula for the own
price elasticities of input demand:
−1
∂lnxi (y, p) /∂lnpi = si (y, p) γii + si (y, p) − 1; i = 1, . . . , N. (139)

Thus, given econometric estimates for the αi and γij , which we denote by αi ∗ and
γij , the estimated or fitted shares in period t, si t* are defined using these estimates
∗
106 W. E. Diewert

and Eq. (133) evaluated at the period t data:

N
si t∗ ≡ αi ∗ + γij ∗ lnpj t ; i = 1, . . . , N; t = 1, . . . , T. (140)
j=1

Now use Eq. (137) evaluated at the period t data and econometric estimates to
obtain the following formula for the period t cross elasticities of demand, Eik t :
−1
Eik t ≡ ∂lnxi yt , pt /∂lnpk = si t∗ γik ∗ + sk t∗ ; i = k; t = 1, . . . , T. (141)

Similarly, use Eq. (139) evaluated at the period t data and econometric estimates
to obtain the following formula for the period t own elasticities of demand, Eii t :
−1
Eii t ≡ ∂lnxi yt , pt /∂lnpi = si t∗ γii ∗ + si t∗ − 1; i = 1, . . . , N; t = 1, . . . , T.
(142)

We can also obtain an estimated or fitted period t unit cost, ct* , by using our
econometric estimates for the parameters and by exponentiating the right hand side
of equation t in (130):

∗ N ∗ N N ∗
c ≡ exp α0 +
t∗
αi lnpi + (1/2)
t
γik lnpi lnpk
t t
;
i=1 i=1 k=1

t = 1, . . . , T.
(143)

Finally, our fitted period t shares si t* defined by Eq. (50) and our fitted period t
costs Ct* defined by Eq. (53) can be used in order to obtain estimated or fitted period
t input demands, xi t* , as follows:

xi t∗ ≡ yt ct∗ si t∗ /pi t ; i = 1, . . . , N; t = 1, . . . .T. (144)

Given the matrix of period t estimated input price elasticities of demand,

[Eik t ], we can readily calculate the matrix of period t estimated input price
derivatives, ∇ p x(yt , pt ) = ∇ 2 pp C(yt , pt ) = yt ∇ 2 pp c(pt ). The estimate for element
ik of ∇ 2 pp C(yt , pt ) is:

Cik t∗ ≡ Eik t xi t∗ /pk t ; i, k = 1, . . . , N; t = 1, . . . , T (145)

where the estimated period t elasticities Eik t are defined by Eqs. (141) and (142) and
the fitted period t input demands xi t* are defined by Eq. (144). Once the estimated
input price derivative matrices [Cik t* ] have been calculated for period t, then we
may check whether it is negative semidefinite using determinantal conditions or by
checking if all of the eigenvalues of each matrix are zero or negative for t = 1, . . . , T.
Unfortunately, very frequently these negative semidefiniteness conditions will fail to
3 Duality in Production 107

be satisfied for both the translog and generalized Leontief functional forms. Thus,
the translog and generalized Leontief functional forms both suffer from the same
problem: in general, it is not possible to impose concavity on these functional forms
without destroying their flexibility property. Hence, in the following two sections,
we study functional forms where these curvature conditions can be imposed without
destroying the flexibility of the functional form.

The Normalized Quadratic Unit Cost Function

The normalized quadratic unit cost function c(p) is defined as follows for p 0N 59 :

c (p) ≡ bT p + (1/2) pT Bp/αT p (146)

where bT ≡ [b1 , . . . , bN ] and αT ≡ [α1 , . . . , αN ] are parameter vectors and B ≡ [bik ]

is a matrix of parameters. The vector α and the matrix B satisfy the following
restrictions:

α > 0N ; (147)

B = BT ; i.e., the matrix B is symmetric; (148)

Bp∗ = 0N for some p∗ 0N . (149)

In most empirical applications, the vector of nonnegative but nonzero parameters

α is fixed a priori. The
two most frequent a priori choices for α are α ≡ 1N , a vector
of ones or α ≡ (1/T) t = 1 T xt , the sample mean of the observed input vectors. The
two most frequent choices for the reference price vector p* are p∗ ≡ 1N or p∗ ≡ pt for
some period t; that is, in this second choice, we simply set p* equal to the observed
period t price vector.
Assuming that α has been predetermined, there are N unknown parameters in the
b vector and N(N − 1)/2 unknown parameters in the B matrix, taking into account
the symmetry restrictions (148) and the N linear restrictions in (149). Note that the
c(p) defined by Eq (146) is linearly homogeneous in the components of the input
price vector p.
Another possible way of defining the normalized quadratic unit cost function is
as follows:

59 This functional form was introduced by Diewert and Wales ([42], 53) where it was called the

Symmetric Generalized McFadden functional form. It is a generalization of a functional form due

to McFadden ([73], 279). Additional material on this functional form can be found in Diewert and
Wales [43–45]).
108 W. E. Diewert

c (p) ≡ (1/2) pT Ap/αT p (150)

where the parameter matrix A is symmetric; that is, A = AT ≡ [aik ] and α > 0N as
before. Assuming that the vector of parameters α has been predetermined, the c(p)
defined by Eq. (150) has N(N + 1)/2 unknown aik parameters.
Comparing Eq. (146) with (150), it can be seen that Eq. (150) has dropped the b
vector but has also dropped the N linear constraints (149). It can be shown that the
model defined by Eq. (146) is a special case of the model defined by Eq. (150). To
show this, given Eq. (146), define the matrix A in terms of B, b and α as follows:

A ≡ B + bαT + αbT . (151)

Substituting Eq. (151) into (150), Eq. (150) becomes:

c (p) = (1/2) pT B + bαT + αbT p/αT p

= (1/2) pT Bp/αT p + (1/2) pT bαT + αbT p/αT p

= (1/2) pT Bp/αT p + (1/2) pT bαT p + pT αbT p /αT p (152)

= (1/2) pT Bp/αT p + (1/2) 2pT bαT p /αT p

= (1/2) pT Bp/αT p + pT b

which is the same functional form as (146). However, it is preferable to work with
the model (146) rather than with the seemingly more general model (150) for three
reasons:

• The c(p) defined by Eq. (146) clearly contains the no substitution Leontief
functional form as a special case (simply set B = 0N × N )
• The estimating equations that correspond to Eq. (146) will contain constant
terms.
• It is easier to establish the flexibility property for (146) than for (150).

The first and second order partial derivatives of the normalized quadratic unit
cost function defined by Eq. (146) are given by:

−1 −2
∇p c (p) = b + αT p Bp − (1/2) αT p pT Bpα; (153)

−1 −2 −2 −3

∇ 2 pp c (p) = αT p B − αT p BpαT − αT p αpT B + αT p pT BpααT .
(154)
3 Duality in Production 109

We now prove that the c(p) defined by Eqs. (146), (147), (148), and (149) (with
α predetermined) is a flexible functional form at the point p* . Using the restrictions
(149), Bp∗ = 0N , we have p∗ T Bp∗ = p∗ T 0N = 0. Thus, evaluating Eqs. (153) and
(154) at p = p∗ yields the following equations:

∇p c p∗ = b; (155)

−1
∇ 2 pp c p∗ = αT p∗ B. (156)

We need to satisfy Eqs. (91) and (92) above to show that the c(p) defined by Eqs.
(146), (147), (148), and (149) is flexible at p* . Using Eq. (155), we can satisfy Eq.
(91) if we choose b as follows:

b ≡ ∇c∗ p∗ . (157)

Using Eq. (156), we can satisfy Eq. (92) by choosing B as follows:

−1
B ≡ αT p∗ ∇ 2 c∗ p∗ . (158)

Since ∇ 2 c∗ (p∗ ) is a symmetric matrix, B will also be a symmetric matrix and

so the symmetry restrictions (148) will be satisfied for the B defined by Eq. (158).
Moreover, since c* (p) is assumed to be a linearly homogeneous function, Euler’s
Theorem implies that

∇ 2 c∗ p∗ p∗ = 0N . (159)

Equations (158) and (159) imply that the B defined by Eq. (158) satisfies the
linear restrictions (149). This completes the proof of the flexibility property for the
normalized quadratic unit cost function.
It is convenient to define the vector of normalized input prices, vT ≡ [v1 , . . . , vN ]
as follows:
−1
v ≡ pT α p. (160)

The system of input demand functions x(y,p) that corresponds to the normalized
quadratic unit cost function c(p) defined by Eq. (146) can be obtained using
Shephard’s Lemma in the usual way:

x (y, p) = y∇c (p) . (161)

Using Eq. (161) and definition (146) evaluated at the period t data, we obtain the
following system of estimating equations:
110 W. E. Diewert

xt /yt = b + Bvt − (1/2) vtT Bvt α + et ; t = 1, . . . , T (162)

where xt is the observed period t input vector, yt is the period t output, vt ≡ pt /αT pt is
the vector of period t normalized input prices and et ≡ [e1 t , . . . , eN t ]T is a vector of
stochastic error terms. Equation (162) can be used in order to statistically estimate
the parameters in the b vector and the B matrix. Note that Eq. (162) are linear in the
unknown parameters. Note also that the symmetry restrictions (148) can be imposed
when estimating the system of Eq. (162) or their validity can be tested.
Once estimates for b and B have been obtained (denote these estimates by b* and
*
B respectively), then Eq. (162) can be used in order to generate a period t vector of
fitted input demands, xt* say:

xt∗ ≡ yt b∗ + B∗ vt − (1/2) vtT B∗ vt α ; t = 1, . . . , T. (163)

Equations (154) and (161) may be used in order to calculate the matrix of period t
estimated input price derivatives, ∇ p x(yt , pt ) = ∇ 2 pp C(yt , pt ). The estimated matrix
of second order partial derivatives ∇ 2 pp C(yt , pt ) for t = 1, . . . , T is the following
one:
−1 −2 −2
t∗
Cij ≡ y t
αT pt B∗ − αT pt B∗ pt αT − αT pt αptT B∗
−3 (164)
tT ∗ t
+ α p
T t
p B p αα .
T

Equations (163) and (164) may be used in order to obtain estimates for the matrix
of period t input demand price elasticities, [Eij t ]:

Eij t ≡ ∂lnxi yt , pt /∂lnpj = pj t Cij t∗ /xi t∗ ; i, j = 1, . . . , N; t = 1, . . . , T (165)

where xi t* is the ith component of the vector of fitted demands xt* defined by
Eq. (163).
There is one important additional topic that we have to cover in our discussion of
the normalized quadratic functional form: what conditions on b and B are necessary
and sufficient to ensure that c(p) defined by Eqs. (146), (147), (148), and (149) is
concave in the components of the price vector p?
The function c(p) will be concave in p if and only if ∇ 2 c(p) is a negative
semidefinite matrix for each p in the domain of definition of c. Evaluating Eq. (154)
at p = p∗ and using the restrictions (149) yields:

−1
∇ 2 c p∗ = αT p∗ B. (166)

Since α > 0N and p∗ 0N , αT p∗ > 0. Thus, in order for c(p) to be a concave

function of p, the following necessary condition must be satisfied:
3 Duality in Production 111

B is a negative semidefinite matrix. (167)

We now show that the necessary condition (167) is also sufficient to imply that
c(p) is concave over the set of p such that p 0N . Unfortunately, the proof is
somewhat involved.60
Let p 0N . We assume that B is negative semidefinite and we want to
show that ∇ 2 c(p) is negative semidefinite or equivalently, that −∇ 2 c(p) is positive
semidefinite. Thus, for any vector z, we want to show that −zT ∇ 2 c(p)z ≥ 0. Using
Eq. (154), this inequality is equivalent to:
−1 −2 −2
− αT p zT Bz + αT p zT BpαT z + αT p zT αpT Bz
−3 (168)
− α p T
pT BpzT ααT z ≥ 0

or
−1 −3 2 −2
− αT p zT Bz − αT p pT Bp αT z ≥ −2 αT p zT BpαT z using B = BT .
(169)

Define A ≡ − B. Since B is symmetric and negative semidefinite by assumption,

A is symmetric and positive semidefinite. Thus there exists an orthonormal matrix
U such that

UT AU = ; (170)

UT U = IN (171)

where IN is the N by N identity matrix and is a diagonal matrix with the

nonnegative eigenvalues of A, λi , i = 1, . . . , N, running down the main diagonal.
Now premultiply both sides of Eq. (170) by U and postmultiply both sides by UT .
Using Eq. (171), UT = U−1 , and the transformed Eq. (170) becomes the following
equation:

A = UUT
= U1/2 1/2 UT
(172)
= U1/2 UT U 1/2 UT since UT U = IN
= SS

60 The proof is due to Diewert and Wales ([42], 66).

112 W. E. Diewert

where 1/2 is the diagonal matrix that has the nonnegative square roots λi 1/2 of the
eigenvalues of A running down the main diagonal and the symmetric square root of
A matrix S is defined as

S ≡ U1/2 UT . (173)

If we replace – B in Eq. (169) with A, the inequality that we want to establish

becomes
−1 −2 2
2 αT p zT ApαT z ≤ zT Az + αT p pT Ap αT z (174)

where we have also multiplied both sides of Eq. (169) by the positive number αT p
in order to derive Eq. (174) from (169).
Recall the Cauchy-Schwarz inequality for two vectors, x and y:
1/2 1/2
xT y ≤ xT x yT y . (175)

Now we are ready to establish the inequality (174). Using Eq. (172), we have:
−1 −1
αT p zT ApαT z = αT p zT SSpαT z
1/2 −2 2
1/2
≤ zT SST z αT p αT z pT ST SP
−1
using (175) with xT ≡ zT S and y ≡ αT p αT z Sp
1/2 2− 2
1/2
= zT SSz αT p αT z pT SSp using S = ST

1/2 −2 2
1/2
= z Az T
αT p α z p Ap
T T
using (172), A = SS
−2 2
≤ (1/2) zT Az + (1/2) αT p αT z pT Ap
(176)

where the last inequality follows using the nonnegativity of zT Az, pT Ap, the
positivity of αT z and the Theorem of the Arithmetic and Geometric Mean.61
The inequality (176) is equivalent to the desired inequality (174).
Thus, the normalized quadratic unit cost function defined by Eqs. (146), (147),
(148), and (149) will be concave over the set of positive prices if and only if the

61 This proof is due to Diewert and Wales [42].

3 Duality in Production 113

symmetric matrix B is negative semidefinite. Thus, after econometric estimates of

the elements of B have been obtained using the system of estimating Eq. (162), we
need only check that the resulting estimated B* matrix is negative semidefinite.
However, suppose that the estimated B* matrix is not negative semidefinite. How
can one reestimate the model, impose negative semidefiniteness on B, but without
destroying the flexibility of the normalized quadratic functional form?
The desired imposition of negative semidefiniteness can be accomplished using
a technique due to Wiley et al. [86]: simply replace the matrix B by

B ≡ −AAT (177)

where A is an N by N lower triangular matrix; that is, aij = 0 if i < j.62

We also need to take into account the restrictions (149), Bp∗ = 0N . These
restrictions on B can be imposed if we impose the following restrictions on A:

AT p∗ = 0N . (178)

To show how this curvature imposition technique works, let p∗ = 1N and consider
the case N = 2. In this case, we have:

a 0 a11 a21
A ≡ 11 and A = T
.
a21 a22 0 a22

a11 + a21 0
The restrictions (178) become: AT 12 = =
a22 0
and hence we must have a21 = − a11 and a22 = 0. Thus, in this case,

a11 0 a11 −a11 2
a11 −a11
2 −1 1
B ≡ −AAT = − =− = a11 2 .
− a11 0 0 0 − a11 a11
2 2 1 −1
(179)

Equation (179) shows how the elements of the B matrix can be defined in terms
of the single parameter, a11 2 . Note that with this reparameterization of the B matrix,
it will be necessary to use nonlinear regression techniques rather than modifications
of linear regression techniques. This turns out to be the cost of imposing the correct
curvature conditions on the unit cost function.

62 Since zT AAT z = (AT z)T (AT z) = yT y ≥ 0 for all vectors z, AAT is positive semidefinite and
hence − AAT is negative semidefinite. Diewert and Wales ([42], 53) showed that any positive
semidefinite matrix can be written as AAT where A is lower triangular. Hence, it is not restrictive
to reparameterize an arbitrary negative semidefinite matrix B as − AAT .
114 W. E. Diewert

The Konüs Byushgens Fisher Unit Cost Function

Define the KBF unit cost function, c(p), as follows63 :

1/2
c (p) ≡ pT Bp ; B = BT (180)

where B is an N by N symmetric matrix which has one positive eigenvalue (with a

strictly positive eigenvector) and the remaining N − 1 eigenvalues are negative or
zero. The vector of first order partial derivatives of this unit cost function, ∇c(p), and
the matrix of second order partials, ∇ 2 c(p), are equal to the following expressions:
1/2
∇c (p) = Bp/ pT Bp ; (181)

−1/2 −1
∇ 2 c (p) = pT Bp B − Bp pT Bp pT B . (182)

At this point, we need to determine the region of price space where the c(p)
defined by Eq. (180) is a concave function. In general, the unit cost function
defined by Eq. (180) will not be concave for all strictly positive price vectors
p.64 In order for a unit cost function to provide a valid global representation of
homothetic preferences, it must be a nondecreasing, linearly homogeneous and
concave function over the positive orthant. However, in order for c to provide a
valid local representation of preferences, we need only require that c(p) be positive,
nondecreasing, linearly homogeneous, and concave over a convex subset of prices,
say S, where S has a nonempty interior.65 It is obvious that c(p) defined by Eq. (20) is
linearly homogeneous. The nondecreasing property will hold over S if the gradient
vector ∇c(p) defined by Eq. (181) is strictly positive for p ∈ S and the concavity
property will hold if ∇ 2 c(p) defined by Eq. (182) is a negative semidefinite matrix
for p ∈ S. We will show how the regularity region S can be determined shortly but
first, we will indicate why the c(p) defined by Eq. (20) is a flexible functional form66
since this explanation will help us to define an appropriate region of regularity.

63 This is a special case of a functional form due to Denny [18], which Diewert ([24], 131) called

the quadratic mean of order r unit cost function. This functional form with r = 2 was introduced
into the economics literature by Konüs and Byushgens ([69], 168) and its connection to the Fisher
[53] ideal price index was explained by these authors and Diewert [24]. See Problem 22 below.
64 The following analysis of the regularity conditions for the c(p) defined by Eq. (180) is due to

Diewert and Hill [38].

65 See Blackorby and Diewert [5] for more details on local representations of preferences using

duality theory.
66 Diewert ([24], 130) established the flexibility of c(p) defined by Eq. (180) as part of a more

general result.
3 Duality in Production 115

Let p∗ 0N be a strictly positive reference price vector and suppose that we are
given an arbitrary unit cost function c* (p) that is twice continuously differentiable in
a neighborhood around p* .67 Let x∗ ≡ ∇ c∗ (p) 0N be the strictly positive vector
of first order partial derivatives of c* (p* ) and let S∗ ≡ ∇ 2 c∗ (p∗ ) be the negative
semidefinite symmetric matrix of second order partial derivatives of c* evaluated
at p* . Euler’s Theorem on homogeneous functions implies that S* satisfies the
following matrix equation:

S∗ p∗ = 0N . (183)

In order to establish the flexibility of the KBF c defined by Eq. (180), we need
only show that there are enough free parameters in the B matrix so that the following
equations are satisfied:

∇c p∗ = x∗ ; (184)

∇ 2 c p∗ = S∗ . (185)

In order to prove the flexibility of c, it is convenient to reparameterize the B

matrix. Thus we now set B equal to:

B = bbT + A (186)

where b 0N is a positive vector and A is a negative semidefinite matrix which has

rank equal to at most N − 1 and it satisfies the following restrictions:

Ap∗ = 0N . (187)

Note that bbT is a rank one positive semidefinite matrix with p∗ T bbT p∗ =
> 0 and A is a negative semidefinite matrix and satisfies p∗ T Ap∗ = 0. Thus, it
(bT p∗ )2
can be seen that B is a matrix with one positive eigenvalue and the other eigenvalues
are negative or zero.
Substitute Eq. (181) into (184) in order to obtain the following equation:

67 Of course, in addition, we assume that c* satisfies the appropriate regularity conditions for a
unit cost function. Using Euler’s Theorem on homogeneous functions, the fact that c* is linearly
homogeneous and differentiable at p* means that the derivatives of c* satisfy the following
restrictions: c∗ (p∗ ) = p∗ T ∇ c∗ (p∗ ) and ∇ 2 c∗ (p∗ )p∗ = 0N . The unit cost function c defined by
Eq. (180) satisfies analogous restrictions at p = p∗ . These restrictions simplify the proof of the
flexibility of c at the point p* .
116 W. E. Diewert

1/2
x∗ = Bp∗ / p∗T Bp∗
1/2
= bbT + A p∗ / p∗T bbT + A p∗ using (186)
(188)
1/2
= bbT p∗ / p∗T bbT p∗ using (187)

= b.

Thus, if we choose b equal to x* , Eq. (184) will be satisfied. Now substitute Eq.
(182) into (183) and obtain the following equation:
−1/2 −1
S∗ = p∗T Bp∗ B − Bp∗ p∗T Bp∗ p∗T B
−1/2 −1
∗T T ∗ T ∗ ∗T T ∗ ∗T T
= p bb p bb + A − bb p p bb p
T
p bb
(189)
using (186) and (187)
−1
= bT p∗ A using bT p∗ > 0.

Thus, if we choose A equal to (bT p* )S* , Eq. (185) will be satisfied and the
flexibility of c defined by Eq. (180) is established.68
Now we are ready to define the region of regularity for c defined by Eq. (180).69
Consider the following set of prices:

S ≡ {p : p 0N ; Bp 0N } . (190)

If p ∈ S, then it can be seen that c(p) = (pT Bp)1/2 > 0 and using Eq. (181),
∇c(p) 0N . However, it is more difficult to establish the concavity of c(p) over the
set S. We first consider the case where the matrix B has full rank so that it has one
positive eigenvalue and N − 1 negative eigenvalues. Let p ∈ S and using Eq. (182),
we see that ∇ 2 c(p) will be negative semidefinite if and only if the matrix M defined
as:
−1
M ≡ B − Bp pT Bp pT B (191)

68 We need to check that A is negative semidefinite (which it is since it is a positive multiple of the
negative semidefinite substitution matrix S* ) and that A satisfies the restrictions in Eq. (187), since
we used these restrictions to derive Eq. (188) and the second line in Eq. (189). But A does satisfy
Eq. (187) since A satisfies Eq. (183).
69 The region of regularity can be extended to the closure of the set S.
3 Duality in Production 117

is negative semidefinite. Note that M is equal to the matrix B plus the rank 1 negative
semidefinite matrix – Bp(pT Bp)−1 pT B. B has one positive eigenvalue and the
remaining eigenvalues are 0 or negative. Since M is B plus a negative semidefinite
matrix, the eigenvalues of M cannot be greater than the eigenvalues of B. Now
consider two cases; the first case where B has one positive and N − 1 negative
eigenvalues and the second case where B has N − 1 negative or zero eigenvalues in
addition to its positive eigenvalue. Consider case 1, let p ∈ S and calculate Mp:
−1
Mp = B − Bp p Bp
T
p B p = 0N .
T
(192)

The above equation shows that p = 0N is an eigenvector of M that corresponds

to a 0 eigenvalue. Now the addition of a negative semidefinite matrix to B can
only make the N − 1 negative eigenvalues of B more negative (or leave them
unchanged) so we conclude that the addition of the negative semidefinite matrix –
Bp(pT Bp)−1 pT B to B has converted the positive eigenvalue of B into a zero
eigenvalue and hence M is negative semidefinite.
Case 2 follows using a perturbation argument.
Thus, we have shown that the KBF unit cost function c(p) defined by Eq. (180)
is positive, increasing in the components of p and concave in p over the region of
prices S defined by Eq. (190).
It is useful to show if c(p) ≡ (pT Bp)1/2 is defined by Eq. (180), then we
can decompose the matrix B into bbT + A where b 0N and A is a negative
semidefinite matrix with Ap∗ = 0N for some p∗ 0N . Recall that definition (180)
specified that c(p) ≡ (pT Bp)1/2 where B is an N by N symmetric matrix which
has one positive eigenvalue (with a strictly positive eigenvector) and the remaining
N − 1 eigenvalues are negative or zero. Let λ1 > 0 and λi ≤ 0 for i = 2, 3, . . . , N be
the eigenvalues of B and let the column vectors ui be the corresponding eigenvectors,
which are orthonormal to each other; that is, uiT ui = 1 for i = 1, . . . , N and uiT uj = 0
for all i = j. Then it is well known that the matrix B has the following representation:

N
B= λi ui uiT . (193)
i=1

Using the regularity conditions in definition (180), it can be seen that the first
eigenvector u1 is strictly positive. Make the following definitions:

N
p∗ ≡ u1 0N ; b ≡ (λ1 )1/2 u1 ; A ≡ λi ui uiT . (194)
i=2

It can be seen that A is a negative semidefinite matrix. Since u1 = p∗ is orthogonal

to u2 , . . . , uN , Ap∗ = 0N . Thus, we have B = bbT + A where b is a positive vector
and A is negative semidefinite with Ap∗ = 0N .
The following problems show the connection of the KBF functional form with
Irving Fisher’s [53] ideal index number formula.
118 W. E. Diewert

Problems
21. Suppose that a producer’s unit cost function is defined by Eq. (180). Assume
cost minimizing behavior on the part of the producer for periods 1 and 2 so that
using Shephard’s Lemma, we have:
(i)

xt = ∇c pt yt ; t = 1, 2

where pt , xt , and yt are the period t input price and quantity vectors and
yt is the period t output level for t = 1, 2.
(a) Show that
(ii)

xt /ptT xt = ∇c pt /c pt ; t = 1, 2.

(b) Show that we also have the following equations:

(iii)

xt /ptT xt = Bpt /c pt ; t = 1, 2.

22. Continuation of 21: The Fisher [53] ideal input price index PF is defined as the
following function of the observed input price and quantity vectors for periods
1 and 2:
(i)

1/2
PF p1 , p2 , x1 , x2 ≡ p2T x1 p2T x2 /p1T x1 p1T x2 .

Assume that p1 ,p2 ,x1 ,x2 satisfy equations (i) in Problem 21 where the
KBF unit cost function c(p) is defined by Eq. (180). Show that
(ii)

PF p1 , p2 , x1 , x2 = c p2 /c p1 .

Hint: Note that the inner products of p2 with x1 /p1T x1 and p1 with x2 /p2T x2
appear in the formula (i) above for PF (p1 ,p2 ,x1 ,x2 ). Apply part (b) of Problem 21.
Comment: The ratio of unit costs, c(p2 )/c(p1 ), can be interpreted as a theoretical
input price index, (due originally to Konüs [68] in the consumer context). Equation
(ii) above tells us that this theoretical input cost index can be calculated using just
observed input price and quantity data for the two periods under consideration using
the Fisher index provided that the producer is cost minimizing in the two periods
and has the production function that is dual to the unit cost function defined by
3 Duality in Production 119

Eq. (180). Thus, no econometric estimation is necessary in order to construct the

ratio of unit costs.70

We conclude this section by looking at the problems associated with estimating

the unknown parameters in the symmetric B matrix, assuming that we have data on
a production unit producing one output and using N inputs for T time periods. Using
Eq. (181), Shephard’s Lemma and definition (180) evaluated at the period t data, we
obtain the following system of estimating equations:
1/2
xt /yt = Bpt / ptT Bpt + et ; t = 1, . . . , T (195)

where xt is the observed period t input vector, yt is the period t output, pt is the
vector of period t input prices and et ≡ [e1 t , . . . , eN t ]T is a vector of stochastic error
terms with 0 means. Equation (195) can be used in order to statistically estimate
the N(N + 1)/2 independent bij parameters in the B matrix. However, the system
of equations defined by Eq. (195) is nonlinear in the unknown parameters. Define
period t unit cost by ct ≡ ptT xt /yt . In theory, ct should equal (ptT Bpt )1/2 plus an error
term. Thus, the system of estimating Eq. (195) can be replaced by the following
system:

ct xt /yt = Bpt + et∗ ; t = 1, . . . .T (196)

where et∗ ≡ [e1 t∗ , . . . , eN t∗ ]T is a new vector of stochastic error terms with 0 means.
Note that the new system of estimating equations defined by Eq. (196) is linear in
the unknown bij .71
As was the case when estimating the normalized quadratic unit cost function,
it will often turn out that the estimated B matrix will not satisfy the regularity
conditions that are associated with definition (180). As we have seen above, B may
be estimated as the equivalent expression equal to bbT + A where b is a strictly
positive vector and A is a symmetric negative semidefinite matrix with Ap∗ = 0N for
some strictly positive reference vector p* . Thus, we need only set A = − CCT where
C is a lower triangular matrix with CT p∗ = 0N and the correct curvature conditions

70 This result is much more important in the consumer context where we interpret f(x) as a utility
function defined over consumption vectors x and c(p) is the dual unit expenditure function. Note
that utility cannot be observed whereas output can be observed.
71 In the consumer context where output yt is replaced by (unobservable) utility ut and xt is

the period t consumption vector, rewrite Eq. (195) as xt = ut Bpt /(ptT Bpt )1/2 where we have
dropped the error terms. Total period t expenditure is ptT xt = ut (ptT Bpt )1/2 . Thus we obtain
xt /ptT xt = Bpt /ptT Bpt . Premultiply both sides of equation n by pn t and we obtain the following
system of estimating equations: pn t xn t /ptT xt ≡ sn t = pn t i = 1 N bni pi t /ptT Bpt + en t for n = 1, . . . ,
N and t = 1, . . . , T. We need to impose a normalization on the elements of the B matrix such as
b11 = 1 and since we have share equations, we need to drop one of these share equations in the
nonlinear estimation procedure. For an example of this methodology in the consumer context, see
Diewert and Feenstra [32].
120 W. E. Diewert

will be imposed on the resulting functional form for the unit cost function defined
as follows:
1/2 1/2
c (p) ≡ pT bbT − CCT p = pT bbT − i=1 N−1 ci ciT p (197)

where c1T ≡[c1 1 , c2 1 , . . . ,cN 1 ], c2T ≡ [0, c2 2 , . . . , cN 2 ], c3T ≡ [0, 0, c3 3 , . . . , cN 3 ],

. . . , c(N − 1)T ≡ [0, . . . , 0, cN − 1 N − 1 , cN N − 1 ] and cnT p∗ = 0 for n = 1, 2, . . . ,
N − 1.
We have considered four flexible functional forms for a unit cost function: the
generalized Leontief, the translog, the normalized quadratic, and the KBF functional
forms. The last two functional forms have the advantage that concavity can be
imposed on these functional forms without destroying the flexibility of the resulting
functions. The normalized quadratic functional form has the disadvantage that it is
usually necessary to choose the vector α72 , whereas all of the parameters for the
KBF functional form can be estimated endogenously.

Semiflexible Functional Forms

In models where the number of commodities N is large, it can be difficult to

estimate all of the parameters for a flexible functional form. Thus, when estimating
the parameters for the normalized quadratic defined by Eq. (146) above, it was
necessary to estimate the elements of the N by N symmetric matrix B and for
the KBF functional form, it was necessary to estimate the elements of the N by N
symmetric matrix A in Eq. (186). If we impose concavity on these functional forms,
then in both of these cases, the B and A matrices are replaced by – CCT where C is
lower triangular and Cp∗ = 0N for a reference positive price vector p* . An effective
way to estimate the C matrix is to estimate it one column at a time. Thus, consider
our estimating Eq. (162) for the normalized quadratic unit cost function. Replace
the B matrix in these equations by – CCT where C is lower triangular and Cp∗ = 0N
and we obtain the following system of equations:

xt /yt = b − CCT vt + (1/2) vtT CCT vt α + et ; t = 1, . . . , T. (198)

In Stage 1, we set C = 0N × N and use the resulting equations in (198) in

order to estimate the vector of parameters b. In Stage 2, set CCT = c1 c1T where
c1T ≡ [c1 1 , c2 1 , . . . , cN 1 ] and c1T p∗ = 0. Equation (198) now becomes a nonlinear
regression model. For starting parameter values, use the b vector that was estimated
in Stage 1 and set the vector c1 = 0N . In Stage 3, set CCT = c1 c1T + c2 c2T where
c1T ≡ [c1 1 , c2 1 , . . . , cN 1 ], c2T ≡ [0, c2 2 , . . . , cN 2 ] and ciT p∗ = 0 for i = 1, 2. For

72 Thus different choices for the α vector could lead to different estimates for elasticities of demand.

N − 1 components of the α vector could be estimated along with the remaining parameters but then
we would not have a parsimonious flexible functional form.
3 Duality in Production 121

starting parameter values, use the b and c1 vectors that were estimated in Stage 2 and
set the vector c2 = 0N . This procedure of gradually adding nonzero columns of the
lower triangular C matrix can be continued until the full number of N − 1 nonzero
columns have been added, provided that the number of time series observations T is
large enough compared to N, the number of commodities in the model.73 However,
in models where T is small relative to N, the above procedure of adding nonzero
columns to A will have to be stopped well before the maximum number of N − 1
nonzero columns has been added, due to the lack of degrees of freedom. Suppose
that we stop the above procedure after K < N − 1 nonzero columns have been
added. Then Diewert and Wales ([43], 330) called the resulting normalized quadratic
functional form a flexible of degree K functional form or a semiflexible functional
form. A flexible of degree K functional form for a cost function can approximate
an arbitrary twice continuously differentiable functional form to the second order at
some point, except the matrix of second order partial derivatives of the functional
form with respect to prices is restricted to have maximum rank K instead of the
maximum possible rank, N − 1.
The cost of using a semiflexible functional form of degree K where K is less than
N − 1 is that we will miss out on the part of CCT that corresponds to the smallest
eigenvalues of this matrix. In many situations, this cost will be very small; that is, as
we go through the various stages of estimating C by adding an extra nonzero column
to C at each stage, we can monitor the increase in the final log likelihood (if we use
maximum likelihood estimation) and when the increase in Stage k + 1 over Stage k
is “small,” we can stop adding extra columns, secure in the knowledge that we are
not underestimating the size of CCT by a large amount.
This semiflexible technique has not been widely applied but it would seem to
offer some big advantages in estimating substitution matrices in situations where
there are a large number of commodities in the model.74

The Use of Splines for Modeling Technical Progress

Recall the definitions for the generalized Leontief, normalized quadratic, and KBF
unit cost functions c(p) given by Eqs. (97), (146), and (180). If these functions are
estimated in the time series context for a production unit for say T periods, then a
problem will often occur: these functional forms make no allowance for technical
progress that may have taken place over the sample time period. This problem
can be solved if we add the function dT pt to the unit cost function c(p) where

73 In empirical applications, typically a final stage K < N − 1 will be reached where the addition of

another column to the CCT matrix leads to no increase in log likelihood and the last column cK is
a column of zeros.
74 Diewert and Lawrence in some unpublished work have successfully estimated semiflexible

models for profit functions for 40 commodities. Neary [75] used semiflexible functional forms
for 11 commodity groups.
122 W. E. Diewert

dT ≡ [d1 , . . . , dN ] is an N dimensional vector of technical progress parameters and

t is a scalar time variable which takes on the value t for time period t. Thus, choose
a flexible functional form for the unit cost function c(p) and add the function dT pt
to it. Using our usual notation for a data set on inputs xt , input prices pt and output
levels yt for period t, we obtain the following system of estimating equations using
Shephard’s Lemma:

xt /yt = ∇c pt + dt + et ; t = 1, . . . , T (199)

where et is a suitable error vector. If we choose c(p) to be the normalized quadratic

unit cost function, then the resulting estimating Eq. (199) will be linear in the
unknown parameters.75
However, in many applications of this model, the results may not be satisfactory.
The problem with the model defined by Eq. (199) is that the resulting measures
of technical progress are too smooth; that is, typically if one looks at the residuals
generated by the model, substantial amounts of autocorrelation will be present in
the estimating equations. This is an indication that rates of technical progress are not
constant over the sample time period. Under these circumstances, it will be useful to
replace the function simple linear function dT pt by the following piece-wise linear
spline function, τ(p, t), defined as follows:

τ (p, t) ≡ d1T pt if 1 ≤ t ≤ t1∗ ;

≡ d1T pt1∗ + d2T p t − t1∗ if t1∗ ≤ t ≤ t2∗ ; (200)

≡ d1T pt1∗ + d2T p t2∗ − t1∗ + d3T p t − t2∗ if t2∗ ≤ t ≤ T

where d1 , d2 , and d3 are N dimensional technical progress parameters and t1* and
t2∗ > t1∗ are two time periods where the piece-wise linear function of time t, τ(p, t),
changes from one set of rates of technical progress to another set.76 The estimating
equations are now the following ones77 :

75 However, if we impose concavity on the normalized quadratic functional form, then the resulting

estimating equations will be nonlinear in the unknown parameters. For a worked example of this
methodology for modelling technical progress, see Diewert and Wales [42].
76 The break points t1* and t2* can be chosen by running a preliminary regression of the form

(199) and examining the regression residuals to see when these turning points occur. In our
example, we have three time periods where the rates of technical progress are linear in time.
If necessary, additional break points can be added at the cost of having to estimate additional
parameter vectors di .
77 If the unit cost function is translog, then the estimating equations will be somewhat different.
3 Duality in Production 123

xt /yt = ∇c pt + d1 t + et ; 1 ≤ t ≤ t1∗ ;

= ∇c pt + d1 t1∗ + d2 t − t1∗ + et ; t1∗ < t ≤ t2∗ ;

= ∇c pt + d1 t1∗ + d2 t2∗ − t1∗ + d3 t − t2∗ + et ; t2∗ < t ≤ T.
(201)

If we chose c(p) to be the normalized quadratic unit cost function, then, assuming
that it is not necessary to impose concavity, the above estimating equations will be
linear in the unknown parameters. For an example of the use of the above spline
methodology, see Fox [54].78
The above spline methodology for modeling technical progress can be modified
to model nonconstant returns to scale technologies; see Fox and Grafton [55].
The above linear spline model has the disadvantage that rates of technical
progress will typically jump in a discontinuous manner as we move from one linear
spline segment to the following one. This problem can be remedied (at the cost
of a more complicated set of estimating equations) if the linear splines in time t
are replaced with quadratic splines in t. For an example of the quadratic spline
approach, see Diewert and Wales [44].

Allowing for Flexibility at Two Sample Points

There can be a problem with our two flexible functional forms for unit cost
functions where the correct curvature conditions can be imposed (the normalized
quadratic and the KBF unit cost functions): the elasticities of input demand that
these functions generate in the time series context can exhibit substantial trends.
We need to derive a formula for the elasticity of demand for input n with respect
to a change in the price of input k, say Enk (y,p) where y is output and p is an input
price vector. Recall that the normalized quadratic unit cost function was defined by
c(p) ≡ bT p + (1/2)pT Bp/αT p where α is predetermined and B is a symmetric matrix
which satisfies Bp∗ = 0N .79 The vector of first order partial derivatives and the
matrix of second order partial derivatives of this unit cost function are as follows:

∇c (p) = b + Bv − (1/2) vT Bvα; (202)

−1
∇ 2 c (p) = αT p B − BvαT − αvT B + vT BvααT (203)

78 Foxused a more scientific method to pick the break points (cross validation).
79 SeeEqs. (146), (147), (148), and (149) above. We also require that B be negative semidefinite, a
property which can be imposed as was explained in section “The Normalized Quadratic Unit Cost
Function” above.
124 W. E. Diewert

where v ≡ p/αT p is a vector of normalized input prices. The system of input

demand functions that is generated by this functional form is x(y, p) ≡ y ∇ c(p)
and the N by N matrix of input demand derivatives with respect to input
prices is ∇ p x(y, p) ≡ y∇ 2 c(p). Using Eq. (203), we see that the elasticity
Enk (y, p) ≡ [pk /xn ]∂xn (y, p)/∂pk is equal to the following expression:

Enk (y, p) = pk /αT p y/xn bnk − Bn · vαk − Bk · vαn + vT Bvαn αk ;
(204)
n, k = 1, . . . , N

where bnk is the nkth element of the matrix B, Bi· denotes the ith row of the B
matrix for i = 1, . . . , N and v ≡ p/αT p is the vector of normalized prices; that
is, the components of the input price vector p are divided by αT p. Note that when
p = p∗ , the restrictions imply that v∗ T Bv∗ = 0 and Bi· v∗ = 0 for i = 1, . . . , N where
v∗ ≡ p∗ /αT p∗ . Thus

Enk y, p∗ = pk ∗ /αT p∗ y/xn bnk ; n, k = 1, . . . , N. (205)

The reference price vector p* will usually be a representative input price vector
for the sample under consideration. Thus, the price elasticity of input demand when
evaluated at these reference prices, Enk (y,p* ), will be equal to the constant term bnk
times the price ratio term pk ∗ /αT p∗ times the quantity ratio term y/xn . The remaining
three terms on the right hand side of Eq. (204) will be equal to zero when p = p∗ .
Thus, the first term will generally be the most significant term that defines Enk (y,p)
for a general input price vector. If there are substantial divergent trends in either
input prices p or input quantities x, it can be seen that [pk /αT p][y/xn ]bnk will also
have substantial trends and hence Enk (yt ,pt ) will, in general, also exhibit substantial
trends under these conditions.
What can be done to remedy this problem of trending elasticities? If the number
of observations τ + 1 is relatively large compared to the number of inputs N, then
we can set the unit cost function equal to the following function of time t:

c (p, t) ≡ 1 − τ−1 t b1T p + τ−1 tb2T p + (1/2) pT 1 − τ−1 t B1
(206)
+τ−1 tB2 p/αT p; t = 0, 1, 2, . . . , τ

where B1 p0 = 0N and B2 pτ = 0N .80 Thus the resulting unit cost function evaluated
at period 0 is c(p, 0) ≡ b1T p + (1/2)pT B1 p/αT p and evaluated at period τ is

80 We require that B1 and B2 be symmetric negative semidefinite matrices. If the estimated

matrices fail to be negative semidefinite, then we can impose negative semidefiniteness by setting
Bi = − Ci CiT for i = 1, 2 where each Ci is an arbitrary lower triangular matrix satisfying
C1T p0 = 0N and C2T pT = 0N .
3 Duality in Production 125

c(p, τ) ≡ b2T p + (1/2)pT B2 p/αT p; that is, the resulting unit cost function is flexible
at two data points. If there are trends in input demand elasticities using this
functional form, then these trends are implied by the data rather than by the choice
of the functional form.81 Note that the unit cost function defined by Eq. (206) allows
for biased technical change over the sample period; that is, it allows for trends in the
b ≡ (1 − t)b1 + tb2 vector.82
It is possible to generalize the KBF unit cost function in a similar manner. Recall
that this unit cost function was defined by Eq. (180): c(p) ≡ (pT Bp)1/2 where
B ≡ bbT + A and A is a negative semidefinite symmetric matrix which satisfies
Ap∗ = 0N . The vector of first order partial derivatives was defined by Eq. (181).
Using this equation and Shephard’s Lemma, we have x(y, p) = y ∇ c(p) and so
x/y = ∇ c(p). Thus, using (181), we obtain the following equations:
1/2
x (y, p) = y∇c (p) = yBp/ pT Bp . (207)

When p = p∗ , using B ≡ bbT + A and Ap∗ = 0N , it can be seen that

x∗ ≡ x y, p∗ = yb. (208)

The matrix of input demand derivatives with respect to input prices is

∇ p x(y, p) = y∇ 2 c(p). The matrix of second order partial derivatives of the unit
cost function was defined by Eq. (182). Thus, we have:

∇p x (y, p) = y∇ 2 c (p)
1/2 −1
= y p Bp
T
B − Bp p Bp
T T
p B using (182)
−1 1/2
−1/2
= yc(p) B − Bp p Bp
T T
p B using c (p) ≡ pT Bp

= yc(p)−1/2 B − y−2 x (y, p) x(y, p)T using (207)

= yc(p)−1/2 bbT + A − y−2 x (y, p) x(y, p)T using B=bbT + A.
(209)

Now evaluate Eq. (209) when p = p∗ . We find that:

81 This technique of imposing price flexibility at two points is due to Diewert and Lawrence [40].
82 If the residuals in the final model exhibit substantial autocorrelation, then it is possible to replace

the b vector by a piece-wise linear function of time as was done in the previous section. This will
allow for a more general pattern of technical change.
126 W. E. Diewert

−1/2 T
∇p x y, p∗ = yc p∗ A + bbT − y−2 x y, p∗ x y, p∗
−1/2
= yc p∗ A + bbT − bbT using (208) (210)
−1/2
= yc p∗ A.

Using Eq. (209), we see that the elasticity Enk (y, p) ≡ [pk /xn ]∂xn (y, p)/∂pk is
equal to the following expression:

Enk (y, p) = pk /c (p) y/xn (y, p) ank + bn bk − y−2 xn (y, p) xk (y, p) ;

n, k = 1, . . . , N
(211)

where ank is the nkth element of the negative semidefinite matrix A (which satisfies
Ap∗ = 0N ), bn is the nth element of the vector b and xn (y,p) is the nth element of
the cost minimizing input vector x(y,p) defined by Eq. (207). Using Eq. (210), we
see that when p = p∗ , Enk (y, p∗ ) = [pk /c(p∗ )][y/xn (y, p∗ )]ank so that the last two
terms on the right hand side of Eq. (211) sum to zero when p = p∗ . Thus, the first
term associated with ank will generally be the most significant term that defines
Enk (y,p) for a general input price vector. If there are substantial divergent trends in
either input prices p or input quantities x, it can be seen that [pk /c(p* )][y/xn (y,p* )]ank
will also have substantial trends and hence Enk (yt ,pt ) will, in general, also exhibit
substantial trends under these conditions.
Again, if the number of observations τ + 1 is relatively large compared to
the number of inputs N, then we can set the KBF unit cost function equal to the
following function of time t:

c (p, t) ≡ pT 1 − τ−1 t b1T p + τ−1 tb2T p + 1 − τ−1 t A1
1/2 (212)
+τ−1 tA2 p ; t = 0, 1, 2, . . . , τ

where A1 p0 = 0N and A2 pT = 0N .83 Thus, the resulting unit cost function

evaluated at period 0 is c(p, 0) ≡ (pT [b1 b1T + A1 ]p)1/2 and evaluated at period τ
is c(p, τ) ≡ (pT [b1 b1T + A1 ]p)1/2 ; that is, the resulting unit cost function is flexible
at two data points. As was the case for the normalized quadratic, if there are trends
in input demand elasticities using this functional form, then these trends are implied
by the data rather than by the choice of the functional form. Again, the unit cost

83 We require that A1 and A2 be symmetric negative semidefinite matrices. Again, if the estimated
matrices fail to be negative semidefinite, then we can impose negative semidefiniteness by setting
Ai = − Ci CiT for i = 1, 2 where each Ci is an arbitrary lower triangular matrix satisfying
C1T p0 = 0N and C2T pT = 0N .
3 Duality in Production 127

function defined by Eq. (212) allows for biased technical change over the sample
period; that is, it allows for trends in the b ≡ (1 − t)b1 + tb2 vector.
We turn our attention to multiple input and multiple output technologies.

National Product or Variable Profit Functions

Up to now, we have only considered technologies that produce one output. In reality,
production units (firms or industries) usually produce many outputs.84 Hence, in
this section, we consider technologies that produce many outputs while using many
inputs.
Let S denote the technology set of a production unit. We decompose the
inputs and outputs of the firm into two sets of commodities: variable and fixed.
Let y ≡ [y1 , . . . , yM ] denote a vector of variable net outputs (if ym > 0, then
commodity m is an output while if ym < 0, then commodity m is an input) and
let x ≡ [x1 , . . . , xN ] denote a nonnegative vector of “fixed” inputs.85 Thus, the
technology set S is a set of feasible variable net output and fixed input vectors,
(y,x).
Let p 0M be a strictly positive vector of variable net output prices that the
firm faces during a production period. Then conditional on a given vector of fixed
inputs x ≥ 0N , we assume that the firm attempts to solve the following conditional
or variable profit maximization problem:

maxy pT y : (y, x) ∈ S ≡ π (p, x) . (213)

The optimized objective function, π(p, x), has been called many names,86
depending on the context. Alternative names for this function are the national
product function Samuelson ([79], 10), the gross profit function Gorman [58], the
conditional profit function McFadden [72, 73], the variable profit function Diewert
[21], the GDP function Kohli [65, 67], and the value added function Diewert [25].
If there are no intermediate inputs or imports in the outputs, then π(p, x) becomes
the revenue function Diewert [23]. Some regularity conditions on the technology set
S are required in order to ensure that the maximum in Eq. (213) exists. A simple set
of sufficient conditions are87 : (i) S is a closed set in RM + N and (ii) for each x ≥ 0N ,

84 See Bernard et al. [4]. In the sample of US firms considered by Hottman et al. ([63], 1301),

the mean number of products (measured by distinct barcodes) was 13 per firm and the maximum
number was 388.
85 These “fixed” inputs may only be fixed in the short run. Or we may simply decide to allow

outputs and intermediate inputs to be variable and condition on an x vector of primary inputs.
86 The concept of this function is due to Samuelson [79].
87 Let x ≥ 0 and p 0 . Then by (ii), there exists y such that (y , x) ∈ S. Define the closed
N M x x
and bounded set B(x, p) ≡ {y : y ≤ b(x)1M ; pT y ≥ pT yx } where b(x) > 0 is an upper bound on all
possible net output vectors that can be produced by the technology if the vector of fixed inputs x is
128 W. E. Diewert

there exists a y such that (y, x) ∈ S and the set of such y vectors is bounded from
above. We will call these conditions the minimal regularity conditions on S.
Note that π(p, x) is equal to the optimized objective function in Eq. (213) and is
regarded as a function of the net output prices for variable commodities that the firm
faces, p, as well as a function of the vector of fixed inputs, x, that the firm has at
its disposal. Just as in section “Cost Functions: The One Output Case” above where
we showed that the cost function C(y,p) satisfied a number of regularity conditions
without assuming much about the production function, we can now show that the
profit function π(p, x) satisfies some regularity conditions without assuming much
about the technology set S.

Theorem 9 McFadden [72, 73], Gorman [58], Diewert [21]: Suppose the technol-
ogy set S satisfies the minimal regularity conditions (i) and (ii) above. Then the
variable profit function π(p, x) defined by Eq. (213) has the following properties
with respect to p for each x ≥ 0N :

Property 1 π(p, x) is positively linearly homogeneous in p for each fixed x ≥ 0N ;

that is,

π (λp, x) = λπ (p, x) for all λ > 0, p 0N and x ≥ 0N . (214)

Property 2 π(p, x) is a convex function of p for each x ≥ 0N ; that is,

x ≥ 0M , pi 0M , i = 1, 2; 0 < λ < 1 implies

(215)
π λp1 + (1 − λ) p2 , x ≤ λπ p1 , x + (1 − λ) π p2 , x .

Problem
23. Prove Theorem 9. Hint: Properties 1 and 2 above for π(p, x) are analogues to
Properties 2 and 4 for the cost function C(y,p) in Theorem 1 above and can be
proven in the same manner.

We now ask whether a knowledge of the profit function π(p, x) is sufficient to

determine the underlying technology set S. As was the case in section “The Duality
Between Cost and Production Functions” above, the answer to this question is yes,
but with some qualifications.
To see how to use a given profit function π(p, x) can be used to determine the
technology set that generated it, pick an arbitrary vector of fixed inputs x ≥ 0N and
an arbitrary vector of positive prices, p1 0M . Now use the given profit function π
to define the following isoprofit surface: {y : p1T y = π(p1 , x)}. This isoprofit surface
must be tangent to the set of net output combinations y that are feasible, given that

available to the producer. It can be seen that the constraint (y, x) ∈ S in Eq. (213) can be replaced
by the constraint (y, x) ∈ S ∩ B(x, p). Using (ii), S ∩ B(x, p) is a closed and bounded set so that the
maximum in (213) will exist.
3 Duality in Production 129

the vector of fixed inputs x is available to the firm, which is the conditional on x
production possibilities set, S(x) ≡ {x : (y, x) ∈ S}. It can be seen that this isoprofit
surface and the set lying below it must contain the set S(x); that is, the following
halfspace M(x,p1 ), contains S(x):

M x, p1 ≡ y : p1T y ≤ π p1 , x . (216)

Pick another positive vector of prices, p2 0M and it can be seen, repeating

the above argument, that the halfspace M(x, p2 ) ≡ {y : p2T y ≤ π(p2 , x)} must also
contain the conditional on x production possibilities set S(x). Thus, S(x) must belong
to the intersection of the two halfspaces M(x,p1 ) and M(x,p2 ). Continuing to argue
along these lines, it can be seen that S(x) must be contained in the following set,
which is the intersection over all p 0M of all of the supporting halfspaces to S(x):

M (x) ≡ ∩p0M M (x, p) . (217)

Note that M(x) is defined using just the given profit function, π(p, x). Note also
that since each of the sets in the intersection, M(x,p), is a convex set, then M(x) is
also a convex set. Since S(x) is a subset of each M(x,p), it must be the case that S(x)
is also a subset of M(x); that is, we have

S (x) ⊂ M (x) . (218)

Is it the case that S(x) is equal to M(x)? In general, the answer is no; M(x)
forms an outer approximation to the true conditional production possibilities set
S(x). Suppose that that there are only two outputs and for a given input vector x, the
output production possibilities set is the heart shaped region in Fig. 2. The boundary
of the set M(x) partly coincides with the boundary of S(x) but it encloses a bigger
set: the backward bending parts of the true production frontier are replaced by the
dashed lines that are parallel to the y1 axis and the y2 axis and the inward bending
part of the true production frontier is replaced by the dashed line that is tangent to
the two regions where the boundary of M(x) coincides with the boundary of S(x).
However, if the producer is a price taker in the two output markets, then it can
be seen that we will never observe the producer’s nonconvex or backward bending
parts of the production frontier.
What are conditions on the technology set S (and hence on the conditional
technology sets S(x)) that will ensure that the outer approximation sets M(x),
constructed using the variable profit function π(p, x), will equal the true technology
sets S(x)? It can be seen that the following two conditions on S (in addition to the
minimal regularity conditions (i) and (ii)) are the required conditions 88 :

N = 1 so that there is only one fixed input, then given a producible net output vector y ∈ RM ,
88 If

we can define the (fixed) input requirements function that corresponds to the technology set S
as g(y) ≡ minx {x : (y, x) ∈ S}. In this case, condition (220) becomes the following condition: the
130 W. E. Diewert

S(x)

M(x) y1

Fig. 2 The geometry of the two output maximization problem

For every x ≥ 0N , the set S (x) ≡ {x : (y, x) ∈ S} has the following free disposal
property :
y1 ∈ S (x) , y2 ≤ y1 implies y2 ∈ S (x) ;
(219)

For every x ≥ 0N , the set S (x) ≡ {y : (y, x) ∈ S} is convex. (220)

Conditions (219) and (220) are the conditions on the technology set S that
are counterparts to the two regularity conditions of nondecreasingness and
quasiconcavity89 that were made on the production function, f(x), in section “The
Duality Between Cost and Production Functions” above in order to obtain a duality
between cost and production functions. If the firm is behaving as a price taker
in variable commodity markets, it can be seen that it is not restrictive from an
empirical point of view to assume that S satisfies conditions (219) and (220), just
as it was not restrictive to assume that the production function was nondecreasing
and quasiconcave in the context of the producer’s (competitive) cost minimization
problem studied earlier.
The next result provides a counterpart to Shephard’s Lemma, Theorem 5 in
section “The Derivative Property of the Cost Function” above.

Theorem 10 Hotelling’s ([61], 594) Lemma 90 : If the profit function π(p, x) satis-
fies the properties listed in Theorem 9 above and in addition is once differentiable
with respect to the components of the variable commodity prices at the point (p* ,x* )

input requirements function g(y) is quasiconvex in y. For additional material on this one fixed input
model, see Diewert [23].
89 Recall conditions (11) and (12) in section “The Duality Between Cost and Production Func-

tions.”
90 See also Gorman [58] and Diewert ([22], 137).
3 Duality in Production 131

where x∗ ≥ 0N and p∗ 0M , then

y∗ = ∇p π p∗ , x∗ (221)

where ∇ p π(p∗ , x∗ ) is the vector of first order partial derivatives of variable profit
with respect to variable commodity prices and y* is any solution to the profit
maximization problem

maxy p∗T y : y, x∗ ∈ S ≡ π p∗ , x∗ . (222)

Under these differentiability hypotheses, it turns out that the y* solution to Eq.
(222) is unique.

Proof Let y* be any solution to the profit maximization problem (222). Since y*
is feasible for the profit maximization problem when the variable commodity price
vector is changed to an arbitrary p 0N , it follows that

pT y∗ ≤ π p, x∗ for every p 0M . (223)

Since y* is a solution to the profit maximization problem (22) when p = p∗ , we

must have

p∗T y∗ = π p∗ , x∗ . (224)

But Eqs. (223) and (224) imply that the function of M variables, g(p) ≡ pT y∗ −
π(p, x∗ ) is nonpositive for all p 0M with g(p) = 0. Hence, g(p) attains a global
maximum at p = p∗ and since g(p) is differentiable with respect to the variable
commodity prices p at this point, the following first order necessary conditions for
a maximum must hold at this point:

∇p g p∗ ≡ y∗ − ∇p π p∗ , x∗ = 0M . (225)

Now note that Eq. (225) is equivalent to Eq. (221). If y** is any other solution to
the profit maximization problem (222), then repeat the above argument to show that
y∗∗ = ∇ p π(p∗ , x∗ ) which in turn is equal to y* .
Hotelling’s Lemma may be used in order to derive systems of variable commod-
ity output supply and input demand functions just as we used Shephard’s Lemma to
generate systems of cost minimizing input demand functions; for examples of this
use of Hotelling’s Lemma, see Diewert ([22], 137–139) and sections “The Translog
Variable Profit Function,” “The Normalized Quadratic Variable Profit Function,”
and “The KBF Variable Profit Function” below.
If we are willing to make additional assumptions about the underlying firm
production possibilities set S, then we can deduce that π(p, x) satisfies some
additional properties. One such additional property is the following one: S is subject
132 W. E. Diewert

to the free disposal of fixed inputs if it has the following property:

x2 > x1 ≥ 0N and y, x1 ∈ S implies y, x2 ∈ S. (226)

The above property means if the vector of fixed inputs x1 is sufficient to produce
the vector of variable inputs and outputs y and if we have at our disposal a
bigger vector of fixed inputs x2 , then y is still producible by the technology that
is represented by the set S.

Theorem 1191 Suppose the technology set S satisfies the weak regularity condi-
tions (i) and (ii) above.

(a) If in addition, S has the following property92 :

For every x ≥ 0N , (0M , x) ∈ S; (227)

then for every p 0M and x ≥ 0N , π(p, x) ≥ 0; that is, the variable profit
function is nonnegative if Eq. (227) holds.
(b) If S is a convex set, then for each p 0M , then π(p, x) is a concave function of
x over the set ≡ {x : x ≥ 0N }.
(c) If S is a cone so that the technology is subject to constant returns to scale, then
π(p, x) is (positively) homogeneous of degree one in the components of x.
(d) If S is subject to the free disposal of fixed inputs, property (226), then

p 0, x2 > x1 ≥ 0N implies π p, x2 ≥ π p, x1 ; (228)

that is, π(p, x) is nondecreasing in the components of x.

Proof of (a) Let p 0M and x ≥ 0N . Then

π (p, x) ≡ maxy pT y : (y, x) ∈ S

≥ pT 0M since by(227), (0M , x) ∈ S and hence is feasible for the problem
= 0.
(229)

Proof of (b) Let p 0M , x1 ≥ 0N , x2 ≥ 0N and 0 < λ < 1. Then

91 The results in this Theorem are essentially due to Samuelson ([79], 20), Gorman [58], McFadden

(1968) and Diewert ([21, 22], 136) but they are packaged in a somewhat different form in this
chapter.
92 This property says that the technology can always produce no variable outputs and utilize no

variable inputs given any vector of fixed inputs x.

3 Duality in Production 133

π p, x1 ≡ maxy pT y : y, x1 ∈ S
(230)
= pT y1 where y1 , x1 ∈ S;

π p, x2 ≡ maxy pT y : y, x2 ∈ S
(231)
= pT y2 where y2 , x2 ∈ S.

Since S is assumed to be a convex set, we have

λ y1 , x1 + (1 − λ) y2 , x2 = λy1 + (1 − λ) y2 , λx1 + (1 − λ) x2 ∈ S.
(232)

Using the definition of π, we have:

π p, λx1 + (1 − λ) x2 ≡ maxy pT y : y, λx1 + (1 − λ) x2 ∈ S

≥ pT λy1 + (1 − λ) y2 since by(232), λy1 + (1 − λ) y2

is feasible for the problem

= λpT y1 + (1 − λ) pT y2

= λπ p, x1 + (1 − λ) π p, x2 using(230)and(231).
(233)

Proof of (c) Let p 0M , x∗ ≥ 0N and λ > 0. Then

π p, x∗ ≡ maxy pT y : y, x∗ ∈ S
(234)
= pT y∗ where y∗ , x∗ ∈ S.

Since S is a cone and since (y∗ , x∗ ) ∈ S, then we have (λy∗ , λx∗ ) ∈ S as well.
Hence, using a feasibility argument:

π p, λx∗ ≡ maxy pT y : y, λx∗ ∈ S

≥ pT λy∗ since λy∗ + λx∗ ∈ S and hence is feasible for the problem
= λpT y∗ .
(235)

Now suppose that the strict inequality in Eq. (235) held so that
134 W. E. Diewert

π (p, x) (236)

Since S is a cone, λ > 0, and (y∗∗ , λx∗ ) ∈ S, then we have (λ−1 y∗∗ , x∗ ) ∈ S as well.
Thus, λ−1 y∗∗ is feasible for the maximization problem (234) that defined π(p, x∗ )
and so

pT y∗ = maxy pT y : y, x∗ ∈ S using(234)

≥ pT λ−1 y∗∗ since λ−1 y∗∗ is feasible for the problem (237)

= λ−1 pT y∗∗

or since λ > 0, Eq. (237) is equivalent to

λpT y∗ ≥ pT y∗∗ > λpT y∗ using(236). (238)

But Eq. (238) implies that λpT y∗ > λpT y∗ , which is impossible and hence our
supposition is false and the desired result follows.

Proof of (d) Let p 0, x2 > x1 ≥ 0N . Using the definition of π(p, x1 ), we have

π p, x1 ≡ maxy pT y : y, x1 ∈ S
(239)
= pT y1 where y1 , x1 ∈ S.

Using the free disposal property (228) for S, since (y1 , x1 ) ∈ S and x2 > x1 , we
have

y1 , x2 ∈ S (240)

Using the definition of π(p, x2 ), we have

π p, x2 ≡ maxy pT y : y, x2 ∈ S

= pT y1 since by (240), y1 , x2 is feasible (241)

= π p, x1 using(239).

If the technology set S satisfies the minimal regularity conditions (i) and (ii)
plus all of the additional conditions that are listed in Theorem 11 above (we shall
call such a technology set a regular technology set), then the associated variable
profit function π(p, x) will have all of the regularity conditions with respect to
3 Duality in Production 135

its fixed input vector x that a nonnegative, nondecreasing, concave, and linearly
homogeneous production function f(x) possesses with respect to its input vector x.
Hotelling’s Lemma enabled us to interpret the vector of first order partial
derivatives of the variable profit function with respect to the components of the
variable commodity price vector p, ∇ p π(p, x), as the producer’s vector of variable
profit maximizing output supply (and the negative of variable input demand)
functions, y(p,x), provided that the derivatives existed. If the first order partial
derivatives of the variable profit function π(p, x) with respect to the components
of the fixed input vector x exist, then this vector of derivatives, ∇ x π(p, x), can
also be given an economic interpretation as a vector of shadow prices or imputed
contributions to profit of adding marginal units of fixed inputs. The following result
also shows that these derivatives can be interpreted as competitive input prices for
the “fixed” factors if they are allowed to become variable.

Theorem 12; Samuelson’s Lemma93 Suppose the technology set S satisfies the
minimal regularity assumptions (i) and (ii) above and in addition is a convex set.
Suppose in addition that p∗ 0M , x∗ ≥ 0N and that the vector of derivatives,
∇ x π(p∗ , x∗ ) ≡ w∗ , exists. Then x* is a solution to the following long run profit
maximization problem that allows the “fixed” inputs x to be variable:

maxx π p∗ , x − w∗T x : x ≥ 0N . (242)

Proof Part (b) of Theorem 11 above implies that π(p∗ , x) is a concave function of
x over the set ≡ {x : x ≥ 0N }The function – w*T x is linear in x and hence is also
a concave function of x over . Hence, f(x) defined for x ≥ 0N as

f (x) ≡ π p∗ , x − w∗T x (243)

is also a concave function in x over the set . Since x∗ ≥ 0N , x∗ ∈ . Hence, using

the fact that a differentiable concave function has a Taylor series approximation that
provides an upper bound to the function around any point x* where the function is
differentiable, we have the following inequality:
T
f (x) ≤ f x∗ + ∇x f x∗ x − x∗ for all x ≥ 0N

= π p∗ , x∗ −w∗T x∗ +0N T x−x∗ since ∇x f x∗ =∇x π p∗ , x∗ − w∗ =0N

= π p∗ , x∗ − w∗T x∗ .
(244)

93 Samuelson’s National Product function, N(p,v), is the counterpart to our π(p, x) where his v is
a vector of primary inputs. Samuelson ([79], 10) derived the equations w = ∇ v N(p, v). Our proof
follows that of Diewert ([22], 140).
136 W. E. Diewert

But Eqs. (243) and (244) show that x* solves the profit maximization problem
(242).

Corollary If in addition to the above assumptions, π(p, x) is differentiable with

respect to the components of p at the point (p∗ , x∗ ), so that y∗ ≡ ∇ p π(p∗ , x) exists,
then (y∗ , x∗ ) solves the following long run profit maximization problem:

p∗ , w∗ ≡ maxy,x p∗T y − w∗T x : (y, x) ∈ S . (245)

Proof Using Hotelling’s Lemma, we know that y* solves the following variable
profit maximization problem:

π p∗ , x∗ ≡ maxy p∗T y : y, x∗ ∈ S = p∗T y∗ . (246)

Now look at the long run profit maximization problem defined by Eq. (245):

p∗ , w∗ ≡ maxy,x p∗T y − w∗T x : (y, x) ∈ S

= maxx maxy p∗T y : (y, x) ∈ S − w∗T x where we have rewritten

themaximization problem as a two stage maximization problem

= maxx π p∗ , x − w∗T x using the definition of π p∗ , x

= π p∗ , x∗ − w∗T x using Theorem 12.
(247)

Hence, with x = x∗ being an x solution to Eq. (247), we must have

p∗ , w∗ ≡ maxy,x p∗T y − w∗T x : (y, x) ∈ S

= maxy p∗T y : y, x∗ ∈ S − w∗T x∗ letting x = x∗ (248)

= p∗T y∗ − w∗T x∗ using(246).

Hotelling’s Lemma and Samuelson’s Lemma can be used as a convenient method

for obtaining econometric estimating equations for determining the parameters that
characterize a producer’s technology set S. Assuming that S satisfies the minimal
regularity conditions on S, we need only postulate a differentiable functional form
for the producer’s variable profit function, π(p, x), that is linearly homogeneous
and convex in p. Suppose that we have collected data on the fixed input vectors
used by the production unit in period t, xt , and the net supply vectors for variable
commodities produced in period t, yt , for t = 1, . . . , T time periods as well as
3 Duality in Production 137

the corresponding variable commodity price vectors pt . Then the following MT

equations can be used in order to estimate the unknown parameters in π(p, x):

yt = ∇p π pt , xt + ut ; t = 1, . . . , T (249)

where ut is a vector of errors. If in addition, S is a convex set and it can be assumed

that the production unit is optimizing with respect to its vector of “fixed” inputs in
each period, where it faces the “fixed” input price vector wt in period t, then the
following N equations can be added to Eq. (249) as additional estimating equations:

wt = ∇x π pt , xt + vt ; t = 1, . . . , T (250)

where vt is a vector of errors.94 We will look at some specific functional forms

for π(p, x) and their econometric estimating equations in the final sections of this
chapter.

The Comparative Statics Properties of Net Supply and Fixed

Input Demand Functions

From Theorem 11 above, we know that the firm’s variable profit function π(p, x)
is convex and linearly homogeneous in the components of the vector of variable
commodity prices p for each fixed input vector x. Thus, if π(p, x) is twice
continuously differentiable with respect to the components of p at some point (p,x),
then using Hotelling’s Lemma, we can prove the following counterpart to Theorem
7 for the cost function.

Theorem 13 Hotelling ([61], 597), Hicks ([60], 321), Samuelson ([79], 10),
Diewert ([22], 142–146): Suppose the variable profit function π(p, x) is linearly
homogeneous and convex in p and in addition is twice continuously differentiable
with respect to the components of p at some point, (p,x). Then the system of variable
profit maximizing net supply functions, y(p, x) ≡ [y1 (p, x), . . . , yM (p, x)]T , exists at
this point and these net supply functions are once continuously differentiable. Form
the M by M matrix of net supply derivatives with respect to variable commodity
prices, B ≡ [∂ym (p, x)/∂pk ], which has mk element equal to ∂ym (p, x)/∂pk . Then
the matrix B has the following properties95 :

94 Ifin addition, the technology set S is subject to constant returns to scale and the data reflect this
fact by “adding up” (so that ptT yt = wtT xt for t = 1, . . . , T), then the error vectors ut and vt in Eqs.
(249) and (250) cannot be statistically independent. Hence, under these circumstances, one of the
M + N equations in (249) and (250) must be dropped from the system of estimating equations.
95 These are the Hotelling ([61], 549) and Hicks ([60], 321) symmetry restrictions on supply

functions.
138 W. E. Diewert

B = BT so that ∂ym (p, x) /∂pk = ∂yk (p, x) /∂pm for all m = k; (251)

B is positive semidefinite and (252)

Bp = 0M . (253)

Proof
Hotelling’s Lemma implies that the firm’s system of variable profit maximizing
net supply functions, y(p, x) ≡ [y1 (p, x), . . . , yM (p, x)]T , exists and is equal to

y (p, x) = ∇p π (p, x) . (254)

Differentiating both sides of Eq. (254) with respect to the components of p gives
us

B ≡ ∂ym (p, x) /∂pk = ∇ 2 pp π (p, x) . (255)

Property (251) follows from Young’s Theorem in calculus. Property (252)

follows from Eq. (255) and the fact that π(p, x) is convex and twice differentiable
in p and hence the matrix of second order partial derivatives ∇ 2 pp π(p, x) must be
positive semidefinite. Finally, property (253) follows from the fact that the profit
function is linearly homogeneous in p and hence, using Part 2 of Euler’s Theorem
on homogeneous functions, (253) holds.
Note that property (252) implies the following properties on the net supply
functions:

∂ym (p, x) /∂pm ≥ 0 for m = 1, . . . , M. (256)

Property (256) means that output supply curves cannot be downward sloping.
However, if variable commodity m is an input, then ym (p,x) is negative. If we define
the positive input demand function as

dm (p, x) ≡ −ym (p, x) ≥ 0, (257)

then the restriction (256) translates into ∂dm (p, x)/∂pm ≤ 0, which means that
variable input demand curves cannot be upward sloping.
Obviously, if the technology set is a convex cone, then the firm’s competitive
fixed input price functions (or inverse demand functions), w(p, x) ≡ ∇ x π(p, x),
will satisfy properties analogous to the properties of cost minimizing input demand
functions in Theorem 7.

Theorem 14 Samuelson ([79], 10), Diewert ([22], 144–146): Suppose that the
production unit’s technology set S is regular. Define the variable profit function
3 Duality in Production 139

π(p, x) by Eq. (213). Suppose that π(p, x) is twice continuously differentiable with
respect to the components of x at some point (p,x) where p 0M and x ≥ 0N . Then
the system of input price functions, w(p, x) ≡ [w1 (p, x), . . . , wM (p, x)]T , exists at
this point96 and these input price functions are once continuously differentiable.
Form the N by N matrix of input price derivatives with respect to the “fixed” inputs,
C ≡ [∂wi (p, x)/∂xk ], which has ik element equal to ∂wi (p, x)/∂xk . Then the matrix
C has the following properties:

C = CT so that ∂wi (p, x) /∂xk = ∂wk (p, x) /∂xi for all i = k; (258)

C is negative semidefinite and (259)

Cx = 0N . (260)

Proof Using Samuelson’s Lemma, the firm’s system of fixed input price functions,
w(p, x) ≡ [w1 (p, x), . . . , wN (p, x)]T , exists and is equal to

w (p, x) ≡ ∇x π (p, x) . (261)

Differentiating both sides of Eq. (261) with respect to the components of x gives
us

C ≡ ∂wi (p, x) /∂xk = ∇ 2 xx π (p, x) . (262)

Now property (258) follows from Young’s Theorem in calculus. Property (259)
follows from Eq. (262) and the fact that π(p, x) is concave in x.97 Finally, property
(260) follows from the fact that the profit function is linearly homogeneous in x98
and hence, using Part 2 of Euler’s Theorem on homogeneous functions, (260) holds.
Note that property (259) implies the following properties on the fixed input price
functions:

∂wn (p, x) /∂xn ≤ 0; n = 1, . . . , N. (263)

Property (263) means that the inverse fixed input demand curves cannot be
upward sloping.

96 The assumption that S is regular implies that S has the free disposal property in fixed inputs
property (226), which implies by part (d) of Theorem 11 that π(p, x) is nondecreasing in x and this
in turn implies that w(p, x) ≡ ∇ x π(p, x) is nonnegative.
97 The assumption that S is regular implies that S is a convex set and this in turn implies that π(p, x)

is concave in x. Concavity in x plus our differentiability assumption implies that ∇ 2 xx π(p, x) is

negative semidefinite.
98 The assumption that S is regular implies that S is a cone and this in turn implies that π(p, x) is

linearly homogeneous in x.
140 W. E. Diewert

If the firm’s production possibilities set S is regular and if the corresponding

variable profit function π(p, x) is twice continuously differentiable with respect to
all of its variables, then there will be additional restrictions on the derivatives of the
variable net output supply functions y(p, x) = ∇ p π(p, x) and on the derivatives of
the fixed input price functions w(p, x) = ∇ x π(p, x). Define the M by N matrix of
derivatives of the net output supply functions y(p,x) with respect to the components
of the vector of fixed inputs x as follows:

D ≡ ∂ym (p, x) /∂xn = ∇ 2 px π (p, x) ; m = 1, . . . , M; n = 1, . . . , N, (264)

where the equalities in Eq. (264) follow by differentiating both sides of the
Hotelling’s Lemma relations, y(p, x) = ∇ p π(p, x), with respect to the components
of x. Similarly, define the N by M matrix of derivatives of the fixed input price
functions w(p,x) with respect to the components of the vector of variable commodity
prices p as follows:

E ≡ ∂wn (p, x) /∂pm = ∇ 2 xp π (p, x) ; n = 1, . . . , N; m = 1, . . . , M, (265)

where the equalities in Eq. (265) follows by differentiating both sides of the
Samuelson’s Lemma relations, w(p, x) = ∇ x π(p, x), with respect to the components
of p.

Theorem 15 Samuelson ([79], 10), Diewert ([22], 144–146): Suppose that the
production unit’s technology set S is regular. Define the variable profit function
π(p, x) by Eq. (213). Suppose that π(p, x) is twice continuously differentiable with
respect to the components of x at some point (p,x) where p 0M and x ≥ 0N and
define the matrices of derivatives D and E by Eqs. (264) and (265), respectively.
Then these matrices have the following properties:

D = ET so that ∂ym (p, x) /∂xn = ∂wn (p, x) /∂xm for m = 1, . . . , M and

(266)
n = 1, . . . , N;

w (p, x) ≡ Ep ≥ 0N ; (267)

y (p, x) = Dx. (268)

Proof The symmetry restrictions (266) follow from definitions (264) and (265) and
Young’s Theorem in calculus.
Since π(p, x) is linearly homogeneous in the components of p, we have

π (λp, x) = λπ (p, x) for all λ > 0. (269)

Partially differentiate both sides of Eq. (269) with respect to xn and we obtain:
3 Duality in Production 141

∂π (λp, x) /∂xn = λ∂π (p, x) /∂xn for all λ > 0 and n = 1, . . . , N. (270)

But Eq. (270) implies that the functions wn (p, x) ≡ ∂π(p, x)/∂xn are homoge-
neous of degree one in p. Hence, we can apply Part 1 of Euler’s Theorem on
homogeneous functions to these functions wn (p,x) and conclude that
M
wn (p, x) = ∂wn (p, x) /∂pm pm ; n = 1, . . . , N. (271)
m=1

But Eq. (271) are equivalent to the equations in (267). The inequality in (267)
follows from w(p, x) = ∇ x π(p, x) ≥ 0N , which in turn follows from the fact that
regularity of S implies that π(p, x) is nondecreasing in the components of x.
Since S is regular, part (c) of Theorem 11 implies that π(p,x) is linearly
homogeneous in x, so that

π (p, λx) = λπ (p, x) for all λ > 0. (272)

Partially differentiate both sides of Eq. (272) with respect to pm and we obtain:

∂π (p, λx) /∂pm = λ∂π (p, x) /∂pm for all λ > 0 and m = 1, . . . , M. (273)

But Eq. (273) implies that the functions ym (p, x) ≡ ∂π(p, x)/∂pm are homo-
geneous of degree one in x. Hence, we can apply Part 1 of Euler’s Theorem on
homogeneous functions to these functions ym (p,x) and conclude that
N
ym (p, x) = ∂ym (p, x) /∂xn xn ; m = 1, . . . , M. (274)
n=1

But Eq. (274) are equivalent to Eq. (268).

Following up on the pioneering work of Samuelson [79], Diewert and Woodland
([46], 383–390) developed additional comparative statics properties for a consoli-
dated production sector consisting of a finite number of constant returns to scale
production units. For additional applications of the National Product Function to the
theory of international trade, see Kohli [65, 67], Dixit and Norman [48], Woodland
[87] and Feenstra [50].

Problems
24. Under the hypotheses of Theorem 15, show that y(p,x) and w(p,x) satisfy the
following equation:
(i)

pT y (p, x) = xT w (p, x) .

25. Let S be a technology set that satisfies the minimal regularity assumptions and
let π(p, x) be the corresponding differentiable variable profit function defined
by Eq. (213). Variable commodities m and k (where m = k) are said to be
142 W. E. Diewert

substitutes if (i) below holds, unrelated if (ii) below holds and complements if
(iii) below holds:
(i)

∂ym (p, x) /∂pk < 0;

(ii)

∂ym (p, x) /∂pk = 0;

(iii)

∂ym (p, x) /∂pk > 0.

(a) If the number of variable commodities M = 2, then show that the two
variable commodities cannot be complements.
(b) If M = 2 and the two variable commodities are unrelated, then show
that:
(iv)

∂y1 (p, x) /∂p1 = ∂y2 (p, x) /∂p2 = 0.

(c) If M = 3, then show that at most one pair of variable commodities can
be complements.99
26. Let S be a regular technology set and let π(p, x) be the corresponding
twice continuously differentiable variable profit function defined by Eq. (213).
Variable commodities m and fixed input n are said to be normal if (i) below
holds, unrelated if (ii) below holds and inferior if (iii) below holds (we assume
p 0M and x 0N ):
(i)

∂ym (p, x) /∂xn = ∂wn (p, x) /∂pm > 0;

(ii)

∂ym (p, x) /∂xn = ∂wn (p, x) /∂pm = 0;

(iii)

∂ym (p, x) /∂xn = ∂wn (p, x) /∂pm < 0.

99 This type of argument (that substitutability tends to be more predominant than complementarity)

is again due to Hicks ([60], 322–323) but we have not followed his terminology exactly.
3 Duality in Production 143

(a) If wn (p, x) > 0, then there exists at least one variable commodity m
such that commodity m and fixed input n are normal.
(b) If wn (p, x) ≥ 0, then there exists at least one variable commodity
m such that commodity m and fixed input n are either normal or
unrelated.
(c) If ym (p, x) > 0, then there exists at least one fixed input n such that
commodity m and fixed input n are normal.
(d) If ym (p, x) < 0, then there exists at least one fixed input n such that
commodity m and fixed input n are inferior.
In the following three sections, we will look at the properties of some specific
functional forms for a variable profit function. We will assume that these profit
functions are dual to a regular technology.

The Translog Variable Profit Function

Assume that the log of the variable profit function for a regular technology, lnπ(p, x),
has the following translog functional form100 :
M M M
ln π (p, x) ≡ a0 + am lnpm + (1/2) amk lnpm lnpk
m=1 m=1 k=1
N N N
+ bn lnxn + (1/2) bni lnxn lnxi
n=1 n=1 i=1
M N
+ cmn lnpm lnxn ;
m=1 n=1
(275)

The coefficients must satisfy the following restrictions in order for π(p, x) to be
linearly homogeneous in the components of p as well as the components of x101
M
am = 1; (276)
m=1

N
bn = 1; (277)
n=1

100 This functional form was suggested by Diewert ([22], 139) as a generalization of the translog

functional form introduced by Christensen et al. [14]. Diewert ([22], 139) indicated that this
functional form was flexible for regular technologies. For applications of this functional form to
international trade theory, see Kohli [65, 67]. For applications to index number theory and the
measurement of productivity, see Caves et al. [11, 12], Diewert and Morrison [41], Kohli [66],
Feenstra et al. [51] and Inklaar and Diewert [64].
101 There are additional restrictions on the parameters which are necessary to ensure that π(p, x) is

convex in p and concave in x.

144 W. E. Diewert

amk = akm for all k, m; (278)

bni = bin for all n, i; (279)

M
amk = 0 for m = 1, . . . , M; (280)
k=1

N
bni = 0 for n = 1, . . . , N; (281)
i=1

N
cmn = 0 for m = 1, . . . , M; (282)
n=1

M
cmn = 0 for n = 1, . . . , N. (283)
m=1

If some of the variable outputs are actually inputs, then the domain of definition
of p and x needs to be restricted to p and x such that π(p, x) > 0, since we cannot
take the logarithm of a nonpositive number. The proof that the translog profit
function defined by Eqs. (275), (276), (277), (278), (279), (280), (281), (282), and
(283) is linearly homogeneous in p follows our earlier proof that the translog unit
cost function c(p) defined in section “The Translog Functional Form” was linearly
homogeneous in p. The proof that π(p, λx) = λπ(p, x) for all λ > 0 follows in an
analogous manner.
Note that using Hotelling’s Lemma, we have ∂lnπ(p, x)/lnpm = [pm /π(p,
x)]∂π(p, x)/∂pm = [pm /π(p, x)]ym (p, x) ≡ sm (p, x) where ym (p,x) is the profit
maximizing conditional net supply function for net output m and sm (p,x) is the
share of net output m in total variable profits. Thus, differentiating the logarithm
of π(p, x) defined by Eq. (275) with respect to the logarithm of pm leads to the
following system of net variable output share equations:

M N
sm (p, x) = am + amk lnpk + cmn lnxn ; m = 1, . . . , M. (284)
k=1 n=1

Thus, if we have data on the net outputs for period t, yt , the corresponding net
output prices pt 0M and fixed inputs used in period t, xt 0N by a production
unit for t = 1, . . . , T, then we can form observed variable profits for period t,
πt ≡ ptT yt > 0 and the period t net variable output shares sm t ≡ pm t ym t /πt for
m = 1, . . . , M and t = 1, . . . , T. A set of econometric estimating equations is the
following very simple system of equations:
3 Duality in Production 145

M M M
ln πt = a0 + am lnpm t + (1/2) amk lnpm t lnpk t
m=1 m=1 k=1
M N N
+ bn lnxn t + (1/2) bnj lnxn t lnxj t (285)
n=1 n=1 j=1
N N
+ cmn lnpm t lnxn t + e0 t ; t = 1, . . . , T;
m=1 n=1
M N
sm t = am + amk lnpk t + cmn lnxn t + em t ; m = 1, . . . , M;
k=1 n=1 (286)
t = 1, . . . , T

where the em t are error terms with 0 means for m = 0, 1, . . . , M and t = 1, . . . , T.

Note that these equations are linear in the unknown parameters. The cross equation
symmetry restrictions, amk = akm for 1 ≤ m < k ≤ M could be imposed on the above
equations or these conditions could be tested.102
Suppose now that we have reason to believe that the producer is optimizing
with respect to the vector of “fixed” inputs x. Using Samuelsons’s Lemma, we
have ∂lnπ(p, x)/lnxn = [xn /π(p, x)]∂π(p, x)/∂xn = [xn /π(p, x)]wn (p, x) ≡ Sm (p, x)
where wn (p,x) is the profit maximizing inverse demand function for “fixed” input
n and Sn (p,x) is the share of “fixed” input n in total “fixed” input cost.103 Thus,
differentiating the logarithm of π(p, x) defined by Eq. (275) with respect to the
logarithm of xn leads to the following system of input cost share equations:
N M
Sn (p, x) = bn + bnj lnxj + cmn lnpm ; n = 1, . . . , N. (287)
j=1 m=1

Thus if we have data on “fixed” input prices for the T periods in addition to the
already mentioned data, then we can form the observed cost shares for “fixed” input
n in period t, Sn t ≡ wn t xn t /πt for t = 1, . . . , T. Thus, we can add the following set
of estimating equations to the estimating equations defined by Eqs. (285) and (286):
N M
Sn t ≡ bn + bnj lnxj t + cmn lnpm t + un t ; n = 1, . . . , N; t=1, . . . , T
j=1 m=1
(288)

where the un t are error terms with 0 means.104

102 Since the shares sm t sum to 1 over m for each t, the Eq. (286) cannot have independent error
terms and hence one of the M equations in (286) should be dropped when estimating the unknown
parameters.
103 From Problem 24, we know that pT y(p, x) = xT w(p, x) = π(p, x) since we have assumed that

the underlying production possibilities set is regular.

104 Since the shares S t sum to one over n for each t, one of the N estimating equations in (287)
n
should be dropped. Typically, the cross equation parameter restrictions defined by Eqs. (278),
(279), (280), (281), (282), and (283) would be imposed but in principle, they could be tested.
146 W. E. Diewert

The simplicity of the estimating equations given by Eqs. (285), (286), and (288)
means that it is relatively easy to estimate the translog variable profit function.
However, there are two disadvantages associated with the translog functional
form:

• Not all of the parameters of the translog π(p, x) can be estimated unless Eq. (285)
are included in the estimation procedure. But every parameter is included in each
of these equations and this can lead to singularity problems if N + M is large and
T is small105 .
• It is not possible to impose the convexity in p and concavity in x property for the
translog functional form without destroying the flexibility of the functional form.

Thus, in the following two sections, we look at functional forms for a regular
technology variable profit function where we can impose the correct concavity and
convexity properties.
Another problem with the translog π(p, x) defined by Eq. (275) is that this
functional form does not allow for technical progress. This problem can be readily
remedied:
simply add the following
terms to the right hand side of definition (275):
α0 t + m = 1 M αm tlnpm + n = 1 N βn tlnxn where t is a scalar time variable and the
new
parameters α m and β n satisfy the additional restrictions m = 1 αm = 0 and
M
N β = 0.106 These restrictions will ensure that the resulting translog π(p, x)
n=1 n
is linearly homogeneous in p and x separately.107

The Normalized Quadratic Variable Profit Function

At this point, it will be useful to list the equations that a twice continuously
differentiable functional form for a variable profit function π(p, x) that is dual to a
regular technology must satisfy in order to be a flexible functional form at the point
p∗ 0M and x∗ 0N . Let π∗ (p, x) be an arbitrary variable profit function that
is dual to a regular technology set and suppose that π∗ (p, x) is twice continuously
differentiable at (p* ,x* ). For π(p, x) to be a flexible functional form, it must have
enough free parameters so that it can provide a second order approximation to
π∗ (p, x) at the point (p* ,x* ). Thus, the candidate function π must have enough
parameters so that it can satisfy the following 1 + M + N + (M + N)2 equations:

105 This problem is perhaps not too serious; if Eqs. (286) and (288) are estimated, then all of the

parameters that appear in definition (275) can be identified except the parameter a0 . This parameter
could be estimated in a second stage where Eq. (285) are used to solve for a0 in terms of lnπt and
the fitted values from the first stage for the right hand side of Eq. (285) omitting the term a0 .
106 This extension of the translog function GDP function to allow for technical progress is due to

Kohli [65] in a model with four outputs and two inputs. Feenstra ([50], 423) noted these restrictions
in the general M outputs and N inputs model.
107 More general specification of technical progress can be made using linear or quadratic splines

in the time variable.

3 Duality in Production 147

π p∗ , x∗ = π∗ p∗ , x∗ ; 1 equation; (289)

∇p π p∗ , x∗ = ∇p π∗ p∗ , x∗ ; M equations; (290)

∇x π p∗ , x∗ = ∇x π∗ p∗ , x∗ ; N equations; (291)

∇ 2 pp π p∗ , x∗ = ∇ 2 pp π∗ p∗ , x∗ ; M2 equations; (292)

∇ 2 xx π p∗ , x∗ = ∇ 2 xx π∗ p∗ , x∗ ; N2 equations; (293)

∇ 2 px π p∗ , x∗ = ∇ 2 px π∗ p∗ , x∗ ; MN equations; (294)

∇ 2 xp π p∗ , x∗ = ∇ 2 xp π∗ p∗ , x∗ ; NM equations. (295)

However, because π(p, x) and π∗ (p, x) are both linearly homogeneous in p and x
separately and both are assumed to be twice continuously differentiable at (p* ,x* ),
not all of the equations in (289), (290), (291), (292), (293), (294), and (295) are
independent. Equation (289) is implied by the first part of Euler’s Theorem on
homogeneous functions and Eq. (290) or (291). Thus, Eq. (289) can be dropped
from the list of equations that π(p, x) must satisfy since it will be satisfied if
either Eq. (290) or (291) is satisfied. Since p∗ T ∇ p π(p∗ , x∗ ) = x∗ T ∇ x π(p∗ , x∗ ) and
p∗ T ∇ p π∗ (p∗ , x∗ ) = x∗ T ∇ x π(p∗ , x∗ ), any one of the M + N equations in (290) and
(291) can also be dropped. Young’s Theorem from calculus and the second part of
Euler’s Theorem on homogeneous functions imply that if the M(M − 1)/2 equations
in the upper triangle of Eq. (292) hold, then all M2 equations in (292) will hold.
Similarly, if the N(N − 1)/2 equations in the upper triangle of Eq. (293) hold, then all
N2 equations in (293) will hold. Young’s Theorem implies that if the MN equations
in (294) hold, then the NM equations in (295) will also hold. Recall Eqs. (254),
(261), (267), and (268) in section “The Comparative Statics Properties of Input
Demand Functions.” These equations imply that ∇ 2 px π(p∗ , x∗ )x∗ = ∇ p π(p∗ , x∗ )
and p∗ T ∇ 2 px π(p∗ , x∗ ) = ∇ x π(p∗ , x∗ )T . The same equations will apply to the
corresponding partial derivatives of π∗ (p∗ , x∗ ). Thus we need only satisfy Eq. (294)
for the (M − 1) by (N − 1) submatrix of the N M matrix ∇ 2 px π(p∗ , x∗ ) that drops
the last row and column of this matrix. Thus, for π(p, x) to be flexible at (p* ,x* ), we
need to satisfy M + N − 1 of the equations in (290) and (291), the M(M − 1)/2
equations in the upper triangle of Eq. (292), the N(N − 1)/2 in the upper triangle
of Eq. (293) and the (M − 1)(N − 1) equations in (294) that drop the equations for
one row and one column of the matrix equation involving M rows and N columns.
Thus, a flexible functional form for a regular variable profit function must have at
least M + N − 1 + M(M − 1)/2 + N(N − 1)/2 + (M − 1)(N − 1) independent
parameters.
148 W. E. Diewert

Recall that the normalized quadratic unit cost function was defined by Eqs. (146),
(147), (148), and (149) in section “The Normalized Quadratic Unit Cost Function”
above. We will adapt this functional form to our present context. Define the function
r(p) for p > 0M as follows:

r (p) ≡ bT p + (1/2) pT Bp/βT p (296)

where β > 0M is a predetermined vector, b is a parameter vector and B is symmetric

positive semidefinite parameter matrix that satisfies:

Bp∗ = 0M . (297)

Use the normalized quadratic functional form to define the following function of
f(x) for x > 0N :

f (x) ≡ aT x + (1/2) xT Ax/αT x (298)

where α > 0N is a predetermined vector, a is a parameter vector and A is symmetric

negative semidefinite parameter matrix that satisfies:

Ax∗ = 0N . (299)

Normalize α and β so that they satisfy the following restrictions:

αT x∗ = 1; βT p∗ = 1. (300)

Use the f(x) and r(p) defined above in the following definition for the normalized
quadratic variable profit function 108 :

π (p, x) ≡ r (p) f (x) + pT Cx (301)

where C is an M by N parameter matrix. Using the restrictions defined by Eqs.

(297), (299), and (300), the level and first and second order partial derivatives of the
π(p, x) defined by Eq. (300) evaluated at (p* ,x* ) are set equal to the corresponding
level and derivatives of an exogenously given π∗ (p∗ , x∗ )

π∗ p∗ , x∗ = aT x∗ bT p∗ + p∗T Cx∗ ; (302)

∇p π∗ p∗ , x∗ = baT x∗ + Cx∗ ; (303)

108 An alternative functional form for a variable profit function that used the r(p) and f(x) defined
by Eqs. (296) and (298) as building blocks appeared in Diewert and Fox [36]. Note that net outputs
y and fixed inputs x are separable if C = OM × N , an M by N matrix of 0’s. See Blackorby et al.
[6] on separability concepts.
3 Duality in Production 149

∇x π∗ p∗ , x∗ = abT p∗ + CT p∗ ; (304)

∇ 2 pp π∗ p∗ , x∗ = BaT x∗ ; (305)

∇ 2 xx π∗ p∗ , x∗ = AbT p∗ ; (306)

∇ 2 px π∗ p∗ , x∗ = baT + C. (307)

We show that there is an a, b, A, B and C solution to the above equations.

Tentatively assume that:

aT x∗ = 1; Cx∗ = 0M and p∗T C = 0N T . (308)

Substitute Eq. (308) into (303) and solve for b = ∇ p π∗ (p∗ , x∗ ). This
implies that p∗ T b = π∗ (p∗ , x∗ ). Substitute Eq. (308) into (304) and solve for
a = ∇ x π∗ (p∗ , x∗ )/bT p∗ = ∇ x π∗ (p∗ , x∗ )/π∗ (p∗ , x∗ ). Since x∗ T ∇ x π∗ (p∗ , x∗ ) =
π∗ (p∗ , x∗ ), it can be seen that aT x∗ = 1. Substitute this equation into (305)
and solve for B = ∇ 2 pp π∗ (p∗ , x∗ ), a symmetric positive semidefinite matrix
that satisfies Bp∗ = 0M using the linear homogeneity of π∗ (p, x) in p. Using
p∗ T b = π∗ (p∗ , x∗ ), Eq. (306) implies that A = [π∗ (p∗ , x∗ )]−1 ∇ 2 xx π∗ (p∗ , x∗ ).
Thus, A is a negative semidefinite matrix that satisfies Ax∗ = 0N . Finally, define
C ≡ ∇ 2 px π∗ (p∗ , x∗ )−baT =∇ 2 px π∗ (p∗ , x∗ )−[π∗ (p∗ , x∗ )]−1 ∇ p π∗ (p∗ , x∗ )∇ x π∗ (p∗ ,
x∗ )T . Using ∇ 2 px π∗ (p∗ , x∗ )x∗ = ∇ p π∗ (p∗ , x∗ ), p∗ T ∇ p π∗ (p∗ , x∗ ) = ∇ x π∗ (p∗ , x∗ )T
and x∗ T ∇ x π∗ (p∗ , x∗ ) = π∗ (p∗ , x∗ ) = p∗ T ∇ p π∗ (p∗ , x∗ ), it can be seen that Cx∗ = 0M
and p∗ T C = 0N T . Thus, the normalized quadratic profit function defined by Eq.
(301) is a flexible functional form.
Given data on net outputs yt , “fixed” inputs xt and their prices pt and wt for
t = 1, . . . , T, econometric estimating equations for a production unit whose
technology is (approximately) dual to the profit function π(p, x) defined by Eq. (301)
can be obtained by using Hotelling’s Lemma and Samuelson’s Lemma to generate
the following nonlinear estimating equations for t = 1, . . . , T:
−1 −2
yt = b + βT pt Bpt − 1 2 βT pt ptT Bpt β aT x + 1 2 xtT Axt /αT xt

+ Cxt + ut ;
(309)

−1 −2
wt = a+ αT xt Axt − 1 2 αT xt xtT Axt α bT p+ 1 2 ptT Bpt /βT pt

+ CT pt + vt
(310)
150 W. E. Diewert

where the error vectors ut and vt have zero means. The disadvantage of these esti-
mating equations is that they are more complicated than the rather straightforward
comparable translog estimating equations that were obtained in the previous section.
However, this functional form has the advantage that the appropriate curvature
conditions can be imposed; that is, the matrices A and B that appear in the above
equations can be replaced by A = − A∗ A∗ T and B = B∗ B∗ T where A* and B* are
lower triangular matrices with A∗ T x∗ = 0N and B∗ T p∗ = 0M .109 These substitutions
will not destroy the flexibility of the resulting functional form. Semiflexible versions
of the A and B matrices can also be estimated in order to conserve on the number
of parameters in the model. Finally, technical progress can easily be accommodated
in the above model: simply add the time trend vector a* t to the a vector and add
the time trend vector b* t to the vector b in the estimating Eqs. (308) and (309) for
period t.110

The KBF Variable Profit Function

In section The Konüs Byushgens Fisher Unit Cost Function” of this chapter, we
studied the KBF unit cost function. This functional form can be used as a basic
building block to obtain a flexible functional form for a variable profit function that
is dual to a regular production possibilities set.111 Thus define the function r(p) for
p > 0M as follows:
1/2
r (p) ≡ pT bbT + B p (311)

where b is a parameter vector and B is symmetric positive semidefinite parameter

matrix that satisfies:

Bp∗ = 0M . (312)

We also use the KBF functional form to define the following function of f(x) for
x > 0N :
1/2
f (x) ≡ xT aaT + A x (313)

109 After making these substitutions for A and B, the resulting π(p, x) will satisfy the convexity and

concavity conditions at the point (p,x) provided that p > 0M , x > 0N , r(p) > 0 and f(x) > 0.
110 For identification, add the constraint a∗ T 1N = 0. Of course, to achieve additional flexibility,
linear or quadratic splines in time could be added to the a and b vectors; see Fox [54] or Fox and
Grafton [55] for empirical examples using the normalized quadratic functional form and piece-wise
linear splines to model technical progress.
111 The advantage of using this functional form over using the normalized quadratic as a basic

building block is that when using the KBF functional form, we do not have to specify the exogenous
vectors α and β which appeared in the normalized quadratic functional form.
3 Duality in Production 151

where a is a parameter vector and A is symmetric negative semidefinite parameter

matrix that satisfies:

Ax∗ = 0N . (314)

We use the f(x) and r(p) defined above in the following definition for the KBF
variable profit function112 :

π (p, x) ≡ r (p) f (x) + pT Cx (315)

where C is an M by N parameter matrix. Using the restrictions defined by Eqs.

(312) and (314), the level and first and second order partial derivatives of the π(p, x)
defined by Eq. (315) evaluated at (p* ,x* ) are set equal to the corresponding level
and derivatives of an exogenously given π∗ (p∗ , x∗ )

π∗ p∗ , x∗ = aT x∗ bT p∗ + p∗T Cx∗ ; (316)

∇p π∗ p∗ , x∗ = baT x∗ + Cx∗ ; (317)

∇x π∗ p∗ , x∗ = abT p∗ + CT p∗ ; (318)

∇ 2 pp π∗ p∗ , x∗ = BaT x∗ /bT p∗ ; (319)

∇ 2 xx π∗ p∗ , x∗ = AbT p∗ /aT x∗ ; (320)

∇ 2 px π∗ p∗ , x∗ = baT + C. (321)

It can be seen that these equations are identical to Eqs. (302), (303), (304),
(305), (306), and (307) in the previous section except that Eqs. (319) and (320)
are slightly different from the corresponding Eqs. (305) and (306). It turns out that
this difference does not affect the proof that there is an a, b, A, B, and C solution
to the above equations. Thus, it is straightforward to establish that the KBF variable
profit function is a flexible functional form.
Given data on net outputs yt , “fixed” inputs xt and their prices pt and wt for t = 1,
. . . , T, econometric estimating equations for a production unit whose technology
is (approximately) dual to the profit function π(p, x) defined by Eq. (315) can be
obtained by using Hotelling’s Lemma and Samuelson’s Lemma to generate the
following nonlinear estimating equations for t = 1, . . . , T:

112 Net outputs y will be separable from inputs x if C = OM × N .

152 W. E. Diewert

−1/2 1/2
yt = bbT pt + Bpt ptT bbT + B pt xtT aaT + A xt + Cxt + ut ;
(322)
−1/2 1/2
wt = aaT xt + Axt xtT aaT + A xt ptT bbT + B pt + CT pt + vt
(323)

where the error vectors ut and vt have zero means. Again, the disadvantage of
these estimating equations is that they are a lot more complicated than the rather
straightforward comparable translog estimating equations that were obtained for the
translog functional form. However, as was the case with the normalized quadratic
profit function, this functional form has the advantage that the appropriate curvature
conditions can be imposed without destroying the flexibility of the functional form;
that is, the matrices A and B that appear in the above equations can be replaced by
A = − A∗ A∗ T and B = B∗ B∗ T where A* and B* are lower triangular matrices with
A∗ T x∗ = 0N and B∗ T p∗ = 0M .113 As usual, semiflexible versions of the A and B
matrices can also be estimated in order to conserve on the number of parameters
in the model. And again as usual, flexible forms of technical progress can easily
be accommodated in the above model by adding the time trend vector a* t to the a
vector and add the time trend vector b* t to the vector b in the estimating Eqs. (322)
and (323) for period t.114
The KBF functional form developed in this section is very similar to the
normalized quadratic functional form that was developed in the previous section.
However, the KBF functional form has the advantage that it is not necessary to
specify an α and β vector a priori as was the case for the normalized quadratic profit
function. The KBF functional form seems to be the most promising parsimonious
functional form that has been developed up to the present.

Joint Cost Functions

Instead of maximizing profits with respect to variable inputs and outputs, in this
section we minimize cost subject to producing a specified vector of outputs. Thus
consider a production unit that produces the output vector y ≥ 0M using an input
vector x ≥ 0N . The set of feasible output and input vectors (y,x) is a set S which
satisfies the following minimal regularity condition115 :

113 After making these substitutions for A and B, the resulting π(p, x) will satisfy the convexity and

concavity conditions provided at the point (p,x) provided that p > 0M , x > 0N , r(p) > 0 and f(x) > 0.
114 Again, in order to identify all of the parameters, add the constraint a∗ T 1N = 0. To achieve
additional flexibility, linear or quadratic splines in time could be added to the a and b vectors.
115 Note that in this section, y is a vector of outputs rather than a vector of net outputs as in previous

sections.
3 Duality in Production 153

S is closed subset of M + N space such that for every output vector y ≥ 0M , there
exists an input vector x ≥ 0N such that (y, x) ∈ S.
(324)

Let w 0N be a strictly positive vector of input prices and let y ≥ 0M be an

output vector. Define the producer’s joint cost function C(y,w) as follows:

C (y, w) ≡ minx wT x : (y, x) ∈ S . (325)

The regularity conditions (324) on S and the assumption that w 0N imply that
the minimum in Eq. (325) will exist.
It is frequently useful to assume that S satisfies free disposability of inputs,
property (326) below, and/or free disposability of outputs, property (327)
below.

y ≥ 0M , 0N ≤ x1 < x2 and y, x1 ∈ S implies y, x2 ∈ S. (326)

0M ≤ y1 < y2 and y2 , x ∈ S implies y1 , x ∈ S. (327)

Problems
27. Theorem 15: Suppose S satisfies conditions (324) and define C(y,w) by
Eq. (325) for y ≥ 0M and w 0N . Show that C(y,p) has the following
properties:
(i) C(y,w) is a nonnegative function; that is, C(y, w) ≥ 0 for y ≥ 0M and
w 0N .
(ii) C(y,w) is positively linearly homogeneous in p for each fixed y; that is,
C(y, λ, w) = λC(y, w) for all λ > 0, w 0N and y ≥ 0M .
(iii) C(y,w) is nondecreasing in w for each fixed y; that is, C(y, w1 ) ≤ C(y, w2 )
for y ≥ 0M and w2 > w1 0N .
(iv) C(y,w) is a concave function of w for each fixed y; that is, C(y, λw1 +
(1 − λ)w2 ) ≥ λC(y, w1 ) + (1 − λ)C(y, w2 ) for y ≥ 0M , w1 0N ; w2 0N
and 0 < λ < 1.
(v) C(y,w) is a continuous function of w for each fixed y ≥ 0M .
Hint: Adapt the proof of Theorem 1 in section “Cost Functions: The One
Output Case” above.
28. Continuation of 27: Suppose S satisfies the free disposability of outputs
property (327) in addition to the minimal regularity conditions (324). Show
that C(y,w) is nondecreasing in y for fixed w; that is, show that w 0N ,
0M ≤ y1 < y2 and (y2 , x) ∈ S implies C(y1 , p) ≤ C(y2 , p). Hint: Use a feasibility
argument.
154 W. E. Diewert

Thus, the joint cost function C(y,w) has much the same properties with respect to
input prices as the single output cost function that was studied in section “Cost
Functions: The One Output Case” above. In particular, C(y,w) must be linearly
homogeneous and concave in w for fixed y.
Under what conditions can a knowledge of the joint cost function, C(y,p),
be sufficient to determine the underlying technology set S? We now address
this question. Suppose S satisfies the minimal regularity conditions (324). For
each y ≥ 0M , define the set of inputs that can produce at least y, L(y), as
follows:

L (y) ≡ {x : (y, x) ∈ S} . (328)

If we are given the family of upper level sets, L(y) for every y ≥ 0M , then S can
be recovered using S = {(y, x) : y ≥ 0M and x ∈ L(y)}. Thus, the above question can
be reduced to the equivalent question: under what assumptions on L(y) can the joint
cost function be used to determine L(y) for each y ≥ 0M ? We can use the method
explained in section “The Duality Between Cost and Production Functions” above
to answer this question.
Let y ≥ 0M and w 0N . Use the given joint cost function C(y,w) to define the
following half space of inputs:

M (y, w) ≡ x : wT x ≥ C (y, w) . (329)

The above half space must contain the level set L(y). Thus, L(y) must be
contained in the following set, which is the intersection of all of the supporting
halfspaces to L(y):

M (y) ≡ ∩w0N M (y, p) . (330)

Since each of the sets in the intersection, M(y,p), is a convex set, then M(y) is
also a convex set. Since L(y) is a subset of each M(y,p), it must be the case that
L(y) is also a subset of M(y); that is, we have L(y) ⊂ M(y). As was the case in
section “The Duality Between Cost and Production Functions,” in order to ensure
that M(y) = L(y), we need to add the following two conditions on the family of
level sets L(y):

For each y ≥ 0M , L (y) satisfies free disposability of inputs; i.e., x1 ∈ L (y) , x2

≥ x1 implies x2 ∈ L (y) .
(331)

For each y ≥ 0M , L (y) is a convex set. (332)

3 Duality in Production 155

Condition (331) on the family of as input level sets L(y) is equivalent to condition
(326) on the production possibilities set S. As in section “The Duality Between Cost
and Production Functions,” assumptions (331) and (332) rule out backward bending
and nonconvex input production possibilities sets L(y).
As was the case in section “The Duality Between Cost and Production Func-
tions,” if the producer is a price taker in input markets, then it is not necessary
to assume properties (331) and (332) when estimating a joint cost function: a cost
minimizing producer will never choose an input vector that belongs to a nonconvex
or backward bending upper level set L(y). Thus, an estimated joint cost function
can be used to form the upper level sets M(y) and these sets can provide an adequate
approximation to the true L(y) for most purposes.
If the joint cost function C(y,w) satisfies the conditions listed in Theorem 15 and
is differentiable with respect to input prices w, then we can show that Shephard’s
Lemma still holds; that is, the producer’s system of cost minimizing input demand
functions is equal to x(y, w) ≡ ∇ w C(y, w) for y ≥ 0M and w 0N .116
If the production possibilities set S has additional properties, then we can deduce
that the joint cost function C(y,w) has additional properties. Two familiar additional
properties for S are the following ones:

S is a convex set; i.e., y1 , x1 ∈ S, y1 , x1 ∈ S and 0 < λ
(333)
< 1 implies λy1 + (1 − λ) y2 , λx1 + (1 − λ) x2 ∈ S.

S is a cone; i.e., if (y, x) ∈ S and λ > 0, then (λy, λx) ∈ S. (334)

The cone assumption (334) means that production is subject to constant returns to
scale. The convexity assumption rules out technologies that are subject to increasing
returns to scale. Some of the implications of these assumptions are listed in the
following problems.

Problems
29. Assume S satisfies Eq. (234) and the convexity assumption (333). (i) Show that
L(y) ≡ {x : (y, x) ∈ S} is a convex set for each y ≥ 0M . (ii) Show that C(y,w)
defined by Eq. (325) is a convex function of y for fixed w 0N . Hint: Look at
the proof of part (b) of Theorem 11.
30. Assume S satisfies Eq. (234) and the output free disposability assumption (327).
Show that C(y,w) defined by Eq. (325) is a nondecreasing function of y for fixed
w 0N . Hint: Use a feasibility argument.

116 Theproof of Theorem 5 in section “The Derivative Property of the Cost Function” can be
adapted to prove this result.
156 W. E. Diewert

31. Assume S satisfies Eq. (234) and the cone assumption (334). Show that C(y,w)
defined by Eq. (325) is a linearly homogeneous function of y for fixed w 0N .
Hint: Modify the proof of part (c) of Theorem 11.

The above problems show that if S satisfies the output free disposal assumption
(327) and the convexity and constant returns to scale assumptions (333) and (334),
then the corresponding joint cost function C(y,w) will be a nondecreasing, linearly
homogeneous, and convex function of y for fixed w.
Assume that C(y,w) is differentiable with respect to y and w. Shephard’s Lemma
enables us to interpret the vector of first order partial derivatives of the joint cost
function with respect to the input price vector w, ∇ w C(y, w), as the producer’s
vector of input demand functions, x(y,w). The vector of first order partial derivatives
of the joint cost function with respect to y, ∇ y C(y, w), is obviously the vector of
marginal costs for each output. However, if S satisfies the convexity assumption
(333), then p = ∇ y C(y, w) can be interpreted as the producer’s system of inverse
supply functions; that is, if the producer faced the output price vector p and the
input price vector w, then an output vector y which satisfied the system of equations
p = ∇ y C(y, w) and the x = ∇ w C(y, w) would be a solution to the following
producer’s profit maximization problem:

maxy,x pT y − wT x : (y, x) ∈ S . (335)

Theorem 16 Suppose the technology set S satisfies the minimal regularity assump-
tions (324) plus (326) (free disposability of inputs), (327) (free disposability of
outputs) and (333) (convexity). Let y∗ ≥ 0N and w∗ 0N . Suppose that C(y,w)
is differentiable at (y* ,w* ). Define x∗ ≡ ∇ w C(y∗ , w∗ ) and p∗ ≡ ∇ y C(y∗ , w∗ ). Then
(y* ,x* ) is a solution to the following profit maximization problem:

maxy,x p∗T y − w∗T x : (y, x) ∈ S . (336)

Proof The free disposability assumptions imply that p∗ ≥ 0M and x∗ ≥ 0M . The

convexity assumption on S implies that C(y,w* ) is a convex function of y. Thus, the
function f(y) ≡ C(y, w∗ ) − p∗ T y is also a convex function of y for all y ≥ 0M .
Note that ∇f(y∗ ) = ∇ y C(y∗ , w∗ ) − p∗ = 0M using the definition of p* . Since
f(y) is a convex function and differentiable at y = y∗ , its first order Taylor series
approximation around this point will lie below (or be coincident with) f(y). Thus,
we have for all y ≥ 0M :
T
f (y) ≥ f y∗ + ∇f y∗ y − y∗
(337)
= f y∗

where the inequality follows since ∇f(y∗ ) = 0M . Thus, f(y) attains a global minimum
at y* . Using the definition of f, we see that y* is a solution to the following
3 Duality in Production 157

minimization problem:

miny C y, w∗ − p∗T y; y ≥ 0M = C y∗ , w∗ − p∗T y∗

= w∗T ∇w C y∗ , w∗ − p∗T y∗ (338)

= w∗T x∗ − p∗T y∗

where the second equality follows from the linear homogeneity of C(y,w) in w and
the third equality follows from the definition of x∗ ≡ ∇ w C(y∗ , w∗ ).
It can be verified that solving the profit maximization problem defined by Eq.
(335) is equivalent to solving the following (net) cost minimization problem:

miny,x w∗T x − p∗T y : (y, x) ∈ S = miny minx w∗T x : (y, x) ∈ S − p∗T y

= miny C y, w∗ −p∗T y; y≥0M using (325)

= w∗T x∗ − p∗T y∗ using (338).

(339)

The above result is a joint cost function counterpart to Samuelson’s Lemma,

Theorem 12 above. It says that if producers take prices as given on both input and
output markets and the technology set is convex, then the producer’s system of
inverse supply functions, p(y,x), is equal to ∇ y C(y,w), the producer’s system of
marginal cost functions.
If the production possibilities set S satisfies all of the regularity conditions on S
that are listed in this section (free disposability of inputs and outputs, convexity, and
constant returns to scale), we say that S is a regular production possibilities set.

Problems
32. Suppose S satisfies the minimal regularity conditions (324). Define the cor-
responding joint cost function C(y,w) by Eq. (325). Suppose C(y,w) is twice
continuously differentiable with respect to w at some point y > 0M and
w 0N . Then the system of cost minimizing input demand functions is
x(y, w) = ∇ w C(y, w) and the N by N matrix of demand derivatives with respect
to input prices, B ≡ [∂xn (y, w)/∂wi ] = ∇ 2 ww C(y, w) exists. Show that the matrix
B has the following properties:
(i) B = BT .
(ii) B is negative semidefinite.
(iii) Bw = 0M .
Hint: Adapt the proof of Theorem 13.
33. Suppose S is a regular production possibilities set and the corresponding
C(y,w) is twice continuously differentiable at the point y 0M and w 0N .
Then the system of inverse supply functions, p(y, w) = ∇ y C(y, w) and the
M by M matrix of partial derivatives with respect to output quantities,
158 W. E. Diewert

A ≡ [∂pm (y, w)/∂yk ] = ∇ 2 yy C(y, w) exist. Show that the matrix A has the
following properties:
(i) A = AT so that ∂pm (y, w)/∂yk = ∂pk (y, w)/∂ym for all m = k.
(ii) A is positive semidefinite and.
(iii) Ay = 0M .
Hint: Adapt the proof of Theorem 14.
34. Suppose S is a regular production possibilities set and the corresponding C(y,w)
is twice continuously differentiable at the point y 0M and w 0N . Then
the system of inverse supply functions is p(y, w) = ∇ y C(y, w) and the M
by N matrix of partial derivatives supply prices with respect to input prices,
D ≡ [∂pm (y, w)/∂wn ] = ∇ 2 yw C(y, w) exists. The system of cost minimizing
input demand functions is x(y, w) = ∇ w C(y, w) and the N by M matrix
of partial derivatives of input quantities with respect to output quantities,
E ≡ [∂xn (y, w)/∂ym ] = ∇ 2 wy C(y, w) exists. Show that the matrices D and E
have the following properties:
(i)

D = ET ;

(ii)

p (y, w) = Dw ≥ 0M ;

(iii)

x (y, w) = Ey ≥ 0N .

Hint: Adapt the proof of Theorem 15.

Shephard’s Lemma and Theorem 16 can be used as a convenient method for
obtaining econometric estimating equations for determining the parameters that
characterize a producer’s technology set S. Assuming that S satisfies the minimal
regularity conditions on S, we need only postulate a differentiable functional form
for the producer’s joint cost function, C(y,w), that is linearly homogeneous and
concave in w. Suppose that we have collected data on the input vectors used by the
unit in period t, xt , and the outputs produced in period t, yt , for t = 1, . . . , T time
periods as well as the corresponding input price vectors wt . Then the following NT
equations can be used in order to estimate the unknown parameters in C(y,w):

xt = ∇w C yt , wt + ut ; t = 1, . . . , T (340)

where ut is a vector of errors. If in addition, S is a convex set and the firm is

maximizing profits facing the fixed output and input price vectors, pt and wt ,
respectively, in period t, then the following MT equations can be added to Eq. (340)
as additional estimating equations:
3 Duality in Production 159

pt = ∇y C yt , wt + vt ; t = 1, . . . , T (341)

where vt is a vector of errors.117

Flexible Functional Forms for Joint Cost Functions

Specific functional forms for C(y,w) can be found by adapting the functional forms
explained in sections “The Translog Variable Profit Function,” “The Normalized
Quadratic Variable Profit Function,” and “The KBF Variable Profit Function” above.
Adapting the material in section “The Translog Variable Profit Function,” we could
assume the log of the joint cost function for a regular technology, lnC(y,w), has the
following translog functional form118 :
M M M
ln C (y, w) ≡ a0 + am lnym + (1/2) amk lnym lnyk
m=1 m=1 k=1
N N N
+ bn lnwn + (1/2) bni wn lnwi
n=1 n=1 j=1
M N
+ cmn lnym lnwn .
m=1 n=1
(342)

The unknown coefficients in Eq. (342) must satisfy the restrictions (276), (277),
(278), (279), (280), (281), (282), and (283) listed in section “The Translog Variable
Profit Function” if S is a regular production possibilities set.
Note that using Shephard’s Lemma, we have ∂lnC(y, w)/lnwn = [wn /C(y, w)]
∂C(y, w)/∂wn = [wn /C(y, w)]xn (y, w) ≡ Sn (y, w) where xn (y,w) is the cost mini-
mizing demand function for input and Sn (y,w) is the share of input n in total cost.
Assuming that the producer minimizes cost and S is dual to the translog joint cost
function defined by Eq. (342), then differentiating the logarithm of C(y,w) defined
by Eq. (342) with respect to the logarithm of wn leads to the following system of
input share equations:
N M
Sn (y, w) = bn + bnj lnwj + cmn lnym ; n = 1, . . . , N. (343)
j=1 m=1

117 Ifin addition, the technology set S is subject to constant returns to scale and the data reflect this
fact by satisfying ptT yt = wtT xt for t = 1, . . . , T, then the error vectors ut and vt in Eqs. (340) and
(341) cannot be statistically independent. Hence one of the M + N equations in (340) and (341)
must be dropped from the system of estimating equations.
118 This functional form is due to Burgess [9] who applied it to international trade theory. For

applications of this functional form to index number theory, see Diewert and Morrison [41] and
Diewert and Fox [34].
160 W. E. Diewert

Equations (342) and (343) can be used as estimating equations if the production
unit is minimizing costs. Note that these equations are linear in the unknown
parameters.119
Suppose that in addition to the assumption that the production unit is minimizing
costs, we assume that the technology set is regular and the producer is maxi-
mizing profits. Using Theorem 16, ∂lnC(y, w)/lnym =[ym /C(y, w)]∂C(y, w)/∂ym =
[ym /C(y, w)]pm (y, w) ≡ sm (p, x) where pm (y,w) is the profit maximizing inverse
demand function for output m and sm (p,x) is the share of output m in total
profit maximizing revenue. Assume that S is regular. Assuming that the producer
maximizes profit and S is dual to the translog joint cost function C(y,w) defined by
Eq. (342), then differentiating the logarithm of C(y,w) with respect to the logarithm
of ym leads to the following system of revenue share equations:
M N
sm (y, w) = am + amk lnyk + cmn lnwn ; m = 1, . . . , M. (344)
k=1 n=1

Equations (342), (343), and (344) can be used as estimating equations if the
production unit is maximizing profits and has a regular translog technology.
The above functional form for the logarithm of joint cost does not allow for
technical progress. To remedy this problem, simply add the following terms to the
right hand side of definition (342): α0 t + m = 1 M tαm lnym + n = 1 N tβn lnwn
where
t is a scalar time variable and the new parameters αm and βn satisfy
M α = 0 and
n = 1 βn = 0.
N 120
m=1 m
A problem with the translog joint cost function is that it is not possible to impose
concavity in w (and convexity in y if the dual S satisfies convexity) over the region
spanned by the sample input prices wt (and the region spanned by the sample output
vectors yt if S is a convex set) without impairing the flexibility of the functional
form. In order to impose these curvature conditions without destroying the flexibility
property, we turn to the functional forms defined in sections “The Normalized
Quadratic Variable Profit Function” and “The KBF Variable Profit Function.”
Define the normalized quadratic joint cost function C(y,w) for y > 0M and w > 0N
as follows:

C (y, w) ≡ g (y) c (w) + yT Ew (345)

where g(y) ≡ bT y + (1/2)yT By/βT y, β > 0M is a predetermined vector that satisfies

βT y∗ = 1, b > 0M is a parameter vector, B is symmetric positive semidefinite
parameter matrix that satisfies By∗ = 0M , c(w) ≡ aT w + (1/2)wT Aw/αT w, α > 0N is
a predetermined vector that satisfies αT w∗ = 1, a is a parameter vector that satisfies

119 If we do not impose constant returns to scale and convexity on S, then the parameter restrictions

(277) and (281), (282), and (283) do not have to be imposed. These restrictions should be imposed
if we assume constant returns to scale and convexity.
120 Linear or quadratic spline functions in time can also be added to the estimating equations to

better approximate variable rates of technical progress over time.

3 Duality in Production 161

aT w∗ = 1, A is symmetric negative semidefinite parameter matrix that satisfies

Aw∗ = 0N and E is an M by N parameter matrix.
Define the KBF joint cost function C(y,w) using Eq. (345) where E is
again an M by N parameter matrix. However, redefine g(y) and c(w) as
follows: g(y) ≡ (yT [bbT + B]y)1/2 , where b > 0M is a parameter vector, B
is symmetric positive semidefinite parameter matrix that satisfies By∗ = 0M ,
c(w) ≡ (wT [aaT + A]w)1/2 , a is a parameter vector that satisfies aT w∗ = 1 and
A is symmetric negative semidefinite parameter matrix that satisfies Aw∗ = 0N .
For both of these joint cost functions, the vector of cost minimizing input demand
functions x(y,p) can be obtained by calculating the vector of first order partial
derivatives, ∇ w C(y, w). The concavity in input prices property for the joint cost
function can be imposed by setting A = − A∗ A∗ T with A* lower triangular and
A∗ T w∗ = 0N . In the case where the underlying production possibilities set S is
convex, the vector of profit maximizing output prices p(y,w) that is consistent with
the production of the vector y of outputs can be obtained by calculating the vector of
first order partial derivatives, ∇ y C(y, w). The convexity property in output quantities
for C(y,w) can be imposed by setting B = B∗ B∗ T with B* lower triangular and
B∗ T y∗ = 0M .121
The normalized quadratic and KBF joint cost functions as defined above do not
allow
for technicalprogress. This problem can be remedied by adding the term
( m = 1 M γm ym t) ( n = 1 N δn wn t) to the right hand side of definitions (345) where
the γm and δn are technical progress parameters and t is a time trend.122 These
additional technical progress terms may not capture the trends in technical progress
in
the time series context
ifN the sample period is long. In this case, the terms
M γ y t and
m=1 m m n = 1 δn wn t can be replaced by piece-wise linear spline
functions as was done in section “The Use of Splines for Modeling Technical
Progress” above; see Eq. (200).

121 After making these substitutions for A and B, the resulting C(y,w) will satisfy the convexity
and concavity conditions provided at the point (y,w) provided that y > 0M , w > 0N , g(y) > 0 and
c(w) > 0. The proof of the flexibility of the normalized quadratic and KBF joint cost functions in
the case of a regular technology is entirely analogous to the corresponding proofs of normalized
quadratic and KBF variable profit functions that were discussed in sections “The Normalized
Quadratic Variable Profit Function” and “The KBF Variable Profit Function.”
122 In order to identify all of these technical progress parameters, we need to impose a normalization

on them such as m = 1 M γm = 1.
162 W. E. Diewert

Applications of Joint Cost Functions

In this section, we discuss three areas of research where joint cost functions play
important roles.
Many government outputs are produced in a nonmarket context. The output
quantities can usually be measured but typically, there are no market prices for
the outputs that are produced by many government production units. However,
government producers still have an incentive to minimize costs. If the public sector
production unit is minimizing costs and the technology set can be approximated by
a constant returns to scale production possibilities set S and econometric estimation
of a differentiable dual joint cost function C(y,w) is possible (using just the input
demand functions as estimating equations), then approximate output prices can be
obtained as the vector of marginal costs, p ≡ ∇ y C(y, w). If production is subject
to constant returns to scale, then the resulting output price vector p will have the
property that pT y = C(y, w) = wT x; that is, the resulting value of outputs will
equal the value of inputs.123 This result is useful in the national income accounting
context where government statisticians have to find methods for valuing public
sector outputs. Using marginal cost prices is also useful when economists want to
measure the productivity performance of public sector production units.124
A second application for the estimation of joint cost functions is in the context of
the regulation of utilities that deliver electricity, water, and communications services
via networks. Regulators are interested in using marginal costs to aid them in setting
utility prices. Utilities may be forced to sell their outputs at regulated prices that do
not reflect marginal costs but regulated utility firms will still have an incentive to
minimize costs. In this case, joint cost functions can be estimated and the resulting
estimates can be used to measure technical progress as well as the total factor
productivity of the regulated firms.125
A third area where joint cost functions play an important role is in modeling
monopolistic behavior. Typically producers take input prices as fixed and beyond
their control. However, they may have some pricing power over their outputs. Recall
Eq. (245) which defined a producer’s competitive profit maximization problem. A
monopolistic counterpart to this problem is the following problem:

M
maxy.x fm (ym ) ym − w x : (y, x) ∈ S
T
m=1

(346)
M
= maxy fm (ym ) ym − C (y, w)
m=1

123 In practice, the vector of marginal costs may have to be approximated by average costs of
production, which in turn will usually require many accounting imputations.
124 See Diewert [29–31] on this topic.
125 For examples of the use of joint cost functions in a regulatory context, see Denny et al. [19],

Lawrence and Diewert [70] and Diewert et al. [47].

3 Duality in Production 163

where w 0M is a positive input price vector, y ≡ (y1 , . . . , yM ) ≥ 0M is an output

vector, S is the producer’s production possibilities set, C(y,w) is the producer’s joint
cost function defined by Eq. (325) and pm = fm (ym ) is the (downward sloping)
inverse demand function for output m that the producer faces for m = 1, . . . . M.
If the inverse demand functions fm (ym ) and the joint cost function C(y,w) are once
differentiable when evaluated at the period t data, then under appropriate regularity
conditions on the fm (ym ) and S, the following equations will be satisfied by a profit
maximizing monopolist using the observed period t data:

p̂t 1M − μt = ∇y C yt , wt ; t = 1, . . . , T; (347)

xt = ∇w C yT , wT ; t = 1, . . . , T (348)

where 1M is an M dimensional vector of ones, μt ≡ [μ1 t , . . . , μM t ]T ≥ 0M is a

period t markup vector where μm t ≡ − [ym t /pm t ][∂fm (ym t )/∂ym ] is the markup of
price over marginal cost for output m in period t, yt , and xt are the observed quantity
vectors for outputs and inputs in period t, pt and wt are the corresponding observed
output and input price vectors for period t = 1, . . . , T and p̂t is an M by M diagonal
matrix with the elements of the vector pt on the main diagonal. If the markups
are constant over time, given a suitable functional form for the joint cost function
C(y,w), Eqs. (347) and (348) can be used as econometric estimating equations.126
Thus again, joint cost functions play a crucial role in this area of economics.127

Problems that Require Additional Research

We conclude this chapter with some comments on three problem areas that have not
been addressed in the above sections.
The first problem area is the difficulty of distinguishing increasing returns to
scale from technical progress if there is general growth of all inputs and outputs for
the production unit that is under consideration. Multicollinearity problems usually
arise in this situation: the two effects typically cannot be reliably determined using
just time series data.
The second problem area is the fact that many inputs cannot be varied in the short
run and thus producers are not necessarily producing outputs and utilizing inputs on

126 If the markups are not constant, then linear (or piece-wise linear) trends in the markups could be

introduced into the model. See Diewert and Fox [33] for an econometric application of this model
and Diewert and Fox [34] for an application of this model to index number theory.
127 If the monopolist provides some goods and services on a competitive basis (i.e., at marginal

cost), then the markup for this commodity can be set equal to zero. Alternatively, this commodity
could be removed from the y vector and be placed with the x inputs, except the quantity would be
indexed with a negative sign in the input demand equations. The resulting input cost would become
input cost less the revenue from the sales of goods and services provided at marginal cost.
164 W. E. Diewert

the frontiers of their production possibilities sets. For example, suppose a recession
occurs in the economy so that demand for the outputs of production units declines.
Producers can reduce the demand for their variable inputs but they are more or less
stuck with their structure inputs and with other durable capital investments that are
“bolted down.”128 Thus, producers end up being in the interior of their production
possibilities sets.129 When a producer makes an irreversible investment, the total
cost of the investment should not be charged to the period when the investment
was made but this cost should be allocated over the useful life of the investment.
But how exactly should this cost be allocated? This is the fundamental problem
of accounting.130 Note that in addition to structure and network capital inputs, a
successful R&D project is another example of a fixed cost input whose input cost
must be allocated over time in some manner. If there is only a single sunk cost
input (or we aggregate all sunk cost inputs into a single input), then it is possible
to set up an intertemporal profit maximization problem that justifies the purchase
of the fixed input. The price of this fixed asset at a particular point in time is the
discounted net revenue generated by the project over its remaining useful life and if
this information on discounted net revenues can be forecasted, then the initial cost
of the asset can be amortized in a manner that is proportional to the forecasted net
revenues by period.131
The final problem area that has not been addressed in this survey of the
applications of duality theory in production theory is the new goods problem
and the problem of quality change. Modern economies are subject to tremendous
product churn, and in addition, revolutionary new products are constantly being
developed.132 Up to this point, we have assumed that the production unit is
producing M outputs and N inputs and this set of outputs and inputs remains
constant over time (if we are in the time series context) or it remains constant over
different production units in the same industry (in the cross sectional context). If the
underlying technology set St for a production unit does not change very much when
new outputs appear and some old outputs disappear in period t, then the various
econometric models proposed above could in theory deal with this problem if we
allow for technical change. But if there are many such changes over many periods,

128 Some labour hoarding may also occur; i.e., the costs of firing and then rehiring workers after
the recession is over may be higher than just keeping the workers employed.
129 This inefficiency problem will be addressed in other chapters in this Handbook using nonpara-

metric production analysis or Chap. 10, “Data Envelopment Analysis: A Nonparametric Method
of Production Analysis”; see Charnes and Cooper [13]. Most of the research in this area is applied
to cross sectional or panel data. For an application of the nonparametric approach to production
theory and the measurement of efficiency in the time series context, see Diewert and Fox [37].
130 See Cairns [10].
131 For examples of this methodology, see Diewert [28], Diewert et al. [47], Diewert and Huang

[39], Cairns [10] and Diewert and Fox [35].

132 See Broda and Weinstein [7, 8], Bernard et al. ([4], 82) and Hottman et al. ([63], 1300) for

information on the number of products sold in the US (at least 1.6 million). The last three papers
have information on the frequency of product entry and exit in the US (about 2% per month).
3 Duality in Production 165

obviously, we will not be able to estimate flexible functional forms due to the
proliferation of parameters. Even if output changes are infrequent, the production
of a new output and the discontinuance of an existing output could lead to a radical
change in the use of inputs as the newer technology replaces the existing one and
again, we will have a proliferation of parameters, a lack of degrees of freedom and
our suggested econometric approaches will fail. Thus, there is a need for further
research to address these problems.

Acknowledgments The author thanks Robert Cairns, Kevin Fox, John Hartwick, Robert Inklaar,
Peter Neary, Subhash Ray, Stephen Redding, Philip Vermeulen and Valentin Zelenyuk for helpful
comments and the SSHRC of Canada for financial support.

References
1. Allen RGC (1938) Mathematical analysis for economists. Macmillan, London
2. Arrow KJ, Chenery HB, Minhas BS, Solow RM (1961) Capital-labor substitution and
economic efficiency. Rev Econ Stat 63:225–250
3. Berge C (1963) Topological spaces. Macmillan, New York
4. Bernard AB, Redding SJ, Schott PK (2010) Multiple-product firms and product switching. Am
Econ Rev 100:70–97
5. Blackorby C, Diewert WE (1979) Expenditure functions, local duality and second order
approximations. Econometrica 47:579–601
6. Blackorby C, Primont D, Russell RR (1978) Duality, separability and functional structure:
theory and economic applications. North-Holland, New York
7. Broda C, Weinstein DE (2006) Globalization and the gains from variety. Q J Econ 121:541–586
8. Broda C, Weinstein DE (2010) Product creation and destruction: evidence and price implica-
tions. Am Econ Rev 100(3):691–723
9. Burgess D (1974) A cost minimization approach to import demand equations. Rev Econ Stat
56:224–234
10. Cairns RD (2013) The fundamental problem of accounting. Can J Econ 46:634–655
11. Caves DW, Christensen LR, Diewert WE (1982a) The economic theory of index numbers and
the measurement of input, output and productivity. Econometrica 50:1393–1414
12. Caves DW, Christensen LR, Diewert WE (1982b) Multilateral comparisons of output, input
and productivity using superlative index numbers. Econ J 96:659–679
13. Charnes A, Cooper WW (1985) Preface to topics in data envelopment analysis. Ann Oper Res
2:59–94
14. Christensen LR, Jorgenson DW, Lau LJ (1971) Conjugate duality and the transcendental
logarithmic production function. Econometrica 39:255–256
15. Christensen LR, Jorgenson DW, Lau LJ (1973) Transcendental logarithmic production fron-
tiers. Rev Econ Stat 55:28–45
16. Christensen LR, Jorgenson DW, Lau LJ (1975) Transcendental logarithmic utility functions.
Am Econ Rev 65:367–383
17. Cobb C, Douglas PH (1928) A theory of production. Am Econ Rev Suppl 18:139–165
18. Denny M (1974) The relationship between functional forms for the production system. Can J
Econ 7:21–31
19. Denny M, Fuss M, Waverman L (1981) The measurement and interpretation of total factor
productivity in regulated industries with an application to Canadian telecommunications. In:
Cowing TG, Stevenson RE (eds) Productivity measurement in regulated industries. Academic,
New York, pp 179–218
20. Diewert WE (1971) An application of the Shephard duality theorem: a generalized Leontief
production function. J Polit Econ 79:481–507
166 W. E. Diewert

21. Diewert WE (1973) Functional forms for profit and transformation functions. J Econ Theory
6:284–316
22. Diewert WE (1974a) Applications of duality theory. In: Intriligator MD, Kendrick DA (eds)
Frontiers of quantitative economics, vol 2. North-Holland, Amsterdam, pp 106–171
23. Diewert WE (1974b) Functional forms for revenue and factor requirements functions. Int Econ
Rev 15:119–130
24. Diewert WE (1976) Exact and superlative index numbers. J Econ 4:114–145
25. Diewert WE (1978) Hicks’ aggregation theorem and the existence of a real value added
function. In: Fuss M, McFadden D (eds) Production economics: a dual approach to theory
and applications, vol 2. North-Holland, Amsterdam, pp 17–51
26. Diewert WE (1981) The comparative statics of industry long run equilibrium. Can J Econ
14:78–92
27. Diewert WE (1993) Duality approaches to microeconomic theory. In: Diewert WE, Nakamura
AO (eds) Essays in index number theory, vol 1. North-Holland, Amsterdam, pp 105–175
28. Diewert WE (2009) The aggregation of capital over vintages in a model of embodied technical
progress. J Prod Anal 32:1–19
29. Diewert WE (2011) Measuring productivity in the public sector: some conceptual problems.
J Prod Anal 36:177–191
30. Diewert WE (2012) The measurement of productivity in the nonmarket sector. J Prod Anal
37:217–229
31. Diewert WE (2018) Productivity measurement in the public sector, forthcoming. In: Grifell-
Tatjé E, Lovell CAK, Sickles RC (eds) The Oxford handbook of productivity analysis. Oxford
University Press, New York
32. Diewert WE, Feenstra R (2017) Estimating the benefits and costs of new and disappearing
products. Discussion paper 17-10. Vancouver School of Economics, University of British
Columbia, Vancouver
33. Diewert WE, Fox KJ (2008) On the estimation of returns to scale, technical progress and
monopolistic markups. J Econ 145:174–193
34. Diewert WE, Fox KJ (2010) Malmquist and Törnqvist productivity indexes: returns to scale
and technical progress with imperfect competition. J Econ 101:73–95
35. Diewert WE, Fox KJ (2016) Sunk costs and the measurement of commercial property
depreciation. Can J Econ 49:1340–1366
36. Diewert WE, Fox KJ (2017) The difference approach to productivity measurement and exact
indicators. Discussion paper 17-05. Vancouver School of Economics, University of British
Columbia, Vancouver
37. Diewert WE, Fox KJ (2018) Decomposing value added growth into explanatory factors,
forthcoming. In: Grifell-Tatjé E, Lovell CAK, Sickles RC (eds) The Oxford handbook of
productivity analysis. Oxford University Press, New York
38. Diewert WE, Hill RJ (2010) Alternative approaches to index number theory. In: Diewert WE,
Balk BM, Fixler D, Fox KJ, Nakamura AO (eds) Price and productivity measurement. Trafford
Press, Victoria, pp 263–278
39. Diewert WE, Huang N (2011) Capitalizing R&D expenditures. Macroecon Dyn 15:537–564
40. Diewert WE, Lawrence D (2002) The deadweight costs of capital taxation in Australia. In: Fox
KJ (ed) Efficiency in the public sector. Kluwer, Boston, pp 103–167
41. Diewert WE, Morrison CJ (1986) Adjusting output and productivity indexes for changes in the
terms of trade. Econ J 96:659–679
42. Diewert WE, Wales TJ (1987) Flexible functional forms and global curvature conditions.
Econometrica 55:43–68
43. Diewert WE, Wales TJ (1988) A normalized quadratic semiflexible functional form. J Econ
37:327–342
44. Diewert WE, Wales TJ (1992) Quadratic spline models for producer’s supply and demand
functions. Int Econ Rev 33:705–722
45. Diewert WE, Wales TJ (1993) Linear and quadratic spline models for consumer demand
functions. Can J Econ 26:77–106
3 Duality in Production 167

46. Diewert WE, Woodland AD (1977) Frank Knight’s theorem in linear programming revisited.
Econometrica 45:375–398
47. Diewert E, Lawrence D, Fallon J (2009) The theory of network regulation in the presence of
sunk costs. Technical report prepared for the New Zealand Commerce Commission. Avail-
able at: https://econ.sites.olt.ubc.ca/files/2013/06/pdf_paper_erwin-diewert-theory-network-
\penalty-\@Mregulation.pdf
48. Dixit A, Norman V (1980) Theory of international trade: a dual, general equilibrium approach.
Cambridge University Press, Cambridge, UK
49. Feenstra RC (1994) New product varieties and the measurement of international prices. Am
Econ Rev 84(1):157–177
50. Feenstra RC (2004) Advanced international trade: theory and evidence. Princeton University
Press, Princeton
51. Feenstra RC, Inklaar R, Timmer MP (2015) The next generation of the Penn World Table. Am
Econ Rev 105:3150–3182
52. Fenchel W (1953) Convex cones, sets and functions. Lecture notes at Princeton University,
Department of Mathematics, Princeton
53. Fisher I (1922) The making of index numbers. Houghton-Mifflin, Boston
54. Fox KJ (1998) Non-parametric estimation of technical progress. J Prod Anal 10:235–250
55. Fox KJ, Grafton RQ (2000) Nonparametric estimation of returns to scale: method and
application. Can J Agric Econ 48:341–354
56. Gábór-Toth E, Vermeulen P (2017) The relative importance of taste shocks and price
movements in the variation of cost-of-living: evidence from scanner data. Paper presented at
the 15th meeting of the Ottawa Group, Eltville am Rhein
57. Gale D, Klee VL, Rockafellar RT (1968) Convex functions on convex polytopes. Proc Am
Math Soc 19:867–873
58. Gorman WM (1968) Measuring the quantities of fixed factors. In: Wolfe JN (ed) Value, capital
and growth: papers in honour of Sir John Hicks. Aldine, Chicago, pp 141–172
59. Hardy GH, Littlewood JE, Polya G (1934) Inequalities. Cambridge University Press, Cam-
bridge, UK
60. Hicks JR (1946) Value and capital, 2nd edn. Clarendon Press, Oxford
61. Hotelling H (1932) Edgeworth’s taxation paradox and the nature of demand and supply
functions. J Polit Econ 40:577–616
62. Hotelling H (1935) Demand functions with limited budgets. Econometrica 3:66–78
63. Hottman CJ, Redding SJ, Weinstein DE (2016) Quantifying the sources of firm heterogeneity.
Q J Econ 131:1291–1364
64. Inklaar R, Diewert WE (2016) Measuring industry productivity and cross-country convergence.
J Econ 191:426–433
65. Kohli URJ (1978) A gross national product function and the derived demand for imports and
supply of exports. Can J Econ 11:167–182
66. Kohli U (1990) Growth accounting in the open economy: parametric and nonparametric
estimates. J Econ Soc Meas 16:125–136
67. Kohli U (1991) Technology, duality and foreign trade: the GNP function approach to modelling
imports and exports. University of Michigan Press, Ann Arbor
68. Konüs AA (1924) The problem of the true index of the cost of living. Econometrica
7(1939):10–29
69. Konüs AA, Byushgens SS (1926) K probleme pokupatelnoi cili deneg. Voprosi Konyunkturi
2:151–172
70. Lawrence D, Diewert E (2006) Regulating electricity networks: the ABC of setting X in New
Zealand. In: Coelli T, Lawrence D (eds) Performance measurement and regulation of network
utilities. Edward Elgar Publishing, Cheltenham, pp 207–241
71. Leontief WW (1941) The structure of the American economy 1919–1929. Harvard University
Press, Cambridge, MA
72. McFadden D (1966) Cost, revenue and profit functions: a cursory review. Institute for Business
and Economic Research working paper no. 86. University of California, Berkeley
168 W. E. Diewert

73. McFadden D (1978) Cost, revenue and profit functions. In: Fuss M, McFadden D (eds)
Production economics: a dual approach, vol 1. North-Holland, Amsterdam, pp 3–109
74. McKenzie LW (1956–1957) Demand theory without a utility index. Rev Econ Stud 24:
184–189
75. Neary JP (2004) Rationalizing the Penn World Table: true multilateral indices for international
comparisons of real income. Am Econ Rev 94:1411–1428
76. Pollak RA (1969) Conditional demand functions and consumption theory. Q J Econ 83:60–78
77. Rockafellar RT (1970) Convex analysis. Princeton University Press, Princeton
78. Samuelson PA (1947) Foundations of economic analysis. Harvard University Press, Cam-
bridge, MA
79. Samuelson PA (1953) Prices of factors and goods in general equilibrium. Rev Econ Stud 21:
1–20
80. Samuelson PA (1967) The monopolistic competition revolution. In: Kuenne RE (ed) Monopo-
listic competition theory: studies in impact. Wiley, New York, pp 105–138
81. Shephard RW (1953) Cost and production functions. Princeton University Press, Princeton
82. Shephard RW (1970) Theory of cost and production functions. Princeton University Press,
Princeton
83. Uzawa H (1962) Production functions with constant elasticities of substitution. Rev Econ Stud
29:291–299
84. Uzawa H (1964) Duality principles in the theory of cost and production. Int Econ Rev 5:
291–299
85. Walters AA (1961) Production and cost functions: an econometric survey. Econometrica 31:
1–66
86. Wiley DE, Schmidt WH, Bramble WJ (1973) Studies of a class of covariance structure models.
J Am Stat Assoc 68:317–323
87. Woodland AD (1982) International trade and resource allocation. North Holland, Amsterdam
Multiproduct Technologies
4
Rolf Färe, Daniel Primont, and W. L. Weber

Contents
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 170
The Production Technology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 170
Set Representations of Technologies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 171
Cost and Revenue Indirect Technologies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 175
Functional Representations of the Technology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 176
Radial Distance Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 176
A Joint Production Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 178
Directional Distance Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 179
A Distance Function Tree . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 180
Cost and Revenue Indirect Distance Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 182
Optimization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 186
Profit Maximization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 187
Cost Minimization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 189
Revenue Maximization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 190
Efficiency Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 192

R. Färe ()
Department of Economics and Department of Agricultural and Resource Economics, Oregon
State University, Corvallis, OR, USA
Department of Economics and Department of Applied Economics, School of Public Policy,
Oregon State University, Corvallis, OR, USA
Department of Agricultural Economics, University of Maryland, College Park, MD, USA
e-mail: [email protected]
D. Primont
Department of Economics, Southern Illinois University-Carbondale, Carbondale, IL, USA
e-mail: [email protected]
W. L. Weber
Department of Accounting, Economics and Finance, Southeast Missouri State University, Cape
Girardeau, MO, USA
e-mail: [email protected]

© Springer Nature Singapore Pte Ltd. 2022 169

S. C. Ray et al. (eds.), Handbook of Production Economics,
https://doi.org/10.1007/978-981-10-3455-8_5
170 R. Färe et al.

Duality Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 198

Cost Function Dualities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 198
Revenue Function Dualities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 201
Profit Function Dualities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 202
Calculus . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 203
Shadow Pricing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 204
Scale Elasticities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 206
Elasticities of Substitution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 208
Appendix: Production Axioms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 212
Cross-References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 213
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 213

Introduction

This chapter lays out the theory of multiproduct technologies where inputs are
transformed into outputs by producers. The technology underlying production can
be represented by various sets as shown in section “The Production Technology”,
for instance, the input requirement set which includes all input combinations
that produce given outputs or the output set which includes all feasible output
combinations that can be produced from given inputs. If input prices or output
prices are known, cost or revenue indirect sets can be useful representations of
the technology. Section “Functional Representations of the Technology” shows
how functions can be used as representations of the various technology sets with
the functions inheriting their properties from the sets. Producers pursue various
objectives such as maximum profits, minimum costs, and maximum revenues. The
optimization of these objectives is covered in section “Optimization” as well as
a comparison between observed outcomes of the production process and optimal
outcomes in what is known as efficiency measurement. When the technology set is
convex, one can move between functions defined on the quantity space and functions
defined on the price space, and various duality results are shown in section “Duality
Theory”. When differentiable, the functional representations of the technology can
be used to recover shadow prices of inputs or outputs and scale elasticities or
substitution elasticities as shown in section “Calculus”.

The Production Technology

A production technology describes how inputs are transformed into outputs. Sets
or functions can be used to represent the technology. In this section, we begin with
set representations of the technology and discuss various axioms on the technology
sets. These sets serve as the foundation for describing multiproduct technologies,
i.e., technologies where multiple inputs are transformed into multiple outputs. Sets
describing how input quantities are transformed into output quantities are first
described. Then, we show how various technologies can be described by prices and
values, for instance, output sets that depend on input prices and costs or input sets
that depend on output prices and a target level of revenue. The various alternatives
4 Multiproduct Technologies 171

allow the empirical economist who is confronted with limited data a meaningful
way to represent the technology.

Set Representations of Technologies

In this section, we represent the technologies as sets, including the output, input, and
technology sets. Let x ∈ RN + denote inputs and y ∈ R+ outputs. The technology
M

set is given by

T = {(x, y) : x can produce y}. (1)

The input requirement set is denoted by

L(y) = {x : x can produce y} = {x : (x, y) ∈ T } (2)

and the output set is

P (x) = {y : x can produce y} = {y : (x, y) ∈ T }. (3)

From these definitions, it follows that

(x, y) ∈ T ⇔ x ∈ L(y) ⇔ y ∈ P (x), (4)

i.e., each set is a complete representation of a technology and is often associated

with a different optimization problem. Profit maximization is best modeled with the
technology set, revenue maximization with the output set, and cost minimization
with the input set.
Activity Analysis1 (AA) or Data Envelopment Analysis2 (DEA) models are
examples of a specific set representation. Let (x k , y k ) k = 1 . . . , K represent
the K observations of input and output vectors. The AA/DEA model constructed
from these data is

K
T = {(x, y) : zk xkn ≤ xn , n = 1, . . . , N,
k=1

K
zk ykm ≥ ym , m = 1, . . . , M,
k=1

zk ≥ 0, k = 1, . . . , K} (5)

1 John von Neumann [26] introduced activity analysis.

2 Charnes et al. [6] coined the term Data Envelopment Analysis.
172 R. Färe et al.

where zk ≥ 0, k = 1, . . . , K are the intensity variables. A corresponding output

set is

K
P (x) = {y : zk xkn ≤ xn , n = 1, . . . , N,
k=1

K
zk ykm ≥ ym , m = 1, . . . , M,
k=1

zk ≥ 0, k = 1, . . . , K} (6)

and the corresponding input set is

K
L(y) = {x : zk xkn ≤ xn , n = 1, . . . , N,
k=1

K
zk ykm ≥ ym , m = 1, . . . , M,
k=1

zk ≥ 0, k = 1, . . . , K}. (7)

Following [20], we impose the following conditions on the data (x k , y k ), k =

1, . . . , K:

K
I. xkn > 0, n = 1, . . . , N
k=1

N
II. xkn > 0, k = 1, . . . , K
n=1

K
III. ykm > 0, m = 1, . . . , M
k=1

M
IV. ykm > 0, k = 1, . . . , K. (8)
m=1

Condition I states that each input, n = 1, . . . , N , is used by some k DMU

(decision-making unit). Condition II requires that each DMU use at least some of
one input. On the output side, condition III says that each output, m = 1, . . . , M,
is produced by at least one DMU, and finally, condition IV states that each DMU
produces a positive amount of at least one output.
If the conditions in (8) hold, one can prove that T is a closed set, i.e., if (x l , y l ) ∈
T for all l and (x l , y l ) converges to (x o , y o ), then (x o , y o ) ∈ T .
4 Multiproduct Technologies 173

We note that if T is closed, so are P (x) and L(y). These results follow by
considering the sequences

(x o , y l ) ∈ T and (x l , y o ) ∈ T . (9)

Since (x o , y l ) ∈ T ⇔ y l ∈ P (x o ) and since (x o , y o ) ∈ T , it implies that y o ∈

P (x o ) proving that P (x) is a closed set. The same logic applies to show that L(y)
is a closed set.
The above conditions also imply that P (x) is a bounded set, a condition often
referred to as scarcity. In addition one can prove a no free lunch condition: that is, if

y ∈ P (0) then y = 0. (10)

The M inequalities (≥) for outputs in (6) allow outputs to be freely disposable,
i.e., (in terms of the output set)

y ∈ P (x), y ≤ y ⇒ y ∈ P (x). (11)

With respect to inputs, the N inequalities (≤) in (7) make inputs freely disposable,
i.e.,

x ∈ L(y), x ≥ x ⇒ x ∈ L(y). (12)

The intensity variables zk , k = 1, . . . , K are restricted to be nonnegative, which

implies that the technology exhibits constant returns to scale:

λT =T , λ > 0,
or P (λx) =λP (x), λ > 0,
or L(λy) =λL(y), λ > 0. (13)

Since the intensity variables are nonnegative, if we consider two vectors of these
variables, z0 and z1 , then their convex combination, z = λz0 +(1−λ)z1 , 0 ≤ λ ≤ 1,
is also nonnegative. This implies that T is convex, i.e.,

if (x 0 , y 0 ) ∈ T and (x 1 , y 1 ) ∈ T then
(λx 0 + (1 − λ)x 1 , λy 0 + (1 − λ)y 1 ) ∈ T , 0 ≤ λ ≤ 1. (14)

Convexity of T implies that P (x) is convex. To see this result, just take x 0 = x 1 .
Similarly, if we take y 0 = y 1 , then it follows that L(y) is convex. However, the
converse does not apply. Figure 1 illustrates the technology set T , the output sets
P (x 1 ) and P (x 2 ), and the input sets L(y 1 ) and L(y 2 ). One can easily see that for
each fixed x, the set P (x) is convex and for each fixed y, the set L(y) is convex.
However, T is not convex.
When we extend our model to include bad or undesirable outputs, additional
axioms must be introduced. Let b ∈ R+ J represent undesirable or bad outputs, and

let the output set take the form

174 R. Färe et al.

Fig. 1 Nonconvexity of T y

P(x2)
}} y2

P(x1)
x
0 x1 x2 L(y2)

}}
P (x) = {(y, b) : x can produce (y, b)}.
L(y1)

When good and bad outputs are jointly produced, the assumption of free dispos-
ability does not apply; one cannot just throw away the bads. To model disposability
in this case, we say that outputs are weakly disposable if radial contractions are
feasible, i.e.,
(15)

(y, b) ∈ P (x), 0 ≤ θ ≤ 1 ⇒ (θy, θ b) ∈ P (x). (16)

To model the condition that y and b are jointly produced, we say that these outputs
are null-joint, i.e.,

if (y, b) ∈ P (x) and b = 0 ⇒ y = 0. (17)

In words, no fire without smoke.

We may extend the AA/DEA model to include “bads.” This model now becomes

K
P (x) = {y : zk xkn ≤ xn , n = 1, . . . , N,
k=1

K
zk ykm ≥ ym , m = 1, . . . , M,
k=1

K
zk bkj = bj , j = 1, . . . , J,
k=1

zk ≥ 0, k = 1, . . . , K} (18)
4 Multiproduct Technologies 175

In this model, a third set of linear restrictions are added, and we note that these
restrictions are equalities. One can prove that these equality constraints for the bad
outputs and inequality constraints for the “good” outputs (ym ) make (y, b) weakly
disposable.
To model the null-jointness of y and b in the AA/DEA framework, we require
that two conditions hold, namely,

K
V. bkj > 0, j = 1, . . . , J and
k=1

J
VI. bkj > 0, k = 1, . . . , K. (19)
j =1

The first condition says that each bad output is produced by some k, and the
second condition states that each k produces some bad output. To verify that these
conditions make good and bad outputs null-joint set bj = 0, j = 1, . . . , J . Then
each intensity variable must satisfy zk = 0, k = 1, . . . , K. Thus, ym = 0, m =
1, . . . , M.
For a list of axioms, see the Appendix: Production Axioms.

Cost and Revenue Indirect Technologies

We now turn to technologies where firms operate under a cost or revenue constraint.
These models are often referred to as indirect technologies.
Let w ∈ RN + be an input price vector and c ∈ R+ represent total or allowed cost.
The budget or cost constraint is the linear function

N
c ≥ wx = wn xn . (20)
n=1

The cost indirect output set, I P (w/c), is the union of all output sets P (x) where the
input vector does not cost more than c. Formally,

I P (w/c) = {y : y ∈ P (x), wx ≤ c}
= {y : y ∈ P (x), (w/c)x ≤ 1}. (21)

Like the indirect output set, the indirect input requirement set can also be
obtained. Let p ∈ RM+ represent an output price vector and r ∈ R+ be the target or
minimum revenue to be attained. The target revenue constraint is

M
r ≤ py = pm ym . (22)
m=1
176 R. Färe et al.

The revenue indirect input requirement set, L(p/r), is the union of all input sets
L(y) such that y generates at least revenue r, i.e.,

L(p/r) = {x : x ∈ L(y), py ≥ r}
= {x : x ∈ L(y), (p/r)y ≥ 1}. (23)

The indirect output set I P (w/c) is useful in modeling production where the
producer faces a budget constraint such as in the public sector. Here the producer
might be interested in knowing the different combination of outputs that might
be produced given their budget and inputs. With respect to outputs, the indirect
input requirement set L(p/r) might be useful to a producer with multiple outlets
or divisions where each outlet has a revenue target. In this case, the producer gains
knowledge of the different input combinations that can meet the revenue target. As
we show in section “Cost and Revenue Indirect Distance Functions”, these indirect
sets can be combined with behavioral assumptions to construct indirect distance
functions.

Functional Representations of the Technology

While sets are useful models of the production technology, functional representa-
tions of the technology are more generally used in empirical work. In this section,
we examine functions that can be used to represent multiproduct technologies. In
turn, these functions can be used to measure efficiency and returns to scale. In
addition, functions can be used in empirical work to recover shadow or support
prices of nonmarket outputs and inputs and to estimate various elasticities.

Radial Distance Functions

Next we study radial distance functions which were introduced into production
economics by [28, 29]. We start with the input distance function.
The input requirement set L(y) consists of all input vectors that can produce y,

L(y) = {x : x can produce y}, y ∈ RN

+, (24)

the isoquants of which are defined as

I soq L((y) = {x : x ∈ L(y), λ < 1, λx ∈

/ L(y)}. (25)

The input distance function is defined by

Di (y, x) = sup{λ > 0 : (x/λ) ∈ L(y)}, y ∈ RM

+. (26)
λ
4 Multiproduct Technologies 177

One property of the input distance function is that it is homogeneous of degree +1

in inputs (the scaled vector), i.e., Di (y, μx) = μDi (y, x). This property can be
easily proved as follows:
μx
Di (y, μx) = sup λ : ∈ L(y)
λ λ
λ μx
= μ sup : ∈ L(y)
λ μ λ
λ x
= μ sup : ∈ L(y)
λ/μ μ (λ/μ)
= μDi (y, x), μ > 0. (27)

Now, assuming weak disposability of inputs (i.e., if x ∈ L(y) and λ ≥ 1,

then λx ∈ L(y)), then and only then is the input distance function a functional
representation of the input sets, i.e.,

Di (y, x) ≥ 1 if and only if x ∈ L(y)

or
L(y) = {x : Di (y, x) ≥ 1}. (28)

Formally:
Proposition: Inputs are weakly disposable if and only if L(y)={x : Di (y, x)≥1}.
Proof:3 (⇐) : x ∈ L(y) ⇒ Di (y, x) ≥ 1 ⇒ Di (y, λx) ≥ λ ≥ 1 (using
homogeneity of Di in x) ⇒ λx ∈ L(y).
(⇒) : x ∈ L(y) ⇒ λ = 1 is feasible in the definition of Di . Thus, Di (y, x) ≥ 1.
On the other hand Di (y, x) ≥ 1 ⇒ (x/λ) ∈ L(y) for some λ ≥ 1. Then, using
A.7, λ(x/λ) ∈ L(y), i.e., x ∈ L(y). Q.E.D.
Also note that Di (y, x) = 1 if and only if x ∈ I soq L(y).
From the definition of Di (y, x) and the above proposition, it follows that the
production axioms have an equivalent representation in terms of Di (y, x).
The radial output distance function is defined in terms of the output set

P (x) = {y : x can produce y}, (29)

Do (x, y) = inf{λ : y/λ ∈ P (x)}, x ∈ RN

+. (30)
λ

3 This proof is from [15, p.22].

178 R. Färe et al.

From its definition, it follows that

Do (x, λy) = λDo (x, y), λ > 0 (31)

i.e., it is homogeneous of degree +1 in the (scaled) output vector.4

Using homogeneity and weak output disposability (A.5), one can prove that the
output distance function has the representation property,

P (x) = {y : Do (x, y) ≤ 1}, x ∈ RN

+. (32)

Let the output isoquant be defined as

I soq P (x) = {y : y ∈ P (x), λ > 1, ⇒ λy ∈

/ P (x)}, x ∈ RN
+. (33)

Then, one can prove that

Do (x, y) = 1 if and only if y ∈ I soq P (x). (34)

Unlike the production function where multiple inputs produce a single output,
Shephard distance functions allow a functional representation of technologies
where multiple outputs are produced by multiple inputs. The Shephard distance
functions have been widely used as measures of technical efficiency, and we
outline these efficiency measures in section “Efficiency Analysis”. Furthermore,
in section “Duality Theory”, we show that these distance functions have a dual
representation in price space as the cost function and revenue function.

A Joint Production Function

Following [29, p. 212]5 , we define the joint production function J : RN

+ × R+ →
M

R+ such that:

(a) for y ≥ 0, L(y) = ∅, I soq L(y) = {x : J (x, y) = 0}

and
(b) for x ≥ 0, P (x) = ∅, I soq P (x) = {y : J (x, y) = 0}. (35)

Two questions are of interest with respect to J (x, y). First, does it exist? and second,
how can it be represented? Regarding the last question, distance functions may be
used as its representation, namely,

4 The proof is similar to that of homogeneity of inputs in the input distance function, and we omit
it here.
5 See also [10], pp. 38–40.
4 Multiproduct Technologies 179

J (x, y) = Di (y, x) − Do (x, y). (36)

Regarding the existence of J (x, y), we refer to the proof in [10, p. 39]. The
conditions required are

I soq P (x) ∩ I soq P (λx) = ∅, λ = 1,

and
I soq L(y) ∩ I soq L(θy) = ∅, θ = 1. (37)

Directional Distance Functions

Luenberger [22, 23] building on [1] introduced directional distance functions using
the terminologies, shortage and benefit functions. These functions are additive in
structure and generalize the radial Shephard distance functions. Here we follow [4]
and term them directional distance functions.
Given the technology set

T = {(x, y) : x can produce y} (38)

the directional technology distance function (shortage function) is defined as

−
→
D T (x, y; gx , gy ) = sup{β : (x − βgx , y + βgy ) ∈ T } (39)
β

where g = (gx , gy ) ∈ RN + × R+ , g = 0 is the directional vector that indicates

the direction along which (x, y) is projected onto the boundary of T . This function
simultaneously contracts x along gx and expands y along gy . This contraction in
inputs and expansion in outputs is in line with profit maximization where it is
desirable to use fewer inputs to produce more outputs. If inputs and outputs are
strongly disposable, i.e.,

if (x, y) ∈ T and x ≥ x, y ≤ y, then (x , y ) ∈ T (40)

then the technology distance function has the representation property6

−
→
D T (x, y; gx , gy ) ≥ 0 if and only if (x, y) ∈ T . (41)

This property allows us to express the production axioms in terms of the directional
technology distance function.
From its definition, this distance function satisfies the translation property:

6A weaker condition called g = (gx , gy ) disposability will suffice for this property to hold.
180 R. Färe et al.

−
→ −
→
D T (x − αgx , y + αgy ; gx , gy ) = D T (x, y; gx , gy ) − α, α ∈ R. (42)

The directional output distance function holds inputs constant (gx = 0) and
projects outputs onto the boundary of P (x):

−
→
D o (x, y; gy ) = sup{β : (y + βgy ) ∈ P (x)}. (43)
β

Similarly, the directional input distance function holds outputs constant (gy = 0)
and projects inputs onto the boundary of L(y):

−
→
D i (y, x; gx ) = sup{β : (x − βgx ) ∈ L(y)}. (44)
β

Each of these functions has properties similar to those of the directional technology
distance function, i.e., representation and translation, now of course with respect to
the output set P (x) and input set L(y).

A Distance Function Tree

To relate the five distance functions introduced so far, we form a tree (see Fig. 2)
whose root is the directional technology distance function. It is obvious that by
setting gx = 0 the directional output distance function is obtained and setting
gy = 0 the directional input distance function is obtained. To derive the radial
distance functions, an argument is in place. Recall that these functions meet the
representation property. Thus, e.g., the radial output distance function may be used
−
→
in the definition of D o (x, y; gy ):

Fig. 2 Distance function tree

Do (x,y) Di (y,x)
gy=y g x=x

Do(x,y;gy) Di (x,y;gx)

gy=0
gx=0

DT(x,y;gx,gy)
4 Multiproduct Technologies 181

−
→
D o (x, y; gy ) = sup{β : (y + βgy ) ∈ P (x)}
β

= sup{β : Do (x, y + βgy ) ≤ 1}. (45)

Now, set gy = y and recall that Do (x, y) is homogeneous of degree one in y. Then

−
→
D o (x, y; y) = sup{β : Do (x, (1 + β)y) ≤ 1}
β

= −1 + sup{(1 + β) : Do (x, y)(1 + β) ≤ 1}

1
= −1 + sup{(1 + β) : (1 + β) ≤
β Do (x, y)
1
= −1 + (46)
Do (x, y)

which can be rearranged as

1
Do (x, y) = −
→ . (47)
1 + D o (x, y; y)

A similar argument can be used to establish the relation between Di (y, x) and
−
→
D i (y, x; gx ). Setting gx = x we can write

−
→
D i (x, y; x) = sup{β : Di (y, x − βx) ≥ 1}
β

= sup{β : Di (y, (1 − β)x) ≥ 1}

β
Di (y, x)
= sup β : ≥1
β 1−β
1 1
= 1 − inf 1 − β : ≥
1−β 1−β Di (y, x)
1
=1− (48)
Di (y, x)

which can be rearranged as

1
Di (y, x) = −
→ . (49)
1 − D i (x, y; x)
182 R. Färe et al.

The results in (46) and (48) show that the Shephard output and input distance
functions are special cases of the directional output and directional input distance
functions when the directional vectors are chosen as gy = y and gx = x. Since
other directional vectors may be chosen, the directional distance function provides
a more generalized representation of the technology.

Cost and Revenue Indirect Distance Functions

All distance functions in sections “Radial Distance Functions”, “A Joint Production

Function”, “Directional Distance Functions”, and “A Distance Function Tree” were
defined on the “direct” technologies T , P (x), or L(y). Here we turn our attention
to distance functions defined on the indirect technologies, I P (w/c) and L(p/r).
Again we study radial/Shephard and directional distance functions defined on those
indirect technologies.
Recall that the cost indirect output set is

I P (w/c) = {y : y ∈ P (x), wx ≤ c} = {y : y ∈ P (x), (w/c)x ≤ 1} (50)

where w ∈ RN + is an input price vector, c is total cost, and P (x) is the “direct” output
set. The corresponding (radial) cost indirect output distance function is defined as

I Do (w/c, y) = inf{θ : (y/θ ) ∈ I P (w/c)}. (51)

This function is homogeneous of degree +1 in outputs (y):

I Do (w/c, λy) = λI Do (w/c, y), λ > 0. (52)

Under strong disposability of outputs7 , the indirect output distance function meets
the representation condition

I Do (w/c, y) ≤ 1 ⇔ y ∈ I P (w/c)
or
I P (w/c) = {y : I Do (w/c, y) ≤ 1}. (53)

We consider two special cases for I Do (w/c, y): first when y is a scalar and
second when the technology exhibits constant returns to scale (CRS). When output
is a scalar, the distance function can be written as

I Do (w/c, y) = I Do (w/c, 1)/y. (54)

7 Weaker conditions can be established.

4 Multiproduct Technologies 183

To see this result, recall the homogeneity property given in (52), and let λ = 1/y.
Now, if y is efficient, we have I Do (w/c, y) = 1 so that (54) can be written as

1 = I Do (w/c, 1)/y
or
y = I Do (w/c, 1) (55)

which is the indirect production function or in consumer theory, the indirect utility
function.
Next we show that under CRS the cost function and I Do (w/c, y) are equivalent.
The cost function is defined as

C(y, w) = min{wx : x ∈ L(y). (56)

If input prices are strictly positive, i.e., wn > 0, n = 1, . . . , N, then one can prove
that8

I P (w/c) = {y : C(y, w) ≤ c}. (57)

If the technology exhibits CRS, the cost function is homogeneous of degree +1

in outputs

C(λy, w) = λC(y, w), λ > 0. (58)

Using (55) and (58), we show that the cost indirect output distance function
equals the cost function:9

I Do (w/c, y) = inf{θ : (y/θ ) ∈ I P (w/c)}

= inf{θ : C(y/θ, w) ≤ c}
θ

= inf{θ : C(y, w/c) ≤ θ }

= C(y, w/c) (59)

establishing our conjecture.

One may ask, what is the relation between the “direct” and indirect output
distance functions Do (x, y) and I Do (w/c, y)? With their first arguments, the direct
distance function is defined in primal/quantity space and the indirect function

8 Färe and Primont [15], p. 95

9 Färe and Primont [15], p. 83–84
184 R. Färe et al.

defined in dual/price space; hence, they are dual to each other in their first argument,
x ∈ RN+ and w/c ∈ R+ .
N 10

Although we devote a section to duality theory, this nontraditional relation is

included here. If L(y) is convex, we can prove that

I Do (w/c, y) = inf{Do (x, y) : wx ≤ c}

and
Do (x, y) = sup{I Do (w/c, y) : wx ≤ c}. (60)
w/c

Hence, the two distance functions are dual with respect to inputs (x) and normalized
input prices (w/c).
The indirect directional output distance function is
−→
I D o (w/c, y; gy ) = sup{β : (y + βgy ) ∈ I P (w/c)} (61)
β

and it meets the standard translation and representation properties.

Again, we may ask what the relation between the direct and indirect directional
output distance functions is. The following duality theorem illustrates the relation:11
−→ −
→
I D o (w/c, y; gy ) = sup{ D o (x, y; gy ) : wx ≤ c}
x

−
→ −→
D o (x, y; gy ) = inf {I D o (w/c, y; gy ) : wx ≤ c}. (62)
w/c

Therefore, under convexity of P (x), the two distance functions model the same
technology.
Turning to input representations of the technology, recall that the revenue indirect
input requirement set is given by

I L(p/r) = {x : x ∈ L(y), r ≤ py} = {x : x ∈ L(y), 1 ≤ (p/r)y} (63)

where p ∈ RM + is a vector of output prices, r is total revenue, and L(y) is the

“direct” input requirement set. The radial indirect distance function defined on this
set is

I Di (p/r, x) = sup{λ : x/λ ∈ I L(p/r)}. (64)

10 Note that RN and its dual (RN )∗ are equal, RN = (RN )∗ ; see, e.g., [19], p. 80–81.
11 See [16, p. 244] for a proof.
4 Multiproduct Technologies 185

This distance function is homogeneous of degree +1 in inputs

I Di (p/r, λx) = λI Di (p/r, x), λ > 0 (65)

and satisfies the representation condition under strong disposability of inputs, i.e.,

I Di (p/r, x) ≥ y ∈ I L(p/r)
or
I L(p/r) = {x : I Di (p/r, x) ≥ 1}. (66)

The maximum revenue function is defined as

R(x, p) = max{ry : y ∈ P (x)}. (67)

Using the revenue function and CRS, we can show that

I Di (p/r, x) = R(x, p/r). (68)

First we note that if output prices are positive, pm > 0, m = 1, . . . , M, the indirect
input set may be written as

I L(p/r) = {x : R(x, p/r) ≥ 1}. (69)

Second, under CRS, the revenue function is homogeneous of degree +1 in inputs,

R(λx, p/r) = λR(x, p/r), λ > 0. (70)

Using the facts in (69) and (70) and knowing that the revenue function is
homogeneous of degree +1 in output prices, our conjecture follows:

I Di (p/r, x) = sup{λ : x/λ ∈ I L(p/r)}

= sup{λ : R(x/λ, p) ≥ r}
λ

= sup{λ : R(x/λ, p/r) ≥ 1}

= sup{λ : R(x, p/r) ≥ λ}

= R(x, p/r) (71)

proving our conjecture.

186 R. Färe et al.

Contrasting the direct and indirect distance functions, Di (y, x) and I Di (p/r, x),
we see that their first vector differs. The direct distance function depends on output
y ∈ RM + , while the indirect distance function depends on revenue normalized output
prices, p/r ∈ RM + . These two concepts are dual to each other, and in particular we
have

y(p/r) = 1. (72)

Thus, the two distance functions are dual to each other, and when the output set
P (x) is convex, we can prove that

I Di (p/r, x) = sup{Di (y, x) : py ≥ r}

Di (y, x) = inf {I Di (p/r, x) : py ≥ r}. (73)

p/r

Let us define the revenue constrained directional input distance function as

−→
I D i (p/r, x; gx ) = sup{β : x − βgx ∈ I L(p/r)}, (74)
β

and note that it meets the translation and representation properties (among others).
−→ −
→
Comparing I D i (p/r, x; gx ) and D i (y, x; gx ), we again observe that they have
dual first vectors, p/r and y, respectively. Thus, the two functions are dual with
respect to these vectors. In particular, if P (x) is convex, we can prove the duality
theorem12
−→ −
→
I D i (p/r, x; gx ) = sup{ D i (x, y; gx ) : (p/r)y ≥ 1}
y
−
→ −→
D i (x, y; gx ) = inf {I D i (p/r, x; gx ) : (p/r) ≥ 1}. (75)
p/r

Optimization

The technology can also be represented as the result of optimizing behavior on

the part of the producer. In this section, we consider the profit function, cost
function, and revenue function as dual representations of the technology, each of
which inherits its properties from the technology T , L(y), or P (x). We make these
relations clear when discussing duality.

12 Färe and Primont [16]

4 Multiproduct Technologies 187

Profit Maximization

We first turn our attention to the profit function which is the result of maximizing
behavior and is a dual representation of the technology, T .
The existence of a maximum, i.e., there is an input/output vector (x, y) ∈ T such
that

py ∗ − wx ∗ ≥ py − wx for all (x ∗ , y ∗ ) ∈ T (76)

needs some consideration that we bring up later.

Given that (x ∗ , y ∗ ) maximizes profits, it then follows that x ∗ minimizes costs of
producing outputs y ∗ given input prices w and y ∗ maximizes revenue given inputs
x ∗ and output prices p. To verify the assertion that profit maximization implies cost
minimization, assume that x ∗ does not minimize costs. Then, there must exist a
feasible input vector, x̂, i.e., (x̂, y ∗ ) ∈ T such that

w x̂ < wx ∗ . (77)

However, the condition above would then mean that

py ∗ − wx ∗ < py ∗ − w x̂, (78)

which contradicts that (x ∗ , y ∗ ) maximize profit. Hence, profit maximization implies

cost minimization given input prices and y ∗ .
To confirm that profit maximization also implies revenue maximization, assume
not. Then there must exist a feasible output vector, ŷ, such that (x ∗ , ŷ) ∈ T and

py ∗ < pŷ. (79)

By the condition above, then it must be that

py ∗ − wx ∗ < pŷ − wx ∗ (80)

contradicting the condition that (x ∗ , y ∗ ) maximizes profit. Thus, profit maximiza-

tion implies revenue maximization.
In addition to defining the profit function in terms of a set, we next show how
it can be expressed using distance functions. These formulations are important for
duality theory.
Recall the following equivalences:

(x, y) ∈ T ⇔ y ∈ P (x) ⇔ x ∈ L(y), (81)

188 R. Färe et al.

where P (x) is an output set and L(y) is an input set. We know that the directional
technology distance function

−
→
D T (x, y; gx , gy ) = sup{β : (x − βgx , y + βgy ) ∈ T } (82)
β

meets representation, i.e.,

−
→
T = {(x, y) : D T (x, y; gx , gy ) ≥ 0}. (83)

Hence, we may use this function to define profit maximization

−
→
π(w, p) = max{py − wx : D T (x, y; gx , gy ) ≥ 0}. (84)
y,x

Moreover, we are aware of the fact that

y ∈ P (x) ⇔ Do (x, y) ≤ 1. (85)

Thus, we have

P (x) = {y : Do (x, y) ≤ 1}. (86)

and since

(x, y) ∈ T ⇔ y ∈ P (x), (87)

we have the following profit maximization formulation:

π(w, p) = max{py − wx : Do (x, y) ≤ 1}. (88)

y,x

With respect to the input distance function, Di (y, x), we have

x ∈ L(y) ⇔ Di (y, x) ≥ 1, (89)

where L(y) = {x : Di (y, x) ≥ 1}. Now, since (x, y) ∈ T ⇔ x ∈ L(y), we can

maximize profit with the input distance function as its constraint

π(w, p) = max{py − wx : Di (y, x) ≥ 1}. (90)

y,x

In our duality section, we will make use of all these distance function formulations
of profit maximization.
4 Multiproduct Technologies 189

Cost Minimization

Let w ∈ RN
+ be a vector of input prices and

L(y) = {x : x can produce y} (91)

be an input requirement set. The cost minimization problem is

min wx subject to x ∈ L(y). (92)

When a solution to the cost minimization problem exists, the value function

C(y, x) = min{wx : x ∈ L(y)}, (93)

is referred to as the cost function. We note that

C(y, w) ≤ wx for all x ∈ L(y) (94)

since C(y, w) is a minimum. Since the input distance function has the representation
property x ∈ L(y) ⇔ Di (y, x) ≥ 1, the cost minimization problem may be written
as

C(y, w) = min{wx : Di (y, x) ≥ 1} (95)

which has the Lagrangian formulation

C(y, w) = wx − μ(Di (y, x) − 1), (96)

where μ is the Lagrangian multiplier.

We have also indicated that the directional input distance function
−
→
D i (y, x; gx ) = sup{β : (x − βgx ) ∈ L(y)} (97)
β

meets the representation condition

−
→
x ∈ L(y) ⇔ D i (y, x; gx ) ≥ 0. (98)

Therefore, we may write the cost minimization problem as

−
→
C(y, w) = min{wx : D i (y, x; gx ) ≥ 0} (99)
x

with the Lagrangian formulation

190 R. Färe et al.

−
→
C(y, w) = wx − μ( D i (y, x; gx ) − 0), (100)

where μ is the Lagrangian multiplier. Of course we may also formulate the

minimization problem with other distance functions that meet the representation
property. We leave that exercise to the reader.
In the previous section, we proved that profit maximization

π(x, p) = py ∗ − wx ∗ (101)

implied that wx ∗ was the minimum cost of producing y ∗ , i.e.,

C(y ∗ , w) = wx ∗ = min{wx : x ∈ L(y ∗ )}. (102)

Thus, we may use the cost function, C(y, w), as part of the profit maximization
problem

π(w, p) = max{py − C(y, w)}. (103)

This formulation holds provided the cost function represents the technology,
i.e., if C(y, w) is dual to it. As we show in section “Cost Function Dualities”,
representation is true under certain conditions, like the input requirement set L(y)
being closed, nonempty, and convex.

Revenue Maximization

Denote the output price vector as p ∈ RM

+ and the output set by

P (x) = {y : x can producey}. (104)

The optimization problem consists of firms choosing a feasible output vector so as

to maximize revenues given inputs and output prices. The resulting value function

R(x, p) = max{py : y ∈ P (x)} (105)

is referred to as the revenue function. Given that P (x) is closed and bounded, i.e.,
compact, the revenue function exists. We would like to express our duality theorems
in terms of functions; hence, we need to express revenue maximization with function
constraints, rather than constraints on sets, in this case the output set.
Consider first the radial/Shephard output distance function

Do (x, y) = inf{λ : y/λ ∈ P (x)}. (106)

λ
4 Multiproduct Technologies 191

This function meets the representation condition

y ∈ P (x) ⇔ Do (x, y) ≤ 1, (107)

i.e., we have

P (x) = {y : Do (x, y) ≤ 1}. (108)

Thus, we may substitute the output distance function into the revenue maximization
problem so that

R(x, p) = max{py : Do (x, y) ≤ 1}. (109)

This revenue function may be written as a Lagrangian problem as

R(x, p) = py − μ(Do (x, y) − 1), (110)

where μ is the Lagrangian multiplier.

We also know that the directional output distance function meets the representa-
tion condition, i.e.,

−
→
P (x) = {y : D o (x, y; gy ) ≥ 0}, (111)

so that an alternative constraint in the revenue maximization problem can represent

the technology, i.e.,

−
→
R(x, p) = max{py : D o (x, y; gy ) ≥ 0}. (112)
y

We use (112) in section “Revenue Function Dualities” on duality theory.

We proved in section “Profit Maximization” that profit maximization implies
revenue maximization, i.e., π(w, p) = py ∗ − wx ∗ . Thus we have

R(x ∗ , p) = py ∗ = max {py : y ∈ P (x ∗ )} (113)

where input is restricted to equal the optimal input under profit maximization, x ∗ .
Now, if R(x, p) represents the technology, i.e., it is dual to the output distance
function, then the profit maximization problem may be formulated as

π(w, p) = max {R(x, p) − wx}. (114)

x
192 R. Färe et al.

Efficiency Analysis

In sections “Functional Representations of the Technology” and “Optimization”,

we have discussed distance functions and value functions. Merging these concepts
allow us to study efficiency in production, which consists of a technical component,
distance functions, and an overall component – value functions. We first establish
[18] cost efficiency model and proceed with the corresponding directional distance
function analog.
Recall that under cost minimization the following inequality holds:

C(y, w) ≤ wx for all x ∈ L(y). (115)

From the discussions of distance functions, we know that

Di (y, x) = sup{λ : x/λ ∈ L(y)} = λ∗ (116)

where λ∗ is the maximal (supremal) contraction of the input vector x that can still
feasibly produce the output vector y. When this contraction in inputs is achieved,
we have
x x
= ∈ L(y), (117)
λ∗ Di (y, x)
x
i.e., Di (y,x) belongs to the input set L(y) and is thus feasible. Combining this
condition with the cost minimization inequality from (115) yields
wx
C(y, w) ≤ . (118)
Di (y, x)

Interpreting wx as observed cost, then the right-hand side of (118) can be interpreted
as the cost of production if the observed inputs are contracted to the input isoquant,
I soq L(y).13 Rearranging (118) yields

C(y, w) 1
≤ (119)
wx Di (y, x)

where the left-hand side is the ratio of minimum costs to actual costs which
measures overall cost efficiency (OCE) and the right-hand side equals input
technical efficiency (T Ei ). To close the inequality, we introduce an allocative
efficiency component, AEi , as a residual.14 That is,

13 The input isoquant is defined as I soq L(y) = {x : x ∈ L(y), λ < 1 ⇒ λx ∈

/ L(y)}.
14 See [3] for an approach in which AEi is not a residual.
4 Multiproduct Technologies 193

C(y, w) 1
= × AEi
wx Di (y, x)
or
OCE = T Ei × AEi . (120)

That is, overall cost efficiency equals the product of input technical efficiency and
allocative efficiency.
Turning to the directional input distance function
−
→
D i (y, x; gx ) = sup{β : (x − βgx ) ∈ L(y)} = β ∗ (121)
β

where β ∗ is the supremum. We note that

−
→
x − β ∗ gx = x − D i (y, x; gx )gx ∈ L(y). (122)

Combining (122) with cost minimization yields the inequality

−
→
C(y, w) ≤ w(x − D i (y, x; gx )gx )
−
→
≤ wx − D i (y, x; gx )wgx (123)

which we may rearrange as

wx − C(y, w) −→
≥ D i (y, x; gx ), (124)
wgx

where the left-hand side measures the difference between actual costs and minimum
−−→
costs normalized by wgx and serves as a measure of overall cost inefficiency (OCI )
−→
and the right-hand side measures input technical inefficiency (T I i ). Following [9],
we refer to these measures of inefficiency as indicators, while the radial measures
of efficiency are termed indexes.15
−
→
To close the gap, we introduce allocative inefficiency (AI i ) additively; i.e.,

wx − C(y, w) −→ −
→
= D i (y, x; gx ) + AI i
wgx
or
−−→ − → −
→
OCI = T I i + AI i (125)

15 See also [13] for difference indicators using a profit function.

194 R. Färe et al.

−−→
is the decomposition of the cost inefficiency indicator (OCI ) into technical inef-
−→ −→
ficiency (T I i ) and allocative inefficiency (AI i ). A producer exhibits no technical
−
→ −→
inefficiency if T I i = 0 and no allocative inefficiency if AI i = 0. When both
−−→
components equal zero, the producer exhibits no cost inefficiency, i.e., OCI = 0.
We note that in the radial case, allocative efficiency is introduced multiplicatively
and in the directional case allocative inefficiency is introduced additively. Higher
values for the efficiency indexes – OCE, T Ei , and AEi – indicate greater
efficiency in the use and allocation of inputs given outputs and input prices. In
−−→ − → −
→
contrast, higher values of the inefficiency indicators – OCI , T I i , and AI i – indicate
greater inefficiency in the use and allocation of inputs given outputs, input prices,
and the directional vector (gx ).
Next we study revenue efficiency with both radial and directional distance
functions. Since this analysis closely parallels the cost notions, we keep it to a
minimum. Recall the following revenue inequality

R(x, p) ≥ py for all y ∈ P (x) (126)

and the two distance functions

Do (x, y) = inf{λ : y/λ ∈ P (x)}, (127)

with y/Do (x, y) ∈ P (x) and

−
→
D o (x, y; gy ) = inf{β : y + βgy ∈ P (x)}, (128)
β

−
→
with y + D o (x, y; gy )gy ∈ P (x).
To construct a measure of overall revenue efficiency (ORE), the revenue
inequality is rearranged and then closed by multiplying the output distance function
by an index of output allocative efficiency (AEo ), i.e.,

py
R(x, p) ≥
Do (x, y)
py
= Do (x, y) × AEo
R(x, p)
ORE = T Eo × AEo (129)

where T Eo is output technical efficiency as measured by the output distance

function. The indexes ORE, T Eo , and AEo take a maximum value of one when
the producer is efficient in the particular index.
Similar to overall cost inefficiency, an indicator of overall revenue inefficiency
−−→
(ORI ) can be constructed. Overall revenue inefficiency equals the normalized
difference between maximum revenues and actual revenues. To derive this indicator,
4 Multiproduct Technologies 195

start with the revenue inequality using the directional output distance function which
−
→
serves as the indicator of output technical inefficiency (T I o ). Then, rearrange and
−
→
finally, add an indicator of output allocative inefficiency (AI o ):
−
→
R(x, p) ≥ p(y + D o (x, y; gy )gy )
R(x, p) − py −
→ −
→
= D o (x, y; gy ) + AI o
pgy
−−→ − → −
→
ORI = T I o + AI o . (130)

−−→ −
→ −
→
Efficient producers have ORI = 0, T I o = 0, and AI o = 0 with higher values
indicating greater inefficiency in the particular component.
Above in this section we showed how the radial/Shephard distance functions
could be used as constraints in the profit maximization problem. First, recall that
x ∈ L(y) ⇔ (x, y) ∈ T . Second, consider the following expressions:

Di (y, x) = sup{λ : x/λ ∈ L(y)}

= sup{λ : (x/λ, y) ∈ T }. (131)

Now, since Di (y, x) meets the representation conditions16 of L(y) = {x :

Di (y, x) ≥ 1} and T = {(x, y) : Di (y, x) ≥ 1}, we have shown that
the input distance function may be used as the technology constraint for the
profit maximization problem. We follow [5] and make use of these representation
properties in developing Nerlovian indicators of profit inefficiency.
When a profit maximum exists, maximum profits are no less than actual profits

π(w, p) ≥ py − wx for all (x, y) ∈ T . (132)

x
From the above analysis, we know that ( Di (y,x) , y) ∈ T . Thus

wx
π(w, p) ≥ py − . (133)
Di (y, x)

By adding and subtracting wx from the right-hand side and rearranging, one
obtains
1
π(w, p) ≥ py − wx + wx − wx
Di (y, x)

16 Say under strong disposability of inputs.

196 R. Färe et al.

π(w, p) − (py − wx) 1

≥1−
wx Di (y, x)
π(w, p) − (py − wx) −→
≥ D i (y, x; gx ) (134)
wx
−
→
where D i (y, x; x) = 1 − 1
Di (y,x) when gx = x as shown in (48). Adding
−
→
a directional allocative inefficiency component (AI i ) to the right-hand side of
(134) closes the inequality and yields an additive decomposition of overall profit
inefficiency into technical inefficiency and allocative inefficiency:

π(w, p) − (py − wx) −→ −

→
= D i (y, x; x) + AI i (135)
wx

where the left-hand side of (135) measures overall profit inefficiency and is equal to
the amount that maximum profits exceed actual profits normalized by actual costs:
π(w,p)−(py−wx)
wx .
As an alternative to (135), one can also develop a multiplicative index. Again,
starting from the profit inequality (132), we also know that if (x, y ∗ ) ∈ T , then
π(w, p) ≥ py ∗ − wx. Now, since (x, y ∗ ) ∈ T , then ( Di (yx∗ ,x) , y ∗ ) ∈ T which
means
wx
π(w, p) ≥ py ∗ −
Di (y ∗ , x)
wx
π(w, p) − py ∗ ≥ − . (136)
Di (y ∗ , x)

Since π(w, p) = py ∗ − wx ∗ , we have

wx
py ∗ − wx ∗ − py ∗ ≥ −
Di (y ∗ , x)
wx
−wx ∗ ≥ −
Di (y ∗ , x)
wx ∗ 1
≤ . (137)
wx Di (y ∗ , x)

The inequality can be closed by multiplying the right-hand side by an index of input
allocative efficiency, AEi . Thus, we have overall cost efficiency equals the product
of input technical efficiency and allocative efficiency:

wx ∗ 1
= × AEi
wx Di (y ∗ , x)
OCEi = T Ei × AEi . (138)
4 Multiproduct Technologies 197

The decomposition of OCEi can also be done for values of y other than y ∗ .
One can also develop a Nerlovian indicator of profit inefficiency by exploiting
the representation properties of either the Shephard output distance function or the
directional output distance function. We start with the output distance function.
Recall that y ∈ P (x) ⇔ (x, y) ∈ T . The representation property yields P (x) = {y :
Do (x, y) ≤ 1} and T = {(x, y) : Do (x, y) ≤ 1}. When a profit maximum exists,
maximum profits are no less than actual profits: π(w, p) ≥ py −wx for all (x, y) ∈
y
T . Thus, since (x, y) ∈ T ⇒ (x, Do (x,y) ) ∈ T , we can derive a second Nerlovian
profit efficiency indicator as

py
π(w, p) ≥ − wx + (py − py)
Do (x, y)
π(w, p) − (py − wx) 1
≥ −1
py Do (x, y)
π(w, p) − (py − wx) −→
≥ D o (x, y; y)
py
π(w, p) − (py − wx) −→ −
→
= D o (x, y; y) + AI o (139)
py

−
→ −
→
where D o (x, y; y) = 1
Do (x,y) − 1 when gy = y. Here D o (x, y; y) is an indicator
−
→
of output technical inefficiency, and AI o is an indicative of output allocative
inefficiency. In this case, overall profit inefficiency equals the amount that maximum
profits exceed actual profits normalized by actual revenues: π(w,p)−(py−wx) py . We
leave the derivation of the multiplicative index of output efficiency along the same
lines as (138) to the reader.
In general, the indicator of Nerlovian profit inefficiency can be decomposed as
−
→
the sum of an indicator of technical inefficiency measured by D T (x, y; gx , gy ) and
−→
an indicator of allocative inefficiency AI T . Starting with the profit inequality (132)
−
→
and making use of the representation property (x, y) ∈ T ⇔ D T (x, y; gx , gy ) and
−
→
definition of D T (x, y; gx , gy ), the Nerlovian profit inefficiency indicator is derived:

π(w, p) ≥ py − wx
−
→
π(w, p) ≥ p(y + D T (x, y; gx , gy )gy )
−
→
− w(x − D T (x, y; gx , gy )gx )
π(w, p) − (py − wx) −→
≥ D T (x, y; gx , gy )
pgy + wgx
π(w, p) − (py − wx) −→ −
→
= D T (x, y; gx , gy ) + AI T (140)
pgy + wgx
198 R. Färe et al.

−
→
where the inequality is closed by adding the allocative inefficiency component AI T .
Clearly, when gx = x and gy = 0 the indicator of Nerlovian profit inefficiency
collapses to (135). Similarly, when gx = 0 and gy = y, the profit inefficiency
indicator collapses to (139).

Duality Theory

This section is devoted to duality theory. This theory has its roots in the fact that a
convex set can be modeled in two ways: one as the set of all convex combinations of
points in the set and two as the intersection of all half-spaces containing it. Shephard
[28] introduced this theory into economics as a way to make use of price/cost data
to model a technology.
Statistical studies of cost functions are generally more accessible than corresponding
empirical investigations of production functions, because economic data are most frequently
in price and monetary terms. [28, p. 28]

In order to illustrate the idea, we make use of a paper by [30]. Assume we are
given a Leontief production function with two inputs, y = min{x1 , x2 }. The cost
function associated with this production function is

C(y, w) = min{wx1 + w2 x2 : min(x1 , x2 ) ≥ y} (141)

and can be derived as C(y, w) = y(w1 + w2 ).

Our illustration of the duality between the Leontief production function and cost
function is represented in Fig. 3 which has a primal space (x1 , x2 ) ≥ 0 and a dual
space ( wc1 , wc2 ). The input requirement set is illustrated in the primal space, and
the cost function is illustrated in the dual space. In the primal space, the horizontal
and vertical intercepts ( wc1 , wc2 ) of the relative price lines give the maximum input
quantities that could be purchased if the entire budget was spent on the specific
input. The figure shows how the Leontief technology is reflected through asymptotes
w1 x1 w2 x2
c = 1 and c = 1 from the primal quantity space into the price space where it
is a straight line.
Of course, one may also start in the price space with the cost function and derive
the production function in the quantity space. We note that duality does not preserve
differentiability, but might create it.

Cost Function Dualities

We start with [28] duality theory between the cost function and the radial/Shephard
input distance function and end this section with a cost function and directional
input distance function duality statement.
Recall that the cost function is defined as
4 Multiproduct Technologies 199

Fig. 3 Cost and Production x2

Function Duality

L(y)

(w2x2)/c=1

x1
w2/c
C(w/c,y)
(w1x1)/c=1

w1/c

C(y, w) = min{wx : x ∈ L(y)}, w > 0 and y ∈ Dom L (142)

where y ∈ RM + is an output vector, w ∈ R+ an input price vector, x ∈ R+ an input

N N

vector, and Dom L = {y : L(y) = ∅}. In general we drop the assumption that
y ∈ Dom L. One may show that the cost function meets the following conditions:17

C.1 C(y, w) is nonnegative and non-decreasing in w.

C.2 C(y, w) is homogeneous of degree +1 in w.
C.3 C(y, w) is concave and continuous in w.

If strong disposability and convexity are imposed on the input requirement set L(y),
one can prove that

L(y) = {x : wx ≥ C(y, w) for all w > 0}. (143)

Hence, we can establish a duality between L(y) and C(y, w) in the sense that each
can be derived from the other:

C(y, w) = min{wx : x ∈ L(y)}

and
L(y) = {x : wx ≥ C(y, w) for all w > 0}. (144)

17 See [15] Drop (1995).

200 R. Färe et al.

Earlier we have shown that the radial input distance function, Di (y, x) = sup{λ >
0 : (x/λ) ∈ L(y)}, represents the input requirement set, i.e., L(y) = {x :
Di (y, x) ≥ 1}. Thus the cost function duality may use the representation property
to yield

C(y, w) = min{wx : Di (y, x) ≥ 1}. (145)

The second part above may be written in terms of an optimization condition18 which
gives us a second duality theorem.

C(y, w) = min{wx : Di (y, x) ≥ 1}

Di (y, x) = inf{wx : C(y, w) ≥ 1} (146)

This duality theorem is between the two functions which makes it handy for
empirical work on pricing. The two parts are both constraint optimization problems,
and next we transform these into unconstrained problems,19 i.e.,
wx
C(y, w) = min , w>0
x Di (y, x)
wx
Di (y, x) = inf , x ∈ RN
+. (147)
w C(y, w)

These dual relationships play an important role in efficiency calculation where, for
example, C(y,w) 1
wx is the [18] measure of cost efficiency and Di (y,x) is the measure of
input technical efficiency.
Turning to the directional input distance function, we have shown that it meets
the representation condition, i.e.,
−
→
L(y) = {x : D i (y, x; gx ) ≥ 0} (148)

given that inputs are freely disposable, where gx is the directional vector showing
how the input x is contracted onto the boundary of the input set L(y). The above
fact (148) yields an unconstrained duality condition.
−
→
C(y, w) = min{wx − D i (y, x; gx )wgx }
x

−
→ wx − C(y, w)
D i (y, x; gx ) = inf . (149)
w wgx

18 See [15].
19 See [15].
4 Multiproduct Technologies 201

Revenue Function Dualities

Next we develop duality theorems for the revenue function20 which is defined as

R(x, p) = max{py : y ∈ P (x)} (150)

and the output distance function which is defined as

Do (x, y) = inf{λ : y/λ ∈ P (x)}. (151)

Our first theorem is between the revenue function and the output distance function,
and it states that

R(x, p) = max{py : Do (x, y) ≤ 1}

Do (x, y) = sup{py : R(x, p) ≤ 1}. (152)

These two constrained optimization problems can be transformed into uncon-

strained problems as
py

R(x, p) = max
y Do (x, y)
py
Do (x, y) = sup . (153)
p R(x, p)

Recall that the directional output distance function

−
→
D o (x, y; gy ) = sup{β : (y + βgy ) ∈ P (x)} (154)
β

−
→
also completely characterizes the output set: P (x) = {y : D o (x, y; gy ) ≥ 0}.
Thus, the revenue maximization problem may be formulated as
−
→
R(x, p) = max{py : D o (x, y; gy ) ≥ 0}. (155)
y

Given appropriate conditions on P (x), namely, convexity and free disposability of

outputs, the unconstrained duality conditions between the revenue function and the
directional output distance function are

20 This section mimics section “Cost Function Dualities”, but is terse.

202 R. Färe et al.

−
→
R(x, p) = max{py + D o (x, y; gy )gy }
y

−
→ R(x, p) − py
D o (x, y; gy ) = inf . (156)
p pgy

Profit Function Dualities

In this section, we illustrate three profit function dualities related to the radial input
and output distance functions and the directional technology distance function.21
Recall that these distance functions are complete representations of the technology.
In particular we have
−
→
T = {(x, y) : D T (x, y; gx , gy ) ≥ 0}
P (x) = {y : Do (x, y) ≤ 1} (157)
L(y) = {x : Di (y, x) ≥ 1}

and that

(x, y) ∈ T ⇔ y ∈ P (x) ⇔ x ∈ L(y). (158)

Then, given a convex technology with inputs and outputs freely disposable, the
following duality relation can be established:22
−
→
π(w, p) = sup{py − wx + D T (x, y; gx , gy )(pgy + wgx )}
x,y

−
→ π(w, p) − (py − wx)
D T (x, y; gx , gy ) = inf . (159)
p,w pgy + wgx

With respect to the radial input distance function, [15] established the duality
that
wx
π(w, p) = sup py −
x,y Di (y, x)
wx
Di (y, x) = inf (160)
p,w py − π(w, p)

and for the radial output distance function, we have

21 Of course we could also show dualities between the profit function and directional input distance

function and directional output distance function.

22 See [22] for a proof.
4 Multiproduct Technologies 203

py
π(w, p) = sup − wx
x,y Do (x, y)
py
Do (x, y) = inf . (161)
p,w wx + π(w, p)

Calculus

A famous calculus statement in economics is Shephard’s lemma [28], which derives

the conditional demand function for inputs through calculus. In particular we have

∇w C(y, w) = x(y, w), (162)

where x(y, w) is the input demand function and the cost function is C(y, w) =
minx {wx : x ∈ L(y)} for inputs x ∈ RN + . It is important to note that by applying
calculus in price space, one ends up in quantity space, i.e., calculus moves you from
one space into its dual.
In economics we frequently think of quantity space, say RQ as primal and price
space RP as its dual (Q = P ), where (RQ )∗ = RP , i.e., the dual (RQ )∗ is the price
space. One can prove that (RQ )∗ = RQ , which of course is not true in general; see,
e.g., [21]. Although RQ and its dual (RQ )∗ are equal, in economics one can say
that you can eat a donut, but not its price. Thus one needs to distinguish between the
two spaces.
In order to formalize the idea of Shephard’s lemma, let, for simplicity, F : R →
R be a function from the real numbers into the real numbers. Then we want to show
that

dF
∈ (R)∗ , (163)
dx

i.e., it is in the dual space for R.

A function, F (x), is differentiable at x o if there exists a linear functional L ∈
(R)∗ such that

F (x + h) − F (x) − L(h)
lim = 0. (164)
h→0 h

Since L(h) is linear, L(h) = hL(1). Thus,

F (x + h) − F (x) hL(1)
lim − =0 (165)
h→0 h h
or

F (x + h) − F (x)
lim = L(1) (166)
h→0 h
204 R. Färe et al.

where
F (x + h) − F (x) dF
lim = . (167)
h→0 h dx

Thus,

dF
= L(1) ∈ (R)∗ , (168)
dx
justifying our claim that the derivative belongs to the dual space on which the
function is defined. Of course, this generalizes to RQ , Q > 1, and hence
Shephard’s lemma has a natural explanation, i.e., the gradient of the cost function
with respect to input prices yields input quantities.
Next one may ask, what are the conditions on the input requirement set L(y) that
yield a differentiable cost function (value function). Mas Colell et al. [24, p. 141]
show that the cost function is differentiable when L(y) is strictly convex, i.e.,

for x 0 , x 1 ∈ L(y), then λx 0 + (1 − λ)x 1 ∈ Interior L(y), 0 < λ < 1. (169)

From our duality section, it is clear that convexity is just a sufficient condition
for differentiability. For a necessary and sufficient condition, see [15, 16, 18]. Next
we turn to some applications of calculus.

Shadow Pricing

Our first application of calculus is shadow pricing. The hints from an earlier section
suggest that we use input/output data to estimate a distance function and then derive
the shadow prices using calculus.
Here we restrict ourselves to Shephard’s input distance function23 and derive
expressions to estimate input prices.
Let the input distance function be

Di (y, x) = sup{λ > 0 : x/λ ∈ L(y)}, y ∈ RM

+ (170)
λ

and recall that it is homogenous of degree +1 in inputs, Di (y, μx) =

μDi (y, x), μ > 0, and meets the representation property Di (y, x) ≥ 1 if and
only if x ∈ L(y). Let w ∈ RN
+ be a vector of input prices. Then

C(y, x) = min{wx : Di (y, x) ≥ 1} (171)

23 See [11] in this volume for a comprehensive discussion of the topic.

4 Multiproduct Technologies 205

is a cost function. The Lagrangian problem for this cost function is

C(y, w) = wx − μ(Di (y, x) − 1), (172)

where μ is the Lagrangian multiplier. The first-order conditions (FOCs) associated

with it are

w − μ∇x Di (y, x) = 0. (173)

Suppose that one input price, say w1 , is known in addition to x ∈ RN + and y ∈ R+ .

Then the shadow prices for the other n = 2, . . . , N inputs can be determined as

∂Di (y, x)/∂xn

wn = w 1 (174)
∂Di (y, x)/∂x1

so that total cost can be computed as

N
∂Di (y, x)/∂xn
c = w1 x1 + w1 xn . (175)
∂Di (y, x)/∂x1
n=2

The next model assumes that total cost, c = N n=1 wn xn , is known together with
x ∈ RN + and y ∈ R M , but no individual price w is known. In this case, we can
+ n
derive the pricing model

∇x Di (y, x)
w=c . (176)
Di (y, x)

To derive this pricing model, we need to interpret the Lagrangian multiplier μ. Thus,
consider

C̃(y, w, α) = wx − μ(Di (y, x) − α) (177)

as a permutation of the distance function. By homogeneity we have

C̃(y, w, α) = wx − αμ(Di (y, x/α) − 1)

= αw x̂ − αμ(Di (y, x̂) − 1) (178)
= αC(y, w)

where x̂ = x/α. The derivative is then ∂ C̃(y, w, α)/∂α = μ from (177), and
from the last line in (178), we have ∂ C̃(y, w, α)/∂α = C(y, x). Thus, the optimal
Lagrangian multiplier equals the cost function: μ = C(y, w). Inserting C(y, w)
into the first-order conditions yields
206 R. Färe et al.

w = C(y, x)∇x Di (y, x). (179)

Now, multiply both sides of (179) by x, and using Euler’s theorem24 yields

c = wx = C(y, x)Di (y, x). (180)

Thus, C(y, w) = c/Di (y, x), and substituting this expression into (179) yields the
desired result:

w = C(y, w)∇x Di (y, x)

c∇x Di (y, x)
= . (181)
Di (y, x)

Assuming that inputs are technically efficient in the sense of [18], i.e.,
Di (y, x) = 1, and that prices are cost deflated, ŵ = w/c, then our pricing model
becomes

ŵ = ∇x Di (y, x). (182)

In the case of a single output, y ∈ R+ , the pricing formula can be found in [28, p.
19] and [29, p. 171].
Finally, suppose we know r = total revenue, output prices p ∈ RM + , and inputs
x ∈ RN + . Then we can estimate the revenue function, R(x, p), and hence we can
price inputs by maximizing profits by choosing inputs x

π(w, p) = max{R(x, p) − wx} (183)

and then obtain the pricing rule via the first-order conditions

w = ∇x R(x, p). (184)

Scale Elasticities

In production theory as opposed to consumer theory, size matters. A measure of size

is scale elasticity. Here we follow [15] in our discussion. As a second application of
calculus, we discuss a primal and dual formulation of scale elasticities. Let

F (x) = max{y : (x, y) ∈ T } (185)

24 For a homogeneous function of degree +1 in x, Euler’s theorem states that x∇x Di (y, x) =
Di (y, x).
4 Multiproduct Technologies 207

be a production function. Then scale elasticity is defined as

∂lnF (λx) ∇x F (x)x

(x) = |λ=1 = . (186)
∂λ F (x)

In the multi-output case, scale elasticity generalizes as

∂lnθ
o (x, y) = |θ=λ=1 (187)
∂lnλ

where Do (λx, θy) = 1 and

N ∂Do (x,y)
n=1 ∂xn xn
o (x, y) = − M ∂Do (x,y)
m=1 ∂ym ym

∇x Do (x, y)x
=− applying Euler’s theorem
Do (x, y)
= −∇x Do (x, y)x (188)

since Do (x, y) = 1, when λ = θ = 1.

Let p ∈ RM + be an output price vector. Then, recall that the revenue function is
dual to the output distance function represented by the duality

py
R(x, p) = max
y Do (x, y)
py
Do (x, y) = sup . (189)
p R(x, p)

Hence, one may derive an expression for scale elasticity in terms of the revenue
function as

∇x R(x, p)x
R (x, p) = (190)
R(x, p)

where

∇x R(x, p)x ∇x Do (x, y)x

=−
R(x, p) Do (x, y)
i.e., R (x, p) = o (x, y). (191)
208 R. Färe et al.

Elasticities of Substitution

The curvature of input isoquants and output possibility frontiers reveals infor-
mation about the degree of substitutability between inputs or between outputs.
In a consumer context, [7] and [8] show the relation between the Slutsky
matrix of compensated price effects derived from an expenditure function and the
Antonelli matrix of compensated quantity effects derived from distance functions.
To investigate the curvature requires value functions or distance functions to be
twice differentiable which we assume. Knowledge of the elasticity of substitution
(transformation) reveals how input cost (revenue) shares change as relative input
(output) prices change. Dual elasticities of substitution and transformation are
derived from distance functions.25 These dual elasticities measure the percent
change in relative prices that support a percent change in relative quantities. For
the curvature of the input requirement set, we examine elasticities of substitution
using the cost function and input distance function. These elasticities of substitution
are based on the work of [25] and [2]. Then, we derive output elasticities of
transformation from the revenue function and their dual measures from the output
distance function and directional output distance function.
We begin with the cost function. Input demand functions are derived from
Shephard’s lemma: ∇w C(y, w) = x(y, w). The ratio of two inputs, i and j , then
equals the ratio of the first derivative of the cost function

∂C(y, w)/∂wi xi (y, w)

= . (192)
∂C(y, w)/∂wj xj (y, w)

Taking the natural log of both sides of (192) and then the partial derivative with
respect to log relative input prices yields the Morishima elasticity of substitution
(MES):

∂C(y,w)/∂wi
∂ln ∂C(y,w)/∂w j
∂ln xxji (y,w)
(y,w)
MESij = − =−
wi wi
∂ln w j
∂ln wj

∂ 2 C(y,w) ∂ 2 C(y,w) ∂xj (y,w) ∂xi (y,w)

∂wi ∂wj ∂wi2 ∂wi ∂wi
= wi ∂C(y,w)
− ∂C(y,w)
= wi −
xj (y, w) xi (y, w)
∂wj ∂wi

= j i (y, w) − ii (y, w) (193)

where j i (y, w) is the cross price elasticity between the j th input demand and the
ith input price and ii is own price elasticity of demand for input i. As shown by [2],

25 Seethe [27] chapter in this volume on Elasticities of Substitution by R.R. Russell for a detailed
discussion of alternative definitions and forms of substitution elasticities.
4 Multiproduct Technologies 209

the Morishima elasticity is asymmetric in terms of whether the relative price change
is due to the price of input i changing or due to the price of input j changing.
Furthermore, the elasticity of the relative cost shares of i and j due to relative price
change of input i is 1 − MESij . Thus, if MESij is greater (less) than 1, the cost
share of input i falls (rises) relative to j due to an increase in the price of input i.
Next, we turn to the dual Morishima elasticity of substitution (MES i j ) which
measures the percent change in relative shadow prices of inputs i and j due to a
percent change in the relative input quantities. We start with the dual Shephard’s
lemma found in (179)

w(y, x)
∇x Di (y, x) = (194)
C(y, x)

which for two inputs i and j is

∂Di (y, x)/∂xi wi (y, x)

= . (195)
∂Di (y, x)/∂xj wj (y, x)

Again, we take the natural logarithms of both sides of (195) and then take the partial
derivative with respect to log changes in the ratio of the ith input quantity to the j th
input quantity

∂Di (y,x)/∂xi wi (y,x)
∂ln ∂D (y,x)/∂x ∂ln w (y,x)
ij = −
MES
i

j
=−
j

xi
∂ln xj ∂ln xxji

∂ 2 Di (y,x) ∂ 2 Di (y,x) ∂wj (y,x) ∂wi (y,x)

∂xi ∂xj ∂xi2 ∂xi ∂xi
= xi ∂Di (y,x)
− ∂Di (y,x)
= xi −
wj (y, x) wi (y, x)
∂xj ∂xi

= ˜j i (y, x) − ˜ii (y, x) (196)

The Morishima output elasticities of transformation (substitution) provide infor-

mation on the curvature of the frontier of the output possibility set. The revenue
function is used to derive the output elasticity of transformation, and either the
Shephard or directional output distance function can be used to derive dual
elasticities of transformation in price space. We begin by deriving the output
elasticity of transformation from the revenue function and its dual from the output
distance function. Then, we reintroduce jointly produced undesirable outputs back
into our framework and derive the dual output elasticity of transformation from the
directional output distance function.
The revenue function is R(x, p) = maxy {py : y ∈ P (x)}. Using the
representation property, we can write this revenue function using the output distance
function as
210 R. Färe et al.

R(x, p) = max{py : Do (x, y) ≤ 1}. (197)

As shown in (153), the revenue function can be written as an unconstrained

maximization problem
py
R(x, p) = max (198)
y Do (x, y)

from which the output supply functions can be derived

y(x, p)
∇p R(x, p) = (199)
Do (x, y)

where y(x, p) ∈ RM + is the vector of output supply functions. For two outputs, s
and t, we then have

∂R(x, p)/∂ps ys (x, p)

= . (200)
∂R(x, p)/∂pt yt (x, p)

The Morishima elasticity of transformation between outputs s and t is

∂ln ∂R(x,p)/∂p s
∂R(x,p)/∂pt ∂ln yyst (x,p)
(x,p)

METst = − =−
∂ln ppst ∂ln ppst

∂ 2 R(x,p) ∂ 2 R(x,p) ∂yt (x,p) ∂ys (x,p)

∂ps ∂pt ∂ps2 ∂ps ∂ps
= ps ∂R(x,p)
− ∂R(x,p)
= ps −
yt (x, p) ys (x, p)
∂pt ∂ps

= δts (x, p) − δss (x, p) (201)

where δts is the cross price elasticity of supply between output t and output price s,
while δss is the own price elasticity of supply for output s.
As our last application, we account for desirable and jointly produced undesirable
outputs in the technology and derive the dual Morishima elasticity of transforma-
tion.26 This elasticity measures the percent change in the shadow price ratio due
to a percent change in the ratio of the desirable to undesirable output. Here the
directional output distance function provides a way of modeling the objective of
simultaneously expanding desirable outputs while contracting undesirable outputs
given the technology as represented by the output possibility set.
The output possibility set is P (x) = {(y, b) : (y, b) can be produced by x}. The
directional output distance function is defined on this set as

26 This discussion follows [12].

4 Multiproduct Technologies 211

−
→
D o (x, y, b; gy , gb ) = sup{β : (y + βgy , b − βgb ) ∈ P (x)}. (202)
β

The undesirable output price vector is q ∈ RJ+ , and the revenue function that
accounts for the charges or fines that the producer must incur from producing
undesirable output is

R(x, p, q) = max{py − qb : (y, b) ∈ P (x)} (203)

y,b

which the representation property allows us to write as

−
→
R(x, p, q) = max{py − qb : D o (x, y, b; gy , gb ) ≥ 0}. (204)
y,b

The duality between the revenue function and the directional output distance func-
tion allows the directional output distance function to be written as an unconstrained
minimization problem

−
→ R(x, p, q) − (py − qb)
D o (x, y, b; gy , gb ) = inf . (205)
p,q pgy + qgb

Differentiating with respect to y and b yields

−
→ p
∇y D o (x, y, b; gy , gb ) = −
pgy + qgb
and
−
→ q
∇b D o (x, y, b; gy , gb ) = . (206)
pgy + qgb

Thus, for desirable output s and undesirable output j , we have

−
→
∂ D o (x, y, b; gy , gb )/∂bj qj (x, y, b)
−
→ =− (207)
∂ D o (x, y, b; gy , gb )/∂ys ps (x, y, b)

where q(x, y, b) is the vector of inverse (price) supply functions for the undesirable
output and p(x, y, b) is the vector of inverse supply functions for the desirable
output.
The dual Morishima elasticity of transformation can be derived as27

27 We state the dual Morishima elasticity of transformation in terms of second derivatives of

−
→ −
→
D o (x, y, b; gy , gb ) since the natural log of D o (x, y, b; gy , gb ) = 0 is not well-defined.
212 R. Färe et al.

q (x,y,b)
∂ln pjs (x,y,b)
js =
MET
∂lnbj
−
→ −
→
∂ 2 D o (x,y,b;gy ,gb ) ∂ 2 D ( x,y,b;gy ,gb )
∂ys ∂bj ∂bj ∂bj
= bj −
→ − −
→
∂ D o (x,y,b;gy ,gb ) ∂ D o (x,y,b;gy ,gb )
∂ys ∂bj

∂ps (x,y,b) ∂qj (x,y,b)

∂bj ∂bj
= bj −
ps (x, y, b) qj (x, y, b)
= δ̃j s (x, y, b) − δ̃ss (x, y, b) (208)

where δ̃j s (x, y, b) is the dual cross elasticity of transformation and δ̃ss (x, y, b) is
the dual own elasticity of transformation.

Appendix: Production Axioms

In this appendix, we list axioms on the technology sets. There is no particular order
of these axioms. Since we have that

(x, y) ∈ T ⇔ y ∈ P (x) ⇔ x ∈ L(y) (209)

each axiom may be given a representation in each of these sets.

A.1 T is a closed set, i.e., if (x l , y l ) ∈ T for all l = 1, . . . , ∞ and (x l , y l ) →

(x o , y o ), then (x o , y o ) ∈ T .
A.2 P (x) is a closed set, i.e., if y l ∈ P (x), for all l = 1, . . . , ∞ and y l → y o ,
then y o ∈ P (x).
A.3 L(y) is a closed set, i.e., if x l ∈ L(y) for all l = 1, . . . , ∞ and x l → x o , then
x o ∈ L(y).
Comment: A.1 implies A.2 and A.3.
A.4 Strong disposability or free disposability of outputs: if y o ∈ P (x) and y ≤
y o , then y ∈ P (x).
A.5 Weak disposability of outputs: if y o ∈ P (x) and 0 ≤ θ ≤ 1, then θy o ∈ P (x).
A.6 Strong or free disposability of inputs: if x o ∈ L(y) and x ≥ x o , then x ∈
L(y).
A.7 Weak disposability of inputs: if x o ∈ L(y) and θ ≥ 1, then θ x o ∈ L(y).
Comment: A.4 implies A.5 and A.6 implies A.7.
A.8 Scarcity: P (x) is bounded for each x ∈ RN +.
A.9 Possibility of inaction: (0, 0) ∈ T .
A.10 Possibility of inaction: 0 ∈ P (x) for each x ∈ RN +.
Comment: A.10 implies A.9.
4 Multiproduct Technologies 213

A.11 T is a convex set, i.e., if (x 0 , y 0 ) ∈ T and (x 1 , y 1 ) ∈ T , then

(θ x 0 + (1 − θ )x 1 , θy 0 + (1 − θ )y 1 ) ∈ T , 0 ≤ θ ≤ 1.
A.12 P (x) is a convex set, i.e., if y 0 and y 1 ∈ P (x), then (θy 0 +(1−θ )y 1 ) ∈ P (x),
0 ≤ θ ≤ 1.
A.13 L(y) is a convex set, i.e., if x 0 and x 1 ∈ L(y), then (θ x 0 + (1 − θ )x 1 ) ∈ L(y),
0 ≤ θ ≤ 1.
Comment: A.11 implies A.12 and A.13.
A.14 Constant returns to scale: T = θ T , θ > 0.
A.15 Non-increasing returns to scale: (x, y) ∈ T and 0 ≤ θ ≤ 1, imply (θ x, θy) ∈
T.
A.16 Non-decreasing returns to scale: (x, y) ∈ T and θ ≥ 1, imply (θ x, θy) ∈ T .
A.17 Null-jointness: if (y, b) ∈ P (x) and b = 0, then y = 0.

Cross-References

Shadow Pricing in Production Economics

References
1. Allais M (1943) A la Recherche d’une Discipline Economique Premiére Partie, l’Economie
Pure Paris: Atliers Industria, vol 1
2. Blackorby C, Russell RR (1989) Will the real elasticity of substitution please stand up? Am
Econ Rev 79(4):882–888
3. Bogetoft P, Färe R, Obel B (2006) Allocative efficiency of technically inefficient production
units. Eur J Oper Res 168(2):450–462
4. Chambers RG, Chung Y, Färe R (1996) Benefit and distance functions. J Econ Theory
70(2):407–419
5. Chambers RG, Chung Y, Färe R (1998) Profit, directional distance functions, and Nerlovian
efficiency. J Optim Theory Appl 98(2):351–364
6. Charnes A, Cooper WW, Rhodes E (1978) Measuring the efficiency of decision making units.
Eur J Oper Res 2:429–444
7. Deaton A (1979) The distance function in consumer behaviour with applications to index
numbers and optimal tax theory. Rev Econ Stud 46(3):391–405
8. Deaton A, Muellbauer J (1980) Economics and consumer behavior. Cambridge University
Press, Cambridge
9. Diewert WE (1998) Index number theory using differences rather than ratios, Department of
Economics, University of British Columbia, Discussion paper No.98–10
10. Färe R (1988) Fundamentals of production theory. Springer, Berlin
11. Färe R, Grosskopf S, Margaritis D (2017) Shadow Pricing in Production Economics
12. Färe R, Grosskopf S, Noh D-W, Weber WL (2005) Characteristics of a polluting technology:
theory and practice. J Econ 126:469–492
13. Färe R, Grosskopf S, Ray SC, Miller SM, Mukherjee K (2000) Difference measures of profit
inefficiency: an application to U.S. Banks, Conference on Banking and Finance, Miguel
Hernadez University, Elche, May 2000
14. Fare R, Primont D (1986) On differentiability of cost functions. J Econ Theory 38:233–237
15. Färe R, Primont D (1995) Multi-output production and duality: theory and applications. Kluwer
Academic Publishers, Boston
214 R. Färe et al.

16. Färe R, Primont D (2006) Directional duality theory. Economic Theory 29(1):239–247
17. Färe R, Primont D, Samuelson L (1990) On differentiability of cost functions: corrigendum. J
Econ Theory 52:237
18. Farrell MJ (1957) The measurement of productive efficiency. J R Stat Soc Ser A General
125:(Part 2):252–267
19. Fleming W (1977) Functions of several variables. Springer, New York
20. Kemeny JG, Morgenstern O, Thompson GL (1956) A generalization of the von Neumann
Model of an expanding economy. Econometrica 24:115–135
21. Luenberger DG (1969) Optimization by vector space methods. Wiley, New York
22. Luenberger DG (1992) Benefit functions and duality. J Math Econ 21:461–481
23. Luenberger DG (1995) Microeconomic theory. McGraw-Hill, New York
24. Mas-Colell A, Whinston MD, Green JR (1995) Microeconomic theory. Oxford University
Press, New York
25. Morishima M (1967) A few suggestions on the theory of elasticity, (in Japanese). Kezai Hyoron
(Economic Review) 16:144–150
26. von Neumann J (1937, 1945/1946), Über ein ökonomisches gleichungssytem und eine
verallgemeinerung des Brouwerschen Fixpunksatzes. In: Menger K (ed) Ergebnisses eines
Mathematischen Kolloquiums reprinted as A Model of General Economic Equilibrium Rev
Econ Stud 13(1):1–9
27. Russell R (2020) Elasticities of substitution. In: Chambers R, Kumbhakar S, Ray SC (eds)
Handbook of production economics, vol 1. Springer, Singapore
28. Shephard RW (1953) Cost and production functions. Princeton University Press, Princeton
29. Shephard RW (1970) Theory of cost and production functions. Princeton University Press,
Princeton
30. Weymark JA (1980) Duality results in demand theory. Eur Econ Rev 14:377–395
Functional Structure and Aggregation
5
Daniel Primont

Contents
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 216
Example: Intermediate Inputs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 217
Example: House Building . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 218
Functional Structure with Two Sectors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 219
Defining Separability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 219
Separability and Functional Structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 222
Functional-Structure Equivalences . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 230
Functional Structure with More Than Two Sectors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 236
Separability of Dual Representations of Technology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 237
Homothetic Separability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 241
Additive Functional Structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 245
Recursive Functional Structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 248
Application . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 252
Multioutput Technologies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 253
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 256

Abstract

A production function can involve dozens or even hundreds of inputs that are
combined to produce a single output. Economists can simplify these complex
processes by positing separability restrictions which, in turn, yield (1) a particular
functional structure of the production function and (2) aggregates of subsets
of the inputs that can be viewed as intermediate inputs. This has the effect of
reducing the number of variables that enter the economic analysis.

D. Primont ()
Department of Economics, Southern Illinois University-Carbondale, Carbondale, IL, USA
e-mail: [email protected]

© Springer Nature Singapore Pte Ltd. 2022 215

S. C. Ray et al. (eds.), Handbook of Production Economics,
https://doi.org/10.1007/978-981-10-3455-8_22
216 D. Primont

Various combinations of separability assumptions are considered, and the

resulting forms of functional structure are characterized. This survey considers
not only production functions but also dual representations such as cost functions
and indirect output functions thereby giving rise to aggregates of both input
quantities and input prices. The extension to the case of multiple outputs is briefly
considered.

Keywords

Separability · Functional structure · Aggregation · Production function · Cost

function · Distance function

Introduction

Consider a production technology for which three inputs are used to produce a single
output. This is typically represented by a production function written as:

y = f (x1 , x2 , x3 ) , (1)

where x1 , x2 , and x3 denote the quantities of the three inputs used and y denotes
the quantity of output. Suppose it is possible to aggregate inputs 2 and 3 into an
intermediate input denoted by φ (x2 , x3 ) . Moreover, suppose we could rewrite the
production function in (1) as

y = f (x1 , x2 , x3 ) = f (x1 , φ (x2 , x3 )) . (2)

In other words we can also represent the technology as one for which the output
is produced using x1 and an intermediate input, φ (x2 , x3 ) . The result of this is an
example of what is meant by the term “functional structure.”
One can ask the following question: What assumption can be made about the
production technology that would justify the functional structure expressed in
equation (2)? To answer this question, we compute the marginal rate of substitution
between inputs 2 and 3 using equation (2). We get

∂y/∂x2 (∂f (x1 , z) /∂z) (∂φ (x2 , x3 ) /∂x2 )

=
∂y/∂x3 (∂f (x1 , z) /∂z) (∂φ (x2 , x3 ) /∂x3 )
∂φ (x2 , x3 ) /∂x2
= ,
∂φ (x2 , x3 ) /∂x3

where z = φ (x2 , x3 ) is the quantity of the intermediate input. We see that the
marginal rate of substitution between inputs 2 and 3 does not depend on the quantity
of input 1, i.e.,
5 Functional Structure and Aggregation 217

∂ ∂y/∂x2
= 0. (3)
∂x1 ∂y/∂x3

We may conclude that the functional structure expressed in equation (2) implies
the restriction in equation (3). What is remarkable is that the restriction in (3) implies
the functional structure in (2), i.e., (2) and (3) are equivalent. This equivalence
was demonstrated in pioneering papers by Leontief [19, 20] and Sono [26, 27].
Stated in words, the restriction in (3) says that inputs 2 and 3 are separable from
1. Hence, the notion of functional structure is intimately related to the concept of
separability.
For the remainder of this section, we will look at some specific examples of
functional structure that will, hopefully, motivate our study. In section “Functional
Structure with Two Sectors” we consider functional structure for the simple case of
two subsets, or sectors, of inputs for which the inputs in sector 2 are separable from
the inputs in sector 1. In section “Functional Structure with More Than Two Sectors”
we consider a more general case for which two or more sectors are separable from all
inputs in the other sectors. The examples of steel production and personal computer
production given below pertain to section “Functional Structure with More Than
Two Sectors”. In section “Additive Functional Structure” we consider a functional
structure that involves additivity. In section “Recursive Functional Structure” we
study functional structure that is completely recursive. Complete recursivity is
another way that the results in section “Functional Structure with Two Sectors” may
be generalized. The house building example given below illustrates a completely
recursive production function. Finally, in section “Multioutput Technologies” we
consider how the results for single-output production may be extended to multiout-
put production.
All applications considered here will involve functional structure of various
representations of a production technology. For applications of functional structure
to utility or welfare functions, the reader is referred to the survey by Blackorby
et al. [4].

Example: Intermediate Inputs

One of the uses of functional structure is the modelling of intermediate inputs in

production. This was, in fact, the example used by Leontief [19] who considered
the production of steel using the intermediate inputs, coal and iron ore, along with
a labor input. In this simplified example, inputs 1,2, and 3 are used to produce coal;
inputs 4,5,6, and 7 are used to produce iron ore; and input 8 is the labor input that
runs the blast furnaces. This results in the following system:

y1 = F 1 (x1 , x2 , x3 ) (coal production) (4)

y2 = F (x4 , x5 , x6 , x7 ) (iron ore production)
2
(5)
218 D. Primont

y = F (y1 , y2 , x8 ) (steel production) (6)

Substituting (4) and (5) into (6) yields the following functional structure

y = F F 1 (x1 , x2 , x3 ) , F 2 (x4 , x5 , x6 , x7 ) , x8 . (7)

Here is another example. A simple personal computer can be assembled from the
following components.

1. Case
2. Power supply
3. Motherboard
4. Central processing unit
5. Cooling system
6. Random-access memory
7. Hard drive

Since these components are numbered from 1 to 7, we denote the quantities of

each component by the variables y1 , y2 , . . . , y7 . Other inputs are denoted by x 0 and
would include the labor and tools used to assemble the personal computers. Let y be
the number of computers assembled; clearly y is a function of (x 0 , y1 , y2 , . . . , y7 ),
and we can write the production function for personal computers as

y = F(x 0 , y1 , y2 , . . . , y7 ). (8)

Of course, each of the components themselves must be produced before they

can be assembled. Let the production function for the r-th component be given by
yr = F r (x r ) where x r is the vector of inputs used to produce yr , r = 1, 2, . . . , 7.
Assume that none of the inputs that produce component r are used in the production
of component s where s = r. For example, the labor that is used to produce the
power supplies is specialized in that use and is not used in the production of any
other component. Substitution of the component production functions into (8) yields

y = F x0, F 1 x1 , F 2 x2 , . . . , F 7 x7 . (9)

The functional structure exhibited in equation (9) implies that the inputs used to
produce yr are separable from all other inputs.

Example: House Building

Building a house involves a production process with many inputs of capital, labor,
and materials. How might an economist model such a complex undertaking?
5 Functional Structure and Aggregation 219

One answer involves breaking down this large problem into a number of smaller
subproblems. Consider the following steps in building a house.

1. Prepare site
2. Pour foundation.
3. Frame floor, walls, and roof.
4. Install plumbing, electrical system, heating, ventilation, air conditioning, and so
on.

Each step can be modelled by a subproduction function that depends on

specialized inputs of labor, capital, and materials. Let y1 be the site preparation
output and let x 1 be the vector of inputs used in site preparation. Then
we can write
the production function for this intermediate output as y1 = F 1 x 1 . Foundation
output will be a function of prepared
sites,
y1 , and inputs that are used to pour
foundations, x2 . We get y2 = F 2 y1 , x 2 . Next, let y3 be the framing output and x3
the vector of inputs used in framing. The production function for this intermediate
output is y3 = F 3 y2 , x 3 , and so on.
For expositional simplicity, suppose that all production steps beyond step 3 are
lumped together into a final step 4, viz., finish building house. Then the final output,
y, will equal step 4 output, y4 , and we have

y = y4 = F 4 F 3 F 2 F 1 x 1 , x 2 , x 3 , x 4 .

The inputs in x 1 are separable from the inputs in x 2 , x 3 , x 4 , the inputs in
1 2 3 4
x , x are separable from the inputs in x , x , and the inputs in x 1 , x 2 , x 3
are separable from the inputs in x 4 .

Functional Structure with Two Sectors

In this section, functional structure and its associated property, separability, are
examined in the simple case of two sectors, numbered 1 and 2. In this setting,
functional structure arises from the condition that sector 2 is separable from sector
1 but sector 1 is not necessarily separable from sector 2.

Defining Separability

Let RN
+ be the N-dimensional nonnegative Euclidean orthant, i.e.,

+ = x : x = (x1 , . . . , xN ) , xn 0, n = 1, . . . , N .
RN
220 D. Primont

A production function is a real-valued function defined on the domain RN + with

image y = F (x) . It is assumed that y = F (x) is the maximum output that can be
produced using the input vector x. The range of F is denoted by R (F ) .
Let I = {1, 2, . . . , N } be the set of integers that identify the inputs for which the
production function is defined. Sort the set of inputs into two groups, i.e., let

I¯ = I 1 , I 2

be a binary partition of I. In other words

I 1 ∪ I 2 = I, I 1 ∩ I 2 = ∅, I 1 = ∅, I 2 = ∅.

Assume that I has been conveniently ordered so that

I 1 = {1, 2, . . . , n1 } and I 2 = {n1 + 1, n1 + 2, . . . , N } .

Let n2 = N − n1 . The production function is equivalently defined on the domain

n1 n2
RN + = R+ × R+ . Moreover, the input
vector can be written as x = x 1 , x 2 .
Output is now given by y = F x 1 , x 2 . According to the Leontief-Sono definition,
I 2 is separable from I 1 if and only if

∂ ∂y/∂xi
= 0 for all i, j ∈ I 2 , k ∈ I 1 .
∂xk ∂y/∂xj

The economic content of this condition is this: Inputs in sector I 2 are separable
from inputs in sector I 1 if and only if marginal rates of substitution between inputs
in sector I 2 are independent of input quantities in sector I 1 .
For example, suppose there are only three inputs. Let x 1 = x1 and x 2 = (x2 , x3 ) .
Further suppose that the production function is given by

1/2 1/4 1/4 1/2 1/2
y = F x 1 , x 2 = x1 x2 x3 + x2 x3 .

Then

1/2 −3/4 1/4 −1/2 1/2
∂y/∂x2 (1/4) x1 x2 x3 + (1/2) x2 x3
=
∂y/∂x3 1/2 1/4 −3/4 1/2 −1/2
(1/4) x1 x2 x3 + (1/2) x2 x3
⎛ ⎞
1/2 −3/4 1/4 −1/2 1/2
x3 x2 ⎝ (1/4) x1 x2 x3 + (1/2) x2 x3
= ⎠
x2 x3 (1/4) x 1/2 x 1/4 x −3/4 + (1/2) x 1/2 x −1/2
1 2 3 2 3
5 Functional Structure and Aggregation 221

⎛ ⎞
1/2 1/4 1/4 1/2 1/2
x3 ⎝ (1/4) x1 x2 x3 + (1/2) x2 x3
= ⎠
x2 (1/4) x 1/2 x 1/4 x 1/4 + (1/2) x 1/2 x 1/2
1 2 3 2 3
x3
= .
x2

Thus the marginal rate of technical substitution between inputs 2 and 3 does not
depend on the value of input 1 since

∂ x3
= 0,
∂x1 x2

and hence inputs 2 and 3 are separable from input 1.

It can be verified that inputs 1 and 2 are not separable from input 3 and that inputs
1 and 3 are not separable from input 2.
Another way to understand the separablity condition is to define a conditional
isoquant in Rn+2 . It is given by the set of group 2 input vectors that yield constant
output conditional on group 1 inputs, i.e.,

x 2 : F x 1 , x 2 = k = constant .

Separability implies the invariance of marginal rates of substitution in group 2 when

group 1 inputs change. This will be guaranteed if conditional isoquants do not
depend on group 1 inputs. This means that conditional upper level sets in Rn+2 ,
given by

x 2 : F x 1 , x 2 k = constant ,

do not depend on group 1 inputs.

We can now state a definition of separability, due to Stigum [28], that does not
depend on differentiability. Define the conditional upper level set by

β x 1 , x 2 = x̂ 2 : x̂ 2 ∈ Rn+2 , F x 1 , x̂ 2 F x 1 , x 2 .

Inputs in group 2 are separable from inputs in group 1 if and only if β x 1 , x 2 is

independent of x 1 for all x 1 , x 2 in Rn+1 × Rn+2 . In this case,

β x 1 , x 2 = β x̃ 1 , x 2 ,

for all x 1 ∈ Rn+1 , x̃ 1 ∈ Rn+1 , x 2 ∈ Rn+2 .

There is another way of stating the definition of separability. Inputs in group 2
are separable from inputs in group 1 if and only if
222 D. Primont

F x 1 , x̂ 2 F x 1 , x 2 ⇔ F x̃ 1 , x̂ 2 F x̃ 1 , x 2 ,

for all x 1 , x̃ 1 ∈ Rn+1 , x 2 , x̂ 2 ∈ Rn+2 .

Separability and Functional Structure

The Production Function

The following theorem is the basic result that links separability with functional
structure.

Theorem 1 ( [7, 14]). Assume that the production function is continuous. Given the
partition, I 1 , I 2 , inputs in group 2 are separable from inputs in group 1 if and
only if there exists continuous functions F 2 and F such that

F x1, x2 = F x1, F 2 x2 , (10)

2
for all x 1 , x 2 ∈ RN
+ , where F is increasing in its last argument, F x .
2

The following sketch of the proof of Theorem 1 is short and may yield
further insight. To show that the given functional structure implies the separability
condition, note that

n2
β x 1 , x 2 = x̂ 2 : x̂ 2 ∈ R+ , F x 1 , x̂ 2 F x 1 , x 2

n2
= x̂ 2 : x̂ 2 ∈ R+ , F x 1 , F 2 x̂ 2 F x 1 , F 2 x 2

n2
= x̂ 2 : x̂ 2 ∈ R+ , F 2 x̂ 2 F 2 x 2 ,

where the last equality is implied

by the condition that F is increasing
in its
last argument. This shows that β x 1 , x 2 is independent of x 1 for all x 1 , x 2 in
Rn+1 × Rn+2 .
Conversely, suppose the separability condition holds. Let

F 2 x 2 = F O 1 , x 2 for all x 2 ∈ Rn+2 (11)

for some arbitrarily chosen vector O 1 in Rn+1 . Next, define

F x 1 , F 2 x 2 = F x 1 , x 2 for all x 1 , x 2 ∈ Rn+1 × Rn+2 .

We need to show that F is increasing in its last argument. First, note that
5 Functional Structure and Aggregation 223

F 2 x 2 F 2 x̂ 2

if and only if

F O 1 , x 2 F O 1 , x̂ 2 for all O 1 ∈ Rn+1

if and only if

F x 1 , F 2 x 2 F x 1 , F 2 x̂ 2 .

Hence, we have shown that

F 2 x 2 F 2 x̂ 2 if and only if F x 1 , F 2 x 2 F x 1 , F 2 x̂ 2 ,

which implies that F is increasing in y2 = F 2 x 2 .
We will refer to the function F 2 in (10) as an aggregator function since it
aggregates the inputs in group 2 into a scalar measure, e.g., an intermediate input.

The function F will be called a macro function since it maps x 1 and y2 = F 2 x 2
into final output, y. The original production function F will be referred to as the
master function.
The functional structure,

F x1, x2 = F x1, F 2 x2 ,

may be given a schematic representation as in Fig. 1.

Although we began with a one-period production function, Fig. 1 suggests that
production takes place over two time periods. In the first period, group 2 inputs are
used to produce an intermediate input, y2 . In the second period, group 1 inputs and
the intermediate input, y2 , are used to produce final output y. This can be interpreted
as a network production model. See Färe and Grosskopf [9] and the discussion that
begins on page 4.
So far we have only assumed that the master function F is continuous. Theorem 1
tells us that in this case, both the aggregator function F 2 and the macro function
F can be chosen to be continuous. Put another way, continuity of the aggregator
function and continuity of the macro function are inherited from the master function.

Fig. 1 One Separable Sector x1

y2 F y
2
x F 2
224 D. Primont

Are there other properties of the master function that are inherited by the
aggregator function and the macro function? The answer is generally yes. Here is a
list of some of these inherited properties: (i) quasiconcavity, (ii) strictly increasing,
(iii) homotheticity, (iv) homogeneity of degree one, and (v) concavity. Proofs of
these claims are somewhat scattered in the literature. For example, proofs for
properties (i) and (iv) can be found in Solow [25]. A comprehensive set of proofs
for all five properties (plus more) are given in Theorem 3.5, page 78, in Blackorby
et al. [3].
A simple example that illustrates the construction of F 2 in equation (11) is

1/3 1/3 1/3

F (x1 , x2 , x3 ) = x1 x2 x3 .

It is straightforward to show that inputs 2 and 3 are separable from input 1. Let
the reference vector, O 1 , be x1 = 1. Then an aggregator function for inputs
1/3 1/3
2 and 3 is just F 2 (x2 , x3 ) = x2 x3 . A correct choice of F in this case
1/3
is F x1 , F 2 (x2 , x3 ) = x1 F 2 (x2 , x3 ) . However, these choices of F and F 2
1/2 1/2
are not unique. Another correct choice would be F 2 (x2 , x3 ) = x2 x3 and
1/3 2/3
F x1 , F 2 (x2 , x3 ) = x1 F 2 (x2 , x3 ) . Note that this latter choice entails an
aggregator function that exhibits constant returns to scale in inputs 2 and 3.
An additional example, that was previously considered, is

1/2 1/4 1/4 1/2 1/2
y = F x 1 , x 2 = x1 x2 x3 + x2 x3 ,

where x 1 = x1 and x 2 = (x2 , x3 ) . Let

1/2 1/2
F 2 x 2 = x2 x3 .

Then
1/2
1/2
F x 1 , F 2 x 2 = x1 F 2 x 2 + F 2 x2 .

Duality
In this section we review some basic results from duality theory. More thorough
treatments can be found in Shephard [24], Blackorby et al. [3], Diewert [8], and
Färe and Primont [10].
It is assumed that the production function is (i) continuous and nondecreasing in
x, i.e.,

x x ⇒ F (x) F x for all x, x in RN
+;

(ii) satisfies a no-local-maxima condition, i.e., given N

vector x in R+ and
any input

any ε > 0, there exists another input vector x with x − x < ε such that F x >
5 Functional Structure and Aggregation 225

F (x) ; and (iii) quasiconcave in x, i.e.,

F (x) F x ⇒ F αx + (1 − α) x F x ,

for all α such that 0 α 1 and for all x, x in RN + . The no-local-maxima

condition implies that isoquants are not “thick”. It is the production theory analog
to the assumption of local nonsatiation in utility theory.
There are several dual representations of the technology. They include the cost
function, the input distance function, and the indirect production function. Each
of these is defined below. In a subsequent section, we will look at multioutput
technologies and discuss an alternative representation, namely, the output distance
function.
The cost function is defined by

C (y, w) = min w · x : F (x) y , w ∈ RN
++ ,
x

where RN ++ = {w = (w1 , .., wN ) : wn > 0, n = 1, . . . , N} is the set of input price

vectors whose components are strictly positive. The cost function is (i) jointly
continuous in (y, w) , (ii) increasing in y, and (iii) nondecreasing, concave, and
homogeneous of degree one in w. If a function C (y, w) satisfies (i)–(iii), then one
may recover the production function that would generate C (y, w) through the cost
minimization problem.
The input distance function is defined by

D (y, x) = max λ : F (x/λ) y .
λ>0

It is (i) jointly continuous in (y, x) , (ii) decreasing in y, and (iii) nondecreasing,

concave, and homogeneous of degree one in x. On the other hand, if we start
with a function D (y, x) that satisfies (i)–(iii), then one can find the corresponding
production function by solving

F (x) = max {y : D (y, x) ≥ 1} .

As a consequence, the input distance function is an implicit representation of the

production function in the sense that

y = F (x) ⇔ D (y, x) = 1.

The indirect production function is defined by

G (w/c) = max F (x) : (w/c) · x 1 , c > 0 and w ∈ RN
++ .
x
226 D. Primont

This is the production counterpart to the indirect utility function of consumer

theory. It is (i) continuous, (ii) nonincreasing, and (iii) quasiconvex in w/c. (G
is quasiconvex if −G is quasiconcave.) The (direct) production function may be
recovered from G; it will be the production function that would generate G through
the output maximization problem. Details of these results can be found in Chapter 2
and the Appendix in Blackorby et al. [3] and in Diewert [8].
The relationship between the cost function and the indirect production function is
analogous to the relationship between the input distance function and the production
function. In particular

C (y, w) = min c : G (w/c) y
c>0

and

G (w/c) = min y : C (y, w/c) 1 .
y

It follows that the cost function is an implicit representation of the indirect

production function, i.e.,

y = G (w/c) ⇔ C (y, w/c) = 1.

There are two important duality relationships. The first of these involves the input
distance function and the cost function. The main result is:

C (y, w) = min w · x : D (y, x) 1 (12)
x

if and only if

D (y, x) = min w · x : C (y, w) 1 . (13)
w

Equation (12) follows from the fact that F (x) y ⇔ D (y, x) 1. If the solution
to (12) is unique, denote it by x ∗ = ζ (y, w) . It is the vector of cost minimizing
or Hicksian input demand functions. (If the solution is not unique, then we can
reinterpret ζ (y, w) as an arbitrary selection from the set of solutions.) The Hicksian
demand functions are homogeneous of degree zero in w. The minimized value of
cost is C (y, w) = w · ζ (y, w) .
Similarly, let w ∗ = δ (y, x) be the solution to (13) (or an arbitrary selection from
the set of solutions). It is the vector of output-constant price-demand functions or,
alternatively, the vector of shadow prices of the input vector x. The shadow price
functions are homogeneous of degree zero in x. The shadow cost of x is D (y, x) =
δ (y, x) · x.
The following result is attributed to Hotelling [18]: If C is differentiable at
(y, w) , then
5 Functional Structure and Aggregation 227

∂C (y, w)
ζi (y, w) = , i = 1, . . . , N. (14)
∂wi

A result that is dual to (14) is attributed to Shephard [24]: If D is differentiable at

(y, x) , then

∂D (y, x)
δi (y, x) = , i = 1, . . . , N. (15)
∂xi

Another duality relationship exists for the production function and the indirect
production function. This is given by:

G (w/c) = max F (x) : (w/c) · x 1 (16)
x

if and only if

F (x) = min G (w/c) : (w/c) · x 1 . (17)
w/c

Denote the solution to (16) by x ∗ = φ (w/c) . It is the vector of output maximizing

input vectors when the firm is cost-constrained. (If the solution is not unique, then
let φ (w/c) be an arbitrary selection from the set of solutions.) It is analogous to the
vector of Marshallian demand functions in the theory of the consumer; that label
can be carried over to the theory of the firm. The value of maximized output is given
by G (w/c) = F (φ (w/c)) .
Similarly, let (w/c)∗ = ξ (x) be the solution (or one of the solutions) to (17). It
is the vector of cost-normalized price-demand functions or, alternatively, the vector
of shadow prices of the input vector x.
The following result is attributed to Roy [22, 23]: If G is differentiable at w/c,
then

∂G (w/c)
∂ (wi /c)
φi (w/c) = , i = 1, . . . , N. (18)
N
∂G (w/c)
· wj /c
j =1
∂ wj /c

A result that is dual to (18) is attributed to Wold [29]: If F is differentiable at x, then

∂F (x)
∂xi
ξi (x) = , i = 1, . . . , N. (19)

N
∂F (x)
xj
∂xj
j =1
228 D. Primont

The Cost Function

For the partition I 1 , I 2 , we define separability of the cost function by first
defining the conditional upper level sets

γ 2 y, w 1 , w 2 = ŵ 2 ∈ Rn+2 : C y, w 1 , ŵ 2 C y, w 1 , w 2 ,

for each y ∈ R (F ) , w 1 , w 2 ∈ Rn+1 × Rn+2 . Then we say that the input prices in I 2

are separable from the input prices in I 1 if and only if γ 2 y, w 1 , w 2 is independent
of w1 , i.e.,

γ 2 y, w 1 , w 2 = γ 2 y, w̃ 1 , w 2 ,

for each y ∈ R (F ) , for all w 1 , w 2 ∈ Rn+1 × Rn+2 , and for any w̃ 1 ∈ Rn+1 . It is
important to note that while I 2 is separable from I 1 , it is not necessarily separable
from the output variable y. (This case will be considered later.)
If C is twice differentiable with positive first partial derivatives, then we can
apply the Leontief-Sono condition for separability. In this case I 2 is separable from
I 1 if and only if

∂ ∂C (y, w) /∂wi ∂ ζi (y, w)
= = 0,
∂wk ∂C (y, w) /∂wj ∂wk ζj (y, w)

for all i, j ∈ I 2 , k ∈ I 1 and for all (y, w) ∈ R (F ) × Rn+1 × Rn+2 . This condition
says that ratios of Hicksian demand functions for inputs in sector I 2 do not depend
on input prices in sector I 1 .
Next is a statement of the functional structure of the cost function that obtains
under the above separability conditions.

Theorem 2. Assume that C is continuous and nondecreasing in w. Then I 2 is

separable in C from I 1 if and only if there exist functions C 2 and C such that

C (y, w) = C y, w 1 , C 2 y, w 2 , (20)

for all (y, w) ∈ R (F )×RN
+ , where C is increasing in its last argument, C y, w .
2 2

In addition, if C is continuous, nondecreasing, homogeneous of degree one, and

concave in w, jointly continuous in (y, w) , and increasing in y, then C is continu-

ous, nondecreasing, homogeneous of degree one, and concave in w 1 , C 2 y, w 2 ,
and C 2 is continuous, nondecreasing,
homogeneous of degree one, and concave in
w 2 and is jointly continuous in y, w 2 and increasing in y.
The proof of this theorem is contained in the proof of Theorem 3.4, page 70,
in Blackorby, Primont, and Russell [3]. As stated both the macro function C and
5 Functional Structure and Aggregation 229

the aggregator function C 2

have the properties of the master function C. It is thus
tempting to think of C y, w 2 as a price index for inputs in sector I 2 . However,
2

this is somewhat problematic since C 2 depends on the output level y. We will see
later what we get when y is freed from C 2 .

The Input Distance Function

To define separability in the context of the input distance function, define the
conditional upper level sets

η2 y, x 1 , x 2 = x̂ 2 : x̂ 2 ∈ Rn+2 , D y, x 1 , x̂ 2 D y, x 1 , x 2 ,

for each y ∈ R (F ) and for each x 1 , x 2 ∈ Rn+1 × Rn+2 . The set of inputs in I 2 is

separable from the set of inputs in I 1 if and only if η2 y, x 1 , x 2 is independent of
x 1 , i.e.,

η2 y, x 1 , x 2 = η2 y, x̃ 1 , x 2 ,

for each y ∈ R (F ) , for all x 1 , x 2 ∈ Rn+1 × Rn+2 , and for any x̃ 1 ∈ Rn+1 . Once
again it is important to note that while I 2 is separable from I 1 , it is not necessarily
separable from the output variable y.
If D is twice differentiable with positive first partial derivatives, then the
Leontief-Sono definition is: I 2 is separable from I 1 if and only if

∂ ∂D (y, x) /∂xi ∂ δi (y, x)
= = 0,
∂xk ∂D (y, x) /∂xj ∂xk δj (y, x)

for all i, j ∈ I 2 , k ∈ I 1 and for all (y, x) ∈ R (F ) × RN+ . In words, the ratios of
shadow prices for inputs in sector 2 are independent of the quantities of inputs in
sector 1. Note that this ratio is not independent of the level of output y.
We now state:

Theorem 3. Assume D is continuous and nondecreasing. I 2 is separable from I 1

in D if and only if there exist functions D 2 and D such that

D (y, x) = D y, x 1 , D 2 y, x 2 , (21)

for all (y, x) ∈ R (F ) × RN
+ , where D is increasing in its last argument, D y, x .
2 2

The proof of Theorem 3 parallels that of Theorem 2 since C (y, w) and D (y, x)
have the same properties (except the monotonicity property in y.)
230 D. Primont

The Indirect Production Function

The definition of the separability of a sector of normalized price vectors in the
indirect production function is based on the lower level sets given by
1 1
w1 w2 ŵ 2 n2 w ŵ 2 w w2
α 2
, = ∈ R+ : G , G , .
c c c c c c c

We say that the normalized input prices in I 2 are separable from the normalized
input prices in I 1 if and only if α 2 w 1 /c, w 2 /c is independent of w 1 /c, i.e.,

w1 w2 w̃ 1 w 2
α2 , = α2 , ,
c c c c
n1
for all w 1 /c, w 2 /c ∈ RN + and for all w̃ /c ∈ R+ . If G is twice differentiable
1

with negative first partial derivatives, then the Leontief-Sono separability condition
is written as:

∂ ∂G (w/c) /∂ (wi /c) ∂ φi (w/c)
= = 0,
∂ (wk /c) ∂G (w/c) /∂ wj /c ∂ (wk /c) φj (w/c)

+ , for all (i, j ) ∈ I , and for all k ∈ I , where the second equality
for all (w/c) ∈ RN 2 1

follows from Roy’s theorem (18). In this case the ratios of cost-constrained output
maximizers (Marshallian demand functions) in I 2 are independent of normalized
prices in I 1 .

Theorem 4. Assume G is continuous and nonincreasing. Sector I 2 is separable in

G from I 1 if and only if there exist continuous functions G2 and G such that

G (w/c) = G w 1 /c, G2 w 2 /c , (22)

2
for all (w/c) ∈ RN
+ , where G is decreasing in G w /c .
2

If, in addition, G is quasiconvex, then G is nonincreasing, and quasiconvex and

G2 is nondecreasing and quasiconcave.

Functional-Structure Equivalences

Cost and Input Distance Functions

Separability of sector I 2 from I 1 in F is not equivalent to separability of sector I 2
from I 1 in D or in C or in G. Put another way, separability of I 2 from I 1 imposes a
different restriction on the production technology than does separability of I 2 from
I 1 in D or in C or in G. This statement is generally true for any pair of these four
5 Functional Structure and Aggregation 231

representations except for one pair, namely, the input distance function and the cost
function. The following is proved in Blackorby et al. ( [3], Theorem 3.6, page 83).
Their proof is based on Gorman [15, 17]. It is also proved by McFadden [21].
Before stating the theorem notice that we must assume that the production
function, F, is continuous, nondecreasing, and quasiconcave in x and satisfies
the no-local-maxima condition. This assumption implies that the cost function
is jointly continuous in (y, w) , increasing in y, and nondecreasing, concave,
and homogeneous of degree one in w and that the input distance function is
jointly continuous in (y, x) , decreasing in y, and nondecreasing, concave, and
homogeneous of degree one in x. Thus, the duality relationship between C and
D holds.

Theorem 5. Assume that the production function is continuous, nondecreasing,

and quasiconcave in x. Then sector I 2 is separable from sector I 1 in the input
distance function, i.e.,

D (y, x) = D y, x 1 , D 2 y, x 2

if and only if sector I 2 is separable from sector I 1 in the cost function, i.e.,

C (y, w) = C y, w 1 , C 2 y, w 2 .

Proof. “only if”:

C (y, w) = min w 1 · x 1 + w 2 · x 2 : D y, x 1 , D 2 y, x 2 1
x

= min w 1 · x 1 + min w 2 · x 2 : D 2 y, x 2 d2 : D y, x 1 , d2 1
x 1 ,d2 x2

= min w 1 · x 1 + d2 min w 2 · x 2 /d2 : D 2 y, x 2 /d2 1
x 1 ,d2 x 2 /d2

: D y, x 1 , d2 1

= min w 1 · x 1 + d2 C 2 y, w 2 : D y, x 1 , d2 1
x 1 ,d2

= C y, w1 , C 2 y, w 2 .

“if”: Because of the symmetry between C and D, one can repeat the “only if” proof
swapping the roles of Cand D, C and D, and C 2 and D 2 and minimizing over input
prices rather than input quantities.
It is interesting to ask what it is about the D − C pair that distinguishes it
from the other possible pairings of technology representations. The answer seems
232 D. Primont

to be that only the D − C pair has the property that both representations are
homogenous of degree one in its variables (except y). This means that D 2 (C 2 ) can
be chosen to be homogeneous of degree one in x 2 (w 2 ). This suggests that adding
the assumption that the aggregator functions are homothetic (and so can be chosen
to be homogeneous of degree one) would extend the result of Theorem 5 to all
possible pairs of representations. This turns out to be the case as we will see in the
next subsubsection.

Homothetic Separability
The production function is homothetically separable if it has the functional structure

F x1, x2 = F x1, F 2 x2 ,

where F is increasing in the intermediate input y2 = F 2 x 2 and F 2 is a homothetic
function. We want to examine the implications of a homothetically separable
production function for the cost, input distance, and indirect production function. As
a preliminary step, we will first consider
a special case in which the parent function,
F , is homothetic in its arguments x 1 , x 2

Theorem 6. Assume that the production function is continuous, nondecreasing,

and quasiconcave in x. Then the following four statements are equivalent:

(i) The production function F is homothetic, i.e.,

F (x) = h F̄ (x) ,

where h is an increasing function of its single argument and F̄ is homogeneous

of degree one in x.
(ii) There exist functions C and C̄ such that

C (y, w) = C y, C̄ (w) ,

where C is increasing in c = C̄ (w) and C̄ (w) is homogeneous of degree one

in w. In this case

C y, C̄ (w) = h−1 (y) C̄ (w) .

(iii) The indirect production function is negatively homothetic, i.e.,

1
G (w/c) = h ,
C̄ (w/c)

where h is increasing (as in (i)) and 1/C̄ (w/c) is homogeneous of degree

minus one in w/c.
5 Functional Structure and Aggregation 233

(iv) Their exist functions D and D̄ such that

D (y, x) = D y, D̄ (x) ,

where D is increasing in its last argument and D̄ is homogeneous of degree

one in x. In this case

F̄ (x)
D (y, x) = .
h−1 (y)

Proof. (i) ⇒ (ii):

C (y, w) = min w · x : F (x) y
x

= min w · x : h F̄ (x) y
x

= min w · x : F̄ (x) h−1 (y)

x

= min w · x : F̄ (x) y , y =h−1 (y)
x

= y min w · (x/y) : F̄ (x/y) 1
x/y

= yC (1, w)
= h−1 (y) C̄ (w)

(ii) ⇒ (iii):

G (w/c) = max y : C (y, w) c
y

= max y : h−1 (y) C̄ (w) c

= max y : h−1 (y) C̄ (w/c) 1

y

1
= max y : h−1 (y)
y C̄ (w/c)

1
= max y : y h
y C̄ (w/c)

1
=h .
C̄ (w/c)

(iii) ⇒ (iv):
234 D. Primont

D (y, x) = min (w/c) · x : G (w/c) y
w/c

1
= min (w/c) · x : h y
w/c C̄ (w/c)

1 −1
= min (w/c) · x : h (y)
w/c C̄ (w/c)

1
= min (w/c) · x : y , y = h−1 (y) ,
w/c C̄ (w/c)

1 1
= min y (w/c) · x : 1
y yw/c C̄ (yw/c)
1
= F̄ (x)
y
F̄ (x)
= .
h−1 (y)

(iv) ⇒ (i):

F (x) = max y : D (y, x) 1
y

F̄ (x)
= max y : −1 1
y h (y)

= max y : F̄ (x) h−1 (y)

y

= max y : h F̄ (x) y
y

= h F̄ (x) .

See Blackorby et al. [3], pages 91–93. It is apparent from the theorem that
homotheticity of the production function is equivalent to separability of all input
prices from output in the cost function and equivalent to separability of all inputs
from output in the input distance function.
The following is a generalization of Theorem 6.

Theorem 7. Assume that the production function is continuous, nondecreasing,

and quasiconcave in x. Then the following four statements are equivalent.

(i) The production function is homothetically separable, i.e., it has the functional
structure

F x 1 , x 2 = F x 1 , F̄ 2 x 2 ,
5 Functional Structure and Aggregation 235

where F is increasing in y2 = F̄ 2 x 2 and F̄ 2 is a homothetic function.
(ii) The cost function has the following functional structure

C (y, w) = C y, w 1 , C̄ 2 w 2 ,

where C is increasing in c2 = C̄ 2 w 2 and C̄ 2 is homogeneous of degree one
in w 2 .
(iii) The input distance function has the following functional structure

D (y, x) = D y, x 1 , D̄ 2 x 2 ,

where D is increasing in d2 = D̄ 2 x 2 and D̄ 2 is homogeneous of degree one
in x 2 .
(iv) The indirect production function is homothetically separable, i.e., it has the
following functional structure

G (w/c) = G w 1 /c, Ḡ2 w 2 /c ,

where G is increasing in y2 = Ḡ2 w 2 /c and Ḡ2 is homogeneous of degree
minus one in w 2 /c.

The above theorem is proved in Blackorby et al. ( [3], pages 94–97).

Two Sector Applications

It has already been noted in the Introduction that various functional structures may
be used to model intermediate inputs in production. Another application involves
the study of real valued added. Suppose, for simplicity, that the production function
may be written as

y = F (m, K, L) = F (m, F (K, L)) ,

where m is the input of raw materials, K is the capital input, and L is the labor input.
The above functional structure embodies the assumption that capital and labor are
separable from materials. Then F (K, L) is interpreted as real value added.
This is the starting point of a paper by Arrow [2]. He assumes that F is
homogeneous of degree one and concave in its arguments. He proves that both F
and F can be chosen to be homogeneous of degree one and concave. He then goes
on to illustrate two econometric models that could be used to estimate this model.
One can also use functional structure results to study the aggregation of inputs.
An early example of this appears in Solow [25]. He begins with a simple production
function
236 D. Primont

Q = F (L, C1 , C2 ) ,

where L is labor and C1 and C2 are two types of capital. He then asks the question:
When can we write the production function as

Q = F (L, C1 , C2 ) = H (L, K) ,

where

K = φ (C1 , C2 )

is a measure of aggregate capital. The topic of capital aggregation is further pursued

in Fisher [11–13], Stigum [28], and Blackorby and Schworm [5, 6].

Functional Structure with More Than Two Sectors

Theorem 1 can be generalized to the case of several separable subsets of inputs.

Suppose the set of input indices, I = {1, 2, . . . , N } , is partitioned into R+1 subsets
and let this partition be given by

I 0, I 1, . . . , I R .

Corresponding to this partition, the input vector may be rewritten as

x = x0, x1, . . . , xR ,

where input indices have been renumbered appropriately. The following theorem
formally states the desired generalization.

Theorem 8 ( [7, 14]). Suppose the production function is continuous, nondecreas-

ing, and quasiconcave in x. For the partition I 0 , I 1 , . . . , I R group r, r =
1, . . . , R is separable from all of the other inputs in I if and only if there exist
functions, F, F 1 , . . . , F R , which are continuous, nondecreasing, and quasiconcave
in x and

F (x) = F x 0 , F 1 x 1 , . . . , F R x R , (23)

where the macro function, F, is increasing in its last R arguments.

yr = F r x r , r = 1, . . . , R.

The proof of Theorem 8 follows from a repeated application of Theorem 1. (An

example of this functional structure was given in (9), the production of personal
5 Functional Structure and Aggregation 237

Fig. 2 Several Separable x0

Sectors
y1
x1 F1
y2
x2 F2 F y
y3
x3 F3

computers.) The subset of inputs, I 0 , is often referred to as a free sector, i.e.,

a nonseparable sector. Since (23) is a direct representation of the technology (as
opposed to the indirect production function), we refer to the condition in (23) as
direct separability.
For R = 3, the functional structure in (23) has the following schematic
representation (Fig. 2).
Again, one can think of this as a two-period production process. In period 1 inputs
x 1 , x 2 , and x 3 are used to produce intermediate inputs y1 , y2 , and y3 , respectively.
In period 2, x 0 , y1 , y2 , and y3 are used to produce final output, y. This is another
example of a network production model [9].
A special case of the 0
above arises when I is empty and, thus, the revelant
partition is written as I 1 , . . . , I R . For that partition each of the R subsets is
separable from all of the other inputs if and only if there exist continuous functions,
F, F 1 , . . . , F R , such that

F (x) = F F 1 x 1 , . . . , F R x R ,

where the macro function, F, is increasing in its R arguments. In this case it is often
said that the production function is “weakly separable”.
Again, the aggregator functions and the macro functions will inherit properties
of the master function. See pages 108–110 in Blackorby et al. [3].

Separability of Dual Representations of Technology

Separability results will be presented for the partition I 0 , I 1 , . . . , I r , . . . , I R
where I 0 is a free sector. We first consider separability of the cost function.

Theorem 9. Assume that the cost function is continuous, nondecreasing, homoge-

neous of degree one, and concave in w, jointly continuousin (y, w) , and increasing
in y. Then the cost function is separable in the partition I 0 , I 1 , .., I R if and only
if it has the functional structure given by

C (y, w) = C y, w 0 , C 1 y, w 1 , . . . , C R y, w R , (24)
238 D. Primont

where C is increasing in its last R arguments,

cr = C r y, w r , r = 1, . . . , R.

In addition, the macro function, C, is continuous and increasing in (c1 , . . . , cR ),

and each aggregator function, C r , is continuous, nondecreasing, concave, and
homogeneous of degree one in wr , r = 1, . . . , R. Moreover, each C r is jointly
continuous in (y, w r ) and increasing in y. (See Corollary 4.1.4 on page 112 in
Blackorby et al. [3].)
If the input distance function is increasing in x and twice differentiable, then,
according to the Leontief-Sono definition of separability, sector r is separable from
all of the other sectors if and only if

∂ ∂D (y, x) /∂xi
= 0,
∂xk ∂D (y, x) /∂xj

for all i, j ∈ I r and for all k ∈

/ I r . A representation theorem that is analogous to
Theorem 8 can be proved for the input distance function.

Theorem 10. Assume that the input distance function is continuous, nondecreas-
ing, concave, and homogeneous of degree one in x, jointly continuous in (y, x) ,
and0 decreasing in y. Then the input distance function is separable in the partition
I , I 1 , . . . , I R if and only if it has the following functional structure:

D (y, x) = D y, x 0 , D 1 y, x 1 , . . . , D R y, x R , (25)

where D is increasing in its last R arguments,

dr = D r y, x r , r = 1, . . . , R.

In addition, the macro function, D, is continuous and increasing in (d1 , . . . , dr ) ,

and each aggregator function, D r ,is continuous, nondecreasing, concave, and
homogeneous of degree one in x r , r = 1, . . . , R. Moreover, each D r is jointly
continuous in (y, x r ) and decreasing in y.
Is the functional structure depicted in equation (23) for the production function
equivalent to the functional structure depicted in (25) for the input distance function?
In general the answer to this question is no. One way to see why this is so is
to note that the aggregator functions, F r , r = 1, . . . , R in (23) depend only on
x r , r = 1, . . . , R while the functions D r , r = 1, . . . , R in (25) depend on x r
and y, r = 1, . . . , R. Another way to see this is to calculate the marginal rate
of technical substitution between two inputs in sector r by applying the implicit
function theorem to the input distance function. We get
5 Functional Structure and Aggregation 239

∂D (y, x) /∂xi
−
∂y/∂xi ∂D (y, x) /∂y
= , i, j ∈ I r
∂y/∂xj ∂D (y, x) /∂xj
−
∂D (y, x) /∂y
∂D (y, x) /∂xi
= , i, j ∈ I r
∂D (y, x) /∂xj
∂D r (y, x r ) /∂xi
= , i, j ∈ I r
∂D r (y, x r ) /∂xj
∂D r (F (x) , x r ) /∂xi
= , i, j ∈ I r .
∂D r (F (x) , x r ) /∂xj

for r = 1, . . . , R. The marginal rate of technical substitution depends, in general,

on the entire input vector x and thus fails to satisfy the Leontief-Sono condition in
the production function. We conclude that a production function that is separable
in the partition I 0 , I 1 , . . . , I R will, in general, correspond to a technology that is
different than one that corresponds to an input distance function that is separable in
the same partition.
If the indirect production function is decreasing in w/c and twice differentiable,
then, according to the Leontief-Sono definition of separability, sector r is separable
from all of the other sectors if

∂ ∂G (w/c) /∂ (wi /c)
= 0,
∂ (wk /c) ∂G (w/c) /∂ wj /c

for all i, j ∈ I r and for all k ∈

/ I r . A representation theorem that is analogous to
Theorem 8 can be proved for the indirect production function.

Theorem 11. Assume that the indirect production function is continuous, non-
increasing, and quasiconvex
in (w/c) . Then the indirect production function is
separable in the partition I 1 , . . . , I R if and only if it has the functional structure
given by

G (w/c) = G G1 w 1 /c , . . . , GR w R /c (26)

where G is increasing in its R arguments.

In addition, the macro function G is continuous and quasiconvex, and the aggregator
functions G1 , . . . , GR , are continuous, nonincreasing, and quasiconcave. We refer
to the condition in (26) as indirect separability.
Since the cost function can be written as an implicit representation of the indirect
production function, one can apply the implicit function theorem to
240 D. Primont

C (y, w/c) = 1

to calculate the derivatives

of G. Thus, one can show that separability of the cost
function in the partition I 0 , I 1 , .., I R does not necessarily imply separability of

the indirect production function in the partition I 1 , . . . , I R . In particular,

∂C (y, w/c) /∂ (wi /c)

∂y/∂ (wi /c) ∂C (y, w/c) /∂y
= , i, j ∈ I r
∂y/∂ wj /c ∂C (y, w/c) /∂ wj /c
∂C (y, w/c) /∂y
∂C (y, w/c) /∂ (wi /c)
= , i, j ∈ I r
∂C (y, w/c) /∂ wj /c
∂C r (y, w r /c) /∂ (wi /c)
= , i, j ∈ I r
∂C r (y, w r /c) /∂ wj /c
∂C r (G (w/c) , w r /c) /∂ (wi /c)
= , i, j ∈ I r ,
∂C r (G (w/c) , w r /c) /∂ wj /c

for r = 1, . . . , R. Thus ratios of partial derivatives of G with respect to normalized

prices in group r will, in general, depend on the entire vector of normalized prices,
w/c, and hence fail to satisfy the Leontief-Sono conditions for separability of the
indirect production function.
The production function, the cost function, the input distance function, and the
indirect production function form six possible pairs of technology representations.
Only one pair exhibits the property that separability in a given partition is equivalent
for the two representations. We can state:

Theorem 12. Assume that the production function is continuous, nondecreasing,

and quasiconcave in x. Then each sector r, r = 1, . . . , R is separable in the input
distance function from its complement in I = {1, . . . , N } , i.e.,

D (y, x) = D y, x 0 , D 1 y, x 1 , . . . , D R y, x R

if and only if each sector r, r = 1, . . . , R is separable in the cost function from its
complement in I = {1, . . . , N } , i.e.,

C (y, w) = C y, w 0 , C 1 y, w 1 , . . . , C R y, w R .

Here is a sketch of the proof of the “if” part of this statement.

D (y, x) = min w · x : C (y, w) 1 (27)
w
5 Functional Structure and Aggregation 241

R
= min w x + 0 0
w · x : C y, w , C y, w , . . . , C y, w
r r 0 1 1 R R
1
w
r=1
(28)
R
w0 x 0 + r=1 minwr w · x : C
r r r wr cr :
= min (y, ) (29)
w C y, w , c1 , . . . , cr 1
0

R
w 0 x 0 + cr r=1 min r w r /cr · x r : C r (y, w r /cr ) 1 :
= min w /cr0 (30)
w C y, w , c1 , . . . , cr 1

= min w 0 x 0 + cr D r y, x r : C y, w 0 , c1 , . . . , cr 1 (31)
w0 ,c1 ,...,cR

= D y, x 0 , D 1 y, x 1 , . . . , D R y, x R . (32)

The complete proof of this statement can be found in Gorman [15,17], McFadden
[21], and Blackorby et al. ( [3], Theorem 4.2). All of these proofs exploit the fact
that each D r is homogeneous of degree one in x r , r = 1, . . . , R and that each C r
is homogeneous of degree one in w r , r = 1, . . . , R.. (Note what happens when
going from equation (29) to equation (30).) This suggests, for example, that if the
aggregator functions F r , r = 1, . . . , R are homogeneous of degree one in x r , then
the aggregator functions Gr , r = 1, . . . , R will be homogeneous of degree minus
one in w r /c and, moreover, that direct and indirect separability are equivalent. We
shall see that this is so in the next subsection.

Homothetic Separability

The production technology satisfies homothetic separability in the partition

I 0, I 1, . . . , I R

if the production function can be written as in (23) and each of the aggregator
functions are homothetic. In other words

F (x) = F x 0 , F 1 x 1 , . . . , F R x R , (33)

where F is continuous and increasing in its last R arguments and each F r is

homothetic. If F r is homothetic, then there exists a continuous, increasing function
hr and a function
F̄ r that is homogeneous of degree one in x r such that F r (x r ) =
hr F̄ r (x r ) , r = 1, . . . , R. Then

F (x) = F x 0 , h1 F̄ 1 x 1 , . . . , hR F̄ R x R
242 D. Primont

= F x 0 , F̄ 1 x 1 , . . . , F̄ R x R

where each hr is, in effect, absorbed into F to yield F. Since each hr is continuous
and increasing, F is continuous and increasing in its last R arguments. Thus,
whenever homothetic separability is assumed, we may also assume, without loss of
generality, that the aggregator functions have been normalized to be homogeneous
of degree one in (33). If we think of F r as the subproduction function for some
intermediate good, yr = F r (x r ) , then the isoquants for yr have been monotonically
renumbered in such a way that F r is homogeneous of degree one.

Theorem 13. Assume that the production function is continuous, nondecreasing,

and quasiconcave in x. Let the output variable y be indexed by 0. Then the following
four statements are equivalent.

1. The production function is homothetically separable in the partition

I 0, I 1, . . . , I R ,

i.e.,

F (x) = F x 0 , F 1 x 1 , . . . , F R x R ,

where F is increasing in its last R arguments and each F r , r = 1, . . . , R, is

homothetic.
2. The cost function is separable in the partition

{0} ∪ I 0 , I 1 , . . . , I R ,

i.e.,

C (y, w) = C y, w 0 , 1
w1 , . . . , R
wR

where C is increasing in its last R arguments and each r, r = 1, . . . , R, is

homogeneous of degree one.
3. The input distance function is separable in the partition

{0} ∪ I 0 , I 1 , . . . , I R ,

i.e.,

D (y, x) = D y, x 0 , X1 x 1 , . . . , XR x R
5 Functional Structure and Aggregation 243

where D is increasing in its last R arguments and each Xr , r = 1, . . . , R, is

homogeneous of degree one.
4. The indirect production function is homothetically separable in the partition

I 0, I 1, . . . , I R ,

i.e.,

G (w/c) = G w 0 /c, G1 w 1 /c , . . . , GR w R /c ,

where G is increasing in its last R arguments and each Gr , r = 1, . . . , R, is

negatively homothetic in wr /c.

A sketch of the proof is provided here.

1 ⇒ 2: Assume, without loss of generality, that each F r is homogeneous of
degree one. The rth intermediate product is yr = F r (x r ) , and it is produced under
constant returns to scale. As is well known, the corresponding cost function for
sector r,
r r
C r yr , w r = min
r
w x : F r x r yr ,
x

has the form

C r yr , w r = C r 1, w r yr (34)

= r w r yr , (35)

where r (w r ) is the constant marginal and average cost of producing yr . Now we

derive the form of the overall cost function.

C (y, w) = min w · x : F (x) y
x

R
= min w x +
0 0
w x : F x0, F 1 x1 , . . . , F R xR y
r r
x 0 ,...,x R
r=1

R
r r r
= min w x +
0 0
min
r
w x : F r
x yr : F x 0
, y1 , . . . , yR y
x 0 ,...,x R x
r=1

R
r
= min w x +
0 0 r
w yr : F x 0 , y1 , . . . , yR y (using (35))
x 0 ,y1 ,...,yR
r=1

= C y, w 0 , 1
w1 , . . . , R
wR .
244 D. Primont

2 ⇒ 3: This is easily accomplished by substituting equation (35) into the

sketched proof of Theorem 12 that separability of C implies separability of D.
3 ⇒ 4: This will consist of two steps. Step 1 is to show that 3 ⇒ 2, and Step 2
is to show that 2 ⇒ 4. To prove Step 1, we derive the cost function from the input
distance function as follows.

C (y, w) = min wx : D (y, x) 1
x

R
= min w x + 0 0
w x : D y, x , X x , . . . , X x
r r 0 1 1 R R
1
x
r=1

R
w0 x 0 + minx r w r x r : Xr (x r ) dr :
= min
r=1
x D y, x 0 , d1 , . . . , dR 1

R
w0 x 0 + minx r /dr w r x r /dr : Xr (x r /dr ) 1 :
r=1 dr
= min
x D y, x 0 , d1 , . . . , dR 1

R
r
= min w x +
0 0
dr r
w : D y, x , d1 , . . . , dR 1
0
x 0 ,d1 ,...,dR
r=1

= C y, w0 , 1
w1 , . . . , R
wR .

Thus 3 ⇒ 2. Now we show that 2 ⇒ 4.

G (w/c) = min y : C (y, w/c) 1
y

= min y : C y, w 0 /c, 1
w 1 /c , . . . , R
w R /c 1
y

= G w 0 /c, 1
w 1 /c , . . . , R
w R /c .

We can now conclude that 3 ⇒ 4.

4 ⇒ 1: Note that

F (x) = min G (w/c) : w/c · x 1
w/c

− max −G (w/c) : w/c · x 1 .
w/c

We see that the problem

5 Functional Structure and Aggregation 245

max −G (w/c) : w/c · x 1
w/c

is mathematically equivalent to the problem

max F (x) : w/c · x 1 ,
x

with the roles of x and w/c reversed. Since we have shown that 1 ⇒ 4, then this
mathematical equivalence implies that 4 ⇒ 1.

Application
Theorem 13 suggests a straightforward way that the firm can decentralize its
decision-making. Each firm sector has a constant-returns-to-scale production func-
tion given by yr = F r (x r ) , r = 1, . . . , R. A cost function for sector r is then
derived and is given by C r (yr , w r ) = C r (1, w r ) yr = r (w r ) yr , r = 1, . . . , R.
This price of the intermediate good, r (w r ) , is then reported to the manager of the
firm who now solves

R
r
min w x +
0 0 r
w yr : F x 0 , y1 , . . . , yR y
x 0 ,y1 ,...,yR
r=1

to derive the firm’s demand for the inputs in I 0 and for the intermediate inputs,
(y1 , . . . , yR ) . The resulting cost function is

C (y, w) = C y, w 0 , 1
w1 , . . . , R
wR ,

which can be differentiated to yield all of the other inputs demands via Shephard’s
lemma.

Additive Functional Structure

Suppose the set of variable indices, I = {1,

2, . . . , N} , is partitioned into R subsets
or sectors. This partition is denoted by I 1 , I 2 , . . . , I R . In this section we will
examine the conditions under which the production function has the following
functional structure:

R
r
F (x) = F r
F x ,
r=1

where F is an increasing function of its single argument. The separability condition

that yields this additive structure is called complete separability. A production is
246 D. Primont

completely separable in the partition I 1 , I 2 , . . . , I R if every union of sectors is
separable from all of the inputs in the sectors outside the union.
We start with a result derived from the theory of functional equations [1] that
were effectively used in Gorman [14]. The result is that if there are two separable
subsets of I in F that have a nonempty intersection (i.e., are overlapping), then F
must have an additive structure. Let us state this more precisely.
Suppose there are four sectors, I 1 , I 2 , I 3 , and I 4 , that comprise a partition of I.
Let I c be the union of sectors 2, 3, and 4, i.e.,

I c = I 2 ∪ I 3 ∪ I 4.

In other words, I c is the complement of I 1 in I. Let x c = x 2 , x 3 , x 4 . In order
to avoid certain degenerate cases, we define essentiality and strict essentiality as
follows.
n1
The sector I 1 is said to be essential if, for every x̂ 1 ∈ R+ , the set

x 1 : F x 1 , x c = F x̂ 1 , x c

is nonempty for at least one point x c ∈ Rn+2 × Rn+3 × Rn+4 , where n2 , n3 , and n4
are the number of elements of I 2 , I 3 , and I 4 , respectively. Sector I 1 is said to be
strictly essential if it is essential for all points x c ∈ Rn+2 × Rn+3 × Rn+4 . Essentiality
and strict essentiality of each of the other three sectors are defined in the same way.
Example. Consider the production structure:

F (x) = min min {x1 , x2 } , F c x c ,

where I 1 = {1, 2} . Then I 1 is essential since F is increasing in its first

min {x1 , x2 } , for points x such that min {x1 , x2 } < F (x ) . However,
argument, c c c

F x 1 , x c = F x̂ 1 , x c for all x c such that min {x1 , x2 } F c (x c ) . Thus I 1 is not
strictly essential.
Let I r = I 1 ∪ I 2 and I s = I 2 ∪ I 3 . Set differences are given by I r − I s =
/ I s } = I 1 and I s − I r = I 3 . The sectors I r and I s are overlapping
{z : z ∈ I r , z ∈
separable sectors.

Theorem 14 ( [14]). Assume that the production function is continuous and that
all four sectors I 1 , I 2 , I 3 , and I 4 are essential. Consider the following three
conditions.

(a) I r and I s are nonempty and separable from their complements, I 3 ∪ I 4 and
I 1 ∪ I 4 , respectively; I r ∩ I s , I r − I s and I s − I r are nonempty; and I s − I r
is strictly essential.
(b) Each of the following sets is separable from their complements:
5 Functional Structure and Aggregation 247

I r ∩ I s , I r − I s , I s − I r and I r − I s ∪ I s − I r .

(c) There exist continuous functions, F, F 1 , F 2 , F 3 such that

F (x) = F F 1 x 1 + F 2 x 2 + F 3 x 3 , x 4 ,

where F increasing in its first argument.

Then (a) ⇒ (b), (a) ⇒ (c), (c) ⇒ (a) and (c) ⇒ (b).

The theorem can be extended to R sectors.

Theorem 15 ( [14]). Assume that the production function F is continuous. Then F

is completely separable in the partition I 1 , I 2 , . . . , I R , R 3, if and only if
there exist function F, F 1 , . . . , F R such that

R
r
F (x) = F r
F x ,
r=1

where F is an increasing function of its single argument.

When the production function is twice differentiable with positive first partial
derivatives, complete separability
can be defined (locally) as: F is completely
separable in the partition I 1 , I 2 , . . . , I R , R 3, if

∂ ∂F (x) /∂xi
= 0,
xk ∂F (x) /∂xj

for all i ∈ I r , j ∈ I s , k ∈ I t , r, s, t = 1, . . . , R, r = t, s = t. An analysis of the

differentiable case can be found in Leontief [19, 20].
Theorems 14 and 15 are valid when there are R 3 sectors. What can be said
about the case in which there are only two sectors (R = 2)? Put another way, do
there exist separability conditions that yield the following functional structure

F x1, x2 = F F 1 x1 + F 2 x2 ? (36)

The short answer to this question is no. While it is true that the functional structure
in (36) implies separability in the partition I 1 , I 2 , the converse is not true since

separability in the partition I 1 , I 2 only implies that

F x1, x2 = F F 1 x1 , F 2 x2 .
248 D. Primont

Thus, alternative conditions are needed to provide an equivalent characterization

of (36).
In the special case for which I 1 consists of a single variable index, Sono
[26, 27] provided a solution to this problem using a condition called independence.
Blackorby et al. [3] extended Sono’s proof to the case for which I 1 may include two
or more variable indices. We will summarize their result here.
First, a definition. Assume that the production function is twice continuously
differentiable. The sector I 1 is independent of I 2 if and only if there exist functions
ψ j i , i, j ∈ I 1 , such that

∂ ∂F (x) /xk ∂ ∂F (x) /x
ln = ln = ψ j i x1 ,
∂xi ∂F (x) /xj ∂xi ∂F (x) /xj

for all i, j ∈ I 1 and for all k, ∈ I 2 .

Theorem 16. Assume that the production function, F, is twice continuously differ-
entiable. Then I 1 is independent of I 2 and is separable from I 2 if and only if there
exist functions F, F 1 , and F 2 such that

F (x) = F F 1 x 1 + F 2 x 2 .

For a proof and more details, see Blackorby et al. ( [3], pp. 159–165).

Recursive Functional Structure

When a production process takes place over several time periods, it may be
appropriate to model this process with an
intertemporal production function. The
input vector may be written as x = x 1 , x 2 , . . . , x t , . . . , x T where x t is the
input vector used at time t and production takes place over T time periods. Final
production is written as

y = F x1, x2, . . . , xt , . . . , xT .

The intertemporal cost function has the usual definition:

C y, w1 , . . . , w t , . . . , w T

T
= min w · x : F x ,x ,...,x ,...,x y ,
t t 1 2 t T
x
t=1

where w1 , . . . , w t , . . . , w T is a vector of time-discounted input prices.
5 Functional Structure and Aggregation 249

If we are willing to impose separability restrictions on the production function,

the cost minimization problem can be broken down into a series of smaller steps.
Such separability assumptions might be justified for a production process like
the “building a house” example given earlier. In the general case, we have T
intermediate steps in the production process. We get the following system of
intermediate production functions:

y1 = F 1 x 1

y2 = F 2 y1 , x 2
..
.
(37)
yt = F t yt−1 , x t
..
.

yT = F (x) = F T yT −1 , x T .

1This functional structure is equivalent to the condition that the inputs in

I , . . . , I t−1 are separable from the inputs in I t , . . . , I T , t = 2, . . . , T . These
separability assumptions are collectively referred to as complete recursivity. Also
note that, since yT is the output of the final step, we have y = yT .
It will be convenient to refer the reader to certain proofs in Blackorby et al. ( [3],
Chaper 6). There is a slight, but easily surmountable, problem here. Their results
pertain to the system of subutility functions given by

uR = U R x R

uR−1 = U R−1 uR , x R
..
.
ur = U r (ur+1 , x r )
..
.

u1 = U (x) = U 1 u2 , x 1 .

This looks just like the production model above except the numbering of sec-
tors is reversed. This is because in the intertemporal utility model, a useful
assumption is that variables in I r .I r+1 , . . . , I R are separable from variables in
1 2
I , I , . . . , I r−1 , r = 2, . . . , R. (The assumption is useful because it paves
the way to consistent intertemporal decision-making. For a discussion of this, see
Chapter 10 in Blackorby et al. [3].) Nevertheless, we can still appeal to their results
for intertemporal utility functions as long as we mentally renumber the sectors.
Complete recursivity of the indirect production function is characterized in an
analogous fashion (Blackorby et al. [3], p. 225). However, complete recursivity of
the production function is not generally equivalent to complete recursivity of the
indirect production function.
Complete recursivity of the cost function has the functional structure of the
following system of functions
250 D. Primont

c1 = C 1 y, w 1

c2 = C 2 y, c1 , w 2
..
.
(38)
ct = C t y, ct−1 , w t
..
.

cT = C (y, w) = C T y, cT −1 , w T ,

where each C t is increasing in ct−1 , t = 2, . . . , T . This functional structure is

equivalent to the separability conditions: Input prices in I 1 , . . . , I t−1 are sepa-

rable from input prices in I t , . . . , I T , t = 2, .. . , T , Moreover,
each C t inherits
t
concavity and homogeneity of degree one in ct−1 , w . (See Corollary 6.1.3,
page 226 in Blackorby et al. [3].)
Similarly, the functional structure of a completely recursive input distance
function is given by the system of functions

d1 = D 1 y, x 1

d2 = D 2 y, d1 , x 2
..
.
(39)
dt = D t y, dt−1 , x t
..
.

dT = D (y, x) = D T y, dT −1 , x T ,

where each D t is increasing in dt−1 , t = 2, . . . , T . This functional structure is

equivalent to the separability conditions: Inputs in I 1 , . . . , I t−1 are separable

from inputs in I t , . . . , I T , t = 2, .. . , T , In addition, each D t inherits concavity
and homogeneity of degree one in dt−1 , x t . (See Corollary 6.1.4, page 226 in
Blackorby et al. [3]).
Earlier we saw that separability of the cost function is equivalent to separability
of the input distance function. An analogous argument establishes that the word
“separability” may be replaced by “complete recursivity”. Formally stated:

Theorem 17. The cost function has a completely recursive functional structure as
in (38) if and only if the input distance function has a completely recursive functional
structure as in (39).

In the absence of further assumptions, this equivalence result only holds for
the cost function – input distance function pair of representations. However, when
an assumption of homotheticity is added to the mix, we get equivalence results
analogous to those for symmetric structures. First, some formal definitions are in
order.
5 Functional Structure and Aggregation 251

A production function is homothetically completely recursive if it is completely

recursive (37) and each of the following functions, F1 , F2 , . . . , Ft , . . . , FT −1 , is a
homothetic function.

y1 = F 1 x 1 = F1 x 1

y2 = F 2 y1 , x 2 = F2 x 1 , x 2
..
.
(40)
yt = F t yt−1 , x t = Ft x 1 , x 2 , . . . , x t
..
.

yT −1 = F T −1 yT −2 , x T −1 = FT −1 x 1 , x 2 , . . . , x T −1 .

Notice that this definition says nothing about the function yT = F T yT −1 , x T =

F x 1 , x 2 , . . . , x T , i.e., it does not include the assumption of (overall) homothetic-
ity of the production function. According to Theorem 6.3, page 228 in Blackorby
et al. [3], if the production function is homothetically completely recursive,
then there exists a completely recursive representation in which the functions,
F 1 , F 2 , . . . , F t , . . . , F T −1 , are
homogeneous of degree
one in their arguments.
Given a cost function, C y, w 1 , . . . , w t , . . . , w T , suppose input prices in
1
I , . . . , I t−1 are separable from input prices in I t , . . . , I T , t = 2, . . . , T ,
and from output, y, in C. Then the resulting functional form of C will satisfy the
following system of equations:

c1 = C 1 w 1

c2 = C 2 c1 , w 2
..
.
(41)
ct = C t ct−1 , w t
..
.

cT = C (y, w) = C T y, cT −1 , w T .

The converse is also true (Theorem 6.4, page 230 in Blackorbyet al. [3]).
An analogous result holds for input distance functions. Inputs in I 1 , . . . , I t−1

are separable from inputs in I t , . . . , I T , t = 2, . . . , T , and from output, y, in D
if and only if D has a functional structure given by

d1 = D 1 x 1

d2 = D 2 d1 , x 2
..
.
(42)
dt = D t dt−1 x t
..
.

dT = D (y, x) = D T y, dT −1 , x T .
252 D. Primont

The systems (40), (41), and (42) are related in the following

Theorem 18. The following three functional structures are equivalent:

(i) The productionfunction is homothetically completely recursive, i.e., (40).

(ii) Input prices in I 1 , . . . , I t−1 are separable from input prices in I t , . . . , I T ,
t = 2, . . . ,T , and from output, y, in C, i.e.,(41).
(iii) Inputs in I 1 , . . . , I t−1 are separable from inputs in I t , . . . , I T , t =
2, . . . , T , and from output, y, in D, i.e., (42).

Application

Theorem 18 suggests that, for example, the firm’s cost minimization problem can
be broken down into a series of steps described by the following algorithm.
Step 1: Solve

C 1 w 1 y1 = min w 1 x 1 : F 1 x 1 y1 ,
x1

where the multiplicative form on the left hand side is implied by the homogeneity of
degree one of F 1 in x 1 . Moreover, C 1 is homogeneous of degree one in w 1 . This is
a well-behaved price aggregate for the intermediate output y1 . It is used in the next
step in minimizing the cost of intermediate input y2 .
Step 2: Solve

min C 1 w 1 y1 + w 2 x 2 : F 2 y1 , x 2 y2
y1 ,x 2

= C 2 C 1 w 1 , w 2 y2

= C 2 c1 , w 2 y2 ,

where c1 = C 1 w 1 . Note that C 2 is homogeneous of degree one in c1 , w 2 and,
hence, serves as a well-behaved price aggregate in step 3 when minimizing the cost
of intermediate output y3 .
However, instead of moving to step 3, we will move to step t for t = 3, . . . , T −1.
Step t: Solve

min C t−1 ct−2 , w t−1 yt−1 + w t x t : F t yt−1 , x t yt
yt−1 ,x t

= C t C t−1 ct−2 , w t−1 , w t yt

= C t ct−1 , w t yt ,
5 Functional Structure and Aggregation 253

Fig. 3 Complete Recursivity y1 y2

F1 F2 F3 y

x1 x2 x3

where ct−1 = C t−1 ct−2 , w t−1 .
At the final step
(T ), the cost of final ouput y is minimized using the price
aggregate C T −1 cT −2 , w T −1 that was computed in step T − 1.
Step T : Solve

min C T −1 cT −2 , w T −1 yT −1 + w T x T : F t yT −1 , x T yT = y
yT −1 ,x T

= C T y, C T −1 cT −2 , w T −1 , w T

= cT = C (y, w) = C T y, cT −1 , w T ,

where cT −1 = C T −1 cT −2 , w T −1 .
Note that the resulting functional structure of C (y, w) is precisely the structure
given in (41). The above algorithm is a proof that (i) implies (ii) in Theorem 18.
Here is another application. Consider the following completely recursive produc-
tion function:

y1 = F 1 x 1

y2 = F 2 y1 , x 2

y = F 3 y2 , x 3 .

This model of production can be depicted schematically by a directed network

production diagram (Fig. 3). In period 1, the input vector x 1 is used to produce
y1 . In period 2, y1 and x 2 are used to produce y2 . In the third periiod, y2 and x 3 are
used to produce the final output y.
This conceptual approach to a multiperiod production process has been utilized
by Färe and Grosskopf [9], who name their approach dynamic DEA since they
model production using Data Envelopment Analysis.

Multioutput Technologies

To examine functional structure for multioutput technologies, we will dispense with

the use of single-output production and indirect production functions. However, we
can still utilize cost and input distance functions when we replace the single output
254 D. Primont

y with a vector of outputs. Assume there are N inputs and M outputs. Input vectors
are denoted as

+ = (x1 , . . . , xn , . . . , xN ) : xn 0, n = 1, . . . , N
x ∈ RN

and output vectors are denoted as

+ = (y1 , . . . , ym , . . . , yM ) : ym 0, m = 1, . . . , M .
y ∈ RM

We can define the technology set as

S = (x, y) : x ∈ RN
+ , y ∈ R+ , x can produce y .
M

It is sometimes convenient to work with input (requirement) sets defined as

L (y) = {x : (x, y) ∈ S}

or with output (possibility) sets defined by

P (x) = {y : (x, y) ∈ S} .

It seems reasonable that these three sets should be related by

(x, y) ∈ S ⇔ x ∈ L (y) ⇔ y ∈ P (x) ,

and, indeed, they are.

There are some standard axioms that L (y) and P (x) should satisfy. See Färe
and Primont [10] for the details. For our purposes we will want to make sure that
both the cost function and the input distance function are well-defined and dual to
each other.
We assume that inputs are at least weakly disposable, i.e.,

x ∈ L (y) ⇒ λx ∈ L (y) for all λ 1,

for all x ∈ RN
+ . A stronger assumption is that inputs are strongly disposable, i.e.,

x ∈ L (y) ⇒ x ∈ L (y) for all x x,

for all x ∈ RN
+ . Outputs are weakly disposable if

y ∈ P (x) ⇒ θy ∈ P (x) for all 0 θ 1,

for all y ∈ RM + . The stronger assumption here is that outputs are strongly
disposable, i.e.,
5 Functional Structure and Aggregation 255

y ∈ P (x) ⇒ y ∈ P (x) for all 0 y y,

for all y ∈ RM+.

We are now in a position to extend the definition of the input distance function in
the single-output case to one in which there are multiple outputs. A formal definition
of an input distance function is

Di (y, x) = sup {λ > 0 : (x/λ) ∈ L (y)} for all x ∈ RN

+ , y ∈ R+ .
M
λ

We have made a slight notational change here. For the single-output case, we used
the symbol D for the input distance function; now we use the symbol Di in the
multioutput case. This is necessary to distinguish the input distance function from
the output distance function which is defined by

Do (x, y) = inf {θ > 0 : y/θ ∈ P (x)} for all x ∈ RN

+ , y ∈ R+ .
M
θ

Weak disposability of inputs is necessary and sufficient for the following

relationship between the input set and the input distance function ( [10], page 22):

L (y) = {x : Di (y, x) ≥ 1} .

Similarly, weak disposability of outputs is necessary and sufficient for the following
relationship between the output set and the output distance function ([10], page 15):

P (x) = y : Do (x, y) 1 .

In the multioutput framework, the cost function is defined as

C (y, w) = min {w · x : x ∈ L (y)}

x

= min w · x : Di (y, x) 1 .
x

The cost function is nonnegative and nondecreasing, homogeneous of degree one,

concave, and continuous in w. We have already seen that separability of the cost
function is equivalent to separability of the input distance function, i.e., each sector
r, r = 1, . . . , R is separable in the input distance function, i.e.,

Di (y, x) = Di y, x 0 , Di1 y, x 1 , . . . , DiR y, x R

if and only if each sector r, r = 1, . . . , R is separable in the cost function, i.e.,

C (y, w) = C y, w 0 , C 1 y, w 1 , . . . , C R y, w R .
256 D. Primont

(Theorem 12) This result was proved treating y as a single output. However, the
proof would be unchanged if y is viewed as a vector, an observation made by
Gorman [16]. Similarly, Theorem 13 would continue to hold for the cost function
and the input distance function if the scalar y is replaced by a vector y.
Now define the revenue function by

R (x, p) = max {p · y : y ∈ P (x)}

y

= max p · y : Do (x, y) 1 .
y

The revenue function is nonnegative and nondecreasing, homogeneous of degree

one, convex, and continuous in p. The output distance function can found by solving

Do (x, y) = sup p · y : R (x, p) 1 .
p

An output-revenue analog of Theorem 12 is given by:

Theorem 19. Each sector r, r = 1, . . . , R, is separable in the output distance

function, i.e.,

Do (x, y) = Do x, y o , Do1 x, y 1 , . . . , DoR x, y R ,

if and only if each sector r, r = 1, . . . , R, is separable in the revenue function, i.e.,

R (x, p) = R x, p0 , R 1 x, p1 , . . . , R R x, pR .

The proof of this theorem is analogous to the proof of Theorem 12 generalized to

the multioutput case.

References
1. Aczèl J (1966) Lectures on functional equations and their applications. Academic, New York
2. Arrow K (1974) The measurement of real value added. In: David PA, Reder MW (eds) Trade,
stability, and macroeconomics. Academic, New York, pp 181–202
3. Blackorby C, Primont D, Russell RR (1978) Duality, separability, and functional structure:
theory and economic applications. Elsevier North-Holland, New York
4. Blackorby C, Primont D, Russell RR (1998) Separability: a survey. In: Barbarà S, Hammond
PJ, Seidl C (eds) Handbook of utility theory, volume I. Kluwer Academic Publishers, Boston,
pp 49–92
5. Blackorby C, Schworm W (1984) The structure of economies with aggregate measures of
capital: a complete characterization. Rev Econ Stud 51(4):633–650
6. Blackorby C, Schworm W (1988) The existence of input and output aggregates in aggregate
production functions. Econometrica 56(3):613–643
5 Functional Structure and Aggregation 257

7. Debreu G (1960) Topological methods in cardinal utility theory. In: Arrow K, Karlin S, Suppes
P (eds) Mathematical methods in the social sciences. Stanford University Press, Stanford,
pp 16–26
8. Diewert WE (1982) Duality approaches to microeconomic theory. In: Arrow KJ, Intrili-
gator MD (eds) Handbook of mathematical economics, vol 2. North-Holland, Amsterdam,
pp 535–599
9. Färe, R.G. and S. Grosskopf (1996), Intertemporal Production Frontiers: with Dynamic DEA,
Boston: Kluwer Academic Publishers.
10. Färe RG, Primont D (1995) Multi-output production and duality: theory and applications.
Kluwer Academic Publishers, Boston
11. Fisher FM (1965) Embodied technical change and the existence of an aggregate capital stock.
Rev Econ Stud 32:263–288
12. Fisher FM (1968) Embodied technology and the existence of labor and output aggregates. Rev
Econ Stud 35:391–412
13. Fisher FM (1968) Embodied technology and the aggregation of fixed and moveable capital
goods. Rev Econ Stud 35:417–428
14. Gorman WM (1968) The structure of utility functions. Rev Econ Stud 35:367–390
15. Gorman WM (1970) Quasi-separable preferences, costs, and technologies. Unpublished
manuscript, published as pp 104–114 in Gorman WM (1995)
16. Gorman WM (1987) Separability. In: Eatwell J, Milgate M, Newman P (eds) The new palgrave:
a dictionary of economics. Macmillan Press, London. Reprinted in Gorman W.M. (1995)
17. Gorman WM (1995) The collected works of W.M. Gorman, vol 1, Blackorby C, Shorrocks A
(eds). Oxford, Oxford University Press
18. Hotelling H (1932) Edgeworth’s taxation paradox and the nature of supply and demand
functions. J Polit Econ 40:577–616
19. Leontief WW (1947) A note on the interrelation of subsets of independent variables of a
continuous function with continuous first derivatives. Bull Am Math Soc 53:343–350
20. Leontief WW (1947) Introduction to a theory of the internal structure of functional relation-
ships. Econometrica 15:361–373
21. McFadden D (1978) Cost, revenue, and profit functions. In: Fuss M, McFadden D (eds)
Production economics: a dual approach to theory and applications. Amsterdam, North-Holland,
pp 3–110
22. Roy R (1942) De l’Utilitè. Hermann, Paris
23. Roy R (1947) La Distribution du Revenue Entre Les Divers Biens. Econometrica 15:205–225
24. Shephard RW (1970) Theory of cost and production functions. Princeton University Press,
Princeton
25. Solow RN (1955) The production function and the theory of capital. Rev Econ Stud 23:
101–108
26. Sono M (1945) The effect of price changes on the demand and supply of separable goods. (in
Japanese). Kokumin Keisai Zasshi 74:1–51. English translation in Sono (1961)
27. Sono M (1961) The effect of price changes on the demand and supply of separable goods. Int
Econ Rev 2:239–271
28. Stigum B (1967) On certain problems of aggregation. Int Econ Rev 8(3):349–367
29. Wold HOA (1943) A synthesis of pure demand analysis, I, II. Skandinavisk Akguarietidskrift
26:85–118, 220–263
Elasticities of Substitution
6
R. Robert Russell

Contents
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 260
Two-Input Elasticity of Substitution: Early Formulations and Characterizations . . . . . . . . . . 263
Definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 263
Comparative Statics of Income Shares . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 264
Constant Elasticity of Substitution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 265
Digression: Dual Representations of Multiple-Input, Multiple-Output Technologies . . . . . . . 266
Allen and Morishima Elasticities of Substitution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 269
Allen Elasticities of Substitution (AES) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 269
Morishima Elasticities of Substitution (MES) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 270
AES and MES and the Comparative Statics of Income Shares . . . . . . . . . . . . . . . . . . . . . . 271
Constancy of the Allen and Morishima Elasticities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 272
Non-homothetic Technologies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 275
Dual Elasticities of Substitution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 276
Two-Input Elasticity of Substitution Redux . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 276
Dual Morishima and Allen Elasticities of Substitution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 277
Symmetric Elasticity of Complementarity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 279
Gross Elasticities of Substitution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 280
Elasticities of Substitution and Separability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 283
Separability and Functional Structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 284
Elasticity Identities and Functional Structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 288
Concluding Remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 289
Cross-References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 292
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 292

R. R. Russell ()
Department of Economics, University of California, Riverside, Riverside, CA, USA
e-mail: [email protected]

© Springer Nature Singapore Pte Ltd. 2022 259

S. C. Ray et al. (eds.), Handbook of Production Economics,
https://doi.org/10.1007/978-981-10-3455-8_10
260 R. R. Russell

Abstract

This chapter lays out the theoretical foundation of the measurement of the degree
of substitutability among inputs utilized in a production process. It proceeds from
the well-settled (Hicksian) notion of this measure for two inputs (typically labor
and capital) to the more challenging conceptualization for technologies with
more than two inputs (most notably, Allen-Uzawa and Morishima elasticities).
Dual elasticities of substitution (also called elasticities of complementarity) and
gross elasticities of substitution (measuring substitutability for non-homothetic
technologies taking account of output changes) are also covered. Also analyzed
are functional representations of two-input technologies with constant elasticity
of substitution (CES) and of n-input technologies with constant and identical
elasticities for all pairs of inputs. Finally, the chapter explores the relationship
between elasticity values and the comparative statics of factor income shares and
the relationships between certain elasticity identities and separability conditions
rationalizing consistent aggregation of subsets of inputs.

Keywords

Duality theory · Income shares · Separability and Functional structure

Introduction

In his classic book on the Theory of Wages, the Oxford University economist
John R. Hicks [42] introduced two concepts that persist to this day as important
components of both microeconomic and macroeconomic analysis: (1) elasticity
of substitution1 and (2) input neutrality (alternatively, input bias) of technological
change. Each of these constructs is fundamental to the analysis of changing factor
income shares as an economy (or other production unit) expands (or, for that
matter, contracts). This chapter focuses on the first of these concepts; Chap.
20, “Conceptualization and Measurement of Productivity Growth and Technical
Change: A Nonparametric Approach”.
As noted by Blackorby and Russell [14, p. 882] in their discussion of the
elasticity of substitution, “Hicks’ key insight was to note that [in a two-factor
economy] the effect of changes in the capital/labor ratio (or the factor price ratio)
on the distribution of income (for a given output) can be completely characterized
by a scalar measure of curvature of the isoquant.”2 This measure, the (two-input)

1 Thisconcept was independently formulated by Cambridge University economist Joan Robinson

[69] in her comparably classic book on The Economics of Imperfect Competition. Abba Lerner
[60] and A. C. Pigou [68] also contributed to the understanding of the concept at its genesis.
2 While the use of the word “curvature” in this quote conveys the appropriate intuition, it is

nevertheless technically incorrect, in part because curvature, formally defined, is a unit-dependent

mathematical concept. See de la Grandville [29] for a clear exposition of this point.
6 Elasticities of Substitution 261

elasticity of substitution, is a logarithmic derivative of the input-quantity ratio with

respect to the technical rate of substitution between the two inputs, holding output
constant. The elasticity of substitution and its relationship to the comparative statics
of income shares is exposited in section “Two-Input Elasticity of Substitution: Early
Formulations and Characterizations”, where the famous SMAC (Arrow, Chenery,
Minhaus, and Solow [4]) theorem on the functional characterization of constancy
of the elasticity of substitution (CES) is also discussed.
Generalization of the elasticity of substitution to allow for more than two inputs
began with suggestions by Hicks and Allen [45]. One suggestion was to employ
the constructions defining the original Hicksian notion for any two inputs, holding
the other input quantities fixed. This idea was further explored by McFadden [61],
but since then it has faded from the picture, largely because its failure to allow
for optimal adjustment of other inputs means that it generally fails to provide
information about the comparative statics of relative income shares of any two
inputs.3
The other generalization formulated by Hicks and Allen [45] – and further
analyzed by Hicks (1938), Allen [2], and Uzawa [79] – is now known as the
Allen elasticity of substitution (AES) or the Allen-Uzawa elasticity of substitu-
tion (AUES). This elasticity is a share-weighted (constant-output) cross elasticity
of demand. An alternative generalization, first formulated by Morishima [62]
(in Japanese and unfortunately never translated into English) and independently
discovered by Blackorby and Russell [11], is a constant-output cross elasticity
of demand minus a constant-output own price elasticity of demand.4 Blackorby
and Russell [13] named this concept the “Morishima elasticity of substitution”
(MES) and argued that it, unlike the AES, preserves the salient properties of the
original Hicksian notion when the number of inputs is expanded to more than
two. As these (and other) elasticity concepts are most evocatively described using
dual representations of the technology, section “Digression: Dual Representations
of Multiple-Input, Multiple-Output Technologies” presents some useful duality
constructs. Features of the AES and the MES are then explored in section “Allen
and Morishima Elasticities of Substitution”, particularly their relationships to the
comparative statics of factor income shares and functional representations of the
technology when the elasticities are invariant with respect to changes in input
quantities.

3 As pointed out by Blackorby and Russell [14, p. 882], “[O]nly if the two variables were
separable from all other variables would [this elasticity] provide information about shares; if we
were to require all pairs to have this property, the production function would be additive. When
combined with homotheticity (an assumption maintained in all these studies . . . ) this implies that
the production function is CES, in which case [the elasticities] are constant for all pairs of inputs.”
4 The Morishima elasticity is a generalization of Robinson’s [69] characterization of the two-input

elasticity and for this reason is called the “Robinson elasticity of substitution” by Kuga and Murota
[54] and Kuga [53].
262 R. R. Russell

The curvature of a two-input isoquant can be equivalently (and again infor-

mally5 ) represented by the inverse of the Hicksian elasticity, the logarithmic
derivative of the technical rate of substitution – i.e., the shadow price ratio – with
respect to the quantity ratio. These concepts are dual to one another. Of course, in the
case of only two variables, these two elasticities are simple inverses of one another
(and have inverse implications for the curvature of the isoquant). With more than two
inputs, however, the analogous dual concepts – duals to the Allen and Morishima
elasticities – are not simple inverses of one another. This dual structure, developed
by Blackorby and Russell [11, 13], is examined in section “Dual Morishima and
Allen Elasticities of Substitution”.
Stern [75] points out that the dual Morishima elasticities do not reflect differential
movements along an isoquant (essentially because only one input is varied in
the calculation). He proposes an elasticity, named the symmetric elasticity of
complementarity, that constrains differential changes in quantities to be contained
in an isoquant. This elasticity, unlike the dual Morishima elasticity, is symmetric.6
It is presented in section “Symmetric Elasticity of Complementarity”.
The Allen and Morishima elasticities of substitution are calculated for differen-
tial movements along a constant-output surface. If the technology is homothetic,
this is not a restriction, but in general the comparative-static calculations on, say,
income shares hold only for cases where outputs are exogenous. The extension of
the Allen elasticity to incorporate output effects was broached by Mundlak [63]
and formulated in the dual (using the profit function) by Lau [56]. Extensions
of these results to Morishima elasticities can be found in Davis and Shumway
[28] and Blackorby et al. [16]. In line with the latter, I refer to these concepts as
gross elasticities and describe and analyze them in section “Gross Elasticities of
Substitution”.
In a widely cited paper, Berndt and Christensen [6] were the first to notice that
identities among certain pairs of Allen elasticities are equivalent to corresponding
separability restrictions on the production function. Essentially, a set of inputs is
separable from a distinct input if the technical rates of substitution among inputs
in the set are independent of the quantity of the excluded input. Separability is a
powerful concept in part because it has implications for the possibility of consistent
aggregation of inputs (i.e., aggregation across different types of labor inputs to
form an aggregate input in the functional representation of the technology).7 If
a subset of inputs is separable from all inputs excluded from the subset, there
exists an aggregator over the inputs in the subset, which then is an aggregate input
into the production function. Berndt and Christensen discovered that a subset of
inputs is separable from a distinct input if and only if the Allen elasticities of

5 See Footnote 2 above.

6 It is dual, not to the Morishima elasticity, but to McFadden’s
[61] shadow elasticity of substitution.
7 Separability is also a necessary condition for decentralization of an optimization problem (as
in, e.g., two-stage budgeting). See Blackorby et al. [15, Ch. 5] for a thorough exposition of the
connection between separability and decentralized decision-making.
6 Elasticities of Substitution 263

substitution between the excluded and each of the inputs in the separable subset
are identical. Russell [70] and Blackorby and Russell [12, 13] later generalized
these results and extended them to comparable identity restrictions for Morishima
elasticities of substitution. These relationships between certain elasticity identities
and functional structure are covered in section “Elasticities of Substitution and
Separability”.
The conclusion contains with a brief discussion of the extensive theoretical
and empirical literature in which the elasticity of substitution plays a salient
role.

Two-Input Elasticity of Substitution: Early Formulations and

Characterizations

Definition

Hicks was particularly interested in the substitutability between labor and capital
and more particularly in the relative income shares of these two inputs. Let us
therefore denote the input quantity vector, in an obvious notation, by x , xk ∈ R2+
and the (scalar) output quantity by y ∈ R+ . The production function, F : R2+ →
R+ , is assumed to be increasing, strictly quasi-concave, and homothetic.8 For
convenience, we restrict our analysis to the interior of quantity space, R2++ , and
assume that F is continuously twice differentiable on this space.9
Homotheticity of F implies that the technical rate of substitution between labor
and capital,10

trs,k = F (x , xk )/Fk (x , xk ) =: T RS,k (x , xk ),

is homogeneous of degree zero, so that

trs,k = T RS,k (1, xk /x ) =: t (xk /x )

and

ln trs,k = ln t (xk /x ) =: θ ln(xk /x ) .

8 As we shall see in Digression: Dual Representations of Multiple-Input, Multiple-Output Tech-

nologies the homotheticity assumption can be dropped when the elasticity concept is formulated
in the dual.
9 We could extend our analysis to all of R2 by employing directional derivatives at the boundary
+
but instead leave this technical detail to the interested reader.
10 Subscripts on functions indicate differentiation with respect to the specified variable. The

relation, =, should be interpreted as an identity throughout this chapter (i.e., as holding for all
allowable values of the variables). Also, A := B means the relation defines A, and A =: B means
the relation defines B.
264 R. R. Russell

Owing to strict quasi-concavity of F , t is strictly monotonic, hence invertible, and

we can write
xk
= t −1 (trs,k )
x
or
xk
ln =: φ(ln trs,k ). (1)
x

The (two-input) elasticity of substitution is defined as the log derivative of the

input-quantity ratio with respect to the technical rate of substitution,

σ = φ (ln trs,k ),

or, equivalently, as the inverse of the log derivative of the technical rate of
substitution with respect to the input-quantity ratio,

1
σ = .
θ ln(xk /x )

As the production function is strictly quasi-concave, the elasticity of substitution

lies in the open interval, (0, ∞). Relatively large values of σ indicate that the rate at
which one input can be substituted for the other is relatively insensitive to changes
in the input ratio: in the vernacular, substitution is “relatively easy” (isoquants
are “relatively flat”). Conversely, lower values of σ reflect “relatively difficult”
substitution (and “strong curvature” of isoquants). As σ → ∞, the isoquants
converge to (parallel) linear line segments (perfect substitution), and as σ → 0,
the isoquants converge to Leontief (fixed proportions) isoquants.

Comparative Statics of Income Shares

As shown by Hicks [42], the value of the elasticity of substitution has unambiguous
implications for the effects of changes in relative factor prices or in relative factor
quantities on relative factor shares in a competitive (price taking) economy. In
an obvious notation for factor prices, the share of capital relative to labor is
s = pk xk /p x . In a competitive economy, where trs,k = p /pk ,

ln s = ln(xk /x ) − ln(p /pk )

= φ ln(p /pk ) − ln(p /pk ) =: Ŝ(ln(p /pk )),

= ln(xk /x ) − θ ln(xk /x ) =: S̃(ln(xk /x )),

so that the proportional effect on relative shares of a change in the price ratio is
6 Elasticities of Substitution 265

Ŝ ln(p /pk ) = φ ln(p /pk ) − 1 = σ − 1

and the proportional effect on relative shares of a change in the quantity ratio is

1
S̃ ln(xk /x ) = 1 − θ ln(xk /x ) = 1 − .
σ
Thus, the comparative statics of functional income shares when σ is constant is
encapsulated in
⎛ ⎞ ⎛ ⎞
< <
d Ŝ(ln(p /pk )) ⎝ ⎠
= 1 ⇐⇒ σ ⎝=⎠ 1
d ln(p /pk )
> >

or
⎛ ⎞ ⎛ ⎞
< >
d S̃(ln(xk /x )) ⎝ ⎠
= 1 ⇐⇒ σ ⎝ = ⎠ 1.
d ln(xk /x ))
> <

Constant Elasticity of Substitution

Implicit in Hicks’s analysis of substitutability is the assumption that the elasticity

is constant for all values of the input vector. This begs the question of the
restrictiveness of this assumption, a question that later was answered by Arrow
et al. [4] (affectionately known as SMAC in much of the literature). Assuming
that the technology is convex and homogeneous of degree one, they showed that the
elasticity of substitution is constant (independent of xk /x ) if and only if
1/ρ
ρ ρ
F (x , xk ) = α x + αk xk , 1 ≥ ρ = 0, (CES2 )

F (x , xk ) = γ xα xk1−α , γ > 0, 1 > α > 0. (CD2 )

Thus, constancy of the elasticity of substitution implies that the production function
must be belong to the CES family of technologies, (CES2 ) or (CD2 ).11 Moreover,

11 The Cobb-Douglas production function was well-known at the time of the SMAC derivation,
having been proposed much earlier [25]. The (CES2 ) production function made its first appearance
in Solow’s [73] classic economic growth paper, but the functional form had appeared much earlier
in the context of utility theory: Bergson (Burk) [20] proved that additivity of the utility function
and linear Engel curves (expenditures on individual goods proportional to income for given prices)
266 R. R. Russell

the fixed-proportions technology (Leon5ef [59]) is generated as a limiting case of

(CES2 ):
1/ρ
ρ ρ
lim α x + αk xk = min α x , αk xk . (L2 )
ρ→−∞

Some simple calculations establish the following equivalences: (CES2 ) ⇐⇒ σ =

1/(1 − ρ); (CD2 ) ⇐⇒ σ = 1; and (L2 ) ⇐⇒ σ → 0.12
I need not bring to the attention of the reader the extent to which the CES family
of production functions has been, over the years, a perennial workhorse in both
theoretical and empirical research employing production functions.

Digression: Dual Representations of Multiple-Input,

Multiple-Output Technologies

The Allen and Morishima elasticities, originally formulated by Hicks and Allen [45]
and Morishima [62] in the (primal) context of a single-output production function,
are more generally and evocatively exposited in the dual (using the cost function) as
first shown, respectively, by Uzawa [79]13 and by Blackorby and Russell [11, 13],
Kuga [53], and Kuga and Murota [54]. The dual approach also allows us to move
seamlessly from technologies with a single output to those with multiple outputs.
Finally, duality theory is needed for the development of the dual elasticities of
substitution in section “Dual Elasticities of Substitution”. This section lays out the
requisite duality theory.14
Denote the ordered set of inputs by N = 1, . . . n and the ordered set of outputs
by M = 1, . . . m. Input and output quantity vectors are denoted x ∈ Rn+ and
y ∈ Rm + , respectively. The technology set is the set of all feasible input, output
combinations:

T := x, y ∈ Rn+m
+ | x can produce y .

implies that the utility function belongs to the CES family. As the SMAC authors point out, the
function (CES2 ) itself was long known in the functional-equation literature (see Hardy et al. [41,
p. 13]) as the “mean value of order ρ.”
12 The SMAC theorem is easily generalized to homothetic technologies, in which case the

production function is a monotonic transformation of (CES2 ) or (CD2 ); in the limiting case as

σ → 0 it is a monotonic transformation of (L2 ).
13 Because the Allen elasticities are now typically exposited in the dual, they are often called the

“Allen-Uzawa elasticities,” and I resort to that nomenclature on occasion as well. Uzawa’s approach
was later extended by Blackorby and Russell [11, 13].
14 Thorough expositions of duality theory can be found in, e.g., Blackorby et al. [15], Chambers

[22], Cornes [26], Diewert [30, 31], Färe and Primont [36], Fuss and McFadden [37], and Russell
[71].
6 Elasticities of Substitution 267

While the nomenclature suggests that feasibility is a purely technological notion,

a more expansive interpretation is possible: feasibility could incorporate notions of
institutional and political constraints, especially when we consider entire economies
as the basic production unit.
An input requirement set for a fixed output vector y is

L(y) := x ∈ Rn+ | x, y ∈ T .

We assume throughout that, for all y ∈ Rm + , L(y) is closed and strictly convex
(relative to Rn+ )15 and satisfies strong input disposability

L(y) = L(y) + Rn+ ∀ y ∈ Rm

output monotonicity,16

ȳ > y =⇒ L(ȳ) ⊂ L(y),

and “no free lunch,”

y = 0(m) =⇒ 0(n) ∈
/ L(y).

The (input) distance (gauge) function, a mapping from17

Q := x, y ∈ Rn+m
+ | y = 0(m) ∧ x = 0(n) ∧ L(y) = ∅

into the positive real line (where 0(n) is the null vector of Rn+ ), is defined by

D(x, y) := max λ | x/λ ∈ L(y) .

Under the above assumptions, D is well defined on this restricted domain and
satisfies homogeneity of degree one, positive monotonicity, concavity, and conti-
nuity in x and negative monotonicity in y. (See, e.g., Färe and Primont [36] for
proofs of these properties and most of the duality results that follow.18 ) Assume, in

15 These assumptions are stronger than needed for much of the conceptual development that
follows, but in the interest of simplicity I maintain them throughout.
16 Vector notation: ȳ ≥ y if ȳ ≥ y for all j ; ȳ > y if ȳ ≥ y for all j and ȳ = y; and ȳ y if
j j j j
ȳj > yj for all j .
17 We restrict the domain of the distance function to assure that it is globally well defined. An

alternative approach (e.g., Färe and Primont [36]) is to define D on the entire non-negative (n+m)-
dimensional Euclidean space and replace “max” with “sup” in the definition. See Russell [71,
footnote 12] for a comparison of these approaches.
18 Whatever is not there can be found in Diewert [30] or the Fuss/McFadden [37] volume.
268 R. R. Russell

addition, that D is continuously twice differentiable in x. The distance function is a

representation of the technology, since (under our assumptions)

x, y ∈ T ⇐⇒ D(x, y) ≥ 1.

In the single-output case (m = 1), where the technology

can be represented
by a production function, F : Rn+ → R+ , D x, F (x) = 1 and the production
function is recovered by inverting D(x, y) = 1 in y. If (and only if) the technology
is homogeneous of degree one (constant returns to scale),

F (x)
D(x, y) = .
y

The cost function, C : Rn++ × Y → R+ , where

Y = y | x, y ∈ Q for some x ,

is defined by

C(p, y) = min p · x | x ∈ L(y)

or, equivalently, by

C(p, y) = min p · x | D(x, y) ≥ 1 . (2)

Under our maintained assumptions, D is recovered from C by

D(x, y) = inf p · x | C(p, y) ≥ 1 , (3)

and C has the same properties in p as D has in x. This establishes the duality
between the distance and the cost function. On the other hand, C is positively
monotonic in y. We also assume that C is twice continuously differentiable in p.
By Shephard’s Lemma (application of the envelope theorem to (2)), the (vector-
valued, constant-output) input demand function, δ : Rn++ × Y → Rn+ , is generated
by first-order differentiation of the cost function19 :

δ(p, y) = ∇p C(p, y).

Of course, δ is homogeneous of degree zero in p. The (normalized) shadow-price

vector, ρ : Q → R+ , is obtained by applying the envelope theorem to (3):

19 ∇ := C(p, y)/∂p1 , . . . , ∂C(p, y)/∂pn .

p C(p, y)
6 Elasticities of Substitution 269

ρ(x, y) = ∇x D(x, y). (4)

As is apparent from the re-writing of (3) (using homogeneity of C in p) as

p p
D(x, y) = inf · x | C(p/c, y) ≥ 1 = inf · x | C(p, y) ≥ c , (5)
p/c c p/c c

where c can be interpreted as (minimal) expenditure (to produce output y) and the
vector ρ(x, y) in (4) can be interpreted as shadow prices normalized by minimal
cost.20 In other words, under the assumption of cost-minimizing behavior,
p
ρ δ(p, y), y = .
C(p, y)

Clearly, ρ is homogeneous of degree zero in p.

Allen and Morishima Elasticities of Substitution

Allen Elasticities of Substitution (AES)

The Allen elasticity of substitution between inputs i and j is given by

C(p, y)Cij (p, y)

σijA (p, y) : = (6)
Ci (p, y)Cj (p, y)
ij (p, y)
= , ∀ i, j ∈ N × N, (7)
sj (p, y)

where the subscripts on the cost function C indicate differentiation with respect to
the indicated variable(s);

∂ ln δi (p, y) pj Cij (p, y)

ij (p, y) := = .
∂ ln pj Ci (p, y)

is the (constant-output) elasticity of demand for input i with respect to a change in

the price of input j ; and

pj δj (p, y) pj Cj (p, y)
sj (p, y) = =
C(p, y) C(p, y)

20 See Färe and Grosskopf [34] and Russell [71] for analyses of the distance function and associated
shadow prices.
270 R. R. Russell

is the cost share of input j . Thus, the Allen elasticity is simply a share-weighted
cross (i = j ) or own (i = j ) demand elasticity. It collapses to the Hicksian elasticity
when n = 2.

Morishima Elasticities of Substitution (MES)

To define the Morishima elasticities, let pi be the (n − 1)-dimensional vector of

price ratios with pi in the denominator. Zero-degree homogeneity of δ in p allows
us to write

δ̂ pi , y := δ(p, y).

The Morishima elasticity of substitution of input i for input j is defined directly as

i
∂ ln δ̂i pi , y δ̂j p , y
σijM (p, y) := . (8)
∂ ln(pj /pi )

Note that any variation in a single component of pi , say pj /pi , holding other
components (j = j ) constant, must be entirely manifested in variation in pj alone.
Hence, for all pairs i, j ,

∂ δ̂i (pi , y) ∂δi (p, y) 1

= .
∂(pj /pi ) ∂pj pi

Using this fact, along with Shephard’s Lemma, the MES can be re-written in the
dual as

Cij (p, y) Cjj (p, y)
σijM (p, y) = pj − (9)
Ci (p, y) Cj (p, y)
= ij (p, y) − jj (p, y). (10)

Thus, the MES is simply the difference between the appropriate (constant output)
cross price elasticity of demand and the (constant output) own elasticity of demand
for the input associated with the axis along which the price ratio is being varied.
The Morishima elasticity, unlike the Allen elasticity, is non-symmetric, since
the value depends on the normalization adopted in (8) – that is, on the coordinate
direction in which the prices are varied to change the price ratio, pj /pi 21 Of course,
if there are only two inputs, there is no difference between changing pi and changing

21 See Blackorby and Russell [11, 13, 14] for a discussion of this asymmetry, which (as pointed out

by Stern [76]) was recognized much earlier by Pigou [68] in his analysis of Robinson’s [69]
less-formal characterization of this elasticity concept.
6 Elasticities of Substitution 271

pj , and the MES, like the AES, collapses to the standard Hicksian elasticity. But in
the general n-input case, the interactions between these two inputs are dependent on
whether it is one or the other that is varied.22

AES and MES and the Comparative Statics of Income Shares

If σijA (p, y) > 0 (i.e., if increasing the j th price increases the optimal quantity of
input i), we say that inputs i and j are Allen-Uzawa substitutes; if σijA (p, y) < 0,
they are Allen-Uzawa complements. Similarly, if σijM (p, y) > 0 (i.e., if increasing
the j th price increases the optimal quantity of input i relative to the optimal quantity
of input j ), we say that input j is a Morishima substitute for input i; if σijM (p, y) <
0, input j is a Morishima complement to input i. As the Morishima elasticity of
substitution is non-symmetric, so is the taxonomy of Morishima substitutes and
complements.23
The conceptual foundations of Allen-Uzawa and Morishima taxonomies of
substitutes and complements are, of course, quite different. The Allen-Uzawa
taxonomy classifies a pair of inputs as substitutes (complements) if an increase
in the price of one causes an increase (decrease) in the quantity demanded of the
other. This is the standard textbook definition of net substitutes (and complements).
The Morishima concept, on the other hand, classifies a pair of inputs as substitutes
(complements) if an increase in the price of one causes the quantity of the other to
increase (decrease) relative to the quantity of the input for which price has changed.
For this reason, the Morishima taxonomy leans more toward substitutability (since
the theoretically necessary decrease in the denominator of the quantity ratio in (8)
(owing to jj (p, y) < 0) helps the ratio to decline when the price of the input in the
denominator increases).
Put differently, if two inputs are substitutes according to the Allen-Uzawa cri-
terion, theoretically they must be substitutes according to the Morishima criterion,
but if two inputs are complements according to the Allen-Uzawa criterion, they can
be either complements or substitutes according to the Morishima criterion. This
relationship can be seen algebraically from (7) and (10). If i and j are Allen-
Uzawa substitutes, in which case ij (p, y) > 0, concavity of the cost function
(and hence negative semi-definiteness of the corresponding Hessian) implies that
ij (p, y) − jj (p, y) > 0, so that j is a Morishima substitute for i. Similar algebra
establishes that two inputs can be Morishima substitutes when they are Allen-Uzawa
complements.

22 “Own” Morishima elasticities are identically equal to zero and hence uninteresting, as one might

expect to be the case for a sensible elasticity of substitution.

23 These notions are referred to as “p-substitutes” and “p-complements” in much of the literature
(see Stern [76] and the papers cited there), as distinguished from “q-substitutes” and “q-
complements,” which I call dual substitutes and complements in section “Dual Elasticities of
Substitution”.
272 R. R. Russell

Note that, for i = j ,

∂ ln si (p, y)
= ij (p, y) − sj (p, y) = sj (p, y) σijA (p, y) − 1 ,
∂ ln pj

so that an increase in pj increases the absolute cost share of input i if and only
if σijA (p, y) > 1 : that is, if and only if inputs i and j are sufficiently strong
net substitutes. Thus, the Allen-Uzawa elasticities provide immediate qualitative
comparative-static information about the effect of price changes on absolute shares.
To obtain quantitative comparative-static information, one needs to know the share
of the j th input as well as the Allen-Uzawa elasticity of substitution.
The Morishima elasticities immediately yield both qualitative and quantitative
information about the effect of price changes on relative input shares:

∂ ln(ŝi pi , y /ŝj pi , y )
= ij (p, y) − jj (p, y) − 1 = σijM (p, y) − 1,
∂ ln(pj /pi )

where (with the use of zero-degree homogeneity of si in p) ŝi (pi , y) := si (p, y) for
all i. Thus, an increase in pj increases the share of input i relative to input j if and
only if

σijM (p, y) > 1 :

that is, if and only if inputs i and j are sufficiently substitutable in the sense of
Morishima. Moreover, the degree of departure of the Morishima elasticity from
unity provides immediate quantitative information about the effect on the relative
factor shares.

Constancy of the Allen and Morishima Elasticities

In addition to his dual reformulation of the elasticity proposed by Hicks and Allen
[45], Uzawa [79] also extended the SMAC theorem on constancy of the elasticity
of substitution to encompass more than two inputs. He conjectured (p. 293) that the
“production function which extends the Arrow-Chenery-Minhas-Solow function to
the n-factor case may be the following type:”
1/ρ
ρ
F (x) = αi xi , αi > 0 ∀ i, 1 ≥ ρ = 0. (CES)
i∈N

This structure does indeed yield a constant Allen-Uzawa elasticity, σ = 1/(1 − ρ),
for all pairs of inputs and moreover converges to
β
F (x) = α xi i , α > 0, βi > 0 ∀ i, (CD)
i∈N
6 Elasticities of Substitution 273

with σ = 1 as ρ → ∞. Moreover, Uzawa shows that this structure is necessary as

well as sufficient for constancy of the Allen-Uzawa elasticities, but it turns out not
to be necessary for the elasticities to be identical for all pairs of inputs.
Uzawa went on to establish necessary (as well as sufficient) conditions for
constancy and uniformity of the AES for a PLH production function. To explicate
these conditions, consider a partition of the set of inputs into m subsets, N =
{N 1 , . . . , N S }, with ns inputs in subset s for each s. Uzawa showed that the AES
are constant and identical if and only if, for some partition N , the (PLH) production
function is given by

S
S
F (x) = F s (x s )βs , βs > 0 ∀ s and βs = 1, (11)
s=1 s=1

where, for all s,

1/ρs
ρ
F (x ) =
s s
αi xi s , 0 = ρs < 1, αi > 0 ∀ i ∈ N s . (12)
i∈N s

That is, the production function can be written as a Cobb-Douglas function of CES
aggregator functions. Note that the structure (11)–(12) collapses to the (CES) case
when S = 1 and to the (CD) case when |N s | = 1 for all s. Moreover, when n = 2,
(11) collapses to (CD2 ) and (12) collapses to (CES2 ).24
Analogous results for constancy and uniformity of the Morishima elasticities
were established by Blackorby and Russell [11] and Kuga [53] (generalizing
the three-input case proved by Murota [66]). Again maintaining positive linear
homogeneity of the production function, the MES are constant if and only if the
production function takes the form (CES) or (CD) above.

24 McFadden [61] showed that his direct elasticities of substitution are constant and identical if and
only if

S 1/ρ
F (x) = αs F s (x s ) , 0 = ρ < 1/n∗ , 0 ≤ ρ < 1/n∗ , αs > 0 ∀ s,
s=1

S
∗
F (x) = α0 F s (x s )αs /n , αs > 0 ∀ s, and n∗ = max{ns }
s
s=1

where

ns
F s (x s ) = xi , s = 1, . . . , m :
i=1

that is, if and only if the production function can be written as a CES or Cobb-Douglas function of
(specific) Cobb-Douglas aggregator functions.
274 R. R. Russell

Blackorby and Russell [14, p. 888] point to the contrast between these represen-
tations for the AES and the MES to support their view that the latter is the natural
generalization of the Hicks elasticity to encompass more than two inputs: if equation
(CES) “is the CES production function, then the MES – and not the AES – is the
elasticity of substitution.”
Using standard duality theory, constancy of the elasticity of substitution can also
be characterized in terms of the structure of the cost and distance functions. For PLH
production functions, the Morishima elasticities are globally constant and identical
if and only if the cost function takes the form25

n 1/ρ̂
ρ̂ 1/(1−ρ)
C(p, y) = y α̂i pi , α̂i = αi ∀i, ρ̂ = ρ/(ρ − 1), (13)
i=1

n
β
C(p, y) = y α̂i pi i , α̂i = αiαi ∀ i. (14)
i=1

The distance function dual to (13) and (14), formally derived using (3), has the
following alternative images:

n 1/ρ
−1 ρ
D(x, y) = y αi xi (15)
i=1

n
D(x, y) = y −1 α xiαi . (16)
i=1

Obviously, setting D(x, y) = 1 in (15) and (16) and inverting in y yields the explicit
production function (CES) and (CD).
These dual structures, necessary and sufficient for constancy of the MES,
are easily modified to encompass the cases where the production function is
homogeneous but does not satisfy constant returns to scale: virtually
the same proof
goes through if we simply replace y by y 1/α where α = i αi in (13), (14),
(15), and (16). The results are similarly extended to homothetic technologies if we
replace y with (y) in (13), (14), (15), and (16), where is an increasing function.
In fact, this structure also suffices when there are multiple outputs, in which case
is a mapping from Rm + into R+ , increasing in each output quantity. Intuitively

25 Note the “self-duality” of this structure, a concept formulated by Houthakker [46] in the context
of dual consumer preferences: the cost-function structure in prices mirrors the CES/Cobb-Douglas
structure of the production function in input quantities.
6 Elasticities of Substitution 275

these simple extensions of the basic result reflect the fact that all of the structure
implied by constancy of the elasticity of substitution is imbedded in the “shape”
(or “curvature”) of an isoquant, and as long as there is no change in its shape as
we move from one isoquant to another, the basic structure is preserved.26 The next
subsection examines the possibilities when the shape of the isoquant does change
when outputs change – i.e., when the technology is not homothetic.

Non-homothetic Technologies

Blackorby and Russell [11, 13] generalized the representation result for MES to
encompass the case of non-homothetic technologies. Their proof first characterizes
the constancy of the MES in terms of the structure of the cost function. In particular,
the MES are constant if and only if the cost function has the following structure:

n 1/ρ̂
ρ̂
C(p, y) = (y) αi (y) pi , 0 = ρ̂ ≤ 1, (17)
i=1

n
β

n
C(p, y) = (y) pi i , βi > 0 ∀ i, βi = 1, (18)
i=1 i=1

where, denoting the range of F by R(F ), : R(F ) → R++ , and αi : R(F ) →

Rn++ , i = 1, . . . , n, are increasing functions.
Thus, the basic CES/Cobb-Douglas structure is preserved when we expand
the set of allowable technologies to be non-homothetic. An important difference,
however, is the dependence on output y of the “distribution coefficients,” αi (y), i =
1, . . . , n in the CES structure in (17).27 This additional flexibility allows the
isoquants to “bend” differently for different output vectors while keeping constant
the curvature of the isoquant.
The distance function dual to (17) is

n 1/ρ
−1 ρ̂
αi (u)−1 xi
ρ
D(y, p) = (y) , ρ= . (19)
ρ̂ − 1
i=1

26 I am unaware of similar explorations of possible generalizations of the results on constancy of

the Allen-Uzawa elasticities, but intuition suggests that similar results would go through there as
well.
27 Blackorby and Russell [13] proved that the dependence on y of the corresponding coefficients,

βi , i = 1, . . . n, in (18) leads to a violation of positive monotonicity of the cost function in y. Thus,

generalization to non-homothetic technologies does not expand the Cobb-Douglas technologies
consistent with constancy of the MES.
276 R. R. Russell

Owing to the dependence of the distribution coefficients, αi (y), i = 1, . . . n, on y,

the explicit production function cannot be derived in closed form in the case of a
single output. In the Cobb-Douglas case, however, the dual distance function is

n
D(x, y) = (y)−1 α xiαi , (20)
i=1

which in the case of a single output can be set equal to one and inverted in y to
obtain the explicit production function:
n
y = −1 α xiαi .
i=1

As pointed out by Blackorby and Russell [13], if the cost structure is given by
(17)–(18), the Allen-Uzawa elasticities are constant and equal to the Morishima
elasticities. As far as I know, however, necessary structural conditions for constancy
of the Allen-Uzawa elasticities have not been worked out, though one might expect
some variation on (17)–(18).

Dual Elasticities of Substitution

Two-Input Elasticity of Substitution Redux

Let us return briefly to the discussion of the two-variable elasticity of substitution

in section “Two-Input Elasticity of Substitution: Early Formulations and Charac-
terizations”, where the elasticity is defined as the log derivative of the quantity
ratio with respect to the technical rate of substitution: σ = φ (ln trs,k ). The
inverse of φ , with image σ d := θ (xk /x ), is the log derivative of a technical
rate of substitution with respect to a quantity ratio. This is also an elasticity, one
that is dual to σ . In contrast to the (direct) elasticity of substitution σ , large
values of σ d reflect “difficult” substitution, or strong complementarity, whereas
low values reflect “easy” substitution, or weak complementarity. Following the lead
of Sato and Koizumi [72], several papers refer to this concept as an elasticity of
complementarity.28 In what follows, I use the terms “dual elasticity of substitution”
and “elasticity of complementarity” interchangeably.29
Of course, in the two-input case, σ and σ d convey the same information about
the curvature of the isoquant and the degree of substitutability (or complementarity)
between the two inputs. In the remainder of this section, I extend this dual concept

28 See
Bertoletti [10], Kim [50], and Stern [76].
29 Yet
another possible assignation is “shadow elasticity of substitution,” since this dual concept is
formulated in terms of shadow prices.
6 Elasticities of Substitution 277

to technologies with more than two inputs, paralleling the development of Allen
and Morishima (direct) elasticities in the previous section. The development is
facilitated by the use of duality theory, since the trs,k can be interpreted as the
relative shadow price of inputs and k and in fact is equal to the market price ratio,
p /pk , under conditions of competitive market pricing.

Dual Morishima and Allen Elasticities of Substitution

The dual Morishima elasticity of substitution (Blackorby and Russell [11, 13]) is
given by
i
∂ ln ρ̂i x i , y ρ̂j x , y
σijDM (x, y) : = (21)
∂ ln(xj /xi )

Dij (x, y) Djj (x, y)
= xj − (22)
Di (x, y) Dj (x, y)
= ij (x, y) − jj (x, y),
D D
(23)

where x i is the (n − 1)-dimensional vector of input quantity ratios with xi in the

denominator and
∂ ln ρi (x, y)
D
ij (x, y) =
∂ ln xj

is the (constant-output) elasticity of the shadow price of input i with respect to

changes in the quantity of input j .
Analogously, Blackorby and Russell [13] proposed the Allen elasticity of
complementarity (alternatively, the dual Allen elasticity of substitution):

D(x, y)Dij (x, y)

σijDA (x, y) = (24)
Di (x, y)Dj (x, y)
D
ij (x, y)
= , (25)
sjD (x, y)

where

sjD (x, y) = ρj (x, y) · xj

is the cost share of input j (assuming cost-minimizing behavior).

If σijDA (p, y) < 0 (i.e., if increasing the j th quantity decreases the shadow
price of input i), we say that inputs i and j are Allen-Uzawa dual substi-
tutes; if σijDA (p, y) > 0, they are Allen-Uzawa dual complements. Similarly, if
278 R. R. Russell

σijDM (p, y) < 0 (i.e., if increasing the j th quantity decreases the shadow price
of input i relative to the shadow price of input j ), we say that input j is a dual
Morishima substitute for input i; if σijDM (p, y) > 0, input j is a dual Morishima
complement to input i.
Recall from section “Two-Input Elasticity of Substitution Redux” that in the two-
input case the elasticities of substitution and complementarity are simple inverses
of one another. This is clearly not the case when n > 2.30 Interestingly, since
the distance function is concave in x, and hence jj D (x, y) in (23) is non-positive,

the Morishima elasticity leans more toward dual complementarity than does the
Allen elasticity (in sharp contrast to the primal taxonomy in section “AES and MES
and the Comparative Statics of Income Shares”). Similarly, if two inputs are dual
Allen-Uzawa complements, they must be dual Morishima complements, whereas
two inputs can be dual Allen-Uzawa substitutes but dual Morishima complements.
There exist, of course, dual comparative-static results linking factor cost shares
and elasticities of complementarity.31 Consider first the effect of quantity changes
on absolute shares (for i = j ):

∂ ln siD (x, y)
= D
ij (x, y) = σijDA (x, y) sjD (x, y),
∂ ln xj

so that an increase in xj increases the absolute share of input i if and only

if ijD (x, y) > 0 or, equivalently, σijDA (x, y) > 0 : that is, if and only if
inputs i and j are dual Allen-Uzawa complements. Thus, the dual elasticities
provide immediate qualitative comparative-static information about the effect of
quantity changes on (absolute) shares. To obtain quantitative comparative-static
information, one needs to know the share of the j th input as well as the Allen-Uzawa
elasticity of complementarity. Of course, the (constant-output) elasticity derived
from the distance function ijD (x, y) yields the same (qualitative and quantitative)
comparative-static information.
Comparative-static information about relative income shares in the face of
quantity changes can be extracted from the Morishima elasticity. As the share
functions, siD , i = 1, . . . n, are homogeneous of degree zero in quantities, we can
re-write their images as s̃iD (x i , y) := siD (x, y). We then obtain

∂ ln s̃iD x i , y /s̃jD x i , y
= ij (x, y) − jj (x, y) − 1
DM DM
= σijDM (x, y) − 1.
∂ ln(xj /xi )

30 Of course, the Allen and Morishima elasticities of complementarity are identical when n = 2, as
is the case with Allen and Morishima elasticities of substitution.
31 While shadow prices and dual elasticities are well defined even if the input requirement sets are

not convex, the comparative statics of income shares using these elasticities requires convexity (as
well, of course, as price-taking, cost-minimizing behavior), which implies concavity of the distance
function in x. By way of contrast, convexity of input requirement sets is not required for the
comparative statics of income shares using dual elasticities, since the cost function is necessarily
concave in prices. See Russell [71] for a discussion of these issues.
6 Elasticities of Substitution 279

Thus, an increase in xj increases the share of input i relative to input j if and only
if

σijDM (x, y) > 1 :

that is, if and only if inputs i and j are sufficiently complementary in terms of the
dual Morishima elasticity of complementarity. Moreover, the degree of departure
from unity provides immediate quantitative information about the effect on the
relative factor share. Thus, the dual Morishima elasticities provide immediate
quantitative and qualitative comparative-static information about the effect of
quantity changes on relative shares.
As pointed out by Blackorby and Russell [13, p.153], constancy of the dual
Morishima elasticities of substitution entails precisely the same restrictions on the
production technology as does constancy of the MES elasticities. This is because
the required structure of the cost function and the distance function is self-dual, as
can be seen by inspection of (13)–(14) and (19)–(20).

Symmetric Elasticity of Complementarity

Stern [75] points out that the dual Morishima elasticity does not reflect the curvature
of the isoquant. While the log derivative in (21) holds output quantities y constant
in assessing the effect on the shadow-price ratio of a change in the quantity ratio –
changing only the j th quantity – it does not maintain D(y, x) = 1. Consequently,
the direction of the differential change in the quantity ratio is not consistent with
containment in the y-isoquant.
Stern defines the symmetric elasticity of complementary as follows:
i
∂ ln ρ̂i x i , y ρ̂j x , y
σijSEC (x, y) = (26)
∂ ln(x /x )
j i D(x,y)=1

Ψ (x, y)
= (27)
(x, y)

where

Dii (x, y) Dij (x, y) Djj (x, y)

Ψ (x, y) = − +2 −
Di (x, y)2 Di (x, y)Dj (x, y) Dj (x, y)2

and

(x, y) = 1/Di (x, y)xi + 1/Dj (x, y)xj .

This elasticity is symmetric, reflecting the required (differential) movement along

the isoquant. Moreover, it can be expressed as a share-weighted average of the dual
Morishima elasticities [75],
280 R. R. Russell

siD (x, y) sjD (x, y)

σ SEC (x, y) = σijDM (x, y) + σjDM
i (x, y),
siD (x, y) + sjD (x, y) siD (x, y) + sjD (x, y)
(28)

or the dual Allen elasticities (Stern [76]),

siD (x, y) sjD (x, y)

σ SEC (x, y) = σijDA (x, y) + σjDA
i (x, y).
siD (x, y) + sjD (x, y) siD (x, y) + sj (x, y)
(29)

The asymmetric elasticity of complementarity is dual to the shadow elasticity of

substitution (McFadden [61] and Mundlak [63]), which is derived by evaluating
the derivative of a quantity ratio with respect to a price ratio along a constant-cost
frontier:

∂ ln Ci p, y Cj p, y
σijSES (p, y) =
∂ ln(p /p )
j i C(p,y)=y

˜
= Ψ̃ (p, y)/(p, y)

where

Cii (p, y) Cij (p, y) Cjj (p, y)

Ψ̃ (p, y) = − 2
+2 −
Ci (p, y) Ci (p, y)Cj (p, y) Dj (p, y)2

and

˜
(p, y) = 1/Ci (p, y)pi + 1/Dj (p, y)pj .

As shown by Chambers [22] and Stern [76], respectively, the shadow elasticity
of substitution can be written as share-weighted averages of Morishima or Allen
elasticities, as in the relationships between for dual elasticities of complementarity
(28) and (29).

Gross Elasticities of Substitution

The (primal and dual) Allen and Morishima elasticities of substitution are formu-
lated in terms of constant-output demand functions. Their immediate usefulness in
studies of the comparative statics of factor incomes is limited to firms that are output
constrained or to firms with homothetic technologies (in which case the elasticities
are independent of output quantities). This limitation prompted the formulation of
elasticities of substitution that incorporate the effects of optimal output adjustments
as input prices change. Following Blackorby et al. [16], I refer to these measurement
6 Elasticities of Substitution 281

concepts as gross elasticities of substitution, contrasting them to the standard Allen

and Morishima elasticities, which assess net input quantity change – that is, changes
that abstract from the effects of output changes.
The gross analogue of the Allen elasticity, first formulated by Mundlak [63]
using primal production theory methods, was formulated in the dual by Lau [56] and
referred to by Bertoletti [9, 10] as the Hotelling-Lau elasticity in his resurrection of
this concept. The gross analogue of the Morishima elasticity was proposed by Davis
and Shumway [28] and contrasted with the Hotelling-Lau elasticity by Blackorby
et al. [16] and Syrquin and Hollender [77].32 Both of these gross elasticities are most
evocatively expressed in terms of the profit function (and in fact are formulated by
simply substituting the profit function for the cost function in the Allen elasticity).
Let r ∈ Rn++ be the vector of output prices, indexed by k, = 1, . . . , m. The
profit function, : Rm++ × R++ , is defined by
m

(p, r) = max{r · y − p · x | x, y ∈ T } (30)

x,y

= r · φ(p, r) − p · ζ (p, r), (31)

where φ and ζ are the (vector-valued) input-demand and output-supply functions,

respectively. The profit function is nondecreasing in r, nonincreasing in p, and
convex, jointly continuous, and homogeneous of degree one in r, p. We assume
in addition that it is twice continuously differentiable in all prices. The vector of
supply functions and the vector of input demand functions are derived from the
profit function using the envelope theorem – often termed Hotelling’s Lemma in
this context:

φk (p, r) = k (p, r) ∀ k

and

ζi (p, r) = −i (p, r) ∀ i,

where subscripts on the profit function indicate differentiation with respect to the
indicated output or input price.
The extension of the Allen elasticity to encompass output-quantity changes as
formulated by Lau [56] – the Hotelling-Lau elasticity (HLES) – is given, for inputs
i and j , by

(p, r) ij (p, r) (p, r) ∂ ln φi (p, r)

σijH L (p, r) = = .
i (p, r) j (p, r) pj xj ∂ ln pj

32 See also Hicks [44], Sato and Koizumi [72], and Stern [76] for discussions of these issues.
282 R. R. Russell

Blackorby et al. [16] demonstrate that the HLES inherits the inadequacies of the
AES, from which it is constructed by analogy. They summarize their evaluation as
follows (page 206): The HLES “is not a logarithmic derivative of a quantity ratio
with respect to a price ratio—allowing output to change, and it does not provide
comparative static content about relative factor incomes. In fact it is not even a
generalization of the AUES in any meaningful sense, since it does not reduce to the
latter under the assumption of homotheticity.”
Blackorby et al. go on to construct a gross version of the MES that rectifies the
problems with the HLES. Note that the optimal input ratio can be written as

ζi (p, r) −i (p, r) i (p, r)
ln = ln = ln , (32)
ζj (p, r) −j (p, r) j (p, r)

where the second term is an application of Hotelling’s Lemma. To differentiate the

ratio in (32) with respect to the log of pi /pj , note that, using homogeneity of degree
one of in all prices,

(p, r) = pj j (p−j /pj , r/pj ), (33)

where p−j is the (n − 1)-dimensional vector of price ratios with pj purged from p.
Similarly, owing to homogeneity of degree zero of the demand functions,

ζi (p, r) = φi (p−j /pj , r/pj ) ∀i.

Application of Hotelling’s Lemma to (33) yields the Morishima gross elasticity

of substitution:

∂ ln(ζi (p−j /pj , r/pj )/ ln ζj (p−j /pj , r/pj )
j j
σ MG (p, r) = (34)
∂ ln(pi /pj )

ij (p, y) jj (p, r)
= pj − . (35)
j (p, r) j (p, y)
∗ ∗
= ij (p, r) − jj (p, r), (36)

where ij∗ (p, r) is the (gross) cross elasticity of demand for input i with respect to
the j th price and jj∗ (p, r) is the own (gross) price elasticity of demand for input

j . Thus, analogous to the MES, the MGES is simply the difference between the
appropriate (gross) cross elasticity of input demand and the (gross) own elasticity
of the input associated with the j th axis, along which the price ratio is being varied.
By construction, the MGES is a derivative of an optimal input-quantity ratio
with respect to the relevant input price ratio when outputs as well as inputs are
allowed to adjust. This elasticity, moreover, provides immediate information about
the comparative statics of factor income shares. Define the relative shares of inputs
i and j :
6 Elasticities of Substitution 283

pi ζi (p, r)
sij (p, r) =
pj ζj (p, r)

or, in terms of price ratios,

pi ζ̂i (p−j /pj , r/pj )

ŝij (p−j /pj , r/pj ) = .
pj ζ̂j (p−j /pj , r/pj )

Some tedious but straightforward calculations yield

∂ ŝij (p−j /pj , r/pj )

= 1 − σijMG (pj , r).
∂ ln(pi /pj )

That is, an increase in the price of input i relative to the price of input j (actually,
holding pj fixed) increases the share of input i relative to input j if and only
if σijMG (pj , r) < 1. Thus, the Morishima gross elasticity, unlike the Hotelling-
Lau elasticity, yields immediate (qualitative and quantitative) comparative static
information about the effect of changes in relative input prices on the relative
factor income shares. As shown by Blackorby et al. [16], the MGES reduces to
the MES when the technology is homothetic, whereas the Hicks-Lau elasticity does
not collapse to the Allen elasticity under this restriction.

Elasticities of Substitution and Separability

Technological separability – independence of technical rates of substitution of a

subset of pairs of inputs or outputs from the quantities of inputs or outputs not
belonging to this subset – is a powerful restriction rationalizing the existence of
aggregate input or output quantities and the decentralization of optimization prob-
lems (e.g., output-constrained cost minimization).33 Applications of the concept to
dual representations of the technology – independence of a set of dual marginal rates
of substitution (or complementarity) from price levels of inputs or outputs not in the
set – have dual implications for (price) aggregation and decentralization.34
As was first noticed by Berndt and Christensen [6], certain identity restrictions
on the Allen elasticities of substitution are equivalent, under some strong regularity
conditions (principally homotheticity), to some corresponding separability restric-
tions on the technology. Russell [70] and Blackorby and Russell [12] generalized
these results for the AES and then extended them to encompass the Morishima
elasticities in Blackorby and Russell [13]. The requisite technological restrictions

33 The concept was independently conceived by Leontief [57, 58] and Sono [74]. See Blackorby
et al. [15] for a comprehensive development of the concept and its applications and for citations to
the literature extending the Leontief-Solo concept.
34 In fact, the concept is abstract: it can be applied to any (multiple variable) function.
284 R. R. Russell

take the form of separability conditions for the dual cost function (or, equivalently,
the distance function).

Separability and Functional Structure

Inputs i, j , say, are separable from input k in the distance function if

∂ Di (x, y) ∂ ρi (x, y)
= = 0. (37)
∂xk Dj (x, y) ∂xk ρj (x, y)

That is, the technical rate of substitution of input i for input j , given the output level
y, is independent of the quantity employed of input k. In the single-output case, this
condition is equivalent to35

∂ Fi (x)
= 0. (38)
∂xk Fj (x)

Similarly, under the assumption of differentiability of the cost function, input prices
i, j are separable from input price k in C if

∂ Ci (p, y) ∂ δi (p, y)
= = 0. (39)
∂pk Cj (p, y) ∂pk δj (p, y)

That is, the ratio of constant-output demand-function images i and j is independent

of input price k for given output y.
Now partition the set of input variable indices I = 1, . . . , n into subsets I =
I 1 , . . . , I S . The corresponding decompositions of the vectors x and p are x =
x 1 , . . . , x S and p1 , . . . , pS . Define the set of triples,

IS = {i, j, k | i, j ∈ I r × I r ∧ k ∈ I s , r = s}

and

IC = {i, j, k | i ∈ I r , j ∈ I s , k ∈
/ I s ∪ I r }.

We say that D, F , or C is separable in the partition I if (quantity or price)

variables i and j are separable from k for all i, j, k ∈ IS and completely separable
in the partition I if variables i and j are separable from k for all i, j, k ∈ IC . That
is, D, F , or C is separable in I if ratios of derivatives (i.e., trade-offs between)
variables in any set belonging to I are independent of values of variables outside that

35 Inthe case where m = 1, D(x, F (x)) = 1 on the isoquant for output F (x). Differentiate this
identity with respect to xi and xj and take the ratio to obtain this equivalence.
6 Elasticities of Substitution 285

set, and D, F , or C is completely separable in I if ratios of derivatives between any

two variables in I are independent of values of variables outside the sets containing
the variables in the ratio.
Positing these separability conditions imposes structural restrictions on the
functions representing the technology. In particular separability of F in the partition
I holds if and only if the production-function image (in the single-output case) can
be written as

F (x) = F̂ F 1 (x 1 ), . . . , F S (x S ) . (40)

The “aggregator” functions, F 1 , . . . , F S , inherit the curvature and monotonicity

properties of F and are interpreted as aggregate input quantities. The “macro”
function F̂ is increasing in its arguments.
The cost function is separable in the partition I if and only if

C(p, y) = Ĉ y, C 1 (p, y), . . . , C S (p, y) (41)

and the distance function is separable in the partition I if and only if

D(x, y) = D̂ y, D 1 (x, y), . . . , D S (x, y) . (42)

The sectoral distance and cost functions, D 1 , . . . , D S and C 1 , . . . , C S , inherit

the properties of D and C; the macro functions, D̂ and Ĉ, are increasing in the
aggregator-function images. The structures, (41) and (42), are self-dual: that is, the
structure (41) holds if and only if (42) holds.36
If the technology is homothetic, the aggregator functions in (40) can be normal-
ized to be homogeneous of degree one. Moreover, the dual representations, (41) and
(42), simplify as follows:

C(p, y) = (y) C̃ 1 (p), . . . , S (p) (43)

and

D(x, y) = (y) D̃ 1 (x), . . . , S (x) , (44)

where the aggregator functions are homogeneous of degree one and can be
interpreted as sectoral price and quantity indexes, is an increasing function, and
is a decreasing function.
Finally, if the production function satisfies first-degree homogeneity, (43) and
(44) simplify to

36 Proofs of these and other results in this section can be found in Blackorby et al. [15].
286 R. R. Russell

C(p, y) = y C̃ 1 (p), . . . , S (p) (45)

and

D(x, y) = y −1 D̃ 1 (x), . . . , S (x) . (46)

Note that, on the frontier, D(x, y) = 1 and inversion of (46) in y yields the
production function (40).
We say that D, F , or C is completely separable in the partition I if (quantity or
price) variables i and j are separable from k for all i, j, k ∈ IC and completely
separable in the partition I if variables i and j are separable from k for all i, j, k ∈
IC .
Assume that the partition of the price and quantity variables I contains more than
two groups (S > 2).37 Then the (symmetrically dual) cost and distance functions
have the following images if and only if they satisfy complete separability in the
partition I 38 :
S
C(p, y) = C̄ y, C s (ps , y) (47)
s=1

S 1/ρ̂(y)
= (y) Ĉ C s (ps , y)ρ̂(y) , 0 = ρ̂(y) ≤ 1, (48)
s=1
or

S
(y) C s (ps , y)βs (y) , βs (y) > 0 ∀ s,
s=1

S
βs (y) = 1, (49)
s=1

and
S
D(x, y) = D̄ y, s (x s , y) (50)
s=1

S 1/ρ(y)
= (y)−1 D̂ s (x s , y)ρ(y) , 0 = ρ(y) ≤ 1, (51)
s=1

37 Don’task. Or if you can’t resist, I refer you to Section 4.6 of Blackorby et al. [15] on “Sono
independence” and additivity in a binary partition.

38 Analogous to the case (13), ρ(y) = ρ̂(y)/ ρ̂(y) − 1 .
6 Elasticities of Substitution 287

or

S
(y)−1 s (x s , y)βs (y) βs (y) > 0 ∀ s,
s=1

S
βs (y) = 1. (52)
s=1

Thus, complete separability results in a dual structure for the cost and distance
functions that is CES in the aggregator-function images, C 1 (p1 , y), . . . , C S (pS , y)
and D 1 (x 1 , y), . . . , D S (pS , y). These aggregator function images cannot be inter-
preted, however, as sectoral price and quantity indexes, since they depend on
the value of the output vector as well as input-specific prices and quantities. If,
however, we conjoin complete separability and the assumption of homotheticity of
the technology, the above structure simplifies to

S 1/ρ̂
C(p, y) = (y) s (ps )ρ̂ , 0 = ρ̂ ≤ 1, (53)
s=1
or

S
(y) s (ps )βs , βs > 0 ∀ s,
s=1

S
βs = 1, (54)
s=1

and

S 1/ρ
−1
D(p, y) = (y) s
Λ (x ) s ρ
, 0 = ρ ≤ 1, (55)
s=1
or

S
(y)−1 Λs (x s )βs , βs > 0 ∀ s,
s=1

S
βs = 1. (56)
s=1

The functions, s (ps ) and s (x s ), s = 1, . . . , S, satisfy the salient (monotonicity

and homogeneity) properties of price and quantity indexes, respectively.
If m = 1 in (55) and (56), inversion of D(x, y) = 1 yields the production
function,
288 R. R. Russell

S 1/ρ
y = F (x) = s
Λ (x ) s ρ
, 0 = ρ ≤ 1, (57)
s=1
or

S
Λs (x s )βs , βs > 0 ∀ s,
s=1

S
βs = 1. (58)
s=1

If, in addition, the production function is homogeneous of degree one, (y) = y

and

S 1/ρ
F (x) = Λs (x s )ρ , 0 = ρ ≤ 1, (59)
s=1
or

S
Λs (x s )βs , βs > 0 ∀ s,
s=1

S
βs = 1. (60)
s=1

Elasticity Identities and Functional Structure

Berndt and Christensen [6] were the first to notice a relationship between functional
structure and certain restrictions on the values of (Allen) elasticities of substitution.
Maintaining linear homogeneity of a single-output production function and n > 2,
they showed that

σkiA (p, y) = σkj

A
(p, y) ∀i, j, k ∈ IS

if and only if F is separable in the partition I. This result was generalized by

Diewert [30], Russell [70], and Blackorby and Russell [12], and the latter results
were extended to Morishima elasticities by Blackorby and Russell [13]. These
results can be summarized as follows:
The following conditions are equivalent (under the maintained assumption that
n > 2):

(i) C is separable in the partition I (structure (43)).

(ii) D is separable in the partition I (structure (44)).
(iii) σkiM (p, y) = σkj
M (p, y) ∀i, j, k ∈ I .
S
6 Elasticities of Substitution 289

(iv) σkiA (p, y) = σkj

A (p, y) ∀i, j, k ∈ I .
S
(v) σkiDM (p, y) = σkjDM (p, y) ∀i, j, k ∈ I .
S
(vi) σki (p, y) = σkj (p, y) ∀i, j, k ∈ IS .
DA DA

That is, separability of C or D is equivalent to identity of both Allen and Morishima

elasticities between all variables within a separable sector and all variables outside
that sector.
The following conditions are equivalent (under the maintained assumption that
n > 2):

(i) C is completely separable in the partition I (structure (48) if S > 2).

(ii) D is completely separable in the partition I (structure (51) if S > 2).
(iii) σkiM (p, y) = σkj
M (p, y) ∀i, j, k ∈ I .
C
(iv) σki (p, y) = σkj (p, y) ∀i, j, k ∈ IC .
A A

(v) σkiDM (p, y) = σkjDM (p, y) ∀i, j, k ∈ I .

C
(vi) σki (p, y) = σkj (p, y) ∀i, j, k ∈ IC .
DA DA

That is, strict separability of C or D is equivalent to identity of both Allen and

Morishima elasticities between all variables in any two sectors and all variables
outside those sectors.
These results provide powerful tools for hypothesis testing because tests for
separability – i.e., for aggregate inputs or outputs – are equivalent to tests for certain
equality conditions for pairs of elasticities.

Concluding Remarks

The elasticity of substitution concept grew out of the interest of prominent English
economic theorists, at the time of the Great Depression, in the distribution of income
between capital and labor. The concept surged to prominence with the SMAC
characterization of constant elasticity of substitution production functions and the
emergence of modern growth theory in the 1960s. The elasticity of substitution
between labor and capital turns out to be fundamental to many theoretical aspects
of economic growth, including the possibility of perpetual growth or decline, the
growth of per capita income, and the speed of convergence to an equilibrium growth
path. The elasticity of substitution is especially salient in the insightful analysis of
the growth process by Klump and de la Grandville [51]. See Chirinko [23] for a
discussion of these issues and references to the relevant literature.
Most of the growth theory literature, relying originally on the historical constancy
of labor and capital income shares, features the Cobb-Douglas production func-
tion.39 But beginning about 1980, the labor share began to fall, and an accumulation

39 In fact, as first pointed out by Antras

[3], the pre-1980 constancy of income shares does not imply
unitary elasticity of substitution when one takes into account the empirical evidence of aggregate
290 R. R. Russell

of empirical evidence has indicated that the (aggregate) elasticity of substitution is

significantly less than one (see Chirinko [23] and Chirinko and Mallick [24]).40
Although the CES has only one more parameter than the Cobb-Douglas, its greater
flexibility seems to be more attuned to the evidence.
The first empirical estimation of elasticities of substitution for more than two
inputs (to my knowledge41 ) is that of Griliches [38], who estimated Allen elasticities
to assess the effect of increases in human capital (reflected in years of educational
attainment) on the relative wages of skilled and unskilled labor, with capital
as a third important input.42 Shortly thereafter, Parks [67] estimated a translog
production function to obtain estimated Allen elasticities for five inputs (capital,
labor, and three material inputs).
Another early estimation of Allen elasticities, as well as implementation of the
aggregation theorems in section “Elasticities of Substitution and Separability” and
Berndt and Christensen [6], is the test in Berndt and Christensen [7] for the existence
of an aggregate capital stock comprising equipment and structures in a production
technology using labor as well as the two types of capital.
Allen elasticities for technologies with more than two inputs play an important
role in the research on energy economics, beginning with the classic KLEM (capital,
labor, energy, and materials) paper of Berndt and Wood [8]. Their paper employs
the theorems described in section “Elasticities of Substitution and Separability” to
test for the existence of a value-added production function – that is, for separability
of labor and capital inputs from material inputs. Thompson and Taylor [78] follow
up on the analysis of these issues using Morishima elasticities.
Elasticities of substitution with more than two inputs have also played an
important role in the assessment of the substitutability of (multiple) monetary assets.
Barnett et al. [5] estimated Allen elasticities, whereas Davis and Gauger [27] and
Ewis and Fischer [33] each employed Morishima elasticities.
Finally, Allen elasticities play an important role in the empirical study of the
effect of immigration on the relative wages of domestic and immigrant labor
(Grossman [39], Borjas [17], Borjas, Freeman, and Katz [18], Borjas, Freeman,
and Katz [19] and of the effect of the increase in the number of guest workers on
resident and non-resident labor (Kohli [49]).
The only empirical estimation of dual elasticities of substitution of which I am
aware is in the study by Mundra [64] of the substitutability of resident and non-
resident (guest) labor (and other inputs). The paper compares estimates of primal
and dual elasticities.

labor-saving technological change, which would tend to increase the share of capital, offsetting its
declining share owing to a increasing capital intensity and an elasticity of substitution below one.
40 Karabarbounis and Neiman [48] estimate an elasticity of substitution greater than one, but

Acemoglu and Robinson [1] argues that their use of cross-country data makes their estimates
more likely to correspond to endogenous-technology elasticities.
41 Some of this discussion is based on a working paper by Mundra and Russell [65].
42 Follow-ups of the Griliches study can be found in Johnson [47], Kugler et al. [55], and Welch

[80].
6 Elasticities of Substitution 291

Without doubt, many other empirical applications of elasticities of substitution

have escaped my attention.
The focus of this chapter has been on the elasticity of substitution between (or
among) inputs, but the concept is equally relevant to output substitution. That is, for
each of the elasticities defined in sections “Two-Input Elasticity of Substitution:
Early Formulations and Characterizations”, “Allen and Morishima Elasticities
of Substitution”, and “Dual Elasticities of Substitution”, one can formulate a
corresponding (primal or dual) output elasticity of substitution – a characterization
of the curvature of the output possibility curve (or surface).43
The elasticity of substitution also shows up in utility theory, where it reflects the
ease of substitution between consumer goods (and characterizes the curvature of
indifference surfaces). In fact, it was in utility theory that the CES function made its
first appearance in the economics literature, when Burk (Bergson) [20] showed that
additivity of the utility function and linearity of Engel curves implies that the utility
function belongs to the CES family, referred to as the “Bergson family” in consumer
theory. The elasticity of substitution in intertemporal utility functions plays an
important role in macroeconomic theory (Hall [40]) and in optimal growth theory
(Cass [21] and Koopmans [52]). Finally, the CES utility function has proved useful
in the study of optimal product diversity in the context of monopolistic competition
[32].
Consistent with the theme of this volume, the chapter has focused primarily on
the theoretical development of the measurement of substitutability: primal and dual
characterizations and their close relationships to separable sectors of a production or
utility function. The taxonomy for n-variable elasticities implicit in the discussions
in sections “Allen and Morishima Elasticities of Substitution”, “Dual Elasticities
of Substitution”, and “Gross Elasticities of Substitution” dichotomizes elasticity-of-
substitution concepts along the following lines: partial vs. ratio elasticities (Allen
vs. Morishima elasticities), direct vs. dual elasticities (quantities vs. (shadow)
prices as “endogenous” variables), and net vs. gross elasticities (fixed output [or
technological homotheticity] vs. variable output). Other elasticity-of-substitution
concepts have been proposed, but I see them as variations on these themes.44

43 See, e.g., the analysis of the substitutability between a “good” and a “bad” output (in this case,
electricity and sulfur dioxide) in Färe et al. [35].
44 This may be an unfair oversimplification: Stern [76], building on Mundlak [63], proposes a

related but somewhat different and more comprehensive taxonomy of the elasticities. (Neverthe-
less, I’m reminded of a (private) comment made by a prominent social choice theorist back in the
heyday of research in his area: “The problem with social choice theory is that there are more axioms
than there are ideas.” Well, perhaps we have reached the point where there are more elasticity-of-
substitution concepts than there are ideas.)
292 R. R. Russell

Cross-References

Distance Functions in Production Economics

Duality in Production
Functional Structure and Aggregation
Multiproduct Technologies

References
1. Acemoglu D and Robinson JA (2015) The rise and decline of general laws of capitalism. J
Econ Perspect 29:3–28
2. Allen RGD (1938) Mathematical analysis for economists. Macmillan, London
3. Antras P (2004) Is the U.S. aggregate production function Cobb-Douglas? New estimates of
the elasticity of substitution. Contrib Macroecon 4:1–34
4. Arrow KJ, Chenery HP, Minhaus BS, Solow RM (SMAC) (1961) Capital-labor substitution
and economic efficiency. Rev Econ Stat 63:225–250
5. Barnett WM, Fisher D, Serletis A (1992) Consumer theory and the demand for money. J Econ
Lit 30:2086–2119
6. Berndt ER, Christensen L (1973) The internal structure of functional relationships: separability,
substitution, and aggregation. Rev Econ Stud 40:403–410
7. Berndt ER, Christensen L (1973) The translog function and the substitution of equipment,
structures, and labor in U.S. manufacturing 1929–68. J Econ 1:81–114
8. Berndt ER, Wood DO (1975) Technology, prices, and the derived demand for energy. Rev Econ
Stat 57:259–268
9. Bertoletti P (2001) The Allen/Uzawa elasticity of substitution as a measure of gross input
substitutability. Rivista Italiana Degli Economisti 6:87–94
10. Bertoletti P (2005) Elasticities of substitution and complementarity a synthesis. J Prod Anal
24:183–196
11. Blackorby C, Russell RR (1975) The partial elasticity of substitution. Discussion Paper No.
75-1, Department of Economics, University of California, San Diego
12. Blackorby C, Russell RR (1976) Functional structure and the Allen partial elasticities of
substitution: an application of duality theory. Rev Econ Stud 43:285–292
13. Blackorby C, Russell RR (1981) The Morishima elasticity of substitution: symmetry, con-
stancy, separability, and its relationship to the Hicks and Allen elasticities. Rev Econ Stud
48:147–158
14. Blackorby C, Russell RR (1989) Will the real elasticity of substitution please stand up? A
comparison of the Allen/Uzawa and Morishima elasticities. Am Econ Rev 79:882–888
15. Blackorby C, Primont D, Russell RR (1978) Duality, separability, and functional structure:
theory and economic applications. North-Holland, New York
16. Blackorby C, Primont D, Russell RR (2007) The Morishima gross elasticity of substitution. J
Product Anal 28:203–208
17. Borjas GJ (1994) The economics of immigration. J Econ Lit 32:1667–1717
18. Borjas GJ, Freeman RB, Katz LF (1992) On the labor market effects of immigration and trade.
In: Borjas G, Freeman R (eds) Immigration and the work force. University of Chicago Press,
Chicago
19. Borjas GJ, Freeman RB, Katz LF (1996) Searching for the effect of immigration on the labor
market. AEA Pap Proc 8:246–251
20. Burk (Bergson) A (1936) Real income, expenditure proportionality, and Frisch’s ‘new methods
of measuring marginal utility’. Rev Econ Stud 4:33–52
21. Cass D (1965) Optimum growth in an aggregative model of capital accumulation. Rev Econ
Stud 32:233–240
6 Elasticities of Substitution 293

22. Chambers R (1988) Applied production analysis. Cambridge University Press, Cambridge
23. Chirinko RS (2008) σ : the long and short of it. J Macroecon 30:671–686
24. Chirinko RS, Mallick D (2016) The substitution elasticity, factor shares, and the low-frequency
panel model. Am Econ J Macroecon 9:225–253
25. Cobb CW, Douglas PH (1928) A theory of production Am Econ Rev 18:23–34
26. Cornes R (1992) Duality and modern economics. Cambridge University Press, Cambridge
27. Davis GC, Gauger J (1996) Measuring substitution in monetary-asset demand systems. J Bus
Econ Stat 14:203–209
28. Davis GC, Shumway CR (1996) To tell the truth about interpreting the Morishima elasticity of
substitution. Can J Agric Econ 44:173–182
29. de la Grandville O (1997) Curvature and the elasticity of substitution: straightening it out. J
Econ 66:23–34
30. Diewert WE (1974) Applications of duality theory. In: Intriligator M, Kendrick D (eds)
Frontiers of quantitative economics, vol 2. North-Holland, Amsterdam
31. Diewert WE (1982) Duality approaches to microeconomic theory. In: Arrow K, Intriligator M
(eds) Handbook of mathematical economics, vol II. North-Holland, New York
32. Dixit AK, Stiglitz JE (1977) Monopolistic competition and optimum product diversity. Am
Econ Rev 67:297–308
33. Ewis NA, Fischer D (1984) The translog utility function and the demand for money in the
United States. J Money Credit Bank 16:34–52
34. Färe R, Grosskopf S (1990) A distance function approach to price efficiency. J Pol Econ
43:123–126
35. Färe R, Grosskopf S, Noh D-W, Weber W (2005) Characteristics of a polluting technology:
theory and practice. J Econ 126:469–492
36. Färe R, Primont D (1995) Multi-output production and duality theory: theory and applications.
Kluwer Academic Press, Boston
37. Fuss M, McFadden D (eds) (1978) Production economics; a dual approach to theory and
applications. North-Holland, Amsterdam
38. Griliches Z (1969) Capital-skill complementarity. Rev Econ Stat 51:465–468
39. Grossman J (1982) The substitutability of natives and immigrants in production. Rev Econ Stat
64:596–603
40. Hall RE (1988) Intertemporal substitution in consumption. J Polit Econ 96:339–357
41. Hardy GH, Littlewood JE, Pólya G (1934) Inequalities. Cambridge University Press, Cam-
bridge
42. Hicks JR (1932) The theory of wages. MacMillan Press, London
43. Hicks JR (1936) Distribution and economic progress: A revised version. Rev Econ Stud 4:1–12
44. Hicks JR (1970) Elasticity of substitution again: substitutes and complements. Oxford Econ
Pap 22:289–296
45. Hicks JR, Allen RGD (1934) A reconsideration of the theory of value, part II. Economica
1:196–219. N.S
46. Houthakker HS (1965) A note on self-dual preferences. Econometrica 33:797–801
47. Johnson GE (1970) The demand for labor by educational category. South Econ J 37:
190–204
48. Karabarbounis L, Neiman B (2014) The global decline of the labor share. Q J Econ 129:61–103
49. Kohli U (1999), Trade and migration: a production theory approach. In: Faini R, de Melo J,
Zimmermann KF (eds) Migration: the controversies and the evidence. Cambridge University
Press, Cambridge
50. Kim HY (2000) The Antonelli versus Hicks elasticity of complementarity and inverse input
demand systems. Aust Econ Pap 39:245–261
51. Klump R, de la Grandville O (2000) Economic growth and the elasticity of substitution: two
theorems and some suggestions. Am Econ Rev 90:282–291
52. Koopmans TC (1965) On the concept of optimal economic growth. In: The econometric
approach to development planning. North Holland (for Pontificia Academic Science), Ams-
terdam
294 R. R. Russell

53. Kuga K (1979) On the symmetry of Robinson elasticities of substitution: the general case. Rev
Econ Stud 46:527–531
54. Kuga K, Murota T (1972) A note on definition of elasticity of substitution. Macroeconomica
24:285–290
55. Kugler P, Müller U, Sheldon G (1989) Non-neutral technical change, capital, white-collar and
blue-collar labor. Econ Lett 31:91–94
56. Lau L (1978) Applications of profit functions. In: Fuss M, McFadden D (eds) Production
economics: a dual approach to theory and applications. North-Holland, Amsterdam, pp 133–
216
57. Leontief WW (1947) A note on the interrelation of subsets of independent variables of a
continuous function with continuous first derivatives. Bull Am Math Soc 53:343–350
58. Leontief WW (1947) Introduction to a theory of the internal structure of functional relation-
ships. Econometrica 15:361–373
59. Leontief WW (1953) Domestic production and foreign trade: the American capital position
re-examined. Proc Am Philos Soc 97:331–349
60. Lerner AP (1933) Notes on the elasticity of substitution II: the diagrammatical representation.
Rev Econ Stud 1:68–70
61. McFadden D (1963) Constant elasticity of substitution production functions. Rev Econ Stud
30:73–83
62. Morishima M (1967) A few suggestions on the theory of elasticity (in Japanese). Keizai Hyoron
(Econ Rev) 16:144–150
63. Mundlak Y (1968) Elasticities of substitution and the theory of derived demand. Rev Econ
Stud 35:225–236
64. Mundra K (2013) Direct and dual elasticities of substitution under non-homogeneous technol-
ogy and nonparametric distribution. Indian Growth Dev Rev 6:204–218
65. Mundra K, Russell RR (2004) Dual elasticities of substitution. Discussion Paper 01-26,
Department of Economics, University of California, Riverside
66. Murota T (1977) On the symmetry of Robinson elasticities of substitution: a three-factor case.
Rev Econ Stud 42:79–85
67. Parks RW (1971) Price responsiveness of factor utilization in Swedish manufacturing, 1870–
1950. Rev Econ Stud 53:129–139
68. Pigou AC (1934) The elasticity of substitution. Econ J 44:23–241
69. Robinson J (1933) Economics of imperfect competition. MacMillan, London
70. Russell RR (1975) Functional separability and partial elasticities of substitution. Rev Econ
Stud 42:79–85
71. Russell RR (1997) Distance functions in consumer and producer theory. Essay 1. In: Färe R,
Grosskopf S (eds) Index number theory: essays in honor of Sten Malmquist. Kluwer Academic
Publishers, Boston, pp 7–90
72. Sato R, Koizumi T (1973) On the elasticities of substitution and complementarity. Oxford Econ
Pap 25:44–56
73. Solow RM (1956) A contribution to the theory of economic growth. Q J Econ 65:65–94
74. Sono M (1945, 1961) The effect of price changes on the demand and supply of separable
goods. Int Econ Rev 2:239–271 (Originally published in Japanese In: Kokumin Keisai Zasshi
74:1–51)
75. Stern DI (2010) Derivation of the Hicks, or direct, elasticity of substitution from the input
distance function. Econ Lett 108:349–351
76. Stern DI (2011) Elasticities of substitution and complementarity. J Prod Anal 36:79–89
77. Syrquin M, Hollender G (1982) Elasticities of substitution and complementarity: the general
case. Oxford Econ Pap 34:515–519
78. Thompson P, Taylor TG (1995) The capital-energy substitutability debate: a new look. Rev
Econ Stat 77:565–569
79. Uzawa H (1962) Production functions with constant elasticities of substitution. Rev Econ Stud
29:291–299
80. Welch F (1970) Education in production. J Polit Econ 78:35–59
Distance Functions in Production Economics
7
Robert G. Chambers and Rolf Färe

Contents
Intuitive Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 298
Basic Assumptions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 300
Distance Functions Defined and Their Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 304
Differential Properties of Distance Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 309
Distance Functions at Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 315
Duality Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 315
Efficiency Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 317
Index Numbers and Productivity Measurement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 320
Empirical Implementation of Distance Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 325
Commentary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 327
Cross-References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 328
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 328

Abstract

This chapter treats distance functions used in production economics and opera-
tions research. The economic intuition behind the different notions of distance
functions is discussed to set the stage for a more formal analysis. A minimal set
of regularity conditions needed to ensure the existence of well-behaved distance
functions are presented, distance functions are defined, and the uses of distance
functions in a variety of settings are surveyed.

R. G. Chambers ()
Department of Agricultural and Resource Economics, University of Maryland,
College Park, MD, USA
e-mail: [email protected]
R. Färe
Department of Economics and Department of Agricultural and Resource Economics, Oregon
State University, Corvallis, OR, USA
e-mail: [email protected]

© Springer Nature Singapore Pte Ltd. 2022 295

S. C. Ray et al. (eds.), Handbook of Production Economics,
https://doi.org/10.1007/978-981-10-3455-8_14
296 R. G. Chambers and R. Färe

Keywords

Distance functions · Production · Technology · Measurement

JEL codes

D21, D22, D24

Modern production economics, following Debreu [11] and Shephard [22], is

typically cast in set-theoretic terms. Even the most casual reader of that literature is
confronted with concepts such as technology sets, input sets, and output sets that
are far removed from the production functions and input-requirement functions
familiar from intermediate microeconomics. The set-theoretic approach conveys
clarity, precision, and the ability to generalize. But its abstractness often proves
problematic when examining more mundane situations involving observed data. A
cardinal bridge is needed to connect abstract set-theoretic analysis and actual data.
In intermediate microeconomics, the production function and the input-
requirement function are the cardinal bridges to which we introduce our students.
The former is defined as the maximum obtainable output for a given input
bundle, and the latter as the minimal amount of an input needed to produce a
given output. That’s clear enough and easily illustrated using Fig. 1. There the
“lazy S-shaped” curve emanating from the origin, labelled f (x), traces out a
traditional version of the graph of the production function. The vertical axis is
denominated in units of output, y, the horizontal axis is denominated in units of
input, x, and the technically feasible input-output combinations are those falling
on or below f (x). Thus, for input use of x ∗ , the maximum output attainable is
f (x ∗ ) that corresponds to the point on f (x) directly above x ∗ . On the other
hand, if the goal is to produce y ∗ , only e (y ∗ ), which corresponds to the point
on f (x) at the same vertical height as y ∗ , is needed. Thus, depending upon
your perspective, f (x) illustrates both the production function and the input-

Fig. 1 Production, y
input-requirement, and
distance functions
B f(x)
f ( x *)

y* C A

x
0 e ( y *) x*
7 Distance Functions in Production Economics 297

requirement function. Figure 1 depicts x as the independent variable and y as

the dependent variable reflecting the idea that the production function gives the
maximum amount that can be produced given y. Switching axes and measuring
x vertically and y horizontally, however, does not change the substance of the
relationship.
There are several things to note. First, the production function is measured as
the vertical distance between, say, (x ∗ , 0) and (x ∗ , f (x ∗ )) and is denominated in
output units. Second, the input-requirement function is measured as the horizontal
distance (0, y ∗ ) and (e (y ∗ ) , y ∗ ) and is denominated in input units. Finally, in the
case illustrated e∗ (y) = f −1 (y) and f (x) = e−1 (x), with

f −1 (y) = {x : f (x) = y} ,

and

e−1 (x) = {y : e (y) = x} .

Unfortunately, real technologies are not describable in terms of a single input or

a single output. In general, production technologies involve using multiple inputs to
produce multiple outputs. So, despite their intuitive value, production functions and
the input-requirement function are often lacking in empirical settings. Production
economists and operations researchers have accommodated this difficulty by devel-
oping cardinal representations of the set-theoretic representations of production
technologies. Those cardinal representations are commonly referred to as distance
functions. Like the production function and the input-requirement function, they
characterize the technology in terms of a measured distance (hence the name)
between some point in input-output space and a frontier of the feasible set of
production outcomes. Unlike the production function and the input requirement
function, however, they can be adapted to both multiple input and multiple output
settings. They are the subject matter of this chapter.
This chapter proceeds as follows. First, the basic intuition behind the different
notions of distance functions is presented. Then we present and discuss intuitively
a minimal set of assumptions that are needed to ensure a meaningful representation
of a production technology. A formal definition of a distance function is then
offered, and its basic mathematical properties are discussed. Then we turn to a
discussion of the smoothness (differential) properties of distance functions, and we
show the connection between standard notions of differentiability and a slightly
generalized notion needed to ensure that points are associated with potentially
meaningful economic equilibria. The penultimate section discusses different uses
(in a production contexts) to which distance functions have been applied. And the
final section ends with a brief overview of the history of distance functions in
economic analysis.
298 R. G. Chambers and R. Färe

Intuitive Background

The typical economist conceptualizes the technology using a production function.

Specialists in production economics are more prone to think of it in terms of a
technology set, T , that is a subset of RN +M and is defined by

T = (x, y) ∈ RN +M : x can produce y ,

where x ∈ RN denotes an N−dimensional input vector, and y ∈ RM an

M−dimensional output vector.1
If x and y are scalars, the connection between the production function and T is
obvious. Measuring x on the horizontal axis and y on the vertical axis as in Fig. 1,
the production function represents the upper boundary of T . One thus writes2

f (x) = max {y : (x, y) ∈ T }

to define the production function f : R → R.

Although production functions are usually treated as primitives by economists,
viewing it from the perspective of T clarifies that deriving it formally involves
solving an optimization problem. And the type of optimization problem reveals
much about the way economists often view production economics problems.
Defining f (x) as the largest possible output attainable from x is a natural way of
looking at things if you are the individual who keeps the returns from producing y.
Not all individuals, however, get to keep what they produce. Many times, contracting
works the other way around. Individuals are hired to produce a given y but
have no claim on y. Intuition would then suggest that they would concentrate on
economizing on the use of x leading one to solve

e (y) = min {x ∈ R : (x, y) ∈ T },

where e : R → R is the input-requirement function. If we were to depict this

optimization problem formally using traditional mathematical conventions, then

1 Distinguishing between inputs and outputs is unnecessary and often arbitrary because what are
perceived as outputs in one context can be inputs in another. For example, milk produced from
cows can be converted by the producer into cream, cheese, and other milk products.
A more general representation, which is especially common in general-equilibrium models, is
obtained by using the concept of a netput which allows commodities to play either role. When a
commodity acts as an input in a particular process, it enters with a negative sign, and when it acts
as an output, it enters with positive sign. In that case, the technology is written

T = z ∈ RN +M : z is technically feasible .

Although this level of generality is possible, throughout our presentation we maintain the artificial
distinction between inputs and outputs because of its familiarity and its continuing prevalence in
applied production analysis.
2 Subject to suitable regularity conditions that we will discuss below.
7 Distance Functions in Production Economics 299

unlike Fig. 1, y as the independent variable would be measured along the horizontal
axis and x as the dependent variable would be measured on the vertical axis.
The production function and the input-requirement function portray two different
ways of looking at the same T . Residual claimants for y view T via f (x), while
residual claimants for x via e (y). One way to portray the intuitive connection
between them and distance functions, which are the subject of this chapter, is to
think in terms of an arbitrary (x, y) and then ask if (x, y) represents the best input-
output combination available from T ? Answering that question properly requires
one to be precise about what best means. One possible sense of best would be
in terms of whether that y represented the maximal output that could be had if x
were committed to production. One could then compare y to f (x) if the latter were
known but more directly would also solve:

o (y, x) = min {β ∈ R : (x, y − β) ∈ T }

= y + min {β − y ∈ R : (x, y − β) ∈ T }
= y − max {y − β ∈ R : (x, y − β) ∈ T }
= y − f (x) .

Alternatively, another possible way to define best would be to determine whether y

could be produced using something smaller than x. That would lead one to solve the
following optimization problem:

i (x, y) = max {α ∈ R : (x − α, y) ∈ T }
= x + max {α − x ∈ R : (x − α, y) ∈ T }
= x − min {x − α ∈ R : (x − α, y) ∈ T }
= x − e (y) .

The functions o (y, x) and i (x, y) are examples of distance functions. The first,
o (y, x), measures the minimal shrinkage of y required to make the “shrunken
output” producible by x.3 The second, i (x, y), measures the maximal shrinkage

3 One can also write this problem as determining the maximal expansion of y possible given x.

Mathematically,

ô (y, x) = max β̂ : x, y + β̂ ∈ T

= max y + β̂ : x, y + β̂ ∈ T

= f (x) − y
= −o (y, x) ,

so that apart from the sign difference, identical results will be obtained. A similar argument shows
that i (x, y) can also be recast as a minimization problem without changing its true nature.
300 R. G. Chambers and R. Färe

of x that will allow the “shrunken input” to produce y. Figure 1 again illustrates.
For the input-output pair (x ∗ , y ∗ ) illustrated by point A, o (y ∗ , x ∗ ) is given
by (minus) the vertical distance AB, and i (x ∗ , y ∗ ) is given by the horizontal
distance CA.
Before we move on to a more detailed discussion, it’s important to emphasize
that the different representations of the technology, even in a two-dimensional
setting, result from looking at the constraints imposed by the technology from
different economic perspectives. One emphasizes output considerations, and the
other emphasizes input considerations. Those perspectives echo throughout even
the most basic economic discussion of production problems. Visually, we often
think of producers moving onto or along isoquants (an input perspective) or
onto or along transformation curves (an output perspective). There are many
reasons for this, and not the least is that in a multiple input-multiple output
setting, graphical analysis of technologies requires that something be held con-
stant. It’s also true that those different ways of looking at things reflect the
different ways producers think about outputs and input, maximization versus
minimization.

Basic Assumptions

Our most basic assumption is that there exists a nonempty subset of RN +M that we
denote as T and define by

T = (x, y) ∈ RN +M : x can produce y .

Maintaining the conventional split between inputs and output, T represents the
technically feasible bundles of inputs and outputs, (x, y). We treat T as a physical
datum that is not subject to manipulation by individual producers. Individual
producers view T as a constraint to which they react in designing their economic
activities. The assumption that T is a subset of real space carries the implicit
assumption that both inputs and outputs can be divided into arbitrarily small units.
The jargon for this is input and output divisibility. Everyday experience, however,
teaches us that many commodities are not so divisible. So, for example, where a tiny
scoop of ice cream remains a tiny amount of ice cream, a tiny slice of the ice-cream
scooper is not a scooper but a metal shard. It’s important to remember, therefore,
that x and y do not correspond to everyday notions of commodities that do not
distinguish between stocks and flows but that they refer exclusively to the economic
flows from the commodities, and it is those flows that are arbitrarily divisible and
not the stocks.
7 Distance Functions in Production Economics 301

From T we derive two correspondences. The first4 V : RM ⇒ RN , called the

input correspondence, is defined

V (y) = x ∈ RN : (x, y) ∈ T , y ∈ RM ,

and its image V (y) is referred to as the input set for y, or more simply the input
set. It represents the set of input bundles that can produce the output bundle y. The
second Y : RM ⇒ RN , called the output correspondence, is defined

Y (x) = y ∈ RM : (x, y) ∈ T , x ∈ RN ,

and its image is referred to as the output set for x. It represents the output bundles
that can be produced from the input bundle x. Y and V are lower inverses (in the
sense of Berge 1963) to one another.
Figure 1 represents both concepts in the two-dimensional case. There T con-
tinues to be represented by all (x, y) pairs falling below the lazy S-shaped curve
emanating from the origin. Therefore, for the output y ∗ , V (y ∗ ) consists of all
input quantities (measured on the horizontal axis) greater than or equal to e (y ∗ ).
Conversely, Y (x ∗ ) consists of the output quantities less than or equal to f (x ∗ ).
Hence, it’s natural to envision Y (x) and V (y) as generalizations of f (x) and e (y),
respectively.
Figure 1 also nicely summarizes regularity conditions sufficient to ensure the
existence of well-defined function representations of the technology. Recall that we
represented the production function as the solution to a pointwise, that is for given
x, optimization problem. In terms of output correspondences, that definition can be
rewritten as

f (x) = max {y : y ∈ Y (x)} .

As we have drawn Fig. 1, this optimization problem is well defined over x ≥ 0

because each Y (x) is bounded from above there by the lazy S-shaped curve
emanating from the origin. But elsewhere, things are not so clear. For example,
f is not really defined for x < 0.
We assume that there exists a closed, convex subset of RN +M that we shall refer
to as commodity space, and denote by Z. We require that T ⊆ Z and to be closed. To
economize on notation, we assume in all that follows that vectors or sets resulting

4 Notationally,A ⇒ B denotes a point-to-set mapping from a point in set A to a set in B. Some

writers use A → 2B , where 2B is the power set of B to denote the same correspondence. It is to be
distinguished from A → B that denotes a point-to-point mapping.
302 R. G. Chambers and R. Färe

from Minkowski addition or scalar multiplication of sets belong to Z. To cover

what happens in cases such as Y (x) = ∅ for the production function definition, for
example, we set f (x) = −∞ and use R̄ to denote the set of real numbers formed
by appending −∞ and +∞ to R.
We’ve isolated some mathematically convenient assumptions that facilitate
our discussion of production functions, input-requirement functions, and distance
functions. Next, we consider the mathematical requirements that enable us to reverse
course. That is, assumptions that allow us to generate T from knowledge of o (y, x)
or i (x, y). Figure 1 again illustrates. One sees immediately that if one had complete
knowledge of f then T as depicted could be recaptured from the following operation

T = (x, y) ∈ R2 : f (x) ≥ y . (1)

In formal mathematical terms, T in this instance is the hypograph of the

production function. In everyday English, T is everything falling on or below the
graph of the production function. One can also recapture T as

T = (x, y) ∈ R2 : x ≥ e (y) , (2)

which corresponds to the epigraph of the input-requirement function, that is,

everything lying or above the graph of the input-requirement function. It follows
quickly from their definitions that equivalently

T = (x, y) ∈ R2 : o (y, x) ≤ 0 , (3)

and

T = (x, y) ∈ R2 : i (y, x) ≥ 0 . (4)

So, as drawn, T , f (x), e (y), o (y, x), and i (x, y) all summarize the same
information. The visual reason that this occurs is that once a point on the “boundary”
of T is isolated using the appropriate function representation, all that remains is to
identify the remaining points by looking to one side of that boundary. The standard
production economics jargon for the requisite property is disposability of outputs
and/or inputs.
That jargon, however, masks the role of the numeraire. The production function
and o (y, x) share the common numeraire y. Viewed in these terms, the property that
permits algorithms (1) and (3) to be applied is that once a boundary point, (x o , y o ),
is isolated adding or subtracting units of the numeraire from y o suffices to identify
T . Similarly, algorithms (2) and (4) work by isolating a boundary point, (x o , y o ),
and then adding or subtracting units of the numeraire for e (y) and i (x, y), which is
x, from x o .
7 Distance Functions in Production Economics 303

Thus, what’s crucial to developing the algorithms is the ability of the numeraire
to identify where T lies relative to points on its frontier. We refer to this property
as goodness of the numeraire. Stating this precisely requires a definition of the
numeraire. Our focus in this chapter is on two types of commodities, inputs
and outputs, and thus on two specific types of numeraire. We denote the output
numeraire by ω ∈ RM + \ {0} and the input numeraire by ι ∈ R+ \ {0}. Our formal
N
5
criteria for goodness are :

Definition 1. T satisfies goodness of the output numeraire (GON) if

Y (x) − εω ⊆ Y (x) ,

for ε > 0.

Definition 2. T satisfies goodness of the input numeraire (GIN) if

V (y) + θ ι ⊆ V (y) ,

for θ > 0.

Remark 1. As traditionally defined free disposability of output (FDO) requires that

y ∈ Y (x) ⇒ y o ∈ Y (x) for y o ≤ y.

Free disposability of input (FDI) requires that

x ∈ V (y) ⇒ x o ∈ V (y) for x o ≥ x.

Weak disposability of output (WDO) requires that

y ∈ Y (x) ⇒ μy ∈ Y (x) for μ ∈ (0, 1],

and weak disposability of input (WDI) requires

x ∈ V (y) ⇒ μx ∈ V (y) for μ ≥ 1.

5 If one operated in terms of netputs,

T = z ∈ RN +M : z is technically feasible ,

+M
and a numeraire γ ∈ RN
+ \ {0}, the parallel notion of goodness in the netput numeraire (GNN)
would require that

z ∈ T ⇒ z − εγ ∈ T

for all ε > 0.

304 R. G. Chambers and R. Färe

The weaknesses and strengths of each of these disposability assumptions is

addressed elsewhere in the handbook (see, e.g., the Chap. 12, “Bad Outputs”).
For our purposes, it suffices to note that each represents a strengthening of GON or
GIN. FDO requires that T satisfies GON for all ω ∈ R+ M \ {0}, while FDI requires

that T satisfies GIN for all ι ∈ R+ \ {0}. WDO and WDI narrow things down a bit,
N

but they still require, respectively, that for any x T satisfies GON for all y ∈ Y (x)
and that for any y T satisfies GIN for all x ∈ V (y).

Distance Functions Defined and Their Properties

Once we admit the possibility that production processes involve multiple inputs
and multiple outputs, it becomes obvious that representing technologies in terms
of either production functions or input-requirement functions may be unnecessar-
ily restrictive. Thus, few economists would blink if confronted by technologies
represented pictorially by either isoquants or transformation curves (production
possibilities frontiers). One would understand that the axes for isoquants involve
inputs (holding output constant), that the axes for transformation curves are
outputs (holding inputs constant), and that these curves shift as outputs and inputs,
respectively, change.
Isoquants and transformation curves are usually represented in mathematical
terms as level sets of a numerical function describing the technology. Many writers
call that numerical function a transformation function and denote it as t (x, y). We’ll
honor that tradition. The exact origins of t (x, y) often remain unspecified, even
though it is often endowed with extremely strong (and convenient) mathematical
properties. And in very many instances, t is used to define the technology via

T ≡ {(x, y) : t (x, y) ≤ 0} .

In other words, t is treated as the primitive, and T is derived from it and not
the other way around. Properties of T thus flow from functional structure and
restrictions placed on t. This section shows how the process can be reversed so that
T is treated as the primitive and functional representations, such as this ambiguously
defined t, are induced from assumptions on T . Distance functions provide the link.6
Before we offer a precise definition of a distance function, it’s important to
say a word about the perspective that we take. When distance-type measures were

6 One, but not the only, for example, one can always define what’s known as an indicator function
for T as follows

0 (x, y) ∈ T
δ (x, y) = .
∞ ∞ otherwise

Indicator functions are an essential part of modern variational analysis and convex analysis.
However, in practical settings, they can prove quite difficult to use.
7 Distance Functions in Production Economics 305

originally introduced into economics by Konüs [15], Malmquist [18], and Shephard
[21], they were defined in radial terms. By that we mean that for any arbitrary
(x, y), input-based measures took that x as the numeraire and then moved toward
the frontier in input space in the direction of that x. Visually, that corresponds to
radially contracting x. Similarly, output-based measures used y as the numeraire.
We don’t take that approach. But as we demonstrate below, the radial approach is
an important, but still special, case of our general approach.
In our set up, an isoquant intuitively corresponds to a lower boundary of an
input set, V (y), and a transformation curve to an upper boundary of an output
set, Y (x). Isoquants and transformation curves are, thus, correspondences derived
by projecting a slice7 of T onto RN and RM , respectively, and then looking
at its appropriate boundary. Hence, both V (y) and Y (x) offer natural ways to
formalize the concept of a transformation function. We have as the following natural
generalizations of our two-dimensional measures:

Definition 3. An output-oriented distance function, O ω : RN +M → R̄, is defined

O ω (y, x) ≡ min {β ∈ R : y − βω ∈ Y (x)}

if there exists β ∈ R such that y − β ∈ Y (x) and +∞ otherwise.

Definition 4. An input-oriented distance function, I ι : RN +M → R̄, is defined

I ι (x, y) ≡ max {λ ∈ R : x − λι ∈ V (y)} ,

if there exists λ ∈ R such that x − λι ∈ V (y) and −∞ otherwise.

Because these distance functions are themselves functions of the numeraire (as
recognized by the appropriate superscript), the ability to think in terms of different
potential numeraires ensures that many distance functions potentially exist. The
output-oriented (or output for short) distance function is visualized with the aid
of Fig. 2.8 Y (x) is illustrated by all the points on or below the negatively sloped
curvilinear frontier. In this instance, the numeraire illustrated by the vector labelled
ω contains positive amounts of both outputs. The output couple of interest, y, lies
outside of Y (x) and cannot be produced using the input bundle x. Hence, y must
be “shrunk” to be technically feasible. The numeraire, ω, determines how (in what

7 Slices,in fact, have a precise mathematical definition with which we need not concern ourselves
here. The intuitive idea is straightforward. Imagine T in say three-dimensional space with one
dimension representing output and the other two representing inputs. Now mark off a particular
level of y and imagine taking a knife and slicing through T at this point parallel to the input axes.
That’s the slice that you project onto the input axes to get the isoquant. If you can represent T via
a production function, the equivalent operation is to obtain its upper contour set for a particular y.
8 The visual intuition for the input-oriented distance function is similar with V (y) replacing Y (x)

and x replacing y.
306 R. G. Chambers and R. Färe

Fig. 2 Output-oriented y2
distance function

y
y*

Y(x)
ω
y1

direction) y is to be shrunk. The shrinkage is envisioned as moving from y “in

the direction of the numeraire.” One visualizes this as sliding y along the dotted
line parallel to ω until it encounters Y (x). For the case illustrated, o (y, x) is the
Euclidean length of the line segment y ∗ y divided by ω . Put differently, y is
projected onto the boundary of Y (x) in the direction defined by ω. The resulting
projection is at point y ∗ , whence

y ∗ = y − O ω (y, x) ω.

The basic properties of distance functions are described by (for a proof, see
Chambers [7], Chapter 3):

Theorem 1. If Y (x) satisfies GON in ω:

(a) O ω (y, x) ≤ 0 ⇔ y ∈ Y (x) ⇔ x ∈ V (y) ⇔ (x, y) ∈ T (Indication);

(b) for b ∈ R, O ω (y + bω, x) = O ω (y, x) + b (Translation); and
(c) O μω (y, x) = μ−1 O ω (y, x) μ > 0 (Normalization).

If V (y) satisfies GIN in ι:

(a) I i (x, y) ≥ 0 ⇔ x ∈ V (y) ⇔ y ∈ Y (x) ⇔ (x, y) ∈ T ,(Indication);

(b) for l ∈ R, I ι (x + lι) = I i (x, y) + l (Translation); and
(c) I μi (x, y) = μ−1 I i (x, y) μ > 0 (Normalization).

The Indication property ensures that our distance functions are complete function
representations of T . That is, knowledge of them is equivalent to knowledge of the
technology. An immediate implication is that either one can be used as a formal
means of defining t (x, y), albeit under different regularity conditions.
Mathematically, Translation is an arithmetic consequence of how the distance
function is defined, and it has nothing to do with the regularity properties that we
have imposed upon T . One visualizes it using Fig. 2. From the illustrated point y,
y + bω for b real is illustrated by sliding y along the dotted line b units in the
7 Distance Functions in Production Economics 307

direction of the numeraire. Calculating O ω for that translated point carries one back
to the same y ∗ as y, and the only difference in the calculation is the arithmetic
correction for the translation amount. Put another way, Translation ensures that the
distance calculation between, for example, y and Y (x) is invariant to translations
of the origin in the direction of ω. An immediate consequence of Translation is, for
example,

I ι (x + ϕι, y) − I ι (x, y)
lim =1
ϕ→0 ϕ

for all (x, y).9 Thus, differentially small adjustments of x in the direction of ι
are matched by equivalent changes in I ι . Where Translation ensures invariance
to translations of the origin in the direction of ω, Normalization ensures that
rescaling axes of the numeraire rescale the resulting distance functions by the same
proportion.

Remark 2. Choosing ω to be y and ι to be x, respectively, yields

O y (y, x) = min {β : y − βy ∈ Y (x)}

= min {β − 1 : (1 − β) y ∈ Y (x)} + 1
= 1 − max {1 − β : (1 − β) y ∈ Y (x)}
= 1 − g Y (y, x)−1

with

g Y (y, x) = inf {γ > 0 : y ∈ γ Y (x)} ,

and

I x (x, y) = max {λ : x − λx ∈ V (y)}

= 1 − d V (x, y)−1

with

d V (x, y) = sup {θ > 0 : x ∈ θ V (y)} .

g Y is often called a radial output-oriented distance function,10 and d V a radial input-

oriented distance function. An immediate consequence of Theorem 1 is that if T

9 In formal terms, this is equivalent to I ι (x, y) being Gateaûx differentiable in the direction of x
for all (x, y).
10 g Y is also often called a gauge function or a Minkowski functional for the set Y .
308 R. G. Chambers and R. Färe

satisfies WDO

g Y (y, x) ≤ 1 ⇔ y ∈ Y (x) ,

and if T satisfies WDI

d V (x, y) ≥ 1 ⇔ x ∈ V (y) ,

so that both satisfy a version of Indication.

In principle, O ω , g Y , I ι , and d V all provide a suitable means of defining t. And,

despite the claims made by some authors, there seems to be no reason to prefer one
over any of the others for purposes of representing the technology. The situation is
not unlike that faced by a carpenter or a mechanic. Depending upon the task at hand,
they will use the most appropriate tool. One can, of course, use a hammer to drive a
wood screw, but a deftly applied drill bit and screwdriver (spanner) work much more
efficiently. Similarly, a pliers can be used to fasten and loosen bolts, but properly
sized wrenches work even better. So too for our distance functions. In some cases,
good reasons exist to prefer output-oriented ones to input-oriented ones, and just the
reverse in other situations. This same reasoning applies to choosing the perspective
for looking at T and a numeraire, a suitable choice will depend the problem to be
solved.
Along these lines, we have derived the output distance functions using Y (x)
and the input distance functions using V (y) . It follows trivially, however, from
Theorem 1 that we could have derived either directly from T , the former from V (y),
or the latter from Y (x). Our derivation reflects traditional treatment and the natural
bias that issues mainly involving outputs are best examined in output space and
issues involving inputs are best examined in input space because they are most easily
visualized, and thus illustrated, in those terms.11

11 Using the netput representation,

T = z ∈ RN +M : z is technically feasible ,

one can also define a netput-distance function as

N γ (z) = min {ψ ∈ R : z − ψγ ∈ T }

if there exists ψ ∈ R such that z − ψγ ∈ T and ∞ otherwise that satisfies appropriate versions of
Indication, Translation, and Normalization under GNN in the direction γ .
7 Distance Functions in Production Economics 309

Differential Properties of Distance Functions

Pictorial depictions of isoquants, transformation curves, and graphs of production

functions are fundamental to production economics. Any discussion of optimal
producer behavior teems with references to marginal rates of technical substitution,
marginal rates of transformation, and marginal products depicted as slopes of
frontiers. One way to investigate the resulting concepts is to impose smoothness
on an appropriate function representation of T , use the function representation to
generate the appropriate boundary of T , and then use the calculus to investigate the
slope and curvature of that boundary.
Our approach to such matters relies less on mechanical calculation and more
on viewing matters geometrically. Generally speaking, the idea is to start with an
appropriate boundary of T . Because we assume T to be closed, little or no generality
is lost in assuming that the boundary is for a closed set and hence belongs to it. We
then examine what different aspects of the visual representations of the boundary
imply about the differential properties of the function generating the boundary.
The basic ideas can be demonstrated using either O ω or I ι . Going from one to
the other requires slight changes in wording. We concentrate the visual and intuitive
discussion on O ω and Y (x) and follow with a relatively terse discussion focused on
I ι , leaving it to the interested reader to fill in the details. We start with O ω (y, x),
given GON for ω,

T = (x, y) : O ω (y, x) ≤ 0 .

Our particular interest is in the outer boundary for Y (x), which we identify with

Ȳ (x) = y : O ω (y, x) = 0 ,

and call the transformation curve. (We’re fudging here for the sake of easy intuition.
We have yet to place enough structure upon T to ensure how to characterize
the boundary of Y (x) properly.) Figure 3 illustrates the transformation curve as
the curve connecting points A and B and Y (x) as all points falling below the
transformation curve. For most economists, identifying Y (x) with points below the
transformation curve surely will seem natural. But, this is precisely where FDO
usually comes into play. FDO ensures GON in all directions, so that if points on
AB belong to Y (x), so too must the points falling below it. Moreover, for y ≤ y,
y − O ω (y, x) ω ≤ y − O ω (y, x) ω ∈ Y (x) so that O ω y , x ≤ O ω (y, x) is
nondecreasing under FDO, which with Indication is another way of saying that any
point falling below the transformation curve must belong to Y (x).
FDO also ensures that the slope of the transformation curve must not be a
strictly positive real number. Intuitively, increasing one output prompts an increase
in O ω , and to balance it, one needs another output to decrease to return production
feasibility. The slope of the transformation curve that reflects this principle is
310 R. G. Chambers and R. Färe

Fig. 3 Transformation curve y2

A
C

Y(x) E

y1
B

traditionally called the marginal rate of transformation. A negative value ensures

that outputs, when viewed as pairs, must be substitutes in production.12
Figure 3 depicts a smooth transformation curve. The marginal rate of transfor-
mation is obtained by finding a tangent to Ȳ (x) or by implicitly differentiating
O ω (y, x) when the latter is smooth. Figure 3 has been drawn so that at C, D, and E,
the marginal rate of transformation is illustrated by the slope of the parallel dotted
tangent line segments implying the same marginal rate at C, D, and E. This has
been done to illustrate the connection between a smoothly differentiable structure
and a more general differential representation that determines whether points on the
transformation curve are economically relevant in a sense to be made precise.
From technical and mechanical perspectives, points C, D, and E appear similar.
They each satisfy O ω (y, x) = 0, and they share a common marginal rate
of transformation. But they represent different output mixes. And looking more
closely, D is located on a portion of Ȳ (x) that is convex to the origin, while C and
E are located on portions that are concave to the origin. In production economics
jargon, the technology exhibits a diminishing marginal rate of transformation at D
and an increasing marginal rate of transformation at C and E.
If the slopes of the dotted line segments were to reflect relative output prices,
D would be judged economically inferior to C and E because less revenue can be
obtained from D than from the latter two. Using the same criterion, however, C is
judged inferior to E.
What Fig. 3 illustrates is that distinct points can be closely similar from the
mechanical calculus perspective in which we frequently discuss economic concepts
but drastically different economically. They can be on the frontier, and they can
have equal marginal rates of transformation, and marginal rates of transformation

12 Good reasons exist to believe that this is not always true. In other words, FDO might be too
strong a restriction (hence our insistence upon the presence of GON ). Perhaps the best example is
given by pollutants that have complementary relationships with many outputs. The Chapter 12,
“Bad Outputs” discusses such concerns in detail.
7 Distance Functions in Production Economics 311

increasing at the same rate, but still differ in terms of their economic efficiency.13
Differentiability is often intuitive and analytically convenient for those practiced in
calculus manipulation for identifying potential optima. But, in the final analysis,
something else is required.
As it turns out, a one-sided differential notion that slightly generalizes the
geometric association of the gradient with the hyperplane tangent to the function’s
level set suffices. Point E in Fig. 3 illustrates. Letting the dotted line segments
represent relative output prices −p p2 , point E satisfies p1 y1 + p2 y2 > p1 y1 +
1 E E C

p2 y2C > p1 y1D + p2 y2D and moreover that

p1 y1E + p2 y2E ≥ p1 y1 + p2 y2

for all (y1 , y2 ) ∈ Y (x). A simple tangency between the hyperplane with normal
(p1 , p2 ) that passes through y1E , y2E and Ȳ (x) won’t satisfy this criterion (points
C and D both illustrate). What’s required in addition to tangency is that the
associated hyperplane associated supports Y (x) from above so that Y (x) falls in
the half-space below the tangent hyperplane. Economically, that means for prices
(p1 , p2 ), y1E , y2E must be at least as valuable as any other output combination
falling in Y (x).
We can write the desired criterion for y1E , y2E as (p1 , p2 ) satisfying

O ω (y, x) − O ω y E , x ≥ p1 y1 − y1E + p2 y2 − y2E for all (y1 , y2 ) ∈ R2 .

To see why this always works, recall that O ω y E , x = 0 and that for any (y1 , y2 )
∈ Y (x), O ω (y, x) ≤ 0, whence

0 ≥ O ω (y, x) ≥ p1 y1 − y1E + p2 y2 − y2E ,

or

p1 y1E − y1 + p2 y2E − y2 ≥ −O ω (y, x) ≥ 0,

as desired. More generally, if y E is to satisfy such a criterion, there must exist a

p ∈ RM such that14

O ω (y, x) − O ω y E , x ≥ p y − y E for all y ∈ RM .

13 Thisis another way of saying that location, first-order, and second-order conditions can be met
without truly identifying the true optimum. Note that at C second-order conditions for an interior
optimum are satisfied, but C remains non-optimal.
14 Here and elsewhere p for p ∈ RM denotes the transpose of an M-dimensional column vector,

and p y for p, y denotes the standard inner product.

312 R. G. Chambers and R. Färe

Fig. 4 Subdifferentiable but y2

not smooth
A
C

Y(x)

y1
B

A p ∈ RM satisfying this criterion is called a subgradient (in y) of O ω (y, x) at

y E . In Fig. 3, all the parallel dotted line segments can be identified with gradients of
O ω (y, x) at C, D, and E, but only the one at E defines a subgradient. When such a
p exists, O ω is said to be subdifferentiable at y E .15
Because derivatives are typically defined via limits, gradients describe point-to-
point mappings. Subdifferentiability is not defined in terms of limiting behavior but
in terms of a global condition that must apply at a particular point. That definition
permits the existence of multiple subgradients at a point. Figure 4 illustrates with
point C possessing multiple supporting hyperplanes for Y (x). The subdifferential
correspondence ∂O ω : RN +M ⇒ RM is defined at y E , x by

p ∈ RM : O ω (y, x) − O ω y E , x ≥ p y − y E
∂O y , x =
ωE
.
for all y ∈ RM

Despite the fact that gradients and subgradients are both intuitively identified
with tangent hyperplanes, one does not imply the other. Numerical functions can be
differentiable at point but not subdifferentiable (see C in Fig. 3) or subdifferentiable
but not differentiable (see C in Fig. 4). The key distinction is that subdifferentiability
directly reflects global optimality16 , while differentiability does not. This distinction
is highlighted by noting that

p ∈ ∂O ω y E , x ⇔ O ω (y, x) − p y ≥ O ω y E , x − p y E for all y ∈ RM ,

whence

15 The “sub” terminology arises from (y, O ω (y, x)) falling in the half-space lying above the affine

hyperplane

(y, O) ∈ RM+1 : O = O ω y E , x + p y − y E for all y ∈ RM .

16 In the sense of minimization

7 Distance Functions in Production Economics 313

Fig. 5 Convex Y(x) y2

A F

G
Y(x)

y1
B

0 ∈ ∂O ω y E , x ⇔ O ω (y, x) ≥ O ω y E , x for all y ∈ RM .

Smooth numerical functions are not generally subdifferentiable. On the other

hand, as Fig. 4 illustrates non-smooth functions can be subdifferentiable but not
differentiable. Examining Figs. 3 and 4 gives us a hint at what’s required to ensure
subdifferentiability. At both points of subdifferentiability, the transformation curve
is (locally) concave to the origin. But point C in Fig. 3 also demonstrates that
local concavity does not ensure subdifferentiability. What’s needed is envisioned
by taking the transformation curve in Fig. 3 and somehow eliminating the salient
between points C and E that points toward the origin. Figure 5 depicts this operation
as pushing the inward pointing salient from Fig. 3 northwest toward the dotted line
segment until it is replaced by the line segment connecting points F and G. The
resulting modified transformation curve AFGB is subdifferentiable everywhere on
the interior of the positive orthant. The resulting transformation curve is illustrated
visually by a contour that is everywhere concave toward to the origin, and the
modified Y (x) is a convex set.17 As the visual argument suggests, convexity of
Y (x) suffices to ensure that O ω is convex in y and subdifferentiable as required.
We formalize this visual intuition with:

Theorem 2. If T satisfies F DO and Y (x) is a convex set for all x ∈ RN , O ω (y, x)

is nondecreasing and convex in y, ∂O ω (y, x) = ∅ for all y in the relative interior
of {y : O ω (y, x) < ∞}, and O ω (y, x) is differentiable almost everywhere in the
relative interior of {y : O ω (y, x) < ∞}.

Figure 5, with the inner-directed salient removed, depicts a “normal”-looking

transformation curve that is an “upper bound” for a nicely convex Y (x). Our dis-
cussion reveals that what’s considered “normal” is one consequence of requiring an
everywhere subdifferentiable O ω . In other words, our most common visualization
of an output set embeds subdifferentiability in it. This ensures that for any boundary

17 More formally, the modified set is the convex hull of Y (x).

314 R. G. Chambers and R. Färe

points on Ȳ (x), one can find a suitable set of prices that will make it optimal. In
a more general setting that does not correspond to our stylized representations,
subdifferentiability enables discrimination between boundary points that are eco-
nomically rational and ones that are not. In economics jargon, subdifferentiability is
the differential condition required to ensure a point on the transformation curve can
be identified with an output “price” vector that makes the point revenue maximal,
what economists refer to as a “shadow” or “virtual” price vector.
The identification of subdifferentiability with shadow prices is formalized and
strengthened by the following result (see, e.g., Chambers [7], Chapter 3):

Theorem 3. Under FDO, p ∈ ∂O ω (y, x) ⇒ p ∈ RN

+ and p ω = 1 for all
y ∈ {y : ∂O (y, x) = ∅}, and
ω

po ∈ ∂O ω y o , x ,

poo ∈ ∂O ω y o , x ⇒ p oo − po y oo − y o ≥ 0. (cyclical monotonicity)

for all y, y o ∈ {y : ∂O ω (y, x) = ∅}

Theorem 3 establishes that F DO and subdifferentiability of O ω guarantee the

existence of non-negative shadow price vectors that price the numeraire bundle at
1. Moreover, these shadow prices are cyclically monotone with their associated
output bundles. Cyclical monotonicity of shadow prices generalizes the concept
of an increasing marginal rate of transformation between outputs to higher dimen-
sions. Cyclical monotonicity is perhaps best understood as a generalization of the
univariate notion of monotonicity. For example, N = 1, cyclical monotonicity of
∂O ω (y, x) implies that the subdifferential with respect to y is nondecreasing in y,
which also requires O ω (y, x) to be convex in y.
Turning to the differential properties of I i (x, y), we first note that replacing
“transformation curves” with “isoquants,” “marginal rate of transformation” with
“marginal rate of substitution,” and “increasing rate of marginal transformation”
with “decreasing marginal rate of substitution” provides the needed jargon. Then,
if one uses −I i (x, y), replaces F DO with F DI , and recycles arguments made
in terms of y in terms of x, the extension is straightforward. Subdifferentiability
of O ω in y is replaced with subdifferentiability of −I i in x,18 maximization of
revenue with minimization of cost, and concave to the origin transformation curves
with convex to the origin isoquants that are consistent with F DI .

18 Subdifferentiability of −I i in x is equivalent to superdifferentiability of I i in x.

7 Distance Functions in Production Economics 315

Distance Functions at Work

Distance functions play a central role in many areas of economics including

producer analysis, consumer theory, efficiency analysis, index number theory, and
equilibrium computation among others. This section presents an overview of the
uses of distance functions in duality theory, efficiency measurement, index number
theory, and productivity measurement. As with the discussion of the differential
properties of distance functions, the basic ideas can be demonstrated using output-
or input-based concepts. Going from one to the other requires slight changes in
wording. When relevant, we concentrate the discussion on V (y) and leave the
extension to Y (x) to the reader.

Duality Theory

The duality terminology is borrowed from mathematics, where the dual of a primal
vector space X is identified as the space of the linear functionals on X. Using
the traditional Euclidean norm, the linear functions of elements of RN , x ∈ RN ,
assume the form l x with l ∈ RN so that the space of linear functionals of RN
is itself, RN . Put another way, RN is self-dual. Because our exclusive focus is on
real numbers, what is primal and what is dual is, therefore, often ambiguous and
depends importantly upon how axes are labelled. Economists typically resolve this
ambiguity by invoking the aphorism that “primal means things you can eat, and dual
things you cannot.”19 Thus, RN conceived of as measuring quantities of physical
commodities is typically identified with the primal space, and RN consisting of the
linear functions of quantities associated with prices that value these quantities is the
dual space.
The core notion of duality in production economics is that technologies can be
characterized either in primal or dual terms. And under appropriate restrictions upon
T the primal characterization and the dual characterization are equivalent.20 To our
knowledge, the first complete demonstration of a dual relationship was Shephard
[21] who demonstrated that a radial input-oriented distance function and a cost
function formed a dual pair.
We now quickly illustrate why this is true using our set up. Letting input prices
be denoted w ∈ RN ++ , the cost function, c : R++ × R → R̄ for V (y) is defined:
N M

c (w, y) = min w x : x ∈ V (y) (5)
x

19 We attribute this Professor Shawna Grosskopf, and even if it is bit imprecise because primal also
contains items such as coal, it nicely conveys the general idea.
20 Please see Chapter 3 of Volume 1 of this Handbook for a thorough discussion of duality theory.
316 R. G. Chambers and R. Färe

if V (y) = ∅ and +∞ otherwise. As is well-known, the cost function is nondecreas-

ing, positively homogeneous, and concave in w. Moreover, it can be recognized as
the (lower) support function for V (y). Thus, if V (y) satisfies F DI and is convex
as a subset of RN+ , a well-known result in convex analysis is that (Rockafellar [20])

V (y) = x ∈ RN
+ : w x ≥ c (w, y) for all w ∈ RN
++ . (6)

Together, expressions (5) and (6) are referred to as a dual pair. They show that
knowledge of V (y) is sufficient to generate c (w, y) for given input prices and that
knowledge of c (w, y) is sufficient to generate V (y). As a result, any information
that is imbedded in the primal technology, V (y), can be recaptured from its dual
c (w, y) provided V (y) satisfies F DI and is convex. Conversely, information
imbedded in the dual c (w, y) can be recaptured from V (y). In this sense, c (w, y)
and V (y) are informationally equivalent. In the words of McFadden [19, p.4],
the cost function is “. . . a sufficient statistic for all the economically relevant
characteristics of the underlying technology.”
Input distance functions offer another perspective on this duality that can be
analyzed using simple optimization arguments. If V (y) satisfies F DI ,

x − I ι (x, y) ι ∈ V (y) ,

then

w x − I ι (x, y) ι ≥ c (w, y) for all x ∈ RN

+ , w ∈ R++ .
N

Using the positive homogeneity of c (w, y) and rearranging gives

w ∗ x ≥ I ι (x, y) + c w ∗ , y for all x ∈ RN ∗

+ , w ∈ R++
N
(Fenchel’s inequality),
(7)

where w ∗ = ww ι represents real input prices normalized by the value of the
numeraire bundle. Holding x fixed, Fenchel’s inequality implies

w ∗ x − c w ∗ , y ≥ I ι (x, y) for all w ∗ ∈ RN

++ ,

while holding w∗ fixed gives

w ∗ x − I ι (x, y) ≥ c w ∗ , y for all x ∈ RN

Thus, I ι (x, y) provides a lower bound for all affine functions of x describable as
w ∗ x − c (w ∗ , y) for w ∗ ∈ RN ∗
++ , and c (w , y) provides a lower bound for all
∗ ∗
affine functions of w describable as w x − I ι (x, y) for x ∈ RN + . An immediate
consequence is that
7 Distance Functions in Production Economics 317

inf w ∗ x − c w ∗ , y ≥ I ι (x, y) , and
w∗ ∈RN
++

min w ∗ x − I ι (x, y) ≥ c w ∗ , y . (8)
x∈RN
+

The left-hand sides of the inequalities in (8) are, respectively, the concave conju-
gates of c (w ∗ , y) and I ι (x, y) (Rockafellar [20]). When V (y) satisfies F DI and is
convex as a subset of RN + , these inequalities convert to equalities to form a conjugate
dual pair (Rockafellar [20])

I ι (x, y) = inf w ∗ x − c w ∗ , y , and (9)
w∗ ∈RN
++

c w ∗ , y = min w ∗ x − I ι (x, y) .
x∈RN
+

An immediate consequence is the following version of Shephard’s Lemma:

x ∈ ∂c w ∗ , y ⇔ w ∗ ∈ ∂I ι (x, y) ,

where, for example, ∂c (w, y) now denotes the superdifferential of c (w, y) in

w.21 In words, the efficient solution to the cost minimization problem belongs
to the superdifferential of the cost function and the efficient solution to the dual
formulation of the distance-function problem belongs to the superdifferential of I ι .

Efficiency Analysis

Another important use of distance functions is to measure technical efficiency.

Recall, for example, the radial input-oriented distance function

d V (x, y) = sup {θ > 0 : x ∈ θ V (y)} .

Because
−1
d V (x, y) = 1 − I x (x, y) ,

its effective numeraire is x. Moreover, if x ∈ V (y), d V (x, y) ≥ 1, and under WDI,

the converse is true.
One important interpretation of d V (x, y) is as a measure of technical efficiency
for producing y (Debreu [10], Farrell [14]). If d V (x, y) = 1, x might be called
technically efficient relative to V (y) because x cannot be shrunk radially and stay

21 Recall the superdifferential of c is the subdifferential of −c.

318 R. G. Chambers and R. Färe

in V (y). On the other hand, if d V (x, y) > 1, x “might” be called technically

x
inefficient because the radially shrunken d V (x,y) belongs to V (y).
We’ve employed the “might” qualification to emphasize that precise definitions
of technical efficiency are necessarily entangled with how the “boundary” for V (y)
is characterized. To illustrate, recall that we have previously referred to V (y) s
boundary as an isoquant. Intuitively, its name is meant to connote the intermediate
micro intuition that it represents the input bundles that can produce the same output.
But that definition also describes V (y), so more precision is needed. In particular,
we want to exclude points falling in V (y) but not on its boundary. That intuitive
idea can be accomplished in a number of different ways. One popular alternative is
to identify the isoquant correspondence, V̄ R : RM ⇒ RN , via

V̄ R (y) = {x ∈ V (y) : λx ∈
/ V (y) , λ < 1} .

Another alternative is

V̄ L (y) = x ∈ V (y) : x ≤ x ⇒ x ∈
/ V (y) .

Because λx ≤ x for λ < 1, one can show V̄ L (y) ⊆ V̄ R (y). The converse is not
true because V̄ L (y) can exclude points falling in V̄ R (y). The Leontief production
function f (x) = min {x1 , x2 } illustrates. For that case,

V̄ L (y) = {(y, y)} ,

is a singleton set. V R (y), on the other hand, is represented as an L−shaped

isoquant emanating upward from (y, y) with its vertical “arm” extending parallel
to the vertical axis and its horizontal “arm” extending outward and parallel to the
horizontal axis. Hence,

x ∈ V̄ L (y) ⇒ d V (x, y) = 1, and

d (x, y) = 1 ⇔ x ∈ V̄ (y) ,
V R
(10)

so that points can be declared technically efficient according to the Debreu-Farrell

criterion but not fall on the “isoquant” V̄ L (y).
Expressions (10) illustrate the close connection between notions of technical
efficiency and different versions of GIN. Choosing V̄ R (y) to be the boundary
concept associates technical efficiency with radial shrinkage of x and naturally links
technical efficiency judgments to GIN for all x ∈ V (y). V̄ L (y), on the other hand,
associates technical inefficiency with being able to decrease any single input and,
thus, links technical efficiency comparisons to GIN for all ι ∈ R+ N \ {0}. Because

the latter is a more stringent criterion, V̄ (y) ⊆ V̄ (y). More generally, different
L R

boundary notions derived from either O ω or I ι by varying either ω or ι will lead to

definitions of technical efficiency that are specific to the choice of the numeraire.
7 Distance Functions in Production Economics 319

Expressions (10) also illustrate that input bundles, which could never be eco-
nomically efficient, can be technically efficient according to some criteria. Again
the Leontief technology, y = min {x1 , x2 }, illustrates. As long as input prices are
positive, the only economically efficient point in V̄ R (y) is (y, y). Economists are
usually more interested in economic efficiency than in technical efficiency. And
economic efficiency is often defined in dual terms. A natural measure of cost
efficiency for a given input bundle x ∈ V (y) is the ratio of minimal cost to observed
cost

c (w, y)
.
w x

If x ∈ V (y) and c(w,y)

w x
= 1, x is judged cost efficient. If c(w,y)
w x
< 1, x is cost
inefficient because costs can be reduced while still producing the same output.
x
Because it is definitionally true that d V (x,y) ∈ V (y), it follows for x ∈ RN
++ and
V (y) ⊂ RN
++ that

w x ≥ c (w, y) d V (x, y) . (Mahler’s inequality) (11)

Mahler’s inequality is the multiplicative manifestation of the same economic

phenomenon lying behind Fenchel’s inequality. Farrell [14] suggested closing the
Mahler inequality by defining allocative efficiency residually as the ratio of the right-
hand side divided by the left. That is,

c (w, y) d V (x, y)
a V (x, w, y) ≡ .
w x

The input bundle x is allocatively efficient for (w, y) if a V (x, w, y) = 1 and

inefficient otherwise.
If FDI is maintained and V (y) is assumed to be a convex set, Mahler’s inequality
forms the basis for yet another dual pair:

w x
c (w, y) = min = min w x : d v (x, y) ≥ 1 ,
x d V (x, y) x

w x
d V (x, y) = min = min w x : c (w, y) ≥ 1 . (12)
w c (w, y) w

Moreover, convexity of V (y) ensures the following version of Shephard’s Lemma

x ∈ ∂c (w, y) ⇔ w ∈ ∂d V (x, y) .

The Farrell [14] definitions of technical, cost, and allocative efficiency depend
upon the chosen numeraire. Choosing different numeraire results in different notions
320 R. G. Chambers and R. Färe

of technical, cost, and allocative efficiency (Chambers, Chung, and Färe [9]).
Recalling that

x − I ι (x, y) ι ∈ V (y)

implies

w x − I ι (x, y) w ι ≥ c (w, y) ,

whence

w x − c (w, y)
= w ∗ x − c w ∗ , y ≥ I ι (x, y) . (13)
w ι
Following Chambers, Chung, and Färe [9], a difference-based measure of allocative
efficiency is then determined residually to “balance” Fenchel’s inequality as

w ∗ x = Ai w ∗ , x, y + c w ∗ , y + I ι (x, y) .

To recycle this discussion for measuring output inefficiency, replace V (y) with
Y (x), “isoquants” with “transformation curves,” and x technically efficient for y
with y technically efficient for x provides the needed jargon. Then, replacing W DI
with W DO, F DI with F DO, w ∈ RN ++ with strictly positive output prices p ∈
RM++ , and recycling arguments made in terms of x in terms of y will proved the
needed extension.

Index Numbers and Productivity Measurement

Konüs ([1924], [15])22 proposed using cost or expenditure functions to represent

price indexes. Following that contribution, Sten Malmquist [18], following the basic
logic of duality, proposed using distance functions to represent quantity indexes.
Malmquist [18], who was working in the context of consumer theory, phrased things
in slightly different terms than ours. But the translation is easy enough. Simply
equate Malmquist’s commodities with our inputs and his indifference level with our
output (now treated as a scalar).
For the commodity bundles, x o , x 1 ∈ RN o
++ , and the indifference level, y ,
Malmquist [18, p. 230] defined his quantity index

x
Io1 y o = min μ > 0 : μx 1 ∈ V y o

22 Konüs [15] is a translation of a paper originally published by Konüs in Russian in 1924.

7 Distance Functions in Production Economics 321

−1
= d V x1, yo ,

with x o chosen so that d V (x o , y o ) = 1. Hence, more generally,

d V (x o , y o )
x
Io1 yo = .
d V x1, yo

Translated into inputs and outputs, either I01 x x

(y o ) or its reciprocal I01 (y o )−1 defines
a Malmquist input index (Caves, Christensen, and Diewert [3]) that relates input
bundles x o , x 1 in terms of their ability to produce the output y o . If Io1 x
(y o ) < 1, x 1
o x
must be radially expanded to be capable of producing output y , and if Io1 (y o ) > 1,
it can be radially shrunk. Thus, in this sense, x would be judged “larger” than x 1
o

in the first instance and smaller in the second.

The Malmquist input index effectively ranks x o and x 1 in terms of their ability
to produce y o . If one instead uses y 1 , one obtains

dV xo, y1
x
Io1 y1 = V 1 1 .
d x ,y

In general, Io1x y 1 = I x o
o1 (y ) so that the two indexes typically differ. This is a
familiar problem in index number theory. Namely, indexes always depend upon
the “reference” situation. Here the reference is determined by the level at which the
output, y, is set. A common way23 to address this indeterminacy is to take geometric
averages to arrive at the following input index
1
d V x o , y 1 d V (x o , y o ) 2
I˜o1
x
yo, y1 = .
d V x1, y1 d V x1, yo

Exactly parallel arguments on the output side suggest defining a Malmquist

output index (Caves, Christensen, and Diewert [3]) as

y
Oo1 x o = max μ > 0 : μy 1 ∈ Y x o
−1
= gY y 1, x o

with y o chosen so that g Y (y o , x o ) = 1, whence

23 Other ways exist. For example, one could simply choose an arbitrary y that is neither y o nor y 1
as the reference. That resolves the dilemma of choosing either o or 1 as the basis, but it does not
resolve its arbitrariness.
322 R. G. Chambers and R. Färe

y g Y (y o , x o )
Oo1 x o = .
gY y 1, x o

Again, either Oo1 (x o ) or its reciprocal, Oo1 (x o )−1 , defines an index. But
y y

now the index relates y o and y 1 in terms of x o s ability to produce them. If

y
Oo1 (x o ) <1, y 1 would have to be radially shrunk to be producible by x o . Because
y o could be produced using x o , y 1 would often be judged as larger than y o in
this case. As before, choosing a different input base, say x 1 , yields a different
index. The corresponding geometric average that is often used to resolve this
indeterminacy is
1
g Y (y o , x o ) g Y y o , x 1 2
y
Õo1 xo, x1 = .
gY y 1, x o gY y 1, x 1

Following a parallel logic, Chambers [4, 6] suggested developing input and

output “indicators”24 defined in terms of a specific numeraire. He defined a
Luenberger input indicator for x o , x 1 , y o

Xι x o , x 1 , y 1 = I ι x 1 , y o − I ι x o , y o ,

and a Luenberger output indicator for x o , y o , y 1

Y ω xo, yo, y1 = O ω yo, xo − O ω y1, xo .

As with the Malmquist index numbers, the Luenberger indicators depend upon their
bases. This dependence is addressed by taking simple averages to get the respective
input and output indicators,

1 ι 1 o
X̃ι x o , x 1 , y o , y 1 = I x , y − I ι xo, yo + I ι x1, y1 − I ι xo, y1 ,
2
1 ω o o
Ỹ ω x o , y o , y o , y 1 = O y , x − Oω y1, xo + Oω yo, x1 − Oω y1, x1 .
2

One of the primary reasons for interest in index numbers is the desire to expand
concepts developed in the case of a single input and single output to higher
dimensions. Particular interest has focused on average products defined in the single
input-single output case by the ratio y/x. A very common choice for x is labor.
Average products have many different interpretations, but perhaps the most common

24 The “indicator” terminology appears due to Diewert (1993).

7 Distance Functions in Production Economics 323

is that of an average return to using x. This average return goes by a variety of

names. In agricultural interpretations, it’s typically yield and x is taken to be land.
In other areas, the parallel terminology is “productivity,” which has come to be play
an important role in modern macroeconomics.
A natural way to compare productivity across different observations is to
compare, say, y 1 /x 1 to y o /x o . Working in ratio terms suggests the following notion
of a productivity index

y 1 /x 1
.
y o /x o

When (x, y) ∈ R2 , such productivity comparisons are perfectly sensible. The

natural question, however, is what to do in the more realistic case when there
are multiple inputs and multiple outputs. A variety of different solutions have
been proposed. All rely on distance functions. Here we shall review the two most
common.
What has been dubbed the Malmquist approach (Caves, Christensen, and
Diewert [3]) views productivity measurement from two different perspectives, one
is output-based and the other is input-based. We treat the input-based approach and
leave the output-based approach, which is exactly parallel, to be developed by the
interested reader. The Malmquist input-oriented productivity index is defined

d V x1, y1
m xo, x1, yo, y1 = V o o ,
d (x , y )

which represents the ratio of the radial distance that x o is from the frontier for V (y o )
to the corresponding distance for x 1 and y 1 . If x 1 is on the frontier for V y 1 but x o
is not on the frontier for y o , m x o , x 1 , y o , y 1 < 1 and 1 is judged to be the most
productive in an input sense. However, if both observations are on the frontier for
their respective V (y), m x o , x 1 , y o , y 1 = 1, and both are judged equally efficient.
Thus, the Malmquist approach effectively identifies productivity closely with how
efficiently the technology is being used. The Luenberger input-oriented productivity
indicator (Chambers [4, 6]) works similarly and defines the productivity indicator
as

lι xo, x1, yo, y1 = I ι x1, y1 − I ι xo, yo .

The Hicks-Moorsteen approach identifies its concept of productivity not with

technical efficiency but with the intuitive idea that productivity should reflect an
average product. To illustrate, suppose that X̂ represents an aggregate input and
Ŷ an aggregate output, then one version of a Hicks-Moorsteen productivity index
would be written
324 R. G. Chambers and R. Färe

Ŷ 1 /X̂1
H X̂o , X̂1 , Ŷ o , Ŷ 1 = .
Ŷ o /X̂o

If this operation is sensible (and in general there’s good reason to suspect it may not
be), it’s also true that

Ŷ 1 /Ŷ o
H X̂o , X̂1 , Ŷ o , Ŷ 1 = ,
X̂1 /X̂o

so that the Hicks-Moorsteen productivity index can also be interpreted as the ratio
of an output index, Ŷ 1 /Ŷ o , to an input index, X̂1 /X̂o .
Making such observations sensible requires defining the notion of aggregation
of inputs and outputs and then of indexing. There are many such alternatives.
For example, one could simply use observed prices and Laspeyres, Paasche, or
Fisher methods to compute aggregates without any reference to the technology.
Another approach is to combine Malmquist methods of creating input indexes and
output indexes with the Hicks-Moorsteen definition. If that were done, one possible
candidate for a Hicks-Moorsteen productivity index would be

Õ y x o , x 1
H M X̂o , X̂1 , Ŷ o , Ŷ 1 = xo1 ,
I˜o1 y o , y 1

where the M superscript recognizes

that the indexes
are Malmquist ones.
As a general matter, H M X̂ , X̂ , Ŷ , Ŷ and m x o , x 1 , y o , y 1 do not coin-
o 1 o 1

cide. The same would be true for H M X̂o , X̂1 , Ŷ o , Ŷ 1 and a Malmquist output-
based productivity index constructed using g Y . To illustrate, take the special case
where (x o , y o ) and x 1 , y 1 satisfy d V (x o , y o ) = d V x 1 , y 1 = g Y (y o , x o ) =
g Y y 1 , x 1 = 1 so that allobservations are efficient in the Debreu-Farrell sense.
Then H M X̂o , X̂1 , Ŷ o , Ŷ 1 and m x o , x 1 , y o , y 1 coincide if and only if

gY y o , x 1 dV xo, y1
= .
gY y 1, x o d V x1, yo

Conditions can be identified under which this equality is satisfied, but they will be
restrictive.25 The ultimate reason that the two approaches differ lies in the different
approaches taken to generalize a univariate concept to multiple dimensions. One
views productivity in the univariate case as measuring productive efficiency, and the
other views it as measuring an average product. Because the viewpoints differ, the
end results differ.

25 See, for example, the chapter in this volume by Färe Primont, and Weber.
7 Distance Functions in Production Economics 325

Productivity comparisons are made in many different contexts. In some an

implicit assumption that we have so far maintained is plausible. In others, it is not.
That implicit assumption is that both (x o , y o ) and x 1 , y 1 represent observations on
firms or decision-making units that have access to the same technology. Given our
description of T as simply containing the feasible inputs and outputs, that’s perfectly
sensible at any point in time regardless of the breadth of any cross section. However,
once time changes it seems less sensible because we clearly have the knowledge to
produce things now that we could not produce in earlier periods. One way to tackle
this difficulty is to adopt the perspective that things that are physically possible do
not change and that the change that we observe represents not a change in physical
possibilities but a change in our knowledge of those physical possibilities. That
perspective, if taken, would imply that one of the components of x should represent
that knowledge (or better yet the flow associated with it). In principle, that is fine,
but in practice representing that knowledge in measurable units is quite tricky. The
common default, therefore, is to assume that technologies change with the passage
of time and to recognize that passage by an appropriate subscripting of T , Y , or
V . Once that is done, productivity comparisons also involve comparisons across
different technologies.

Empirical Implementation of Distance Functions

Distance functions have been used in a variety of applied production settings. In

broad terms, there are two distinct approaches, frequently referred to as nonpara-
metric and parametric. With few exceptions, nonparametric analyses are conducted
using programming tools developed in the operations research literature. Parametric
analyses, on the other hand, mainly utilize statistical tools developed in the
econometrics literature. Both programming and econometric approaches to applied
production problems are covered in detail in other chapters of this handbook. Hence,
in this section, we restrict attention to the most basic details of both approaches as
they relate to distance functions.
Assume that there exist K distinct observations on inputs and outputs x k , y k ∈
N +M
R+ , which we shall refer to the data set, and denote by

+M
D (K) = x k , y k ∈ RN
+ : k = 1, 2, . . . , K .

The most common version of the programming approach to analyzing these

data, often referred to as data envelopment analysis (DEA), approximates the
technology by developing a conservative extension of the convex hull of the data
set. Specifically, the data set is assumed to have been generated by a T that is closed,
+M
convex as a subset of RN + , and exhibits FDO and FDI. If T is closed and convex,
all convex combinations of D (K) must belong to T . That is,

C {D (K)} ⊂ T ,
326 R. G. Chambers and R. Färe

where
⎧ ⎫
⎨
K
K
K ⎬
+M
C {D (K)} ≡ (x, y) ∈ RN
+ :x= λk x k , y = λk y k , λk = 1, λk ∈ R+ ∀k ,
⎩ ⎭
k=1 k=1 k=1

is the convex hull of D (K). The standard DEA approximation to T is then derived
by recognizing that if T also satisfies FDI and FDO, then

C {D (K)} ⊂ T ⇒ F {D (K)} ⊂ T

where
⎧ ⎫
⎨
K
K
K ⎬
+M
F {D (K)} ≡ (x, y) ∈ RN
+ :x≥ λk x k , y ≤ λk y k , λk = 1, λk ∈ R+ ∀k .
⎩ ⎭
k=1 k=1 k=1

F {D (K)}, sometimes referred to as the free disposal convex hull of D (K),

+M
represents the smallest subset of RN+ that contains all convex combinations of
N +M
D (K) and elements of R+ that must belong to T if it is to be consistent with
both F DI and F DO.
Because F {D (K)} itself is a closed convex set and satisfies FDI and FDO, both

IFι {D(K)} (x, y) = max {λ ∈ R : (x − λι, y) ∈ F {D (K)}} ,

and

OFω{D(K)} (y, x) = min {β ∈ R : (x, y − βω) ∈ F {D (K)}} ,

satisfy Indication for F {D (K)}, Translation, and Normalization. Both are calcula-
ble using standard linear programming techniques applied to D (K).
Econometric analysis involving distance functions typically requires assump-
tions about the data-generating process (DGP) associated with D (K) and about the
specific parametric form for either O ω or I ι . The assumptions about the underlying
DGPs are varied and are treated in detail elsewhere in this volume. That leaves
the issue of the specific functional form. To illustrate some of the potential issues,
consider approximating an arbitrary function F : RJ → R by the generalized
quadratic form

J
J
J
F̂ (q) ≈ α0 + αj qj + βj i qj qi
j =1 j =1 i=1

where the α and β terms represent parameters to be econometrically estimated. If F̂

is to satisfy Translation in the direction χ ∈ RJ ,
7 Distance Functions in Production Economics 327

F̂ (q + χ ) ≈ F̂ (q) + 1

for all q ∈ RJ whence

J
J
J
αj χj + βj i qj χi + qi χj + χj χi = 1
j =1 j =1 i=1

for all q ∈ RJ . Thus, for the generalized quadratic form to satisfy Translation

J
J
J
αj χj = 1, βj i χi = 0, and βj i χj = 0.
j =1 i=1 j =1

More generally, Färe and Lundberg [13] demonstrate that among the class
of second-order flexible approximations to an arbitrary function F : RJ → R
describable as

J
F̂ (q) = α0 + αj hj qj + βj i hj qj hi (qi ) ,
j =1 j i

only the generalized quadratic

J
α0 + αj qj + βj i qj qi ,
j =1 j i

and
⎛ ⎞
1
ln ⎝ βj i exp μqj exp (μqi )⎠ ,
2μ
j i

can be made consistent with translation.26

Commentary

Distance functions, as we have treated them, descend from older ideas in mathe-
matics. The most obvious example is the Minkowski functional (gauge) that is used

26 Chambers [5] appears to be the first to have considered a version of this form, which he referred
to as logarithmic-transcendental.
328 R. G. Chambers and R. Färe

to “define” distance on a vector space. For K ⊂ RN , its associated Minkowski

functional (gauge) is defined

ρK (x) = inf {λ > 0 : x ∈ λK} .

It has played a central role in both functional analysis and convex analysis (e.g.,
Rockafellar [20]). Both input-oriented and output-oriented radial distance functions
are versions of Minkowski functionals.
Debreu [10], Shephard [21], and Malmquist [18] seem to be independently
responsible for introducing versions of the Minkowski functional into economics.
Debreu [10] used it to define a coefficient of resource utilization. Shephard [21]
used an input-oriented version to derive a dual relationship between a single-
product technology and its cost function. And Malmquist [18], explicitly reflecting
an older argument that he attributes to Könus [15], used it to define a quantity
index. Somewhat later, Shephard [22] used a Minkowski functional to define
an output-oriented radial distance function. Färe and Lovell [12] recognized the
formal connection between input-oriented radial distance functions and the Farrell
[14] efficiency score and showed that input-oriented and output-oriented distance
functions for a technology are reciprocals if and only if the technology satisfies
constant returns to scale.
The output-oriented and input-oriented distance functions, which explicitly
recognize the use of a numeraire, can be attributed to Blackorby and Donaldson’s [2]
translation function as used in inequality measurement and to Luenberger’s [16,17]
benefit function and shortage function as used in equilibrium characterization. But
as Luenberger [16] points out, the benefit function generalizes Allais’s [1] even
older idea of disposable surplus. Their use to characterize technological structures
is due to Chambers, Chung, and Färe [8], who also explicitly recognized them as
generalized versions of radial distance functions.

Cross-References

Bad Outputs
Duality in Production
Multiproduct Technologies

References
1. Allais M (1943) Traité d’Économie pure, vol 3. Imprimerie Nationale, Paris
2. Blackorby C, Donaldson D (1980) A theoretical treatment of indices of absolute inequality. Int
Econ Rev 21(1):107–136
3. Caves DW, Christensen LR, Diewert WE (1982) Multilateral comparisons of output, input, and
productivity using superlative indexes. Econ J 92:73–86
7 Distance Functions in Production Economics 329

4. Chambers RG (1996) Computable profit as a superlative technical change, input, output, and
productivity measure. Department of Agricultural and Resource Economics, The University of
Maryland, College Park, pp 96–01
5. Chambers RG (1998) Input and output indicators. In: Färe R, Grosskopf S, Russell RR (eds)
Essays in honor of Sten Malmquist. Kluwer Academic Publishers, Boston
6. Chambers RG (2002) Exact nonradial input, output, and productivity measurement. Econ
Theory 20:751–767
7. Chambers RG (2020, forthcoming) Competitive agents in certain and uncertain. Oxford
University Press, Oxford
8. Chambers RG, Chung Y, Färe R (1996) Benefit and distance functions. J Econ Theory 70:
407–419
9. Chambers RG, Chung Y, Färe R (1998) Profit, directional distance functions, and nerlovian
efficiency. J Optim Theory Appl 98:351–364
10. Debreu G (1951) The coefficient of resource utilization. Econometrica 19(3):273–292
11. Debreu G (1959) The theory of value. Yale University Press, New Haven
12. Färe R, Lovell CAK (1978) The structure of technical efficiency. J Econ Theory 19:150–162
13. Färe R, Lundberg A (2006) mimeo
14. Farrell MJ (1957) The measurement of productive efficiency. J R Stat Soc 129A:253–281
15. Konüs AA (1939) The problem of the true index of the cost of living. Econometrica 7:10–29
16. Luenberger DG (1992a) Benefit functions and duality. J Math Econ 21:461–481
17. Luenberger DG (1992b) New optimality principles for economic efficiency and equilibrium.
J Optim Theory Appl 75(2):221–264
18. Malmquist S (1953) Index numbers and indifference surfaces. Trabajos De Estadistica 4:
209–242
19. McFadden D (1978) Cost, revenue, and profit functions. In: McFadden D, Fuss M (eds)
Production economics: a dual approach to theory and applications. North Holland, Amsterdam
20. Rockafellar RT (1970) Convex analysis. Princeton University Press, Princeton
21. Shephard RW (1953) Cost and production functions. Princeton University Press, Princeton
22. Shephard RW (1970) Theory of cost and production functions. Princeton University Press,
Princeton
Stochastic Frontier Analysis: Foundations
and Advances I 8
Subal C. Kumbhakar, Christopher F. Parmeter,
and Valentin Zelenyuk

Contents
Introduction and Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 332
The Benchmark SFM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 335
The Distribution of ε . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 336
Alternative Specifications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 338
Estimation of Individual Inefficiency . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 342
Do Distributional Assumptions Even Matter? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 346
Finite Sample Identification of Inefficiency . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 347
Handling Endogeneity in the SFM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 349
A Corrected Two-Stage Least Squares Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 350
A Likelihood Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 350
A Method of Moments Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 352
Estimation of Individual Inefficiency . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 353
An Economic Approach to Deal with Endogeneity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 354
Modeling Determinants of Inefficiency . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 355
Proper Modeling of the Determinants of Inefficiency . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 357
Incorporating Determinants When u Is Truncated-Normal . . . . . . . . . . . . . . . . . . . . . . . . . 359
The Scaling Property . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 360
Estimation Without Imposing Distributional Assumptions . . . . . . . . . . . . . . . . . . . . . . . . . 361
Estimation When Determinants of Efficiency and Endogeneity Are Present . . . . . . . . . . . 363

S. C. Kumbhakar ()
Department of Economics, State University of New York at Binghamton, Binghamton, NY, USA
Inland Norway University of Applied Sciences, Lillehammer, Norway
e-mail: [email protected]
C. F. Parmeter
Department of Economics, University of Miami, Miami, FL, USA
e-mail: [email protected]
V. Zelenyuk
School of Economics and Centre for Efficiency and Productivity Analysis (CEPA), The
University of Queensland, Brisbane, QLD, Australia
e-mail: [email protected]

© Springer Nature Singapore Pte Ltd. 2022 331

S. C. Ray et al. (eds.), Handbook of Production Economics,
https://doi.org/10.1007/978-981-10-3455-8_9
332 S. C. Kumbhakar et al.

Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 365
Cross-References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 365
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 366

Abstract

This chapter (as well as Chap. 11) reviews some of the most important develop-
ments in the econometric estimation of productivity and efficiency surrounding
the stochastic frontier model.

Keywords

Efficiency · Productivity · Panel data · Endogeneity · Nonparametric ·

Determinants of inefficiency · Quantile · Identification

JEL Classiﬁcation

C10, C13, C14, C50

Introduction and Overview

The primary goal of this and the next chapters is to introduce the wide audience of
this Handbook to the range of methods, developed over the last four decades, within
one of the most popular paradigms in modern productivity analysis – the approach
called Stochastic Frontier Analysis, often abbreviated as SFA.
The first, and one of the most important, question a reader might wonder about
is why a researcher on productivity should ever care about SFA in general and
especially about the enormous variety of different types of SFA models that have
been proposed over the last four decades. Our goal in writing these two chapters
was to provide the reader with a good answer to this important question. Here, we
strive to outline the essence of major types of SFA methods, providing the minimal
but most essential details and focusing on advantages and disadvantages of each
method for dealing with various aspects that arise in practice. We hope that upon
finishing reading these chapters, the reader who is barely or even unfamiliar with
SFA obtains a general understanding of the importance and relevance of different
SFA methods, along with useful/key references for further details on each method.
Of course, the reader also deserves to receive a quick answer now, to decide if it is
worth it for a productivity researcher to read these chapters further – we try to give
such a quick answer in this section.
The Nobel Laureate Paul Krugman was hardly exaggerating when he once
quipped that “Productivity isn’t everything, but in the long run, it’s almost every-
thing.” The root of this statement can be seen when looking at various theoretical
models of economic growth, e.g., starting from Solow’s growth model, the related
8 Stochastic Frontier Analysis: Foundations and Advances I 333

variations of more advanced growth theory models or empirical growth accounting

approach to productivity measurement, as well as from the more sophisticated
measurements of productivity. Regardless of how the productivity is measured,
it is inevitably tied to measuring production relationships. Such relationships are
usually modeled through the so-called production functions or, more generally,
transformation functions (e.g., Shephard’s distance functions, directional distance
functions), cost functions, etc. In the classical growth accounting approach [95],
all the variation in growth apart from the variation of inputs is attributed to the so-
called Solow’s residual, which under certain restrictions measures what is referred
to as the change in total factor productivity (TFP).
A well-known problem of simple growth accounting is that it piles up and
hides many sources for growth, the most obvious of which is the statistical error.
Standard regression methods can and are often used to, basically, estimate average
relationships conditional on various factors (inputs, demographic and geographic
factors, etc.) to filter out the effect of statistical noise. All the deviations from the
estimated regression curves in such approaches are attributed to the statistical error,
and all the decision-making units (DMUs) represented in the data as observations
(e.g., firms, countries, etc.) are typically assumed to be fully efficient or on the
frontier of the production relationship. Such full efficiency assumption certainly
simplifies the measurement complexity, but is it really an innocent assumption?
Indeed, while many economic models admit the assumption that all firms
are efficient, the reality that one observes in practice usually suggests there are
reasonable amounts of inefficiency in this world. Such inefficiencies could arise,
for example, because of asymmetric information or, more generally, the problem of
incomplete markets (e.g., see [97]), which to some extent are present almost in every
aspect of our lives. Differences in inefficiencies (or in relative productivity levels)1
across firms or countries can also arise due to different managerial practices (e.g.,
see [19]), which could in turn be implied by the asymmetric information problem,
different cultural beliefs, traditions, and expectations [17]. Does accounting for such
inefficiency matter for productivity measurement? Vast literature on the subject
suggests that it indeed often matters substantially, as has been documented in
thousands of articles in the last four decades. The difference is in the approach –
SFA, data envelopment analysis (DEA), free disposable hull (FDH), etc. – and the
goal of this and the next chapters is to give a sense of a few major approaches within
the SFA paradigm.
In a nutshell, the main premise of the SFA approach is a recognition that
whether all DMUs are efficient or not is an empirical question that can and should
be statistically tested against the data, while allowing for a statistical error. To

1 Despite the variety of definitions, intuitively, production efficiency can be understood as a relative

measure of productivity. In other words, production efficiency is a productivity measure that is

being normalized (e.g., to be between 0 and 1 to reflect percentages) relative to some benchmark,
such as the corresponding frontier outcome, optimal with respect to some criteria: e.g., maximal
output given certain level of input and technology in the case of technical efficiency or minimal
cost given certain level of output and technology in the case of cost efficiency.
334 S. C. Kumbhakar et al.

enable such testing, the SFA approach provides a framework where the production
relationship is estimated also as a conditional average (of outputs given inputs
and other factors, in the case of production function), but the total deviation
from the regression curve is decomposed into two terms – statistical noise and
inefficiency. Both of these terms are unobserved by a researcher but with relatively
mild assumptions the different approaches within SFA allow the analyst to estimate
them for the sample as a whole (e.g., representing an industry) or for each individual
DMU.
Importantly, SFA approach also allows for the inefficiency term to be statistically
insignificant, if the data might suggest so, thus encompassing the classical approach
with a naive assumption of full efficiency as a special case and, importantly,
allowing for this assumption to be tested. Moreover, the SFA approach also
contains the other extreme where one assumes no statistical noise with all the
deviations treated as inefficiency to the frontier. Thus, the SFA approach is a
natural compromise between approaches which make two extreme assumptions,
yet also encompassing them as special cases, which can still be followed if the
data and the statistical tests from SFA would not recommend otherwise. If the
tests support (or at least cannot reject) the full efficiency hypothesis, then one
can proceed with the standard regression techniques, or even with Solow’s growth
accounting, but if not then accounting for possible inefficiency could be critical
for both quantitative and qualitative conclusions and, perhaps more importantly, for
the resulting policy implications. Indeed, if statistical tests reject the hypothesis of
full efficiency of DMUs, then it can be imperative to decompose the productivity
(be it Solow’s residual or any other productivity measure) further – to estimate the
inefficiency component for the sample (e.g., representing an industry) and for each
individual DMU. Moreover, SFA also provides a framework to analyze the sources
of production inefficiency and the variation of productivity levels, both of which can
give important insights into how to reduce inefficiency and increase productivity. We
discuss these interesting and important issues in this and the next chapters, while
some of the stylized facts we present here can be also found in previous reviews
[38, 59, 61, 70, 79], and it is impossible to give a good review without following
them to some degree, here we also summarize many of (what we believe to be) the
key recent developments as well as (with their help) shed some novel perspectives
onto the workhorse methods. So, all in all, our belief is that there is much value
added for the reader to complement what was done well in earlier reviews on this
theme.
The rest of the chapter is structured as follows: sections “The Benchmark
SFM”, “Handling Endogeneity in the SFM”, and “Modeling Determinants of
Inefficiency” focus on stochastic frontier models (SFM) for cross-sectional variation
in efficiency (relative productivity), where section “The Benchmark SFM” covers
the foundation laid by [4] and some closely related research, with section “Handling
Endogeneity in the SFM” discussing endogeneity issues and section “Modeling
Determinants of Inefficiency” focuses on modeling the determinants of inefficiency,
while section “Conclusions” concludes.
8 Stochastic Frontier Analysis: Foundations and Advances I 335

The Benchmark SFM

One of the main approaches to study productivity and efficiency of a cross section
of firms is the SFM, independently proposed by [4] (ALS hereafter) and [72] (MvB
hereafter).2 Using conventional notation, let Yi be the single-output for observation
(e.g., firm) i and let yi = ln(Yi ). The SFM can be written for a production frontier3
as

yi = m(x i ; β) − ui + vi = m(x i ; β) + εi . (1)

Here m(x i ; β) represents the production frontier of a firm (or more generally a
DMU), with given input vector x i . Our use of β is to clearly signify that we are
parametrically specifying our production function.4 The main difference between a
standard production function setup and the SFM is the presence of two distinct error
terms in the model. The ui term captures inefficiency, shortfall from maximal output
dictated by the production technology, while the vi term captures stochastic shocks.
The standard neoclassical production function model assumes full efficiency – so
the SFM embraces it as a special case, when ui = 0, ∀i, and allows the researcher
to test this statistically.5
One shortcoming of the benchmark SFM is that the appearance of inefficiency in
(1) lacks any specific structural interpretation. Where is inefficiency coming from?
It could stem from inputs being used sub-optimally: workers may not put forth full
effort or capital may be improperly used, e.g., due to asymmetric information or
other reasons hidden to the researcher or even the firm. Without a specific structural
link, it is difficult to know just how to treat inefficiency in (1). Thus, to estimate
the model, several assumptions need to be imposed. First, it is commonly assumed
that inputs are independent of u and v, ui ⊥ x and vi ⊥ x ∀x.6 Second, u and
v are assumed to be independent of one another. Next, given that ui leads directly
to a shortfall in output, it must come from a one-sided distribution implying that
E[εi |x] = 0. This has two effects if one was to estimate the SFM using OLS.
First, the intercept of technology would not be identified, and second, without

2 Battese and Corra [16] and Meeusen and van den Broeck [73], while appearing in the same year,
are applications of the methods.
3 Our discussion in both chapters will focus on a production frontier, as it is the most popular

object of study, while the framework for dual characterizations (e.g., cost, revenue, profit) or other
frontiers is similar and follows with only minor changes in notation.
4 See Chapter 9 “Stochastic Frontier Analysis: Foundations and Advances II” for a discussion on

relaxing parametric restrictions on the production frontier in the SFM.

5 Prior to the development of the SFM, approaches which intended to model inefficiency typically

ignored vi leading to estimators of the SFM with less desirable statistical properties: see the work
of [1, 3, 29, 84, 87, 99].
6 See section “Handling Endogeneity in the SFM” for a discussion on estimation of the SFM when

some inputs are allowed to be endogenous.

336 S. C. Kumbhakar et al.

any additional information, nothing can be said about inefficiency. Additionally,

if ui is an independently and identically distributed random variable, there is no
policy implication behind it given that nothing can directly increase or decrease
inefficiency. That is, the conclusions of such a study would be descriptive (reporting
presence or absence of inefficiency) rather than prescriptive or normative.7
Denote E[u] as μu and εi∗ = vi −(ui −μu ), the benchmark SFM can be rewritten
as

yi = m(x i ; β) − μu − (ui − μu ) + vi ≡ m∗ (x i ; β) + εi∗ (2)

and E[εi∗ |x] = 0. The OLS estimator could be used to recover mean inefficiency
adjusted technology m∗ (x i ; β) = m(x i , β) − μu in this case. However, rarely is the
sole focus of an analysis of productivity on the production technology. It is more
likely that both the production technology and information about inefficiency for
each DMU are the targets of interest; more structure is required on the SFM in this
case.
ALS’ and MvB’s approach to extract information on inefficiency, while also
estimating technology, was to impose distributional assumptions on ui and vi ,
recovering the implied distribution for εi and then estimating all of the parameters
of the SFM with the maximum likelihood estimator (MLE). vi was assumed to be
distributed as a normal with mean 0 and variance σv2 by both sets of researchers,
while the distribution of ui differed across the papers; [4] assumed that ui was
generated from a half-normal distribution, N+ (0, σu2 ), whereas MvB assumed ui
was distributed exponentially, with parameter σu .8
Even though the half-normal and exponential distributions are distinct, they
possess several common aspects. Both densities have modes at zero and monoton-
ically decay (albeit at different speeds) as ui increases. The zero mode property
is indicative of an industry where there is a tendency for higher efficiency for
the majority of the DMUs. Both densities would be classified as single-parameter
distributions, which means that the mean and variance both depend on the single
parameter, and these distributions also possess the scaling property, which we will
discuss in section “The Scaling Property.”

The Distribution of ε

Estimation of the SFM in (1) with maximum likelihood requires that the density of
ε, f (ε), is known. f (ε) can be determined through the distributional assumptions
invoked for v and u. Not all pairs of distributional assumptions for v and u will

7 See section “Modeling Determinants of Inefficiency” for models handling determinants of

inefficiency.
8 ALS also briefly discussed the exponential distribution, but its use and development is mainly

attributed to MvB.
8 Stochastic Frontier Analysis: Foundations and Advances I 337

lead to a tractable density of f (ε), permitting estimation via maximum likelihood.

Fortunately, the half-normal specification of [4] and the exponential specification of
MvB (along with the normal assumption for v), produce a density for ε that has a
closed form solution; direct application of maximum likelihood is straightforward in
this setting. For brevity we report the density of the composed error for the normal-
half-normal specification.

2
f (ε) = φ(ε/σ )(−ελ/σ ), (3)
σ
where φ(·) is the standard normal probability density function (pdf), (·) is the
standard
normal cumulative distribution function (cdf), with the parameterization
σ = σu2 + σv2 , and λ = σu /σv . λ is commonly interpreted as the proportion of
variation in ε due to inefficiency. The density of ε in (3) can be characterized as that
of a skew normal random variable with location parameter 0, scale parameter σ , and
skew parameter −λ.9 This connection has only recently appeared in the efficiency
and productivity literature [27].
From f (ε) in (3), along with independence assumptions on ui and vi the log-
likelihood function is
n
n
1 2
n
ln L = ln f (εi ) = −n ln σ + ln (−εi λ/σ ) − εi , (4)
2σ 2
i=1 i=1 i=1

where εi = yi −m(x i ; β). The SFM can be estimated using the traditional maximum
likelihood estimator (MLE). The benefit of this is that under the assumption of
correct distributional specification of ε, the MLE is asymptotically efficient (i.e.,
consistent, asymptotically normal and its asymptotic variance reaches the Cramer-
Rao lower bound). A further benefit is that a range of testing options are available.
For instance, tests related to β can easily be undertaken using any of the classic
trilogy of tests: Wald, Lagrange multiplier, or likelihood ratio. The ability to readily
and directly conduct asymptotic inference is one of the major benefits of stochastic
frontier analysis over DEA.10

9 The pdf of a skew normal random variable x is f (x) = 2φ(x)(αx). The distribution is right
skewed if α > 0 and is left skewed if α < 0. We can also place the normal, truncated-normal
pair of distributional assumptions
in this class.
The pdf of x with location ξ , scale ω, and skew
parameter α is f (x) = ω2 φ x−ξ
ω α x−ξ ω . See [12, 76] for more details.
10 This in no way suggests that inference cannot be undertaken when the DEA estimator
is deployed; rather, the DEA estimator has an asymptotic distribution which is much more
complicated that the MLE for the SFM, and so direct asymptotic inference is not available;
bootstrapping techniques are required for many of the most popular DEA estimators [93, 94].
338 S. C. Kumbhakar et al.

Alternative Specifications

The half-normal assumption for the one-sided inefficiency term is almost without
question the most common distribution for inefficiency in practice. This stems partly
from posterity, partly from the closed form solution of the likelihood function, and
partly from the availability of software to estimate the model for applied researchers.
However, none of these reasons are sufficient for blind application of the half-normal
density for inefficiency in the SFM.

The Exponential Distribution

The exponential assumption on inefficiency is also popular. The exponential density
is
1 −u/σu
f (u) = e , u ≥ 0. (5)
σu

For the normal-exponential distributional pair, the density of ε is

1
(−ε/σv − σv /σu )eε/σu +σv /2σu ,
2 2
f (ε) = (6)
σu

with likelihood function

σv2
n
1
n
ln L = −n ln σu + n + ln (−εi /σv − σv /σu ) + εi . (7)
2σu2 σu
i=1 i=1

Like the half-normal specification for u, the exponential specification monoton-

ically decreases in u, suggesting that larger levels of inefficiency are less likely
to occur than small levels of inefficiency. Both the half-normal and exponential
specifications for inefficiency stem from what are known as single-parameter
distributions. Single-parameter distributions are the simplest distributions, and an
unfortunate (yet sometimes very convenient) property of them is that all of their
moments depend on this single parameter, which can restrict the shape that the
density can potentially take.11

The Truncated Normal Distribution

To allow more generality into the SFM, while guarding against distribution mis-
specification, a variety of one-sided distributions have been proposed for modeling
ui in the SFM. Stevenson [96] proposed the truncated-normal distribution as a
generalization of the half-normal distribution; whereas the half-normal distribution
is the truncation of the N(0, σu2 ) at 0, the truncated-normal distribution is the

11 See [79] for a more detailed analysis of the SFM with u distributed exponentially.
8 Stochastic Frontier Analysis: Foundations and Advances I 339

truncation of the N(μ, σu2 ) at 0. The pre-truncation mean parameter, μ, affords the
SFM more flexibility in the shape of the distribution of inefficiency.
The truncated-normal density is

2
1 − (u−μ)
f (u) = √ u ≥ 0.
2
2σu
e , (8)
2π σu (μ/σu )

This density reduces to the half-normal distribution when μ = 0 and thus

provides a generalization (more specifically a nesting structure), and an opportunity
for inference on μ. An intuitive appeal of deploying truncated-normal distribution
in practice is that, unlike the half-normal and exponential densities, the truncated-
normal density has a mode at 0 only when μ ≤ 0 but otherwise has a mode at μ.
When μ > 0, the implication is that producers in a given market would tend to have
inefficiency ui near μ > 0 rather than near 0. This connotation may be more realistic
in some settings (e.g., the regulatory environment) than the half-normal assumption,
where the probability of being less efficient is much larger than of being grossly
inefficient.
For the normal-truncated-normal distributional pair, the density of ε is

1 ε+μ μ ελ
f (ε) = φ − (μ/σu ). (9)
σ σ σλ σ

The corresponding log-likelihood function is

n
εi +μ 2
n
μ εi λ
ln L= − n ln σ −0.5 −n ln (μ/σu )+ ln − . (10)
σ σλ σ
i=1 i=1

Other Distributions
Aside from the truncated-normal specification for the distribution of u, a variety
of alternatives have been proposed throughout the literature. Greene [34, 35] and
Stevenson [96] both proposed a gamma distribution for inefficiency. The gamma
distribution generalizes the exponential distribution in much the same way that
the truncated-normal distribution nests the half-normal distribution. Ritter and
Simar [85] advocate against the use of the gamma specification in practice noting
that large samples were required to reliably estimate the parameters of the gamma
distribution due to computational identification problems with the constant of the
regression. Lee [66] proposed a four parameter Pearson density for the specification
of inefficiency; unfortunately, this distribution is intractable for applied work and
until now has not appeared to gain popularity. Li [68] proposed the use of the
uniform distribution for inefficiency noting an intriguing feature of the subsequent
340 S. C. Kumbhakar et al.

composed error density: that it could be positively skewed.12 Another specification

for inefficiency appears in [22], who assumes that the distribution of u follows a
binomial specification; this allows the skewness of the composed error to be positive
or negative. Gagnepain and Ivaldi [32] specify inefficiency as being beta distributed
when inefficiency can be defined as a percentage (scaled between 0 and 1), while
[6] further generalize Stevenson’s [96] framework by assuming a doubly truncated-
normal distribution for inefficiency. This distributional assumption also allows the
convolved error term to be either positively or negatively skewed.
A common theme of all of the papers just mentioned is that they focus exclusively
on the distribution of inefficiency inside the SFM. Recent literature has shed light
on the features of f (ε) for the SFM when both the density of v and the density
of u are changed. Horrace and Parmeter [46] study the behavior of the composed
error when v is distributed as Laplace and u is distributed as truncated Laplace.
Nguyen [75] considers the Laplace-exponential distributional pair as well as the
Cauchy-Half Cauchy pair for the two error terms of the composed error. While
these alternative distributional pairs do provide different insights into the behavior
of the composed error, it remains to be seen if they will be regularly adopted in
practice and whether they provide substantially different conclusions than the most
frequently adopted distributional pairs (e.g., normal-half-normal); see section “Do
Distributional Assumptions Even Matter?” for more discussion on the perceived
importance of distributional assumptions regarding estimation of the SFM.
It is important to note that the main idea behind the SFM is that nearly any pair
of distributions can be used to model u and v. The advantage of the normal-half-
normal pair that is dominant in the literature is that the likelihood function has an
easy to evaluate expression. In general this should not be expected. More likely than
not, for a range of distributional assumptions, the likelihood function will contain
one or more intractable integrals, complicating estimation.13

Alternative Estimation Approaches of the SFM

Given the focus on inefficiency in the SFM and the impact that the distributional
assumption on u is likely to have on the MLE, studying the behavior of the SFM
across a range of distributional assumptions is desirable. However, outside of a few
specifications (half-normal, exponential, truncated-normal for ui , and normal for
vi ), the composed error density will not have a likelihood function that lends itself
for easy evaluation. In these cases it can be difficult to estimate all of the parameters
of the SFM, but several approaches exist, ranging in complexity, to estimate the
SFM when direct estimation of the likelihood function is not feasible. The simplest

12 Prior to [68] all of the previously proposed distributions always produced a composed error
density that was theoretically negatively skewed. Note that if u is distributed uniformly over the
interval [0, b], inefficiency is equally likely to be either 0 or b.
13 Note that the likelihood function for the normal-half-normal pair is dependent upon the cdf of the

normal distribution, (·) which contains an integral, but this can be quickly and easily evaluated
across all modern software platforms.
8 Stochastic Frontier Analysis: Foundations and Advances I 341

approach, dubbed corrected OLS (COLS) by [77],14 recognizes that OLS estimation
of the SFM produces consistent estimates of the coefficients of the frontier function
aside from the intercept. The intercept
√ is biased downward by the expected level
of industry inefficiency E [u] = 2/π σu .15 Olson et al.’s [77] insight was that
for a given pair of distributional assumptions (normal-exponential, say), the central
moments of the OLS residuals could be used to construct consistent estimators of the
parameters of the convolved error. Once these were estimated, expected inefficiency
could be estimated and the bias in the intercept corrected. The beauty of COLS from
the applied perspective is that OLS can be used and difficult likelihood functions do
not have to be derived or estimated.16
Several newer approaches exist as well. One that is becoming popular is
maximum simulated likelihood (MSL) estimation [71]. Greene [37] used MSL
estimation to estimate the parameters of the SFM for the normal-gamma convo-
lution. The key to implementation of the SFM when the composed error does not
produce a tractable likelihood is to notice that the integrals that commonly remain in
the density (from integrating u out of the density) can be treated as expectations and
evaluated by simulation rather than analytic optimization. Given that the distribution
of u is assumed known (up to unknown parameters), for a given set of parameters,
draws can be taken, and the subsequent expectation that is evaluated can replace the
integral. Optimization proceeds by searching over the parameter space until a global
maximum is found.
An even more recent approach to evaluating intractable likelihoods is found
in [102] who suggested estimation of the parameters of the SFM through the
characteristic function of the composed error. The reason that this will work is that
the characteristic function is a unique representation of a distribution (whether the

14 See also [34, pp. 31–32]. Richmond [84] also proposed adjusting the intercept from OLS
estimation, however, his model differs from that of [77] by assuming the presence of inefficiency
(which follows a gamma distribution) but no noise.
15 An alternative approach would be to estimate a weighted average efficiency of an industry, as

described theoretically in Chapter 25 “Aggregation of Efficiency and Productivity: From Firm

to Sector and Higher Levels”.
16 There exists some confusion over the terminology COLS as it relates to another method, modified

OLS (MOLS). Beginning with [111] and discussed in [31] and [34, pp. 32–34], MOLS shifts the
estimated OLS production function until all of the observations lie on or below the “frontier.” At
issue is the appropriate name of these two techniques. Greene [38] called the bounding approach
COLS, crediting [70, p. 21] with the initial nomenclature, and referred to MOLS as the method in
which one bias corrects the intercept based on a specific set of distributional assumptions. Further,
[59, pp. 70–71] also adopted this terminology. However, given that [77, p. 69] explicitly used
the terminology COLS, in our review we will adopt COLS to imply bias correction of the OLS
intercept and MOLS as a procedure that shifts up (or down) the intercept to bound all of the
data. The truth is both COLS and MOLS are the same in the sense that the OLS intercept is
augmented, it is just in how each method corrects, or modifies, the intercept that is important. While
we are departing from the more mainstream use of COLS and MOLS currently deployed, given
the original use of COLS, coupled with myriad papers written by Peter Schmidt and coauthors that
we discuss here, we will use the COLS acronym to imply a bias corrected intercept.
342 S. C. Kumbhakar et al.

density does or does not exist), and following from the convolution theorem, the
characteristic function of two independent random variables (here v and u) added
together is the product of the individual characteristic functions. The characteristic
functions for all of the densities described above are known, and so, using
the Fast Fourier Transform, the estimated characteristic function can be mapped
to the underlying density and, subsequently, the likelihood function. Tsionas’s
[102] method is somewhat computationally complicated, but it offers another
avenue to estimate the SFM under alternative distributional assumptions on both
v and u.

Estimation of Individual Inefficiency

Once the parameters of the SFM have been estimated, estimates of firm-level
productivity and efficiency can be recovered. Observation-specific estimates of
inefficiency are one of the main benefits of the SFM relative to neoclassical models
of production. Firms can be ranked according to estimated efficiency; the identity
of underperforming firms as well as those who are deemed best practice can also
be gleaned from the SFM. All of this information is useful in helping to design
more efficient public policy or subsidy programs aimed at improving the market,
for example, insulating consumers from the poor performance of heavily inefficient
firms.
As a concrete illustration, consider firms operating electricity distribution net-
works that typically possess a natural local monopoly given that the construction
of competing networks over the same terrain is prohibitively expensive.17 It is not
uncommon for national governments to establish regulatory agencies which monitor
the provision of electricity to ensure that abuse of the inherent monopoly power is
not occurring. Regulators face the task of determining an acceptable price for the
provision of electricity while having to balance the heterogeneity that exists across
the firms (in terms of size of the firm and length of the network). Firms which
are inefficient may charge too high a price to recoup a profit but at the expense of
operating below capacity. However, given production and distribution shocks, not
all departures from the frontier represent inefficiency. Thus, measures designed to
account for noise are required to parse information from εi regarding ui .
Alternatively, further investigation could reveal what it is that makes these estab-
lishments attain such high levels of performance. This could then be used to identify
appropriate government policy implications and responses or identify processes
and/or management practices that should be spread (or encouraged) across the less
efficient, but otherwise similar, units. This is the essence of the determinants of

17 The current literature is fairly rich on various examples of empirical values of SFA for the
estimation and use of efficiency estimates in different fields of research. For example, in the context
of electricity providers, see [42,54,62]; for banking efficiency, see [23] and references cited therein;
for the analysis of the efficiency of national healthcare systems, see [33] and a review by [45]; for
analyzing efficiency in agriculture, see [14, 15, 21, 69], to mention just a few.
8 Stochastic Frontier Analysis: Foundations and Advances I 343

inefficiency approach which we will discuss in section “Modeling Determinants of

Inefficiency.” More directly, efficiency rankings are used in regulated industries such
that regulators can set tougher future cost reduction targets for the more inefficient
companies, in order to ensure that customers do not pay for the inefficiency of firms.
The only direct estimate coming from the normal-half-normal SFM is σu2 . This
provides context regarding the shape of the half-normal distribution on ui and
the industry average efficiency E[u], but not on the absolute level of inefficiency
for a given firm. If we are only concerned with the average level of technical
efficiency for the population, then this is all the information that is needed. Yet,
if we want to know about a specific firm, then something else is required. The main
approach to estimating firm-level inefficiency is the conditional mean estimator of
[50], commonly known as the JLMS estimator. Their idea was to calculate the
expected value of ui conditional on the realization of composed error of the model,
εi ≡ vi − ui , i.e., E[ui |εi ].18 This conditional mean of ui given εi gives a point
prediction of ui . The composed error contains individual-specific information, and
the conditional expectation is one measure of firm-specific inefficiency.
Jondrow et al. [50] show that for the normal-half-normal specification of the
SFM, the conditional density function of ui given εi , f (ui |εi ), is N+ (μ∗i , σ∗2 ),
where
−εi σu2
μ∗i = (11)
σ2
and
σv2 σu2
σ∗2 = . (12)
σ2
Given results on the mean of a truncated-normal density it follows that

σ∗ φ( μσ∗i∗ )
E[ui |εi ] = μ∗i + . (13)
μσ∗i∗

The individual estimates are then obtained by replacing the true parameters in
(13) with MLE estimates from the SFM.
Another measure of interest is the Afriat-type level of technical efficiency,
defined as e−ui = Yi /em(x i ) evi ∈ [0, 1]. This is useful in cases where output is
measured in logarithmic form. Further, technical efficiency is bounded between 0
and 1, making it somewhat easier to interpret relative to a raw inefficiency score.
Since e−ui is not directly observable, the idea of [50] can be deployed here, and
E e−ui |εi can be calculated [13, 67]. For the normal-half-normal model, we have

18 Jondrow et al. [50] also suggested an alternative estimator based on the conditional mode.
344 S. C. Kumbhakar et al.

μ∗i
−μ∗i + 12 σ∗2
σ∗− σ∗
E e−ui |εi = e , (14)
μ∗i
σ∗

where μ∗i and σ∗ were defined in (11) and (12), respectively. Technical efficiency
estimates are obtained by replacing the true parameters in (14) with MLE estimates
from the SFM. When ranking efficiency scores, one should use estimates of 1 −
E [ui |εi ], which is the first-order approximation of (14). Similar expressions for
the [50] and [13] efficiency scores can be derived under the assumption that u is
exponential [59, p. 82], truncated-normal [59, p. 86], and Gamma [59, p. 89]; see
also [61].19

Inference About the Presence of Inefficiency

Having estimated the benchmark SFM, a natural hypothesis is whether inefficiency
is even present. In this case the null hypothesis of interest is H0 : σu2 = 0 against
H1 : σu2 > 0.20 The direct way to test the H0 is through a likelihood ratio test,
keeping in mind that the unrestricted model is the assumed SFM and the restricted
model is the linear regression model (or more specifically the normal regression
model). There is a problem with implementation of this test however. Under H0 σu2
is restricted to lie on the boundary of the parameter space, and this precludes direct
use of a likelihood ratio test.
Battese and Coelli [28] demonstrates that under H0 , the likelihood ratio statistic
in this setting is a 50:50 mixture of a χ12 distribution, the distribution of the
ordinary likelihood ratio statistic if the parameter was not on the boundary of the
parameter space, and a χ02 , known as the chi-bar-square distribution, χ̄ 2 , [28, 89].
This second piece is what captures the potential presence of the σu2 parameter to lie
on the boundary of the parameter space and creates a point mass in the asymptotic
distribution of the likelihood ratio statistic.
Calculation of the test statistic itself is invariant to whether the parameter lies
on the boundary under H0 . What does change is how one goes about calculating
either the p-value or the critical value to assess the outcome of the test. In the case
of the 50:50 mixture, the critical values are determined by looking at the 2α-level
critical value from a χ12 distribution. For example, whereas the critical value for a 5%
significance level is 3.841 for χ12 , it is 2.706 for the 50:50 mixture. More specifically,
Table 1 presents the critical values of both the χ12 and the 50:50 mixture for a range
of significance levels.

19 In principle, these individual efficiency scores can then be used for estimating weighted average
efficiencies of an industry or a group within it, as described theoretically in Chapter 25
“Aggregation of Efficiency and Productivity: From Firm to Sector and Higher Levels”, which
seems novel for SFA context.
20 One could test if other moments of the distribution were 0 as well, but most of the SFMs

parameterize the distribution of u with σu and so this seems the most natural.
8 Stochastic Frontier Analysis: Foundations and Advances I 345

Table 1 Right tail critical values for both a χ12 and a 50:50 mixture of a χ02 and a χ12 , denoted
as χ̄ 2
Significance Level 0.01 0.05 0.1 0.15 0.2 0.25
χ12 6.634 3.841 2.706 2.072 1.642 1.323
χ̄ 2 5.412 2.706 1.642 1.074 0.708 0.455

An alternative type of test for the presence of inefficiency is based on the

skewness of the residuals. A variety of tests for skewness exist, notably [2, 43, 63].
Henderson and Parmeter [43] proposed a bootstrap based version of Ahmad &
Li’s [2] asymptotic test, noting that in finite samples the bootstrap version is
likely to have superior performance. This test involves estimating the SFM using
OLS and then testing whether the distribution of the OLS residuals is symmetric.
Kuosmanen and Fosgerau’s [63] test of symmetry is also based on the bootstrap,
but rather than focus on the estimated distribution of the OLS residuals, their test
focuses exclusively on the skewness coefficient of the residuals. Both of these tests
of symmetry are appealing because they do not require parametric distributional
assumptions and can be implemented after having estimated the SFM using OLS.

Inference About the Distribution of Inefficiency

It is important to recognize, despite the frequent misuse of terminology, that the
JLMS or Battese-Coelli (or similar types) efficiency estimators are not estimators of
ui or e−ui , respectively, and do not converge to them for n → ∞. As n → ∞, the
new observations represent different firms each with their own level of inefficiency
and noise (upon which JLMS conditions), rather than observations from the same
firm. Even more importantly, the JLMS estimator was not intended to estimate
unconditional inefficiency. The JLMS estimator is, however, a consistent estimator
for the expected level of inefficiency conditional on the particular realizations of ε.21
The JLMS efficiency scores can be used to provide a (limited) test of the
distribution of inefficiency. The key insight to understand how a test can be
constructed is that if the distributional assumptions are correct, then the distribution
of E[ui |εi ] is completely known. Hence a comparison of the distribution of E[ui |εi ]
to the true distribution of E[ui |εi ] will shed light into the statistical validity of
the assumed distributions for u and v. Wang and Schmidt [108] derived the
distribution of E[ui |εi ] for the normal-half-normal SFM, while [107] proposed χ 2
and Kolmogorov-Smirnov-type test statistics against this distribution.22
We caution readers regarding a rejection with use of this test. A rejection does
not necessarily imply that the distributional assumption on u is incorrect, it could
be that the normality distributional assumption on v or some other assumptions
about the SFM (e.g., the parametric form of m) is violated, and this is leading to

21 The JLMS efficiency estimator is known as a shrinkage estimator; on average, it understates the
efficiency level of a firm with small ui while it overstates efficiency for a firm with large ui .
22 See also [66] for a different test based on the Pearson distributional assumption for u.
346 S. C. Kumbhakar et al.

the rejection. Similarly, one must be careful in interpreting tests on the distribution
of ε (or functionals of ε) when the distribution of v is also assumed to be normal.
Alternative tests similar to [107] could be formulated using the Laplace-exponential
SFM of [46].

Predicting Inefficiency
Aside from testing for the appropriate distribution of inefficiency, one should also
test, or present uncertainty, as it pertains to an individual efficiency score. Each
JLMS efficiency score is a prediction of inefficiency, and it is possible to calculate
prediction intervals. Interestingly, few applied papers cover in depth uncertainty of
estimated efficiency scores.
A prediction interval for E[ui |εi ] was first derived by [98] and also appeared
in [18, 44, 47] (see the discussion of this in [92]). The prediction interval is based
on f (ui |εi ). The lower (Li ) and upper (Ui ) bounds for a (1 − α)100% prediction
interval are

−1 α μ∗i
Li =μ∗i + 1− 1− 1− − σ∗ , (15)
2 σ∗

α μ∗i
Ui =μ∗i + −1 1− 1− − σ∗ , (16)
2 σ∗

where μ∗i and σ∗ are defined in (11) and (12), respectively, and replacing them with
their MLE estimates will give estimated prediction intervals for E[ui |εi ].
Wheat et al. [109] derived minimum width prediction intervals noting that the
confidence interval studied in [47] was based on a symmetric two-sided interval.
Given that the distribution of ui conditional on εi is truncated (at 0) normal
and asymmetric, this form of interval is not of minimum width. Parmeter and
Kumbhakar [79] showed that depending upon the ratio of σu to σv , the difference in
relative widths of Horrace & Schmidt’s [47] and Wheat et al.’s [109] prediction
intervals can be quite substantial. It is thus recommended to use the intervals
provided by [109] as these are not based on symmetry. Note that although we could
predict u and construct a prediction interval, this information is not that useful for
policy purposes unless there are some variables that affect inefficiency and such
variables can be changed by a specific policy.

Do Distributional Assumptions Even Matter?

An important empirical concern when using the SFM is the choice of distributional
assumptions made for v and u. The distribution of v has almost universally been
accepted as being normal in both applied and theoretical work (a recent exception is
[46]); the distribution of u is more commonly debated, but relatively little work has
been devoted to discerning the impact that alternative shapes of the distribution can
have. Moreover, choice of u is often driven through available statistical software to
8 Stochastic Frontier Analysis: Foundations and Advances I 347

implement the method rather than an underlying theoretical link between a model
of productive inefficiency and the exact shape of the corresponding distribution.
A majority of applied papers studying productivity do not rigorously check
differences in estimates, or perform inference, across different distributional
assumptions. Greene [36] is often cited as one of the first analyses to compare
average inefficiency levels across several distributional specifications (half-normal,
truncated-normal, exponential, and gamma), and he finds little difference in average
inefficiency across 123 US electric generation firms. Following Greene’s [36]
investigation into the choice of distribution, [59] calculated the rank correlations
among the JLMS scores from these same four models, producing rank correlations
as low as 0.75 and as high as 0.98.23
The intuition underlying these findings is that one’s understanding of ineffi-
ciency, as measured through the JLMS score, is robust to distributional choices,
at least from a ranking perspective. The reason for this can be found in the work
of [78, p. 438] who have shown that the JLMS efficiency scores are monotonic in
ε provided that the distribution of v is log-concave (which the normal distribution
is). The implication here is that firm rankings can be obtained via the OLS residuals
without the need of distributional assumptions whatsoever [18]. Thus, in light of
these insights, the important aspect of distributional choice for u is the impact
that it has on the corresponding estimates of the production function; when these
estimates are robust to distributional choice, so too will be the inefficiency rankings.
Thus, if interest hinges on features of the frontier, then so long as inefficiency
does not depend on conditional variables (see section “Modeling Determinants of
Inefficiency”), one can effectively ignore the choice of distribution, as this only
affects (usually but not substantially) the level of the estimated technology, but not
its shape – which is what influences measures such as returns to scale and elasticities
of substitution.

Finite Sample Identification of Inefficiency

An early analysis of the finite sample performance of the normal-half-normal SFM

by [77] uncovered an interesting phenomena, quite regularly the corrected OLS
estimator would produce an estimate of σu2 ≤ 0. This was deemed a “Type I”
failure of the SFM; further [77, p. 70] noted that “It is also true that, in every
case of Type I failure we encountered, the MLE estimate of [σu2 ] also turned out
to equal zero. (This makes some sense, though we cannot prove analytically that
it should happen.)” Waldman [104] provided the analytic foundation behind this

23 Ina limited Monte Carlo analysis, [86] compared rank correlations of stochastic frontier
estimates assuming that inefficiency was either half-normal (which was the true distribution)
or exponential (a misspecified distribution) and found very little evidence that misspecification
impacted the rank correlations in any meaningful fashion; [46] conducted a similar set of
experiments and found essentially the same results.
348 S. C. Kumbhakar et al.

result, demonstrating that a stationary point of the log-likelihood function exists,

and this stationary point is a local maximum when the sign of the skewness of
residuals stemming from OLS estimation of the SFM is positive. This is broadly
viewed as a deficiency of the SFM as an estimate of σu2 of 0 is literally interpreted
as a finding of no inefficiency.
However, this is an unfortunate interpretation because it is purely a finite sample
issue. If in fact u is distributed half-normal, then as shown in [28, 92, 103], as
n → ∞ the likelihood of drawing a random sample which will have positive
skewness decreases, and the rate of this decrease is directly related to σu2 /σv2 ; the
larger this ratio, the faster the decrease in the probability of observing a random
sample with positive skew.24 The observance of OLS residuals with positive skew is,
by and of itself, of no concern. What is concerning is that for an applied researcher
whose focus is to study the efficiency level of firms, analysis of a sample where the
residuals from the SFM have positive skewness leads to the conclusion of all firms
being efficient, and this finding might be incongruent with either preconceptions
about the industry or perceived publication standards when applying these methods.
This has often led to various forms of respecification: using a different data set,
trying an alternative functional form for the production function or, most likely,
deploying different distributional assumptions regarding inefficiency.
As noted by [92], none of these respecification approaches are appropriate or
warranted. Again, Table 1 in [92] evinces that even when everything about the
SFM is correctly specified, positively skewed OLS residuals are still a regular
occurrence. Their suggestion is to use special resampling techniques based on
bootstrapping to conduct inference on either overall inefficiency of the industry
under study or specific firms. The finding of OLS residuals with positive skewness
is commonly denoted the “wrong skew problem,” though it is not clear where this
term initially originated. It is unfortunate that this term has crept into the lexicon of
productivity analysis as there really is no problem at all, except for the problem of
misinterpretation and mistreatment.
One reason why respecification is troubling is that classical statistical inference
assumes that model specification is selected independently of estimation. When
specification searches are conducted, this introduces biases into the final parameter
estimates. Further, there is the concern in published research that if the researcher
did encounter positive skew, this information is not provided to the reader. It is worth
mentioning that not all SFMs are plagued by this issue. In fact, some distributional
combinations will lead to identification of inefficiency regardless of the sign of
the skewness of the OLS residuals. Examples include the normal-uniform SFM of
[68], the normal-Weibull SFM of [101], the normal-binomial SFM of [22], and the
normal-doubly truncated SFM of [6]. Even more recently, [46] demonstrated, in
the style of [104], that the log-likelihood function of the Laplace-exponential SFM
is not dependent upon the sign of the skewness of the OLS residuals. This mainly

24 Note that the estimator of the skewness coefficient is distributed asymptotically standard normal,

so it is feasible to have either negative or positive skewness in any finite sample.

8 Stochastic Frontier Analysis: Foundations and Advances I 349

stems from the fact that as σu2 → 0, this model converges to a regression model with
error term distributed Laplace, for which the MLE is the least absolute deviations
(LAD) estimator.
Despite the history behind the impact of the sign of the skewness of the OLS
residuals on the SFM, interest still abounds surrounding this issue. Recently, [40]
presented a generalized method which always ensures that the SFM can be identified
and that this model will converge to the traditional SFM model as n → ∞
if the traditional SFM is correctly specified. Bonanno et al. [20] introduced a
generalized SFM which allows v to be distributed as a Type 1 generalized logistic
which introduces asymmetry in v, coupled with allowing dependence between u
and v. These two additional assumptions, similarly to [40], allow the parameters
of the SFM to be identified regardless of the sign of the OLS residuals. Feng
et al. [30] describe a constrained MLE that uses the traditional normal-half-normal
distributional pair but imposes a penalty in estimation to combat the potential for
positive skewness of the OLS residuals to lead to an estimate of σu2 of 0. Finally, [48]
generalize the theory of [104] by studying the SFM without explicit distributional
assumptions. All told, this issue is one that still generates a substantial amount of
interest in the academic community, and it is one that is not likely to fade any time
soon (see the discussion in [7]).

Handling Endogeneity in the SFM

A common assumption in the SFM is that x is either exogenous or independent

of both ui and vi . If either of these conditions are violated, then the MLE will
be biased and most likely inconsistent. Yet, it is not difficult to think of settings
where endogeneity is likely to exist. For example, if shocks are observed before
inputs are chosen, then producers may respond to good or bad shocks by adjusting
inputs, leading to correlation between x and v. Alternatively, if managers know
they are inefficient, they may use this information to guide their level of inputs,
again, producing endogeneity. In a regression model, dealing with endogeneity is
well understood. However, in the composed error setting, these methods cannot be
simply transferred over but require care in how they are implemented [10].
To incorporate endogeneity into the SFM in (1), we set m(x i ; β) = β0 +
x 1i β 1 +x 2i β 2 where x 1 are our exogenous inputs and x 2 are the endogenous inputs,
where endogeneity may arise through correlation of x 2 with u, v, or both. To deal
with endogeneity, we require instruments, w, and identification necessitates that the
dimension of w is at least as large as the dimension of x 2 . The natural assumption
for valid instrumentation is that w is independent of both u and v. Our following
discussion here will center on the distributional assumptions of ALS.
Why worry about endogeneity? Economic endogeneity means that the inputs
in question are choice variables and chosen to optimize some objective function
such as cost minimization or profit maximization. Statistical endogeneity arises
from simultaneity, omitted variables, and measurement errors. For example, if the
350 S. C. Kumbhakar et al.

omitted variable is managerial ability, which is part of inefficiency, inefficiency is

likely to be correlated with inputs because managerial ability affects inputs. This
is the Mundlak’s argument for why omitting a management quality variable (for
us inefficiency) will cause biased parameter estimates. Endogeneity can also be
caused by simultaneity meaning that more than one variable in the model are jointly
determined.
One way to address the problem is to look at it from a purely statistical angle
and use instrumental variables. The other solution is economic, that is, address
the economic issue that is causing endogeneity. We consider first the statistical
solution and then the economic solution. In many applied settings, it is not clear
what researchers mean when they attempt to handle endogeneity inside the SFM.
An excellent introduction into the myriad of influences that endogeneity can have on
the estimates stemming from the SFM can be found in [74]. Mutter et al. [74] used
simulations designed around data based on the California nursing home industry
to understand the impact of endogeneity of nursing home quality on inefficiency
measurement.

A Corrected Two-Stage Least Squares Approach

The simplest approach to accounting for endogeneity is to use a corrected two-stage

least squares (C2SLS) approach, similar to the common COLS approach that has
been used to estimate the SFM. This method estimates the SFM using standard
2SLS with instruments w. This produces consistent estimators for β 1 and β 2 but
not β0 , as this is obscured by the presence of E[u] (to ensure that the residuals have
mean zero). The second and third moments of the 2SLS residuals are then used
to recover estimatorsof σv2 and σu2 . Once σu2 is determined, the intercept can be
corrected by adding π2 σ̂u .
This represents a simple avenue to account for endogeneity, and it does not
require specifying how endogeneity enters the model, i.e., through correlation
with v, with u, or both. However, as with other corrected procedures based on
calculations of the second and third (or even higher) moments of the residuals, from
[77] and [104], if the initial 2SLS residuals have positive skew (instead of negative),
then σu2 cannot be identified, and its estimator is 0. Further, the standard errors from
this approach need to be modified for the estimator of the intercept to account for
the stepwise nature of the estimation.

A Likelihood Approach

The SFM with endogeneity has recently been studied by [10, 52, 64, 100]. Here
we describe maximum likelihood estimation of the SFM under endogeneity. Our
discussion here follows [10] as their derivation of the likelihood relies on a simple
conditioning argument as opposed to the earlier work relying on the Cholesky
8 Stochastic Frontier Analysis: Foundations and Advances I 351

decomposition. While both approaches lead to the same likelihood function, the
conditioning idea of [10] is simpler and more intuitive.
Consider the stochastic frontier system:

yi =x i β + εi (17)
x 2i =wi + ηi (18)

where x i = (x 1i , x 2i ), β = (β 1 , β 2 ), w i = (x 1i , q i ) is the vector of instruments, ηi

is uncorrelated with wi , and endogeneity of x 2i arises through cov(εi , ηi ) = 0. Here
simultaneity bias (and the resulting inconsistency) exists because ηi is correlated
with either vi , ui or both.
The following assumptions are used by [10]: ui ∼N+ (0, σu2 ), m(x 1i , x 2i ; β 1 , β 2 )
=β0 + x 1i β 1 + x 2i β 2 , and conditional on w i , ψi = (vi , ηi ) ∼ N(0, ), where

σv2 Σvη
= .
Σηv Σηη

vi
Amsler et al. [10] focused on the setting where ui is independent of ψi = .
ηi
To derive the likelihood function, [10] condition on the instruments, w. Doing
this yields f (y, x 2 |w) = f (y|x 2 , w) · f (x 2 |w). With the density in this form,
the log-likelihood follows suite: ln L = ln L1 + ln L2 , where ln L1 corresponds
to f (y|x 2 , w) and ln L2 corresponds to f (x 2 |w). These two components can be
written as

1 2
n n

ln L1 = − (n/2) ln σ 2 − 2
ε̃i + ln (−λc ε̃i /σ )
2σ
i=1 i=1

n
−1
ln L2 = − (n/2) ln |Σηη | − 0.5 ηi Σηη ηi ,
i=1

−1 η , σ 2 = σ 2 + σ 2 , λ = σ /σ , and
where ε̃i = yi − β0 − x i β − μci , μci = Σvη Σηη i v u c u c
−1 Σ . The subtraction of μ in ln L is an endogeneity correction
σc2 = σv2 −Σvη Σηη ηv ci 1
while it should be noted that ln L2 is nothing more than the standard likelihood
function of a multivariate normal regression model (as in (17)). Estimates of the
model parameters (β, σv2 , σu2 , , Σvη ) and Σηη can be obtained by maximizing the
likelihood function ln L.
While direct estimation of the likelihood function is possible, a two-step
approach is also available [64]. However, as pointed out by both [64] and [10],
this two-step approach will have incorrect standard errors. Even though the two-
step approach might be computationally simpler, it is, in general, different from full
optimization of the likelihood function of [10]. This is due to the fact that the two-
step approach ignores the information provided by and Σηη in ln L1 . In general
352 S. C. Kumbhakar et al.

full optimization of the likelihood function is recommended as the standard errors

(obtained in a usual manner from the inverse of the Fisher information matrix) are
valid.25

A Method of Moments Approach

An insightful avenue to deal with endogeneity in the SFM that differs from the
traditional corrected methods or maximum likelihood is proposed by [10], who
used the work of [41]. The idea is to use the first-order conditions for maximization
of the likelihood function under exogeneity:

E ε22 /σ 2 − 1 = 0 (19)

εi φi
E =0 (20)
1 − Ψi

φi
E x i εi /σ + λx i = 0, (21)
1 − i

where φi = φ( λεσ i ) and i = ( λεσ i ). Note that these expectations are taken over
x i and yi (and by default, εi ) and solved for the parameters of the SFM.
The key here is that these first-order conditions (one for σ 2 , one for λ and
the vector for β) are valid under exogeneity, and this implies that the maximum
likelihood estimator is the generalized methods of moments estimator. Under
endogeneity however, this relationship does not hold directly. But the seminal idea
of [10] is that the first-order conditions (19) and (20) are based on the distributional
assumptions on v and u, not on the relationship of x with v and/or u. Thus, these
moment conditions are valid whether x contains endogenous components or not.
The only moment condition that needs to be adjusted is (21). In this case the first-
order needs to be taken with respect to w, the exogenous variable, not x. Doing so
results in the following amended first-order condition:

25 Typically the standard errors can be obtained either through use of the outer product of gradients
(OPG) or direct estimation of the Hessian matrix of the log-likelihood function. Given the nascency
of these methods, it has yet to be determined which of these two methods is more reliable in
practice, though in other settings both tend to work well. One caveat for promoting the use of the
OPG is that since this only requires calculation of the first derivatives, it can be more stable (and
more likely to be invertible) than calculation of the Hessian. Also note that in finite samples, the
different estimators of covariance of MLE estimator can give different numerical estimates, even
suggesting different implications on the inference (reject or do not reject the null hypothesis). So,
for small samples, it is often advised to check all feasible estimates whenever there is suspicion
of ambiguity in the conclusions (e.g., when a hypothesis is rejected only at say around the 10% of
significance level).
8 Stochastic Frontier Analysis: Foundations and Advances I 353

φi
E wi εi /σ + λwi = 0, (22)
1 − i

where φi and i are identical to those in (21). It is important to acknowledge that

this moment condition is valid when εi and w i are independent. This is a more
stringent requirement than the typical regression setup with E[εi |wi ] = 0. As with
the C2SLS approach, the source of endogeneity for x 2 does not need to be specified
(through v and/or u).

Estimation of Individual Inefficiency

An interesting and important finding from [10] is that when there is endogeneity, one
can potentially improve estimation of inefficiency through the JLMS estimator. The
traditional predictor of [50] is E(ui |εi ). However, more information is available
when endogeneity is present, namely, via ηi . This calls for a modified JLMS
estimator, E(ui |εi , ηi ). Note that even though it is assumed that ui is independent
from ηi (as in [10]), because ηi is correlated with vi , there is information that can
be used to help predict ui even after conditioning on εi .
Amsler et al. [10] showed that ηi is independent of (ui , ε̃i ):

E(ui |εi , ηi ) = E(ui |ε̃i , ηi ) = E(ui |ε̃i ).

and that the distribution of ui conditional on ε̃i = yi − β0 − x i β − μci is

N+ (μ∗ , σ∗2 ) with μ∗ = −σu2 ε̃i /σ 2 and σ∗2 = σu2 σc2 /σ 2 , which is identical to
the original JLMS estimator, except that σv2 is replaced with σc2 and ε̃i taking the
place of εi . The modified
JLMSestimator in the presence of endogeneity becomes
φ(ξi )
E(ui |εi , ηi ) = σ∗ 1−(ξi ) − ξi with ξi = λε̃i /σ . Note that E(ui |εi , ηi ) is a better
predictor than E(ui |εi ) because σc2 < σv2 . The improvement in prediction follows
from the textbook identity for variances, where for any random vector (X, Z), where
X and Z are random sub-vectors, we have

var(X) = var[E(X|Z)] + E(var[X|Z]) .

Explained U nexplained

In this case, by conditioning on both εi and ηi , the conditioning set is larger than
simply conditioning on εi , and so it must hold that the unexplained portion of
E(ui |εi , ηi ) is smaller than that of E(ui |εi ). It then holds that there is less variation
in E(ui |εi , ηi ) as a predictor than E(ui |εi ), which is a good thing. While it is not
obvious at first glance, one benefit of endogeneity is that researchers may be able
to more accurately predict firm-level inefficiency, though it comes at the expense
of having to deal with endogeneity. This improvement in prediction may also be
accompanied by narrower prediction intervals; however, this is not known as [10]
did not study the prediction intervals.
354 S. C. Kumbhakar et al.

An Economic Approach to Deal with Endogeneity

An alternative to developing valid instruments and correcting for endogeneity is

to use what is known as a primal system approach, when inputs are endogenous
[61, Chap. 8]. This setup estimates the traditional SFM but appends the first-order
conditions stemming from cost minimization (one could alternatively attach profit
maximization or return to the outlay conditions instead if this was a more repre-
sentative behavior for the industry under study). That is, if a producer minimizes
costs26

min p x, s.t. y = m(x; β) + v − u, (23)

for input prices p, the first-order conditions in this case are

mj (x; β) pj
= , j = 2, . . . , J, (24)
m1 (x; β) p1

where mj (x; β) is the partial derivative of m(x; β) with respect to xj . These

first-order conditions are exact, which usually does not arise in practice; rather, a
stochastic term is added, which is designed to capture allocative inefficiency. That
m (x;β) p
is, our empirical first-order conditions are mj1 (x;β) = pj1 eξj for j = 2, . . . , J where
eξj captures allocative inefficiency for the j th input relative to input 1 (the choice
of input to compare to is without loss of generality). The idea behind allocative
inefficiency is that firms could be fully technically efficient and still have room for
improvement due to over or under use of inputs, relative to another input, given the
price ratio. In general if firms are cost minimizers and one estimates a production
function, the inputs will be endogenous as these are choice variables to the firm.
Hence, a different approach is needed.
The primal system approach estimates the SFM as in (1) but also incorporates
the information in the J − 1 conditions in (24) with allocative inefficiency built
in. Shephard’s lemma in microeconomics dictates that the first-order conditions are
actually cost share information; when the logarithm of the production function is
taken, the first derivatives represent the cost shares of the corresponding inputs,

∂ ln m
mj (x; β) ∂ ln xj sj /xj
= ∂ ln m
= . (25)
m1 (x; β) ∂ ln x1
s1 /x1

sj /xj pj ξj
When these are equated to the ratio of input prices, one obtains s1 /x1 = p1 e ,
sj pj xj ξj
which can be rearranged to yield s1 = p1 x1 e . Taking logarithms produces

is possible to treat a subset of x as endogenous; i.e., x = (x 1 , x 2 ), where x 1 is endogenous

26 It

and x 2 is exogenous.
8 Stochastic Frontier Analysis: Foundations and Advances I 355

ln(sj ) − ln(s1 ) − ln(pj xj ) + ln(p1 x1 ) = ξj . (26)

If distributional assumptions are imposed on v, u, and ξ , the parameters of the

production function can be estimated along with technical and allocative efficiency.
An unfortunate consequence of the primal system approach is that the input demand
and cost functions are analytically tractable only for quite specific assumptions
on the production function (Cobb-Douglas being one). In these cases a more
complicated process is required to determine the impact of technical and allocative
inefficiency on costs [60]. See [56, 57] for more detailed discussion of these types
of primal system approaches to handle economic endogeneity across a range of
settings.

Modeling Determinants of Inefficiency

The use of the SFM is exciting for productivity analysis because a prediction
of firm-level efficiency can be obtained. However, in the benchmark SFM, ui is
treated as completely random, and so nothing connects the level of inefficiency to
variables which might serve as an explanation for the existence and the level of
inefficiency. As the SFM has gained popularity in applied productivity analysis, it
has become common to introduce variables outside the main production structure
which influence output through their effect on inefficiency.27
As a concrete example, consider the study of productivity within the banking
industry. A researcher may want to know whether a bank’s level of efficiency is
affected by the use of information technology, the amount of assets the bank has
access to, the type of bank, or the type of ownership structure in place, corporate
governance practices, etc. Similarly, the government might be interested in whether
regulations (such as allowing banks to merge) improve banks’ performance. To
answer these questions, the relationship between efficiency and its potential deter-
minants needs to be modeled and estimated.
Consider estimating what influences firm-level inefficiency in the benchmark
SFM. This model assumes that both vi and ui are homoskedastic. In a traditional
linear regression, heteroskedasticity has no impact on the bias/consistency of the
OLS estimator. However, if we were to allow σu2 to depend on determinants of
inefficiency, z, then ignoring this will lead to, except in special settings, a biased
and inconsistent estimator of the parameters of the SFM. Both [59, Section 3.4]
and [106] provide detailed accounts of the consequences of ignoring the presence
of determinants of inefficiency in the SFM.

27 Reifschneiderand Stevenson [83] used the term “inefficiency explanatory variables,” while
others call them “environmental variables,” but it is now common to refer to these variables
as “determinants of inefficiency.” A variety of approaches have been proposed to model the
determinants of inefficiency with the first pertaining to panel data models [14, 55] (see chapter
11).
356 S. C. Kumbhakar et al.

√
Recall from section “The Benchmark SFM” that E[u] = 2/π σu . Now imagine
ignoring the composed structure of ε and estimating the SFM via OLS. If it is the
case that determinants of inefficiency are present, so that σu2 = σu2 (z), this omission
leads to biased parameter estimates of the SFM given that the assumed model is

yi = m(x i ; β) + 2/π σu + εi∗ ,
√
with εi∗ = εi − 2/π σu , whereas the true model is

yi = m(x i ; β) + 2/π σu (zi ) + εi∗ ≡ m̃(x i , zi ; β, δ) + εi∗ .

The estimates of m(x i ; β) are conflated with σu (zi ), unless x and z are
uncorrelated. The reason that this issue presents itself is the fact that the mean of
u, due to the truncation at 0, must depend on the variance. Thus, it is not possible to
allow u to be heteroskedastic without the mean of u being a function of z as well.
Notice here that we have specifically separated the impacts of x and z on output,
with x capturing pure production and z capturing inefficiency. This is commonly
known as the separability assumption. In some settings this assumption does not
have to be made, but in other settings, it is a necessity for identification. See [80]
for a more detailed discussion of the separability assumption. Our use of it here is
more for expositional clarity.
Exactly how to model the influence of z on inefficiency is unknown, and at
various points in time, practitioners have deployed a simpler, two-step analysis to
account for the presence of determinants of inefficiency. This approach constructs
JLMS predictions in the first step and then regresses these inefficiency estimates
on z in the second step. Pitt and Lee [82] were the first to implement this type of
approach (in a panel data setting), and many others followed this two-step approach
blindly [5, 21, 51]. However, this route to modeling determinants of inefficiency has
been met with criticism repeatedly and for good reason.
As explained in [15], the first-stage model is misspecified if z is ignored. Further,
[106] note that if x and z are correlated, then an omitted variable bias exists in the
first step rendering the second step ineffectual. Even in the special case where x
and z are uncorrelated, ignoring the dependence of u on z will lead to the estimated
JLMS predictions in the first stage to have too little variation (see also [88]), and,
subsequently, the estimator in the second-stage regression will be biased downward.
Caudill and Ford [24] provide Monte Carlo evidence on the effects that ignoring the
impact of z on u has on the estimator of the parameters of the SFM, while [106]
provide a detailed analysis of the bias of the second-stage parameter estimators.
As should be clear, the two-stage approach to account for determinants of
inefficiency in the SFM has no statistical foundation and is widely agreed upon
to yield poor insights on the actual behavior of inefficiency, as such this approach
should be strictly avoided; even with these criticisms of the two-step approach,
one will occasionally happen across research that adopts this flawed two-step
methodology.
8 Stochastic Frontier Analysis: Foundations and Advances I 357

While the two-stage approach has undesirable statistical properties, this does not
mean that determinants of inefficiency cannot be accounted for. Quite the contrary,
the preferred approach to studying the exogenous influences on efficiency is a
single-step procedure that explicitly accounts for z.

Proper Modeling of the Determinants of Inefficiency

The first proper proposals to model z in the SFM are [58, 83], who used the normal
truncated-normal SFM as the basis for estimation.28 While their focus was on the
normal truncated-normal SFM, the key insights hold for the normal-half-normal
SFM, which is what we will base our discussion on here. The main idea is to specify
σu2 as a parametric function of z.29 Formally, their parameterization of σu2 is

σu2 = ez δ , (27)

The log-likelihood function of the heteroskedastic model is the same as in

(4), except that we replace σu2 with (27).30 Here all of the model parameters are
estimated simultaneously, and once they are found, technical inefficiency can be
computed using (13) or (14) with the appropriate form of σu2 substituted into the
expressions.
If u follows a half-normal distribution, with the σu2 function depending upon z,
then the mean of ui is
1
E[ui |zi ] = 2/π ezi δ = e 2 ln(2/π )+zi δ . (28)

Note that the 12 ln(2/π ) term can be absorbed by the constant term in zi δ.
Therefore, by parameterizing σu2 , we allow z to affect the expected value of
inefficiency. More importantly, however, is that the parameterization (27) produces
maximum likelihood estimates of δ which may not be very informative. This is
because E[ui |zi ] is nonlinear in z, and therefore the slope coefficients δ are not the
marginal effects of z. For instance, assume the j th variable in z has an estimated
coefficient of 0.5. This number itself tells us very little about the magnitude of
the j th variable’s (marginal) effect on the inefficiency, though it does tell us the
direction of the effect on inefficiency. Also, the nonlinearity of the conditional mean
of u implies that for different levels of z, there will be different expected levels of
u. In these instances the marginal effect of z may be useful for empirical purposes.

28 Caudill and Ford [24], Huang and Liu [49], Battese and Coelli [15], Caudill et al. [25], Hadri
[39], and Wang [105] present alternative specifications as well.
29 It is also possible to model σ 2 as a function of variables, but this poses fewer problems, and we
v
omit the details here. See [79] and [91] for more discussion.
30 Actually, given the reparameterization of the log-likelihood function, the specification for σ
u
implies a particular specification for both λ and σ .
358 S. C. Kumbhakar et al.

For the given parameterization of the normal-half-normal SFM, the marginal

effect of the j th variable of zi , zj i on E[ui |zi ] is

∂E[ui |zi ]
= δj 2/π σu,i (29)
∂zj

√
where 2/π is approximately 0.80. It is clear that (29) also implies

∂E[ui |zi ]
sign = sign(δk ) (30)
∂zj

so that the sign of the coefficient reveals the direction of impact of zj i on

E[ui |zi ]. This property does not always hold across distributional assumptions, for
example, in the normal-truncated-normal SFM, the sign of the coefficient cannot be
interpreted directly [79]. In general, only in one parameter families for the pdf of u
(exponential, half-normal, etc.) does this correspondence hold; this suggests caution
in directly interpreting the impact that a particular variable zj has on inefficiency
based purely on the sign of δj .
The nonlinear nature of the relationship of E[u|z] with z implies that for a sample
of n observations, we have n marginal effects for each variable. A concise statistic
to present is the average partial effect (APE) on inefficiency or the partial effect of
the average (PEA):

n
−1
AP E(zj ) =(δju 2/π ) n σu,i (31)
i=1

P EA(zj ) =δju 2/π ez̄u δ . (32)

Either of these measures can be used to provide an overall sense for the impact of
a given variable on the level of inefficiency. However, these statistics should also be
interpreted with care. Neither necessarily reflects the impact of a given covariate for
a given firm but rather on average and ceteris paribus, i.e., holding other covariates
fixed; for example, it could be that half of the sample has a very negative effect that
is balanced by positive effects in the other half of the sample, thus getting nearly
zero on average, which might misrepresent the phenomenon. It is√ also possible to
standardize further by using elasticities, which will cancel out the 2/π term; this
occurs when the variables are measured in logarithms. It could also prove useful
to present the estimates of these at either quartiles or at particular points of interest
suggested by a particular empirical context (for example a specific regulation output
target).
8 Stochastic Frontier Analysis: Foundations and Advances I 359

Incorporating Determinants When u Is Truncated-Normal

As we have discussed earlier, the truncated-normal distribution offers greater

flexibility to model an array of shapes of the true, but unknown, distribution
of u. When determinants of inefficiency are present and one elects to assume
the truncated-normal distribution, several additional modeling choices become
available to the researcher. These additional choices are important because, as with
the choice of distributional assumption, there is typically little guidance on how best
to incorporate the determinants.
What do we mean? Consider again the truncated-normal density that would be
assumed for u, when determinants of inefficiency are present:

(u−μ(z;δ 2 ))2
1 −
f (u) = √ e 2σu (z;δ 1 )2 , u ≥ 0. (33)
2π σu (z; δ 1 ) (μ(z; δ 2 )/σu (z; δ 1 ))

In this case the impact of z on u can be modeled through the pre-truncation mean,
μ, and the pre-truncation standard deviation, σu . The issue with where to assume
that z influences u is that modeling either parameter as a function of z impacts all
of the moments of u, due to the truncation. Consider the conditional (on z) mean of
a truncated normal random variable

μ(z; δ 2 ) φ( σμ(z;δ
(z;δ
2)
) )
E[u|z] = σu (z; δ 1 ) + u 1
. (34)
σu (z; δ 1 ) ( μ(z;δ 2 ) )
σu (z;δ 1 )

Regardless of whether σu or μ is constant, z still influences the mean of

inefficiency unless both are constant. This is what makes the choice of where to
incorporate z abstruse when using the truncated normal distribution. Parametric
specification of either σu or μ will allow for z to influence expected inefficiency,
but in different manners and in nonlinear fashion. Given that μ can be positive
or negative, it is common to model it in a linear fashion, i.e., μ(z; δ 2 ) = z δ 2
and to model σu (z; δ 1 ) as ez δ 1 , to ensure positivity of the pre-truncation standard
deviation.
When we assume that u has the half-normal distribution, our choice is easy
because only a single parameter exists and it is clear where z enters. However, in
the truncated-normal setup, we could elect to have z enter only through the pre-
truncation mean, only through the pre-truncation standard deviation, or both. In
fact, various applied papers have used any of these three approaches. Kumbhakar
et al. [58] and Reifschneider and Stevenson [83] modeled the impact of determinants
of inefficiency through μ,31 while [24] incorporated determinants through σu .32
Lastly, [105] modeled the determinants through both μ and σu . The benefit of

31 See also [49] and [15] for early approaches following this strategy.
32 Other early approaches that followed this route include [25] and [39].
360 S. C. Kumbhakar et al.

modeling both pre-truncation parameters jointly as functions of z is that this

leaves little room for ambiguity and makes inference of where z belongs a viable
option. The costs are that the model is more complex to estimate and may lead
to identification problems, as raised in [85]. An alternative approach, which we
discuss next, is to invoke a special assumption on the distribution that makes it more
amenable to modeling the influence of determinants of inefficiency in the SFM.

The Scaling Property

Many of the main proposals to incorporate determinants of inefficiency did so

through the normal truncated-normal SFM. The two-parameter nature of the
truncated-normal distribution implies that determinants could influence the pre-
truncation mean, μ, the pre-truncation variance, σu2 , or both. Further still, different
variables could influence each parameter.
A popular simplification [90, 106], which encapsulates the normal-half-normal
SFM, is to assume that inefficiency behaves as

ui ∼ g(zi ; δ) u∗i , (35)

where g(·) ≥ 0 is a function of the exogenous variables, while u∗i ≥ 0 is a

random variable. This behavior is known as the scaling property. Single-parameter
distributions, such as the half-normal and the exponential, automatically possess this
property, but more flexible distributions, such as truncated-normal or gamma, can
have this property imposed. The key feature of the scaling property is that u∗i does
not depend on zi in any fashion; u∗i is known as base inefficiency [8, 106].
When a distribution possesses the scaling property, the shape of the distribution
of ui is the same for all firms, which can be viewed as an attractive feature. The
scaling function, g(·), expands or contracts the horizontal axis so that the scale of the
distribution of ui changes while preserving the underlying shape of the distribution.
In comparison, the normal truncated-normal SFM models allow different scalings
for each ui , so that for some firms the distribution of inefficiency is close to a
normal (if the pre-truncation mean is large), while for other firms, the distribution
of inefficiency is the extreme right tail of a normal with a mode of zero (if the pre-
truncation mean is negative). In comparison, for a model with the scaling property,
the mean and the standard deviation of u change with zi , but the shape of the
distribution is fixed.
Another advantage of the scaling property specification is the ease of interpreta-
tion of δ when g(zi , δ) = ezi δ ,

∂ ln E[ui |z]
= δj . (36)
∂zj

That is, δj is the semi-elasticity (or elasticity if z is already measured on the

logarithmic scale) of expected inefficiency with respect to the j th element of z, and
8 Stochastic Frontier Analysis: Foundations and Advances I 361

more importantly, this interpretation is distinct from any distributional assumption

placed on u∗ . An interpretation of this ilk is generally not available in other model
specifications. Further, the sign of the elements of δ can be directly interpreted.
The scaling property provides an attractive economic interpretation as well. u∗
can be interpreted as a benchmark level of inefficiency of the firm [8]. The scaling
function then allows a firm to exploit (or fail to exploit) these talents through other
variables, z, which might include experience of the plant manager, the operating
environment of the firm, or regulatory restrictions.
The scaling property is not a fundamental feature; rather, as with the choice of
distribution on u, it is an assumption on the features of the inefficiency distribution.
As such it can be tested against models that do not possess this property for the
inefficiency distribution. As it currently stands, all tests of the scaling property
hinge on a given distributional assumption, for example, estimating the normal
truncated-normal SFM and then estimating a restricted version of the same model
but imposing the scaling property. An important avenue for future research is
the development of a test (or tests) that does not require specific distributional
assumptions.

Estimation Without Imposing Distributional Assumptions

In settings where the researcher is comfortable with imposing the scaling property
on the distribution of inefficiency, the SFM can be estimated without parametric
distributional assumptions. This is perhaps the key benefit of invoking the scaling
property. To understand how it is possible to estimate the SFM without distributional
assumptions, we expound on the discussion of [8,90,106]. The SFM with the scaling
property can be written as33

yi = m(x i ; β) + vi − ezi δ u∗i . (37)

The conditional mean of y given x and z is

E[y|x, z] = x β − ez δ μ∗ (38)

where μ∗ = E[u∗ ] and E[v|x, z] = 0. The SFM is then

yi = m(x i ; β) − ezi δ μ∗ + vi − ezi δ (ui − μ∗ ) = m(x i ; β) − ezi δ μ∗ + εi∗ , (39)

with εi∗ = vi − ezi δ (ui − μ∗ ), which, for a given parameterization of m(x i ; β), can
be estimated using nonlinear least squares (NLS) as

33 Note here that we are making the implicit assumption that z is different from x. The nonlinearity
of the scaling function does allow z and x to overlap however.
362 S. C. Kumbhakar et al.

n
2

β, δ, μ∗ = min∗ n−1 yi − m(x i ; β) + μ∗ ezi δ . (40)
β,δ,μ
i=1

The elegance of invoking the scaling property is that the SFM can be estimated
in a distribution-free manner via NLS; the need for NLS stems from the fact that
the scaling function must be positive, and if it was specified as linear, this would be
inconsistent with theoretical requirements on the variance of the distribution.
Direct NLS will produce a consistent estimator of all of the terms of the SFM.
However, the error term εi∗ is heteroskedastic,

var(εi∗ |x i , zi ) = σv2 + σu2∗ e2zi δ ,

where σv2 = var(vi ) and σu2∗ = var(u∗i ). As such, a generalized NLS estimator
would be called for to produce an efficient estimator (as similar to the MLE).
Unfortunately, a generalized NLS algorithm hinges on distributional assumptions
to appropriately separate σv2 and σu2∗ . An alternative, which allows valid inference
to be undertaken, is to compute heteroskedasticity robust standard errors for β and
δ [110].
An interesting extension of this idea was recently proposed by [81] for the setting
where ui has already been converted into technical efficiency. In this case the level
of inefficiency must be bound between 0 and 1. To account for this [81] model,
the impact of z on the level of inefficiency through a probit function. Again, given
the nonlinear nature of the probit function, this necessitates the use of NLS if one
wishes to eschew distributional assumptions.
With the wide range of statistical software that can quickly implement an
NLS problem, it is perhaps surprising that this avenue has not been exploited
in applied research. Certainly the scaling property is an assumption that requires
judicious justification, but not more so than distributional assumptions imposed
on the composed error structure of the SFM. It is also possible in this nonlinear
setup that the calculation of expected firm efficiency can be done without requiring
distributional assumptions, leading to the potential for more robust conclusions
regarding observation-specific inefficiency. It is also possible to estimate the SFM
in (39) without imposing assumptions on the scaling function.
Currently no test of the scaling property exists without enforcing distributional
assumptions. Alvarez et al. [8] proposed standard tests of the scaling property
by using the nesting structure of the normal-truncated-normal distributional pair
against the normal-half-normal distributional pair. Unfortunately this testing facility
requires distributional assumptions on both vi and ui . A test of the statistical
significance of the determinants of inefficiency, using the NLS framework just
described, is available [53]. Under H0 : δ = 0 it follows that

yi = m(x i ; β) − μ∗ ezi δ + εi∗ = m∗ (x i ; β) − μ∗ (1 − ezi δ ) + εi∗ ,

8 Stochastic Frontier Analysis: Foundations and Advances I 363

where m∗ (x i ; β) = m(x i ; β) + c for a constant c. That is, one can only identify μ
if at least one element of δ is nonzero; note that under H0 : 1 − ezi δ = 0, μ∗ cannot
be separately identified. This lack of identification creates issues for inference under
the null hypothesis and invalidates the common asymptotic behavior of Wald and
likelihood ratio tests. The solution, which [53] proposed to avoid this problem, is
to use the Lagrange multiplier (LM) test which involves estimation imposing the
null hypothesis. A novel insight of [53] is that the LM test they proposed has power
in directions where the scaling property does not hold. This is due to the fact that
the model being tested H0 is indifferent to “how” inefficiency enters the model.
Thus, while an explicit test of the scaling property without requiring distributional
assumptions would be a useful tool, the [53] LM test is likely to be sufficient.
The LM test is based on the derivative of the NLS criterion function in (40) with
respect to δ, evaluated at the restricted estimates (δ = 0):

2
n

yi − m(x i ; β) + μ∗ μ∗ zi . (41)
n
i=1

The test statistic is designed to determine how close the derivative of the NLS
objection function (with respect to the parameters under the null hypothesis) is to 0.
If the parameter restrictions are true, then this should be close to 0. The reason that
distributional assumptions are not needed for this test to work properly is that this
test is identical to an F -test, and F -tests are invariant to the scale of the covariates
[53]. Thus, one can simply set μ∗ = 1 and use NLS to regress y on (x, z) and test
the significance of δ.

Estimation When Determinants of Efficiency and Endogeneity Are

Present

Quite recently, attention has focused on estimation of the SFM when some of the
determinants of inefficiency may be endogenous [11, 65]. These models can be
estimated using traditional instrumental variables methods. However, given that the
determinants of inefficiency enter the model nonlinearly, nonlinear methods are
required. To begin, we consider the model of [11],

yi = x i β + vi − ui = x i β + vi − u∗i ezi δ , (42)

where the scaling property has been invoked. The covariates x i and zi are
partitioned as

x 1i z1i
xi = , zi = ,
x 2i z2i
364 S. C. Kumbhakar et al.

where x 1i and z1i are exogenous and x 2i and z2i are endogenous. The set of
instruments used to combat endogeneity are defined as

⎡ ⎤
x 1i
w i = ⎣ z1i ⎦ ,
qi

where q i are the traditional outside instruments. Identification of all the parameters
requires that the dimension of q be at least as large as the dimension of x 2 plus the
dimension of z2 (the rank condition).
In the model of [11], endogeneity arises through correlation between a variable
in the model (x 2 and/or z2 ) and noise, v. That is, both x and z are assumed to
be independent of basic inefficiency u∗ . Given that E[ui ] is not constant, the COLS
approach to deal with endogeneity proposed by [10] cannot be used here. To develop
an appropriate estimator, add and subtract the mean of inefficiency to produce a
composed error term that has mean 0,

yi = x i β − μ∗ ezi δ + vi − (u∗i − μ∗ )ezi δ . (43)

Proper estimation through instrumental variables requires that the moment condition

E vi − (u∗i − μ∗ )ezi δ |wi = 0. (44)

The nonlinearity of these moment conditions would necessitate use of nonlinear

two-stage least squares (NL2SLS) [9].
Latruffe et al. [65] have a similar setup as [11], using the model in (42), but
develop a four step estimator for the parameters; additionally, only x 2 is treated as
endogenous. Latruffe et al.’s [65] approach is based on [26] using the construction of
efficient moment conditions. The vector of instruments proposed in [65] is defined
as
⎡ ⎤
x 1i
wi (γ , δ) = ⎣ q i γ ⎦ , (45)
zi e zi δ

where q i γ captures the linear projection of x 2 on the external instruments q. The

four-stage estimator is defined as

Step 1 Regress x 2 on q to estimate γ . Denote the OLS estimator of γ as γ .

Step 2 Use NLS to estimate the SFM in (42). Denote the NLS estimates of (β, δ)
as (β̈, δ̈). Use the NLS estimate of δ and the OLS estimate of γ in Step 1 to
construct the instruments wi (γ , δ̈).
8 Stochastic Frontier Analysis: Foundations and Advances I 365

Step 3 Using the estimated instrument vector w i (γ , δ̈), calculate the NL2SLS
estimator of (β, δ) as (β, δ). Use the NL2SLS estimate of δ and the OLS estimate
of γ in Step 1 to construct the instruments wi (γ , δ).
Step 4 Using the estimated instrument vector w i (γ , δ), calculate the NL2SLS
estimator of (β, δ) as (β, δ).

This multistep estimator is necessary in the context of efficient moments because

the actual set of instruments is not used directly; rather w i (γ , δ) is used, and this
instrument vector requires estimates of γ and δ. The first two steps of the algorithm
are designed to construct estimates of these two unknown parameter vectors. The
third step then is designed to construct a consistent estimator of w i (γ , δ), which is
not done in Step 2 given that the endogeneity of x 2 is ignored (note that NLS is
used as opposed to NL2SLS). The iteration from Step 2 to Step 3 does produce a
consistent estimator of wi (γ , δ), and as such, Step 4 produces consistent estimators
for β and δ. While [65] proposed a set of efficient moment conditions to handle
endogeneity, the model of [11] is more general because it can handle endogeneity
in the determinants of inefficiency as well.

Conclusions

In this chapter we covered the workhorse SFM and discussed avenues to include
determinants of inefficiency and productivity and how to deal with potential
endogeneity issues. This material must give a good stepping-stone for a general
reader to proceed to the next chapter, where we will discuss about various ways
to analyze stochastic frontier problems with panel data, how to adapt quantile
estimation to SFA, how to use robust methods involving nonparametric regression
and local-likelihood, as well as what software can be used to implement the
various methods of stochastic frontier analysis. We will defer further concluding
remarks till the end of Chapter 9 “Stochastic Frontier Analysis: Foundations and
Advances II”.

Cross-References

Activity Analysis in Production Economics

Conceptualization and Measurement of Productivity Growth and Technical
Change: A Nonparametric Approach
Cost, Revenue, and Profit Function Estimates
Data Envelopment Analysis: A Nonparametric Method of Production Analysis
Distance Functions in Production Economics
Duality in Production
Elasticities of Substitution
Modeling Technical Change: Theory and Practice
Multiproduct Technologies
366 S. C. Kumbhakar et al.

Reminiscences of “Returns to Scale in Electricity Supply”

Scale Elasticity and Returns to Scale

References
1. Afriat SN (1972) Efficiency estimation of production functions. Int Econ Rev 13(3):568–598
2. Ahmad IA, Li Q (1997) Testing symmetry of an unknown density function by kernel method.
J Nonparametric Stat 7:279–293
3. Aigner D, Chu S (1968) On estimating the industry production function. Am Econ Rev
58:826–839
4. Aigner DJ, Lovell CAK, Schmidt P (1977) Formulation and estimation of stochastic frontier
production functions. J Econ 6(1):21–37
5. Ali M, Flinn JC (1989) Profit efficiency among Basmati rice producers in Pakistan Punjab.
Am J Agric Econ 71(2):303–310
6. Almanidis P, Qian J, Sickles RC (2014) Stochastic frontier models with bounded inefficiency.
In: Sickles RC, Horrace WC (eds) Festschrift in honor of Peter Schmidt econometric methods
and applications. Springer, New York, pp 47–82
7. Almanidis P, Sickles RC (2011) The skewness issue in stochastic frontier models: fact or
fiction? In: van Keilegom I, Wilson PW (eds) Exploring research frontiers in contemporary
statistics and econometrics. Springer, Berlin
8. Alvarez A, Amsler C, Orea L, Schmidt P (2006) Interpreting and testing the scaling property
in models where inefficiency depends on firm characteristics. J Prod Anal 25(2):201–212
9. Amemiya T (1974) The nonlinear two-stage least-squares estimator. J Econ 2:105–111
10. Amsler C, Prokhorov A, Schmidt P (2016) Endogeneity in stochastic frontier models. J Econ
190:280–288
11. Amsler C, Prokhorov A, Schmidt P (2017) Endogeneity environmental variables in stochastic
frontier models. J Econ 199:131–140
12. Azzalini A (1985) A class of distributions which includes the normal ones. Scand J Stat
12(2):171–178
13. Battese GE, Coelli TJ (1988) Prediction of firm-level technical efficiencies with a generalized
frontier production function and panel data. J Econ 38:387–399
14. Battese GE, Coelli TJ (1992) Frontier production functions, technical efficiency and panel
data: with application to paddy farmers in India. J Prod Anal 3:153–169
15. Battese GE, Coelli TJ (1995) A model for technical inefficiency effects in a stochastic frontier
production function for panel data. Empir Econ 20(1):325–332
16. Battese GE, Corra GS (1977) Estimation of a production frontier model: with application to
the pastoral zone off Eastern Australia. Aust J Agric Econ 21(3):169–179
17. Benabou R, Tirole J (2016) Mindful economics: the production, consumption, and value of
beliefs. J Econ Perspect 30(3):141–164
18. Bera AK, Sharma SC (1999) Estimating production uncertainty in stochastic frontier
production function models. J Prod Anal 12(2):187–210
19. Bloom N, Lemos R, Sadun R, Scur D, Van Reenen J (2016) International data on measuring
management practices. Am Econ Rev 106(5):152–156
20. Bonanno G, De Giovanni D, Domma F (2017) The ‘wrong skewness’ problem: a re-
specification of stochastic frontiers. J Prod Anal 47(1):49–64
21. Bravo-Ureta BE, Rieger L (1991) Dairy farm efficiency measurement using stochastic
frontiers and neoclassical duality. Am J Agric Econ 73(2):421–428
22. Carree MA (2002) Technological inefficiency and the skewness of the error component in
stochastic frontier analysis. Econ Lett 77(1):101–107
23. Case B, Ferrari A, Zhao T (2013) Regulatory reform and productivity change in Indian
banking. Rev Econ Stat 95(3):1066–1077
8 Stochastic Frontier Analysis: Foundations and Advances I 367

24. Caudill SB, Ford JM (1993) Biases in frontier estimation due to heteroskedasticity. Econ Lett
41(1):17–20
25. Caudill SB, Ford JM, Gropper DM (1995) Frontier estimation and firm-specific inefficiency
measures in the presence of heteroskedasticity. J Bus Econ Stat 13(1):105–111
26. Chamberlain G (1987) Asymptotic efficiency in estimation with conditional moment restric-
tions. J Econ 34(2):305–334
27. Chen Y-Y, Schmidt P, Wang H-J (2014) Consistent estimation of the fixed effects stochastic
frontier model. J Econ 181(2):65–76
28. Coelli TJ (1995) Estimators and hypothesis tests for a stochastic frontier function: a Monte
Carlo analysis. J Prod Anal 6(4):247–268
29. Dugger R (1974) An application of bounded nonparametric estimating functions to the
analysis of bank cost and production functions, Ph.D. thesis, University of North Carolina,
Chapel Hill
30. Feng Q, Horrace WC, Wu GL (2015) Wrong skewness and finite sample correction in
parametric stochastic frontier models. Center for Policy Research – The Maxwell School,
working paper N. 154
31. Gabrielsen A (1975) On estimating efficient production functions. Working Paper No. A-85,
Chr. Michelsen Institute, Department of Humanities and Social Sciences, Bergen
32. Gagnepain P, Ivaldi M (2002) Stochastic frontiers and asymmetric information models. J Prod
Anal 18(2):145–159
33. Greene W (2004) Distinguishing between heterogeneity and inefficiency: stochastic frontier
analysis of the World Health Organization’s panel data on national health care systems. Health
Econ 13(9):959–980
34. Greene WH (1980a) Maximum likelihood estimation of econometric frontier functions. J
Econ 13(1):27–56
35. Greene WH (1980b) On the estimation of a flexible frontier production model. J Econ
13(1):101–115
36. Greene WH (1990) A gamma-distributed stochastic frontier model. J Econ 46(1–2):
141–164
37. Greene WH (2003) Simulated likelihood estimation of the normal-gamma stochastic frontier
function. J Prod Anal 19(2):179–190
38. Greene WH (2008) The econometric approach to efficiency analysis. In: Knox Lovell CA,
Fried HO, Schmidt SS (eds) The measurement of productive efficiency and productivity
change, chapter 2. Oxford University Press, Oxford, UK
39. Hadri K (1999) Estimation of a doubly heteroscedastic stochastic frontier cost function. J Bus
Econ Stat 17(4):359–363
40. Hafner C, Manner H, Simar L (2016) The “wrong skewness” problem in stochastic frontier
model: a new approach. Econometric Reviews. forthcoming
41. Hansen C, McDonald JB, Newey WK (2010) Instrumental variables estimation with flexible
distributions. J Bus Econ Stat 28:13–25
42. Hattori T (2002) Relative performance of U.S. and Japanese electricity distribution: an
application of stochastic frontier analysis. J Prod Anal 18(3):269–284
43. Henderson DJ, Parmeter CF (2015) A consistent bootstrap procedure for nonparametric
symmetry tests. Econ Lett 131:78–82
44. Hjalmarsson L, Kumbhakar SC, Heshmati A (1996) DEA, DFA, and SFA: a comparison. J
Prod Anal 7(2):303–327
45. Hollingsworth B (2008) The measurement of efficiency and productivity of health care
delivery. Health Econ 17(10):1107–1128
46. Horrace WC, Parmeter CF (2014) A Laplace stochastic frontier model. University of Miami
Working Paper
47. Horrace WC, Schmidt P (1996) Confidence statements for efficiency estimates from stochas-
tic frontier models. J Prod Anal 7:257–282
48. Horrace WC, Wright IA (2016) Stationary points for parametric stochastic frontier models.
Center for Policy Research – The Maxwell School, working paper N. 196
368 S. C. Kumbhakar et al.

49. Huang CJ, Liu J-T (1994) Estimation of a non-neutral stochastic frontier production function.
J Prod Anal 5(1):171–180
50. Jondrow J, Lovell CAK, Materov IS, Schmidt P (1982) On the estimation of technical
efficiency in the stochastic frontier production function model. J Econ 19(2/3):233–238
51. Kalirajan KP (1990) On measuring economic efficiency. J Appl Econ 5(1):75–85
52. Karakaplan MU, Kutlu L (2013) Handling endogeneity in stochastic frontier analysis.
Unpublished manuscript
53. Kim M, Schmidt P (2008) Valid test of whether technical inefficiency depends on firm
characteristics. J Econ 144(2):409–427
54. Knittel CR (2002) Alternative regulatory methods and firm efficiency: stochastic frontier
evidence form the U.S. electricity industry. Rev Econ Stat 84(3):530–540
55. Kumbhakar SC (1987) The specification of technical and allocative inefficiency in stochastic
production and profit frontiers. J Econ 34(1):335–348
56. Kumbhakar SC (2011) Estimation of production technology when the objective is to
maximize return to the outlay. Eur J Oper Res 208:170–176
57. Kumbhakar SC (2013) Specification and estimation of multiple output technologies: a primal
approach. Eur J Oper Res 231:465–473
58. Kumbhakar SC, Ghosh S, McGuckin JT (1991) A generalized production frontier approach
for estimating determinants of inefficiency in US diary farms. J Bus Econ Stat 9(1):279–286
59. Kumbhakar SC, Lovell CAK (2000) Stochastic frontier analysis. Cambridge University Press,
Cambridge
60. Kumbhakar SC, Wang H-J (2006) Estimation of technical and allocative inefficiency: a primal
system approach. J Econ 134(3):419–440
61. Kumbhakar SC, Wang H-J, Horncastle A (2015) A practitioners guide to stochastic frontier
analysis using stata. Cambridge University Press, Cambridge, UK
62. Kuosmanen T (2012) Stochastic semi-nonparametric frontier estimation of electricity distri-
bution networks: application of the StoNED method in the Finnish regulatory model. Energy
Econ 34:2189–2199
63. Kuosmanen T, Fosgerau M (2009) Neoclassical versus frontier production models? Testing
for the skewness of regression residuals. Scand J Econ 111(2):351–367
64. Kutlu L (2010) Battese-Coelli estimator with endogenous regressors. Econ Lett 109:79–81
65. Latruffe L, Bravo-Ureta BE, Carpentier A, Desjeux Y, Moreira VH (2017) Subsidies and
technical efficiency in agriculture: evidence from European dairy farms. Am J Agric Econ
99:783–799
66. Lee L (1983) A test for distributional assumptions for the stochastic frontier function. J Econ
22(2):245–267
67. Lee L-F, Tyler WG (1978) The stochastic frontier production function and average efficiency:
an empirical analysis. J Econ 7:385–389
68. Li Q (1996) Estimating a stochastic production frontier when the adjusted error is symmetric.
Econ Lett 52(3):221–228
69. Lien G, Kumbhakar SC, Hardaker JB (2017) Accounting for risk in productivity analysis: an
application to Norwegian dairy farming. J Prod Anal 47(3):247–257
70. Lovell CAK (1993) Production frontiers and productive efficiency. In: Knox Lovell CA,
Fried HO, Schmidt SS (eds) The measurement of productive efficiency, chapter 1. Oxford
University Press, Oxford, UK
71. McFadden D (1989) A method of simulated moments for estimation of discrete response
models without numerical integration. Econometrica 57(5):995–1026
72. Meeusen W, van den Broeck J (1977a) Efficiency estimation from Cobb-Douglas production
functions with composed error. Int Econ Rev 18(2):435–444
73. Meeusen W, van den Broeck J (1977b) Technical efficiency and dimension of the firm: some
results on the use of frontier production functions. Empir Econ 2(2):109–122
74. Mutter RL, Greene WH, Spector W, Rosko MD, Mukamel DB (2013) Investigating the impact
of endogeneity on inefficiency estimates in the application of stochastic frontier analysis to
nursing homes. J Prod Anal 39(1):101–110
8 Stochastic Frontier Analysis: Foundations and Advances I 369

75. Nguyen NB (2010) Estimation of technical efficiency in stochastic frontier analysis. Ph.D.
thesis, Bowling Green State University
76. O’Hagan A, Leonard T (1976) Bayes estimation subject to uncertainty about parameter
constraints. Biometrika 63(1):201–203
77. Olson JA, Schmidt P, Waldman DA (1980) A Monte Carlo study of estimators of stochastic
frontier production functions. J Econ 13:67–82
78. Ondrich J, Ruggiero J (2001) Efficiency measurement in the stochastic frontier model. Eur J
Oper Res 129(3):434–442
79. Parmeter CF, Kumbhakar SC (2014) Efficiency analysis: a primer on recent advances. Found
Trends Econ 7(3–4):191–385
80. Parmeter CF, Zelenyuk V (2016) A bridge too far? the state of the art in combining the virtues
of stochastic frontier analysis and data envelopment analysis. University of Miami Working
Paper 2016-10
81. Paul S, Shankar S (2017) An alternative specification for technical efficiency effects in a
stochastic frontier production function. Crawford School Working Paper 1703
82. Pitt MM, Lee L-F (1981) The measurement and sources of technical inefficiency in the
Indonesian weaving industry. J Dev Econ 9(1):43–64
83. Reifschneider D, Stevenson R (1991) Systematic departures from the frontier: a framework
for the analysis of firm inefficiency. Int Econ Rev 32(1):715–723
84. Richmond J (1974) Estimating the efficiency of production. Int Econ Rev 15(2):515–521
85. Ritter C, Simar L (1997) Pitfalls of normal-gamma stochastic frontier models. J Prod Anal
8(2):167–182
86. Ruggiero J (1999) Efficiency estimation and error decomposition in the stochastic frontier
model: a Monte Carlo analysis. Eur J Oper Res 115(6):555–563
87. Schmidt P (1976) On the statistical estimation of parametric frontier production functions.
Rev Econ Stat 58(2):238–239
88. Schmidt P (2011) One-step and two-step estimation in SFA models. J Prod Anal 36(2):201–
203
89. Silvapulle M, Sen P (2005) Constrained statistical inference. Wiley, Hoboken
90. Simar L, Lovell CAK, van den Eeckaut P (1994) Stochastic frontiers incorporating exogenous
influences on efficiency. Discussion Papers No. 9403, Institut de Statistique, Universite de
Louvain
91. Simar L, Van Keilegom I, Zelenyuk V (2017) Nonparametric least squares methods for
stochastic frontier models. J Prod Anal 47(3):189–204
92. Simar L, Wilson PW (2010) Inferences from cross-sectional, stochastic frontier models. Econ
Rev 29(1):62–98
93. Simar L, Wilson PW (2013) Estimation and inference in nonparametric frontier models:
recent developments and perspectives. Found Trends Econ 5(2):183–337
94. Simar L, Wilson PW (2015) Statistical approaches for nonparametric frontier models: a
guided tour. Int Stat Rev 83(1):77–110
95. Solow R (1957) Technical change and the aggregate production function. Rev Econ Stat
39(3):312–320
96. Stevenson R (1980) Likelihood functions for generalized stochastic frontier estimation. J
Econ 13(1):58–66
97. Stiglitz JE, Greenwald BC (1986) Externalities in economies with imperfect information and
incomplete markets. Q J Econ 101(2):229–264
98. Taube R (1988) Möglichkeiten der effizienzmess ung von öffentlichen verwaltungen. Duncker
& Humbolt GmbH, Berlin
99. Timmer CP (1971) Using a probabilistic frontier production function to measure technical
efficiency. J Polit Econ 79(4):776–794
100. Tran KC, Tsionas EG (2013) GMM estimation of stochastic frontier models with endogenous
regressors. Econ Lett 118:233–236
101. Tsionas EG (2007) Efficiency measurement with the Weibull stochastic frontier. Oxf Bull
Econ Stat 69(5):693–706
370 S. C. Kumbhakar et al.

102. Tsionas EG (2012) Maximum likelihood estimation of stochastic frontier models by the
Fourier transform. J Econ 170(2):234–248
103. Uekusa M, Torii A (1985) Stochastic production functions: an application to Japanese
manufacturing industry (in Japanese). Keizaigaku Ronsyu (Journal of Economics) 51(1):2–23
104. Waldman DM (1982) A stationary point for the stochastic frontier likelihood. J Econ
18(1):275–279
105. Wang H-J (2002) Heteroscedasticity and non-monotonic efficiency effects of a stochastic
frontier model. J Prod Anal 18(2):241–253
106. Wang H-J, Schmidt P (2002) One-step and two-step estimation of the effects of exogenous
variables on technical efficiency levels. J Prod Anal 18:129–144
107. Wang WS, Amsler C, Schmidt P (2011) Goodness of fit tests in stochastic frontier models. J
Prod Anal 35(1):95–118
108. Wang WS, Schmidt P (2009) On the distribution of estimated technical efficiency in stochastic
frontier models. J Econ 148(1):36–45
109. Wheat P, Greene B, Smith A (2014) Understanding prediction intervals for firm specific
inefficiency scores from parametric stochastic frontier models. J Prod Anal 42:55–65
110. White H (1980) A heteroskedasticity-consistent covariance matrix estimator and a direct test
for heteroskedasticity. Econometrica 48:817–838
111. Winsten CB (1957) Discussion on Mr. Farrell’s paper. J R Stat Soc Ser A Gen 120(3):282–284
Stochastic Frontier Analysis: Foundations
and Advances II 9
Subal C. Kumbhakar, Christopher F. Parmeter,
and Valentin Zelenyuk

Contents
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 372
Panel Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 372
Time-Invariant Technical Inefficiency Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 374
Time-Varying Technical Inefficiency Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 376
Models That Separate Firm Heterogeneity from Inefficiency . . . . . . . . . . . . . . . . . . . . . . . . 378
Models That Separate Persistent and Time-Varying Inefficiency . . . . . . . . . . . . . . . . . . . . . 380
Models That Separate Firm Effects, Persistent Inefficiency,
and Time-Varying Inefficiency . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 383
The Four-Component Panel Data SFM with Determinants of Inefficiency . . . . . . . . . . . . . 385
Inference Across the Panel Data SFM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 386
Nonparametric Estimation of the SFM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 386
Early Attempts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 386
Local Likelihood Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 388
Local Least-Squares Approaches . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 391
Avoiding Distributional and (Some) Parametric Assumptions When
Determinants of Inefficiency Are Present . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 395
Future Directions in Semi- and Nonparametric Estimation and Inference
of the SFM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 398
Quantile Estimation of the SFM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 399

© Springer Nature Singapore Pte Ltd. 2022 371

S. C. Ray et al. (eds.), Handbook of Production Economics,
https://doi.org/10.1007/978-981-10-3455-8_11
372 S. C. Kumbhakar et al.

Additional Approaches/Extensions of the SFM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 403

Available Software to Estimate SFMs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 403
Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 404
Cross-References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 404
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 405

Abstract

This chapter continues to review some of the most important developments in the
econometric estimation of productivity and efficiency surrounding the stochastic
frontier model. As in the previous chapter, we continue to place an emphasis on
highlighting recent research and providing broad coverage, while details are left
for further reading in the rich (although not exhaustive) references at the end of
this chapter.

Keywords

Efficiency · Productivity · Panel data · Endogeneity · Nonparametric ·

Determinants of inefficiency · Quantile · Identification

JEL codes

C10, C13, C14, C50

Introduction

In this chapter, we will continue our discussion of one of the most popular paradigms
in modern productivity analysis – stochastic frontier analysis, or SFA. In particular,
section “Panel Data” focuses on various SFA models for analyzing variation of
efficiency (or relative productivity) not only across firms but also over time, i.e.,
in the panel data context. Section “Nonparametric Estimation of the SFM” reviews
several prominent semi- and nonparametric approaches to SFA. Section “Quantile
Estimation of the SFM” briefly discusses a recent vein of literature focusing on
quantile estimation of the SFM. Section “Additional Approaches/Extensions of
the SFM” presents some further extensions of the SFM, while section “Available
Software to Estimate SFMs” briefly summarizes some of the available software to
estimate SFMs in practice. Section “Conclusions” concludes. We will use the same
notation as in the previous chapter, with some modifications given the new contexts
we discuss here.

Panel Data

Our current discussion of the SFM has focused on having access to cross-
sectional data. When repeated observations of firms are available, then more useful
information about inefficiency (and often with more flexibility) can be extracted,
9 Stochastic Frontier Analysis: Foundations and Advances II 373

and a range of panel data SFMs are available to the applied researcher. Here we
highlight some of the most prominent models. The advantage of panel data is that
more information on inefficiency and productivity can be parsed, and in particular,
shed light on changes in efficiency or productivity, which differs from a cross-
sectional setting, which can only provide a static portrayal of inefficiency.
While [87] were the first to consider extending the cross-sectional SFM to the
panel data setting, it was [92] who brought prominence to the use of models tailored
exclusively to panel data. They raise three problems with cross-sectional models
that are used to measure inefficiency and productivity: First, if the MLE is used to
estimate the parameters of the SFM and inefficiency through JLMS, everything is
contingent on distributional assumptions for both noise and inefficiency; second,
technical inefficiency is assumed to be independent of the regressor(s)1 ; and third,
the JLMS estimator is not a consistent estimator of u, as E[u|ε] never approaches
u as the number of cross-sectional units approaches infinity (n → ∞). Access to
panel data can, to varying degrees, mitigate all of these issues. However, with panel
data comes a range of additional assumptions that the researcher needs to carefully
consider before proceeding.
To begin, consider the benchmark linear panel data regression model:

yit = m(x it ; β) + ci + vit . (1)

Aside from the indexing of our data by individual, i, and time, t, we have the
presence of firm-specific heterogeneity, ci . The common dilemma facing application
of the linear panel data regression model is how to treat the relationship between
ci and x it . Under the fixed-effects (FE) framework [103], x it is allowed to be
correlated with ci , and the parameters of the model can be estimated consistently
using the within transformation [7]. Under the random-effects (RE) framework,
x it and ci are required to be uncorrelated, leading to OLS being a consistent
estimator, but is ultimately inefficient given that the variance-covariance matrix of
the composed error term c + v is no longer diagonal. A feasible generalized least-
squares approach is available to obtain asymptotically efficient estimators of the
parameters of the regression model in this case.
Now, to think about where inefficiency enters the model in (1), we must
characterize the nature of inefficiency. If inefficiency is assumed to be constant over
time, then it is likely that ci might be augmented to also capture inefficiency. If
inefficiency is time-varying, then we could include a second, one-sided error term
to be convolved with vit in (1), in much the same way we did in the benchmark SFM.
Or, it could be that inefficiency is composed of both a time-invariant component and
a time-varying component. All told, the general panel data SFM is

yit = m(x it ; β) + ci − ηi + vit − uit = m(x it ; β) + αi + εit , (2)

1 Iffirms maximize profit, and inefficiency is known to the firm, then this assumption is unlikely to
be true as firms may adjust their inputs to account for inefficiency (e.g., see [75]).
374 S. C. Kumbhakar et al.

where αi = ci − ηi with ci capturing time-invariant heterogeneity and ηi

encapsulating time-invariant inefficiency, while εit = vit − uit with uit representing
time-varying inefficiency. The panel data SFM looks identical to the panel data
regression model in (1), except that, due to uit > 0, εit no longer has mean zero, and
αi no longer solely captures individual specific heterogeneity. Early approaches that
studied inefficiency in panel data settings placed restrictions on how inefficiency
entered the panel data SFM. As time progressed, fewer assumptions were made,
especially as more advanced econometric techniques were exploited.

Time-Invariant Technical Inefficiency Models

When inefficiency in the panel data SFM is assumed to be time-invariant, it is

possible to estimate the model without the need for distributional assumptions. To
begin, we assume that uit does not exist in (2) and all time-invariant unobserved
heterogeneity is inefficiency, αi = ηi . With these restrictions, the panel data SFM is
written as

yit = m(x it ; β) − ηi + vit ; i = 1, . . . , n; t = 1, . . . , T . (3)

This model is termed the time-invariant SFM. Aside from the one-sided nature of
ηi , this model can be estimated with standard panel data regression techniques,
once an assumption on the underlying statistical relationship (either the FE or RE
framework) between x it and ηi is made. Which framework to deploy depends upon
the relationship that one assumes exists between the covariates of the model and
firm-level inefficiency. Under the FE framework, correlation is allowed between
x it and ηi , whereas under the RE framework, no correlation is permitted between
x it and ηi . Regardless of which framework is deemed appropriate, neither requires
distributional assumptions for η or v. This freedom from imposing a parametric
assumption on the distribution of ηi (i.e., we have some statistical requirements
on the distribution but do not require a precise parametric form) has led to the time-
invariant SFM being referred to as a distribution-free approach [92]. To estimate the
time-invariant SFM respecting the one-sided nature of ηi , a simple transformation
is needed to interpret the individual effect as time-invariant inefficiency as opposed
to pure firm heterogeneity. One major limitation of the time-invariant SFM is that
separate identification of inefficiency and individual heterogeneity is not considered.
Additionally, the production technology is assumed to be time constant, which may
be a further limitation depending upon the time dimension one has access to.
We briefly discuss how to estimate the time-invariant SFM under the FE
framework, which was first proposed by [92]. For ease of exposition, we assume
m(·) is linear in x it . The time-invariant SFM is

yit =β0 + x it β + vit − ηi (4)

=(β0 − ηi ) + x it β + vit
=ci + x it β + vit (5)
9 Stochastic Frontier Analysis: Foundations and Advances II 375

where ci ≡ β0 − ηi . Under the FE framework, ηi and thus αi , i = 1, . . . , n are

allowed to have arbitrary correlation with x it .
Given the similarity of the time-invariant SFM and a traditional panel data
regression model, [92] used standard estimation methods to estimate the parameters
of the model, namely, within estimation. The within transformation subtracts cross-
sectional means of the data from each cross section (e.g., replacing yit by yit − ȳi·
and xit by xit − x̄i· , where ȳi· = (1/T ) t yit , etc.), thereby eliminating ci . OLS can
then be used to estimate the transformed model, essentially regressing transformed
y on transformed x. The OLS estimator with the transformed data, β, is a consistent
estimator for β. An estimator of ci , ĉi , is constructed from the mean of the residuals
for each cross-sectional unit, i.e., ĉi = ȳi· − x i·
β, but it is biased, because ηi > 0
∀i. A simple transformation will produce a consistent estimator of ηi . Once ĉi is
determined, η̂i is estimated as [92]:

η̂i = max{ĉi } − ĉi ≥ 0, i = 1, . . . , n. (6)

This formulation implicitly assumes that the most efficient firm/DMU in the sample
is 100% efficient. In other words, estimated inefficiency in the fixed-effects model
is relative to the best firm/DMU in the sample. If one is interested in estimating
firm-specific technical efficiency, it can be obtained from

T
E i = e−η̂i , i = 1, . . . , n. (7)

Operating under the FE framework may be more appropriate for empirical

applications in which inefficiency is believed to be correlated with the inputs used.
However, a disadvantage of using the time-invariant SFM under the FE framework
is that no other time-invariant variables can be included, for example, the gender of
a plant manager or ownership status of the firm (which may not change over a short
time). Effectively, the influence of time-constant variables will be accumulated in
(and distort) the estimates of inefficiency.
In settings where time-invariant variables are expected to be relevant regressors
in the production model, an alternative is to operate under the RE framework.
Estimation of the model still does not require distributional assumptions on v or
η, but OLS on the transformed model no longer represents an efficient estimator
given that the composed error term, vit − ηi , no longer has a diagonal variance-
covariance matrix [besides the requirement of no correlation between inefficiency
and inputs]. Schmidt and Sickles [92] discuss estimation under the RE framework
through generalized least-squares as well. Another alternative, if one was uncom-
fortable with the implications stemming from RE framework, would be to make
distributional assumptions and estimate the model via maximum likelihood. This
avenue was suggested by [87] and can allow time-invariant covariates to enter
the model while still identifying time-invariant inefficiency. The cost is the use
of distributional assumptions so that the likelihood function can be constructed.
Following [1, 87] assume that ηi follows a half-normal distribution and vit follows
a normal distribution. Kumbhakar [53] discussed estimation of inefficiency in such
a model by extending the JLMS formulation.
376 S. C. Kumbhakar et al.

Time-Varying Technical Inefficiency Models

The time-invariant SFM allows inefficiency to differ across individuals but restricts
any change over time. The implication of this is that an inefficient firm could not
improve productivity over time by lessening inefficiency. This may be unrealistic in
a variety of applied settings or where T is large. We must consider models that allow
both technology and inefficiency to change over time to accommodate the idea of
productivity and efficiency improvement at the firm level.
A nice feature of time-varying SFMs is that the time-invariant SFM is a special
case and, correspondingly, the time-invariant specification can be tested, opening up
a variety of inferential opportunities for empirical analyses. To introduce the time-
varying SFM, recall the model in (5):

yit = ci + x it β + vit . (8)

To allow ci to be time-varying, one may impose some reasonable and tractable

structure, e.g., [26] suggested replacing ci by cit where

cit = c0i + c1i t + c2i t 2 , (9)

where t is the time trend variable. The parameterization in (9) allows the parameters
to be firm-specific. If the number of cross-sectional units (n) is not large, one can
define n firm dummies and interact these dummies with time and time squared.
These variables along with the regressors (i.e., the x variables) are then used in a
standard OLS regression. The coefficients associated with the firm dummies and
their interactions are the estimates of c0i , c1i , and c2i . These estimated coefficients
can be used to obtain estimates of cit , c̃it . Again, the within estimator can be used
to consistently estimate β along with the 3n parameters from the parameterization
of cit . Finally, ĉit (the estimator of relative inefficiency) is obtained from

ĉit = ĉt − c̃it where ĉt = max(c̃j t ) ∀t. (10)

In this model, efficiency is calculated relative to the best firm in each year. Since
the firm with the maximum c̃j t is likely to change over time, different firms may
be fully efficient (or inefficient at different levels) in different years. An alternative
would be to calculate ĉj t = maxj t (c̃j t ), the maximum over all j and t, and replace
ĉt with this definition in (10), and then efficiency is relative to the firm that was the
most efficient over the entire sample period.
The [26] estimation procedure is easy to implement. It relies on the standard
panel data estimator with the FE framework. Note that since t appears in the
inefficiency function, it cannot also appear as a regressor in x it , which would be
required if one were to capture technical change, i.e., a shift in the production
frontier, m(x). In other words, the above model cannot separate inefficiency from
technical change, which is an obvious drawback of this approach. In general, if
9 Stochastic Frontier Analysis: Foundations and Advances II 377

one wants to have both time-varying inefficiency and technical change, then the
distribution-free route of [26] will not work. In this case, distributional assumptions
will be necessary to allow time (and higher powers of it) to enter the model in
various places.
In a model with large n and small T , the model will have too many parameters
(3n parameters in the cit function alone). A somewhat parsimonious time-varying
inefficiency model was proposed by [69]:

yit = m(x it ; β) + vit − uit = m(x it ; β) + εit . (11)

where uit = ui t and t represent time-specific effects to be estimated. This

model is quite flexible in its ability to model time-varying inefficiency. However,
the temporal pattern of inefficiency is assumed to be exactly the same for all firms
(t ). Under the FE framework, this specification can be viewed as an interactive
effects panel data model, and estimation can be undertaken by introducing both firm
and time dummies. Though no distributional assumptions are required by [69], the
structure of inefficiency is similar to that assuming the scaling property discussed
above. Again, given that inefficiency depends directly upon time, it is difficult to
model both time-varying inefficiency and technical change in (11).
A similar idea was used prior to [69] in [54] and [11], who proposed time-varying
SFMs, but made distributional assumptions on both vit and uit and estimated the
corresponding likelihood functions. Lee & Schmidt’s [69] model is more general
than either the [54] or [11] models as both can be derived as special cases with
appropriate parametric restrictions on t . Further still, the time-invariant SFM is
also a special case: t = 1 ∀ t. Once t and ui are estimated, inefficiency can be
estimated from

ûit = max{ûj ˆt } − ûi ˆt . (12)

So far, the time-varying models that we have discussed treat inefficiency in

a fully deterministic fashion, i.e., no distributional assumptions are required. In
the [69] time-varying SFM, both ui and t are deterministic. This model can
also be estimated treating the time component as deterministic, but the individual
component as stochastic (through a distributional assumption). The deviation from
the [69] time-varying SFM in (11) is that uit = G(t)ui with G(t) being a
deterministic function of time and ui ∼ N+ (μ, σu2 ) [11, 54]. The ideas discussed
pertaining to the scaling property appear here, where firms have a base level of
inefficiency, and then, through time, become more or less efficient. The stochastic
component, ui , utilizes the panel structure of the data in this model. The G(t)
component is common across individuals (as in, but not limited to, [69]).
Given ui ≥ 0, uit ≥ 0 is ensured by having G(t) > 0. Undoubtedly, the most
popular form of G(t) is that proposed by [11]

G(t) = exp [γ (t − T )] , (13)

378 S. C. Kumbhakar et al.

where T is the terminal period of the sample. The specification for G(t) is a
simplification of the first attempt to introduce stochasticity into the time-varying
SFM by [54] that assumes a more general specification of G(t) given by
−1
G(t) = 1 + exp(γ1 t + γ2 t 2 ) . (14)

The [11] specification essentially enforces γ2 = 0 in the [54] time-varying

SFM. The popularity of the [11] time-varying SFM has been aided by the freely
available statistical package Frontier V4.1 which implements this model at the push
of a button (see section “Available Software to Estimate SFMs” as well). Other
specifications for G(t) have also been proposed; see [27] and [65] for more recent
examples. Little research has been done on comparing a variety of forms of G(t).
Lastly, modeling technical change in the [54] or [11] framework is trivial because
the imposition of distributional assumptions allows inclusion of t (as a deterministic
time-trend, e.g., linear, quadratic, etc.) as a component of x it .

Models That Separate Firm Heterogeneity from Inefficiency

While the time-invariant SFM is a standard panel data model where ci is the unob-
servable individual effect, a notable drawback of this approach is that inefficiency is
indistinguishable from individual heterogeneity. All time-invariant heterogeneity is
confounded with inefficiency, and therefore ĉi will capture heterogeneity in addition
to, or even instead of, inefficiency [37]. An important question for practitioners
using the time-invariant SFM is how to view the time-invariant component. Should
it be thought of as persistent inefficiency (as per [55–58]) or is it more appropriate
to think of it as individual heterogeneity, capturing the effects of unobserved time-
invariant covariates? If it is the latter, then the insights from the time-invariant panel
data SFMs are incorrect. A less rigid perspective is that the truth lies somewhere
in the middle; inefficiency may be decomposed into two components: one that is
persistent over time and one that varies over time.
Unless persistent inefficiency is disentangled from time-invariant individual
heterogeneity, practitioners need to choose between either the case in which
ci represents persistent inefficiency or ci represents an individual-specific effect
(heterogeneity). Here, we will discuss both specifications. In particular, we will
consider models in which inefficiency is time-varying irrespective of whether the
time-invariant component is treated as inefficiency or not. Thus, the model we will
describe is

yit = ci + x it β + vit − uit . (15)

Compared to a standard panel data model, we have the additional time-varying

inefficiency term, −uit , in (15). If one treats ci , i = 1, . . . , n as a random variable
that may be correlated with x it but does not capture inefficiency, then the above
9 Stochastic Frontier Analysis: Foundations and Advances II 379

model becomes what has been termed the “true fixed-effects” panel SFM [36]. The
model is labeled as the “true random-effects” SFM when ci is treated as uncorrelated
with x it . Note that these specifications are of the same nature as the models proposed
by [55–58]. The difference is in the interpretation of the “time-invariant term,” ci .
Estimation of the model in (15) is not straightforward. When ci , i = 1, . . . , n,
are embedded in the FE framework, the model encounters the incidental parameters
problem [76]. The incidental parameters problem arises when the number of
parameters to be estimated increases with the number of cross-sectional units in
the data, which is the case with the ci in (15). In this situation, consistency of the
parameter estimates is not guaranteed even if n → ∞ because the number of ci
increases with n. Therefore, usual asymptotic results may not apply. In addition to
this specific statistical problem, another technical issue in estimating (15) is that the
number of parameters to be estimated can be prohibitively large for large nT .
For a standard linear panel data model (i.e., one that does not have −uit in (15)),
the literature has developed estimation methods to deal with this problem. These
methods involve transforming the model so that ci is removed before estimation.
Without ci in the transformed model, the incidental parameters problem no longer
exists, and the number of parameters to be estimated no longer increases with
the number of individuals. Methods of transformation include conditioning the
model on ci ’s sufficient statistic2 to obtain the conditional MLE, and the within-
transformation model or the first-difference transformation model to construct the
marginal MLE (e.g., [25]). For the basic panel data SFM, this could be done
by transforming the error term if assumptions on vit and uit are such that the
composed error term’s distribution is closed-skew normal (i.e., the normal-half-
normal distributional pair).
These standard methods, however, are usually not applicable to (15). For the
conditional MLE of (15), [37] showed that there is no sufficient statistic for ci .
For the marginal MLE, the resulting model after the within or first-difference
transformation usually does not have a closed-form likelihood function, if one uses
standard procedures.3 In general this would not pose an issue as regression methods
can be easily applied. However, given the precise interest in recovering estimates of
the parameters of the distribution of inefficiency, maximum likelihood or specific
moments of the distribution of the transformed error component are needed. This
precipitates methods that can recover information regarding uit .
Greene [37] proposed a tentative solution. He assumed uit follows a simple
i.i.d. half-normal distribution and suggested including n dummy variables in the
model for ci , i = 1, . . . , n and then estimating the model by MLE without any
transformation. He found that the incidental parameters problem does not cause
significant bias to the model parameters when T is relatively large (e.g., T ≥10). The

2 A sufficient statistic contains all the information needed to compute any estimate of the parameter.
3 Colombi et al. [24] showed that the likelihood function has a closed-form expression. Chen et al.
[20] considered a special case of [24] and derived a closed-form expression.
380 S. C. Kumbhakar et al.

problem of having to estimate more than n parameters is dealt with by employing

an advanced numerical algorithm.
There are some recent econometric developments on this issue. First, [20] pro-
posed a solution in the FE framework. They showed that the likelihood function of
the within-transformed and the first-difference model have closed-form expressions
using results in [30]. The same theorem in [30] is used by [23] to derive the log-
likelihood function in the RE framework.
Using a different approach, [102] solve the problem classified in [37] by
proposing a class of SFMs in which the within and first-difference transformations
on the model can be carried out while also providing a closed-form likelihood
function. The main advantage of such a model is that because the ci s are removed
from the model in (15), the incidental parameters problem is avoided entirely. As
such, consistency of the estimates is obtained for either n → ∞ or T → ∞,
which is invaluable for applied settings. A further computational benefit is that
the elimination of ci s reduces the number of parameters to be estimated to a
manageable number. The catch is in the specification of inefficiency which is the
product of an i.i.d non-negative random component and a deterministic function of
zit (determinants of inefficiency). Formally, the [102] model is

yit = ci + x it β + εit , (16)

where εit = vit − uit with vit ∼ N(0, σv2 ) and uit = git u∗i with u∗i ∼
N+ (μ, σu2 ), which is the now familiar scaling property model with a truncated-
normal distribution for the basic distribution of inefficiency.
For the scaling function [102] set git = g(zit δ). What allows the model
transformation to be applied is the scaling property; the within and first-difference
transformations leave this stochastic term intact as u∗i does not change with time.
As [102] showed that the within-transformed and the first-differenced models are
algebraically identical we have only provided discussion on the first-differenced
model. However, a limitation of their model is that it does not completely separate
persistent and time-varying inefficiency, a subject which we now turn our attention
to. Lastly, as with the models of [54] or [11], the use of distributional assumptions
allows both time-varying inefficiency and technical change to be modeled in (16).

Models That Separate Persistent and Time-Varying Inefficiency

Although several models discussed earlier can separate firm heterogeneity from
time-varying inefficiency (which is either modeled as the product of a time-invariant
random variable and a deterministic function of covariates or distributed i.i.d. across
firms and over time), none of these models consider persistent technical inefficiency.
It is important to quantify persistent inefficiency, especially in short panels, as it
captures the effects of inputs like management quality [75]. Unless there is a change
in something that affects management practices at the firm (e.g., new government
9 Stochastic Frontier Analysis: Foundations and Advances II 381

regulations or a change in ownership), it is unlikely that persistent inefficiency will

change. The importance of persistent inefficiency contrasts with time-varying as this
can change over time without requiring structural changes which impact the firm.
This distinction between the time-varying and persistent components is important
from a policy perspective as each yields different implications. Colombi et al. [23]
refer to time-varying inefficiency as short-run inefficiency and mention that it can
arise due to failure in allocating resources properly in the short run. They argued
that, for example, a hospital with excess capacity may increase its efficiency in
the short run by reallocating the work force across different activities. Thus, some
of the physicians’ and nurses’ daily working hours might be changed to include
other hospital activities such as acute discharges. This is a short-run improvement
in efficiency that may be independent of short-run inefficiency levels in the previous
period, which can justify the assumption that uit is i.i.d. However, this does not
impact the overall management of the hospital and so is independent from time-
invariant inefficiency.
To help formalize this issue more clearly, we consider the model4

yit = β0 + x it β + εit = β0 + x it β + vit − (ηi + uit ) (17)

Technical inefficiency is represented as ηi + uit where ηi is the persistent, firm-

specific component (e.g., time-invariant ownership or geographic location) and uit
is the time-varying component of technical inefficiency which is firm- and time-
specific. Model (17) generalizes the previously discussed models because it allows
for firm heterogeneity and time-invariant and time-varying inefficiency all at once.
Such a decomposition is desirable because, since ηi does not change over time,
for a firm to improve efficiency, a structural change in policy or management would
need to arise. Additionally, ηi does not fully capture firm-level inefficiency because
it does not account for learning over time since it is time-invariant; the time-varying
component, uit can capture this aspect. In (17) the level of overall firm inefficiency,
as well as the components, is important to know because they convey different types
of information. Thus, for example, it may be argued that if residual inefficiency for
a firm is relatively large in a particular year, this is due to an event which is unlikely
to occur in the following year. Alternatively, if persistent inefficiency is large, then
a firm is expected to operate with a relatively high level of inefficiency over time,
unless some changes in policy and/or management occur. Therefore, a large value
of ηi is more concerning in the long run given its persistent nature than is a high
value of uit .
The specification in (17) offers that advantage of testing for the presence of
the persistent nature of technical inefficiency without the imposition of a specific
parametric form of time dependence. Furthermore, by including time in the x it
vector, (17) has the ability to separate exogenous technical change from technical
inefficiency.

4 This is the model proposed by [56–58], among others.

382 S. C. Kumbhakar et al.

To estimate the model, we rewrite (17) as

yit = αi + x it β + ωit = (β0 − ηi − E[uit ]) + x it β + vit − (uit − E[uit ]). (18)

The error, ωit , has zero mean and constant variance. Model (18) is a standard panel
data model with firm-specific heterogeneity (one-way error component model) and
can be estimated either by the within transformation (under the FE framework) or
by generalized least-squares (under the RE framework).
The SFM in (18) can be estimated under the FE framework using a step-wise
procedure.
Step 1 The standard within transformation can be performed on (18) to remove αi
before estimation. Since both the components of ωit are zero mean and constant
variance random variables, the within transformed ωit will generate a random
variable that has zero mean and constant variance. OLS can be used on the
within-transformed version of (18) to obtain consistent estimates of β.
Step 2 Given the estimate of β, β̂, from Step 1, construct the pseudo-residuals
rit = yit − x it β̂, which contain information on αi + ωit . Using these, we first
estimate αi from the mean of rit for each i. Then, we can estimate αi from
maxi α̂i − α̂i = maxi {r̄i } − r̄i where r̄i is the mean (over time) of rit for firm i.
Note that the intercept, β0 , and ωit are eliminated by taking the mean of rit over
time for a firm. The above formula gives an estimate of αi relative to the best
firm in the sample.
Step 3 With our estimates of β and ηi , we calculate residuals eit = yit − x it β̂ +
ηˆi , which contains information on β0 + vit − uit . At this stage, additional
distributional assumptions are required to separate vit from uit . Here we follow
convention and assume vit ∼ i.i.d. N(0, σv2 ) and uit ∼ i.i.d. N+ (0, στ2 ). MLE
can be deployed here, treating eit as the dependent variable, to estimate β0 and
the parameters associated with vit and uit . The log-likelihood for this setup is,
letting N = nT ,

n
T
1 2
n T
ln L = −N ln σ + ln Φ(−eit λ/σ ) − eit (19)
2σ 2
i=1 t=1 i=1 t=1

Note that the parameters to be estimated here are β0 , σν2 , and στ2 . Once these
parameters have been estimated, a JLMS conditional mean or median technique
can be used to estimate uit for each observation.
To summarize estimation under the FE framework, we estimate (18) using
standard FE panel data tools to obtain consistent estimates of β in Step 1. Step 2
estimates persistent technical inefficiency, ηi . Lastly, Step 3 involves estimation of
β0 and the parameters associated with the distributional assumptions imposed on the
random components, vit and uit . One can then use the JLMS formula to estimate the
time-varying (residual) component of inefficiency, uit . Note that no distributional
9 Stochastic Frontier Analysis: Foundations and Advances II 383

assumptions are used in the first two steps. Without further assumptions, residual
inefficiency cannot be identified, and hence, distributional assumptions are needed
in the last step. This model can also be estimated under the RE framework
(see also [23]).

Models That Separate Firm Effects, Persistent Inefficiency,

and Time-Varying Inefficiency

All of the panel data SFMs introduced so far have departed from the general model
introduced in (2) in some aspect pertaining to the four separate error components.
This is due to the fact that, until recently, it was not clear how to estimate the full
panel data SFM represented by (2). The models of [23,59] overcome the limitations
of the previous models by embracing the nature of the four-component structure
inherent in the general panel data SFM. In the SFM represented in (2), the four
components take into account different factors affecting output, given the inputs. As
in [36, 37], the first component captures firms’ latent heterogeneity, which needs
to be extricated from inefficiency; the second component captures time-varying
inefficiency, the third component captures time-invariant inefficiency as in [56–58]
while the fourth component captures stochastic shocks beyond control of the firm.
The ability to estimate model (2) allows improvement over the previous models
in several ways. To begin, while some of the time-varying inefficiency models
just described can accommodate firm effects, these models fail to acknowledge
the potential for factors that might have time-invariant effects on firm inefficiency.
Second, SFMs which allow time-varying inefficiency commonly assume that the
inefficiency level of the firm at time t is independent of its previous level of
inefficiency; it is more reasonable to assume that a firm may eliminate some of
its inefficiency by mitigating short-run rigidities, while other sources of inefficiency
may remain over time. The former is captured by the time-invariant component,
ηi , and the latter by the time-varying component, uit . Finally, many panel SFMs
do consider time-invariant inefficiency, but do not simultaneously account for the
presence of unobserved firm heterogeneity. In doing so, these models confound
time-invariant inefficiency with firm effects (heterogeneity). The models proposed
by [20,36,37,65,102] decompose the error term in the production function into three
components: a firm-specific time-varying inefficiency term, a firm-specific effect
capturing latent heterogeneity, and a time- and firm-varying random error term.
However, these models consider any producer-specific, time-invariant component
as unobserved heterogeneity. Thus, although firm heterogeneity is now accounted
for, it comes at the cost of ignoring long-term inefficiency. As before, latent
heterogeneity is confounded with long-run inefficiency.
Estimation of the panel data SFM in (2) can be undertaken in a single-stage MLE
method based on distributional assumptions on the four components [24]. We first
describe a simpler, multi-step procedure [59]. For this, we rewrite the model in (2)
as
384 S. C. Kumbhakar et al.

yit = β0∗ + x it β + αi + εit , (20)

where β0∗ = β0 − E[ηi ] − E[uit ]; αi = ci − ηi + E[ηi ]; and εit = vit − uit + E[uit ].
With this specification, both αi and εit are zero mean and constant variance random
variables. (20) is estimated in three steps.
Step 1 Standard random-effect panel regression is used to estimate β̂ (since (20)
is a common panel data model). Predicted values of αi and εit , denoted by α̂i and
ε̂it , are also available after estimating (20).
Step 2 Time-varying technical inefficiency, uit , is estimated using ε̂it from Step
1. Since

εit = vit − uit + E[uit ], (21)

by assuming 2 2
√ vit is i.i.d. N(0, σv ) and uit is i.i.d. N+ (0, σu ), which yields
E[uit ] = 2/π σu , and ignoring the difference between the true and predicted
values5 of εit , we can estimate (21) using standard SFA techniques. Doing so
produces
predictions of the time-varying technical inefficiency component uit ,
E e−uit |εit , (i.e., [10]), which we call relenting technical efficiency (RTE).
Step 3 Estimate ηi following a similar strategy as in Step 2. For this we use α̂i
from Step 1. Since

αi = ci − ηi + E[ηi ], (22)
√
by assuming ci is i.i.d. N(0, σμ2 ), ηi is i.i.d. N+ (0, ση2 ), where E[ηi ] = 2/π ση ,
estimate (22) using the standard normal-half-normal cross-sectional SFM and
obtain estimates of the persistent technical inefficiency component, ηi , following
JLMS. Persistent technical efficiency (PTE) can then be estimated as PTE = e−ηi ,
where ηˆi is the JLMS estimator of ηi . Overall technical efficiency (OTE) is then
constructed as the product of PTE and RTE, i.e., OTE = PTE×RTE.
It is possible to extend this model (in steps 2 and 3) to include PTE and RTE that is
distributed as truncated-normal or exponential as opposed to half-normal.
While the multi-step approach of [59] is straightforward to implement, it is
inefficient relative to full MLE. However, given the structure of the four separate
errors, deriving the likelihood function was previously seen as infeasible. However,
using insights related to the closed-skew normal distribution, as in [23], a tractable
likelihood function turned out to be easily obtainable.
Colombi et al. [23] made skew normal distributional assumptions for both ci −ηi
and vit − uit in (20).6 Assuming vit is i.i.d normal and uit is i.i.d half-normal,
the composed error vit − uit has a skew normal distribution. The same set of

5 Which is the standard practice in any two- or multi-step procedure.

6 The skew normal distribution is a more general distribution than the normal distribution, allowing

for asymmetry [5].

9 Stochastic Frontier Analysis: Foundations and Advances II 385

assumptions can be used for ci and ηi . Thus, model (2)’s likelihood can be derived.
Even though the log-likelihood for (2) can be determined based on skew normal
assumptions for the time-varying and time-invariant error components, it can be
daunting to implement. Greene and Fillipini [39] recently proposed a simulation-
based optimization routine which circumvents many of the challenges associated
with direct optimization. They used a trick suggested by [17], conditioning on ci
and ηi . This conditioning eliminates many of the computational hurdles that direct
optimization of the likelihood function presents.

The Four-Component Panel Data SFM with Determinants

of Inefficiency

A further generalization of the four-component model in (2) involves the inclusion

of determinants of inefficiency, either for the time-varying or the time-invariant
components. An estimator for this model was recently proposed in [6],

yit = m(x it ; β) + ci − ηi + vit − uit , (23)

where ηi ∼ N+ (0, ση,i

2 ), u ∼ N (0, σ 2 ), c ∼ N(0, σ 2 ), and v ∼ N(0, σ 2 ).
it + u,it i c,i it v,it
These distributional assumptions are imposed so that the time-invariant composed
error and the time-varying composed error both follow the closed-skew normal
distribution. Each of the variance parameters of the four components is dependent

2 = σ 2 ezη,i δ η ,
upon a set of covariates and specified as an exponential function: ση,i η

2 = σ 2 ezc,i δ c , σ 2 2 zu,it δ u , and σ 2 2 zv,it δ v . The time-constant and
σc,i c u,it = σu e v,it = σv e
time-varying z vectors can overlap due to the assumed distributional assumptions,
that is, zc,i can share elements with zη,i and zu,it can share elements with zv,it .
To estimate this four-component model, [6] used the insights of [39] and
deployed simulated maximum likelihood techniques. The benefit of this approach
is that rather than having T integrals to evaluate, by conditioning on ci − ηi ,
the likelihood function can be written as the product of T univariate integrals.
Simulation methods are required to construct draws of ci −ηi inside the convolution
density. The final log-likelihood function is

n
R T
−1 2 εitr εitr λit
L= log R φ Φ , (24)
σit σit σit
i=1 r=1 t=1

where σit = ezu,it δ u + ezv,it δ v , λit = ezu,it δ u −zv,it δ v , εitr = it −

ezc,i δ c Vir − ezη,i δ η |Uir | and it = yit − m(x it ; β). R is the number of
draws over which to numerically evaluate the integral (larger R increases accuracy
but slows down the routine; smaller R leads to faster computation but decreases
accuracy). Lastly, both Vir and Uir are random draws from a standard normal
386 S. C. Kumbhakar et al.

distribution. Implementation of this routine is straightforward if one has access to

a standard normal random number generator (which is typically available in any
general statistical software). Once draws for Vir and Uir have been constructed,
the likelihood is evaluated for the current set of parameters (β, δ u , δ v , δ η , δ c ). This
process is then iterated over different sets of parameter values. Naturally, one can
impose constancy at various parts of the error components by restricting δ = 0 for
∈ {u, v, c, η}.

Inference Across the Panel Data SFM

The most general SFM in the panel context is the model which allows for
firm-specific heterogeneity, persistent technical efficiency, relenting technical inef-
ficiency, and individual time-specific idiosyncratic shocks. Colombi et al. [23]
denote this model as TTT (True for having firm-specific heterogeneity, True for
having time-constant inefficiency, and True for having time-varying inefficiency).
The majority of all panel data models that have appeared in the literature are special
cases of TTT. For example, the widely used true RE model of [37] is a special
case of the TTT model. The same holds for all of the models we have discussed
above. Naturally, inference is necessary to determine the model which best fits the
data at hand. One benefit of nearly all of the panel data SFM discussed here is that
standard panel data type tests (coefficient significance, fixed-versus random-effects
framework, serial correlation, etc.) are easily implemented. This is similar to the
benefits of the cross-sectional SFM that we discussed earlier.
What is less straightforward is to test the most general TTT model against more
restricted versions. Testing any of the previous models against the most general TTT
model is a nonstandard problem because, under the null hypothesis, one or more
of the parameters of interest lie on the boundary of the parameter space. Under
reasonable assumptions, the asymptotic distribution of the log-likelihood ratio test
statistic is χ̄ 2 , as discussed in chapter 9. For example, the model of [87] could be
tested against the TTT model with the log-likelihood ratio test statistic but using χ̄ 2
to determine the p-value; see Table 1 in chapter 9.
Future research focusing on adapting testing procedures to the TTT framework is
important moving forward. As discussed earlier, the presence of both time-varying
and time-invariant efficiencies yields different policy recommendations, and so
working with models that document their presence, or lack of one, is important
for proper analysis.

Nonparametric Estimation of the SFM

Early Attempts

In a nutshell, the semiparametric and nonparametric approaches to SFA typically use

the benchmark SFM of [1] as the stepping-stone, generalizing it in different ways
by relaxing all or some parametric assumptions by utilizing existing semiparametric
9 Stochastic Frontier Analysis: Foundations and Advances II 387

and nonparametric statistical methods, such as the Nadaraya-Watson estimator, the

local-polynomial estimator, or the likelihood (pseudo or local) estimators.
To facilitate further and more precise discussion, recall that the benchmark SFM
for a sample of n DMUs is given by

yi = m(x i ) + vi − ui = m(x i ) + εi , i = 1, . . . , n, (25)

where m(·) is the frontier of the production technology that can be used to transform
q
vector of inputs x ∈ R+ into scalar output yi , perturbed by some statistical noise
vi and adjusted by technical inefficiency ui . As we discussed in Section 2in the
previous chapter, traditional parametric estimation of the model begins by assuming
a particular functional form for the production technology, most commonly a Cobb-
Douglas or a Translog, besides making distributional assumptions on both vi and ui ,
which help to identify and estimate the unknown parameters via, say, the maximum
likelihood approach. All the asymptotic results (consistency, asymptotic normality)
are conditional on these assumptions, and if they happen to be incorrect, then,
strictly speaking, all these results may be invalid. In such cases, the parametric
MLE will be inconsistent or converging in probability not to the truth (e.g., true
elasticities) but to some other numbers, which can even be very far from the truth if
the parametric assumption made on a function is far from the true one.
The early attempts to estimating SFM nonparametrically or semiparametrically
go back to at least [8,34,47]. Specifically, [8] proposed a nonparametric approach in
the spirit of the DEA estimator but embedded in a maximum likelihood framework,
similar to parametric SFA, and thus allow for modeling both the noise and the
inefficiency. A few years later, [34] proposed estimating the production frontier in
another flexible manner, using nonparametric kernel regression methods embedded
into the parametric maximum likelihood. About the same time, [47] suggested using
the kernel regression estimator (Nadaraya-Watson in particular) for the panel data
SFM.
Importantly, note that the estimated conditional mean E[yi |x] of the production
frontier is a biased estimator when ignoring the inefficiency term. Indeed, a critical
assumption for consistent estimation of the production frontier in a regression
setting is E[εi |x] = 0 and due to the one-sided nature of ui , this assumption is
not satisfied, because E[εi |x] = μu = 0 in the simplest case when inefficiency is
independent of the inputs, or more generally, E[εi |x] = μu (x) = 0 ∀x. Therefore,
the production frontier cannot be identified in the regression setup, where one would
estimate

yi = m(x i ) + εi = m(x i ) + μu + (εi − μu ) ≡ m∗ (x i ) + εi∗ . (26)

Realizing this, [34] proposed correcting the estimation bias of m(x) via a three-
stage semiparametric pseudo-likelihood estimation of the SFM. In this approach,
388 S. C. Kumbhakar et al.

at the first stage, one estimates (26) nonparametrically.7 Results from this first
stage are then fed into the second stage, involving parametric MLE with particular
assumptions on the distribution of the noise and inefficiency that help identifying
and disentangling the two.8 Once the parameters of this symbiosis of MLE and
kernel regression are estimated, the estimated conditional mean can then, in the
third stage, be corrected for the bias by the estimated mean of inefficiency (as in
COLS), μ̂u (x i ) to get a consistent estimator m(x i ) given by

m ∗ (x i ) − μ̂u (x i ),
(x i ) = m (27)

[47] also proposed a similar strategy for correcting the bias occurring in estimating
(26) nonparametrically, but avoided using MLE due to the possibility of disentan-
gling the noise from inefficiency without distributional assumptions, by utilizing the
panel-data SFA framework.
The approaches of [34,47] provided a useful framework and formed a foundation
on which many other approaches have been built.9 For example, more recent
approaches of [68, 84] share some essence of [34] except that they required
the estimated production frontier to obey traditional axioms of production, such
as monotonicity and concavity, something that [34] did not accommodate in
their approach. Specifically, [84] employ the framework of [34] but combine it
with constraint weighted bootstrapping [31, 40] to ensure that monotonicity and
concavity are enforced during estimation. More recently, [77] made improvements
to the approach of [84], which resulted in small sample performance gains.
On the other hand, [68] used an entirely different estimation approach, concave
nonparametric least-squares (CNLS), to impose monotonicity and concavity. Lastly,
[73] showed that while the estimator of [34] is consistent, the parametric estimator
for the parameters of √the density of the convolved error yields an asymptotic bias
(when normalized by n) and proposed an alternative estimator that estimates the
distributional parameters and the unknown frontier jointly.

Local Likelihood Methods

The local likelihood approach [99] is known to be a natural alternative to the

semiparametric pseudo-likelihood, and it was first proposed in the SFA context by

7 They used a local constant (Nadaraya-Watson) regression, although other consistent nonparamet-
ric estimators can be used there too.
8 In their work, the normal-half-normal assumption was used, but other assumptions as discussed

above can be used there too. Note, however, that for some alternative distributional assumptions
on u, for example, exponential or truncated-normal, a concentrated version of the log-likelihood
function may not exist, causing identification problems.
9 See [86] for a more comprehensive review of this topic.
9 Stochastic Frontier Analysis: Foundations and Advances II 389

[60]. This approach closely resembles the parametric likelihood approach with the
only (yet fundamental) difference being the kernel-based weights (instead of the
equal weights) used to weigh each individual contribution to the likelihood, which
help in localizing the estimation in the direction of each continuous variable through
the bandwidths. Specifically, for a given regression error density, fε (ε, θ ), we have
the local log-likelihood function

n
Ľn (θ (x), mx ) = (n|h|)−1 ln fε (yi − m(x i ); θ (x))Kix , (28)
i=1

where mx captures the conditional mean of y given x (a q × 1 vector of covariates)

q
x is −x s
and θ is the vector of remaining parameters of fε , Kix = h−1
s k hs
s=1
is the standard product kernel where k(·) is any second order univariate kernel
(Epanechnikov, Gaussian, e.g.), hs is the smoothing parameter for the sth covariate
(and is the sth element of vector h), while |h| = h1 h2 · · · hq .
Kumbhakar et al. [60] used a local-linear approximation for the unknown
production function m(x i ) combined with the assumption of a normal, half-normal
convolved error term, where parameters are also modeled as unknown functions of
the covariates:
n

Ľn =(n|h|)−1 −0.5σ̈x2 (x i ) − 0.5ε̈i2 e−σ̈x (x i )
2

i=1
2

+ ln −ε̈i eλ̈x (x i )−0.5σ̈x (x i ) Kix (29)

where ε̈i = yi − m̈x (x i ), m̈x (x i ) = m̈0 − m̈1 (x i − x), σ̈x2 (x i ) = σ̈02 + σ̈12 (x i − x),
and λ̈x (x i ) = λ̈0 + λ̈1 (x i − x).10
Noting that often the main focus of interest is related to σu ; [82] suggested
directly parameterizing the local likelihood function in terms of ln σv2 and ln σu2
which also impose positivity of σv2 and σu2 throughout the estimation, making it
more stable computationally. Park et al. [82] also outlined an asymptotic theory for
modeling discrete variables in the context of the local likelihood approach, which
can be imperative for many applications, which many covariates that researchers
have access to are categorical in nature (regulated vs. non-regulated firms or
industries, private vs. publicly owned companies, male vs. female managers, etc.).
The local likelihood function in this case would be

10 One could also use a quadratic approximation, but note that even in this local-linear case, there
are already 3 + 3q parameters to estimate (i.e., optimize over) at each point of interest x: these are
the three functional estimates, m̈0 , σ̈02 and λ̈0 and the 3q derivative estimates of the functions, m̈1 ,
σ̈12 and λ̈1 .
390 S. C. Kumbhakar et al.

n
2 c d
Ľ(θ (x c , x d ), mx c ,x d ) = (n|h|)−1
2 c d
−0.5 ln eσ̈v (x i ,x i ) + eσ̈u (x i ,x i )
i=1
2 c d 2 c d

−0.5ε̈i2 / eσ̈v (x i ,x i ) + eσ̈u (x i ,x i )

σ̈u2 (x ci ,x di )/2−σ̈v2 (x ci ,x di )/2 σ̈v2 (x ci ,x di ) σ̈u2 (x ci ,x di )
+ ln −ε̈i e / e +e Kix c W i (x di ).
(30)

where x ci is a vector of continuous regressors, while x di is a vector of discrete

regressors, and W i (x di ) is an appropriate discrete kernel, e.g., the one proposed
by [2] or its variations. The theory in [82] is derived for the case of kernel from
k I (xijd =xjd )
[90], given by W i (x d ) = j =1 ωj , which is a standardized version of
the Aitchison-Aitken kernel, standardized so that the bandwidths for a j th discrete
variable, here denoted as ωj , are always between 0 and 1, regardless of the number
of categories. However, this theory also extends (with some modifications) to cases
with other discrete kernels. For example, one might prefer the so-called discrete
Epanechnikov kernels, which are particularly useful and can be superior to others
in case of sparse data (e.g., see [21] and the references cited therein). One can also
use more adaptive bandwidths, e.g., allow for bandwidths of some or all continuous
regressors to differ across categories of some or all discrete variables (e.g., see [70]
for related discussion).
Standard optimization algorithms can be used here, but as with any nonlinear
optimization, careful choice of starting values is imperative, especially in selecting
the bandwidths. For example, [60] suggested starting with the local-linear least-
squares estimates for m̈0 and m̈1 and the global, parametric maximum likelihood
estimates for σ 2 and λ (from [1]) so that m̈0 is properly corrected (as in [34]).
Selection of the bandwidths is a very important step here (as is true in
general for kernel-based methods), and many interesting general selection methods
can be adapted to the current context. One of the most popular approaches
is cross-validation.11 Kumbhakar et al. [60] outlined how to use least-squares
cross-validation (LSCV) for their approach. Meanwhile, [82] suggested using
maximum likelihood cross-validation (MLCV), which is more natural for the local
likelihood approach, although it may be more demanding in computation. For the
starting values in numerical optimization of LSCV or MLCV for selecting optimal
bandwidths, one could use the so-called rules-of-thumb bandwidths that reflect
the rates of convergence required for the asymptotic theory, e.g., for a continuous
variable xsc , use h0 (xsc ) = 1.06 × n−1/(4+q) σ̂xsc , where σ̂xsc is estimated standard
deviation of xsc , and ω0 = n−2/(q+4) for the discrete bandwidths.
Kneip et al. [48] provide an update of the [60] estimator whereby the
distributional assumption on the inefficiency term can be dropped. The only

11 For more discussions on the pros and cons, as well as references on this approach in general, see
[42].
9 Stochastic Frontier Analysis: Foundations and Advances II 391

parametric assumption required in [48] is that the two-sided error term is normal,
which allows them to rely on penalized likelihood, where the unknown density is
constructed nonparametrically via a histogram over the support of the covariate
space and the penalty term is included to ensure appropriate smoothness of the
resulting density. Both the theory and simulated evidence appearing in [48] suggest
that this estimator works quite well in a variety of settings. To date, no application
of this method has appeared to our knowledge, and so it represents an exciting
opportunity moving forward.

Local Least-Squares Approaches

In spite of the appealing theoretical advantages of the likelihood-based approaches,

they involve numerical optimization of the local likelihood function over many
parameters at each point of interest, which can be computationally complex,
especially if bootstrap methods are needed to conduct inference. An attractive
alternative that is much simpler to compute is provided by adopting the local least-
squares methods; because these methods do not require nonlinear optimization
(given closed-form solutions), only basic matrix operations are required, marking
dramatic improvements in computation time.
Recently, [94] (SVKZ hereafter) proposed what can be viewed as a semi- or non-
parametric generalization of COLS [79],12 which also allows for modeling determi-
nants of inefficiency. Specifically, they considered a generalization of (25) given by

yi = m(x i , zi ) + vi − ui = m(x i , zi ) + εi . (31)

where m(x i , zi ) is the production frontier evaluated at x i , the realizations of inputs

for observation i, and at zi , the realization of the so-called environmental factors
faced by the observation i and disturbed by the realizations of statistical noise vi
and inefficiency ui . In general, they required fairly general and mild conditions
on the model, e.g., (ui |x i = x, zi = z) ∼ D + (μ(x, z), σu2 (x, z)) with D + (·, ·)
being a non-negative random variable with mean μ(·, ·) and finite positive variance
σu2 (·, ·), while (vi |x i = x, zi = z) ∼ D(0, σv2 (x, z)) with D(0, ·) being a random
variable with mean zero and finite positive variance σv2 (·, ·). They also assumed
that, conditional on (x i , zi ), ui and vi are independent random variables. Further,
given that vi has a symmetric distribution around zero, while ui is a positive random
variable from a skewed distribution E[εi |x, z] = −E[ui |x, z] = 0. Therefore, after
recentering, we have

yi = m(x i , zi ) + vi − ui + E[ui |x, z] − E[ui |x, z] = m∗ (x i , zi ) + εi∗ (32)

12 As with our earlier discussion, SVKZ referred to this approach as nonparametric MOLS, but cite

[79], who used the term COLS and so we refer to it as COLS here.
392 S. C. Kumbhakar et al.

where m∗ (x i , zi ) = m(x i , zi ) − E[ui |x, z] and εi∗ = εi + E[ui |x, z]. Adapting
the strategy of COLS from [79], SVKZ proposed in the first stage the estimator
of m∗ (x, z), m ∗ (x, z) using local-polynomial least-squares, noting that under
mild regularity conditions and appropriate choice of bandwidths, such estimators
have desirable statistical properties (consistency, asymptotic normality, etc.; see
[32,42,71]). Then, in the second stage, they utilized the moment conditions implied
by the assumptions on ui and vi , namely,

E[ε∗ |x, z] = 0,
E[(ε∗ )2 |x, z] = σu2 (x, z) + σv2 (x, z),

E[(ε∗ )3 |x, z] = −E (u − E[u|x, z])3 |x, z ,

and estimate the second and third moments of ε∗ using local-polynomial methods
εi∗ = yi − m
with the residuals ∗ (x i , zi ) from the first stage, i.e.,

n
2 (x, z) =
m εi2
Ai (x, z) (33)
i=1

and

n
3 (x, z) =
m εi3 ,
Ai (x, z) (34)
i=1

where Aj (x, z) would vary depending upon the local smoothing method used. If one
desires to estimate the level of the frontier in SVKZ’s setup, then a (local) parametric
distributional assumption for ui is needed, although the ranking of output would
be independent of this distributional choice. Importantly, note that if the moments
of ui depend on x or z, then the frontier correction will also depend on x and z
implying that any features of the production frontiers, such as returns to scale, may
depend on the distribution of ui . One therefore needs to either make some type
of distributional assumption or to assume a type of separability assumption, such
as E[u|x, z] = E[u|z]. With the normal-half-normal framework, SVKZ showed
(adapting [79]) that
1/3
π π

σu (x, z) = max 0, 3 (x, z)
m (35)
2 π −4

π −2
σv2 (x, z) = m
2 (x, z) −
σu2 (x, z) , (36)
π

These estimates can then be used to obtain the estimates of the efficiency scores
for each observation, in the spirit of [46], generalized to the heteroskedastic
9 Stochastic Frontier Analysis: Foundations and Advances II 393

case involving E[ui |εi , x i , zi ] instead of E[ui |εi ]. However, as mentioned in

the parametric context above, one should be careful interpreting these estimates
of efficiency scores, as they are “predicted values” conditional on unobserved εi ,
replaced with its estimate for the specific realization i, and as such the prediction
intervals tend to be quite wide (see [96] for related discussion). In turn, the
conditional mean of inefficiency can be consistently estimated as

2

μu (x, z) =
σu (x, z). (37)
π

and then use it at any point of interest (x, z) to form a consistent estimate of the
level of frontier, m(x, z), using

m ∗ (x, z) +
(x, z) = m μu (x, z). (38)

SVKZ also derived the asymptotic properties of these estimators, generalizing

earlier results from [33] and [19].
Finally, and perhaps most interestingly, SVKZ pointed out that if one is only
interested in the influence of z or x on the (conditional mean) efficiency, or as
a special case to test if E[u|x, z] is a constant, then no parametric distributional
specification is required for ui , only a condition that it belongs to the one-
parameter scale family of distributions. Specifically, they showed that an elasticity
of E[u|x, z], ψ , defined as

∂μu (x, z) ψ
ξψ (x, z) = (39)
∂ψ μu (x, z)

assuming that μu (x, z) = 0, can be estimated as

3 (x, z) ψ
1 ∂m

ξψ (x, z) = (40)
3 ∂ψ 3 (x, z)
m

where m 3 (x, z)/∂ψ are the estimates from the local-polynomial

3 (x, z) and ∂m
estimator and provided that m 3 (x, z) = 0 for the particular combination of
interest (x, z). Importantly, SVKZ also derived the asymptotic law for this elasticity
estimator, showing that

(nhp+d+2 )1/2
ξψ (x, z) − ξψ (x, z) −→ N(0, sξ2 (x, z)), (41)

In turn, these asymptotic results can be used for statistical testing about influence
of elements in (x, z) onto expected inefficiency.
A practical limitation of SVKZ is that the estimated production technology may
not satisfy axioms of production. One might be tempted to follow [68] or [84],

imposing the desired constraints first and then recovering E[u|x, z]. However, as we
noted earlier, the methods of [68] and [84] work when the distribution of inefficiency
394 S. C. Kumbhakar et al.

is independent of x and z, i.e., when u is homoskedastic. The issue the applied

researcher faces here is much more subtle. When heteroskedasticity is present in u,
one must recognize that what is being estimated in the first stage is a conditional
mean, and not a production frontier. Thus, it is not necessarily the case that the
axioms of production should be expected to hold when estimating the conditional
mean.
Consider the case of a monotonic production function. The conditional mean of

output could be non-monotonic if E[u|x, z] was non-monotonic, even though the
production function is monotonic. Further, it is well-known that adding two concave

functions might not produce a concave function, so even if E[u|x, z] was concave,
adding it to the production frontier may not produce a concave production function.
And therein lies the danger of imposing constraints when estimating the conditional
mean, it is not necessarily the case that they should be satisfied. This might seem
innocuous except for the fact that imposing constraints on a conditional mean
that are incorrect will not produce a consistent estimator, and typically, consistent
estimates in the first stage are needed for the second stage (recovering inefficiency)
to produce valid estimates.
Take, for example, the discussion in [67, pp. 233], who consider estimation of
a production frontier nonparametrically, while also allowing u to depend on x. In
this case, they stated (in our notation) “. . . Note that the shape of function g can
differ from that of frontier m because E(ui |x i ) is a function of inputs x . . . It is also
worth noting that function g is not necessarily monotonic increasing and concave
even if the production function m satisfies these axioms because −E(ui |x i ) can be
a non-monotonic and non-concave function of inputs . . . To apply CNLS in step 1,
we need to assume that the curvature of the production function m dominates and
that function g is monotonic increasing and concave (at least by approximation).”
Unless the conditional mean of output satisfies the axioms of production, it is
recommended the axiomatic restrictions be enforced after consistent, unrestricted
estimation of the conditional mean as this will ensure that the first-stage estimator of
the conditional mean is consistent. How exactly to do this is a relatively unexplored
area in stochastic frontier analysis and is a fruitful avenue for future research.
Figures 1, 2, and 3 illustrate the pitfalls of enforcing constraints ex ante
on the conditional mean of y (given x). We have a single input, x, and our
production frontier is logarithmic, which is naturally monotonic and concave. When
inefficiency is homoskedastic, we see that the conditional mean is just a shift down
of the production frontier and remains both monotone and concave. However, if we
allow heteroskedasticity of inefficiency, e.g., through a quadratic relationship, then,
depending on the nature of heteroskedasticity, we can violate monotonicity, Fig. 2,
or concavity, Fig. 3, of E[y|x]. This quadratic relationship is not beyond the pale,
even in the parametric setting.13

13 Wang [101] documents non-monotonic efficiency effects in a panel of Philippine rice farmers
based on the age of the farmer.
9 Stochastic Frontier Analysis: Foundations and Advances II 395

Conditional Mean is
Concave and Monotonic

0.55
l

l
2.5

0.50
l
l
l l
l l
2.0

l
l

0.45
l l
ll l
l
l l
l
l
l l l
1.5

μu
l l

0.40
y

l
ll l
l
l
ll l l l l l
l
l l
l

0.35
l
1.0

l l l l l
l l l
l l
l l
l l
l l l
0.30
l l
l
ll
0.5

l
l
l l l
l

l
0.25

l
l
0.0

0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0

x x
Fig. 1 Concave and Monotonic Conditional Mean and Production Frontier Under Homoskedastic
Inefficiency. The solid line is the production frontier, while the dashed line is the conditional mean
of output

Avoiding Distributional and (Some) Parametric Assumptions When

Determinants of Inefficiency Are Present

Here we discuss the approach of [85, 100]. Let the SFM be

yi = m(x i ) + vi − ui = m(x i ) + vi − ui + E[ui |zi ] − E[ui |zi ]

= m∗ (x i , zi ) + εi∗ . (42)

where m∗ (x i , zi ) = m(x i ) + g(zi ), (ui |zi = z) ∼ D + (μ(x, z), σu2 (x, z)), while
(vi |x i , zi ) ∼ D(0, σv2 ). This model is a special case of SVKZ’s model. Now, if we
specify our production technology as m(x i ) = x i β and E[ui |zi ] = g(zi ), then if
β were known, g(zi ) could be identified as the conditional mean of ε̃i = yi − x i β
given zi .
However, β is unknown and must be estimated. It can be estimated as follows.
Conditioning only on zi in equation (42), we have
396 S. C. Kumbhakar et al.

Conditional Mean is
3.0 Non−Monotonic

l
l l
2.5

0.4
l
l
ll l l
l l
l
2.0

l
l l
l l
l
l l l l l
l

0.3
l
l l
l

μu
l ll l
y

l
1.5

ll l
l l
l
l
l l
l l
l
l l ll l l
l
l l l l

0.2
1.0

l l
l l l
l
l l
l l
l l l
l l
l
0.5

l
l
0.1

0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0

x x

Fig. 2 Concave but Non-monotonic Conditional Mean and Production Frontier Under Het-
eroskedastic Inefficiency. The solid line is the production frontier, while the dashed line is the
conditional mean of output

E[yi |zi ] = E[x i |zi ] β − g(zi ). (43)

Subtracting (43) from (42) yields

yi − E[yi |zi ] = (x i − E[x i |zi ]) β + εi . (44)

If E[yi |zi ] and E[x i |zi ] were known, β could be estimated via OLS from (44).
The idea is to replace the unknown conditional means with their nonparametric
estimates [91].
To estimate both β and g(zi ), we replace E[yi |zi ] and E[x i |zi ] in (44) with

n
Ê[y|zi ] = Aj (zi )yj
j =1
9 Stochastic Frontier Analysis: Foundations and Advances II 397

Conditional Mean is
Non−Concave

l l
2.5

0.7
l l

l l l
l l
2.0

l
l
l l
l l
l

0.6
l
l l l
1.5

l l
l l
l
l ll
l l ll
l l

μu
y

l l
1.0

l
l l
l ll l l
l l

0.5
l l
l ll l
l l
l ll
0.5

l l l
l
l
ll l
l l ll

l l
l
0.0

0.4

l
l
−0.5

0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0

x x
Fig. 3 Monotonic but Non-concave Conditional Mean and Production Frontier Under Het-
eroskedastic Inefficiency. The solid line is the production frontier, while the dashed line is the
conditional mean of output

n
Ê[x s |zi ] = Aj (zi )x sj ,
j =1

For a given bandwidth, the conditional expectations for y and each element of x
can be estimated, and OLS can then be used to obtain a consistent estimator of β.
That is, instead of the usual regression of y on x, one performs the modified OLS
regression of ỹ on x̃, where we have used the notation w̃ = w − Ê[w|z] to denote
a random variable that has been conditionally demeaned. The estimates for β can
then be used to obtain a consistent estimator of the conditional mean of inefficiency
via standard nonparametric regression techniques.
Let ε̌i = yi − x i
β, where
β is our estimate from the OLS regression of ỹ on x̃.
We then estimate g(zi ) nonparametrically via local-polynomial least-squares as

n
g (zi ) =
Aj (zi )ε̌j . (45)
j =1
398 S. C. Kumbhakar et al.

In the cross-sectional regression setting, without assuming some structure on the

distributions of the error components, it is not possible to identify the impact that
any given variable has on output directly, i.e., through the frontier, indirectly through
inefficiency, or both.14 One way to achieve identification is through invocation of
the separability assumption. This assumption, described in exceeding detail in [95],
essentially requires two distinct sets of variables: those which influence the frontier
and those which solely influence inefficiency. In the context of a model for which
two-sided noise does not exist (the standard DEA framework), when this assumption
is satisfied, a two-step approach is available which can produce consistent estimators
of both the frontier function and the inefficiency of a firm [9, 95, 97].
In general, it is recommended that if variables which influence inefficiency exist,
this information should be used directly, with a single-stage estimator, such as
maximum likelihood. When the separability assumption holds, then the partly linear
model of [85, 100] could be deployed (albeit with some parametric assumptions
imposed) or the additive model previously described can be used.15
Importantly, the separability assumption can be tested in the stochastic frontier
context, including the fully nonparametric or semiparametric frameworks. We can
compare the estimates from the additively separable SFM, with that from a fully
nonparametric model to determine if there are statistical differences. Fortunately,
this type of setup is conducive to inference through either a residual sum of squares
test or a conditional moment test. See the discussion in Chapter 6 of [42].

Future Directions in Semi- and Nonparametric Estimation

and Inference of the SFM

One of the future directions of research within non- and semiparametric SFA is,
naturally, related to statistical inference. The asymptotic results developed in the
above mentioned papers as well as various testing procedures developed in the

14 Hall and Simar [41] discussed nonparametric identification of the mean of inefficiency subject to
the variance of the noise distribution diminishing as n → ∞. Horrace and Parmeter [43] showed
how to nonparametrically identify the full distribution of inefficiency if one assumes that v is
distributed normal.
15 The approach of SVKZ allows for both x and z to influence both the frontier and inefficiency,

and as such the separability assumption is not required. Yet, one may say that there is also a
kind of “separability” structure involved implicitly: (x, z) is assumed to influence the frontier
via the first moment, while for the inefficiency term, u, the same (x, z) is modeled through the
skedastic function defining the second moment. Besides helping with statistical identification, such
structure can be viewed as quite natural to the context of measurement. Indeed, one often thinks of
the frontier as the level, and so using the (conditional) first moment, measuring the (conditional)
average level of outputs, would be very natural. Meanwhile, the inefficiency is often understood
as the deviation from the frontier, so it would be a more natural way to model it with the second
moment. In addition, one could also think of the inefficiency as a reflection of the uncertainty and
related “risk” to produce less than the potential and beyond the usual (and symmetric) noise, and
it is very common to model risk through the second moment.
9 Stochastic Frontier Analysis: Foundations and Advances II 399

general statistics community make a solid foundation for this to happen, with careful
adaptation and extensive Monte Carlo evidence supporting the theory. Additionally,
few of the methods discussed here have been fully developed in the panel data
setting.
It is worth noting that neither [60], nor [73], nor [82], nor SVKZ imposed any
axioms of production on the frontier, e.g., monotonicity (i.e., require ∇mx ≥ 0 ∀x),
although some of them have brief discussions about possible extensions to do so.
Specifically, to impose the desired constraints, one could adapt ideas from [29] and
[28], or use DEA or FDH on the fitted values from these methods (thus using the
stochastic DEA or stochastic FDH approaches of [98]), or to employ the constraint
weighted bootstrapping [31, 40], as was adapted to the baseline SFM by [84].

Quantile Estimation of the SFM

A recent development in the estimation of the SFM has been to embrace the use
of quantile methods [15, 16, 49, 72]. Quantile regression is known to provide a
more complete picture of a conditional distribution [50, 52] and provides a robust
alternative to ordinary least squares. Whereas the ordinary least-squares estimator
stems from minimization of the sum of squared errors, the conditional quantile
estimator is determined through minimization of the “check” function [51] defined
for a particular quantile, the median say.
The conditional quantile function Qy (τ |x) for a random variable y with condi-
tional CDF F (y|x) is defined as F −1 (τ |x) = inf {y : F (y|x) ≥ τ } where τ is the
τ th conditional quantile of the random variable y. Rather than directly inverting
of the conditional distribution function, the conditional quantile can be determined
through the loss function:

ρτ () = (τ − 1{ < 0}) . (46)

ρτ () is known as the check function. For a traditional linear in parameters

framework, Qy (τ |x) = x i β(τ ), the quantile estimator is found by minimizing

n

min ρτ yi − x i β(τ ) , (47)
β
i=1

for a given τ . When the error terms are i.i.d., the conditional quantiles represent
vertical shifts of the conditional median function by the appropriate quantile of
the error distribution. However, when heteroskedasticity is present, the conditional
quantiles are no longer vertical shifts of the conditional median, but will have
varying slopes; moreover, the quantiles will become nonlinear.
The use of conditional quantile estimation to recover the frontier is appealing
because in general a frontier can be thought of as a quantile in the distribution of
output. At issue is the appropriate quantile, τ . For example, [16, p. 379] estimate the
400 S. C. Kumbhakar et al.

frontier with the conditional quantile estimator using τ = 0.5, 0.9 and 0.975. τ =
0.5 corresponds to the median and is equivalent to the conditional mean in the case
that σu2 = 0 (see the discussion in [44]). Know et al. [49, p. 79] estimate conditional
quantiles for τ = 0.85, 0.9 and 0.95, while [72, p. 1080] consider τ = 0.5 and 0.8.
Lastly, [15, p. 572] recommended the use of τ = 0.95 for estimation of production
frontiers and τ = 0.05 for estimation of cost frontiers.
What is lost in the recommendations of this earlier research is how one estimates
(or predicts) individual efficiency once the frontier has been estimated. Currently
the standard practice is to treat any firm whose output lies above the frontier as
fully efficient, and any firm whose output is below the frontier as inefficient, with
inefficiency defined as the difference between the estimated frontier and observed
output. However, both of these recommendations ignore the fact that the composed
error term represents inefficiency and noise. There does not exist at present an
approach that separates inefficiency from noise in a manner similar to [46]. One
idea could be to use the conditional mode as proposed in [74]. This estimator can
be interpreted as a maximum likelihood estimator for the distribution of the joint
density of v and u, and more importantly, for positive residuals, it is always 0,
which is akin to how inefficiency is currently calculated using conditional quantile
estimation. Unfortunately, as with the conditional mean, the conditional mode
estimator requires distributional assumptions for it to be operational.
Lastly, we mention two important caveats with quantile estimation of frontiers.
First, heteroskedasticity in either v or u has, until now, not been accounted for.
This is a severe limitation as heteroskedasticity is commonly seen as present in
v in applied efficiency studies, and researchers typically have access to an array
of determinants of inefficiency, which induce heteroskedasticity in the inefficiency
term. Moreover, unlike estimation of a conditional mean, when conditional het-
eroskedasticity is present, this can affect consistent estimation of the conditional
quantile. Second, estimation of the conditional quantile for a specific value of τ is
an implicit assumption on the ratio of signal to noise between σu2 and σv2 . To see
this, more clearly, Figs. 4, 5, 6, and 7 present the results of quantile estimation for
τ = 0.5, 0.8, 0.85, 0.9, and 0.95 for 1,000 observations drawn from the model:

yi = xi0.4 evi −ui , (48)

with vi ∼ N(0, 1) and ui ∼ N+ (0, σu2 ). In Fig. 4 the inefficiency draws are taken
with σu2 = 0.01, in Fig. 5 we have σu2 = 0.25, in Fig. 6 σu2 = 1, and in Fig. 7 σu2 = 4.
In the case where σu2 = 4, this corresponds to a λ = σu /σv = 2 which is of a decent
size for an applied efficiency study. In this case, the true frontier is approximately
equal to the 85th quantile. It is clear that interpreting the frontier for a given quantile
as the benchmark for a firm being efficient or inefficient is implicitly a statement on
the ratio between the variance of the noise and the inefficiency for the sample. In
Fig. 4, where λ = 0.01, the situation where there is almost no inefficiency, the
frontier is nearly equivalent to the median, which is the least absolute deviation
estimator that [44] discussed.
9 Stochastic Frontier Analysis: Foundations and Advances II 401

15
l
l l
True Frontier l
l
τ = 0.5
τ = 0.8 l

τ = 0.85
τ = 0.9 l l l

τ = 0.95 l
l l
l
l l l

10
l
l l l
l l l
l l
l l
l
l ll
l l
l l
l
l ll l
l
l l l
l ll l
y

l l l
l l
l l l l l
l l l l l l ll
l l
l l l l l ll
l l
l l l l
l ll
l l l l l l
l l l l
l l l l l l
l
l ll l l l l
l l l l l l
ll l ll
5

l ll l l ll l
l l ll l l ll l
l l l l
ll l l l l l
l ll ll l l l l ll
l ll l
l l l l l
l l l
l
l l l ll l l
l l l l l ll
l
l ll l ll ll
l
l ll l l l
ll l ll l ll l
l
l l ll l l l ll l l l
l ll l
ll l l l l l l l l
l ll l l ll l l l l l l l l
l l l ll l l l l l l l l l l
l l l l
ll l l l l l l l l ll
lll ll l ll ll l ll llll llll
l ll l l l l ll l ll l l l
l ll ll l
ll l l ll
l
l l l ll
l
l l
l l l l
l l l l
l l l
ll l l ll ll l l ll ll l
llll l l
lll
l
l l l l l l ll l l l l l ll l l ll
ll
l
l ll ll l
l l
llll l ll l l
l
l lll l l
l ll l l ll
l l
l
l l ll lll l ll l l l l l l ll ll
ll
ll ll
l lll ll l l l llll l llll l l l ll l
ll llll ll l l ll llll ll
l l l l l l
l l ll
l l l
l ll l llll lll l l ll ll l llll l l l l l llll
ll l l lllll llll ll l lll ll llll l l l l l llll l ll ll l
l lll l l l ll l ll ll l l l l ll lll
l l l
lll ll l ll l l l l ll llll ll ll l
l
l l l l l lll ll l lll l l l llll lll ll
ll l ll
l l l l l ll l ll ll ll l l l l
l l l
llll llll l l l lllll lll l l ll llll ll l
ll l lll l
ll l ll l
l l l l llllll
l l ll lll
l llll l l ll lll l l l llll l lll
llll ll l l lll
l l lll l lll l l ll l l
l l ll l l
lll llll l l ll l lll lll lll
l ll
ll ll l l
llll lllll l l l lll l ll l l
l ll
ll l ll ll l l l ll
l l ll ll l l lll l l
l
l
l
llll l
lll
l
l l lllll l
ll ll l l l l l ll ll l
l l
l
ll
l l
l ll ll l l
l ll
l l
l ll l l l ll l l l l l l l ll
0

2 4 6 8 10
x

Fig. 4 Conditional quantile estimation of a univariate SFM with σu2 = 0.01

l
l
l
True Frontier
τ = 0.5 l
τ = 0.8 l l
l
τ = 0.85
τ = 0.9 l

τ =l0.95 l
l
l
l
10

l
l
l l
l
l l
l l
l l l l
l
l l
l l l
y

l
l l
l
l l
l l l l
l l l l l l
l l ll
l l l l l l
l l l
l
l ll l l
l l l l ll ll l
5

l l l l
l l
l l
l l ll ll l l
l l l l
l ll l ll l l l l l l
ll l l l l
l l l ll l l l l
l ll l l l
l ll l ll
l l l l l l lll
l l l l l ll l l lll l l l ll
ll l ll l
l lll l l l l l
l ll l
l
l ll l l ll l l
l l l l l
l ll l l l l l l l l
l ll
l
l lll l l lll l l l ll ll
ll
l
lll l
l
ll l l llll l l llll l
l ll
l
ll
l l l l l l l l l ll l
ll l l l l l lll l ll
l l l llll l l l l ll ll
ll
lll l l ll l l l ll l l ll l l l l l l ll l l l l l
l
ll l
ll l
l l l l l l l
ll lll l l ll ll l l l l lll l l l lll l ll ll l ll
l ll
l l
l ll l l ll
l ll l
l ll l l ll l ll l
l l lll
l
l l l lllll l
llll ll ll l l l l ll l llll l
ll l
llll ll ll
ll l
l lllll
l l ll l l l llll ll l ll l l ll l l ll
l l ll lll l l ll l
ll l ll lll l l l ll ll l
l l ll ll l l lllll llll lll
lll ll l
lll l ll l l l lll ll ll l l ll ll ll
l
ll ll l ll l l ll
l lll l l l
l l
ll lllll
lll
ll ll lllllllll ll
l l
llll ll l l ll ll ll l l l l lll
ll lllll l ll
ll ll
ll l
l l l
llll lll l lll lll ll l l
lll l ll
ll ll
l
l
ll
lll
l lll llll l l lll l ll l l l
ll
ll
ll
llll l llllll lll l l
l
ll l l l ll ll lll llll l
ll lllll llll llllll lll ll l lllll l
l ll l l lll l l llll ll l l
l l
ll
ll l l l ll llll
l lll l l
l l lll
ll ll ll ll l
ll l
l l ll ll l l l ll l lll l
ll l l ll ll
ll ll
lll
ll ll
l lll
ll
l
lllllll ll l l l ll
l
l l l
l ll llll lllll
lll l lllll l ll llll lll l l l
ll
ll l l
l lll ll ll l ll
ll l ll l ll l
0

2 4 6 8 10
x

Fig. 5 Conditional quantile estimation of a univariate SFM with σu2 = 0.25

While the quantile estimator marks an interesting and robust alternative to

traditional stochastic frontier analysis, it should be clear that more work needs to
be done. We direct the reader to the earlier referenced papers for more details and
additional insights on how best to use conditional quantile methods at present for
conducting efficiency analysis. Furthermore, panel estimation of quantiles, as well
402 S. C. Kumbhakar et al.

15
l

True Frontier
τ = 0.5
τ = 0.8 l
l l

τ = 0.85
τ = 0.9 l

l
τ = 0.95 l
l
l
l

10
l

l
l
l
ll l
l
y

l l
l ll
l l
l l
l l l
l
l l l
l l l l
l l l
ll l l l
5

l l
l l l l
l l ll l
l l l
l l l l
l l l l
l l l l
l
ll l l l l l
ll l l l l
l l ll l l l l l ll
ll
l l l l l l
ll ll l
l
l l ll l
l l ll ll
l l l
l
ll l
l
l l l l l l
l l l
l
l l ll l l lll l ll l ll l l l
l l l
ll ll l l l l l l l ll l l
l l lll l l
ll l ll l l l l ll l lll l l ll ll l l l
l l ll l
l l l
l l l l ll l
l l l ll l ll l l l l l
ll
l
l l l ll l l l l l
l ll l l llllll
l l
l l lll l l
ll l
ll
lll l
l l ll l l l l
l l ll ll l ll l ll l
l ll l l ll l l
l lll l
l
ll l l l
ll ll ll ll l l
l llll ll
l l l l l l
l
l ll
l ll
ll l llllll ll l
ll l l l lll l ll lll l l l l
ll
l ll ll ll ll l l lll ll
l l l ll
lllllllll l ll l l
lllll
l l ll l
l llll l l llll l l l ll ll l l l l ll ll
l l l
l lll llllllll
l ll
ll l
l ll l l l ll ll l ll l l l
l l lll l ll
ll l l l l
l
ll l l ll l l
l l l l l ll l l lll l llll
l
l
l llll l
l l llll ll
lllll l llll
l l l ll ll ll ll ll l l l l ll lll ll l ll l
l
ll ll ll lll l
l ll l llll
ll
l l l lll
l l lll ll lll l lll l l l l l l
l
ll ll l
ll ll
lll l lll ll llll lllll l
l ll l
lll
ll l l
ll l l l lll
l l l lll ll lll lllll lll
l
ll
l
l lll
l l l lll
ll l lll
ll llll
lll lll
llll l ll
ll
l l llll llll l l l l l
ll
l l l l l
llllll ll ll
lllllll l ll
ll ll ll l
l
lll l
ll ll
l
ll
l ll
l l l llllllll lll lll
llllll ll ll lllll
l lll
ll
lll ll ll
lllll llll
ll ll l lllll l
l ll lll l llll
l
lllll ll
ll ll lll ll ll
l lll ll
l l
l ll ll llll
0

lll
l l l
ll ll l l l ll l l
l l l

2 4 6 8 10
x

Fig. 6 Conditional quantile estimation of a univariate SFM with σu2 = 1

True Frontier l
τ = 0.5
τ = 0.8
l
τ = 0.85 l

τ = 0.9
τ = 0.95
l
10

l
l
l l
l
y

l
l l
l
l l
l l l
l l
l
l l
l l l l l
5

l l l
l l l l
l l
l l
l l l
l
l l l
ll ll
l l
l l l
l l l l l
l l l ll ll l
l l ll
l l l l l
l l lll l l l l
l l l l l lll l l l l
l l l
lll l
l l l l l l l l l ll l ll
l l l lll l l l l ll l l ll l l l ll l lll l
llll l l ll
ll
l l l l l l
l l l l l l lll l l l ll l l
l l l ll l l l l l l lll ll l ll l l
ll l l
llll l l ll l l l l lll l l ll ll l ll l l l l l l ll
ll l l llll l
l l lll
l l ll ll l ll ll l
ll
l l ll l l l l l
ll l l l
lll lll l l ll ll l
ll l lll l l
ll l l l l lll l
l
l ll l l
l l l l lll
ll
l l
ll ll l l
l l l l l ll l
l
ll
ll l l ll
l l llll lll l l l l
ll l l
l ll ll l ll
ll l l
l l ll l ll l ll
l
l
l ll l l l ll
l l ll l ll lll
l llll l l lll
l ll l l l l ll llll
l l ll lll
ll
l l
llllll l ll
l l l l l lll
llllll
l llll
ll l
l lll l l l l l l l llll ll
l
l l l l l
lll l lll l l l l
l l l
l ll l
l l l l l l l l l
ll
llll
l
ll lll
l
l l llll ll
ll l
ll llll l l lllll l l lllll ll
l lll ll l ll
ll l lll
lll ll l
l l ll
l ll
l
lll ll l ll llll lll
l lllllll l ll
ll ll l l
l ll
l l l
l ll ll ll l
ll lllll l l l
l ll ll l
l
lll
l ll
ll lllll l lll llllll l ll ll ll l
ll
l l llllllll llll l l lll
l
l l
l
lllll
l
ll lll
ll
ll
l
ll
l llll
l
l
lll
lll
l llll
llll
llll
l l
l
ll
lll
lllllll lll
l l
ll
llll
lll
llll
ll l
llll ll
ll lllll l ll
lllll
l l
ll
lll
l l
ll l
lll
l lllllll l l lllllll ll l ll l ll
ll
ll
ll
l lllll ll
l ll
l
lll
ll l
llll ll ll
l
llll
ll l
ll
0

2 4 6 8 10
x

Fig. 7 Conditional quantile estimation of a univariate SFM with σu2 = 4

9 Stochastic Frontier Analysis: Foundations and Advances II 403

as semi- and nonparametric estimation of quantiles, is still in its infancy in this area,
and extensions to the SFM have as yet to appear in the literature.

Additional Approaches/Extensions of the SFM

As with any review or summary article, there is never enough space to cover all
topics equally or broadly enough. The SFM has been studied and used for 40
years now, and even though we have covered a range of approaches and insights,
there are still many topics which we did not cover. These include finite mixture
models [18, 37, 80], the zero-inefficiency SFM [63], the meta-frontier [13, 14],
total factor productivity change and its individual components [45], the two-tier
frontier [88, 89], sample selection in the SFM [38, 64], and directional distance
function estimation [4]. Parmeter and Kumbhakar [83] cover broadly estimation and
inference of finite mixture models, the zero-inefficiency SFM, and issues pertaining
to sample selection. Full details on the measurement of total factor productivity and
separation into distinct components can be found in [66, chap. 11]. Both the two-tier
frontier [61, 62, 81] and meta-frontier [3, 78] have started to receive more attention
recently, but as of yet, no broad review of either exists. Regarding the estimation
of directional distance functions, we refer interested readers to [35] for a thorough
treatment.

Available Software to Estimate SFMs

Despite the popularity of the SFM, only the most basic implementations of it
are available across a wide array of statistical platforms. For example, in the R
programming environment, the frontier [22] package allows for cross-sectional
estimation of the SFM assuming either the half-normal or truncated-normal distri-
bution for ui , and the [11, 12] panel data estimators of the SFM are implemented.16
There are similar estimators available in LIMDEP through the NLOGIT module, but
these also include the normal-gamma specification as well as the true fixed- and true
random-effects estimators along with the latent class stochastic frontier estimator.
There are also several modules in the STATA software as described in [66] which
implement several other panel data estimators as described earlier. Additionally,
many authors provide their own personal codes. For example, Young Hoon Lee
provides GAUSS code for a variety of cross-sectional and panel data stochastic
frontier estimators on his webpage complete with several datasets (https://sites.
google.com/site/yhnlee3/SFM-code). Federico Belotti provides integrated STATA

16 The frontier package accesses the Frontier V4.1 Fortran codes originally developed by

Tim Coelli, which is also freely available (at http://www.uq.edu.au/economics/cepa/frontier.php),

although fairly outdated by now (see also https://cran.r-project.org/web/packages/frontier/frontier.
pdf).
404 S. C. Kumbhakar et al.

code which works with the basic frontier capabilities. These new codes are
sfcross and sfpanel and can be obtained through his blog http://www.
econometrics.it.17
However, there does not yet exist a singular software that implements all of the
available estimators described here. This should not be surprising. As with any
applied field, as statistical improvements are made, there is a lag with available
software, and the array of options makes it infeasible to include all discussed models
in a singular package. Researchers interested in the newest methods can invest in
programming these methods and disseminating them to the field, or can collaborate
with the authors of the original models to develop software that can be made widely
available, and we strongly encourage researchers to do so.

Conclusions

The review that we made in this and the previous chapters was meant to highlight
some of the most important econometric developments over the past 40 years
which improve the estimation of measurements of productivity and efficiency.
While, in the previous chapter, we covered the workhorse SFM and how to include
determinants of inefficiency and productivity and how to deal with endogeneity,
in this chapter, we focused on the panel data, quantile estimation, and robust
methods involving nonparametric regression and local likelihood. All told, a variety
of methods and models exist for the practitioner, and our hope is that this review
will encourage applied researchers to move away from some of the basic SFMs in
search of more robust and insightful conclusions.
While much has been covered, much remains unsaid. Important areas that are
still being developed include modeling dependence between statistical noise and
inefficiency, selection of firm technology, handling heterogeneous technology in a
sample of firms, and how to allow a subset of firms to be fully efficient. While
our discussion was couched in terms of the single equation stochastic production
frontier, system-based approaches surrounding cost, profit, or revenue frontiers are
also available, and, similar to the other methods mentioned earlier without any
details, they deserve attention and separate reviews.

Cross-References

Activity Analysis in Production Economics

Data Envelopment Analysis: A Nonparametric Method of Production Analysis

17 One can install these commands via net install sfcross, all from (http://www.
econometrics.it/stata) net install sfpanel, all from (http://www.econometrics.it/
stata), see also https://sites.google.com/site/productivityefficiency/home1 and for details refer to
Chapter 17 of [93]
9 Stochastic Frontier Analysis: Foundations and Advances II 405

References
1. Aigner D, Lovell CAK, Schmidt P (1977) Formulation and estimation of stochastic frontier
production functions. J Econ 6(1):21–37
2. Aitchison J, Aitken C (1976) Multivariate binary discrimination by the kernel method.
Biometrika 63:413–420
3. Amsler C, O’Donnell CJ, Schmidt P (2017) Stochastic metafrontiers. Econ Rev 36:
1007–1020
4. Atkinson SE, Tsionas EG (2016) Directional distance functions: optimal endogenous direc-
tions. J Econ 190:301–314
5. Azzalini A (1985) A class of distributions which includes the normal ones. Scand J Stat
12(2):171–178
6. Badunenko O, Kumbhakar SC (2017) Economies of scale, technical change and persistent and
time-varying cost efficiency in Indian banking: do ownership, regulation and heterogeneity
matter? Eur J Oper Res 260:789–803
7. Baltagi BH (2013) Econometric analysis of panel data, 5th edn. John Wiley & Sons, Great
Britain
8. Banker RD, Maindiratta A (1992) Maximum likelihood estimation of monotone and concave
production frontiers. J Prod Anal 3(4):401–415
9. Banker RD, Natarajan R (2008) Evaluating contextual variables affecting productivity using
data envelopment analysis. Oper Res 56(1):48–58
10. Battese GE, Coelli TJ (1988) Prediction of firm-level technical efficiencies with a generalized
frontier production function and panel data. J Econ 38:387–399
11. Battese GE, Coelli TJ (1992) Frontier production functions, technical efficiency and panel
data: with application to paddy farmers in India. J Prod Anal 3:153–169
12. Battese GE, Coelli TJ (1995) A model for technical inefficiency effects in a stochastic frontier
production function for panel data. Empir Econ 20(1):325–332
13. Battese GE, Rao DSP (2002) Technology gap, efficiency and a stochastic metafrontier
function. Int J Bus Econ 1:1–7
14. Battese GE, Rao DSP, O’Donnell CJ (2004) A metafrontier production function for estimation
of technical efficiencies and technology gaps for firms operating under different technologies.
J Prod Anal 21:91–103
15. Behr A (2010) Quantile regression for robust bank efficiency score estimation. Eur J Oper
Res 200:568–581
16. Bernini C, Freo M, Gardini A (2004) Quantile estimation of frontier production function.
Empir Econ 29:373–381
17. Butler J, Moffitt R (1982) A computationally efficient quadrature procedure for the one factor
multinomial probit model. Econometrica 50:761–764
18. Caudill SB (2003) Estimating a mixture of stochastic frontier regression models via the EM
algorithm: a multiproduct cost function application. Empir Econ 28(1):581–598
19. Chen L-H, Cheng M-Y, Peng L (2009) Conditional variance estimation in heteroscedastic
regression models. J Stat Plan Inference 139(2):236–245
20. Chen Y-Y, Schmidt P, Wang H-J (2014) Consistent estimation of the fixed effects stochastic
frontier model. J Econ 181(2):65–76
21. Chu C-Y, Henderson DJ, Parmeter CF (2017, forthcoming). On discrete Epanechnikov
kernels. Comput Stat Data Anal
22. Coelli T, Henningsen A (2013) Frontier: stochastic frontier analysis. R package version 1.1-0.
http://CRAN.R-Project.org/package=frontier
23. Colombi R, Kumbhakar S, Martini G, Vittadini G (2014) Closed-skew normality in stochastic
frontiers with individual effects and long/short-run efficiency. J Prod Anal 42(2):123–136
24. Colombi R, Martini G, Vittadini G (2011) A stochastic frontier model with short-run and long-
run inefficiency random effects. Department of Economics and Technology Management,
University of Bergamo, Working Paper Series
406 S. C. Kumbhakar et al.

25. Cornwell C, Schmidt P (1992) Models for which the MLE and the conditional MLE coincide.
Empir Econ 17(2):67–75
26. Cornwell C, Schmidt P, Sickles RC (1990) Production frontiers with cross-sectional and time-
series variation in efficiency levels. J Econ 46(2):185–200
27. Cuesta RA (2000) A production model with firm-specific temporal variation in technical
inefficiency: With application to Spanish dairy farms. J Prod Anal 13:139–152
28. Daouia A, Park BU (2013) On projection-type estimators of multivariate isotonic functions.
Scand J Stat 40:363–386
29. Daouia A, Simar L (2005) Robust nonparametric estimators of monotone boundaries.
J Multivar Anal 96(2):311–331
30. Domínguez-Molina JA, González-Farías G, Ramos-Quiroga R (2003) Skew normality in
stochastic frontier analysis. Comunicación Técnica No I-03-18/06-10-2003 (PE/CIMAT)
31. Du P, Parmeter CF, Racine JS (2013) Nonparametric kernel regression with multiple
predictors and multiple shape constraints. Stat Sin 23(3):1347–1371
32. Fan J, Gijbels I (1996) Local polynomial modelling and its application. Chapman and Hall,
Boca Raton
33. Fan J, Yao Q (1998) Efficient estimation of conditional variance functions in stochastic
regression. Biometrika 85:645–660
34. Fan Y, Li Q, Weersink A (1996) Semiparametric estimation of stochastic production frontier
models. J Bus Econ Stat 14(4):460–468
35. Färe R, Martins-Filho C, Vardanyan M (2010) On functional form representation of multi-
output production technologies. J Prod Anal 33(1):81–96
36. Greene WH (2005) Fixed and random effects in stochastic frontier models. J Prod Anal
23(1):7–32
37. Greene WH (2005) Reconsidering heterogeneity in panel data estimators of the stochastic
frontier model. J Econ 126(2):269–303
38. Greene WH (2010) A stochastic frontier model with correction for sample selection. J Prod
Anal 34(1):15–24
39. Greene WH, Fillipini M (2014) Persistent and transient productive inefficiency: a maximum
simulated likelihood approach. CER-ETH – Center of Economic Research at ETH Zurich,
Working Paper 14/197
40. Hall P, Huang H (2001) Nonparametric kernel regression subject to monotonicity constraints.
Ann Stat 29(3):624–647
41. Hall P, Simar L (2002) Estimating a changepoint, boundary or frontier in the presence of
observation error. J Am Stat Assoc 97:523–534
42. Henderson DJ, Parmeter CF (2015) Applied nonparametric econometrics. Cambridge Uni-
versity Press, Cambridge, Great Britain
43. Horrace WC, Parmeter CF (2011) Semiparametric deconvolution with unknown error
variance. J Prod Anal 35(2):129–141
44. Horrace WC, Parmeter CF (2014) A Laplace stochastic frontier model. University of Miami
Working Paper
45. Hulten CR (2001) Total factor productivity. a short biography. In: Hulten CR, Dean ER,
Harper MJ (eds) New developments in productivity analysis. University of Chicago Press,
Chicago, pp 1–54
46. Jondrow J, Lovell CAK, Materov IS, Schmidt P (1982) On the estimation of technical
efficiency in the stochastic frontier production function model. J Econ 19(2/3):233–238
47. Kneip A, Simar L (1996) A general framework for frontier estimation with panel data. J Prod
Anal 7(2):187–212
48. Kneip A, Simar L, Van Keilegom I (2015) Frontier estimation in the presence of measurement
error with unknown variance. J Econ 184:379–393
49. Know KJ, Blankmeyer EC, Stutzman JR (2007) Technical efficiency in Texan nursing
facilities: a stochastic production frontier approach. J Econ Financ 31(1):75–86
50. Koenker R (2005) Quantile regression. Cambridge University Press, Cambridge
51. Koenker R, Bassett G (1978) Regression quantiles. Econometrica 46(1):33–50
9 Stochastic Frontier Analysis: Foundations and Advances II 407

52. Koenker R, Hallock K (2001) Quantile regression. J Econ Perspect 15:143–156

53. Kumbhakar SC (1987) The specification of technical and allocative inefficiency in stochastic
production and profit frontiers. J Econ 34(1):335–348
54. Kumbhakar SC (1990) Production frontiers, panel data, and time-varying technical ineffi-
ciency. J Econ 46(1):201–211
55. Kumbhakar SC (1991) The measurement and decomposition of cost-inefficiency: the translog
cost system. Oxf Econ Pap 43(6):667–683
56. Kumbhakar SC, Heshmati A (1995) Efficiency measurement in Swedish dairy farms: an
application of rotating panel data, 1976–1988. Am J Agric Econ 77(3):660–674
57. Kumbhakar SC, Hjalmarsson L (1993) Technical efficiency and technical progress in Swedish
dairy farms. In: Fried KLH, Schmidt S (eds) The measurement of productive efficiency.
Oxford University Press, Oxford, UK
58. Kumbhakar SC, Hjalmarsson L (1998) Relative performance of public and private ownership
under yardstick competition: electricity retail distribution. Eur Econ Rev 42(1):97–122
59. Kumbhakar SC, Lien G, Hardaker JB (2014) Technical efficiency in competing panel data
models: a study of Norwegian grain farming. J Prod Anal 41(2):321–337
60. Kumbhakar SC, Park BU, Simar L, Tsionas EG (2007) Nonparametric stochastic frontiers: a
local maximum likelihood approach. J Econ 137(1):1–27
61. Kumbhakar SC, Parmeter CF (2009) The effects of match uncertainty and bargaining on labor
market outcomes: evidence from firm and worker specific estimates. J Prod Anal 31(1):1–14
62. Kumbhakar SC, Parmeter CF (2010) Estimation of hedonic price functions with incomplete
information. Empir Econ 39(1):1–25
63. Kumbhakar SC, Parmeter CF, Tsionas E (2013) A zero inefficiency stochastic frontier
estimator. J Econ 172(1):66–76
64. Kumbhakar SC, Tsionas EG, Sipiläinen T (2009) Joint estimation of technology choice and
technical efficiency: an application to organic and conventional dairy farming. J Prod Anal
31(2):151–161
65. Kumbhakar SC, Wang H-J (2005) Production frontiers, panel data, and time-varying technical
inefficiency. J Econ 46(1):201–211
66. Kumbhakar SC, Wang H-J, Horncastle A (2015) A practitioners guide to stochastic frontier
analysis using stata. Cambridge University Press, Cambridge, UK
67. Kuosmanen T, Johnson A, Saastamoinen A (2015) Stochastic nonparametric approach
to efficiency analysis: a unified framework. In: Zhu J (ed) Data envelopment analysis.
International series in Operations research & management science. Springer Science, New
York, chapter 7, pp 191–244
68. Kuosmanen T, Kortelainen M (2012) Stochastic non-smooth envelopment of data: semi-
parametric frontier estimation subject to shape constraints. J Prod Anal 38(1):11–28
69. Lee Y, Schmidt P (1993) A production frontier model with flexible temporal variation
in technical efficiency. In: Fried KLH, Schmidt S (eds) The measurement of productive
efficiency. Oxford University Press, Oxford, UK
70. Li D, Simar L, Zelenyuk V (2016) Generalized nonparametric smoothing with mixed discrete
and continuous data. Comput Stat Data Anal 100:424–444
71. Li Q, Racine J (2007) Nonparametric econometrics: theory and practice. Princeton University
Press, Princeton
72. Liu C, Laporte A, Ferguson BS (2008) The quantile regression approach to efficiency
measurement: insights from Monte Carlo simulations. Health Econ 17:1073–1087
73. Martins-Filho CB, Yao F (2015) Semiparametric stochastic frontier estimation via profile
likelihood. Econ Rev 34(4):413–451
74. Materov IS (1981) On full identification of the stochastic production frontier model (in
Russian). Ekonomika i Matematicheskie Metody 17:784–788
75. Mundlak Y (1961) Empirical production function free of management bias. J Farm Econ
43(1):44–56
76. Neyman J, Scott EL (1948) Consistent estimation from partially consistent observations.
Econometrica 16:1–32
408 S. C. Kumbhakar et al.

77. Noh H (2014) Frontier estimation using kernel smoothing estimators with data transforma-
tion. J Korean Stat Soc 43:503–512
78. O’Donnell CJ, Rao DSP, Battese GE (2008) Metafrontier frameworks for the study of firm-
level efficiencies and technology ratios. Empir Econ 34:231–255
79. Olson JA, Schmidt P, Waldman DA (1980) A Monte Carlo study of estimators of stochastic
frontier production functions. J Econ 13:67–82
80. Orea L, Kumbhakar SC (2004) Efficiency measurement using a latent class stochastic frontier
model. Empir Econ 29(1):169–183
81. Papadopoulos A (2015) The half-normal specification for the two-tier stochastic frontier
model. J Prod Anal 43(2):225–230
82. Park BU, Simar L, Zelenyuk V (2015) Categorical data in local maximum likelihood: theory
and applications to productivity analysis. J Prod Anal 43(1):199–214
83. Parmeter CF, Kumbhakar SC (2014) Efficiency Analysis: A Primer on Recent Advances.
Found Trends Econ 7(3–4):191–385
84. Parmeter CF, Racine JS (2012) Smooth constrained frontier analysis. In: Chen X, Swanson N
(eds) Recent advances and future directions in causality, prediction, and specification analysis:
essays in honor of Halbert L. White Jr. Springer, New York, chapter 18, pp 463–489
85. Parmeter CF, Wang, H-J, Kumbhakar SC (2017) Nonparametric estimation of the determi-
nants of inefficiency. J Prod Anal 47(3):205–221
86. Parmeter CF, Zelenyuk V (2019) A bridge too far? the state of the art in combining the virtues
of stochastic frontier analysis and data envelopment analysis. Operations Research 67(6):
1628–1658
87. Pitt MM, Lee L-F (1981) The measurement and sources of technical inefficiency in the
Indonesian weaving industry. J Dev Econ 9(1):43–64
88. Polachek SW, Yoon BJ (1987) A two-tiered earnings frontier estimation of employer and
employee information in the labor market. Rev Econ Stat 69(2):296–302
89. Polachek SW, Yoon BJ (1996) Panel estimates of a two-tiered earnings frontier. J Appl Econ
11(2):169–178
90. Racine JS, Li Q (2004) Nonparametric estimation of regression functions with both categori-
cal and continuous data. J Econ 119(1):99–130
91. Robinson PM (1988) Root-n consistent semiparametric regression. Econometrica 56:931–954
92. Schmidt P, Sickles RC (1984) Production frontiers and panel data. J Bus Econ Stat 2(2):
367–374
93. Sickles R, Zelenyuk V (2019) Measurement of productivity and efficiency: theory and
practice. Cambridge University Press, Cambridge
94. Simar L, Van Keilegom I, Zelenyuk V (2017) Nonparametric least squares methods for
stochastic frontier models. J Prod Anal 47(3):189–204
95. Simar L, Wilson PW (2007) Estimation and inference in two-stage, semi-parametric models
of production processes. J Econ 136(1):31–64
96. Simar L, Wilson PW (2010) Inferences from cross-sectional, stochastic frontier models. Econ
Rev 29(1):62–98
97. Simar L, Wilson PW (2011) Two-stage DEA: Caveat emptor. J Prod Anal 36(2):205–218
98. Simar L, Zelenyuk V (2011) Stochastic FDH/DEA estimators for frontier analysis. J Prod
Anal 36(1):1–20
99. Tibshirani R, Hastie T (1987) Local likelihood estimation. J Am Stat Assoc 82:559–568
100. Tran KC, Tsionas EG (2009) Estimation of nonparametric inefficiency effects stochastic
frontier models with an application to British manufacturing. Econ Model 26:904–909
101. Wang H-J (2002) Heteroscedasticity and non-monotonic efficiency effects of a stochastic
frontier model. J Prod Anal 18(2):241–253
102. Wang H-J, Ho C-W (2010) Estimating fixed-effect panel stochastic frontier models by model
transformation. J Econ 157(2):286–296
103. Wooldridge JM (2010) Econometric analysis of cross section and panel data, 2nd edn. MIT
Press, Cambridge, MA
Data Envelopment Analysis:
A Nonparametric Method of Production 10
Analysis

Subhash C. Ray

Contents
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 411
The Production Technology and Technical Efficiency . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 413
Shephard Distance Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 414
Input and Output Sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 415
Nonparametric Construction of the Technology and Measurement
of Technical Efficiency . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 416
Assumptions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 416
DEA Models for Measuring Output and Output-Oriented Technical Efficiency . . . . . . . . . 417
Technology and Efficiency Under Constant Returns to Scale . . . . . . . . . . . . . . . . . . . . . . . 420
Multiplier Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 422
Scale Efficiency . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 424
Ray Average Productivity and Returns to Scale . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 425
Most Productive Scale Size . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 425
Identifying the Nature of Local Returns to Scale . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 426
Identifying Returns to Scale for Inefficient Unit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 429
The Case of Multiple MPSS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 431
Choice Between Input- and Output-Oriented Projections . . . . . . . . . . . . . . . . . . . . . . . . . . . 433
Graph Efficiency Measures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 434
Graph Hyperbolic Distance Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 434
Directional Distance Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 436
Non-radial Measures of Efficiency . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 438
Non-radial Russell Output Efficiency . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 439
Non-radial Russell Input Efficiency . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 440
Pareto-Koopmans Measures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 440
Efficiency Measurement with Market Prices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 442
Cost Efficiency . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 443
Fixed Inputs and Short Run Cost Minimization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 445
Using Total Cost as an Aggregate Input . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 446

S. C. Ray ()
Department of Economics, University of Connecticut, Storrs, CT, USA
e-mail: [email protected]

© Springer Nature Singapore Pte Ltd. 2022 409

S. C. Ray et al. (eds.), Handbook of Production Economics,
https://doi.org/10.1007/978-981-10-3455-8_24
410 S. C. Ray

Multi-location Cost Minimization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 448

Revenue Efficiency . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 449
Profit Efficiency . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 450
Capacity Utilization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 451
A Physical Measure of Short Run Capacity Utilization . . . . . . . . . . . . . . . . . . . . . . . . . . . . 452
Long Run Capacity Output . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 454
Economic Scale Efficiency . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 457
Efficiency Measurement with Bad Outputs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 457
Bad Output as Input . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 458
Good and Bad Outputs as Joint Products . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 458
Bad Output as a By-Product . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 459
Joint Disposability and Material Balance Principle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 461
Contextual Variables in DEA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 462
All-Inclusive DEA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 463
A Second Stage Regression . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 464
A Three-Stage Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 465
Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 466
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 467

Abstract

In the Operations Research/Management Science literature, the nonparametric

method of Data Envelopment Analysis (DEA) has gained wide popularity as
a valid analytical format for efficiency evaluation. In economics, however, its
reception has been far less enthusiastic. Yet, the intellectual roots of DEA go back
to the seminal contributions to nonparametric analysis of production by Debreu,
Shephard, Farrell, Afriat, and others. Over the past four decades, DEA has
matured into a full blown non-parametric methodology for measuring productive
efficiency that serves as an alternative to parametric Stochastic Frontier Analysis
(SFA). Both grounded into the neoclassical theory of production, DEA and SFA
provide the researcher alternative ways to calibrate testable relations between
inputs, outputs, costs, revenue, and profit.
Staring from the central concept of the Production Possibility set, this chapter
provides a broad overview of the literature on DEA methodology for radial and
non-radial measurement of technical efficiency from input and output quantity
data under alternative returns to scale assumptions. This is followed by models
for performance evaluation in the presence of market prices through cost,
revenue, and overall profit efficiency- both in the long run when all inputs are
variable and in the short run, when some inputs are fixed. DEA models for
physical measures of the capacity output in the short run and economic measures
of capacity in the long run are discussed. Alternative ways to incorporate
the production of ‘bad’ or undesirable outputs collaterally with the ‘good’
or intended output in DEA models for efficiency measurement are presented.
Finally, the role of contextual or environmental variables that affect efficiency is
also discussed.
10 Data Envelopment Analysis: A Nonparametric Method of Production Analysis 411

Keywords

Neoclassical production theory · Production efficiency · Returns to scale ·

Cost efficiency · Capacity utilization · Bad outputs

Introduction

Charnes, Cooper, and Rhodes (CCR) [19] introduced data envelopment analysis
(DEA) as a nonparametric method of measuring technical efficiency in a paper
published in the European Journal of Operational Research in 1978. At that
time, the journal was only in its second year of publication. In the Operations
Research (OR)/Management Science (MS) literature, DEA was enthusiastically
accepted as a path-breaking contribution and within a few years found its place
in the analytical tool kit of empirical researchers. By contrast, its reception in
economics was lukewarm at best and outright skeptical, in general. Apparently,
CCR first tried The American Economic Review, the iconic economics journal, as
the outlet for their new model and later went to EJOR only after they were frustrated
by rejections by top economics journals. In general, economists do not feel
quite comfortable dealing with mathematical programming problems incorporating
multiple inequality constraints, which cannot be easily assumed to hold as equalities
simultaneously, thereby reducing it to a standard Lagrange multiplier problem.
Moreover, DEA does not readily accommodate random noise in the data. Finally,
in the absence of an explicit functional form of the underlying production, cost,
or profit function, one cannot extract ready-to-use elasticities for policy evaluation.
All this may account (at least partially) for the general reluctance to accept DEA
as a valid empirical method in mainstream economics. Nonetheless, in economics
the intellectual roots of DEA go all the way back to Debreu [28], Koopmans [44],
Shephard [72], and Farrell [37] continuing further in the works of Afriat [1] and
Hanoch and Rothschild [42] and beyond.
At present, among the academics, there are two different views of DEA. In
the OR/MS field, DEA is considered to be an extension of the method of linear
programming (LP) introduced by Dantzig [27] and Charnes, Cooper, and Mellon
[18] and others. The novelty in DEA is that unlike in earlier formulations of LP
problems, the input requirements per unit of the output (sometimes described as
the activity vector) are unknown. The primary emphasis there is on the algebraic
formulation of the optimization problems and their solution algorithms. In eco-
nomics, by contrast, DEA is an extension and refinement of the nonparametric
approach to production analysis and is firmly grounded in neoclassical production
economics. This is evident from the fact that originally designed to measure
technical efficiency of production units, the DEA methodology has been extended
to address general questions about the technology ranging from capacity utilization
to technical change.
412 S. C. Ray

The objective of this chapter is to provide a broad overview of DEA as a non-

parametric method of empirical analysis of the production technology and producers
behavior. The sequence of topics discussed in this chapter is not necessarily in the
chronological order of their development in the DEA literature. Rather, it starts with
the most general characterization of the technology (exhibiting variable returns to
scale (VRS)) and gradually imposes additional restrictions (like constant returns to
scale (CRS)) and/or makes behavioral assumptions like cost minimization, revenue
maximization, or profit maximization.1 Our discussion of the different DEA LP
models is prefaced by the underlying economic theory so that each of these models
can be clearly interpreted economically. At the same time, this chapter does not
provide an overall review of nonparametric analysis of production.
The chapter is organized as follows. Section “The Production Technology and
Technical Efficiency” introduces the production technology represented either by
the production possibility set or by the families of input and output sets as
the reference for measuring technical efficiency in production and/or Shephard
distance functions. Section “Nonparametric Construction of the Technology and
Measurement of Technical Efficiency” spells out the underlying assumptions for
empirically constructing a nonparametric production possibility set and formulates
the DEA models for measuring technical efficiency under variable and constant
returns to scale. Section “Scale Efficiency” deals with scale efficiency and the most
productive scale size (MPSS) and shows how to identify the nature of (local) returns
to scale at any point on the frontier. Section “Graph Efficiency Measures” considers
the graph hyperbolic and the directional distance functions for measurement of
graph efficiency. Section “Non-radial Measures of Efficiency” deals with non-radial
output and output-oriented Russell measures and the overall Pareto-Koopmans
measure (also known as the slack-based measure (SBM)). Section “Efficiency
Measurement with Market Prices” covers different measures of economic efficiency
including DEA models for measurement of cost efficiency (both in the long run and
in the short run) along with multi-location cost efficiency, revenue efficiency, and
profit efficiency. Measurements of capacity utilization – both physical measures of
capacity utilization (with fixed inputs) in the short run and economic measures in
the long run (without fixed inputs), are considered in section “Capacity Utilization.”
Section “Efficiency Measurement with Bad Outputs” deals with measurement of
efficiency in the presence of undesirable outputs. Section “Contextual Variables
in DEA” deals with non-discretionary (or contextual variables) that influence
productivity but are not within the control of the decision-maker. Section “Summing
Up” wraps up the chapter.

1 Bycontrast, the earlier models in DEA and activity analysis assumed CRS (e.g., CCR [19]) and
generalization to the VRS technology was a subsequent development [8].
10 Data Envelopment Analysis: A Nonparametric Method of Production Analysis 413

The Production Technology and Technical Efficiency

The conceptual foundation of the neoclassical production economics is the pro-

duction possibility set. Consider an industry where the individual production
decision-making units (generically described as firms in economics and as DMUs
in DEA) produce bundles of m outputs y = (y1 , y2 , ..yr , .., ym ) using bundles of
n inputs, x = (x1 , x2 , . . . , xi , . . . , xn ). An input-output pair (x0 , y0 )constitutes a
feasible production plan if, and only if, output y0 can be produced from input x0 .2
The production possibility set, T, consists of all feasible production plans and can
be defined as

T = (x, y) : x ∈ R+
n
, y ∈ R+
m
; y can be produced from x (1)

m+n
It is assumed that T is a closed and bounded subset of R+ . The production
technology of the industry is completely defined by the set T.
In the single output case, one defines a production function:

y ∗ = f (x1 , x2 , .., xi , .., xn ) = f (x) (2)

where y* is the maximum quantity of the (scalar) output y that can be produced from
the input bundle x = (x1 , x2 , .., xi , .., xn ). Under most circumstances, one may assume
that inputs may be left idle or otherwise wasted. As a result, the output produced is
less than what is maximally producible. In this case, the production possibility set
can be defined as

T = {(x, y) : y ≤ f (x)} . (3)

The graph of the production function (also known as the frontier of the
production possibility set) is

G = {(x, y) : y = f (x)} . (4)

For any firm with the observed input-output bundle (x0 , y0 ) ∈ T, y0 ≤ f (x0 ). Its
output-oriented technical efficiency can be measured as
y y0
0
τy x 0 , y0 = ∗ = 0 . (5)
y0 f x

The input-output bundle x 0 , y0∗ lies on the graph of the technology and serves
as the benchmark for measuring the output-oriented technical efficiency of the firm.

2 In this chapter, vectors are denoted by superscripts and scalars by subscripts.

414 S. C. Ray

An alternative way to measure technical efficiency would be to look for the

maximum reduction in inputs feasible without lowering the output below the
observed level, y0 . In other words, we want an input bundle x∗ such that f (x∗ ) = y0 .
Because there will be many such bundles, one possible way to get a unique
projection is to require that the target input bundle x∗ is proportional to the actual
bundle x0 . Let x∗ = θ x0 . Then f (x∗ ) = f (θ x0 ) = y0 and the output-oriented technical
efficiency of the firm is

τx x 0 , y0 = θ. (6)

When the technology involves multiple outputs and multiple inputs, one cannot
define the production possibility set T in terms of a production function and, instead,
has to consider a production correspondence mapping from the input space into the
output space through the transformation function

F (x, y) = k. (7)

An input-output combination (x0 , y0 ) is a feasible production plan if and only if

F(x0 , y0 ) ≤ 0. Hence, the production possibility set can be defined as

T = (x, y) : F x, y ≤ 0 . (8)

For any specific input-output pair (x0 , y0 ), the output-oriented technical effi-
ciency is
1
τy x 0 , y 0 = ∗ where ϕ ∗ = max ϕ : x 0 , ϕy 0 ∈ T . (9)
ϕ

The corresponding output-oriented technical efficiency is

τx x 0 , y 0 = min θ : θ x 0 , y 0 ∈ T . (10)

Shephard Distance Functions

Shephard [72, 73] defined the (output) distance function evaluated at any arbitrary
input-output bundle (x0 , y0 ) as
1
D y x 0 , y 0 = min λ : x 0 , y 0 ∈ T. (11)
λ

Note that λ > 1 implies that (x0 , y0 ) is infeasible whereas if λ < 1, (x0 , y0 ) is
technically inefficient and the output bundle can be scaled upward without using
any more input. Finally, λ = 1 implies that the bundle is technically efficient. Thus,
10 Data Envelopment Analysis: A Nonparametric Method of Production Analysis 415

the distance function relates to the transformation function as

1
D y x 0 , y 0 = min λ : F x 0 , y 0 ≤0 (12)
λ

and an alternative characterization of the production possibility set is

T = (x, y) : D y x, y ≤ 1 . (13)

Also, it is evident from a comparison of (12) with (9)

D y x 0 , y 0 = τy x 0 , y 0 . (14)

Analogous to the output distance function is the Shephard input distance function

1 0 0
D x x 0 , y 0 = max β : x ,y ∈ T. (15)
β

Note that β in (15) is the inverse of θ in (10). Hence,

1
Dx x 0, y 0 = . (16)
τx x 0 , y 0

Input and Output Sets

An alternative characterization of the production possibility set T is possible in

terms of a family of input sets. For any output bundle y, the input (requirement)
set consists of all input bundles that can produce that output bundle and can be
expressed as

V (y) = {x : (x, y) ∈ T } . (17)

The frontier of the input requirement set is the input isoquant

V y 0 = x : x ∈ V y 0 and α < 1 ⇒ αx ∈
/ V y0 . (18)

It is clear that τx x 0 , y 0 = 1 ⇒ x 0 ∈ V y 0 .
In a comparable manner, one can define the output set of any input bundle x as

P (x) = {y : (x, y) ∈ T } (19)

and the output isoquant

416 S. C. Ray

P (x) = y : (x, y) ∈ T and α > 1 ⇒ x, αy ∈/T . (20)

Further, τy x 0 , y 0 = 1 ⇒ y 0 ∈ P x 0 .

Nonparametric Construction of the Technology

and Measurement of Technical Efficiency

In practice, the production technology is unknown, and one needs to construct an

approximation to the production possibility set T from data. The nonparametric
method of data envelopment analysis (DEA) enables one to construct from the
observed input-output data a piecewise linear approximation to the frontier of
the production possibility set.3 In DEA one relies on a number of fairly weak
assumptions about the production technology but leaves the exact functional form
of the frontier unspecified. The result is a conservative approximation to the frontier.

Assumptions

In particular, it is assumed that:

(A1) All observed input-output bundles are feasible

(A2) The production possibility set is convex
(A3) Inputs are freely disposable
(A4) Outputs are freely disposable

Consider the data set consisting of the input-output bundles of N firms in the
sample

D= x j , y j ; j = 1, 2, . . . , N (21)

• Assumption (A1) implies that each (xj , yj )∈ D ⇒ x j , y j ∈ T .
• The convexity assumption in (A2) implies (x1 , y1 ) ∈ T ∧ (x2 , y2 ) ∈ T ⇒ (λx1 +
(1 − λ)x2 , λy1 + (1 − λ)y2 ) ∈ T ∀ λ ∈ (0, 1). Thus, (A1–A2) together imply that
every convex combination of the observed bundles in D will also be in T.
• Free disposability of inputs in (A3) implies that if (x0 , y0 ) ∈ T, then for every
x ≥ x0 , (x, y0 ) ∈ T.

3 The more popular alternative is Stochastic frontier analysis (SFA) introduced by Aigner, Lovell,
and Schmidt [2] where one uses the maximum likelihood procedure to estimate a parametrically
specified frontier production function incorporating a one-sided error term representing ineffi-
ciency and another two-sided error term representing random noise. See Kumbhakar and Lovell
[46] for a detailed discussion of Stochastic frontier analysis (SFA).
10 Data Envelopment Analysis: A Nonparametric Method of Production Analysis 417

• Free disposability of outputs in (A4) implies that if (x0 , y0 ) ∈ T, then for every
y ≤ y0 , (x0 , y) ∈ T.4

Under the assumptions (A1)–(A4), an empirical approximation of the production

possibility set T is
⎧ ⎫
⎨
N
N
N ⎬
T̂ = (x, y) : x ≥ λj x j ; y ≤ λj y j ; λj = 1j λj ≥ 0; j = 1, 2, . . . , N
⎩ ⎭
j =1 j =1 j =1
(22)

The set T̂ is often described as the free disposal convex hull of the set D of the
observed input-output bundles and is the smallest set satisfying assumptions (A1)–
(A4). The frontier of the set T̂ provides the tightest envelop that covers the data
from above and is an under approximation of the true production possibility set T.5

DEA Models for Measuring Output and Output-Oriented Technical

Efficiency

The output-oriented technical efficiency of a firm using input x0 and producing

output y0 may be evaluated as
1
τy x 0 , y 0 = ∗
ϕ
where

ϕ ∗ = max ϕ

N
j
s.t. λj yr ≥ ϕyr0 (r = 1, 2, . . . , m) ;
j =1

N
j
λj xi ≤ xi0 (i = 1, 2, . . . , n) ; (23)
j =1
N
λj = 1;
j =1
λj ≥ 0, (j = 1, 2, . . . , N) ; ϕ unrestricted.

4 If the transformation function is differentiable, weak disposability of inputs and outputs will imply

∂xi ≤ 0 for each input i and ∂yr ≥ 0 for each output r. In a later section in this chapter and in
∂F ∂F

much greater details in the Chapter 12, “Bad Outputs” by Murty and Russell in this volume of
the Handbook, free or strong disposability is contrasted with weak disposability where an output
cannot be decreased (or an input increased) unilaterally but simultaneous reduction in multiple
outputs or increase in multiple inputs may be feasible.
5 In the DEA literature, this is often described as minimum extrapolation. It should be noted that

this is a criterion for estimation rather than a property of the technology.

418 S. C. Ray

Even though ϕ is unrestricted, when (x0 , y0 ) is one of the bundles in D (say the
bundle of firm k), (λk = 1, λj = 0 (j = k), ϕ = 1) is a feasible solution and hence 1
would be a lower bound for ϕ.6
∗
N
∗ ∗

N
∗
The benchmark input-output bundle for (x , y ) is x =
0 0 λj x , y =
j λj y j
j =1 j =1
constructed from the optimal solution of the problem. For any output r, the
N
output slack sr+ = λ∗j yr -ϕ ∗ yr0 represents additional expansion of the output
j
j =1
feasible beyond the common expansion by the scalar ϕ* . Similarly, the input slack

N
si− = xi0 − λ∗j xi is the potential reduction in input i. The scalar ϕ∗ shows the
j
j =1
factor by which all outputs can be expanded without requiring any additional input.
In the single output case, the optimal value of the objective function in the output-
oriented DEA problem (ϕ* ) yields an estimate of the maximum output7 producible
from the input bundle x0 as

fˆ x 0 = ϕ ∗ y0 . (24)

Based only on assumptions (A1)–(A4),

it is the most conservative estimate of
the frontier output, and, hence, τy x 0 , y0 = ϕ1∗ is an upper bound on the output-
oriented technical efficiency of the firm.
This model was formulated in Banker, Charnes, and Cooper [9] and is commonly
known as the output-oriented BCC DEA model. It should be emphasized that
although the credit for this formulation of T̂ is given to BCC, it was already
formulated in Afriat ([1] Theorem 1.2, p. 571) for the single output case.
The corresponding output-oriented technical efficiency of the firm using input x0
and producing output y0 may be evaluated as

τx x 0 , y 0 = θ ∗ ,

where

6 When (x0 , y0 ) is not one of the observed bundles, non-negativity of the λs and the outputs will
ensure that ϕ will never be negative even if it is lower than 1. However, if any individual input in
the bundle x0 is smaller than the smallest value of the corresponding input across all observations
in the data set D, (23) will not have a feasible solution.
7 The true maximum may actually be considerably higher than ϕ∗ y . But we cannot infer that on
0
the basis of the observed input-output bundles without making additional assumptions about the
technology. However, it cannot be any smaller than ϕ∗ y0 if the assumptions (A1)–(A4) hold.
10 Data Envelopment Analysis: A Nonparametric Method of Production Analysis 419

Fig. 1 (a) Output-oriented technical efficiency. (b) Input-oriented technical efficiency

θ ∗ = min θ

N
j
s.t. λj yr ≥ yr0 (r = 1, 2, . . . , m) ;
j =1

N
j
λj xi ≤ θ xi0 (i = 1, 2, . . . , n) ; (25)
j =1
N
λj = 1;
j =1
λj ≥ 0, (j = 1, 2, . . . , N) ; θ unrestricted.

N
It should be noted that the benchmark input-output bundle x ∗ = λ∗j x j ,
j =1

N
y∗ = λ∗j y j on the frontier for the output-oriented DEA problem will generally
j =1
be different from what was obtained as the efficient output-oriented projection.
Output and output-oriented measurement of technical efficiency are shown
graphically for the 1-output 1-input case in Fig. 1a and b, respectively. The curve
y∗ = f (x) is the production frontier. Points A and B show two observed input-
output bundles (xA , yA ) and (xB , yB ). The
output-oriented
efficient
projections of the
two points shown in Fig. 1a are A∗ xA , yA∗ and B ∗ xB , yB∗ , respectively. The
output-oriented technical efficiencies of the two bundles are τy (A) = Oy A
Oy ∗ and A
τy (B) = Oy B
∗ . In Fig. 1b, their input-oriented efficient projections are the point
∗ OyB
C xA , yA and D xB∗ , yB . Their input-oriented technical efficiency measures are
Ox ∗ Ox ∗
τx (A) = OxAA and τx (B) = OxBB .
For the output bundle y0 , the input set can be empirically constructed as
420 S. C. Ray

⎧ ⎫
⎨
N N N ⎬
V̂ y 0 = x : x≥ λj x j ; λj y j ≥y 0 ; λj =1; λj ≥0, (j =1, 2, . . . , N) .
⎩ ⎭
j =1 j =1 j =1
(26)

The input-oriented technical efficiency τ x (x0 , y0 ) can alternatively be measured

empirically as

θ ∗ = min θ : θ x 0 ∈ V̂ y 0 . (27)

For the input bundle x0 , the output set can be empirically constructed as
⎧ ⎫
⎨
N
N
N ⎬
P̂ x 0 = y : y≥ λj y j ; λj x j ≤x 0 ; λj =1; λj ≥0, (j = 1, 2, . . . , N ) .
⎩ ⎭
j =1 j =1 j =1
(28)

The output-oriented technical efficiency can alternatively be measured empiri-

cally as
1
τy x 0 , y 0 = ∗
ϕ

where

ϕ ∗ = max ϕ : ϕy 0 ∈ P̂ x 0 . (29)

Technology and Efficiency Under Constant Returns to Scale

So far no assumption has been made about returns to scale. In some cases, it would
be reasonable to assume constant returns to scale. The technology exhibits constant
returns to scale (CRS) globally if8

(x, y) ∈ T ⇒ (kx, ky) ∈ T ∀k ≥ 0. (30)

Under the CRS assumption, an empirical estimate of the production possibility

set is

8 The case of k = 0 corresponds to inaction when no input is used and no output is produced.
10 Data Envelopment Analysis: A Nonparametric Method of Production Analysis 421

⎧ ⎫
⎨
N
N ⎬
T̂ C = (x, y) : x ≥ λj x j ; y ≤ λj y j ; λj ≥ 0; j = 1, 2, . . . , N .
⎩ ⎭
j =1 j =1
(31)

The set T̂ C is sometimes described as the free disposal conical hull of D 9 .

The output-oriented CRS technical efficiency is
1
τyC x 0 , y 0 = ∗ (32)
ϕC

where

ϕC∗ = max ϕ

N
s.t. λj y j ≥ ϕy 0 0 ;
j =1
(33)

N
λj x j ≤ x 0 ;
j =1
λj ≥ 0; (j = 1, 2, . . . , N) ; ϕ unrestricted.

Similarly, the input-oriented CRS technical efficiency is

τxC x 0 , y 0 = min θ
N
s.t. λj y j ≥ y 0 ;
j =1
(34)

N
λj x j ≤ θ x 0 ;
j =1
λj ≥ 0; (j = 1, 2, . . . , N) ; θ unrestricted.

It may be noted
that correspondingly,
under CRS, the output distance function is
y
DC (x, y) = τyC x 0 , y 0 = ϕ1∗ , while the input distance function is DCx (x, y) =
C
1
= 1
θC∗ . It is easy to verify that under the CRS assumption, input- and
τyC (x 0 ,y 0 )
output-oriented measures of technical efficiency are identical. In Fig. 2, the line
f (x) = kx is the CRS production frontier. For the input-output bundle shown by

9
Note the absence of the restriction
that the λs add up to unity. Under convexity only,

N
N
N
x= λj x , y =
j λj y j is feasible so long as λj = 1 and no λj is negative. With
j =1 j =1 j =1
the added assumption
of CRS, (kx, ky) is also feasible for any k ≥ 0. CRS implies that for

N
N N
μj = kλj , k ≥ 0, j
μj x , j
μj y is feasible. But μj = k need not be equal to 1.
j =1 j =1 j =1
422 S. C. Ray

Fig. 2 Technical efficiency under CRS

the point PA , the output-oriented projection onto the frontier is QA , and the input-
∗
RA x A OxA∗
PA x A
oriented projection is RA . It can be seen that τyC (PA ) = Q x
A A
= Q x
A A
= Ox A
=
PB x B RB xB∗ OxB∗
τxC (PA ) . Similarly, τyC (PB ) = QB xB = QB xB = OxB = τxC (PB ) .

Multiplier Models

It is worthwhile at this point to look at the following LP dual of the minimization

problem in (34) above:

m
max ur yr0
r=1

m
j
n
j
s.t. ur yr − vi xi ≤ 0; (j = 1, 2, . . . , N ) (35)
r=1 i=1

n
vi xi0 = 1; ur , vi ≥ 0; (i = 1, 2, . . . , n; r = 1, 2, . . . , m) .
i=1
10 Data Envelopment Analysis: A Nonparametric Method of Production Analysis 423

n
Utilizing the normalization constraint vi xi0 = 1, one can rewrite (35) as
i=1

m
ur yr0
h = max r=1

n
vi xi0
i=1

m
j
ur yr (36)
s.t. r=1

n ≤ 1; (j = 1, 2, . . . , N)
j
vi xi
i=1
ur , vi ≥ 0; (i = 1, 2, . . . , n; r = 1, 2, . . . , m) .

This is the original ratio form of technical efficiency proposed by CCR [19].
The multipliers u = (u1 , u2 , . . . , um ) are the shadow prices of outputs, while
v = (v1 , v2 , . . . , vn ) are the shadow prices of inputs.10 The objective function is
a measure of the shadow return on outlay of the unit under evaluation.
For the VRS technology, consider the following dual of the output-oriented BCC
model in (23)

n
ψ = min v0 + vi xi0
i=1
n
j m
j
s.t.v0 + vi xi − ur yr ≥ 0; (j = 1, 2, . . . , N)
i=1 r=1 (37)
m
ur yr = 1;
0
r=1
ur , vi ≥ 0; (i = 1, 2, . . . , n; r = 1, 2, . . . , m) ; v0 unrestricted.

By standard duality results, ψ ∗ in (37) is equal to ϕ∗ in (23). Hence, the ratio

measure of the output-oriented technical efficiency under VRS will be

m
ur yr0
τy x 0 , y 0 = 1
ψ = max r=1

n
v0 + vi xi0
i=1

m
j
ur yr
s.t. r=1

n ≤ 1; (j = 1, 2, . . . , N) (38)
j
v0 + vi xi
i=1

n
v0 + vi xi0 = 1;
i=1
ur , vi ≥ 0; (i = 1, 2, . . . , n; r = 1, 2, . . . , m) ; v0 unrestricted.

10 These shadow prices or multipliers are uniquely designed for the unit under evaluation.
424 S. C. Ray

Scale Efficiency11

While full technical efficiency requires a firm to produce the maximum output
from its observed input bundle, in order to be considered scale efficient, the firm
needs to operate at the scale of the input where average productivity reaches a
maximum. In the 1-output 1-input case, the average productivity of a firm with
input-output (x0 , y0 ) is yx00 . Now suppose that the production function is y∗ = f (x).
Along the production function, y = f (x) so that AP (x) = yx = f (x) x . Thus, if
y = f (x0 ) , AP (x0 ) = f (x x0
0)
. Under VRS, average productivity varies across
*
different levels of the input. Let x be the input level where average productivity
attains a maximum. In that case, dAP dx
(x)
= xf (x)−f
x2
(x)
= 0 at the input level x* .
Frisch [40] described the input level where average productivity is maximum as the
technical optimal production scale (TOPS). At the technically optimal input level
(x* ), dAP
dx
(x)
= 0. That is, within a small neighborhood of x* , average productivity
remains unchanged as x changes. Hence, locally constant returns to scale holds.
Also, at the input level (x* ), marginal productivity and average productivity are
f x∗
equal. Thus, f (x ∗ ) = (x ∗ ) . This implies that f (x∗ ) = f (x∗ )x∗ . Hence,

f (x0 )
x0 f (x0 )
SE (x0 ) = f (x ∗ )
= . (39)
x0 f (x ∗ )
x∗

Now define δ ≡ f (x∗ ) and consider an artificial CRS production function

y ∗∗ = r(x) = δx. (40)

Then, the denominator in (39) becomes δx0 = r(x0 ). Therefore, an alternative

measure of scale efficiency is
y0 y
f (x0 ) r(x0 ) τyC (x0 , y0 ) D (x0 , y0 )
SE (x0 ) = = y0 = = Cy . (41)
r (x0 ) f (x0 ) τy (x0 , y0 ) D (x0 , y0 )

It should be noted that the expression in (41) measures the output-oriented scale
efficiency of the input level x0 . In a perfectly analogous manner, one can take the
output level y0 as given and measure the input-oriented scale efficiency

τxC (x0 , y0 ) D x (x0 , y0 )

SE (y0 ) = = x . (42)
τx (x0 , y0 ) DC (x0 , y0 )

11 For a more detailed discussion of this topic, refer to the Chap. 17, “Scale Elasticity and Returns

to Scale”, by Podinovski and Førsund in this volume of the Handbook.

10 Data Envelopment Analysis: A Nonparametric Method of Production Analysis 425

As in the case of technical efficiency, the scale efficiency measure also will
generally be different for input and output orientation. In Fig. 3 the point A shows the
input-output bundle (x0 , y0 ), and the point B on the production function y∗ = f (x) is
the output-oriented technically efficient projection. The most productive input scale
f x∗ Cx ∗
is x∗ and AP (x ∗ ) = (x ∗ ) = Ox Dx0
∗ = Ox . Also, the tangent to the production
0
function at the point C can be treated as a counterfactual CRS production function
y∗∗ = r(x) = δx; δ ≡ f (x∗ ). Thus,

f (x0 ) y0 y
x0 Bx0 f (x0 ) r(x0 ) D (x0 , y0 )
SE (x0 ) = f (x ∗ )
= = = y0 = Cy . (43)
Dx0 r (x0 ) D (x0 , y0 )
x∗ f (x0 )

Ray Average Productivity and Returns to Scale

The single input single output case was useful for illustrative purposes but is of
little relevance in real life because seldom, if ever, any output is produced from
one input alone. Now consider a multiple input single output technology. The
production function now shows the maximum scalar output producible from a
vector of inputs. Consider an input-output combination (x0 , y0 ) that lies in the graph
of the technology, G. That is, y0 = f (x0 ). Now consider another bundle (x1 , y1 )
also in the graph such that x1 = βx0 . The two input bundles differ only in scale
but not in input proportions. The vectors x0 and x1 lie on the same ray through the
origin in the input space. If the bundle x0 is considered to be 1 unit of a composite
input, then x1 represents β units of the same input. If β is greater than 1, then x1 is
a radial expansion of the x0 bundle. Now suppose that y1 = αy0 . The ray average
productivity measured by output per unit of the composite input at (x0 , y0 ) is y0 and
at (x1 , y1 ) is αyβ 0 . Now, if α > β > 1, then ray average productivity is increasing at
(x0 , y0 ), and one can conclude that locally increasing returns to scale (IRS) holds at
this point on the graph. On the other hand, 1 < α < β signifies locally diminishing
returns to scale (DRS). Finally, α = β implies constant returns to scale (CRS). Note
that these are all local characteristics of the technology and are evaluated as β → 1
from above. The technology may exhibit increasing, constant, or diminishing returns
to scale at different points on the graph. This is why it is described as variable returns
to scale (VRS).

Most Productive Scale Size

Banker [5] generalized Frisch’s concept of the technically optimal production

scale
0 to nthe 0multiple output multiple input case. A feasible input-output bundle
x ∈ R+ , y ∈ R+ m is a most productive scale size (MPSS) if for all non-negative

scalars (α, β) for which (βx0 , αy0 ) is a feasible input-output combination, βα ≤ 1. In

other words, (x0 , y0 ) is an MPSS only if there is no other feasible input-output bundle
426 S. C. Ray

Fig. 3 Scale efficiency

with the same mix of inputs and outputs but a higher ray average productivity. It is
obvious that no feasible input-output bundle can be an MPSS unless it is in the
graph.12
Banker and Thrall [8] and Ray [60] have shown that when the production
possibility set is convex, IRS holds at all scales smaller than the smallest MPSS.
Similarly, DRS holds at all scales larger than the largest MPSS.

Identifying the Nature of Local Returns to Scale

There are three alternative ways to identify the nature of returns to scale at a
specific input-output bundle: (a) a primal approach [5], (b) a dual approach [8],
and (c) a nesting approach due to Färe, Grosskopf, and Lovell (FGL) [31]. This
chapter, considers the primal approach in details both because it is the most

(x0 , y0 ) ∈ T but ∈G, then there will exist either some β < 1 such that (βx0 , y0 ) ∈ T or some
12 If

α > 1 such that (x0 , αy0 ) ∈ T. In the former case, one gets βα > 1 for α = 1. In the latter case,
β > 1 for β = 1.
α
10 Data Envelopment Analysis: A Nonparametric Method of Production Analysis 427

popular alternative and also because it serves to identify the MPSS. The other two
approaches are mentioned only briefly13 .

Banker’s Primal Approach

Banker [5] developed the following important theorem that serves as a basis for
identifying the nature of local returns to scale at the input-output bundle (x0 , y0 ) if
it is on the VRS frontier and at its efficient projection if it is an interior point.

Theorem 1 An input-output bundle (x0 , y0 ) is an MPSS if and only if the optimal

value of the objective function of a CCR-DEA model equals unity for this input-
output combination.

Proof See Banker ([5], p. 40).

This theorem only determines whether (x0 , y0 ) is an MPSS or not. It does not say
anything directly about the nature of local returns to scale when it is not an MPSS.
However, three important corollaries follow from the theorem:

N
1. If k = λ∗j = 1 at the optimal solution of the DEA LP problem (34) above,
j =1
(x0 , y0 ) is an MPSS and CRS holds locally.
N
2. If k = λ∗j < 1 at the optimal solution of the DEA LP problem (34) above,
j =1
IRS holds locally at (x0 , y0 ) or at its input-oriented efficient projection on to the
VRS frontier if it is technically inefficient.
N
3. If k = λ∗j > 1 at the optimal solution of the DEA LP problem (34) above,
j =1
DRS holds locally at (x0 , y0 ) or at its input-oriented efficient projection on to the
VRS frontier if it is technically inefficient.

The intuition behind these corollaries is quite simple. When k = 1, the optimal
solution from the output-oriented CRS problem in (34) is an optimal solution for
the corresponding VRS problem. Because the CRS and VRS technical efficiency
measures are identical, scale efficiency equals unity and (x0 , y0 ) is an MPSS.
Moreover, by virtue of part (a) of the theorem, θ * equals unity and (x0 , y0 ) is on
the frontier. If k = 1, the CRS input-oriented projection ∗ 0 0
∗ (θ0 x0 , y ) is not a feasible
1
solution for the corresponding VRS problem. But k θ x , y is on both the CRS
and the VRS frontiers. If k < 1, the input-oriented projection is to be scaled up to
attain an MPSS, and it lies in the IRS region. On the other hand, if k > 1, it is to the
right of the MPSS, and the input-oriented projection falls in the DRS region on the
VRS frontier.

13 The more interested reader may see Ray [57] for detailed treatment of all these three approaches.
428 S. C. Ray

A potential problem with this method of returns to scale characterization is that

there may be multiple optimal solutions to the DEA problem in (34) with the sum
of λs greater than 1 in some and less than 1 in others. In that situation conflicting
conclusions would be drawn depending on which optimal solution was reached.
This requires a modification of corollaries (2) and (3) as follows:

N
(2a) Locally increasing returns to scale holds if k = λ∗j < 1 at all optimal
j =1
solutions of the CRS DEA problem in (34).

N
(3a) Locally diminishing returns holds if k = λ∗j > 1 at all optimal solutions of
j =1
the CRS DEA problem in (34).

This can be implemented in two steps. In step 1, the DEA problem in (34)
is solved, and the optimal value θ * is determined. For (2a) above, in step 2, the
following problem is solved:

N
max λj
j =1

N
s.t. λj x j ≤ θ ∗ x 0 ;
j =1 (44)

N
λj y j ≥ y 0 ;
j =1
λj ≥ 0, (j = 1, 2, . . . , N) .

If the maximum value of the objective function is less than 1, it can be concluded
N
that k = λ∗j < 1 at all optimal solutions of (34). Similarly, in order to check for
j =1
(3a), one minimizes the sum of λs in (44), and if the minimum is greater than 1, one
can conclude that DRS holds locally.

A Dual Approach
Banker, Charnes, and Cooper (BCC) [9] offer an alternative method of identifying
local returns to scale from the following dual of the input-oriented VRS DEA
problem:

max u y 0 − u0
s.t. u y j − u0 ≤ v x j , (j = 1, 2, . . . , N ) ;
(45)
v x 0 = 1;
u, v 0 ≥ 0; u0 unrestricted.
10 Data Envelopment Analysis: A Nonparametric Method of Production Analysis 429

Note that at the optimal shadow prices (u∗ , v∗ ), the shadow profit from any
observed input-output bundle (xj , yj ) is u∗ yj − v∗ xj . Thus the constraints in (45)
imply that u∗0 is the upper bound on the shadow profit from the bundle (xj , yj ).
BCC have shown that:

(i) CRS holds at (x0 , y0 ) if at the optimal solution of (45) u0 is zero

(ii) IRS holds at (x0 , y0 ) if at the optimal solution of (45) u0 is <0
(iii) DRS holds at (x0 , y0 )if at the optimal solution of (45) u0 is >0

As in the case of Banker’s approach, multiple optimal solutions pose a problem,

and the conditions (ii) and (iii) have to be appropriately modified.

A Nesting Approach
Färe, Grosskopf, and Lovell (FGL) [31] consider a technology that lies in between
CRS and the VRS technologies. They call it a non-increasing returns to scale (NIRS)
technology. Under the assumption of NIRS

x 0 , y 0 ∈ T ⇒ kx 0 , ky 0 ∈ T for any k ∈ (0, 1) .

The DEA approximation to an NIRS production possibility set is

⎧ ⎫
⎨
N
N
N ⎬
T̂ NI RS= (x, y) :x≥ λj x j ;y≤ λj y j ; λj ≤1j λj ≥0; j =1, 2, . . . , N
⎩ ⎭
j =1 j =1 j =1
(46)

It may be noted that the frontiers of the CRS and NIRS production possibility sets
coincide in the region of IRS. Similarly, the VRS and NIRS frontiers are identical in
the DRS region. Therefore, when IRS holds at (x0 , y0 ), in an input-oriented model
θ∗C = θ∗ N I RS < θ∗V where the superscripts C, N, and V refer to CRS, NIRS, and
VRS. Similarly, θ∗C < θ∗ NI RS = θ∗V implies DRS. Of course, in the case of CRS,
all three estimates of technical efficiency equal unity.

Identifying Returns to Scale for Inefficient Unit

The concept of returns to scale is meaningful only when the relevant input-output
bundle lies on the frontier of the production possibility set. For an inefficient bundle,
one must consider its efficient projection – either input- or output-oriented. Unless
similar returns to scale are found at both projections, one cannot conclusively
determine the returns to scale at the observed input-output bundle.
The following minor modification of Banker [5] considered by Cooper, Thomp-
son, and Thrall [22] can be used not only to determine whether an input-output
bundle (x0 , y0 ) is an MPSS but also to identify the bundle x∗0 , y∗0 which is an
430 S. C. Ray

MPSS for (x0 , y0 ):

α
max
β

N
s.t. λj y j ≥ αy 0 ;
j =1

N
λj x j ≤ βx 0 ; (47)
j =1

N
λj = 1;
j =1

α, β, λj ≥ 0, (j = 1, 2, . . . , N) .

As such, the objective function is nonlinear. However, it can be easily trans-

formed into a linear programming problem. Define t = β1 , ρ = βα , and μj = tλj
(j = 1, 2, . . . ,N). Note that non-negativity of β and λj s ensures that t and μj s are
also non-negative. Problem (47) can, therefore, be reformulated as the following
linear programming problem:

max ρ

N
s.t. μj y j ≥ ρy 0 ;
j =1

N
μj x j ≤ βx 0 ; (48)
j =1

N
μj = t;
j =1

t, μj ≥ 0, (j = 1, 2, . . . , N ) .

∗
From the optimal solution of this problem, we can derive β ∗ = t1∗ and α ∗ = ρt ∗ .
One can then infer the nature of returns to scale from these values of α* and β* .
It may be pointed out here that because the only restriction on t is non-negativity,
(48) is simply the output-oriented CCR DEA problem and ρ1∗ is the same as the

output-oriented CRS technical efficiency τyC x 0 , y 0 .
Because (x0 , y0 ) is assumed to be a feasible input-output bundle, (α = β = ρ = 1)
is a feasible solution for this problem. Hence, the optimal value ρ ∗ is always greater
10 Data Envelopment Analysis: A Nonparametric Method of Production Analysis 431

∗
than or equal to 1.When ρ ∗ = βα ∗ exceeds unity, we know that (x0 , y0 ) is not an
MPSS. But we can also conclude that (β ∗ x0 , α ∗ y0 ) is an MPSS.
When the bundle (x0 , y0 ) is not itself an MPSS, ρ ∗ > 1 so that α ∗ > β ∗ . If the
MPSS is unique, there are five different possibilities: (i) 1 < β ∗ < α ∗ ; (ii) β ∗ < α ∗ < 1;
(iii) β ∗ = 1 < α ∗ ; (iv) β ∗ < 1 = α ∗ ; and (v) β ∗ < 1 < α ∗ . When the MPSS is unique,
if 1 < β ∗ < α ∗ , both input- and output-oriented projections of the bundle (x0 , y0 )
fall in the region of IRS. In this case, the unit is conclusively too small relative to its
MPSS. Similarly, if β ∗ < α ∗ < 1, both input- and output-oriented projections fall in
the region of DRS. The implication is that the unit is too large. When β ∗ = 1 < α ∗ ,
the input scale corresponds to the MPSS but the output scale is too small. This is
only due to output-oriented technical inefficiency but there is no scale inefficiency.
Similarly when β ∗ < 1 = α ∗ , there is output-oriented technical inefficiency but no
scale inefficiency. Finally, in the intermediate case, where β ∗ < 1 < α ∗ , the input
scale is bigger than the MPSS, and the output-oriented projection falls in the region
of DRS. At the same time, the input scale is smaller than the MPSS, and the input-
oriented projection falls in the region of IRS. When β ∗ = 1 < α ∗ , the (α = 1, β = 1)
point lies directly below the MPSS. Similarly, for β ∗ < α ∗ = 1, the (α = 1, β = 1)
point lies on the horizontal line through the MPSS.
Figure 4a through c graphically illustrate the three cases mentioned above. In
each diagram, the horizontal axis measures the input scale, and the vertical axis
measures the output scale of a specific input-output bundle shown by the point
(α = 1, β = 1) in each of these three diagrams. The broken line shown as the frontier
is the fixed-mix graph of the technology

G x 0 , y 0 = (α, β) : F βx 0 , αy 0 = 0 (49)

The MPSS is shown by the point (α ∗ , β ∗ ). In Fig. 4a, 1 < β ∗ < α ∗ and both the
input and output-oriented projections of (x0 , y0 ) will be in the region of increasing
returns to scale. Figure 4b illustrates the case where β ∗ < α ∗ < 1 and both
projections will be in the region of diminishing returns to scale. The intermediate
case, where β ∗ < 1 < α ∗ , is shown in Fig. 4c. In this case, the output scale is
smaller than the MPSS and the output-oriented projection would be in the region
of increasing returns but the input scale is bigger than the MPSS and the output-
oriented projection will be in the region of diminishing returns. When β ∗ = 1 < α ∗ ,
the (α = 1, β = 1) point lies directly below the MPSS. Similarly, for β ∗ < α ∗ = 1,
the (α = 1, β = 1) point lies on the horizontal line through the MPSS.

The Case of Multiple MPSS

Next consider the possibility of multiple MPSS. It is obvious that when (48) has
a unique optimal solution (in particular, t* is unique), there cannot be multiple
MPSS. If different values of t are obtained in multiple optimal solutions (of course,
432 S. C. Ray

Fig. 4 Unique and ambiguous RTS characterization of inefficient units

10 Data Envelopment Analysis: A Nonparametric Method of Production Analysis 433

with the same optimal value ρ ∗ ), the largest t ∗ = μ∗j = t1 across all optimal
j
∗

solutions of (48) corresponds to the smallest MPSS α1∗ = ρt1 , β1∗ = t11 .Similarly,
∗
the smallest t ∗ = μj = t2 at an optimal solution yields the largest MPSS
∗
j
α2∗ = ρt2 , β2∗ = t12 .
Now the returns to scale classification of (x0 , y0 ) needs to be revised appropri-
ately. In particular:

(i) β1∗ < α1∗ < 1 corresponds to IRS.

(ii) β1∗ < α1∗ < 1 < β2∗ < α2∗ corresponds to CRS.
(iii) 1 < β2∗ < α2∗ corresponds to DRS.

For other feasible combinations of (α ∗ , β ∗ ), the returns to scale classification will

depend on the direction of projection (whether input- or output-oriented).
Zhu [77] uses a single input single output example to partition the interior
of the production possibility set into six different regions for returns to scale
classification of inefficient production units14 . In three out of these six regions, both
input- and output-oriented efficient projections exhibit the same returns to scale:
increasing, constant, or diminishing. In the remaining three, increasing returns at
the input-oriented projection combines with constant or diminishing returns at the
output-oriented projection, or constant returns at the input-oriented projection is
associated with diminishing returns at the output-oriented. In order to correctly
locate an inefficient unit in the appropriate region, one has to ascertain returns to
scale at both projections.

Choice Between Input- and Output-Oriented Projections

Except in the case of globally constant returns to scale, output- and input-oriented
technical efficiency measures would differ for the same firm. An important question
is how to decide which measure is preferable. As a general rule, the answer depends
on whether output augmentation is more important that input conservation in a
specific context. In many situations, however, there is no clear-cut priority. A rule
of thumb would then be to select the orientation that yields a lower measure of
efficiency under the VRS assumption. The logic behind this criterion is that the
corresponding efficient projection would have a higher level of scale efficiency.
This can be explained by a simple 1-input 1-output example. Consider a technically
inefficient input-output combination (x0 , y0 ). Now suppose that the output-oriented
efficient projection is (x0 , ϕ∗ yo ) while the input-oriented projection is (θ ∗ x0 , yo ).
Thus the corresponding technical efficiency measures are τy = ϕ1∗ and τ x = θ ∗ .

14 See also the earlier paper by Seiford and Zhu [71].

434 S. C. Ray

Assume, arbitrarily, that τ y < τ x . This implies 1

ϕ∗ < θ ∗ or 1
θ∗ < ϕ ∗ . Therefore,
y0 ∗
θ ∗ x0< ϕx0y0 . This shows that average productivity is higher at the output-oriented
efficient projection than at the input-oriented projection of (x0 , y0 ).

Graph Efficiency Measures

For the output-oriented radial measure of technical efficiency of an inefficient unit,

one projects the observed input-output bundle vertically upward onto the frontier.
Similarly, for the output-oriented measure, it is projected horizontally leftward.
One may, however, prefer to project to a point on the frontier which is strictly
to the northwest of the observed point. Movement to such a point involves some
increase in output simultaneously with some decrease in input. A comparison of this
efficient projection with the observed bundle yields a graph efficiency measure that
takes account of potential increase in output and decrease in input simultaneously.
However, there is an ambiguity about the optimal projection in this case. In a
1-output 1-input diagram with output measured along the vertical and input along
the horizontal axis, the optimal output-oriented projection is directly above the
actual point. Similarly, for the input-oriented model, it is horizontally toward the left.
However, for the graph efficiency, one can, in principle, select any point in between
these two limits. The choice of a specific point in this segment of the frontier must
be based on some other reasonable criterion. While there are several alternative
models, the more popular ones are the graph hyperbolic and the directional distance
function.

Graph Hyperbolic Distance Function

The graph hyperbolic distance function (GHDF) is defined as

1 0
δ GH x 0 , y 0 = max δ : x , δy 0 ∈ T. (50)
δ

It derives its name from the fact that in the 1-output 1-input case, the optimal
projection would be located at the point where a rectangular hyperbola through the
observed point intersects the production frontier in the input-output space.
In Fig. 5 the point A shows the input-output bundle (x0 , y0 ). The curve xy = k
is a rectangular hyperbola through A. The point B where this curve intersects the
production frontier y∗ = f (x) is the graph hyperbolic efficient projection of A. At B,
x ∗ = 1δ x0 , y ∗ = δx0 .
The DEA LP problem for (50) above is
10 Data Envelopment Analysis: A Nonparametric Method of Production Analysis 435

Fig. 5 Graph hyperbolic distance function

max δ

s.t. λj yrj ≥ δyr0 ; (r = 1, 2, . . . , m)
j (51)
λj xij ≤ 1δ xi0 ; (i = 1, 2, . . . , n)
j

λj = 1; λj ≥ 0; (j = 1, 2, . . . , N)
j

Under CRS, the constraint on the sum of the λs is deleted.

This is a nonlinear programming problem. However, one can define β = δ 2 and
μj = δλj (j = 1, 2, . . . , N), and the model can be rewritten as15

max β

s.t. μj yrj ≥ βyr0 ; (r = 1, 2, . . . , m)
j (52)
μj xij ≤ xi0 ; (i = 1, 2, . . . , n)
j
μj ≥ 0; (j = 1, 2, . . . , N)

For VRS, one can use the linear approximation 1

δ ≈ 2 − δ at δ = 1 to revise (51)
as

15 See Färe, Grosskopf, Lovell, and Pasurka [33].

436 S. C. Ray

max δ

s.t. λj yrj ≥ δyr0 ; (r = 1, 2, . . . , m)
j (53)
λj xij + δxi0 ≤ 2xi0 ; (i = 1, 2, . . . , n)
j

λj = 1; λj ≥ 0; (j = 1, 2, . . . , N )
j

Directional Distance Function

Chambers, Chung, and Färe (CCF) [15, 16] introduced the directional distance
function (DDF) based on Luenberger’s benefit (or shortage) function [49] as a
measure of inefficiency. While in the GHDF the direction of projection of an
inefficient bundle on to the frontier is endogenously determined, in the case of the
DDF, the direction is prespecified by the analyst. Consider a bundle g = (gx , gy )
which may or may not be feasible. However, the DDF measures the potential for
radial movement from the observed bundle along that direction as it is projected
onto the frontier. The DDF is measured as
→ 0 0 x y
−
D x , y ; g , g = max β : x 0 − βg x , y 0 + βg y ∈ T . (54)

The CRS DEA LP problem for (54) for given pair of direction vectors (gx , gy ) is

max β
s.t.

N
λj y j − βg y ≥ y 0 ;
j =1 (55)

N
λj x j + βg x ≤ x 0 ;
j =1
λj ≥ 0; j = 1, 2, . . . , N.

The dual LP problem for (55) is

min v x 0 − u y 0
s.t.
v x j − u y j ≥ 0, (j = 1, 2, . . . , N ) (56)
v g x + u g y = 1,
u, v ≥ 0.

The numeraire in the DDF model (56) shows that when the same direction
vectors (gx , gy ) are chosen for evaluating efficiency of all units, the numeraire is
the sum of the shadow values of the same input-output bundle for all units, and
the efficiency measures are more directly comparable across units. By contrast in
10 Data Envelopment Analysis: A Nonparametric Method of Production Analysis 437

a radial output-oriented model like (35), the numeraire is the shadow value of the
output bundle of the unit under evaluation and varies across units making a direct
comparison of the efficiency ratios
somewhat problematic.
j y j
Leleu and Briec [47] used g = X = x , g = Y = y , the industry
x
j j
aggregate of input-output bundle, for the direction of projection.
Ray [58] offered a measure of “overall technical inefficiency” as

ξ = max (ϕ − θ ) : θ x 0 , ϕy 0 ∈ T . (57)

Because ϕ − θ = (ϕ − 1) + (1 − θ ), one can think of the overall inefficiency as

the sum of output-oriented and input-oriented inefficiencies.
Aparicio, Pastor, and Ray (APR) [3] extended the model in (57) as

→ 0 0 x y
−
D x , y ; g , g = max β x + β y : x 0 − β x g x , y 0 + β y g y ∈ T . (58)

x y y
APR have shown that β , β = 1
N βjx , N1 βj solves the problem in
j j
−
→
(58) for D (X, Y ; g x , g y ) when g x = X, g y = Y .
Another direction vector often used is (gx = − ι, gy = ι). In this case, β is the
maximum (absolute) amount by which each output can be expanded and every input
be contracted simultaneously. Unlike in (58), this time β does not have a simple
intuitive interpretation.
A popular choice of the direction is (gx = x0 , gy = y0 ). In that case

→ 0 0 x y
−
D x , y ; g , g = max β : (1 − β) x 0 , (1 + β) y 0 ∈ T . (59)

Here β can be interpreted as the maximum proportion by which the output

vector can be expanded and the input vector be contracted simultaneously. Note
that for (gx = 0, gy = y0 ) one gets the familiar output-oriented projection, whereas
(gx = x0 , gy = 0) yields the output-oriented projection.
In Fig. 6, A is the observed bundle (x0 , y0 ). The point B defines the direction of
movement (−x0 , y0 ). The point C on the production frontier shows the maximum
feasible movement within the production possibility set in the direction parallel to
OB. In this case, the directional distance function is β = OBAC
= ODOB .
The DEA model for (59) under VRS will be
438 S. C. Ray

Output ( )
∗
= ( )

0
B A

0 0 Input ( )
− 0

Fig. 6 Directional distance function

max β

s.t. λj yrj − βyr0 ≥ yr0 ; (r = 1, 2, . . . , m)
j
(60)
λj xij + βxi0 ≤ xi0 ; (i = 1, 2, . . . , n)
j

λj = 1; λj ≥ 0; (j = 1, 2, . . . , N)
j

Non-radial Measures of Efficiency

It should be emphasized that all of the DEA efficiency measures considered so far
are radial in the sense that technical efficiency is determined by the maximum
proportional expansion of all outputs or contraction of all inputs. The measured
efficiency does not reflect the potential for expanding any individual output beyond
the common rate of expansion or decreasing individual inputs on top of the common
rate of contraction. For example, in a 2-output case, a firm is radially efficient if no
increase is possible in one output even if the other output could be doubled!
In fact, CCR were bothered by this problem right from the beginning and soon
after their original 1978 paper introduced the so-called non-Archimedian number
(ε) as a penalty for the presence of a slack in any of the input or output constraints
in their note [20] revising the DEA model as
10 Data Envelopment Analysis: A Nonparametric Method of Production Analysis 439

m

n
max ϕ + ε sr+ + sii
r=1 i=1

N
λj yr − sr+ = ϕyr0 ; (r = 1, 2, . . . , m) ;
j
s.t.
j =1
(61)

N
λj xi + si− = xi0 ; (i = 1, 2, . . . , n)
j

j =1

sr+ , si− ≥ 0; (r = 1, 2, . . . , m; i = 1, 2, . . . , n) ;
λj ≥ 0; (j = 1, 2, . . . , N) ; ϕ unrestricted.

Now, even when ϕ∗ equals 1 in the optimal solution, the unit under evaluation
will not be considered efficient unless all slacks are also 0. It can be seen from the
dual of this LP problem in (61) that all the shadow prices of outputs and inputs
have a lower bound of ε and cannot be strictly equal to 0. However, its practical
usefulness is virtually nil because in order to solve the problem in (61) with data
one cannot assign a positive real numeric value to ε and at the same time treat this
as non-Archimedian (smaller than any Archimedian number).16
The non-radial Russell measure of efficiency introduced by Färe and Lovell [30]
avoids the presence of slacks (in outputs in output-oriented models and in inputs
in output-oriented models) by allowing outputs to expand (or inputs to contract) at
different rates.

Non-radial Russell Output Efficiency

The input-oriented Russell efficiency measure is

1
RM y x 0 , y 0 =
ρy x0, y0
1
m
ρy x 0 , y 0 = max ϕr
m
r=1

16 Rumor has it that when Färe raised this point at a conference in Austin TX, Charnes was so
irritated that he excluded the former from his guest list to a barbecue!
440 S. C. Ray

N
j
s.t. λj yr ≥ ϕr yr0 ; (r = 1, 2, . . . , m) ;
j =1

N
j
λj xi ≤ xi0 ; (i = 1, 2, . . . , n)
j =1
(62)

N
λj = 1;
j =1

ϕr ≥ 1; (r = 1, 2, . . . , m) ;
λj ≥ 0; (j = 1, 2, . . . , N) ; r unrestricted.

It may be noted that although the output constraints are in the form of inequali-
ties, because the ϕ s appear in the objective function in (62), there will be no output
slacks in the optimal solution.

Non-radial Russell Input Efficiency

Analogous to (62) above is the input-oriented Russell efficiency measure

1
n
RM x x 0 , y 0 = ρx x 0 , y 0 = min θi
n
i=1

N
j
s.t. λj yr ≥ yr0 ; (r = 1, 2, . . . , m) ;
j =1

N
j
λj xi ≤ θi xi0 ; (i = 1, 2, . . . , n) (63)
j =1

N
λj = 1;
j =1

θi ≤ 1; (i = 1, 2, . . . , n) ;
λj ≥ 0; (j = 1, 2, . . . , N) ; r unrestricted.

Pareto-Koopmans Measures

An input-output bundle is Pareto-Koopmans efficient [44, 45] if and only if no

output can be increased without reducing some other output or increasing some
10 Data Envelopment Analysis: A Nonparametric Method of Production Analysis 441

input, and at the same time, no input can be reduced without increasing some other
input or reducing some output. In other words, for any Pareto-Koopmans efficient
bundle, there cannot be any slack in any input or in any output. As has been shown
above for the non-radial Russell measure, there cannot be any output slack in an
efficient non-radial output-oriented projection, but slacks in inputs are not ruled out.
Conversely, the input-oriented non-radial projection may allow output slacks, but
there cannot be any slack in any input. Thus, Pareto-Koopmans (PK) efficiency
combines both input- and output-oriented Russell efficiency. There are different
variants of this PK efficiency, but the most popular of them is the product of the
Russell output and input efficiencies.17 It is called enhanced Russell measure by
Pastor, Louis, and Sirvent (PRS) [52], slack-based measure (SBM) by Tone [74],
and simply Pareto-Koopmans efficiency by Ray [57] and can be measured as

1
n θi
i
τ PK 0
x ,y 0
= min 1
m ϕr
r

s.t. λj yrj ≥ ϕr yr0 ; (r = 1, 2, . . . , m)
j

λj xij ≤ θi xi0 ; (i = 1, 2, . . . , n) (64)
j

ϕr ≥ 1; (r = 1, 2, . . . , m) ;
θi ≤ 1, ; (i = 1, 2, . . . , n)

λj = 1; λj ≥ 0; (j = 1, 2, . . . , N)
j

Note that in (64) every input and output constraint will be strictly binding.
Therefore at the optimal projection xi∗ = λ∗j xij = θi∗ xi0 (i = 1, 2, . . . , n) .
j
Define the total reduction in input i as for each input xi0 − xi∗ = si− ≥ 0. This leads
to

xi∗ s−
θi∗ = =1− i . (65)
xio xi0

Similarly by defining sr+ = yr∗ − yro , we can derive

1
17 Portela (θi ) n
and Thanassoulis [54] use the measure 1 and called it the geometric distance
(ϕ r ) m
function.
442 S. C. Ray

xi∗ s+
ϕr∗ = = 1 + r , (r = 1.2 . . . , m) (66)
xio yr0

Hence the objective function in (64) becomes

si−
1
n θi 1− 1
n xi0
i i
1 = + (67)
ϕr sir
m
r
1+ 1
m yr0
r

which is the SBM.18

Both (PRS) and Tone use the expression in (67) for the objective function and
resort to a normalization to convert the linear fractional functional programming
problem into an LP (following Charnes and Cooper [17]). Ray [56], Ray and Jeon
[63], and Ray and Ghose [62], on the other hand, used a linear approximation of the
objective function at (θ i = 1, ϕr = 1) (i = 1, 2, . . . , n; r = 1, 2, . . . , m) to get

1
θi
n
i

1 ≈2+ θi − ϕr (68)
m ϕr i r
r

and used min θi − ϕr as the objective function.
i r

Efficiency Measurement with Market Prices

There is a commonly held belief that DEA should be used only to measure effi-
ciency of non-profit organizations like government departments, non-governmental
organizations (NGOs), and other non-profit institutions like schools. For decision-
making units operating in the market, it is believed, one should use econometric
methods like SFA to estimate a parametric cost, revenue, or profit function. This is
a result of confusing conceptualization with calibration. The appropriate criterion
of measuring efficiency – cost, revenue, or profit – is determined by the scope of
decision-making. With outputs exogenously assigned efficiency lies in minimizing
cost. Similarly, for a given input bundle, the objective is to maximize revenue.
Lastly, when both outputs and inputs can be freely chosen, profit maximization is
the appropriate criterion of efficiency. The nonparametric method of DEA and the
parametric method of SFA are merely two alternative methods of computing the
appropriate measure of efficiency.

18 See also the range-adjusted measure (RAM) introduced by Cooper, Park, and Pastor [23].
10 Data Envelopment Analysis: A Nonparametric Method of Production Analysis 443

Cost Efficiency

The different kinds of efficiency considered so far – radial, non-radial, output-

oriented, input-oriented, and graph – are based entirely on input and output quantity
data only. In all of these cases, improvement in efficiency came from reduction in
inputs or increase in outputs or some combination of both.
In measuring the radial output-oriented technical efficiency, proportional reduc-
tion in each input is given the same importance, and the objective is to scale the
entire input bundle down as much as possible without changing the input mix. In
practice, however, different inputs have different market prices and account for
different proportions of the total cost of the firm. When the technical efficiency
is below unity, say 0.90, the firm can, obviously, reduce all inputs by 10% and
reduce its total cost by 10%. It may be possible, however, to achieve an even
greater reduction in cost by changing different inputs by different proportions. In
fact, sometimes cost minimization involves increasing some inputs while reducing
some others.
Cost minimization is an important objective for a variety of production decision-
making units. Even when a firm is maximizing profit and selects its optimal input
and output bundles together, cost minimization remains an objective embedded
within the overall objective of profit maximization because profit is not maximized
unless the selected output bundle is produced at the lowest cost. In a non-profit
organization, the output may be exogenously given, but accountability to the
stakeholders makes cost minimization an important objective.
Consider a firm facing a vector of input prices w0 that produces the output bundle
y using the input bundle x0 . Thus its actual cost is C0 = w0 x0 . The minimum cost
0

of producing the output bundle y0 at input prices w0 is

C0∗ = min w 0 x : x, y 0 ∈ T . (69)

Alternatively,

C0∗ = min w 0 x : x ∈ V y 0 . (70)

Given that x0 ∈ V(y0 ), obviously C0∗ ≤ C0 . The cost efficiency of the firm is
measured as

C0∗
γ = . (71)
C0

The relevant DEA LP problem for cost minimization under the VRS assumption
is
444 S. C. Ray

C0∗ = min w 0 x

N
s.t. λj x j ≤ x;
j =1

N
λj y j ≥ y 0 ; (72)
j =1

N
λj = 1;
j =1

x ≥ 0; λj ≥ 0, (j = 1, 2, . . . , N ) .

N
As usual, when CRS is assumed, the restriction λj = 1 is dropped. It is
j =1
important to note that like the λj s, the elements of the optimal input vector x are also
choice variables. Moreover, there cannot be any input slack at the optimal solution
of (70).
Following Farrell [37] one can decompose the (overall) cost efficiency of the firm
into two multiplicative components separately representing technical and allocative
efficiencies. If the input-oriented technical efficiency of the firm is τ x (x0 , y0 ) = θ , it
would be possible to scale down its input bundle from x0 to x0T = θ x 0 . As a result
its cost would be lowered to

C0T = θ. w 0 x 0 = θ C0 . (73)

The scaled down input bundle x0T is in the isoquant V y 0 , and there remains
no further scope for reducing cost through proportionate reduction in all inputs.
Now, all input bundles in V y 0 will produce the output bundle y0 . But they
do not cost the same amount of money. There remains the possibility of further
cost reduction by moving from the bundle x0T to another bundle on the isoquant.

Because all bundles in V y 0 are technically efficient, reduction in any input has
to be counterbalanced by increase in some other input(s).19 The potential for cost
reduction through input substitution along the isoquant will depend on both the
degree of substitution possible and the relative prices of the inputs. Note that input
substation will alter the input mix. The amount of cost reduction through a change in
the input mix as a proportion of the technically efficient cost of the firm is a measure
of its allocative efficiency.
Thus,
C∗ C0T C0∗
CE x |w , y =
0 0 0 0
= . . (74)
C0 C0 C0T

19 An exception is when there is any slack in the technically efficient input bundle.
10 Data Envelopment Analysis: A Nonparametric Method of Production Analysis 445

( 0
1, 2) =

0 2 1 0 1

Fig. 7 Technical, allocative, and cost efficiency

C0T C0∗
As is apparent from (73) above, = θ . Now define = α. Then we
C0 C0T
have Farrell’s decomposition of cost efficiency in (71) into technical and allocative
efficiency as

γ = (θ ) . (α) . (75)

Figure 7 shows the measurement and decomposition of cost efficiency for

the 2-input case. Suppose that the point P represents the input bundle x0 and
the expenditure line A0 B0 through P shows the cost C0 = w0 x0 . Point Q on the
isoquant is the bundle xT0 , the technically efficient projection of x0 . The cost of
this bundle is CT0 = w 0 xT0 shown by the expenditure line A1 B1 through Q. The
minimum expenditure C0∗ = w 0 x ∗ is shown by the line A2 B2 through R which is
the cost minimizing input bundle. We therefore have a measure of cost efficiency
C∗
γ = C00 = OA 2
OA0 .
CT0 OA1 OQ
This is further broken up into θ = C0 = OA0 = OP representing technical
C0∗ OA2
efficiency and α = = = OS
OQ . representing allocative efficiency.
CT0 OA1

Fixed Inputs and Short Run Cost Minimization

In the short run, one distinguishes between fixed and variable inputs. Suppose that
the input vectors x is partitioned as x = (v, K) where v is the vector of variable
446 S. C. Ray

inputs and K is a single fixed input. The vector of variable input prices is wv , while
the rental rate of the fixed input is rK . The firm has a target (scalar) output y0 and
has to select a bundle of variable inputs (v) that can produce y0 when combined with
the given quantity of fixed input K0 . Note that like y0 , K0 is also given in the short
run. Hence, minimizing the short run total cost SRTC = wv v + rK K0 involves
minimization of only the variable cost wv v. The firm’s optimization problem in
this case is

min V C = wv v

s.t. λj v j ≤ v;
j

λj Kj ≤ K0 ;
j (76)

λj yj ≥ y0 ;
j

λj = 1; v ≥ 0; λj ≥ 0; (j = 1, 2, . . . , N)
j

The optimal solution of (76) along with the fixed cost (rK K0 ) leads to the optimal
short run total cost SRTC = wv v∗ + rK0 . It should be noted that when the fixed
∗
input constraint is binding, the corresponding dual variable ∂V∂KC ≡ zk measuring
the reduction in the variable cost as the quantity of the fixed input is increased will
be negative. In production economics, = − zk is treated as the shadow value of
the fixed input. If > rK , the firm can lower the SRTC for its output y0 by increasing
∗
K if possible. A measure of the short run marginal cost is MC = ∂V∂yC0 which is the
dual variable associated with the output constraint.

Using Total Cost as an Aggregate Input

In many situations in order to evaluate the output-oriented technical efficiency of a

firm, one has to use the total expenditure or cost as an aggregate measure of input. As
argued by Banker, Chang, and Natarajan (BCN) [10], because the total cost involves
both quantities and prices of inputs, the measure of inefficiency obtained using
a single aggregated input variable equals the aggregate (technical and allocative)
efficiency of the firm. For this to be true, one has to assume that all firms face the
same input prices even though these prices may not be known.
Assume that all firms face the same strictly positive price vector w. Thus, the
total cost of the input bundle xj of firm j is Xj = w x j . Now construct the set
10 Data Envelopment Analysis: A Nonparametric Method of Production Analysis 447

⎧ ⎫
⎨
N
N
N ⎬
T̃ (w) = (X, y) : X ≥ λj Xj ; y ≤ λj y j ; λj = 1; λj ≥ 0 (77)
⎩ ⎭
j =1 j =1 j =1

derived from the estimated production possibility set T̂ define earlier.

Consider the single input technical efficiency, θ X = min θ : θ X0 , y 0 ∈ T̃ .
The relevant DEA problem is

θ z = min θ

N
s.t. λj Xj ≤ θ X0 ;
j =1

N (78)
λj y j ≥ y 0 ;
j =1

N
λj = 1; λj ≥ 0; (j = 1, 2, . . . , N ) .
j =1

Contrast this with the cost minimization problem in (72). It is clear that the

N
N
constraint λj x j ≤ x in (72) implies λj w x j ≤ w x . If we define
j =1 j =1

w x = θ. w x 0 , it is obvious that the single input DEA technical efficiency
problem in (78) is the same as the multiple input cost minimization
problem
in (72).
In other words, τ x (X0 , y0 ) = θ z from (72) is the same as γ x 0 |y 0 , w . 20 We have
already seen that cost efficiency coincides with technical efficiency if and only if
allocative efficiency is unity. An implication of this result is that technical efficiency
obtained from using total cost as the single input will in general underestimate the
technical efficiency of a multiple input firm.
It is very important to realize that in the cost minimization problem, the input
price vector, w0 , is assumed to be a parameter like the target output bundle y0 . No
matter whether the product market is competitive or monopolistic, the input markets
are assumed to be competitive, and the firm is a price taker in the factor markets. It
is assumed that the firm can purchase any input bundle it wants at the given vector of
input prices. Failure to recognize this competitive assumption leads to the erroneous
perception that between two units producing the same output and using the same
input bundle, one is more cost-efficient because it faces lower input prices in the
factor market and has a lower cost.21

20 See the discussion in Ray and Mukherjee [64].

21 See, for an example, Tone [75].
448 S. C. Ray

Multi-location Cost Minimization

Ray, Chen, and Mukherjee (RCM) [65] introduced some measure of flexibility in
the choice of input prices on the part of the firm by considering multiple production
locations across which input prices vary.
In the RCM multi-location cost minimization problem, the firm still has a target
output vector y0 to produce. But it has the option to produce all or parts of y0 in any
one or several out of a given set of S sites. The vector of input prices, ws , that the
firm has to pay at any specific location s is fixed. But the firm can choose to pay a
different vector of prices, wq , by selecting a different site, q. There is an important
constraint, however. Any output produced at any specific site, s, must be produced
from inputs procured at the local market at the locally applicable prices, ws . The firm
is not allowed to “cherry pick” by purchasing various inputs at favorable prices from
different markets. Suppose that the firm has a choice of sites at three locations: A, B,
and C. The input prices at these locations are wA , wB , and wC , respectively. The firm
decides to produce the output bundles yA , yB , and yC at these locations. Of course, it
may decide not to produce anything at location s, in which case ys = 0 (s = A, B, C).
The input bundles used at these locations are (xA , xB , xC ), and the corresponding
costs of production are CA = wA xA , CB = wB xB , andCC = wC xC . The firm has to
choose its multi-location production plan such that its total output target is met at
the minimum cost. Of course, the production plan (xs , ys ) at each location s must be
feasible. As formulated in RCM, the multi-location minimum cost is22

min C = w A x A + w B x B + w C x C

N
s.t. λsj x j ≤ x s , (s = A, B, C) ;
j =1

N
λsj y j ≥ y s , (s = A, B, C) ;
j =1

ys ≥ y0;
s=A,B,C

N
λsj = Bs , (s = A, B, C) ;
j =1

Bs ∈ {0, 1} , (s = A, B, C) ; λsj ≥ 0 (s = A, B, C; j = 1, 2, . . . , N ) .
(79)

22 Thisproblem assumes homogeneous technology across all locations. RCM also consider the
case where the technology varies across locations.
10 Data Envelopment Analysis: A Nonparametric Method of Production Analysis 449

In the above LP problem, Bs is an indicator variable. When Bs equals zero, every

λsj also equals zero. Thus, both xs and ys are null vectors. That is, the firm does
not select location s for producing any output. On the other hand, when Bs equals
unity, (xs , ys ) is some convex combination of observed input-output bundles and is a
strictly positive input-output bundle. That is, the firm does produce a positive output
bundle at location s.

Revenue Efficiency

In some cases, the objective of a firm is to select the optimal output bundle that can
be produced from a given endowment of inputs in order to generate the maximum
revenue. This time, the output prices (like the input quantities) are treated as given
parameters. Suppose that a firm produces the output bundle y0 from the input bundle
x0 and at the output price vector p0 earns the revenue R0 = p0 y0 . Now the maximum
possible revenue that can be generated from this input bundle and at these output
prices is

R ∗ = max p0 y : x 0 , y ∈ T . (80)

Alternatively,

R ∗ = max p0 y : y ∈ P x0 . (81)

The VRS DEA formulation of this problem is

max p0 y
s.t.

λj y j ≥ y;
j
(82)
λj x j ≤ x 0 ;
j

λj = 1; λj ≥ 0, (j = 1, 2, . . . , N) .
j

It may be noted in passing that the revenue maximization problem is the

nonparametric version of the familiar product mix problem. Here the resource and
return vectors are specified but the technology matrix is not. The optimal value of
the objective function in (82) yields the maximum revenue R∗ = p0 y∗ , and the
revenue efficiency of the firm is measured as
450 S. C. Ray

p0 y 0 R0
η y 0 ; x 0 , p0 = 0 ∗ = ∗ (83)
p y R .

As in the case of cost efficiency, revenue efficiency also can be multiplicatively

decomposed into technical and allocative efficiency.

Profit Efficiency

In the case of cost minimization, the output quantities and input prices were
treated as exogenously determined, and only the input quantities were the choice
variables. Similarly, for revenue maximization, input quantities and output prices
were parameters, and the output quantities were the choice variables. For profit
maximization by a competitive firm, both input and output quantities are choice
variables, while the input and output prices are parameters. In order to maximize
its profit, the firm is free to select any input-output bundle so long as it represents a
feasible production plan in the sense that the selected output bundle can be produced
from the corresponding input bundle. Thus, the profit maximization problem can be
expressed as

π p 0 , w 0 = max p0 y − w 0 x : (x, y) ∈ T . (84)

The DEA problem for profit maximization is

max p 0 y − w 0 x
s.t.

λj y j ≥ y;
j
(85)
λj x j ≤ x;
j

λj = 1; λj ≥ 0, (j = 1, 2, . . . , N) .
j

It is important to remember that the optimal profit cannot assume a non-zero yet
finite value if CRS holds.
The optimal solution of (85) yields the maximum profit π ∗ = p0 y∗ − w0 x∗ ,
and the profit efficiency of the firm can be measured as

π0 p0 y 0 − w0 x 0
υ= = . (86)
π∗ p0 y ∗ − w0 x ∗
10 Data Envelopment Analysis: A Nonparametric Method of Production Analysis 451

A potential problem with this measure of efficiency is that if the actual profit
is negative, one gets υ < 0. Note, however, that from the standpoint of economic
theory, this is really a question of misspecification of the problem. Because all inputs
are being freely chosen, (85) represents an analysis of the behavior of the firm in the
long run. If the firm is making losses even in the long run, it would have the option to
quit in which case there would be 0 profit. The fact that the firm is earning negative
profit but still is in business would suggest that there are fixed costs and not all inputs
are freely variable. In that case (85) is not an appropriate analytical framework for
evaluating efficiency.
Varian [76] proposed a quick test of profit maximizing behavior simple enough
to be carried out on the back of an envelope.23 Consider the observed input-output
bundles (xj , yj ), (j = 1, 2, . . . , N). Suppose that the output and input price vectors of
firm i are (pi , wi ). Thus its actual profit is π i = pi yi − wi xi . Firm i is not profit
maximizing if π i = pi yi − wi xi < pi yj − wi xj for some j = i. Varian calls this
the weak axiom of profit maximization (WAPM). The intuition behind this is quite
simple. Every observed input-output bundle is feasible by assumption. Hence, if a
different bundle yields higher profit at the prices applicable for firm i, then this firm
cannot have maximized profit. In that sense, WAPM is a necessary condition for
consistency of the observed bundle with profit maximizing choice. It can also be
shown that under the standard assumptions of convexity and free disposability, it is
also a sufficient condition. (See Ray [56], p. 261.)

Capacity Utilization24

While there is considerable interest in measuring capacity utilization, there is no

general agreement on the definition of the capacity output. Indeed, there are several
definitions of the capacity output – each is valid in a particular context. By far, the
simplest of them is the maximum level of output that can be produced from a given
level of quasi-fixed inputs (like plant and machinery) even when variable inputs
(like labor or materials) are available without restriction. By definition, the actual
output produced cannot exceed this maximum quantity. This is a physical measure
of capacity that is technologically determined. First proposed by Johansen [43],
it has been subsequently popularized in empirical applications by Färe, Grosskopf,
and Kokkelenberg (FGK) [32]. Note that this capacity is a short run concept because
at least one variable is held fixed. Moreover, the level of the capacity output
would depend on which input is being held fixed and at what level. In contrast

23 Also see Afriat [1] and Hanoch and Rothschild [42]. Banker and Maindiratta [6] further extended

Varian [76] to construct upper and lower bounds on technical, allocative efficiency, and overall
profit efficiencies.
24 See the Chap. 24, “Capacity and Capacity Utilization in Production Economics” in this volume

of the Handbook for a detailed discussion of capacity utilization in production economics in

general.
452 S. C. Ray

to this physical measure, the textbook definition of the economic capacity output
corresponds to the minimum of the long run average cost curve of a competitive
firm. Because this is a long run measure, there is no fixed input. In the case of the
long run average cost curve, its U-shape derives from economies of scale followed
by diseconomies of scale and is not due to any fixed input. An economic measure
of the capacity output differs from the physical measure even in the short run. The
presence of fixed costs associated with the quasi-fixed inputs of the firm justifies
the U-shaped average cost curve, and the output level where the short run average
(total) cost reaches the minimum is the capacity level for the given bundle of the
(quasi)-fixed inputs. In fact, Cassell [13] argued that, “since the absolute technical
upper limit of output obtainable from the fixed factors is likely to lie far beyond the
realm of practical economic operations, their capacity output should be taken as that
at which the average full costs of production are at their minimum.”
It is impoo emphasize at the outset that the average cost curve – whether in
the short run or in the long run – is unequivocally defined only in the context
of a single product technology. In the multiple output case, there is no natural
definition of average cost. One approach to deal with this problem is to obtain
an aggregate measure of output (like total revenue) and to treat it as single
product problem. Alternatively, following Baumol, Panzer, and Willig [11], one
may consider variations in only the scale of an output bundle but keep the output mix
unchanged and measure the ray average cost. The economic capacity scale will then
correspond to the minimum point of the ray average cost curve for the given output
mix. In this section only the single output case is considered. In a recent paper, Ray,
Walden, and Chen [67] developed DEA models to determine the rate of capacity
utilization in the multiple output case both in the short run and in the long run under
alternative returns to scale assumptions and accommodating multiple fixed inputs.

A Physical Measure of Short Run Capacity Utilization

To obtain the physical measure of the short run capacity output, partition the input
vector as x = (v, f ), where v is the sub-vector of variable inputs and f is the sub-
vector of fixed inputs. Then for any given bundle of fixed inputs f0 , capacity output
is defined by FGK as

y ∗∗ f 0 = maxv y : v, f 0 , y ∈ T . (87)

In empirical applications, the FGK capacity output is measured as

y ∗∗ = ϕ ∗∗ y0

where
10 Data Envelopment Analysis: A Nonparametric Method of Production Analysis 453

ϕ ∗∗ = max ϕ

s.t. λj yj ≥ ϕy0 ;
j

λj v j ≤ v;
j
(88)
λj f j ≤ f 0 ;
j

λj = 1;
j

v ≥ 0; λj ≥ 0 (j = 1, 2, . . . , N) .

Note that apart from non-negativity the variable inputs are unconstrained and,
in consequence, play no role in the optimization problem in (88). Recall that the
technically efficient output producible from the given levels of variable and fixed
inputs (v0 , f0 ) is

y ∗ = ϕ ∗ y0

where

ϕ ∗ = max ϕ

s.t. λj yj ≥ ϕy0 ;
j

λj v j ≤ v 0 ;
j
(89)
λj f j ≤ f 0 ;
j

λj = 1;
j

v ≥ 0; λj ≥ 0 (j = 1, 2, . . . , N ) .

FGK defined capacity utilization as

y∗ ϕ∗
CU ∗ = ∗∗
= ∗∗ . (90)
y ϕ

It is clear from above that ϕ∗∗ ≥ ϕ∗ and, therefore, their measure of capacity
utilization can never exceed unity. One might prefer to measure capacity utilization
454 S. C. Ray

by the ratio of the actual output y0 to the capacity output y∗∗ . The corresponding rate
of capacity utilization is

y0 1 1 ϕ∗
CU 0 = ∗∗
= ∗∗ = . . (91)
y ϕ ϕ∗ ϕ ∗∗

The first factor on the right measures wasted capacity due to inefficiency, while
the second relates to excess capacity.

Long Run Capacity Output

The short run physical capacity output is not economically appealing because it
does not pay any attention to the cost of adjusting the variable inputs that would be
needed to produce the capacity output and it is not clear why a firm would like to
produce at that level. By contrast, the minimum point of the long run average cost
curve is arguably the only sustainable output level for a firm in the long run in a
competitive market with free entry and exit.
The minimum point of the long run average cost (LAC) curve corresponds to
the output level where economies of scale have been exhausted but diseconomies
of scale have not yet set in. Of course, the LAC will have usual U-shape only
when the technology exhibits variable returns to scale. There is a subtle difference
between the MPSS and the cost-efficient scale size. The former maximizes the
average productivity AP (x) = f (x) x in the single input single output case and ray
average productivity RAP (t, x) = f (tx)
t in the multiple input single output case. On
the other hand, the cost-efficient output scale minimizes the average cost AC(y) =
C(w,y)
y in the single output case and the ray average cost RAC(t, y) = C(ty,w) t in
the multiple output case.
Ray [61] addressed the problem of finding the capacity output in the single output
multiple input case. In order to find the output level where the LAC under VRS
reaches a minimum, one needs to solve the problem

w0 x
min
y

s.t. λj x j ≤ x;
j

λj yj ≥ y; (92)
j

λj = 1;
j

λj ≥ 0; (j = 1, 2, . . . , N) .
10 Data Envelopment Analysis: A Nonparametric Method of Production Analysis 455

The optimal input-output bundle x∗ from (92) above yields the minimum average
cost

(w0 , y∗ )(w 0 , y∗ ) w0 x ∗
AC V RS (w0 , y∗ ) = C V RS = (93)
y∗ y∗

along with the efficient output level (y* ).

Although the problem in (92) is nonlinear, one can utilize the following two
lemmas from production economics to determine both the output level y* and the
minimum average cost ACVRS (w0 , y∗ ) by solving a simple DEA LP problem.

Lemma 1
Locally constant returns to scale holds at the input-output bundle (x* , y* ) where
the average cost reaches a minimum.

Lemma 2
If the technology exhibits constant returns to scale globally, average cost is a
constant at all output levels.

Proof See Ray [61].

Suppose that firm k faces the input prices wkandproduces the output quantity yk .
The objective is to find the output levelyk∗ = y w k that minimizes ACVRS (wk , y).
For this, first consider the CRS cost minimization problem

min w k x

s.t. λj x j ≤ x;
j
(94)
λj yj ≥ yk ;
j

λj ≥ 0.

Suppose
∗ that atthe optimal solution
λj = q and λ∗j x j = x ∗ .
j j
This implies that wk x∗ = CCRS (w0 , yk ) is the minimum cost of producing yk
at input prices wk under the CRS assumption. Then, by Lemma 2, q1 w k x ∗ =
k
1
q CCRS w , yk = CCRS w 0 , 1 y . Define the output level y = 1 y and the
q k q k
∗ 1 ∗ ∗ ∗
weights μj = q λj . Clearly, μj = 1 and each μj ≥ 0. Further, define the input
j

vector x̃ = q1 x ∗ = μ∗j x j .
j
456 S. C. Ray

Now consider the VRS cost minimization problem

C V RS w k , y = min w k x

s.t. λj x j ≤ x;
j

λj yj ≥ y; (95)
j

λj = 1;
j

λj ≥ 0.

Suppose that the optimal solution of (95) yields a minimum cost w k x. It can
be seen from above that the μ∗j s and x̃ above constructed from (94) constitute a
feasible solution for (95). Hence,

w k x ≤ w k x̃. (96)

At the same time, wk x̃ = q1 w k x ∗ = C CRS w k , y is the optimal solution of the
less restrictive CRS problem

min w k x

s.t. λj x j ≤ x;
j
(97)
λj yj ≥ y;
j

λj ≥ 0.

Hence,

w k x̃ ≤ w k x. (98)

The inequalities (96) and (98) together imply

w k x̃ = w k x. (99)

Therefore, at the output level y, the total and average costs are the same under
both CRS and VRS assumptions. That is, at the output level y, the average cost
ACVRS (wk , y) reaches a minimum. In other words, y is an efficient production scale.
10 Data Envelopment Analysis: A Nonparametric Method of Production Analysis 457

Recall, however, that y = 1

q yk . Hence, an economic measure of capacity
utilization is
yk
CU e = = q. (100)
y

In order to obtain the capacity output, one only needs to solve the CRS cost
minimization problem and from the optimal values of the λs compute
yk yk
y= = ∗. (101)
q λj
j

Economic Scale Efficiency

The scale efficiency of an input-output bundle (x0 , y0) is measured

by the inverse of
the ray average productivity at its MPSS. Thus, SE x 0 , y 0 = βα when (βx0 , αy0 )
is its MPSS. By contrast, the economic scale efficiency of the output (bundle) y0 is

RAC ty 0 ; w C t ∗y0, w
ESE y ; w =
0 = (102)
RAC y 0 ; w t ∗C y0, w

where t∗ is the output scale that minimizes the ray average cost. In the scalar output
case,
AC w 0 , y0
ESE y0 |w =
0 (103)
AC w 0 , y ∗

where y∗ is the output level where average cost reaches a minimum.

Efficiency Measurement with Bad Outputs25

In many cases, the production of the desired or intended output results simul-
taneously in the production of some undesirable or “bad” outputs as well. The
most common example is one of electricity generation in thermal power plants
resulting in air pollution as well. It is recognized that just as increased production
of the desirable output from a given bundle of inputs implies higher efficiency,
increased production of the undesirable should be considered lower efficiency.
While “good/bad” classification of outputs is a matter of preference, production

25 Fora detailed treatment of bad outputs, see Chap. 12, “Bad Outputs” by Murty and Russell in
this Handbook.
458 S. C. Ray

of an output in spite of it being considered bad by the producer calls into question
the assumption of free disposability of such output. Clearly, it is not possible to
eliminate totally or at least to reduce significantly the bad output unilaterally without
increasing some input(s) and/or reducing some desired or good output as well.
While there is no general agreement about how to reconceptualize the production
technology to accommodate bad outputs, there are three principal approaches found
in the relevant literature.

Bad Output as Input

The first approach is to treat the bad output like a conventional input (e.g., in Baumol
and Oates [12] or Cropper and Oates [25]). In light of the positive correlation of the
good output (power) and the bad output (smoke emission), it is intuitively appealing
to treat the bad output like an input (say coal). But even though observationally
equivalent, the bad output is conceptually quite different from an input. First, an
input exists even before the production process starts. There was no smoke in the
air before production started. Second, an input is depleted in stock as production is
carried out. In this case, there is more smoke in the atmosphere as more power is
generated.

Good and Bad Outputs as Joint Products

In the second, and by far the most widely used, approach introduced by Färe and
Grosskopf and applied in numerous papers with their coauthors,26 the good and the
bad outputs are treated as weakly disposable and null joint. Two outputs are weakly
disposable if a unilateral decrease in one output holding the other output and inputs
unchanged is not feasible but they can be reduced together. They are null joint if
production of one output can be stopped only if the other output is not produced as
well. For example, consider an industry producing one good output (g) along with
a single bad output (b) using two inputs x1 and x2 . Under the assumption of free
disposability of outputs,

x10 , x20 , g 0 , b0 ∈ T ∧ g 1 , b1 ≤ g 0 , b0 ⇒ x10 , x20 , g 1 , b1 ∈ T . (104)

But under weak disposability

x10 , x20 , g 0 , b0 ∈ T ∧ g 1 , b1 = αg 0 , αb0 ; 0 ≤ α ≤ 1⇒ x10 , x20 , g 1 , b1 ∈ T .
(105)

26 See, for example, Färe, Grosskopf, Lovell, and Pasurka [33]; Färe, Grosskopf, Lovell, and

Yaisawarng [34]; and Färe, Grosskopf, Noh, and Weber [36] among many others.
10 Data Envelopment Analysis: A Nonparametric Method of Production Analysis 459

Following Färe and Grosskopf [29], assuming (i) CRS and (ii) weak disposability
of the bad output but free disposability of the good output, the production possibility
set under joint production can be approximated as
⎧ ⎫
⎪
⎪
N ⎪
⎪
⎪
⎪ (x , x ; g, b) : x ≥
j
μj xi ; (i = 1, 2) ; ⎪
⎪
⎪
⎪
1 2 i ⎪
⎪
⎨ j =1 ⎬
SJWPD = (106)
⎪
⎪ ⎪
⎪
⎪
⎪
N
N ⎪
⎪
⎪
⎪ g ≤ μ g ; b = μ b ; μ ≥ 0 (j = 1, 2, . . . , N) . ⎪
⎪
⎩ j j j j j ⎭
j =1 j =1

Bad Output as a By-Product

Murty and Russell [50] and Murty, Russell, and Levkoff (MRL) [51] regard the
bad output as an unintended by-product of producing the good output.27 Both the
neutral input x1 and the polluting input x2 are used for the production of the good
output, g, while only the polluting input produces the bad output b as an undesired
side effect.28 In this conceptualization, the production possibility set consists of two
sub-technologies
∂F g
∂F g
T g = (x1 , x2 ; g) : F g x1 , x2 ; g ≤ 0; < 0 i = 1, 2 ; > 0 (107)
∂xi ∂g

and

∂F b (kx 2 , kb)
T = (x2 ; b) : F x2 ; b ≥ 0;
b b
>0 . (108)
∂k

MRL [51] define the corresponding nonparametric construction of the production

possibility set for the by-production technology under CRS as

27 See their Chap. 12 on “Bad Outputs” in this volume of the Handbook.

28 MRL [51] also consider another model including pollution abatement as a separate desired output

produced by diverting resources from the production of the desired output g. An example would
be treatment of polluted waste water before discharging into the stream.
460 S. C. Ray

⎧
⎨
g j j
SBP = (x1 , x2 , g) : x1 ≥ λj x1 ; x2 ≥ λj x2 ; g ≤ λj gj ; λj
⎩
j j j

≥ 0; j = 1, 2, . . . , N
⎧ ⎫
⎨ ⎬
j
b
SBP = (x2 , b) : x2 ≥ μj x2 ; b ≥ λj bj ; μj ≥ 0; j = 1, 2, . . . , N
⎩ ⎭
j j

g
SBP = (x1 , x2 , g, b) : x1 , x2 , g ∈ SBP ∧ x2 , b ∈ SBP
b

(109)

MRL [51] formulate two separate optimization problems one for maximizing

g
ϕg : x10 , x20 , ϕg g ∈ SBP (110)

and another for minimizing

θb : x20 , θb b0 ∈ SBP
b
(111)

and measure overall efficiency as

∗ 1
ψ =α + (1 − α) θb∗ (112)
ϕg∗

where 0 < α < 1 is a preselected weight.

There are two problems with treating (110) and (111) as separate optimization
problems. First, the optimal value of the common input x2 can be different across
the two problems. As argued in Lozano [48] and Ray, Mukherjee, and Venkatesh
(RMV) [66], the quantity of the polluting input, x2∗ , in (110) that is used to produce
the good output g∗ should be exactly the x2∗ in (111) that produces the optimal bad
output b∗ . There is no constraint in the MRL formulation that ties the two problems
together. Another problem (although much less serious than the other) is that unless
the optimal λs in (110) and the optimal μs in (111) are the same, the peer group
from the good output benchmark would be different from the one for the bad output
benchmark. This implicitly assumes that there are two decision-making units within
the firm: one maximizing the good output and the other minimizing the bad output.
10 Data Envelopment Analysis: A Nonparametric Method of Production Analysis 461

Joint Disposability and Material Balance Principle

RMV [66] modified the weak disposability assumption into what they call joint dis-
posability in respect of the pollution generating technology. In the joint production
approach, the two outputs (good and bad) are considered weakly disposable. This
is a relation between a pair of outputs. RMV [66] connect the bad output and the
polluting input through joint disposability in the sense that the bad output can be
reduced only if the polluting input is also reduced.29 Their formulation of Tb is
⎧
⎨
j
b
SBP = (x2 , b) : x2 = α λj x2 ; b = α λj bj ; 1 ≥ α
⎩
j j (113)

≥ 0; λj ≥ 0; j = 1, 2, . . . , N

Førsund ([38], [39]) points out that there is a physical law that requires that any
part of the (material) input that is not incorporated into the output must take the form
of a residual, which is treated as a bad output in specific situations. Accordingly, he
argues strongly for imposing a materials balance condition30 relating the good and
the bad outputs and the polluting input. In the simple case considered above, such
materials balance condition would be of the form

a1 b = a2 x2 − a3 g (114)

For the by-production technology, he recommends an additional restriction of the

form
⎛ ⎞ ⎛ ⎞ ⎛ ⎞

a1 ⎝ λ∗j bj ⎠ = a2 ⎝ λ∗j x2 ⎠ − a3 ⎝ λ∗j gj ⎠ .
j
(115)
j j j

Note that RMV [66] in their “unified” models require that the same λs must be
used for creating the benchmark bundles for the good and the bad output production
technologies. That should ensure that the material balance condition in (115) is
automatically satisfied. Because (114) is assumed to hold as a physical necessity for
j
every (bj , gj , x2j ) combination that is observed, a1 bj = a2 x2 −a3 gj holds for every j.

29 The joint disposability of the bad output and the polluting input is comparable to the two
materials balance postulates MB1 and MB2 in Dakpo et al. ([26], p. 352).
30 Ayres and Kneese [4] introduced the question of materials balance in economics. In a number of

subsequent papers, it has been extensively discussed in the context of production efficiency by a
number of authors including Pethig [53]; Coelli, Lauwers, and Van Huylenbroeck [21]; Chambers
and Melkonyan [14]; Hampf [41]; Rodseth [68, 69]; and Førsund [39] among others. See, in
particular, Dakpo, Jenneauxe, and Latruffe [26].
462 S. C. Ray

Hence, λ∗j a1 bj = λ∗j a2 x2
− λ∗j a3 gj ⇒ a1 λ∗j bj = a2 λ∗j x2
j j
−
j j j j j

∗
a3 λj gj . This would make imposition of (115) as an additional restriction
j
redundant.
It is important to remember that environmental pollution is not the only case
where a bad output is produced alongside the good output and there is not a
single model that applies in every context. Consider, for example, the case of
nonperforming loans in banking. Such loans in default are best modeled as joint
products with the good output (loans in good standing). Because the probability of
loan default can never be reduced to 0, the only way to eliminate them is to stop
lending altogether. Thus, they are indeed null joint. The joint production model is a
more appropriate analytical format in this case.

Contextual Variables in DEA

The production possibility set consists of all feasible input-output bundles and is
thus defined in the input-output space. However, production takes place in a specific
physical, social, and cultural environment. Differences in environmental conditions
can play a decisive role in defining the feasibility of a particular input-output bundle.
In measuring the efficiency of a decision-making unit, one assumes that it can
choose the input bundle it uses or the output bundle it produces. Unlike inputs or
outputs, the environmental factors cannot be chosen by the firm and has to be treated
as “non-discretionary.”
An obvious example of an environmental factor is rainfall in the context of
agricultural production. The maximum output producible from a given bundle of
inputs (say labor, fertilizer, and land) depends on the amount of rainfall. In that
sense, rainfall contributes to the output much the same way as irrigation. However,
while the farmer can choose the level of irrigation, the amount of rainfall is not
within his control. Here, rainfall acts as a non-discretionary input. In defining the
feasible set for a DEA LP problem, one has to include a constraint for the amount
of rainfall, but while measuring the radial input-oriented technical efficiency, the
proportional scaling factor should be applied only to the discretionary inputs (like
labor, fertilizer, and land) but not to rainfall.
For another example, consider a secondary school where the average perfor-
mance of its pupils in a standardized test in mathematics is one of the outputs and
hours of classroom instruction in math is one of the inputs. An increase in this
input is expected to improve the average test score in math. Now consider another
variable – the median family income of the town where the school is located. There
is ample evidence to conclude that students from more affluent families where the
parents are professionals are better motivated and spend more time on homework
and perform better in tests. In that sense, the economic status of the pupil acts
10 Data Envelopment Analysis: A Nonparametric Method of Production Analysis 463

like class time spent on math. However, the former is an input, while the latter
is a contextual variable. They are also referred to as “non-discretionary inputs.”
Another example of a contextual variable is the marital status of parents of a pupil.
A child from a single parent family (irrespective of income) is unlikely to get the
same level of parental attention as when both parents are present. Thus, an increase
the proportion of pupils from single parent families will lower the average math
score even when all other variables are unchanged. Two things emerge out of this
illustrative example. First, unlike an input, a contextual variable may be either
favorable (like family income) or detrimental for production (like single parent
families). Second, a decision-maker at an appropriate level of authority (the school
superintendent or the Board of Education) can select the input bundle used. This is
not true for the contextual variables.
For yet another example, consider the efficiency of a water utility. The outputs are
the number of customers served and the gallons of water distributed. The inputs are
pumps, length of pipelines, and hours of labor. Note that in an urban area the higher
density of population implies that the same number of customers can be served and
the same volume of water dispensed with a smaller network of pipelines than what
is required in a rural area. Moreover, when many customers are located in the same
building (as is the case in an urban community), the labor hours needed for meter
reading will be lower than in a rural area where customers are located at distant
points. In this case, density of population is an environmental variable.
Where contextual variables are considered to be significant determinants of
performance, an appropriate way to conceptualize the production technology is to
define the production possibility set conditional on a specific vector of contextual
variables z0 . With explicit inclusion of the contextual variables in the transformation
function, the production possibility set becomesT = {(x, y, z) : F(x, y, z) ≤ 0}, and
the conditional production possibility set is

T z0 = (x, y) : F x, y, z0 ≤ 0 . (116)

Efficiency is still evaluated at the inputs used and outputs produced. But the
appropriate benchmark bundle depends on the applicable vector of contextual
variables. The disposability and convexity assumptions about the technology apply
to the input-output set but are not necessarily extended to the contextual variables
z0 . There are mainly two different ways to formulate the DEA problem depending
on how the revised transformation function is conceptualized.

All-Inclusive DEA

In one approach, following Banker and Morey [7], one imposes additional
constraints for the non-discretionary
variable. Assume that the vector of non-
discretionary inputs is z0 = z10 , z20 where z1 is a favorable and z2 an unfavorable
factor. Then the input-oriented VRS technical efficiency of a firm producing output
464 S. C. Ray

y0 from input x0 while facing the external conditions z0 is obtained as

θ ∗ = min θ

s.t. λj y j ≥ y 0 ;
j

λj x j ≤ θ x 0 ;
j
j (117)
λj z1 ≤ z10 ;
j
j
λj z2 ≥ z20 ;
j

λj = 1; λj ≥ 0 (j = 1, 2, . . . , N) .
j

Several points are to be noted. In the problem above, the scaling factor θ
applies to the discretionary inputs only. Also, the actual value of the favorable non-
discretionary input is treated as an upper bound, and that of the unfavorable one is
set as a lower bound for the corresponding values in the benchmark.
This all-inclusive DEA model is intuitively appealing because it does not allow
the hypothetical benchmark unit to have a lower level of the unfavorable factor
or a higher level of the favorable factor than what is faced by the unit under
evaluation. However, extending the convexity and disposability assumptions to the
non-discretionary inputs is open to question. Also, all of the non-discretionary
factors have to be unequivocally identified beforehand as favorable or unfavorable.
Lastly, the DEA LP scores have to be recomputed whenever any non-discretionary
factor is added or removed.

A Second Stage Regression

The other approach introduced by Ray [56] includes only the discretionary inputs
and outputs in the DEA LP problem in the first stage and estimates a regression
of the DEA efficiency score on the non-discretionary variables. The conceptual
foundation of the second stage regression is provided in Ray [55, 56]. Consider
the single output case and assume that the production function is multiplicatively
separable as

y ∗ = f (x, z) = g(x).h(z). (118)

The actual output is

10 Data Envelopment Analysis: A Nonparametric Method of Production Analysis 465

y = y ∗ e−η ; η ≥ 0 (119)

where yy∗ = e−η is the technical efficiency of the firm.

In this case, any change in the contextual variables causes a neutral shift in the
production frontier that does not alter the marginal rates of substitution between
inputs or marginal rates of transformation between outputs. Assuming CRS and that
h(z) is naturally bounded between 0 and 1 for all values of z, Ray [55] has shown
that the DEA efficiency score τy = ϕ1∗ is an estimate of h(z)e−η . One can specify

j
ln h zj = α0 + αp zp j (120)
p

and estimate an OLS regression

ln τy = α0∗ +
j j
αp zp j + εj ;
p
(121)
where εj = E ηj − ηj ; α0∗ = α0 − E ηj .

Ray [55] proposed a corrected OLS (COLS) procedure to adjust the intercept
by the largest OLS residual and to use the adjusted residuals to measure technical
efficiency.
Apart from its simplicity, the second stage regression is quite appealing because
the regression coefficients measure the marginal effects of changes in different
non-discretionary variables on the DEA efficiency score. However, the statistical
properties of the second stage regression estimators have been questioned in the
literature. It is important to remember that the second stage regression will be invalid
unless the non-discretionary variables are all uncorrelated with the inputs.

A Three-Stage Analysis

Ruggiero [70] developed a three-stage procedure that uses the second stage
regression and then reverts to the all-inclusive DEA of Banker and Morey with a
modification. In this procedure, the predicted value of the dependent variable from
the second stage regression is treated as a composite measure of “environmental
harshness” (Ej ) for each decision-making unit in the sample. In the subsequent third
stage of the analysis, this composite variable is used for data screening so that the
benchmark frontier for any unit with input-output (x0 , y0 ) is constructed from the
(xj , yj ) data for only those units with Ej ≥ E0 . The Ruggiero [70] three-stage VRS
input-oriented DEA problem is
466 S. C. Ray

θ ∗ = min θ

s.t. λj y j ≥ y 0 ;
j

λj x j ≤ θ x 0 ; (122)
j

λj = 0 if Ej < E0 ;

λj = 1; λj ≥ 0 (j = 1, 2, . . . , N) .
j

As already noted, the convexity assumption about the input-output bundles in the
all-inclusive problem (117) may not be applicable for some contextual variables.
This is particularly true for categorical variables. Suppose that in the water utility
example, the service areas are classified as rural, urban, and metropolitan but
the exact measure of population density is not available for each observation. In
this case, all one knows is that water delivery is most difficult in the rural areas
and the least difficult in the metropolitan areas. Creating convex combinations of
a categorical variable representing population density is not meaningful in this
context. One may handle this by treating the conditional production possibility
sets as nested in the sense that all input-output bundles that are feasible in a less
favorable condition are also feasible in a more favorable condition but not the other
way around. In this case, one includes only the rural observations to construct the
frontier for evaluating utilities serving rural areas but all observations to construct
the frontier that is to be used for evaluating the utilities serving the most densely
populated areas. It should be noted though that for multiple contextual variables
that are categorical, such cross-classification may severely restrict the number of
observations available for constructing the frontier for the less favored groups. The
LP problem in (122) by focusing on an aggregated measure of overall disadvantage
helps significantly to mitigate the problem of data attenuation.

Conclusion

This chapter has covered only some of the major topics from DEA. Given the limited
scope of a book chapter (as compared to a full-length book), many other topics
which are themselves quite important had to be excluded. Among them are such
important areas of research as productivity growth and technical change31 , network
DEA, centralized resource allocation, non-convexity (including free disposal hull

31 Chapters 20, “Conceptualization and Measurement of Productivity Growth and Technical

Change: A Nonparametric Approach” by Ray and 21, “Modeling Technical Change: Theory
and Practice” by Kumbhakar in this volume of the Handbook cover non-parametric DEA and
parametric SFA approaches to measuring productivity growth and technical change.
10 Data Envelopment Analysis: A Nonparametric Method of Production Analysis 467

analysis),32 and others. The criterion for selecting topics was how closely they were
related to core neoclassical production theory.33

References
1. Afriat S (1972) Efficiency estimation of production functions. Int Econ Rev 13(3):568–598
2. Aigner DJ, Lovell CAK, Schmidt P (1977) Formulation and estimation of stochastic frontier
production function models. J Econ 6(1):21–37
3. Aparicio J, Pastor JT, Ray SC (2013) An overall measure of technical inefficiency at the firm
and at the industry level: the ‘lost profit on outlay’. Eur J Oper Res 226(1):154–162
4. Ayres RU, Kneese AV (1969) Production, consumption, and externalities. Am Econ Rev
59:282–297
5. Banker RD (1984) Estimating the most productive scale size using data envelopment analysis.
Eur J Oper Res 17(1):35–44
6. Banker RD, Maindiratta A (1988) Nonparametric analysis of technical and allocative efficien-
cies in production. Econometrica 56(5):1315–1332
7. Banker RD, Morey RC (1986) Efficiency analysis for exogenously fixed inputs and outputs.
Oper Res 34(4):513–521
8. Banker RD, Thrall RM (1992) Estimating most productive scale size using data envelopment
analysis. Eur J Oper Res 62:74–84
9. Banker RD, Charnes A, Cooper WW (1984) Some models for estimating technical and scale
inefficiencies in data envelopment analysis. Manage Sci 30(9):1078–1092
10. Banker RD, Chang H, Natarajan R (2007) Estimating DEA technical and allocative inefficiency
using aggregate cost or revenue data. J Prod Anal 27:115–121
11. Baumol WJ, Panzar JC, Willig RD (1982) Contestable Markets and the Theory of Industry
Structure. New York: Harcourt, Brace, Jovanovich.
12. Baumol WJ, Oates WE (1988) The theory of environmental policy, 2nd edn. Cambridge
University Press, Cambridge
13. Cassell JM (1937) Excess capacity and monopolistic competition. Q J Econ 51(3):426–443
14. Chambers RG, Melkonyan T (2012) Production technologies, material balance, and the
income-environmental quality trade-off. University of Exeter working paper
15. Chambers RG, Chung Y, Färe R (1996) Benefit and distance functions. J Econ Theory 70:
407–419
16. Chambers RG, Chung Y, Färe R (1998) Profit, directional distance functions, and nerlovian
efficiency. J Optim Theory Appl 98:351–364
17. Charnes A, Cooper WW (1968) Programming with linear fractional functionals. Nav Res
Logist Q 15:517–522
18. Charnes AC, Cooper WW, Mellon B (1952) Blending aviation gasolines – a study in
programming interdependent activities in an integrated oil company. Econometrica 20(2):
135–159
19. Charnes A, Cooper WW, Rhodes E (1978) Measuring the efficiency of decision making units.
Eur J Oper Res 2(6):429–444

32 For a detailed discussion of non-convexity in general, refer to the Chap. 18, “Nonconvexity
in Production and Cost Functions: An Exploratory and Selective Review” by Briec, Kerstens, and
Van de Woestyne in this volume of the Handbook.
33 For detailed discussion of DEA from an OR/MS perspective, the reader should refer to Zhu [77]

and Cooper, Seiford, and Tone [24]. Fare, Grosskopf, and Lovell [35] and Ray [56] explain the
economic theory behind DEA.
468 S. C. Ray

20. Charnes A, Cooper WW, Rhodes E (1979) Short communication: measuring the efficiency of
decision making units. Eur J Oper Res 3(4):339
21. Coelli T, Lauwers L, Van Huylenbroeck GV (2007) Environmental efficiency measurement
and the materials balance condition. J Prod Anal 28:3–12
22. Cooper WW, Thompson RG, Thrall RM (1996) Introduction: extensions and new develop-
ments in DEA. Ann Oper Res 66:3–45
23. Cooper WW, Park SK, Pastor JT (1999) RAM: a range adjusted measure of inefficiency for
use with additive models, and relations to other models and measures in DEA. J Prod Anal
11:5–42
24. Cooper WW, Seiford L, Tone K (2002) Data envelopment analysis: a comprehensive text with
uses, example applications, references and DEA-solver software. Kluwer, Norwell
25. Cropper ML, Oates WE (1992) Environmental economics: a survey. J Econ Lit 30:675–740
26. Dakpo KH, Jeanneauxe P, Latruffe L (2016) Modeling pollution generating technologies in
performance benchmarking: recent developments, limits, and future prospects in the non-
parametric framework. Eur J Oper Res 250:347–359
27. Dantzig GB (1951) Maximization of a linear function of variables subject to linear inequalities.
In: Koopmans TC (ed) Activity analysis of production and allocation. Wiley, New York, pp
339–347
28. Debreu G (1951) The coefficient of resource utilization. Econometrica 19(3):273–292
29. Färe R, Grosskopf S (2003) Nonparametric productivity analysis with undesirable outputs:
comment. Am J Agric Econ 85:1070–1074
30. Färe R, Lovell CAK (1978) Measuring the technical efficiency of production. J Econ Theory
19(1):150–162
31. Färe R, Grosskopf S, Lovell CAK (1985) The measurement of efficiency of production.
Kluwer-Nijhoff, Boston
32. Färe R, Grosskopf S, Kokkelenberg EC (1989) Measuring plant capacity, utilization and
technical change: a nonparametric approach. Int Econ Rev 30(3):655–666
33. Färe R, Grosskopf S, Lovell CAK, Pasurka C (1989) Multilateral productivity comparisons
when some outputs are undesirable: a non-parametric approach. Rev Econ Stat 71(1):90–98
34. Färe R, Grosskopf S, Lovell CAK, Yaisawarng S (1993) Derivation of shadow prices for
undesirable outputs: a distance function approach. Rev Econ Stat 75:374–380
35. Färe R, Grosskopf S, Lovell CAK (1994) Production frontiers. Cambridge University Press,
Cambridge
36. Färe R, Grosskopf S, Noh DW, Weber W (2005) Characteristics of a polluting technology:
theory and practice. J Econ 126:469–492
37. Farrell MJ (1957) The measurement of technical efficiency. J R Stat Soc Ser A Gen 120(Part
3):253–281
38. Førsund F (2009) Good modelling of bad outputs: pollution and multiple-output production.
Int Rev Environ Resour Econ 3(1):1–38
39. Førsund F (2018) Multi-equation modeling of desirable and undesirable outputs satisfying the
material balance. Empir Econ, online 54(1):67–99
40. Frisch R (1965) Theory of production. Rand McNally and Company, Chicago
41. Hampf B (2014) Separating environmental efficiency into production and abatement efficiency:
a nonparametric model with application to US power plants. J Prod Anal 41:457–473
42. Hanoch G, Rothschild M (1972) Testing the assumptions of production theory: a nonparametric
approach. J Polit Econ 80(2):256–275
43. Johansen L (1968) Production functions and the concept of capacity. Reprinted in Førsund FR
(ed) Collected works of Leif Johansen, vol 1. North Holland, Amsterdam
44. Koopmans TJ (1951) Analysis of production as an efficient combination of activities. In:
Koopmans TJ (ed) Activity analysis of production and allocation. Wiley, New York, pp 33–97
45. Koopmans TJ (1957) Three essays on the state of economic science. McGraw Hill, New York
46. Kumbhakar S, Lovell CAK (2000) Stochastic frontier analysis. Cambridge University Press,
New York
10 Data Envelopment Analysis: A Nonparametric Method of Production Analysis 469

47. Leleu H, Briec W (2009) A DEA estimation of a lower bound for firms’ allocative efficiency
without information on price data. Int J Prod Econ 121:203–211
48. Lozano SC (2015) A joint-inputs network DEA approach to production and pollution-
generating technologies. Expert Syst Appl 42:7960–7968
49. Luenberger DG (1992) Benefit functions and duality. J Math Econ 21:115–145
50. Murty S, Russell RR (2016) Modeling emission-generating technologies: reconciliation of
axiomatic and by-production approaches. Empir Econ 54(1):7–30
51. Murty S, Russell R, Levkoff SB (2012) On modeling pollution-generating technologies. J
Environ Econ Manag 64:117–135
52. Pastor JT, Louis JL, Sirvent I (1999) An enhanced DEA Russell-graph efficiency measure. Eur
J Oper Res 115:596–607
53. Pethig R (2006) Non-linear production, abatement, pollution and materials balance reconsid-
ered. J Environ Econ Manag 51:185–204
54. Portela MCAS, Thanassoulis E (2005) Profitability of a sample of Portuguese bank branches
and its decomposition into technical and allocative components. Eur J Oper Res 162(3):
850–866
55. Ray SC (1988) Data envelopment analysis, non-discretionary inputs and efficiency: an
alternative interpretation. Socio Econ Plan Sci 22(4):167–176
56. Ray SC (1991) Resource-use efficiency in public schools: a study of Connecticut data. Manag
Sci 37(12):1620–1628
57. Ray SC (2004) Data envelopment analysis: theory and techniques for economics and operations
research. Cambridge University Press, New York
58. Ray SC (2007) Shadow profit maximization and a measure of overall inefficiency. J Prod Anal
27:231–236
59. Ray SC (2009) Are Indian firms too small? A nonparametric analysis of cost efficiency and the
optimal organization of industry in Indian manufacturing. Indian Econ Rev XXXXVI(1):49–67
60. Ray SC (2010) A one-step procedure for returns to scale classification of decision making units
in data envelopment analysis. University of Connecticut Economics working paper 2010-07
61. Ray SC (2015) Nonparametric measures of scale economies and capacity utilization: an
application to U.S. manufacturing. Eur J Oper Res 245:602–611
62. Ray SC, Ghose A (2014) Production efficiency in Indian agriculture: an assessment of the post
green revolution years. Omega 44:58–69
63. Ray SC, Jeon Y (2009) Reputation and efficiency: a non-parametric assessment of America’s
top-rated MBA programs. Eur J Oper Res 189(2008):245–268
64. Ray SC, Mukherjee K (2016) Data envelopment analysis with aggregated inputs and a test of
allocative efficiency when input prices vary across firms. Data Envel Anal J 2(2):141–161
65. Ray SC, Chen L, Mukherjee K (2008) Input price variation across locations and a generalized
measure of cost efficiency. Int J Prod Econ 116:208–218
66. Ray SC, Mukherjee K, Venkatesh A (2018) Nonparametric measures of efficiency in the
presence of undesirable outputs: a by-production approach with weak disposability. Empir
Econ 54(1):31–65
67. Ray SC, Walden J, Chen L (2018) Economic Measures of Capacity Utilization: A Nonparamet-
ric Cost Function Analysis. Working Paper 2018–02, University of Connecticut, Department
of Economics
68. Rodseth KL (2015) Axioms of a polluting technology: a materials balance approach. Environ
Res Econ 67(1):1–22. Online October 2015
69. Rodseth KL (2016) Environmental efficiency measurement and the materials balance condition
reconsidered. Eur J Oper Res 250:342–346
70. Ruggiero J (1998) Non-discretionary inputs in data envelopment analysis. Eur J Oper Res
111:461–469
71. Seiford L, Zhu J (1999) An investigation of returns to scale in data envelopment analysis.
Omega Int J Manag Sci 27:1–11
72. Shephard RW (1953) Cost and production functions. Princeton University Press, Princeton
470 S. C. Ray

73. Shephard RW (1970) Theory of cost and production functions. Princeton University Press,
Princeton
74. Tone K (2001) A slacks-based measure of efficiency in data envelopment analysis. Eur J Oper
Res 130:498–509
75. Tone K (2002) A strange case of the cost and allocative efficiencies in DEA. J Oper Res Soc
53:1225–1231
76. Varian HR (1984) The nonparametric approach to production analysis. Econometrica
52(3):579–597
77. Zhu J (2003) Quantitative models for performance evaluation and benchmarking: data envel-
opment analysis with spreadsheets and DEA excel solver. Kluwer Academic, Boston
Activity Analysis in Production Economics
11
Thijs ten Raa

Contents
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 472
The Origin of Activity Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 472
Substitution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 473
Activity Foundation of the Production Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 475
Variants of Houthakker’s Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 477
Activity Foundation of Input-Output Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 478
Efficiency . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 479
Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 481
Cross-References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 481
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 481

Abstract

This chapter opens with the historical root of activity analysis. The framework
of activity analysis admits multiple techniques to produce a commodity. Sub-
stitution theorems investigate when the market mechanism singles out a best
technique for each product and if the best techniques vary with the data of
an economy, such as resource availabilities. Houthakker’s Theorem initiated
a literature on the relationship between the distribution of activities and the
form of the aggregate production function. Activity analysis is connected to
modern input-output analysis, where the numbers of products and industries
differ, which facilitates the measurement of the efficiency of the production units
of an economy and of the economy.

T. ten Raa ()

Utrecht School of Economics, Utrecht University, Utrecht, The Netherlands
e-mail: [email protected]

© Springer Nature Singapore Pte Ltd. 2022 471

S. C. Ray et al. (eds.), Handbook of Production Economics,
https://doi.org/10.1007/978-981-10-3455-8_25
472 T. ten Raa

Keywords

Activity analysis · Efficiency · Houthakker’s theorem · Input-output analysis ·

Substitution theorem

Introduction

This chapter shows how activity analysis is a basis of production functions. One of
the advantages of activity analysis is that it offers a framework for the measurement
of efficiency. Given the ensemble of all activities, one may determine the frontier
and then measure how close activities are to the frontier. There are two problems:
First, in which direction should one go to measure the gap between an activity and a
best practice frontier element? Second, how should one aggregate the efficiencies
of the individual activities? Both issues are solved in the recent literature, and
these results neatly complement neoclassical production analysis with its explicit
or implicit assumption of cost minimization.
In the section “The Origin of Activity Analysis,” I discuss the historical root
of activity analysis. The framework of activity analysis admits multiple techniques
to produce a commodity. An important economic question is: Does the market
mechanism single out a best technique for each product? If so, does the best
technique vary with the data of an economy, such as resource availabilities?
Substitution theorems are discussed in the section “Substitution.” Section “Activity
Foundation of the Production Function” discusses in detail the relationship between
the distribution of activities and the form of the production function. The classic
result is Houthakker’s Theorem. Section “Variants of Houthakker’s Theorem”
discusses the further literature on this relationship. Section “Activity Foundation of
Input-Output Analysis” connects activity analysis to modern input-output analysis,
where the numbers of products and industries differ. Section “Efficiency” analyzes
the measurement of the efficiency of the production units of an economy and of the
economy.

The Origin of Activity Analysis

Activity analysis originates from Tjalling Koopmans’ [10] Activity Analysis of

Production and Allocation, a conference volume. The conference was on linear
programming, and indeed, activity analysis is still considered a practical tool for
the analysis of production and allocation. The conference volume is an icon in
the history of economic thought. In 1975, his research on this topic earned Dutch
American Koopmans, jointly with Russian Leonid Kantorovich, the Nobel Prize in
economics. Incidentally, Koopmans was named by the then young William Baumol,
who sadly died during the writing of this chapter. Baumol’s [1] Economic Theory
and Operations Analysis popularized the use of operations research in economics
and was a leading microeconomic textbook until well into the 1980s.
11 Activity Analysis in Production Economics 473

Koopmans’ Activity Analysis of Production and Allocation consists of four parts:

theory (ten papers), applications (six papers), convex analysis (four papers), and
algorithms (five papers). The third and fourth parts are now standard fare in applied
mathematics and software, greatly facilitated by recent computer power. The second
part is a collection of assorted papers. The first part is the most important one, at
least for our objective, an overview of activity analysis in the context of production
economics.
The first part of Activity Analysis of Production and Allocation begins with
Dantzig’s mathematical programming, and Koopmans’ “Analysis of Production
as an Efficient Combination of Activities,” continues with Von Neumann and
Leontief’s dynamic models and completes with five (!) papers on the static Leontief
model. Indeed, there is a close connection between input-output analysis and activity
analysis, both at the level of concept and of application. To begin with the latter,
mind that activity analysis blossomed shortly after World War II. Planning was an
important policy tool, not only in Russia, but also in the United States. Warfare
prompted a change in demand (toward aircraft and other equipment), and this had
to be reconciled with resource scarcity.

Substitution

At the conceptual level, an activity is a pair of an input vector and an output vector.
Leontief, who won the Nobel Prize 2 years earlier, was ahead of Koopmans and
activity analysis, but the latter, not surprisingly, offers a more general framework
for economic analysis. By the same token, input-output analysis is a special, albeit
important, case of activity analysis. This is particularly true of static input-output
analysis. Here an activity is a pair of an input vector and a pure output vector, where
“pure” means that only one component is nonzero. Joint production is ruled out.
Dynamic input-output analysis is more general. It features two types of inputs,
namely, absorbed inputs and capital, and the output is basically still pure, but
accompanied by the same capital. There may be depreciation, but that is modeled
by including it in the absorbed inputs. The main simplification of input-output
analysis is the implicit assumption that for every output there is only one activity
that produces it. Each product has a unique input structure. Economists, particularly
students of Leontief, starting with Paul Samuelson, were intrigued by this simplify-
ing, implicit assumption and theorized about it. Samuelson, Koopmans, and Arrow
each contribute a paper on this subject to Koopmans [10].
The result of their analysis is the substitution theorem. If activities are pure,
but different ones coexist for the production of commodities, and if there is only
one nonproduced input, called “labor,” then there exists a collection of activities,
one for each product, that minimizes the labor input of the net output vector
produced, irrespective of the proportions of the net output vector. Except for labor,
commodities feature in output vectors and input vectors, and the difference between
the two is net output, available for household consumption. Thus, the substitution
theorem states that there is a dominant technology to produce net output, whatever
474 T. ten Raa

its commodity composition. An activity is also called a “technique,” and a collection

of available activities or techniques is also called a “technology.” For an obvious
reason, the substitution theorem – industries can substitute techniques but do not do
so when minimizing the resource use – is also called the nonsubstitution theorem.
Activity analysis sticks to the input-output assumption of constant returns to
scale. Consequently, the minimization of the resource input for a given level of net
output is equivalent to the maximization of the level of net output given a resource
input. This observation is obvious in the context of the substitution theorem (a single
nonproduced input and pure output vectors) and remains valid when net outputs are
not pure but feature multiple nonzero components. For example, if a given net output
vector is producible with only 4/5th or 80% of the actual use of the observed amount
of resource, then, under constant returns to scale, the actual amount of resource
could produce 5/4th or 125% of the given net output vector. We say the efficiency
of the economy is 80%. Equivalently, potential output is 125% of actual output.
In determining these performance measures, the collection of available activities is
considered to be given, but the intensities with which each activity is run are to be
determined by the mathematical program that minimizes the resource requirement
or maximizes the level of net output. There are two constraints in either program.
Commodity balances require that the activity intensities are large enough so that the
supply of net output is at least equal to household demand, but small enough so that
production demand for the resource does not exceed the available stock.
The constraints pick up Lagrange multipliers: a factor reward for the resource and
also shadow prices for the produced commodities. The shadow prices, including the
one for the factor input, fulfill the so-called dual constraints. The dual constraints
are such that all activities have nonpositive profit and the activities running with
positive intensity break even. Therefore, the shadow prices are competitive prices.
Competitive prices would signal to entrepreneurs which activities to undertake. The
competitive prices are also a useful analytical tool. An advantage is that in the
general activity model, there are more activities than commodities, thus allowing
substitution. Therefore, the dual variables (the prices) have lower dimension than the
primal variables (the activity intensities). Johansen [8] approached the substitution
theorem using the competitive prices, and ten Raa [21] filled the gaps.
The substitution theorem has the striking result that prices are independent
of demand, hence determined by supply, more precisely, by technology. This is
classical economics. It rests on the classical assumptions of a single nonproduced
input and no joint production. The classical assumptions have a built-in tension.
The assumption of a single nonproduced input, labor, suggests that capital is a
produced commodity. We have no difficulty with this view. Capital is buildings,
machinery, equipment, and infrastructure, and these are produced commodities
indeed. However, the essence of productive capital is that at least some of it (after
correcting for depreciation) remains when an activity has been completed. But then,
the activity has at least two outputs: the commodity produced and the remaining
capital. In other words, there is joint production. There are two approaches to deal
with this issue. The first, going back to Von Neumann [25], is to accommodate
multiple positive output components in the activity vectors. However, Von Neumann
11 Activity Analysis in Production Economics 475

trades off this generality on the supply side against more specificity on the demand
side, assuming that labor services are the output of the household activity with fixed
consumption coefficients, which essentially is a classical, Marxian assumption.
The second is to accommodate joint production and to analyze to which extent
substitution emerges [15].
A more blatant violation of the substitution theory assumptions occurs when
there are multiple nonproduced inputs, for example, labor and land, or, as in many
neoclassical economic models, labor and capital, where at any point of time the latter
is considered to be given by the past. Then the choice of techniques will depend on
the composition of the resources. A relatively more labor-endowed economy will
employ more labor-intensive techniques when maximizing the level of output. The
argument is simple, particularly when the relative factor intensities range from very
low to very high values compared to the endowment ratio. Then both endowments
can be fully employed, and the average factor intensity in production will be equal to
the endowment ratio. When East Germany was absorbed, labor became less scarce
and capital more so. The shadow price of labor became smaller, supporting more
efficient labor-intensive production. This reasoning pertains to the potential output
of the economy. The actual economy may have followed a different, less efficient
path.

Activity Foundation of the Production Function

The economy has a supply side, populated with producers, and a demand side,
populated with consumers. Center pieces of the supply side are production func-
tions. Production functions have different functional forms. Implicit are alternative
degrees of substitution and scale economies [5]. In activity analysis, however, the
situation is more basic. An activity is like a recipe. There are input requirements per
unit of output. Output may be multidimensional as well, for example, juice may be
a byproduct when cooking. Differences in output are accompanied by differences
in inputs. And even when there are no differences in outputs, such as in an industry
producing a homogeneous product, there may be differences in inputs. Moreover,
alternative activities may produce the same output. Local conditions vary, alternative
production techniques compete, and some production units are simply less efficient
than others, a phenomenon which shows in a different (higher) input structure.
Activities replace each other. For example, when a new supply of some resource
is discovered, activities which make relatively intensive use of this resource will
expand. The economy will use a different mix of inputs. This, indeed, may be
described by a production function, but it is an interesting question how differences
in activities translate into alternative functional forms of production.
In activity analysis, alternative techniques to produce commodities coexist,
and the efficient ones are determined using a mathematical program in which
intensities, one for each activity, that is, technique, are the variables. How can we
reconcile this framework with a neoclassical production function, such as the Cobb-
Douglas function, Y = AK α L β? Here K and L are inputs, Y is output, and A, α,
476 T. ten Raa

and β are parameters. There are decreasing/constant/increasing returns to scale if

α + β < / = / > 1, respectively.
A simplistic link would be as follows, for the case of constant returns to scale. An
activity would be a pair of variable inputs and an output, which can be normalized,
(k, l; 1), with Akα l β = 1. The activities can be parameterized by one input, for
example, k. Then l = (Ak α)−1/ β, and therefore the technology set of activities
is {(k, (Ak α)−1/β; 1)| k > 0}. Each activity can be run with intensity s k. Total
output will be s k dk, where the integral
is taken over the positive numbers. The
constraints are s k kdk ≤ K and s k ldl ≤ L, where K and L are the factor
endowments. Because of the convexity of the technology set and the assumption
of constant returns to scale, α + β = 1, running different activities with positive
intensity can be improved upon, in terms of output, by replacement of the activities
by their intensity-weighted average. Hence, output is maximized by running the
single activity with the right factor intensity that matches the endowments ratio,
k/l = k(Ak α)1/ β = K/L. Solving, using α + β = 1, k∗ = (K/L)β/A. The intensity
is determined by s k∗ k∗ = K, hence s k∗ = K(K/L)− β A = AK α L β. All other
intensities s k are zero. Since activities were normalized by output and returns to
scale are constant, output equals the activity intensity, Y = AK α L β. In other words,
the aggregate production function is the same as the underlying microtechnology.
In this simple activity analytic underpinning of the aggregate production func-
tion, all production units are free to choose from a continuum of activities, from
labor to capital intensive. And all production units would select the same activity.
This extreme flexibility, with its concentrated optimal activity pattern, is not very
realistic. Capital intensities of production units are fixed once installed and vary
across production units. Individual production units cannot access the full menu of
activities, technology. In activity analysis, it is customary to assume that production
units have given techniques and are represented by their activities. The implicit
assumption is that production units cannot substitute inputs. However, at the
macrolevel, substitution may take place. When a factor input becomes abundant,
such as labor in the time of German unification, its price will go down, making
activities with intensive use of the abundant factor input financially feasible. The
subpopulation of active production units will shift to the more intensive users of the
abundant endowment. In this framework, we better do not assume that production
intensities can vary freely from zero to infinity. If so, then a single production unit
with the right factor intensity, which matches the endowments ratio, would pick
up all activity. The result would be the same as in the simplistic world were all
production units to have access to the full technology.
In line with the factor specificity of an activity, it is assumed there is a capacity
constraint for each activity. A fixed input causes the capacity constraint. The
fixed input is other than the variable inputs, capital, and labor. Houthakker [7]
suggests entrepreneurial resources. The distribution of the capacity constraint (of
entrepreneurial resources) over activities (k, l; 1) is considered to be given, y(k, l).
This distribution need not be concentrated on a frontier like {(k, l)| Ak α l β = 1}.
Some activities may dominate others, with both components of (k, l) smaller. Yet a
dominated activity may be run, because the superior activity, like all activities, has a
11 Activity Analysis in Production Economics 477

capacity constraint. Activities

can be run with intensities
0 ≤ s(k, l) ≤ y(k, l). Subject
to the factor
constraints s(k, l)kdkdl ≤ K and s(k, l)ldkdl ≤ K, we maximize
output s(k, l)dkdl. This is a linear program with a continuum of variables s(k, l).
Denote the shadow prices of the two factor constraints by r and w, respectively. By
the phenomenon of complementary slackness, unprofitable activities, with unit cost
rk + wl > 1, are not run, s(k, l) = 0. By the same argument, profitable activities,
with unit cost rk + wl < 1, are run at full capacity, s(k, l) = y(k, l). Activities
which break even, rk + wl = 1, have activity 0 ≤ s(k, l) ≤ y(k, l), but since the
set of such activities has measure
zero we may set s(k, l) = y(k, l). It follows that
inputsand output are K = rk+ wl≤1 y(k, l)kdkdl, L = rk+ wl≤1 y(k, l)ldkdl, and
Y= rk+ wl≤1 y(k, l)kdkdl, respectively. The implicit assumption is that all factor
input can be fully employed. There must be activities with factor intensity k/l below
endowment ratio K/L and activities with factor intensity above the endowment ratio.
The three expressions, for inputs K and L and output Y, are interrelated by
the two shadow prices r and w. The idea of Houthakker [7] is to use the first to
expressions to solve for r and w in terms of K and L. Substitution of the results
in the third expression yields output as function of the inputs. Houthakker [7]
carries out this calculation for the capacity distribution with Pareto density function,
y(k, l) = μk κ−1 l λ−1 , where μ, κ, and λ are positive constants. The result is
Y = AK α L β with α = κ(κ + λ + 1), β = λ(κ + λ + 1), and A a positive
constant depending on μ, κ, and λ. In other words, a Pareto capacity distribution
yields a Cobb-Douglas production function. This is Houthakker’s Theorem. At the
microlevel, activities have fixed input-output ratios – it takes given amounts of labor
to operate given machinery and equipment – but a change in resources, such as the
inclusion of the East German labor force, is accommodated by the activation of
new activities and the deactivation of some incumbent activities. Reallocations of
resources across activities manifest as substitutions.
The capacity distribution is not concentrated on a single isoquant in input space.
Both k and l can be bigger. In solving the output maximization, smaller input
combinations are activated, but only to full capacity. Residual inputs are employed
by more input-intensive activities. The capacity constraints thus yield decreasing
returns to scale. Indeed, the Cobb-Douglas function has exponents summing to a
number less than one. Houthakker’s activity foundation of neoclassical production
functions works only if returns to scale are decreasing.

Variants of Houthakker’s Theorem

Clearly, different capacity distributions for the activity levels will generate different
production functions. Houthakker [7] has generated a stream of theoretical and
applied research, to date. The bulk of this literature features a lower dimension,
with only one variable input, namely labor, and again one fixed output, which is
now capital. In this one fixed-one variable input framework, Levhari [14] found
the capital distribution for which total output is a CES function of the total fixed
input (capital) and the total variable input (labor) and showed it encompasses the
478 T. ten Raa

Cobb-Douglas function. Muysken [16] has consolidated the Cobb-Douglas, CES,

and VES functions by showing they are all generated by beta distributions, with
alternative parametrizations. Two books on the distribution approach to production
are Johansen [9] and Sato [19].
In this literature, activities have fixed input-output proportions, and capacity
constraints explain the existence of inefficient activities. Increases in levels of inputs
prompt the activation of less efficient activities, in Ricardian style. The law of one
price yields rents to the more efficient activities. The activation of different activities
prompts different proportions between the input totals and the output. Substitution
is considered a symptom of the change in the range of active activities (run with
positive intensity).
The interrelation of total output to two inputs is a shortcut with a strong macroe-
conomic flavor (e.g., Lagos [12]). One way to reconcile economy-wide analysis
with activities is to aggregate in stages, from activities (production units) via
conglomerates (industries) to the economy. In the second stage, one has to aggregate
production functions more general than fixed proportions functions, also called
Leontief functions. In the one variable-one fixed output framework [18] analyzed
how micro-CES functions and an appropriate inefficiency distribution (reflecting
capacity constraints) generate a macro-CES function, with a greater elasticity of
substitution (for the same reason as capacity, constraints create substitutability when
the microfunctions are Leontief).
Growiec [6] generalizes the capacity distribution with Pareto density function,
y(k, l) = μkκ−1 lλ−1 . He keeps the multiplicative structure, in other words the
independence of the unit factor productivities, F a and F b. For each (K, L), firms
maximize CES output A[ψ(bK)θ + (1 – ψ)(aL)θ ]1/ θ with respect to unit factor
productivities a and b, subject to Fa(a)Fb(b) = N, where N indexes the technology
frontier, 0 < N < 1. If Fa(a) = c a a γ , F b(b) = c b b γ α /(1 − α) , then maximum
output is AK α L1 − α . This is the case where a Pareto distribution of unit factor
productivities and free choice of technology yields Cobb-Douglas output. However,
the mechanism is very different than in Houthakker [7]. In Growiec [6], firms freely
choose from a menu which is parametrized by a distribution. The formal similarity –
a Pareto distribution translates into a Cobb-Douglas function – is coincidental. In
Growiec, the Weibull distribution translates into a CES function, while this is not
the case in Levhari’s [14] CES analysis of the Houthakker [7] model.

Activity Foundation of Input-Output Analysis

The two-stage aggregation, from activities via industries to the economy, is a

useful framework to accommodate the output differences between industries and
to relate their inefficiencies. We return to the definition of an activity: a pair
of an input vector and an output vector. Unlike traditional input-output analysis,
modern activity analysis allows for multiple outputs and even different numbers of
outputs and production units or industries. The advantage of input-output analysis,
the accommodation of intermediate inputs, is preserved though. An input vector
11 Activity Analysis in Production Economics 479

consists of m commodities and l factor inputs. An output vector consists of m

commodities (the same as in the input vector). There are n production units, that
is, activities. The first production unit has produced inputs (u11 , . . . , u m1 ), factor
inputs (f11 , . . . , f l1 ), and outputs (v11 , . . . , v m1 ). Here f stands for factor,
u stands for use, and v, the next letter in the alphabet, stands for output. Writing
these vectors as column vectors (as the index notation suggests), and stacking the
column vectors representing the other production units next to them, the n activities
are represented by the triplet of l × n-dimensional factor input matrix F, m × n-
dimensional intermediate input matrix U, and m × n-dimensional output matrix V.
If all activities are included, the economy is represented by the triplet (F, U, V) and
the nonnegative l-dimensional available resource vector ω.
An allocation is a nonnegative n-dimensional activity vector, s, where the i-th
component is the scale of production of unit i. For example, if s i = 1.1, all inputs
and outputs of activity i are 10% greater than observed. An allocation is feasible if
Fs ≤ ω. Intermediate demand is Us. Gross output is Vs. Net output is the difference,
(V − U) s. This is final consumption.
In traditional input-output analysis, the number of activities equals the number
of commodities, i.e., m = n. In this literature, gross output Vs is denoted x and
net output (V – U) s is denoted y. It is reasonable to assume that output matrix
V has a dominant diagonal. Then V is invertible, and the choice between the
allocation variable s of activity analysis and the gross output variable x of input-
output analysis is a matter of a change of variable, x = Vs and s = V−1 x. The
material balance y = (V – U) s can be rewritten as Leontief’s [13] basic equation,
y = x − Ax. Here A is the matrix of input-output equations determined by A = UV−1 .
The latter specification is the so-called commodity technology model, which has
superior balance and invariance properties [11]. The upshot is that activity analysis
encompasses input-output analysis.
In the System of National Accounts [2], the number of commodities is greater
than the number of industries: m > n. Standard input-output analysis is problematic,
but activity analysis remains doable [24].
When input and output data are used in raw form, at the level of reporting
production units, without aggregation to industries, the number of activities is
greater than the number of commodities, m < n. In this case, there is a wealth of data,
and activity analysis facilitates stochastic input-output analysis. The commodity
technology model, A = UV−1 , does not exist when output matrix V is not square,
but input-output coefficients matrix A can be estimated by regressing inputs U on
outputs V: U = AV + ε, including an error term [17].

Efficiency

An allocation is efficient if no other allocation is better to one consumer without

being worse to the other consumers. Observed allocations tend to be inefficient.
The efficiency of an economy is measured by Debreu’s [4] coefficient of resource
480 T. ten Raa

utilization, ρ, a number between 0 and 1. ρ is the lowest number such that

all consumers can be equally well off if the endowment is reduced from ω to
ρω. The coefficient of resource allocation depends on the preferences of the
consumers. If there is much substitutability, there is much scope for reallocations
and, therefore, for potential efficiency gains. In this case, the coefficient of resource
utilization will be low. Conversely, ten Raa [22] has shown that if the consumers
have Leontief preferences (consumption with fixed proportions), the coefficient
of resource utilization attains its upper bound. In other words, the assumption of
Leontief preferences yields a conservative inefficiency measure. ten Raa [22] coins
this measure the Debreu-Diewert coefficient of resource allocation.
The Debreu-Diewert coefficient of resource allocation, by its assumption of fixed
consumption bundles, rules out efficiency gains due to consumers’ exchanges. If
Debreu’s coefficient of resource allocation is 0.7 and the Debreu-Diewert coefficient
is 0.8, then the difference represents consumer inefficiency. In this example, overall
inefficiency is 30%, production inefficiency 20%, and consumer inefficiency 10%.
ten Raa [22] shows that microdata of final consumption are not needed to calculate
the Debreu-Diewert coefficient of resource utilization.
The calculation of the Debreu-Diewert coefficient of resource allocation is simple
in the activity framework of the economy. ten Raa [22] shows that the better set
of Pareto noninferior allocations is {s ≥ 0| (V – U)s ≥ (V – U)e}, where e is the
unit or summation vector with all components equal to one and the inequalities are
commodity constraints. Over this set, one must minimize ρ subject to feasibility
condition Fs ≤ ρω. This is a linear program. The Lagrange multipliers of the
commodity and factor constraints are competitive prices. By the phenomenon
of complementary slackness, activities with positive activity level break even,
and unprofitable activities are shut down. In other words, the principle of profit
maximization selects the activities that minimize resource use.
The competitive commodity prices can be used to evaluate the net output growth,
competitive factor rewards are used to evaluate the factor input growth, and the
difference is total factor productivity growth (TFP). A classical result is that for
perfectly competitive economies, TFP equals the shift in the production function
or technical change (TC; see Solow [20]). In general, TFP equals the sum of TC
and the change in the Debreu-Diewert coefficient of resource utilization or, briefly,
efficiency change (EC; see ten Raa [22]). Both components can be decomposed
further in numerous ways.
The decomposition of efficiency involves a bias issue. The efficiency of a system
of production units is less than the average efficiency of the production units. The
reason is that the allocation of resources may be inefficient. This bias issue was first
analyzed by Blackorby and Russell [3] who demonstrated that only when production
is linear in the sense that marginal rates of substitution and marginal rates of
transformation are constant and these constants are common to the production units,
there is no bias issue. ten Raa [23] showed that the bias measures the inefficiency of
the industrial organization of the production units.
11 Activity Analysis in Production Economics 481

Conclusion

Activity analysis bridges the gap between input-output analysis with its fixed input
proportions and neoclassical production theory with abounding substitutability,
using the former as a foundation of the latter and showing that the latter encom-
passes the former. Substitutability of factor inputs is a manifestation of reallocations
between activities with different factor intensities. Activity analysis accommodates
efficiency analysis by measuring and decomposing inefficiencies.

Cross-References

Multiproduct Technologies
Neoclassical Production Economics: An Introduction

References
1. Baumol WJ (1961) Economic theory and operations analysis. Prentice-Hall, New York
2. Beutel J (2017) Chapter 3. The supply and use framework of national accounts. In: ten Raa T
(ed) Handbook of input–output analysis. Edward Elgar, Cheltenham
3. Blackorby C, Russell RR (1999) Aggregation of efficiency indices. J Prod Anal 12(1):5–20
4. Debreu G (1951) The coefficient of resource utilization. Econometrica 19(3):273–292
5. Diewert WE, Fox KJ (2008) On the estimation of returns to scale, technical progress and
monopolistic markups. J Econ 145(1):174–193
6. Growiec J (2008) Production functions and distributions of unit factor productivities: uncover-
ing the link. Econ Lett 101(1):87–90
7. Houthakker HS (1955) The Pareto distribution and the Cobb-Douglas production function in
activity analysis. Rev Econ Stud 23(1):27–31
8. Johansen L (1972) Simple and general nonsubstitution theorems for input–output models. J
Econ Theory 5(3):383–394
9. Johansen L (1972) Production functions: an integration of micro and macro, short run and long
run aspects. North-Holland, Amsterdam
10. Koopmans TC (1951) Activity analysis of production and allocation. Wiley, New York
11. Kop Jansen P, ten Raa T (1990) The choice of model in the construction of input–output
coefficients matrices. Int Econ Rev 31(1):213–227
12. Lagos R (2006) A model of TFP. Rev Econ Stud 73(4):983–1007
13. Leontief WW (1936) Quantitative input and output relations in the economic system of the
United States. Rev Econ Stat 18(3):105–125
14. Levhari D (1968) A note on Houthakker’s aggregate production function in a multifirm
industry. Econometrica 36(1):151–154
15. Mirrlees JA (1969) The dynamic nonsubstitution theorem. Rev Econ Stud 36(1):67–76
16. Muysken J (1983) Transformed beta-capacity distributions of production units. Econ Lett
11(3):217–221
17. Rueda-Cantuche J (2017) Chapter 4. The construction of input–output coefficients. In: ten Raa
T (ed) Handbook of input–output analysis. Edward Elgar, Cheltenham
18. Sato K (1969) Micro and macro constant-elasticity-of-substitution production functions in a
multifirm industry. J Econ Theory 1(4):438–453
482 T. ten Raa

19. Sato K (1975) Production functions and aggregation. North-Holland, Amsterdam

20. Solow RM (1957) Technical change and the aggregate production function. Rev Econ Stat
39(3):312–320
21. ten Raa T (1995) The substitution theorem. J Econ Theory 66(2):632–636
22. ten Raa T (2008) Debreu’s coefficient of resource utilization, the Solow residual, and TFP: the
connection by Leontief preferences. J Prod Anal 30(3):191–199
23. ten Raa T (2011) Benchmarking and industry performance. J Prod Anal 36(3):285–292
24. ten Raa T, Shestalova V (2015) Supply-use framework for international environmental policy
analysis. Econ Syst Res 27(1):77–94
25. von Neumann J (1945) A model of general economic equilibrium. Rev Econ Stud 13(1):1–9
Bad Outputs
12
Sushama Murty and R. Robert Russell

Contents
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 484
Single-Equation Modeling of the Technology Under Standard Disposability
Assumptions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 487
Treating Pollution as a Conventional Production Output . . . . . . . . . . . . . . . . . . . . . . . . . . . 488
Treating Pollution as a Conventional Production Input . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 489
Weakly Disposable Technologies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 493
Multiple-Equation Modeling of Pollution-Generating Technologies . . . . . . . . . . . . . . . . . . . . 494
Rival vs. Joint Production of Multiple Outputs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 495
Multi-equation Modeling: The Case of Factorially Determined
Multi-output Production . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 500
Multi-equation Modeling: The Case of Rival and Joint Production . . . . . . . . . . . . . . . . . . . 501
Multi-equation Modeling of Emission-Generating Technologies with
Abatement Activities and Multiple Emissions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 510
Rival Production of Abatement and the Economic Output . . . . . . . . . . . . . . . . . . . . . . . . . . 511
Modeling the Generation of Multiple Emissions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 514
The Overall By-Production Technology with Abatement and Multiple Emissions . . . . . . 517
Axiomatic Approach to Modeling Emission-Generating Technologies . . . . . . . . . . . . . . . . . . 517
Efficiency Measurement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 523
Properties of Environmental Efficiency Indexes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 524
Hyperbolic and Directional Distance Indexes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 524
The “Färe-Grosskopf-Lovell” Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 526
Extension of the FGL Index to Graph Space . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 527
Critiques and Suggested Modifications of the By-Production Structure . . . . . . . . . . . . . . . 528

S. Murty ()
Centre for International Trade and Development, School of International Studies, Jawaharlal
Nehru University, New Delhi, India
e-mail: [email protected]
R. R. Russell
Department of Economics, University of California, Riverside, Riverside, CA, USA
e-mail: [email protected]

© Springer Nature Singapore Pte Ltd. 2022 483

S. C. Ray et al. (eds.), Handbook of Production Economics,
https://doi.org/10.1007/978-981-10-3455-8_3
484 S. Murty and R. R. Russell

Concluding Remarks: The Material Balance Condition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 532

Cross-References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 533
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 533

Abstract

Inadequacies of the traditional, single-equation representations of models of

emission-generating technologies prominently associated with the classic book
by Baumol and Oates (The theory of environmental policy, 1st and 2nd edn.
Cambridge University Press, Cambridge, 1975, 1988) are first laid out. In
particular, these models lack the “monotonicity degrees of freedom” to capture
adequately the complex trade-offs in the production of unintended as well as
intended outputs using emission-generating inputs. Reprising ideas in the classic
1965 book on Theory of Production by Ragnar Frisch, it is shown that the use
of multiple functional restrictions, a phenomenon referred to as by-production
in Murty, Russell, and Levkoff (J Environ Econ Manag 64:117–135, 2012),
facilitates the modeling of pollution-generating technologies. In particular, a
by-production technology is obtained as the intersection of an intended-output
sub-technology and an unintended-output sub-technology. These principles are
illustrated by sketching a model of coal-fired electrical power generation. A data
envelopment analysis (DEA) methodology for measuring technical efficiency
under the by-production approach is also discussed.

Introduction

The modeling of production technologies has a long history. Early conceptualization

of relations among inputs and outputs, based on stylized facts and empirical
observations, were manifested in the law of diminishing marginal productivity (or
increasing marginal cost) and various types of returns to scale. These modeling
efforts culminated in a rigorous axiomatization of production technologies and their
representations by production functions in the middle of the twentieth century.
Prominent among the main features of a technology recognized by this literature
were the free disposal properties of the inputs and outputs. Together they imply the
empirically observed positive relationship between inputs and outputs. Combined
with the assumption of convexity, this axiomatization of the technology laid a
foundation for many pathbreaking theoretical results, including the existence of a
general competitive equilibrium, the formalization of the two fundamental theorems
of welfare, and the duality between technological constraints and optimizing
behavior (e.g., profit maximization and cost minimization). These results facilitated
a plethora of applied work with significant consequences for economic policy in
areas like public economics, measurement of efficiency and productivity, economic
growth, and industrial organization.
Operations of many technologies lead to the production of not only desirable eco-
nomic outputs but also incidental outputs that may have undesirable consequences
for the rest of the economy. Rigorous study of the relations between inputs and out-
puts that are satisfied by such technologies was generally lacking for a long while.
12 Bad Outputs 485

It was only in the latter part of the twentieth century, as the field of environmental
and natural resource economics established its roots and gained momentum, that
the researchers faced the challenge of modeling the generation of bad (unintended)
outputs as by-products of the production of the desired (intended) economic outputs.
The technology of an emission-generating producer was recognized as an important
primitive in the theoretical analysis of market externalities and the formulation of
policies aimed at efficient mitigation of the inimical effects on social welfare of the
production of the bad outputs.
The initial treatment of bad outputs in the modeling of production technologies
was very simple. The standard approach, adopted in the classic Baumol-Oates [7]
book and persisting to this day, is simply to include in the production function
an emission variable, assumed to satisfy the same (free disposability) conditions
as a conventional input. An early exception to the standard approach can be
found in Färe, Grosskopf, and Pasurka [22] and Färe, Grosskopf, Lovell, and
Pasurka [19], where emissions are modeled as (bad) outputs satisfying a weak
disposability assumption: bad and good outputs can only be disposed of in tandem
(proportionately). The main idea behind these “input” or “output” approaches
to modeling bad outputs is to capture the empirically observed positive relation
between the production of good and bad outputs: as the production of good outputs
increases, the technology also generates more of the bad outputs. Under such
approaches, it became possible to represent the technology set of an emission-
generating production unit by a single production function/equation.
As first pointed out by Førsund [27] and Murty and Russell [42], each of
these approaches to modeling pollution-generating technologies entails implausible
properties of the technology. Most egregiously, the Baumol-Oates formulation
implies that, ceteris paribus, increases in the use of a pollution-generating input
lowers the levels of emissions. The weak disposability approach of Färe, Grosskopf,
Lovell, and Pasurka [19] entails free disposability of emission-generating inputs,
implying for example that coal input can be increased without bound and without
generating additional pollution.
Building on ideas of Frisch [30], Førsund [27] and Murty and Russell [42]
argued analytically that the perverse trade-offs engendered by the single-equation
representations of pollution-generating technologies can be avoided by using
multiple functional restrictions to describe the technology, a construction that Murty
and Russell call by-production. These ideas have been further developed in Førsund
[28], Murty [40], and Murty, Russell, and Levkoff [44].
The chapter unfolds as follows. The single-equation (Baumol-Oates) and weak
disposability approaches are presented and critiqued in sections “Single-Equa-
tion Modeling of the Technology Under Standard Disposability Assumptions”
and “Weakly Disposable Technologies.” The multiple constraint approach without
abatement activities is developed in section “Multiple-Equation Modeling of Pol-
lution-Generating Technologies.” To study the required form of the multifunctional
restrictions, a distinction is made between rival and joint production of multiple
outputs. It is argued that the production of multiple economic (desirable) outputs
can be rival or joint but that there is jointness in the production of good and bad
outputs.
486 S. Murty and R. R. Russell

While the analysis in section “Multiple-Equation Modeling of Pollution-Gen-

erating Technologies” is restricted to the case where only one emission is gener-
ated, section “Multi-equation Modeling of Emission-Generating Technologies with
Abatement Activities and Multiple Emissions” extends the multi-equation modeling
approach to allow for the generation of multiple emissions by a producing unit and
to incorporate abatement activities that it can undertake to mitigate its emissions.
Ayres and Kneese [3] and Pethig [46] argue that abatement activities mitigate
harmful emissions by transforming them into less harmful matter. When multiple
emissions are generated by a production unit, some may be jointly produced, while
others may be rival in nature; e.g., Levkoff [35], Kumbhakar and Tsionas [33], and
Murty and Russell [43] distinguish between complementarity and substitutability in
the generation of emissions. Section “Axiomatic Approach to Modeling Emission–
Generating Technologies” adopts an axiomatic approach, proving that any model
of pollution-generating technologies satisfies a set of desirable axioms if and only
if it is a by-production technology. Section “Efficiency Measurement” studies the
implications of the multiple production-relations approach to modeling an emission-
generating technology for the measurement of technical inefficiency of a producing
unit.
The multi-equation models of emission-generating technologies that are devel-
oped in this chapter are motivated by both the first and the second law of
thermodynamics.1 Together these laws explain why emission generation is an
inevitable consequence of economic production. Of the two, the first law of
thermodynamics, also called the material balance or the mass balance condition, is
especially popular in the literature, where it has often been employed to measure
the extent of emission generation. Intuitively, it states that matter cannot be
destroyed and hence that the mass of all material inputs must equal the mass of all
outputs produced. Section “Concluding Remarks: The Material Balance Condition”
concludes with some comments on the consequences of this condition for the
economic modeling of production technology.
A couple of caveats about the content of the chapter are in order. First, in keeping
with the theme of this volume, we consider only theoretical characterizations of
a pollution-generating technology and only non-stochastic notions of efficiency
measurement. Second, the chapter is not a standard survey. Rather, its primary
objective is to develop a consistent framework for modeling technologies that
generate by-products, drawing on the relevant literature as necessary.

1 See Ayres and Kneese [3], Ayres [2], Baumgärtner and de Swaan Arons, [6] and Baumgärtner
[5].
12 Bad Outputs 487

Single-Equation Modeling of the Technology Under Standard

Disposability Assumptions

Consider a very parsimonious model in which two inputs are employed to produce
a single intended (economic) output, with a single unintended (bad) output as a
by-product. Denote the quantities of the two inputs by x1 and x2 and the quantities
of the intended and unintended outputs, respectively, by y and z. Finally, denote the
underlying technology set by T and the production vector by x1 , x2 , y, z ∈ R4+ .
Assume that the technology satisfies standard free disposability with respect to
both the inputs and the intended output2 :

x1 , x2 , y, z ∈ T ∧ x̄1 ≥ x1 ∧ x̄2 ≥ x2 ∧ ȳ ≤ y

=⇒ x̄1 , x̄2 , ȳ, z ∈ T . (1)

In particular, output free disposability (implied by (1)),

x1 , x2 , y, z ∈ T ∧ ȳ < y =⇒ x1 , x2 , ȳ, z ∈ T ,

states that, for fixed quantities of inputs (and emissions), the economic output can
be arbitrarily reduced. Thus, reduction of the economic output is costless: it need
not entail use of additional inputs (or reduction of other economic outputs if y were
a vector of several economic outputs). Similarly, input free disposability (implied
by (1)),

x1 , x2 , y, z ∈ T ∧ x̄1 ≥ x1 ∧ x̄2 ≥ x2 =⇒ x̄1 , x̄2 , y, z ∈ T .

states that, holding the economic output (and emissions) fixed, input quantities can
be arbitrarily increased. Thus, the use of additional amounts of inputs is costless:
it need not entail reductions in the production of outputs (both intended and
unintended).
If we also assume that T is a closed set and that there are upper bounds on
production of the economic output when inputs are held fixed, T can then be
represented by a single explicit production function, F : R3+ −→ R+ , with image

F (x1 , x2 , z) := max{y | x1 , x2 , y, z ∈ T }. (2)

Given that the economic output is freely disposable, it is clear that

x1 , x2 , y, z ∈ T ⇐⇒ y ≤ F (x1 , x2 , z). (3)

2 Vector notation: x̄ ≥ x if x̄ ≥ x for all i; x̄ > x if x̄ ≥ x for all i and x̄ = x; and x̄ x if

i i i i
x̄i > xi for all i. The conjunction symbol ∧ stands for “and”.
488 S. Murty and R. R. Russell

The frontier of the technology is defined to be the set of production vectors

x1 , x2 , y, z ∈ T such that y = F (x1 , x2 , z). Free disposability of inputs implies
that the function F is nondecreasing in each of the inputs. To see this, suppose y =
F (x1 , x2 , z) and x̄1 ≥ x1 . Then x1 , x2 , y, z ∈ T and free input disposability imply
that x̄1 , x2 , y, z ∈ T . Hence, (3) implies that F (x1 , x2 , z) = y ≤ F (x̄1 , x2 , z).
Thus, F is nondecreasing in inputs.
Note that (1) imposes no disposability restriction on the unintended output. The
alternative (standard) assumptions are to treat the unintended output either as a
conventional output or as a conventional input.3 As is demonstrated below, either
of these assumptions ensures that the technology has a single-equation functional
representation (albeit these modeling assumptions both lead to counterintuitive
properties of the technology).

Treating Pollution as a Conventional Production Output

Suppose first that emission is treated as a standard output, so that T also satisfies
standard output free disposability with respect to this variable:

x1 , x2 , y, z ∈ T ∧ z̄ ≤ z =⇒ x1 , x2 , y, z̄ ∈ T . (4)

The implications of assuming emission is a freely disposable output are counter-

intuitive. This assumption implies that the technology permits arbitrary reductions
in the emission, holding all other inputs and economic output quantities fixed.
This implies in turn that there is no cost associated with reducing the emission–
emission can be reduced without affecting the production of the economic output,
an implication that is refuted by simple empirical observation in many situations. In
real-life situations, decreases in emissions like greenhouse gases come at the cost
of decreasing the economic output. In particular, assuming that emission is also
a standard output implies that the function F is nonincreasing in emission, i.e., the
trade-off along the frontier of the technology between the maximum-producible eco-
nomic output and the emission is nonpositive.4 This perverse trade-off between the
intended and unintended outputs implies that there is no trade-off between growth
and environment: ceteris-paribus, a reduction in emission (i.e., an improvement in
the environmental quality) increases the production of the economic output.
The negative relation between emission and the economic output when emission
is treated as a standard output of the technology can also be interpreted to imply that
emission has a detrimental effect on the production of the economic output. Førsund

3A nonstandard disposability assumption is explored in section “Weakly Disposable Technolo-

gies”.
4 Sketch of proof: Suppose y = F (x , x , z) and z̄ ≤ z. Free output disposability of emission (4)
1 2
implies that x1 , x2 , y, z̄ ∈ T . Hence, y ≤ F (x1 , x2 , z̄), and from (3), it follows that y =
F (x1 , x2 , z) ≤ F (x1 , x2 , z̄).
12 Bad Outputs 489

[27] demonstrates that, if emission also has a detrimental effect on social welfare,
maximization of social welfare subject to such a technological constraint results in
a solution where no emission is generated, while a positive amount of the economic
output is produced and consumed. This solution, as Førsund argues, contradicts the
inevitability of emission generation when economic outputs are produced.

Treating Pollution as a Conventional Production Input

The arguments presented above have been well understood in the literature, which
has consistently refrained from assuming that emission is a freely disposable output.
Rather, it has aimed at developing models of technology that yield a positive relation
between the generation of emissions and the production of economic outputs. One
strand of this literature,5 going back to Baumol and Oates [7] and Cropper and
Oates [14], models emissions as freely disposable inputs. In the context of the
parsimonious model presented in the previous section, this modeling strategy entails
replacing (4) with

x1 , x2 , y, z ∈ T ∧ z̄ ≥ z =⇒ x1 , x2 , y, z̄ ∈ T , (5)

while maintaining standard disposability (1) with respect to all other goods and all
other hypotheses made about the technology T in the previous section.
The input approach has some appeal: it relates emissions to the waste disposal
capacity of the environment, which is interpreted as an input in production, just
as other economics inputs. Since emission is now treated as a standard input
and satisfies standard input free disposability, the resulting trade-off between the
emission and the economic output obtained under this approach is nonnegative. To
see this, suppose y = F (x1 , x2 , z) and z̄ > z. Free (input) disposability of the
emission (5) implies that x1 , x2 , y, z̄ ∈ T . Hence, y ≤ F (x1 , x2 , z̄), and it follows
from (3) that y = F (x1 , x2 , z) ≤ F (x1 , x2 , z̄) =: ȳ. That is, the function F is
nondecreasing in the emission, so that the emission and the economic output are
positively related. This relationship is consistent with empirical observation: in real
life, emission generation and economic output production usually go hand in hand.
The proponents of the input approach6 also justify the positive trade-off between
emission generation and intended-output production under this approach by invok-
ing abatement activities. Economic resources are shared between the production of
abatement and the economic output, so that the more the resources of a producing
unit are diverted to abatement activities, they less are the available for production of
the economic outputs; thus, the lower are the amounts produced of both economic
outputs and emissions.

5 See,
for example, Njuki and Bravo-Ureta [45] and the references therein.
6 See,
e.g., Baumol and Oates [7], Laffont [34, Ch. 2], Cropper and Oates [14], Reinhard, Lovell,
and Thijssen [48], and Ball, Lovell, Luu, and Nehring [4].
490 S. Murty and R. R. Russell

Taking a very different approach, Førsund [27, 28] argues that, although the
solution to a standard social welfare maximization problem subject to a techno-
logical constraint that assumes emission is a freely disposable input and where
the emission is detrimental to social welfare is well-defined, the input approach to
modeling emission-generating technologies is unsatisfactory, as it is not revealing
of the underlying purification (abatement) activities. Abatement activities are only
implicitly assumed, and this approach therefore fails to show how abatement is
produced from the given inputs.
While the input approach seems to generate the correct trade-off between
emission generation and economic output production, Murty and Russell [42] and
Murty, Russell, and Levkoff [44] (hereafter MRL) show that it also generates two
unacceptable implications for production trade-offs.
To discuss the first of these unacceptable implications, let us first differentiate
inputs according to whether they are emission-causing (such as fossil fuels) or non-
emission-causing (such as labor and capital). Emission-causing inputs are composed
of substances that generate emissions. For example, coal contains sulfur and carbon
content, so that when it is combusted in the process of generating energy, it
liberates CO2 and SO2 into the atmosphere. In the context of our parsimonious
model, assume that the second input is emission causing, while the first is not.
MRL demonstrate that treatment of the emission as a standard input results in
a nonpositive trade-off between emission and any emission-causing input. For
example, the input approach implies that an emission like CO2 decreases with an
increase in an emission-causing input like coal, a finding that is inconsistent with
common sense. Below we provide an alternative (non-differential) proof of this
counterintuitive implication of free disposability of emissions in a single-equation
representation of the technology.
The function, : R3+ −→ R+ , with image

(x1 , y, z) := min{x2 | x1 , x2 , y, z ∈ T },

identifies the minimal amount of the emission-causing input that is required to

produce economic output y and emission z when the non-emission-causing input
use is x1 . Since T satisfies input free disposability, it can also be represented
functionally as7

7 The set T can have more than one functional representation. The function F , defined in (2), offers

one, the function offers another, and later in this section, we define a function ð that offers
yet another. Along the strictly efficient frontier of T , we have y = F (x1 , x2 , z) ⇐⇒ x2 =
(x1 , y, z) ⇐⇒ z = ð(x1 , x2 , y). A production vector in T is a strictly efficient point of T if
there exists no other point in T with no greater amounts of emission and inputs and no smaller
amount of the good output. The set of all strictly efficient points of T forms the strictly efficient
frontier of T . (See section “Axiomatic Approach to Modeling Emission-Generating Technologies”
for a formal definition of the strictly efficient frontier of technology T .)
12 Bad Outputs 491

x1 , x2 , y, z ∈ T ⇐⇒ x2 ≥ (x1 , y, z).

A production vector x1 , x2 , y, z is a frontier point of T if x2 = (x1 , y, z).

Suppose x2 = (x1 , y, z) and z̄ ≥ z. Since the emission is treated as a standard
input, T satisfies free disposability of the emission. Consequently, x1 , x2 , y, z̄ ∈
T , and the definition of the function implies that x2 ≥ (x1 , y, z̄). Hence,
(x1 , y, z) = x2 ≥ (x1 , y, z̄) =: x̄2 . Thus, the function is nonincreasing in
the emission; i.e., when the amount of the non-emission-causing input is held fixed
at x1 , the minimal amount of the emission-causing input that is required to produce
y amount of the economic output and z̄ amount of the emission is less than the
minimal amount required to produce the same amount of the economic output but a
lower amount z of emission. Hence, the input approach implies, contrary to common
sense, a nonpositive relation between the emission and the emission-causing input.
MRL and Murty [41] demonstrate a second paradox associated with the input
approach: if we assume, as is realistic, that emission generation is positively related
to the use of, emission-causing input, then the technology violates free input
disposability of the emission-causing input. We demonstrate this violation below.
The function, ð : R3+ −→ R+ , defined by

ð(x1 , x2 , y) := min{z | x1 , x2 , y, z ∈ T },

identifies the minimal emission level under technology T when the economic output
quantity is y and the input use is x1 , x2 . Thus, we have

x1 , x2 , y, z ∈ T ⇐⇒ z ≥ ð(x1 , x2 , y).

Since the second input is emission causing, its use should not decrease the
minimal level of emission that can be generated, so that ð should be nondecreasing
in x2 . Suppose, in conformance with our intuition, that ð is strictly increasing
in the use of the second (emission causing) input. Let z = ð(x1 , x2 , y). Then
x1 , x2 , y, z ∈ T , and z is the minimal emission generated by quantity x2 of the
emission-causing input. Suppose, ceteris paribus, there is an increase in the use of
this input, x̄2 > x2 . Define the minimal emission that can now be generated as
z̄ := ð(x1 , x̄2 , y). Since ð is increasing in the emission-causing input, we have
z̄ > z. This clearly implies that x1 , x̄2 , y, z ∈ / T , because otherwise z̄ would
not have been the minimal emission generated by quantity x̄2 of the emission-
causing input. Thus, to summarize, we have x1 , x2 , y, z ∈ T and x̄2 > x2 but
x1 , x̄2 , y, z ∈
/ T . Clearly, this is a violation of free disposability of the emission-
causing input. Thus, the input approach is not consistent with the empirically
observed positive relation between the emission and an emission-causing input.
To see a final critique of the input approach, define the production possibility set
corresponding to a given vector of input, say x, as the set of emission and intended
output configurations that can be produced by input vector x under technology T :
492 S. Murty and R. R. Russell

P (x) = {y, z ∈ R2+ | x, y, z ∈ T }.

If the production function F represents the technology, then output disposability of

y implies that

P (x) = {y, z ∈ R2+ | y ≤ F (x)}.

The (weakly efficient) frontier of P (x) is the set

W P (x1 , x2 ) := y, z ∈ R2+ | y = F (x1 , x2 , z) .

Thus, along the frontier of the technology, when inputs are held fixed, there is a
rich menu of combinations of the quantities of the emission and the economic
output. As noted earlier, under free input disposability of the emission and free
output disposability of the economic output, the relation between emission and
economic output along the frontier when all inputs are held fixed is nonnegative.
A combination of good and bad outputs in P (x) is a strictly efficient point of P (x)
if there exists no other combination in P (x) with no larger amount of the bad output
and no smaller amount of the good output. The set of all strictly efficient points of
P (x) – its strictly efficient frontier – is denoted P̂ (x).
Panel (a) of Fig. 1 gives an example of a production possibility set P (x)
satisfying free disposability assumptions (1) and (5), where (5) in particular implies
that emission is treated as a freely disposable input. The strictly efficient frontier,
the bold part of the weakly efficient frontier, reflects a positive relationship between
the good and the bad outputs. The overall frontier reflects a nonnegative relation
between the good and the bad outputs, but this is counterintuitive: if we hold the
quanitity of the emission-causing input fixed, there exists a unique minimal level of
emission, i.e., the minimal level is independent of the output quantity. For example,
the minimal amount of smoke that can be produced by one ton of coal containing a
fixed amount of carbon is unique.

Fig. 1 Production possibility sets satisfying (a) free disposability and (b) weak disposability
12 Bad Outputs 493

Weakly Disposable Technologies

Over the years, the principal alternative to the Baumol-Oates (single equation)
method of modeling emission-generating technologies has been the set-theoretic
approach inaugurated by Färe, Grosskopf, Lovell, and Pasurka [19] (hereafter
referred to as FGLP). This approach, which is generally oriented toward data
envelopment analysis (mathematical programming) methods of estimating or con-
structing technologies, characterizes technologies by sets of inequality conditions
for the inputs and outputs (rather than by use of explicit or implicit production
functions).8
The technologies constructed by this method satisfy conditions (1) on the free
disposability of economic output and standard inputs but not condition (5) regarding
free input disposability of the emission. Instead, the authors propose the weak
disposability condition,

x, y, z ∈ T ∧ λ ∈ [0, 1] =⇒ x, λy, λz ∈ T ,

and the null-jointness condition,

x, y, z ∈ T ∧ z = 0 =⇒ y = 0.

Panel (b) of Fig. 1 illustrates a production possibility set P (x) for a weakly
disposable technology given a fixed vector of inputs x. The bold region of its
boundary, denoted by P̂ (x), is its strictly efficient frontier, showing a positive
relation between the good and the bad outputs.
By not treating emission as a conventional output, the FGLP approach eliminates
the “global” possibility of the perverse negative trade-off between emission and
the economic output demonstrated in section “Treating Pollution as a Conventional
Production Output.” Under the weak disposability condition, pollution cannot be
freely disposed of as a standard output but can instead be reduced only in tandem
(proportionally) with intended output.
As is well documented (see, e.g., Førsund [27, 28]), however, the FGLP
approach does not altogether eliminate the negative trade-off between emission and
the economic output: local regions of the production space can exist where this
trade-off is negative. See panel (b) of Fig. 1, in which the boundary of P (x) has a
negatively sloped region.
Moreover, the MRL and Murty [41] critique of emission-generating technologies
satisfying free input disposability of emission-causing inputs, which was discussed
in the previous section, continues to apply even in the weak disposability approach,

8 See, e.g., Färe, Grosskopf, Lovell, and Yaisawarng [20], Coggins and Swinton [13], Murty and

Kumar [38], Murty and Kumar [39], Färe, Grosskopf, Noh, and Weber [21], and Boyd and
McClelland [8]. See Zhou, Ang, and Poh [56] for a comprehensive survey of a number of papers
employing this approach.
494 S. Murty and R. R. Russell

as this approach also maintains free disposability of all inputs. The strictly efficient
frontier P̂ (x) of the production possibility set P (x) for the weakly disposable
technology in panel (b) of Fig. 1 contains multiple points. As in the rationalization
of the input approach to modeling emissions, the proponents of the FGLP approach
justify the positive relation between economic output production and emission
generation in terms of abatement activities that can be undertaken by the production
unit. However, since such activities are not explicitly modeled, what is modeled can
only be interpreted as a reduced form of the technology in the space of all intended
and unintended outputs and all inputs.9 MRL demonstrate that even this reduced
form of the technology violates free disposability of the emission-causing input.
Moreover, MRL argue that, when abatement activities are produced by a produc-
ing unit along with the economic outputs, an emission-generating technology can
violate the null-jointness assumption in the weak disposability approach. Although
the use of emission-causing inputs results in the generation of emissions alongside
the generation of the economic output, it is possible that abatement activities so
produced can totally eliminate the emissions. Thus, generation of zero net emissions
alongside positive levels of economic outputs is a theoretical possibility.

Multiple-Equation Modeling of Pollution-Generating

Technologies

The output and input approaches to modeling emission-generating technologies,

critiqued in sections “Treating Pollution as a Conventional Production Output”
and “Treating Pollution as a Conventional Production Input,” impose disposability
conditions on the technology that make possible its representation by a single
functional relation F (see, e.g., Eq. (3)) or, equivalently, by the function or
ð. These sections demonstrated that some of these disposal properties do not
conform to our intuitive understanding and empirical observations of the features
of emission-generating technologies and, more particularly, that a single functional
relation is not sufficient to capture all the complex trade-offs among inputs and
outputs involved in the production of economic outputs and the generation of
emissions.
Redress of the problems with the single-equation modeling has focused on using
multiple functional restrictions to implement richer and more plausible disposability
conditions on the representation of the technology. The conceptual framework for
multiple function specifications of technologies was laid out long ago in a book
by Ragnar Frisch [30]. Although inadequately appreciated by the profession for
years, Frisch’s ideas have been reprised for the special case of modeling pollution-
generating technologies in a series of papers by Finn Førsund ( [26–28]; 2017).
Based on the ideas of Frisch, Førsund proposes the use of multiple functional
relations to represent emission-generating technologies. But identification of the

9 See Sect. 3.2 of MRL for further explanation of this restrictive interpretation
12 Bad Outputs 495

precise functional relations that correctly capture the trade-offs among goods in
production processes that generate emissions leads to dual questions about the
realistic disposal properties satisfied by such technologies. Such questions led to
the development of an axiomatic framework for modeling such technologies in a
series of papers by Murty and Russell [42, 43] [hereafter MR], Murty, Russell, and
Levkoff [44], and Murty [40, 41].
To provide a rationale for the introduction of multiple functional relations in the
modeling of such technologies, we first distinguish below between rival production
and joint production. We argue that the production of emissions and economic
outputs is not rival in production; rather, this is a special case of joint production that
is discussed in Frisch. While Førsund proposes a model where all goods (including
abatement activities) are jointly produced, in the model proposed by MR and MRL,
the independent production of economic outputs is rival, but the production of
economic outputs and emission is collectively joint, a phenomenon they call by-
production. MR and MRL consider the case of a single emission in their theoretical
model.10 Later in this chapter, we show that, in the case of multiple emissions,
independent production of emissions can also be joint or rival.11 Moreover, intuition
suggests that the production of economic outputs and explicit abatement activities
(such as mitigation of emissions by treatment plants) by a single producing unit is
rival in nature. The proposed framework can be extended to the case where some
economic outputs are also jointly produced.

Rival vs. Joint Production of Multiple Outputs

Let T ⊂ Rn+m
+ be a general technology set producing m outputs using n inputs.12
Thus, x, y ∈ Rn+m
+ is a production vector, where x ∈ Rn+ denotes an input vector
and y ∈ R+ denotes an output vector. Outputs are indexed by j , while inputs are
m

indexed by i.

Rival Production of Outputs

The following definition of rival production is equivalent to the definition of input
non-jointness in an earlier literature (e.g., Kohli [32] and Chambers [9], p. 287).
Definition. T exhibits rivalry in the production of (all) outputs if there exist
production functions, f j : Rn+ −→ R+ , one for each output, such that

10 However, in their data envelopment analysis (DEA) model and its empirical application to the
measurement of efficiency of a production unit, they adopt a multi-emission framework.
11 See section “Modeling the Generation of Multiple Emissions” of this chapter.
12 In this section, we do not distinguish between economic outputs and emissions. Both are

considered as outputs of the technology.

496 S. Murty and R. R. Russell

x, y ∈ T ⊂ Rn+m
+ ⇐⇒ ∃ x Yj ∈ Rn+ for all j = 1, . . . , m, satisfying

m
x Yj = x and yj ≤ f j (x Yj ) ∀j.
j =1

Thus, rivalry in production means that a given vector of input quantities x

employed by the production unit is allocated to (divided among) the production of its
m outputs as x Y1 , . . . , x Ym . So, if more of any input is diverted to the production of
a particular output, less of that input is available for the production of the remaining
outputs.13
The multiple production functions, f j for j = 1, . . . , m (each representing the
production of a single output), can be combined into a single production function
representing the overall technology. For example, when m = 2, we can define

F(x, y1 ) := max {f 2 (x Y2 ) | y1 ≤ f 1 (x Y1 ), x Y1 + x Y2 ≤ x}. (6)

x Y1 ,x Y2

Given an input vector x and a level of production y1 of the first output, this problem
finds the optimal split of the input vector between the production of the two outputs.
The optimal split is one that maximizes the production of the second output without
reducing the amount of the first output below y1 . Clearly, T can equivalently be
represented by the function F as follows:

x, y1 , y2 ∈ T ⇐⇒ y2 ≤ F(x, y1 ).

If the production functions, f j , j = 1, . . . , m, are nondecreasing, the inputs

and outputs are freely disposable under this representation of the technology T . If
f j is also differentiable for all j = 1, . . . , m, then it can be shown, employing the
envelope theorem on problem ((6)), that holding the input vector x fixed, an increase
in the production of the first economic output comes at the cost of a decrease in the
production of the second economic output. This is because, given the input vector,
an increase in the first economic output involves diversion of inputs to its production,
which implies that lesser amounts of inputs are available for producing the second
economic output. Thus, if F is differentiable, we have (in a slight abuse of notation)

13 See also Kohli [32]. A related literature on network DEA (e.g., Färe and Grosskopf [17],
Färe, Grosskopf and Pasurka [23], and Hampf [31]) features various subprocesses of production
among which inputs are shared (divided). One strand of this literature (see, e.g., Lozano [36]
and references therein) distinguishes between joint and non-joint inputs. While non-joint inputs
are associated with rival production, joint inputs are not shared (or divided) among production
processes and lead to the joint production of outputs, a concept that is defined in the next
subsection.
12 Bad Outputs 497

dy2 ∂F(x, y1 )
= ≤ 0.
dy1 ∂y1

In the general case of m outputs, the technology can be represented by a

single output distance function, given the individual production functions f j for
j = 1, . . . , m, as follows:

m
yj
DO (x, y):= min λ>0 | ≤f j (x Yj ) ∀ j =1, . . . , m ∧ x ≤x ,
Yj
Y Y
λ,x 1 ,...,x m λ
j =1

so that

x, y ∈ T ⇐⇒ DO (x, y) ≤ 1.

It can be shown that DO is nondecreasing in the outputs.

Given an x ∈ Rn+ , the set of weakly efficient output vectors is14

W + | DO (x, y) = 1 .
P (x) = y ∈ Rm

An output vector is strictly efficient given an input vector x if it can be produced

with input vector x, and there exists no other output vector that can be produced
with x containing a larger amount of at least one output and no smaller amount of
any other output. Let P̂ (x) denote the set of all strictly efficient output vectors given
input vector x.
If f j is a differentiable (hence continuous) and increasing function for all j =
1, . . . , m, then DO is increasing and differentiable in the outputs (∂DO (x, y)/∂yj >
0 for all j ) and W P (x) = P̂ (x), that is, the sets of weakly and strictly efficient
output vectors contingent on an input vector coincide. Further, it follows from the
n+m−1
implicit function theorem that there exists a continuous function, F̂ : R+ −→
R+ , such that the implicit production function DO can be solved to express the level
of the j th output as an explicit function of the levels of all the remaining goods; i.e.,

DO (x, y) = 1 ⇐⇒ yj = F̂ x, y−j ,

where y−j is the vector of all outputs other than the j th. Thus, the efficient set of
outputs P̂ (x) has a continuum of points, i.e., given an input vector x, there exists
a rich menu of efficient output combinations that can be produced. Further, the
implicit function theorem also implies that the trade-off between the outputs along
the strictly efficient frontier P̂ (x) is negative and is given by

14 Given input vector x, y is a weakly efficient vector of outputs if there exists no other output
vector that can be produced by input vector x with larger amounts of all outputs.
498 S. Murty and R. R. Russell

Fig. 2 Production possibility

frontier: rival outputs

∂DO (x,y)
∂yj ∂yj
= − ∂D (x,y) < 0, ∀ j = j .
∂yj O
∂yj

The famous guns and butter example in the classic textbook by Paul Samuelson and
William Nordhaus [52] is an example of rival production: the more guns produced,
the lesser are the resources available for producing the other good, butter.
Figure 2 illustrates the set of strictly efficient output vectors when the input vector
is held fixed at x̄ for the case when m = 2. This is given by the set

P̂ (x̄) = y ∈ R2+ | F(x̄, y1 ) = y2 ,

where ∂F(x̄, y1 )/∂y1 < 0. The diagram shows that there are many ways of
allocating input vector x̄ efficiently between the production of the two outputs.
The negative slope of the strictly efficient frontier implies that, as greater amounts
of inputs are allocated to production of good 1, less and less are available for
production of good 2.
Thus, when the production of outputs is rival, it is possible to represent the
technology by a single production function. Holding input levels fixed, there is
a continuum of efficient output combinations, and the trade-off between any two
outputs along the efficient frontier of the technology is nonpositive.

Joint Production of Outputs

The following definition of joint production of outputs is equivalent to the concept
of input price non-jointness defined in Kohli [32]. See also Chambers [9], p. 289.
Definition. T jointly produces outputs 1, . . . , m if there exist production functions
f j : Rn+ −→ R+ , one for every output, such that

x, y ∈ T ⇐⇒ yj ≤ f j (x), j = 1, . . . , m.

Intuitively, if a production unit employs a given vector of inputs, then the same
vector of inputs is available for the production of each of its economic outputs.
12 Bad Outputs 499

Fig. 3 Production possibility frontier: joint outputs

Thus, in contrast to rival production of outputs, the amounts of inputs are not shared/
divided among the various lines of production of the unit; rather, they are equally
available to all lines of production. Frisch [30] and Førsund [27,28] provide real-life
examples of joint production. A sheep as an input jointly produces milk, wool, and
mutton. A chicken jointly yields both eggs and poultry meat.
If, for j = 1, . . . , m, the production function f j is increasing in the inputs, the
set of strictly efficient output vectors for a given input vector is a singleton15 :

P̂ (x) = f 1 (x), . . . , f m (x) .

Thus, in contrast to the case of rival production, there is no trade-off in the

production of the outputs along an efficient frontier with fixed input quantities.
Rather, there is a positive correlation in the production of various outputs: if f j is
increasing in inputs for all j , then as input amounts increase, the unique efficiently
produced vector of outputs becomes larger, i.e.,

x̄ > x ∧ y = f 1 (x) . . . , f m (x) ∧ ȳ = f 1 (x̄), . . . , f m (x̄) =⇒ ȳ > y.

Figure 3 illustrates a case of joint production of two outputs. Shown in the

diagram are production possibility sets for three input vectors satisfying x̄ ≥ x ≥ x.
The unique strictly efficient points of P (x), P (x ), and P (x̄) are y1 , y2 =
(f 1 (x), f 2 (x), y1 , y2 = (f 1 (x ), f 2 (x ), and ȳ1 , ȳ2 = (f 1 (x̄), f 2 (x̄),
respectively.

15 For
example, there is a unique efficient combination of milk and wool that a single sheep can
produce, for it can produce only a certain maximal amount of milk and a certain maximal amount
of wool. In general, it seems realistic to assume that there is no trade-off in the production of milk
and wool by a sheep.
500 S. Murty and R. R. Russell

Multi-equation Modeling: The Case of Factorially Determined

Multi-output Production

Extending the framework in sections “Treating Pollution as a Conventional Produc-

tion Output,” “Treating Pollution as a Conventional Production Input,” and “Weakly
Disposable Technologies,” we henceforth assume that there are n inputs of which
nz are emission causing, while the remaining n − nz =: no are non-emission
n
causing. The input vector x ∈ Rn+ is partitioned as xz , xo , where xz ∈ R+z
no
is the vector of use of emission-generating inputs, while xo ∈ R+ is the vector
of use of non-emission-generating inputs. Inputs continue to be indexed by i or,
alternatively, when the partition into emission-causing and non-emission-causing
inputs is relevant, by zi or oi .16 We assume that there are m economic outputs
(indexed by j ) and m emissions (indexed by k); the respective quantity vectors
m
are denoted by y ∈ Rm + and z ∈ R+ . Let t := n + m + m
Viewing the production of emissions and economic outputs as a clear case of
joint production, Førsund [27, 28] argues that the particular multi-equation model
of Frisch [30] that is best suited for modeling emission-generating technologies is
the case Frisch called “factorially determined multi-output production,” where there
is joint production of all economic outputs and emissions. He specifically suggests
the following multi-equation system:

yj = Fj (x1 , . . . , xn ), j = 1, . . . , m, and
zk = Gk (x1 , . . . , xn ), k = 1, . . . , m , (7)

where, for all j and all k, Fj and Gk are differentiable functions with derivatives
j
satisfying Fxi (x1 , . . . , xn ) ≥ 0 for all i = 1, . . . , n, Gkxz (x1 , . . . , xn ) ≥ 0 for all
i
i = 1, . . . , nz , and Gkxo (x1 , . . . , xn ) ≤ 0 for all i = 1, . . . , no . That is, the signs of
i
the derivatives of Fj with respect to inputs imply that the marginal products of all
inputs in the production of the economic outputs are nonnegative. The signs of the
derivatives of Gk imply that emission-causing inputs (weakly) increase emissions,
while non-emission-causing inputs (called service inputs by Førsund) (weakly)
decrease emissions.
As discussed above (in section “Treating Pollution as a Conventional Production
Input”), Førsund [27, 28] argues that the single-equation input approach of Baumol
and Oates to modeling an emission-generating technology does not reveal the under-
lying purification/abatement activities that explain the positive relation between
emissions and economic activities. He goes on to argue that purification activities
can be inbedded in the technology when it is modeled by equation system (7). In

xzi refers to the amount of ith emission-causing input for i = 1, . . . , nz , and xoi refers to
16 Thus,

the amount of ith non-emission-causing input for i = 1, . . . , no .

12 Bad Outputs 501

particular, he assumes that some or all service inputs such as labor and capital can
be employed to mitigate emissions, an assumption reflected in the nonpositive signs
of the derivatives of the functions Gk , k = 1, . . . , m , with respect to service inputs.
But the problem with adopting a full-fledged joint production approach to multi-
equation modeling of a technology producing multiple economic outputs and also
engaging in abatement activities is that it fails to recognize that not only are
many economic outputs (such as guns and butter in the classic example of Paul
Samuelson) rival in production but that the production of abatement activities and
economic outputs are also rival. If the economic unit employs a vector x of inputs,
it may have to share these resources in the production of many of its economic
outputs, so that if inputs are diverted to the production of some economic output, a
lesser amount of the input vector is available for the production of its other economic
outputs. But the formulation (7) assumes that the input vector x is jointly and equally
available across all lines of production.
Similarly, when service inputs are employed by the economic unit for mitigating
its emissions, lesser amounts of these inputs are available for the production of its
economic outputs. This explains why a cost minimizing/profit maximizing produc-
ing unit diverts no resources to abatement activities when it is unregulated. The
purpose of regulation is to force a production unit to internalize abatement activities
in its operational calculus. Profit maximization, which implies minimization of
abatement expenditure, requires it not only to choose the aggregate levels of inputs
to purchase and use but also to simultaneously choose the optimal split of the
purchased input quantities between economic output production and abatement
activities.
Contrast this description with Førsund’s (2017, p.18), approach, in which inputs
going into abatement do not come from a common pool of resources of the
producing unit. Rather, it is recommended that abatement and economic production
be treated as separate “profit centres.” Given the arguments above, however, this
may not be realistic when a producing unit engages in both economic output
production and abatement activities. For example, scrubbing activities form an
integral part of several regulated thermal plants, where SO2 emissions produced
are instantly subjected to treatment. If thermal power plants were unregulated, they
would fail to undertake scrubbing, as it eats into their profits. Under regulation,
profit maximization internalizes scrubbing costs as scrubbing activities are vertically
integrated into (i.e., become a part of) the production structure.

Multi-equation Modeling: The Case of Rival and Joint Production

In contrast to the pure case of joint production (or equivalently, the factorially
determined multi-output production) discussed above, MR and MRL propose a
multi-equation model that allows rival production of economic outputs on the one
hand and joint production of economic outputs and emissions on the other. They
502 S. Murty and R. R. Russell

call this approach to modeling emission-generating technologies the by-production

approach.17
Murty [41] and Murty and Russell [43] argue that there is no unique model that
can encompass all emission-generating technologies. Models must vary depending
upon case-specific characteristics of emission generation and economic output pro-
duction. They argue further, however, that the by-production approach encompasses
production relations that can characterize all cases. These will generally be of
two types: (i) those that describe the production of the economic and abatement
outputs and (ii) those based mainly on considerations such as the mass balance
conditions that (a) relate generation of emissions to emission-causing inputs used in
the production of the economic and abatement outputs and (b) describe mitigation
of emissions by abatement activities. Each of these production relations describes a
sub-technology with its own disposability features. The overall emission-generating
technology is obtained as an intersection of these sub-technologies; i.e., it contains
production vectors that satisfy all the production relations in (i) and (ii). Dispos-
ability properties of the overall technology are engendered by the disposability
properties of its sub-technologies.
In the simple by-production technology studied in MR and MRL, only one
type of emission is produced (i.e., m = 1), and it is generated because the
production unit uses a particular input that is known to be a natural cause of this
emission. Denote the quantity of this input by xz . There are only two inputs, and
the other input is non-emission causing. Denote the quantity of this input by xo .
In this section, “Multi-equation Modeling: The Case of Factorially Determined
Multi-output Production”, for simplicity of exposition, we retain these assumptions
and assume, in addition, that more than one type of economic output is produced
(i.e., m > 1) and that there is rivalry in the production of economic outputs.18
This model also assumes that the production unit does not engage in explicit
abatement activities.19 Yet, MR and MRL show that this simplified model yields
a positive relation between the emission and the economic outputs. This relation
is based purely on the fact that the use of the emission-causing input affects
both economic output production and emission generation, resulting in a positive
correlation between these two types of outputs.

The Technology Producing Economic Outputs

The first sub-technology is a standard technology restricting the allowable combi-
nations of economic outputs and conventional inputs. It represents the production

17 This model is also extended by MR and MRL to include abatement activities, production of
which is rival to the production of economic outputs. We study this model in section “Multi-e-
quation Modeling of Emission-Generating Technologies with Abatement Activities and Multiple
Emissions”.
18 This model can be generalized to encompass the case where some economic outputs are jointly

produced.
19 Extensions of the model to include such activities are studied in section “Multi-equation Model-

ing of Emission-Generating Technologies with Abatement Activities and Multiple Emissions”.

12 Bad Outputs 503

relation of human engineering design. The formulation below assumes that the
emission by the unit does not affect the production of its economic outputs20 :

T1 = xz , xo , y, z ∈ Rt+ f (xz , xo , y) ≤ 0 , (8)

where the implicit production function f is differentiable and satisfies fxi (xz ,
xo , y) ≤ 0 for i = z, o, and fyj (xz , xo , y) ≥ 0 for all j = 1, . . . , m. These
monotonicity conditions, together with the sign of the inequality constraint in ((8)),
imply the following standard neoclassical disposability conditions for inputs and
economic outputs:

xz , xo , y, z ∈ T1 ∧ x̄z ≥ xz =⇒ x̄z , xo , y, z ∈ T1

xz , xo , y, z ∈ T1 ∧ x̄o ≥ xo =⇒ xz , x̄o , y, z ∈ T1 (9)
xz , xo , y, z ∈ T1 ∧ ȳ ≤ y =⇒ xo , xz , ȳ, z ∈ T1 .

The signs of the derivatives of f imply that, along the frontier of sub-technology
T1 (i.e., the set of production vectors satisfying f (xz , xo , y, z) = 0), standard trade-
offs between goods hold. In particular, if fyj (xz , xo , y) > 0 for some j = 1, . . . , m,
the implicit function theorem implies that, holding inputs fixed, there is nonpositive
trade-off among economic outputs:

∂yj fyj (xz , xo , y)

=− ≤0 ∀ j = 1, . . . , m.
∂yj fyj (xz ,xo ,y)

The implicit production function f is similar to the output distance function DO

derived in the discussion on rival production in section “Rival vs. Joint Production
of Multiple Outputs.” It is clear that the above trade-offs among economic outputs
imply that, under the maintained assumptions, sub-technology T1 exhibits rival
production of economic outputs. Holding inputs fixed, the greater the production of
some economic outputs, the lesser will be the production of the remaining economic
outputs along the frontier of the technology set.

The Emission-Generating Mechanism

The second sub-technology, T2 ⊂ Rt+ , links the emission generation to its various
causes in nature. Emissions are generated because many processes producing
marketable outputs necessarily require the use of emission-causing inputs,21 and
many components of these inputs are not fully transferred to the good outputs
during the process of production. Rather, some amounts of these components are

20 This feature is generalized in Murty [41], where emissions of a unit can affect its economic
output production detrimentally or beneficially.
21 This follows from the second (entropy) law of thermodynamics. See, for instance, Baumgärtner

and Arons [6] and Baumgärtner [5].

504 S. Murty and R. R. Russell

transformed into other outputs (wastes), many of which are harmful to society.22
The exact amounts of emissions produced depend also on the physical conditions
and parameters under which the production takes place, some of which may
be unobservable to the researcher. Thus, the set T2 embodies nature’s emission-
generating mechanism.
In general, one expects that the material balance condition would imply a positive
relation between the use of emission-causing input and the generation of emission
along the frontier of the sub-technology T2 . To obtain additional insights into the
structure of this sub-technology, we begin by describing the disposability properties
of this set. We show that, given these disposability properties, the function that best
represents this sub-technology implies a positive relation between the emission-
causing input and the emission along the frontier.
The following disposal properties are assumed by MR and MRL for sub-
technology T2 :

xz , xo , y, z ∈ T2 ∧ x̄z ≤ xz =⇒ x̄z , xo , y, z ∈ T2

xz , xo , y, z ∈ T2 ∧ z̄ ≥ z =⇒ xz , xo , y, z̄ ∈ T2 (10)
xz , xo , y, z ∈ T2 ∧ ȳ = y ∧ x̄o = xo =⇒ xz , x̄o , ȳ, z ∈ T2 .

The last assumption in ((10)) restricts the generation of emission to the use of
emission-causing inputs, as it implies that, ceteris paribus, arbitrary changes in
the levels of economic outputs and non-emission-causing inputs have no effect on
the generation of emission. Thus, the by-production technology described here is
not applicable to cases where the emissions are generated by outputs rather than
inputs.23
As discussed in section “Treating Pollution as a Conventional Production Input,”
the emission-causing input is not freely disposable: the quantity generated of the
emission might not remain unchanged if the use of this input increases. This feature
of the technology is reflected in the first condition in ((10)). This restriction, the
polar opposite of standard free disposability of inputs, is called costly disposability
of the emission-causing input. In contrast to free input disposability, it says that if
quantity xz of the emission-causing input produces amount z of emission, a lower
use of this input can also continue producing this amount of emission. This reflects
inefficiencies in the functioning of the emission-generating mechanism.24 This will
be true, for example, if production takes place under physical conditions (or other
unobservable parameters) that are not conducive to minimizing emission generation.

22 This follows from the first law of thermodynamics – equivalently, the material-balance condition.
23 See Murty [41] for the case where emissions can also be also generated by the economic output,
once it has been produced.
24 In contrast, when this mechanism works efficiently, lowering use of the emission-generating

input will lower the emission level. For example, if coal is burnt in an efficient manner, a lower use
of coal implies a lower emission of CO2 .
12 Bad Outputs 505

In section “Single-Equation Modeling of the Technology Under Standard Dis-

posability Assumptions,” we noted that emission is an output that does not satisfy
standard free output disposability: ceteris paribus, reductions in the emission comes
at the cost of reductions in the production of economic outputs. The second condi-
tion in ((10)) is the polar opposite of standard free disposability of outputs and is
therefore called costly disposability of emission. More intuition on this assumption
will be provided in section “Axiomatic Approach to Modeling Emission-Generating
Technologies” of this chapter.25 But we note here that this assumption permits
inefficiencies in emission generation: if a given amount of this input generates a
certain amount of emission, then owing to inefficiencies caused by unfavorable
physical and other unobservable conditions, this input quantity could also generate
more emission.
Now define the function ĝ : Rt−1
+ −→ R+ with image

ĝ(xz , xo , y) := min{z ≥ 0 | xz , xo , y, z ∈ T2 }. (11)

Since ((10)) implies that emission generation is not caused by and hence is
unaffected by changes in the economic outputs and the non-emission-generating
input, the image of the minimum emission function ĝ can be redefined as

ĝ(xz , xo , y) =: g(xz ).

The second costly disposability assumption in ((10)) implies that

xz , xo , y, z ∈ T2 ⇐⇒ z ≥ ĝ(xz , xo , y) ≡ g(xz ).

Hence, T2 can be functionally represented as

T2 = xz , xo , y, z ∈ Rt+ z ≥ g(xz ) , (12)

We now show that under the first costly disposal condition in ((10)), g is nonde-
creasing in the use of the emission-causing input.
Sketch of proof. Suppose z = ĝ(xz , xo , y) = g(xz ) and x̄z ≤ xz . Hence,
xz , xo , y, z ∈ T2 and costly disposability of the emission-causing input in ((10))
imply that x̄z , xo , y, z ∈ T2 . Thus, ((12)) implies that z ≥ g(x̄z ). But this implies
g(xz ) = z ≥ g(x̄z ).
Thus, the costly disposability assumptions in ((10)) imply that the efficient
frontier of the emission-generating set can be represented functionally by employing
the function g and that, along this frontier, emission is positively related to its natural
cause.

25 See also Murty [41] and MR.

506 S. Murty and R. R. Russell

An alternative formulation of the set T2 can be found in Ray, Mukherjee, and

Venkatesh [47]. To capture the positive relation between emission and emission-
causing inputs along the frontier of the technology T2 , they assume that this
set satisfies weak disposability of emissions and emission-causing inputs; i.e.,
emissions can be reduced in tandem with emission-causing inputs. With no further
disposability assumptions on emissions, however, this formulation could lead to
cases where the frontier of T2 has local regions with negative slopes.26 This problem,
however, can be solved if costly disposability of emissions is assumed in addition to
this weak disposability assumption.

The Overall Emission-Generating Technology

The overall by-production technology is the intersection of the two sub-
technologies:

TB := T1 ∩ T2 ≡ xz , xo , y, z ∈ Rt+ f (xz , xo , y) ≤ 0 ∧ z ≥ g(xz ) . (13)

The efficient frontier of this set comprises all production vectors xz , xo , y, z ∈ Rt+
that simultaneously satisfy equations

f (xz , xo , y) = 0 ∧ z = g(xz ). (14)

Since all inputs are (potentially27 ) shared in the production of the economic outputs,
technology TB exhibits rivalry in the production of these goods, implying that, when
all inputs (including emission-causing inputs) are held fixed, there is a menu of
efficient combinations of economic outputs. At the same time, there also exists
a unique minimal level of emission. This is because the emission-causing input
independently influences economic output production and emission generation. It
is shared in the production of economic outputs but results in a unique minimal
level of emission. Thus, TB also exhibits jointness in economic output production
and emission generation.
The panels of Fig. 4 illustrate the structure of a by-production technology. Panels
(a), (b), and (c) are drawn under the assumption that there is one good and one bad
output and only one emission-causing input. Since sub-technology T1 is independent
of emission generation and sub-technology T2 is independent of the production
of the good output, the former is depicted in the restricted space of the input and
the good output in panel (a), while the latter is depicted in the restricted space of
the input and the bad output in panel (b). As drawn, it is clear that T1 satisfies
input and output free disposability as defined in ((9)), while T2 satisfies costly
disposability of the emission-causing input and the emission as defined in ((10)).
Hence, the production point x̄, y , z is in technology T = T1 ∩ T2 . However, the

26 Thisproblem is similar to that encountered in the output approach to emission modeling, which
assumes weak disposability of emissions and good outputs.
27 Obviously, this specification could be specialized to restrict the use of some inputs.
12 Bad Outputs 507

Fig. 4 By-production
technology
(n = m = m = 1)

maximum producible amount of the good output and the minimum emission that
can be generated when the input level are held fixed at x̄ are ȳ and z̄, respectively.
In panel (c), z̄, ȳ is the sole strictly efficient point of the production possibility
set P (x̄). Compare this with Fig. 1, where emission is treated as a freely disposable
input and the set P (x̄) has many strictly efficient points. Panels (a) and (b) show
that, as the level of the input increases from x̄ to x ∗ , the maximum level of the
good output and the minimum level of the bad output that are producible increase
to y ∗ and z∗ , respectively. The shift in the production possibility set resulting from
the increase in the level of the input is seen in panel (c) of Fig. 4. The sole strictly
efficient point of the set P (x ∗ ) is z∗ , y ∗ .
Panel (a) of Fig. 5 illustrates the strictly efficient frontier of the production
possibility set P (x̄) for the case of two good outputs and a single bad output. The
strictly efficient frontier, denoted P̂ (x̄), also shows the rivalness in the production
of the two good outputs as well as the by-production of the good and bad
outputs. Given input vector x̄, the figure shows that there are a number of possible
combinations of the two good outputs–the greater is the good output y1 produced
given a fixed vector of inputs x̄, the lesser is the amount of the good output y2
produced. The maximum amount of the first (respectively, second) good output that
508 S. Murty and R. R. Russell

Fig. 5 strictly efficient

frontier with two good
outputs and one bad output
and (b) input possibility set
with two inputs and one bad
output

can be produced from x̄ is ȳ1 (respectively, ȳ2 ). However, given the input vector x̄,
there is only one feasible level of the bad output z̄ that is generated.
Panel (b) of Fig. 5 illustrates the input possibility sets for emission generation for
the case where there are one bad output and two inputs, both of which are emission-
causing; i.e., m = 1 and n = nz = 2. The diagram illustrates that both inputs satisfy
costly disposability in the production of the emission and that the input possibility
sets expand as the level of the emission increases.
The disposability properties of TB are derived from those of the sub-technologies.
Since T1 satisfies standard free disposability with respect to the economic outputs
and non-emission-causing inputs and the constraint defining T2 is independent
of quantities of these goods (see the third condition in ((10))), TB also satisfies
standard free disposability with respect to these goods. But because T1 satisfies free
input disposability with respect to the emission-causing input, while T2 violates
this condition, instead satisfying costly disposability, TB does not satisfy free
disposability with respect to this input. Recall that this is predicted in the latter
part of section “Treating Pollution as a Conventional Production Input.”
The trade-offs among goods along the efficient frontier of TB can be obtained by
applying the implicit function theorem to ((14)). As the number of outputs including
the emission is m + 1, the degree of assortment (using Frisch’s terminology) in
equation system ((14)) is m − 1 (the number of outputs minus the number of
equations). Thus, if fyj (xz , xo , y) > 0 for j = 1, . . . , m, there exists an explicit
m+1
function F : R+ −→ R+ such that F(xz , xo , y−j ) = yj ⇐⇒ f (xz , xo , y) = 0,
and equation system ((14)) can be written as:
12 Bad Outputs 509

yj = F(xz , xo , y−j ) ∧ z = g(xz ). (15)

Moreover, if the derivative of g is positive, we can invert to solve for xz as a

function of the emission level z:

z = g(xz ) ⇐⇒ xz = h(z).

Substitution into the first equation in ((15)) then yields

yj = F h(z), xo , y−j .

Thus, the trade-off between the j th economic output and the emission along the
frontier of TB is positive (under our maintained sign convention for derivatives of
functions f and g):

∂yj fxz h(z), xo , y−j
= Fxz h(z), xo , y−j h (z) = − h (z) > 0,
∂z fyj h(z), xo , y−j

as suggested by our intuition.

The substitution of the non-emission-causing input for the emission-causing
input affects both economic output production and emission generation. Suppose
the differential changes in the emission-causing and non-emission-causing inputs
are dxz < 0 and dxo > 0 and the effect of these differential changes on economic
production is zero:

dyj = Fxz xz , xo , y−j dxz + Fxo xz , xo , y−j dxo = 0.

This implies

Fxo xz , xo , y−j
dxz = − dxo < 0,
Fxz xz , xo , y−j

indicating a substitution of the non-emission-causing input for the emission-causing

input in the production of the j th economic output. The effect that this substitution
has on emission generation is negative:

Fxo xz , xo , y−j
dz = g (xz )dxz = −g (xz ) dxo < 0.
Fxz xz , xo , y−j

Thus, the above specification of an emission-causing technology using the by-

production approach, where the cause of emission in nature is attributed solely to
the good used as an input in intended production, yields the correct effect on the
510 S. Murty and R. R. Russell

emission when another input that does not cause the emission is substituted for the
emission-causing input.28
In order to capture this input substitutability in the factorially determined
multi-output production system (7), Førsund [29] [pp. 10–13] includes service
inputs as arguments in the emission-generating functions, Gk , k = 1, . . . , m .
The above calculations, however, show that the by-production model captures this
substitutability in the overall technology without the need to include these services
as arguments of the emission-generation function g in ((15)).

Multi-equation Modeling of Emission-Generating Technologies

with Abatement Activities and Multiple Emissions

As argued in section “Multiple-Equation Modeling of Pollution-Generating Tech-

nologies,” the production of economic outputs and abatement of emissions in
treatment plants is rival in nature. Resources diverted toward either of these ends
reduce resources available to meet the other. Moreover, the law of conservation of
mass implies that abatement activities merely transform targeted (usually harmful)
emissions into other forms of “less harmful” or even “useful” matter. These abate-
ment activities might also use inputs that generate additional harmful emissions.
Pethig [46] makes these points and develops a model that includes these aspects. It
is important to note, however, that many of these less harmful emissions generated
during the abatement process are outside the purview of economic policy analysis
and hence often not modeled by the researcher.
Further, generation of multiple emissions can itself be joint or rival. Two
emissions are jointly produced when there is no trade-off in their production for
a given vector of emission-causing inputs. On the other hand, the production of two
emissions is rival if an increase in the generation of one type of emission implies
a decrease in the generation of the other type for a given vector of the emission-
causing inputs.
To illustrate these points, consider the operations of a thermal power plant that
uses coal along with other inputs such as labor and capital to generate electricity
as its economic output. The coal employed generates CO, CO2 , and SO2 as the

28 Bäumgartner [5] refers to thermodynamic inefficiencies in the use of fossil fuels. These
inefficiencies arise when the heat generated by the combustion of fossil fuels is not fully (100%)
converted into the desired form of energy (such as electricity) that is required to produce the
economic output. Some of this heat can be lost. Increased use of the service inputs or improvements
in the quality of these inputs, such as large-scaled plants or better capital equipment, can reduce
thermodynamic inefficiencies, so that a given amount of fossil fuel can generate a greater amount
of the desired form of energy. A reduction in thermodynamic inefficiencies attributable to better
quality or more use of service inputs hence implies that the same amounts of the economic outputs
can be produced with lower amounts of fossil fuels. At the same time, lower use of fossil fuels,
together with the production relations characterizing the sub-technology T2 , implies lower amounts
of emission generation.
12 Bad Outputs 511

three emissions owing to its carbon and sulfur content. Of these three emissions,
CO and CO2 are rival, since the total carbon content of a given amount of coal is
limited, and, depending upon the availability of oxygen, the greater the production
of CO2 , the lesser is the production of CO.29 On the other hand, assuming that coal
contains carbon and sulfur in fixed proportions, SO2 is jointly produced with the
two carbon-based emissions. Suppose, in addition, that the plant has a scrubbing
unit that employs lime or limestone as sorbents to mitigate its sulfur emission. The
use of lime in scrubbing converts a part of the sulfur emission into gypsum, which
is either treated as a marketable by-product by the producing unit or is treated as
a relatively less harmful emission by the researcher. The extent of conversion of
SO2 into gypsum depends on the amount of lime employed and the efficiency of the
scrubbing unit.
In the spirit of this real-world example, we devote this section of the chapter
to the development of another parsimonious model, one entailing two emission-
causing inputs, four types of emissions, and one economic output; i.e., nz = 2,
m = 1, and m = 4. An abatement activity helps in mitigating one type (say
the third type) of emission (e.g., scrubbing mitigates SO2 emission), while it is
solely responsible for generating the fourth type of emission because of its use of
the second emission-causing input (e.g., scrubbing leads to production of gypsum,
which we treat as another–the fourth – emission). Economic output production
employs the non-emission-causing inputs (e.g., labor and capital) in conjunction
with the first emission-causing input (say coal) to produce thermal electricity. The
use of the first emission-causing input leads to the generation of the first three
types of emissions (say CO2 , CO, and SO2 ). Of these, the first two types of
emissions are rival in nature, while the third type is jointly produced with the other
two types of emissions. Thus, the space of all goods under study has dimension
t = nz + no + m + m + 1 = no + 8.
We first model the rival production of abatement and the economic output.
The sub-technology that produces these is the intended-production technology. We
then develop the structure of the sub-technology generating multiple emissions
from emission-causing inputs. The overall technology that produces the economic
output, multiple emissions, and abatement from all inputs is obtained as the set of
production vectors that lie simultaneously in both of these sub-technologies.

Rival Production of Abatement and the Economic Output

Individual Technologies Producing Economic Output and Abatement

The technology that produces the desired economic outputs is represented by the set
TY1 ⊂ Rn+1 Y Y Y n+1 30
+ and consists of production vectors xz , xo , y =: x , y ∈ R+ .

29 If the concentration of oxygen in the air is high, relatively more CO2 is produced, and if it is low,
relatively more CO is produced.
30 Since the second emission-causing input is not employed in the production of the economic

output, xzY2 = 0 whenever xzY , xoY , y ∈ T1 .

512 S. Murty and R. R. Russell

Let us pause to describe the nature of the output of an abatement technology

aimed at the reduction of a particular type of emission. The net output of a pollution
treatment technology–e.g., a scrubber technology in the case of SO2 emission–is
often measured in terms of the resultant reduction in the “gross” emission level.31
The gross emission of SO2 generated by the combustion of sulfur contained in coal,
g
say z3 , is reduced by the end of the scrubbing procedure. Denote this reduction in
the gross amount of SO2 by a ∈ R+ , so that the “net” emission of SO2 generated
g
by the producing unit is z3 = z3 − a.
The abatement technology employs inputs such as labor, capital, and lime or
limestone to produce reductions in the emission.32 It is clear that there are bounds
on emission reductions given fixed amounts of these inputs.33 For example, the
amount of SO2 reduction from the flue gas depends upon the amount of lime or
limestone used as a sorbent during flue gas desulfurization (FGD).34 Any given
quantity of lime or limestone, along with fixed amounts of the service inputs used
by the abatement technology, fixes the maximal amount of SO2 reduction.
Thus, the abatement technology is defined by relations among all the inputs used
by it and the extent of reduction that is made possible by the use of these inputs.35
Denote the production technology that captures these relations by TA 1 ⊂ R+ . This
n

technology contains production vectors of the form xz , xo , a = x , a ∈ TA

A A A
1.
We assume that technologies TY1 and TA 1 are standard neoclassical technologies
satisfying the following assumptions:

(T1 C) TY1 and TA 1 are non-empty and closed.

(T1 B) The sets {y ∈ R+ | x Y , y ∈ TY1 } and {a ∈ R+ | x A , a ∈ TA
1 } are bounded
for all x Y ∈ Rn+ and for all x A ∈ Rn+ .
(T1 F D) x, y ∈ TY1 ∧ x̄ ≥ x ∧ ȳ ≤ y =⇒ x̄, ȳ ∈ TY1 .
x, a ∈ T1A ∧ x̄ ≥ x ∧ ā ≤ a =⇒ x̄, ā ∈ TA 1.
(T1 SD) 0n , 0 ∈ TY1 and 0n , 0 ∈ TA1 .

31 For example, in the case of the scrubber technology, reductions are usually measured as
percentages of the gross emission.
32 As will be seen in section “Modeling the Generation of Multiple Emissions,” the reduction in

the third emission, say SO2 , is accompanied by an increase in the fourth emission, say gypsum.
This is because, depending on the quantity of the second input (say lime) used, the third emission
is converted into the fourth emission during the abatement process (say scrubbing).
33 As an analogy, a pound of a cleaning powder can only clean a finite amount of dirty surface area.

If used inefficiently, it cleans less than in its potential.

34 See, e.g., Srivastava and Jozewicz [54].
35 Hampf [31] provides a network DEA formulation of technology that includes rival production

of abatement. The inputs employed by the abatement technology include standard inputs and
gross emissions, while its output is measured in terms of reductions in emission levels. While net
emissions are observable, gross emissions are computed employing the material balance condition
as the difference between the mass of the emission-generating input used and the mass of these
inputs transferred to the marketable output during production. The difference in the gross and net
emissions is defined as the reduction in the emission levels attributable to abatement.
12 Bad Outputs 513

While Assumption (T1 B) implies that the outputs of the economic production
technology TY1 and abatement producing technology TA 1 are bounded when the
respective vectors of inputs used by these technologies are fixed at x Y and x A ,
Assumption (T1 F D) implies that these technologies satisfy standard free dispos-
ability conditions with respect to their respective outputs and inputs. Assumption
(T1 SD) says that it is possible to shut down operations of the two technologies.
Under these assumptions, the technologies TY1 and TA 1 have functional representa-
tions. Define the functions, : Rn+ → R+ and : Rn+ → R+ , by

(x Y ) = max{y ≥ 0 | x Y , y ∈ TY1 } and

(x A ) = max{a ≥ 0 | x A , a ∈ TA
1 }. (16)

Under the maintained assumptions, technologies TY1 and TA

1 can be functionally
represented as

x Y , y ∈ TY1 ⇐⇒ y ≤ (x Y ) and x A , a ∈ TA

1 ⇐⇒ a ≤ (x A ).

The Overall Intended-Production Technology T1

We define the intended production of the economic unit as its production of both the
economic and the abatement outputs. The intended-production technology, denoted
by T1 ⊂ Rt+ , combines the two technologies, TY1 and TA
1 , as follows:

n +no n +no
T1 := xz , xo , a, y, z ∈ Rt+ ∃ xzY , xoY ∈ R+z and xzA , xoA ∈ R+z

such that xzY + xzA = xz , xoY + xoA = xo ,

xzY , xoY , y ∈ TY1 , and xzA , xoA , a ∈ TA
1 . (17)

Thus, T1 is a set of all production vectors x, a, y, z such that the production vectors
y and a are possible with some allocation of the aggregate input vector x between
the two technologies TY and TA . Thus, if the vector xzY , xoY of the two types of
inputs are employed in the production of the economic output, only the remaining
amounts of inputs xzA , xoA := xz , xo − xzY , xoY are available for abatement. The
technology defined by (17) explicitly incorporates the resource cost of cleaning-
up: the diversion of resources toward scrubbing reduces the resources available for
electricity generation.
The proposition below, which directly follows from the restrictions imposed
on TY1 and TA 1 , states the properties of the intended-production technology T1 : it
is a closed set that permits shutting down; the set of combinations of economic
and abatement outputs that are feasible under T1 with finite amounts of inputs is
bounded; it satisfies free disposabiity in all inputs and the economic and abatement
514 S. Murty and R. R. Russell

outputs, and it is independent of the level of net emissions (net emissions do not
affect intended production).36

Proposition 1. Under Assumptions (T1 C), (T1 B), (T1 F D), and (T1 SD), the
following conditions are satisfied:

(i) T1 is closed and 0n , 0s , 0m , z ∈ T1 for all
z ∈ R+ .
m

m+s
(ii) the set a, y ∈ R+ x, a, y, z ∈ T1 is bounded for all x, z ∈ Rn+4 + .
(iii) x, a, y, z ∈ T1 , x ≤ x̄, y ≥ ȳ, and a ≥ ā implies x̄, ā, ȳ, z̄ ∈ T1 .

Note that (iii) holds for z = z̄ as well as z = z̄, a reflection of the fact that the set
T1 simply constrains the production of intended outputs for given quantities of the
inputs, independently of the pollution (by-product) levels.
Employing the functions and , we can obtain an implicit distance function
representation of the overall intended-production technology T1 . Define

F (x, y, a, z) = max λ ≥ 0 λy ≤ (x Y ), λa ≤ (x A ),

λ, x Y ,x A

x Y + x A ≤ x, x Y ∈ Rn+ , and x A ∈ Rn+ . (18)

Then the set T1 can be functionally represented by

x, y, a, z ∈ T1 ⇐⇒ F (x, y, a, z) ≥ 1.

Remark 1. If the functions and are differentiable, production efficiency implies

that the input vector x is split between the production of the economic output and
the abatement output such that (on the interior of Rt+ )37 the marginal rates of
technical substitution between any two inputs in economic output and abatement
output production are equalized.38

Modeling the Generation of Multiple Emissions

Levkoff [35], Kumbhakar and Tsionas [33], and Murty and Russell [43]) have
argued that, while a single restriction on emissions and emission-causing inputs
suffices to capture the generation of emissions that are rival (or substitutable) in
production, multiple restrictions, one for each type of emission, are required when
emissions are jointly produced (complementary).

36 See Murty [41] for the case where emissions also affect production of the economic outputs. For
example, smoke from a factory can have detrimental effects on the productivity of its labor.
37 And at the boundaries for appropriately defined directional derivatives.
38 This follows from considering the first-order conditions of problem (18).
12 Bad Outputs 515

Continuing our example where coal is used to produce electricity, we first

model the sub-technology that generates emissions as the by-product of economic
output production. These emissions include CO2 , CO, and SO2 . We then study the
sub-technology that generates the new emission (gypsum) during the abatement
(scrubbing) process. The overall technology that captures generation of all emis-
sions from emission-causing inputs and their mitigation by abatement activities is
obtained by combining these two sub-technologies.

Modeling Generation of Carbon and Sulfur Emissions Attributable to

Combustion of Coal
We capture the rivalry in the production of CO and CO2 and the jointness in the
production of SO2 and the carbon emissions when coal is combusted to generate
thermal electricity by first defining the set,

TY2 := xz1 , a, z1 , z2 , z3 ∈ R5+ ðC (xz1 , z1 , z2 ) ≥ 0 ∧

z3 ≥ max{ðS (xz1 ) − a, 0} ,

where z1 , z2 , and z3 denote the net emissions of CO, CO2 , and SO2 , respectively.
g
The function ðS : R+ −→ R+ with image z3 = ðS (xz1 ) gives the minimal amount
of gross emission of SO2 associated with xz1 level of coal. Given an arbitrary level
of abatement a ≥ 0, the minimal “net” emission generated is z3 = g S (xz1 ) − a
if a ≤ g S (xz1 ). If, however, a > g S (xz1 ), the minimal net emission is z3 = 0.39
Hence, the minimal net emission of SO2 is given by

z3 = max{ðS (xz1 ) − a, 0}.

The actual level of net emission, z3 , can be more than this if there are inefficiencies
in emission generation.
The implicit production function ðC captures the rival production of CO2 and CO
emissions owing to the use of coal. Thus, ðC (xz1 , z1 , z2 ) ≥ 0 implies that z1 and
z2 levels of the two carbon emissions are feasible given combustion of xz1 amount
of coal. We assume that the functions, ðC and ðS , are differentiable and that their
derivatives have the following signs: ðC xz (xz1 , z1 , z2 ) ≥ 0; ðzk (xz1 , z1 , z2 ) < 0 for
C
1
k = 1, 2; and dg S (xz1 )/dxz1 > 0. From the implicit function theorem, it follows
that the carbon and sulfur emission are increasing in the use of coal and that there
is rivalry in the production of the two carbon emissions. The latter follows because,
holding the quantity of coal fixed, there is a negative trade-off between these two
emissions:
∂z2 ðCz (xz1 , z1 , z2 )
= − C1 < 0.
∂z1 ðz2 (xz1 , z1 , z2 )

39 The potential level of abatement a can be greater than the gross emission level g S (xz1 ) if the
inputs used in the abatement technology are capable of reducing more than g S (xz1 ) of SO2 .
516 S. Murty and R. R. Russell

We assume in addition that, when no coal is used, none of the carbon or sulfur-based
emissions are produced: 05 ∈ TY2 .

Modeling the Production of Gypsum During Scrubbing

Limestone used by the abatement technology transforms the SO2 emission into
gypsum. Thus, the scrubber technology jointly produces the abatement output (a
reduction in SO2 ) and a new emission (gypsum). It is clear that, given an amount
of limestone used for scrubbing, there is an upper bound on the amount of SO2
that the scrubbing can abate. Since the abated SO2 is converted into gypsum, this
upper bound also defines the maximum amount of gypsum that can be produced by
the given amount of limestone. Inefficiency in abatement implies that less than the
maximum reduction of SO2 by the given amount of limestone takes place, resulting
in a lower amount of gypsum production. In the extreme case of inefficiency, no
reduction of SO2 takes place, and so no gypsum is produced by the scrubber. Define
a function ðG : R+ −→ R+ , with image,

z4 = ðG xz2

specifying the (maximal) amount of gypsum that can be produced when xz2
amount of limestone is used efficiently in the scrubber to reduce the SO2 emission.
Assume that this function is differentiable. Since it is increasing in the amount of
limestone used, its derivative is positive. The following sub-technology captures the
production of gypsum during the scrubbing process:

2 := xz2 , z4 ∈ R+ | z4 ≤ ð (xz2 ) .
TA 2 G

In addition, we assume that when no limestone is used, no gypsum is produced:

ðG (0) = 0.

Combining Sub-technologies Generating Carbon and Sulfur Emissions

and Gypsum
In the space Rt+ of all goods, the set depicting the net generation of all emissions
by emission-causing inputs is obtained from the individual net emission generating
sub-technologies, TY2 and TA2 , as

T2 = xz , xo , a, y, z ∈ Rt+ xz1 , a, z1 , z2 , z3 ∈ TY2 ∧ xz2 , z4 ∈ TA
2 . (19)

The proposition below states the properties of set T2 .

Proposition 2. Under the maintained assumptions, the following are true:

no
(i) T2 is closed and 0nz , xo , a, y, 04 ∈ T2 for all y ∈ Rm + , a ∈ R+ , and xo ∈ R+ .
(ii) x, a, y, z ∈ T2 , xz ≥ x̄z , a ≤ ā, zk ≤ z̄k for k = 1, 2, 3, and z4 ≥ z̄4 ,
imply x̄z , x̄o , ā, ȳ, z̄ ∈ T2 for arbritary x̄o , ȳo .
12 Bad Outputs 517

Thus, T2 satisfies joint essentiality of coal and limestone in generating emissions. It

also satisfies costly disposability of the emission-causing inputs, sulfur and carbon-
based emissions, and the abatement activity. However, it satisfies free disposability
of gypsum. Thus, ceteris paribus, T2 permits arbitrary increases in carbon and sulfur-
based emissions, decreases in gypsum, decreases in the emission-causing inputs,
and increases in abatement activity. Moreover, emission generation is independent
of the levels of the non-emission-causing inputs and the economic output–these
goods do not influence the amounts generated of emissions.

The Overall By-Production Technology with Abatement and

Multiple Emissions

Given the intended-production technology T1 defined in (17) and the set T2 depicting
emission generation defined in (19), a by-production technology, denoted by T B ⊂
Rt+ , is defined as in MRL, Murty [41], and MR as the intersection of these two sets:

T B = T1 ∩ T 2 . (20)

Once again, as seen in section “Multi-equation Modeling: The Case of Rival

and Joint Production,” the disposability properties of set T B with respect to
the emission-causing inputs are not obvious. The disposability property of T B
with respect to the abatement activity is also unclear. This is because, while set
T1 satisfies free input disposability of emission-causing inputs and free output
disposability of abatement, it satisfies costly disposability with respect to the
emission-causing input and abatement.
We next study the disposability properties of such an overall technology with
respect to all goods.

Axiomatic Approach to Modeling Emission-Generating

Technologies

The primitive concepts in the modeling of emission-generating technologies in sec-

tion “Multi-equation Modeling of Emission-Generating Technologies with Abate-
ment Activities and Multiple Emissions” are two sub-technologies, one for char-
acterizing the intended production (set T1 in the previous section) and the other
for characterizing emission generation in nature (set T2 in the previous section).
The first sub-technology is an engineering construct, while the latter captures
natural laws that link emissions to their basic sources in nature and the mitigation
of these emissions through human abatement activities. The intersection of these
sub-technologies yields the by-production technology (BPT). As seen in the
previous section, these sub-technologies have well-defined disposability properties
that conform to our intuitive understanding of the processes and are consistent
518 S. Murty and R. R. Russell

with empirical observation. The properties of the overall BPT, however, remained
undetermined.
In the analysis that follows, with a view to understanding the basic disposability
properties of the overall by-production technology, we adopt a reverse approach,
which is based on Murty [41] and Murty and Russell [43]. We perceive observable
data to have been generated by a technology T that engages simultaneously in
economic output production and emission generation. We postulate its disposability
properties in the form of some axioms. Murty and Russell show that, if a technology
T satisfies these axioms, it can be decomposed into an intended production sub-
technology and a set that describes residual generation in nature. Moreover, a BPT,
as defined in the section Modeling the Generation of Multiple Emissions satisfies
these axioms.
We generalize our model to include s types of abatement activities. A quantity
vector of abatement outputs is denoted by a ∈ Rs+ . Redefine the number of
commodities as t = n + s + m + m .
An emission-generating technology comprises a set of technologically feasible
production vectors x, a, y, z ∈ Rt+ and is denoted by ⊂ Rt+ . This technology
should capture all relations that describe how the use of inputs in production
generates the economic and abatement outputs and the emissions as well as the
mechanism by which abatement/cleaning-up activities help in mitigating emissions.
From the material balance conditions of nature, one can infer that there are both
upper and lower limits to production of emissions once the levels of emission-
causing inputs and abatement are fixed; e.g., because of its carbon content, the
combustion of coal must generate a nonnegative amount of CO2 , but the amount of
CO2 emitted depends on the oxygen supply in the air. As economists, we are usually
concerned with emissions that are harmful and economic policies that aim, ceteris
paribus, to minimize the generation of such emissions. At the same time, economic
policies aim, ceteris paribus, to maximize the production of economic outputs
from inputs. Hence, the relevant economic frontier of a technology generating
harmful emissions combines the lower limits of emission generation attributable to
the use of emission-causing inputs and abatement activities with the upper limits
of intended output production from all inputs. In this chapter, we assume that
abatement activities transform harmful emissions generated by the producing unit
into less harmful emissions that are outside the purview of economic policy and
hence not modeled by the researcher.
Thus, the strictly efficient frontier of contains only those production vectors in
for which there do not exist other production vectors, also in , with no larger
amounts of inputs or emissions and no smaller amounts of economic and cleaning-
up outputs. Thus, x, a, y, z in is a strictly efficient point of if −x̄, ā, ȳ, −z̄ >
−x, a, y, −z implies that x̄, ā, ȳ, z̄ is not contained in .
Murty and Russell [43] show that, to study the properties of the true emission-
generating technology relative to its frontier, it suffices to study the properties of
its costly disposal hull in the direction of emissions, which is defined as the set,

T := x, a, y, z + ∈ Rt+ x, a, y, z ∈ and ∈ Rm
+ .
12 Bad Outputs 519

The set T includes any production vector v = x, a, y, z ∈ as well as production

vectors of type x, a, y, z + ∈ Rt+ that, ceteris paribus (i.e., holding levels of
all other goods unchanged), produce arbitrarily larger amounts of emissions than
z. This approach is adopted because the economically relevant frontiers of the two
technologies, T and , are identical, and the set T is analytically more tractable than
the true technology set . In what follows, we therefore adopt the costly disposal
hull T as the relevant emission-generating technology.
It is helpful to define some subspaces of the set T in Rt+ . These include the
intended-output possibility set,

T y (x, a, z) = {y ∈ Rm
+ | x, a, y, z ∈ T },

the pollution-generation set,

T z (x, a, y) = {z ∈ Rm
+ | x, a, y, z ∈ T },

and the set of vectors of economic outputs and emissions that are feasible under T ,

T y,z (x, a) = {y, z ∈ Rm+m
+ | x, a, y, z ∈ T }.

For example, T y (x, a, z) is the set of all economic outputs that are feasible under
technology T with the fixed vectors of inputs, cleaning-up activities, and emissions
x, a, z. It is possible for such a subspace to be empty: for example, T y (x, a, z)
could be empty if the amount of some component of the emission vector z is
smaller than the minimal amount of net emissions that can be generated by input and
cleaning-up vectors xz and a. (This will be true, for example, if the given amounts
of fossil fuels in vector xz , combusted under the most favorable of atmospheric
conditions, generate far more CO2 than the amount indicated by the relevant
component of z.) In this case, there is no economic output vector y such that
x, a, y, z is technologically feasible, because generation of the emission vector
z is infeasible under physical laws of nature given the vector xz of emission-causing
inputs and the cleaning-up vector a. Similarly, the set T z (x, a, y) can be empty
if the levels of inputs in vector x are too small or too large a part of the inputs is
siphoned into producing abatement vector a to ensure production of intended-output
vector y. (E.g., a given amount of coal, when burnt, may be too small to produce
the amount of heat required to generate the given amount of electricity.) In such a
case, there is no emission vector z such that x, a, y, z is technologically feasible,
because y is infeasible in intended production given the combination x, a of inputs
and abatement levels. It is useful, therefore, to define the sets,

= x, a, z ∈ Rn+s+m
+ | T y (x, a, z) = ∅ and
n+s+m
= x, a, y ∈ R+ | T z (x, a, y) = ∅ .
520 S. Murty and R. R. Russell

Murty and Russell [43] impose the following assumptions on the set T 40 :

(EG0) T is closed and contains 0t .

(EG1) T y (x, a, z) is bounded, satisfies free disposability of non-emission-causing
inputs and outputs,

y ∈ T y (x, a, z), x̄o ≥ xo , and ȳ ≤ y =⇒ ȳ ∈ T y (xz , x̄o , a, z), (21)

conditional free disposability of emission-causing inputs and cleaning-up activities,

y ∈ T y (x, a, z), x̄z ≥ xz , ā ≤ a, and x̄z , xo , ā, z ∈

(22)
=⇒ y ∈ T y (x̄z , xo , ā, z),

and independence of emissions,

y ∈ T y (x, a, z) =⇒ y ∈ T y (xz , xo , a, z̄) ∀ z̄ = z. (23)

(EG2) T z (x, a, y) satisfies joint essentiality of emission-causing inputs for emission

generation:

xz = 0(nz ) =⇒ 0(m ) ∈ T z (x, a, y), (24)

conditional costly disposability of emission-causing inputs, cleaning-up activities,

and emissions:

z ∈ T z (x, a, y), x̄z ≤ xz , ā ≥ a, z̄ ≥ z, and x̄z , xo , ā, y ∈

=⇒ z̄ ∈ T z (x̄z , xo , ā, y), (25)

and independence of intended outputs and non-emission-causing inputs:

z ∈ T z (x, a, y) and xz , x̄o , a, ȳ ∈

=⇒ z ∈ T z (xz , x̄o , a, ȳ) ∀ x̄o , ȳ. (26)

To understand these axioms, recall that the set T contains only those production
vectors that simultaneously satisfy constraints on intended production and emission
generation. While the intended-production technology satisfies free disposability of
all inputs, emission-causing inputs are not freely disposable in nature’s emission
generation mechanism. For example, ceteris paribus, increasing coal combustion

40 Theseassumptions attribute emission generation to the use of emission-causing inputs only. See
Murty [41] for the case where emissions can also be generated by the economic output once it has
been produced.
12 Bad Outputs 521

is not free: it comes at the cost of increasing the emission levels. Similarly,
we can argue that, while the intended-production technology satisfies free output
disposability in the direction of cleaning-up outputs, these goods are not freely
disposable in the generation of emissions: decreasing the level of the scrubbing
activity comes at a cost of decreasing the mitigation of the SO2 emission. Hence,
the overall technology T –a composition of the intended production technology and
the laws that govern emission generation–is not freely disposable in the direction of
emission-causing inputs and cleaning-up activities.
The (EG0) assumption is standard in production theory, while (EG1) captures
properties that the technology T inherits from the intended-production technology
of human engineering design, and (EG2) reflects relations between goods that
describe emission generation and remain relevant for the technology T.
More specifically, (21) in (EG1) states that the set T permits standard free dis-
posability of the economic outputs and standard free disposability of non-emission-
causing inputs, while (22) states that emission-causing inputs and cleaning-up
activities are only conditionally freely disposable: if the production of intended-
output vector y is permitted given the quantities of inputs, cleaning-up levels,
and emissions, then y is also permitted by T under a larger vector of emission-
causing inputs x̄z and a smaller vector of cleaning-up activities ā, provided that the
vector x̄z , ā can continue generating z amounts of the emissions–i.e., provided
that x̄z , xo , ā, z ∈ . In addition, condition (23) in (EG1) states that changes in
the levels of emissions do not affect production of the intended outputs. Implicit in
this independence is an assumption that emissions produced by a producing unit are
not detrimental to its production of intended outputs.41
Condition (25) in (EG2) states that, if the emission vector z is permitted by
technology T , given the vector x, a, y of inputs, cleaning-up levels, and intended
outputs, then technical inefficiencies in emission generation can imply that z is
also permitted by T under a smaller vector of emission-causing inputs x̄z and a
larger vector of cleaning-up activities ā, provided the vector xo , x̄z , ā can still
produce amount y of intended outputs–i.e., provided x̄z , xo , ā, y ∈ . Thus,
emission-causing inputs and cleaning-up activities satisfy only conditional costly
disposability. In addition, condition (26) states that emission generation is not
affected by changes in the production levels of intended outputs or changes in the
use of non-emission-causing inputs.
Definition. The set T ⊂ Rt+ is an emission-generating technology (EGT) if it
satisfies (EG0), (EG1), and (EG2).
Let T be an EGT. To recover the intended-production technology and the
emission-generation set underlying T and to obtain its functional representation,
Murty and Russell [43] propose the use of distance functions. Define D1EG : Rt+ →
R+ and D2EG : Rt+ → R+ by

41 For
generalization to the case where the emissions produced by a producing unit also affect the
production of its economic output, see Murty [41].
522 S. Murty and R. R. Russell

inf λ ∈ R++ | y/λ ∈ T y (x, a, z) if T y (x, a, z) = ∅
D1EG (x, a, y, z) =
∞ if T y (x, a, z) = ∅

min λ ∈ R+ | λz ∈ T z (x, a, y)} if T z (x, a, y) = ∅
D2EG (x, a, y, z) =
∞ if T z (x, a, y) = ∅.

Thus, D1EG is the inverse of the maximum technologically permissible amount

by which we can expand the intended output vector y holding input, abatement, and
emission levels fixed, while D2EG is the inverse of the maximum technologically
permissible amount by which we can contract the emission vector z holding input,
abatement, and intended output levels fixed.
The function D1EG provides an implicit functional representation of the intended-
production technology, which can be recovered as

T̂1 := {x, a, y, z ∈ Rt+ D1EG (x, a, y, z) ≤ 1},

while the function D2EG provides a functional representation of the underlying

emission-generation set, which can be recovered as

T̂2 := {x, a, y, z ∈ Rt+ D2EG (x, a, y, z) ≤ 1}.

The set of production vectors x, a, y, z satisfying D1EG (x, a, y, z) = 1 forms

the upper frontier of the set T̂1 , indicating the upper bounds to the produc-
tion of intended outputs. The set of production vectors x, a, y, z satisfying
D2EG (x, a, y, z) = 1 forms the lower frontier of the set T̂1 , indicating the lower
bounds of emission generation.
The following theorem in Murty and Russell [43] shows that all the intuitive
trade-offs between goods in intended production and emission generation hold along
the frontiers defined by the functions D1EG and D2EG , respectively. In particular,
along the frontier of the underlying intended production technology defined by
D1EG , the trade-offs between standard economic outputs and inputs are nonnegative,
and those between two inputs or two economic outputs are nonpositive. On the other
hand, along the frontier of the emission-generating set defined by D2EG , the trade-
offs between emission-causing inputs and emissions are non-negative, and those
between cleaning-up activities and emissions are nonpositive.

Theorem 1. Suppose T is an EGT. D1EG is independent of z and linearly homo-

geneous in y, and D2EG is independent of y and homogeneous of degree minus one
in z. D1EG is nonincreasing in x and nondecreasing in y on the set , while D2EG is
non-decreasing in x and non-increasing in z and a on the set .
12 Bad Outputs 523

The distance functions, D1EG and D2EG , provide a functional representation of the
EGT, which we denote T EG :

x, y, a, z ∈ T EG ⇐⇒ D1EG (x, a, y, z) ≤ 1 and D2EG (x, a, y, z) ≤ 1.

The set of frontier points of T EG are the production vectors x, a, y, z that satisfy
D1EG (x, a, y, z) = 1 or D2EG (x, a, y, z) = 1.
As a conclusion to this section, we draw attention to two important points:
First, recall that an EGT is defined as a technology set that satisfies axioms
EG0, EG1, and EG2, properties that hold, we have argued, for realistic/empirically
observed emission-generating technologies. We have demonstrated above that, in
contrast to the technologies derived under the input or output approaches to emission
modeling discussed in sections “Single-Equation Modeling of the Technology
Under Standard Disposability Assumptions and Weakly Disposable Technologies,”
an EGT has a multiple-equation representation. We have earlier argued that the input
and output approaches result in many counterintuitive consequences for technology
modeling.
Second, Murty and Russell [43] show that if T is a BPT – i.e., T = T1 ∩T2 , where
T1 is defined in (17), T2 is defined in (19), and T1 and T2 satisfy the properties in
Propositions 1 and 2 – then T is also an EGT. Hence, the disposability properties of
a by-production technology are fully specified by the properties of an EGT; that is,
a BPT satisfies properties EG0, EG1, and EG2.

Efficiency Measurement

The emission-generating technologies throughout this chapter have been character-

ized in terms of production and emission sets, encompassing the possibility of firms
producing off the frontier because of technological or managerial inefficiencies.
This approach therefore facilitates discussion of the measurement of technical
(in)efficiency – calculation of a scalar measure of the “distance” from the point
of operation of the firm to the technological frontier.
Formally, an environmental technological efficiency index is a mapping, E :
Rt ∩ T → (0, 1], where T is the set of allowable technologies42 , and, informally,
larger image values are interpreted as higher levels of efficiency. A technological
inefficiency index is a mapping, I : Rt+ ∪ T → [0, ∞), where higher image values
are interpreted as greater levels of inefficiency. Clearly, any inefficiency index can
be converted into an efficiency index, and vice versa, by a simple renormalization:
I (x, y, z, T ) = [1/E(x, y, z, T )] − 1. The production vector x, y, z is a frontier
point of T if and only if E(x, y, z, T ) = 1 or I (x, y, z, T ) = 0.

42 Dakpo,
Jeanneaux, and Latruffe [16] provide a comprehensive survey of recent developments in
DEA modeling of pollution-generating technologies.
524 S. Murty and R. R. Russell

Properties of Environmental Efficiency Indexes

These definitions have no interesting content without a rigorous definition of

efficiency and the stipulation of properties satisfied by the indexes. To that end,
we first define the notion of technical efficiency, which has as its basis (i) a
normative criterion of discouraging generation of harmful emissions and (ii) a
criterion of minimizing wastage of scarce and productive economic inputs. A
production vector x, y, z ∈ T is (technologically) efficient if x̂, −ŷ, ẑ <
x, y, z implies x̂, −ŷ, ẑ ∈ / T and weakly efficient if x̂, −ŷ, ẑ x, y, z
implies x̂, −ŷ, ẑ ∈
/ T . Intuitively, a production vector is efficient if there does not
exist another production vector with no smaller amounts of the good outputs and no
larger amounts of emissions and inputs. Next, we stipulate additional properties that
efficiency indexes are required to satisfy. The most important possibilities are

• identification of (weakly) efficient points: E(x, y, z, T ) = 1 (or I (x, y, z, T ) =

0) if and only if x, y, z is (weakly) efficient for all T ∈ T, and
• monotonicity: x̂, −ŷ, ẑ > x, −y, z) =⇒ E(x̂, ŷ, ẑ, T ) < E(x, y, z, T )
for all T ∈ T
or
• weak monotonicity: x̂, −ŷ, ẑ x, −y, z) =⇒ E(x̂, ŷ, ẑ, T ) <
E(x, y, z, T ) for all T ∈ T.

Note that the satisfaction or violation of these properties depends on the maintained
set of admissable technologies T as well as the specific formulation of the index E
or I .

Hyperbolic and Directional Distance Indexes

A large number of specific (in)efficiency indexes have been proposed in the

literature.43 The first application of efficiency measurement to emission-generating
technologies was carried out by Färe, Grosskopf, and Pasurka [22]. They proposed
the (output oriented) hyperbolic efficiency index (HYP),44 defined by

EH (x, y, z, T ) = min {β ∈ (0, ∞) | x, y/β, βz ∈ T } .

The inverse of this index provides the maximal, technologically feasible (scalar)
amount by which the vector of intended-output quantities can be scaled up and

43 SeeRussell and Schworm [51] for an analysis of these indexes and their properties.
44 So-called because it measures the distance from the stipulated production/emission quantity
vector to the frontier along a hyperbolic path.
12 Bad Outputs 525

the vector of unintended-output quantities can be scaled down, holding all input
quantities fixed.45
In recent years, the more widely employed environmental efficiency index is the
(output-oriented) directional distance inefficiency index (DD), proposed by Chung,
Färe, and Grosskopf [11]46 and defined by

IDD (x, y, z, T , g) = max β ∈ (0, ∞) | x, y + βgy , z − βgz ∈ T ,

where g = gy , gz ∈ Rm+m+ is the arbitrary (output) “direction vector.” This index
provides the maximal technologically feasible (scalar) amount by which the vector
of intended outputs can be increased in the direction gy , and, concomitantly, the
vector of unintended outputs can be decreased in the direction gz while holding all
the inputs fixed.
The vectors x d , y d /β ∗ , β ∗ zd and x d , y d + β ∗ gy , zd − β ∗ gz , where β ∗ is
the solution value in each case, are referred to as “reference points”; they are
comparison vectors for assessing the efficiency of a particular production vector.
The hyperbolic and directional distance indexes work well when applied to
the weak disposability technologies advanced by the authors. MRL, however,
have noted a fundamental problem with the conventional measures of efficiency
when using the by-production (BP) approach for constructing the technology: the
efficiency score for a firm may take the value 1 for HYP measures or 0 for the
DD measure even though the firm is not weakly efficient in both environmental and
intended-output directions. In addition, the reference point, itself, with which the
firm is compared may not be weakly efficient in both these dimensions, resulting
in an understatement of overall inefficiency (overstatement of efficiency). In the BP
approach, the emission-generating technology is an intersection of one or more sub-
technologies, each possessing distinct disposability properties that capture different
types of production relations among the inputs and outputs.47
MRL argue that the DD index is particularly unsuitable for use as an inefficiency
index for a BP technology. It is well-known that the inefficiency scores obtained
from the DD measure can be very sensitive to the choice of the direction vector
g.48 This sensitivity, moreover, seems to be more salient in the BP approach, since
the choice of g in this context is typically tantamount to predetermining a choice

45 The index is called “output oriented” because it measures efficiency in output space (as opposed
to the entire <input,output> space). We return to this point later in this section.
46 Based on the notion of the directional distance function formulated by Luenberger [37] in his

novel approach to duality analysis. For recent applications of the DD to the measurement of
environmental efficiency, see Aparicio, Barbero, Kapelko, Pastor, and Zofio [1] and the papers
cited therein.
47 For example, Serra, Chambers, and Lansink [53] specify a rich model of a BP technology that

takes into account the stochastic nature of agricultural production and incorporates several sub-
technologies governing not only the production of the good (marketable) outputs and the bad
outputs but also the damage to human health.
48 See, e.g., Vardanyan and Noh [55] and Färe, Grosskopf, and Pasurka [24].
526 S. Murty and R. R. Russell

between the selection of the environmental and the intended production inefficiency
components as the measure of overall inefficiency.

The “Färe-Grosskopf-Lovell” Index

Because of these problems with the employment of the HYP or DD efficiency

measure on BP technologies, MRL propose an alternative index motivated by the
input-oriented index proposed by Färe and Lovell [25] and extended to the full
input, output space for standard technologies (with no unintended outputs) by
Färe, Grosskopf, and Lovell ( [18], pp. 153–154). The key feature of this index
is that the reference points it uses to assign efficiency scores to production vectors
are strictly efficient, in contrast to the HYP and DD indexes for which the reference
points are weakly efficient. In particular, this measure deems a production vector to
be efficient if and only if it is both environmentally efficient and efficient in intended
production.49
As the MRL modification is minor, they continue to refer to the measure as the
(output oriented) Färe-Grosskopf-Lovell (FGL) index and define it as follows:

EF GL (x, y, z, TBP ; α) :=

j θj k γk
min α + (1 − α) x, y θ, γ ⊗ z ∈ TBP ,
θ,γ ∈(0,1]m+m
m m

where y θ = y1 /θ1 , . . . , ym /θm , γ ⊗ z = γ1 z1 , . . . , γm zm , and α ∈

(0, 1) is an arbitrary weighting factor (which could depend on analytical or policy
considerations).
This index maps into the (0,1] interval and is equal to 1 if and only if the output
vectors are strictly efficient given the input vector. Moreover, in the case of BP
technologies, the index decomposes as follows:

EF GL (x, y, z, TBP ; α)

j θj k γk
= min α + (1 − α) x, y θ, γ ⊗ z ∈ TBP
θ,γ ∈(0,1]m+m
m m

j θj k γk
= min α + (1 − α) x, y θ, z ∈ T1 ∧
θ,γ ∈(0,1]m+m
m m

x, y, γ ⊗ z ∈ T2

49 This feature is attributable to the fact that the Färe-Grosskopf-Lovell index involves a maximal
contraction/expansion of all inputs/outputs in coordinate-wise directions (rather than in a maximal
radial or hyperbolic direction). Hence, all the slack in inputs and outputs is removed. (Of course,
the output-oriented version of the MRL index takes up all slack only in the output space, leaving
the possibility of residual slack in inputs. More on this in section “Extension of the FGL Index to
Graph Space” below.)
12 Bad Outputs 527

j θj
= α min m x, y θ, z ∈ T1
θ∈(0,1] m

k γk
+(1 − α) min x, y, γ ⊗ z ∈ T2
γ ∈(0,1]m
m

=: αEF1 GL (x, y, z, T1 ) + (1 − α)EF2 GL (x, y, z, T2 ) =: αβ1 + (1 − α)β2 = β,

where the second identity follows from independence of T1 from z and inde-
pendence of T2 from y. This index is a weighted average of the sum of the
average maximal coordinate-wise expansions of economic output quantities and
the average maximal coordinate-wise contractions of unintended-output quantities
subject to the constraint that the expanded/contracted output quantity vector remain
in the production possibility set for a given input vector. Under the independence
assumptions, the index decomposes into the sum of a standard intended-output-
oriented index defined on T1 and an environmental index defined on T2 (β1 and
β2 , respectively).
The FGL environmental efficiency index and the underlying MRL by-production
structure have come under criticism recently. A number of technical criticisms and
proposed corrections appear in Dakpo [15], Dakpo et al. [16], and Ray et al. [47].
We respond to these critiques in section “Critiques and Suggested Modifications
of the By-Production Structure” below but first address a more fundamental issue
raised by both Dakpo [15] and Lozano [36].

Extension of the FGL Index to Graph Space

Each of the efficiency indexes is described in sections “Hyperbolic and Directional

Distance Indexes” and “The “Färe-Grosskopf-Lovell” Index” limits the measure-
ment of efficiency to (intended and unintended) output space (for given input
quantities), leaving open the possibility of remaining slack in input space. This
feature–also adopted by Färe Grosskopf, and Pasurka [22] and by Färe, Grosskopf,
Noh, and Weber [21] in their specifications of environmental efficiency indexes–has
been criticized recently by Dakpo [15] and Lozano [36] for ignoring slack in input
quantities.
The FGL environmental index, however, can be readily extended to the full
<input, output> (or graph) space. This extension requires the incorporation of
additional contraction factors for inputs, δ = δz , δo ∈ (0, 1]n , as follows:

m m n
j =1 θj k=1 γk i=1 δi
EFGGL (x, y, z, TBP ) = min α1 + α2 + α3
θ,γ ,δ∈(0,1] m+m +n m m n

δ ⊗ x, y θ, γ ⊗ z ∈ TBP
528 S. Murty and R. R. Russell

m m n
j =1 θj k=1 γk i=1 δi
= min α1 + α2 + α3
θ,γ ,δ∈(0,1]m+m +n
m m n

δ ⊗ x, y θ, z ∈ T1 ∧ δ ⊗ x, y, γ ⊗ z ∈ T2 ,

(27)

where αν ∈ (0, 1] for ν = 1, 2, 3 and ν αν = 1.
The minimization problem in this formulation takes up the slack in all inputs
as well as all intended and unintended outputs, assuring that the reference point is
strictly efficient.50,51 Although well-defined, it appears that decomposition of the
efficiency index EFGGL into a conventional (economic output) production index and
an environmental index is not possible in the full input, output space (owing to
the interaction between contractions of input vector x with contractions of both
the economic output and the unintended-output vectors y and z). Thus, while
adaption of the FGL index to account for slack in all directions is straightforward,
the cost of this adaptation is the loss of the decomposition of overall efficiency
measurement into production and environmental components. This loss will, of
course, be consequential in some contexts and not in others.

Critiques and Suggested Modifications of the By-Production

Structure
Some recent critiques of the by-production structure are presented in the context
of data envelopment analysis. Under the assumption of constant returns to scale
and convexity of the technology, EF GL (x, y, z, TBP ; α) can be rewritten in DEA
parlance as 52
y
EDEA (x, y, z, X, Y, Z; α) = α min θ Xo λ ≤ xo ; Xz λ ≤ xz ; Yλ ≥ ; λ ≥ 0U
θ ∈(0,1],λ θ

+ (1 − α) min γ Xz μ ≥ xz ; Zμ ≤ γ z; μ ≥ 0U ,
γ ∈(0,1],μ

50 Contraction of effluent-generating input quantities (lowering the components of the vector δz )

paradoxically moves xz , z away from the frontier in its ambient subspace, but under reasonable
assumptions on the technology, reductions in these quantities will be bounded from below by the
constraints in the x, y subspace. Thus, the effective constraints in the xz , z subspace are the
lower bounds on the pollution variables.
51 Note that, in this program, removal of slacks in the input direction leads to production vectors

with the same amounts of all inputs in the two sub-technologies T1 and T2 . This important point
has been made by Ray, Mukherjee, and Venkatesh [47].
52 See p. 133 in MRL.
12 Bad Outputs 529

where Xo , Xz , Y , and Z are no × U , nz × U , 1 × U , and 1 × U data matrices of

non-emission-causing input quantities, emission-causing input quantities, intended
output quantities, and emission levels, and U is the number of decision-making units
(DMUs) in the dataset. Here, we have assumed that m = m = 1. This optimization
problem is clearly equivalent to

EDEA (x, y, z, X, Y, Z; α) = min αθ + (1 − α)γ

θ,γ ∈(0,1]2 ,λ,μ

subject to
y
Xo λ ≤ xo ; Xz λ ≤ xz ; Yλ ≥ ; λ ≥ 0U
θ
Xz μ ≥ xz ; Zμ ≤ γ z; μ ≥ 0U . (28)

A Missing Constraint?
Dakpo et al. [16] add an additional constraint, Xz λ = Xz μ, to the optimization
problem (28). They argue that the absence of such a restriction can lead to
inconsistencies in the emission-causing input levels in the benchmark/reference
points computed by the program for the two sub-technologies. We argue below that
this is not so.
Let θ BP , γ BP , λBP , μBP solve problem (28). Then convexity and constant
return to scale of the constructed DEA technology, independence of sub-technology
T1 from emission, and independence of T2 from intended production and use of
non-emission-causing inputs together imply

y
Xo λBP , Xz λBP , , γ BP z ∈ T1
θ BP
and
y
Xo λBP , Xz μBP , , γ BP z ∈ T2 .
θ BP

It is possible (as Dakpo et al. argue) that the former production vector does not lie in
sub-technology T2 and/or the latter production vector does not lie in sub-technology
T1 . But since the solution satisfies constraints of problem (28), we also have

Xz λBP ≤ xz and XzBP μ ≥ xz .

Free disposability of T1 in all inputs and costly disposability of T2 in emission-

y
causing inputs then imply that the production vector xo , xz , θ BP , γ BP z belongs
to both sub-technologies T1 and T2 . Thus, although it is possible that Xz λBP =
Xz μBP at the optimum of problem (28), we nevertheless always have

y
xo , xz , , γ BP z ∈ T1 ∩ T2 = TBP .
θ BP
530 S. Murty and R. R. Russell

The case Xz λBP = Xz μBP is simply indicative of slacks in the use of emission-
causing inputs at the optimum: Xz λBP < xz (indicating slack in T 1 ) or Xz μBP >
xz (indicating slack in T 2 ).

Conflicting Efficiency Improvements in T 1 and T 2 ?

Dakpo [15] raises some issues regarding the removal of slack in the input direction
in the particular context of a BP technology (as in ((27))). Given that the intended-
output technology T1 satisfies standard free disposability with respect to the
inputs and the economic outputs, input efficiency improvements in intended-output
production entail reductions in the quantities of inputs with no reductions in the
amounts of good outputs produced. On the other hand, features of the emission
generation set T2 –namely, costly disposability in the directions of both emissions
and emission-causing inputs – suggest that, starting from an inefficient (e.g., an
interior) point of T2 , it is possible to increase the use of the emission-causing
inputs without increasing the emission levels. Hence, according to Dakpo, efficiency
improvements in the direction of inputs with respect to the sub-technologies T1 and
T2 have conflicting implications, as they involve decreasing the use of inputs with
respect to T1 and increasing the use of inputs in the context of T2 . It is also for such
a reason that Lozano [36] restricts efficiency improvements in the input direction
to only non-emission-causing inputs. The expressed rational for this restriction is
that the set T2 is assumed to be independent of non-emission-causing inputs so
that, holding the use of emission-causing inputs fixed, efficiency improvements boil
down to standard reductions in the use of non-emission-causing inputs, increases in
the production of the good outputs, and reductions in generations of emissions.
Despite the concerns raised by Dakpo, the general definition of economic
efficiency, as spelled out in ((27)), is unambiguous about the input reductions that
imply efficiency improvement. Scarcity of all productive inputs (both emission-
generating and non-emission generating) implies that efficiency improvements
entail reducing wastage (removing slacks) in any input direction. Thus, in the
context of the overall BP technology–an intersection of the sets T1 and T2 –efficiency
improvements involve reduced use of any input without decreasing the production
of the good outputs or increasing the generation of emissions.

An Overall Intensity Factor?

Ray, Mukherjee, and Venkatesh [47] distinguish between a unified and a decentral-
ized (by-production) approach in their DEA constructions of pollution-generating
technologies. The latter approach involves construction of two sub-technologies,
T1 and T2 , from data on DMUs using two distinct sets of intensity vectors, one
for each sub-technology. The objective is to capture the distinct sets of production
relations satisfied by sub-technologies T1 and T2 . The unified DEA approach, on
the other hand, uses only a single intensity vector to construct an overall technology
satisfying (i) standard free disposability of good outputs and non-emission-causing
inputs and (ii) weak disposability of emissions and emission-causing inputs:
12 Bad Outputs 531

xz , xz , y, z ∈ T ∧ λ ∈ [0, 1] =⇒ xo , λxz , y, λz ∈ T .

Specifically, the Ray et al. technology is

U
U
j
U
j
TRMV = xo , xz , y, z ∈ Rt+ λj yj ≥ y; λj xo ≤ xo ; α λj xz = xz ;
j =1 j =1 j =1

U
α λj zj = z; 0 ≤ α ≤ 1;
j =1

λj = 1; λj ≥ 0 ∀ j = 1, . . . , U . (29)
j

As pointed out in section “Axiomatic Approach to Modeling Emission-Generat-

ing Technologies,” a BP technology is equivalent to an overall emission-generating
technology satisfying axioms (EG0), (EG1), and (EG2). In particular, (EG1) and
(EG2) imply conditional free disposability and conditional costly disposability
of emission-causing inputs. It is possible that the unified approach of Ray et
al. could violate these axioms. Specifically, weak disposability of emissions and
emission-causing inputs could be in conflict with conditional costly disposability
of the pollution-generating inputs. Intuitively, weak disposability of emissions and
emission-causing inputs implies that, holding the quantities of all good outputs and
non-emission-causing inputs fixed, radial contractions of emissions and emission-
causing inputs are points in the technology. Realistically, however, this may not
be possible, as reductions in emission-causing inputs, such as fossil fuels, may
require a decrease in the good outputs produced when all other inputs are held
fixed.
The Ray et al. DEA technology in (29) is predicated on the use of “two different
intensity vectors” λ and μ (both in RU ), in MRL’s DEA construction of the by-
production technology. They argue that, “[w]hen separate intensity vectors are used,
the peer group for good output production may (and in many cases will) be different
from the one for bad output production.” In fact, the programs (6.1) and (6.2) on
page 130 of MRL generate a single reference point, y θ , β ⊗z, where θ solves
(6.1), and β solves (6.2). (Note that the program (6.1) is independent of z, and the
program (6.2) is independent of y.) Alternatively, in the notation of this chapter
(see Eq. (28)), the programs generate a unique reference point, y/θ BP , γ BP z,
when inputs are held fixed at xo , xz . Thus, the reference production vector is
xo , xz , y/θ BP , γ BP z , which lies in TBP = T1 ∩ T2 . This construction allows
for the possibility that a firm using xo , xz amounts of inputs is efficient with
respect to the intended production sub-technology T1 but not with respect to the
environmental sub-technology T2 , or vice versa, or is inefficient with respect to both
sub-technologies.
532 S. Murty and R. R. Russell

Concluding Remarks: The Material Balance Condition

Throughout this chapter, we have made frequent reference to the material balance
condition, a physical relationship that must hold for all production processes. This
condition, embedded in the first law of thermodynamics, intuitively states that
matter cannot be destroyed and hence that the mass of all material inputs must
equal the mass of all material outputs produced. This law on the preservation of
mass energy was introduced into economics in the seminal work of Ayres and
Kneese [3]. They employed this principle to account for wastes generated at the
macroeconomic level based on the knowledge of the masses of material inputs
employed by the economy and the economic outputs produced. The roles played
by both the first and the second (entropy) laws of thermodynamics in the generation
of emissions as an inevitable consequence of production activities have been more
comprehensively discussed in subsequent papers by Baumgärtner and de Swaan
Aron [6] and Baumgärtner [5].
We have endeavored to present in this chapter multi-relation models of pollution-
generating technologies that are consistent with both of the physical laws of
thermodynamics. There exists, however, a microeconomic literature that aims at
explicit incorporation of the first law into the specification of the technology.
Especially noteworthy are the papers by Pethig [46], Coelli, Lawers and Van
Huylenbroeck [12] Chambers and Melkonyan [10], Hampf [31], and Rodseth
[49, 50]. Essentially, these papers introduce a material balance identity along the
following lines into the model of the production process:

α · xz = β · y + γ · z, α, β, γ > 0, (30)

where the coefficient vectors α, β,, and γ convert input and (intended and unin-
tended) output flows into common mass units. Many of these works also demon-
strate that material balance conditions are generally violated in the conventional
input and output approaches to modeling emission-generating technologies.
While the material balance condition–a physical law–must hold at both the macro
and the micro level, MRL and Försund (2017) discuss some concerns that may
arise when it is directly employed to quantify generation of emissions at a micro
level. In particular, the accounting nature of this condition accurately measures the
amounts of wastes generated only if the researcher has full information about all
the inputs (economic and noneconomic) used and the full set of outputs (good and
bad) produced. This seems possible only if the production process is a completely
closed system, which in turn requires no leakage of some unaccounted-for effluents
or inputs. In the space of observable and deducible variables, we cannot expect the
material balance condition to hold as an equality when unobservable variables are
not available to complete the balance. For example, one of the most important forms
of matter in the universe is oxygen, which is an input in many industrial processes
and is difficult to account for in the specification of the technology. As another
example, if only some of the wastes generated during production are policy relevant,
12 Bad Outputs 533

the researcher may not find it worthwhile to collect data on the remaining wastes (or
noneconomic outputs), especially if these wastes are not directly observable.
For these and other reasons, we think the research on the incorporation of the
material balance condition into models of pollution-generating technologies is only
in its formative stage and not yet ready for synthesizing. This inchoate feature of the
research suggests ample opportunities for researchers interested in contributing to
the development of models featuring this condition. We recommend that they start
with careful review of the works cited above.

Cross-References

Data Envelopment Analysis: A Nonparametric Method of Production Analysis

Distance Functions in Production Economics
Economics of Externalities: An Overview
Multiproduct Technologies

References
1. Aparicio J, Barbero J, Kapelko M, Pastor JT, Zofio JL (2017) Testing the consistency and
feasibility of the standard Malmquist–Luenberger index: environmental productivity in world
air emissions. J Environ Manag 196:148–160
2. Ayres RU (1996) Eco-thermodynamics: economics and the second law. Ecol Econ 26:189–209
3. Ayres RU, Kneese AV (1969) Production, consumption and externalities. Am Econ Rev
59:282–297
4. Ball VE, Lovell CAK, Luu H, Nehring R (2004) Incorporating environmental impacts in the
measurement of agricultural productivity growth. J Agric Resour Econ 29:436–460
5. Baumgärtner S (2012) Ambivalent joint production and the natural environment: an economic
and thermodynamic analysis. Springer Science & Business Media, Berlin
6. Baumgärtner S, de Swaan Arons J (2003) Necessity and inefficiency in the generation of waste.
J Ind Ecol 7:113–123
7. Baumol WJ, Oates WE (1975, 1988) The theory of environmental policy, 1st and 2nd edns.
Cambridge University Press, Cambridge
8. Boyd GA, McClelland JD (1999) The impact of environmental constraints on productivity
improvement in integrated paper plants. J Environ Econ Manag 38:121–142
9. Chambers RG (1988) Applied production analysis: a dual approach. Cambridge University
Press, Cambridge
10. Chambers RG, Melkonyan T (2012) Production technologies, material balance, and the
income-environmental quality trade-off. University of Exeter Working Paper
11. Chung YH, Färe R, Grosskopf S (1997) Productivity and undesirable outputs: a directional
distance function approach. J Environ Manag 51:229–240
12. Coelli T, Lauwers L, Van Huylenbroeck GV (2007) Environmental efficiency measurement
and the materials balance condition. J Prod Anal 28:3–12
13. Coggins JS, Swinton JR (1996) The price of pollution: a dual approach to valuing SO2
allowances. J Environ Econ Manag 30:58–72
14. Cropper ML, Oates WE (1992) Environmental economics: a survey. J Econ Lit 30:675–740
15. Dakpo KH (2015) On modeling pollution-generating technologies: a new formulation of the
by-production approach. Paper presented at the 6th EAAE PhD Workshop, Rome Co-organized
by AIEAA (Italian Association of Agricultural and Applied Economics) and the Department
534 S. Murty and R. R. Russell

of Economics of Roma Tre University. http://prodinra.inra.fr/ft?id=1DE1A17F-F41C-41F9-

A853-305563DA93A9
16. Dakpo KH, Jeanneaux P, Latruffe L (2016) Modelling pollution-generating technologies in
performance benchmarking: recent developments, limits and future prospects in the non-
parametric framework. Eur J Oper Res 250:347–359
17. Färe R, Grosskopf S (2000) Network DEA. Socio Econ Plan Sci 34:35–49
18. Färe R, Grosskopf S, Lovell CAK (1985) The measurement of efficiency of production.
Kluwer-Nijhoff, Boston
19. Färe R, Grosskopf S, Lovell CAK, Pasurka CA (1989) Multilateral productivity comparisons
when some outputs are undesirable: a nonparametric approach. Rev Econ Stat 71:90–98
20. Färe R, Grosskopf S, Lovell CAK, Yaisawarng S (1993) Derivation of shadow prices for
undesirable outputs: a distance function approach. Rev Econ Stat 75:374–380
21. Färe R, Grosskopf S, Noh D-W, Weber W (2005) Characteristics of a polluting technology:
theory and practice. J Econ 126:469–492
22. Färe R, Grosskopf S, Pasurka C (1986) Effects of relative efficiency in electric power
generation due to environmental controls. Resour Energy 8:167–184
23. Färe R, Grosskopf S, Pasurka C (2013) Joint production of good and bad outputs with a network
application. In: Shogren J (ed) Encyclopedia of energy, natural resources, and environmental
economics, vol 2. Elsevier, Amsterdam, pp 109–118
24. Färe R, Grosskopf S, Pasurka C (2007) Environmental production functions and environmental
directional distance functions. Energy 32:1055–1066
25. Färe R, Lovell CAK (1978) Measuring the technical efficiency of production. J Econ Theory
91:150–162
26. Førsund F (1972) Allocation in space and environmental pollution. Swed J Econ 74:19–34
27. Førsund F (1998) Pollution modeling and multiple-output production theory. Discussion
Paper #D-37/1998, Department of Economics and Social Sciences, Agricultural University
of Norway
28. Førsund F (2009) Good modelling of bad outputs: pollution and multiple-output production.
Int Rev Environ Resour Econ 3:1–38
29. Førsund F (2018) Multi-equation modeling of desirable and undesirable outputs satisfying the
material balance. Empir Econ 54:67–99
30. Frisch R (1965) Theory of production. D. Reidel Publishing Company, Dordrecht
31. Hampf B (2014) Separating environmental efficiency into production and abatement efficiency:
a nonparametric model with application to US power plants. J Prod Anal 41:457–473
32. Kohli U (1983) Non-joint technologies. Rev Econ Stud 50:209–219
33. Kumbhakar SC, Tsionas EG (2016) The good, the bad and the technology: endogeneity in
environmental production models. J Econ 190:315–327
34. Laffont J-J (1998) Ch. 1 in Fundamentals of public economics, translated by Bonin JP and
Bonin H, MIT Press, Cambridge, Massachusetts; London, England
35. Levkoff SB (2013) Efficiency trends in U.S. coal-fired energy production & the 1990 clean
air act amendment: a nonparametric approach. Working paper, UC San Diego. Online version:
http://stevelevkoff.com/uploads/Clean_Air_Act.pdf
36. Lozano SC (2015) A joint-inputs network DEA approach to production and pollution-
generating technologies. Expert Syst Appl 42:7960–7968
37. Luenberger DG (1992) New optimality principles for economic efficiency and equilibrium. J
Optim Appl 75:221–264
38. Murty MN, Kumar S (2002) Measuring cost of environmentally sustainable industrial devel-
opment in India: a distance function approach. Environ Dev Econ 7:467–486
39. Murty MN, Kumar S (2003) Win-win opportunities and environmental regulation: testing of
porter hypothesis for indian manufacturing industries. J Environ Manag 67:139–144
40. Murty S (2010) Externalities and fundamental nonconvexities: a reconciliation of approaches
to general equilibrium externality modeling and implications for decentralization. J Econ
Theory 145:331–353
12 Bad Outputs 535

41. Murty S (2015) On the properties of an emission-generating technology and its parametric
representation. Econ Theory 60:243–282
42. Murty S, Russell RR (2002) On modeling pollution-generating technologies. Department of
Economics, University of California, Riverside, Discussion Papers Series, No 02-14
43. Murty S, Russell RR (2018) Modeling emission-generating technologies: reconciliation of
axiomatic and by-production approaches. Empir Econ 54:7–30
44. Murty S, Russell RR, Levkoff SB (2012) On modeling pollution-generating technologies. J
Environ Econ Manag 64:117–135
45. Njuki E, Bravo-Ureta BE (2015) The economic costs of environmental regulation in U.S. dairy
farming: a directional distance function approach. Am J Agric Econ 97:1087–1106
46. Pethig R (2006) Non-linear production, abatement, pollution and materials balance reconsid-
ered. J Environ Econ Manag 51:185–204
47. Ray SC, Mukherjee K, Venkatesh A (2018) Nonparametric measures of efficiency in the
presence of undesirable outputs: a by-production approach with weak disposability. Empir
Econ 54:31–65
48. Reinhard S, Lovell CAK, Thijssen GJ (1999) Econometric estimation of technical and
environmental efficiency: an application to Dutch dairy farms. Am J Agric Econ 81:44–60
49. Rodseth KL (2016) Environmental efficiency measurement and the materials balance condition
reconsidered. Eur J Oper Res 250:342–346
50. Rodseth KL (2017) Axioms of a polluting technology: a materials balance approach. Environ
Resour Econ 67:1–22
51. Russell RR, Schworm W (2011) Properties of inefficiency indexes on input, output space. J
Prod Anal 36:143–156
52. Samuelson P (1948) Economics. McGraw-Hill [19th Edition (2009) by Samuelson P, Nordhaus
WD. McGraw-Hill, Irwin
53. Serra T, Chambers RG, Lansink AO (2016) Measuring technical and environmental efficiency
in a state-contingent technology. Eur J Oper Res 236:706–717
54. Srivastava RK, Jozewicz W (2001) Flue gas desulphurization: the state of the art. J Waste Air
Manag Assoc 51:1676–1688
55. Vardanyan M, Noh D-W (2006) Approximating pollution abatement costs via alternative
specifications of a multi-output production technology: a case of us electric utility industry.
J Environ Manag 80:177–190
56. Zhou P, Ang B, Poh K-L (2008) A survey of data envelopment analysis in energy and
environmental studies. Eur J Oper Res 189:1–18
Market Structures in Production Economics
13
Devin Garcia, Levent Kutlu, and Robin C. Sickles

Contents
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 538
The Structure-Conduct-Performance Paradigm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 539
The Bounds Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 540
Commonly Used Basic Market Structures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 543
Conduct Parameter Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 547
Dynamic Market Structures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 549
Market Structures with Differentiated Products . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 550
Market Structure and Market Power . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 551
Market Structure and Innovation Studies with No Explicit Treatment for
Distorted Production Decisions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 553
Theoretical Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 558
Empirical Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 565
Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 570
Cross-References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 570
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 570

D. Garcia
Ernst and Young, LLP, Houston, TX, USA
e-mail: [email protected]
L. Kutlu
Department of Economics and Finance, University of Texas Rio Grande Valley, Edinburg, TX,
USA
e-mail: [email protected]
R. C. Sickles ()
Department of Economics, Rice University, Houston, TX, USA
e-mail: [email protected]

© Springer Nature Singapore Pte Ltd. 2022 537

S. C. Ray et al. (eds.), Handbook of Production Economics,
https://doi.org/10.1007/978-981-10-3455-8_4
538 D. Garcia et al.

Abstract

Our chapter begins by discussing the structure-conduct-performance (SCP)

paradigm, which is an early descriptive literature that provided many of the
stylized facts about market behaviors. This is followed by a discussion of the
bounds approach, which concentrates on making predictions that can hold across
a broad range of industries and is achieved by aiming conclusions based on
minimal assumptions. We then briefly talk about commonly used fundamental
market structures and illustrate how different combinations of various standard
concepts are combined to describe market structures. As dynamic en and markets
with product differentiation play important roles in defining market structures,
we provide additional information about models with dynamic environments
and models with product differentiation in separate sections. We finalize our
review by discussing in depth the literature on market structure on innovation,
productivity slowdowns and related problems of income inequality.

Keywords

Structure-conduct-performance paradigm · Structural models · Dynamics ·

Innovation

Introduction

One of the important purposes of industrial organization is to understand the envi-

ronments, i.e., the market structures, in which firms interact with each other, their
customers, and potential entrants as well as the implications of these environments
for market outcomes. Market structure, among other factors, is characterized by the
number of firms in the market, the concentration of the market, the technology, the
nature of the products, and the presence of information asymmetry between firms
and customers.1 These factors help describing the market structure and examining
how it relates to firm conduct, firm efficiency, and market performance.
Issues that are taken up by this literature involve whether collusion and higher
prices are facilitated in more concentrated markets and whether market concentra-
tion affects firms’ research and development (R&D) or other types of investments.
Among other benefits, an understanding of such relationships guides the policy
makers to determine whether regulatory actions are needed and, if needed, how
particular regulations may be implemented. Often, economic policies are imple-
mented at a level that affects a wide range of industries, which requires economists
to develop tools that apply in a broad range of settings and industries. However, it
is also essential to understand industry-specific differences and industry- or even

1 Some of these factors may be endogenous, which we will talk about later in the chapter.
13 Market Structures in Production Economics 539

firm-specific predictions as in merger cases. Therefore, not surprisingly, economists

have developed tools that can be useful in both these settings.
The structure-conduct-performance (SCP) paradigm is an early descriptive liter-
ature that provided many of the stylized facts about market behaviors. The bounds
approach concentrates on making predictions that can hold across a broad range of
industries. This is achieved by aiming conclusions based on minimal assumptions.
On the other hand, structural methods rely on game theoretic modeling of structure
and generally focus on specific industries. Since the focus is a specific industry,
stronger assumptions that are compatible with the industry-specific properties are
acceptable.
This chapter begins with a discussion of the SCP paradigm and related literature,
which is followed by a description and literature review on the bounds approach.
Then, the chapter briefly talks about commonly used fundamental market structures.
The section illustrates how different combinations of various standard concepts are
combined to describe market structures. At the end of this section, additional poten-
tial weaknesses of the SCP approach is briefly discussed. This section is followed
with a description and brief literature review of the conduct parameter approach,
which is mainly used in the market power literature. As dynamic environments
and markets with product differentiation play important roles in defining market
structures, additional information about models with dynamic environments and
models with product differentiation are presented in separate sections. The review
is finalized by discussing in depth the literature on market structure on innovation,
borrowing from perspectives of Schumpeter. Finally, as innovation plays a major
role in the ongoing public policy debates about productivity slowdowns and the
related problem of income inequality, this seems particularly relevant to a Handbook
Chapter that addresses market structure in a Handbook of Production Economics.
The last section concludes.

The Structure-Conduct-Performance Paradigm

The structure-conduct-performance paradigm proposes a one-way causation path

that starts with market structure’s effect on firm conduct, followed by the effect
of firm conduct on market performance. As mentioned in the introduction, market
structure is a function of number of factors, such as the number of firms in a market,
the concentration of a market, technology, the nature of products, and the presence
of information asymmetry between firms and customers. A general approach taken
by SCP studies is describing market structure through market concentration. It is
argued by SCP that in highly concentrated markets, the firm’s conduct is likely
to be more collusive, which would lead to higher prices and thus higher profits.
A potential reason for high concentration is entry barriers. One particular entry
barrier is the presence of scale economies. Bain [7] argues that the need of a firm
to be large in order to obtain productive efficiency creates an entry barrier. Bain
[7] showed that there is a significant correlation between scale economies and
market concentration. This result is consistent with the idea that scale economies
540 D. Garcia et al.

facilitate entry barriers. The follow-up studies by Bain and others showed that scale
economies are not always the only driving force for high market concentration. For
example, in markets with intensive investment in advertisement and/or R&D, it is
possible to observe high market concentrations.
In the SCP literature, the hypothesis that increased concentration leads to higher
prices is strongly supported by both empirical and theoretical studies. Hence, it
is argued that concentration is bad for consumers, which historically paved the
way for further antitrust legislation. However, a similar strong relation could not
be deduced for market concentration and profitability. In particular, the empirical
results for the relationship between market concentration and profitability are
mixed. One difficulty in examining this relationship stems from measurement and
interpretation issues [31]. Another difficulty is that cross-sectional data used in these
studies frequently come from different industries. This is problematic due to the
fact that demand and supply conditions in these industries may be substantially
different. The final criticism for SCP studies is that market structure is assumed
to be exogenous. Demsetz [27] argues that positive correlation between market
concentration and profits may be due to efficiency differences of the firms. That
is, those firms with superior efficiencies can gain market share and this may lead to
higher market concentration levels. Weiss [113] argues that the studies that focused
on concentration and price do not face all the criticisms that market concentration
and profit studies receive. For example, many of the price-concentration studies use
data across specific local markets within an industry. However, as pointed out by
both Bresnahan [17] and Schmalensee [91], many of the price-concentration studies
suffer from serious endogeneity issues. More precisely, the unobserved demand and
supply shocks in a market may not only affect prices but also the market structure
(e.g., entry barriers may be affected through changes in costs).

The Bounds Approach

The underlying idea of bounds approach, developed by Sutton2 [104, 105], is

identifying strong mechanisms that can characterize the market outcomes across a
broad range of environments. Hence, the approach aims to make as few assumptions
as possible about market structure to generate general testable predictions. Ideally,
these assumptions should be based on strongly confirmed empirical regulations to
reach a reasonable conclusion. This approach deviates from the notion of a fully
specified model, which leads to an (unique) equilibrium outcome. Different models
that have been proposed in the literature lead to different equilibrium outcomes,
and the bounds approach aims to find the bounds for these outcomes. Therefore,
rather than providing a single equilibrium outcome, the approach provides a set
of equilibrium outcomes that are reasonable. Below we provide the details of this

2 These papers are inspired by an idea introduced by Shaked and Sutton [98].
13 Market Structures in Production Economics 541

approach. Sutton [106] and Ellickson [30] are other studies that provide summaries
of this approach.
In order to give some idea about the bounds approach, we give a simple
example. We borrow this simple version of bounds approach from Sutton [106],
which involves a two-stage game. There are N0 (≥2) firms with constant marginal
cost, c, and they produce a homogenous good. In the first stage, the strategic and
forward-looking firms decide about entry and in the second stage they compete for
customers. The payoff of a firm equals the profit in the second stage minus the sunk
cost of entry, e. All consumers have the same utility given by U = xα z1−α where
x is the focus of the study and z is the outside good. Assume that there is a (high)
price p0 so that if p > p0 the consumers do not by x. The demand schedule is given
by X = S/p where p is the market price, S is the total number of expenditures on x,
X is the total quantity sold by all firms. Sutton [106] characterizes the equilibrium
as a perfect Nash equilibrium of the game. Let N denote the number of entrants
in the first stage, and we solve the symmetric Cournot game in the second stage.
As expected, the price falls as N increases and eventually converges to c. Also, the
equilibrium profit in the second stage is π = S/N2 . This enables us to solve the
number of entrants by setting π ≥ e. It turns out that as S increases, the 1-firm
concentration ratio, C1 = 1/N monotonically converges to zero and output per firm
increases. Finally, Sutton [106] notes that the number of firms increases less than
proportionally with S.3
One of the key differences in this setting from SCP is that entry is endogenous.
Two fundamental assumptions for the equilibria are that they must satisfy stability
and viability. Stability assumes that any firm that chooses to not enter expects
negative profits if they enter the market. Viability assumes that any firm that chooses
to enter the market expects nonnegative profits if they enter the market.
In general, the bounds approach leads to different conclusions depending on
whether advertising and R&D plays an important role in the market. Sutton shows
that when advertising and R&D are relatively less important for an industry, the
market size negatively affects the minimum level of concentration. In other words,
the lower bound for concentration decreases as the market grows larger. This result,
however, does not imply a functional relationship between concentration and market
size. That is, the result does not rule out the possibility of a positive relationship
between market size and concentration. What Sutton finds is a relationship between
the lower bound of concentration and market size. The vagueness of this result is an
artifact of general assumptions applied.
Another related question that is answered by Sutton is the relationship between
the strength of competition and the lower bound for concentration. He finds that,
for a given market size, an intensification of price competition will shift the lower
bound of concentration upwards. Sutton [106] argues that there are two candidate
mechanisms that may lead to this increase in concentration. Some firms may exit
and/or consolidate via mergers and acquisitions. An important policy implication

3 For additional examples, see Sutton [106].

542 D. Garcia et al.

of this finding is that regulations which aim to increase competition may indeed
lead to market structures that are more concentrated. Of course, as explained earlier,
the bounds approach does not provide functional relationships and thus this result
does not necessarily imply that an increase in competition would lead to more
concentrated market structures.
For industries where advertising and R&D are relatively more essential, Sutton
finds that the share of the largest firm remains positive as the market size grows.
Hence, the concentration is bounded below. This result contrasts with the findings
for industries where advertising and R&D are relatively less crucial. In these
industries, larger markets are still more profitable than small ones, but also a larger
market size increases the incentives to increase the amount of fixed investments.
This in turn prevents market concentration from falling indefinitely. These results
explain why in some industries firms have incentives that result in a growing amount
of unprofitable sunk investments.
One interesting yet difficult question is whether the results of standard multistage
models are carried out to the dynamic games framework. Sutton [105] (Chap. 13,
“Technology and Market Structure Revisited”) presents a dynamic game in this
context. Sutton [106] argues that the multistage game framework excludes certain
kinds of equilibria that may arise in a dynamic framework, which is not surprising
given the richer essence of dynamic games. Sutton explains that this leads to the
appearance of what he refers to as “underinvestment equilibria.” In a dynamic
setting, firms may underspend on R&D. This is because of strategies that result in a
firm increasing R&D spending at time t + 1 after their rival has increased its R&D
spending at time t. Nocke [76] shows that this kind of equilibrium may happen
in dynamic games in which the firms can react to rivals arbitrarily quickly. As a
consequence, the lower bound to concentration reduces relative to the multistage
counterpart of a dynamic game.
Symeonidis [108, 109] tests some of the important predictions of Sutton via
systematic tests that take advantage of a natural experiment, namely a change
in British competition law that took place in the 1960s. It turns out that as the
laws strengthened, market concentrations generally increased in manufacturing
industries. Ellickson [29] provides empirical support from the US supermarket
industry. The US grocery industry is not very concentrated at the national level and
even less so at the city level (generally dominated by 3 or 4 chain firms). Ellickson
[29] argues that while R&D and advertising play relatively small roles, supermarkets
invest competitively in information technology to increase product variety. As a
result of these investments, the number of local firms is limited.
Marin and Siotis [70] examine Sutton’s results in the chemicals industry. The
advantage of this study is that the chemicals industry is comprised of many distinct
markets in R&D intensity. Moreover, except for pharmaceuticals, advertising
expenditures are low. Hence, this allowed them to concentrate on R&D aspects. As
in many market structure studies, this study encountered the difficulty of defining
the markets and products. Their results support the findings in Sutton [105] that
the lower bound on market concentration is higher and increasing in product
concentration where markets have relatively high R&D intensities.
13 Market Structures in Production Economics 543

We finalize this section by pointing out that although the bounds approach of
Sutton is based on minimal assumptions, the key predictions that are described
above depend critically on some of these assumptions. In particular, the assumption
that the game form is exogenous to the model may be challenged. That is, the
assumption that the form of the game remains unchanged since the market size
changes may not be plausible in every context.

Commonly Used Basic Market Structures

The structure-conduct-performance paradigm generally uses market concentration

measures to describe market structure. However, some of the commonly used
economic models provide more precise descriptions about market structures. In this
section, we present a very brief summary of these fundamental market structures that
constitute the core of more detailed and potentially more technical market structure
representations.
The market environment may depend on the nature of products (e.g., homoge-
nous vs. differentiated products), the choice of strategy variables (e.g., quantity
vs. price), timing of the strategic decisions, presence of conditions that allow
price discrimination or not, etc. The simplest market structure models are given
in the context of homogenous goods, single price, and static environments. One
widely used market structure type is perfect competition, which is used as a
benchmark model in many settings. In this market structure type, a firm is said to
be competitive if it believes that it cannot affect the market price. The importance of
this market structure type is that the first-best welfare outcome is achieved. Under
this framework, firms do not find a room for strategic interactions. Therefore, the
firm’s residual demand curve is flat.
At the other extreme, another benchmark market structure type is monopoly
where there is only one producer/seller for a product(s). It is possible to add some
more structure to this market structure type by providing more details about the costs
and/or technology, whether the monopolist can and is willing to price discriminate
by some specific price discrimination mechanism or not, the nature of the product(s),
etc. The base version of monopoly sets one price and sells a perishable good. If
average costs are declining over all meaningful quantity ranges, the most efficient
outcome would be a single firm to produce all output (e.g., public utilities). This
version of monopoly is said to be a natural monopoly. In general, this happens in
markets where the fixed costs are very high and marginal cost is relatively low.
In some market environments, the monopolist may find ways to implement some
form of price discrimination . The essential ingredient for price discrimination is the
ability to identify the customer’s types (e.g., their valuations) and ability to prevent
arbitrage opportunities. Broadly, there are three types of price discrimination. First-
degree price discrimination (personalized pricing) happens when the firm has
perfect knowledge of the valuations of the customers. In this framework, the firm
charges different prices for each customer based on their valuations. Total welfare
in this setting is the same as in perfect competition. The distinction is that all the
544 D. Garcia et al.

welfare goes to the monopolist. While initially this type of price discrimination
seemed to be an abstract concept that is hard to find in real world scenarios,
recent advancements in machine learning algorithms and big data make this type
of price discrimination in some market structures closer to reality than fiction.
With second-degree price discrimination (menu pricing), the firm creates slightly
different products (e.g., different amounts of the same product) for the purpose
of price differentiation. Finally, third-degree price discrimination (group pricing)
divides the market in segments and charges everyone the same price according
to segment. The reason for such segmentation is that the firm may not have the
tools to identify individual valuations of the customers. Hence, the firm rather
identifies groups in an attempt to get a heuristic approximation that simplifies the
identification problem. Two obvious examples of third-degree price discrimination
are student discounts and geographical price discrimination. Armstrong [5] and
Stole [102] survey this literature and provide details of advancements in the price
discrimination literature.
If a product is a durable good rather than perishable good, this would have serious
implications for the monopolist. One important implication of market structures
with durable goods is that the monopolist needs to consider the future. In a typical
market with perishable goods, it is possible to use static models; however, this
is inappropriate with durable goods. Coase’s conjecture claims that if consumers
do not discount time heavily and expect the price to fall in the future then this
would affect the current demand negatively. This in turn forces the monopolist to
charge a lower price compared to what it would charge for a perishable good. Some
examples that may solve consumers’ lower future price expectation problem include
introducing capacity constraints, announcing future prices, or renting. Whether this
conjecture holds or not is discussed by many researchers and has led to a large
literature [32].
So far the market structure examples that we provided were either perfect
competition or monopoly. Cases that are more interesting can be explored in market
structures with imperfect competition. One important distinction of monopoly and
imperfect competition models is that for the latter, in general, the choice variable
plays a more important role. That is, in imperfect competition models the outcomes
may change substantially depending on whether the firms choose quantity or price.
Similarly, the timing of a firm’s actions plays an important role, e.g., simultaneous
versus sequential.
We first start with the imperfectly competitive market structures where the
firms choose quantities. Cournot competition assumes that there are multiple firms
in a homogenous product market that choose quantities to maximize profit. The
equilibrium output is determined such that no firm can increase its profits by
changing its output level if other firms produce Cournot output levels. That is, firms
maximize their profits in accordance with a Nash equilibrium. A characteristic of
this market structure is that an increase in another firm’s output leads the firm to
produce less. In other words, the best response functions are downwards sloping.
When the number of firms is one, this market structure coincides with the (based)
13 Market Structures in Production Economics 545

monopoly structure. When the number of firms goes to infinity, the outcome of this
market structure converges to that of perfect competition.
In the Cournot competition model, firms act simultaneously. If one firm has some
advantage to enable it to move first, then this would change the market outcomes and
lead to another market structure known as Stackelberg competition. The simplest
version of this model assumes that there are two firms in the market that compete on
quantity: the leader and the follower. The leader moves first and after observing the
leaders action the follower moves. It is assumed that the leader knows ex ante that
the followers observe their actions. Moreover, the follower must not have the ability
to commit to a future non-Stackelberg follower action, and this has to be known
by the leader. One reason why a leader might have a first-mover advantage is that
it may be the incumbent monopolist in the industry, and the follower is a potential
new entrant. Hence, the Stackelberg competition model and its variations play an
important role in describing markets that face potential entrants.4
Similar to monopoly setting, it is possible to add more details about the market
structure to the quantity-competition-based models. For instance, Stole [102] gives
an example of third-degree price discrimination in a quantity choice setting.
Similarly, Hazledine [47, 48] and Kutlu [62] incorporate a form of second-degree
price discrimination to the Cournot competition framework. They show that, under
linear and some nonlinear demand functions, the quantity-weighted average price
does not depend on the extent of price discrimination. Kutlu [61] incorporates the
same price discrimination type to the Stackelberg competition framework where
the demand is linear and costs are symmetric. He shows that the leader does not use
price discrimination but the follower does. The leader directs all of its first mover
advantage to attract the highest value consumers.
Whether firms actually set quantities or not depends on the particularities of
relevant markets. Bertrand competition assumes that the firms set prices rather than
quantities. In a homogenous goods market with Bertrand competition, the consumer
is assumed to buy from the firm with the lowest price. When the firms charge the
same price, we need a sharing rule. A sensible sharing rule is equally dividing the
demand among those firms that charge the smallest price. Under these conditions,
if the marginal costs are the same for the firms, the market price reduces to the
marginal cost. This result is paradoxical because even with two firms the market
price becomes the competitive price. Based on the empirical evidence in the markets
with small number of firms the price–cost markup is positive in general. In the
literature, this result is referred to as the Bertrand Paradox. Hence, when firms
choose prices, the market outcome is materially different compared to Cournot
competition.
When capacity constraints are introduced to the price choice framework, it is
possible to achieve outcomes like Cournot competition [59]. In particular, Kreps

4 In other contexts, a seminal work on static entry models is Bresnahan and Reiss [18], which

examines the strategic entry decisions of small retail firms.

546 D. Garcia et al.

and Scheinkman [59] show that it is possible to get Cournot-like outcomes if the
firms first choose capacity and then prices. This suggests a potential solution to the
Bertrand Paradox. However, their result critically depends on the rationing rule that
is used. A technical aspect of Bertrand competition is that the equilibrium may not
exist if the marginal costs are not constant. This leads to alternative models such as
the supply function equilibrium of Klemperer and Meyer [57]. One advantage of the
supply function equilibrium approach compared to Bertrand and Cournot games is
that in this setting a firm adjusts to the uncertainty in an optimal way given the other
firms’ behavior.
Until now, we have concentrated on market structures with homogenous goods.
However, there are many markets where such an assumption may not be realistic.
That is, with rare exceptions, the goods vary in features, quality, location, etc. Also,
although Kreps and Scheinkman [59] provide a plausible solution to the Bertrand
Paradox, a more applicable solution is to introduce differentiated products. Product
differentiation allows the consumers to choose the product variations close to their
tastes. The greater the variety of the products sold in the market, the more likely
consumers would find a better match to their ideal preferences. Hence, product
differentiation helps the firm to increase the potential amount of goods that they
can sell. However, product differentiation comes with the potential cost that firm(s)
may have to worry about determining the optimal degree of product differentiation.
There are two ways to achieve product differentiation. In vertical differentiation,
all consumers have the same preference ranking of the products given that they are
charged the same price. In horizontal differentiation, consumers do not have the
same preference rankings of the products even when the products are charged the
same price. In the vertical differentiation setting, consumers may rank the products,
say, based on quality. With horizontal differentiation, differences of the products
may be due to factors other than quality such as the color, etc. In the literature,
markets with vertical differentiation are generally examined by models that account
for quality differences of the product variations (e.g., [33, 34, 96, 97]). A common
way to model horizontal differentiation is using the variations of linear city model
of Hotelling [52] and circular city model of Salop [88]. In these models, the
distinct preferences of consumers for the products are modelled through a parameter
called transportation cost. If a variation of the product is not close to a consumer’s
taste, then the transportation cost of this individual would be high for this product
variation.
Many of the current empirical studies choose to model horizontal product
differentiation via discrete choice models. In these models, it is important that
consumers have heterogeneous preferences and choose only one product (out of
a set of finitely many products) to ensure a smooth aggregate demand curve, which
seems to be a requirement that is naturally satisfied in most cases. Later, we will
provide a more detailed discussion for these types of models.
As we discussed in the SCP section, market structures may be described
by market concentration measures such as four-firm concentration measures or
Herfindahl–Hirschman Index (HHI) . For example, the Department of Justice
Merger Guidelines uses HHI in their merger evaluation analysis. One problem
13 Market Structures in Production Economics 547

using concentration measures is that it may be difficult to come up with a proper

market definition where the market shares are calculated. Another difficulty is that
market shares are highly imperfect at describing market dynamics. Thus, they do
not give us a complete or precise description of the market structure. For example,
two homogenous product industries with the same market concentration measures
may have very different characteristics. In particular, in one industry the market
structure may be described with Cournot competition and in the other industry
it may be Bertrand competition. Similarly, in one industry the market structure
may be described by uniform-pricing firms, and in the other industry the market
structure may be described by price discriminating firms. Hence, using HHI as a
measure of competition in a cross-industry study may not always be the best way
to proceed. In such cases, the models that describe the market by more precise
structural information can be used as alternative options.
In the next section, we consider the conduct parameter (or conjectural variations)
approach that enables researchers to model market structures by “generalizing”
some of the equilibrium concepts that are used in the commonly used models
described above.

Conduct Parameter Approach

An exhaustive survey of the conduct parameter approach along with technical details
and some empirical examples is provided by Perloff et al. [78]. Bresnahan [17] is
another study that provides a detailed summary of this approach. Some of the earlier
works that use this approach are Gollop and Roberts [40], Iwata [53], Appelbaum
[4], Porter [81], and Spiller and Favaro [100].
As mentioned earlier, the focus of the SCP literature is the cross-sectional
study of many industries. The conduct parameter approach, on the other hand,
concentrates on a single industry in an attempt to estimate a conduct parameter
that characterizes firm behavior. Hence, conduct parameter models use economic
theory to guide the empirical model specification while concentrating on a single
industry. These empirical models rely on the theory of conjectural variations to
estimate conduct parameters, which is mainly used as a measure of market power.
Based on the conjectural variations interpretation, the conduct parameter mea-
sures the market power of firms in a market in a fairly general way by allowing
equilibrium outcomes that may not be supported by the standard equilibrium
concepts such as Nash equilibrium. For example, in the standard Cournot model,
the conjecture is that the firms will have zero reaction; yet conjectural variations
theory allows more general types of reactions. Basically, the conjectural variations
of the firms determine the slopes of their reaction functions. Hence, similar to the
common models that we described, the researchers may add some structure (e.g.,
capacity constraints, dynamic factors, price discrimination, etc.) to the model that
describes the market structure in a market; but at the same time, the researcher may
also be agnostic about the firms’ competitive behavior, i.e., the firm conduct, and
estimate it using the available economic data.
548 D. Garcia et al.

For example, Puller [83] incorporates capacity constraints; Puller [82] and Kutlu
and Sickles [63] incorporate dynamic strategic factors; Graddy [42] and Kutlu and
Sickles [64] incorporate price discrimination in their conduct parameter models.
Corts [25] and Kutlu and Sickles [64] state that in the language of conjectural
variations theory, the conduct is described in terms of firms’ conjectural variations,
which are their expectations about other firms’ reactions. Conduct parameter models
use the conduct parameters to represent the conjectural variations of the firms. Based
on this interpretation, the conduct parameter can take a continuum of values. It is
important to note that the conjectures do not refer to what firms believe will happen
if they change their quantity levels. Rather, what is being estimated is what firms
would do because of their expectations. Hence, as argued by Corts [25], the conduct
parameter can be estimated “as if” the firms are playing a conjectural variations
game, which would reveal the price-cost margins.
However, some researchers may not be comfortable with the idea of a conjectural
variation that allows nonstandard equilibrium outcomes, e.g., equilibria other
than the Bertrand, Cournot, collusion, etc. Although some other researchers have
argued that the folk theorem allows a range of conduct parameter values that
are consistent with Nash equilibrium, they could not make a strong case against
critics that question the validity of using a static model to represent a dynamic
game. Hence, some researchers prefer to view conduct as a parameter that can
take values consistent with existing theories, which would be estimated in the
conduct parameter model. That is, the estimated conduct parameter value can be
used to categorize the competitive behavior of firms by using statistical tests. In
many cases, the researcher would face more than two alternative models to pick
from. For example, the researcher may need to test whether the market outcome is
consistent with perfect competition, Cournot competition, or monopoly. Therefore,
nonnested hypothesis tests (e.g., [112]) can be used as in Gasmi et al. [37]. Due to
its simplicity, some researchers may prefer to choose a compatible model by making
pairwise comparisons using the standard statistical tests similar to pairwise model
comparisons done in Bresnahan [16].
Besides the interpretation-related criticisms, the standard conduct parameter
models that do not incorporate the dynamics specifically are intrinsically static.
Corts [25] argues and illustrates by an example that this may lead to severely
mismeasured market power estimates. Puller [83] and Kutlu and Sickles [63] present
conduct parameter models that are robust to the criticism of Corts [25]. That is, they
offer general empirical models that allow the consistent estimation of the parameters
of the model (including the conduct parameter) that is robust to efficient collusion.
As argued by Puller [83], while these models are robust to efficient tacit collusion,
they may not be robust to other forms of dynamic solutions. Nevertheless, these
models nest the static scenario in a testable way, and it would be extremely difficult
if not impossible to design economic models that are robust to every, or even most,
dynamic market behaviors. More details about dynamic market structures in other
contexts are provided in the next section.
One of the distinctions of conduct parameter models from the standard economic
models is that the identification requires an extra effort. In particular, when
13 Market Structures in Production Economics 549

estimating a demand-supply system of equations, the researcher needs to be more

careful compared to standard demand-supply models. The source of this problem
is that not every functional form choice for demand and marginal cost functions
enables separate identification of the marginal cost and conduct parameter. If the
functional form choices are not carefully done, it would be possible to confuse
competitive markets with high marginal cost and collusive markets with low
marginal cost.
Lau [68] and Bresnahan [15] provide some conditions for identification in the
conduct parameter setting. Bresnahan [15] suggests that this identification issue
can be solved by using more general demand functions so that the exogenous
variables do more than parallel shifts, i.e., change the demand slope by rotations.
Hence, the rotations around the equilibrium point would identify the conduct
parameter. This can be achieved by including an interaction term with the quantity
variable. However, Perloff and Shen [79] show that such rotations may cause some
multicollinearity issues.
Another, potentially more realistic, identification approach would be the non-
parametric structural identification approach in Brown [19], Roehrig [86], and
Brown and Matzkin [20]. Recently, Orea and Steinbucks [77] and Karakaplan and
Kutlu [56] proposed conduct parameter models that can be estimated using stochas-
tic frontier approaches. The advantage of these methods is that they model the firm
and time-specific conduct parameters as random draws from a doubly truncated
normal distribution. Hence, in contrast to existing conduct parameter models, they
use skewness of the distribution of conduct parameter in order to identity marginal
cost and conduct parameters separately without requiring some of the strong
functional form restrictions on the demand and marginal cost functions. In another
context, Kumbhakar et al. [60] also use the stochastic frontier approach to estimate
market powers (i.e., markups) of firms. Their approach allows estimation of market
power even when the input price data are not available. Moreover, their method can
reliably estimate market power with or without constant returns to scale assumption.

Dynamic Market Structures

Perloff et al. [78] distinguish two types of properties that affect the dynamics of
market structure of a market: fundamental and strategic. If the dynamics of the
market structure is due to a stock variable that affects future profits, they call this
type of reason fundamental. If the dynamic interactions of firms stem from the
beliefs that the rivals will respond to current actions, they call this type of reason
strategic.
In a dynamic setting, the optimization condition should be modified so that price
equals full marginal cost, where the full marginal cost equals the sum of marginal
cost and a term that is a function of shadow value of the constraints that the firms
face. If the reasons are strategic, this can be a function of the shadow value of the
incentive compatibility constraints due to cooperation (e.g., [63, 83]). If the reasons
are fundamental, this can be a function of the shadow value of a stock (e.g., [80, 82]).
550 D. Garcia et al.

The stock variable can be amount of natural resources, knowledge, or a quasi-fixed

input. A common example of a quasi-fixed input is capital, as in many applications
it is more expensive to make quicker adjustments in the capital. This is an example
of production-related fundamental reasons (e.g., [28, 85]).
Any market structure that involves a quasi-fixed input requires that firms solve a
dynamic optimization problem. The reason is that the quasi-fixed input affects the
current profits and future levels of quasi-fixed input, which in turn affects future
profits. Therefore, the optimal level of investment path depends on the current
period’s quasi-fixed input amount and the firm’s belief about future factors such as
input prices. Similarly, in markets where advertising is a relatively more important
tool that can change demand, the firms face a dynamic optimization problem. This
is an example of demand-related fundamental reasons. Perloff et al. [78] argues that
advertising may create a stock effect by increasing the firms’ customers today and
in the future. If the firm has a small portion of the potential customers, the value
of additional advertisement would be large. If this firm invests a large amount in
the current time period, it may boost the current demand by capturing potentially
a large portion of the market demand. This behavior, however, affects the need
for advertisement in the next period because the firm already has a high demand
due to large investment on advertisement in the current period; and thus the value
of additional advertisement for the next period may be small. The intertemporal
connection of advertisement decisions consequently makes the firm’s optimization
problem a dynamic one.
For market structures where dynamics play an important role, open-loop equi-
libria and Markov perfect equilibria are among the most commonly used solution
concepts. With Markov perfect equilibria, firms know that rivals will respond to a
change in the state variable. On the other hand, in open-loop equilibria, the firm will
assume that the rivals will not respond to these changes. Therefore, it seems that
Markov perfect equilibria more closely reflect what many would call rational firms’
behavior. Perloff et al. [78] provide detailed examples of those models that use these
equilibrium concepts.

Market Structures with Differentiated Products

In this section, we concentrate more on the approaches that are used when modelling
markets with differentiated products and their estimation. The first approach
estimates residual demands for close substitutes. Since the degree of market power
of a firm depends on the residual demand elasticity, the residual demand approach
may be useful in studies that analyze market power. Some earlier examples that
use this approach are Bresnahan [14, 16] and Spiller and Favaro [100]. The main
difficulty of this approach is that it is not clear how one estimates all own- and
cross-price elasticities.
The second approach estimates a neoclassical demand system. The general idea
is simultaneously estimating a demand system of goods along with marginal cost
functions for each good. Hence, for n goods this approach estimates a system
13 Market Structures in Production Economics 551

of 2n equations (n for demand and n for supply). However, when the number
of goods is relatively large, estimation of such a system of equations may be
impossible due to data requirements. A solution to this problem is estimating a
model with a multilevel demand specification (e.g., Hausman et al. [46]; Hausman
and Leonard [45]. This approach imposes some implicit restrictions on the cross-
price elasticities. In particular, changes in the prices of one category do not affect
the demand for another category. Another solution is estimating a so-called almost
ideal demand system (AIDS). A relatively simple version of this method is the linear
approximation of AIDS that is proposed by [26]). This approach uses Stone’s [103]
geometric approximation to the price index.
One overlooked issue in the literature is the fact that the estimated demand system
actually may not be a complete one by ignoring other goods. It is, however, possible
to estimate an incomplete demand system in a way that is consistent with utility
maximization (e.g., [66, 67]). The third and relatively more widely used approach
estimates a random parameter utility model. This method solves the potential
data requirement problems mentioned above. Moreover, Nevo [73] argues that the
assumptions for consumers to have preferences so that an aggregate consumer exists
and has a demand function that satisfies conditions assumed by economic theory are
strong, and many times these assumptions are empirically falsifiable. Hence, using
an aggregate model may lead to different conclusions compared to a model that
explicitly models individual heterogeneity.
One potential solution is to use discrete choice models, which solve the dimen-
sionality problem by projecting the products into a characteristics space (e.g., [8,
73–75]). These models allow the researchers to model the market structure at a
very detailed microlevel. For example, Nevo [75] estimates brand-level demand
for the ready-to-eat cereal industry and uses the estimates along with the pricing
rules to recover price-cost markups without observing the costs. In contrast to
the conduct parameters method, which estimates the firm’s conduct along with
other parameters, Nevo’s approach assumes that the firms compete under Nash-
Bertrand setting. Hence, the conduct is exogenously given. He instead uses three
hypothetical ownership structures to determine the extent of market power: single-
product firms; the current structure observed in the data; and a multibrand monopoly
producing all brands. Based on these different ownership scenarios, he calculates
the counterfactual price–cost markups that correspond to each of these scenarios.
By using crude measures of actual price–cost markups, he can determine which
ownership structure fits better to the observed data. Using the same approach, Nevo
[74] provides an analysis of merger impact prior to its consummation. This provides
an important tool for policy-makers for merger analysis.

Market Structure and Market Power

There is a strong connection between market structure and market power. At a high
level, market structure creates the environment that forms the base of market power.
Although they actually measure market concentration, HHI and, to a lesser extent,
552 D. Garcia et al.

concentration ratios are commonly used proxies for market power. The appeal of
HHI is that it not only gives more weight to larger firms, but it also increases as the
number of firms decreases. However, HHI does not consider the particularities of
markets. For example, although not very likely, it is possible to imagine relatively
competitive markets even with two symmetric firms. Application of the HHI in
practice is also problematic since it requires that goods be homogenous. Hence,
in markets with differentiated products, the usage of HHI requires additional
assumptions, e.g., calculating the market shares based on sales rather than actual
quantities.
Another common measure of market power is the Lerner index, defined as the
ratio of price-marginal cost markup to price. In a standard static setting, when
price equals marginal cost, this measure equals zero, which indicates a perfectly
competitive market. One issue with the Lerner index is that efficiency improvements
could be mistaken as increments in market power. Koetter et al. [58] propose
efficiency adjusted measures of Lerner index to overcome this issue. Moreover, if the
market structure involves dynamics (due to either strategic or fundamental reasons),
the relevant marginal cost concept for optimality conditions is the full marginal
cost. This suggests that the Lerner index also must use the full marginal cost in its
calculation. Kutlu and Sickles [63] suggest using efficiency adjusted full marginal
cost, which is robust to efficiency- and dynamics-related concerns mentioned above.
Kutlu and Wang [65] provide a variety of conduct-parameter-based models in this
context.
The Lerner index may not always be easy to estimate due to available data issues.
More precisely, the calculation of Lerner index requires knowledge of marginal cost,
which is not directly observable and thus requires to be estimated. One way to do so
is estimating a cost function and calculating the marginal cost from the parameter
estimates of this cost function. This, however, requires total cost data, which is
not always easy to obtain. The conduct parameter approach enables the estimation
of marginal cost implicitly without using the total cost data. Also, this approach
provides an alternative measure of market power, i.e., the conduct parameter. At
least some variations of conduct parameters are shown to be highly correlated with
the Lerner index. In particular, under some assumptions, the conduct parameter
equals price elasticity adjusted Lerner index.
Boone [12] proposes a new way to measure market power. He motivates his
analysis by demonstrating (e.g., [87, 101]; and [21]) the theoretical possibility of
more intense competition leading to higher price–cost margins. Hence, Boone [12]
aims to develop a competition measure that is theoretically robust yet requires
similar data sets with price–cost margin estimation. Boone [12] calls his measure
relative profit differences (RPD). This measure is defined as follows: Let π (n) be the
profit level of a firm with efficiency level n where higher n denotes higher efficiency.

Consider three firms with different efficiency levels, n < n < n and calculate

the following variable (π (n ) − π (n))/(π (n ) − π (n)). Boone [12] argues that an
increase in competition raises this variable in models where a rise in competition
reallocates output from less efficient to more efficient firms, which covers a broad
range of models. Another related study with a very similar measure is Boone [13],
13 Market Structures in Production Economics 553

which proposes a measure called relative profits. This time Boone considers the
variable for two firms with efficiency levels. Boone [13] argues that an increase in
competition reallocates profits from the less efficient firm to the more efficient firm
and thus increases the relative profits measure.

Market Structure and Innovation Studies with No Explicit

Treatment for Distorted Production Decisions

Schumpeter [93] argued that innovation plays the principal role in advancing
economic prosperity. This is supported by the historical record of innovations
that have come and gone since the industrial revolution. The steam engine, light
bulb, automobile, airplane, radio, and integrated circuit immediately come to
mind – each of these innovations have pushed the economic frontier forward,
noticeably improving consumer welfare. Understanding what innovation is and how
competition policy could affect it is consequently important. The chief goal of this
section is to explore the latter, but we first provide a short summary of the origin of
the literature, what innovation is and some of the issues surrounding measurement.
Innovation is defined generally as a new or materially improved product, process,
service or business method. What drives it is the profit motive. Economic policy,
therefore, can only be effective at fostering innovation if it helps to safeguard or
promote the incentive to innovate.
Broadly speaking, one may view the profit motive from two perspectives: (1)
the objective and expectation of realizing an economic gain solely on the merits
of the innovation, with or without existing competition; or (2) the objective to
maintain profitability5 at a minimum or prevent an economic loss when competition
is present. The first approach to innovation is purely entrepreneurial in that the
innovator’s investment decision is independent of its competitive environment. In
contrast, the second approach to innovation is a mixture of entrepreneurship and
an effort to survive; i.e., the innovator also takes into account its competitive
environment to optimize its investment strategy. This section limits discussion
around the second approach to innovation.
Whether or not competition spurs innovation has been a question of signif-
icant interest in the industrial organization literature. One may attribute this to
Schumpeter [93], who argues that in a capitalist system, perfect competition is both
incompatible with economic progress and inferior to large scale enterprise. Taken
at face value, his conclusion might indicate that a more competitive industry stifles
innovation, but this is not Schumpeter’s intention.6
Schumpeter’s claim that “big business” is superior to atomistic competition in a
capitalist system is secondary to his main concern – that capitalism is superior to

5 Providedprofitability is high enough to fully cover economic costs.

6 For
example, Aghion et al. [3] define the “Schumpeterian effect” as the effect of increased
competition lowering post-innovation rents, thereby reducing the incentive to innovate.
554 D. Garcia et al.

socialism because perfectly competitive markets maximize social welfare. Schum-

peter advanced the idea that capitalism is superior not because of price competition
or any particular market structure, but rather because free enterprise begets the
introduction of new products, processes, services, and methods of production.
If one therefore accepts the premise that economic prosperity is the product of
innovation, then perfect competition cannot be the reason for capitalism’s success
as an economic system. Schumpeter demonstrates this point with the following
observations.
The first of these observations is that capitalism is a system of constant and
disruptive structural change. Accordingly, it is not appropriate to assess capitalism’s
performance based on an outcome of static conditions. In particular, while the static
outcome of perfect competition is socially desirable at any given point in time, it
implies that goods and production methods never change. The static efficiency of
perfect competition thus comes at the cost of precluding innovation’s promise of
dynamic and long-run efficiency.
Schumpeter also juxtaposed the assumption of free entry in a perfectly com-
petitive market against an innovator’s expected return on investment. That is, even
though free entry is a necessary condition for maximizing social welfare, quick and
costless entry into a new market is at odds with an innovator’s incentive to invest.7
In particular, if free entry implies costless and nearly instantaneous replication of the
innovator’s idea, then the innovator’s expected return will not exceed its opportunity
cost of investment. Even ignoring the assumption of homogeneous goods and static
production technologies, it follows that a firm in a perfectly competitive market has
no incentive to innovate.
The final observation is that large scale and anticipated market power facilitate
investment in innovative ideas. More specifically, the financial position of a prof-
itable, large firm grants it the opportunity to take on riskier innovative activity than
a small firm. And anticipated market power, by way of an intellectual property right,
promotes innovative activity by allowing the innovator to temporarily charge a price
in excess of what the market would bear. The latter observation, while not novel,
is further emphasized by Schumpeter that free and prompt entry is problematic for
innovation. Schumpeter additionally advocates for intellectual property rights as it
permits a firm to better plan for the future. On the other hand, Schumpeter’s view
that large scale is a boon rather than a detriment to innovation is meant to highlight
the limitations of a perfectly competitive firm, specifically limitations for growth
due to relatively little capital to finance risky innovative activity, and the relative
inability to weather the constant flow of outside innovations.

7 The idea that free entry will quickly erode an innovator’s supranormal profit assumes that the
innovator’s competition is effective; that the idea embedded in the innovation is copied well and
easily; and that being a first mover does not confer a material advantage. While this is not always
the case (see, for example, Boldrin and Levin [11]), the topic of intellectual property rights is
outside the scope of this chapter.
13 Market Structures in Production Economics 555

It bears repeating that Schumpeter’s criticism of perfect competition as a model

of efficiency is not a statement about the relationship between innovation and ex
ante market structure, market power or level of competition. Rather, Schumpeter’s
intention is to demonstrate that the assumptions of perfect competition are wholly
in conflict with innovation, and therefore economic progress. The idea that ex ante
market power fosters innovation is nevertheless commonly ascribed to Schumpeter.
This is borne out by the multitude of studies that have tested some form of
Schumpeter’s proposition.
Before going into some detail about the literature alluded to above, we briefly
touch on the issue of measurement. A common measure of competition is the
Lerner index. The ubiquity of this measure is traced to its relationship with the price
elasticity of demand, the source of a firm’s market power when price is the only
choice variable. Indeed, when a single, profit maximizing firm takes its demand
as given, its equilibrium Lerner index will be inversely proportional to its price
elasticity of demand. Thus, a larger Lerner index is taken to indicate greater market
power or less competition.
There are several reasons why this measure could fail to indicate changes in
market power. One possibility is if a firm engages in cost-minimizing behavior.
In this case, a temporal increase in the Lerner index would reflect nothing more
than the competitive process at work. The life cycle of a product will also affect
the margin we observe in the data. For many goods, as a product matures, focus
shifts from differentiation and quality improvement to cost reduction ([111]). That
is, commoditization will take effect in the later stages of a product’s life cycle, and
margins will generally fall as a result. A firm’s margin will also increase if it is
able to generate a competitive advantage. For example, in a hypothetical industry of
two firms selling the same good, successful product innovation by one of the firms
will allow the innovator to command a higher price for its product. The average
margin in the industry will increase, but if the competitive advantage possessed by
the innovator pushes the noninnovator to invest in developing a better product, then
rivalry will have been preserved if not strengthened. Thus, an increase or decrease
in margins does not fundamentally indicate a fall or rise in competitive intensity,
respectively.
The Herfindahl–Hirschman index is another commonly used measure of com-
petition. The main attraction of this measure is that it resonates with the intuition
that more concentrated industries are less competitive (i.e., industries with few firms
and/or firms with substantial market share are viewed as relatively less competitive).
Also attractive is the Herfindahl–Hirschman index’s theoretical link to market
structure. For example, the Herfindahl–Hirschman index corresponds to monopoly
when the market is captured entirely by one firm and perfect competition when there
are infinitely many firms with equal shares.
The prevalence of the above measures is based on the presumption that changes
in market structure or market power will identify changes in competitive intensity.
This presumption is questionable, however, if the defining feature of a more
competitive market is greater rivalry to deliver superior products and services. In
particular, it does not follow that a more profitable or concentrated industry will
556 D. Garcia et al.

be less inclined to compete vigorously (or vice versa), especially when changes in
market structure or market power are measurably small. A more competitive market
will arise when, on average, a firm faces a greater risk of falling behind (resp.,
a greater prospect of moving ahead) its competitors if it offers inferior (superior)
value to consumers. That is, a more contestable market will result in greater rivalry.
Shapiro [99] calls this the “contestability” principle, and it is adopted by Garcia [36]
to examine the relationship between competition and innovation.
Application of the “contestability” principle is consistent with antitrust policy.8
The 2010 Horizontal Merger Guidelines (HMG) of United States Department of
Justice and the Federal Trade Commission state the following9 :
“The unifying theme of these Guidelines is that mergers should not be permitted to create,
enhance, or entrench market power or to facilitate its exercise . . . [where by definition] a
merger enhances market power if it is likely to encourage one or more firms to raise price,
reduce output, diminish innovation, or otherwise harm customers as a result of diminished
competitive constraints or incentives.”

The Guidelines identify the root of enhanced market power as any market
characteristics that would lessen rivalry or the incentive to compete. While it is
possible that enhanced market power could manifest as greater profitability or
market concentration, the converse is not necessarily true. In contrast, any policy
that would serve to make a market more contestable, all else equal, will intensify
competition.
The measurement of innovation is equally important. The four defining features
of innovation are: investment; expansion of revenue and/or profit not attributable to
routine labor and capital; relatively rare occurrence; and growth in economic output
that exceeds growth in inputs. The three most common measures of innovative
activity are R&D intensity, patent counts, and productivity growth. We briefly argue
below that productivity growth is the most appropriate measure for capturing the
above features.
R&D intensity, defined as the ratio of R&D expenditure to sales, is perhaps
less used today than it was in the nascent stages of the literature. Intuitively,
R&D intensity captures the “innovative effort” of a firm, but it is not a measure
of innovative output. The latter ultimately makes R&D intensity an inappropriate
measure of innovation. But even as a measure of effort, it suffers from a variety
of problems. Notably, there is no requirement for R&D expenditure to be reported,
and the peculiarities of firm-level accounting methods make it an unsuitable as an
estimate of relative innovative activity [10].

8 Anne Bingaman, a former Assistant Attorney General for the Antitrust Division in the U.S.
Department of Justice during 1993–1996, said the following about rivalry and innovation: “The
fundamental thesis of strong antitrust enforcement is that rivalry, not market power, fosters
innovation and efficiency over the long run . . . Antitrust has an important role in preserving the
rivalry that spurs innovation.” See Bingaman [9].
9 See HMG [49] for the complete set of guidelines governing antitrust policy in the United States.
13 Market Structures in Production Economics 557

Patent statistics are often used as a measure of innovation because patents

represent a novel idea. A citation-weighted patent count, in particular, is the number
of forward citations a patent has; its use is based on the idea that more citations
are indicative of greater economic value. There are several problems with patents
as a measure of innovation, however. One, patents represent only a fraction of
innovative output [41]. Two, the incentive to patent encompasses more than just
the intent to protect a novel and valuable idea; companies also seek out patents to
defend against litigation or to litigate themselves [43]. Three, evidence suggests that
companies view patents as relatively weak mechanisms for protecting intellectual
assets; instead, trade secrecy and lead time to market are viewed as more effective
[22]. Four, the presumption that a greater number of citations reflect a more valuable
idea has been challenged. Empirical evidence indicates that the relationship between
citations and economic value is an inverted-U [1]. Intuitively, the most valuable
patents are the ones firms actively try to protect the most, resulting in fewer citations.
For the reasons above, it has been argued that productivity growth is the most
appropriate measure of innovation. Productivity growth, as has been argued by Dale
Jorgenson and others at the NBER (see, for example, [54]), is the key economic
indicator of innovation:
. . . Productivity growth is the key economic indicator of innovation. Economic growth can
take place without innovation through replication of established technologies. Investment
increases the availability of these technologies, while the labor force expands as population
grows. With only replication and without innovation, output will increase in proportion
to capital and labor inputs, as suggested by Schultz ([94], [95]). By contrast the successful
introduction of new products and new or altered processes, organization structures, systems,
and business models generates growth of output that exceeds the growth of capital and labor
inputs. This results in growth in multifactor productivity or output per unit of input . . . .

Thus, not only does productivity growth as a measure of innovation avoid the
economic complications underlying patents, it is consistent with long-run economic
growth – precisely why Schumpeter was such an advocate for anything that would
serve to promote innovation.
The main concern of early empirical work in the literature was the effect of firm
size and market concentration on innovative activity, the latter being some measure
of R&D expenditure. Little evidence was found that supported a statistically robust
or economically significant cross-sectional size effect. More recent empirical IO
studies have turned their attention to the relationship between ex ante market power
and innovation. Typical measures of competition and innovation in this literature are
the Lerner index and patent statistics, respectively. The results are mixed despite the
use of panel data, similar methods and more sophisticated econometric techniques.
Some studies find a negative relationship, some find a positive relationship and some
find an inverted-U relationship.
A related but distinct empirical literature presents strong evidence of a positive
relationship between competition and innovation. This literature examines changes
in productivity at the microlevel in response to discrete changes in the competitive
environment. In particular, this literature uses natural experiments to identify
changes in competitive intensity. For example, one study finds that a group of US
558 D. Garcia et al.

iron ore manufacturers nearly doubled their labor productivity in the 1980s after
Brazilian manufacturers entered their market.
Economic theory also examines the relationship between competition and inno-
vation. Broadly speaking, two approaches have been taken. One approach compares
how much an incumbent firm would be willing to invest in R&D compared to
a potential entrant, and the other approach examines the relationship between a
parameter that affects market structure and R&D effort. Like the empirical IO
literature, the results are mixed. The incumbent-entrant class of models predicts both
a positive and negative competition-innovation relationship, while predictions from
the parametric class of models range from a positive relationship to an inverted-U.
Overall, both empirical evidence and theory paint an unclear picture of the
competition-innovation relationship. The nebulous qualities and complex interac-
tion of these variables is doubtless a major reason why. Our endeavor in this section
is to provide a synthesis of the competition-innovation debate and the implications
it has for economic growth.

Theoretical Work

Theoretical Models of Market Structure, Incumbency, and the Incentive

to Innovate
Arrow [6] was the first to rigorously examine the relationship between market
structure and the incentive to innovate, the latter defined as the difference between
post- and pre-innovation profit. Under the crucial assumption that property rights
are perfectly exclusive and infinitely lived, Arrow analyzes the relationship by
comparing a monopolist’s incentive to reduce marginal cost to that of a perfectly
competitive firm.
Two types of process innovation are possible in his model: drastic and nondrastic.
With drastic innovation , the new technology achieves a profit maximizing price that
is less than the marginal cost of the old technology. Consequently, the monopolist
remains a monopolist with drastic innovation, and the perfectly competitive firm
becomes one.10 It follows that the incentive to innovate will be higher for a
competitive firm since its post-innovation profit is the same as the monopolist’s,
while its pre-innovation profit of zero is lower.
In the case of nondrastic innovation (i.e., the new profit maximizing price is
higher than the marginal cost of the old technology), the post-innovation profit
for a competitive firm will be less than a monopolist’s. This is a consequence of
the fact that a competitive firm cannot profitably charge a price that exceeds the
prevailing competitive price. Instead, a competitive firm’s post-innovation profit

10 Tosee the latter, note that the marginal cost of the old technology is equal to the prevailing price
in a perfectly competitive market. Thus, with drastic innovation and exclusive property rights, a
perfectly competitive firm will drive its competitors out of the market with the new price and
subsequently become a monopolist.
13 Market Structures in Production Economics 559

will be limited to the unit royalty it charges (i.e., the difference between the old
and new marginal costs) times the number of units sold in the market at the old
price. Notwithstanding, Arrow shows that a monopolist’s incentive to innovate will
still be less than a competitive firm’s, precisely because the monopolist applies the
cost reduction to relatively less units of output in equilibrium.
Gilbert and Newbery [39] build on Arrow’s model by allowing for the possibility
of entry into the monopolist’s market. Specifically, an incumbent (i.e., monopolist)
bids for a patent on a new, substitute technology to preempt entry by a challenger,
and a challenger bids for a chance to compete with the incumbent.
In this setup, Gilbert and Newbery find that preemption is a rational strategy –
in fact, a Nash equilibrium – for the incumbent if monopoly profits with the new
technology exceed the costs of preemption. This will attain if post-entry industry
profit is less than pre-emptive monopoly profit. That is, letting e and m denote
entrant and monopolist, respectively, if π m (pm 1 , pm 2 ) > π m (pm 1 , pe 2 ) + π e (pm 1 ,
pe 2 ) attains, where π i (·) represents the firm i’s profit function, i ∈ {m, e}, and
pi j represents the price of product j, j ∈ {1, 2}, then the incumbent will have the
incentive to submit a larger bid than the challenger.11 Gilbert and Newbery show that
this condition holds under fairly weak assumptions, implying that the incumbent’s
incentive to innovate will be relatively greater if it has more to lose from entry than
the challenger has to gain.
Reinganum [84] extends the model of Gilbert and Newbery [39] by introducing
uncertainty to the innovation process. Specifically, she assumes that innovation
follows a Poisson process with an exponentially distributed date of successful
innovation. In turn, the incumbent and challenger choose investment levels that
increase their chance of innovating first.
Her model has three possible outcomes. If the incumbent succeeds in reducing
its current marginal cost from c to c < c and secures a patent before the challenger,
then it will earn flow profits (c). If the challenger succeeds before the incumbent,
then the incumbent and challenger will earn Cournot flow profits π I (c) < (c) and
π C (c), respectively. If neither succeeds, then the incumbent and challenger maintain
their pre-innovation profit flows at R and zero, respectively.12
In the spirit of Arrow [6], Reinganum defines drastic innovation as an innovation
that lowers marginal cost to a level c ≤ c0 , where c0 is the largest value of c such that
π I (c) = 0 Coupled with the assumption of constant returns to scale, this condition
implies that the incumbent produces zero output. The incumbent is thus knocked
out of the market by the challenger when innovation is drastic, allowing the latter to
earn profit flows π C (c) = (c).
The Nash equilibrium of this game implies that the challenger will unambigu-
ously spend more on R&D than the incumbent when innovation is drastic. And to

11 Product j = 2 represents the new, substitute technology.

12 Reinganum assumes that (c) and π C (c) are nonincreasing and that π I (c) is nondecreasing in
c, the intuition being that the successful (unsuccessful) innovator’s flow profits are higher (lower)
the greater is the reduction in cost.
560 D. Garcia et al.

the extent that the equilibrium solution can be analyzed under nondrastic innovation,
Reinganum shows that there exists a nontrivial set of nondrastic innovations where
the challenger’s incentive to innovate is relatively greater.13
The assumption of stochastic innovation is critical to the innovation-market
structure relationship, as demonstrated by the different results reached by Rein-
ganum [84] and Gilbert and Newbery [39]. Why the difference occurs may be
explained by how investment is modeled. Specifically, whereas in the model of
Gilbert and Newbery there is no investment decision (the incumbent and challenger
innovate with probability one), Reinganum assumes that further investment can only
marginally increase the probability of successful innovation; i.e., the assumptions of
Reinganum’s model imply diminishing returns to investment. The marginal increase
in expected post-innovation flow profits (via the marginal increase in probability
of relatively early innovation) is accordingly offset by the marginal cost of further
investment at some point. And this point occurs earlier for the incumbent precisely
because its pre-existing profit flows must be replaced.

Theoretical Models on the Degree of Competition and the Incentive

to Innovate
Theoretical models in the vein of Arrow [6] may be viewed as “discrete” in the
sense that they compare the incentive to innovate across two types of firms: one
with market power and one without. More recent theory, however, examines the
competition-innovation relationship with competition measured on a continuum.
Kamien and Schwartz [55] were the first to take this approach.
In their model, a firm chooses a date to introduce innovation based on the degree
of rivalry it faces, where greater rivalry is modeled as a parameter that accelerates the
expected date of rival innovation. Specifically, the decision problem facing the firm
is to choose an innovation arrival time that maximizes its present value of expected
innovation cash flows, conditional on the probability of a rival innovating first.14
The firm formally chooses its development date T ∗ to solve

∞
maxT e−(r−g)t [P0 (1 − F (t)) + P1 (F (t) − F (T )) + P2 F (T )] dt − C(T ),
T

where r > g is the discount rate; g is market growth; F(τ ) is the probability of rival
introduction by time τ ; P0 (1 − F(t)) is the expected payoff to the firm conditional

13 This is for nondrastic innovations where the reduction in cost is sufficiently close to the drastic
level c0.
14 Kamien and Schwartz [55] examine two versions of this model, one with patent protection

and one without. We omit the model with patent protection as the results are not affected by the
differences in appropriability.
13 Market Structures in Production Economics 561

on no rival entry by time t; P1 (F(t) − F(T)) is the expected payoff to the firm
conditional on it innovating before a rival and the rival appearing between time
T and t; P2 is the expected payoff to the firm conditional on a rival innovating
before time T; and C(T) is the minimum present value of the cost of completing

development by time T. It is assumed that C (T) < 0 and C (T) > 0, and P0 ≥ P1
and P0 ≥ P2 .
In addition to other parameters, the optimal date T ∗ is a function of the hazard

rate of successful innovation by a rival, h ≡ F (t)/(1 − F(t)). Kamien and Schwartz
assume for simplicity that the probability of innovation is memoryless, implying a
constant hazard rate with cumulative density function F(τ ) = 1 − ehτ , τ ∈[0, T]. The
significance of h is its tie to the degree of rivalry. Namely, because the inverse of h is
the expected date of innovation by a rival, an increase in h heightens the probability
of a rival innovating first. Kamien and Schwartz accordingly interpret an increase in
h as an intensification of rivalry.
As Kamien and Schwartz demonstrate, the relationship between h and T∗
depends on the size of the innovation (defined as the amount of profit rewarded
for innovating). When the size of innovation is sufficiently large, the relationship
takes a U shape. That is, for relatively small values of h, the marginal cost saving of
postponement is less than the marginal loss of delay, pushing the firm to accelerate
its development date. But for sufficiently high values of h, the effort required to
preempt a rival’s development date becomes excessively costly, driving the firm to
postpone. In the case of small innovation size, h and T∗ are strictly positively related.
Intuitively, an already small reward for innovation makes any additional effort to
pre-empt rival innovation less attractive.
Loury [69] extends the model of Kamien and Schwartz [55] to a game-theoretic
setting. In particular, he examines how R&D incentives are affected by rivalry
when a finite number of symmetric firms, n, choose their investment strategies
simultaneously. As in Kamien and Schwartz [55], Loury assumes that the date of
successful innovation is memoryless. Thus, the probability of successful innovation
by time t is given by P(τ (x) ≤ t) = 1 − e−h(x)t , where τ (·) is the random date
of successful innovation, x is R&D expenditure and h(·) is the hazard rate of

innovation. It is assumed that h (x) > 0, h (x) ≥ 0 for x∈[0,x] and h (x) < 0 for
x∈(x, ∞).
Each firm chooses its R&D expenditure to maximize its present value of
discounted cash flows, taking as given the collective R&D expenditure of its rivals.
The symmetry of the problem implicitly defines the optimal R&D expenditure
function x ∗ = x̂(a, r, V ), where a = (n − 1)h(x∗ ) and V and r denote the flow
of revenues from successful innovation and the interest rate, respectively.
In the special case of no strategic effects, Loury shows that a firm’s R&D
expenditure will either first increase and then decrease with rival R&D expenditure
(i.e., the parameter a), or monotonically decrease. This mirrors the results of Kamien
and Schwartz [55]. When strategic effects are allowed (and n ≥ 2), however,
individual R&D expenditure and the expected date of innovation strictly decrease
562 D. Garcia et al.

with n.15 Hence, while greater rivalry (as measured by an increase in n) lowers
individual R&D expenditures, it also accelerates the introduction of innovation.
Intuitively, an increase in the number of firms in a symmetric industry will lower
expected profits, thus deterring firms to invest in R&D; concurrently, more firms will
simultaneously invest in R&D, which increases the chance of earlier innovation.
Aghion et al. [2] develop a dynamic macroeconomic model of “step-by-step”
innovation whereby a leading (laggard) firm in its sector can widen (narrow) its
technological lead (lag) if it successfully innovates. The continuum of sectors
that makeup the economy are each comprised of two cost-asymmetric firms that
ultimately compete in price with differentiated goods.
In the first stage of a two-stage game, each firm takes their second-stage profit
function as given and makes an R&D investment decision to maximize expected
future discounted profits, conditional on the technological gap in their industry and
the level of competition. In particular, the firm makes a cost-reducing investment
that increases its probability of favorably changing the technological gap in its
industry, where the technological gap is a function of the relative production cost
between the two firms. In the second stage, firms compete on price alone, taking
as given their relative cost. The equilibrium profit in this stage is an implicit
function of relative marginal cost16 and the degree of product substitutability. The
latter is Aghion et al.’s [2] measure of competition; however, it is more precisely
a reflection of consumer tastes, which affects the interpretation of an industry’s
market structure.17
Aghion et al. [2] analyze the effect of competition on innovation under the
following assumptions: (1) there is a steady state composition of n-gap (unleveled)
and 0-gap (leveled) industries; (2) the laggard can immediately catch-up to the
leader if the laggard is the sole innovator; (3) there is at-most a one-step increase
in the gap if a leader or neck-and-neck firm innovates and; (4) changes in product
substitutability affect the whole economy.
When innovation is large, they find that an increase in competition can either
foster or retard innovation. On one hand, large innovation implies that a one-step
lead will raise the would-be leader’s profit to the maximal level, so a leader will
not innovate further. On the other hand, large innovation effectively fixes the post-
innovation rent for a neck-and-neck firm and the pre-innovation rent for a laggard.
The effect of competition on innovation thus only operates through the pre- and

15 Loury shows that the expected arrival date of innovation is strictly decreasing with respect to
n when a unit increase in R&D investment by any single firm causes every other firm to invest a
smaller amount into R&D.
16 Defined as the ratio of firm i’s marginal cost to firm j’s marginal cost.
17 Aghion et al. [2] concede that product substitutability is a taste parameter. By construction,

product substitutability will affect the structure of a firm’s demand. When product substitutability
is at its lowest, a firm has no competitors, so the firm will behave like a monopolist. On the other
hand, a firm will behave like a perfectly competitive firm when product substitutability is at its
highest. Thus, a higher level of product substitutability may be interpreted as a “less monopolistic”
market structure.
13 Market Structures in Production Economics 563

post-innovation rents of neck-and-neck and laggard firms, respectively. In fact, these

rents are the same: the pre-innovation rent for a neck-and-neck firm is the post-
innovation rent for a laggard. Thus, since greater competition lowers neck-and-neck
rents, competition will have a nonlinear effect on economy-wide innovation.
The shape of the relationship found by Aghion et al. [2] under large innovation
is an inverted-U. This arises because the steady state distribution of leveled and
unleveled industries is itself a function of the level of competition. More specifically,
when competition is already relatively high, neck-and-neck firms will have a greater
incentive to innovate than laggards as they seek to “escape competition.” This will
push the economy into a state where there are more unleveled industries than leveled
ones. And because this transition does not affect the level of competition (i.e.,
the level of neck-and-neck profits), the incentive to innovate across the economy
will diminish – laggards have little incentive to innovate when competition is
already high, and leaders do not innovate at all. Aghion et al. [2] call this the
“Schumpeterian” effect of competition. Contrast this to the case of relatively
low competition. When competition is relatively low, laggard firms will be more
inclined than neck-and-neck firms to innovate. This will push the economy into a
state with more leveled than unleveled industries where the “escape competition”
effect dominates. Thus, innovation initially increases with competition but then
declines.
In the case of small innovation, innovation is found to monotonically increase
with competition. Intuitively, the increment in profit from innovating is approx-
imately the same for leaders, laggards and neck-and-neck firms; and since the
increment in profit for a neck-and-neck firm increases with competition (due to
the “escape competition” effect), it follows that economy-wide innovation will also
increase.
Finally, Aghion et al. [2] examine the general case numerically and confirm their
analytical results. That is, they find innovation to increase with competition for
intermediate values of innovation size, but eventually an inverted-U shape arises
when innovation is large and the probability of imitation is low.
Aghion et al. [3] reexamine the “step-by-step” innovation model developed
in Aghion et al. [2] under two modifications. First, a firm can only advance its
technological position by one step through successful innovation. If, for example,
the current state of a sector is m – a nonnegative integer that indexes the efficiency
gap between two firms – and the leader (laggard) successfully innovates while the
laggard (leader) does not, then the state of the sector will change from m to m + 1
(respectively, from m to m − 1). Thus, unlike Aghion et al. [2], this model assumes
that the laggard cannot immediately catch up to the leader.
Second, instead of explicitly using product substitutability as a proxy for
competition, greater competition is measured as the degree to which neck-and-neck
firms cannot collude. That is, letting ∈ [0, 1/2] denote a neck-and-neck firm’s
profit as a fraction of a leader’s profit, Aghion et al. [3] treat a smaller value of
as greater competition. To operationalize in their model, Aghion et al. [3] use
≡ 1 − = (π 1 − π 0 )/π 1 , where π 0 and π 1 are the profit levels of a neck-and-
neck and leader firm, respectively.
564 D. Garcia et al.

With the above modifications, Aghion et al. [3] restrict attention to the case where
m ∈ {0, 1}, so that a leader (laggard) can be at most one step ahead (behind). This
mirrors the “large innovation” scenario analyzed in Aghion et al. [2], and implies
that a leader will not innovate. The inverted-U competition-innovation relationship
found in Aghion et al. [2] (under large innovation) remains intact in Aghion et al.
[3].
Hashmi [44] argues that the theoretical model of Aghion et al. [3] is not well-
suited for industry level analysis, particularly because Aghion et al. [3] model
the interaction of competition and innovation at the economy level. Hashmi notes,
however, that an industry-level analysis is possible with Aghion et al.’s [3] duopoly
model of competition, so he accordingly adopts it.
Namely, Hashmi considers a setting where two cost-asymmetric firms price
compete with differentiated goods. The demand for firm i, i = 1, 2, is given by:

1/(α−1)
pi
qi = α/(α−1) α/(α−1)
,
pi + pj

where p is price, and α ∈ [0, 1] is the degree of product substitutability. Hashmi

follows Aghion et al. [3] and uses α to measure competition. Letting ci = wγ −ki
denote firm i’s constant marginal cost of production, where w is the wage rate, γ is
the size of innovation and ki is the technology level of firm i, the equilibrium profit
function for firm i is

(1 − α) Ri (n)
πi (n) = ,
1 − αRi (n)

where n ≡ ki − kj is the technology gap between firm i and firm j, and Ri (·) is firm
i’s market share. Finally, Hashmi defines the probability of successful innovation as

P = 1 − e−ax + max 0, 1 − eη(n−n̂) ,

where x is R&D investment. The first term is common to both firms; it represents the
baseline probability of success. The second term is specific to an unleveled industry,
and it allows the laggard to more easily catch up to the leader as the technological
gap grows. This term is introduced by Hashmi to ensure that zero investment is not
chosen by both the laggard and leader when the gap is large.
Given the above, both firms choose their R&D investment level to maximize
their expected future discounted profits. The relationship between optimal R&D
investment and competition depends on the technology gap. When the technology
gap is small, the relationship is approximately monotonically increasing; when it is
intermediate, the relationship is an inverted-U; and when it is large, the relationship
is monotonically decreasing. Thus, Aghion et al. [3]’s result is not robust to
variations in the technology gap.
13 Market Structures in Production Economics 565

Hashmi’s results are interesting from a policy perspective, particularly if one

views the technology gap as a better proxy, or at least a characteristic, of market
structure. Intuitively, a technological lead confers a competitive advantage (in
this case, the ability to charge a more competitive price due to a cost-reducing
technology), and as the lead gets larger, the industry will move closer to monopoly.18
Perhaps even more interesting is the result that innovation decreases with the tech-
nological gap. Ostensibly, a large gap lulls the leader and discourages the laggard
from innovating, whereas a small gap keeps rivals from becoming complacent with
their competitive positions.

Empirical Work

Empirical Results on Market Structure, Incumbency, and the Incentive

to Innovate
The literature on firm size, market structure, and innovation is voluminous [38,
107]. Scherer [90], an extension of Scherer’s earlier work, Scherer [89], is a seminal
paper in the literature. There are two material differences between Scherer [90] and
Scherer [89]. One, Scherer [90] uses a more comprehensive dataset of 56 industries,
as opposed to the 48 used in Scherer [89]. And two, whereas Scherer [89] estimates
the impact of firm size on patents granted, Scherer [90] estimates the effect of
market concentration on the total number of engineers and scientists employed as
a proportion of total employees. Thus, he moves from an output-based to an input-
based measure of innovation.
Two findings stand out in Scherer’s cross-sectional study. One is that the
explanatory power of market concentration drops substantially when industry-
specific dummies are added to the regression. In other words, industry differences
in technological opportunity account for most of the variation in R&D investment.
Another notable finding is that innovation effort exhibits an inverted-U relationship
with concentration. Thus, R&D investment increases with market concentration
when market concentration is relatively low, and decreases when it is relatively high.
Scherer interprets this as are a rejection of the Schumpeterian hypothesis that market
power fosters innovation.
The drop in market concentration’s explanatory power after technological oppor-
tunity is accounted for is a common theme in the empirical literature related to
market structure, incumbency, and the incentive to innovate [107]. However, it
is unclear from these studies if the elimination of market concentration’s effect
is theoretically driven, or if it is a statistical artifact of the limitations of cross-

18 Infact, when an industry consists only of two firms, a larger technological gap will necessarily
lead to greater concentration. This relationship does not necessarily hold for an industry with more
than two firms, however. When there are three firms with different efficiency levels, for example,
the laggard firm could narrow the technological gap with the second-place and leader firm, but the
reallocation of sales among the three firms could lead to lower or higher concentration.
566 D. Garcia et al.

sectional methods. The latter is addressed by more recent empirical studies, which
use panel data methods to identify the effect of market power on innovation.
More significant from a policy perspective is the inverted-U relationship brought
forward by Scherer, which implies that too much competitive rivalry will retard
innovation. The possibility of a nonlinear competition-innovation relationship has
been and continues to be investigated in the IO literature. We discuss this more
recent literature shortly.

Empirical Results on the Degree of Competition and the Incentive

to Innovate
Relatively early empirical work was concerned with the hypothesis that large scale
stimulates innovation. More recent empirical IO research is expressly concerned
with the effect of competition on innovation. In addition to this different, albeit
somewhat related policy question, the newer literature differentiates itself via more
sophisticated econometric techniques, measures of competition and innovation, and
breadth of data.
The literature is broadly split into two categories; models that test competition’s
effect on innovation under the assumption that the relationship is linear, and models
that test the effect assuming the relationship is nonlinear. In particular, Nickell [72]
and Blundell et al. [10] estimate linear specifications, while Aghion et al. [3], Correa
[23], Correa and Ornaghi [24], and Hashmi [44] estimate nonlinear specifications.
Both Nickell [72] and Blundell et al. [10] estimate a linear, dynamic panel-data
model and find a positive competition-innovation relationship based on a panel of
publicly traded firms in the UK. Nickell uses a firm’s average operating margin
as a proxy for competition and total factor productivity to measure of innovation,
whereas Blundell et al. [10] measure competition in terms of individual market
share, market concentration and import penetration, and innovation in terms of
survey-based innovation counts.
The nonlinear class of empirical IO models, all based on Aghion et al. [3],
generate a range of qualitative results. Aghion et al. [3] use Poisson regression and
a 1973–1994 panel of publicly traded firms in the UK to estimate the competition-
innovation relationship at the industry-year level. They use citation-weighted patent
counts to measure innovation and operating margins to measure competition. They
find that competition’s effect on innovation is positive for relatively low and
intermediate levels of competition, but negative when competition is sufficiently
high. Using the same data and econometric techniques as Aghion et al. [3], Correa
[23] instead finds a monotonically increasing relationship after accounting for
structural changes in the data, namely a policy that facilitated patent grants. Correa
and Ornaghi [24] also find a monotonically increasing relationship, but with panel
data on US manufacturing firms spanning the period 1974–2001. Their finding is
robust to different measures of innovation, including citation-weighted patent counts
and productivity growth. Finally, Hashmi [44] finds a weak, monotonically negative
relationship using a panel of US publicly traded firms over the period 1976–2001.
His econometric model is largely the same as Aghion et al.’s [3], except he uses
negative binomial regression to estimate the competition-innovation relationship.
13 Market Structures in Production Economics 567

The range of conclusions for the nonlinear class of models is striking given the
strong similarity in data and methods. This bears further discussion, which we
pursue below.
As noted above, the nonlinear empirical models that test the competition-
innovation relationship are closely related to the Aghion et al. [3] model. Accord-
ingly, this warrants some discussion of the Aghion et al. [3] model and the models
that followed.
Specifically, Aghion et al. [3] estimate following conditional mean function:
E[p|c,x] = eg(c) + xβ ,
where p is the number of citation-weighted patents, c is competition, x is a set
of industry and time dummy variables and g(·) is some function to be estimated.
It is assumed that p follows a Poisson process. Aghion et al. [3] estimate the
model with data on publicly traded manufacturing firms in the UK over the period
1973–1994.
Correa [23] revisits the Aghion et al. [3] model using the same sample of data and
empirical formulation, but allows for structural breaks in the data. Correa reasons
that this is appropriate because the establishment of the United States Court of
Appeals for the Federal Circuit (henceforth CAFC) in 1982 made it effectively
easier to have a patent granted. Correa takes two approaches to test the structural
break hypothesis. The first approach is a Chow test. The base model estimated is the
same as in Aghion et al. [3]:

pj t = exp β0 + β1 cj t + β2 cj t 2 + ϕ v̂j t + δ1 Dτ cj t

+δ2 Dτ cj t 2 + αj Dj + γt Dt + uj t ,
j t

where Dτ = 1 for all t ≥ π (π denotes a pre-defined structural break), 0 otherwise;

cjt is the level of competition for industry j at time t, measured as one minus the
industry average price-cost margin; v̂ jt is the residual for industry j at time t from
regressing the competition index on policy and foreign-industry instruments (i.e.,
endogeneity is accounted for with a control function approach); and the last two
terms are industry- and time-fixed effects. The null hypothesis of time stability at
t = 1983 is rejected by the Chow test at the 5% level of significance.
Correa also carries out a Wald-type test for structural breaks, finding only one
structural break at year 1981. Correa gives several reasons why this year, instead of
1982, was detected. One of these reasons is that the political discussion to establish
the CAFC began in 1979. Thus, in anticipation of the CAFC being established,
a structural change in patent incentives may have manifested before the CAFC’s
official introduction.
The indication of a structural break in the data by both tests prompted Correa
to test the joint statistical significance of the competition coefficients. Correa
concluded from this exercise that before the establishment of the CAFC, the
competition-innovation relationship is statistically significant; but after the CAFC’s
introduction, the relationship is not statistically significant. Correa then estimated
568 D. Garcia et al.

the relationship under the two regimes, i.e., with the two identified structural breaks.
In both cases, he found that the relationship between competition and innovation is
monotonically increasing before the structural break.
Hashmi [44] and Correa and Ornaghi [24] also revisit the empirical Aghion
et al. [3] model. Both use a negative binomial instead of a Poisson specification
for the conditional mean function (to account for over-dispersion in the data), data
on publicly traded manufacturing firms in the US and the Lerner index as a proxy for
competition. Nevertheless, Correa and Ornaghi [24] find a monotonically positive
relationship using total factor productivity growth, labor productivity and citation-
weighted patent counts as a proxy for innovation, while Hashmi [44]|, using only
citation-weighted patent counts as a proxy for innovation, finds a monotonically
“mild,” but negative relationship. There remains no clear consensus of empirical
evidence that points to the direction of the effect of competition on innovation or on
the productivity growth that is presumably engendered by innovation.

Empirical Results Using Natural Experiments to Identify the Effect

of Competition
The empirical IO literature on competition and innovation has exclusively focused
on concentration and profitability to infer the level of competitiveness of an industry,
and largely patents and R&D to capture the level of innovation. Another literature
investigates the link between structural changes to the competitive environment
and productivity. Following Holmes and Schmitz [50], this section presents a brief
review of that literature.
We would first like to draw attention to some comments made by Holmes
and Schmitz [50] regarding the measurement of competition and how compe-
tition affects productivity. Holmes and Schmitz [50] claim that concentration
and profitability are inadequate at identifying structural changes in a competitive
environment and, in fact, have the potential to mislead. To illustrate, they consider an
industry that has a strong trade barrier and is made up of small, unproductive firms.
The government then lifts the trade barrier, subsequently drawing the attention of
large, highly productive to enter the market. From the perspective of the researcher
who observes only market shares and profitability, they might conclude that the
industry became less competitive due to a significant increase in concentration
and profitability. This, however, is at odds with the conventional thinking that
less entry barriers stimulate competition. They also note the selection effect of
competition, whereby relatively unproductive firms are “selected out” of an industry
because they cannot compete effectively. Taken altogether, market concentration
may increase in a more competitive state. Moreover, to the extent that productivity
is positively correlated with profitability, average profitability will also increase in
a more competitive state.
Notwithstanding the above, Holmes and Schmitz [50] note that there is no model
that can comprehensively explain why or how competitive pressure induces firms to
be more productive. The body of evidence strongly suggests that it does, however.
Matsa [71], for example, found that incumbent supermarket retailers significantly
upgraded their inventory systems (to maximize product availability) after Wal-Mart
13 Market Structures in Production Economics 569

entered their local markets. Importantly, the increase in productivity cannot be

attributed to an increase in market or average firm sizeas demand did not all of
a sudden increase for the existing firms, nor did the existing firms substantially
increase in scale. The observed gain in productivity was, therefore, largely a
response to increased competitive pressure.
Competition can also boost productivity by lowering the opportunity cost of
investment. For example, Schmitz [92] found that plant managers were reluctant
to adopt new managerial practices because they feared losing profits to a job
strike. From this, he argues that the competitive process – which in the absence
of innovation tends to shrink margins over time – will reduce forgone profits and
thereby spur investment into new, more efficient forms of management.
Major shifts in the competitive landscape are perhaps the best way to identify
the effect of competition on innovation. Holmes and Schmitz [51] examine the
effect of railroad transportation on water shipping in the 19th and 20th centuries.
Before transportation by railroad was economically feasible in the US (1850s),
freight transportation by water was effectively the only way to ship cargo across
the nation. This meant that ports not only had tremendous market power, but also
had the incentive to keep it. Their market power was heavily weakened, however,
when railroads became a viable alternative for transportation.
Railroads undermined the market power of ports in two ways. On one hand,
railroads gave consumers easier accessibility to other ports. So, if a consumer was
not happy with the price or service of a port, it could use a train to ship its cargo to
another port. On the other hand, railroads could in some cases entirely replace the
function provided by water services. Ultimately, the threat that railroads presented
to ports manifested as an effort by the latter to increase productivity.
In the same vein, Galdon-Sanchez and Schmitz [35] and Schmitz [92] examine
the effect of Brazil’s entry into the lower Great Lakes iron ore market during the
1980s. They note that before Brazil’s entry markets were characterized by few and
distant producing locations and high transportation costs, which fostered significant
market power. Market power was in fact evident since unions and local government
exercised their power to extract as much of the surplus from iron ore producers as
possible.
The market power of iron ore producers around the Great Lakes eventually
eroded, however. Due to a substantial decrease in transportation costs, Brazil entered
the iron market around the Great Lakes in the 1980s. This put tremendous price
pressure on the domestic iron ore producers and, in turn, pressure to improve labor
productivity. Labor productivity actually doubled in the mid-1980s, and Galdon-
Sanchez and Schmitz [35] demonstrate that the source of productivity growth was
not due to the closing of inefficient mines or increases in scale, but rather surviving
mines that made investments to lower costs.
Finally, Syverson [110] investigates the effect of spatially dense competition on
the distribution of productivity in the US ready-mix concrete industry. He finds
that more densely clustered markets exhibit higher average productivity and lower
productivity dispersion. The reason for this is that more densely clustered markets
lower switching costs for consumers, and thus an inefficient firm is more likely to
570 D. Garcia et al.

exit an industry that is highly dense. In turn, average productivity and productivity
dispersion will increase and decrease, respectively, in more competitive markets.

Conclusion

In this chapter, we have demonstrated what market structure is, how it is measured
and how it can affect productivity and innovation. Such a chapter of course leaves
much room for other studies and perspectives and we do not expect that this
handbook chapter will satisfy all of those in this rather dense literature. We do trust,
however, that our treatments and perspectives on important contributions to this lit-
erature are balanced and that they provide a relatively complete perspective on such
an important issue in regard to a broad array of assumptions and methodological
approaches.

Cross-References

Capacity and Capacity Utilization in Production Economics

Cost, Revenue, and Profit Function Estimates
Multiproduct Technologies

References
1. Abrams D, Akcigit U, Popadak J (2013) Patent value and citations: creative destruction or
strategic disruption? NBER working paper, 19647
2. Aghion P, Harris C, Howitt P, Vickers J (2001) Competition, imitation and growth with step-
by-step innovation. Rev Econ Stud 68:467–492
3. Aghion P, Bloom N, Blundell R, Griffith R, Howitt P (2005) Competition and innovation: an
inverted-u relationship. Q J Econ 120:701–728
4. Appelbaum E (1982) The estimation of the degree of oligopoly power. J Econ 19:287–299
5. Armstrong M (2006) Recent developments in the economics of price discrimination. In:
Blundell R, Newey W, Persson T (eds) Advances in economics and econometrics: theory and
applications: ninth world congress of the econometric society. Cambridge University Press,
Cambridge, UK
6. Arrow KJ (1962) Economic welfare and the allocation of resources for invention. In: Nelson
R (ed) The rate and direction of inventive activity. Princeton University Press, Princeton, pp
609–625
7. Bain JS (1956) Barriers to new competition. Harvard University Press, Cambridge, MA
8. Berry S, Levinsohn J, Pakes A (1995) Automobile prices in market equilibrium. Econometrica
63:841–890
9. Bingaman A (1994) Innovation and antitrust. https://www.justice.gov/atr/speech/innovation-
and-antitrust
10. Blundell R, Griffith R, Reenen JV (1999) Market share, market value and innovation in a
panel of British manufacturing firms. Rev Econ Stud 66:529–554
11. Boldrin M, Levine D (2008) Against Intellectual Monopoly. Cambridge University Press
12. Boone J (2008) A new way to measure competition. Econ J 118:1245–1261
13 Market Structures in Production Economics 571

13. Boone J (2008) Competition: theoretical parameterizations and empirical measures. J Inst
Theor Econ 164:587–611
14. Bresnahan TF (1981) Departures from marginal-cost pricing in the American automobile
industry. J Econ 17:201–227
15. Bresnahan TF (1982) The oligopoly solution concept is identified. Econ Lett 10:87–92
16. Bresnahan TF (1987) Competition and collusion in the American automobile oligopoly: the
1955 price war. J Ind Econ 35:457–482
17. Bresnahan TF (1989) Empirical studies of industries with market power. In: Schmalensee R,
Willig RD (eds) The handbook of industrial organization, vol 2. North-Holland, Amsterdam,
pp 1011–1057
18. Bresnahan TF, Reiss PC (1990) Entry in monopoly markets. Rev Econ Stud 57:57–81
19. Brown B (1983) The identification problem in systems nonlinear in the variables. Economet-
rica 51:175–196
20. Brown DJ, Matzkin RL (1998) Estimation of nonparametric functions in simultaneous
equations models with application to consumer demand. Working paper, Yale University
21. Bulow J, Klemperer P (1999) Prices and the winner’s curse. Rand J Econ 33:1–21
22. Cohen WM, Nelson RR, Walsh JP (2000) Protecting their intellectual assets: appropriability
conditions and why U.S. manufacturing firms patent (or not). NBER working paper, 7552
23. Correa JA (2012) Innovation and competition: an unstable relationship. J Appl Economet
27:160–166
24. Correa JA, Ornaghi C (2014) Competition and innovation: evidence from U.S. patent and
productivity data. J Ind Econ LXII:258–285
25. Corts KS (1999) Conduct parameters and the measurement of market power. J Econ 88:
227–250
26. Deaton A, Muellbauer JN (1980) Economics and consumer behavior. Cambridge University
Press, New York
27. Demsetz H (1973) Industry structure, market rivalry, and public policy. J Law Econ 16:1–9
28. Driskill R, McCafferty S (1989) Dynamic duopoly with adjustment costs: a differential game
approach. J Econ Theory 49:324–338
29. Ellickson P (2007) Does Sutton apply to supermarkets? Rand J Econ 38:43–59
30. Ellickson P (2015) Market structure and performance. In: Wright JD (ed) International
encyclopedia of the social and behavioral sciences, vol 14, 2nd edn. Elsevier, Oxford, pp
9211–9216
31. Fisher FM, McGowan JJ (1983) On the misuse of accounting rate of return to infer monopoly
profits. Am Econ Rev 73:1141–1200
32. Fudenberg D, Tirole J (1991) Game theory. MIT Press, Cambridge, MA
33. Gabszewicz J, Thisse JF (1979) Price competition, quality and income disparities. J Econ
Theory 20:340–359
34. Gabszewicz J, Thisse JF (1980) Entry, (and exit) in a differentiated industry. J Econ Theory
22:327–338
35. Galdon-Sanchez JE, James AS (2002) Competitive pressure and labor productivity: world
iron-ore markets in the 1980s. Am Econ Rev 94:1222–1235
36. Garcia DD (2016) Examination of the relationship between competition and innovation:
toward a robust approach. Unpublished PhD dissertation, Rice University
37. Gasmi F, Laffont JJ, Vuong QH (1992) Econometric analysis of collusive behavior in a soft-
drink market. J Econ Manag Strateg 1:277–311
38. Gilbert R (2006) Looking for Mr. Schumpeter: where are we in the competition-innovation
debate? In: Jaffe AB, Lerner J, Stern S (eds) Innovation policy and the economy. The MIT
Press, pp 159–215
39. Gilbert RJ, Newbery DMG (1982) Preemptive patenting and the persistence of monopoly.
Am Econ Rev 72:514–526
40. Gollop FM, Roberts MJ (1979) Firm interdependence in oligopolistic markets. J Econ
16:617–645
41. Griliches Z (1990) Patent statistics as indicators: a survey. J Econ Lit XXVIII:1661–1707
572 D. Garcia et al.

42. Graddy K (1995) Testing for imperfect competition at the Fulton fish market. Rand J Econ
25:37–57
43. Hall B, Harhoff D (2012) Recent research on the economics of patents. Ann Rev Econ 4:
541–565
44. Hashmi AR (2013) Competition and innovation: the inverted-u relationship revisited. Rev
Econ Stat 95:1653–1668
45. Hausman JA, Leonard GK (2004) The competitive effects of a new product introduction: a
case study. J Ind Econ 50:237–263
46. Hausman J, Gregory L, Zona JD (1994) Competitive Analysis with Differentiated Products.
Annales d’Économie et de Statistique 34:159–180
47. Hazledine T (2006) Price discrimination in Cournot-Nash oligopoly. Econ Lett 93:413–420
48. Hazledine T (2010) Oligopoly price discrimination with many prices. Econ Lett 109:150–153
49. HMG (2010) Horizontal merger guidelines: U.S. Department of Justice and Federal Trade
Commission. https://www.justice.gov/atr/horizontal-merger-guidelines-08192010
50. Holmes T, Schmitz J (2010) Competition and productivity: a review of evidence. Federal
Reserve Bank of Minneapolis: Research Department Staff Report 439
51. Holmes T, Schmitz J (2001) Competition at work: railroads vs. monopoly in U.S. shipping.
Fed Reserve Bank Minneapolis Quart Rev 25(2):3–29
52. Hotelling H (1929) Stability in competition. Econ J 39:41–57
53. Iwata G (1974) Measurement of conjectural variations in oligopoly. Econometrica 42:
947–966
54. Jorgenson DW (2011) Innovation and productivity growth. Am J Agric Econ 93:276–296
55. Kamien MI, Schwartz NL (1976) On the degree of rivalry for maximum innovative activity.
Q J Econ 90:245–260
56. Karakaplan MU, Kutlu L (2019) Estimating market power using a composed error model.
Scottish J Polit Econ 66:489–510
57. Klemperer P, Meyer M (1989) Supply function equilibria in oligopoly under uncertainty.
Econometrica 57:1243–1277
58. Koetter M, Kolari JW, Spierdijk L (2012) Enjoying the quiet life under deregulation?
Evidence from adjusted Lerner indices for U.S. banks. Review of Economics and Statistics
94:462–480
59. Kreps D, Scheinkman J (1983) Quantity precommittment with Bertrand competition. Bell J
Econ 14:326–337
60. Kumbhakar SC, Baardsen S, Lien G (2012) A new method for estimating market power with
an application to norwegian sawmilling. Rev Ind Organ 40:109–129
61. Kutlu L (2009) Price discrimination in Stackelberg competition. J Ind Econ 57:364
62. Kutlu L (2012) Price discrimination in Cournot competition. Econ Lett 117:540–543
63. Kutlu L, Sickles RC (2012) Estimation of market power in the presence of firm level
inefficiencies. J Econ 168:141–155
64. Kutlu L, Sickles RC (2017) Measuring market power when firms price discriminate. Empir
Econ 53:287–305
65. Kutlu L, Wang R (2018) Estimation of cost efficiency without cost data. J Prod Anal 49:137–
151
66. LaFrance JT (1990) Incomplete demand systems and semilogarithmic demand models. Aust
J Agric Econ 34:118–131
67. LaFrance JT (2004) Integrability of the linear approximate almost ideal demand system. Econ
Lett 84:297–303
68. Lau LJ (1982) On identifying the degree of competitiveness from industry price and output
data. Econ Lett 10:93–99
69. Loury GC (1979) Market structure and innovation. Q J Econ 93:395–410
70. Marin P, Siotis G (2007) Innovation and market structure: an empirical evaluation of the
‘bounds approach’ in the chemical industry. J Ind Econ 55:93–111
71. Matsa DA (2011) Competition and product quality in the supermarket industry. Q J Econ
126:1539–1591
13 Market Structures in Production Economics 573

72. Nickell SJ (1996) Competition and corporate performance. J Polit Econ 104:724–746
73. Nevo A (2000) A practitioner’s guide to estimation of random-coefficients logit models of
demand. J Econ Manag Strateg 9:513–548
74. Nevo A (2000) Mergers with differentiated products: the case of the ready-to-eat cereal
industry. Rand J Econ 31:395–421
75. Nevo A (2001) Measuring market power in the ready-to-eat cereal industry. Econometrica
69:307–342
76. Nocke V (2007) Collusion and dynamic (under)investment in quality. Rand J Econ 38:
227–249
77. Orea L, Steinbuks J (2018) Estimating market power in homogenous product markets using
a composed error model: application to the California electricity market. Econ Inq 56:1296–
1321
78. Perloff JM, Karp LS, Golan A (2007) Estimating market power and strategies. Cambridge
University Press, Cambridge
79. Perloff JM, Shen EZ (2012) Collinearity in linear structural models of market power. Rev Ind
Organ 2:131–138
80. Pindyck R (1985) The measurement of monopoly power in dynamic markets. J Law Econ
28:193–222
81. Porter R (1983) A study of cartel stability: the joint executive committee 1980–1986. Bell J
Econ 14:301–314
82. Puller SL (2007) Pricing and firm conduct in California’s deregulated electricity market. Rev
Econ Stat 89:75–87
83. Puller SL (2009) Estimation of competitive conduct when firms are efficiently colluding:
addressing the Corts critique. Appl Econ Lett 16:1497–1500
84. Reinganum JF (1983) Uncertain innovation and the persistence of monopoly. Am Econ Rev
73:741–748
85. Reynolds S (1987) Capacity investment, preemption and commitment. Int Econ Rev 28:69–88
86. Roehrig CS (1988) Conditions for identification in nonparametric and parametic models.
Econometrica 56:433–447
87. Rosenthal R (1980) A model in which an increase in the number of sellers leads to a higher
price. Econometrica 48:1575–1579
88. Salop S (1979) Monopolistic competition with outside goods. Bell J Econ 10:141–156
89. Scherer FM (1965) Firm size, market structure, opportunity, and the output of patented
inventions. Am Econ Rev 55:1097–1125
90. Scherer FM (1967) Market structure and the employment of scientists and engineers. Am
Econ Rev 57:524–531
91. Schmalensee R (1989) Inter-industry differences of structure and performance. In:
Schmalensee R, Willig RD (eds) The handbook of industrial organization, vol 2. North-
Holland, Amsterdam, pp 951–1009
92. Schmitz JA (2005) What determines productivity? Lessons from the dramatic recovery of
the U.S. and Canadian iron ore industries following their early 1980s crisis. J Polit Econ
113:582–625
93. Schumpeter JA (1942) Capitalism, socialism and democracy. Harper & Row, New York
94. Schultz TW (1956) Reflections on Agricultural Production, Output, and Supply. J Farm Econ
38(3):748–762
95. Schultz TW (1962) Reflections on Investment in Man. J Polit Econ 70(5, pt. 2):1–8
96. Shaked A, Sutton J (1982) Relaxing price competition through product differentiation. Rev
Econ Stud 49:3–13
97. Shaked A, Sutton J (1983) Natural oligopolies. Econometrica 51:1469–1484
98. Shaked A, Sutton J (1987) Product differentiation and industrial structure. J Ind Econ 36:
131–146
99. Shapiro C (2012) Competition and innovation: did Arrow hit the bull’s eye? In: Lerner J,
Stern S (eds) The rate and direction of inventive activity revisited. University of Chicago
Press, Chicago, pp 361–404
574 D. Garcia et al.

100. Spiller PT, Favaro E (1984) The effects of entry regulation or oligopolistic interaction: the
Uruguayan banking sector. Rand J Econ 15:244–254
101. Stiglitz J (1989) Imperfect information in the product market. In: Schmalensee R, Willig RD
(eds) The handbook of industrial organization, vol 1. North-Holland, Amsterdam, pp 769–847
102. Stole LA (2007) Price discrimination and competition. In: Armstrong M, Porter R (eds) The
handbook of industrial organization, vol 3. North-Holland, Amsterdam, pp 2221–2299
103. Stone JRN (1953) The measurement of consumers’ expenditure and behavior in the United
Kingdom, 1920–38. Cambridge University Press, Cambridge, UK
104. Sutton J (1991) Sunk costs and market structure. MIT Press, Cambridge, MA
105. Sutton J (1998) Technology and market structure. MIT Press, Cambridge, MA
106. Sutton J (2007) Market structure: theory and evidence. In: Armstrong M, Porter R (eds) The
handbook of industrial organization, vol 3. North-Holland, Amsterdam, pp 2301–2368
107. Symeonidis G (1996) Innovation, firm size and market structure: Schumpeterian hypotheses
and some new themes. OECD economics department working paper, 161
108. Symeonidis G (2000) Price competition and market structure: the impact of restrictive
practices legislation on concentration in the UK. J Ind Econ 48:1–26
109. Symeonidis G (2001) The effects of competition: cartel policy and the evolution of strategy
and structure in British industry. MIT Press, Cambridge, MA
110. Syverson C (2004) Market structure and productivity: a concrete example. J Polit Econ
112:1181–1222
111. Utterback J, William A (1975) A dynamic model of process and product innovation. Omega
3:639–656
112. Vuong QH (1989) Likelihood ratio tests for model selection and non-nested hypotheses.
Econometrica 57:307–333
113. Weiss L (1989) Concentration and price. MIT Press, Cambridge, MA
Production Under Uncertainty
14
Robert G. Chambers

Contents
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 576
Uncertainty and Risk . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 577
The Stochastic Technology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 579
Incorporating Randomness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 582
Some Common Assumptions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 585
The Structure of Stochastic Technologies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 592
Stochastic Production Decisions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 599
Producer Preferences . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 599
Cost Minimization, Duality, Risk-Neutral Probabilities, Fisher
Separation, and More Duality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 600
Revenue Cost and Graphical Illustration of Producer Equilibrium . . . . . . . . . . . . . . . . . . . 606
Concluding Remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 608
Cross-References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 608
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 609

Abstract

This chapter describes a formal model of a stochastic production technology.

Alternative axioms and different structural restrictions are presented, and pro-
ducer decision-making under uncertainty is examined. The presentation empha-
sizes the formal similarities between the stochastic production environment and

My thanks to Spiro Stefanou for comments that considerably improved the presentation.

R. G. Chambers ()
Department of Agricultural and Resource Economics, University of Maryland,
College Park, MD, USA
e-mail: [email protected]

© Springer Nature Singapore Pte Ltd. 2022 575

S. C. Ray et al. (eds.), Handbook of Production Economics,
https://doi.org/10.1007/978-981-10-3455-8_6
576 R. G. Chambers

more traditional models of a nonstochastic technology and producer behavior

under certainty. The nonstochastic multiple-output technology is shown to be
special case of the more general stochastic production structure.

Keywords

Production · Uncertainty · Risk · Producers · Cost · Decision-making under

uncertainty

JEL codes

D21, D24, D25, D81, D84

Introduction

Virtually all economic decisions involve uncertainty. Even such mundane trans-
actions as buying a fast-food hamburger carry the potential for foreseeable, but
uncertain, unpleasant outcomes. In such settings, the uncertainty extends over
a relatively short time horizon, perhaps a day, and thus may seem relatively
inconsequential. Producers, on the other hand, routinely make important economic
decisions well in advance of exact knowledge of the conditions that will prevail
when their products are ready for market. Traditionally, microeconomic analyses
of producer behavior swept this awkwardness under an analytically convenient rug
woven from the fibers of a nonstochastic decision environment.
Of the reasons offered for this abstraction, perhaps the most basic was the most
forceful. Nonstochastic decision environments are simpler than stochastic ones. And
despite their relative simplicity, they still presented sufficient challenges to fully
employ entire generations of production economists. Better, perhaps, to resolve the
challenges posed by the simpler setting before attempting to resolve those posed by
the more realistic, but more challenging, setting.
Still, important contributions were made. Knight [26], by arguing that profit was
the entrepreneurial reward for bearing uncertainty, made uncertainty the centerpiece
of his theory of the firm. Arrow (1953, later translated as Arrow [3]), Savage
[39], and Debreu [17] provided the basic analytic foundations for analyzing
uncertain decision-making. Both Arrow [2] and Debreu [17] treated general-
equilibrium settings. Arrow’s [2] analysis focused on a pure-exchange economy,
but he suggested that “. . . the introduction of production would not be difficult.”
And Debreu [17] later extended the basic model to incorporate uncertain production
possibilities.
Nevertheless, the bulk of the formal production economics literature remains
focused on nonstochastic decision settings. This chapter’s goal is to provide a
capsular depiction of a formal model of stochastic production technologies and then
to use it to discuss briefly the fundamentals of producer decision under making
uncertainty. The emphasis, following Magill and Quinzii [31] and Chambers and
14 Production Under Uncertainty 577

Quiggin [10], is on identifying formal similarities between models developed

for nonstochastic production structures. The view taken is that nonstochastic
models should be special cases of a more general stochastic production structure.
Thus, readers of other chapters in this Handbook should encounter many familiar
concepts, albeit slightly recycled to accommodate the more general decision setting.
One thing this chapter will not attempt is a review of the stochastic production
function literature (e.g., [24] and [25], Antle [1], Lapan and Moschini [28], Pope
and Chavas [34], Chavas and Holt [14], and Pope and Just [35, 36]). In its place,
this chapter shows why this approach is a restrictive special case of the more general
Arrow-Debreu framework. Readers interested in more details on this approach are
referred to the excellent overview presented in Moschini and Hennessy [33].
The chapter is organized as follows. The first section discusses the conceptual
framework for analyzing uncertain decisions. A formal model of uncertain pro-
duction is then developed. The first step is to recall some basic concepts from
nonstochastic production analysis. These concepts are then extended to an uncertain
setting. The result is a general model that contains traditional nonstochastic models
as a special case. Alternative axioms for that technology are then discussed. After
that, different structural representations of stochastic technologies are examined.
The primary emphasis is on discriminating between an econometrically framed
version of that technology and one that flows more naturally from the Arrow and
Debreu contributions and is thus better grounded in axiomatic production theory.
The penultimate section contains a brief treatment of producer decision-making in
a stochastic environment, and then the chapter concludes.

Uncertainty and Risk

Following Knight [26], economists distinguish between risk and uncertainty.

Randomness that is “susceptible of measurement” by statistical calculation, his-
torical experience, or deduction is called risk. Uncertainty is defined residually
as randomness that is not “susceptible of measurement” by statistical calculation,
historical experience, or deduction. Practically speaking, risk involves randomness
to which objective probability assessments can be attached, and uncertainty involves
randomness to which objective probability assessments cannot be attached.
With the obvious exceptions of formal and informal gambling fora, such as card
games, casinos, and publicly supported lotteries, most economic decisions over
random outcomes involve uncertainty and not risk. A bit of introspection reveals
why. Consider a simple coin-flip game where two individuals wager on the outcome
of tossing a coin. If the coin is known to be fair, it’s a 50–50 proposition and the
game involves risk. On the other hand, we’re all aware of the phenomenon of unfair
coins. And so unless the coin is subjected to testing before the wagers are made, any
assessment of its fairness is purely subjective. Thus, the game involves uncertainty.
More generally, many of our daily endeavors are influenced by interacting random
factors that are beyond our control and which we only understand imprecisely.
Creating laboratory-like settings for evaluating the likelihood of different outcomes
578 R. G. Chambers

in such settings is not possible. Any probability assessments we make to aid our
decision-making in such settings are inherently subjective.
In discussing individual attitudes toward randomness, and in providing an
axiomatic basis for making decisions about potentially random outcomes, distin-
guishing between risk and uncertainty is vital. In modelling random production
structures, it is not. A complete and probabilistically-free theory of random
production is available.
The claim is not that probabilities are unimportant for decision-makers con-
fronted by stochastic technologies. Whether that is true or not is the subject of
decision theory and not production economics. Instead, the claim following Debreu
[17] is that probabilities are not needed to describe a physical stochastic technology
and that a complete theory of producer decision-making under uncertainty can be
developed without any reference to the notion of probabilities.
The formal structure, traceable to Arrow [2] and Savage [39], is common to both
decision theory and financial analysis. There are two periods, 0 and 1. In period 0,
which predates 1 and is called the decision period, no uncertainty exists. In period
1, uncertainty exists. Decisions made in period 0 are not uncertain, but they can
have uncertain consequences (outcomes) in period 1. Uncertainty is modelled by a
set of states, S, that is exogenous to producers and for which one, and only one,
realization, s ∈ S, called a state or a state of Nature actually occurs in period 1.
Uncertainty is resolved by a neutral player, Nature, making a unique choice from
S. Nature’s choice is made after period 0 and cannot be affected by producers. In
general, S can contain either a finite number of elements or an infinite number of
elements. In this chapter attention is restricted to the case where S contains a finite
number of elements. The elements of S are indexed, with an abuse of notation,
as S = {1, 2, . . . , S}. Hence, S is used to denote both the set of states and its
dimensionality. Subsets of S are referred to as events.
From the perspective of a production economist, the main practical difference
between the case of finite-dimensional S and infinite-dimensional S is that the
latter results in random variables being defined as infinite-dimensional vectors.
Conceptually, this causes no real problems, but mathematically it does require
notions of differentiability that depart from those usually used by economists.
Finite-dimensional S, on the other hand, allows us to operate in terms of commonly
understood gradients and partial derivatives.
Outcomes realized in period 1 are always expressible as real numbers. Thus,
the choices made in period 0 that have uncertain outcomes in period 1, commonly
referred to as acts, random variables, or state-contingent choices, are representable
as vectors in RS . For an uncertain choice f ∈ RS , its realized value if Nature
chooses s ∈ S is denoted by fs ∈ R.
An example illustrates. At planting time, a farmer is uncertain how much rain
will occur during the growing season. There are two possibilities, drought (D) and
adequate rainfall (A). The farmer can either plant a drought-resistant seed variety
(R) or a non-resistant seed variety (N ). If the non-resistant seed variety is planted,
yield will be yAN if adequate rainfall occurs and yDN if drought occurs with yAN >
14 Production Under Uncertainty 579

yDN . If the drought-resistant variety is planted, yield will be yAR if adequate rainfall
occurs and yDR if drought occurs with yAR > yDR .
In this example, period 0 is “planting time,” and period 1 is “end of the growing
season.” Abusing notation, S = {A, D} and the decision to be made is “non-resistant
variety” or “drought-resistant variety,” that is, choose either yN = (yAN , yDN )
or yR = (yAR , yDR ) . (The notation denotes the transpose of the vector.)
The period 0 choice reduces to choosing between two real vectors, each with the
characteristic that its sth element corresponds to the realized value for that choice
if Nature chooses s from S.
The farmer’s available choices can be represented in several ways. One is in
tabular format:

StateChoice Resistant Non-resistant

Adequate yAR yAN .
Drought yDR yDN

Another is the corresponding matrix format:

yAR , yAN
Y = ∈ R2×2 .
yDR , yDN

Our notational convention is that the mth column of an S × M matrix corresponds

to the mth random variable. Each row, s ∈ S, on the other hand represents the
corresponding outcome for each of the random variables in realized state s.

The Stochastic Technology

When talking production, economists typically rely upon three interrelated con-
cepts. Each characterizes the technology but views it from a different visual
perspective. The concepts are a technology set, an input set, and an output set. The
technology set, when depicted in R2 , consists of input-output combinations that
lie on or below a technical frontier typically conceived as the graph of a production
function. The input set, when depicted in R2 , consists of input combinations capable
of producing a given output. These input combinations fall on or above another
technical frontier, the isoquant. The output set, when depicted in R2 , portrays output
bundles producible using a fixed input bundle. These output bundles fall on or below
a technical frontier known, variously, as the transformation frontier, the production
possibilities frontier, or the production transformation frontier.
Figure 1, which measures a single output, z, on the vertical axis and a single
input, x, on the horizontal axis, depicts the technology set as everything falling on
or below the familiar “lazy-S” shaped production function. Figure 2 illustrates the
input set for the case of two inputs, x1 and x2 . The input set is obtained by holding
580 R. G. Chambers

Fig. 1 The technology set

Fig. 2 The input set

output, z, constant and orthogonally projecting the locus of input bundles that can
produce z onto R2 . Its lower boundary is the isoquant.
More formally, identify input bundles as N -dimensional real vectors, x ∈ RN ,
and output bundles as M-dimensional real vectors, z ∈ RM . The technology set, T ,
is the subset of input-output space, RN × RM+ , satisfying

T ≡ (x, z) ∈ RN × RM
+ : x and z are technically feasible .

The input set is obtained by applying a point-to-set mapping (correspondence) to

T . The notation P : A ⇒ B denotes a point-to-set mapping from the real subspace
A to the real subspace B. That is, P takes points, a ∈ A, and maps them into subsets
of B, B o ⊂ B. The input correspondence maps points in output space, z ∈ RM ,
into the subset of the input space, RN , containing the input bundles that can produce
z:

X (z) ≡ x ∈ RN : (x, z) ∈ T .

The image of the correspondence evaluated at z, X (z) , is the input set for z.
14 Production Under Uncertainty 581

Fig. 3 The output

correspondence

Fig. 4 The output set

The output correspondence transmutes the process and maps input space into
output space, Z : RN ⇒ RM , with

Z (x) ≡ z ∈ RM : (x, z) ∈ T .

Figure 3 illustrates the output correspondence visually for the case N = 1 and
M = 2, and Fig. 4 illustrates the output set Z (x0 ) obtained by projecting the surface
in Fig. 3 onto RM for x0 . Its outer boundary is the production possibilities frontier.
Formally, there is no need to discriminate between inputs and outputs. One can
always think in terms of netputs letting positive realizations denote outputs and
negative realizations denote inputs. This easily handles the case of intermediate
inputs where some outputs from one production process are inputs in another. We
discriminate between inputs and outputs because it allows us to develop our ideas
using non-specialist language and widely familiar concepts.
The input correspondence and the output correspondence are lower inverses of
one another. That is,
582 R. G. Chambers

Z (x) = X− (x) ≡ {z : x ∈ X (z)} ,

and

X (z) = Z − (z) ≡ {x : z ∈ Z (x)} .

From the definitions, it is apparent that the following equivalence holds

x ∈ X (z) ⇔ (x, z) ∈ T ⇔ z ∈ Z (x) . (1)

The different perspectives from which to view production problems are equivalent.
Different perspectives are used because different economic problems may require
focusing on different aspects of the technology. For example, for problems involving
the interplay between a single input and a single output, T is the natural choice. If
the interplay between different inputs is the primary concern, X (z) is the natural
focus, and Z (x) when the interplay between different outputs is the focus.

Incorporating Randomness

Despite the prominent role that nonstochastic technologies play in economic theory,
very few real technologies are nonstochastic. Everyone is familiar with very simple
circumstances where seemingly identical efforts produce different outcomes. Most
need go no further than attempting to draw two uniform circles freehand on a single
sheet of paper to grasp this phenomenon. Try as we might, slightly different images
will likely emerge each time the process is repeated.
Just what causes such phenomena is not uncontroversial. And because our pur-
pose is not to explain why such things happen, we attempt no concrete explanation.
Instead, we outline a formal framework that permits incorporating such possibilities
into models of producer behavior.
To that end, assume that inputs, x, are committed in period 0 and that outputs,
although also chosen in period 0, are only realized in period 1. After the producer
makes that input-output choice, Nature makes a selection, s, from S that resolves
uncertainty and determines what level of outputs are actually realized.
Just what S incorporates and what constitutes a state is often vague and typically
depends upon the problem setting. So, for an agricultural producer, Nature’s choice
might involve picking such things as moisture and radiation that characterize
the physical conditions under which plant growth actually occurs. Moisture and
radiation are physical inputs that the producer knows can affect production but
which he or she cannot feasibly know prior to planting. Hence, in this case, each
state of the world may reasonably be interpreted as specifying distinct moisture-
radiation combinations, and S would include all such possible combinations. Of
course, in other decision settings, producers may not care about moisture and
radiation because those random inputs may not affect production outcomes. Then,
14 Production Under Uncertainty 583

S would likely be conceptualized differently to accommodate the relevant sources

of uncertainty.
Regardless of the setting, however, the states and S must satisfy several criteria.
S and its elements must be exogenous to the producer. S must exhaustively
characterize the uncertainty faced by the producer. Put in language used by Donald
Rumsfeld,1 all unknowns are known, Nature’s choice must always be an element of
S, and one element of S must be chosen. S must cover all random outcomes that
can possibly affect production. And, finally, the states must be mutually exclusive
so that no two states can occur simultaneously. For example, drought and adequate
rainfall cannot both occur.
As a practical matter, the exhaustive nature of S in the Savage set up is often
violated. In describing criteria that states need satisfy, Savage (1954, p. 8) only
refers to “. . . descriptions that might thinkably apply.” Despite this vagueness, it is
critical that the producer knows each possible state and what it means. What the
producer does not know in period 0 is which state will actually be chosen. But once
Nature chooses an s from S, the production process becomes entirely deterministic.
However, as has become increasingly apparent, individuals often face choices
for which the range of alternative outcomes is unknowable or imperfectly foreseen,
what Rumsfeld called unknown unknowns. There is a burgeoning literature on mod-
eling and describing decision-making over unknown unknowns, but that remains
beyond the scope of the current chapter.
The conceptual approach to accommodating this uncertainty in modelling pro-
duction decisions, traceable to Arrow [2], recognizes that just as commodities
are differentiated according to their physical characteristics, location, and time
provided, they also should be differentiated according to the state of Nature, s ∈ S,
that eventuates. For example, when it doesn’t rain, those galoshes on your feet are
superfluous attire. But when it does rain, the same galoshes handily keep your feet
dry.
Therefore, outputs chosen by the producer, z, before Nature’s choice is revealed
should not only be subscripted according to their physical characteristics (e.g.,
wheat of a particular variety) but also according to the state of Nature that occurs.
Thus, multiple stochastic outputs are not represented by vectors but as matrices,
z ∈ RS×M (and not z ∈ RM ) with typical element zsm denoting the amount of the
mth commodity produced if state of Nature s ∈ S is realized. And so if in period 0,
the producer chooses the input and stochastic output mix, (x, z) ∈ RN × RS×M and
then Nature picks s ∈ S, the realized period-1 output bundle is

zs· = (zs1 , zs2 , ..., zsM ) ,

that corresponds to the sth row of z.

1 Donald Rumsfeld famously said: “‘. . . there are known knowns; there are things we know we
know. We also know there are known unknowns; that is to say we know there are some things we
do not know. But there are also unknown unknowns – the ones we don’t know we don’t know”
(US Department of Defense, 2002).
584 R. G. Chambers

With these changes in notation and interpretation, the definition of the stochastic
technology becomes

T = (x, z) ∈ RN × RS×M : x and z are technically feasible , (2)

which is virtually indistinguishable from that for a nonstochastic technology. The

main difference is the domain for z.
The formal similarity in definitions is especially attractive because it implies that
concepts used to analyze nonstochastic production decisions can be productively
recycled to analyze stochastic ones. Of course, because more forces are at play,
actual decisions are now more complicated. But fundamentally, one is still interested
in input substitutability, output transformability, and the interaction between inputs
under the control of producers and outputs realized. What’s new is Nature’s role in
this process.
Nature’s role necessarily complicates things, but no reason exists to suggest that
it invalidates ideas developed in nonstochastic models. Indeed, in a formal setting,
one would hope that familiar nonstochastic models would emerge as the special case
of the more general stochastic technology for S = {1} . Mathematically, when one
wishes to focus on certain aspects of a Q-dimensional set, one often orthogonally
projects the set onto a lower dimensional space holding one or several elements of
the set constant. (This is the basic idea behind input sets and output sets.) Using
similar logic, nonstochastic technologies can be visualized as projections of the
stochastic technologies from RN × RS×M onto, say, RN × RM for a particular
s ∈ S. However, this projection is only economically relevant if it is known for sure
in period 0 that Nature’s choice will be s.
Some examples will help illustrate. First, take the case where M = 1. The
producer produces a single stochastic output. In our formalism, in period 0 the
producer chooses x ∈ RN and z ∈ RS . The sth element of z, zs , is the amount
of the output produced if Nature picks s.
Something akin to Figs. 3 and 4 remains a natural way to depict the output choice.
But despite the similarity of the visual representations, the intuition differs subtly.
For now the horizontal axis in Fig. 4 measures output realized from x if Nature’s
choice is 1 ∈ S = {1, 2} , and the vertical axis measures output produced if
2 ∈ S = {1, 2} is chosen. As drawn, T permits transforming z1 into z2 holding
x constant. This would be done, for example, by diverting efforts to prepare for
state 1 to preparing for state 2.
In this instance, the visual depiction for X (z) remains essentially unchanged
from Fig. 2. And Fig. 1, as drawn, corresponds to how much of one state-specific
output, say z1 , would be produced as x varies now holding the other state-specific
output, z2 , constant. It emerges from projecting the three-dimensional surface in
Fig. 3 onto the (x, z1 ) coordinate plane holding z2 constant.
Now suppose that M = 2 so that two stochastic outputs are produced. Then
instead of just one two-dimensional transformation frontier for a fixed x, as in
Fig. 4, there are now six: between z11 and z21 (same output, different states), z11
14 Production Under Uncertainty 585

and z12 (same state, different outputs), z21 and z22 (same state, different outputs),
z11 and z22 (different outputs and different states), and z12 and z21 (different outputs
and different states). Presumably, one might represent these distinct output sets as
similar to Fig. 4. For our current purposes, however, the specific shape is not crucial.
And, in any case, it likely varies considerably across problem settings. What is
important is that trade-offs occur both across products and across states of Nature.
In the following, we shall always speak in terms of outputs as being random and
inputs as being deterministic. Again this is a naming convention and is pursued
to promote an intuitive grasp of the basic ideas while avoiding an excessively
complicated notation. The case of stochastic inputs is easily handled within our
formalism. To illustrate, consider the example of a farmer who plants a crop in
early spring knowing that it might be subject to a pest infestation closer to harvest
time that can only be treated upon emergence. Then at planting time, when the
farmer picks z ∈ RS , he or she also picks a stochastic treatment plan t ∈ RS to
accommodate the potential pest outbreak.

Some Common Assumptions

Economic analysis always requires assumptions. This subsection discusses the

technical details that facilitate analysis of stochastic technologies. As a general
rule, the mathematics relevant for stochastic technologies are the same as for
nonstochastic ones. Where differences emerge, they typically result from the subtle
difference between a stochastic commodity, z ∈ RS and an S-dimensional bundle
of distinct, but nonstochastic, outputs. The discussion focuses on those differences,
and readers requiring a more complete discussion of technical issues are referred
to Chambers and Quiggin [10]. And because the focus is stochastic production and
not multiple outputs, the remaining sections exclusively treat the case of a single
stochastic output so that T ⊂ RN × RS instead of RN × RS×M .
The assumptions relevant to modelling stochastic production can be classified
as feasibility assumptions, continuity assumptions, disposability assumptions, and
curvature or smoothness assumptions. Each will be discussed in turn.
Feasibility assumptions ensure that T , defined above, is non-empty. That’s simple
enough and needs relatively little motivation. After all, if T = ∅, it would be hard
to justify economists studying it intensively.
Continuity assumptions describe the type of set that we want T to be. In the
main, these are chosen for mathematical convenience, and depending upon context
they can assume different forms. For example, if our interest lies in studying input
behavior, one might only need restrictions on the input correspondence, X. To be
sure, because X and Z are lower inverses of one another, this results in restrictions
on Z. But those induced restrictions on Z may not be the same invoked if one were
only studying output behavior.
For our purposes, however, simplicity is best even if it results in a loss of
generality. Therefore, we require T to be compact, that is, as a subset of RN × RS ,
it is both closed and bounded. In certain contexts, this can prove overly strong. For
586 R. G. Chambers

example, it rules out technologies satisfying global constant returns to scale. But the
gain in simplification in presenting basic results more than compensates for any lost
generality.
As the chapters “Distance Functions” and “Bad Outputs” demonstrate, poorly
chosen disposability assumptions can be quite limiting even when technologies
involve no uncertainty. Similar issues occur here. For example, imposing free
disposability of certain inputs and outputs can violate the laws of Nature as we
now understand them.
On the other hand, disposability properties frequently convey the most important
analytic consequence of any assumptions – the ability to write down a “function
representation of T .” Where, for example, would firm theory or empirical produc-
tion analysis be if one could not describe the technology in functional terms, be it a
production function, a transformation function, or an input requirement function?
The disposability assumption we invoke, while acknowledging its potential
limitations, is free disposability of z ∈ RS . Formally, T ⊂ RN × RS satisfies

z ∈ Z (x) ⇒ z ∈ Z (x) for z ≤ z.

(Note, attention is here restricted to M = 1.) Free disposability of z has a number of

consequences, but for our immediate purposes, two are particularly important. This
first is that

Z (x) = ∅ ⇒ 0S ∈ Z (x) .

(Notation: For any r ∈ R, r S denotes the element of RS where r always

occurs regardless of Nature’s choice. Alternatively, one can write r S = r1S . 0S ,
thus, denotes the origin associated with the usual orthonormal basis for RS .) In
words, if something can be produced from a given x, one can always choose to
produce nothing from that same input bundle. Put yet another way, complete output
inefficiency is possible.
The second consequence is that

z ∈ Z (x) ⇔ D (z, x) ≤ 0, (Indication) (3)

where

D (z, x) ≡ min β ∈ R : z − β S ∈ Z (x)

if there exists β ∈ R such that z − β S ∈ Z (x) and ∞ otherwise. Indication

ensures that D (z, x) fully characterizes Z (x) and hence, by (1) T and X (z) . Thus,
knowledge of D (z, x) is equivalent to knowledge of the stochastic technology.
14 Production Under Uncertainty 587

Fig. 5 D(z, x) illustrated for

S = {1, 2}

The extended-real-valued function, D, is called either a directional output

distance function or a shortage function.2 It gives the smallest amount of 1S ∈ RS
that can be subtracted from z and leave the result within the output set Z (x) . It is a
directional notion because geometrically it involves translating z in the direction 1S .
The choice of 1S as the direction is motivated by the fact that it allows interpreting
distance as being measured in units of the sure thing, which is formally equivalent
to a degenerate random variable (see below).
Figure 5 illustrates the determination of D (z, x) geometrically. As drawn, z lies
outside of Z (x) . The direction 1S is illustrated by the vector (1, 1). D (z, x) is
then derived by translating the point z in the direction parallel to (1, 1) toward the
boundary of Z (x) until the translated point, z−D (z, x) 1S , just lies on the boundary
of Z (x) . For the case drawn, a positive amount of 1S has been subtracted from z
to make it feasible and thus D (z, x) > 0 signalling that x could not have produced
that particular z.
Because they focus on trade-offs, economists routinely use calculus-based argu-
ments. In a production context, these are usually justified by curvature assumptions
placed directly upon X (z) , Z (x) , or T or by smoothness assumptions placed
upon a function representation of the technology such as D (z, x) . Depending upon
the context, required assumptions can differ subtly. For example, assuming that
Z (x) ⊂ RS is a strictly convex set ensures that D (z, x) is differentiable in z. It
does not ensure, however, that D (z, x) is differentiable in x.
Figure 6 illustrates. At x0 , the boundary of Z (x0 ) is strictly convex and thus
nicely smooth in (z1 , z2 ) space. For points on the boundary of Z (x0 ) , D (z, x) will

2 Luenberger [30] coined the “shortage” terminology. Later building upon Luenberger [30],

Chambers, Chung, and Färe [7] introduced the “directional distance” terminology to emphasize its
similarities to and differences from distance function of the type studied by Shephard [? ]. See the
chapter entitled “Distance Functions” by Chambers and Färe in this volume.
588 R. G. Chambers

Fig. 6 Smooth output set

be differentiable in z. But, as drawn, the correspondence possesses a kink as one

varies x away from x0 .
Similar reasoning shows that requiring X (z) ⊂ RN to be a strictly convex
set ensures differentiability of D (z, x) in x, but not in z. On the other hand,
one can easily imagine smooth boundaries for nonconvex Z (x) that will generate
differentiable D (z, x).
Broadly speaking, whether differentiability assumptions are crucial to analyzing
production decisions under uncertainty is, as in most other areas of economic
analysis, context-dependent. In the main, non-differentiability if it occurs will only
occur at isolated points (sets of measure zero) and can be safely ignored. In such
instances, differentiability assumptions are akin to continuity assumptions discussed
above, analytically convenient but not terribly substantive.
But in other cases, those sets of measure zero can be acutely interesting and
can convey important economic information. In fact, several long-standing puzzles
or paradoxes in economics (e.g., the Allais Paradox, the Ellsberg Paradox, the
endowment effect) have been accommodated analytically by admitting the potential
for non-differentiabilities.
From the perspective of a production economist, non-differentiabilities signal
a lack of substitutability or transformability. Economic analysis of alternative
production schemes often hinges on a presumed ability to substitute either away
from or toward economically attractive alternatives. For example, tax policies
directed at reducing the use of an environmentally damaging input are likely
ineffective if no substitutes exist for the damaging input. That lack of substitutability
is often manifested analytically as a lack of differentiability.
Curvature assumptions also often reflect economic regularities in which
economists typically believe. The laws of the diminishing marginal productivity of
inputs (traceable at least to von Thünen’s (1826) early nineteenth-century studies),
the diminishing marginal rate of technical substitution, and the increasing rate
of marginal transformation are common production examples. Each is formalized
14 Production Under Uncertainty 589

by imposing convexity on one of the set-based representations of the stochastic

technology. Our focus is on the latter two, the diminishing marginal rate of technical
substitution and the increasing marginal rate of transformation.
Assuming that X (z) is a convex set yields input sets as depicted in Fig. 2. The
marginal rate of technical substitution between any two inputs at a particular point
on the isoquant is given by the slope of the tangent hyperplane to that point. We
compute it using D (z, x) as follows. For any point, x o , belonging to

X̄ (z) = {x : D (z, x) = 0} ,

use an appropriate version of the implicit function theorem to obtain by partial

differentiation the small variation in xk that will exactly balance a small variation in
xj (thus keeping one on the isoquant) as

∂xk ∂D (z, x o ) /∂xj

=− ,
∂xj ∂D (z, x o ) /∂xk

which (depending upon naming conventions) is the marginal rate of technical

substitution of xk for xj at (z, x o ) . When the input set is nicely smooth as in Fig. 2,
the marginal rate of substitution is well defined, is well understood, and exhibits
a diminishing marginal rate (input substitution becomes increasingly difficult). On
the other hand, if X (z) is convex but possesses a kinked boundary, this standard
procedure will not work. The presence of a kink implies the existence of an infinity
of supporting hyperplanes, each with different slope, and signals a lack of smooth
substitutability between inputs. The economic consequence is that local factor
demands will exhibit “stickiness” to a continuum of relative factor-price changes.
Assuming that Z (x) is convex allows one to depict output sets as in Fig. 4. But
one must now remember that this familiar shape’s interpretation is subtly different
than in the nonstochastic multi-output case production case. In the latter, the smooth
shape implies that redirecting inputs from the production of one output to the
production of another effectively permits transforming one into the other. You get
more corn, from a given bundle of inputs, by sacrificing some wheat.
Here, however, the message is that to get more corn in one state of Nature, say 1 ∈
{1, 2} , you have to sacrifice corn in state of Nature 2. Output transformation occurs
not across dissimilar products but across different states of Nature for otherwise
identical products.
Mechanically, one obtains the marginal rate of transformation between state-
contingent or state-specific outputs exactly as one obtains the marginal rate of
substitution. Define the stochastic or state-contingent production possibilities fron-
tier by

Z̄ (x) = z ∈ RS : D (z, x) = 0 .
590 R. G. Chambers

Holding all inputs and state-contingent outputs except s and m constant, implicit
differentiation yields the marginal rate of transformation as

∂zs ∂D (z, x) /∂zm

=− .
∂zm ∂D (z, x) /∂zs

When Z (x) takes the shape in Fig. 4, this marginal rate of transformation is
always well defined. However, if Z̄ (x) contains a kink, the marginal rate of transfor-
mation is not well defined. Again a kink in Z̄ (x) implies a lack of substitutability or,
perhaps, transformability between outputs in different states of Nature. Thus, if state
1 were identified with ideal growing conditions for corn and state 2 with less than
ideal growing conditions for corn, a kink in the stochastic production possibilities
frontier for corn would signal that farmers experience stickiness in transforming
corn in the ideal state of Nature into corn in the less than ideal state of Nature.
This, in turn, would signal physical difficulties in redirecting inputs to different uses
in preparing for different states of Nature. As later developments reveal, a broadly
studied class of stochastic technologies always possesses such kinks.
Other subtle differences can emerge between how one interprets curvature or
smoothness properties for nonstochastic technologies and how one interprets similar
properties in studying stochastic technologies. We illustrate with Fig. 7. There each
point on the ray passing through (1, 1) that is labelled the “Bisector” is characterized
by z1 = z2 . The terminology reminds us that the ray passing through (1, 1) splits
R2 in half. In the nonstochastic case, points on this ray contain equal amounts of
two different outputs and are not particularly meaningful. But in the stochastic
context, points on that ray have the same output in both states of Nature. In other
words, these are the non-random (or degenerately random) production outcomes.
In finance, elements on that ray have certain (nonstochastic) returns in period 1.
And, 1S is often referred to as the riskless asset. (Remember random variables are
vectors in RS . Thus, financial assets with random period 1 payouts are representable
as vectors in RS .)

Fig. 7 Stochastic production

possibilities
14 Production Under Uncertainty 591

Depicting radial contractions or expansions of 1S as intersecting Z (x) , thus,

represents the nontrivial assumption that the input bundle x can be used to produce
some output level nonstochastically. That is, all uncertainty can be controlled.
Whether this is true depends upon the technology, but the important economic
implication is that whether the producer exposes himself or herself to uncertainty
now represents a conscious choice. For example, as noted, free disposability of
output requires Z (x) = ∅ ⇒ 0S ∈ Z (x) , so that all uncertainty can be avoided by
the simple device of choosing not to produce.
The riskless asset is not the only geometric surface in RS to possess a subtly
different interpretation when discussing random variables. Recall that an affine
hyperplane, as a subset of RS , is defined as the set of points satisfying

H̄ (α, p) = z ∈ RS : p z = α, α ∈ R .

Visually, an affine hyperplane is represented as a linear surface that is parallel to

the hyperplane passing through the origin with normal p ∈ RS . In the definition,
the location parameter, α, represents the amount that the hyperplane through the
origin is translated in the direction of 1S to obtain the surface H̄ (α, p) . When
p ∈ RS+ / {0} , hyperplanes in RS depict sets of random variables sharing the same
expected value, pα1S , as defined relative to the probability measure

p
.
p 1S

To see why, consider two facts. First, the homogeneity properties of affine hyper-
planes ensure that

α p
H̄ S
, S = H̄ (α, p) .
p 1 p 1

Dividing both the normal and the location parameter by the same constant leaves
the hyperplane unchanged. And second

p

1s = 1.
p 1S

When p ∈ RS+ / {0} , elements of the normal to H̄ α
, p
p 1S p 1S
are non-negative
p
and sum to one. Hence, p 1S
∈ RS+ / {0}
is interpretable as a probability measure,

and its inner products, pp1S z, with random variables, z ∈ RS , are interpretable
as the expected values of those random variables.
592 R. G. Chambers

The Structure of Stochastic Technologies

Generally speaking, economists possess little exact knowledge of what characterizes

physical production technologies. As a practical matter, analytic tractability rou-
tinely trumps realism in choosing our models. Thus, an economist’s representation
of the technology is often better viewed more as a parable about production than as
a representation of the physical world.
Hopefully, these parables don’t do too much damage to reality. But when all
is said and done, our primary interest remains in how technologies constrain or
condition economic behavior. And thus, if models can be designed that do not
controvert the physical laws of Nature and that do not trivialize the economic
responses of producers, one is likely ahead of the game. Unfortunately, economists
rather persistently specify technical models that do controvert the laws of Nature
(see, e.g., the chapter entitled “Bad Outputs” by Murty and Russell in this volume)
and that do trivialize the economic responses of producers.
In examining the structure of stochastic technologies, we must first decide which
assumptions to impose on T . To preserve generality, they are kept to a minimum: a)
T = ∅ (feasibility); b) T is compact (continuity); and c) z ∈ Z (x) ⇒ z ∈ Z (x)
for z ≤ z (free disposability of stochastic output). As noted, a) and b) are relatively
harmless and do relatively little damage to reality. Free disposability of z, on the
other hand, is often violated but is maintained because it simplifies the presentation.
These assumptions ensure that D (z, x) characterizes T . But together, they are
not sufficient to identify a specific form for D (z, x). How to proceed? If one were
to poll economists on a functional form for D (z, x), the smart money would bet
that something similar to

S
D (z, x) = A (x) zsαs , (4)
s=1

would be among the favored answers.

The Cobb-Douglas form is surely among the most popular in economics. Its
popularity is traced to several roots. It admits nonlinearity while remaining relatively
simple. Its economically interesting characteristics are captured by relatively few
parameters. Its separability properties (see the chapter by Primont on “Functional
Structure” in this volume) ensure that the interactions between those characteristics
are relatively simple. And its supermodular structure ensures behavioral responses
are “nice.”3

3 As it turns out, the Cobb-Douglas specification is actually not a viable candidate for characterizing

a directional distance function. It fails to satisfy a key regularity property such functions must
possess. This regularity condition, which is referred to as the translation property, is important
from a technical perspective. But it is not crucial to our discussion of stochastic technologies and
thus has been ignored.
14 Production Under Uncertainty 593

Any choice of form for D (z, x) imposes structure on T . Judging that structure’s
plausibility remains a craft that requires understanding the problem and the basic
nature of the technology. What truly distinguishes nonstochastic technologies
from stochastic ones is the latter’s treatment of output as random rather than as
deterministic. Consequently, our structural focus is on D (z, x) s treatment of z.
For (4), the marginal rate of transformation between two stochastic outputs is

∂zs am zs
=− .
∂zm αs zm

The associated partial elasticity of transformation (please see the chapter in this
volume by Russell entitled “Elasticities of Substitution”) between these two outputs
is thus

∂ ln zzms
= 1,
∂ ln ∂D/∂z
∂D/∂zs
m

for all pairs of realized outputs. A priori, little reason exists to suggest that the
elasticity of transformation should parametrically equal one (or any other value).
The message this conveys is that, along the stochastic production possibilities
frontier, percentage changes in ratios of stochastic outputs are exactly matched
by equal percentage changes in marginal rates of transformation. This behavior
mirrors the behavior in input space of Cobb-Douglas production functions. Thus,
even though it may possess no strong factual basis, it is conceptually familiar.
By appropriate choice of the parameters, this specification also admits stochastic
production possibilities frontiers approximating that drawn in Fig. 4.
Before the late 1990s, however, the most common economic specification of
stochastic technologies did not resemble Fig. 4. Instead, something akin to the
rectangular Z (x) in Fig. 8 was the most popular choice.4 The marginal rate of
transformation between stochastic outputs is only well defined along a portion of
the boundary of Z (x) in Fig. 8. Depending upon your viewpoint, the slope of the
flat surfaces is either zero or infinity (in the limit). And, so, if one is zero, the
other is infinity. Moreover, at the “kink” the marginal rate of transformation is
not well defined. In its place, we find the continuum of potential marginal rates of
transformation, (−∞, 0) .5 That kink, of course, is a set of measure zero on which
differentiability of D (z, x) fails. It’s economically interesting because it represents
the most efficient z that can be produced using x. Any other z producible using that

4 As far as I am aware, Chambers and Quiggin [8] were the first to recognize that Z (x) for this
particular specification took this shape.
5 Formally, gradients are replaced by subgradients at the kink. Visually, subgradients are repre-

sented by the infinity of hyperplanes tangent to Z (x) at the kink.

594 R. G. Chambers

Fig. 8 Cubical production

possibilities

x results in a lower output for at least one state. In probabilistic terms, the outer
vertex of Z (x) first-order stochastically dominates all the other elements of Z (x) .
Figure 8 exhibits either “output-price-nonjointness” or “perfect output comple-
mentarity.” The former terminology is due to Kohli (1983). The latter is analogous
to the consumer case where Leontief preferences identify perfect complements.
The economic consequences of both are easily grasped by considering what Fig. 8
implies when the axes correspond to two distinct outputs for a nonstochastic
technology. In that case, if x were held constant, altering relative output prices
would elicit no production response from the producer. Thus, even though one
output becomes economically more attractive as its price rises, the producer does
not respond by producing more of it at the expense of producing less of the now
relatively less attractive output. Because there is no substitution effect, the only way
for supply to respond is by adjusting x.
Imagine, in a stochastic setting, a farmer with only one tractor who has already
contracted to employ a fixed amount of labor. Suppose also that the farmer must
decide how to allocate that labor and that tractor without knowing what final
growing conditions will be. In that setting, consider two decision scenarios. In the
first, a government program guarantees that the farmer receives a fixed price for
her crop. In the second, no government programs exist, and the farmer must take
whatever price the market offers.
The first scenario provides the farmer with perfect price insurance; the second
provides none. Do you believe that the farmer produces in the same fashion under
both so that the provision of perfect price insurance has no substitution effect? If
your answer is no, your intuition aligns with a Z (x) resembling Fig. 4 because that
technology allows one to adjust the mix of stochastic outputs holding x constant. A
yes answer, on the other hand, aligns with Fig. 8 because the only available output
adjustment involves throwing away output in one or both of the states.
The Z (x) depicted in Fig. 8 can be identified with the stochastic production
function model of the technology defined as
14 Production Under Uncertainty 595

Z pf (x) = z ∈ RS : zs ≤ fs (x) , for all s ∈ S

= {zs : zs ≤ fs (x)} , (5)

s∈S

where each fs : RN → R is a state-specific production function and the random

variable f (x) = (f1 (x) , ..., fS (x)) ∈ RS is the stochastic production function.
This technology works as follows. When input choices are made, the producer
only knows that one of S different ways of using the chosen x will occur. But she
does not know exactly which one. It’s as though she picks x and then spins a roulette
wheel to determine the technology that will prevail. The producer cannot allocate the
chosen x to perform different actions in response to realizations of s ∈ S. Instead,
the only true choice is the maximal output that occurs in each state.
The D (z, x) for this technology is derived as

D (z, x) = min β : z − β S ∈ Z (x)

= min {β : zs − fs (x) ≤ β, s ∈ S}
= max {zs − fs (x)} . (6)
s∈S

Thus, assuming that s ∈ S corresponds to the state with maximal zs − fs (x) ,

∇z D (z, x) = es

where ∇ denotes the gradient with respect to the subscripted argument and es ∈ RS
is the sth element of the usual orthonormal basis. Taking pairwise ratios of elements
of ∇z D (z, x) yields either zeroes or undefined terms. The associated kink occurs at
z = f (x) ∈ RS .
The stochastic production function model is a natural consequence of two distinct
analytic traditions. The first is the single-product production function familiar
from intermediate micro theory. And the second is Haavelmo’s (1943) classic
identification of the “econometric error” with the equation as a whole and not any
single variable in the set of dependent and independent variables. The most common
version of the model melds these ideas into the generic form

zs = fs (x) ≡ g (x, εs ) , s∈S (7)

where ε ≡ (ε1 , ε2 , ..., εs ) ∈ RS is an exogenously determined random variable

and g : RN +1 → R. The assumption is that, prior to choosing x, the producer
596 R. G. Chambers

knows g and ε but not its specific realization. Hence, choosing x immediately results
in the unique choice of rational z as (g (x, ε1 ) , ..., g (x, εS )) . How ε is determined
is rarely broached and even less rarely explained. Typically, ε is chosen to satisfy
analytically convenient statistical properties rather than with regard to the actual
decision setting.
This is a decidedly econometric view of stochastic production. In econometric
terms, ε provides the “sample space” for the stochastic factors affecting produc-
tion.6 As such, the stochastic production function specification naturally possesses
important advantages in formulating statistically estimable versions of g. That is
most easily seen by considering its simplest version,

zs = g o (x) + εs , s ∈ S.

Realized values of the random variable z, in this specification, decompose into

two components. One, g o (x) , is deterministic with its parameters amenable to
parametric estimation using distributional assumptions on the other, zs − E [ε] .
Studies using some version of (7) are varied and include theoretical and empirical
analyses (see, e.g., Fuller [19], Feldstein [18], de Janvry [16], Stiglitz [? ], Batra
[4], Just and Pope [24] and [25], Moscardi and de Janvry [32], Holström [22], Antle
[1], Lapan and Moschini [28], Pope and Chavas [34], Chavas and Holt [14], and
Pope and Just [35] and [36]).
The cost of econometric tractability, however, may be economic plausibility.
Chambers and Quiggin [10] provide a detailed critique of the economic implications
of the stochastic production function model, which they refer to as output-cubical,
and an axiomatic derivation of it in a decision-theoretic setting. Not the least of
their critiques is the lack of analytic tractability that the “kinky” nature of this
specification imposes on function representations of T . But there are also other
conceptual problems.
To give a flavor of some of these, yet another agricultural example is useful. Let
z correspond to a stochastic crop output, x to the vector of inputs controlled by
the producer, and ε to a composite of inputs beyond the producer’s control such
as natural moisture (m ∈ RS ), radiation r ∈ RS , and pest infestation b ∈ RS .
Specifically, set

εs = ϕ (ms , rs , bs ) , s∈S

so that

zs = g (x, ϕ (ms , rs , bs )) , s ∈ S.

6 Most applications treat the case where S is infinite-dimensional and thus take ε to be an interval
of the real line.
14 Production Under Uncertainty 597

This specification requires the state-specific inputs, (ms , rs , bs ) , to be weakly

separable from x. So, for example, if x contains pesticides, the marginal rate
of substitution between pest infestation and moisture must be independent of
the amount of pesticides applied. Similarly, if x contains chemical fertilizer, the
marginal rate of substitution between natural moisture and radiation must be
independent of fertilizer. Both are restrictive if not implausible.
The familiar Edgeworth box and Lerner-Pearce diagrams, which provide the
backdrop for the classic derivations of the Stolper-Samuelson theorem, the Rybczyn-
ski theorem, and the factor-price equalization theorem, suggest a natural way to
move from the representation in Fig. 8 to that in Fig. 4. In a nonstochastic setting,
the formal framework is the input-nonjoint production model (see, e.g., Chambers
[5], pp. 286–8). Translated to a stochastic setting, the input-nonjoint framework
models Z (x) as

Z N (x) = z ∈ RS : zs ≤ rs (xs ) , for all s ∈ S, x = xs .
s

Here each rs : RN → R is a state-specific production function that maps the

input bundle allocated to s, xs ∈ RN , into a state-specific output. The associated
random output is r (x) = (r1 (x1 ) , ..., rS (xS )) . The main difference from the
output-cubical model is that the total bundle of inputs, x, is now allocable across
state-specific tasks. Thus, the random variable r (x) changes as the allocation,
(x1 , x2 , ..., xS ), changes even if x is held constant. Chambers and Quiggin [10]
call Z N the state-allocable model.
Figure 9 shows how this ability to allocate inputs to different s-specific tasks
affects the stochastic production possibilities frontier. Suppose that only one input,
call it effort, exists. Suppose further that the production process is agricultural and
that the crucial factor determining S is the level of natural moisture. It’s easy to
imagine that too much moisture is a bad thing because it will drown the plants
or animals involved. Conversely, too little moisture is also a bad thing. Thus, in
choosing how to apply her or his effort, the farmer must decide on how to allocate
that effort to prepare for different levels of moisture.
Suppose that the effort allocation choice is reflected by x1 + x2 = x in Fig. 9. For
that allocation, maximal production is determined by (r1 (x1 ) , r2 (x2 )) as illustrated
in the figure. A rectangular stochastic production possibilities frontier emerges
for that allocation. Now suppose the same farmer almost simultaneously receives
a flash of insight (or a long-term forecast) that the likelihood of bad drought
conditions had diminished while the likelihood of massive flooding had dramatically
increased. If given the chance to change the effort allocation, without changing
total x, the farmer responds by changing the allocation to x1∗ , x2∗ , the state-
allocable model permits the maximum output bundle to move to r1 x1∗ , r2 x2∗ ,
representing a different stochastic output mix and a different random variable. The
output-cubical model would permit no such reallocation. Extending the thought
experiment to all possible moisture levels, the effort reallocations define a contin-
598 R. G. Chambers

uum of new rectangular production possibilities frontiers, with the outer vertexes of
these rectangles tracing out a representation approximating Fig. 4.
For real-world settings, neither the output-cubical model nor the state-allocable
model is likely correct. The same is probably true for any specific functional
structure that one chooses. Paraphrasing W.M. Gorman [20], choosing a functional
structure is about choosing what part of the model to analyze in detail and what to
leave to later, and hopefully more thorough, analysis.
Several points bear emphasis. First, choosing a specific structure limits the issues
that can be examined. For example, the output-cubical structure predetermines the
degree of output transformability. Thus, the rate of output transformability cannot
be informatively investigated using that structure.
Second, states of Nature are neither ex ante “good” nor “bad.” It can and often
will depend upon how one prepares for them. Insurance provides an example.
Purchasing insurance, formally speaking, is a decision to gamble, just as is
purchasing a lottery ticket. You pay a premium, and if a particular state occurs, you
get an indemnity back. Otherwise, you get nothing. If you receive no indemnity, ex
post you have less money than if you had not taken the insurance gamble and, in
that sense, you have lost the gamble.
Most likely view not collecting on insurance as a good thing. Nevertheless,
you have clearly lost the gamble that something bad would happen. Reversing
the analogy, purchasing a lottery ticket, usually perceived as gambling, is formally
equivalent to buying insurance for the unlikely event that your chosen number gets
picked. So, too, for bingo and shaking hands with one-armed bandits. Formally, they
can be viewed as insurance activities.
The same phenomenon applies for stochastic production systems. States are not
fundamentally bad or good. Instead, they reflect Nature’s role in the production
process. How one prepares for them determines whether a good or a bad outcome
emerges. Figure 9 again illustrates. Continuing our production parable, suppose
that the farmer responds to his or her insight that flooding was more likely by
reallocating effort. Output in the reallocated situation (associated with ∗) is higher

Fig. 9 State-allocable
production possibilities
14 Production Under Uncertainty 599

in state 2 than in state 1. Thus, if that allocation were chosen, one expects the farmer
to be hoping that 2 occurs because it involves a higher output. Contrast that with the
original allocation. There output in 1 is higher than in 2. Then one naturally expects
the farmer to be hoping that 1 occurs.7 Bad conditions, when properly prepared
for, can lead to good outcomes just as gambling on a bad outcome (insurance)
occurring can ameliorate losses. The output-cubical specification flips this logic on
its head. Rather than the producer being able to undertake preventative actions that
affect which state is “good” or “bad,” Nature alone decides. The result is a technical
specification that, by its very nature, underestimates (if not eliminates) any potential
for technically efficient self-protective activities.

Stochastic Production Decisions

Our next step is to integrate the stochastic production model into producer decision-
making. In this section, we first present an objective function for the producer. It
includes as special cases virtually all producer objective functions that have been
studied in uncertain decision settings. Important examples include risk-neutral,
subjective expected-utility preferences, and mean-variance structures consistent
with first-order stochastic dominance. We then discuss the primitives of the pro-
ducer’s decision set, demonstrate a fundamental result underlying optimal producer
behavior, and relate that behavior to net present value decision rules. The final
subsection develops an equivalent, but probability-free, version of the decision
model and relates it to fundamental results for competitive, but nonstochastic,
markets.

Producer Preferences

Period 0 wealth (income) of the producer is exogenously determined and denoted

i 0 . Producer preferences over uncertain period 1 consumption, y 1 ∈ RS , and
nonstochastic period 0 consumption, y 0 , are complete and given by W : RS+1 → R
that is strictly increasing and continuous in y 1 , y 0 . For intuitive simplicity, we also
frequently assume W is quasi-concave in y 1 . Quasi-concavity imposes a generic
form of risk aversion (Debreu [17]) that contains more familiar notions as special
case.
Although one can operate in terms of W, its existence ensures the existence of
an even-more convenient cardinal representation, known as its certainty equivalent,
and defined by

7 Atthis juncture, it would be a good exercise for you to revisit Fig. 9 and the associated intuitive
discussion in an attempt to determine whether 1 is associated with more or less moisture than 2.
600 R. G. Chambers

e y 1 , y 0 = min β ∈ R : W β S , y 0 ≥ W y 1 , y 0

if there exists β ∈ R : W β S , y 0 ≥ W y 1 , y 0 and ∞ otherwise. The certainty

equivalent gives the amount of the riskless asset, 1S , that when combined with
a period 0 consumption of y 0 indifferent to y 1 , y 0 . It is strictly increasing and
continuous in y 1 , y 0 and satisfies

e y 1∗ , y 0∗ ≥e y 1 , y 0 ⇔W y 1∗ , y 0∗ ≥W y 1 , y 0 (Preference Indication),

so that it is functionally equivalent to W. When W is quasi-concave in y 1 , e is

concave in y 1 and thus differentiable almost everywhere.

Cost Minimization, Duality, Risk-Neutral Probabilities, Fisher

Separation, and More Duality

We assume the producer faces competitive period 0 markets for the inputs. Their
market-determined prices are denoted w ∈ RN ++ . Period 1 price for the stochastic
output z is stochastic but determined exogenously to the producer and denoted p ∈
RS++ . The producer thus views herself or himself as a price taker both for w and for
p.
The producer’s task is to determine how to allocate i 0 to producing stochastic
output to be sold in period 1 at the stochastic spot price of p, how much to consume
in period 0, and how much to consume in period 1. The producer faces two period-
specific budget constraints

i 0 ≥ y 0 + w x,
p1 · z ≥ y 1 .

(For two random variables m and n, the notation m · n ∈ RS denotes the random
variable formed as the element-wise product of the two random variables, so that
m·n = (m1 n1 , ..., mS nS ) .) The first inequality requires that period 0 consumption
not exceed income available for consumption, i 0 − w x. And the second ensures
that stochastic period 1 income always is at least as large as stochastic period 1
consumption. Put another way, period 1 income first-order stochastically dominates
period 1 consumption. Hence, the producer’s problem is to solve

max e y 1 , y 0 : i 0 ≥ y 0 + w x, p1 · z ≥ y 1 . (8)
(x,z,y)

The following result is immediate (Chambers and Quiggin [5, 10]):

14 Production Under Uncertainty 601

Theorem 1. (Chambers and Quiggin) The producer’s optimal production decisions

satisfy

max {e (p · z, i − c (w, z))} ,

where:

c (w, z) = min w x : x ∈ X (z)
x

= min w x : z ∈ Z (x)
x

= min w x : D (z, x) ≤ 0
x

if X (z) = ∅ and ∞ otherwise.

Theorem 1 shows that solving the producer’s problem decomposes, as usual, into
two components. In the first, producers solve the cost-minimizing problem for z.
And, once that problem is solved, producers pick their optimal stochastic output to
maximize their preferences. The producer cost function, c (w, z) , is nondecreasing,
closed, concave, and positively homogeneous in w. It also satisfies Shephard’s
Lemma. Namely, its superdifferential in w contains the cost-minimizing solutions,
and when that superdifferential is a singleton set (implying differentiability in w),
the cost-minimizing solution is unique and equals the corresponding gradient. Con-
versely, if the cost-minimizing solution is unique, the cost function is differentiable
in w, and the gradient equals that solution.
An immediate consequence of the properties of c (w, z) is that the correspon-
dence, X̂ : RS ⇒ RN , generated by c (w, z) as

X̂ (z) = x ∈ RN : w x ≥ c (w, z) for all w ∈ RN
++

possesses images that are closed, convex sets satisfying X̂ (z) + RN + ⊂ X̂ (z) (free
disposability of inputs), with X (z) ⊂ X̂ (z) . Moreover, if X (z) satisfies these same
properties, then X̂ (z) = X (z) . In other words, standard duality relationships apply
for input correspondences associated with stochastic technologies.
A theorem stating that competitive producers minimize cost and that dual
relations apply might strike some as pedantic. After all, the producer cost function
is one of the first concepts to which we introduce our students. But the fact remains
that the relevance of the cost minimization and dual cost structures for producers
facing stochastic production possibilities was still being questioned until the middle
of the last decade of the twentieth century [23, 34, 35].
Theorem 1 establishes that producers do minimize cost when production is
uncertain. Moreover, these cost functions are fully dual to input correspondences
602 R. G. Chambers

satisfying familiar properties. The confusion that did exist on this issue seems to
have emerged from an exclusive focus on the stochastic production function model
without a formal specification of its measure-theoretic underpinnings in terms of
a state space, its associated measurable events, and the definition of a random
variables as mappings from the state space to the reals.
Suppose, to the contrary, that a producer decided not to minimize the cost of
producing z. Instead of incurring c (w, z) to produce z, he or she would incur
some other strictly greater cost level, ĉ, to produce that same output, z. That,
however, would require the producer to forego period 0 consumption in the amount
of ĉ − c (w, z) > 0 while generating an unchanged stochastic income of p · z!
In language borrowed from financial economics, the producer ignores an arbitrage
opportunity. Such behavior is always inconsistent with the choices of a rational
economic individual regardless of their risk preference or the structure of W .

Example 1. To illustrate the derivation of the producer’s cost function, we use the
specific example of a stochastic production function. The input correspondence for
that technology is given by the lower inverse to Z pf

Z pf − (z) = x ∈ RN : zs ≤ fs (x) , for all s ∈ S

= ∩s∈S x ∈ RN : zs ≤ fs (x) ,

which corresponds to the intersection of the input sets for the S state-specific
technologies. It follows immediately that the minimal cost associated with this
technology can be no lower than the minimal cost associated with the most
expensive of the S state-specific technologies. That is, the cost function obeys the
maximin criterion

c (w, z) ≥ max min w x : zs ≤ fs (x) ,
s∈S

that implies an isocost structure of the same generic form as Z pf (x) .

For the sake of a familiar argument, our discussion of the producer’s problem
assumes that e y 1 , y 0 and c (w, z) are smooth and that all solutions are interior.8
The producer’s first-order conditions then require

∂e y 1 , y 0
∇1 e y 1 , y 0 · p = ∇z c (w, z) (9)
∂y 0

8 These
conditions are easily relaxed. See, for example, Chambers and Quiggin ( [10–12], 2010),
Chambers [6], and Chambers and Voica [13].
14 Production Under Uncertainty 603

where ∇1 e y 1 , y 0 denotes the gradient of e in y 1 and ∇z c (w, z) denotes the

gradient of c in z.
Some familiar transformations borrowed from the finance literature (e.g.,
Cochrane [15] or LeRoy and Werner [29]) permit translating these conditions into
net present value terms. Define the producer’s subjective intertemporal discount
factor, denoted as δ y 1 , y 0 ∈ R, as the marginal riskless variation in period 1
consumption needed to balance a small change in period 0 consumption, ε,

S
e y1 + δ y1, y0 , y − ε = e y1, y0 .
0

Letting ε → 0 and using the implicit function theorem gives

∂e
δ y 1 , y 0 = 0 /∇1 e y 1 , y 0 1S ,
∂y

where the right-hand side reveals that δ y 1 , y 0 is also interpretable as the marginal
rate of substitution between one unit of period 0 consumption and one unit of
the riskless asset in period 1. Next define the producer’s risk-neutral probability
measure, denoted by π y 1 , y 0 ∈ RS , as

∇1 e y 1 , y 0
π y1, y0 = S
.
∇1 e y 1 , y 0 1

As the notation indicates, both δ y 1 , y 0 and π y 1 , y 0 are local measures

and depend upon e y 1 , y 0 . Therefore, both are subjectively determined. Two
producers with the same consumption profile y 1 , y 0 can easily possess different
subjective intertemporal discount factors and/or risk-neutral probability measures.
Second, both are intuitive devices that make the producer’s decision-making more
intuitively accessible to individuals accustomed to using net present value methods.
In that context, δ y 1 , y 0 represents the return on the riskless asset that would
leave the producer just indifferent between a dollar today and a dollar tomorrow.
And πs y 1 , y 0 /πs y 1 , y 0 gives the relative odds of states s and s that would
form an expected net present value producer’s marginal rate of substitution between
period 1 incomes in states s and s .
Using these transformations and (9) gives

π y1, y0
· p = ∇z c (w, z) . (10)
δ y1, y0

Expression (10) forms the basis of:

Theorem 2. (Fisher separation) The producer’s optimal consumption y 1 , y 0 and

production decisions satisfy
604 R. G. Chambers

⎧ ⎫
π y1, y0 ⎨ π y1, y0 ⎬
c∗ w, ·p ≡ sup (p · z) − c (w, z) .
δ y1, y0 z ⎩ δ y1, y0 ⎭

π y 1 ,y 0
Here c∗
w, δ y 1 ,y 0 · p , which is the convex conjugate of c (w, z) , is a net
( )
present value profit function defined in terms of period 0 input prices, w, and
π y 1 ,y 0
discounted period 1 state-specific output prices, δ y 1 ,y 0 · p. It is positively homo-
( )
geneous and convex in its arguments and satisfies versions of Hotelling’s Lemma so
that optimal derived demands and state-specific
supplies
can be recaptured as either
π y 1 ,y 0
subdifferentials or gradients of c∗ in w, δ (y 1 ,y 0 )
· p . Thus, when unique optimal
derived demands and state-specific supplies exist, they can be written, respectively,
as

π y1, y0 ∗ π y1, y0
x w, · p = −∇w c w, ·p
1 + δ y1, y0 1 + δ y1, y0

π y1, y0
= −∇w c w, z w, ·p , (11)
1 + δ y1, y0

and

π y1, y0 ∗ π y1, y0
z w, ·p = ∇ π (y 1 ,y 0 ) c w, ·p . (12)
1 + δ y1, y0 1+δ (y 1 ,y 0 )
·p 1 + δ y1, y0

π y 1 ,y 0
The convexity properties of c∗ ensure that the Hessian of c∗ in w, 1+δ y 1 ,y 0 · p
( )
is positive semi-definite, thus yielding familiar
comparative-static
results for optimal
π y 1 ,y 0
demands and state-specific supplies in w, 1+δ · p . It also follows imme-
(y 1 ,y 0 )
diately from (11) that optimal derived demand behavior, following Sakai [38], can
be decomposed into pure substitution effects and pure expansion effects.9
The convex conjugate of c∗ defined by

c∗∗ (w, z) = sup q z − c∗ (w, q)
q∈RS

9 Although it is not discussed here, one can alternatively express the Fisher separation theorem in
terms of a revenue function defined over discounted period 1 prices and x. That derivation, in turn,
allows for a decomposition of present value maximizing supplies in terms of substitution and scale
effects.
14 Production Under Uncertainty 605

is a closed, convex function of z that is closed, concave, nondecreasing, and

positively homogeneous in w (Rockafellar [37]). Moreover, if c (w, z) itself is
closed, convex in z, c (w, z) = c∗∗ (w, z) , so that standard duality results also
apply for net present value profit functions.
The intuition behind Fisher separation is not that producers maximize the
expected net present value of producing their stochastic output. Rather, Theorem 2
asserts that producers choose their stochastic output in the same manner as would an
expected net present value maximizer using the subjectively determined stochastic
π y 1 ,y 0
pricing kernel, δ y 1 ,y 0 ∈ RS , to discount period 1 stochastic incomes.
( )
Terminologies regarding pricing kernels for stochastic assets can differ. The
general idea is that a pricing kernel should discount the period 1 stochastic return
to equal its period 0. For example, if A ∈ RS is the period 1 stochastic return
from an asset with a period 0 acquisition price of 1 and the stochastic pricing
kernel is denoted k ∈ RS , then k A = 1. An alternative definition works in
terms of expectations inner products and often invokes the terminology stochastic
discount factor. In this setting if m ∈ RS is a stochastic discount factor, then
Eπ [mA] = (π · m) A = 1. That is the expected discounted value of the asset
equals its period 0 acquisition price. Thus, if k is a stochastic pricing kernel,

k1 kS
π1 , ..., πS ∈ RS is a stochastic discount factor for the probability measure π.
π y 1 ,y 0
In the current context, if one knows δ y 1 ,y 0 , the optimal stochastic pro-
( )
duction
choice is the one that maximizes expected net present value. Because
π y 1 ,y 0
c∗ w, δ y 1 ,y 0 · p inherits all of the properties of a standard profit function,
( )
Fisher separation is both analytically and
empirically convenient.
Moreover, the
π y 1 ,y 0
existence of a technology fully dual to c∗ w, δ (y 1 ,y 0 )
·p ensures that no true
generality is lost by operating in terms of c∗ (rather than c, D, or any other
representation of T ) provided that e y 1 , y 0 is increasing in y 1 , y 0 . A producer
optimizing e with respect to c and then with respect c∗ makes the same economic
choices for both. Thus, just as in standard producer decision-making under certainty,
rational producers effectively “skip over” nonconvexities in technologies, and the
“true” technology and the dual technology are observationally equivalent.
Traditional shadow-pricing arguments now reveal that the producer’s appropriate
shadow values for z are determined by its marginal cost, ∇z c (w, z) ∈ RS . A closely
related observation is obtained by rewriting (10) as

π y1, y0
= ∇z c (w, z) / · p,
δ y1, y0

where /· denotes element-wise division. The left-hand side of this expression is the
appropriate stochastic pricing kernel for an individual with preferences e y 1 , y 0 .
But, in equilibrium, that realization reveals that ∇z c (w, z) / · p ∈ RS will represent
an appropriate virtual (shadow) stochastic pricing kernel.
606 R. G. Chambers

Revenue Cost and Graphical Illustration of Producer Equilibrium

The concepts of risk-neutral probabilities and Fisher separation are intuitively

handy, but they are by no means necessary to discussing producer decision-making
under uncertainty. This section shows how a probability-free approach to analyzing
producer behavior is developed. A direct consequence of Theorem 1 is:

Corollary 1. The producer’s optimal consumption decisions satisfy: maxy 1 e y 1 ,

i − c w, y 1 / · p .

c w, y 1 / · p is a special case of the revenue-cost function introduced by

Chambers and Quiggin [10]. It gives the minimal producer cost of obtaining a period
1 consumption y 1 given p and w.10 In the smooth case, the associated first-order
conditions for the producer’s optimization problem are

∇1 e y 1 , y 0
= ∇z c w, y 1 / · p / · p, (13)
∂e y 1 , y 0 /∂y 0

which is fully equivalent to (9). (Substitute y 1 / · p for z in the latter to obtain the
former.) Despite their equivalence, (13) motivates a subtly different interpretation
of optimality conditions that does not rely on probabilities. The expression on the
left-hand side of (13) equals the S-vector of marginal rates of substitution between
period 1 consumption in state s and period 0 consumption. The expression on the
right-hand side is the S-vector of period 0 marginal costs of raising consumption in
period 1 by one unit. Optimality, not surprisingly, requires equating the two.
The ratio of any two elements of the left-hand side of (13), say s and s ,
corresponds to

∂e y 1 , y 0 /∂ys1
,
∂e y 1 , y 0 /∂ys1

the producer’s marginal rate of substitution between consumption in states s and s .

The corresponding ratio on the right-hand side is

ps ∂c w, y 1 / · p /∂zs
,
ps ∂c w, y 1 / · p /∂zs

the producer’s marginal rate of transformation between income in states s and s .

Thus, equilibrium corresponds to a tangency between the producer’s indifference
curve between period 1 consumption in different states and the stochastic production

10 The reader is referred to Chambers and Quiggin [10] for a more detailed and thorough treatment
of revenue-cost functions.
14 Production Under Uncertainty 607

Fig. 10 Producer
equilibrium

possibilities frontier (as expressed in revenue terms). Figure 10 illustrates that

equilibrium as y1∗ , y2∗ .
The equilibrium depicted in Fig. 10 is reminiscent of basic representation results
for general equilibria with representative producers and representative consumers.
The difference is that the equilibrium there depicted represents an “internal”
equilibrium for the producer that involves her or his assessment of optimal trade-offs
and arbitrage opportunities between her or his preference structure and the physical,
but stochastic, production technology.
The slope of the hyperplane defined by the tangency between the stochastic
production possibilities frontier and the indifference curve (labelled “Relative State-
Claim Prices”) is also expressible as

πs y 1 , y 0
− ,
πs y 1 , y 0

the relative risk-neutral probabilities (the relative risk-neutral odds) between states
s and s . The visual interpretation of the Fisher separation theorem is simply
that the producer chooses the stochastic output to ensure that the slope of the
illustrated hyperplane in Fig. 10 equals the rate of transformation between state-
specific outputs. When those relative risk-neutral odds are reinterpreted as relative
prices, the visual criterion for optimality is the same as for multiple-output profit
maximization.
By an exactly parallel logic, the solution to the “consumer” side of the producer’s
problem can be characterized as maximizing the utility from y 1 given a budget
constraint defined by the hyperplane in Fig. 10. In that context, the relative prices
are interpretable as subjective Arrow [2] state-claim prices – the period 0 prices
of options on 1 unit of period 1 income in each of states. Thus, given these Arrow
state-claim prices or the parallel risk-neutral probabilities, an individual’s behavior
as a producer and a consumer can be “separated” and analyzed independently.
608 R. G. Chambers

π y 1 ,y 0
The obvious catch is that one must know δ y 1 ,y 0 to operationalize these
( )
separation results. The market and price-taking behavior ensure separation between
producer and consumer behavior for the corresponding analysis for nonstochastic
technologies. But here the “market” is one internal to the producer, and unless
competitive markets exist for Arrow securities or options (so-called complete
markets), the relevant state-claim prices depend upon the solution to the producer’s
problem and commingle producer preferences and the technology. Thus, individuals
with different preference structures or with different technologies have different
stochastic pricing kernels.
Except for relatively trivial versions of S, producers do not face complete
markets for period 1 state-claims. More generally, markets will be incomplete. But
incompleteness is not the same as non-extant, and as Magill and Quinzii [31],
Chambers and Quiggin [9,12], and Chambers and Voica [13] demonstrate, different
circumstances can arise where incomplete financial markets yield situations that are
effectively complete from the producer’s perspective, and price discipline over state-
claims arises from competitive behavior. In those instances, separation is complete,
and producer and consumer behavior can be analyzed separately. Fully analyzing
producer behavior in the presence of such incomplete markets requires a complete
specification of the financial market structure that is beyond the scope of this chapter
and thus is not considered. Nevertheless, a straightforward consequence of these
results is that comparative-static analyses developed for dual profit structures and
indirect preference structures translate directly to a stochastic setting.

Concluding Remarks

This chapter presents a capsular depiction of a formal model of a stochastic

production technology. The basic model is that developed by Arrow [2] and
Debreu [17] as later extended by Chambers and Quiggin [10]. It has as an
important special case the standard nonstochastic multiple-output production model.
Alternative axioms, functional representations, and structural restrictions for the
stochastic technology are examined. The model is then used to characterize optimal
producer decision-making in a stochastic environment.

Cross-References

Bad Outputs
Distance Functions in Production Economics
Functional Structure and Aggregation
14 Production Under Uncertainty 609

References
1. Antle JM (1987) Econometric estimation of producers’ risk attitudes. American Journal of
Agricultural Economics 69:509–22
2. Arrow KJ (1953) Le Role des Valeurs Boursiers pour la Repartition la Meilleur des Risques.
Cahiers du Seminair d’Economie. CNRS, Paris
3. Arrow KJ (1964) The role of securities in the optimal allocation of risk bearing. Rev Econ Stud
31:91–96
4. Batra RN (1974) Resource allocation in a general equilibrium model of production uncertainty.
J Econ Theory 8:50–63
5. Chambers RG (1988) Applied production analysis: a dual approach. Cambridge University
Press, Cambridge
6. Chambers RG (2007) Valuing agricultural insurance. Am J Agric Econ 89:596–606
7. Chambers RG, Chung Y, Färe R (1996) Benefit and distance functions. J Econ Theory 70:407–
419
8. Chambers RG, Quiggin J (1992) A state-contingent approach to production under uncer-
tainty.mimeo
9. Chambers RG, Quiggin J (1997) Separation and hedging results with state-contingent produc-
tion. Economica 64:187–209
10. Chambers RG, Quiggin J (2000) Uncertainty, production, choice, and agency: the state-
contingent approach. Cambridge University Press, New York
11. Chambers RG, Quiggin J (2008) Narrowing the no-arbitrage bounds. J Math Econ 44(1):1–14
12. Chambers RG, Quiggin J (2009) Separability of stochastic production decisions from producer
risk preferences in the presence of financial markets. J Math Econ 45:730–737
13. Chambers RG, Voica D (2017) “Decoupled” farm program payments are really decoupled: the
theory. Am J Agric Econ 99:773–782
14. Chavas J-P, Holt M (1996) Economic behavior under uncertainty: a joint analysis of risk
preferences and the technology. Rev Econ Stat 78:329–335
15. Cochrane, J. H. (2001) Asset pricing. Princeton University Press, Princeton
16. de Janvry A (1972) The Generalized Power Production Function. Am J Agric Econ 54:234–237
17. Debreu G (1959) The theory of value. Yale University Press, New Haven
18. Feldstein M (1971) Production uncertainty with uncertain technology: some economic and
econometric implications. Int Econ Rev 12:27–36
19. Fuller W (1965) Stochastic fertilizer production functions for continuous corn. J Farm Econ
47:105–119
20. Gorman WM (1976) Tricks with utility functions. In: Artis MJ, Nobay AR (eds) Essays in
economic analysis. Cambridge University Press, New York
21. Haavelmo T (1943) The structural implications of simultaneous equations systems. Economet-
rica 11:1–12
22. Holmström B (1979) Moral hazard and observability. Bell J Econ 10:74–91
23. Just RE (1993) Discovering production and supply relationships: present status and future
opportunities. Rev Mark Agric Econ 61:11–40
24. Just RE, Pope RD (1978) Stochastic specification of production functions and economic
implications. J Econ 7:67–86
25. Just RE, Pope RD (1979) Production Function Estimation and Related Risk Considerations.
Am J Agric Econ 61:277–84
26. Knight FH (1921) Risk, uncertainty, and profit. Augustus M. Kelley, New York
27. Kohli U (1983) Nonjoint technologies. Rev Econ Stud 50:209–219
28. Lapan H, Moschini G (1994) Futures heding under price, basis, and production risk. Am J
Agric Econ 76:465–477
29. LeRoy SF, Werner J (2014) Principles of financial economics. Cambridge University Press,
Cambridge
30. Luenberger DG (1994) Dual Pareto efficiency. J Econ Theory 62:70–84
610 R. G. Chambers

31. Magill M, Quinzii M (1996) Theory of incomplete markets. MIT Press, Cambridge
32. Moscardi E, de Janvry A (1977) Attitudes towards risk among peasants: an econometric
approach. Am J Agric Econ 59:710–716
33. Moschini G, Hennessey D (2001) Uncertainty, Risk Aversion and Risk Management for
Agricultural Producers. BL Gardner and GC Rausser (eds.). Handbook of Agric Econo Elseveir
1:87–115
34. Pope RD, Chavas J-P (1994) Cost functions under production uncertainty. Am J Agric Econ
76:196–204
35. Pope RD, Just RE (1996) Empirical implementation of ex ante cost functions. J Econ 72:231–
249
36. Pope RD, Just RE (1998) Cost function estimation under risk aversion. Am J Agric Econ
80:296–302
37. Rockafellar RT (1970) Convex analysis. Princeton University Press, Princeton
38. Sakai Y (1974) Substitution and expansion effects in production economics: the case of joint
products. J Econ Theory 9:255–274
39. Savage LJ (1954) Foundations of statistics. Wiley, New York
40. United States Department of Defense (2002) DoD news briefing: secretary Rumsfeld
and Gen. Myers Feburary 12 2002. https://archive.defense.gov/Transcripts/Transcript.aspx?
TranscriptID=2636
41. von Thünen JH (1826) Der Isolierte Staat. Pergamon Press, New York
Dynamic Analysis of Production
15
Spiro E. Stefanou

Contents
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 612
Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 614
The Setting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 614
Dynamic Optimization Frameworks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 616
Adjustment Cost Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 616
Long History and Evolution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 616
Incorporating Adjustment Costs into a Technology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 618
Primal-Dual Theory Opportunities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 619
Econometric Approaches . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 621
Nonparametric Approaches . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 622
Dynamic Generalizations of Modern Production Theory Concepts: Scale
and Scope, Efficiency, Capacity, and Productivity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 625
Efficiency . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 626
Capacity Utilization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 628
Productivity Change . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 629
Non-convex Production Relationships . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 630
Network Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 631
Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 632
Cross-References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 633
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 633

S. E. Stefanou ()
Food and Resource Economics Department, University of Florida, Gainesville, FL, USA
Wageningen University, Wageningen, Netherlands
e-mail: [email protected]

© Springer Nature Singapore Pte Ltd. 2022 611

S. C. Ray et al. (eds.), Handbook of Production Economics,
https://doi.org/10.1007/978-981-10-3455-8_17
612 S. E. Stefanou

Abstract

A parable of economic life is that some factors can adjust rapidly while others
adjust slowly in a given time scale. Focusing on production analysis in the
dynamic setting leads us to emphasize the technology specification that permits
the theoretical construction that can be translated and amenable to empirical
implementation. A historical perspective of the framing the dynamic decision-
making is reviewed. The adjustment cost model of the investment is the key
conceptual feature as it can be incorporated into the formal structure of a
production technology, which offers the opportunity to exploit primal-dual
theory in both analysis and empirical implementation. An overview of empirical
formulations in both econometric (parametric) and nonparametric settings is
discussed. Dynamic production decision environment allows explicitly for the
evolution of assets implying firms may not be in long-run equilibrium at a
given point in time. The dynamic generalizations of modern production theory
concepts measuring economic performance are reviewed given the need to
properly account and value the factors that are out of equilibrium. Empirical
nonparametric and parametric approaches are addressed at length. While these
cases can be addressed relatively easily within a nonparametric, dynamic data
envelopment analysis setting, econometric formulations are a greater challenge.

Keywords

Adjustment cost · Dynamic production technology · Dynamic data

envelopment analysis · Econometric approaches · Economies of scope ·
Economies of scale · Capacity utilization · Productivity change · Dynamic
distance function · Quasi-fixed factors · Capital adjustment

Introduction

. . . if you don’t do the best you can with what you happen to have got, you’ll never do the
best you might have done with what you should have had . . . (Aris [7], p. 27)

This insightful statement captures the essence of production decision making

over time. Unpacking the elements of Aris’ single sentence is a driving force of this
review chapter. The dynamic decision environment necessarily involves the linkage
of current decisions to future opportunities, leading us back to Aris’ quote at which
is a restatement of Bellman’s Principle of Optimality.1 Optimal solutions are linked

1 Bellman’s principle of optimality states: “An optimal policy has the property that whatever the
initial state and initial decision are, the remaining decisions must constitute an optimal policy with
regard to the state of resulting from the first decision” ([10], p. 83).
15 Dynamic Analysis of Production 613

forward “ . . . to the best we might have done . . . ” and backward “ . . . with what
you should have had . . . .”
We take a step back from the basic economic setting to address the case where
the time scale of decisions and the presence of factors of production that are still
productive beyond the data-reporting period add another layer impacting how we
characterize decision making. For the analyst, economic data arrive at different time
scales than the decisions. Nature of inputs that contribute to a dynamic context for
production is an economic decision making fact of life. Durable factors of production
that typically include structures and machinery have a productive life beyond the
reported decision period, and extracting the full value of these factors is the decision
maker’s challenge.
This leads us to reconsidering the technology characterization to articulate how
decisions are linked over time and the mechanisms influencing these linkages.
Technologies and behavior need to be specified. A dynamic production environment
can be characterized as one where current production decisions impact future
production possibilities. Consequently, the dynamic perspective of production rela-
tionships necessarily involves the close interplay between stock and flow elements
in the input-output transformation process, and how current decisions impact the
changes in future stocks. Stock elements in the production transformation process
can involve physical elements, such as the volume of capital (buildings, machinery,
soil nutrient endowment, etc.) that can be effectively employed in the input-output
transformation process as well as the stock of technical knowledge and expertise
available to the decision maker during the decision period.
This chapter is organized as follows. The next section reviews the historical
background on how the literature addresses dynamic aspects of production decision
making and the interplay between the short and long run. This is followed by
an extended presentation of the adjustment cost model of the investment and
focus on how it can be incorporated into the formal structure of a production
technology, which offers the opportunity to exploit primal-dual theory in both
analysis and empirical implementation. An overview of empirical formulations in
both econometric (parametric) and nonparametric settings is discussed. Next is the
review of dynamic generalizations of modern production theory concepts measuring
economic performance. Notions of scale, efficiency, capacity, and productivity
now need to properly account and value the factors that are out of equilibrium.
The important special case in studying the micro-oriented cases follows, which
can present non-convexity adjustment process which can impact economic perfor-
mance measures and the analysis of productivity change. While these cases can
be addressed relatively easily within a nonparametric DEA setting, econometric
formulations are a greater challenge. This is followed by a review of the Network
DEA approach which offers a computational decision procession framework for
production systems as outputs in one stage feed into a future stage. The final section
address two directions, in particular, that are worthy of attention and discussion.
614 S. E. Stefanou

Background

The Setting

The characterization of dynamics in production decision making is a theme of long

interest. Samuelson ([124], Chapter IX) provides an overview of the earliest efforts
to introduce dynamic notions in economics and identifies the distinction between
“dynamic and causal (non-historical)” and “dynamic and historical” ([124], p. 315).
“Dynamic and causal (non-historical)” refers to a subset of variables leading other
variables to move, but the parameters of the system remain unchanged. “Dynamic
and historical” allows for a parameter of the system (such as the state of the
art) to lead to changes in the system or in its behavior over time. Frisch ([59],
Part V, Chapters 16–19) focuses on the dynamic theory of production by initially
commenting on the time shape (his italics), or trajectory, of input and output
decisions. These early efforts focus on the investment decisions and focus particular
attention on the depreciation and replacement of equipment and structures. Frisch,
as others of this era, are taking on an engineering economics framework to the
intertemporal case, focusing on a replacement theory approach where maintenance
investment is the emphasis. The focus on replacement and maintenance leads to the
investment dynamics influencing the installation of additional capital, which can
lead to overcapacity [28].
The nature of factors of production is a driving force of dynamic production
decision making. Viner’s [152] distinction between some factors being “freely
adjusted” while others are “necessarily fixed” necessitates the distinction between
the short- and long-run production technology. Capital inputs are characterized as
having productive value beyond the current period and at the same time present a
degree of inflexibility that freely adjusted (variable) inputs do not present. Alchian
[5], Smith [137], and De Alessi [36] advanced the stock-flow production function
that the notion of intertemporal investment and production decision making is
formalized within the long-run cost minimization framework.
Three major components of production decisions have the potential to drive a
dynamic decision process. The first is driven by economic forces related largely
to adjustment processes, which arise from the dichotomy between the short and
long runs. The distinction between the short and long run becomes a prime
consideration in determining the appropriate time scale of economic decision-
making strategies. These strategies focus on the choice of production factors
assumed to be fixed when factor allocation decisions are to be made. All economic
activity occurs in the short run to the extent a factor (or factors) of production
are taken as fixed [137, 152]. The long run refers to the firm planning to select a
future short-run production situation. The problem with the classical description
of the short and long run is that the story of the envelope curve is not entirely
consistent with the story motivating the distinction between the short and long
run.
15 Dynamic Analysis of Production 615

The long run consists of a range of possible short-run opportunities available to

the firm. As such, the firm always operates in the short run but plans for the long run
([55], p. 198). A more complete description of producer behavior in the long-run
theory of cost concentrates on the planning problem involving the minimization of
the discounted stream of costs. Such a characterization focuses on long-run costs as
a stock rather than a flow concept.
The classical approach characterizes both short- and long-run cost functions as
flows. The long run is merely the case where the fixed factor is now variable –
presumably, because the time span under consideration is now long enough to view
the problem as a short-run planning problem. This could entail describing the long
run to last 5 or 10 years given capital adjustment rates estimated in the empirical
literature. Viner’s [152] idea of some factors being “freely adjusted” while others
are “necessarily fixed” is sufficiently vague to allow long-run costs to be considered
a flow. Freely adjusted implies that altering the input levels of these factors does not
impose a penalty on the firm other than a constant acquisition cost.
The application of non-freely adjusted inputs presumably occurs because the
firm must absorb some additional costs beyond the acquisition cost. Introducing
the concept of adjustment costs can capture this phenomenon. Some factors are
considered “fixed” in the short run, not because the operator is physically prevented
from removing or introducing more of the factor, but because the economic
environment places a high cost on adjusting the factor level. The full slate of
costs that can lead to gradual adjustment are rarely, if ever, observed but can be
proxied by a relationship (or function) that is driven by changes in the quasi-fixed
factors.
To those who may see this as a shortcoming, we take a step back and acknowl-
edge that many of the concepts we have in economics are theoretical constructions
proposed as the mechanisms to rationalize economic choices to observed behavior.
We never observe a production function, but it serves as a mechanism to relate input
choices into output realizations; we never observe a consumer’s utility function,
but it serves to translate how choices of goods and services lead to the consumer’s
overall satisfaction.
Fundamentally, economic analysis in a static (or timeless) context entails
addressing how a change in an economic variable of interest (e.g., price, tax, the
level of capital stock) impacts the firm’s decision (e.g., how much to invest). This
change is assumed to happen instantaneously, or more appropriately, over a time
period that happens to be so long that the full impact of the stimulus for change
has not taken full effect. In fact, this time period may span several years. This is
akin to asking the question: How long is the long run? For a firm manufacturing
automobiles, the answer can vary from 12 hours to 12 weeks; a new manufacturing
facility may take 2–3 years to construct, debug, and gear up into full operational
status. But for a politician, the answer is quite simple: The long run is until the next
election. Understanding the decision time path associated with a policy stimulus is
as important as understanding the final impact of that stimulus.
616 S. E. Stefanou

Dynamic Optimization Frameworks

The presentation of dynamic optimization approaches has evolved over the years.
Many books, including Intriligator [69], Kamien and Schwartz [71], Léonard and
Long [83], Takayama [139], Seierstad and Sydsaeter [126], and Caputo [19], offer
complete presentations of the calculus of variations and optimal control theory, with
applications to economics. Dorfman [41] is a classic contribution on the economic
interpretations of optimal control theory. Sufficient conditions in optimal control
can be found in Kamien and Schwartz [70] and Seierstad and Sydsaeter [126] for
both the standard control problems and the case of corner solutions for the controls.
Benveniste and Scheinkman [11] identify the sufficient conditions to guarantee the
value function is differentiable.
With enhanced computational capabilities, come treatments that have a distinctly
empirical orientation exploiting the dynamic programming approach. Bellman [10]
introduces the theory of dynamic programming, which predates the optimal control
theory approaches and offers the foundation for computational algorithms as well as
a fundamental functional equation of optimization that many applications exploit to
connect with primal dual theory of dynamic production. Comprehensive discussion
of the computational methods suitable to estimate structural dynamic models of
investment is found in Adda and Cooper [2] and Bond and Van Reenen [17].

Adjustment Cost Model

Long History and Evolution

Holt et al. [67] and Eisner and Strotz [42] build the notion of a relationship that
accounts for some factors of production to adjust gradually while others adjust
instantaneously. The classic microeconomic theory of variable and fixed factors
is now amended to refer to these gradually adjusting, or quasi-fixed, factors of
production. The mechanism to capture the inertia of some factors is the cost of
adjustment function that defines the relationship between cost (in either physical
output or value terms) and the magnitude of the adjustment. As a result, the
relative speed of adjustment can be used to characterize the degree to which a
factor is variable or quasi-fixed. These models are motivated from the following
propositions: (i) there are costs associated with adjusting the capital stock at a
rapid rate per unit of time; and (ii) these costs increase rapidly with the absolute
rate of investment; presumably, so rapid that the firm never attempts to achieve a
jump in its capital stock at any given moment. The factors that are not adjusted
instantaneously are referred to as quasi-fixed. The stock of quasi-fixed factors is
gradually accumulated since it costs more to adjust the stock rapidly rather than
slowly.
15 Dynamic Analysis of Production 617

A number of studies followed up on this basic concept by building on the

parable about production2 by focusing on the properties of the adjustment cost
mechanism and its relationship to investment behavior. Early contributions to this
stream of thought that are exploring the properties of the adjustment cost model are
Lucas [87], Rothschild [120], Treadway [145, 146], and Mortensen [97]. General
overviews of the theory of the firm facing adjustment costs are Brechling [18] and
Nickel [101], with Hamermesh and Pfann [62] offering an extensive history of both
the theoretical and empirical applications that focus on the convex (smooth) and
non-convex (kinked) quasi-fixed factor adjustment functions.
These costs are the result of a reduction in output, which occurs when the quasi-
fixed factor is absorbed (or released) too quickly. A firm may have personnel and
training departments, which are adequately budgeted and staffed for the normal
replacement of quits and retirements. If the firm seeks to expand its work force,
more capital and labor must be devoted to the personnel and training departments.
With total inputs fixed, the level of output must fall. Many such internal costs of
adjustment can be viewed as learning. A manager seeking to expand the operation
must spend more time learning how to manage more resources effectively. As
some of this time may be devoted to more formal training (e.g., studying manuals,
attending workshops), a portion of this training time can result in a loss in
physical output due to the manager’s learning by production experience. With the
manager’s total time available fixed, less time is available to manage the operation
as a significant block of time is diverted to learning. Consequently, the level of
output must fall. External adjustment costs arise from market forces or contractual
obligations. With capital as a quasi-fixed factor, examples of external adjustment
costs are expansion planning fees (e.g., architects, legal costs associated with zoning
issues, design consultants) and imperfect capital goods markets. A firm intending
to expand its capital base rapidly may be able to obtain more capital at a steeply
increasing marginal cost because the rate of production of new capital goods may
be insufficient. A firm may have to go to alternative credit sources to raise capital
beyond the level traditional lending institutions may permit due to collateralization
requirements. With labor as a quasi-fixed factor, severance pay, job advertisement,
and other labor recruiting (search) costs are examples of actual costs incurred by the
firm.
For a production function specification for output (y), variable inputs (x), capital
stock (K), and gross investment (I), =f (x, K, I), output is non-increasing in I and
concave, reflecting the lost output associated with incurring investment. In a cost
function setting with variable input prices (w), C(w, y, K, I), optimal short-run

2 Economists are prone to refer to the “black box” as the unseen mechanism translating action and
decisions into accomplishments. Our production functions are examples of such mysterious mech-
anisms. I defer here to Prof. Chambers who aptly characterizes the economist’s conceptualization
of the mechanism as a parable (a simple story) in his Chap. 14, “Production Under Uncertainty”
in this volume.
618 S. E. Stefanou

xt
Fig. 1 Isoquant map (I, x)

fI
Slope = –
fx

costs are increasing in investment at an increasing rate, reflecting a diseconomies of

investment. Illustrating the isoquant for (I, x) in Fig. 1, additional variable input will
require additional investment, which exhibits the classic congestion input behavior.
The current period cost of investment is balanced against the flow of future gains
associated with the embodiment of additional capital.

Incorporating Adjustment Costs into a Technology

Several primal representations of the production technology are defined and charac-
terized axiomatically in the static theory of production.3 The production function
has been used, in general, as the primal representation of the adjustment-cost
production technology (e.g., [45, 82, 108]). Recently, other primal representations of
the adjustment-cost production technology have emerged in the literature allowing
for the possibility of multiple outputs. Sengupta [127] addresses adjustment costs in
an optimal control framework with a specification leading to a closed form solution
of controls.
The nonparametric approaches to the theory of production led to a proliferation
of empirical work. These contributions start with Farrell [54] and advanced impor-
tantly by Afriat [3] and Hanoch and Rothschild [63]. These contributions provide
the foundation for the more complete primal/dual nonparametric characterizations
of production behavior by developing a set of inequalities that must be satisfied by
observed prices, input decisions, and output realizations that would be generated
by optimizing (cost minimizing and profit maximizing) behavior. Computationally

3 The interested reader is referred to the Professors Chambers and Färe’s Chap. 7, “Distance
Functions in Production Economics” in this volume for a presentation of the production sets, and
the radial and directional distance functions.
15 Dynamic Analysis of Production 619

convenient tests developed by Diewert and Parkan [39] and Varian [149] broadened
the power of the revealed preference approach to production analysis.
Silva and Stefanou [130] take on the dynamic production case by defining
an adjustment-cost technology through a family of input requirement sets that
is negative monotonic and convex in I(t) and reverse nested in k(t). Negativity
monotonicity in I(t) and reverse nestedness in k(t) implies current additions to the
capital stock are output decreasing in the current period but increase output in
the future by increasing the future stock of capital. Convexity in I(t) implies the
more rapidly the quasi-fixed factors are adjusted the greater the cost, leading to
sluggish adjustment in the quasi-fixed factors. For other regularity conditions of this
technology, see Silva and Stefanou [130].
Consider the data series Sc = {(yi (t), xi (t), Ii (t), ki (t), wi (t), ci (t)); i = 1, . . . , n;
t = 1, . . . , T} representing the observed behavior of each production unit i at each
time t, and the market input price vectors (wi (t) for variable inputs and ci (t) for
quasi-fixed factors). Focusing on the input requirement set, V(y(t): k(t)), Theorem 2
in Silva and Stefanou [130] identifies VI (y(t): k(t)) as the tightest inner bound on
V(y(t): k(t)) as the convex monotonic hull of (xi (t), Ii (t)) constructed as

VI (y(t):k(t)) = x(t), I (t) :x(t)≥ λj (t)x j (t); I (t)≤ λj (t)I j (t); y(t)
j j

≤ λj (t)y j (t); k(t) ≥ k j (t); λj (t) = 1; λj (t) ∈ + , ∀j
j j
(1)

where λ(t) is the intensity vector at time t.

The distance function perspective on defining a production technology is focused
on the entire opportunity set, including the boundary. For a given input-output
bundle, the distance to the boundary is selected given the technology set and this
distance is a measure of technical efficiency. Figure 2 presets an input requirement
set in (I, x) for output identifies an interior bundle, Z, and illustrates the hyperbolic
input distance function is defined in Silva and Stefanou [131] to represent a
production technology with adjustment costs.

Primal-Dual Theory Opportunities

At any point of time t, the firm is presumed to minimize the discounted flow of costs
from time t forward as follows:
∞
W (w, c, y, kt ) = min t e−rs w ´ xs + c´ Ks ds,
x,I
s.t. (2)
K̇ = I − δK, Kt = kt
(xs , Is ) ∈ V (ys : Ks ) , s ∈ (t, ∞)
620 S. E. Stefanou

Fig. 2 Technical efficiency

of variable and quasi-fixed
x
factors

z´

where w is the vector of current variable input prices, c is the current rental price
vector of quasi-fixed factors,4 ys is the production target at time s,5 r is the constant
discount rate, K̇ = dK/dt is the vector of net investment, and δ is a diagonal (o x o)
matrix of the depreciation rates δ h , h = 1, . . . ,o. The optimal current value function
W(w,c,y,kt ) associated with problem in (2) focuses on long-run costs as a stock
concept. The flow version of the long-run cost function is the dynamic programming
equation or Hamilton-Jacobi-Bellman (HJB) equation

rW (w, c, y, kt ) = min w ´ xt + c´ kt + Wk ´ (It − δkt ) : xt , It ∈ V (yt : Kt ) ,
xt ,It
(3)

where rW(·) is a flow version of the intertemporal cost and Wk (w, c, y, kt ) is the
vector of the shadow value of capital. By definition, the shadow value of the quasi-
fixed factor h, Wkh (·), measures the impact on the value function due to a small
change in the initial capital stock, kh . Consequently, the shadow value of capital is
an endogenous price and influenced by input prices (w, c), the production target, y,

4 Staticprice expectations are assumed in model (21). This assumption means the firm considers
that current prices contain all relevant information about future prices. The firm revises its price
expectations as the initial period changes. Chambers and Lopez [21] discuss the reasons a firm that
is aware of the cost of obtaining information may choose rationally to generate expectations in this
way and update decisions continuously as new information appears.
5 A sequence of production targets is specified over the planning horizon in the dynamic cost model

(Stefanou [138] and Epstein and Denny [47]). Consequently, the value function depends on current
and future production targets. However, static output expectations are assumed in (2). The firm
revises its expectations and production plans as the initial period changes and the new output targets
are developed. In this way, the firm is solving an open loop optimization problem that allows but
does not anticipate revisions in expectations.
15 Dynamic Analysis of Production 621

and initial capital stocks, kt . The economic interpretation of the optimized version of
(2) is straightforward. The left-hand side, rW(·), is the opportunity of the production
plan using the starting capital stock, kt and production target, y. The right-hand side
presents the instantaneous flow of variable costs plus the user cost of capital and the
imputed value of the net investment to the long-run cost function.
The attractive feature of characterizing adjustment costs within a primal technol-
ogy is that it creates opportunities to exploit the dual characterizations. Intertempo-
ral (dynamic) duality focuses on the dual relation between the production function
and the optimal value function of an intertemporal optimization problem (e.g., [45,
82, 108]), and duality between the optimal value function and the instantaneous
variable profit function [93]. In particular, McLaren and Cooper [93] and Epstein
[45] introduce the duality relationship between the value and production functions,
publishing their contributions nearly simultaneously (appearing 2 months apart).
McLaren and Cooper [93] exploit the optimal control approach, while Epstein [45]
focuses on the HJB equation to develop the theory and econometric specification of
dynamic dual models.

Econometric Approaches

The emergence of focused econometric approaches to estimating dynamic adjust-

ment econometrically estimate dynamic factor demands and test for the presence
of instantaneous adjustment by determining if the marginal cost of adjustment is
constant and demonstrate how the demands for quasi-fixed factors are generated as
an approximate solution to the multivariate linear accelerator. Early econometric
implementation of this approach is found in Denny et al. [37]. Mahmud et al.
[91] examine the implications of functional form specification on the invariance
of estimation, followed by Pindyck and Rotemberg [117] and Shapiro [129] and
Chirinko [29] focusing on the implicit equations approach. These approaches are
a mixture of dual short-run functional specifications and primal adjustment cost
function specifications.
Epstein and Denny [47] estimate the dynamic dual model for the US manufac-
turing sector and present a discussion of homotheticity and aggregation restriction
for the intertemporal cost minimization. Blackorby and Schworm [16] and Epstein
[46] focus on aggregation in dynamic models. With this early start, a number
of applications ensue in agriculture (Taylor and Monson [141], Vasavada and
Chambers [151], Howard and Shumway [68], Vasavada and Ball [150], Luh and
Stefanou [88], Sckokai and Moro [125], Serra et al. [128], Rungsuriyawiboon and
Hockmann [121], Yang and Shumway [153]), utilities [122], and manufacturing
[15].
Several approaches to formulating and estimating dynamic decision rules under
non-static expectations have been suggested in the literature. The first is the one used
in Hansen and Sargent [64] and Epstein and Yatchew [48]. This approach is based
on an explicit analytic solution to the Euler equation for the firm’s intertemporal
optimization problem in the classic calculus of variations formulation. To yield
622 S. E. Stefanou

an analytic solution, this approach requires a full specification of the expectation

model. Moreover, due to the difficulty in solving the Euler equation, most often
this approach is restricted to linear-quadratic technologies. The second approach
due to Kennan [78] assumes expectations to be rational. Applying an instrumental
variable estimation technique, Pindyck and Rotemberg [117] implemented this
approach in their study. Unlike the first approach, this approach does not require
an explicit solution of the Euler equations; therefore, it allows greater flexibility
in the specification of the production technology, but this ignores the information
contained in the transversality conditions. The third approach considers certainty
equivalence feedback control policies and is suggested by Prucha and Nadiri
[118]. Although the approach also utilizes the full solution to the intertemporal
optimization problem, the algorithm suggested by Prucha and Nadiri avoids the need
for explicit analytic solution.
Taylor [140], LaFrance and Barney [80], and Lasserre and Ouellette [82] engage
in theoretical explorations into the duality properties in stochastic environments.
Empirical investigations are found in Luh and Stefanou [90] using aggregate data,
and micro-level investigation is found in Pietola and Myers [116]. These dual-
based formulations to address price expectations yielded to the structural investment
models such as Cooper and Haltiwanger [32] for the US manufacturing plants and
Roberts and Vuong [119] addressing the relationship of R&D an investment and
productivity for German manufacturing firms.

Nonparametric Approaches

With dynamic decision making focused on the HJB equation, the focus is on
making current decisions on variable and dynamic factors, while looking to the
future. This context allows for specifying nonparametric technologies for the
investment decision making. The HJB equation in (3) can specify a technology
nonparametrically for firm i over J firms, with M outputs, N inputs, and F capital
factors

J
j
i
ym ≤ γ j ym , m = 1, . . . , M;
j =1

J
j
γ j xn ≤ xni − β i gxi , n = 1, . . . , N ;
j =1

J
j j
Ifi + β i gIf − δf Kfi ≤ γ j If − δf Kf , f = 1, . . . , F ;
j =1

γ j ≥ 0, j = 1, . . . , J.
15 Dynamic Analysis of Production 623

The objective function in (3) is linear in the choice of the variable factor, x, and
investment, I, given the shadow value of capital, Wk (·), which is the dual variable
to the primal stated in (3). The solution to this class of problems must allow for the
presence of both primal and dual variables in the specification of the optimization
problem in such a way to guarantee the dual variables are not treated as primal
variables in optimization. This can be framed as a linear complementarity problem6
as in Silva and Stefanou [131] or exploit the primal-dual structure as in Silva et al.
[132].
The revealed preference approach to production analysis is one that focuses on
allowing the data to reveal the technology without the constraints of a functional
form. Building on Varian [149], the dynamic version leads to two boundaries for
the input requirement set. The tightest inner bound presented in (2) based on
the observed input-output data. When prices are available as well, the theoretical
foundation of cost minimization can be used to assess if the data rationalize the
dynamic cost minimizing behavior yielding the inner and outer bounds of the
production technology. Theorem 3 in Silva and Stefanou [130] construct the tightest
outer bound by the intersection of half-spaces that is created by the isocost planes
and defined as

VO (y(t) : k(t))

(x(t), I (t)) : w i (t) x(t) + Wki (t) I (t) ≥ w i (t) x(t) + Wki (t) I (t)i ; (4)
=
y(t) ≥ y i (t); k(t) ≤ k i (t)

where Wki (t) is the vector of the shadow value of capital for observation i at time t.
Silva and Stefanou [130] present a detailed empirical investigation of this approach
with firm-level data (Fig. 3).
More recent efforts emphasize the directional distance function approach to
specifying the dynamic production technology. Building on the pioneering work
of Chambers et al. [22, 24] and Chambers [20], Silva et al. [132] establish
the primal-dual foundations for production under dynamic adjustment using the
dynamic version of the directional distance function and develop decomposition of
the dynamic version of the Luenberger productivity indicator in Chambers [20].
The directional distance function approach offers the flexibility of choosing the
direction, rather than forces to the hyperbolic path to the frontier. The dynamics
are introduced in the production technology specification as an adjustment cost in
the form of the properties of the directional input distance function with respect to
the dynamic factors (or the change in the quasi-fixed factors) as

6 The linear complementarity problem and solution approaches are found in Mangasarian [92] and

applied in economics in Paris [111].

624 S. E. Stefanou

Fig. 3 Identifying
technically efficient reference X2
location

X* X0

0
X1

−
→
D (y(t), K(t), x(t), I (t); gx , gI )
(5)
= max β ∈ : x(t) − βgx , I (t) + βgI )∈ V ( y(t)|K(t) ,

The vector (gx , gI ) ∈ N ++ × ++ is a nonzero vector determining the

F
−
→
direction in which D is defined. This function measures the distance of (x(t),
I(t)) to the boundary of V(y(t)|K(t)) in a predefined direction (gx , gI ) = 0N + F .
Figure 4 illustrates the dynamic directional input distance function assuming one
variable input and one dynamic factor. The input vector (x(t),I(t)) is projected
−
→ −
→
onto the isoquant of V(y(t):K(t)) at a point x(t) − D (.)gx , I (t) + D (.)gI ∈
V (y(t) |K(t) ), (gx , gI ) = 0N + F . This figure presents three possible projections of
the input vector (x(t),I(t)) associated with three directions: g0 , g1 , and g2 .
As in the static case, the empirical implementation of the dynamic version for the
directional distance function takes both parametric and nonparametric approaches.
For the dynamic input distance function case, the nonparametric problem for firm i
is to maximize the distance, β i , or level of inefficiency, stated as

−
→ i i i i
D y , K , x , I ; gx , gI = max β i
β i ,γ j
s.t
J
i ≤ j
ym γ j ym , m = 1, . . . , M;
j =1
J
j
γ j xn ≤ xni − β i gxn , n = 1, . . . , N ;
j =1
J
j j
Ifi + β i gIf − δf Kfi ≤ γ j If − δf Kf , f = 1, . . . , F ;
j =1
γ j ≥ 0, j = 1, . . . , J.
15 Dynamic Analysis of Production 625

V(y(t)|K(t))

(I(t),x(t))
(I1(t),x1(t))

(I0(t),x0(t))
(I2(t),x2(t))

g0
I

g0
g2

Fig. 4 The dynamic input distance function

Dynamic Generalizations of Modern Production Theory

Concepts: Scale and Scope, Efficiency, Capacity, and Productivity

With the dynamic duality concepts established, rW(·) is the value function in flow
terms in (3) and provides the starting point to commence the dynamic generalization
of economic performance measures in modern production theory. The concepts of
economic performance are muddled by the problem-specific nature of determining
the length of run. As apparent in (3), the shadow value of the quasi-fixed factor, Wk ,
enters as an endogenous valuation of the capital asset. Conditioned on current value
parameters (e.g., prices, output, and capital), it is predetermined in the sense that it is
contemporaneously fixed and intertemporally endogenous. When the firm is not at
a long-run equilibrium position, the under- or over-utilization of quasi-fixed factors
imposes an internal cost. The time scale and the inertia of quasi-fixed factor change
have two effects on our measure of economic performance. As the long-run cost
function allows quasi-fixed factor stocks to evolve over time at an endogenous rate,
there is not just a long run and a short run but a continuum of runs. Consequently,
we can generate measures such as of returns to scale and scope economies, typically
reported as a static metric, at each point in time that is driven by the fact that K̇ = 0
and an intertemporal objective is in effect. Elasticities can now be viewed as short
run (with K(t) fixed at kt ), intermediate run (with K(t) variable according to the
optimized investment policy), and long-run equilibrium (with optimal K(t) fixed
such that K̇(t)∗ = 0 and I ∗ = δK ∗ and K ∗ is the long-run equilibrium capital
stock and is a function of (w, c, y)).
When the technology can shift over time due to technological progress,
we typically denote this progress with the time trend and the technology,
(xt , It ) ∈ V(yt , Kt , t), implying the value function, W(w, c, Kt , yt , t), also shifts
over time with technological progress. The measure of productivity growth in
626 S. E. Stefanou

the dynamic production case can be generated the same way as the static case.
In the classic static case, total factor productivity growth is decomposed into two
components: (i) a scale effect reflecting input growth along the existing technology
frontier, and (ii) a technological progress effect. If real prices are held fixed
over time, the components driving productivity change are associated with those
variables that change over time (inputs and technology, in this case). In the dynamic
case, market input prices can be assumed to be fixed; the shadow value of capital,
WK (w, c, Kt , yt , t), in the HJB equation in (3) serves as an internal price that does
change over time with changes in the quasi-fixed factor stock.
For the case under dynamic adjustment, Luh and Stefanou [88] find that there
will be a technological progress effect but the input change effect now has several
components. The first input change effect address the change in the flow variables;
i.e., variable inputs and gross investment (the flow in the quasi-fixed factors stock).
As investment take place along with the depreciation of the quasi-fixed factor stock,
the quasi-fixed factor stock changes over time as it is driven by production choices.
The last force driving productivity change emerges from the shadow value of capital,
which is an endogenous value (price); namely, the impact of an internal price
change, dW dt · Wk = WKK K̇ + WKt · Wk .
k 1 1

Measures of economic performance that are moving over time such as pro-
ductivity, capacity utilization, efficiency, among others, must also now reflect the
impact of disequilibrium that includes both the changing capital stock as well as
the endogenously shifting shadow value of capital. Extensions to the case of returns
to scale under dynamic adjustment are developed in Stefanou [138], Morrison [94,
96], and Paul and Siegel [113]. For the case of scope economies, Fernandez-Cornejo
et al. [56] present an application to West German dairy farms, Oude Lansink and
Stefanou [106] focusing on farm-level Dutch cash crops, and Helfat and Eisenhardt
[65] presenting a business case study. A related direction arises in the strategic
management literature focusing on the dynamic capabilities of firms and how
resources are reallocated to build competitive advantage as firms experience rapid
change in the face of competitive pressures [142]. While not based on structural
production technologies, the theme and interest is related in terms of looking at
the organizational capabilities, which strikes at the heart of the scope economies
motivation. Silva et al. ([133], Chapter 5) constructs measures of scale, scope,
capacity utilization, and productivity change under the dynamic adjustment with
the directional distance function.

Efficiency

Fallah-Fini et al. [49] offer an overview of the current nonparametric dynamic

efficiency models. They organize the landscape to attribute dynamic behavior to
one or a combination of factors associated with the dynamic aspects of production
that include dynamic adjustment, cognitive capacity of decision makers, long lags
in installing infrastructure, among others. The paradox of efficiency under dynamic
adjustment is illustrated in Fig. 3.
15 Dynamic Analysis of Production 627

With a firm starting at X0 and the cost minimizing input bundle is X∗ , how
does the decision guide her firm to move from X0 to X∗ ? If there were no friction
or inertia in the decision making system, the move would be instantaneous and
we would attribute the initial inefficient bundle as a lapse in managerial ability
and the study of efficiency analysis would be fairly trivial. The point is that we
do empirically observe firms operating off the frontier, and their presence off the
frontier can be persistent. The dynamics of input reallocation may be significant.
Each point on an isoquant is a technique of a production process (or technology)
and the smooth isoquant results in the presence of an infinite number of techniques
to achieve a given output level [31, 98]. Changing the input bundle as the firm
reorients the techniques of production to enhance efficiencies can cause the firm to
incur monitoring costs associated with reorganizing the production process. This can
be revealed by the presence of transition costs associated with reallocating inputs,
which would lead to a trajectory taking her to the optimal input bundle [30]. But
as the decision-maker is following a path toward the optimal input allocation, a
snapshot in time would reveal her to be inefficient; i.e., the firm may be statically
inefficient but dynamically efficient (as she follows the optimal path).
This approach has been applied to a range of applications in agriculture and food
manufacturing (Skevas and Oude Lansink [134], Kapelko et al. [77], Kapelko and
Oude Lansink [74], Kapelko [73]) and in construction [75].
As in the static version of the directional distance function, econometric appli-
cations are challenging. The need to impose the translation property restricts the
range of functional form specifications, with the quadratic case remaining a tractable
operational choice in the dynamic case as well, as is applied in Oude Lansink et al.
[107].
The emphasis on the next discussion is driven by the error specification of a
parametric frontier production function to address the production dynamics. The
literature discussed here focuses on the production technology, not a cost function
(although there is no restriction, other than access to price data, precluding the
implementation to cost function frontiers). These econometric approaches seek to
capture the sluggish movement of quasi-fixed factors by proposing that the dynamic
stochastic frontier models tend to estimate firms’ long-run technical inefficiency
level.
Early attempts to take into account the time dependence of inefficiency in
the stochastic frontier literature include studies by Battese and Coelli [9] and
Kumbhakar [79]. These models allow the efficiency scores to follow deterministic
functions of time. The disadvantage of this modeling approach is that it restricts the
time path of efficiency to have the same structure for all firms under consideration,
on average. In this sense, the results are interpretable with respect to the direction
of evolution of efficiency over time for the industry. Cornwell et al. [34] develop a
specification using firm-specific parameters that does not impose this restriction by
proposing estimation in a generalized least squares framework. Cuesta [35] extends
the Battese and Coelli [9] model by allowing the parameter in the function that
describes the evolution of efficiency scores to be firm specific. All four models
disregard the persistence of a positive or negative shock in firm efficiency and are,
628 S. E. Stefanou

Table 1 Summary of studies estimating long-run persistence of efficiency

Study Application Long-run efficiency score
Tsionas [147] US Banking (1989–2000) 0.955
Emvalomatis et al. Dairy farms (1995–2005) Netherlands: 1.00
[44] Germany: 0.778
Emvalomatis [43] US electric utilities firms (1986–1997) 0.813
Lambarraa et al. [81] Spanish olive farms (2005–2012) 0.727
Skevas et al. [135] German Dairy & Livestock 0.700
(1999–2009)

therefore, not consistent with a dynamic model of firm behavior. Alvarez et al. [6]
note that most of the parameters of a stochastic frontier model remain consistent
even if the correlation between the efficiency scores among periods is ignored.
Ahn and Sickles [4] and Tsionas [147] have specified inefficiency in a true
autoregressive form. The primary focus of these two studies is the estimation of
the autocorrelation parameter(s). Both studies find very strong autocorrelation, or
persistence, in the efficiency scores. Although dynamic firm behavior is not modeled
explicitly, the implications of such a structural model are revealed in the form of the
parameter measuring the persistence of inefficiency. These approaches investigate
how technical inefficiency can be present and persist, while at the same time this
inefficiency is the consequence of factors that are under the firm’s control. The
assumption is that these factors cannot be adjusted without entailing costs; i.e.,
the efficiency improvement will necessarily depend on the costs of adjustment. If
such costs are high, we expect to find persistent technical inefficiency. This is a
reduced form perspective in that observed relations lead to the revelation of behavior
underlying the choices. Emvalomatis [43] builds on this line of work to investigate
the impact of unobserved heterogeneity in the estimation of the long-run persistence
of inefficiency. Table 1 summarizes empirical estimations from several studies of
long run, persistent efficiency using the autoregressive frontier specification.

Capacity Utilization

With capacity utilization being directly connected to the firm’s assets (or physical
infrastructure), the connection to the dynamic production decision making is a
natural concept to address briefly in this chapter.7 Various notions of capacity
utilization have evolved over the decades. These measures are cyclical economic
indicators with long standing use in public policy formation and analysis of business
decisions. The derivation and calculation of capacity utilization rates has been

7 The interested reader is referred to the Professors Hayes and Jorgenson’s Chap. 24, “Capacity

and Capacity Utilization in Production Economics” in this volume.

15 Dynamic Analysis of Production 629

historically more on data analysis, both statistical and judgmental, with more recent
efforts focusing on theoretical foundations. These more rigorous efforts focus on
technology- and economic-based formulations.
The economic characterization of capacity utilization is surveyed in the 1986
annals issue of the Journal of Econometrics (e.g., see Berndt and Fuss [12] and
Slade [136]). Berndt and Fuss [13] generalize the notion of capacity utilization to
multiple outputs and multiple quasi-fixed factors within a static setting. Fousekis
and Stefanou [58] address the primal and dual measures of capacity utilizations
under dynamic adjustment for the case of the US food processing and distribution
sector. Morrison [95] explores the case for the US automobile industry, and Paul
([112], Chapter 3) disentangles capacity utilization from productivity measures
under dynamic adjustment.

Productivity Change

It is commonly accepted that measures of productivity growth should involve the

use of multiple inputs and multiple outputs. Luh and Stefanou [88] explicitly
address the change in the value of the shadow value of capital while incorporating
dynamic adjustment in measuring productivity growth in a study of US agricultural
productivity, finding that most productivity gains in production agriculture are
from technical change. Further investigations accounting for the change in the
shadow value of capital are found in Luh and Stefanou [89] accounting for
learning-by-doing in productivity growth, Rungsuriyawiboon and Stefanou [122,
123], and Oude Lansink et al. [107]. Building on the dynamic analogue to the
contributions of Chambers [20] and Färe et al. [53], Oude Lansink et al. [107]
present the Luenberger-based measures of productivity change. The primal Luen-
berger productivity growth indicator is decomposed to identify the contributions
of efficiency growth and technical change, while the dual Luenberger productivity
growth indicator offers a further decomposition to identify the impact of quasi-fixed
factor disequilibrium and allocative efficiency change.
The decomposition adds context to the policy discussions. When policies
promoting growth are assessed, the productivity impact is certainly important as a
summary measure. But how this growth is being distributed among the components
is driving the policy discussion. Is the sector gaining by the reallocation of gains by
better, more efficient firms being formed (e.g., by merger and acquisition, entry vs.
exits) and/or reallocation of factors within a firm?
Petrin and Levinsohn [115], Dhyne et al. [38], and Nishida et al. [103] addresses
the distribution of the reallocation of productivity drivers by specifying the produc-
tion function controlling for simultaneity and selection problems. These production
function estimation approaches controlling for the productivity shocks [1, 85, 104].
The interest in accommodating the potential endogeneity of productivity lends itself
to a range of policy considerations.
630 S. E. Stefanou

Non-convex Production Relationships

The context for non-convex adjustment arises in the context of micro-level inves-
tigations. The smooth adjustment models can be justified with the aggregation
across firms and quasi-fixed factors. Empirically, one finds there are frequent
periods on investment inactivity at the plant level, referred to as lumpy investment
or investment spikes. Non-convex adjustment can lead to investment bursts and
the jumps in productive capacity can follow. Extensive evidence exists for these
erratic investment patterns Cooper et al. [33], Geylani and Stefanou [60]. The
findings of Letterie et al. [84] for German firms indicate that most investment spikes
reflect an expansionary type of investment that have no direct relationship with
improved productivity, while episodes of large investments in new technology that
enhance productivity are very rare. Nielsen and Schiantarelli [102] find only very
small changes in labor productivity associated with investment spikes, suggesting
that productivity improvements are not related to technological change through
investment spikes.
At the micro-level decision-making unit, Cooper and Haltiwanger [32] review
the empirical literature and find that non-convexity is likely the rule rather than the
exception. Once we aggregate the data beyond the micro decision units, convexity
is likely to be a tolerable abstraction. In the end, the adjustment cost function is
a mechanism created by the analyst to rationalize the behavior that generates the
observed data. As such, it is another of these parables of production that rationalizes
all the forces that can lead to the gradual adjustment of some factors. From an
empirical perspective, it is desirable to specify models that can test for the degree of
inflexibility in adjusting a given factor. The relationship between investment spikes
and productivity gains is an empirical relationship that has received scant attention
to date. It is an empirical issue to establish the link between productivity growth and
large investments, and examine how productivity growth changes in the presence of
lumpy investments and the potential impact of learning-by-doing, which relates to
the story of efficiency, change.
Another direction related to non-convex adjustment is to explore the prospect of
asymmetry in adjustment as a motivation for lumpy investment, where expansions
can occur at a different rate than contractions. Chang and Stefanou [25] and
Oude Lansink and Stefanou [105] take a dynamic dual modeling perspective
and find the presence of adjustment asymmetry at the farm-level, and Palm and
Pfann [110] explore the asymmetry of adjustment at a macro level focusing on
in UK manufacturing building on a Koyck-lag mechanism driving adjustment. An
alternative approach mixing dynamic DEA and econometrics is taken by Kapelko
et al. [76] which assesses the impact of dynamic inefficiency, and productivity more
broadly, measures over several years using the impulse response estimation post-
investment spike for the case of firm level Spanish meat processing firms finding
the impact largely depends on firms’ size.
15 Dynamic Analysis of Production 631

Network Approach

Network DEA is a computational decision process framework that is appropriate

to modeling dynamic economic decision making [51]. It is particularly well suited
to production systems when an output in one stage feeds into the next stage as an
input. A network consists of sub-technologies entailing time-specific input decisions
and the production of time-specific (intermediate) outputs. The constraints from
one stage (or decision period) to the next can be input constrained just as capital-
like factors can be in the dynamic problems. This is a mathematical programming
problem where the optimization takes place over all stages (periods) to generate the
optimal trajectories for the control variables. A DEA model typically describes a
technology to a level of abstraction necessary for the analyst’s purpose, but leaves
out a description of the sub-technologies that make up the internal functions of the
technology. Essentially, the objective is to model the internal structure of firms with
intermediate products or carry-over activities across multiple periods.
Färe et al. [52] provide a comprehensive review of Network DEA perspective.
This framework has been extended by, among others, Nemoto and Goto [99], Chen
[26], Ouellette and Yan [109], Chen and Dalen [27], Tone and Tsutsui [143, 144],
and Kao [72].
There are two clear points of difference between the Network DEA and the HJB
equation framework presented in Section “Adjustment Cost Model” in this chapter.
The HJB approach solves for the current period choice variables while looking to
the impact of the flow of profits or costs into the future and has the ability to exploit
the value function relations that lead to formulating the dynamic generalizations of
modern production theory. The value function can be implemented computationally
through DEA or can be parameterized and estimated econometrically. In addition,
the HJB approach is more akin to decision-making process where firms make
decisions in the short run as they look to the future. Since both approaches are
consistent with dynamic optimization theory, they yield the same answers. As the
Network DEA framework solves for the current period forward, a trajectory is
generated for the future periods, thus, clearly, linking current decisions to the future
flow of profits or costs.
Chambers et al. [23] use this model to study the inefficiency of APEC countries
due to dynamic misallocation of resources. Nemoto and Goto [99] extend the DEA
to a dynamic framework consistent with the adjustment-cost theory of investment
and obtain measures of dynamic inefficiency based on the HJB equation. Nemoto
and Goto [100] applied this dynamic network model to study Japanese electricity
production over time using a cost minimization criterion.
Fallah-Fini et al. [50] use the network DEA approach to evaluate dynamic
efficiency measurement in the performance of highway maintenance policies
allowing for the inter-temporal dependencies between consumption of inputs and
realization of outputs. Herrera-Restrepo et al. [66] build on Tone and Tsutsui
632 S. E. Stefanou

[143, 144] for highway evacuation and highway-ramp closure strategies. Both of
these studies combine the dynamic network DEA approach with traffic engineering
and socio-behavioral theory of protective action. An initial exploration of the
proposed approach allows for the discovery of efficiency interdependencies among
perspectives, which in turn provides useful information and insights for the future
design of holistic evacuation traffic management strategies.

Conclusion

This chapter takes clear aim on the specification of a technology that can accom-
modate yet another parable of economic life that some factors can adjust rapidly
while others adjust slowly in a given time scale. Focusing on production analysis
in the dynamic setting leads us to emphasize the technology specification that
permits the theoretical construction that can be translated and amenable to empirical
implementation.
While the decision to keep this chapter focused on the structure of the technology
that support dynamic decision making, there are several directions that embrace
more fully the complexity that the dynamic production parable struggles to speak.
It can be a tolerable abstraction to state that we know the current state of nature
better than the future states; hence, why not fully admit the case of uncertainty
in the course of dynamic decision making? This is certainly a fair criticism, but my
defense is that we must under that the kernel of the technology leading to the linkage
of decisions over time. Grasping this starting point firmly provides us with some
structure to venture out to the address future states of nature where real prices and
other forces influencing decisions have elements of randomness. Several empirical
econometric-oriented directions addressing unknown future prices are addressed in
this chapter. Alternatively, uncertain relations have been modeled with fuzzy logic
as an alternative to modeling stochastic aspects within a DEA framework and having
more precise data, which comes at a cost [86].
As this chapter concludes, I turn to two directions that are worthy of attention
and discussion, but outside the scope of this chapter. Both directions implicitly
admit a degree of uncertainty as a key component to developing a more complete
characterization. While the preoccupation in motivating application is with physical
quasi-fixed factors of production, I have alluded to the case where quasi-fixed factor
stocks can also be forms of human and knowledge capital [14]. This direction
connotes a link to activities actively encouraging infusion (or investment) that can
arise from activities starting with R&D and going beyond to the commercialization
of such infusions to innovations. Hall et al. [61] provide a review of this direction
to include the dynamic factors. There is a rich literature and interest in this
direction and these explorations by their nature imply a degree of uncertainty in the
dynamic setting. Aw et al. [8], Doraszelski and Jaumandreu [40], and Peters et al.
[114] model the firm’s endogenous decisions to engage in R&D investment. These
models explicitly address the concept of productivity in the context of knowledge
capital while allowing for stochastic shocks to the current productivity. These
15 Dynamic Analysis of Production 633

considerations include addressing the role of research and development (R&D) in

promoting productivity growth and eventually addressing the demand for R&D.
The second is to pull ourselves out of the weeds of a particular technology and
how the decision interacts in that setting, to a broader view that the nodes comprising
the system embodies several behavioral forces. The system dynamics paradigm has
its roots in Jay Forrester’s [57] pioneering work with Vaneman and Triantis [148]
offering a contribution on how to embed system dynamic elements in the Network
DEA framework. However, it is important to emphasize the role of managing a
system that involves manufacturing activities, service delivery, human oversight,
and policy decisions regarding the performance of this system.

Cross-References

Capacity and Capacity Utilization in Production Economics

Distance Functions in Production Economics
Production Under Uncertainty

References
1. Ackerberg DA, Caves K, Frazer G (2015) Identification properties of recent production
function estimators. Econometrica 83:2411–2451
2. Adda J, Cooper R (2003) Dynamic economics: quantitative methods and applications. MIT
Press, Boston
3. Afriat S (1972) Efficiency estimation of production functions. Int Econ Rev 13(3):568–598
4. Ahn SC, Sickles RC (2000) Estimation of long-run inefficiency levels: a dynamic frontier
approach. Econ Rev 19(4):461–492
5. Alchian A (1959) Costs and outputs. In: Abramovitz M (ed) The allocation of economic
resources: essays in honor of B.F. Haley. Stanford University Press, Stanford
6. Alvarez A, Amsler C, Orea L, Schmidt P (2006) Interpreting and testing the scaling property
in models where inefficiency depends on firm characteristics. J Prod Anal 25:201
7. Aris R (1964) Dynamic programming: an introduction to the optimization of staged processes.
Blaisdell Publishing Co, New York
8. Aw BY, Roberts MJ, Xu DY (2011) R&D investment, exporting and productivity dynamics.
Am Econ Rev 101:1312–1344
9. Battese GE, Coelli TJ (1992) Frontier production functions, technical efficiency and panel
data with application to paddy farmers in India. J Prod Anal 3:153–169
10. Bellman R (1959) Dynamic programming. Princeton University Press, Princeton
11. Benveniste LM, Scheinkman JA (1979) On the differentiability of the value function in
dynamic models of economics. Econometrica 47(3):727–732
12. Berndt ER, Fuss MA (1986) Productivity measurement with adjustments for variations in
capacity utilization, and other forms of temporary equilibrium. J Econ 33:7–29
13. Berndt ER, Fuss MA (1989) Economic capacity utilization and productivity measurement for
multiproduct firms with multiple quasi-fixed inputs. National Bureau of economic research
working paper no. 2932, Cambridge, MA
14. Bernstein JI, Mamuneas TP (2006) R&D depreciation, stocks, user costs and productivity
growth for US knowledge intensive industries. Struct Chang Econ Dyn 17(1):70–99
15. Bernstein JI, Nadiri MI (1989) Research and development and intra-industry spillovers: an
empirical application of dynamic duality. Rev Econ Stud 56(2):249–267
634 S. E. Stefanou

16. Blackorby C, Schworm W (1982) Aggregate investment and consistent intertemporal tech-
nologies. Rev Econ Stud 49(4):595–614
17. Bond S, Van Reenen J (2007) Microeconometric models of investment and employment. In:
Heckman JJ, Leamer EE (eds) Handbook of econometrics, vol 6A. Elsevier, pp 4417–4498
18. Brechling F (1975) Investment and employment decisions. Manchester University Press,
Manchester
19. Caputo MR (2005) Foundations of dynamic economic analysis. Cambridge University Press,
Cambridge, MA
20. Chambers RG (2002) Exact nonradial input, output, and productivity measurement. Eco-
nomic Theory 20(4):751–765
21. Chambers RG, Lopez R (1984) A general dynamic supply-response model. Northeast Journal
of Agricultural and Resource Economics 13:142–154
22. Chambers RG, Chung Y, Färe R (1996a) Benefit and distance functions. J Econ Theory
70(2):407–419
23. Chambers RG, Färe R, Grosskopf S (1996b) Productivity growth in APEC countries. Pac
Econ Rev 1(3):181–190
24. Chambers RG, Chung Y, Färe R (1998) Profit, directional distance functions, and Nerlovian
efficiency. J Optim Theory Appl 98(2):351–364
25. Chang CC, Stefanou SE (1988) Specification and estimation of asymmetric adjustment rates
for quasi-fixed factors of production. J Econ Dyn Control 12:145–151
26. Chen CM (2009) A network-DEA model with new efficiency measures to incorporate the
dynamic effect in production networks. Eur J Oper Res 194:687–699
27. Chen CM, Dalen JV (2010) Measuring dynamic efficiency: theories and an integrated
methodology. Eur J Oper Res 203(3):749–760
28. Chenery HB (1952) Overcapacity and the acceleration principle. Econometrica 20(1):1–28
29. Chirinko RS (1993) Business fixed investment spending: modeling strategies, empirical
results, and policy implications. J Econ Lit 31(4):1875–1911
30. Choi O, Stefanou SE, Stokes JR (2006) The dynamics of efficiency improving input
allocation. J Prod Anal 25(1–2):159–171
31. Coeymans JE, Mundlak Y (1992) Endogenous technology and sectoral productivity: Chile
1962–1982. J Prod Anal 3(3):257–275
32. Cooper RW, Haltiwanger JC (2006) On the nature of capital adjustment costs. Rev Econ Stud
73(3):611–633
33. Cooper RW, Haltiwanger JC, Power L (1999) Machine replacement and the business cycle:
lumps and bumps. Am Econ Rev 89(4):921–946
34. Cornwell C, Schmidt P, Sickles RC (1990) Production frontiers with cross sectional and time
series variation in efficiency levels. J Econ 46:185–200
35. Cuesta RA (2000) A production model with firm-specific temporal variation in technical
inefficiency: with application to Spanish dairy farms. J Prod Anal 13:139
36. De Alessi L (1967) The short run revisited. Am Econ Rev 57(3):450–461
37. Denny M, Fuss M, Waverman L (1981) The measurement and interpretation of total factor
productivity in regulated industries with an application to Canadian telecommunications. In:
Cowing TG, Stevens RE (eds) Productivity measurement in regulated industries. Academic,
London, pp 179–218
38. Dhyne E, Petrin A, Smeets V, Warzynski F (2017) Multi product firms, import competition,
and the evolution of firm-product technical efficiencies. NBER working paper no. 23637.
National Bureau of Economic Research, Cambridge, MA
39. Diewert WE, Parkan C (1983) Linear programming tests of regulatory conditions for
production functions. In: Eichhorn W, Henn R, Neumann K, Shephard RW (eds) Quantitative
studies on production and prices. Physica, West Germany, pp 131–158
40. Doraszelski U, Jaumandreu J (2013) R&D and productivity: estimating endogenous produc-
tivity. Rev Econ Stud 80:1338–1383
41. Dorfman R (1969) An economic interpretation of optimal control theory. Am Econ Rev
59(5):817–831
15 Dynamic Analysis of Production 635

42. Eisner R, Strotz RH (1963) Determinants of business investment. Research Study Two in
Impacts of monetary policy. Prentice Hall, Englewood Cliffs
43. Emvalomatis G (2012) Adjustment and unobserved heterogeneity in dynamic stochastic
frontier models. J Prod Anal 37:7–16
44. Emvalomatis G, Stefanou SE, Oude Lansink A (2011) A reduced form model for dynamic
efficiency measurement: application to dairy farms in Germany and the Netherlands. Am J
Agric Econ 93:161–174
45. Epstein LG (1981) Duality theory and functional forms for dynamic factor demands. Rev
Econ Stud 48(1):81–95
46. Epstein LG (1983) Aggregating quasi-fixed factors. Scand J Econ 85(2):191–205
47. Epstein LG, Denny MGS (1983) The multivariate flexible accelerator model, its empirical
restrictions and an application to U.S. manufacturing. Econometrica 51:647–674
48. Epstein LG, Yatchew AJ (1985) The empirical determination of technology and expectations:
a simplified procedure. J Econ 27(2):235–258
49. Fallah-Fini S, Triantis K, Johnson AL (2014) Reviewing the literature on non-parametric
dynamic efficiency measurement: state-of-the-art. J Prod Anal 41(1):51–67
50. Fallah-Fini S, Triantis K, Rahmandad H, de la Garza JM (2015) Measuring dynamic
efficiency of highway maintenance operations. Omega 50:18–28
51. Färe R, Grosskopf S (1996) Intertemporal production frontiers: with dynamic DEA. Kluwer-
Nijhoff Publishing, Boston
52. Färe R, Grosskopf S, Whittaker G (2007) Network DEA. In: Zhu J, Cook WD (eds) Mod-
eling data irregularities and structural complexities in data envelopment analysis. Springer,
Boston
53. Färe R, Grosskopf S, Margaritis D (2008) Efficiency and productivity. In: Fried HO, Lovell
CAK, Schmidt SS (eds) The measurement of productive efficiency and productivity growth.
Oxford University Press, New York, pp 522–622
54. Farrell MJ (1957) The measurement of productive efficiency. Journal of the Royal Statistical
Society Series A (General) 120(3):253–290
55. Ferguson CE (1969) Microeconomics theory. Richard D. Irwin Press, Homewood
56. Fernandez-Cornejo J, Gempesaw CM, Elterich JG, Stefanou SE (1992) Dynamic measures
of scope and scale economies: an application to German agriculture. Am J Agric Econ 74:
329–342
57. Forrester JW (1961) Industrial dynamics. MIT Press, Cambridge, MA
58. Fousekis P, Stefanou SE (1996) Capacity utilization under dynamic profit maximization.
Empir Econ 21:335–359
59. Frisch R (1965) Theory of production. Rand McNally & Company, Chicago
60. Geylani PC, Stefanou SE (2013) Linking investment spikes and productivity growth. Empir
Econ 45(1):157–178
61. Hall BH, Mairesse J, Mohnen P (2010) Chapter 24: measuring the returns to R&D. In: Hall
BH, Rosenberg N (eds) Handbook of the economics of innovation, vol 2, North-Holland, pp
1033–1082
62. Hamermesh D, Pfann G (1996) Adjustment costs in factor demand. J Econ Lit 34(3):
1264–1292
63. Hanoch G, Rothschild M (1972) Testing the assumptions of production theory: a nonpara-
metric approach. J Polit Econ 80(2):256–275
64. Hansen LP, Sargent TJ (1980) Formulating and estimating dynamic linear rational expecta-
tions models. J Econ Dyn Control 2:7–46
65. Helfat CE, Eisenhardt KM (2004) Inter-temporal economies of scope, organizational modu-
larity, and the dynamics of diversification. Strateg Manag J 25(13):1217–1232
66. Herrera-Restrepo O, Triantis K, Trainor J, Murray-Tuite P, Edara P (2016) A multi-
perspective dynamic network performance efficiency measurement of an evacuation: a
dynamic network-DEA approach. Omega 60:45–59
67. Holt CC, Modigliani F, Muth JF, Simon HA (1960) Planning production, inventories, and
work force. Prentice-Hall, Englewood Cliffs
636 S. E. Stefanou

68. Howard WH, Shumway CR (1988) Dynamic adjustment in the U.S. dairy industry. Am J
Agric Econ 70(4):837–847
69. Intriligator MD (1971) Mathematical optimization and economic theory. Prentice-Hall,
Philadelphia
70. Kamien MI, Schwartz NL (1971) Sufficient conditions in optimal control theory. J Econ
Theory 3:207–214
71. Kamien MI, Schwartz NL (1981) Dynamic optimization: the calculus of variations and
optimal control in economics and management. North Holland, New York
72. Kao C (2013) Dynamic data envelopment analysis: a relational analysis. Eur J Oper Res
227(3):325–330
73. Kapelko M (2017) Dynamic versus static inefficiency assessment of the polish meat-
processing industry in the aftermath of the European Union integration and financial crisis.
Agribusiness 33:505–521
74. Kapelko M, Oude Lansink A (2017) Dynamic multi-directional inefficiency analysis of
European dairy manufacturing firms. Eur J Oper Res 257(1):338–344
75. Kapelko M, Oude Lansink A, Stefanou SE (2014) Assessing dynamic inefficiency of the
Spanish construction sector pre- and post-financial crisis. Eur J Oper Res 237(1):349–357
76. Kapelko M, Oude Lansink A, Stefanou SE (2015) Analyzing the impact of investment spikes
on dynamic productivity growth. Omega 54:116–124
77. Kapelko M, Oude Lansink A, Stefanou SE (2017) The impact of the 2008 financial crisis
on dynamic productivity growth of the Spanish food manufacturing industry. An impulse
response analysis. Agric Econ 48(5):561–571
78. Kennan J (1979) The estimation of partial adjustment models with rational expectations.
Econometrica 47(6):1441–1455
79. Kumbhakar SC (1990) Production frontiers, panel data, and time-varying technical ineffi-
ciency. J Econ 46(1–2):201–211
80. LaFrance JT, Barney LD (1991) The envelope theorem in dynamic optimization. J Econ Dyn
Control 15(2):355–385
81. Lambarraa F, Stefanou S, Gil J (2016) The analysis of irreversibility, uncertainty and dynamic
technical inefficiency on the investment decision in the Spanish olive sector. Eur Rev Agric
Econ 43(1):59–77
82. Lasserre P, Ouellette P (1999) Dynamic factor demands and technology measurement under
arbitrary expectations. J Prod Anal 11:219
83. Léonard D, Long NV (1992) Optimal control theory and static optimization in economics.
Cambridge University Press
84. Letterie W, Pfann GA, Verick S (2010) On lumpiness in the replacement and expansion of
capital. Oxf Bull Econ Stat 72:263–281
85. Levinsohn J, Petrin A (2003) Estimating production functions using inputs to control for
unobservables. Rev Econ Stud 70(2):317–341
86. Liu S, Triantis KP, Sarangi S (2011) Representing qualitative variables and their interactions
with fuzzy logic in system dynamics modeling. Syst Res Behav Sci 28:245–263
87. Lucas RE (1967) Adjustment costs and the theory of supply. J Polit Econ 75(4–1):321–334
88. Luh YH, Stefanou SE (1991) Productivity growth in U.S. agriculture under dynamic
adjustment. Am J Agric Econ 73(4):1116–1125
89. Luh YH, Stefanou SE (1993) Learning-by-doing and the sources of productivity growth: a
dynamic model with application to U.S. agriculture. J Prod Anal 4(4):353–370
90. Luh YH, Stefanou SE (1996) Estimating dynamic dual models under nonstatic price
expectations. Am J Agric Econ 78(4):991–1003
91. Mahmud SF, Robb AL, Scarth WM (1987) On estimating dynamic factor demands. J Appl
Econ 2(1):69–75
92. Mangasarian OL (1977) Solution of symmetric linear complementarity problems by iterative
methods. J Optim Theory Appl 22:465–485
15 Dynamic Analysis of Production 637

93. McLaren KR, Cooper RJ (1980) Intertemporal duality: application to the theory of the firm.
Econometrica 48(7):1755–1762
94. Morrison CJ (1986a) Productivity measurement with nonstatic expectations and varying
capacity utilization: an integrated approach. J Econ 33(1–2):51–74
95. Morrison CJ (1986b) Capacity utilization and productivity measurement: an application to the
U.S. automobile industry. In: Dogramaci A, Färe R (eds) Applications of modern production
theory: efficiency and productivity. Kluwer, Boston
96. Morrison CJ (1997) Assessing the productivity of information technology equipment in U.S.
manufacturing industries. Rev Econ Stat 79(3):471–481
97. Mortensen D (1973) Generalized costs of adjustment and dynamic factor demand theory.
Econometrica 41(4):657–665
98. Mundlak Y (1988) Endogenous technology and the measurement of productivity. In: Capalbo
SM, Antle JM (eds) Agricultural productivity: measurement and explanation. Resources for
the Future, Inc., Washington, DC
99. Nemoto J, Goto M (1999) Dynamic data envelopment analysis: modeling intertemporal
behavior of a firm in the presence of productive inefficiencies. Econ Lett 64(1):51–56
100. Nemoto J, Goto M (2003) Measurement of dynamic efficiency in production: an appli-
cation of data envelopment analysis to Japanese electric utilities. J Prod Anal 19(2–3):
191–210
101. Nickel SJ (1978) The investment decisions of firms. Cambridge University Press, Cambridge
102. Nielsen ØA, Schiantarelli F (2003) Zeros and lumps in investment: empirical evidence on
irreversibilities and nonconvexities. Rev Econ Stat 85(4):1021–1037
103. Nishida M, Petrin A, Rotemberg M, White T (2017) Are we undercounting reallocation’s
contribution to growth? US Census Bureau Center for Economic Studies Paper No. CES-
WP-13-55R
104. Olley S, Pakes A (1996) The dynamics of productivity in the telecommunications industry.
Econometrica 64(6):1263–1298
105. Oude Lansink A, Stefanou SE (1997) Asymmetric adjustment of dynamic factors at the firm
level. Am J Agric Econ 79(4):1340–1351
106. Oude Lansink A, Stefanou SE (2001) Dynamic area allocation and economies of scale and
scope. J Agric Econ 52:38–52
107. Oude Lansink A, Stefanou SE, Serra T (2015) Primal and dual dynamic Luenberger
productivity indicators. Eur J Oper Res 241(2):555–563
108. Ouellette P, Vigeant S (2001) Cost and production duality: the case of the regulated firm. J
Prod Anal 16:203
109. Ouellette P, Yan L (2008) Investment and dynamic DEA. J Prod Anal 29(3):235–247
110. Palm FC, Pfann GA (1998) Sources of asymmetry in production factor dynamics. J Econ
82(2):361–392
111. Paris Q (1979) Revenue and cost uncertainty, generalized mean-variance, and the linear
complementarity problem. Am J Agric Econ 61(2):268–275
112. Paul CJM (1999) Cost structure and the measurement of economic performance. Kluwer
Academic, Boston
113. Paul CJM, Siegel DS (1999) Scale economies and industry agglomeration externalities: a
dynamic cost function approach. Am Econ Rev 89(1):272–290
114. Peters B, Roberts MJ, Vuong VA, Fryges H (2017) Estimating dynamic R&D choice: an
analysis of costs and long-run benefits. Rand J Econ 48:409–437
115. Petrin A, Levinsohn J (2012) Measuring aggregate productivity growth using plant-level data.
Rand J Econ 43:705–725
116. Pietola K, Myers RJ (2000) Investment under uncertainty and dynamic adjustment in the
Finnish pork industry. Am J Agric Econ 82:956–967
117. Pindyck R, Rotemberg J (1983) Dynamic factor demands and the effects of energy price
shocks. Am Econ Rev 73(5):1066–1079
118. Prucha IR, Nadiri MI (1986) A comparison of alternative methods for the estimation of
dynamic factor demand models under nonstatic expectations. J Econ 33:187–211
638 S. E. Stefanou

119. Roberts MJ, Vuong VA (2013) Empirical modeling of R&D demand in a dynamic framework.
Appl Econ Perspect Policy 35(2):185–205
120. Rothschild M (1971) On the cost of adjustment. Q J Econ 85(4):605–622
121. Rungsuriyawiboon S, Hockmann H (2015) Adjustment costs and efficiency in polish
agriculture: a dynamic efficiency approach. J Prod Anal 44:51
122. Rungsuriyawiboon S, Stefanou SE (2007) Dynamic efficiency estimation: an application to
U.S. electric utilities. J Bus Econ Stat 25:226–238
123. Rungsuriyawiboon S, Stefanou SE (2008) Decomposition of total factor productivity growth
in the U.S. electric utilities. J Prod Anal 30(3):177–190
124. Samuelson PA (1947) Foundations of economic analysis. Harvard University Press, Cam-
bridge, MA
125. Sckokai P, Moro D (2009) Modelling the impact of the CAP single farm payment on farm
investment and output. Eur Rev Agric Econ 36(3):395–423
126. Seierstad A, Sydsaeter K (1987) Optimal control theory and applications. North Holland,
New York
127. Sengupta JK (1999) A dynamic efficiency model using data envelopment analysis. Int J Prod
Econ 62(3):209–218
128. Serra T, Stefanou SE, Gil JM, Featherstone A (2009) Investment rigidity and policy measures.
Eur Rev Agric Econ 36(1):103–120
129. Shapiro MD (1986) The dynamic demand for capital and labor, the quarterly. J Econ
101(3):513–542
130. Silva E, Stefanou SE (2003) Nonparametric dynamic production analysis and the theory of
cost. J Prod Anal 19:5–32
131. Silva E, Stefanou SE (2007) Dynamic efficiency measurement: theory and application. Am J
Agric Econ 89(2):398–419
132. Silva E, Oude Lansink A, Stefanou SE (2015) The adjustment-cost model of the firm: duality
and productive efficiency. Int J Prod Econ 168:245–256
133. Silva E, Stefanou SE, Oude Lansink A (2019) Dynamic efficiency and productivity measure-
ment. Oxford University Press, New York
134. Skevas T, Oude Lansink A (2014) Reducing pesticide use and pesticide impact by productivity
growth: the case of Dutch Arable farming. J Agric Econ 65(1):191–211
135. Skevas I, Emvalomatis G, Brümmer B (2018) The effect of farm characteristics on the
persistence of technical inefficiency: a case study in German dairy farming. Eur Rev Agric
Econ 45(1):3–25
136. Slade M (1986) Total-factor-productivity measurement when equilibrium is temporary: a
Monte Carlo assessment. J Econ 33(1–2):75–95
137. Smith VL (1961) Investment and production: a study in the theory of the capital-using
enterprise. Oxford University Press, Cambridge, MA
138. Stefanou SE (1989) Returns to scale in the long run: the dynamic theory of cost. South Econ
J 55:570–571
139. Takayama A (1985) Mathematical economics, 2nd edn. Cambridge University Press, Cam-
bridge
140. Taylor CR (1984) Stochastic dynamic duality: theory and empirical applicability. Am J Agric
Econ 66(3):351–357
141. Taylor T, Monson M (1985) Dynamic factor demands for aggregate southeastern United
States agriculture. J Agric Appl Econ 17(2):1–10
142. Teece D, Pisano G, Shuen A (1997) Dynamic capabilities and strategic management. Strateg
Manag J 18(7):509–533
143. Tone K, Tsutsui M (2010) Dynamic DEA: a slacks-based measure approach. Omega 38(3–
4):145–156
144. Tone K, Tsutsui M (2014) Dynamic DEA with network structure: a slacks-based measure
approach. Omega 42(1):124–131
145. Treadway AB (1969) On rational entrepreneurial behavior and the demand for investment.
Rev Econ Stud 36:227–240
15 Dynamic Analysis of Production 639

146. Treadway AB (1970) Adjustment costs and variable inputs in the theory of the competitive
firm. J Econ Theory 2:329–347
147. Tsionas EG (2006) Inference in dynamic stochastic frontier models. J Appl Econ 21(5):669–
676
148. Vaneman WK, Triantis K (2003) The dynamic production axioms and system dynamics
behaviors: the foundation for future integration. J Prod Anal 19(1):93–113
149. Varian HR (1984) The nonparametric approach to production analysis. Econometrica
52(3):579–597
150. Vasavada U, Ball VE (1988) A dynamic adjustment model for U.S. agriculture: 1948–1979.
Agric Econ 2:123–137
151. Vasavada U, Chambers RG (1986) Investment in U.S. agriculture. Am J Agric Econ 68:950–
960
152. Viner J (1931) Cost curves and supply curves, Zeitschrift für Nationalökonomie, 23–46.
Reprinted in Boulding KE, Stigler GJ (eds) 1952 A.E.A Readings in price theory. Homewood:
Richard D. Irwin
153. Yang S, Shumway CR (2016) Dynamic adjustment in US agriculture under climate change.
Am J Agric Econ 98(3):910–924
Cost, Revenue, and Profit Function
Estimates 16
Levent Kutlu, Shasha Liu, and Robin C. Sickles

Contents
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 642
Duality of the Technology and Characterizations of the Technology Using
the Cost, Revenue, and Profit Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 643
Cost Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 645
Cost Function Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 645
Functional Forms for Cost Function Estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 646
Stochastic Frontier Models for Cost Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 654
Endogeneity in Cost Function Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 657
Marginal Cost Estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 659
Revenue Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 659
Revenue Function Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 659
Functional Forms for Revenue Function Estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 660
Stochastic Frontier Models for Revenue Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 660
Profit Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 661
Profit Function Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 661
Functional Forms for Profit Function Estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 662
Profit Function with Allocative and Technical Distortions . . . . . . . . . . . . . . . . . . . . . . . . . . 663
Stochastic Frontier Models for Profit Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 666
Alternative Profit Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 668
Multi-output Functional Forms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 669

L. Kutlu
Department of Economics and Finance, University of Texas Rio Grande Valley, Edinburg, TX,
USA
e-mail: [email protected]
R. C. Sickles ()
Department of Economics, Rice University, Houston, TX, USA
e-mail: [email protected]
S. Liu
Enterprise Model Risk, Freddie Mac, McLean, VA, USA
e-mail: [email protected]

© Springer Nature Singapore Pte Ltd. 2022 641

S. C. Ray et al. (eds.), Handbook of Production Economics,
https://doi.org/10.1007/978-981-10-3455-8_12
642 L. Kutlu et al.

Non-parametric Estimation (and Shape Restrictions) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 670

Concluding Remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 673
Cross-References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 674
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 674

Abstract

This chapter reviews the ways in which cost, revenue, and profit functions are
used to identify and characterize an underlying technology. It concentrates on the
more widely used functional forms to motivate various issues in the flexibility of
various parametric functions, in the imposition of regularity conditions, in the
use of non-parametric estimation of models, and in standard econometric models
used to estimate the parameters of these different functional characterizations
of an underlying technology. The modeling scenarios we consider also allow
allocative and technical distortions and address how such distortions may be
modeled empirically in the specification and estimation of the dual functional
representations of the underlying primal technology.
Keywords

Duality · Flexible functional forms · Non-parametric production models ·

Endogeneity

Introduction

The purpose of this chapter is to provide a review of how cost, revenue, and profit
functions are used to identify and characterize an underlying technology. Such an
undertaking for a Handbook will undoubtedly leave out certain topics. The chapter
will provide a relative cursory discussion of duality theory and the links between
cost, revenue, and profit functions and the underlying technology they characterize
under certain testable regularity conditions. A more extensive recent treatment and
summary can be found in Sickles and Zelenyuk [110].
Moreover, as the functional forms and estimation setup for the cost, revenue,
and profit functions have many generic commonalities, the chapter will concentrate
on the more widely used functions to motivate various issues in the flexibility
of various parametric functions, in the imposition of regularity conditions, in the
use of non-parametric estimation of models, and in standard econometric models
used to estimate the parameters of these different functional characterizations of an
underlying technology.
The chapter also discusses briefly modeling settings in which allocative and tech-
nical distortions may exist and how such distortions may be addressed empirically
in the specification and estimation of the dual functional representations of the
underlying primal technology.
16 Cost, Revenue, and Profit Function Estimates 643

Duality of the Technology and Characterizations

of the Technology Using the Cost, Revenue, and Profit Functions
1 Very often researchers either do not have information that allows them to identify
the underlying technology and thus its characterization in terms of marginal
products, substitution possibilities, and other technical aspects of the production
process, or have problems estimating such a relationship due to statistical problems
such as endogeneity of inputs. This situation was one of the motivations for work
in the area of duality by various legendary economists. Among the masterminds,
Ronald Shephard revolutionized the neoclassical production theory by developing
his duality theory, a foundation for many practical results later on. The chapter
will summarize and highlight some important results of this theory that will utilize
discussion of various estimating relationships that rest on this theory, such as the
cost, profit, and revenue functions that are the topics of this chapter.
M
The starting point is a firm that produces My outputs ∈ R+ y , using Mx inputs x ∈

RM Mx
+ with exogenous prices = w1 , . . . . , wMx ∈ R++, using some technology
x

M
T, where the technology set Tis defined as T ≡ (x, y) ∈ N + × + : y
x y

is producible from x . The input requirement set L(y) completely characterizes
the technology and is defined as

M
L(y) ≡ x ∈ RM
+
x
: y is producible from x , y ∈ R+ y . (1)

Moreover, the Shephard’s input distance function, defined as

Di (y, x) ≡ sup {θ > 0 : x/θ ∈ L(y)} (2)

completely characterizes the input requirement set in the sense

x ∈ L(y) ⇐⇒ Di (y, x) ≥ 1. (3)

A firm faced with a cost constraint chooses its level of inputs given the price and
output level. Such a cost (we are considering here long-run costs) and its functional
representation can be shown to be

C (y, w) ≡ min {wx : x ∈ L(y)} . (4)

1 For
more details on the issues discussed in this section, see Chapter 2 (Production Theory: Dual
Approach) of Sickles and Zelenyuk [110] whose notation we adopt here.
644 L. Kutlu et al.

Given cost minimizing decision by the firm in the employment of its resources,
we can express the input demand functions as

x (y, w) ≡ arg min { wx : x ∈ L(y)}, (5)

which, of course, are conditional on the level of output produced. If the input
requirement sets are convex and there is free disposability of inputs, then it can
be shown that the technology underlying the cost function can be identified. Thus,
under these conditions the cost function is dual to the primal technology.
If a firm’s behavioral objective is to maximize revenues instead of min-
imizing costs, then a duality can be shown to exist between the revenue
function and the underlying primal technology under certain regularity con-
ditions. First, define the technological
possibilities (output set) as P (x) ≡
My
y ∈ R+ : y is producible from x , x ∈ RM x
+ , and let the output prices for the
M
M outputs be p1 , . . . , pMy ∈ R++y . The output set P(x) completely characterizes
the technology. The Shephard’s output distance function is used to completely
characterize P(x) as

y ∈ P (x) ⇐⇒ Do (x, y) ≤ 1, (6)

where

Do (x, y) ≡ inf {θ > 0 : y/θ ∈ P (x)}. (7)

M
The revenue function R: RM
+ × R++ → R+ ∪ {+∞}, is then defined as
x y

R (x, p) ≡ max {py : y ∈ P (x)}, (8)

which leads to a set of output supply functions

y (x, p) ≡ arg max {py : y ∈ P (x)}. (9)

M
Finally, the profit function π : RM
++ × R++ → R+ ∪ {+∞} is defined as
x y

π (w, p) ≡ sup {py − wx : (x, y) ∈ T }, (10)

x,y

and the corresponding output supply and input demand equations are given by

(x (w, p), y (w, p)) ≡ arg sup {py − wx : (x, y) ∈ T }, (11)

x,y

assuming profit maximizing behaviors.

16 Cost, Revenue, and Profit Function Estimates 645

Next, the chapter turns to explicit functional forms and assumptions for cost func-
tions and factor demand equations, revenue functions and output supply equations,
and profit functions and the corresponding output supply and input demand equa-
tions. The chapter also considers their shadow prices when allocative distortions
exist in the optimal relative output mix and input mix selected by the firm.

Cost Functions

Simple inflexible cost functions, thanks to their parametric forms, often satisfy
the regularity conditions required in the production theory and dual forms such
as the cost functions. However, these simple but inflexible forms have serious
limitations. Aside from the strong and often unrealistic restrictions they impose on
the technology being modeled, they suffer from other shortcomings as well. A multi-
output Cobb-Douglas distance function, for example, does not satisfy the concavity
condition because it has a convex production possibility frontier. Since a majority of
firms produce more than one output, distinguishing each output by using a different
production function is empirically infeasible and theoretically dubious. Given the
fact that substitution possibilities do not vary across many inputs using inflexible
forms, the multi-output version of technology using inflexible functions, in general,
does not have varying substitution possibilities either. More flexible functional
representations of production are needed to satisfy the regularity conditions and to
resolve the issues of using inflexible forms in the multi-output production. Flexible
functional forms allow non-increasing marginal rates of substitution, which is a
property all well-defined production functions possess.
One important motivation for using flexible functional forms is that they do not
impose any prior restrictions on the Allen-Uzawa elasticities of substitution. Given
any arbitrary function, the flexible forms can approximate the function as well as the
first two derivatives at a point with precision [19, 117]. The flexible functional forms
are not completely new knowledge. In fact, they can be derived by adding second-
order terms to a wide range of functions used in the production studies. Therefore,
the flexible functional forms can be considered as non-parametric versions of the
commonly used functional forms such as the linear, the Leontief, and the Cobb-
Douglas functions.
The next section focuses will focus on a set of cost functions widely used
in the literature: the translog, the quadratic, the generalized Cobb-Douglas, the
generalized Leontief, the CES-translog, and the symmetric generalized McFadden
cost functions. This section presents present some of the important features of the
dual cost function and issues related to its estimation. Since many of these concepts
apply to revenue and profit estimations, those sections are relatively brief.

Cost Function Properties

The cost function gives the minimal amount of cost for a certain level of outputs
M
y ∈ R+ y with given technological possibilities and fixed input prices w ∈ RM x
++ ,
646 L. Kutlu et al.

where My and Mx are number of outputs and inputs, respectively. The duality theory
shows that the cost function of a productive unit contains all the information of its
technology. An immediate example is that the input distance function-based scale
elasticity coincides with the cost-based measure of scale elasticity measure. Hence,
understanding the cost function is essential for understanding the technology of
production. This section first summarizes the properties of a cost function below
as these properties play a central role in estimating a cost function [110].

1. C(y, w) ≥ 0 (non-negativity)
2. C(y, w) is continuous in (y, w) (continuity)2
3. C(y, kw) = kC(y, w), ∀ k > 0(linear homogeneity in w)
4. ) ≥ C (y, w) , ∀w̃ ≥ w(monotonicity in w)
C (y, w
5. C(y, w) is concave in w (concavity in w)

M
where y ∈ R+ y and w ∈ RM x
++ are vectors of outputs and input prices, respectively.
In practice, Conditions 1 and 2 are automatically satisfied by a proper functional
choice for the cost function. Condition 1 may be violated for some functional form
choices but, generally, it is satisfied at sample data points. Imposition of Condition 3
is not problematic as well. However, imposing Conditions 4 and 5 on a cost function
is a relatively more difficult, yet possible, task. The difficulty stems from the fact
that, for flexible functional forms, the restrictions would be observation specific. In
practice, monotonicity condition is the least concern since estimated factor demands
are positive, and cost is increasing in output with no parametric restrictions imposed.
However, curvature conditions pose a somewhat difficult problem when estimating
a flexible functional form.

Functional Forms for Cost Function Estimation

This section briefly discusses briefly discuss some of the most widely used func-
tional forms for cost function estimation and how regularity conditions are treated in
this context. Although this section concentrates only on single-output cost functions,
the generalizations to multi-output cases are available and straightforward.3

Translog Cost Function

The translog (TL) cost function is the most widely used flexible functional form for
cost function estimation and is:

2A weaker continuity condition is that C(y, w) is continuous in w and lower semi-continuous in y.

3 See Caves et al. [19] for a discussion multi-output cost functions. See also Röller [100] for another

study that consider multi-output cost functions.

16 Cost, Revenue, and Profit Function Estimates 647

1
ln C (y, w) = β0 + βy ln y + βj ln wj + βyy (ln y)2 + βyj ln y ln wj
j 2 j

1
+ βj k ln wj ln wk
2 j,k
(12)

where β jk = β kj (symmetry), j β j = 1, j β yj = 0, and k β jk = 0 (linear
homogeneity). A standard way to impose linear homogeneity restriction is by
normalizing C(y, w) and input prices using one of the input prices. It is common
to estimate the cost-input share system in order to add degrees of freedom and
boost the precision of the estimates. This, of course, may not be appropriate if input
allocations are distorted and thus the cost minimizing input shares derived from the
TL are not given by:

∂ ln C (y, w)
sj (y, w) = = βj + βyj ln y + βj k ln wk . (13)
∂ ln wj k

These input share equations (as opposed to the input demand equations in the
level form) are linear in parameters. Regularity conditions can be tested using the
cost function’s estimates. For example, the monotonicity condition is satisfied if
sj (y, w) ≥ 0. Linear homogeneity in y is met whenβ yy = β yj = 0, while the less
restrictive property of homotheticity only requires thatβ yj = 0. When the TL second-
order terms β yy , β yj , β jk are zero, it becomes the Cobb-Douglas (CD) cost function.

Translog Cost Functions with Allocative and Technical Distortions

Kumbhakar [63] discusses inefficiencies with a focus on multiple outputs in the
frameworks of cost minimizing and profit maximizing using translog functions to
represent technology. A firm minimizes shadow cost given inefficient output, and its
optimization problem is defined as

∗ ∗
c∗ w ∗ , yeu = wj xj w , yeu (14)
j

where wi∗ is the shadow price of the optimal input level, y is the actual output, and
u ≥ 0 is technical inefficiency such that yeu is the maximum possible output. Since
shadow costs are unobservable, actual costs are related with shadow costs by using
input demand function and are derived as

ln cA = ln c∗ + ln S ∗ θ −1 , (15)
j j j

where Sj∗ is the shadow cost share and θ j = 1 is the allocative inefficiency. Actual
cost shares can be related to the shadow cost shares by
648 L. Kutlu et al.

SjA = Sj∗ θj−1 / Sk∗ θk−1 . (16)
k

A translog shadow cost function for the shadow cost function is utilized with
homogeneity of degree one in w∗ and is written as

c∗ 1
ln = α0 + j∗ +
αj ln w j∗ ln w
aj k ln w k∗
w1 2
j j,k
1
+ βm ln ym eu + βml ln ym eu ln yl eu (17)
m
2
m,l

+ γj m ln wj∗ ln ym eu ,
j,m

w∗
j∗ = wj1 .
where α jk = α kj , β ml = β lm , and w
Then shadow cost shares can be obtained as

Sj∗ = ∂ ln c∗ /∂ ln wj∗ = αj + k∗ +
αj k ln w γj m ln ym eu (18)
k m

Technical inefficiency does not only appear additively but also interact with input
prices and outputs, which results in heteroscedasticity.
In the presence of input inefficiency, the shadow cost function incorporating
technical inefficiency is
∗ e ∗
c̃ w ∗ , y = wj xj w , y , (19)
j

and input demand functions are derived from Shephard’s lemma

∂ c̃ (w ∗ , y)
xje w ∗ , y = . (20)
∂wj∗

For the translog cost function, actual cost and shadow cost can be related by

ln cA = ln c̃ w ∗ , y + ln S̃j θj−1 + τ, (21)
j

where S̃j is the shadow cost share in the case of input inefficiency. Similar to the
derivation in the output inefficiency case, actual cost shares can be derived as
wj xj
SjA = A
= S̃j θj−1 / S̃k θk−1 . (22)
c k

The cost function lncA is then complete after using the translog form for
ln c̃ (w ∗ , y), and S̃j is derived from the translog form.
16 Cost, Revenue, and Profit Function Estimates 649

Sickles and Streitwieser [109] focus on distortions in the pipeline transmission

of natural gas by employing a restricted cost function captured by a shadow price
and estimate various aspects of a production.
Assuming exogenous output and input prices, a firm minimizes its short run cost
as follows:

min wi xi subject to G (y, x; t) = 0, (23)
i

where G is the function that transforms the technology t, and x include labor, energy,
and two quasi-fixed capital inputs. The solution to this is the short-run variable cost
function

V C = C (y, w, x; t), (24)

where C is homogenous of degree one, non-decreasing, and concave in factor prices

w, non-increasing and convex in the quasi-fixed factors x, and non-negative and non-
decreasing in output y. A non-homothetic translog function is used to approximate
C. Given exogenous wi , they derive the variable cost share utilizing Shephard’s
Lemma as

Mi = αi + i βij ln wj + βyi ln y + k βik ln xk . (25)

zk xk
xk = CV is incorporated in the model, where
ln C
The shadow share equation – ∂∂ ln
zk , the shadow price, can be obtained by taking the difference between revenues and
variable costs. The shadow cost share in the restricted translog cost function is

Mk = − αk + i βik ln wi + βyk ln y + h βhk ln xhk . (26)

Good, Nadiri, and Sickles [38] develop several modeling scenarios in the airline
industry, which allow input price distortions incorporated in a translog variable cost
function that captures the linkage between observed cost and assumed minimized
cost. Airlines are assumed to use inputs x = x (xJ , xN − J ) > 0 to produce outputs
y = y (yK , yM − K ), where the last N-J inputs are assumed to be fixed and the last
M-K outputs are non-physical output characteristics. Consider a virtual technology
and virtual input and output decisions, labeled with a “∗ ,” that are consistent with
the standard assumptions of duality theory. The observed prices deviate from the
virtual prices by θ = (θ 1 , . . . , θ N ) such that wi∗ = wi + θi for input i. Based on
Shephard’s lemma, factor demands derived from the firm’s minimum virtual cost
function are

x∗J y, wj∗ ; xN −J = ∇wj∗ C ∗ y, wj∗ ; xN −J . (27)

The observed cost function and associated short-run factor shares are
650 L. Kutlu et al.

C y, w∗j , wj ; xN −J = wj xj∗ y, wj∗ ; xN −J (28)
j

and
wi xi
Mi = , i = 1, . . . , J. (29)
C y, wj∗ , wj ; xN −J

Since M∗i = wi∗ xi /C ∗ , observed input use can be written as xi = Mi∗ C ∗ /wi∗ .
Then, observed costs can be expressed as

∗ Mi∗ wi
C=C , (30)
i wi∗

and observed factor shares expressed as

Mi∗ wi
wi∗
Mi = . (31)
Mj∗ wj
j wj∗

The equations above provide linkages between an observable cost function and
the virtual technology when the application of the technology is distorted.
Atkinson and Halvorsen [10] incorporate regulatory constraints into the cost
function framework in which they assume shadow prices to be simply proportional
to market prices. Later Getachew and Sickles [36] utilizes the same approach to
study the impact of policy constraints on relative prices and structure of production.
By imposing additional constraints R(w, x; φ), the firm minimizes the production
cost as follows:

minx C = w x s.t.f (x) ≤ Q and R (w, x; φ) ≤ 0 (32)

where f(x) is a production function and Q is a certain level of output. Taking

Lagrangian, the constrained cost minimization of the firm becomes

L = w x − v (f (x) − Q) − λr Rr (w, x; φ), (33)
r

where λr are the Lagrangian multipliers for each of the Rr constraints. The
unobserved shadow prices are approximated by using a first-order Taylor series

wei = ki wi , (34)

where ki is a factor proportional to an input price. Derived from the shadow cost
function, the updated demand function can be obtained utilizing Shepard’s Lemma.
The updated demand function gives an actual cost function
16 Cost, Revenue, and Profit Function Estimates 651

M∗
lnCA = lnC∗ + ln i
, (35)
i ki

where M∗i is the shadow share of factor i. The actual share equation MA
i is derived
to be

M∗i
ki
MA
i = Mi∗ . (36)
i ki

The shadow cost function lnC∗ can be rewritten in the translog form as follows:

1
lnC∗ = α0 + αQ ln Q + γQQ (ln Q)2 + αi ln (ki wi )
2 i
(37)
1
+ γiQ ln Q ln (ki wi ) + γij ln (ki wi ) ln kj wj + δt t,
i 2 i,j

where t is the time trend that represents technological change over time. Then,
the expression for the shadow share M∗i can be obtained from the logarithmic
differentiation. Substituting into the actual cost function gives

lnCA = lnC∗ + ln αi + γiQ ln Q + γij ln kj wj /ki . (38)
i j

Then, the actual cost share of input i can be derived as

1
i = αi + γiQ ln Q +
MA γij ln kj wj / αi + γiQ ln Q
j ki i
(39)
1
+ γij ln kj wj .
j ki

The actual cost function is then complete.

Generalized Leontief Cost Function

The generalized Leontief (GL) cost function [25] is homogenous by construction
and is given by:
1/2 1/2

C (y, w) = βj wj + y βj k wj wk + y2 βyj wj (40)
j j,k k

where β jk = β kj (symmetry). Input demand equations are given by:

wk 1/2
xj (y, w) = βj + y βj k + βyj y 2 . (41)
k wk
652 L. Kutlu et al.

The monotonicity condition is satisfied if xj (y, w) ≥ 0. The GL cost function

is non-homothetic unless β yy = β yj = 0 and incapable of distinguishing between
homotheticity and linear homogeneity. When β jk = 0 for j = k, GL cost function
collapses to the Leontief fix proportions cost function.

The Symmetric Generalized McFadden Cost Function

The symmetric generalized McFadden (SGM) cost function [25] is given by:

C (y, w) = g(w)y + βj wj + y βyj wj (42)
j j

where g(w) = 12 wθ Sw
w , S is a symmetric non-negative semidefinite parameter matrix,
and θ is a non-negative vector (not all zero). In order to achieve identification of
all parameters, it’s necessary to have S w = 0 for some w with strictly positive
components, e.g., a vector of ones. Input demand equations are given by the vector:

Sw 1 w Sw
x(w) =
− θ. (43)
θ w 2 (θ w)2

By construction, SGM cost function is linear homogenous in w. The monotonic-

ity condition is satisfied if the components of x(w) are non-negative. It turns out that
C(y, w) is globally concave in w if S is negative semidefinite. If the estimate of S

is not negative semidefinite, one can reparametrize S as S = − LL , where L is a
lower triangular matrix so that L w = 0, which would assure global concavity of
C(y, w). Kumbhakar [62] gives a generalization of SGM cost function to the multi-
output case that makes it relatively easy to estimate different aspects of a production
technology. He applies SGM to a panel data of 12 Finnish foundry plants to estimate
technical progress, economies of scale, and economies of scope. Rask [97] proposes
a modified version of SGM to allow fixed factors of production so that the cost
function can be applied to the processes when there are fixed costs. He estimates the
modified SGM cost function for sugarcane in Brazil, which takes up over two-thirds
of total costs in ethanol production and thus is important to study the technology of
sugarcane production.

Imposing Regularity Conditions for Cost Functions

As Barnett [11] points out, if both monotonicity and curvature conditions are not
satisfied, the second-order conditions for optimization and duality theory fail. While
some empirical researchers do not state these conditions, many others are careful
about the regularity conditions. Guilkey and Lovell [46] and Guilkey et al. [47]
exemplify some studies that provide evidence for potential poor global behavior of
multi-output cost functions.4

4 See Wales [117] for another example in the utility function context.
16 Cost, Revenue, and Profit Function Estimates 653

If the percentage of violations for monotonicity and curvature conditions is small

(e.g., smaller than 5%), some researchers attribute this to the stochastic nature of the
estimations and find the violations acceptable. When the percentage of violations is
high, some researchers modify the model to get an acceptable violation percentage.
For example, when estimating a TL variable cost function of US airports, Kutlu
and McCarthy [73] include an additional term to reduce the violation percentages
for monotonicity and concavity conditions. The percentage of violations decreases
from 4.2% to 0.5% after including this term. They argue that some airports have
particularly higher capital levels relative to the median airport, and the additional
term that they include captures this pattern.
Another approach is simply imposing regularity conditions. Serletis and Feng
[107] and references therein provide good discussions on how this can be done.
Hence, the rest of this subsection closely follows their arguments. Serletis and Feng
[107] categorize these methods as local regularity (at some data point in the sample),
regional regularity (over a neighborhood of data points in the sample), pointwise
regularity (at every data point in the sample), or global regularity (at all possible
data points).
Cholesky decomposition methods for imposing regularity conditions were first
used by Wiley et al. [120]. This method is based on the Cholesky decomposition
of a Hessian matrix into the product of a lower triangular matrix and its conjugate

transpose. For imposing concavity, one can reparametrize a matrix S as S = − LL
where L is a lower triangular matrix. As stated by Serletis and Feng [107], this
approach can be used not only for imposing the curvature but also for the mono-
tonicity conditions. While this approach is capable of imposing local and global
curvature conditions, it cannot impose regional or pointwise curvature conditions.
For monotonicity, the approach can be used to impose local monotonicity condition.
As an illustration, consider the TL cost function given in section “Translog Cost
Function.” The concavity in input prices is satisfied if the Hessian matrix

∂ 2 C (y, w)
H (y, w) = (44)
∂w∂w

is negative semidefinite. Diewert and Wales [25] prove that H is negative semidefi-
nite if and only if the following matrix is negative semidefinite:

G (y, w) = B − Diag (s (y, w)) + s (y, w) s (y, w), (45)

where B = [β ij ] is the matrix with element ij being equal to β ij , s (y, w) =

s1 (y, w) , s2 (y, w) , · · · , sMx (y, w) is the input share vector, and Diag(s(y, w))
is the Mx × Mx diagonal matrix with diagonal elements being equal to input
share vector s(y, w). Since G(y, w) is observation specific, it may not be easy to
impose concavity for all data points in the sample. However, as in Ryan and Wales
[102] and Feng and Serletis [30], concavity can be easily imposed on G(y, w) at
a reference point in the sample. Usually once the concavity is satisfied at a single
reference point, it is satisfied at most of the other sample points (if not all). If the
654 L. Kutlu et al.

percentage of violations is still high, one can simply try other reference points
and find the reference point that gives minimum number of violations. The TL
cost function would satisfy global concavity in input prices if s(y, w) > 0 and B is
negative semidefinite [25]. However, Lau [78] and Diewert and Wales [25] argue
that imposing negative semidefiniteness on B destroys the flexibility of TL cost
function and reduces it to the Cobb-Douglas form. The imposition of monotonicity
by the Cholesky decomposition is not difficult and is explained by Serletis and Feng
[107].
The non-linear optimization method for imposing regularity conditions is first
used by Geman and Geman [35]. In order to reduce computational difficulties and
time, Serletis and Feng [107] impose linear homogeneity by normalizing the cost
and input prices by the last input price wMx .5 They impose negative semidefiniteness
on G(y, w), i.e., concavity in input prices, by restricting its eigenvalues to be
non-positive. They also impose non-negativity on the cost function and non-
negativity of input shares (monotonicity). This approach can impose curvature and
monotonicity conditions locally, regionally, and pointwise. It is possible to impose
global concavity by restricting the eigenvalues for B to be non-positive. However,
the global monotonicity and non-negativity cannot be imposed if one wants to keep
concavity assumption.
Serletis and Feng [107] argue that the Bayesian method is a convenient way
for imposing regularity conditions due to Gibbs sampling methods introduced
by Geman and Geman [35] and the Metropolis-Hastings algorithm [49, 90].
Terrell [113], Koop et al. [60], and Griffiths et al. [43] exemplify some important
contributions on this area that allow incorporation of non-negativity, monotonicity,
and concavity conditions.6
Serletis and Feng [107] examine the performance of all three methods for impos-
ing non-negativity, monotonicity, and concavity conditions for TL cost function.
They find that, irrespective of the method, imposing global curvature conditions
forces the elements of the B matrix to be close to zero as the TL cost function reduces
to the Cobb-Douglas cost function in this case. Hence, they rather recommend
imposing pointwise regularity using either constraint optimization or Bayesian
approach. However, the Bayesian approach may be preferred on the grounds that it
is easy to obtain statistical inferences for the parameters and relevant measures (e.g.,
elasticities and productivity), which can be expressed as functions of parameters.

Stochastic Frontier Models for Cost Functions

The stochastic frontier analysis literature relaxes the neoclassical full efficiency
assumption by allowing the productive units to be inefficient. Aigner et al. [5] and

5 For another application of constrained optimization method to a flexible (i.e., globally flexible
Fourier) cost function, see Feng and Serletis [31].
6 See Kleit and Terrell [58] as an application of Bayesian approach for flexible cost functions.
16 Cost, Revenue, and Profit Function Estimates 655

Meeusen and van den Broeck [89] exemplify earlier studies of stochastic frontier
models that aim to measure efficiencies of productive units. A common feature
of stochastic frontier models (SFMs) is that they assume a composed error term
where the first component is the usual two-sided error and the second component
is a one-sided (non-negative) error term, which represents inefficiency. A variety
of distributions is proposed for the one-sided error component including the half
normal [5], the exponential [89], the truncated normal, the gamma, and doubly
truncated normal distributions.
A stochastic cost frontier model is given by:

ln C = α + x1 β + u + v (46)

where C is the cost of the productive unit; α is the constant term; x1 is a vector
of frontier variables, which does not contain the constant; u ≥ 0 is the one-sided
term that captures the cost inefficiency; v is the usual two-sided error term. It is
common to model the inefficiency term as u = h x2 γ u∗ , where u∗ ≥ 0 is a one-
sided random variable and h > 0 is a function of so-called environmental variables
x2 that affect inefficiency. The smaller values of u indicate that the productive unit is
cost efficient, and u = 0 means that the productive unit becomes fully efficient.

The standard stochastic frontier models assume that u∗ , v, and x1 , x2 are all
independent from each other. Cost efficiency is estimated by predicting7 :

Eff = exp (−u). (47)

The earlier stochastic models (e.g., [5, 89]) are in the cross-sectional framework.
Panel data can potentially give more reliable inefficiency estimates. Pitt and Lee
[95] and Schmidt and Sickles [106] propose random and fixed effects models
for estimating unit specific inefficiencies. These models assume time-invariant
inefficiency, which may not be a reasonable assumption for relatively longer panel
data. Cornwell et al. [21], Kumbhakar [61], Battese and Coelli [12], and Lee and
Schmidt [79] exemplify earlier time-varying inefficiency models. Ahn et al. [4],
Desli et al. [22], Tsionas [115], Huang and Chen [51], Assaf et al. [9], and Duygun
et al. [28] provide dynamic efficiency models. Greene [40, 41] argues that if there
is productive unit specific heterogeneity in the frontier and this is controlled, the
heterogeneity may be confused with inefficiency. Greene [40, 41] proposes fixed
and random effects models to control for heterogeneity, which are called true fixed
effects and true random effects, respectively. The advantage of fixed effects models
is that the heterogeneity can be correlated with the regressors. However, it is subject
to incidental parameters problem. In particular, while the frontier parameters are
consistent, the inefficiency estimates may not be accurate. Wang and Ho [118] solve
this problem by introducing first difference and within transformations to eliminate
the fixed effects term. Although the fixed effects models of Greene [40, 41], and

7 See Kumbhakar and Lovell [65] for details.

656 L. Kutlu et al.

Wang and Ho [118] allow inefficiency to vary over time, the heterogeneity is time-
invariant. Kutlu, Tran, and Tsionas [76] illustrate using Monte Carlo simulations
that ignoring time-varying heterogeneity may lead to biased parameter estimates and
seriously distorted efficiency estimates. The individual effects model of Kutlu, Tran,
and Tsionas [76] solve this issue by allowing both heterogeneity and inefficiency to
vary over time without being subject to incidental parameters problem.
Similar to the conventional cost function estimation, the most widely used
functional form in stochastic cost frontier studies is the translog functional form.
As stated earlier, in a conventional cost function model if the monotonicity and/or
curvature conditions are violated, the second-order conditions for optimization and
duality theory fail. The issue is even more serious for stochastic frontier models.
Sauer et al. [103] illustrate that when the monotonicity and curvature conditions are
not satisfied, the efficiency estimates may be seriously distorted. Many stochastic
frontier studies either do not state whether the regularity conditions are satisfied or
simply check these conditions at the mean or median of the sample data points.
Hence, the regularity conditions may still be violated at many other sample points,
indicating that the cost efficiency estimates for these sample points (and potentially
other sample points) are not reliable.
All these stochastic frontier studies mentioned so far can be applied to stochas-
tic cost, production, profit, and revenue frontier model estimations with minor
modifications. In particular, for production, profit, and revenue estimations, the
inefficiency component u is replaced by −u to estimate efficiency.
Allocative inefficiency results in utilization of inputs in wrong proportions given
input prices, i.e., misallocation of inputs. A production function can be used to
estimate technical inefficiency, which happens when the firm fails to produce
maximum output from a given input bundle, but it cannot be used to estimate
allocative inefficiency. Under the Cobb-Douglass production function assumption,
Schmidt and Lovell [104] present a stochastic cost frontier model where both costs
of allocative and technical inefficiency can be estimated. However, they assume that
allocative and technical inefficiency are not correlated. Under the same production
technology, Schmidt and Lovell [105] relax this assumption by allowing allocative
and technical inefficiencies to be correlated. Modeling allocative inefficiency under
translog cost function assumption is less trivial. Greene [39] models allocative
and technical inefficiency in a translog cost function by assuming that allocative
inefficiency departures from the cost shares. However, he does not derive cost of
allocative inefficiency due to such departures. Rather, he assumes that allocative
inefficiency and cost of allocative inefficiency are independent. Bauer [13] calls
this “Greene problem.” Kumbhakar and Wang [68] and Kutlu [72] examine the
consequences of lumping allocative inefficiency together with technical inefficiency
when estimating a cost frontier, i.e., the assumption that the one-sided error term
in the cost function captures the overall cost of inefficiency. They both start
with the cost minimization problem for the translog cost function. Then, they
calculate the exact allocative inefficiency and the corresponding cost of allocative
inefficiency where allocative inefficiency is defined as the deviations from the
optimal input allocation. Both Kumbhakar and Wang [68] and Kutlu [72] point
16 Cost, Revenue, and Profit Function Estimates 657

out negative consequences of lumping the allocative inefficiency with technical

efficiency when estimating a cost frontier. Kutlu [72] argues that system estimators
perform worse than single equation estimators even when the complex functional
form for allocative inefficiency is approximated by a first-order Taylor series. In
order to address this issue, Kumbhakar and Tsionas [66] use similar approximations
in a Bayesian setting, and the solutions based on the cost function approach seem
not easy. Kumbhakar and Wang [67] overcome this issue by using a primal system
consisting of a translog production function and first-order conditions of cost
minimization. In defense of standard stochastic cost frontier models, Kumbhakar
and Wang [67] and Kutlu [72] are typical examples for those studies that find
negative results based on changing where and how an error term enters a model.
While these negative results put some unrest about cost function estimations, they
depend on how the data generating process is determined. Nevertheless, unlike
the conventional cost function estimations where researchers generally estimate a
cost-input share system, the number of such studies is almost non-existent in the
stochastic frontier literature.

Endogeneity in Cost Function Models

Using the production function approach is appropriate if the inputs are exogenous.
However, researchers often encounter endogenous input choices in the production
process. In particular, the factor inputs under a firm’s control may be reallocated
to achieve the firm’s objectives. In the case of a stochastic production function by a
firm maximizing expected profits [122], all variable inputs can be considered weakly
exogenous. However, if the expected profit maximization assumption of Zellner
et al. [122] is not accurate, then one potential solution is to use an instrumental
variable or control variable approach to address the issue. In many scenarios, the
price taking assumption is more reasonable compared to the exogenous factor inputs
assumption and good instruments may be hard to find. Hence, a widely used solution
is to estimate a cost function rather than a production function. This is one of
the reasons why a dual cost function specification may be preferred over a primal
production function specification. Exogenous input prices are more likely when the
market is competitive, and thus researchers would prefer the cost function approach
given that the level of output is dictated by market forces exogenous to the firm.
However, cost functions may suffer from endogeneity problems as well if the output
fails to be exogenous. Thus, both production and cost functions may suffer from
endogeneity. Besides endogenous outputs, other scenarios may lead to endogeneity
in the cost function approach.
One potential problem occurs when a cost function includes a quality variable
where the quality is jointly determined by the costs. Mutter et al. [92] argue that
inclusion of the quality variable leads to inconsistent parameter estimates. Some
researchers drop the quality variable to avoid such problem, but this does not solve
the issue in the stochastic frontier framework. If the quality is cost enhancing
and a stochastic frontier model is estimated, the efficiency estimates would be
658 L. Kutlu et al.

inconsistent irrespective of whether the quality variable is included in the frontier.

Duncombe and Yinger [27] and Gronberg et al. [44] exemplify studies that point
out the endogeneity of output quality in their cost equation. Another potentially
endogenous variable used in cost function estimations is the Herfhindahl-Hirschman
Index (HHI). This variable is popular in stochastic frontier models due to close
connection between market power and efficiency. In particular, it is common to
model inefficiency by using HHI as one of the environmental variables. Karakaplan
and Kutlu [55, 56] find evidence of endogeneity from HHI. Similarly, Kutlu, Tran,
and Tsionas [76] find evidence of endogeneity from another related variable that
measures profitability, i.e., return on revenue.
The endogeneity problem is more likely to occur in a stochastic frontier setting
due to presence of the additional inefficiency term. In particular,
as stated earlier, the
standard models in this literature assume that u∗ , v, and x1 , x2 are all independent
from each other. Guan et al. [45] and Kutlu [71] are the earliest studies that aim
to solve endogeneity problems in the stochastic frontier setting. These papers relax
the independence assumption of x1 and v. Guan et al. [45] achieve this via a two-
stage method where in the first stage they get the consistent frontier parameter
estimates using the GMM and in the second state they estimate efficiency using a
standard stochastic frontier model. Kutlu [71] uses a limited information maximum
likelihood estimation method (single-stage control function estimation) to solve the
endogeneity problem. Tran and Tsionas [114] propose the GMM counterpart of
Kutlu [71]. Karakaplan and Kutlu [54, 55] present cross-sectional and panel data
variations of Kutlu [71] and extend his method to allow environmental variables

to be endogenous, i.e., allowing v and x1 , x2 to be correlated. In a Bayesian

framework, Griffiths and Hajargasht [42] propose models that allow v and x1 , x2
to be correlated. Using a copula approach, Amsler et al. [7, 8] provide cross-
sectional models that allow
more general correlations, including the correlation
between u∗ and x1 , x2 . The approach requires using a proper copula and may
be computationally intensive. Kutlu, Tran, and Tsionas [76] provide an individual
effects panel data model that allows v and x1 , x2 to be correlated, which is
a generalization of time-varying heterogeneity as in Wang and Ho [118]. In an
appendix, they also provide a copula variation of their model that allows more
general correlation structures. However, they argue and illustrate by Monte Carlo
simulations that when the heterogeneity term is included, the consequences of
violating general correlation assumptions are not serious if the heterogeneity is
controlled.
Finally, the standard modeling of a cost function does not incorporate agency-
related aspects into the optimization problem. Kutlu, Mamatzakis, and Tsionas [77]
present a model where the manager is a utility maximizer in a quantity-setting
oligopoly market. The utility of the manager is a function of profit and her effort
level. They assume that higher effort reduces the costs. This introduces an additional
structural inefficiency term, which is a specific function of frontier variables. Hence,
given that the standard models ignore this structural inefficiency term, the parameter
and efficiency estimates from the standard stochastic frontier cost function models
would be inconsistent if the assumptions of this model hold. Basically, the solution
16 Cost, Revenue, and Profit Function Estimates 659

to this problem would be including the structural inefficiency term as a control

function to correct the bias. Gagnepain and Ivaldi [33, 34] propose related models
where additional terms appear in the cost function due to agency-related problems.

Marginal Cost Estimation

Sometimes a researcher is interested in the marginal cost rather than the cost itself.
A common application is estimating the cost function and then calculating the
marginal cost (e.g., Weiher et al. [119] and [74]). However, in many occasions
data on total cost is either not available at all or not available at the desired market
level. For example, Weiher et al. [119], Kutlu and Sickles [74], and Kutlu and Wang
[75] have airline specific total cost data for the US airlines although these studies
are interested in route-airline-specific marginal cost estimates. The new empirical
industrial organization literature allows estimation of marginal cost without using
total cost data. The marginal cost estimates (along with market power estimates)
are obtained by estimating the so-called conduct parameter (conjectural variations)
model where a general form of demand-supply system is estimated. Bresnahan [18]
and Perloff et al. [94] provide excellent surveys on this topic. Recently, Kutlu and
Wang [75] present a methodology that combines the conduct parameter and stochas-
tic frontier methods that enables estimation of market power, marginal cost, and
marginal cost efficiency estimates from a demand-supply system. The advantage of
studying marginal cost efficiency over cost efficiency is that marginal cost efficiency
is directly related to deadweight loss. While both measures are valuable, marginal
cost efficiency measure may be more relevant from the antitrust point of view.

Revenue Functions

This section presents some important features of a revenue function and issues
related to its estimation. As mentioned before, since many of the concepts intro-
duced apply to the revenue function estimation, this section will be brief.

Revenue Function Properties

The revenue function gives the maximal amount of revenue a firm can achieve at a
certain level of inputs x ∈ RMx
+ , given technological possibilities and fixed output
M
prices p ∈ R+ y . First, the properties of a revenue function are summarized below
as these properties play a central role when estimating a revenue function [110]:

1. R(x, p) ≥ 0(non-negativity)
2. R(x, p) is continuous in (x, p) (continuity)8

8A weaker continuity condition is that R(x, p) is continuous in p and upper semi-continuous in x.

660 L. Kutlu et al.

3. R(x, kp) = kR(x, p), ∀ k > 0(linear homogeneity in p)

4. R (x, p̃) ≥ R (x, p) , ∀p̃ ≥ p(monotonicity in p)
5. R(x, p) is convex in p (convexity in p)

M
where x ∈ RM + and p ∈ R++ are vectors of inputs and output prices, respectively.
x y

In practice, Conditions 1 and 2 are automatically satisfied by a proper functional

choice for the revenue function. As in the cost function case, Condition 1 may be
violated for some functional form choices but, generally, it is satisfied at sample data
points. Imposition of Condition 3 is not problematic as well. As in the cost function
case, the monotonicity conditions are not problematic in practice. However, again,
curvature conditions pose some difficulties when estimating a flexible functional
form.

Functional Forms for Revenue Function Estimation

Typically, the functional forms used in revenue function estimation are similar to
those used in (multiple-output) cost function estimation. Hence, this section is brief.
The most widely used revenue function is translog revenue function [23], which is
given by:

ln R (x, p) = β0 + j βxj ln xj + j βj ln pj + 12 j,k βxxj k ln xj ln xk
(48)
+ j,k βxj k ln pj ln xk + 12 j,k βj k ln pj ln pk

where β jk = β kj , β xxjk = β xxkj (symmetry), j β j = 1, j β xjk = 0, and k β jk = 0
(linear homogeneity). The output share equations are given by:

y

sj (x, p) = βj + βxj k ln xk + βj k ln pk . (49)
k k

Diewert [23] provides the details about Generalized Leontief revenue function.
A functional form, which haven’t been mentioned earlier, that is used in the revenue
framework is the mean of order of two revenue functions [24]. Diewert considers
only one input case though the functional form can be extended to a multi-input
scenario in a straightforward way. Using solutions to a set of functional equations,
Chambers et al. [20] show that the translog revenue function can be obtained from
the Shephard distance function for generalized quadratic functions in the dual price
space.

Stochastic Frontier Models for Revenue Functions

Unlike the cost function, the relevant stochastic revenue frontier model needs to be
slightly modified and is given by:
16 Cost, Revenue, and Profit Function Estimates 661

ln R = α + x1 β − u + v (50)

where R is the revenue of a productive unit; α is the constant term; x1 is a vector

of input variables; u ≥ 0 is the one-sided term that captures cost inefficiency; v
is the usual two-sided error term. As in the cost function case, the smaller values
of u indicate that the productive unit is more cost efficient, and when u = 0 the
productive unit becomes fully efficient. The standard stochastic frontier
assumptions

about independence of variables remain the same so that u∗ , v, and x1 , x2 are all
independent from each other. In the case of endogenous input variables, estimates
from the revenue function would be inconsistent. The endogeneity solutions men-
tioned for the stochastic cost frontier models can also be applied to the stochastic
revenue frontier function estimation.
Applications of the revenue function are not as prevalent as the cost and
production function, but the revenue function is still applicable in various research
questions. Kumbhakar and Lai [69] apply the revenue function to a non-radial
and output-specific measure of technical efficiency they propose in a revenue-
maximizing framework. They use the maximum likelihood estimation method to
estimate a translog revenue-share system. The empirical work by Oliveira and his
colleagues [93] use a revenue function to analyze efficiency of hotel companies
in Portugal based on the stochastic frontier approach. Mairesse and Jaumandreu
[83] study the discrepancies between the cross-sectional and time-series estimates
of scales and capital elasticities by estimating the production function as well as
the revenue function with two panel datasets. They find that the estimates of the
functions have little difference and conclude that the bias from other sources, rather
than the lack of firm data on output prices, are more likely to be problematic. Rogers
[99] estimate revenue efficiency along with cost and profit efficiency to show the
importance of including nontraditional output in bank studies. They find that the
standard model understates bank efficiency if nontraditional output is excluded.

Profit Functions

This section presents some important features of a profit function and issues related
to its estimation. It also talks about a less well-known form of profit function, which
has many desirable properties, so-called alternative profit function.

Profit Function Properties

The profit function gives the maximal amount of profit for given input and
output prices with given technological possibilities. First, this section summarizes
properties of a profit function below as these properties play a central role when one
estimates a profit function [110]:
662 L. Kutlu et al.

1. π (w, p) ≥ 0(non-negativity)
2. π (w, p) is continuous in (w, p) (continuity)
3. π (kw, kp) = kπ (w, p), ∀ k > 0(linear homogeneity in (w, p))
4. π (w, p̃) ≥ π (w, p) , ∀p̃ ≥ p(monotonicity in p)
5. w , p) ≥ π (w, p) , ∀
π ( w ≤ w(monotonicity in w)
6. π (w, p) is convex in w (convexity in w)
7. π (w, p) is convex in p (convexity in p)

M
where w ∈ RM ++ and p ∈ R++ are vectors of input and output prices, respectively.
x y

While Conditions 2–5 are relatively easily satisfied, the curvature conditions
(Conditions 6 and 7) and Condition 1 need some extra care. In the banking industry,
for example, data points with negative profits are not uncommon. However, profit
cannot be negative given a concave production function. To use this result, the profit

has to be defined as π (w, p) = p y − w xand used in the model, instead of reported
profit. Observed negative profits violate the property and are problematic.

Functional Forms for Profit Function Estimation

As discussed earlier in the cost function setting, apparent proper candidates for
a profit function are twice differentiable functional forms that are based on a
quadratic form. Diewert [23] notes that having a second-order approximation which
is homogenous of degree one is a preferred method. However, in this case, the
second-order approximation reduces to a first-order approximation. Due to this
reason, he considers alternatives such as generalized quadratic in square roots profit
function and its special case, the generalized Leontief profit function. The extended
profit function of Behrman et al. [14] exemplifies another study that is motivated by
the same problem.
Now, this section briefly discusses the extended profit function of Behrman et al.
[14]. This model is presented using their notation. Let x be the vector of variable
inputs and H be the quasi-fixed input used for producing multiple output represented
by y with prices p. Further the output and input prices and quantities are combined

as q = (p , w ) and u = (−y , x ) . Then, the generalized Leontief variable profit
function can be written as follows:
1/2 1/2

π (q, H ) = γj k qj qk + γj H qj H 1/2 (51)
j,k j

where γ jk = γ kj . Therefore, the constant elasticity transformation-constant elasticity

of substitution- generalized Leontief variable profit function (CET-CES-GL) can be
expressed as:

1/ε 1/2 1/2

π (q, H ) = γjj qjε + γj k qj qk + γj H qj H 1/2 (52)
j j,k=j j
16 Cost, Revenue, and Profit Function Estimates 663

where γ jk = γ kj . The corresponding variable profit maximizing output supply and

input demand equations are given by:

(1−ε)/ε −1/2 1/2

uj (q, H ) = γjj qjε−1 γkk qkε + γj k qj qk + qj H H 1/2 .
k k=j
(53)

Profit Function with Allocative and Technical Distortions

Lovell and Sickles [81] incorporate technical and allocative inefficiency into a profit
function in the Generalized Leontief form by assuming wrong price ratios and by
allowing the actual output and input to differ from the optimal levels. The output
prices p = (p1 , . . . , pm ) > 0 and input prices w = (w1 , . . . , wn ) > 0 are given as
exogenous; the profit maximization problem becomes

maxy,x py − wx s.t. (y, −x) ∈ T . (54)

The profit function is useful from the fact that a profit function π and a production
possibilities set T both represent the profit-maximizing technology due to a duality
relationship. In addition, profit maximizing output and input allocations can be
derived using Hotelling’s Lemma:

∇p π (p, w) = y (p, w) , ∇w π (p, w) = −x (p, w) . (55)

The profit of a firm producing two outputs using two inputs, as an example, is
assumed to be the Generalized Leontief form. Then, the profit maximizing output
and input equations can be derived from Hotelling’s Lemma and can be modified to
include inefficiency as follows

− 12 − 12 − 12
p1 p1 p1
y1 (p, w, φ, θ) = (β11 −φ1 ) +β12 θ12 +β13 θ13 +β14 θ14 ,
p2 w1 w2
(56)

1
− 12 − 12
p1 2 p2 p2
y2 (p, w, φ, θ) = (β22 −φ2 ) +β12 θ12 +β23 θ23 +β24 θ24 ,
p2 w1 w2
(57)

1 1
− 12
p1 2 p2 2 w1
−x1 (p, w, φ, θ) = (β33 −φ3 ) +β13 θ13 +β23 θ23 +β34 θ34 ,
w1 w1 w2
(58)
664 L. Kutlu et al.

1 1 1
p1 2 p2 2 w1 2
−x2 (p, w, φ, θ) = (β44 −φ4 ) +β14 θ14 +β24 θ24 +β34 θ34 .
w2 w2 w2
(59)

The parameters φ i ≥ 0 measure the under-production of outputs and excessive

usage of inputs due to technical inefficiency. The parameters θ ij > 0, j > i are
interpreted as allocative inefficiency. If both technical and allocative inefficiency
exist, the observed profit can be expressed

4 3 4 −1 1 1 1
π (q, φ, θ ) = (βii −φi ) qi + βij θij 2 +θij2 qi2 qj2 , (60)
i i j

where q ≡ (p1 , p2 , w1 , w2 ). The change in profit due to technical inefficiency is

obtained by
4
π(q) − π (q, φ) = φi qi , (61)
i

and the change in profit due to allocative inefficiency is obtained by

3 4 −1 1 1 1
π(q) − π (q, θ ) = βij 2 − θij 2 + θij2 qi2 qj2 . (62)
i j

Allocative inefficiency can be further decomposed into output mix inefficiency,

input mix inefficiency, and scale inefficiency depending on θ ij . The perceived price
ratios θij qqji are consistent allocative inefficiency if they satisfy

qi qj qi
θij θj k = θik , i < j < k, (63)
qj qk qk

which requires

θik = θij θj k , i < j < k. (64)

Based on the work of Lovell and Sickles [81], Sickles, Good, and Johnson [111]
apply the Generalized Leontief profit function with allocative distortions to the US
airline industry by assuming wrong price ratios. The generalized Leontief profit
function including output characteristics is expressed as

−1 1 1 1
π (q, c, t; θ ) = i βii qi + i,j βij θij 2 + θij2 qi2 qj2
(65)
1 1
+ i βit qi t + i,j,k δij k qi cj ck
2 2
, δij k = δikj , ∀i, j = k,
16 Cost, Revenue, and Profit Function Estimates 665

where q is the vector of input and output prices, c is the vector of output
characteristics, and t is a time index. The output and input allocation equations can
be derived as
1
qi 2 1 1
di (q, c, t; θ ) = βii + j =i βij θij + βit t + 2
j,k δij k cj ck
2
, (66)
qj

where d = (y, −x). The output characteristics are approximated by

1 1
1/2 − 2 −1
ci (q, t) = j k>j γij k qj qk + j k>j γij kt qj
2
qk 2 t + γit t + γi . (67)

Kumbhakar [63] models technical and allocative inefficiencies in profit maxi-

mizing frameworks emphasizing on multi-outputs and multi-inputs. He derives the
exact relations between the inefficiencies and profit when translog functions are used
to represent technology. In the presence of output technical inefficiency, the firm’s
profit maximization problem is

max π = p y − w x
y,x

s.t.F yeu , x = 0, (68)

where y is the actual output and u ≥ 0 is the technical inefficiency so that yeu is the
maximum possible output level. Assume wj∗ = θj wj and pm ∗ = k p , where θ and
m m j
km are input inefficiency and output inefficiency, respectively. Optimal inputs and
outputs are determined by the shadow profit adjusted for efficiency. The efficiency
adjusted normalized shadow profit is

π̂ ∗ = y1 eu + p̃∗ ym eu − ŵj∗ xj = π̂ ∗ ŵ ∗ , p∗ (69)
m m j

∗ u eu w ∗
where π̂ ∗ = πp∗e , ŵj∗ = eu wj∗ = p1 j , p̃m
∗ = p ∗ /p and p ∗ = p .The normalized
m 1 1 1
1
actual profit adjusted for efficiency and the shadow profit adjusted for efficiency are
related as follows

∗ 1 ∗ 1
e π̃ = π̂ 1 +
u A
− 1 Rm + − 1 Q∗j , (70)
m km j θj

∗ = ∂ ln π̃ ∗ ∗ ∂ ln π̂ ∗
where the shadow revenue and cost shares are Rm ∗ ∗ , Qj
∂ ln p̃m = ∂ ln ŵj∗
=
j
j∗ xj
w
− π̃ ∗ . This transforms into

ln π̃ A = ln π̂ ∗ + ln H − u, (71)
666 L. Kutlu et al.

where H incorporates the shadow revenue and cost shares. The equations that relate
the actual revenue and cost shares to the shadow revenue and cost shares are given
by

∗ 1 1
A
Rm = Rm (72)
H km

∗ 1 1
j = −Qj
QA . (73)
H θj

Using a translog form for π̂ ∗ ŵ ∗ , p∗ given the expressions for the shadow
revenue and cost shares, one can obtain the expression for H. The profit function
specification is then complete.
In the presence of input technical inefficiency, the firm maximizes the profit as
follows:

maxπ = p y − w x
y,x

s.t.F y, xe−τ = 0, (74)

where τ ≥ 0 is interpreted as technical inefficiency and e−τ ≤ 1 as input technical

efficiency. Similar to the output technical inefficiency setup, the normalized shadow
profit function is

π ∗ (.)
π̃ ∗ w ∗ eτ , p∗ = = y1 + ∗
p̃m ym − ŵ ∗ x e , (75)
p1 m j j j

w ∗ eτ
where wj∗ = pj 1 and xje = xj e−τ . Since π̃ ∗ (w ∗ eτ , p∗ ) is not observed, it can be
related to the normalized actual profit by

1 1
π̃ A = π̃ ∗ 1 + −1 R̃m + − 1 Q̃j , (76)
m km j θj
w
which implies ln π̃ A = ln π̃ ∗ + ln H̃ where p̃m = ppm1 , w
j = p1j , ŵj = wj eτ . Same
procedure follows as in the output technical inefficiency case in which the derived
shadow revenue and cost shares can be related to the actual shares. Assuming a
translog form for π̃ ∗ gives expressions for the shadow revenue and cost shares.

Stochastic Frontier Models for Profit Functions

The stochastic frontier models for profit functions differ from the models for
cost and revenue functions in the presence of technical inefficiency. Kumbhakar
[64] derives the expressions for the profit function corresponding to different
16 Cost, Revenue, and Profit Function Estimates 667

assumptions on the underlying production function. In the presence of technical

inefficiency, the profit function can be written as π (p, w, u)= π (w, pe−u ), where
p is the output price, w is the input price, and e−u ≤ 1 is a measure of technical
inefficiency. To illustrate, assume a translog form on actual profit and the estimable
profit function is as follows:

π wj 1 wj wk
ln =α+ αj ln + αj k ln ln −u+v
p pe−u 2 pe−u pe−u
(77)

or in terms of the profit frontier:

π
ln = ln π (p, w) + ln h (p, w, u) + v, where (78)
p

wj u
ln h (p, w, u) = −u 1 − αj − αjk ln − αj k (79)
p 2

is profit technical inefficiency, which is not a constant multiple of u unless

αjk = 0 ∀ k, i.e., the underlying production technology is homogenous.
The standard stochastic profit frontier models assume that u, v, and the profit
frontier variables are all independent from each other. These assumptions can be
relaxed as stated in the stochastic cost frontier section.
In empirical applications, negative accounting profit is a commonly observed
phenomenon. However, the dependent variable for a stochastic profit frontier model
is the logarithm of the profit, which is not defined for observations with negative
profit. Some studies drop the observations with negative profits and estimate the
model with the remaining observations. As Bos and Koetter [17] mention, this
method has at least two shortcomings. First, one cannot obtain efficiency estimates
for the observations that are dropped. Second, these observations are likely to
belong to the least efficient productive units. Hence, dropping these observations
may potentially distort efficiency estimates. An alternative method is rescaling π
for all firms so that the rescaled π becomes positive. For example, a commonly used
recalling is done by adding θ = min (|π − |) + 1 to π where π − = min (π , 0) is the
negative part of π . Hence, the stochastic frontier profit model is given as follows:

ln (π + θ ) = α + x1 β − u + v. (80)

Berger and Mester [15], Vander Vennet [116], Maudos et al. [88], and Kasman
and Yildirim [57] exemplify some studies that use this rescaling approach. Critics
to this approach would ask “Where did this money come from?” Hence, Berger and
Mester [15] modify the prediction of profit efficiency as follows:
668 L. Kutlu et al.

fˆπ − u − θ
E f fπ = (81)
fˆπ − θ

where fˆπ − u −θ is the predicted actual profit and fˆπ −θ is the predicted maximum
of profit that could be earned if the productive unit is fully efficient. In order to reflect
the actual amounts, the profits are adjusted by θ and thus the standard formula for
efficiency calculations does not work.9
Finally, an issue in the estimation of a stochastic frontier profit function is that
the risk needs to be included in the model when the production involves risks. Since
the risk-taking behavior of a productive unit represents its objective, one would
incorrectly consider the risk-averse productive units as relatively inefficient when
the risk is not included in the estimation. The studies on financial sectors (e.g.,
banking) are generally careful about controlling for risk when estimating a profit or
alternative profit function.

Alternative Profit Function

Alternative profit function, introduced by Humphrey and Pulley [52], is another

representation of profits that can be used when the underlying assumptions of
standard profit function do not hold. In contrast to the profit function, which takes
input and output prices as given, the alternative profit function takes the input prices
and output as given, i.e., π (w, y). Hence, the independent variables for an alternative
profit function are the same as that of a cost function. The underlying assumption
in derivation of the alternative profit function is that the productive units maximize
profits by choosing input quantities and output prices. Berger and Mester [15] list
four conditions where estimating alternative profit function may provide useful
information:

1. There are substantial unmeasured differences in quality of outputs.

2. Outputs are not completely variable so that the productive unit cannot achieve
every scale and output mix.
3. Output markets are not perfectly competitive.
4. Output prices are not accurately measured.

9 Bos and Koetter [17] propose an alternative approach to overcome this issue. For observations

where the profit is positive, they keep the left-hand-side variable as lnπ , and for those observations
where the profit is negative, they replace the left-hand-side variable with 0. They also add an
indicator variable to the right-hand side. This indicator variable equals 0 when the profit is positive
and equals ln|π − | when the profit is negative. This method has the advantage that it uses all sample
points for the estimations. However, when measuring inefficiency, the logarithmic scale breaks
down for negative profits. Hence, the interpretation of inefficiency estimates for the observations
with negative profits deviates from the standard interpretation. Koetter et al. [59] exemplify a study
that uses this approach.
16 Cost, Revenue, and Profit Function Estimates 669

A model of alternative profit function is very similar to that of a cost function

except the dependent variable and the linear homogeneity in input prices assump-
tion. However, in the stochastic frontier setting, an alternative stochastic profit
model does not penalize high-quality banks in terms of efficiency, which may not
be the case for a stochastic frontier cost model.
It is important to note that, unlike the profit function, the alternative profit
function is not linearly homogenous in input prices [98]. Hence, linear homogeneity
of an alternative profit function is an empirical question and not a theoretical
restriction. Restrepo-Tobón and Kumbhakar [98] illustrate that incorrect imposition
of linear homogeneity in prices may lead to misleading results.

Multi-output Functional Forms

In productivity analysis, data on input and output levels are needed to estimate the
production function. The difficulty in obtaining input data and the fact that more
companies have integrated production across different segments make it even harder
to access to division-level input information. Data on total output and input do not
show how the company allocates resources in a certain segment, and thus one cannot
estimate the production function for one specific segment. Same problem persists in
study of a country’s productivity. In this case, most models of productivity assume
one common production function for the whole economy. This does not correctly
reflect how a country invests its resources since different industries/ sectors use
technology differently.
Gong and Sickles [37] develop modeling and estimation methods for multidivi-
sional/multiproduct firms and improve standard assumptions in the productivity and
efficiency literature. They develop a model to find input allocations among different
divisions given total inputs, outputs from each division, and input prices averaged
over the segment. The stochastic frontier model for a company i at time t is

yit = f (xit ; β0 ) ezτ evit e−uit (82)

where yit is the total output (aggregated); xit = xit1 , xit2 , . . . , xitM is the
vector of inputs of M types; f (xit ; β 0 )ezτ is the average production frontier,
β 0 = (β 01 , β 02 , . . . , β 0M ) is a vector of M types of parameters, z is a vector
including time dummy variables, and τ is a vector of corresponding coefficients;
evit is the random shocks to the production; and uit is a one-sided stochastic term
related to technical efficiency.
To allow different frontiers for different segments, Gong and Sickles [37]
introduce segment-specific production frontier (SSPF). In an economy that produces
N outputs/segments using M inputs during T periods, the production technology in
each segment is characterized in a system of N equations as follows:
670 L. Kutlu et al.

⎧ z1 τ1 vi1t −ui1t
⎨ yi1t = f1 (xi1t ; β1 ) e e e
⎪
.. (83)
⎪ .
⎩
yiN t = fN (xiN t ; βN ) ezN τN eviNt e−uiNt

where yijt is the observed output and xijt are vectors of inputs (unobserved) of
firm i in segment j at time t. The production frontier for segment j is represented
by fj (xijt ; β j ). Note that the parameters in the production β j are segment specific.
Similar to the single frontier case, zj = (zj2 , zj3 , . . . , zjT ) is a vector of year dummy
variables, and τ j = (τ j2 , τ j3 , . . . , τ jT ) is a vector of the corresponding coefficients.
The technical efficiency uijt = − η(t − T)uij is time variant. The random shock
ν ijt is assumed to be drawn from N 0, σvj 2 . One can use the SSPF framework

to predict division-level efficiency T ˆE ij t = e−uij t . In the case of single-frontier,

the SSPF approach predicts firm-level efficiency. It is straightforward to see that
the firm-level efficiency for the multidivisional firm T ˆE it is the average of division-
level efficiency weighted by the ratio of division-level revenue to firm-level revenue:

Rij t
T ˆE it = ˆ
T E ij t . (84)
Rit
j

The SSPF approach incorporates the heterogeneity in production frontiers and

has advantage in deriving division-level efficiency compared with a traditional SPF.

Non-parametric Estimation (and Shape Restrictions)

No better example exists of a disconnect between the conditions under which a

dual relationship is estimated and interpreted than in the case of non-parametric
estimation of cost functions. Such relationships have been estimated in the literature
for a variety of important industries, most notably in banking services where
substantial data in the form of panels of cross sections are publicly available.
This issue has been well studied over the last several decades. A number of
important papers have contributed to the development of shape restrictions in non-
parametric estimation. A short list includes Matzkin [86, 87], Ruud [101], Fox [32],
Mammen and Thomas-Agnan [85], Hall and Huang [48], Ait-Sahalia and Duarte
[6], Lewbel [80], Shively et al. [108], Du et al. [26], and Wu and Sickles [121].
This section begins with some examples to show how to restrict a function by
transforming the function. First, it looks at how to restrict a function’s range. If a
function needs to be nonnegative, for example, a common approach is to specify
the function as f (x) = (r(x))2 or (x) = er(x) such that f (x) ≥ 0. To further restrict
the values of the function to be (0, 1), one can specify the function as (x) = 1+e1r(x)
such that 0 < f (x) < 1. In general, a range restriction on a function to take values
in (a,b) is f (x) = a + 1+e b−a
r(x) such that a < f (x) < b. This restriction transforms
a constrained problem (specifying f ) into an unconstrained problem (specifying r)
16 Cost, Revenue, and Profit Function Estimates 671

and still maintain global compliance with constraints. This contrasts with the kernel-
based methods which keep observation-specific compliance as seen in Mukerjee
[91] and Mammen [84]. Later studies on penalized kernel-smoothers include Hall
and Huang [48], Henderson et al. [50], Blundell et al. [16], Ma and Racine [82], and
Du et al. [26].
To impose monotonicity constraints, integration procedures are utilized. Suppose
there is a monotone function f (x) and x ∈ [0, 1]. Ramsay [96] represents f (x) as:
x
f (x) = er(s) ds. (85)
0

Monotonicity and concavity can be imposed on f (x) by introducing an uncon-

strained function r(x), such that f (x) = er(x) > 0 and f (x) = f (x)r (x) so
that if r (x) < 0, then f (x) < 0. One way to model r(x) is to use an integration
transformation as follows:
⎛ ⎞
x ⎜ s ⎟
⎜ ⎟
f (x) = exp ⎜− g (t) dt⎟ ds. (86)
⎝ ⎠
0
0
r(s)

$x
It is clear that f (x) = exp − 0 g (t) dt > 0 and f (x) = −f (x) g (x) .Therefore,
if g(x) > 0, monotonicity and concavity naturally follow. Such a positivity constraint
can be imposed via functions such as g = x2 or g = exp(x).
Wu and Sickles [121] utilize a spline basis for the non-parametric expression for
the function g(x) = g(h(x)), where, e.g., g(x) = (h(x))2 . A dth order splines can be
written as
⎛ ⎞T
⎜ ⎟
(x) = ⎝1, x, . . . , x d , (x − j1 )d+ , · · · , (x − jM )d+ ⎠ , (87)

power series piecewise power series

where (x)+ = max (x, 0) and j1 < · · · < jM are spline knots and thus h(x) = cT Γ (x),
where c are the spline coefficients. This leads to a non-parametric production model
with monotonicity and curvatures constraints given by:
xi s
yi = β0 + β1 exp − g cT (t) dt ds + εi
0 0 (88)
= f (xi ; β, c) + εi .

Then, one can derive the penalized nonlinear least squares estimator as:
672 L. Kutlu et al.

1 n
min (yi − f (xi ; β, c))2 + λR(f ) , (89)
β,c n i=1
roughness penalty
goodness of fit

where R(f ) > 0 measures roughness in f and λ controls the balance between the
goodness-of-fit and smoothness such that the model is not over fit. The spline
coefficients become closer to 0 as λR(f ) decreases. One can use the common
$ 1 2
integrated squared derivatives R(f ) = 0 f (q) (x) dx, q = 1, 2, . . . . to model
the penalty.
Simar, Van Keilegom, and Zelenyuk [112] propose a non-parametric least-
squares method, which utilize the advantage of the local MLE method with less
strict assumptions (and less computational complexity) to analyze efficiency in the
context of stochastic frontier. Given a set of i.i.d. random variables (Yi , Xi , Zi ),
i = 1, . . . , n where Yi is the output, Xi are the inputs, and Zi can be considered
as environmental conditions that affect the production. Setting Xi = x, and Zi = z,
one can characterize the output produced as:

Y = m (x, z) − u + v, (90)

where m (x, z) is the production frontier unknown to researchers, u∼D+ (μu (x, z),
varu (x, z)), and v∼D(0, varv (x, z)) are independent random variables conditional
on (X, Z). The method first estimates an average production function along with
some moments of the complex error term. Then the local inefficiency can be
computed after identifying the local asymmetry of the error. To estimate the average
production function in the first step, rewrite the output equation as:

Y = r1 (x, z) + e, (91)

where r1 (x, z) = m(x, z) − μu (x, z) represents the average production function,

e = v − u + μu (x, z). One can estimate r1 (x, z) = E(Y|x, z) using standard non-
parametric methods and obtain the residuals êi = Yi − r̂1 (Xi , Zi ) , i = 1, . . . , n.
Then, we can consistently estimate r2 (x, z) and r3 (x, z) where rj (x, z) = E(ej |x, z) by:
j
r̂j (x, z) = Wi,h (x, z) êi , (92)

where Wi, h (x, z) represents the estimation method that depends on the vector of
bandwidth h. Next, assume semi-parametric forms of independent u and v as:

u | x, z ∼ N + 0, σu2 (x, z) , (93)

v | x, z ∼ N 0, σv2 (x, z) . (94)

16 Cost, Revenue, and Profit Function Estimates 673

One can derive expressions in terms of r2 and r3 for σu3 (x, z) and σv2 (x, z),
and thus obtain the variance estimates σ̂u2 (x, z) and σ̂v2 (x, z) by plugging r̂2 and
r̂3 estimated in the first step. Then, individual efficiency scores can be estimated
using the method of Jondrow et al. [53] for the general case. One We can also
consistently estimate the conditional mean of inefficiency term μ̂u (x, z) after the
variance estimates are obtained. Then, the stochastic frontier can be estimated by:

m̂ (x, z) = r̂1 (x, z) + μ̂u (x, z). (95)

Non-parametric methods have become essential to analyze productivity. Devel-

opments in computational software and statistical methods have contributed greatly
to empirical studies on productivity. The challenge remains, however, that estimat-
ing non-parametric models is more difficult than imposing restrictions needed for
interpreting results from different functional forms.
As pointed out in the introductory remarks, this chapter focuses on “how cost,
revenue, and profit functions are used to identify and characterize an underlying
technology” and “concentrates on the more widely used cost functions to motivate
various issues.” Thus, this chapter focuses on parametric functions. There are of
course many alternative nonparametric methods to specify both the mean function
and the error terms in a stochastic frontier function that can be utilized to estimate
the dual cost, revenue, and profit functions or the primal production or distance
function. Relatively recent work has also focused on methods to ensure that
the regularity conditions in such nonparametric approaches are imposed on the
functions estimated. This literature includes the work by Fan, Li, and Weersink
[29], Adams, Berger, and Sickles [2, 3], Adams and Sickles [1], Kuosmanen and
Kortelainen [70], and Simar, Van Keilegom, and Zelenyuk [112].

Concluding Remarks

What this chapter has covered is just a small part of the economic theory and
practice enriched by the development of the duality theory. It highlights some
of the benefits from the duality theory of the production function and uses the
cost function to summarize the benefits. Revenue and profit functions have similar
properties and the benefits can also be applied. First, the cost function enables us
to derive an easier representation of technology. In the case of multiple outputs, for
example, the production function becomes infeasible and only yields an implicit
function. However, the cost function can deal with multiple outputs and has become
a common practice. Second, the dual approach using the cost function incorporates
optimal input allocation from optimizing firms’ behavior, while the primal approach
which uses the distance or production function does not contain such information.
Third, one can check if a firm is a cost-minimizer based on the precise conditions
that the cost function needs to satisfy in order to characterize the technology of
a cost-minimizing firm. In addition, one can specify a functional form of the cost
674 L. Kutlu et al.

function suitable for the estimations as long as it satisfies all the precise conditions.
Finally, the data on outputs and input prices required to estimate the cost function
are easier to obtain than the data on actual output and input levels required in the
estimation using the primal approach.
This chapter has also illustrated how to specify functional forms of technology
in various optimization problems which are consistent with both primal and
dual relationships in empirical productivity studies. One important lesson is that
allowing flexible functional forms may sacrifice parsimony properties. Researchers
always need to consider the benefits and losses from flexibility when choosing
functional forms to represent the production technology. Last but not least, one
must be consistent when interpreting results from any productivity research with
the standards established 40 years ago that are still of great importance today.

Cross-References

Data Envelopment Analysis: A Nonparametric Method of Production Analysis

Distance Functions in Production Economics
Duality in Production
Multiproduct Technologies
Neoclassical Production Economics: An Introduction
Shadow Pricing in Production Economics
Stochastic Frontier Analysis: Foundations and Advances I
Stochastic Frontier Analysis: Foundations and Advances II

References
1. Adams RM, Sickles RC (2007) Semi-parametric efficient distribution free estimation of panel
models. Commun Statis Theory Methods 36:2425–2442
2. Adams RM, Berger A, Sickles RC (1997) Computation and inference in semiparametric
efficient estimation, computational approaches to economic problems. In: Amman H, Rustem
B, Whinston A (eds) Advances in computational economics. Kluwer, Boston, pp 57–70
3. Adams RM, Berger A, Sickles RC (1999) Semiparametric approaches to stochastic panel
frontiers with applications to the banking industry. J Bus Econ Stat 17(1999):349–358
4. Ahn SC, Good DH, Sickles RC (2000) Estimation of long-run inefficiency levels: a dynamic
frontier approach. Econ Rev 19:461–492
5. Aigner DJ, Lovell CAK, Schmidt P (1977) Formulation and estimation of stochastic frontier
production function models. J Econ 6:21–37
6. Ait-Sahalia Y, Duarte J (2003) Nonparametric option pricing under shape restrictions. J Econ
116:9–47
7. Amsler C, Prokhorov A, Schmidt P (2016) Endogenous stochastic frontier models. J Econ
190:280–288
8. Amsler C, Prokhorov A, Schmidt P (2017) Endogenous environmental variables in stochastic
frontier models. J Econ 199:131–140
9. Assaf AG, Gillen D, Tsionas EG (2014) Understanding relative efficiency among airports: a
general dynamic model for distinguishing technical and allocative efficiency. Transp Res B
70:18–34
16 Cost, Revenue, and Profit Function Estimates 675

10. Atkinson SE, Halvorsen R (1984) Parametric efficiency tests, economies of scale, and input
demand in U.S. electric power generation. Int Econ Rev 25:647–662
11. Barnett WA (2002) Tastes and technology: curvature is not sufficient for regularity. J Econ
108:199–202
12. Battese GE, Coelli TJ (1992) Frontier production functions, technical efficiency and panel
data with application to paddy farmers in India. J Prod Anal 3:153–169
13. Bauer PW (1990) Recent Developments in the Econometric Estimation of Frontiers. J Econ
46:39–56
14. Behrman JR, Lovell CK, Pollak RA, Sickles RC (1992) The CET-CES-generalized Leontief
variable profit function: an application to Indian agriculture. Oxf Econ Pap 44:341–354
15. Berger AN, Mester LJ (1997) Inside the black box: what explains differences in the
efficiencies of financial institutions. J Bank Financ 21:895–947
16. Blundell R, Horowitz JL, Parey M (2012) Measuring the price responsiveness of gasoline
demand: economic shape restrictions and nonparametric demand estimation. Quant Econ
3:29–51
17. Bos JWB, Koetter M (2009) Handling losses in translog profit models. Appl Econ 43:
307–312
18. Bresnahan TF (1989) Studies of industries with market power. In: Schmalensee, Richard,
Willig, Robert D. (Eds.), The Handbook of Industrial Organization. North-Holland, Amster-
dam
19. Caves D, Christensen L, Tretheway M (1980) Flexible cost functions for multiproduct firms.
Rev Econ Stat 62:477–481
20. Chambers R, Färe R, Grosskopf S, Vardanyan M (2013) Generalized quadratic revenue
functions. J Econ 173:11–21
21. Cornwell C, Schmidt P, Sickles RC (1990) Production frontiers with cross-sectional and time-
series variation in efficiency levels. J Econ 46:185–200
22. Desli E, Ray SC, Kumbhakar SC (2003) A dynamic stochastic frontier production model with
time-varying efficiency. Appl Econ Lett 10:623–626
23. Diewert WE (1974a) A note on aggregation and elasticities of substitution. Can J Econ 7:
12–20
24. Diewert WE (1974b) Functional forms for revenue and factor requirements functions. Int
Econ Rev 15:119–130
25. Diewert WE, Wales TJ (1987) Flexible functional forms and global curvature conditions.
Econometrica 55:43–68
26. Du P, Parmeter CF, Racine JS (2013) Nonparametric kernel regression with multiple
predictors and multiple shape constraints. Stat Sin 23:1347–1371
27. Duncombe W, Yinger J (2011) Making do: state constraints and local responses in Califor-
nia’s education finance system. Int Tax Public Financ 18:337–368
28. Duygun M, Kutlu L, Sickles RC (2016) Measuring productivity and efficiency: a Kalman
filter approach. J Prod Anal 46:155–167
29. Fan Y, Li Q, Weersink A (1996) Semiparametric estimation of stochastic production frontier.
J Bus Econ Stat 14:460–468
30. Feng G, Serletis A (2008) Productivity trends in U.S. manufacturing: evidence from the NQ
and AIM cost functions. J Econ 142:281–311
31. Feng G, Serletis A (2009) Efficiency and productivity of the U.S. banking industry, 1998–
2005: evidence from the Fourier cost function satisfying global regularity conditions. J Appl
Econ 24:105–138
32. Fox KJ (1998) Non-parametric estimation of technical progress. J Prod Anal 10:235–250
33. Gagnepain P, Ivaldi M (2002) Stochastic frontiers and asymmetric information models. J Prod
Anal 18:145–159
34. Gagnepain P, Ivaldi M (2017) Economic efficiency and political capture in public service
contracts. J Ind Econ 65:1–38
35. Geman S, Geman D (1984) Stochastic relaxation, Gibbs distributions and the Bayesian
restoration of images. IEEE Trans Pattern Anal Mach Intell 12:609–628
676 L. Kutlu et al.

36. Getachew L, Sickles RC (2007) The policy environment and relative price effi-
ciency of Egyptian private sector manufacturing: 1987/88–1995/96. J Appl Econ 22:
703–728
37. Gong B, Sickles RC (2017) Resource allocation in multidivisional multiproduct firms:
examining the divisional productivity of energy companies, Working paper
38. Good DH, Nadiri MI, Sickles RC (1991) The structure of production, technical change and
efficiency in a multiproduct industry: an application to U.S. airlines, NBER Working paper
3939
39. Greene WH (1980) One the estimation of a flexible frontier production model. J Econ 13:
101–115
40. Greene WH (2005a) Fixed and random effects in stochastic frontier models. J Prod Anal
23:7–32
41. Greene WH (2005b) Reconsidering heterogeneity in panel data estimators of the stochastic
frontier model. J Econ 126:269–303
42. Griffiths WE, Hajargasht G (2016) Some models for stochastic frontiers with endogeneity.
J Econ 190:341–348
43. Griffiths WE, O’Donnell CJ, Tan Cruz A (2000) Imposing regularity conditions on a system
of cost and cost-share equations: a Bayesian approach. Aust J Agric Resour Econ 44:
107–127
44. Gronberg TJ, Jansen DW, Taylor LL (2011) The adequacy of educational cost functions:
lessons from Texas. Peabody J Educ 86:3–27
45. Guan Z, Kumbhakar SC, Myers RJ, Lansink AO (2009) Measuring excess capital capacity in
agricultural production. Am J Agric Econ 91:765–776
46. Guilkey DK, Lovell CAK (1980) On the flexibility of the translog approximation. Int Econ
Rev 21:137–147
47. Guilkey DK, Lovell CAK, Sickles RC (1983) A comparison of the performance of three
flexible functional forms. Int Econ Rev 24:591–616
48. Hall P, Huang LS (2001) Nonparametric kernel regression subject to monotonicity constraints.
Ann Stat 29:624–647
49. Hastings WK (1970) Monte Carlo sampling methods using Markov chains and their
applications. Biometrica 57:97–109
50. Henderson DJ, List JA, Millimet DL, Parmeter CF, Price MK (2012) Empirical implementa-
tion of nonparametric first-price auction models. J Econ 168:17–28
51. Huang TH, Chen YH (2009) A study on long-run inefficiency levels of a panel dynamic
cost frontier under the framework of forward-looking rational expectations. J Bank Financ
33:842–849
52. Humphrey D, Pulley L (1997) Banks’ responses to deregulation: profits, technology, and
efficiency. J Money Credit Bank 29:73–93
53. Jondrow J, Lovell CAK, Materov IS, Schmidt P (1982) On the estimation of technical
inefficiency in stochastic frontier production models. J Econ 19:233–238
54. Karakaplan MU, Kutlu L (2017a) Handling endogeneity in stochastic frontier analysis. Econ
Bull 37:889–901
55. Karakaplan MU, Kutlu L (2017b) Endogeneity in panel stochastic frontier models: an
application to the Japanese cotton spinning industry. Appl Econ 49:5935–5939
56. Karakaplan MU, Kutlu L (2018) School district consolidation policies: endogenous cost
inefficiency and saving reversals, Forthcoming in Empirical Economics
57. Kasman A, Yildirim C (2006) Cost and profit efficiencies in transition banking: the case of
new EU members. Appl Econ 38:1079–1090
58. Kleit A, Terrell D (2001) Measuring potential efficiency gains from deregulation of electricity
generation: a Bayesian approach. Rev Econ Stat 83:523–530
59. Koetter M, Kolari JW, Spierdijk L (2012) Enjoying the quiet life under deregulation?
Evidence from adjusted Lerner indices for US banks. Rev Econ Stat 94:462–480
60. Koop G, Osiewalski J, Steel M (1997) Bayesian efficiency analysis through individual effects:
hospital cost Frontiers. J Econ 76:77–105
16 Cost, Revenue, and Profit Function Estimates 677

61. Kumbhakar SC (1990) Production frontiers, panel data, and time-varying technical ineffi-
ciency. J Econ 46:201–211
62. Kumbhakar SC (1994) A multiproduct symmetric generalized McFadden cost function.
J Prod Anal 5:349–357
63. Kumbhakar SC (1996) Efficiency measurement with multiple outputs and multiple inputs.
J Prod Anal 7:225–255
64. Kumbhakar SC (2001) Estimation of profit functions when profit is not maximum. Am
J Agric Econ 83:1–19
65. Kumbhakar SC, Lovell CK (2003) Stochastic Frontier Analysis. Cambridge University Press,
Cambridge
66. Kumbhakar SC, Tsionas EG (2005) Measuring technical and allocative inefficiency in the
translog cost system: A Bayesian approach. J Econ 126:355–388
67. Kumbhakar SC, Wang HJ (2006a) Estimation of technical and allocative inefficiency: A
primal system approach. J Econ 134:419–440
68. Kumbhakar SC, Wang HJ (2006b) Pittfalls in the estimation of a cost function that ignores
allocative inefficiency: A Monte Carlo analysis. J Econ 134:317–340
69. Kumbhakar SC, Lai HP (2016) Maximum likelihood estimation of the revenue function
system with output-specific technical efficiency. Econ Lett 138:42–45
70. Kuosmanen T, Kortelainen M (2012) Stochastic non-smooth envelopment of data: semi-
parametric frontier estimation subject to shape constraints. J Prod Anal 38:11–28
71. Kutlu L (2010) Battese-Coelli estimator with endogenous regressors. Econ Lett 109:79–81
72. Kutlu L (2013) Misspecification in Allocative Inefficiency: A Simulation Study, Econ
118:151–154
73. Kutlu L, McCarthy P (2016) US airport governance and efficiency. Transp Res E 89:117–132
74. Kutlu L, Sickles RC (2012) Estimation of market power in the presence of firm level
inefficiencies. J Econ 168:141–155
75. Kutlu L, Wang R (2018) Estimation of cost efficiency without cost data. J Prod Anal 49:137–
151
76. Kutlu L, Tran CK, Tsionas EG (2019) A time-varying true individual effects model with
endogenous regressors. J Econ 211:539–559
77. Kutlu L, Mamatzakis E, Tsionas MG (2019) A principal-agent approach for estimating firm
efficiency: revealing bank managerial behavior. Unpublished manuscipt
78. Lau LJ (1978) Testing and imposing monotonicity, convexity, and quasiconcavity. In: Fuss
M, McFadden D (eds) Production economics, a dual approach to theory and applications.
North-Holland, Amsterdam, pp 409–453
79. Lee YH, Schmidt P (1993) A production frontier model with flexible temporal variation in
technical efficiency. In: Fried HO, Schmidt SS (eds) The measuring productivity efficiency:
techniques and applications. U.K, Oxford, pp 237–255
80. Lewbel A (2010) Using heteroscedasticity to identify and estimate mismeasured and endoge-
nous regressor models. J Bus Econ Stat 30:67–80
81. Lovell CAK, Sickles RC (1983) Testing efficiency hypothesis in joint production: a paramet-
ric approach. Rev Econ Stat 65:51–58
82. Ma S, Racine JS (2013) Additive regression splines with irrelevant categorical and continuous
regressors. Stat Sin 23(2):515–541
83. Mairesse J, Jaumandreu J (2005) Panel-data estimates of the production function and the
revenue function: what difference does it make? Scand J Econ 107:651–672
84. Mammen E (1991) Estimating a smooth monotone regression function. Ann Stat 19:724–740
85. Mammen E, Thomas-Agnan C (1999) Smoothing splines and shape restrictions. Scand J Stat
Theory Appl 26:239–252
86. Matzkin RL (1991) A nonparametric maximum rank correlation estimator. In: Barnett W,
Powell J, Tauchen G (eds) Nonparametric and semiparametric methods in econometrics and
statistics. Cambridge University Press, Cambridge
87. Matzkin RL (1994) Chapter 42: Restrictions of economic theory in nonparametric methods.
In: Engel RF, McFadden DL (eds) Handbook of econometrics, vol IV. Elsevier, Amsterdam
678 L. Kutlu et al.

88. Maudos J, Pastor JM, Pérez F, Quesada J (2002) Cost and profit efficiency in European Banks.
J Int Financ Mark Inst Money 12:33–58
89. Meeusen W, Van Den Broeck J (1977) Efficiency estimation from Cobb-Douglas production
function with composed errors. Int Econ Rev 18:435–444
90. Metropolis N, Rosenbluth AW, Rosenbluth MN, Teller AH, Teller E (1953) Equations of state
calculations by fast computing machines. J Chem Phys 21:1087–1092
91. Mukerjee H (1988) Monotone nonparametric regression. Ann Stat 16:741–750
92. Mutter RL, Greene WH, Spector W, Rosko MD, Mukamel DB (2013) Investigating the impact
of endogeneity on inefficiency estimates in the application of stochastic frontier analysis to
nursing homes. J Prod Anal 39:101–110
93. Oliveira R, Pedro MI, Marques RC (2013) Efficiency performance of the Algarve hotels using
a revenue function. Int J Hosp Manag 35:59–67
94. Perloff JM, Karp LS, Golan A (2007) Estimating market power and strategies. Cambridge
University Press, Cambridge
95. Pitt MM, Lee LF (1981) The measurement and sources of technical inefficiency in the
Indonesian weaving industry. J Dev Econ 9:43–64
96. Ramsay JO (1998) Estimating smooth monotone functions. J R Stat Soc Ser B Stat Methodol
60:365–375
97. Rask K (1995) The structure of technology in Brazilian sugarcane production, 1975–87:
an application of a modified symmetric generalized McFadden cost function. J Appl Econ
10:221–232
98. Restrepo-Tobón D, Kumbhakar S (2014) Enjoying the quiet life under deregulation? Not
quite. J Appl Econ 29:333–343
99. Rogers KE (1998) Nontraditional activities and the efficiency of US commercial banks.
J Bank Financ 22:467–482
100. Röller L-H (1990) Proper quadratic cost functions with an application to the Bell System. Rev
Econ Stat 72:202–210
101. Ruud PA (1997) Restricted least squares subject to monotonicity and concavity constraints.
In: Kreps DM, Wallis KF (eds) Advances in economics and econometrics: theory and
applications, proceedings of the seventh world congress, vol 3. Cambridge University Press,
Cambridge, pp 166–187
102. Ryan DL, Wales TJ (2000) Imposing local concavity in the translog and generalized Leontief
cost functions. Econ Lett 67:253–260
103. Sauer J, Frohberg K, Hockman H (2006) Stochastic efficiency measurement: the curse of
theoretical consistency. J of Appl Econ 10(1):139–165
104. Schmidt P, Lovell CAK (1979) Estimating technical and allocative inefficiency relative to
stochastic production and cost frontiers. J Econ 9:343–366
105. Schmidt P, Lovell CAK (1980) Estimating stochastic production and cost frontiers when
technical and allocative inefficiency are correlated. J Econ 13:83–100
106. Schmidt P, Sickles RC (1984) Production frontiers and panel data. J Bus Econ Stat 2:
367–374
107. Serletis A, Feng G (2015) Imposing theoretical regularity on flexible functional forms. Econ
Rev 34:198–227
108. Shively TS, Walker SG, Damien P (2011) Nonparametric function estimation subject to
monotonicity, convexity and other shape constraints. J Econ 161:166–181
109. Sickles RC, Streitwieser ML (1998) An analysis of technology, productivity, and regulatory
distortion in the interstate natural gas transmission industry: 1977–1985. J Appl Econ 13:
377–395
110. Sickles RC, Zelenyuk V (2019) Measurement of productivity and efficiency: theory and
practice, with Valentin Zelenyuk. Cambridge University Press, New York
111. Sickles RC, Good D, Johnson RL (1986) Allocative distortions and the regulatory transition
of the U.S. airline industry. J Econ 33:143–163
112. Simar L, Van Keilegom I, Zelenyuk V (2017) Nonparametric least squares methods of
stochastic frontier models. J Prod Anal 47:189–204
16 Cost, Revenue, and Profit Function Estimates 679

113. Terrell D (1996) Incorporating monotonicity and concavity conditions in flexible functional
forms. J Appl Econ 11:179–194
114. Tran KC, Tsionas EG (2013) GMM estimation of stochastic frontier model with endogenous
regressors. Econ Lett 118:233–236
115. Tsionas EG (2006) Inference in dynamic stochastic frontier models. J Appl Econ 21:669–676
116. Vander Vennet R (2002) Cost and profit efficiency of financial conglomerates and universal
banks in Europe. J Money, Credit, Bank 34:254–282
117. Wales TJ (1977) On the flexibility of flexible functional forms: an empirical approach. J Econ
5:183–193
118. Wang HJ, Ho CW (2010) Estimating fixed-effect panel stochastic frontier models by model
transformation. J Econ 157:286–296
119. Weiher JC, Sickles RC, Perloff JM (2002) Market power in the US airline industry. In: Slottje
DJ (ed) Measuring market power. Emerald Group Publishing, North-Holland
120. Wiley DE, Schmidt WH, Bramble WJ (1973) Studies of a class of covariance structure
models. J Am Stat Assoc 68:317–323
121. Wu X, Sickles RC (2018) Semiparametric estimation under shape constraints. Economentric
Stat 6:74–89
122. Zellner A, Kmenta J, Drèze J (1966) Specificsation and estimation of cobb-Douglas produc-
tion function models. Econometrica 34:784–795
Scale Elasticity and Returns to Scale
17
Victor V. Podinovski and Finn R. Førsund

Contents
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 682
Scale Elasticity and Returns to Scale for Smooth Production Frontiers . . . . . . . . . . . . . . . . . . 684
The Case of a Single Output . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 685
The General Case with Multiple Outputs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 686
Returns to Scale . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 688
Scale Elasticity and Returns to Scale in the VRS Technology . . . . . . . . . . . . . . . . . . . . . . . . . 690
Evaluation of Scale Elasticity and Returns to Scale in the VRS Technology . . . . . . . . . . . . . 693
Technically Optimal Scale . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 698
Economies of Scale and Cost Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 700
Partial Scale Characteristics for Smooth Production Frontiers . . . . . . . . . . . . . . . . . . . . . . . . . 703
Partial Elasticity of Response for Arbitrary Polyhedral Technologies . . . . . . . . . . . . . . . . . . . 706
Global Returns to Scale . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 711
Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 715
Cross-References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 716
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 716

Abstract

This chapter presents conventional and recent developments of the notions

of scale elasticity and returns to scale, in both the neoclassical economics
framework and the nonparametric methodology of data envelopment analysis.
In addition to the standard development of these notions, this chapter provides
a rigorous exposition of their extensions to the case of nonsmooth production

V. V. Podinovski ()
School of Business and Economics, Loughborough University, Loughborough, UK
e-mail: [email protected]
F. R. Førsund
Department of Economics, University of Oslo, Oslo, Norway
e-mail: [email protected]

© Springer Nature Singapore Pte Ltd. 2022 681

S. C. Ray et al. (eds.), Handbook of Production Economics,
https://doi.org/10.1007/978-981-10-3455-8_23
682 V. V. Podinovski and F. R. Førsund

frontiers, partial scale characteristics, general polyhedral technologies, and

global returns to scale. We show that this broad range of extensions naturally
arises from the introduction of an output response function that, in a general
setting, describes a proportional response of any subset of inputs and outputs to
marginal proportional changes of another subset, as observed on the production
frontier. This function is closely related to the directional distance function and
provides a natural language for the definition and computation of different scale
characteristics in both the neoclassical and nonparametric frameworks.

Keywords

Returns to scale · Scale elasticity · Production frontiers · Data envelopment

analysis

Introduction

The notions of scale elasticity and returns to scale are important characteristics of
a production function which has its roots in works of classical and neoclassical
economists starting in the middle of the nineteenth century.1 Frisch [31], based on a
series of papers starting in the mid-1920s, gives a comprehensive presentation of the
neoclassical production theory and the most common mathematical assumptions to
be imposed. The two key concepts are scale properties and substitution properties.
This chapter is concerned with scale properties only.2
We distinguish between the case of a single output as an explicit function of
multiple inputs and the case of an implicit transformation function which allows
both multiple outputs and inputs.
The single output production function can be stated as

y = f (x1 , . . . , xm ) , (1)

where y is the single output and (x1 , . . . , xm ) is the vector of inputs. The production
function (1) represents the maximum amount of output that can be produced from
any given vector of inputs as for example stated in Samuelson [77] and therefore
defines what is now called the production frontier.
The scale property of production frontiers was expressed in Johnson [39] by
the elasticity of output with respect to a uniform change in the scale of all factor
inputs named the “elasticity of production.” Frisch [31] coined the term the “passus
coefficient.” The term “elasticity of scale” may be the most common one used
now in the literature. Frisch characterized general scale properties by introducing

1 See, for example, the introduction to the work of J. H. von Thünen published in 1863 by
Samuelson [78].
2 See Lloyd [47] for a review of the origins of the concept of substitution.
17 Scale Elasticity and Returns to Scale 683

a class of production functions (with a single output) obeying the Regular Ultra
Passum Law. The latter defines a particular behavior of the scale elasticity along
any nondecreasing curve in the input space: the scale elasticity starts out as greater
than one, then decreases monotonically and passes the value of one, and continues
to decrease after this. Frisch illustrated scale properties by drawing contour lines for
constant values of the passus coefficient. The contour line for the passus coefficient
equal to one is of special importance because this curve shows the technically
optimal scale.
The case of multiple outputs and inputs is based on the use of a transformation
function written as the implicit relation

F (X, Y ) = 0, (2)

where X = (x1 , . . . , xm ) ∈ Rm + is the vector of inputs and Y = (y1 , . . . , ys )

∈ R+ is the vector of outputs. The transformation function F(X, Y) is usually

s 3

assumed to be sufficiently smooth and satisfy additional conditions of the implicit

function theorem. The definition of the elasticity of scale for this case was not
developed until the 1970s [37, 55, 81].
The theory of scale elasticity and returns to scale was later extended and
operationalized in the nonparametric methodology of data envelopment analysis
(DEA). In contrast with the neoclassical setting in which the transformation function
F(X, Y) is assumed to be known and sufficiently smooth, in DEA this function
is generally unknown and not differentiable everywhere. Banker et al. [5] and
Banker and Thrall [4] demonstrated that the scale properties of the production
frontiers in DEA can be evaluated using the appropriate supporting hyperplanes
to the production possibility set which in turn are obtained by analyzing the set of
shadow prices at a given unit on the production frontier. The latter task is achieved
by resorting to linear programming techniques.
It is worth noting that while in economics it is standard to characterize scale
properties of production frontiers in terms of scale elasticity, in DEA the focus
has traditionally been on the less informative qualitative characterization of returns
to scale as increasing, constant, or decreasing returns [27]. An explanation of this
difference may lie in the fact that in the neoclassical approach, the transformation
function F(X, Y) is assumed to be known and the evaluation of scale elasticity is a
straightforward task, both conceptually and computationally. In contrast, in DEA
models where the efficient frontier is not known explicitly and is not smooth,
the notion of scale elasticity is generally undefined on the edges (including the
vertex points) of the production frontier and its one-sided analogues are less
intuitive. Methods of assessment of returns to scale have been developed in the
early pioneering works on DEA. However, the theoretical links between returns to

3 Jointproduction was also discussed by the classical and neoclassical economists – see Kurz [46]
for a review of the origins.
684 V. V. Podinovski and F. R. Førsund

scale and the underlying scale elasticity have often been overlooked and have only
recently been rigorously investigated.
There is a large body of literature devoted to scale characteristics of production
frontiers. Diewert et al. [20] and Balk et al. [2] provide reviews of contributions
in the neoclassical setting. Banker et al. [8] and Sahoo and Tone [74] summarize
contemporary work on the scale elasticity and returns to scale in the field of DEA.
The goal of this chapter is to present a unifying exposition of the notions of scale
elasticity and returns to scale that reflects recent advances in both the neoclassical
and nonparametric DEA settings. In writing this chapter, we pursue several goals.
First, we aim to provide an updated overview of the field, with particular empha-
sis on the new developments and extensions. Well-known results are presented
in a reduced form and are given only cursory discussion, with references to the
literature. Second, in contrast with some existing work that discusses returns to
scale without introducing the notion of scale elasticity, we develop the two notions
simultaneously and link one to the other. Third, we show that there is a direct
correspondence and analogy between the neoclassical and DEA approaches to the
treatment of standard scale characteristics and their various extensions. In simple
words, we show that there is parity between the neoclassical framework and DEA
with respect to conceptual and computational possibilities to categorize scale on
production frontiers. Fourth, we provide a rigorous development of a number of
theoretical issues that either have traditionally been overlooked or approached
heuristically in previous work. This includes a rigorous treatment of the notion
of one-sided scale elasticity in different production technologies and the relation
between returns to scale and technically optimal scale (most productive scale size)
in arbitrary, including nonconvex, technologies. Fifth, we include and discuss recent
extensions to the traditional notions of scale elasticity and returns to scale. This
includes different partial scale characterizations, their generalizations to arbitrary
nonparametric polyhedral technologies, and the notion of global returns to scale.

Scale Elasticity and Returns to Scale for Smooth Production

Frontiers

The scale elasticity and the types of returns to scale are characteristics of efficient
frontiers of production technologies. Conceptually, a technology T is defined as the
set of all feasible input-output combinations (X, Y) that can be used by a production
unit:

+ × R+ |X can produce Y .
T = (X, Y ) ∈ Rm s

If the transformation function (2) is known, technology T is defined as

+ × R+ |F (X, Y ) ≤ 0 .
T = (X, Y ) ∈ Rm s
(3)
17 Scale Elasticity and Returns to Scale 685

For a freely disposable technology, it is assumed that the function F(X, Y) is

strictly increasing in all outputs and strictly decreasing in all inputs, for which,
further assuming differentiability, a sufficient condition used by Hanoch [37] is

∂F (X, Y ) ∂F (X, Y )
< 0, i = 1, . . . , m; > 0, r = 1, . . . , s. (4)
∂xi ∂yr

The inequalities (4) imply that the technology T defined by Eq. (3) is freely
disposable in all inputs and outputs and that equality (2) describes efficient
production.

The Case of a Single Output

Consider the case of a single output as a function of multiple inputs defined by

production function (1). Let the unit (xo1 , . . . , xom , yo ) be located on the production
frontier, that is, satisfy Eq. (1). Consider an arbitrary proportional change of
the input vector (xo1 , . . . , xom ) by the variable multiplier α ≥ 0 and define the
corresponding
relative change
β (α) of the output yo assuming that the resulting unit
αxo1 , . . . , αxom , β (α) yo remains located on the production frontier. The function
β (α) represents the largest amount of the output yo that can be produced given the
input vector (αxo1 , . . . , αxom ):

β (α) yo = f (αxo1 , . . . , αxom ). (5)

Note that β(1) = 1. Assuming differentiability of the production function (1),

the scale elasticity ε, or the Passus coefficient of Frisch [31], evaluated at the unit
(xo1 , . . . , xom , yo ) is the derivative of the function β (α) evaluated at α = 1, that is,

ε = β (1).

Differentiating (5) by α, we obtain the Passus equation of Frisch:

m
∂f (xo1 , . . . , xom )
εyo = xoi ,
∂xi
i=1

or, rearranging and assuming yo = 0,

m
∂f (xo1 , . . . , xom ) (xo1 , . . . , xom ) , ∇f xo1 , . . . , xom
ε= xoi yo = ,
∂xi yo
i=1
(6)
686 V. V. Podinovski and F. R. Førsund

where ∇f (xo1 , . . . , xom ) is the gradient of the function f (x1 , . . . , xm ) evaluated at the
point (xo1 , . . . , xom ). The numerator on the right-hand side of Eq. (6) is the scalar
product of this gradient and the input vector (xo1 , . . . , xom ).
The meaning of the scale elasticity ε follows from equality (5). Namely, ε
is the multiplier by which a marginal proportional change α of the input vector
(xo1 , . . . , xom ) translates to the relative change β (α) of the output yo observed on
the production frontier.

For example, let ε = β (1) = 2. Consider increasing the input vector in some
small proportion, for example, by 1%, which corresponds to the increase of α from

α = 1 to α̃ = 1.01. Then, by definition of the derivative β (1) and to the first degree
of approximation,

β (α̃) − β(1)
≈ β (1) = 2.
α̃ − 1

Rearranging the above equality in which β(1) = 1 and α̃ = 1.01, we have

β (α̃) ≈ 1.02. Therefore, in response to the proportional increase of the input vector
by 1%, the maximum amount of the output that can be produced increases by 2%.
Similarly, a proportional reduction of the input vector by 1% (corresponding to
α̂ = 0.99) results in the
reduction of the maximum amount of the output by 2%
(corresponding to β α̂ ≈ 0.98).

The General Case with Multiple Outputs

Consider the general case characterized by multiple inputs and multiple outputs in
which the efficient boundary (production frontier) of technology T is defined by
the transformation function F(X, Y) as in equality (2).4 Let (Xo , Yo ) be an arbitrary
unit located on the production frontier such that Xo = 0 and Yo = 0. Then we have
F(Xo , Yo ) = 0.
Consider an arbitrary proportional change of the input vector Xo by the variable
multiplier α ≥ 0 and define the corresponding proportional
change
β (α) of the
output vector Yo , assuming that the resulting unit αXo , β (α) Yo remains located
on the production frontier. Substituting this into Eq. (2), we have

F αXo , β (α) Yo = 0. (7)

Because the transformation function in Eq. (2) represents efficient production,

β (α) is unique for each α ≥ 0 and is equal to the largest proportion of the output
vector Yo that can be produced given the input vector αXo . Below we refer to β (α)
as the (proportional) output response function.

4 Thecase of a single output described by production function (1) is a special case of the more
general statement (2). To see this, we can restate (1) as F(x1 , . . . , xm , y) = y − f (x1 , . . . , xm ) = 0.
17 Scale Elasticity and Returns to Scale 687

Generalizing the single-output case, define the scale elasticity ε(Xo , Yo ) as the
derivative of the function β (α) evaluated at α = 1:

ε (Xo , Yo ) = β (1). (8)

According to the given definition, the scale elasticity ε(Xo , Yo ) represents the
largest proportional change of the vector of outputs Yo possible in the technology
(as observed on the production frontier) as a response to a marginal proportional
change of the input vector Xo by a factor α.
Subject to known regularity conditions assumed by the implicit function theorem
and differentiating (7) with respect to α, we obtain the standard formula [37, 55, 81]:

Xo , ∇X F (Xo , Yo )
ε (Xo , Yo ) = β (1) = − , (9)
Yo , ∇Y F (Xo , Yo )

where ∇ X F(Xo , Yo ) and ∇ Y F(Xo , Yo ) are the partial gradients of the function
F(X, Y) evaluated at the point (Xo , Yo ) with respect to the inputs and outputs,
respectively. Note that in the case of a single output defined by the production
function (1), formula (9) is equivalent to Eq. (6).

Remark 1 It is useful to give an equivalent definition of the proportional output

response function β (α) based only on the notion of technology T, without a
reference to the transformation function, similar to its definition in Starrett [81] and
Førsund [27]:

β (α) = max {β| (αXo , βYo ) ∈ T , β ∈ R} . (10)

Defining the function β (α) as in Eq. (10) is particularly useful for the devel-
opment of the notion of scale elasticity in nonparametric models of technology in
which the transformation function is typically unknown.5 In most of such models,
equality (10) becomes a linear program. Applying known results of sensitivity
analysis to this program at the point (Xo , Yo ) leads to a linear programming
method for the evaluation of scale elasticity, even without knowing the explicit
transformation function. This is discussed in detail in the next sections.
It is worth noting that the function β (α) is closely related to the directional
distance function of Chambers et al. [13, 14] and is reciprocal to the gauge function
of Rockafellar [73] evaluated at the unit (αXo , Yo ) in the direction of vector gY = Yo
(see, e.g., Podinovski et al. [68]).

5 To avoid dealing with excessive technicalities, we assume that the maximum in formula (10)
is attained. This assumption is true for most technologies of practical interest, in particular, for
the nonparametric variable returns-to-scale technology used in DEA and discussed in subsequent
sections.
688 V. V. Podinovski and F. R. Førsund

Remark 2 It is clear that there exist an infinite number of different transformation

functions F(X, Y) that characterize the same production technology T by formula
(3), or its production frontier by equality (2). One important example of this is the
characterization of technology T by the directional distance function of Chambers
et al. [13, 14] defined as follows:

sup {δ| (X − δgX , Y + δgY ) ∈ T } , if (X, Y ) ∈ T ,
D (X, Y, gX , gY ) = (11)
− ∞, if (X, Y ) ∈
/ T,

where (gX , gY ) ∈ Rm+ × R+ \ {(0, 0)}.

Chambers et al. [14] show that if technology T is freely disposable with respect
to all inputs and outputs, then the function D(X, Y, gX , gY ) is a complete function
representation of T, that is, we have

T = (X, Y ) ∈ Rm
+ × R+ |D (X, Y, gX , gY ) ≥ 0 .
s
(12)

In this case the frontier of technology T is given by the equality D(X, Y, gX , gY ) = 0.

Taking into account that the vectors gX and gY are fixed, note that the directional
distance function D(X, Y, gX , gY ) defines the transformation function in Eq. (2) as
follows:

F (X, Y ) = −D (X, Y, gX , gY ) ,

and (12) becomes a special case of Eq. (3).6

If the function D(X, Y, gX , gY ) is sufficiently smooth (with respect to the variable
vectors X and Y, for the fixed parameters gX and gY ) and satisfies all conditions
of the implicit function theorem at the unit (Xo , Yo ), the scale elasticity ε(Xo , Yo )
is calculated by formula (9), in which F(X, Y) is substituted by D(X, Y, gX , gY ) [2,
33, 83].

Returns to Scale

The evaluation of scale elasticity by formula (9) leads to the conventional returns-to-
scale characterization of the unit (Xo , Yo ) into the following three types: increasing,
decreasing, and constant returns to scale.

Definition 1 Let F(Xo , Yo ) = 0. Then the unit (Xo , Yo ) exhibits

(i) Increasing returns to scale if ε(Xo , Yo ) > 1

(ii) Decreasing returns to scale if ε(Xo , Yo ) < 1

6 Properties of the directional distance function (11) are explored by Chambers et al. [14, 15] and

Chambers and Quiggin [12].

17 Scale Elasticity and Returns to Scale 689

Output
Q

P
M
C

0 Input

Fig. 1 Returns to scale for a smooth production frontier

(iii) Constant returns to scale if ε(Xo , Yo ) = 1

Example 1 Consider the technology with a single input x and single output y
depicted in Fig. 1. Suppose that its efficient frontier is defined by the production
function y = f (x). In this case, the formula for scale elasticity (6) can be restated as
follows:

xo f (xo ) yo
ε (xo , yo ) = = f (xo ) / . (13)
yo xo

The above formula explains the following conventional interpretation of the scale
elasticity in the case of a single input and a single output. Namely, the scale elasticity
evaluated at an output radial efficient unit (xo , yo ) is equal to the ratio of its marginal
productivity f (xo ) to its average productivity yo /xo .
Consider, for example, unit A in Fig. 1. The marginal productivity evaluated at
this unit is equal to the slope of the line KL tangent to the frontier at the point A.
The corresponding average productivity is equal to the slope of the ray OA. Because
the former is greater than the latter, their ratio is greater than 1, and unit A exhibits
increasing returns to scale.
Similarly, both the marginal and average productivities evaluated at the unit B
are equal to the slope of the ray OM. Therefore, B exhibits constant returns to scale.
Finally, the marginal productivity evaluated at the unit C (equal to the slope of the
line PQ) is lower than the average productivity (equal to the slope of the ray OC).
690 V. V. Podinovski and F. R. Førsund

By formula (13), the scale elasticity evaluated at C is less than 1 and therefore unit
C exhibits decreasing returns to scale.

Scale Elasticity and Returns to Scale in the VRS Technology

In this section we discuss the extension of the notions of scale elasticity and returns
to scale to the variable returns-to-scale (VRS) technology of Banker et al. [5].
Further extensions to other nonparametric technologies and their generalizations
are presented in subsequent sections.7
Let T be the VRS technology generated by a finite set of observed units
Xj , Yj ∈ Rm + × R+ , j ∈ J = {1, . . . , n}. We assume that Xj and Yj are nonzero
s

vectors for any j ∈ J, that is, that each observed unit has at least one positive input
and one positive output. Denote X and Y the m × n and s × n matrices whose
columns are, respectively, the input and output vectors Xj and Yj of the observed
units j ∈ J.
Following Banker et al. [5], the VRS technology T can be stated as follows:

m+s
T = (X, Y ) ∈ R+ |Xλ ≤ X, Y λ ≥ Y, 1 λ = 1, λ ∈ Rn+ . (14)

(We use the superscript to denote transposed vectors. The vector inequalities
mean that the specified inequalities are satisfied for each component.)
In contrast with the neoclassical economics approach in which the transformation
function is assumed to be known and sufficiently smooth, the task of defining and
evaluating the scale elasticity in the VRS technology leads to two obstacles.
First, we do not have an explicit formula for the transformation function (2)
describing the frontier of the VRS technology and cannot use formula (9) for the
calculation of scale elasticity. Below we show that this difficulty can be overcome
by using known results of sensitivity analysis applied to the definition of the output
response function β (α) by formula (10) which, in the case of the VRS technology,
becomes a linear program.
Second, the efficient frontier of the VRS technology is generally not smooth and
the standard definition (8) of scale elasticity generally does not apply to this frontier.
This conceptual difficulty is overcome by defining a one-sided (left-hand and right-
hand) scale elasticity and giving a corresponding one-sided returns-to-scale char-
acterization of the frontier points. In the context of VRS technology, this approach
was pioneered by Banker and Thrall [4]. It was explored further by Fukuyama [32],
Førsund and Hjalmarsson [28], Hadjicostas and Soteriou [35], Podinovski et al.

7 The axiomatic approach used by Banker et al. [5] stipulates that the VRS technology is a convex
set. This assumption precludes an essential feature of the neoclassical frontiers, i.e., the Regular
Ultra Passum Law of Frisch [31] – for a discussion, see Førsund and Hjalmarsson [29], Olesen and
Petersen [51] and Olesen and Ruggiero [52].
17 Scale Elasticity and Returns to Scale 691

[67], and Podinovski and Førsund [66]. It was extended to more general settings
by Chambers and Färe [11], Zelenyuk [83], and Podinovski et al. [68].
It is important to note that in the literature it is often assumed that the notions
of scale elasticity and returns to scale apply only to strongly efficient units.8 As
shown by the development of these notions in Podinovski et al. [67] and Podinovski
et al. [68], it is sufficient that the unit (Xo , Yo ) be only output radial efficient. Using
the proportional output response function β (α) defined in Eq.(10), this condition is
stated as follows:

Assumption 1 The unit (Xo , Yo ) is output radial efficient, that is, β(1) = 1.

If the transformation function F(X, Y) is known and the unit (Xo , Yo ) satisfies
equality (2) that represents efficient production, Assumption 1 is automatically
satisfied and does not need to be mentioned or verified. However, in the VRS
technology, typically only some units satisfy Assumption 1 which therefore needs
to be verified.
Consider any unit (Xo , Yo ) ∈ T that satisfies Assumption 1. (This unit may be
observed or not observed.) Denote the domain of the output response function
β (α). Because the VRS technology T is freely disposable in all inputs, it is
straightforward to show that is an unbounded interval that can be stated in the
form = [α ∗ , +∞), where α ∗ > 0. As follows from known results of sensitivity
analysis in linear programming, β (α) is a continuous, concave, and piecewise linear
function on [68]. This implies that if α = 1 is not the left extreme point of (i.e.,
if α ∗ = 1), then the function β (α) has both the right-hand and left-hand derivatives

β + (1) and β − (1) (because β (α) is a linear function in some neighborhood on each
side of α = 1). If α ∗ = 1, then only the right-hand derivative is defined in the

conventional sense. In this case, the left-hand derivative β − (1) is undefined and is
often formally taken equal to +∞.
The above observations lead to the following definition of the one-sided (right-
hand and left-hand) scale elasticities ε+ and ε− evaluated at the unit (Xo , Yo ).
This definition generalizes the notion of scale elasticity ε defined by Eq. (8) and
is effectively used by Banker and Thrall [4]:

ε+ (Xo , Yo ) = β + (1),
(15)
ε− (Xo , Yo ) = β − (1).

Because the function β (α) is concave on , according to a known result of

convex analysis ([73], Theorem 24.1) we have:

ε− (Xo , Yo ) ≥ ε+ (Xo , Yo ) . (16)

8A unit (Xo , Yo ) ∈ T is strongly efficient if there exists no (X , Y ) ∈ Tsuch that X ≤ Xo , Y ≥ Yo ,

and (Xo , Yo ) = (X , Y ).
692 V. V. Podinovski and F. R. Førsund

The inequality (16) is satisfied as equality if and only if the function β (α) is
differentiable at α = 1, in which case both one-sided scale elasticities are equal to
the scale elasticity ε(Xo , Yo ) defined by Eq.(8). This shows that the notion of one-
sided scale elasticities (15) is a generalization of the conventional notion of scale
elasticity to the points on the frontier at which the latter is undefined but the former
is defined correctly.
The meaning of the one-sided scale elasticities (15) is similar to the mean-
ing of the conventional scale elasticity ε(Xo , Yo ). For example, suppose that
ε+ (Xo , Yo ) = 0.5 and ε− (Xo , Yo ) = 2. If we increase the input vector Xo in a
small proportion, for example, by 1%, the proportion by which the production of
the output vector Yo in the VRS technology can be increased is equal to 0.5% as
indicated by the value of the right-hand scale elasticity ε+ (Xo , Yo ). Similarly, if we
reduce the input vector Xo proportionally by 1%, the maximum production of the
output vector Yo is reduced by 2% as indicated by the left-hand scale elasticity
ε− (Xo , Yo ).
The notion of one-sided scale elasticity (15) leads to the following characteriza-
tion of returns to scale of output-efficient units and is due to Banker and Thrall [4].

Definition 2 Let the unit (Xo , Yo ) ∈ T satisfy Assumption 1. Then (Xo , Yo ) exhibits

(i) Increasing returns to scale if 1 < ε+ (Xo , Yo ) ≤ ε− (Xo , Yo )

(ii) Decreasing returns to scale if ε+ (Xo , Yo ) ≤ ε− (Xo , Yo ) < 1
(iii) Constant returns to scale if ε+ (Xo , Yo ) ≤ 1 ≤ ε− (Xo , Yo )

Example 2 Consider the VRS technology with a single input and a single output
shown in Fig. 2. This technology is generated by the five observed units A, B, C, D,
and E, all of which are output radial efficient.
Consider unit C. The right-hand marginal productivity at this unit is equal
to 2/3, which is the slope of the segment CD. The average productivity of unit
C is equal to 6/5. Therefore, the right-hand scale elasticity at C is equal to
the ratio (2/3)/(6/5) ≈ 0.56. Similarly, the left-hand scale elasticity is the ratio
(2/1)/(6/5) ≈ 1.67. According to Definition 2, C exhibits constant returns to scale.9
Similar analysis shows that the units A and B exhibit increasing returns to scale,
and the units D and E exhibit decreasing returns to scale.

9 Note that unit C actually exhibits decreasing returns to scale if the input is increased (because
the right-hand scale elasticity evaluated at this unit is less than 1) and increasing returns to scale
if the input is reduced (because the left-hand scale elasticity is greater than 1). This contrasts with
the standard definition of constant returns to scale requiring that the scale elasticity be equal to 1
[34].
17 Scale Elasticity and Returns to Scale 693

Output
11
10
E
9
D
8
7
C
6

5
4
B
3
2
A
1

0 1 2 3 4 5 6 7 8 9 10 11 Input

Fig. 2 Returns to scale in the VRS technology

Evaluation of Scale Elasticity and Returns to Scale in the VRS

Technology

In DEA, with the exception of very simple cases like in Example 2, we do not
know a functional representation of the production frontier and do not know the
proportional output response function β (α). Consequently, we cannot use formulae
(15) for the calculation of the one-sided scale elasticities. Below we consider the
linear programming approach to their evaluation based on the dual characterization
of the VRS technology T. The described approach leads to the known returns-to-
scale characterization of production units based on the minimum and maximum
optimal values of the sign-free variable dual to the normalizing equality 1 λ = 1 in
the statement (14) of the VRS technology.
Let the unit (Xo , Yo ) ∈ T satisfy Assumption 1. Consider assessing its output
radial efficiency by solving the following output-oriented multiplier program in
which u ∈ Rs+ and v ∈ Rm + are the vectors of output and input weights, respectively,
and v0 is a sign-free variable10 :

10 We state program (17) and related programs (18) and (19) in a form consistent with Banker and

Thrall [4] and Førsund and Hjalmarsson [28]. Because in these three programs the variable v0 is
free in sign, an alternative statement is equally valid in which the “plus” sign before this variable
in the objective function and constraints is changed to the “minus” sign as, for example, in Cooper
et al. [18]. In the latter case the formulae for the calculation of the one sided-scale elasticities given
in this section need to be changed accordingly.
694 V. V. Podinovski and F. R. Førsund

θ ∗ = min v Xo + v0 , (17)

subject to

u Yo = 1
v Xj − u Yj + v0 ≥ 0, j = 1, . . . , n,
u, v ≥ 0, v0 sign free.

Because by Assumption 1, unit (Xo , Yo ) is output radial efficient, we have θ ∗ = 1.

Denote the set of all optimal solutions u, v, v0 of program (17). Let v0min and
v0max be, respectively, the minimal and maximal values of the scalar v0 taken over
the set . For example, v0max is the optimal value of the following linear program:

v0max = max v0 , (18)

subject to

u Yo = 1,
v Xo − u Yo + v0 = 0,
v Xj − u Yj + v0 ≥ 0, j = 1, . . . , n,
u, v ≥ 0, v0 sign free.

Note that the feasible region of program (18) is the set of optimal solutions to the
multiplier program (17). Indeed, program (18) has the same constraints as program
(17) and the additional constraint v Xo − u Yo + v0 = 0. Because u Yo = 1, this
additional constraint is equivalent to v Xo + v0 = 1, which is satisfied only by the
optimal solutions of program (17).
Similarly, to compute v0min , we change the maximization of the objective function
in Eq. (18) to its minimization:

v0min = min v0 , (19)

subject to

u Yo = 1,
v Xo − u Yo + v0 = 0,
v Xj − u Yj + v0 ≥ 0, j = 1, . . . , n,
u, v ≥ 0, v0 sign free.

Both programs (18) and (19) are feasible and their feasible sets coincide with
the set of optimal solutions to program (17). It is clear that in program (18) we
17 Scale Elasticity and Returns to Scale 695

always have v0max ≤ 1.11 In contrast, the objective function of program (19) may
be unbounded below, in which case program (19) does not have a finite optimal
solution.
The following theorem takes into account that for any unit (Xo , Yo ) ∈ T, the
domain of the function β (α) is a closed half-interval = [α ∗ , +∞). Its proof
follows from a more general result established by Podinovski and Førsund [66].

Theorem 1 Let the unit (Xo , Yo ) ∈ T satisfy Assumption 1. Then

(i) Program (18) has a finite optimal value v0max and

ε+ (Xo , Yo ) = β + (1) = 1 − v0max . (20)

(ii) Program (19) has a finite optimal value if and only if α = 1 is not the left
extreme point of , that is, α ∗ = 1. In this case we have

ε− (Xo , Yo ) = β − (1) = 1 − v0min . (21)

(iii) Program (19) has an unbounded optimal value if and only if α = 1 is the left
extreme point of , that is, if α ∗ = 1. (In this case we can formally define

v0min = −∞ and let ε− (Xo , Yo ) = β − (1) = 1 − v0min = +∞.)

Theorem 1 suggests a straightforward computational procedure for the evaluation

of one-sided scale elasticities ε– and ε+ at any given unit (Xo , Yo ) ∈ T. This
procedure automatically establishes if the unit (Xo , Yo ) satisfies Assumption 1, that
is, if it is output radial efficient. This fact allows us to solve programs (18) and (19)
on the entire set of observed units J = {1, . . . , n} without preliminary evaluation of
their output radial efficiency and selecting its efficient subset.12
To be specific, consider the evaluation of the left-hand scale elasticity ε− (Xo , Yo )
by solving program (19) and using formula (21). The following three outcomes are
logically possible.
Case 1. Let program (19) have a finite optimal solution û, v̂, v0min . As any
feasible solution to Eq. (19), this optimal solution is also optimal in program
(17), and the optimal value of the latter is equal to 1. Therefore, the unit (Xo , Yo )
satisfies Assumption 1, that is, it is output radial efficient. By formula (21), we have

ε− (Xo , Yo ) = β − (1) = 1 − v0min .
Case 2. Let program (19) be feasible and its optimal value be unbounded. In this
case any feasible solution to this program is also optimal in program (17). Therefore,

11 This follows from the second constraint of program (18), taking into account that v Xo ≥ 0 and
u Yo = 1.
12 The fact that programs (18) and (19) can be solved on the full set of observed units, without first

identifying the subset of output radial efficient units, was suggested in a similar setting by Atici
and Podinovski [1].
696 V. V. Podinovski and F. R. Førsund

the unit (Xo , Yo ) satisfies Assumption 1. By statement (iii) of Theorem 1, α = 1 is

the left extreme point of the domain . This means that reducing the input vector Xo
in any proportion α < 1 leads outside the VRS technology. (Such situation occurs at
the unit A in Fig. 2, where there are no units in the technology with the input smaller
than that of the unit A.) In this case the left-hand scale elasticity ε− is undefined.
For convenience, in this case it is often formally assumed that v0min = −∞ and

ε− (Xo , Yo ) = β − (1) = 1 − v0min = +∞.
Case 3. Let program (19) be infeasible. This means that there exists no solution
that satisfies all constraints of program (17) at which its objective function attains the
value of 1. (Equivalently, the equality v Xo − u Yo + v0 = 0 cannot be satisfied).
In this case the unit (Xo , Yo ) does not satisfy Assumption 1, and the notion of one-
sided scale elasticity evaluated at this unit is undefined.
The evaluation of the right-hand scale elasticity ε+ (Xo , Yo ) requires solving

program (18) and using formula (20) to calculate ε+ (Xo , Yo ) = β + (1). As noted,
if the unit (Xo , Yo ) satisfies Assumption 1, this program always has a finite optimal
solution v0max ≤ 1. Therefore, we can have only the analogues of Cases 1 and 3, as
the Case 2 is impossible. It is also worth noting that both programs (18) and (19) are
either both feasible or both infeasible. Therefore, if solving either of these programs
shows their infeasibility (Case 3), then there is no need to solve the remaining
program.

Remark 3 The one-sided scale elasticities and the returns-to-scale characterization

of units in the VRS technology can also be evaluated by solving the input-oriented
multiplier VRS models. Let the unit (Xo , Yo ) ∈ T be both output radial efficient (as
required by Assumption 1 for a correct definition of the one-sided scale elasticities)
and also input radial efficient. Then the optimal value θ̂ of the following input-
oriented multiplier VRS model is equal to 1:

θ̂ = max u Yo + u0 , (22)

subject to

v Xo = 1,
v Xj − u Yj − u0 ≥ 0, j = 1, . . . , n,
u, v ≥ 0, u0 sign free.

Denote ˆ the set of optimal solutions u, v, u0 of program (22). Let umin 0 and
umax be the minimal and maximal values of u 0 taken over the set ˆ . These two
0
values can be evaluated by solving programs analogous to programs (18) and (19).
Then the one-sided scale elasticities at the unit (Xo , Yo ) are expressed as follows [4,
28, 67]:

1 1
ε+ (Xo , Yo ) = , ε− (Xo , Yo ) = . (23)
1 − u0
min 1 − umax
0
17 Scale Elasticity and Returns to Scale 697

Remark 4 In the DEA literature, the notion of returns to scale is sometimes

extended to the units that are output radial inefficient and, therefore, do not satisfy
Assumption 1. A common approach in such cases requires that the unit (Xo , Yo )
be first projected on the boundary of the VRS technology, either in the output or
input orientation. If the projection satisfies Assumption 1 (which is always true
for output projections but is not necessarily true for the input projections), then its
returns-to-scale type is defined correctly and may be assigned to the unit (Xo , Yo ).
It is well known that the input and output projections of the same inefficient unit
may belong to different types of returns to scale, and therefore, the returns-to-scale
characterization of the inefficient units generally depends on the selected projection
[7, 27].

Remark 5 Let us mention two known methods that evaluate the returns-to-scale
type of the unit (Xo , Yo ) directly, without assessing the one-sided scale elasticities.
These methods may be useful when the strength of the characterization measured
by the one-sided scale elasticities is unimportant, and only the classification of units
into the three types of returns to scale is of interest. The resulting characterizations
obtained by these methods are equivalent to those based on Definition 2 that
employs the notion of one-sided scale elasticity [6].
The first method was introduced by Banker and Thrall [4] and further discussed
by Seiford and Zhu [80] and Cooper et al. [18, p. 138]. In this method, the returns-
to-scale characterization of the unit (Xo , Yo ) is obtained by evaluating its input radial
efficiency in the constant returns-to-scale (CRS) technology of Charnes et al. [16].
The statement of the CRS technology is obtained from the statement (14) of the
VRS technology, by omitting the normalizing equality 1 λ = 1. The unit (Xo , Yo )
exhibits increasing returns to scale if and only if 1 λ < 1 in all optimal solutions to
the CRS model measuring its input radial efficiency. Similarly, the unit (Xo , Yo )
exhibits decreasing returns to scale if and only if 1 λ > 1 in all such optimal
solutions. The remaining case (i.e., when there exists an optimal solution such that
1 λ = 1) corresponds to the unit (Xo , Yo ) exhibiting constant returns to scale.
Applying this method requires solving two linear programs: one maximizing and
the other minimizing the term 1 λ.
An alternative method was developed by Färe et al. [23, 24] and was further
discussed by Färe and Grosskopf [22]. This approach requires the evaluation of the
input radial efficiency of the unit (Xo , Yo ) in the three technologies: the actual VRS
technology T and its nonincreasing and constant returns-to-scale (NIRS and CRS)
extensions. Let ET (Xo , Yo ), E NIRS (Xo , Yo ) and E CRS (Xo , Yo ) be the corresponding
input radial efficiencies of the unit (Xo , Yo ) in the three technologies. Then (Xo , Yo )
exhibits constant returns to scale if and only if ET (Xo , Yo ) = ECRS (Xo , Yo ). Other-
wise we have ET (Xo , Yo ) > ECRS (Xo , Yo ) and further investigation using the NIRS
technology is required. Namely, if ENIRS (Xo , Yo ) = E CRS (Xo , Yo ), then the unit
(Xo , Yo ) exhibits increasing returns to scale. Otherwise, (Xo , Yo ) exhibits decreasing
returns to scale.
The method of Färe et al. [23] was further modified by Kerstens and Vanden
Eeckaut [41] and Briec et al. [9]. Podinovski [56, 57] proved that the above
698 V. V. Podinovski and F. R. Førsund

approaches correctly characterize returns to scale in any convex technology T,

including the standard VRS technology.13 If the technology T is not convex, this
group of methods does not evaluate the local returns to scale but rather, with a small
modification, indicates a direction to the optimal scale of operations. We discuss this
in detail is section “Global Returns to Scale.”

Remark 6 An alternative to the linear programming approach is a group of direct

computational methods [30, 43, 44]. Such methods aim at determining (recon-
structing) the section of the production frontier that include all units stated in the
form (αXo , βYo ). Identifying this section of frontier is equivalent to identifying the
function β (α), and the use of formula (15) becomes straightforward. Direct methods
generally offer advantages in computational speed compared to the traditional
methods based on solving multiplier DEA models, although their implementation
requires good programming skills and fine-tuning of the algorithms. This makes
direct methods particularly suitable for the implementation in software packages,
especially those that aim at the visualization of production frontiers, which is a
computationally demanding task [43].

Technically Optimal Scale

In the neoclassical setting, the notion of technically optimal scale was introduced by
Frisch [31]. Assuming that a smooth production function (1) satisfies the Regular
Ultra Passum Law, the surface of technically optimal scale is the locus of all units
characterized by the scale elasticity equal to 1. Under the Regular Ultra Passum Law,
any proportional increase of all inputs of a unit located on this surface by a factor
α > 1 would lead to an increase of the output by a factor less than α. Similarly, a
proportional reduction of all inputs by α < 1 would result in a reduction of the output
by a factor greater than α. In both cases the resulting unit would exhibit reduced
productivity compared to the original unit. This implies that any unit of technically
optimal scale with the vector of inputs (x1 , . . . , xm ) exhibits the highest productivity
among all units whose vectors of inputs belong to the ray (αx1 , . . . , αxm ), where
α > 0.
A similar idea is utilized by Banker [3] who introduces the notion of the most
productive scale size (MPSS) in the nonparametric setting of DEA. Although
Banker defines MPSS for the VRS technology, his definition is equally applicable to
any production technology. As discussed by Førsund [26, 27], the notion of MPSS is
identical to the notion of technically optimal scale for smooth production functions

13 The application of the method of Färe et al. [23, 24] to arbitrary convex technologies assumes that

we have an operational statement of their cone and NIRS extensions. Obtaining such statements
is generally not a simple task. Podinovski and Bouzdine-Chameeva [65] develop a unifying
operational statement of such extensions for any polyhedral technology.
17 Scale Elasticity and Returns to Scale 699

(1) that satisfy the Regular Ultra Passum Law. However, Banker’s definition of
MPSS also extends to arbitrary technologies with multiple outputs, including
various DEA technologies whose efficient frontiers are not smooth and for which the
conventional scale elasticity is undefined. It also applies to nonconvex technologies
in which the units characterized by the scale elasticity equal to 1 are not necessarily
at MPSS. (We discuss this in detail in section “Global Returns to Scale.”)
Let T be any production technology. Following the definition of Banker [3], a
unit (Xo , Yo ) ∈ T is at MPSS if, for all units in the form (αXo , βYo ) ∈ T, where
α > 0, we have β/α ≤ 1. This is equivalent to the condition β (α) /α ≤ 1, for all
α > 0, where the β (α) is the output response function defined by formula (10).
Verifying whether the unit (Xo , Yo ) ∈ T is at MPSS requires solving the following
program14 :

max β/α, (24)

subject to

(αXo , βYo ) ∈ T ,
α, β > 0.

The unit (Xo , Yo ) is at MPSS if the optimal value of program (24) is equal to 1.
In this case it is attained at α = β = 1.
As shown by Banker [3], if T is a VRS technology, the supremum in Eq.
(24) is attained and is equal to the output radial efficiency of the unit (Xo , Yo )
in the corresponding CRS technology of Charnes et al. [16], which is the cone
extension of the VRS technology. It turns out that a similar result is also true for
an arbitrary technology T. Define the reference technology T ∗ as the cone extension
of technology T:

T ∗ = (X, Y ) ∈ Rm
+ × R+ |∃ X̃, Ỹ ∈ T , δ ≥ 0 : X, Y )= δ( X̃, Ỹ
s
.

As proved by Podinovski [63], the optimal value of program (24) is equal to the
optimal value of the following program utilizing the reference cone technology T ∗ :

max β,

subject to

14 For an arbitrary technology T, the maximum in program (24) may not be attained and needs to be

replaced by supremum. To avoid excessive technicalities, we assume that the maximum in program
(24) is attained. This assumption is true for most technologies used in DEA, including the VRS
technology.
700 V. V. Podinovski and F. R. Førsund

(Xo , βYo ) ∈ T ∗ ,
β ≥ 0.

This result shows that the optimal value of program (24) stated for technology T,
is equal to the inverse of the output radial efficiency of the unit (Xo , Yo ) evaluated in
the reference cone technology T ∗ .
Summarizing the above results, the unit (Xo , Yo ) ∈ T is at MPSS if and only if it
is output radial efficient in the reference cone technology T ∗ generated by T.
If T is the conventional VRS technology, Banker [3] and Banker and Thrall [4]
show that the unit (Xo , Yo ) is at MPSS if and only if it exhibits constant returns to
scale according to Definition 2. If (Xo , Yo ) exhibits increasing or decreasing returns
to scale, it needs to increase or, respectively, reduce its size in order to achieve its
MPSS. Podinovski [63] proves that the same result is true not only in the VRS
technology but in any convex technology T. However, as shown in section “Global
Returns to Scale” below, the same relationship between MPSS and the three types
of returns to scale is generally invalid if technology T is not convex.

Economies of Scale and Cost Functions

The notions of optimal scale, or MPSS in the framework of DEA, and returns-to-
scale considered in the previous sections are technical characteristics of production
frontiers. As shown, all these notions are evaluated as specific characteristics of the
output response function β (α) defined by formula (10). In this section, we consider
an alternative, and generally different, approach to defining the optimal scale and
economies of scale based on the cost minimization framework. Earlier research in
this framework was started by Frisch [31], and more recent exposition can be found
in Färe et al. [25] and Ray [70, 71].
Let T ∈ Rm + × R+ be a production technology with m inputs and s outputs. Also,
s

let w ∈ R+ be the vector of input prices which is assumed to be known. For any
m

output vector Y ∈ Rs+ , define the cost function C(w, Y) as follows:

C (w, Y ) = min w X| (X, Y ) ∈ T . (25)

The cost function C(w, Y) is correctly defined for the output vectors Y ∈ Rs+
for which there exists an input vector X ∈ Rm + such that the unit (X, Y) ∈ T.
In other words, the function C(w, Y) is defined for all output
vectors Y for which
the input requirement set V (Y ) = X ∈ Rm + | (X, Y ) ∈ T is not empty. Below the
assumption that V(Y) = ∅ is implicitly assumed without further mention.
Let T be the VRS technology given by its statement (14). For a fixed output
vector Y, C(w, Y) is found as the optimal value of the following linear program:

C (w, Y ) = min w X|Xλ ≤ X, Y λ ≥ Y, 1 λ = 1, λ ∈ Rn+ . (26)
17 Scale Elasticity and Returns to Scale 701

The minimization of the objective function w X in program (26) is performed

with respect to variable vectors X and λ, while the output vector Y and the vector
of input prices w are fixed. The cost function C(w, Y) represents the lowest cost of
producing the output vector Y possible in the VRS technology T, given the input
prices w.
Based on the notion of cost function (25), we can now proceed to giving an
economic definition of scale efficiency. In the case of a single (scalar) output Y > 0,
the average cost of producing this output is defined as

C (w, Y )
AC (w, Y ) = . (27)
Y

Economies of scale are present at the production level Y = Yo if the average cost
AC(w, Y) is a decreasing function of Y in some neighborhood of Yo . This implies
that AC(w, Y) falls to the right of Yo and increases if we move to the left of Yo .
Similarly, diseconomies of scale are present at Y = Yo if AC(w, Y) is an increasing
function in some neighborhood of Yo .
Let Y ∗ > 0 be an output level at which, for the given input prices, the average cost
function AC(w, Y) defined by Eq. (27) attains its minimum. Any such output level
Y ∗ is referred to as an efficient scale of production. The economic scale efficiency of
any actual output level Y = Yo is defined as the ratio of the minimum average cost
AC(w, Y ∗ ) evaluated at any efficient scale Y ∗ to the actual average cost AC(w, Yo ).
This definition is generalized to the case of multiple outputs Y ∈ Rs+ . In this
case, consider the ray of output vectors {tY| t > 0} and define the ray average cost as

C (w, tY )
RAC (w, t, Y ) = . (28)
t

Scale economies (or diseconomies) at Y = Yo are present if the ray average

cost RAC(w, t, Yo ) is a decreasing (respectively, increasing) function of t in some
neighborhood of t = 1. For an output vector Yo , the efficient scale of production is
defined as Y ∗ = T ∗ Yo , where T ∗ minimizes RAC(w, t, Yo ) with respect to t. The
economic scale efficiency of output vector Yo is defined as the ratio of the minimum
ray average cost RAC(w, t, Yo ) attained at t = T ∗ to the ray average cost at t = 1.
The latter is equal to the cost function C(w, Yo ) evaluated by Eq. (25).
In both the single and multiple-output cases, the economic scale efficiency is
interpretable as the utmost factor by which the ray average costs associated with the
production of the output (vector) Y can be reduced by rescaling production to the
efficient scale Y ∗ .
Following Ray [71], it is worth highlighting the difference between the notions
of (technically) optimal scale or MPSS and efficient scale of production arising in
the cost minimization framework. For a production unit (Xo , Yo ) ∈ T, the definition
of MPSS by program (24) seeks the highest ratio β/α achievable among the units
(αXo , βYo ) ∈ T that have the same input and output structures Xo and Yo (although
taken in different quantities α and β) as the actual unit (Xo , Yo ). This definition
702 V. V. Podinovski and F. R. Førsund

is therefore independent of the vector of input prices w used by the definition

of efficient scale of production. Moreover, as pointed out in section “Technically
Optimal Scale,” in any convex technology T, the notion of MPSS is equivalent to
the notion of local CRS, that is, a DMU (Xo , Yo ) ∈ T is at MPSS if and only if it
exhibits CRS in the sense of Definition 2.
In contrast, the definition of efficient scale of production depends on the vector
of input prices w that is assumed known. Similar to MPSS, the idea is to minimize
the use of inputs per unit of quantity of the output vector Yo . (In program (28), this
quantity is represented by variable t.) However, in contrast with the notion of MPSS,
this minimization is understood in the sense of identifying the minimal total cost of
the inputs (which depends on their prices w) required for the production of outputs
Yo . For a unit (Xo , Yo ), the efficient scale of production of its output vector Yo is
defined by comparing this unit with all units in the form (X, tYo ), t > 0. In particular,
the input vector X is not assumed to be in the form αXo as in the definition of MPSS.
Assume that the unit (Xo , Yo ) minimizes the (ray) average cost. This involves two
assumptions. First, the vector Xo minimizes the cost of producing Yo , that is, we
have C(w, Yo ) = w Xo . Second, the output vector Yo represents efficient scale of
production. In the case of a single output this means that Yo minimizes the average
cost (27). In the case of multiple outputs, this means that T ∗ = 1 is optimal in
program (28). Ray [70] proves that any such unit (Xo , Yo ) is at MPSS. The opposite
is not true, that is, it is possible that a unit (Xo , Yo ) is at MPSS but does not minimize
the (ray) average cost.
Let T be a CRS technology. In this case, the cost function C(w, Y) is eval-
uated by solving program (26) from which we remove the normalizing equality
1 λ = 1. As shown by Ray [70], the cost function C(w, Y) for the CRS technology
is homogeneous of degree 1 in vector Y, that is, for any scalar t ≥ 0, we have

C (w, tY ) = tC (w, Y ) .

Based on this property, Ray [70] proves that in the case of a single output, the
average cost function AC(w, Y) defined by Eq. (27) is constant with respect to Y > 0.
Similarly, in the case of vector output Y, the ray average cost function RAC(w, t, Y)
defined by Eq. (28) is constant with respect to t > 0.
In contrast, the (ray) average cost functions (27) and (28) for the VRS technology
are not constant and can generally be thought of as having a U-shape [31, 70]. It
further follows [70] that the minimum of this function is equal to the (constant)
value of the ray average cost function of the corresponding CRS technology. In
graphical terms, the straight horizontal line representing constant ray average costs
(28) in the CRS technology as a function of variable t > 0 is tangent to the U-shaped
curve representing ray average costs in the VRS technology, at the minimum point
T ∗ of the latter curve.

Remark 7 The economic scale efficiency of the output level Y is defined as the
ratio of the minimum (ray) average costs (attained at the efficient scale) to the ray
average costs at the actual level Y. Ray [70] provides an alternative method for the
calculation of this ratio based on the cost functions C(w, Y) evaluated in the VRS and
17 Scale Elasticity and Returns to Scale 703

∗
CRS technologies. Namely, let CVRS ∗
and CCRS be the minimum costs of producing
∗
output (vector) Y in the VRS and CRS technologies, respectively, that is, CVRS and
∗
CCRS are the optimal values of linear program (26) and its CRS analogue. Then the
∗ /C ∗ .
economic scale efficiency of the output level Y is equal to the ratio CCRS VRS
The advantage of this approach is that it avoids solving a nonlinear program aimed
at minimizing the ray average cost (28) as required by the definition of efficient
scale.

Remark 8 We now turn to the question of practical identification of the efficient

scale of production in the VRS technology represented by an optimal solution Y ∗
of program (27) or vector Y ∗ = T ∗ Yo , where T ∗ is optimal in program (28). Both
programs are nonlinear and finding their exact solution may be problematic. To
overcome this problem, Ray [71] shows that we can first solve the cost minimization
linear program (25) with Y = Yo in the CRS technology, that is, program (26) from
which the normalizing condition 1 λ = 1 is removed. Let vector λ∗ be optimal in
the resulting linear program (together with some input vector X∗ ). Denote ∗ =
1 λ∗ . Then Y ∗ = Yo / ∗ represents an efficient scale of production for the output
vector Yo .
In the general case, the optimal scale of production is not unique, and the
described approach will produce only one such scale Y ∗ . The full range of optimal
scales can be obtained as follows. Following Ray [71], define ∗max and ∗min as the
maximum and minimum of the term 1 λ found over the set of optimal solutions
of the CRS analogue of program (26). Identifying ∗max and ∗min requires solving
two linear programs. Then the smallest and largest efficient scales of production are
∗ =Y / ∗ ∗ ∗
found as Ymin o max and Ymax = Yo / min , respectively.

Partial Scale Characteristics for Smooth Production Frontiers

The conventional notion of scale elasticity given by formula (9) and the notion of
returns to scale assume that the whole vector of outputs Yo responds to proportional
marginal changes of the whole vector of inputs Xo . From a practical perspective,
we may also be interested in the elasticity of response of a partial vector of outputs
to marginal changes of a partial vector of inputs. For example, in the short run, we
may be able to change only some inputs such as labor and materials, but be unable
to change the other inputs such as capital or infrastructure.15 This raises a question
of short-run scale elasticity that characterizes the response of a subvector of outputs
to proportional marginal changes of a subvector of inputs.

15 For example, Johansen [38] considers a short-run industry production function that requires
the evaluation of the scale elasticity with respect to a subset of two inputs (labour and energy)
and treating the third input (capital) as fixed. Nelson [50] and Salvanes and Tjøtta [76] explore a
similar idea by excluding quasi-fixed inputs in applications to the power generation and distribution
sectors.
704 V. V. Podinovski and F. R. Førsund

In a more general setting, we may consider the elasticity of response of an

arbitrary subset B of inputs and/or outputs to marginal proportional changes of an
arbitrary subset A of inputs and/or outputs, while keeping the remaining subset C of
inputs and outputs constant. Special cases of this general framework are situations
in which the subsets A and B include a single input or output. We require that the
subsets A, B, and C be mutually exclusive, the subsets A and B be nonempty, while
C may be an empty set.
To model the described scenario, we represent any unit (Xo , Yo ) in the following
form:

(Xo , Yo ) = XoA , XoB , XoC , YoA , YoB , YoC . (29)

Depending on the composition of the subsets A, B, and C, some subvectors in the

above general formula may be omitted. For example, if the subset A includes only
inputs, the subvector YoA is removed. If the set C is empty, then both subvectors XoC
and YoC are removed.
Assume that the unit (Xo , Yo ) stated in the form (29) is located on the production
frontier (this requirement is stated more precisely in Assumption 2 below). We are
interested inthe elasticity
of proportional response of the joint subvector of inputs
and outputs XoB , YoB observed on the production frontier and caused by a marginal

proportional change of the joint subvector of inputs and outputs XoA , YoA , provided
C C
the joint subvector of inputs and outputs Xo , Yo is kept constant.
Similar to our development of the notion of scale elasticity, and following the
approach of Podinovski and Førsund [66] and Podinovski et al. [68], define the
partial proportional response function

β̃ (α) = max β| αXoA , βXoB , XoC , αYoA , βYoB , YoC ∈ T , β ∈ R . (30)

The above function is a generalization of the output response function (10) and
becomes the latter if XoA = Xo and YoB = Yo . We also need to adjust Assumption 1,
that the unit (Xo , Yo ) be efficient in the direction of the joint subvector
by Brequiring
Xo , YoB .

Assumption 2
β̃(1) = 1.

Note that if the set B contains only outputs and does not contain inputs, then
Assumption 2 means that the unit (Xo , Yo ) is weak efficient in the production of
subvector YoB of outputs.
Assume that the function β̃ (α) is defined in some neighborhood of α = 1 and
is differentiable
B B at this point. Define the elasticity of response of the
joint subvector

Xo , Yo with respect to marginal changes of the joint subvector XoA , YoA :

εA,B (Xo , Yo ) = β̃ (1).

17 Scale Elasticity and Returns to Scale 705

Let the production frontier be defined by the transformation function (2).

Talking into account (30), we obtain the following implicit statement of the
function β̃ (α):

F αXoA , β̃ (α) XoB , XoC , αYoA , β̃ (α) YoB , YoC = 0.

Assuming that the conditions of the implicit function theorem are satisfied and
differentiating β̃ (α) by α, we have

XoA , ∇X
A F (X , Y ) + Y A , ∇ A F (X , Y )
o o o o o
εA,B (Xo , Yo ) = β̃ (1) = − B B Y
, (31)
Xo , ∇X F (Xo , Yo ) + YoB , ∇YB F (Xo , Yo )

where ∇X A F (X , Y ), ∇ A F (X , Y ), ∇ B F (X , Y ) and ∇ B F (X , Y ) are the

o o Y o o X o o Y o o
partial gradients of the function F(X, Y) evaluated at the point (Xo , Yo ) with respect
to the inputs and outputs included in the sets A and B, respectively.
Depending on technology T (or transformation function F(X, Y)), the choice of
sets A and B and the unit (Xo , Yo ), the elasticity of response εA, B (Xo , Yo ) may be
positive, negative, or zero. For example, if each of the sets A and B consists of a
single output and the unit (Xo , Yo ) is strongly efficient, then εA, B (Xo , Yo ) < 0, that
is, a marginal increase of one output on the production frontier would result in a
decrease of the other output.

Remark 9 Consider a special case in which the set A consists only of inputs
(but not necessarily of all of them) and the set B consists only of outputs
(again, not necessarily of all of them). In this case the elasticity of response
εA, B (Xo , Yo ) has a meaning of partial scale elasticity and formula (31) is restated as
follows:

XoA , ∇X
A F (X , Y )
o o
εA,B (Xo , Yo ) = β̃ (1) = − B B . (32)
Yo , ∇Y F (Xo , Yo )

For a freely disposable technology for which we have inequalities (4),

formula (32) implies that εA, B (Xo , Yo ) ≥ 0. This allows us to give the
following definition of partial returns to scale, which generalizes the standard
Definition 2.

Definition 3 Let the unit (Xo , Yo ) ∈ T satisfy Assumption 2. Then (Xo , Yo )

exhibits

(i) Partial increasing returns to scale if εA, B (Xo , Yo ) > 1

(ii) Partial decreasing returns to scale if εA, B (Xo , Yo ) < 1
(iii) Partial constant returns to scale if εA, B (Xo , Yo ) = 1
706 V. V. Podinovski and F. R. Førsund

Partial Elasticity of Response for Arbitrary Polyhedral

Technologies

Following Podinovski et al. [68], below we show how the notion of elasticity of
response εA, B (Xo , Yo ) described in the previous section can be defined and computed
in a very large and practically important class of polyhedral technologies. The
notions of one-sided scale elasticity and returns-to-scale introduced for the standard
VRS technology become special cases in this more general development.
Technology T is called polyhedral if T is a polyhedral set in the input-output
space Rm × Rs , that is, T is the intersection of a finite number of closed half-
spaces in Rm × Rs [73]. To simplify notation and formulations, we assume that all
inputs and outputs of all units in T are nonnegative and that technology T is strongly
disposable with respect to its inputs and outputs. These assumptions are not essential
and are not required in the general case considered by Podinovski et al. [68].
We state the freely disposable polyhedral technology T as the set of all units
(X, Y ) ∈ Rm+ × R+ for which there exist a vector λ̂ ∈ R and vectors of input and
s q

output slacks SX ∈ R and SY ∈ R such that the following conditions are true:
m S

X̂λ̂ + SX = X,
Ŷ λ̂ − SY = Y,
(33)
Û λ̂ = Uo ,
λ̂, SX , SY ≥ 0.

In the statement (33), X̂ and Ŷ are, respectively, the m × q and s × q input and
output data matrices which allow a range of different interpretations. For example,
the columns of matrices X̂ and Ŷ may be the input and output vectors of the observed
units, as in the standard VRS and CRS models. These columns may also be used, for
example, to incorporate additional information in the model such as the information
about the scalability of selected inputs and outputs or production trade-offs.16
The third equality in the statement (33) is used to specify additional conditions
on the vector λ̂ and is optional. In this equality, the vector Uo is of some dimension
p and the matrix Û is of dimension p × q. An example of this equality is the
normalizing condition 1 λ = 1 used in the statement (14) of the VRS technology.
Another example is the two-stage VRS technology described in Sahoo et al. [75]
which includes two normalizing equalities, one for each stage. Often, as in the cone
extension of the hybrid returns-to-scale (HRS) technology of Podinovski [61], Uo is
a zero vector. In the CRS technology of Charnes et al. [16], the condition Û λ̂ = Uo
is not used and is omitted from the statement of technology.

16 An example of the former is the hybrid returns-to-scale (HRS) technology of Podinovski [58]
and its multiple HRS analogue [69]. Polyhedral technologies with production trade-offs are dual
to multiplier models with weight restrictions [62].
17 Scale Elasticity and Returns to Scale 707

The class of polyhedral technologies includes most of the known convex

nonparametric technologies. This includes the standard CRS and VRS technologies
of Charnes et al. [16] and Banker et al. [5] and their extensions by production
trade-offs [60, 62], nonincreasing and nondecreasing returns-to-scale (NIRS and
NDRS) technologies [21, 79], hybrid returns-to-scale technologies that assume
scalability only of a subset of inputs and outputs [58, 61], technologies with multiple
component processes [17, 69], many of the known network technologies stated in
terms of the inputs and final outputs [40, 75], and convex technologies assuming
joint weak disposability of outputs or inputs [45, 48, 49].
Podinovski et al. [68] show that the notion of elasticity of response εA, B (Xo , Yo )
introduced in the previous section for smooth production frontiers is generalizable
to the one-sided elasticity of response in any polyhedral technology. Moreover,
such one-sided elasticities can be calculated by solving essentially the same linear
programs which need appropriate specification depending on the statement (33) of
technology T and the choice of the sets A and B. Below we describe this approach
in detail.17
Let T be any polyhedral technology stated in the form (33), and let β̃ (α) be
the partial proportional response function defined as in Eq. (30). Consider any unit
(Xo , Yo ) ∈ T. Denote ˜ the domain of the function β̃ (α). Clearly, α = 1 ∈ . ˜
˜
Furthermore, is a closed interval and β̃ (α) is a continuous, concave and piecewise
linear function on ˜ [68].
Define the partial
right-hand and left-hand elasticities of response of the
subvector XoB , YoB with respect to marginal changes of the subvector XoA , YoA
as follows:
+ (1),
εA,B (Xo , Yo ) = β̃+
− (1). (34)
εA,B (Xo , Yo ) = β̃−

˜ we have the following inequality:

Because the function β̃ (α) is concave on ,
− +
εA,B (Xo , Yo ) ≥ εA,B (Xo , Yo ) . (35)

Let us consider a linear programming approach to the evaluation of the partial

one-sided elasticities (34). The following theorem is proved in Podinovski et al.
[68]. In its statement, the vectors v = (vA , vB , vC ) and u = (uA , uB , uC ) represent the
weights corresponding to the inputs and outputs in the sets A, B, and C, respectively.
The vector ω is of dimension p, the same as the dimension of vector Uo .

Theorem 2 Let the unit (Xo , Yo ) ∈ T satisfy Assumption 2. Then the following four
statements are true:

17 Thedescribed approach is also applicable to various polyhedral technologies with undesirable

outputs, e.g., to the weakly disposable technology of Kuosmanen [45]. This is considered in detail
by Podinovski [64].
708 V. V. Podinovski and F. R. Førsund

(i) If α = 1 is not the right extreme point of , ˜ then the right-hand elasticity of
+
response εA,B (Xo , Yo ) exists, is finite and equal to the optimal value of the
linear program:
+ A
εA,B (Xo , Yo ) = β̃+ (1) = min vA Xo − u A
A Yo , (36)

subject to
−vB XoB + u B Yo = 1,
B

v Xo − u Yo + ω Uo = 0,

v X̂ − u Ŷ + ω Û ≥ 0,
u, v ≥ 0, ω sign free vector.

(ii) If α = 1 is the right extreme point of , ˜ then program (36) has an unbounded
+
optimal value. (In this case we can formally define εA,B (Xo , Yo ) = β + (1) =
−∞.)
(iii) If α = 1 is not the left extreme point of , ˜ then the left-hand elasticity of
−
response εA,B (Xo , Yo ) exists, is finite and equal to the optimal value of the
linear program:
− A
εA,B (Xo , Yo ) = β̃− (1) = max vA Xo − u A
A Yo , (37)

subject to
−vB XoB + u B Yo = 1,
B

v Xo − u Yo + ω Uo = 0,

v X̂ − u Ŷ + ω Û ≥ 0,
u, v ≥ 0, ω sign free vector.

˜ then program (37) has an unbounded

(iv) If α = 1 is the left extreme point of ,
−
optimal value. (In this case we can formally define εA,B (Xo , Yo ) = β − (1) =
+∞.)

Theorem 2 is a generalization of Theorem 1 to the case of an arbitrary polyhedral

technology T and arbitrary subsets A and B of inputs and outputs. In this general
+
case the right-hand elasticity of response εA,B (Xo , Yo ) may be undefined (unlike
in the standard
case of scale elasticity) because a proportional increase of the joint
subvector XoA , YoA may lead outside the polyhedral technology T. This case is
described by part (ii) of Theorem 2. This case is impossible in the VRS technology
and is not included in Theorem 1.18

18 If technology Tis the standard VRS technology and the sets A and B are the sets of all inputs and
outputs, respectively, then by a simple rearrangement program (36) becomes formula (20) from
Theorem 1, and program (37) becomes formula (21). We discuss this in greater detail in Remark
11.
17 Scale Elasticity and Returns to Scale 709

Theorem 2 leads to a straightforward computational procedure for the evaluation

+ −
of the one-sided elasticities of response εA,B (Xo , Yo ) and εA,B (Xo , Yo ), which
is similar to the evaluation of the one-sided scale elasticities discussed in section
“Evaluation of Scale Elasticity and Returns to Scale in the VRS Technology.”
To be specific, consider the assessment of the left-hand elasticity of response
−
εA,B (Xo , Yo ).
If program (37) has a finite optimal solution, then the left-hand elasticity of
−
response εA,B (Xo , Yo ) exists and is equal to the optimal value of program (37).
−
If program (37) has an unbounded optimal value, we formally let εA,B (Xo , Yo ) =
+∞.
A This means that the technology
T contains no units whose structure is
αXo , βXoB , XoC , αYoA , βYoB , YoC , where α < 1, that is, it is impossible to reduce

the joint subvector of inputs and outputs XoA , YoA in technology T while keeping
the subvectors XoC and YoC constant. If program (37) is infeasible, then the unit
(Xo , Yo ) does not satisfy Assumption 2.19

Remark 10 Similar to Remark 9, let us consider a special case in which the set
A consists only of inputs and the set B consists only of outputs.20 In this case, the
objective functions of programs (36) and (37) are replaced by the single term vA XA ,
o
which is always nonnegative. Furthermore, if technology T is freely disposable with
respect to all inputs then, similar to the case of the VRS technology, the domain of
the function β̃ (α) is a closed unbounded interval ˜ = α̃, +∞). This implies that
the case (ii) of Theorem 2 is impossible. Taking into account inequality (35) and
−
formally letting εA,B (Xo , Yo ) = β − (1) = +∞ if α = 1 is the left extreme point of
˜ we have
,
+ −
0 ≤ εA,B (Xo , Yo ) ≤ εA,B (Xo , Yo ) ≤ +∞.

+
In the described special case, the one-sided elasticities of response εA,B (Xo , Yo )
−
and εA,B (Xo , Yo ) represent the marginal proportional change of the partial vector
of outputs YoB with respect to a marginal proportional change of the partial vector of
inputs XoA . They are therefore interpretable as the one-sided partial scale elasticities.
+
Similar to Definition 3, the evaluation of partial scale elasticities εA,B (Xo , Yo )
−
and εA,B (Xo , Yo ) leads to the following characterization of partial returns to scale.

Definition 4 Let the unit (Xo , Yo ) ∈ T satisfy Assumption 2. Then (Xo , Yo ) exhibits

19 This last statement formally follows from Theorem 1 and Proposition 5 proved in Podinovski
et al. [68].
20 Hadjicostas and Soteriou [36] consider the same case for the VRS technology only. Their

linear programs for the evaluation of one-sided scale elasticities are special cases of the programs
presented by Podinovski and Førsund [66] which in turn are generalized further by programs (36)
and (37).
710 V. V. Podinovski and F. R. Førsund

+ −
(i) Partial increasing returns to scale if 1 < εA,B (Xo , Yo ) ≤ εA,B (Xo , Yo )
+ −
(ii) Partial decreasing returns to scale if εA,B (Xo , Yo ) ≤ εA,B (Xo , Yo ) < 1
+ −
(iii) Partial constant returns to scale if εA,B (Xo , Yo ) ≤ 1 ≤ εA,B (Xo , Yo )

Remark 11 Let the set A include all inputs and let the set B include all outputs.
+ −
Then the one-sided elasticities of response εA,B (Xo , Yo ) and εA,B (Xo , Yo ) are the
+ −
standard one-sided scale elasticities ε (Xo , Yo ) and ε (Xo , Yo ) evaluated at the unit
(Xo , Yo ) on the frontier of the polyhedral technology T stated in the form (33).
Let be the set of all optimal solutions u, v, ω of the multiplier model assessing
the output radial efficiency of the unit (Xo , Yo ) in technology T. As shown in
Podinovski et al. [68], the linear programs (36) and (37) are then equivalently
restated as follows:

ε+ (Xo , Yo ) = 1 − max ω Uo , (38)
u,v,ω ∈

ε− (Xo , Yo ) = 1 − min ω Uo . (39)
u,v,ω ∈

It is now clear that Theorem 1 is a special case of Theorem 2. Indeed, if T is the

standard VRS technology, then the vector Uo has a single component equal to 1, and
the equalities (38) and (39) are the same as (20) and (21), respectively.
The one-sided scale elasticities ε+ (Xo , Yo ) and ε− (Xo , Yo ) can also be expressed
using the set ˜ of optimal solutions u, v, ω to the multiplier model measuring the
input radial efficiency of the unit (Xo , Yo ), in an arbitrary polyhedral technology T.
Podinovski et al. [68] show that these are calculated as follows:

+
ε (Xo , Yo ) = 1/ 1 − min ω Uo , (40)
u,v,ω ∈ ˜

−
ε (Xo , Yo ) = 1/ 1 − max ω Uo . (41)
u,v,ω ∈ ˜

If T is the standard VRS technology, the formulae (40) and (41) become formulae
(23).

Remark 12 Although most of the DEA literature devoted to the subject of returns
to scale and scale elasticity has traditionally focused on the VRS technology, a
number of studies have developed specialized techniques for evaluating returns
to scale in other polyhedral technologies, often without exploring a link to the
underlying scale elasticity. For example, Tone [82] and Korhonen et al. [42] develop
methods for the evaluation of returns to scale in technologies expanded by weight
restrictions. Sahoo et al. [75] consider the notion of returns to scale in two-stage
network models. Atici and Podinovski [1] develop methods for the assessment of a
partial elasticity of response in the standard CRS technology.
17 Scale Elasticity and Returns to Scale 711

The general approach developed by Podinovski et al. [68] and presented above
as Theorem 2 allows the evaluation of scale elasticity and its partial analogues,
and hence the characterization of returns to scale, in any polyhedral technology,
by solving essentially the same appropriately specified linear programs. This
effectively removes the need to develop bespoke methodologies for assessing
returns to scale in each individual technology. Podinovski et al. [68] illustrate
the application of this general approach in the VRS technology expanded by the
specification of weight restrictions. Podinovski [63] applies the same approach in a
two-stage network technology.

Global Returns to Scale

The conventional returns-to-scale characterization simultaneously plays two roles:

local and global. To be specific, let us illustrate this using the nonparametric VRS
technology of Banker et al. [5]. Assume that the unit (Xo , Yo ) exhibits increasing
returns to scale. From the local perspective, a small proportional increase of the
input vector of this unit (e.g., by 1%) would result in a larger proportional increase
(more than 1%) of the vector of outputs, assuming that the resulting unit remains on
the efficient frontier. In other words, a marginal increase of the size of the unit leads
to its improved productivity.
From the global perspective, if the unit (Xo , Yo ) exhibits increasing returns to
scale, then it is not at MPSS, and in order to achieve the latter, the unit (Xo , Yo )
needs to increase the scale of its operations. Therefore, the type of returns to
scale exhibited by the unit (Xo , Yo ) in the VRS technology is also indicative of the
direction of resizing in which the productivity would increase, both in the immediate
local sense and in the global sense as a direction to MPSS.
Podinovski [63] proves that the same dual role of the local returns-to-scale char-
acterization is true in any convex production technology. However, if technology T
is not convex, the conventional local characterization of returns to scale is generally
no longer indicative of the direction to MPSS.21 This observation motivates the
development of the concept of global returns to scale by Podinovski [56, 57].
The notion of global returns to scale applies to almost any production technology.
The only assumption that is required is that technology T does not allow free and

21 Nonconvex technologies arise naturally if, for example, some inputs or outputs are represented
by ratio measures such as percentages, which is common in managerial applications of DEA.
The incorporation of ratio inputs or outputs in the model of technology generally invalidates the
assumption of convexity. Olesen et al. [53, 54] develop nonconvex variants of the VRS and CRS
technologies in which both volume and ratio types of inputs and outputs are native types of data.
712 V. V. Podinovski and F. R. Førsund

unlimited production of outputs.22 In particular, technology T is not assumed to be

convex or exhibit any particular type of input or output disposability.
The types of global returns to scale are indicative of the direction in which a
given unit should resize the scale of its operations in order to achieve its MPSS.
To introduce this characterization formally, let the unit (Xo , Yo ) be output radial
efficient, that is, satisfy Assumption 1.
Consider program (24) used for verifying if the unit (Xo , Yo ) is at MPSS. Let
α̂, β̂ be any optimal solution to this program. Then the unit α̂Xo , β̂Yo is at
MPSS. This unit is referred to as a scale reference unit (SRU) of (Xo , Yo ).
Clearly, the unit (Xo , Yo ) is at MPSS if and only if α̂ = β̂ = 1 is an
optimal solution to program (24). Suppose that (Xo , Yo ) is not at MPSS. Then two
possibilities arise: either α̂ > 1 or α̂ < 1. In the former case the SRU α̂Xo , β̂Yo
is larger than the unit (Xo , Yo ) and in the latter it is smaller.23
Note that program (24) may have multiple optimal solutions, each defining a
different SRU of the unit (Xo , Yo ).24 It is possible that all such SRUs are larger than
(Xo , Yo ) or all are smaller than the latter. It is also theoretically possible that some
SRUs are larger and some are smaller than (Xo , Yo ).
The following definition of the four types of global returns to scale is given by
Podinovski [56, 57] and indicates the direction of resizing for the unit (Xo , Yo ) as it
changes the scale of its operations towards its MPSS (represented by its SRUs).

Definition 5 Let the unit (Xo , Yo ) satisfy Assumption 1. Then (Xo , Yo ) exhibits

(i) Global constant returns to scale (G-CRS) if (Xo , Yo ) is at MPSS

(ii) Global increasing returns to scale (G-IRS) if all its SRUs are larger than (Xo , Yo )
(iii) Global decreasing returns to scale (G-DRS) if all its SRUs are smaller than
(Xo , Yo )
(iv) Global subconstant returns to scale (G-SCRS) if some of its SRUs are smaller
and some are larger than (Xo , Yo ), but the unit (Xo , Yo ) itself is not at MPSS

In a more general case (which is primarily of academic interest), it is possible

that the optimal value of program (24) is not attained and, therefore, the unit
(Xo , Yo ) does not have an SRU. In this case, Podinovski [56] defines the notion
of an approximate SRU and restates Definition 5 in the latter terms.

22 Technology T allows free production if there exists a unit X̂, Ŷ ∈ T such that X̂ = 0 and
Ŷ ≥ 0, Ŷ = 0. Technology T allows
unlimited production
if there exists an input vector X̂ and
output vector Ŷ = 0 such that sup β| X̂, β Ŷ ∈ T = +∞.
23 Podinovski [56] proves that α̂ > 1 implies β̂ > 1 and, assuming strong efficiency of the unit
(Xo , Yo ), α̂ < 1 implies β̂ < 1.
24 In practice, it may be of interest to identify the smallest and largest SRUs of the unit (X , Y ).
o o
Provided we have already evaluated the scale efficiency of the unit (Xo , Yo ), identifying its smallest
and largest SRUs requires solving two further simple linear programs ([63], Footnote 16).
17 Scale Elasticity and Returns to Scale 713

Output
F

C
B

0 Input

Fig. 3 Global returns to scale in a nonconvex technology

The following example illustrates the four types of global returns to scale.

Example 3 Consider the nonconvex technology shown as the shaded area in Fig. 3.
Consider the local characterization of returns to scale first. Similar to Example 1,
we establish that the units A, C, and E exhibit constant returns to scale, D exhibits
increasing returns to scale and B and F exhibit decreasing returns to scale. Note that
only the unit E is at MPSS. Therefore, this unit is the single SRU of the other five
units.
By Definition 5, unit E exhibits G-CRS, units A, B, C, and D exhibit G-IRS
(because all four are smaller than their SRU E) and F exhibits G-DRS (because it is
larger than its SRU E). Note that these global types are inconsistent with the types
of local returns to scale.
In this example, no unit exhibits the fourth type of global subconstant returns to
scale. This type would be observed if the frontier point A were located higher, on
the ray OE. In this case both units A and E would then exhibit G-CRS, and units B,
C, and D would be classed as exhibiting G-SCRS.

Example 4 Figure 4 shows an example of free disposal hull technology of Deprins

et al. [19]. It is clear that the unit C is at MPSS and therefore exhibits G-CRS. The
units A and B are smaller than C and therefore exhibit G-IRS. The units D and E are
larger than C and exhibit G-DRS.
714 V. V. Podinovski and F. R. Førsund

Output
11
10
E
9
D
8
C
7
6

5
4
B
3
2
A
1

0 1 2 3 4 5 6 7 8 9 10 11 Input

Fig. 4 Global returns to scale in the free disposal hull technology

It is worth noting that the standard local characterization of returns to scale in the
free disposal hull technology is uninteresting. Indeed, the right-hand scale elasticity
at each of the observed units in Fig. 4 is equal to zero, and the left-hand scale
elasticity is undefined (or can formally be taken equal to +∞). Therefore, by the
standard Definition 2 of returns to scale, all observed units exhibit constant returns
to scale.25
The above examples show that in a nonconvex technology, the local and global
characterizations are generally different. The three types of local returns to scale
are based on the notion of scale elasticity and are therefore local characteristics
of the production frontier. The four types of global returns to scale are global
characteristics. They are based on the direction towards MPSS and are not directly
related to the scale elasticity.
It is also worth noting that the global type of subconstant returns to scale is
primarily of academic interest and should not normally be observed in practical
applications. This fact has recently been confirmed by computational experiments
by Cesaroni et al. [10]. However, without this type the global characterization of
production frontiers would be logically incomplete.26

25 We can also say that each observed unit exhibits decreasing returns to scale on the right
(corresponding to the right-hand scale elasticity equal to zero) and increasing returns to scale on
the left (corresponding to the left-hand scale elasticity equal to +∞).
26 Podinovski [63] notes that the G-SCRS type may occur naturally in the nonconvex free

replication hull technology developed by Ray and Hu [72].

17 Scale Elasticity and Returns to Scale 715

Podinovski [63] proves that if technology T is convex, the local and global
characterizations coincide. More precisely, the unit (Xo , Yo ) exhibits global constant,
increasing, or decreasing returns to scale if and only if it exhibits local constant,
increasing, or decreasing returns to scale, respectively. This further implies that in
a convex technology, the global type of subconstant returns to scale is impossible.
Therefore, any difference between the local and global characterizations may be
observed only in a nonconvex technology, as was illustrated by Examples 3 and 4.
Podinovski [56] shows that the four types of global returns to scale can be
identified by a modification of the reference technology method developed by Färe
et al. [23, 24] and further modified by Kerstens and Vanden Eeckaut [41] and Briec
et al. [9]. To describe this method, consider any technology T that disallows free
and unlimited production. Let ET (Xo , Yo ), E NIRS (Xo , Yo ) and E NDRS (Xo , Yo ) denote
the output radial efficiencies of the unit (Xo , Yo ) in technology T and the NIRS and
NDRS technologies generated by T, respectively.27 The following theorem is proved
by Podinovski [56, 57].28

Theorem 3 Let the unit (Xo , Yo ) satisfy Assumption 1. Then it exhibits

(i) G-CRS if and only if E NDRS (Xo , Yo ) = E NIRS (Xo , Yo ) = ET (Xo , Yo ) = 1

(ii) G-DRS if and only if E NDRS (Xo , Yo ) < E NIRS (Xo , Yo ) ≤ ET (Xo , Yo ) = 1
(iii) G-IRS if and only if E NIRS (Xo , Yo ) < E NDRS (Xo , Yo ) ≤ ET (Xo , Yo ) = 1
(iv) G-SCRS if and only if E NIRS (Xo , Yo ) = E NDRS (Xo , Yo ) < ET (Xo , Yo ) = 1

Conclusion

In this chapter we presented an overview of the notions of scale elasticity and returns
to scale in different production technologies. The literature on this topic is vast
and has been developed over many decades. A particular difficulty in presenting
a consistent unifying development of this topic is the fact that it has historically
been explored in two different theoretical frameworks. The first and the older one
is the neoclassical economics framework ascending from the pioneering works
of Frisch [31]. According to this approach, efficient production is represented
by a known transformation function which is subject to various economic and
technical assumptions that facilitate its analysis. The second approach has started
in the seminal works on DEA by Banker et al. [5] and Banker [3] in which the

27 If T is a free disposal hull technology, the output radial efficiencies E NIRS (Xo , Yo ) and
ENDRS (Xo , Yo ) and their input analogues can be evaluated by solving mixed integer linear programs
[59].
28 Podinovski [56, 57] states and proves an analogue of Theorem 3 in the input orientation, that is,

by using the input radial efficiencies of the unit (Xo , Yo ) in the three technologies. Its proof is based
on the assumption of strong efficiency of the unit (Xo , Yo ). For a similar proof of the output-oriented
variant of this theorem presented in this chapter, the weaker Assumption 1 is sufficient.
716 V. V. Podinovski and F. R. Førsund

whole production technology is defined axiomatically, without making any explicit

assumptions about its efficient frontier.
We note that despite the differences between the two theoretical frameworks and
required assumptions, the fundamental and unifying concept in both approaches is
the proportional output response function β (α). For any unit (Xo , Yo ), this function
represents the maximum proportion of the output vector Yo that can be produced in
the technology as a result of the input vector Xo changing proportionally by a factor
α ≥ 0. It is important that the definition of the function β (α) is universal – it does
not depend on the theoretical approach used to model the technology and does not
rely on any additional assumptions about it.

In the neoclassical approach, the derivative of this function β (α) calculated at
α = 1 is the scale elasticity evaluated at the unit (Xo , Yo ). In DEA models, the
one-sided derivatives of the function β (α) are the corresponding one-sided scale
elasticities. The maximum ratio β (α) /α over the ray α > 0 corresponds to the
optimal technical scale in the neoclassical approach and MPSS in nonparametric
DEA models. Furthermore, the function β (α) can be redefined to represent the
proportional response of a subset of inputs and outputs with respect to proportional
changes of another subset of inputs and outputs, which leads to a spectrum of
different definitions of partial scale elasticity. Finally, in nonconvex technologies,
the ratio β (α) /α may generally attain its maximum over the ray α > 0 at multiple
values of the variable α. An investigation of this leads to the global returns-to-scale
characterization of production frontiers.

Cross-References

Data Envelopment Analysis: A Nonparametric Method of Production Analysis

Distance Functions in Production Economics
Elasticities of Substitution
Nonconvexity in Production and Cost Functions: An Exploratory and Selective
Review
Reminiscences of “Returns to Scale in Electricity Supply”

References
1. Atici KB, Podinovski VV (2012) Mixed partial elasticities in constant returns-to-scale
production technologies. Eur J Oper Res 220(1):262–269
2. Balk BM, Färe R, Karagiannis G (2015) On directional scale elasticities. J Prod Anal 43(1):99–
104
3. Banker RD (1984) Estimating most productive scale size using data envelopment analysis. Eur
J Oper Res 17(1):35–44
4. Banker RD, Thrall RM (1992) Estimation of returns to scale using data envelopment analysis.
Eur J Oper Res 62(1):74–84
5. Banker RD, Charnes A, Cooper WW (1984) Some models for estimating technical and scale
inefficiencies in data envelopment analysis. Manag Sci 30(9):1078–1092
17 Scale Elasticity and Returns to Scale 717

6. Banker RD, Chang H, Cooper WW (1996) Equivalence and implementation of alternative

methods for determining returns to scale in data envelopment analysis. Eur J Oper Res
89(3):473–481
7. Banker RD, Cooper WW, Seiford LM, Thrall RM, Zhu J (2004) Returns to scale in different
DEA models. Eur J Oper Res 154(2):345–362
8. Banker RD, Cooper WW, Seiford LM, Zhu J (2011) Returns to scale in DEA. In: Cooper WW,
Seiford LM, Zhu J (eds) Handbook on data envelopment analysis, 2nd edn. Springer Science
+ Business Media, New York, pp 41–70
9. Briec W, Kerstens K, Leleu H, Vanden Eeckaut P (2000) Returns to scale on nonparametric
deterministic technologies: Simplifying goodness-of-fit methods using operations on technolo-
gies. J Prod Anal 14(3):267–274
10. Cesaroni G, Kerstens K, Van de Woestyne I (2017) Global and local scale characteristics in
convex and nonconvex nonparametric technologies: a first empirical exploration. Eur J Oper
Res 259(2):576–586
11. Chambers RG, Färe R (2008) A “calculus” for data envelopment analysis. J Prod Anal
30(3):169–175
12. Chambers RG, Quiggin J (2007) Dual approaches to the analysis of risk aversion. Economica
74(294):189–213
13. Chambers RG, Chung Y, Färe R (1996) Benefit and distance functions. J Econ Theory
70(2):407–419
14. Chambers RG, Chung Y, Färe R (1998) Profit, directional distance functions, and Nerlovian
efficiency. J Optim Theory Appl 98(2):351–364
15. Chambers RG, Färe R, Quiggin J (2004) Jointly radial and translation homothetic preferences:
generalized constant risk aversion. Economic Theory 23(3):689–699
16. Charnes A, Cooper WW, Rhodes E (1978) Measuring the efficiency of decision making units.
Eur J Oper Res 2(6):429–444
17. Cherchye L, De Rock B, Dierynck B, Roodhooft F, Sabbe J (2013) Opening the “black
box” of efficiency measurement: Input allocation in multioutput settings. Oper Res 61(5):
1148–1165
18. Cooper WW, Seiford LM, Tone K (2007) Data envelopment analysis. A comprehensive text
with models, applications, references and DEA-Solver software, 2nd edn. Springer Science +
Business Media, New York
19. Deprins D, Simar L, Tulkens H (1984) Measuring labor-efficiency in post offices. In:
Marchand M, Pestieau P, Tulkens H (eds) The performance of public enterprises: concepts
and measurements. North-Holland, Amsterdam, pp 243–267
20. Diewert WE, Nakajima T, Nakamura A, Nakamura E, Nakamura M (2011) Returns to
scale: concept, estimation and analysis of Japan’s turbulent 1964–88 economy. Can J Econ
44(2):451–485
21. Färe R, Grosskopf S (1985) A nonparametric cost approach to scale efficiency. Scand J Econ
87(4):594–604
22. Färe R, Grosskopf S (1994) Estimation of returns to scale using data envelopment analysis: a
comment. Eur J Oper Res 79(2):379–382
23. Färe R, Grosskopf S, Lovell CAK (1983) The structure of technical efficiency. Scand J Econ
85(2):181–190
24. Färe R, Grosskopf S, Lovell CAK (1985) The measurement of efficiency of production. Kluwer
Academic, Boston
25. Färe R, Grosskopf S, Lovell CAK (1994) Production frontiers. Cambridge University Press,
Cambridge
26. Førsund FR (1971) A note on the technically optimal scale in inhomogeneous production
functions. Swedish J Econ 73(2):225–240
27. Førsund FR (1996) On the calculation of scale elasticity in DEA models. J Prod Anal 7(2–
3):283–302
28. Førsund FR, Hjalmarsson L (2004) Calculating scale elasticity in DEA models. J Oper Res
Soc 55(10):1023–1038
718 V. V. Podinovski and F. R. Førsund

29. Førsund FR, Hjalmarsson L (2004) Are all scales optimal in DEA? Theory and empirical
evidence. J Prod Anal 21(1):25–48
30. Førsund FR, Hjalmarsson L, Krivonozhko VE, Utkin OB (2007) Calculation of scale elastici-
ties in DEA models: direct and indirect approaches. J Prod Anal 28(1−2):45–56
31. Frisch R (1965) Theory of production. D. Reidel Publishing, Dordrecht
32. Fukuyama H (2000) Returns to scale and scale elasticity in data envelopment analysis. Eur J
Oper Res 125(1):93–112
33. Fukuyama H (2003) Scale characterizations in a DEA directional technology distance function
framework. Eur J Oper Res 144(1):108–127
34. Golany B, Yu G (1997) Estimating returns to scale in DEA. Eur J Oper Res 103(1):28–37
35. Hadjicostas P, Soteriou AC (2006) One-sided elasticities and technical efficiency in multi-
output production: a theoretical framework. Eur J Oper Res 168(2):425–449
36. Hadjicostas P, Soteriou AC (2010) Different orders of one-sided scale elasticities in multi-
output production. J Prod Anal 33(2):147–167
37. Hanoch G (1970) Homotheticity in joint production. J Econ Theory 2(4):423–426
38. Johansen L (1972) Production functions: an integration of micro and macro, short run and long
run aspects. North-Holland, Amsterdam/London
39. Johnson WE (1913) The pure theory of utility curves. Econ J 23(92):483–513
40. Kao C (2017) Network data envelopment analysis. Springer Science + Business Media, New
York
41. Kerstens K, Vanden Eeckaut P (1999) Estimating returns to scale using non-parametric deter-
ministic technologies: a new method based on goodness-of-fit. Eur J Oper Res 113(1):206–214
42. Korhonen PJ, Soleimani-damaneh M, Wallenius J (2011) Ratio-based RTS determination in
weight-restricted DEA models. Eur J Oper Res 215(2):431–438
43. Krivonozhko VE, Utkin OB, Volodin AV, Sablin IA, Patrin M (2004) Constructions of
economic functions and calculations of marginal rates in DEA using parametric optimization
methods. J Oper Res Soc 55(10):1049–1058
44. Krivonozhko VE, Førsund FR, Lychev AV (2014) Measurement of returns to scale using non-
radial DEA models. Eur J Oper Res 232(3):664–670
45. Kuosmanen T (2005) Weak disposability in nonparametric productivity analysis with undesir-
able outputs. Am J Agric Econ 87(4):1077–1082
46. Kurz HD (1986) Classical and early neoclassical economists on joint production. Metroeco-
nomica 38(1):1–37
47. Lloyd P (2012) The discovery of the isoquant. Hist Polit Econ 44(4):643–661
48. Mehdiloo M, Podinovski VV (2019) Selective strong and weak disposability in efficiency
analysis. Eur J Oper Res 276(3):1154–1169
49. Mehdiloozad M, Podinovski VV (2018) Nonparametric production technologies with weakly
disposable inputs. Eur J Oper Res 266(1):247–258
50. Nelson RA (1985) Returns to scale from variable and total cost functions: evidence from the
electric power industry. Econ Lett 18(2–3):271–276
51. Olesen OB, Petersen NC (2013) Imposing the Regular Ultra Passum Law in DEA models.
Omega 41(1):16–27
52. Olesen OB, Ruggiero J (2014) Maintaining the Regular Ultra Passum Law in data envelopment
analysis. Eur J Oper Res 235(3):798–809
53. Olesen OB, Petersen NC, Podinovski VV (2015) Efficiency analysis with ratio measures. Eur
J Oper Res 245(2):446–462
54. Olesen OB, Petersen NC, Podinovski VV (2017) Efficiency measures and computational
approaches for data envelopment analysis models with ratio inputs and outputs. Eur J Oper
Res 261(2):640–655
55. Panzar JC, Willig RD (1977) Economies of scale in multi-output production. Q J Econ
91(3):481–491
56. Podinovski VV (2004) Efficiency and global scale characteristics on the “no free lunch”
assumption only. J Prod Anal 22(3):227–257
57. Podinovski VV (2004) Local and global returns to scale in performance measurement. J Oper
Res Soc 55(2):170–178
17 Scale Elasticity and Returns to Scale 719

58. Podinovski VV (2004) Bridging the gap between the constant and variable returns-to-scale
models: selective proportionality in data envelopment analysis. J Oper Res Soc 55(3):265–276
59. Podinovski VV (2004) On the linearisation of reference technologies for testing returns to scale
in FDH models. Eur J Oper Res 152(3):800–802
60. Podinovski VV (2007) Improving data envelopment analysis by the use of production trade-
offs. J Oper Res Soc 58(10):1261–1270
61. Podinovski VV (2009) Production technologies based on combined proportionality assump-
tions. J Prod Anal 32(1):21–26
62. Podinovski VV (2015) DEA models with production trade-offs and weight restrictions. In:
Zhu J (ed) Data envelopment analysis: a handbook of models and methods. Springer Science
+ Business Media, New York, pp 105–144
63. Podinovski VV (2017) Returns to scale in convex production technologies. Eur J Oper Res
258(3):970–982
64. Podinovski VV (2019) Direct estimation of marginal characteristics of nonparametric produc-
tion frontiers in the presence of undesirable outputs. Eur J Oper Res 279(1):258–276
65. Podinovski VV, Bouzdine-Chameeva T (2019) Cone extensions of polyhedral production
technologies. Eur J Oper Res 276(2):736–743
66. Podinovski VV, Førsund FR (2010) Differential characteristics of efficient frontiers in data
envelopment analysis. Oper Res 58(6):1743–1754
67. Podinovski VV, Førsund FR, Krivonozhko VE (2009) A simple derivation of scale elasticity in
data envelopment analysis. Eur J Oper Res 197(1):149–153
68. Podinovski VV, Chambers RG, Atici KB, Deineko ID (2016) Marginal values and returns to
scale for nonparametric production frontiers. Oper Res 64(1):236–250
69. Podinovski VV, Olesen OB, Sarrico SC (2018) Nonparametric production technologies with
multiple component processes. Oper Res 66(1):282–300
70. Ray SC (2004) Data envelopment analysis: theory and techniques for economics and operations
research. Cambridge University Press, Cambridge
71. Ray SC (2015) Nonparametric measures of scale economies and capacity utilization: an
application to U.S. manufacturing. Eur J Oper Res 245(2):602–611
72. Ray SC, Hu X (1997) On the technically efficient organization of an industry: a study of U.S.
airlines. J Prod Anal 8(1):5–18
73. Rockafellar RT (1970) Convex analysis. Princeton University Press, Princeton
74. Sahoo BK, Tone K (2015) Scale elasticity in nonparametric DEA approach. In: Zhu J (ed)
Data envelopment analysis: a handbook of models and methods. Springer Science + Business
Media, New York, pp 269–290
75. Sahoo BK, Zhu J, Tone K, Klemen BM (2014) Decomposing technical efficiency and scale
elasticity in two-stage network DEA. Eur J Oper Res 233(3):584–594
76. Salvanes KG, Tjøtta S (1994) Productivity differences in multiple output industries: an
empirical application to electricity distribution. J Prod Anal 5(1):23–43
77. Samuelson PA (1965) Foundations of economic analysis. Atheneum, New York
78. Samuelson PA (2009) Thünen: an economist ahead of his times. In: von Thünen J. The isolated
state in relation to agriculture and political economy. Palgrave Macmillan, Basingstoke, pp xii–
xiv
79. Seiford LM, Thrall RM (1990) Recent developments in DEA: the mathematical programming
approach to frontier analysis. J Econ 46(1–2):7–38
80. Seiford LM, Zhu J (1999) An investigation of returns to scale in data envelopment analysis.
Omega 27(1):1–11
81. Starrett DA (1977) Measuring returns to scale in the aggregate, and the scale effect of public
goods. Econometrica 45(6):1439–1455
82. Tone K (2001) On returns to scale under weight restrictions in data envelopment analysis. J
Prod Anal 16(1):31–47
83. Zelenyuk V (2013) A scale elasticity measure for directional distance function and its dual:
theory and DEA estimation. Eur J Oper Res 228(3):592–600
Nonconvexity in Production and Cost
Functions: An Exploratory and Selective 18
Review∗

Walter Briec, Kristiaan Kerstens, and Ignace Van de Woestyne

Contents
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 722
Technologies and Distance Functions: Basic Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 723
Axiom of Convexity: Arguments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 725
Convexity and Duality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 725
Convexity and Time Divisibility . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 726
Convexity and Managerial Practice: Some Skepticism Around . . . . . . . . . . . . . . . . . . . . . . 726
Nonparametric Nonconvex Technologies and Value Functions: Free
Disposal Assumption and Minimum Extrapolation Principle . . . . . . . . . . . . . . . . . . . . . . . . . . 728
Technologies: FDH and Its Extensions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 728
Economic Value Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 732
Efficiency Decompositions and the Testing of Convexity: A Priori Relations . . . . . . . . . . 735
Empirical Evidence on FDH and Its Extensions: The Impact of Convexity . . . . . . . . . . . . 737
FDH and Its Extensions: Further Methodological Refinements . . . . . . . . . . . . . . . . . . . . . . 740
Mitigating Convexity: A Selection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 741
Partial Convexity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 741
Regular Ultra Passum Law . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 742
From Generalized Convexity to Nonconvexity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 743

∗ We acknowledge the most helpful comments of R. Chambers and G. Cesaroni on an earlier

version. The usual disclaimer applies.
W. Briec
University of Perpignan, LAMPS, Perpignan, France
e-mail: [email protected]
K. Kerstens ()
IESEG School of Management, CNRS, Université de Lille, UMR 9221-LEM, Lille, France
e-mail: [email protected]
I. Van de Woestyne
Research Unit MEES, KU Leuven, Brussel, Belgium
e-mail: [email protected]

© Springer Nature Singapore Pte Ltd. 2022 721

S. C. Ray et al. (eds.), Handbook of Production Economics,
https://doi.org/10.1007/978-981-10-3455-8_15
722 W. Briec et al

Semilattice Structures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 746

Preliminary Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 748
Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 749
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 750

Abstract

The purpose of this contribution is to provide an overview of developments in

nonconvex production technologies and economic value functions, with special
attention to the cost function. Apart from a somewhat selective review of
theoretical issues, the emphasis is on whether the assumption of convexity makes
a difference in practice. Anticipating our conclusion, we argue that traditional
convex empirical results differ on average rather markedly from alternative
nonconvex ones. This should make the discipline reconsider its traditional
relationship with convexity in both theoretical and applied production analysis.

Keywords

Nonparametric frontier · Convexity · Production · Cost function · Scale ·

Productivit

Introduction

This contribution focuses on deterministic nonparametric frontier technologies that

somehow relax the traditional hypothesis of convexity. Apart from developments
in general equilibrium theory with nonconvexities, we are unaware of any devel-
opments in empirical production theory that allow to empirically document the
eventual impact of the traditional convexity axiom. This explains the narrow and
selective focus of this chapter.
The seminal article of Farrell [61] introduced a single output multiple inputs
deterministic nonparametric frontier technology, but did not establish a link with
linear programming. Boles [20] and Charnes et al. [39] are the first economics and
operations research articles, respectively, that have given the impetus that made the
nonparametric approach to production one of the great success stories in terms of
both methodological developments and empirical applications. While the axiom of
convexity is traditionally maintained in these nonparametric production models (see
Afriat [4], Banker et al. [13], Charnes et al. [39], Diewert and Parkan [50]) as well
as in the mainstream empirical economic literature on production analysis, Afriat
[4] was probably the first to mention a basic single output nonconvex technology
imposing the assumptions of strong input and output disposability. A multiple output
version has probably been proposed for the first time in Deprins et al. [49] and these
authors suggested the moniker Free Disposal Hull (FDH).
The work of Scarf [108–111] may well be considered as an important prede-
cessor of FDH, since he studied activity analysis models based on integer data. For
instance, Figure 1 displayed in Scarf [108, p. 3638] resembles the FDH as we know
it. Without the pretension to recount the history of the FDH technology in detail, it
18 Nonconvexity in Production and Cost Functions: An Exploratory and . . . 723

suffices to mention Lovell and Vanden Eeckaut [88, footnote 2] lists another three
potential historical sources of the FDH concept.
This traditional stress on convex applied production analysis is to some extent
surprising, since it is theoretically well-known that important features of technology
fundamentally violate the convexity of the production possibility set (see Farrell
[62]). First, indivisibility implies that inputs and outputs are not necessary perfectly
divisible. Furthermore, scaling down or up the entire production process in infinites-
imal fractions may not be feasible. Examples include the start-up and shutdown
costs in industries (see, e.g., O’Neill et al. [93] for electricity generation). Scarf
[112,113] stresses the importance of indivisibility in selecting among technological
options. Second, economies of scale (e.g., modern information technology) and
economies of specialization (e.g., Romer [106] on nonrival inputs in the new growth
theory) violate the convexity of technology. Third, the existence of positive or
negative production externalities also leads to nonconvexities. Thus, the structure
of production in society is potentially full of nonconvexities.
It should be realized that the natural environment is full of nonconvexities
as well (see Dasgupta and Mähler [46] for an overview). Ecologists identify
pathways by which ecosystem constituents interact with one another and with the
external environment. A large body of empirical work reveals that those pathways
often involve transformation possibilities among environmental goods and services
that constitute nonconvex sets (e.g., see Boscolo and Vincent [21] on forestry
economics). In the words of Dasgupta and Mähler [46]: “The word “convexity”
is ubiquitous in economics, but absent from ecology.”
This book chapter is structured as follows. Section “Technologies and Distance
Functions: Basic Definitions” provides some basic definitions of the traditional
axioms underlying technologies and their representation via distance functions. Sec-
tion “Axiom of Convexity: Arguments” discusses in detail the existing justifications
for the axiom of convexity. Section “Nonparametric Nonconvex Technologies and
Value Functions: Free Disposal Assumption and Minimum Extrapolation Principle”
first focuses on nonconvex FDH with its extensions and the corresponding tradi-
tional convex technologies, then followed by a discussion of nonconvex economic
value functions as well as efficiency decompositions and tests of convexity that have
been conceived in the literature. Next, we offer an empirical perspective on the use
of FDH and its extensions on a variety of topics. Finally, we discuss some further
methodological refinements. Section “Mitigating Convexity: A Selection” offers a
very selective review of several attempts to mitigate the impact of the convexity
axiom while avoiding FDH and its extensions. Section “Conclusions” concludes
and outlines some future research issues.

Technologies and Distance Functions: Basic Definitions

A production technology describes all available possibilities to transform input

vectors x = (x1 , . . . , xm ) ∈ Rm
+ into output vectors y = (y1 , . . . , yn ) ∈ R+ .
n

The production possibility set or technology T summarizes the set of all feasible
input and output vectors: T = {(x, y) ∈ Rm + × R+ : x can produce y}. Note
n
724 W. Briec et al

that it may be surprising that the main contributions in this literature continue
considering that the technology is a subset of Rm × Rn . In section “Nonparametric
Nonconvex Technologies and Value Functions: Free Disposal Assumption and
Minimum Extrapolation Principle” we open a perspective on considering the
domain Nm × Nn instead.
Given our focus on input-oriented efficiency measurement later on, this technol-
ogy can be represented by the input correspondence L : Rn+ → 2R+ where L(y) is
m

the set of all input vectors that yield at least the output vector y:

L(y) = {x : (x, y) ∈ T } . (1)

The radial input efficiency measure is a map E : Rm

+ × R+ −→ R+ ∪ {∞} that
n

can be defined as:

E (x, y) = min {λ : λ ≥ 0, λx ∈ L(y)} . (2)

This radial efficiency measure, which is the inverse of the input distance function,
indicates the minimum contraction of an input vector by a scalar λ while still
remaining in the input correspondence. Obviously, the resulting input combination
is located at the boundary of this input correspondence. For our purpose, the radial
input efficiency has two key properties (see, e.g., Hackman [68]). First, it is smaller
or equal to unity (0 ≤ E (x, y) ≤ 1), whereby efficient production on the isoquant
of L(y) is represented by unity and 1−E (x, y) indicates the amount of inefficiency.
Second, it has a cost interpretation. Note that more general efficiency measures are
around in the literature: one example is the directional distance function introduced
by Chambers et al. [38] that is sometimes mentioned in this contribution.
Consider a set of K observations A = {(x1 , y1 ) , . . . , (xK , yK )} ∈ Rm
+ × R+ .
n

In the following, let us denote K = {1, . . . , K}. Nonparametric specifications of

technology can then be estimated by enveloping these K observations in the set A
while maintaining some basic production axioms (see Hackman [68] or Ray [104]).
We are interested in defining minimum extrapolation technologies satisfying the
following assumptions:
T 1: (0, y) ∈ T ⇒ y = 0; (0, 0) ∈ T .
T 2: T is closed.
T 3: For all (x, y) ∈ T and all (u, v) ∈ Rm + × R+ if (x, −y) ≤ (u, −v), then
n

(u, v) ∈ T .
T 4: T exhibits (ı) constant returns to scale (CRS), δT ⊆ T , ∀δ > 0; (ıı)
nonincreasing returns to scale (NIRS), δT ⊆ T , ∀δ ∈ (0, 1);(ııı) nondecreasing
returns to scale (NDRS), δT ⊆ T , ∀δ > 1; (ıv) variable returns to scale (VRS),
when (ı), (ıı), and (ııı) do not hold.
T 5: T is convex.
We briefly expand on the interpretation of these basic axioms. Axiom (T1) states
that there is no free lunch and that inaction is feasible. Axiom (T2) indicates that
18 Nonconvexity in Production and Cost Functions: An Exploratory and . . . 725

the technology is closed. Axiom (T3) represents strong or free disposability in the
inputs and the outputs: inputs can be wasted without opportunity costs, and outputs
can be reduced at will. Axiom (T4) defines all four traditional returns to scale
hypotheses (i.e., constant, nonincreasing, nondecreasing, and variable (flexible)
returns to scale). Finally, the convexity assumption (T5) is traditional, but it is not
indispensable.

Axiom of Convexity: Arguments

While the axiom of convexity (T5) is traditionally maintained in economics, we

develop three types of arguments to put it under scrutiny. Two arguments are related
to economic theory. One argument is more pragmatic: in empirical applications,
it turns out that managers often object to convexity. Sometimes the motivation to
maintain the convexity axiom is just analytical convenience (see, e.g., Hackman
[68, p. 2]). We think this is an argument that is valid only if one can show that
convex results provide a reasonably good approximation to a potentially nonconvex
economic reality.

Convexity and Duality

Often duality is invoked as a reason to maintain convexity. Since the main duality
relations in economics linking, e.g., production and cost approaches presume
some form of convexity, in applied empirical production analysis, researchers feel
compelled to maintain the same axioms. It is an open question whether this desire
for theoretical consistency is cogent.
We explore this viewpoint a little bit. The traditional duality results often fit in a
general equilibrium framework that maintains convexity in its simplest forms. But,
applied researchers tend to forget that general equilibrium theory has become less
attractive as a general normative framework since the Sonnenschein-Mantel-Debreu
results appeared in the early 1970s. Almost entirely negative conclusions appeared
about the uniqueness and stability of general equilibrium. While uniqueness only
occurs under restrictions void of economic realism, instability is the rule rather than
the exception since almost any continuous pattern of price movements may occur in
general equilibrium (see Ackerman [2]).
Furthermore, general equilibrium theory has been developed under more general
conditions of nonconvexity on technology and preferences (see Chavas and Briec
[41]). Realistically, this involves some process of nonlinear pricing. At the firm
level, one may therefore look for proper nonconvex specifications that do justice to
the nonconvexities in technology. This may imply recourse to more complex duality
relations, but this is simply the price to pay for the gain in realism. The FDH and
its extensions can be seen as one example that may fit into such a strategy (see, e.g.,
Agrell and Tind [5]).
726 W. Briec et al

Convexity and Time Divisibility

Several economic theorists interpret convexity of technology solely in terms of time

divisibility of technologies and see no other justification for its use.
Hackman [68, p. 39] puts things clearly when discussing the axiom of convexity
in his textbook:
It does have the following “time-divisibility” justification. Suppose input vectors x1 and x2
each achieve output level u > 0. Pick a λ ∈ [0, 1], and imagine operating 100λ% of the
time using x1 and 100(1 − λ)% of the time using x2 . At an aggregate level of detail, it is
not unreasonable to assume that the weighted average input vector λx1 + (1 − λ)x2 can also
achieve output level u.

Jacobsen [70, p. 759] remarks when discussing the quasi-concavity property of

the production function:
(A.5) implies a time divisibility in the production process.

Shephard [116, p. 15] states about the property of convexity of the input set:
Property P.8 is valid for time divisibly-operable technologies. For example, if x ∈ L(u), y ∈
L(u) and θ ∈ [0, 1], the input vector [(1 − θ)x + θy] may be interpreted as an operation of
the technology a fraction (1 − θ) of some unit time interval with the input vector x and a
fraction θ with y, assuring at least the output rate u.

The added footnote at the end of the last cited phrase reads: “Indeed the input vector
[(1 − θ )x + θy] may have no meaning unless so interpreted.”
This time divisibility argument basically ignores setup and lead times which
make a switch between the underlying activities costly in terms of time. This implies
that convexity becomes questionable when time indivisibilities compound all other
reasons for spatial nonconvexities (e.g., indivisibilities, increasing returns to scale,
economies of specialization, externalities, etc.).

Convexity and Managerial Practice: Some Skepticism Around

Decision-makers do not necessarily believe in convexity. This is evidenced in

remarks, scattered in the literature, on the problems encountered in communicating
the results of traditional efficiency measurement assuming convexity to decision-
makers. We provide some examples of quotes reflecting this doubt of managers to
the axiom of convexity.
In a study applying convex nonparametric frontier methods to measure bank
branch efficiency, Parkan [96, p. 242] notes:
The comparison of a branch which was declared relatively efficient, to a hypothetical
composite branch, did not allow for convincing practical arguments as to where the
inefficiencies lay.

Epstein and Henderson [53, p. 105] report similar experiences in that managers
simply question the feasibility of the hypothetical projection points resulting from
18 Nonconvexity in Production and Cost Functions: An Exploratory and . . . 727

convex nonparametric frontiers when discussing an application to a large public-

sector organization:
The algorithm for construction of the frontier was also discussed. The frontier segment
connecting A and B was considered unattainable. It was suggested that either (1) these
two DMUs should be viewed as abnormal and dropped from the model, (2) certain key
variables have been excluded, or (3) the assumption of linearity was inappropriate in this
organization. It appears that each of these factors was present to some degree.

In a very similar vein, Bouhnik et al. [22, p. 243] state:

Equally as important, it is our experience that managers often question the meaning of
convex combinations that involve what they perceive to be irrelevant DMUs.

All quotes seem to point to the fact that convexity may well in practice combine
observations that are too far apart in terms of input mix, output mix, and/or scale of
operations. While one hopes for a rather uniformly dense rather well-spaced cloud
of points that avoids the combination of extreme points of production, such extreme
combinations apparently occur and are puzzling for managers.
In a value efficiency analysis application (a way of incorporating preference
information into efficiency analysis), Halme et al. [69, p. 11] also opt for its use
with FDH because this matched the preferences of management:
The management was also more comfortable providing preference information over existing
units than virtual units, and found the results valuable.

Also some researchers concede that nonconvex analysis of production facilitates

the practical use of efficiency analysis. For instance, Bogetoft et al. [19, p. 859]
declare in this context:
In general, allowing the possibility set to be nonconvex facilitates the practical use of
productivity analysis in benchmarking. In particular, fictitious production possibilities,
generated as convex combinations of those actually observed, are usually less convincing
as benchmarks, or reference units, than actually observed production possibilities.

This experience is confirmed by Halme et al. [69, p. 10]:

During our long experience of DEA applications we repeatedly encountered the phe-
nomenon that DMs (Decision Maker) are reluctant to evaluate other than existing units.

Obviously, we understand that this is just casual evidence that transpires from the
empirical literature. But, it is useful to consider in addition to the other arguments
above.
Turning to a mathematical argument, notice that there exists some general
condition under which a distance function (related to the efficiency measure (2))
can characterize a nonconvex technology. This general condition is independent of
the strong disposability assumption (T3) (though we use it in the remainder for
computational reasons). One can provide a simple condition considering the radian
subset of R ∈ Rd . A subset R of Rd is a radian set if for all λ ∈ [0, 1] and all x ∈ R,
λx ∈ R. Equivalently, such a subset is called a starshaped set (see Aliprantis and
Border [6] for related concepts). A subset S is co-radian if for all λ ≥ 1, λx ∈ S. In
the field of functional analysis in mathematics, a distance function is called a gauge
728 W. Briec et al

function (analogous to the Minkowski functional for symmetrical sets). This is a

function that recovers a notion of distance on a linear space. For all subset D of Rd ,
the gauge function ψD is the map ψD : Rd −→ [0, ∞] defined by:

ψD (x) = sup{δ : δx ∈ D}, (3)

with the convention that ψD (x) = 0 if there is no λ ≥ 0 such that λx ∈ A.

Paralleling this definition, for all co-radian set, one can define a co-gauge as:

ηD (x) = inf{δ : δx ∈ D}. (4)

This definition implies that for all, respectively, closed radian and co-radian sets R
and S:

R = {x ∈ Rd : ψR (x) ≥ 1} and S = {x ∈ Rd : ηS (x) ≤ 1} (5)

It follows that a production technology can be characterized from the efficiency

measure (2) if and only if the input set L(y) is co-radian for all y ∈ Rm
+ . Considering
an output-oriented efficiency measure, such a characterization applies if and only if
the output set is a radian (starshaped) set.

Nonparametric Nonconvex Technologies and Value Functions:

Free Disposal Assumption and Minimum Extrapolation Principle

Technologies: FDH and Its Extensions

While Deprins et al. [49] are commonly acknowledged as the developers of the
basic FDH model, Kerstens and Vanden Eeckaut [73] extended this basic model by
introducing the possibilities of constant, nonincreasing, and nondecreasing returns
to scale. This leads to the definition of three new technologies complementary to
the assumption of flexible or variable returns to scale embodied in the basic FDH
model.
Individual production possibility sets are based upon one production unit
(xk , yk ), the strong disposability assumption, and different maintained hypotheses
of returns to scale:

N (xk , yk ) = (x, y) ∈ Rm
+ × R+ : x ≥ δxk , y ≤ δyk , δ ∈ ,
n
(6)

where ∈ {CRS , N DRS , N I RS , V RS }, with:

(i) CRS = {δ : δ ≥ 0} ;
(ii) N DRS = {δ : δ ≥ 1} ;
(iii) N I RS = {δ : 0 ≤ δ ≤ 1} ;
(iv) V RS = {δ : δ = 1} .
18 Nonconvexity in Production and Cost Functions: An Exploratory and . . . 729

Unions and convex unions of these individual production possibility sets yield the
nonconvex technologies on the one hand and the traditional convex models on the
other hand:

TN C, = N (xk , yk ) and TC, = Co N (xk , yk ) , (7)
k∈K k∈K

where Co stands for the convex hull operator.

In addition to this approach based on sets and their operations, an alternative and
useful formulation can be proposed making some analogy to the traditional convex
model. Let us introduce the following notation:

C = zk = 1, zk ≥ 0 and N C = zk = 1, zk ∈ {0, 1} .
k∈K k∈K

A unified algebraic representation of convex and nonconvex technologies under

different returns to scale assumptions for a sample of K observations is found in
Briec et al. [30]:

T, = (x, y) ∈ Rm
+ × R+ : (x, −y) ≥
n
δzk (xk , −yk ), zk ∈ , δ ∈ ,
k∈K
(8)

where ∈ {N C , C }. First, there is the activity vector (z) operating subject to a
convexity (C) or nonconvexity (NC) constraint. Second, there is a scaling parameter
(δ) allowing for a particular scaling of all K observations spanning the technology.
This scaling parameter is smaller than or equal to 1 or larger than or equal to 1 under
nonincreasing returns to scale (NIRS) and nondecreasing returns to scale (NDRS),
respectively, fixed at unity under variable returns to scale (VRS), and free under
constant returns to scale (CRS).
Briec et al. [30, Proposition 1] prove the following result:

Proposition 1 ( [30, p. 166]). The nonconvex technologies TNC , are the minimal
extrapolation technologies containing the data A = {(xk , yk ) : k ∈ K} ⊂ Rm
+ × R+
n

and satisfying the axioms T 1 to T 4.

The same statement for basic FDH solely has earlier been developed in Färe and
Li [55]: FDH can be seen as the closest inner approximation of the true, strongly
disposable but possibly nonconvex technology.
The advantages of this formulation (8) are twofold. First, it offers a coherent for-
mulation of all basic technologies under the four basic returns to scale assumptions
(T4) and under both convexity (T5) and nonconvexity. For example, under VRS (i.e.,
setting δ = 1) and no convexity (i.e., constraint (N C )), one obtains the classical
FDH technology:
730 W. Briec et al

TNC ,V RS = (x, y) ∈ Rm
+ × R+ : (x, −y) ≥
n
zk (xk , −yk ), z ∈ N C , (9)
k∈K

as formulated by Deprins et al. [49]. As another example, under VRS and convexity
(i.e., constraint (C )), one retrieves the basic technology defined by Banker et al.
[13] and Färe et al. [56]:1

TC ,V RS = (x, y) ∈ Rm
+ × R n
+ : (x, −y) ≥ zk (xk , −yk ), z ∈ C . (10)
k∈K

Second, its pedagogical advantage is that it neatly separates the role of the various
assumptions in the formulation of technology. For instance, the restrictions on the
scaling parameter (δ) relate directly to the basic definitions of the axioms on returns
to scale (T4). Furthermore, the sum constraint on the activity vector z (i.e., constraint
(C )) relates to the convexity axiom (T5).
In this way, one can avoid confusing statements as found in the literature. For
instance, the sum constraint on the activity vector z (i.e., constraint (C )) in the
envelopment or primal formulation (10) is often called a “convexity constraint”
under the VRS assumption, while the CRS technology has no such constraint in
the formulation of Charnes et al. [39] though it also maintains the convexity axiom
(see, e.g., Cook and Seiford [44, p. 2–3]).
To compute the radial input efficiency measure (2) relative to convex technolo-
gies in (8) requires solving a nonlinear programming problem (NLP) for each
evaluated observation. As shown in Briec and Kerstens [28, Lemma 2.1], this NLP
can be transformed into the familiar linear programming (LP) problems that are
known from the literature by substituting wk = δzk .
For the nonconvex technologies in (8), the radial input efficiency measure (2)
requires computing a nonlinear binary mixed integer program (NLBMIP): see Briec
et al. [30, p. 166]. In fact, to reduce the computational complexity of this NLBMIP
problem, three distinctive alternative solution methods have been proposed in the
literature. First, Podinovksi [99] reformulates all these nonconvex technologies as
binary mixed integer programs (BMIP) using a big M technique. Second, starting
from an existing LP model for the basic FDH model (9) (see Agrell and Tind
[5]), Leleu [85] formulates for all these nonconvex technologies equivalent LP
problems. Third, Briec et al. [30] develop for all nonconvex technologies an implicit
enumeration strategy to obtain closed form solutions for the radial input efficiency
measure (2):2

Proposition 2. Let EN C, denote the radial input efficiency measure defined with
respect to technologies TNC , . For all (x, y) ∈ TNC , and k = 1, · · · , K, let us

1 Notethat the convex VRS and NDRS technologies do not satisfy inaction.
2 Notethat the use of enumeration for the basic nonconvex FDH production model (9) has been
around in the literature for quite a while: examples include [49, 63, 122], among others.
18 Nonconvexity in Production and Cost Functions: An Exploratory and . . . 731

denote:
xki yj
αk (x) = max and βk (y) = max ,
i∈I (x) xi j ∈J (yk ) ykj

where for all (x, y) ∈ Rm + × R+ , I (x) = {i ∈ {1, . . . , m} : xi > 0} and J (y) =

{j ∈ {1, . . . , n} : yj > 0}. We have, for all (x, y) ∈ TNC , :

⎧
⎪
⎪ min αk (x) if = V RS ;
⎪
⎨ (xk ,yk )∈B (x,y)
EN C, (x, y) = min βk (y) · αk (x) if ∈ {CRS , N I RS };
⎪
⎪
(xk ,yk )∈B (x,y)
⎪
⎩ min {max {βk (y), 1} · αk (x)} if = N DRS .
(xk ,yk )∈B (x,y)

with B (x, y) = {(xk , yk ) : δxk ≤ x, δyk ≥ y, δ ∈ }.

Briec and Kerstens [28, p. 148–149] refine this analysis and also offer closed
form solutions for the output-oriented and graph-oriented efficiency measures. Fur-
thermore, these authors indicate that the computational complexity of enumeration
is advantageous compared to the BMIP or LP approaches. Indeed, the maximum
(minimum) of a vector with n components can be calculated in the worst case in
O(n) arithmetic operations. Thus, to enumerate on the data set with the number of
firms K, the number of arithmetic operations is about O(LK(m + n)), where m
and n represent the number input and output dimensions and L is a measure of data
storage for a given precision. A standard linear program has a O(LK 3 ) polynomial
time complexity linked to the number of observed firms K. Since K > m + n in
general, the time complexity of enumeration is thus better than LP. In fact, Kerstens
and Van de Woestyne [75] empirically document that implicit enumeration is by far
the fastest solution strategy followed by BMIP and finally LP.3 Kerstens and Van de
Woestyne [76] provide closed form solutions for the directional distance functions
under alternative returns to scale assumptions.
One can mention that in this nonconvex framework, one can also treat the discrete
case by considering that the technology is a subset of Nm × Nn (instead of Rm ×
Rn ). However, the radial measure (2) involves an assumption of divisibility and
is therefore unsuitable. In line with Andriamasy et al. [9], one can overcome this
problem by using the directional distance function (see Chambers et al. [38]) and
selecting a direction that is the unit vector of Nm × Nn .
In principle, the appropriateness of the convexity axiom can be tested for any
comparison between convex and nonconvex technologies imposing a similar returns
to scale hypothesis. We can define tests for the convexity of technology as a simple
ratio between the convex and nonconvex input efficiency measures. Thus, the ratio:

3 This poor performance is related to the huge size of the LP formulation in Leleu [85].
732 W. Briec et al

CT (x, y) = EC, (x, y)/EN C, (x, y) (11)

determines a nonparametric local goodness-of-fit test for the convexity of technolo-

gies conditional on the scaling law (see Briec et al. [30, p. 178]).

Economic Value Functions

The nonconvex production models have been complemented by nonconvex cost

functions with corresponding specific returns to scale assumptions in Briec et al.
[30]. Turning to a dual representation of technology, recall that the cost function
C : Rn+ × Rm + −→ R+ ∪ {∞} defines the minimum costs to produce an output
vector y given a vector of semi-positive input prices (w ∈ Rm
+ ):

C(y, w) = inf {w · x : x ∈ L(y)} . (12)

Briec et al. [30, p. 175–176] establish a local duality result between the nonconvex
cost functions and the nonconvex FDH and its extensions.
The computation of the cost function (12) relative to convex nonparametric tech-
nologies TC, again requires an NLP to be solved for each evaluated observation.
As above, this NLP can be transformed into the familiar LP problem that is known
from the literature (e.g., Hackman [68]).
The cost function (12) relative to the nonconvex technology TN C, involves
computing a NLBMIP as mentioned above. Again, to reduce the computational
complexity of this NLBMIP problem, three distinctive solution methods can be
pursued. First, following the Podinovksi [99] approach, one can transform these
nonconvex cost functions to BMIPs. Second, Leleu [85] formulates for all these
nonconvex cost functions equivalent LP problems. Third, Briec et al. [30] develop
for all nonconvex cost functions an implicit enumeration strategy yielding closed
form solutions. For all y ∈ Rn+ , let us denote:

V (y, xk , yk ) = x ∈ Rm
+ ; (x, y) ∈ N (xk , yk ) (13)

By construction, we have:

CN C, (y, w) = min w · x : x ∈ V (y, xk , yk ) . (14)
k∈K

(k)
By defining CN C, (y, w) = min{w · x : x ∈ V (y, xk , yk )}, we obtain:

(k)
CN C, (y, w) = min CN C, (y, w). (15)
k∈K
18 Nonconvexity in Production and Cost Functions: An Exploratory and . . . 733

Interestingly the above properties can be derived from the standard background
of convex analysis (see Clarke [43] and Rockafeller and Wets [105] for references).4
Given a closed subset D of Rd , let δD : Rd −→ R ∪{−∞} be the indicator function
defined as:

0 if x ∈ D
δD (z) = (16)
−∞ if x ∈/D

One can then show that:

inf{w.z : z ∈ D} = inf{w.z − δD (z) : z ∈ Rd } = δD (w), (17)

where δD (w) stands for the conjugate of δD . Suppose moreover that for all k ∈ K,

Dk is a closed subset of Rd and that D = k∈K Dk .

δD (w) = δ (w) = inf{w.z − δk∈K Dk (z) : z ∈ Rd } (18)

k∈K Dk

= inf{w.z − max δDk (z) : z ∈ Rd } = inf min w.z − δDk (z) : z ∈ Rd
k∈K k∈K
(19)
= min inf{w.z − δDk (z) : z ∈ Rd } = min δDk (w). (20)
k∈K k∈K

Along this line we obtain for all k ∈ K:

(k)
CN C, (y, w) = δV (y,xk ,yk ) (w) and CN C, (y, w) = min δV (y,xk ,yk ) (w).
k∈K
(21)

Notice that a similar method applies for efficiency analysis. The next result is then
derived.

Proposition 3. Let CNC, (y, w) denote the cost function with respect to technolo-
gies TNC , . For all (y, w) ∈ Rn+ × Rm
+ , we have:
⎧
⎪
⎪ min {w · xk : yk ≥ y} if = V RS ;
⎪
⎪ k∈K
⎪
⎪
⎨ min {βk (y)w · xk } if = CRS ;
k∈K
CNC, (y, w) =
⎪
⎪ min {βk (y)w · xk } if = N I RS ;
⎪
⎪ {k:βk (y)≤1}

⎪
⎪
⎩min max {βk (y), 1} w · xk if = N DRS ;
k∈K

4 This point was suggested to the authors by R. Chambers.

734 W. Briec et al

yj
where J (y) = {j : yj > 0} and βk (y) = maxj ∈J (yk ) ykj are defined as in
Proposition 2.

Remark that Ray [104, Section 10.2] shows that the basic FDH cost function
yields the same result as the Weak Axiom of Cost Minimization (WACM) as defined
by Varian [123]. This is intuitively obvious since WACM only imposes convexity
of the input set, and thus this partial convexity yields the same cost function as the
one not imposing convexity at all.
Now, there is a property of the cost function in the outputs worthwhile spelling
out. Some seminal contributors to axiomatic production theory state that the cost
function is nondecreasing and convex (nonconvex) in the outputs when convexity
of technology is assumed (rejected) (e.g., Färe [54, p. 87], Jacobsen [70, p. 765],
Shephard [116, p. 227], or Shephard [117, p. 15]). A central result established in
Briec et al. [30] is that cost functions based on convex technologies are always
smaller or equal to cost functions based on nonconvex technologies.

Proposition 4 ( [30, p. 171]). The convex and nonconvex cost functions CC, and
CN C, , respectively, satisfy the following properties:

(a) For all (y, w) ∈ Rn+ × Rm + , CC, (y, w) ≤ CN C, (y, w).
(b) In the single output case, if = CRS , then CC, (y, w) = CN C, (y, w).

Both cost functions are only equal in the case of CRS and a single output.
Proposition 4 can be conceived as a more detailed result spelling out the
precise impact of convexity on the above property of cost functions in the
outputs.
Obviously, these results can also be transposed to other economic value func-
tions. Revenue functions based upon convex technologies are higher than or equal to
revenue functions based upon nonconvex technologies. Only in the single input and
CRS case, both these revenue functions coincide. For the long-run profit function,
by contrast, the use of convex technologies or nonconvex technologies is logically
indistinguishable. However, for any other restricted profit function, one obtains the
result that profit is higher or equal when tangent to a convex instead of a nonconvex
technology.
Also the appropriateness of the convexity axiom can be tested by comparing
convex and nonconvex value functions imposing a similar returns to scale hypoth-
esis. A simple test of the convexity of, e.g., the cost function can be defined
as a simple ratio between the convex and nonconvex cost functions. Thus, the
ratio:

CC (y, w) = CC, (y, w)/CN C, (y, w) (22)

determines a nonparametric local goodness-of-fit test for the convexity of cost

functions conditional on the scaling law (see Briec et al. [30, p. 178]). Obviously,
18 Nonconvexity in Production and Cost Functions: An Exploratory and . . . 735

this convexity test in Definition 22 is similar in structure to the test earlier developed
in Definition 11.

Efficiency Decompositions and the Testing of Convexity: A Priori

Relations

While Farrell [61] provided the first measurement scheme for the evaluation of
Technical and Allocative Efficiency in a frontier context, Färe et al. [57] and Seitz
[115] both offer alternative extended efficiency taxonomies. Because it is in our
opinion the most widespreadly used, we stick in this contribution to the conceptual
framework developed in Färe et al. [57, pp. 3–5].
The radial efficiency measure (2) used relative to different technologies entails
the different concepts in this efficiency taxonomy of Färe et al. [57]. By conditioning
the notation of the radial efficiency measure (2) on, e.g., a particular returns to scale
hypothesis, it is straightforward to provide a formal characterization of all efficiency
notions in the following definition (see, e.g., Briec et al. [30, p. 179]).
The following input-oriented efficiency notions are identified:

(a) Technical Efficiency T E (x, y) = E,VRS (x, y).

(b) Overall Technical Efficiency OT E (x, y) = E,CRS (x, y).
(c) Scale Efficiency SCE (x, y) = E,CRS (x, y)/E,V RS (x, y).
(d) Overall Efficiency OE (x, y, w) = C,CRS (y, w)/(w · x).
(e) Allocative Efficiency AE (x, y, w) = OE (x, y, w)/OT E (x, y).

While Technical Efficiency (T E (x, y)) requires production on the boundary of

the VRS technology, Overall Technical Efficiency (OT E (x, y)) necessitates that
production is situated on the boundary of the CRS technology. Scale Efficiency
(SCE (x, y)) reflects a social goal and is measured by the ratio between the
actual (VRS) and ideal (CRS) technological configurations. Overall Efficiency
(OE (x, y, w)) requires computing a cost function relative to a CRS technology
(C,CRS (y, w)) and taking the ratio between minimal and observed costs (w · x).
Allocative Efficiency (AE (x, y, w)) is a residual term computed by the ratio of
OT E (x, y) and OT E (x, y).5
Since E,CRS (x, y) ≤ E,V RS (x, y), evidently 0 < SCE (x, y) ≤ 1. The
embeddedness of technologies in terms of returns to scale assumptions determines
the relations between these efficiency measures. These static efficiency concepts are
mutually exclusive, and their radial measurement yields a multiplicative decompo-
sition:

5 This decomposition ignores structural efficiency or congestion. Recently, an attempt was made
to develop new methods to measure strong forms of hypercongestion for convex and nonconvex
technologies alike in Briec et al. [31]. This new methodology is empirically illustrated in Briec et
al. [32]. Abad and Briec [1] transpose this methodology toward the modeling of bad outputs using
a by-production framework.
736 W. Briec et al

OE (x, y, w) = AE (x, y, w) · OT E (x, y) (23)

where OT E (x, y) = T E (x, y) · SCE (x, y).

To develop tests for convexity, we clarify the relationship between convex and
nonconvex decompositions:

Proposition 5 ( [30, p. 180]). For all (x, y) ∈ Rm + × R+ , the relations between

convex and nonconvex decomposition components are: (a) OT EC (x, y) ≤

OT EN C (x, y); (b) T EC (x, y)≤T EN C (x, y); (c) OEC (x, y, w)≤OEN C (x, y, w).

Thus, while three out of the five above efficiency notions can be ordered with
respect to the impact of convexity, there is no a priori ordering possible for
the nonconvex and convex scale (SCE (x, y)) and Allocative (AE (x, y, w))
Efficiency components. Though the underlying efficiency measures can be ordered,
it is not possible to order the ratios between these efficiency measures.
Nonparametric goodness-of-fit tests for the convexity of the efficiency com-
ponents based upon constant returns to scale technologies and cost functions,
respectively, are provided by the following ratios (see Briec et al. [30, p. 181]):

CRT E(x, y) = OT EC (x, y)/OT EN C (x, y) (24)

and

CRCE( x, y, w) = OEC (x, y, w)/OEN C (x, y, w). (25)

Several methods have been proposed in the literature to obtain qualitative

information regarding global returns to scale (e.g., see Seiford and Zhu [114]). Since
these methods are not suitable for nonconvex technologies, Kerstens and Vanden
Eeckaut [73, Proposition 2] generalize an existing goodness-of-fit method to suit all
technologies. Including a fourth returns to scale case only relevant for nonconvex
technologies (see Podinovksi [98]), the following proposition summarizes this
method.

Proposition 6 ([35, p. 579]). Conditional on the optimal efficient point, technology

T,V RS is globally characterized by:

(a) CRS : E,N I RS (x, y) = E, N DRS (x, y) = E,V RS (x, y);
(b) IRS : E,N I RS (x, y) < E,N DRS (x, y) ≤ E,V RS (x, y);
(c) DRS : E,N DRS (x, y) < E,N I RS (x, y) ≤ E,V RS (x, y);
(d) SCRS : E,N I RS (x, y) = E,N DRS (x, y) < E,V RS (x, y);

where IRS, DRS, and SCRS stand for increasing, decreasing, and sub-constant
returns to scale, respectively.
18 Nonconvexity in Production and Cost Functions: An Exploratory and . . . 737

Table 1 Nonconvex and convex cost estimates: a selection

Article Ratio CC (y, w) (in %) Remarks
Balaguer-coll et al. [11] 58.87
Briec et al. [30] 97.76 CRS
Cummins and Zi [45] 50.55
De Borger & Kerstens [47] 77.59
Grifell-Tatjé & Kerstens [67] 90.85 Actual
79.82 Ideal
Viton [124] 87.64 1 Output
92.77 4 Outputs

Essentially, these CRS, NIRS, and NDRS technologies are auxiliary to determine
the position of an observation relative to the true flexible (i.e., VRS) returns to
scale technology. Recently, Mostafaee and Soleimani-Damaneh [92] propose a
more elaborated taxonomy of global returns to scale characterizations for nonconvex
technologies based on results of Mostafaee and Soleimani-Damaneh [91].

Empirical Evidence on FDH and Its Extensions: The Impact

of Convexity

This subsection focuses on the key question: does nonconvexity matter in empirical
applications when compared to traditional convex analysis? We provide some
evidence for a selection of four economic topics: (i) cost functions, (ii) efficiency
decompositions, (iii) productivity growth, and (iv) capacity utilization.

Cost Function Results

In Table 1 we list a small selection of studies that report the results of convex and
nonconvex frontier cost estimates. The first column lists the authors of the article,
the second column reports the ratio CC (y, w) as defined in Definition 22, and the
third column eventually provides a remark.6
The Balaguer-Coll et al. [11] study on Spanish municipalities reveals that convex
costs are only 58.87% of nonconvex costs at the sample average. Analyzing the US
life insurance industry, Cummins and Zi [45] even report 50.55% on average for
CC (y, w): this means that convex cost is about half of the nonconvex costs. The
De Borger and Kerstens [47] analysis of Belgian municipalities shows that convex
costs are only 77.59% of convex costs. In a study of Spanish electricity distribution,

6 In case the study does not report cost estimates but rather overall efficiency ratios, one can
obtain CC (y, w) = CC, (y, w)/CN C, (y, w) by taking the ratio of the corresponding overall
efficiency ratios OEC (x, y, w)/OEN C (x, y, w). The observed cost in each of the denominators
of OE (x, y, w) cancels out.
738 W. Briec et al

Grifell-Tatjé and Kerstens [67] report a ratio of 90.85% when using data from the
actual network and of 79.82% when using data from an ideal engineering network.
The Briec et al. [30] study lists a ratio of 97.76%, but this study imposes CRS
and therefore meets one of the two conditions for equality (see Proposition 4).
The Viton [124] article is a bit a special case in that the author compares
WACM and traditional convex cost estimates: since WACM coincides with a
nonconvex estimate, this amounts to an implicit test of convexity. He reports a
ratio of 87.64% under a single output specification (meeting again one of the two
conditions for equality, Proposition 4) and a ratio of 92.77% under a multiple output
specification.
In conclusion, it is undeniable that convexity has an important to huge impact on
cost estimates and hence on Overall Efficiency.

Efficiency Decomposition
From the efficiency decomposition discussed in section “Efficiency Decompositions
and the Testing of Convexity: A Priori Relations,” the overall efficiency component
has already been discussed in section “Cost Function Results.” Therefore, we focus
on technical efficiency components in this part.
As established in Proposition 5, T EC (x, y) ≤ T EN C (x, y). There is an
abundance of studies reporting efficiency measures computed relative to basic
convex (10) and nonconvex (9) technologies. We focus on just a few examples.
For instance, Stroobants and Bouckaert [120] compare libraries in the Flemish
region and report substantial differences between convex and nonconvex results for
three specifications (though no statistical tests are reported). As another example,
Mayston [90] evaluates UK economics departments and finds substantial differences
at the sample level (though again no statistical tests are reported).
Cesaroni et al. [35, p. 582–583] report on the decomposition OT E (x, y) =
T E (x, y) · SCE (x, y) for five secondary data sets. These authors find that
convex and nonconvex OT E (x, y) is only significantly different for two data
sets, while convex and nonconvex SCE (x, y) happens to be significantly different
for all data sets and convex and nonconvex T E (x, y) for most data sets. The
same authors also focus on conflicting cases in returns to scale determination using
Proposition 6: e.g., switches from increasing returns to scale (IRS) to decreasing
returns to scale (DRS), from CRS to IRS, and from CRS to DRS. While one data
set has no conflicting cases, four data sets find conflicting cases ranging between
6.98% and 39.02% of observations. Finally, these authors explore the markedly
different patterns of ray average productivity curves under convex and nonconvex
technologies.
Chavas and Kim [42, p. 69–70] report on convex and nonconvex T E (x, y) and
SCE (x, y): while no statistical tests are reported, the descriptive statistics seem to
be markedly different. Cesaroni and Giovannola [34, p. 128–129] establish results
for alternative convex and nonconvex cost-based efficiency components similar to
the above: though no statistical tests are mentioned, the descriptive statistics are
clearly different beyond doubt.
18 Nonconvexity in Production and Cost Functions: An Exploratory and . . . 739

Productivity Growth
Kerstens and Van de Woestyne [74] report empirical results for the immensely
popular Malmquist productivity index (e.g., Färe et al. [60]) as well as for the
Hicks-Moorsteen Total Factor Productivity (TFP) index (defined by Bjurek [17])
under various specifications of technology. For both indices, it turns out that convex
and nonconvex results for both CRS and VRS yield different descriptive statistics,
though no formal tests are provided regarding the statistical significance of these
differences.
Kerstens and Managi [72] focus on the Luenberger productivity indicator which
is defined in terms of the differences between directional distance functions (see
[37]) using basic convex (10) and nonconvex (9) technologies. Analyzing a huge
data set of petroleum wells, their findings can be summarized as follows. First,
productivity change is on average smaller under nonconvexity, and the resulting
distributions are significantly different. Second, substantially more observations
tend to push the frontier outward under nonconvexity and are thus involved in
creating technological change. Third, both β-convergence and σ -convergence are
being tested for and happen to occur only under nonconvexity, not under the
traditional convexity axiom. In a follow-up study of Chinese banks, Barros et
al. [15] also find that the Luenberger productivity change is on average smaller
under nonconvexity. Testing differences in productivity with respect to scale and
ownership does not yield different patterns according to convexity.
Finally, Ang and Kerstens [10] study productivity of US agriculture at the state
level using the Luenberger-Hicks-Moorsteen TFP indicator (introduced by Briec
and Kerstens [27]) again using basic convex (10) and nonconvex (9) technologies.
These authors report a higher TFP change under nonconvexity, and the resulting
distributions are significantly different.

Capacity Utilization
Johansen [71] introduces the notion of plant capacity as the maximum output vector
that can be produced with existing equipment with unrestricted variable inputs
per unit of time. Färe et al. [59] transpose this notion into a multi-output frontier
framework by using a combination of two output-oriented efficiency measures: one
relative to a technology including the variable inputs and another one excluding the
variable inputs. Walden and Tomberlin [125] report average output-oriented plant
capacity estimates that vary between 52% and 84% in the cases of a basic convex
(10) and a basic nonconvex (9) technology, respectively.
Kerstens et al. [79] argue that the output-oriented plant capacity utilization is
unrealistic when the amounts of variable inputs needed to reach the maximum
capacity outputs are not available. This is related to the attainability issue already
noted by Johansen [71]. These authors illustrate empirically that the scaling of
variable inputs is less implausible for nonconvex compared to traditional convex
technologies.
Cesaroni et al. [36] define an alternative input-oriented plant capacity notion
by using a combination of two sub-vector input-oriented efficiency measures only
740 W. Briec et al

aimed at reducing the variable inputs: one relative to a standard technology and
one relative to a technology with the minimum output level per dimension among
all observed units. While these authors report average output-oriented plant capacity
estimates that are 92% and 89% for the convex (10) and nonconvex (9) technologies,
respectively, these apparent small differences nevertheless represent distributions
that turn out to be statistically significantly different. For the average input-oriented
plant capacity estimates, they report numbers of 120% and 121% for the convex (10)
and nonconvex (9) technologies, respectively: again these apparent small differences
reflect distributions that are statistically significantly different.
It goes without saying that such differences may well have potentially huge
implications in the design of policies to combat overcapacity in fisheries. Kerstens
et al. [77] report results from a short-run Johansen sector model allowing for the
reallocation of production between firms that is developed in two steps. In the first
step, output-oriented plant capacity estimates are computed. In the second step, the
industry model minimizes the industry use of fixed inputs in a radial way such that
total production is maintained at the current total level by reallocating production
among firm capacities. From the 398 vessels in the fleet, the convex plant capacity
estimates lead to maintain only 330 vessels, while the nonconvex estimates maintain
357 vessels. Thus, the required decommissioning effort resulting from the short-run
Johansen sector model is larger under convexity.
Kerstens et al. [78] aim to compare empirically technical and economic capacity
notions on both convex and nonconvex technologies. After defining these capacity
notions, an empirical comparison is performed using a secondary data set containing
data of French fruit producers. Two key empirical conclusions are that all these dif-
ferent capacity notions follow different distributions and also that these distributions
almost always differ under convex and nonconvex technologies.

FDH and Its Extensions: Further Methodological Refinements

One can mention a whole series of methodological refinements and variations that
have been introduced in the literature related to methods initially developed in a
convex setting.
First, traditional radial efficiency measures in FDH models yield potentially huge
amounts of slacks and surpluses since the efficient subset is limited to the corner
points; nonradial input-, output-, and graph-oriented efficiency measures have been
evaluated and found particularly relevant in the basic FDH model by De Borger et al.
[48]. Portela et al. [101] focus on some alternative graph-oriented (or nonoriented)
efficiency measures in the same context. Following up on Ebrahimnejad et al.
[55] Fukuyama et al. [64] develop least-distance efficiency measures for FDH
technologies that satisfy a strong monotonicity property.
Second, in the spirit of Bouhnik et al. [22] who proposed lower bound restrictions
on the intensity variables to avoid unreasonable optimal activity vectors in a convex
setting, Mairesse and Vanden Eeckaut [89] develop for these nonconvex production
models lower and upper bound restrictions to the scaling of observations.
18 Nonconvexity in Production and Cost Functions: An Exploratory and . . . 741

Third, several types of extreme points (including anchor points) can be distin-
guished in FDH (see Soleimani-damaneh and Mostafaee [119]). Fourth, Soleimani-
damaneh [118] develops a dynamic FDH production model that can be recursively
solved by means of simple enumeration.
Fifth, Tavakoli and Mostafaee [121] are the first to develop a network structure
production model that opens up the black box of production via parallel and
sequential production processes in a nonconvex world. These authors obtain closed
form solutions for the basic efficiency measures under FDH and its extensions.
Sixth, there is some work on the construction of three-dimensional sections of the
efficient frontier for nonconvex models via enumeration methods as developed supra
(see Krivonozhko and Lychev [80–83], Krivonozhko et al. [84]).
Finally, Tulkens [122] was the first to propose a Free Replicability Hull (FRH)
by allowing for integer replications of all observations, eventually complemented
by upper bounds on the integer replication process. It turns out that this FRH is
computationally quite challenging (see Ehrgott and Tind [52]). In a similar vein,
Green and Cook [66] define a nonconvex technology containing all observations
as well as all composite observations obtained by simple aggregation. This Free
Coordination Hull (FCH) can eventually also be complemented by an upper bound
on the number of observations being aggregated.
Thus, most of the analysis that has been developed for convex technologies can
somehow be transposed to FDH and its extensions. This simply illustrates that
this rich body of analytical results is not necessarily jeopardized when opting for
nonconvex technologies.

Mitigating Convexity: A Selection

It should be clear by now that if one drops the convexity axiom altogether, then FDH
and its extensions are the straightforward technological and economic value function
choices to consider. However, some people have sought to mitigate the impact of
convexity in a variety of ways. This section offers a selection of approaches defining
some alternative to the traditional convexity axiom and somehow avoiding FDH and
its extensions.

Partial Convexity

Several authors have attempted to relax the convexity axiom somewhat. Petersen
[97] initiated a small literature aimed at maintaining convexity in input space and
in output space solely, but not in the graph of technology. The implementation of
this relaxed set of assumptions is corrected by Bogetoft [18] with restrictions on
the dimensionality of the production technology. Bogetoft et al. [19] relax these
restrictions on the dimensionality of the input and output spaces, while Post [102]
improves upon the latter article by proposing a procedure that avoids computational
problems in large-scale applications.
742 W. Briec et al

This relaxed assumption is justified by appeal to, for instance, the law of
diminishing marginal rates of substitution in the input space or to the idea of
diminishing marginal rates of transformation in the output space. However, it is
not clear how time divisibility can be applied in the context of this partial convexity
notion. Furthermore, one may question whether there really is, for instance, a law of
diminishing marginal rates of substitution in the input space. For example, Brokken
[33] summarizes three studies revealing that there are increasing marginal rates
of substitution of grain for roughage in beef production. Therefore, the law of
diminishing marginal rates of substitution is questionable.
Podinovski [100] introduces the idea of partial convexity between certain subsets
of inputs and subsets of outputs and derives BMIP for the traditional efficiency
measures. Leleu [86] proposes new LP formulations combining aspects of convex
and nonconvex production models across dimensions for all returns to scale
assumptions and for the directional distance efficiency measure. While Podinovski
[100, p. 555–556] justifies his partial convexity approach by appealing to divisibility
arguments pertaining to specific inputs and/or outputs, one may wonder whether
time divisibility is by definition related to the whole production process and that
setup times and indivisibilities destroy convexity altogether rather than only in some
subset of dimensions.
Finally, Chavas and Kim [42] adopt a different strategy to combine convex
and nonconvex models by defining the technology as a union of neighborhood-
based local representation of the technology each of which is convex. Obviously,
the union of convex technologies needs not be convex. By choosing very small or
very large neighborhoods, the technology as a union of neighborhood-based local
representations of the technology converges to the nonconvex technology (9) or the
convex technology (10), respectively. An obvious problem of the whole approach is
the neighborhood choice and its impact on productivity and efficiency analysis.

Regular Ultra Passum Law

Olesen and Petersen [94] intend to make convex models (10) suitable to estimate
optimal scale size by augmenting these with two additional maintained hypotheses
which imply that the frontier is consistent with smooth curves along rays in input
and in output space that obey the Regular Ultra Passum (RUP) law (i.e., mono-
tonically decreasing scale elasticities). This RUP law implies that the production
frontier must be S-shaped along any expansion path in input space. Obviously, such
technologies are nonconvex in input-output space. Olesen and Petersen [94] focus
on the multiple inputs single output case.
Olesen and Ruggiero [95] continue from there and focus on production tech-
nologies that are input homothetic. This allows to maintain convexity in input
and in output space but to allow for nonconvexities in input-output space. This
homotheticity assumption mainly serves to simplify the estimation procedure. Also
this presentation assumes only one output.
18 Nonconvexity in Production and Cost Functions: An Exploratory and . . . 743

In a sense, imposing the RUP law in this context again focuses on allowing
for nonconvexities in input-output space, just as in section “Partial Convexity.”
Therefore, the same reservations prevail. Furthermore, there are long-standing
misgivings on the use of homothetic structures in production theory as in Olesen and
Ruggiero [95]. Already Samuelson and Swamy [107, p. 592] conclude: “Empirical
experience is abundant that the Santa Claus hypothesis of homotheticity in tastes
and in technical change is quite unrealistic.”

From Generalized Convexity to Nonconvexity

We now focus on a modification of the CES − CET model introduced by Färe

et al. [58] that is a generalization of the traditional convex approach (10). This
CES − CET model has two parts: the output part is characterized by a Constant
Elasticity of Transformation specification, and the input part is characterized by a
Constant Elasticity of Substitution specification. Consider a generic map φr : Rd+ →
Rd+ defined as φr (z) = (z1r , . . . , zdr ). For all r > 0, this function is an isomorphism
from Rd+ to itself, and its reciprocal is defined on Rd+ as φr−1 (z) = (z1 , . . . , zd ).
1/r 1/r

Given a subset B = {uk : k ∈ K}k∈K of Rd+ , from Ben-Tal [16], one can define its
φr -generalized convex hull as:

Coφr (B) = φr −1 zk φr (uk ) : zk = 1, zk ≥ 0 . (26)
k∈K k∈K

Notice that this set is not convex in the “usual” case which corresponds to the
case where r = 1. The CES − CET model can then be defined as the set:

−1
TC,γ ,δ = (x, y) ∈ Rm
+ × R + : x ≥ φγ
n
zk φγ (xk ) , (27)
k∈K

y≤ φδ−1 zk φδ (yk ) , zk = 1, zk ≥ 0 ,
k∈K k∈K

where γ and δ > 0. Paralleling Banker et al. [13], this construction is derived
from the notion of generalized convex hull defined in (26). For such a class of
models, the radial efficiency measure (2) can be computed making some obvious
linear transformations. Notice that Ravelojaona [103] has proposed a nonlinear
version of the directional distance function (see Chambers et al. [38]) that can also
be computed by linear programming methods.
Boussemart et al. [23, p. 334] state that a production technology T is said to be
homogeneous of degree α if for all λ > 0:

(x, y) ∈ T ⇒ (λx, λα y) ∈ T . (28)

744 W. Briec et al

This technology has also been termed “almost homogeneous technology of degree
1 and α.” This degree of homogeneity of the technology has direct implications for
the nature of returns to scale.

Proposition 7 ( [23, p. 334]). Assume that the production technology T satisfies

T1–T4. Moreover, suppose that T is homogeneous of degree α. (a) If α > 1, then
T satisfies strictly increasing returns to scale; (b) if 0 < α < 1, then T satisfies
strictly decreasing returns to scale.

Thus, these homogeneous technologies exhibit either strictly increasing or

strictly decreasing returns to scale according to their degree of homogeneity.
Therefore, one can say that if the technology is homogeneous of degree α, then
it satisfies α-returns to scale. Obviously, strictly increasing returns to scale imply
nonconvexity of technology.
Boussemart et al. [23] propose to relax the definition proposed in Färe et al. [58]
by considering the following production model:

x ≥ φγ−1
alpha
TC,γ ,δ = (x, y) ∈ Rm
+ × R+ :
n
zk φγ (xk ) , (29)
k∈K

y ≤ φδ−1 zk φδ (yk ) , zk ≥ 0 .
k∈K

alpha
where γ and δ > 0. TC,γ ,δ satisfies an α-returns to scale assumption with α =
γ
δ . This technology differsfrom the one proposed by Färe et al. [58] because it
suppresses the constraint k∈K zk = 1. While their model is not compatible with
an α-returns to scale assumption, model (29) satisfies axioms (T1)–(T4) and satisfies
α-returns to scale under a suitable specification of α.

alpha
Proposition 8 ( [23, p. 336]). The production technology TC,γ ,δ defined in (27)
satisfies:

(a) strictly increasing returns to scale if and only if γ /δ > 1;

(b) strictly decreasing returns to scale if and only if γ /δ < 1;
(c) constant returns to scale if and only if γ /δ = 1;

Furthermore, this notion of α-returns to scale has also been extended to FDH and
its extensions (see Boussemart et al. [23, p. 336]).
In empirical applications, γ and δ are a priori parameters: optimal parameter
values can be determined by applying a goodness-of-fit method. This can be done
using a grid search method. For example, Leleu et al. [87] analyze four types of
intensive care units and find overwhelming evidence of increasing returns to scale,
but at the hospital level most institutions operate under decreasing returns to scale.
18 Nonconvexity in Production and Cost Functions: An Exploratory and . . . 745

More recently, Boussemart et al. [24] attempt to endogenize γ and δ using global
optimization tools. They propose a tractable procedure to find an optimal value of
α under a generalized FDH technology. This approach fully endogenizes α and
estimate its value by linear programming. For each firm k ∈ K, we consider an
individual technology defined by:

Qγ ,δ (xk , yk ) = (x, y) ∈ Rm
+ × R+ : x ≥ λ
n 1/γ
xk , y ≤ λ1/δ yk , λ ≥ 0 . (30)

The global technology is then the union of individual technologies as follows:

TN C,γ ,δ = Qγ ,δ (xk , yk ). (31)
k∈K

For all k, j ∈ K, let us denote:

(k)
Eγ ,δ (xj , yj ) = min{θ : (θ xj , yj ) ∈ Qγ ,δ (xk , yk )}. (32)

By definition, one has Eγ(k)

,δ (xk , yk ) = 1. From Boussemart et al. [24], one can show
that:
δ/γ
(k)
Eγ ,δ (xj , yj ) = βk (yj ) . αk (xj ) (33)

where for all k, αk (xj ) and βk (xj ) as in Proposition 2. Notice that this result
generalizes the one defined in the VRS case. It follows that:

EN C,γ ,δ (xj , yj ) = min{θ : (θ xj , yj ) ∈ TN C,γ ,δ } (34)

δ/γ
= min βk (yj ) . αk (xj ) . (35)
k∈K

By defining α = γ /δ, using the fact that any efficiency score is obtained in closed
form, one can then find α which maximizes the quantity M defining an index of
goodness of fit as:

1/α
M(A; α) = EN C,γ ,δ (xj , yj ) = min βk (yj ) . αk (xj ) (36)
k∈K
k∈K k∈K

subject to the constraint that (xj , yj ) ∈ TN C,γ ,δ for all j ∈ K. Taking the logarithm
it is then easy to convert this optimization problem to a linear program. An empirical
application is proposed in Boussemart et al. [24].
In the same vein, based on Charnes et al. [40], we now consider the piecewise
Cobb-Douglas (CD) model. Let us define the map φ0 : Rd++ −→ Rd++ defined as
φ0 (u) = (ln(u1 ), . . . , ln(ud )) . This function is a bijective function from Rd++ to
746 W. Briec et al

itself, and its reciprocal is defined on Rd++ by φ0−1 (u) = (exp(u1 ), . . . , exp(ud )) .
This piecewise Cobb-Douglas model can be written as:
λ λ
m+n
TCD = (x, y) ∈ R++ :x≥ xk k , y ≤ yk , λk = 1, λ ≥ 0 .
k∈K k∈K k∈K

This model is a generalized convex model derived from the notion of generalized
convexity analyzed by Ben-Tal [16]. A general taxonomy is provided in the next
subsection.

Semilattice Structures

In mathematics, a partially ordered set S for which every two elements have a
supremum contained in S is called an upper-semilattice. Hence for some dimension
d ∈ N, the partial order defined by u ≤ w if ui ≤ wi for all i ∈ {1, . . . , d}, with
u, w ∈ Rd+ , realizes upper-semilattice structures in Rd+ . The supremum of u and w
is determined by u ∨ w = (max(u1 , w1 ), . . . , max(ud , wd )). Note that the operator
∨ can be seen as taking the component-wise maximum.
Following Briec and Horvath [25], a subset L ⊂ Rd+ is said to be a B-
convex set, if ∀u, w ∈ L, ∀t ∈ [0, 1] : u ∨ tw ∈ L. Obviously, B-convex
subsets determine a special class of upper-semilattice structures in Rd+ of which
the mathematical properties are analyzed in detail in Briec and Horvath [25]. Briec
and Horvath [26] impose B-convexity on technologies in production economics as
a substitute for convexity (and nonconvexity in the sense of FDH) and study general
properties of these technologies and related cost functions. Starting from the set of
K observations A = {(x1 , y1 ), . . . , (xK , yK )} ⊂ Rm
+ × R+ , the following B-convex
n

nonparametric technology is defined:

Tmax = (x, y) ∈ Rm
+ × Rn+ :x≥ zk xk , y ≤ zk yk , zk = 1, zk ≥ 0 ,
k∈K k∈K k∈K
(37)

with the notation

uk = max(uk1 ), . . . , max(ukd ) ∈ Rd+ ,
k∈K k∈K
k∈K

for uk = (uk1 , . . . , ukd ) ∈ Rd+ , (k ∈ K), expanding the operator ∨ to multiple

vectors. Notice the structural similarity with (10) by replacing summation with
component-wise maximum.
Dual to the notion of an upper-semilattice, a lower-semilattice is defined as a
partially ordered set S for which every two elements have an infimum contained
18 Nonconvexity in Production and Cost Functions: An Exploratory and . . . 747

in S. Applied to Rd+ , this infimum of u, w ∈ Rd+ is determined by u ∧ w =

(min(u1 , w1 ), . . . , min(ud , wd )). Obviously, the operator ∧ takes the component-
wise minimum of both vectors.
Using this dual notion, Adilov and Yesilce [3] define a subset L ⊂ Rd+ ∪ {+∞}d
to be inverse B-convex if ∀u, w ∈ L, ∀t ∈ [1, +∞] : u ∧ tw ∈ L, and study
its properties. By analogy with the B-convex case, Briec and Liang [29] define the
following inverse B-convex nonparametric technology:

Tmin = (x, y) ∈ Rm
+ × Rn+ :x≥ zk xk , y ≤ zk yk , zk = 1, zk ≥ 0 ,
k∈K k∈K k∈K
(38)

with the notation

uk = min(uk1 ), . . . , min(ukd ) ∈ Rd+ ,
k∈K k∈K
k∈K

for uk = (uk1 , . . . , ukd ) ∈ Rd+ , (k ∈ K). Compared with (10), summation is

now replaced with component-wise minimum. This type of production technologies
allows to take into account the situation where the inputs exhibit complementarity.
In such a case, the structure of the input set is similar to that of the Leontief
production function.
Radial efficiency measurements can be computed with respect to both technolo-
gies Tmin and Tmax by using enumeration algorithms developed in Briec and Horvath
[26] and Briec and Liang [29]. These new production models have recently been
applied in, e.g., energy (Andriamasy et al. [7]), transportation (Barros et al. [14]),
and the tourism industry (Goncalves et al. [65]).
Coming back to the model proposed by Färe et al. [58] Andriamasy et al.
[8] show that these production technologies are the Painlevé-Kuratowski lower
[upper] limit of the sequence of production technologies TC,r,r that are derived from
technology CES − CET (27) by setting γ = δ = r 7 :

Limr−→∞ TC,r,r = Tmax . (39)

In addition id A ⊂ Rm
++ × R++
m

Limr−→−∞ TC,r,r = Tmin , (40)

7 The Painlevé-Kuratowski lower [upper] limit (sometimes also called Peano limit) of the sequence
of sets {En }n∈N is denoted Lin→∞ En [Lsn→∞ En ]. For a set of points p for which there exists a
sequence {pn } of points such that pn ∈ En for all n and p = limn→∞ pn , a sequence {En }n∈N
of subsets of Rm is said to converge, in the Painlevé-Kuratowski sense, to a set E if Lsn→∞ En =
E = Lin→∞ En , in which case we write E = Limn→∞ En .
748 W. Briec et al

and finally
Limr−→0 TC,r,r = TCD . (41)

Andriamasy et al. [9] consider a class of closely related nonparametric produc-

tion models
built
on the so-called
Max-Plus algebra. Let us consider
the semi-ring
Rmax = R ∪ −∞ , ⊕, ⊗ composed of the set R ∪ −∞ which is defined by
the maximization operation as addition s ⊕ t := max (s, t) and the usual addition
operation as multiplication s ⊗ t := s + t. −∞ and 0 are, respectively, the neutral
element of the “addition” ⊕ and the “multiplication” ⊗. One can derive from this
algebraic structure the following production model:

T⊕ := (x, y) ∈ Rm
+ × R+ : x ≥
n
(zk ⊗ x k ), (42)
k∈K

y≤ (zk ⊗ y k ), max zk = 0, z ∈ RK .
k∈K
k∈K

This model is called a Max-Plus nonparametric estimation of the production tech-

nology. The efficiency of firms can be meaningfully evaluated using the directional
distance function introduced by Chambers et al. [38] for which some closed form
has been provided in Andriamasy et al. [9].
Paralleling the standard technology TC,CRS , it is quite natural to define a graph
translation homothetic Max-Plus nonparametric model of the technology. This is
done by dropping the last constraint in equation (42). The following technology
is Max-Plus convex and satisfies a graph translation homothetic (denoted th)
assumption:

T⊕th := (x, y) ∈ Rm
+ × R+ : x ≥
n
(zk ⊗ x k ), y ≤ (zk ⊗ y k ), z ∈ RK .
k∈K k∈K
(43)

Notice that these types of algebraic structures have more recently been consid-
ered by Baldwin and Klemperer [12] to analyze discrete demand types and to prove
the existence of an equilibrium with indivisibilities.

Preliminary Conclusions

This selection is by definition incomplete and somewhat subjective. For instance,

we ignore Hackman [68, p. 135] who introduces the notion of projective convexity.
As another example, Kleine [80] offers a series of production models with general
or individual bounds on activity levels potentially leading to nonconvexities. Our
limited overview just offers a perspective on a non-negligible literature seeking
alternatives to the convexity axiom.
18 Nonconvexity in Production and Cost Functions: An Exploratory and . . . 749

Conclusions

Section “Technologies and Distance Functions: Basic Definitions” laid the founda-
tions by providing basic definitions of the traditional axioms underlying technolo-
gies and their representation via distance functions. Section “Axiom of Convexity:
Arguments” has focused on existing justifications for the axiom of convexity. Apart
from duality reasons that often seem to be misunderstood, we have stressed the
time divisibility argument and its weakness when indivisibilities also affect the
time dimension (e.g., setup times). Furthermore, we have cited some evidence
that decision-makers often have a hard time understanding the results from convex
analysis and sometimes almost explicitly object to its use.
Section “Nonparametric Nonconvex Technologies and Value Functions: Free
Disposal Assumption and Minimum Extrapolation Principle” started by a discussion
of the nonconvex FDH and its extensions and also their corresponding convex
technologies. The focus was on computational problems related to the need to
solve nonlinear binary mixed integer programs. Three solution strategies were
discussed: (i) BMIP, (ii) LP, and (iii) an implicit enumeration strategy, whereby
the latter turns out to be most efficient from a computational point of view. The
ensuing discussion of nonconvex economic value functions also touched upon these
computational problems and the same three solution strategies. Thereafter, the focus
moved to some popular efficiency decomposition and the formulation of basic tests
of convexity on the technology and on the cost function.
After this methodological analysis, we switched to an empirical perspective on
the use of FDH and its extensions grouped under four headings: (i) cost functions,
(ii) efficiency decompositions, (iii) productivity growth, and (iv) capacity utilization.
A final subsection discussed a series of methodological refinements of FDH and
its extensions revealing that almost all refined analysis developed for convex
technologies can somehow be transposed to FDH and its extensions.
Section “Mitigating Convexity: A Selection” has offered a selective review of
attempts to mitigate the impact of the convexity axiom while avoiding FDH and its
extensions. We focused extensively on partial convexity, the imposition of Regular
Ultra Passum laws, α-returns to scale, and semilattice structures. This review is
nowhere complete and reflects our own interests and biases.
An attempt to summarize the current state of affairs may be that the alternatives
for traditional convex technologies have now been around for a decade or so.
Empirical results reveal that convexity matters not only for the technology but also
for economic value functions. The latter may surprise some, but it reveals that the
issue of imposing convexity or not cannot be taken lightly. We consider attempts
to mitigate convexity while steering away from FDH and its extensions not very
successful at the moment. Therefore, unless we manage to renew the axiomatic
foundations of production theory in a fundamental way, it may be hard to ignore
using FDH and its extensions as well as its value functions and even harder to
ignore its empirical results. An open question is to what extent existing empirical
methodologies need to be re-examined to be able to cope with nonconvexities: given
the local nature of some of the results, new standards may need to be established.
750 W. Briec et al

This lack of standards to report nonconvex results as well the need to go beyond
traditional convex optimization that is often considered a cornerstone for economic
analysis may well contribute to its negligence.

References
1. Abad A, Briec W (2019) On the axiomatic of pollution-generating technologies: non-
parametric production analysis. Eur J Oper Res 277(1):377–390
2. Ackerman F (2002) Still dead after all these years: interpreting the failure of general
equilibrium theory. J Econ Methodol 9(2):119–139
3. Adilov G, Yesilce I (2012) B−1 -convex Sets and B−1 -measurable maps. Numer Funct Anal
Optim 33(2):131–141
4. Afriat S (1972) Efficiency estimation of production functions. Int Econ Rev 13(3):568–598
5. Agrell P, Tind J (2001) A dual approach to nonconvex frontier models. J Prod Anal 16(2):
129–147
6. Aliprantis C, Border K (2006) Infinite dimensional analysis: a Hitchhiker’s guide, 3rd edn.
Springer, Berlin
7. Andriamasy L, Barros C, Liang Q (2014) Technical efficiency of French nuclear energy
plants. Appl Econ 46(18):2119–2126
8. Andriamasy L, Briec W, Mussard S (2017) On some relations between several generalized
convex DEA models. Optimization 66(4):547–570
9. Andriamasy R, Briec W, Solonandrasana B (2017) Tropical production technologies. Pac J
Optim 13(4):683–706
10. Ang F, Kerstens P (2017) Decomposing the Luenberger-Hicks-Moorsteen total factor produc-
tivity indicator: an application to U.S. agriculture. Eur J Oper Res 260(1):359–375
11. Balaguer-Coll M, Prior D, Tortosa-Ausina E (2007) On the determinants of local government
performance: a two-stage nonparametric approach. Eur Econ Rev 51(2):425–451
12. Baldwin E, Klemperer P (2019) Understanding preferences: “Demand Types”, and the
existence of equilibrium with indivisibilities. Econometrica 87(3):867–932
13. Banker R, Charnes A, Cooper W (1984) Some models for estimating technical and scale
inefficiencies in data envelopment analysis. Manag Sci 30(9):1078–1092
14. Barros C, Liang Q, Peypoch N (2013) The efficiency of French regional airports: an inverse
B-convex analysis. Int J Prod Econ 141(1):668–674
15. Barros C, Fujii H, Managi S (2015) How scale and ownership are related to financial
performance? A productivity analysis of the Chinese banking sector. J Econ Struct 4:<?pag
?>article 16
16. Ben-Tal A (1977) On generalized means and generalized convex functions. J Optim Theory
Appl 21(1):1–13
17. Bjurek H (1996) The Malmquist total factor productivity index. Scand J Econ 98(2):303–313
18. Bogetoft P (1996) DEA on relaxed convexity assumptions. Manag Sci 42(3):457–465
19. Bogetoft P, Tama J, Tind J (2000) Convex input and output projections of nonconvex
production possibility sets. Manag Sci 46(6):858–869
20. Boles J (1966) Efficiency squared – efficient computation of efficiency indexes. In: Proceed-
ings of the Annual Meeting (Western Farm Economics Association). Western Agricultural
Economics Association, Washington, pp 137–142
21. Boscolo M, Vincent J (2003) Nonconvexities in the production of timber, biodiversity, and
carbon sequestration. J Environ Econ Manag 46(2):251–268
22. Bouhnik S, Golany B, Passy U, Hackman S, Vlatsa D (2001) Lower bound restrictions on
intensities in data envelopment analysis. J Prod Anal 16(3):241–261
23. Boussemart J-P, Briec W, Peypoch N, Tavéra C (2009) α-returns to scale and multi-output
production technologies. Eur J Oper Res 197(1):332–339
18 Nonconvexity in Production and Cost Functions: An Exploratory and . . . 751

24. Boussemart J-P, Briec W, Leleu H, Ravelojaona P (2019) On estimating optimal α-returns to
scale. J Oper Res Soc 70(1):1–11
25. Briec W, Horvath C (2004) B-convexity. Optimization 52(2):103–127
26. Briec W, Horvath C (2009) A B-convex production model for evaluating performance of
firms. J Math Anal Appl 355(1):131–144
27. Briec W, Kerstens K (2004) A Luenberger-Hicks-Moorsteen productivity indicator: its
relation to the Hicks-Moorsteen productivity index and the Luenberger productivity indicator.
Econ Theory 23(4):925–939
28. Briec W, Kerstens K (2006) Input, output and graph technical efficiency measures on non-
convex FDH models with various scaling laws: an integrated approach based upon implicit
enumeration algorithms. TOP 14(1):135–166
29. Briec W, Liang Q (2011) On some semilattice structures for production technologies. Eur J
Oper Res 215(3):740–749
30. Briec W, Kerstens K, Vanden Eeckaut P (2004) Non-convex technologies and cost
functions: definitions, duality and nonparametric tests of convexity. J Econ 81(2):
155–192
31. Briec W, Kerstens K, Van de Woestyne I (2016) Congestion in production correspondences.
J Econ 119(1):65–90
32. Briec W, Kerstens K, Van de Woestyne I (2018) Hypercongestion in production correspon-
dences: an empirical exploration. Appl Econ 50(27): 2938–2956
33. Brokken R (1977) The case of a queer isoquant: increasing marginal rates of substitution of
grain for roughage in cattle finishing. West J Agric Econ 1(1):221–224
34. Cesaroni G, Giovannola D (2015) Average-cost efficiency and optimal scale sizes in non-
parametric analysis. Eur J Oper Res 242(1):121–133
35. Cesaroni G, Kerstens K, Van de Woestyne I (2017) Global and local scale characteristics in
convex and nonconvex nonparametric technologies: a first empirical exploration. Eur J Oper
Res 259(2):576–586
36. Cesaroni G, Kerstens K, Van de Woestyne I (2017) A new input-oriented plant capacity
notion: definition and empirical comparison. Pac Econ Rev 22(4):720–739
37. Chambers R (2002) Exact nonradial input, output, and productivity measurement. Econ
Theory 20(4):751–765
38. Chambers R, Chung Y, Färe R (1998) Profit, directional distance functions, and Nerlovian
efficiency. J Optim Theory Appl 98(2):351–364
39. Charnes A, Cooper W, Rhodes E (1978) Measuring the efficiency of decision making units.
Eur J Oper Res 2(6):429–444
40. Charnes A, Cooper W, Seiford L, Stutz J (1982) A multiplicative model for efficiency
analysis. Socio-Econ Plan Sci 16(5):223–224
41. Chavas J-P, Briec W (2012) On economic efficiency under non-convexity. Econ Theory
50(3):671–701
42. Chavas J, Kim K (2015) Nonparametric analysis of technology and productivity under non-
convexity: a neighborhood-based approach. J Prod Anal 43(1):59–74
43. Clarke F (1983) Optimization and nonsmooth analysis. Wiley, New York
44. Cook W, Seiford L (2009) Data envelopment analysis (DEA) – thirty years on. Eur J Oper
Res 192(1):1–17
45. Cummins D, Zi H (1998) Comparison of frontier efficiency methods: an application to the
U.S. life insurance industry. J Prod Anal 10(2):131–152
46. Dasgupta P, Mähler K-G (2003) The economics of non-convex ecosystems: introduction.
Environ Resour Econ 26(4):499–525
47. De Borger B, Kerstens K (1996) Cost efficiency of Belgian local governments: a com-
parative analysis of FDH, DEA, and econometric approaches. Reg Sci Urban Econ 26(2):
145–170
48. De Borger B, Ferrier G, Kerstens K (1998) The choice of a technical efficiency measure on
the free disposal hull reference technology: a comparison using US banking data. Eur J Oper
Res 105(3):427–446
752 W. Briec et al

49. Deprins D, Simar L, Tulkens H (1984) Measuring labor efficiency in post offices. In:
Marchand M, Pestieau P, Tulkens H (eds) The performance of public enterprises: concepts
and measurements, pp 243–268. North Holland, Amsterdam
50. Diewert W, Parkan C (1983) Linear programming test of regularity conditions for production
functions. In: Eichhorn W, Neumann K, Shephard R (eds) Quantitative studies on production
and prices, pp 131–158. Physica-Verlag, Würzburg
51. Ebrahimnejad A, Shahverdi R, Balf F, Hatefi M (2013) Finding target units in FDH model by
least-distance measure model. Kybernetika 49(4):619–635
52. Ehrgott M, Tind J (2009) Column generation with free replicability in DEA. Omega 37(5):
943–950
53. Epstein M, Henderson J (1989) Data envelopment analysis for managerial control and
diagnosis. Decis Sci 20(1):90–119
54. Färe R (1988) Fundamentals of production theory. Springer, Berlin
55. Färe R, Li S-K (1998) Inner and outer approximations of technology: a data envelopment
approach. Eur J Oper Res 105(3):622–625
56. Färe R, Grosskopf S, Lovell C (1983) The structure of technical efficiency. Scand J Econ
85(2):181–190
57. Färe R, Grosskopf S, Lovell C (1985) The measurement of efficiency of production. Kluwer,
Boston
58. Färe R, Grosskopf S, Njinkeu D (1988) On piecewise reference technologies. Manag Sci
34(12): 1507–1511
59. Färe R, Grosskopf S, Valdmanis V (1989) Capacity, competition and efficiency in hospitals:
a nonparametric approach. J Prod Anal 1(2):123–138
60. Färe R, Grosskopf S, Norris M, Zhang Z (1994) Productivity growth, technical progress, and
efficiency change in industrialized countries. Am Econ Rev 84(1):66–83
61. Farrell M (1957) The measurement of productive efficiency. J R Stat Soc Ser A General
120(3):253–281
62. Farrell M (1959) The convexity assumption in the theory of competitive markets. J Polit Econ
67(4):377–391
63. Fried H, Lovell C, Turner J (1996) An analysis of the performance of university affiliated
credit unions. Comput Oper Res 23(4):375–384
64. Fukuyama H, Hougaard J, Sekitani K, Shi J (2016) Efficiency measurement with a non-
convex free disposal hull technology. J Oper Res Soc 67(1):9–19
65. Goncalves O, Liang Q, Peypoch N (2012) Technical efficiency measurement and inverse B-
convexity: Moroccan travel agencies. Tour Econ 18(3):597–606
66. Green R, Cook W (2004) A free coordination hull approach to efficiency measurement. J
Oper Res Soc 55(10):1059–1063
67. Grifell-Tatjé, E., Kerstens K (2008) Incentive regulation and the role of convexity in
benchmarking electricity distribution: economists versus engineers. Ann Public Cooperative
Econ 79(2):227–248
68. Hackman S (2008) Production economics: integrating the microeconomic and engineering
perspectives. Springer, Berlin
69. Halme M, Korhonen P, Eskelinen J (2014) Non-convex Value efficiency analysis and its
application to bank branch sales evaluation. Omega 48:10–18
70. Jacobsen S (1970) Production correspondences. Econometrica 38(5):754–771
71. Johansen L (1987) Production functions and the concept of capacity. In: Før-
sund F (ed) Collected works of Leif Johansen, vol 1. North Holland, Amsterdam,
pp 359–382
72. Kerstens K, Managi S (2012) Total Factor productivity growth and convergence in the
petroleum industry: empirical analysis testing for convexity. Int J Prod Econ 139(1):
196–206
73. Kerstens K, Vanden Eeckaut P (1999) Estimating returns to scale using nonparametric
deterministic technologies: a new method based on goodness-of-fit. Eur J Oper Res
113(1):206–214
18 Nonconvexity in Production and Cost Functions: An Exploratory and . . . 753

74. Kerstens K, Van de Woestyne I (2014a) Comparing Malmquist and Hicks-Moorsteen

productivity indices: exploring the impact of unbalanced vs. balanced panel data. Eur J Oper
Res 233(3):749–758
75. Kerstens K, Van de Woestyne I (2014b) Solution methods for nonconvex free disposal hull
models: a review and some critical comments. Asia-Pac J Oper Res 31(1)
76. Kerstens K, Van de Woestyne I (2018) Enumeration algorithms for FDH directional distance
functions under different returns to scale assumptions. Ann Oper Res 271(2):1067–1078
77. Kerstens K, Squires D, Vestergaard N (2005) Methodological reflections on the short-run
Johansen industry model in relation to capacity management. Mar Res Econ 20(4):425–443
78. Kerstens K, Sadeghi J, Van de Woestyne I (2019a) Convex and nonconvex input-oriented
technical and economic capacity measures: an empirical comparison. Eur J Oper Res
276(2):699–709
79. Kerstens K, Sadeghi J, Van de Woestyne I (2019b) Plant capacity and attainability: exploration
and remedies. Oper Res 67(4):1135–1149
80. Kleine A (2004) A general model framework for DEA. Omega 32(1):17–23
81. Krivonozhko V, Lychev A (2017a) Algorithms for construction of efficient frontier for
nonconvex models on the basis of optimization methods. Dokl Math 96(2):541–544
82. Krivonozhko V, Lychev A (2017b) Frontier visualization for nonconvex models with the use
of purposeful enumeration methods. Dokl Math 96(3):650–653
83. Krivonozhko V, Lychev A (2019) Frontier visualization and estimation of returns to scale in
free disposal hull models. Comput Math Math Phys 59(3):501–511
84. Krivonozhko V, Lychev A, Blokhina N (2019) Construction of three-dimensional sections of
the efficient frontier for non-convex models. Doklady Math 100(2):472–475
85. Leleu H (2006) A linear programming framework for free disposal hull technologies and cost
functions: primal and dual models. Eur J Oper Res 168(2):340–344
86. Leleu H (2009) Mixing DEA and FDH models together. J Oper Res Soc 60(12):1730–1737
87. Leleu H, Moises J, Valdmanis V (2012) Optimal productive size of hospital’s intensive care
units. Int J Prod Econ 136(2):297–305
88. Lovell C, Vanden Eeckaut P (1994) Frontier tales: DEA and FDH. In: Diewert W, Spremann
K, Stehlings F (eds) Mathematical modelling in economics: essays in honor of Wolfgang
Eichhorn. Springer, Berlin, pp 446–457
89. Mairesse F, Vanden Eeckaut P (2002) Museum assessment and FDH technology: towards a
global approach. J Cult Econ 26(4):261–286
90. Mayston D (2014) Effectiveness analysis of quality achievements for university departments
of economics. Appl Econ 46(31):3788–3797
91. Mostafaee A, Soleimani-Damaneh M (2020a) Closed form of the response function in FDH
technologies: theory, computation and application. RAIRO-Oper Res 54(1):53–68
92. Mostafaee A, Soleimani-Damaneh M (2020b) Global sub-increasing and global sub-
decreasing returns to scale in free disposal hull technologies: definition, characterization and
calculation. Eur J Oper Res 280(1):230–241
93. O’Neill R, Sotkiewicz P, Hobbs B, Rothkopf M, Stewart W (2005) Efficient market-clearing
prices in markets with nonconvexities. Eur J Oper Res 164(1):269–285
94. Olesen O, Petersen N (2013) Imposing the regular ultra passum law in DEA models. Omega
41(1):16–27
95. Olesen O, Ruggiero J (2014) Maintaining the regular ultra passum law in data envelopment
analysis. Eur J Oper Res 235(3):798–809
96. Parkan C (1987) Measuring the efficiency of service operations: an application to bank
branches. Eng Cost Prod Econ 12(1–4):237–242
97. Petersen N (1990) Data envelopment analysis on a relaxed set of assumptions. Manag Sci
36(3):305–314
98. Podinovski V (2004a) Local and global returns to scale in performance measurement. J Oper
Res Soc 55(2):170–178
99. Podinovski V (2004b) On the linearisation of reference technologies for testing returns to
scale in FDH models. Eur J Oper Res 152(3):800–802
754 W. Briec et al

100. Podinovski V (2005) Selective convexity in DEA models. Eur J Oper Res 161(2):552–563
101. Portela M, Borges P, Thanassoulis E (2003) Finding closest targets in non-oriented DEA
models: the case of convex and non-convex technologies. J Prod Anal 19(2–3):251–269
102. Post T (2001) Estimating Non-convex production sets – imposing convex input sets and output
sets in data envelopment analysis. Eur J Oper Res 131(1):132–142
103. Ravelojaona P (2019) On Constant Elasticity of Substitution – Constant Elasticity of
Transformation directional distance functions. Eur J Oper Res 272(2):780–791
104. Ray S (2004) Data envelopment analysis: theory and techniques for economics and operations
research. Cambridge University Press, Cambridge
105. Rockafellar R, Wets R-B (1998) Variational analysis. Springer, Berlin
106. Romer P (1990) Are nonconvexities important for understanding growth? Am Econ Rev
80(2):97–103
107. Samuelson PA, Swamy S (1974) Invariant economic index numbers and canonical duality:
survey and synthesis. Am Econ Rev 64(4):566–593
108. Scarf H (1977) An observation on the structure of production sets with indivisibilities. Proc
Natl Acad Sci 74(9):3637–3641
109. Scarf H (1981a) Production sets with indivisibilities Part I: generalities. Econometrica
49(1):1–32
110. Scarf H (1981b) Production sets with indivisibilities Part II: the case of two activities.
Econometrica 49(2):395–423
111. Scarf H (1986a) Neighborhood systems for production sets with indivisibilities. Econometrica
54(3):507–532
112. Scarf H (1986b) Testing for optimality in the absence of convexity. In: Heller W, Starr R,
Starrett S (eds) Social choice and public decision making: essays in honor of Kenneth J.
Arrow, vol I. Cambridge University Press, Cambridge, pp 117–134
113. Scarf H (1994) The allocation of resources in the presence of indivisibilities. J Econ Perspect
8(4):111–128
114. Seiford M, Zhu J (1999) An investigation of returns to scale in data envelopment analysis.
Omega 27(1):1–11
115. Seitz W (1971) Productive efficiency in the steam-electric generating industry. J Polit Econ
79(4):878–886
116. Shephard R (1970) Theory of cost and production functions. Princeton University Press,
Princeton
117. Shephard R (1974) Indirect production functions. Verlag Anton Hain, Meisenheim am Glam
118. Soleimani-damaneh M (2013) An enumerative algorithm for solving nonconvex dynamic
DEA models. Optim Lett 7(1):101–115
119. Soleimani-damaneh M, Mostafaee A (2015) Identification of the anchor points in FDH
models. Eur J Oper Res 246(3):936–943
120. Stroobants J, Bouckaert G (2014) Benchmarking local public libraries using non-parametric
frontier methods: a case study of Flanders. Libr Inf Sci Res 36(3–4):211–224
121. Tavakoli I, Mostafaee A (2019) Free disposal hull efficiency scores of units with network
structures. Eur J Oper Res 277(3):1027–1036
122. Tulkens H (1993) On FDH Efficiency analysis: some methodological issues and applications
to retail banking, courts, and urban transit. J Prod Anal 4(1–2):183–210
123. Varian H (1984) The nonparametric approach to production analysis. Econometrica
52(3):579–597
124. Viton P (2007) Cost efficiency in US air carrier operations, 1970–1984: a comparative study.
Int J Transp Econ 34(3):369–401
125. Walden J, Tomberlin D (2010) Estimating fishing vessel capacity: a comparison of nonpara-
metric frontier approaches. Mar Res Econ 25(1):23–36
Index Numbers and Productivity
Measurement 19
D. S. Prasada Rao

Contents
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 756
Notation and Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 759
Technology, Output, and Input Distance Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 760
Regularity Conditions R.1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 760
Productivity Measurement: The Case of Single Output and Single Input . . . . . . . . . . . . . . . . 761
Absolute Versus Relative Measures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 761
Decomposition of Productivity Change: Single Input and Single Output Case . . . . . . . . . . 762
Multiple Outputs and Inputs: The Index Number Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . 764
What Are Index Numbers? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 764
Measuring Quantity Aggregates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 765
Measures of Output and Input Quantity Change as an Aggregate
of Commodity-Specific Changes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 766
Decomposition of Changes in Revenues and Costs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 767
Index Numbers Based on Quantity Aggregates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 769
Specification of Functional Form for the Output Aggregates . . . . . . . . . . . . . . . . . . . . . . . . 769
Specification of Functional Forms for Input Aggregates . . . . . . . . . . . . . . . . . . . . . . . . . . . 772
Index Number Approach to Measuring Quantity Change . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 773
Direct Approach to Quantity Index Numbers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 774
Indirect Measures of Quantity Change . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 779
Direct Versus Indirect Measures of Quantity Change: Which One to Use? . . . . . . . . . . . . . 781
Axiomatic Approach to Index Numbers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 781
Notation for the Axiomatic Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 782
Axioms and Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 782
Economic Theoretic Approach to Output and Input Quantity Index Numbers . . . . . . . . . . . . 787
Notation and Basic Framework . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 788
Economic-Theoretic Approaches to Measurement of Output Quantity Change . . . . . . . . . 789
Direct Measures of Quantity Change in the Presence of Price Data . . . . . . . . . . . . . . . . . . 790
Indirect Output Quantity Index Numbers Using Output Price Index Numbers . . . . . . . . . . 791

D. S. Prasada Rao ()

School of Economics, The University of Queensland, Brisbane St. Lucia, QLD, Australia
e-mail: [email protected]

© Springer Nature Singapore Pte Ltd. 2022 755

S. C. Ray et al. (eds.), Handbook of Production Economics,
https://doi.org/10.1007/978-981-10-3455-8_8
756 D. S. Prasada Rao

Indirect Output Quantity Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 795

Direct Quantity Index Based on Malmquist Distance Function . . . . . . . . . . . . . . . . . . . . . . 796
Input Quantity Index Numbers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 800
Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 802
Special Topics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 802
Use of Quantity Aggregates to Measure Quantity Change . . . . . . . . . . . . . . . . . . . . . . . . . . 803
Transitivity and Quantity Index Numbers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 808
Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 815
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 817

Abstract

The chapter provides an overview of the index number methods and approaches
for measuring changes in output and input quantities as well as measuring
changes in output and input prices. The problem of generalization to the case of
multiple outputs and inputs is discussed. Two major alternative approaches to this
problem form the core of this chapter. The first approach uses a framework where
aggregates of output and input quantities are first computed and the resulting
aggregates are used to measure changes in output and input quantities. Issues
relating to the choice of functional form for the aggregates are discussed. The
second approach is the standard index number approach where index number
formulae are used to directly measure changes in output and input quantities
without having to first measure quantity aggregates. Index number formulae
used in direct and indirect measurement of quantity change are discussed. The
axiomatic and economic theoretic approaches to index numbers along with
their implications for the choice of an appropriate index number formula are
described. The problem of multilateral comparisons, the transitivity requirement,
implications of transitivity for the choice of appropriate formulae along with a
few recommendations to the practitioner are presented in the last section.

Keywords

Output and input aggregates · Index numbers · Axiomatic approach ·

Economic theoretic approach · Laspeyeres · Paasche · Fisher and T‚ornquist ·
Lowe and Young indices · Transitivity

Introduction

Index number theory and practice are important tools used by researchers, analysts
and policy makers at the national and international level. Most national statistical
offices around the world compile measures of changes in consumer prices which
are published in the form of consumer price index (CPI) on a monthly, quarterly
and annual basis. Central banks in different countries use measures of core inflation
for the purpose of setting interest rates in their macroeconomic policy making.
Economists rely on regular updates on macroeconomic aggregates such as gross
19 Index Numbers and Productivity Measurement 757

domestic product (GDP) at current and constant prices. Users are also familiar
with indices like the Dow Jones Index measuring changes in stock market prices
for 30 large companies listed on stock exchanges in the United States; and the
human development index (HDI) published regularly in the United Nations’ Human
Development Report. The HDI tracks performance of nations in three different
dimensions: income, health as measured by life expectancy, and education. The
objective of this chapter is to provide an overview of index number theory and
practice as it relates to productivity measurement.
Index numbers have a long history and tradition. William Fleetwood was one
of the earliest (1707) to use the idea of index numbers and measures of price
change and to come to the conclusion that five pounds in fifteenth century would
cost around 30 pounds at the beginning of the eighteenth century. His conclusion
was based on prices of essentials like bread, drink, meat, cloth, and books. Since
then contributions in this area have been steady and research on index numbers for
measuring temporal and spatial price changes is continuing. Some of the earliest
known index numbers are due to Dutot, Carli, Young, Lowe and Jevons; Laspeyres
[39], Paasche [49], Fisher [28], and Törnqvist [62]. Most of these index numbers are
still in vogue and play a central role in the construction of price and quantity index
numbers. For a comprehensive account of the history of index numbers, the reader
is referred to Diewert [17] and to a more recent account in Balk [6].
Index numbers can be used to measure price and quantity changes from the
perspective of a consumer or a producer. On the consumer side, emphasis is on
the construction of consumer price index or the cost of living index numbers. The
economic theoretic approach employed in this context necessarily depends on the
theory of consumer behavior. The Konus [37] and the Allen [2] indices are two
theoretical index numbers used for making price and real expenditure comparisons.
The axiomatic approach to index numbers, in contrast, focuses on the properties
expected of price and quantity index numbers.
On the production side, the scope of index numbers is somewhat wider as there
are data on output prices and quantities as well as data on input quantities used
in the production process and their prices. This means that index number methods
are needed for compiling output as well as input quantity and price index numbers.
The economic theoretic approach in this case relies on the production technology;
revenue maximization behavior; input cost minimization or profit maximizing
behavior of the producer. The axiomatic approach for these index numbers, however,
is similar to that used in the consumer price indices and comparisons of real
expenditure. An added dimension on the production side is the need to compile
measures of productivity change and then to identify various components of
productivity change. This chapter will focus purely on index numbers for output
and input level comparisons across firms or movements over time. Measurement
of productivity change and identification of sources of productivity change are
discussed in other contributions in this Handbook.
Index numbers are also used in empirical studies as a tool to reduce dimension-
ality in the process of identifying production frontiers. Techniques like the data
envelopment analysis (DEA) and stochastic frontier analysis (SFA) make use of
758 D. S. Prasada Rao

micro- or firm-level data and often suffer from the curse of dimensionality. In order
to be able to apply these methods, index numbers are used in the first stage to
reduce the dimension of the output or input vectors. For example, in the case of
agriculture, agricultural output includes thousands of agricultural commodities. It is
difficult to identify production frontiers with such high-dimensional data using DEA
or SFA. Index numbers are used to reduce the dimension to a manageable number of
composite agricultural commodities like cereals, pulses, vegetables, milk, livestock,
and animal products. Index number techniques used for this purpose are the same as
index numbers used to measure quantity change or to make comparisons of levels
of output at the firm level.
As the current chapter presents only essential elements of index number construc-
tion, readers and practitioners are advised to refer to major reference works like the
ECE-ILO Manual on the Consumer Price Index (2014) [21]1 and the Producer Price
Index (PPI) Manual [35]2 for further details. In addition, there are a large number
of publications, working papers, and economic measurement course materials
available on the website of Erwin Diewert whose pioneering work has helped shape
the modern approach to index numbers.3 As the main focus of this Handbook is on
productivity measurement, a useful reference on various measurement and index
number related issues is OECD [48], Measuring Productivity – OECD Manual
Measurement of Measurement of Aggregate and Industry-level Productivity Growth.
This chapter is organized as follows. Section “Notation and Preliminaries”
establishes the notation and preliminaries for the chapter. Section “Productivity
Measurement: The Case of Single Output and Single Input” examines, using the
simple one output and one input case, the link between productivity measurement,
price and quantity movements, profitability ratio, and changes in the terms of
trade. The section also discusses various components of productivity change.
Section “Multiple Outputs and Inputs: The Index Number Problem” focuses on the
multi-output and multi-input case and introduces the notion of fundamental index
number decomposition. Section “Index Numbers Based on Quantity Aggregates”
focuses on measuring quantity change using quantity aggregates. Section “Index
Number Approach to Measuring Quantity Change” presents the standard index

1 The latest version of the CPI Manual can be found at: www.ilo.org/global/statistics-and-
databases/WCMS_331153/lang%2D%2Den/index.htm. The Manual is an expanded revision of
Consumer price indices: An ILO manual, published in 1989. The current version is largely due to
the efforts of the Inter-secretariat Working Group on Price Statistics (IWGPS) which included the
participation of: the International Labour Office (ILO); the International Monetary Fund (IMF);
the Organisation for Economic Co-operation and Development (OECD); the Statistical Office
of the European Communities (Eurostat); the United Nations Economic Commission for Europe
(UNECE); and the World Bank.
2 The PPI Manual can be found on: www.imf.org/en/Publications/Manuals-Guides/Issues/2016/

12/30/Producer-Price-Index-Manual-Theory-and-Practice-16966. This manual is produced by the

Inter-secretariat Working Group on Price Statistics with lead role taken by the International
Monetary Fund.
3 Diewert’s website can be accessed through: https://economics.ubc.ca/faculty-and-staff/w-erwin-

diewert/
19 Index Numbers and Productivity Measurement 759

number theory and practice which attempts to measure quantity change directly.
Section “Axiomatic Approach to Index Numbers” is devoted to the axiomatic
approach to index numbers. Section “Economic Theoretic Approach to Output and
Input Quantity Index Numbers” reviews the economic theoretic approach to index
numbers. Relationship between the theoretical constructs and operational index
number formulae is discussed. The concept of exact and superlative index numbers
is introduced. Section “Special Topics” deals with topics that are of special interest.
In particular the section discusses the transitivity requirement and its implication
for the selection of suitable index number formulae for measuring output and input
quantity change. Section “Conclusion” offers guidance to practitioners regarding the
choice of appropriate index number methodology for productivity measurement.

Notation and Preliminaries

Consider the case where there are N outputs and K inputs indexed by i = 1, 2,
. . . , N and k = 1, 2, . . . , K. Let {piτ , qiτ ; i = 1, 2, . . . , N and τ = s, t} represent,
respectively, output prices and output quantities in periods s and t.4 The input price
and quantities are, respectively, denoted by {wkτ , xkτ ; k = 1, 2, . . . , K and τ = s, t}.
Let p, w, q and x, respectively, represent vectors of output and input prices and
vectors of output and input quantities. All prices and quantities are assumed to be
strictly positive. Inner-products of vectors, say p and q and w and x, denoted by p · q
and w · x represent the sum of products of elements of the two respective vectors:

N
K
p·q ≡ pi · qi w·x ≡ wk · xk
i=1 k=1

The output revenue shares and input cost shares denoted by, riτ and vkτ , are
defined by:
piτ · qiτ piτ · qiτ
riτ ≡ N = i = 1, 2, . . . , N; τ = s, t
i=1 piτ · qiτ
pτ · qτ

and
wkτ · xkτ wkτ · xkτ
vkτ ≡ K = k = 1, 2, . . . , K; τ = s, t
k=1 wkτ · xkτ
wτ · xτ

The revenue and cost shares are all strictly positive, by assumption, and add up
to unity over respective domains.

4 When we consider comparisons across M firms we use indices j and k (= 1, 2, . . . , M).

760 D. S. Prasada Rao

Technology, Output, and Input Distance Functions

This chapter covers a number of basic concepts from production economics.

The state of technology, represented by S, is defined as all pairs of input and
output vectors (x, q) where x can produce q.
Throughout this chapter, where relevant, the production technology is assumed
to satisfy the following set of regularity conditions. A formal treatment of these
conditions can be found in Färe and Primont [26].

Regularity Conditions R.1

• With N outputs and K inputs, production technology S is a nonempty closed

N +K
subset of the nonnegative orthant R+ .
• If x 0, then there exists q 0 such that (x, q) ∈ S. This means positive inputs
5

lead to positive amounts of outputs.

• Free disposability of inputs and outputs: If (x1 , q1 ) ∈ S; x2 ≥ x1 ; q2 ≤ q1 then
(x2 , q2 ) ∈ S .
• The technology set is convex. That is, if (x1 , q1 ) ∈ S and (x2 , q2 ) ∈ S then for any
0 ≤ λ ≤ 1, ( λx1 + (1 − λ)x2 , λq1 + (1 − λ)q2 ) ∈ S.
• For any given input vector, x, the output set, P(x) = {q; (x, q) ∈ S} is closed,
bounded and convex. This means that finite amounts of inputs cannot produce
unlimited outputs.
• For any given output vector, q, the input set consisting of all input vectors that
can produce q, L(q) = {x : (x, q) ∈ S}, is a closed set.

The output and input distance functions are used at various places in this
chapter. Shephard (1970) introduced the notion of distance functions which have
subsequently assumed prominence in efficiency and productivity measurement. The
output distance function is the radial distance [27] between the observed output
vector from a specific production possibility frontier determined by the state of
technology and a prespecified input vector. Then the output distance function is
defined as6 :

Do (x, q : S) = minλ>0 { λ| (x, q/λ) ∈ S} = minλ>0 { λ| (q/λ) ∈ P (x)} (1)

5 Here x y means that each element of the vector x, xi , is strictly greater than the corresponding
element yi in vector y. In contrast x > y means that xi ≥ yi for all i with strict inequality holding for
at least one i. Further, x ≥ y simply means that xi ≥ yi for all i.
6 This is not as rigorous a definition as one would find in standard expositions on this subject (see

[26]). But this definition suffices for the purposes of this chapter.
19 Index Numbers and Productivity Measurement 761

The input distance function is similarly defined:

Di (x, q : S) = maxρ>0 { ρ| (x/ρ, q) ∈ S} = maxρ>0 { ρ| (x/ρ) ∈ L(q)} (2)

The output and input distance functions play a critical role in the construction
of output and input quantity index numbers. See Coelli et al. [12] and Sickles and
Zelenyuk [55] for more details and extensive discussion of these distance functions.

Productivity Measurement: The Case of Single Output and Single

Input

Consider the simplest case where a firm produces only one commodity using a
single input. Let q and x represent, respectively, the quantities of output produced
and input used in the production process. In addition, suppose the output and input
prices are known and represented by p and w, respectively. In this case the following
measures of performance of the firm can be used. The first measure of productivity,
denoted by PROD, is simply the ratio of output to input:

output q
P ROD = = (3)
input x

This measure shows the number of units of output produced per one unit of input.
An alternative measure of performance can be defined using profitability ratio,
denoted by PFR, which shows the amount of revenue generated per one unit cost. It
is given by

total revenue p·q

PFR = = (4)
total cost w·x

Both of these measures can be computed directly using data on output, input,
and the respective output and input prices. These measures are intuitive and are of
practical importance.

Absolute Versus Relative Measures

Measures in Eqs. (3) and (4) are both simple real numbers and are not informative
unless they are compared with the performance of another firm or compared over
time for the same firm. For purposes of illustration, consider productivity and
performance of a given firm in two periods, t and t + 1. Then

qt qt+1
P ROD t = P ROD t+1 = .
xt xt+1
762 D. S. Prasada Rao

Productivity change over time, denoted by PRODt, t + 1 , is the ratio of productiv-

ity levels in these two periods

P RODt+1 qt+1 /xt+1 qt+1 /qt output growth

P ROD t,t+1 = = = = (5)
P RODt qt /xt xt+1 /xt input growth

In this simple case, productivity growth over the period is equivalent to output
growth discounted by input growth.7
Consider firm’s performance using change in profitability ratio over the two
periods.
pt qt pt+1 qt+1
P F Rt = P F R t+1 =
wt xt wt+1 xt+1
P F Rt+1 pt+1 qt+1 /wt+1 xt+1 qt+1 /qt pt+1 /pt (6)
P F R t,t+1 = = = ×
P F Rt pt qt /wt xt xt+1 /xt wt+1 /wt
= Productivity Growth × Change in Terms of Trade

Equation (6) is important from a producer view point as it implies that changes
in profitability are driven by changes in productivity (ability to produce more output
with a given input) as well as changes in terms of trade measured by movements
in output prices relative to input prices. This, in turn, implies that profitability can
change even if productivity remains the same or, alternatively, profitability could
fall even when there are improvements in productivity.

Decomposition of Productivity Change: Single Input and Single

Output Case

Even though this chapter is about index numbers and measurement of output
and input quantity change and productivity change, it is useful to examine the
components or drivers of productivity change in this simplest case where a single
input is used in producing a single commodity. The discussion below draws material
from Section “Specification of Functional Forms for Input Aggregates” of Coelli
et al. [11], pp. 100–103. Following Eq. (5), productivity change from period s to t is
given by

qt /qs output change

P ROD st = =
xt /xs input change

Suppose production is governed by production technologies in the periods s and

t which are represented by production functions fs (x) and ft (x), respectively. Under

7 Similar
interpretation holds when two firms A and B are compared instead of a single firm
compared over two periods. In that case, these ratios represent relative levels rather than growth.
19 Index Numbers and Productivity Measurement 763

the assumption that the observed output quantities are the maximum possible under
technologies existing in these periods and hence the firm is technically efficient in
both periods, the productivity index in (5) can be written as:

qt /xt ft (xt ) /xt

P ROD st = = (7)
qs /xs fs (xs ) /xs

However, if the firm is technically inefficient in both periods then the observed
production is only a fraction of the maximum feasible indicated by the production
function. In this case

qτ = δτ × fτ (xτ ) for τ = s, t where 0 ≤ δτ ≤ 1 (8)

Here δ s and δ t presents levels of technical efficiency. Substituting (8) into (7) we
have

δt ft (xt ) /xt
P ROD st = × (9)
δs fs (xs ) /xs

The first part of the right hand side of Eq. (9) represents efficiency change
component of productivity change. Suppose, further, that quantity of input used in
the two periods is the same with xt = xs = x∗ then Eq. (9) simplifies to:

δt ft (x∗ )
P ROD st = × (10)
δs fs (x∗ )

Equation (10) simply states that in the absence of input change, productivity
growth is driven by two factors: technical efficiency change and technological
change.
In practice, input quantities do change and hence xt = xs . In the case of a single
input, the input in period t can be written as a scalar multiple of input in period s.
Therefore, we have xt = κxs where κ > 0. In this case, Eq. (9) can be written as:

δt ft (κxs ) /xt δt ft (κxs ) /κxs δt ft (κxs ) 1

P ROD st = × = × = × ×
δs fs (xs ) /xs δs fs (xs ) /xs δs fs (xs ) κ
(11)

This means that the scale of input usage has changed by a factor of κ. Suppose
the production function in period t is homogeneous of degree ε(t), then Eq. (11) can
be written as:

δt ft (xs )
P ROD st = × κ ε(t)−1 (12)
δs fs (xs )

Equation (12) identifies three drivers of productivity change: (i) the first compo-
nent on right hand side of Eq. (12) shows technical efficiency change; (ii) the second
764 D. S. Prasada Rao

component shows the effect of change in the scale of operations and thus represents
the scale effect; and (iii) the last component shows the contribution of technical
change.
A few comments are notable in this context. First, if the production function
exhibits constant returns to scale in period t then productivity change is driven solely
by technical change and technical efficiency change. Second, measures of technical
change and the scale effect in Eq. (12) are conditional on expressing period t input as
a multiple of period s input. A different answer results if period s input is expressed
as a multiple of period t input. It is likely that measures of both the scale effect
and technical change are different. This difference vanishes if technology in both
periods exhibits constant returns to scale.
In the more general and realistic case where a vector of outputs is produced
using a vector of inputs – the case of multiple inputs and outputs – it is necessary
to consider the contribution of changes in the output mix and the input mix to
the observed productivity change. There is vast literature on the decomposition
of productivity change and identifying the contribution of various drivers of
productivity change. This is subject matter for other chapters in this Handbook.

Multiple Outputs and Inputs: The Index Number Problem

Consider the more general and realistic case where a firm produces N outputs using
K inputs.8 In addition there are input and output data for two periods s and t (or
for two firms).9 In this case, let (qs , qt ); (xs , xt ); (ps , pt ); (ws , wt ) represent vectors of
output and input quantities and prices, respectively, in periods s and t.
The problem is how to generalize the measures of productivity and profitability
defined in Eqs. (3) and (4) and to measure changes in productivity and profitability.
Index numbers can be used in deriving suitable measures in this multi-output and
multi-input case.

What Are Index Numbers?

The standard notion of index number used in the discipline of economic statistics is
that an index number is a real number which measures changes in a set of related
variables. This is the concept that underpins the construction of consumer and
producer price index numbers as well as standard price and quantity index numbers.

8 Inputs may be classified as intermediate inputs and primary inputs (labor and capital) or as
KLEMS representing capital, labor, energy, materials and service inputs. This kind of distinction
is important in the case of measuring productivity within different sectors of the economy, e.g.,
agriculture, manufacturing etc. However, for the purpose of this chapter this distinction is not
critical.
9 For purposes of exposition the case of two periods is considered but all the results and

considerations equally apply to the case of two firms.

19 Index Numbers and Productivity Measurement 765

An alternative concept of index number could be that it is a real number that

measures the size of a given vector of variables. This concept leads to measures
such as the aggregate or the average size of a vector of variables.
These two alternative notions of index numbers are implicit in the following
description of the index number problem than can be found in Diewert [20]’s intro-
ductory chapter on Index Numbers and Measurement Economics. Diewert states:
“The question that this book addresses is: how exactly should the microeconomic
information involving possibly millions of prices and quantities be aggregated into
a smaller number of price and quantity variables? This is the basic index number
problem.”10
In the literature on productivity measurement, the traditional approach has been
to use index numbers as measures of change in a set of related variables and
therefore index numbers are used as measures of changes in prices and quantities
over time or as a measure of difference in levels in prices and quantities associated
with different firms. This treatment can be found in the seminal works of Diewert
[13], Caves et al. [10], Diewert [16], Balk [6], the CPI and PPI Manuals, and in a
standard textbook treatment provided in Coelli et al. [11, 12].
Based on this discussion, three strands of index number problems relevant for
productivity measurement and performance assessment can be identified.

Measuring Quantity Aggregates

In its simplest form, construction of index numbers based on economic aggregates

involves the computation of output and input aggregates. Let Q and X, respectively,
be real valued functions representing output and input aggregates. These are
expressed as functions of the quantities of outputs produced and inputs used in the
process. Thus,

Q = Q (q1 , q2 , . . . , qN ) X = X (x1 , x2 , . . . , xK ) (13)

Once the output and input aggregates are constructed, productivity in the
presence of multiple outputs and inputs can simply be measured, similar to the single
output and single input case, as the ratio

aggregate output Q Q (q1 , q2 , . . . , qN )

P ROD = = = (14)
aggregate input X X (x1 , x2 , . . . , xK )

The output and input quantity indices measuring changes in outputs produced
and inputs used are simply the ratios of the corresponding aggregates. The indices
measuring changes in outputs and inputs are defined as:

10 Reader may find alternative working definitions of index numbers from Balk [6] and Coelli et al.
[11, 12].
766 D. S. Prasada Rao

Q (qt ) Q (q1t , q2t , . . . , qN t )

Qst ≡ = and
Q (qs ) Q (q1s , q2s , . . . , qN s )
(15)
X (xt ) X (x1t , x2t , . . . , xKt )
Xst ≡ =
X (xs ) X (x1s , x2s , . . . , xKs )

where Qst and Xst , respectively, represent output and input quantity index numbers
measuring the changes in output and input levels from period s to period t.11
Implementation of this approach hinges on how the quantity aggregates in Eq.
(14) are measured. This approach is discussed further in Section “Multiple Outputs
and Inputs: The Index Number Problem.”

Measures of Output and Input Quantity Change as an Aggregate

of Commodity-Specific Changes

This approach advocates measuring overall change in outputs and inputs directly
using index number methods. Here the required quantity index is measured using
both price and quantity data pertaining to the two periods. The output and input
quantity index numbers are then defined as:

Qst ≡ Qst (qt , pt ; qs , ps ) and Xst ≡ Xst (xt , wt ; xs , ws ) (16)

where Qst and Xst are real valued functions satisfying a number of properties like
positivity, continuity, monotonicity, and others.12
We discuss a number of index number formulae in the Section “Index Numbers
Based on Quantity Aggregates,” but it is useful to consider the intuition behind the
index number approach. First focus on the output quantity index number measuring
change in the output vector over time. For commodity i (=1, 2, . . . , N), growth
or change in the output, from base year s to comparison period t, is given by the
ratio qit /qis . Given this observed growth in each of these outputs, an overall index
may be defined as a weighted average of the growth in individual items. If weights
reflect the importance of each commodity as measured by the revenue share of the
commodity in period s defined in Section “Notation and Preliminaries,”13 the output
quantity index is given by

11 Index numbers always identify the reference or base period and the comparison or current period.

In Equation (15), periods s and t are the base and current periods for which the index is defined.
12 These properties are usually discussed under the axiomatic approach to index numbers which
is considered further in Section “Index Number Approach to Measuring Quantity Change” of this
chapter.
13 On the other hand, one may choose to use expenditure shares in the current or comparison

period, t.
19 Index Numbers and Productivity Measurement 767

N
N
qit N
qit pis qis qit · pis qt · ps
Qst = · ris = · N = i=1 = (17)
qis qis i=1 pis qis
N
i=1 qit · pis
qs · ps
i=1 i=1

This quantity index is generally referred to as the Laspeyres index [39] index.
Several features of this formula are worth noting. First, this index does not require
a quantity aggregate to be defined for each period, instead it starts with growth in
output of each commodity as building blocks and the index is then defined as a
weighted average of the change in different commodities. Second, the last ratio in
this equation suggests that this index can be interpreted as the ratio of value of output
in periods t and s both evaluated at the output prices in period s, the base period.
Third, weights used in Eq. (17) which are revenue shares in period s can be replaced
by revenue shares in period t or by an average of the shares in both periods. Finally, it
is not necessary that the quantity index makes use of an arithmetic average. One may
use geometric or harmonic means of commodity specific output quantity changes.
A range of index number formulae based on these alternative specifications are
discussed in Section “Index Numbers Based on Quantity Aggregates.”

Decomposition of Changes in Revenues and Costs

From the perspective of the firm, performance is reflected in the revenue, costs, and
profits generated out of its operations. In periods t and s, total revenue and costs are
given by:

N
K
Rt = qit · pit = qt · pt Ct = xkt · wkt = xt · wt
i=1 k=1
(18)

N
K
Rs = qis · pis = qs · ps Cs = xks · wks = xs · ws
i=1 k=1

The revenue and cost changes over the two periods are then given by

Rt qt .pt Ct xt .wt
Rst = = and Cst = = (19)
Rs qs · ps Cs xs .ws

In Eq. (19), total revenues and costs in periods t and s are both observed. The
index number problem is often stated as one of identifying the price change and
quantity change that drive the revenue and cost changes. For example, suppose the
total revenue of a firm has increased from $1500 to $2000 which gives a value of
1.33 for Rst indicating a 33% increase in the revenue. The firm manager might wish
to know how much of this increase is due to increase in output prices and what part
is due to increase in outputs produced. The index number problem is often described
as finding Qst and Pst which respectively represent quantity change and price change
over the period s to t such that the value or revenue change is equal to the product
768 D. S. Prasada Rao

of price and quantity change

Rt qt .pt
Rst = = = Qst · Pst . (20)
Rs qs · ps

The change in costs from period s to t can be similarly decomposed into input
quantity change and input price change

Ct xt .wt
Cst = = = Xst · Wst (21)
Cs xs .ws

This decomposition is often described as a test or an axiom in the index number

literature. The axiomatic or test approach is discussed in Section “Axiomatic
Approach to Index Numbers.”
An important corollary of this decomposition of value change is that, for
example, if revenue change is observed to be 1.33, then estimates of quantity and
price change have to be consistent with this observation. It also implies that, in Eq.
(20), the knowledge of two out of the three entities, Rst , Qst and Pst will determine
the third. For example, if revenue change is observed and if it is easy to compute the
price change, Pst , the quantity change can be simply obtained as:

Rst
Qst = (22)
Pst

Similarly, changes in input costs can be decomposed into input price change and
input quantity change.
Based on this decomposition, it is possible to decompose change in profitability
ratio. This is given by

P F Rt Rt /Ct Rt /Rs
P F R st = = =
P F Rs Rs /Cs Ct /Cs

Revenue change Qst · Pst

= = (23)
Cost change Xst · Wst

Qst Pst
= · = Productivity change × Terms of trade change
Xst Wst

Equation (23) is a multi-output and multi-input generalization of the decomposi-

tion of profit change described in Eq. (4). This equation underscores the important
link between price and quantity index numbers and the value change. Ignoring
this link would create a disconnection between measures of quantity change, price
change, and the measure of value change.
19 Index Numbers and Productivity Measurement 769

Index Numbers Based on Quantity Aggregates

This review of index numbers begins, within the context of measuring productivity
change, with the approach that builds on the notion of output and input quantity
aggregates. Use of quantity aggregates and price aggregates in the context of
index numbers is not new (see [20]). However, this approach has reemerged in the
literature as it has recently been advocated in a series of papers on productivity
measurement by O’Donnell [44–47]. The approach in these papers underpins the
DPIN software developed by O’Donnell [43] which is available through the website
of the Centre for Efficiency and Productivity Analysis (CEPA) (URL: www.uq.edu.
au/economics/cepa).
In simple terms, this approach advocates the construction of input and output
aggregates in periods s and t denoted respectively by Q(qs ), Q(qt ), X(xs ), and X(xt ).
The required indices are then given by:

Q (qt ) X (xt )
Output index = Qst ≡ ; Input index = Xst ≡ ;
Q (qs ) X (xs )
(24)
Qst
Productivity change = P ROD st ≡
Xst

Implementation of this approach requires details as to how these aggregates can

be computed. This in turn requires a description of the basic properties expected
of these aggregates. O’Donnell [43, 46] lists the following properties for the output
aggregates14 :
Property 1 Q(.) is a real-valued function which is nonnegative. For any given
output vector which is nonnegative, the aggregate is nonnegative.
Property 2 Output aggregate is nondecreasing in its arguments. If the output of
one of the commodities increases, keeping other outputs fixed at given levels, the
output aggregate must increase or at least remain the same.
Property 3 The output aggregate function is linearly-homogeneous. This
means that if all the outputs are multiplied by a positive constant then
the output aggregate is also multiplied by the same constant. That is, for
any λ > 0, Q (λq1 , λq2 , . . . , λqn ) = λQ (q1 , q2 , . . . , qN ).
These properties may be considered as a minimal set of properties to be satisfied
by output and input aggregates.

Specification of Functional Form for the Output Aggregates

In order to implement this approach, it is necessary to specify the functional

form for Q (.) and X (.). O’Donnell ([43], p. 3) provides a long list of functional

14 Focus here is on the output aggregate and similar properties are expected of the input aggregate.
770 D. S. Prasada Rao

forms that satisfy the three properties listed above. In defining some specific types
of aggregates, observed price data are used to combine quantities of different
commodities. Thus, the observed price vectors in periods s and t, ps and pt , are
used below. Here s represents the base period (or reference firm) and t represents
the current or comparison period (or firm).
Laspeyres-type aggregate: Base period prices are used to compute the output
aggregate. It is given by:

N
QL (qτ ) ≡ ps · qτ = pis · qiτ τ = s, t (25)
i=1

In this case the output quantity aggregate is simply the total value of the output
in each period s and t evaluated at prices observed in the base period, s.
Paasche-type aggregate: This aggregate uses current or comparison period prices
to compute the output aggregate and it is given by:

N
QP (qτ ) ≡ pt · qτ = pit · qiτ τ = s, t (26)
i=1

This aggregate represents the total value of output in periods s and t evaluated at
prices observed in the current period, t.
Fisher-type aggregate: The Fisher aggregate is defined as the geometric mean of
the Laspeyres and Paasche- type of aggregates defined above.

QF (qτ ) ≡ [(ps · qτ ) . (pt · qτ )]0.5

0.5 (27)
N N
= pis · qiτ · pit · qiτ τ = s, t
i=1 i=1

It is important to note that the Fisher-type aggregate is not derived by evaluating

output quantities at geometric averages of prices in the base and current periods.
The Fisher-type aggregate in (27) may be considered as a symmetric aver-
age of the Laspeyres and Paasche quantity aggregates based on the geometric
average. Thus, it is possible to define alternatives to Fisher-type aggregates by
taking the arithmetic or harmonic averages leading to (QL (qτ ) + QP (qτ )/2 or
2QL (qτ ) · QP (qτ )/[QL (qτ ) + QP (qτ )], τ = s, t. The use of Laspeyres, Paasche,
and Fisher type aggregates (in Eqs. 25, 26 and 27) leads to the standard Laspeyres,
Paasche, and Fisher quantity indices widely used for measuring quantity change.
Lowe aggregate: The Lowe aggregate is a general specification that includes
Laspeyres and Paasche specifications as special cases. This aggregate makes use of
a fixed reference price vector, pR to evaluate quantities in both periods. It is given
by:
n
QLowe (qτ ) ≡ pR · qτ = piR qiτ τ = s, t (28)
i=1
19 Index Numbers and Productivity Measurement 771

Lowe index requires nomination of the reference price vector. If the reference
price vector selected is same as the base period price vector, we have the Laspeyres
aggregate. Similarly, use of current period price vector results in the Paasche
aggregate.
The Lowe aggregate also includes the Edgeworth-Marshal and Walsh indices
(see [6] for a description of these indices) as special cases where the reference price
vector is an arithmetic or geometric average of the prices in the two periods. When
working with multiperiod or multifirm data the Lowe aggregate may be defined
using an average of prices in all the periods.
No, consider a series of aggregates which make use of the Malmquist output
distance function defined in Section “Notation and Preliminaries” of this chapter.
Alternative quantity aggregators based on Malmquist distance are listed below.
These aggregates form the basis for the Hicks-Moorsteen index [8].
Laspeyres-type Malmquist quantity aggregate: This is defined using base period
production technology and the input vector represented, respectively, by Ss and xs .
The aggregate is given by

QM−L (qτ ) ≡ Do xs , qτ : S s τ = s, t (29)

Paasche-type Malmquist aggregate: This aggregate is defined using the current

period technology and input vector represented, respectively, by St and xt . The
aggregate is defined as:

QM−P (qτ ) ≡ Do xt , qτ : S t τ = s, t (30)

Hicks-Moorsteen-type Malmquist aggregate: This aggregate is defined along

the lines of the Fisher index. It is defined as the geometric average of the Laspeyres
and Paasche-type Malmquist aggregates.

0.5
QM−H M (qτ ) = Do xs , qτ : S s · Do xt , qτ : S t τ = s, t (31)

Färe-Primont Malmquist Aggregate: This aggregate function is similar in

structure to the Lowe index defined above. It depends upon an arbitrarily selected
reference input vector, xR , and reference technology, SR . It is defined as:

QM−F P (qτ ) ≡ Do xR , qτ : S R τ = s, t (32)

Aggregates defined above satisfy all the three Properties 5.1 to 5.3. These
quantity aggregates have been employed in the past in measuring productivity levels
and productivity change over time. A distinguishing feature of the Malmquist-
based aggregators is that these aggregates do not require any price information and
therefore can be used even when price data are not available. On the down side, it
772 D. S. Prasada Rao

is necessary to know the production technology in the form of the output distance
function – both the functional form and values of the parameters in the function.
In implementing the quantity aggregates approach as illustrated through Eqs.
(25), (26), (27), (28), (29), (30), (31), and (32), the practitioner has to make some
conscious decisions regarding the type of aggregator, price vector in the case of
Lowe index, and the choice of reference technology and reference input vector in
the case of Fare-Primont Malmquist distance based aggregate.

Specification of Functional Forms for Input Aggregates

Specification of functional forms for input aggregates is similar to output aggregates

discussed in the previous section. In defining input quantity aggregates use is often
made of input prices. As input quantities cannot be summed directly, input prices are
used in the process of aggregating quantities. The input quantity and price vectors
in the two periods are represented by (xτ , wτ : τ = s, t). Measures similar to those in
Eqs. (25), (26), (26), (27), and (28) used in defining output quantity aggregates will
lead to the following input quantity aggregates.

K
XL (xτ ) ≡ ws · xτ = wks · xkτ τ = s, t
k=1
K
XP (xτ ) ≡ wt · xτ = wkt · xkτ τ = s, t
k=1

XF (xτ ) ≡ [(ws · xτ ) . (wt · xτ )]0.5

0.5
K K
= wks · xkτ · wkt · xkτ τ = s, t
k=1 k=1
K
XLowe (xτ ) ≡ wR · xτ = wkR xkτ τ = s, t
k=1

where X(.) represents the quantity aggregate and wR denotes a reference input price
vector used in the Lowe aggregate function.
The Malmquist input distance function based aggregates makes use of the input
distance function Di (x, q : S). Based on this distance function, we can define four
input aggregates similar to those in Eqs. (29), (30), (31), and (32). These are:

XM−L (xτ ) ≡ Di xτ , qs : S s τ = s, t

XM−P (xτ ) ≡ Di xτ , qt : S t τ = s, t
0.5
XM−H M (xτ ) = Di xτ , qs : S s · Di xτ , qt : S t τ = s, t

XM−F P (xτ ) ≡ Do xτ , qR : S R τ = s, t
19 Index Numbers and Productivity Measurement 773

where qR represents a preselected reference quantity vector and SR represents

reference production technology. The Färe-Primont type aggregate will differ
depending on the choice of the reference technology and reference output vector.
The output and input aggregates discussed in this section can be used in
measuring changes in output and input vectors over time or the level differences
across two firms. Measures of input and output change are simply defined, see Eq.
(24), as the ratio of the respective quantity aggregates over time and for measures of
productivity levels across firms use the ratio of the respective aggregates over two
firms at the same point in time.

Index Number Approach to Measuring Quantity Change

This section describes the traditional index number theoretic approach to measuring
quantity changes. As before, the exposition focuses on methods for measuring
output quantity change but these are equally applicable to measurement of input
quantity change. These methods are designed to make comparisons between two
sets of observations, over time or across two firms. Without loss of generality,
discussion below focuses on comparisons over time.
As index numbers measure changes in a set of related variables, it is necessary
to identify one of the periods or firms as the base or reference period and the
other as the current or comparison period. This means that the index number
measures changes in output quantities from the base period to the current period.
For comparisons between firms, use the terms reference firm and the comparison
firm.
In the literature there are two alternative approaches to the construction of
quantity index numbers. The first approach is the direct approach whereby the
index number formulae are designed to measure output quantity change. The
second approach, known as the indirect approach, where the output price change
is measured first and then quantity change is measured as the ratio of the observed
value (revenue in this case) change and the price change. The revenue change is
measured by the ratio of revenues in periods s and t, shown in equation below.

Rt qt .pt
Revenue change = Rst ≡ =
Rs qs · ps

The indirect approach is based on the fundamental index number decomposition

which states that measures of value change must equal the product of measures of
quantity change and price change. In the case of outputs and revenues, revenue
change is decomposed into output quantity change and output price change, as
shown below.

Rst = Qst · Pst = Output quantity change × Output price change.

The output quantity change can be measured indirectly as:

774 D. S. Prasada Rao

Rst
Rst = Qst · Pst ⇒ Qst =
Pst

The indirect input quantity index numbers can similarly be defined using Cst , Wst ,
and Xst which, respectively, represent measures of cost change, input price change,
and input quantity change. See Section “Indirect Measures of Quantity Change” for
further discussion on the decomposition of value (revenue or cost) change.

Direct Approach to Quantity Index Numbers

Under this approach quantity index numbers are computed directly using observed
price and quantity data. In presenting various formulae a further distinction is made
between two different approaches. The first approach makes use of a fixed price
vector to compare quantity levels. The second approach computes an average of
changes in quantities of different products over the two periods. This approach is
essentially one of obtaining a measure of central tendency.
The formulae presented here refer to output quantities but are equally applicable
for measuring changes in input quantities when output quantities and prices are
substituted by input quantities and input prices.

Fixed Price Approach

As the title suggests, application of this approach requires specification of a
reference set of prices at which output quantities are evaluated. The most generic
index under this class is the Lowe index. indices like the Laspeyres, Paasche,
and Walsh indices are similar to the Lowe index in formulation but with subtle
differences.
Lowe Index: The Lowe index [40] makes use of an arbitrarily chosen reference
vector of prices, pR . The Lowe index is defined as:
N
piR qit
QLowe
st
i=1
≡ N (33)
i=1 piR qis

The Lowe index is the ratio of revenues associated with output quantities in
periods s and t evaluated at fixed reference prices. Any change in the revenue ratio
in Eq. (33) reflects changes in quantities since reference prices are fixed.
The selection of the reference price vector is a critical decision in the implemen-
tation of the Lowe index. In practice use is made of the average of output prices
in both periods s and t or all the time periods in the analysis if quantity changes
are measured over several periods. If quantity comparisons are made across several
firms from a given industry, then the reference price vector may be an average of the
prices faced by different firms.
Laspeyres Index: The Laspeyres quantity index [39] is in a way a special case
of the Lowe index but it differs from the Lowe index from both operational and
19 Index Numbers and Productivity Measurement 775

economic perspectives. Laspeyres index has its origins in the construction of fixed
basket consumer price index numbers.
Faced with the choice of the reference price vector for the implementation of the
Lowe index, the Laspeyres index suggests the use of base period, s. The choice is
driven by the fact that the reference price vector is operationally relevant for the firm
in at least one of the periods. From an economic point, firm’s decisions on output
quantities in the base period would have been governed by output prices prevailing
in that period.
Thus, the Laspeyres quantity index is defined as:
N
pis qit ps · qt
QL
st
i=1
≡ N = (34)
i=1 pis qis
ps · qs

Economic intuition of the Laspeyres index is simply that one output quantity
vector is deemed to be bigger than another if it generates more revenue at the prices
prevailed in the base period. For example, at the base period prices if the revenues
generated at base and current period quantities are $10 million and $12 million,
respectively, then the Laspeyres quantity index implies a 20% increase in output
quantities.
Paasche Index: The Paasche index [49] is defined as the change in revenue
generated from outputs in the two periods when current price vector, pt , is used
for evaluating output. This index is also a special case of the Lowe index where the
reference price vector is the current period price vector, pt . The Paasche quantity
index is defined as:
N
pit qit pt · qt
QPst i=1
≡ N = (35)
i=1 pit qis
pt · qs

The Paasche index compares the revenue generated by current period output
quantities at current period prices with the revenue that would have been generated
in the base period at current period prices. As the same prices are used in evaluating
revenues generated by quantity vectors in periods s and t, any change in the revenue
is attributed to quantity change.
If pit = λ · pis ; i = 1, 2, . . . , N where λ is strictly positive, then the Laspeyres
and Paasdche indices coincide.
N N N
i=1 pit qit i=1 λ · pis qit pis qit
QPst = N = N = N
i=1
= QL
st
i=1 pit qis i=1 λ · pis qis i=1 pis qis

However, in practice price changes are not proportional across all commodities.
In that case these indices differ, sometimes quite significantly, if the price structures
exhibit major shifts. When comparisons are being made over time, one would expect
the price structures to be similar when the periods are not too distant. However,
firms in different spatial locations are likely to face different prices and relative
776 D. S. Prasada Rao

price structures. Differential price structures are likely to be encountered if the firms
are located in different countries or in different regions of a country. For example,
relative prices of agricultural commodities are likely to differ significantly across
different states within the USA.
Fixed Price Indices based on Symmetric Averages of Base and Current Period
Prices: The Laspeyres and Paasche indices are usually considered to be two
extremes since these indices utilize prices prevailing in the base and current periods.
In cases where price structures may be changing over time, it may be prudent to
make use of an average of the prices in the two periods. The resulting price index
would be:
N ∗
i=1 pi qit
Q∗st = N where
∗
i=1 pi qis
(36)
(pis + pit ) √ 2
pi∗ = ; or pis · pit ; or ; i = 1, 2, . . . , N
2 1
+ 1
pis pit

In Eq. (36), the arithmetic, geometric, and harmonic means of prices in the two
periods are used in evaluating quantity vectors in the two periods. The quantity
index resulting from the use of the arithmetic average of prices is known as the
Edgeworth-Marshall index [22, 42]. Similarly, the Walsh index [62] makes use of
geometric average of prices in the current and base periods.
Fisher Index and Symmetric averages of the Laspeyres and Paasche Indices:
Since the Laspeyres and Paasche indices are seen as two extremes relying on the
base or the current period prices, Irving Fisher [28] suggested the use of a geometric
average of the Laspeyres and Paasche indices. The Fisher index is defined as:
N 0.5
N
i=1 pis qit i=1 pit qit
1/2
QFst ≡ QL
st · Qst
P
= N · N (37)
i=1 pis qis i=1 pit qis

The Fisher index possess several important axiomatic and economic theoretic
properties which are discussed in this and the following section. It is possible to use
arithmetic or harmonic averages of the Laspeyres and Paasche quantity indices but
none of them have properties similar to that of the Fisher index.
The Lowe, Laspeyres, Paasche, and Fisher indices discussed in this section
play an important role in measuring changes in output quantities, changes in input
quantities and measurement of productivity change.

Direct Indices Based on Statistical Averages

In this section the problem of measuring changes in the output quantity vector is
considered as a problem of obtaining a suitable average or a measure of central
tendency of observed growth in outputs of different commodities. Given the output
quantities in two periods, {q1s , q2s , . . . , qNs } and {q1t , q2t , . . . , qNt } , there are N
measures of quantity change, one for each commodity, {qit /qis : i = 1, 2, . . . , N}.
19 Index Numbers and Productivity Measurement 777

Then the index number problem is one of obtaining a suitable measure of central
tendency of these N ratios.

Unweighted Measures
If all commodities are treated as equally important, then the following two index
numbers due to Carli [9] and Jevons [36] are of importance.

N
1 qit
N
qit 1/N
QCarli
st ≡ QJevons
st ≡ (38)
N qis qis
i=1 i=1

These indices are essentially arithmetic and geometric averages of quantity

change observed for each commodity. These indices were proposed and in use well
before the Lowe, Laspeyres, Paasche, and Fisher index numbers.

Quantity Indices Using Revenue Share Weighted Averages

Use of unweighted averages as is the case in Eq. (38) means that quantity change in
each commodity is considered equally important. This is an untenable position from
the perspective of an economist as not all commodities produced are considered
to be of equal importance. Usually a weighted average with weights reflecting
the economic importance of different commodities is recommended. A measure
of economic importance of output of a particular commodity is its contribution
to the total revenue or its share in the revenue (defined in Section “Notation and
Preliminaries”). Revenue shares may refer to the base period, s, or to the current
period, t, or an average of shares in both periods. It is also possible to use some
hypothetical or reference set of weights which have no relationship with revenue
shares observed in the base and current periods.

Weighted Arithmetic Averages

Base-period Revenue Share Weights: Laspeyres Quantity Index
Using the base period revenue shares, ris , as weights in an arithmetic average
formula lead to the Laspeyres index. It can be seen by
N
N qit N pis qis qit pis qit
ris · = N · i=1
= N = QL
st
i=1 qis i=1
i=1 pis qis
qis i=1 pis qis

Current Period Revenue Share Weights: Paasche Quantity Index

If a harmonic mean of quantity ratios is used in conjunction with current period
revenue shares, rit , it results in the quantity index.

N
1 1 i=1 pit qit
N qis
= N pit qit qis
= N = QPst
i=1 rit · qit i=1 N p q · qit p q
i=1 it is
i=1 it it
778 D. S. Prasada Rao

Weighted Geometric Averages

A number of commonly used formulae can be derived as geometric averages of
quantity ratios with revenue share weights. Out of these the most commonly used are
the Törnqvist index [62] and the geometric-Young (G-Y) index [62]. These indices
can be built from the Laspeyeres and Paasche-type indices shown below.
Geometric-Laspeyres: This index is defined using base period revenue shares.

N

qit ris
QG−L
st ≡ (39a)
qis
i=1

Geometric- Paasche: This index is defined using current period revenue shares.

N

qit rit
QG−P
st ≡ (39b)
qis
i=1

T‚ornqvist Index: The T‚ornqvist (TT) index is then defined, similar to the Fisher
index, as the geometric mean of the geometric-Laspeyres and geometric-Paasche
indices. It is given by:
N N 1/2 N ris +rit
1/2 qit ris qit rit qit 2
QTstT ≡ QG−L
st · QG−P
st = · =
qis qis qis
i=1 i=1 i=1
(40)

The TT index is similar to the Fisher index in its formulation – defined as

a symmetric average of the geometric-Laspeyres and geometric-Paasche indices.
While Fisher index cannot be shown to be a weighted arithmetic average of quantity
change observed for each commodity, the TT index is indeed a weighted geometric
average of quantity ratios where weights are the arithmetic average of shares in the
base and current periods.
The TT index is an important formula for constructing consumer and producer
price and quantity index numbers. It also plays an important role in productivity
measurement. Though the TT index does not satisfy all the axiomatic properties
that the Fisher index does, numerical values of TT index in most practical situations
are quite close to the Fisher index. So, in practical situations it really does not matter
whether one chooses the Fisher or TT index for measuring quantity change.

Geometric-Young Index
The G-Y index is similar in concept to the Lowe index but defined in terms of
quantity change rather than quantity aggregates. Thus G-Y index is more flexible
operationally and it carries intuitive statistical interpretation. This index uses item
specific weights that need not necessarily correspond to revenue shares in the base
or current periods. If ri denotes the weight for i-th commodity ratio, then the G-Y
index is defined as:
19 Index Numbers and Productivity Measurement 779

N

qit ri
QG−Y
st ≡ (41)
qis
i=1

Empirical implementation of the G-Y index requires specification of the weights.

The index is particularly useful when comparisons are being made over several
periods or across a set of firms. In that case the G-Y index can be anchored on
the average of revenue shares, averaged over time or across different firms. After
we discuss the axiomatic framework for index numbers, we will note that the G-Y
index has a unique property that makes it a particularly useful tool.

Indirect Measures of Quantity Change

Measures of output quantity change are often derived indirectly whereby price
changes are measured first and then quantity changes are obtained indirectly through
the standard decomposition of revenue change15 formula:
N
pit qit
i=1
Rst = N = Pst · Qst
i=1 pis qis

This decomposition is also known as the product test.

Given this decomposition, Qst can be derived indirectly if Pst is known. The
indirect quantity index is then given by
N N
Rst i=1 pit qit 1 i=1 pit qit /Pst
Qst = = N · = N
Pst i=1 pis qis
Pst i=1 pis qis (42)
Revenue in period t adjusted for price change from period s to t
=
Revenue in pertiod s

The numerator is revenue in period t after the effect of price change over
the period s to t is removed. It is usually referred to as constant price revenue
aggregate.16 Equation (42) means that the indirect quantity change measure is a
ratio of the base and current period revenues after the effect of price change is
eliminated using an appropriate price index.
In order to implement (42), we need to select a price index number formula.
Price index formulae are similar to quantity indices and most standard price index

15 Inthe case of input quantity changes, changes in total cost are decomposed into input price
change and input quantity change. Here input price change is measured first and input quantity
change is measured indirectly.
16 This terminology is more common in the case of national accounts aggregates such as GDP.

National statistical offices regularly publish GDP in current prices and GDP in constant (base
period) prices.
780 D. S. Prasada Rao

number formulae can be derived by simply interchanging price and quantities in the
formulae discussed in Section “Direct Approach to Quantity Index Numbers.” As
this is a mechanical exercise, we do not list them.

Quantity Index Based on the Laspeyres Price Index

This index is defined as the ratio of the revenue change index and Laspeyres price
index.
N N N
L Rst i=1 pit qit / i=1 pis qis i=1 pit qit
I QPst = = N N = N = QPst (43)
PstL i=1 pit qis / i=1 pis qis i=1 pit qis

L
where I QPst represents the indirect quantity index derived using the Laspeyres
Price index. From Eq. (43), it can be seen that this indirect index is the same as
the Paasche-Quantity index.
An implication of Eq. (43) is that if the Laspeyres price index is a suitable
measure of price change then Paasche Quantity index must be used in order to
satisfy the index number decomposition condition. In this sense the Paasche quantity
index is the dual of the Laspeyres price index.

Quantity Index Based on the Paasche Price Index

This index is defined as the ratio of the revenue change index and the Paasche price
index.
N N N
P Rst i=1 pit qit / i=1 pis qis pit qis
I QPst = P = N N = i=1
N
= QL
st (44)
Pst i=1 pit qit / i=1 pis qit i=1 pit qis

P
where I QPst represents the indirect quantity index derived using the Paasche price
index. From Eq. (44) it can be seen that the Laspeyres quantity index is dual to the
Paasche price index.

Quantity Index Based on the Fisher Price Index

The Fisher price index is the geometric mean of the Laspeyres and Paasche price
index numbers. Thus, the Fisher-based indirect quantity index is given by:

F Rst Rst 0.5 Rst 0.5 0.5 0.5
I QPst = 1/2 = L
· P
= QPst · QL
st = QFst
Pst · Pst
L P Pst Pst
(45)

The second last expression on the right hand side follows from Eqs. (43) and
(44). This is an important result which shows that the Fisher quantity index is dual
to the Fisher price index and therefore Fisher price and quantity indices are self-dual
to each other.
19 Index Numbers and Productivity Measurement 781

Indirect Quantity Indices with Geometric Price Indices

Since the revenue shares are additive and the geometric indices are multiplicative
there are no interesting dual properties attached to indirect geometric quantity
indices. We simply note here that indirect quantity indices are computed using Eq.
(42) along with one of the geometric price indices defined in equations (39a, b) to
(41).

Direct Versus Indirect Measures of Quantity Change: Which One

to Use?

This is an important question. When a practitioner is working with a data set

that contains both price and quantities of commodities, this question assumes
significance. Fortunately this is an issue that was adequately addressed by Allen
and Diewert [3]. Their recommendation is to use the index that is most reliable
out of the two alternatives. An important conclusion from their paper is that if
price and quantity ratios – commodity specific changes in prices and quantities
over time – do not show much variability it really does not matter whether direct
or indirect measures are used and, in this case, most index number formulae will
result in numerical values close to each other. Their main recommendation refers
to the case where price and quantity relatives exhibit significant variability across
commodities. In this case, Allen and Diewert [3] suggest a procedure which involves
the following steps: (i) run regressions where ln (pit /pis ) and ln (qit /qis ) are regressed
separately on a constant; (ii) compute residual sum of squares of these regressions;
and (iii) choose the index that is associated with the smallest residual sum of squares.
Properties of the residual sum of squares as measures of reliability are established in
Allen and Diewert [3]. The only comment to add is that this measure of reliability
can be improved upon by running weighted least squares with revenue shares as
weights reflecting the relative importance of difference commodities included in
index computation.

Axiomatic Approach to Index Numbers

The axiomatic approach to index numbers offers a number of criteria which can be
used in evaluating the usefulness of various index number formulae. These criteria
are usually referred to as axioms or tests. This approach serves two important
functions. First and foremost it helps the user or practitioner in choosing an
appropriate index number formula which can be a real issue when one is confronted
with a myriad of formulae serving the same purpose. The second role of the
axiomatic approach is that it helps us to narrow the choice set of formulae by
restricting to those which satisfy a given set of axioms. The earliest attempts to
identify a number of desirable criteria date back to Fisher [28]. Section “Absolute
Versus Relative Measures” of Balk [6] sketches the historical development of
axiomatic approach. A formal treatment of the axioms and proofs of a number
782 D. S. Prasada Rao

of useful results can be found in Eichorn and Voeller [24], Eichorn [23], Diewert
[16], and Balk [4, 6]. Balk [6] makes a distinction between axioms and tests.
Time reversal, factor reversal, and some others are classified under tests whereas
properties like identity and linear homogeneity are included in the list of axioms.
Diewert [16] and Chap. 19, “Index Numbers and Productivity Measurement”,
of the Producer Price Index Manual [35] refer to all the properties as tests. This
chapter considers axioms and tests as properties of index numbers and discuss them
under the single heading, axioms. The axiomatic theory treats prices and quantities
of commodities as independent nonstochastic variables. This treatment of prices and
quantities differs from the economic theoretic approach where prices and quantities
are assumed to be functionally related.

Notation for the Axiomatic Approach

A slightly more elaborate notation is used to discuss axioms for index numbers. For
simplicity the focus is on axioms for index numbers measuring changes in prices
and quantities from the base period, s, to current or comparison period, t.17 In the
axiomatic approach, price and quantity index numbers are considered as real valued
functions of prices and quantities in these two periods. The following notation:

Pst = P (pt , qt , ps , qs ) Qst = Q (pt , qt , ps , qs )

where Pst and Qst represent, respectively, price and quantity index numbers is used.
Prices and quantities are assumed to be strictly positive. The price and quantity
index functions are assumed to be strictly positive, continuous, and differentiable.18
Price and quantities here may refer to outputs or inputs.

Axioms and Discussion

The following is a set of widely canvassed axioms in index number literature. All
the axioms are stated for quantity indices but equally applicable for price index
numbers.
Axiom 1 – Strong Identity Axiom: If all the quantities in the current period are
equal to quantities in the base period, then the quantity index must equal 1. If
qt = qs = q then

Qst (pt , q, ps , q) = 1 (46)

17 In most standard textbook expositions on index numbers base and current periods are denoted by

“0” and “1” respectively.

18 Continuityand differentiability are mathematical requirements that ensure smoothness of the
index numbers derived.
19 Index Numbers and Productivity Measurement 783

This axiom implies that changes in prices over the period s to t do not influence
the output quantities. This has the implication that output prices and quantities are
independently determined, which may not be consistent with microeconomic theory
of producer behavior.
Axiom 2 – Weak Identity Axiom: If prices and quantities in period t are the same
as those observed in period s, that is, pt = ps ; qt = qs then the quantity index must
equal 1. Thus

Qst (ps , qs , ps , qs ) = 1 (47)

This axiom links price and quantity movements and this identity axiom applies
only when the price-quantity pair remains the same and, therefore, weaker than the
strong identity axiom.
Axiom 3 – Linear homogeneity in current period quantities: If the output vector
in current period is multiplied by a positive constant, λ, then the resulting index is a
multiple of the quantity index by the same constant λ. This means

Qst (pt , λqt , ps , qs ) = λ · Qst (pt , qt , ps , qs ) for λ > 0. (48)

Axiom 4 – Monotonicity: The quantity index must be increasing in current period

quantities and decreasing in base period quantities.

Qst (pt , q, ps , qs ) > Qst (pt , qt , ps , qs ) for all q > qt ; Qst (pt , qt , ps , q)
(49)
< Qst (pt , qt , ps , qs ) for all q > qs

Axiom 5 – Homogeneity of degree zero in quantities: If quantity vectors in

both periods are multiplied by the same factor then the quantity index remains
unchanged.

Qst (pt , λqt , ps , λqs ) = Qst (pt , qt , ps , qs ) for λ > 0 (50)

Axiom 6 – Commensurability or dimensional invariance: The quantity index

must be independent of the units in which output quantities are measured. If all
the quantities are scaled using a diagonal matrix, N × N with positive elements,
then the quantity index remains unchanged. Thus, we have

Qst −1 pt , qt , −1 ps , qs = Qst (pt , qt , ps , qs ) where N xN > 0 (51)

Axiom 6 – Proportionality: If quantities in period t are a constant multiple of

quantities in period s or equivalently qit /qis = λ (λ > 0) for all i, then the quantity
index must equal λ.

Qst (pt , λqs , ps , qs ) = λ where λ > 0

784 D. S. Prasada Rao

It can be seen that Axiom 1, the Strict Identity Axiom, is a special case of Axiom
6 where λ is set to 1.
Axiom 7 – Mean value axiom: A quantity index number must lie within the
smallest and largest changes in quantities from period s to t, or observed quantity
ratios. This means

q1t q2t qN t q1t q2t qN t
min , , ., ., ., ≤ Qst (pt , qt , ps , qs ) ≤ max , , ., ., .,
q1s q2s qN s q1s q2s qN s
(52)

Axiom 8 – Time Reversal Test: The quantity index measuring changes from
period s to t must be the reciprocal of the quantity index measuring changes from
period t to s. Thus, the test requires

Qst (pt , qt , ps , qs ) = 1/Qts (ps , qs , pt , qt ) or

(53)
Qst (pt , qt , ps , qs ) · Qts (ps , qs , pt , qt ) = 1

This axiom implies consistency between comparisons from s to t and compar-

isons in reverse from t to s.
Axiom 9 – Circularity test: Suppose we are interested in comparisons involving
several periods, say 1 to T then the circularity test requires:

Q12 (p2 , q2 , p1 , q1 ) ×Q23 (p3 , q3 , p2 , q2 ) × · × QT −1T (pT , qT , pT −1 qT −1 )

· QT 1 (p1 , q1 , pT , qT ) = 1
(54)

The circularity test requires consistency in quantity comparisons over several

periods and suggests that chained comparisons from periods 1 to T must equal the
reciprocal of comparison between T and 1.
Axiom 10 – Transitivity Axiom: The circularity test is often stated in the form of
a transitivity axiom which states that for any three periods s, t, and t the following
condition on pairwise output quantity comparisons holds:

Qst (pt , qt , ps , qs ) = Qtt (pt , qt , pt , qt ) · Qt s (pt , qt , ps , qs ) (55)

The transitivity axiom can be interpreted as follows. Suppose a comparison

between time periods s and t is desired. There are two alternative approaches.
The first is a direct comparison between the two periods s and t. The second is
to make an indirect chained comparison using the product, Qtt (pt , qt , pt , qt ) ·
Qt s (pt , qt , ps , qs ) through a third time period, t’. Transitivity axiom stipulates
that the direct and the chained comparison must yield the same value for the index.
The circularity test and transitivity axiom are similar in concept but not the same.
This can be established from Eqs. (54) and (55). Transitivity implies identity test
and the circularity test. The circularity test together with the identity axiom implies
19 Index Numbers and Productivity Measurement 785

transitivity. However, most index number formulae satisfy the time reversal test but
very few index number formulae satisfy the circularity and transitivity axioms.
The following two axioms relate price and quantity indices to value change.
Axiom 11: Factor test or product test: This test simply requires that the product
of a quantity index and a price index used must be equal to their value index.
N
i=1 pit qit
Qst (pt , qt , ps , qs ) · Pst (pt , qt , ps , qs ) = N = Vst (56)
i=1 pis qis

Equation (56) places no restriction on the formula chosen to compute Qst and
Pst . For example if Laspeyres index to measure price changes and Paasche index
to measure quantity changes are chosen, these two index numbers together satisfy
the Factor test. A stronger requirement would be to insist that the same formulae be
used to compute both price and quantity changes.
Axiom 12 – Factor Reversal Test: According to this test, the product of price and
quantity comparisons derived using the same formula must equal the value change.
If Qst represents a quantity index formula and the price index uses the same formulae
but with prices and quantity vectors interchanged then their product must equal
value change. Thus
N
i=1 pit qit
Qst (pt , qt , ps , qs ) · Qst (qt , pt , qs , ps ) = N = Vst (57)
i=1 pis qis

where Qst (qt , pt , qs , ps ) denotes a price index obtained using the same formula that
is used for the quantity index and by interchanging prices and quantities.
The factor reversal test represents a more stringent requirement than the factor
test and consequently it is satisfied by only a few index number formulae.
We introduce two more axioms, though not standard, that play an important role
in characterizing the Fisher index.
Axiom 13 – Price Reversal Test: This test states that the quantity index remains
unchanged if the price vectors in periods s and t are reversed. Thus

Qst (pt , qt , ps , qs ) = Qst (ps , qt , pt , qs ) (58)

This axiom reflects the general philosophy of the axiomatic approach that prices
and quantities are independently determined.
Axiom 14: Value dependence Axiom. This axiom states that the quantity index
can be expressed as a function of value aggregates formed out of prices and
quantities in the two periods expressed as:
N N N
N
Qst (pt , qt , ps , qs ) = f pit qit ; pis qis ; pit qis ; pis qit
i=1 i=1 i=1 i=1
(59)
786 D. S. Prasada Rao

This axiom restricts the form of the quantity index number. The Laspeyres,
Paasche and Fisher indices, among others, satisfy this axiom. However, none of
the geometric indices satisfy this axiom.
The tests listed above is a selected list designed to give a feel for the axiomatic
approach to index numbers. There are many variants of these tests but it is sufficient
to focus on the tests listed here. For an in-depth coverage of the tests and their inter-
relationships, the reader may refer to Diewert [16], of the Producer Price Index
Manual [35] and Balk [6].
An important logical question that arises is whether there exists an index number
formula that satisfies all these axioms. If such an index exists, that index can be used
for measuring output and input quantity changes.
Applying these tests to the formulae discussed so far, it is easy to check that
all the indices satisfy positivity, continuity, commensurability, and the bounds tests.
Laspeyres and Paasche indices do not satisfy time reversal test or the factor reversal
test. However, Lapeyres price index paired with Paasche quantity index and Paasche
price index paired with Laspeyres quantity index satisfy the product test. The Fisher
index satisfies the time reversal test as well as the factor reversal test. None of the
geometric indices listed, geometric-Laspeyres, geometric-Paasche, Törnqvist, and
geometric Young indices satisfy the factor reversal test.19 With the exception of the
geometric-Young index, none of the others satisfy the circularity test.
The axioms are also used to characterize different index numbers. In what
follows, a few important results are stated (without proofs) to conclude this section.
Proofs of these results are available from Eichorn and Voeller [24], Diewert [16],
IMF [35], and Balk [6] just to mention a few.
Result 1: The only quantity index number Qst (pt , qt , ps , qs ) that satisfies the
axioms of positivity, time reversal test, price reversal test and the factor reversal
test is the Fisher price index.
In this result the price reversal test in Eq. (58) is critical. A similar result
characterizing the Fisher index using the value dependence axiom is also available.
Result 2: A quantity index Qst (pt , qt , ps , qs ) satisfies the linear homogeneity
axiom, factor reversal test and the value dependence test if and only if the index
is the Fisher index.
Now we state an impossibility theorem which establishes that we cannot find an
index number formula that satisfies a small set of axioms.
Result 3: There does not exist an index number formula that simultaneously
satisfies the identity axiom, the circularity test and the factor reversal test.
There are several other impossibility theorems for price and quantity index
numbers which can be found in the book by Eichorn and Voeller [24].

19 Theonly geometric or log-change index that satisfies the factor reversal test is the Sato-Vartia
index [53, 62]. The Sato-Vartia index is similar to the Törnqvist index but the weights used are
based on logarithmic averages of expenditure shares in the current and base periods.
19 Index Numbers and Productivity Measurement 787

If one is seeking a formula that satisfies the circularity test or transitivity 20 the
following result due to Funke et al. [31] is important.
Result 4: A quantity index Qst (pt , qt , ps , qs ) satisfies monotonicity, linear homo-
geneity, identity, commensurability and circularity tests if and only if the index is of
the geometric-Young form:

N

N
qit wn
Qst (pt , qt , ps , qs ) = with wn > 0, n = 1, 2, . . . , N and wn = 1
qis
n=1 i=1
(60)

This index is sometimes referred to as a Cobb-Douglas type index.

This is indeed an important result if one is selecting index number formula that
is required to satisfy transitivity as well as a few other relatively basic properties
listed in Result 4. A point to note here is that weights in the geometric-Young index
in Eq. (60) must be independent of the time periods under consideration. This rules
out indices such as the geometric-Laspeyres, geometric-Paasche and the Törnqvist
index.
In concluding this section it is useful to remind the readers that the focus has
been mainly on output quantity index numbers. Similar axioms can be stated for
output price index numbers as well as input price and input quantity index numbers.
Input quantity index numbers can be obtained by simply replacing {pt , qt , ps , qs }
with input price and quantity vectors for the two periods, {wt , xt , ws , xs } and all the
axioms and results are then applicable for these vectors.

Economic Theoretic Approach to Output and Input Quantity

Index Numbers

Economic theoretic or economic approach to index numbers has a long history. In

his seminal work, Konus [37] articulated an economic theoretic approach to measure
changes in the cost of living using the theory of consumer behavior. The Konus
index serves as a starting point for any discourse on consumer price index numbers.
Ragnar Frisch [30] in his survey article on index numbers referred to this approach
as the functional approach. This approach explicitly recognizes the existence of
a functional relationship between price and quantity data arising out of decision
making by consumers. On the production side, the work of Fisher and Shell [29]
offered a producer behavior perspective to producer price index numbers – their
contributions form the basis for the exposition in this section. However, it is the
path-breaking work of Diewert [13] followed by Caves et al. [10] that has provided

20 Transitivity is a property which is considered essential in the context of multilateral comparisons

of prices and quantities. See Section “Economic-Theoretic Approaches to Measurement of Output

Quantity Change” for further discussion on multilateral comparisons.
788 D. S. Prasada Rao

a framework to establish a systematic link between the Laspeyres, Paasche, Fisher,

and Törnqvist index numbers and producer behavior operating within the confines
of a production technology. Diewert [15, 16] and many of his works21 have been
instrumental in establishing economic theoretic properties of these index numbers
which were until then considered to be either statistical or heuristic or axiomatic. In
particular Diewert has shown that the Fisher and Törnqvist index numbers are exact
measures of output and input quantity and price change under certain specific forms
of the production technology and under the assumption that the observed input and
output quantities are both technically and allocatively efficient. Exposition in this
section draws heavily from Diewert [15].22
It is impossible to provide a comprehensive review of all the material available on
this topic. Further, some of the work in this area relies on sophisticated economic
and mathematical analysis. The principal objective of this section is to offer the
reader a flavor of what is available and provide an overview of developments in this
area. Throughout this section the focus is on bilateral comparisons involving a firm
over two time points. In order to limit the size of this section, the focus is on output
related measurement issues. Input related measures can be derived by following
similar concepts and measures with appropriate changes. Under a given technology,
a firm’s objective, for example, could be to maximize revenue by making decisions
on outputs produced given a vector of inputs and a vector of output prices. Just as
easily, the firm’s objective could be to minimize the cost of producing a given output
vector with a given vector of input prices.

Notation and Basic Framework

The notation and framework for this section is the same as that discussed in Section
“Notation and Preliminaries.” Reference will be made to the regularity conditions
R.1 listed in that section. The notion of revenue function plays an important role
in the discussion on output price and output quantity index numbers. Given the
production technology, the revenue function,23 r (p, x), is defined as the maximum
revenue feasible with a given vector of input endowments and output prices. This
function is defined as:

r (p, x) ≡ maxq {p.q : (x, q) ∈ S} (61)

21 Research work and contributions of Diewert in the area of index numbers can be found and
downloaded from his website: https://economics.ubc.ca/faculty-and-staff/w-erwin-diewert/
22 Of all the papers on this subject, Diewert [15] is easily one of the best expositions where the

treatment is focused solely on the economic theoretic approach and the matter is developed quite
systematically. The earlier work of Fisher and Shell [29] is another systematic exposition on this
subject but Diewert [15] is comprehensive in its coverage of this subject matter.
23 Note that this terminology differs slightly from that used by Diewert [15] but this terminology is

more intuitive.
19 Index Numbers and Productivity Measurement 789

Properties of the revenue function can be derived under the regularity conditions
R.1. Important properties of the revenue function include nonnegativity, linear
homogeneity, continuity, convexity in output price vector p; and it is nondecreasing
in p. The revenue function is nondeceasing and concave in input vector x.

Economic-Theoretic Approaches to Measurement of Output

Quantity Change

The economic theoretic approach to measurement of output quantity changes

depends on whether output price data are available. In most practical situations,
price data are available and it is reasonable to assume revenue maximizing behavior
by the firms. When output price data are available, there are two options. The first
option is to measure output changes directly using the Samuelson and Swamy [52]
approach. Alternatively, one may first obtain a measure of price change discussed
in Fisher and Shell [29] and use it to measure output change applying the deflation
method. In the case where price data are not available, then a direct approach based
on the Malmquist output distance function is used in measuring output quantity
change. Price data are usually not available for the output of nonmarketed goods
and services such as education, health, and police services. Material in this section
is divided into the following four sections each dealing with a different economic-
theoretic approach to the construction of output quantity index numbers.
Direct approach in the presence of output price data: Under this approach, the
output quantity index is measured directly but it is based on the notion of a revenue
function introduced in Eq. (61).
Fisher and Shell approach to output price index numbers: In the presence of
output price data, economic-theoretic measurement of price change is considered
using the Fisher and Shell approach.
Indirect or deflation approach to output quantity index numbers: This approach
makes use of the fundamental index number decomposition applied to the produc-
tion side whereby change in the revenue from period s to t is expressed as a product
of price change and output quantity change. In this case, the indirect measure is
defined as the ratio of value change and a measure of price change. Essentially
this approach compares value output in the base period with value of output in the
current period after adjusting for movements in output prices and thus referred to as
the deflation approach .
Direct approach based on Malmquist output distance function: This approach
provides a measure of quantity change without the need for any price data and it
is anchored on the Malmquist output distance function leading to what is usually
referred to as the Malmquist output quantity index .
As these approaches are all anchored on the underlying production technology,
empirical implementation and numerical measures of quantity indices depend in
principle on the knowledge of the production technology. However, in practice,
technology used by the firms is not known. In many practical situations, the only
observations available are the actual quantities of inputs used and outputs produced
790 D. S. Prasada Rao

along with data on input and output prices faced by the firms. These can be a random
sample from firms operating in the industry. In many practical situations where
index number methods are used only two data points – either observations for a
firm at two different points of time or for two firms at a given point of time – are
available. On the face of it, this looks like a mission impossible, but under certain
assumptions it can be shown that these measures of output change can be reasonably
well approximated by standard index number formulae such as the Törnqvist index
and the Fisher index.

Direct Measures of Quantity Change in the Presence of Price Data

The approach described here is due to Samuelson and Swamy [52] and further
discussion of the approach can also be found in Sato [54]. It makes use of the
existence of production technology S which satisfies the regularity conditions R.1
stated in Section “Notation and Preliminaries” and the availability of price and
quantity data for both outputs and inputs. Here the output index is defined as
the ratio of the maximum revenue generated in periods s and t using production
technologies available in these periods, Ss and St ; input quantity vectors xs and xt at
a common reference price vector pR . Using the revenue function defined in Eq. (61),
the Samuelson-Swamy (SS) output index is defined as:

r t (pR , xt )
st ≡
QSS (62)
r s (pR , xs )

This index depends on the choice of the reference price vector. If the base period
output price vector is the reference vector, then the resulting Laspeyres-type SS
output index is given by:

r t (ps , xt )
QSS−L
st = (63)
r s (ps , xs )

If the firm is technically and allocatively efficient in both periods, revenue

maximizing behavior implies that the observed revenue in the base period equals
the denominator in (63). Then,

r t (ps , xt ) r t (ps , xt )
QSS−L
st = = (64)
r s (ps , xs ) N
i=1 pis qis

Similarly, Paasche-type SS index can be defined using the current period price
vector as the reference vector. Then,
N
r t (pt , xt ) pit qit
QSS−P
st = s = si=1 (65)
r (pt , xs ) r (pt , xs )
19 Index Numbers and Productivity Measurement 791

If there is sufficient data, for example data on a large number of firms in both
years, then it would be possible to compute indices in (64) and (65) by first
identifying production technologies in periods s and t.
The SS index in Eq. (62) can be expressed as:

r t (pR , xt ) r s (pR , xt )
st =
QSS s
× s (66)
r (pR , xt ) r (pR , xs )

Equation (66) shows that the SS output index is the product of two components.
The first component provides a measure of technical change as it measures the
change in revenue, for a given reference price vector pR and input quantity vector,
xt , that is purely attributable to change in technology from period s to t. The second
component is a measure of change in revenue purely attributable to change in input
vector xs to xt .

Indirect Output Quantity Index Numbers Using Output Price Index

Numbers

The indirect approach to the measurement of output change was first described in
Fisher and Shell [29]. Under this approach, a suitable output price index is first
measured and the quantity index is then derived indirectly using the fundamental
index decomposition of revenue (value) change into output price change and output
quantity change. Then
n
pit qit
Revenue change = ni=1
i=1 pis qis

Price index = Pst (pt , qt , ps , qs )

n n
Indirect Quantity index ≡ pit qit / pis qis /Pst (pt , qt , ps , qs )
i=1 i=1
(67)

In order to derive the quantity index defined in Eq. (67), it is necessary to obtain
an economic theoretic measure of output price change. Here one may employ the
Fisher-Shell price index24 which uses the revenue function to define the output price
index.

The Fisher-Shell Output Price Index

The Fisher and Shell (FS) (1972) output price index, PstF −S , is defined as the ratio
of maximum revenues generated under prices prevailing in periods s and t using a

24 This is somewhat similar in concept to the Konus index used in measuring changes in consumer
prices.
792 D. S. Prasada Rao

reference technology, SR , and a fixed reference input vector xR . The FS price index
is given by

R R (pt , xR )
PstF −S = Pst pt , ps , x R , S R ≡ (68)
R R (ps , xR )

In the case of singe output, this price index reduces to the ratio of output prices
in the two periods.
Apart from the assumptions on the production technology, under the assumption
that the observed output quantities in periods t and s are optimal in that they are
solutions to the problem of maximizing revenue given the technology prevailing in
these periods and for the input vectors available. This means that these observations
represent output vectors that are technically and allocatively efficient. Thus,

N
N
R t (pt , xt ) = pit qit = pt · qt and R s (pt , xt ) = pis qis = ps · qs (69)
i=1 i=1

Equation (69) provides information helpful to link the FS output price index
concept in (68) to the price index numbers discussed in Section “Index Number
Approach to Measuring Quantity Change.”
The following results are stated without proof.
Result 5: If the price vectors are strictly positive, the input vector is nonnegative
with at least one positive input, and if the reference technology satisfies the standard
regularity conditions of production technology and that a nontrivial output vector is
feasible with the given input vector, then the FS price index in Eq. (68) is within the
minimum and maximum observed price ratios.

pit pit
mini , i = 1, 2, . . . , N ≤ Pst ≤ maxi
FS
, i = 1, 2, . . . , N (70)
pis pis

For proof, see Diewert [15], p. 1058. This means that the FS price index satisfies
the standard bounds-axiom.

FS- Laspeyres and FS-Paasche Output Price Index Numbers

In empirical implementation of the Fisher-Shell index, the most obvious choices
for the reference technology and the input vectors would be those prevailing in
periods t and s. When the base period input vector and production technology are
used, it leads to the FS-Laspeyres index indicating that the index relies on base
period information. Similarly FS-Paasche index uses period t technology and the
current/comparison period input vector. These are defined as:

R s (pt , xs ) R t (pt , xt )
PstF S−L ≡ and P F S−P
st ≡ (71)
R s (ps , xs ) R t (ps , xt )
19 Index Numbers and Productivity Measurement 793

Result 6: Under the assumption that the regularity conditions R.1 hold in periods
t and s and that the observed outputs are optimal in periods t and s (technically
and allocatively efficient), then the FS-Laspeyres and FS-Paasche indices satisfy
the following inequalities.

R s (pt , xs ) R s (pt , xs ) pt · xs
PstF S−L = = ≥ = PstL (72)
s
R (ps , xs ) ps · xs ps · xs

and

R s (pt , xt ) pt · xt pt · xt
PstF S−P = = s ≤ = PstP (73)
s
R (ps , xt ) R (ps , xt ) ps · xt

This means that the standard Laspeyres and Paasche output price indices
discussed in Section “Index Number Approach to Measuring Quantity Change”
provide bounds for the economic-theoretic FS output price index numbers. This
is the first link between the economic theoretic output price index and the index
number formulae based on statistical and other considerations.
Proofs of Result 6 and the results that follow can be found in Diewert [17].
Result 7: Combining results 5 and 6, the following bounds hold for the FS-
Laspeyres and FS-Paasche indices:

pt · xs pit
=
PstL ≤ Pst
F S−L
≤ maxi , i = 1, 2, . . . , N (74)
ps · xs pis

pit pt · xt
mini , i = 1, 2, . . . , N ≤ PstF S−P ≤ = PstP (75)
pis ps · xt

Equations (74) and (75) provide useful upper and lower bounds for the unknown
FS-Laspeyres and FS-Paasche index numbers which can be computed using
observed data on output prices and quantities even in the absence of any knowledge
about the actual production technology.
The following result shows that under certain conditions there exists a theoretical
FS index for which the Laspeyres and Paasche indices simultaneously provide lower
and upper bounds respectively.
Result 8: Suppose the price and input vectors in periods s and t are strictly
positive and that technologies in period s and t are well-behaved, then there exists
a constant λ, 0 ≤ λ ≤ 1, such that the following bounds hold for the FS index
defined for production technology25 S ∗ = {(1 − λ)S s + λS t } and the input vector
x ∗ = {(1 − λ)x s + λx t }:

25 The convex combination of technologies Ss and St is the convex hull of the two technologies.
For example, this implies that for a given input vector x, the production possibility set under the
convex combination of technologies would be the smallest convex set that contains the production
possibilities sets in periods s and t, Ss and St .
794 D. S. Prasada Rao

pt · xs pt · xt
PstLP = ≤ PstF S pt , ps , x ∗ , S ∗ ≤ = PstP (76)
ps · xs ps · xt

Bounds given in Eq. (76) are important since they bound the unknown Fisher-
Shell output price index defined for a weighted average of the input vector and the
corresponding convex combination of technologies. This inequality means that if an
average of the standard Laspeyres and Paasche indices is taken, then it is likely to
get close to the unknown theoretical output price index. These bounds offer support
to the use of the Fisher index. Second, if the Laspeyres and Paasche indices are
numerically close then the unknown theoretical index lies within a small numerical
interval.
The next result is similar to the result proved by Diewert [13, 14] where he
established the notion of exact and superlative consumer price index numbers. This
version focuses on the production side counterpart and offers a result for the FS
output price index.
Result 9: Suppose the regularity conditions R.1 hold and that the observed output
vectors in periods s and t are revenue maximizing under respective technologies
and output prices (i.e., observed outputs are technically and allocatively efficient).
Further, let the output revenue function at time τ (τ = s,t) in the two observed periods
has translog functional form:

K
1 τ
K K
ln R τ (p, x) = a0τ + bkτ ln xk + bkl ln xk ln xl
2
k=1 k=1 l=1

N
1 τ
N N
+ aiτ ln pi + aij ln pi ln pj (77)
2
i=1 i=1 j =1

N
K
+ τ
cik ln pi ln xk τ = s, t
i=1 k=1

where bklτ = bτ ; a τ = a τ for all i, j = 1, 2, . . . , N and k, l = 1, 2, . . . , K τ =

lk ij ji
s, t. Assume further that the second order coefficients associated with prices in Eq.
(77) are the same in both periods, i.e., aijs = ajt i for all i and j, then the geometric
average of Laspeyres-FS and Paasche-FS output price index numbers is exactly
equal to the Törnqvist index in Eq. (40):

s N wi1 +wi2
1/2 R (pt , xs ) R t (pt , xt ) 1/2 pit 2
PstF S−L · PstF S−P = s · t = = PstT T
R (ps , xs ) R (ps , xt ) pis
i=1
(78)

This result is quite important as it forges a link between index number theory
and practice. The left-hand-side of Eq. (78) is an economic theoretic concept based
on the Fisher-Shell (1972) approach and the right-hand-side of the equation is a
19 Index Numbers and Productivity Measurement 795

practitioners’ measure of output price change which is a weighted geometric average

of commodity specific price change.
Result 10: Suppose all the conditions stated in Result 9 hold and that the
revenue function is given by a quadratic function. If the second order coefficients
corresponding to output prices are identical in both periods s and t then the
geometric average of FS Laspeyres and FS Paasche indices is exactly equal to the
Fisher output price index. Thus

1/2 R s (pt , xs ) R t (pt , xt ) 1/2
PstF S−L · PstF S−P = ·
R s (ps , xs ) R t (ps , xt )
N 1/2 (79)
N
i=1 pit qis i=1 pit qit
= N · N = PstF
i=1 pis qis i=1 pit qit

Results 9 and 10 are important in establishing links between economic theoretic

measures of output price change and the commonly used Fisher and Törnqvist index
numbers which can be computed using observed price and quantity data without the
need for any additional information or knowledge about the production technology.

Exact and Superlative Index Numbers

At this stage it is useful to introduce the reader to the notion of exact and superlative
index numbers which were introduced in a path-breaking contribution by Diewert
[13]. An index number is said to be exact if it is equal to the economic theoretic
index when the revenue function takes a particular form. Result 9 shows that the
Törnqvist index is exact for the translog revenue function, and we have the result that
Fisher index is exact for the quadratic revenues function, respectively. Diewert [13]
designates an index to be superlative if it is exact for a given revenue function which
provides a second order approximation to the unknown revenue function. Since
translog function provides a second order approximation to the unknown revenue
function, Törnqvist index is a superlative index. Similarly, the Fisher index is also
superlative since it is exact for a quadratic revenue function which also provides a
second order approximation to the unknown revenue function. These results hold
only when observed output vectors are technically and allocatively efficient for the
prevailing technologies and output prices.
An advantage with the TT and Fisher indices is that they are numerically very
close to each other, almost to four significant digits [14]. Therefore, empirically it
does not matter as to which index is chosen.

Indirect Output Quantity Index

Returning to Eq. (67) and the indirect output quantity index based on Fisher-Shell
output price index, QSS
st is given by:
796 D. S. Prasada Rao

N
N
QFstS = pit qit / pis qis /PstF S (pt , qt , ps , qs )
i=1 i=1
N N N N (80)
i=1 pit qit / i=1 pis qis i=1 pit qit / i=1 pis qis
= =
PstF S (pt , ps , x) R R pt , x R /R R ps , x R

where the technology and input vector in reference period R are used in defining the
Fisher-Shell Index. The following result can be easily established.
The indirect quantity index in Eq. (80) satisfies properties of positivity, conti-
nuity, commensurability, and time reversal test. However, it fails the homogeneity
property which states that if the output vector in period t is a scalar (positive)
multiple, λ, of output vector in period s then the output quantity index must be
equal to the scalar ([15], p. 1071).
Note here that implementation of the Fisher-Shell output index requires the
selection of a reference price vector and a reference technology. Results 9 and 10
suggest that the true Fisher-Shell price index in (80) may be replaced by exact or
superlative indices such as the Fisher and the Törnqvist price indices.

Direct Quantity Index Based on Malmquist Distance Function

The direct quantity index based on the Malmquist distance function is usually
referred to as the Malmquist output index. The Malmquist quantity index, denoted
by QM R
st , is defined using a reference technology, S and a reference input vector, xR .
It is defined as:

DoR (xR , qt )
st ≡
QM (81)
DoR (xR , qs )

There are several advantages with the Malmquist output index. First, the
Malmqust output index in (81) does not require any output price data. This is
an advantage when it comes to measuring output growth in nonmarketed goods
and services like education and health. Second, the Malmquist index satisfies the
homogeneity property which is not satisfied by the indirect index.
The Malmquist output index in (81) depends on the reference technology and
reference input vector used unless the technology satisfies homothetic separability
condition ([15], p. 1071) which states that the output distance function must have
the following form:

1
Dot (x, q) = (82)
g(q)h (t, x)

where g(.) and h(., .) are appropriately defined functions (see [15], pages 1069 and
1071). In general, the homothetic separability condition does not hold empirically
19 Index Numbers and Productivity Measurement 797

and therefore it is necessary to select the reference technology and the reference
quantity vector before the Malmquist quantity index in Eq. (81) can be computed.
The most natural choices available are the base and current period input vectors and
technologies. This leads to two alternative forms of the Malmquist index.

Malmquist-Laspeyres Index
As the title suggests, here the technology and input vector associated with the base
period are selected.

Dos (xs , qt )
QM−L
st ≡ (83)
Dos (xs , qs )

If the production vector qs is on the frontier of the output set associated with
input vector xs and technology in period s, we have Dos (xs , qs ) = 1. In this case,
the Malmquist-Lasdpeyres index simplifies to QM−Lst = Dos (xs , qt ).

Malmquist-Paasche Index
Use of information from the comparison or current period leads to the Malmquist-
Paasche index:

Dot (xt , qt , )
QM−P
st ≡ (84)
Dot (xt , qs )

If the production vector qt is on the frontier of the output set with input vector
xt and technology in period t, then the Malmquist-Paasche index simplifies to
QM−P
st = 1/Dot (xt , qs ).

Malmquist-Fisher Index
In the spirit of the Fisher index, the Malmquist-Fisher index is defined as the
geometric mean of the –Malmquist-Laspeyres and Paasche indices in Eqs. (83) and
(84).
1/2
Dos (xs , qt ) Dot (xt , qt )
QM−F
st ≡ · (85)
Dos (xs , qs ) Dot (xt , qs )

The Laspeyres, Paasche, and Fisher versions of the Malmqust output index
require the knowledge of the production technology or equivalently the output
distance function so that all the distances in Eqs. (83), (84), and (85) can be
evaluated. In the absence of this key information, the following results, stated
without proof, from Diewert [15] are quite useful.
Result 11: Suppose the regularity conditions R.1 hold and that the observed
output vectors in periods s and t are revenue maximizing under respective tech-
nologies and output prices (i.e., observed outputs are technically and allocatively
efficient). The Malmquist-Laspeyres output index in (83) is bounded from below by
the Laspeyres output quantity index.
798 D. S. Prasada Rao

Dos (xs , qt ) ps · qt
QM−L
st ≡ ≥ QL
st =
Dos (xs , qs ) ps · qs

Result 12: Suppose the regularity conditions R.1 hold and that the observed
output vectors in periods s and t are revenue maximizing under respective tech-
nologies and output prices (i.e., observed outputs are technically and allocatively
efficient). The Malmquist-Paasche output index is bounded from above by the
Paasche quantity index.
1/2
Dos (xs , qt ) Dot (xt , qt ) pt · qt
QM−F
st ≡ · ≤ QPst =
Dos (xs , qs ) Dot (xt , qs ) pt · qs

Result 13: Suppose the regularity conditions R.1 hold and that the observed out-
put vectors in periods s and t are revenue maximizing under respective technologies
and output prices (i.e., observed outputs are technically and allocatively efficient).
Suppose further that technologies in periods s and t are sufficiently well behaved
so that the distance functions are well defined for convex production technologies.
There exists a positive scalar λ such that the Malmquist index defined using a convex
combination of the production technologies in periods s and t, λSs + (1 − λ)St and
weighted average of input vectors in periods s and t, λxs + (1 − λ)xt is bounded
by the Laspeyres and Paasche Quantity index numbers. Stated formally, this result
gives the following bounds:

ps · qt D ∗ (x∗ , qt )
st =
QL ≤ o∗ ≤ QPst
ps · qs Do (x∗ , qs )
pt · qt
= ; where S∗ = λSs + (1 − λ) S t and x∗ = λxs + (1 − λ) xt
pt · qs

This result implies that by using the Fisher index it may be possible to get closer
to the unknown Malmquist quantity index. This theorem is particularly useful when
the Laspeyres and Paasche indices computed using observed price and quantity data
are numerically close to each other.
The following result due to Caves et al. [10] provides a direct link between the
Törnqvist index, Fisher Index and the Malmquist-Fisher theoretical index in Eq.
(85). This result is similar in content to Result 9.
Result 14: If the following conditions hold: (i) the production technology
satisfies the standard regularity conditions stated in R.1 hold; (ii) the output vector,
output price vector and input vector are all strictly positive26 ; (iii) the output vectors
in periods s and t are technically and allocatively efficient, respectively, for the input
vectors and technologies in these two periods; and (iv) the distance function assumes
the following translog functional form:

26 All elements of these vectors are strictly positive.

19 Index Numbers and Productivity Measurement 799

K
1 τ
K K
ln Doτ (x, q) = a0τ + bkτ ln xk + bkl ln xk ln xl
2
k=1 k=1 l=1

N
1
N
N
+ aiτ lnqi + aijτ lnqi lnqj
2
i=1 i=1 j =1

N
K
+ τ
cik lnqi ln xk τ = s, t
i=1 k=1

where
τ = bτ ; a τ = a τ for all i, j = 1, 2, . . . , N and k, l = 1, 2, . . . , K τ = s, t;
bkl lk ij ji
and (v) aijs = ajt i for all i, j = 1, 2, . . . , N, then the economic theoretic Malmquist
output index based on the Malmquist-Fisher type index in Eq. (85) is exactly equal
to the Törnqvist output index. That is:

1/2 N
wi1 +wi2
Dos (xs , qt ) Dot (xt , qt ) qit 2
· = = QTstT (86)
Dos (xs , qs ) Dot (xt , qs ) qis
i=1

This result is usually restated to indicate that the Törnqvist quantity index is exact
for a translog output distance function. This index is also considered superlative
as the translog distance function provides a second order approximation to any
unknown output distance function.
A result similar to Result 14 can be established to show that if all the conditions in
Result 14 hold and further if the distance function is quadratic with identical second
order coefficients in both periods s and t, then the theoretical Malmquist-Fisher type
index is exactly equal to the Fisher index. In this case:

1/2 N 1/2
N
Dos (xs , qt ) Dot (xt , qt ) i=1 pis qit i=1 pit qit
· = N · N = QFst (87)
Dos (xs , qs ) Dot (xt , qs ) i=1 pis qis i=1 pit qis

Equation (87) shows that the theoretical Malmquist index is equal to the Fisher
index introduced in Section “Index Number Approach to Measuring Quantity
Change” provided the distance function is quadratic. These two results are sig-
nificant since they show that under plausible assumptions and conditions on the
distance function and data collected, it is possible to compute theoretical Malmquist
output index using only price and quantity data from these two periods. The main
assumption here is that the distance function is quadratic and has the same second
order coefficients associated with the outputs. The Fisher index is thus known as
an exact and superlative index number. These indices can be considered as a good
approximation to the unknown theoretical indices.
So far three alternative theoretical measures of output quantity change have
been considered – the direct index due to Samuelson and Swamy [52]; the indirect
800 D. S. Prasada Rao

index based on deflated value change with the Fisher-Shell output deflator; and
the Malmquist output index based on the notion of output distance function. This
section is concluded with two key points. The first point is that of all the three
indices, the Malmquist output index is the only index that satisfies the homogeneity
condition. The second point is that all the three indices coincide when the production
technology satisfies the homothetic separability condition (see Theorem 5, [15],
p. 1067).

Input Quantity Index Numbers

This chapter so far has entirely focused on economic theoretic approach to output
quantity index numbers. For productivity measurement, input quantity index num-
bers are equally important. The general apparatus used in the construction of input
quantity index numbers is much the same as that used for output index numbers. The
main difference is that the input quantity indices make use of input cost functions
and Malmquist input distance functions.
Given the production technology, the cost function is defined as the minimum
cost of producing a given output vector q when the input price vector is w and the
technology is represented by S. This function is defined as:

c (w, q) ≡ minx {w.x : (x, q) ∈ S}

Using the cost function, the indirect input quantity index is defined as the change
in input costs after adjusting for movements in price change. If our focus is on
comparisons over two periods, s and t, the indirect input quantity index27 is defined
using these measures:
K
k=1 wkt xkt
Cost change = K
k=1 wks xks
Input Price index = I P st (wt , xt , ws , xs )
K
K
Indirect Quantity index I Qst ≡ wkt xkt / wks xks /IP st (wt , xt , ws , xs )
k=1 k=1

In order to derive the indirect input quantity index, we need a measure of input
price change. The input price index, following Fisher and Shell [29], is defined as:

C R (wt , qR )
I P Fst −S = I P st wt , ws , qR , S R ≡
C R (ws , qR )

27 Denote input quantity and price index numbers as IQ and IP- this notation helps to distinguish
this from the output quantity and price index numbers.
19 Index Numbers and Productivity Measurement 801

As in the case of output price index numbers, under assumptions similar that in
Section “Notation and Basic Framework” and under the assumption the observed
input quantities are optimal for the input prices and output quantities prevailing in
the two periods, that is,

K
K
C t (wt , qt ) = wkt xkt = wt · xt and C s (ws , qs ) = wks xks = ws · xs ,
k=1 k=1

then the equivalent of Results 5 to 14 can be proved. The Laspeyres and Paasche
input price index numbers, I P FstS−L and I P FstS−P usually serve as upper and
lower bounds for the true input price index. The main message is that under the
assumptions stated in Results 9 to 14, and under the assumption that the input cost
function is of the translog functional form, the input price index can be computed
using the Törnqvist input price index number

1/2
1/2 C s (wt , qs ) C t (wt , qt )
I P FstS−L · I P FstS−L = ·
C s (ws , qs ) C t (ws , qt )
K
cis +cit
wit 2
= = I P TstT
wis
k=1

w x
where ckτ = K kτ kτ τ = s, t represents the cost share of input k in period τ
k=1 wkτ xkτ
(=s,t).
Further, if the input cost function is quadratic then we have

1/2 C s (wt , qs ) C t (wt , qt ) 1/2
I P FstS−L · I P FstS−L = ·
C s (ws , qs ) C t (ws , qt )
K 1/2
K
k=1 wkt xks k=1 wkt xkt
= K · K = I P Fst
k=1 wks xks k=1 wks xkt

Similar results also hold in the case of Malmquist input quantity index numbers.
The theoretical input quantity index is defined using the Malmquist input distance
function:

DiR (xt , qR )
st ≡
I QM
DiR (xs , qR )

Under various regularity conditions on the technology set and assumptions on the
form of the input distance function and under the assumption that observed input
quantity vectors are technically and allocately efficient, the Fisher and Törnqvist
quantity index numbers are exact and superlative for the Malmquist input quantity
802 D. S. Prasada Rao

index when the distance functions are, respectively, quadratic or translog functional
form.

Summary

The economic theoretic approach to output and input price and quantity indicators
is well developed. This section has just touched upon the basic elements of this
approach. The main objective of this section is to give an overview rather than
to provide a detailed description of all the methods and to offer proofs of various
results. Interested reader will benefit from reading the original works of Fisher and
Shell [29], Samuelson and Swamy [52], Sato [54], Malmquist [41], Diewert [13, 15,
16], and Balk [5].
Regarding the choice between compiling direct and indirect quantity indices, the
general advice is to use direct quantity indices based on Malmquist distance function
as they satisfy the homogeneity condition. In the case where the revenue and
cost functions, input and output distance functions satisfy homothetic separability
assumption then the direct and indirect measures coincide and there is no need to
choose between the alternative approaches.
The basic take-home message is that empirical implementation of the economic
theoretic indices requires complete knowledge of the functional form for the
revenue, cost, Malmquist input and output distance functions. However, in the
absence of such information, economic theoretic approach can provide empirically
feasible measures of output price, output quantity, input price, and input quantity
index numbers based on observed price and quantity data when certain assumptions
and regularity conditions are satisfied. This approach is particularly well suited
when limited data are available for a firm over two periods or only for two firms
at a given point of time. In such cases, under some reasonable assumptions, the
Fisher and Törnqvist indices provide useful tools for measuring output and input
quantity change which can then be used in measuring productivity change.

Special Topics

This chapter, thus far, has provided an extensive survey of various index number
approaches for measuring output and input quantity change that can in turn be used
to measure productivity change. This section is devoted to two special topics that
deserve special attention and additional discussion.
The first topic addresses a question which is somewhat fundamental in that it
concerns the choice between the two alternative approaches which are both designed
to measure output and input quantity change. The first approach, discussed in
Section “Index Numbers Based on Quantity Aggregates” of this chapter, advocates
construction of quantity (output or input) aggregates before measuring quantity
change. Under this approach, aggregate quantity levels are computed first and
change is measured using these levels. The second approach, discussed in Sections
19 Index Numbers and Productivity Measurement 803

“Index Number Approach to Measuring Quantity Change,” “Axiomatic Approach

to Index Numbers,” and “Economic Theoretic Approach to Output and Input
Quantity Index Numbers,” in contrast constructs measures of quantity change
directly using standard index number methods. The practitioner is then faced with
a choice between these two approaches. Section “Use of Quantity Aggregates to
Measure Quantity Change” below is devoted to this problem of choosing between
these two approaches.
The discussion so far has typically focused on comparisons over two periods s
and t or across two firms. However, in practical situations one may be required to
make multilateral comparisons across a number of firms or measure change over
several periods of time. Section “Transitivity and Quantity Index Numbers” focuses
on the problem of multilateral comparisons of quantity change or levels and discuss
the role of transitivity requirement. The reader may recall that transitivity and the
circularity axioms were discussed in the context of the axiomatic approach to index
numbers but in this section the main concern is on the implications of transitivity
for the choice of index number methods for measuring quantity change.

Use of Quantity Aggregates to Measure Quantity Change

The approach described in Section “Index Numbers Based on Quantity Aggregates”

where output quantity change is measured by the ratio of quantity aggregates:

Q (qt ) Q (q1t , q2t . . . , qN t )

Qst = =
Q (qs ) Q (q1s , q2s . . . , qN s )

where the aggregator function Q(qτ ); τ = s, t is required to satisfy a minimal set of

conditions: nonnegativity, nondecreasing in arguments, and linear homogeneity.
In the implementation of this approach based on quantity aggregates advocated
by O’Donnell [44, 46, 47], a few aspects need special consideration. (1) Are there
other axioms, in addition to the three conditions, that need to be considered in
selecting quantity aggregator functions? If so, what are the implications of the
additional conditions? This aspect is considered in Section “An Additional Axiom
for Quantity Aggregates.” (2) One of the indices canvassed in O’Donnell [44, 46]
is the Lowe index which makes use of value aggregates evaluated at a fixed price
vector. This index is attractive since it is consistent with the notion of additivity of
quantity aggregates (see Section “An Additional Axiom for Quantity Aggregates”
below) and it leads to transitive comparisons. Despite these useful properties, the
Lowe index can lead to strange ordering of output quantity vectors. This issue is
considered in Section “The Lowe Index.” (3) If the quantity aggregate approach
for measuring quantity change is adopted, then it is natural and symmetric to use
a similar approach for measuring price change based on price aggregates. Will
the measures of quantity change and price change based on respective aggregates
be consistent with observed changes in the revenue or costs? Will the product of
quantity change and price change, each based on respective aggregates in periods
804 D. S. Prasada Rao

s and t, be equal to the observed revenue change or change in input costs? This
issue is discussed in Section “Value Decomposition and the Use of Quantity and
Price Aggregates.” (4) The use of quantity aggregates to measure quantity change
is implicitly based on the assumption that movements in prices and quantities are
independent. Economic theory suggests that these are not independent. How does
one evaluate validity of this assumption using observed data? This question is
answered in Section “Are Price and Quantity Data Independent?”

An Additional Axiom for Quantity Aggregates

The current treatment of quantity aggregates uses a minimal set of properties for the
quantity aggregate. The properties of nonnegativity, nondecreasing in its arguments
and linear homogeneity listed in O’Donnell [44, 46] is indeed a minimal set. While
these properties are desirable, these may not capture all the characteristics expected
of an aggregate function.
Now consider an additional property which is an intuitive requirement one would
expect from a quantity aggregate. This is the additivity property which requires
the quantity aggregate of the sum of two quantity vectors to be equal to the sum
of quantity aggregates of the two quantity vectors. Stated formally, the quantity
aggregate Q(q) is said to satisfy additivity if for two output vectors, q1 and q2 , we
have:

Q (q1 + q2 ) = Q (q1 ) + Q (q2 ) for all q1 , q2 ≥ 0 (88)

This type of additivity ensures that if a firm produces q1 in the first half of the
year and q2 in the second half of the year, then the quantity aggregate for the year
would be the same as the sum of quantity aggregates in the first and second half of
the year. The additivity axiom is also important if one were aggregating output of
firms over a geographical region.
If we add additivity to the three properties already listed, then the class of
functions which satisfy these four properties together would be more restricted. The
question is how restricted is such a class? The answer to this can be found from the
solution to the first fundamental Cauchy equation: f (x + y) = f (x) + f (y). This is
similar to (88) which is in vector form. The following result provides the answer.
Result 15: If the function Q (q) is a real-valued function that maps vectors in the
nonnegative orthant of the N-dimensional space, R+ N to R , and if the function Q (.)
+
satisfies, in addition to properties (1) to (3), the additivity property stated in (Eq. 88)
then there exists a vector of positive constants c = {c1 , c2 , . . . , cN } such that

N
Q (q) = ci qi f or all q = (q1 , q2 , . . . , qN ) (89)
i=1

Proof of this for the scalar case is in Diewert [19] and in earlier works of Aczèl [1]
and Eichorn [23]. The multivariate case with multiple outputs has been considered
in Kuczma and Gilányi [38].
19 Index Numbers and Productivity Measurement 805

This result implies that once we additivity is imposed in addition to axiomatic

properties (1) to (3), then the class of functions narrows to a simple linear function
shown in Eq. (89). The constants in this equation, ci s, may be interpreted as prices
at which outputs are evaluated. Such an aggregate is consistent with the notion of
a quantity aggregate evaluated at a given set of prices. This function is indeed the
Lowe aggregate and it results in the Lowe quantity index discussed in Section “Index
Numbers Based on Quantity Aggregates.”
Discussion presented here is just to alert the practitioner that it is important to
be clear about what properties one wishes to see in the aggregate and that it may in
turn restrict the choice of an appropriate aggregator function.

The Lowe Index

Let pR represent a vector of output prices which serves as the reference price vector
in defining the Lowe aggregate. Then the value of the quantity aggregate in periods
s and t, is given by:

N
QLowe
τ ≡ piR · qiτ τ = s, t (90)
i=1

where {piR : i = 1, 2, . . . , N} are fixed constants.

Lowe quantity index is then given by:

N
QLowe
t (qt ) i=1 piR qit
QLowe
st = = N (91)
QLowe
t (qs ) i=1 piR qis

The use of Lowe index for making price and quantity comparisons is not new.
Since the time Lowe [40] proposed this index, it has been the subject of considerable
research. Hill [34] provides an overview of the use of the Lowe index for temporal
and spatial price and quantity comparisons. Balk and Diewert [7] examine the
nature and extent of substitution bias induced by the Lowe index in the context
of intertemporal price comparisons. Here attention is drawn to the issues that arise
when the Lowe quantity index in Eq. (91) is used for productivity comparisons.
The Lowe index may lead to counter-intuitive conclusions if one uses this index
somewhat mechanically.
Suppose the quantity vectors in the two periods, qt and qs have the same value
aggregate when evaluated at the reference prices of the Lowe index. Then at the
Lowe reference prices

N N
piR qit = piR qis = δ > 0 (92)
i=1 i=1

Further suppose that these two output vectors are produced using the same input
vector, x, in both periods. Then we have the Lowe output quantity index equal to 1
and the input quantity index is also equal to 1. Then productivity index comparing
806 D. S. Prasada Rao

Fig. 1 Illustration of Lowe

output aggregate

periods s and t also equals 1, with the conclusion that there has been no productivity
change.
In the diagram below, the two-commodity case is depicted. Here points A and
B represent output vectors in periods t and s. The straight line associated with
reference price vector pR represents the iso-revenue line representing all the quantity
vectors which result in the same amount of revenue which is δ in Eq. (92).
Since both output vectors are produced using the same input vector, x, the use of
Lowe index leads to the conclusion that there has been no productivity change when
we move from output qs at point B to qt at point A and the Lowe productivity index
equals 1 (Fig. 1).
However, it can be seen from the diagram that there has been a shift in
the production possibility frontier associated with technology Ss in period s to
technology St in period t representing a significant technological change. In fact
the production possibility frontier in period t completely dominates the frontier in
s – with the implication that the firm is more productive in period t compared to
period s. Such a shift in the frontier should imply productivity change driven by
technical change. However, the Lowe index shows no productivity change!
This illustration serves to remind the users of the Lowe index of some of the
consequences of using the Lowe index to measure output and input quantities and
their subsequent use in measures of productivity levels and change.

Value Decomposition and the Use of Quantity and Price Aggregates

From an economist’s perspective and that of a firm, the link between changes in
revenues, costs and profitability and movements in outputs produced, inputs used,
and changes in terms trade are quite important. These links are discussed in detail in
Sections “Productivity Measurement: The Case of Single Output and Single Input”
19 Index Numbers and Productivity Measurement 807

and “Multiple Outputs and Inputs: The Index Number Problem.” If the approach
of constructing output and input quantity aggregates is implemented, then the link
hinges on the fundamental decomposition of value aggregates into quantity and
price aggregates below. If Q (q) and P (p) represent the output and price aggregates
then this decomposition is satisfied if

N
Q (q) · P (p) = qi pi = V (93)
i=1

If there exist functions Q (q) and P (p) such that decomposition in Eq. (93) holds
then the following decomposition of value change into quantity and price change
holds:

Vt Q (qt ) · P (pt ) Q (qt ) P (pt )

Vst ≡ = = · = Qst · Pst (94)
Vs Q (qs ) · P (ps ) Q (qs ) P (ps )

From a firm’s perspective this decomposition is useful in their decision making

and therefore the usefulness of the aggregates approach hinges on the existence of
Q (q) and P (p) satisfying Eq. (94). The following theorem due to Eichorn [23], p.
144 discussed in Diewert [19] sheds some light on the existence of these functions.
Result 16: Consider the case where the number of commodities, N, is greater
than 1. Then then there do not exist any functions Q (q) and P (p) that satisfy the
following two conditions:

(i) Q (q) > 0 if q

0 and P (p) > 0 if p 0, and
(ii) Q (q).P (p) = N i=1 pi qi = V

Proof of this result is quite simple and available in Diewert [19].

The implication of Result 16 is significant if this approach is expected to serve as
a tool in the analysis of revenue and cost changes and its decomposition into input
and output changes and changes in the terms of trade. This result shows that the use
of the aggregates approach is of limited use when the scope of the analysis extends
beyond just providing measures of productivity change.

Are Price and Quantity Data Independent?

The construction and use of quantity aggregates in measuring quantity change is
predicated on the premise that price and quantity data are somehow independent.
Some of the axioms discussed Section “Axiomatic Approach to Index Numbers”
also assume independence of price and quantity vectors. For example, strong
identity axiom assumes that quantity vectors would remain the same even when
the prices are different in the two periods. Microeconomic theory of the firm and
optimizing behavior of the firms – revenue maximization, cost minimization, or
profit maximization – implies that decisions of the firms depend on observed input
and output prices. Therefore, one would expect significant correlation between
prices and quantities. The question then is how does one know if such correlation
808 D. S. Prasada Rao

exists between prices and quantities (outputs or inputs)? The answer to this question
can be found in the following decomposition of the spread between the Laspeyres
and Paasche indices. This decomposition was provided by von Bortkiewicz [62].
Bortkiewicz has shown that the percentage difference between the Laspeyres and
Paasche price or quantity index can be written as (see [6], p. 64):

N
PstP PstP − PstL QPst − QLst pit /pis qit /qis
−1= = = ris −1 −1
PstL PstL QLst i=1
PstL QL st
(95)

The right hand side of the Laspeyres-Paasche spread in Eq. (95) is a weighted
measure of covariance between relative price changes and the relative quantity
changes. If the covariance is positive, implying that output quantities increase when
output prices increase, then Eq. (95) shows that the Paasche index would be greater
than the Laspeyres index – a result stated in Section “Economic Theoretic Approach
to Output and Input Quantity Index Numbers.” If the prices and quantities refer
to inputs then one would expect the covariance to be negative in which case the
Paasche index would be lower than the Laspeyres index.
An important implication of Eq. (95) is that if the relative price movements and
relative quantity movements are uncorrelated, one would expect the Laspeyres and
Paasche indices (price as well as quantity) to be close to each other. For a practitioner
the recommendation would then be to compute the Laspeyres and Paasche indices
and if there is a significant difference between these two, then the assumption of
independence between price and quantity data is not justified. This in turn gives an
indication as to the type of axioms one may wish to impose on the index numbers
compiled.

Transitivity and Quantity Index Numbers

Transitivity is one of the axioms within the axiomatic framework for index numbers
discussed in Section “Axiomatic Approach to Index Numbers.” It is useful to discuss
the context in which this axiom assumes significance and examine some of the
practical issues when choosing an appropriate index number formula that satisfies
transitivity.
The need for transitivity of quantity and price index numbers arises when
multilateral comparisons are being undertaken. Multilateral comparisons refer to
the case when comparisons between every pair of firms, regions, countries, time
periods, or a combination of all of these are undertaken. Let us consider the simplest
case where output index numbers are compiled for all pairs of firms within a cross-
section of firms.
19 Index Numbers and Productivity Measurement 809

Multilateral Comparisons
Suppose there are M firms producing N outputs using K inputs. In this case data for
all these firms are represented by: {pj , qj , wj , xj ; j = 1, 2, . . . , M} which, respectively,
represent output price and quantity vectors of dimension N and input price and
quantity vectors of dimension K.
For the purpose of exposition, consider output quantity index numbers. In the
case of multilateral comparisons, the aim is to fill the following matrix of quantity
index numbers, Q:
⎡ ⎤
Q11 Q12 . . . . Q1k . . . Q1M
⎢ Q21 Q22 . . . . Q2k . . . Q2M ⎥
QM×M =⎢
⎣ Qj 1
⎥ (96)
Qj 2 . . . . Qj k . . . Qj M ⎦
QM1 QM2 . . . . QMk . . . QMM

where Qjk (j, k = 1, 2, . . . , M) represents output quantity index for firm k with firm
j as the base. If this index, for example, is 1.20 then output of firm k is 20% higher
than that of the firm j. By definition, all the diagonal elements are equal to 1.
In principle, elements of this matrix can be filled by applying any of the formulae
discussed in Section “Index Number Approach to Measuring Quantity Change.”
Discussion in Sections “Axiomatic Approach to Index Numbers” and “Economic
Theoretic Approach to Output and Input Quantity Index Numbers” suggests that
either the Fisher or the Törnqvist index could be used here. These index numbers
are intuitive and have useful axiomatic properties and from an economic theoretic
perspective these are exact and superlative.
An important issue arises in this context. For example, when Fisher (or TT) index
is used, direct comparison between two firms, say 1 and 2, will be different from
an indirect comparison through another firm, say firm 3. The direct and indirect
comparisons in this case are, respectively, given by QF12 and QF13 · QF32 .

N 1/2
N
i=1 pi1 qi2 i=1 pi2 qi2
QF12 ≡ N · N = QF13 · QF32
i=1 pi1 qi1 i=1 pi2 qi1
N 1/2 N 1/2
N N
i=1 pi1 qi3 i=1 pi3 qi3 i=1 pi3 qi2 pi2 qi2
= N · N × N · i=1
N
i=1 pi1 qi1 i=1 pi3 qi1 i=1 pi3 qi3 i=1 pi2 qi3

It is easy to check this fact using any numerical data. This means that use of the
Fisher index formula (or for that matter most other formulae) will result in a matrix
of comparisons that are not internally consistent and hence not transitive. A formal
definition of transitivity is given below.
Definition: A matrix of quantity index numbers Q is said to satisfy transitivity if
for all j, k and l (=1, 2, . . . , M)

Qj k = Qj l · Qlk (97)
810 D. S. Prasada Rao

Transitivity is really an accounting or consistency requirement which will ensure

internally consistent assessments of output levels and productivity levels.
This is a requirement that is currently imposed on international comparisons
of prices and real expenditures conducted by the World Bank as a part of the
International Comparison Program. Details of the methods used by the World
Bank can be found in [18, 50, 51]. In the context of international comparisons,
an additional requirement of base invariance or country symmetry is also imposed
which requires symmetric treatment of all the countries involved in price and real
expenditure comparisons.
An implication of adherence to transitivity requirement is that inclusion of an
additional firm into comparisons will affect all the quantity comparisons.

Is Transitivity a Natural Requirement?

Before turning attention to implications of transitivity for the choice of a suitable
formula, it is useful to recognize that transitivity is an accounting requirement and
it is usually not naturally satisfied when comparisons are being made. For example,
transitivity is violated even in the simplest case of measuring distance between two
N-dimensional vectors, x and y. Distance between these vectors can be measured
using the Euclidean distance measure, d(x, y), which is given by:

N 1/2

d (x, y) = (xi − yi ) 2

i=1

This distance measure is symmetric but it does not generally satisfy transitivity.
In fact, the distance measure satisfies triangular inequality: d(x, y) ≤ d(x, z) + d(z, y)
with equality holding if and only if these three vectors lie on the same hyperplane.
In two-dimensions, this means that the Euclidean distance satisfies transitivity only
in the cases where all the three vectors x, y and z line on a straight line.
Another example where transitivity is automatically satisfied is in the case
of exchange rates of currencies. In the absence of any arbitrage and differential
transaction costs, exchange rates between pairs of currencies satisfy transitivity. But
in practice, even these do not satisfy transitivity strictly.
These two examples serve as a cautionary note but these are not presented as
an argument to discard the notion of transitivity or to diminish the importance of
transitivity in the context of multilateral output and productivity comparisons.

Alternative Approaches to Making Transitive Comparisons

There are basically two approaches to index numbers that satisfy transitivity. The
first approach is to restrict to the use of only those index number methods that satisfy
transitivity. The second approach is to construct transitive multilateral comparisons
using binary index numbers as building blocks. The second approach was first
proposed by Gini [32, 33] but was independently proposed and popularized by
Éltető and K‚oves [25] and Szulc [56], and hence the method is referred to as the
GEKS method.
19 Index Numbers and Productivity Measurement 811

Index Number Formulae that Are Transitive

Several index numbers described in Sections “Index Numbers Based on Quantity
Aggregates” and “Index Number Approach to Measuring Quantity Change” auto-
matically satisfy transitivity. For example, the Lowe index, the geometric Young
index, or the Malmquist quantity index applied to multilateral comparisons with a
specific choice of reference technology and reference input quantity vector satisfy
transitivity.

The Lowe Index

The Lowe index in the context of multilateral quantity index numbers takes the
following form. For any pair of firms j and k, the index is defined as:
N
piR qik
QLowe
jk = i=1
N
for all j, k = 1, 2, . . . , M (98)
i=1 piR qij

where piR represent the “reference” or fixed price vector used to evaluate quantity
vectors of all the firms.
It is easy to see that the Lowe index satisfies the identity, positivity, circularity,
and transitivity axioms. It does not satisfy the factor reversal test. Implementation
of the Lowe index requires the choice of the reference prices used in Eq. (98). The
obvious choices are:

(i) Choose price vector associated with one of the firms as the reference price
vector. In this case indices are transitive but one firm is given a special
treatment. Note that results can differ significantly when one firm’s price vector
is replaced by another firm’s price vector.
(ii) In order to avoid the problem of choosing between different firms to specify the
reference price vector, one may choose the average of all the price vectors as
the reference vector. In this case the arithmetic, geometric, or harmonic average
of price vectors of all the M firms can be employed. For commodity i

1 M
M
1/M
pAM
i = pij and pGM
i = pij i = 1, 2, . . . , N (99)
M j =1
j =1

A problem with this approach is that all the firms are treated as equally
important in defining the average price. If firms differ significantly in size, one
may consider a weighted average.
(iii) A slightly more complicated but a realistic choice is to use a quantity weighted
average of the prices faced by the M firms in the comparison. For each item i,
use the average:
M
pij qij
M
qij
i=1
pi = M
= pij · M i = 1, 2, . . . , N (100)
j =1 qij j =1 j =1 qij
812 D. S. Prasada Rao

It is clear that one can think of many other reference price vectors for use
in conjunction with the Lowe index formula in Eq. (98). Logically and from an
economic view point the quantity-weighted average price vector in Eq. (100) is to
be preferred. This choice is less important if all the price vectors are close to each
other.
It is useful to remind the reader that the use of Lowe index in Eq. (98) may lead
to quantity indices that are counter-intuitive. In view of the discussion and Fig. 1,
the general recommendation is not to use the Lowe index for purposes of measuring
output, input quantity and productivity changes. If one were to use Lowe index
for this purpose, it is important to examine and report sensitivity of the results to
alternative choices of the reference price vector.

The Geometric Young Index

Recall that the geometric Young index (GY) for quantity comparisons (see Section
“Index Number Approach to Measuring Quantity Change”) is given by:

N
N
Young qik wi
Qj k = 0 ≤ wi ≤ 1 and wi = 1; j, k = 1, 2, . . . , M
qij i=1
i=1
(101)

The GY index is similar in structure to the Cobb-Douglas function and it is

also similar to the Törnqvist quantity index. The main point to note here is that
the weights wi are fixed for all pairwise comparisons.
Numerical values of the index depend upon the choice of the weights wi . It is
meaningful to use weights based on output revenue shares. The most commonly
used set of weights are:

1 M pij qij
wi = rij where rij = N for all i = 1, 2, . . . , N (102)
M j =1
i=1 pij qij

By construction the GY index is transitive and satisfies a number of properties.

It is useful to recall the following important theorem, stated as Result 4 in Section
“Axioms and Discussion,” which has strong implications for the choice of quantity
index numbers which are transitive. This result is stated again here.
Result: A quantity index Qst (pt , qt , ps , qs ) satisfies monotonicity, linear homo-
geneity, identity, commensurability and circularity tests if and only if the index is of
the Geometric-Young form and

N

N
qit wn
Qst (pt , qt , ps , qs ) = with wi > 0, i = 1, 2, . . . , N and wi = 1
qis
i=1 i=1
19 Index Numbers and Productivity Measurement 813

This result and its proof is originally from Funke et al. [31] but Balk ([6], pp.
97–98) provides a simpler proof.28
This result is indeed quite powerful since it shows that if monotonicity, linear
homogeneity, identity, commensurability, and circularity tests are to be satisfied then
the choice is limited to the use of the Geometric Young index in (102). This result
has important implications for the recently developed approach by O’Donnell [44,
46].
As in the case of Lowe index, output quantity comparisons from the GY index
are influenced by the particular choice of the set of weights used. In this regard, it is
useful to conduct an analysis of sensitivity of quantity comparisons to the choice of
weights and then make a judicious choice regarding the weights.

The Malmquist Output Quantity Index

The Malmquist output quantity index was introduced in Section “Direct Quantity
Index Based on Malmquist Distance Function.” The Malmquist index is given by:

DoR (xR , qk )
jk =
QM for all j, k = 1, 2, . . . , M (103)
DoR xR , qj

The Malmquist index requires the specification of a reference technology and a

reference input vector. In the case of binary comparisons, it is usually defined as the
geometric average of the Malmquist index computed using the technology and input
vectors of the base and reference firms leading to the following index:
j
1/2
Do xj , qk Dok (xk , qk )
QM
jk = j
· k (104)
Do xj , qj Do xk , qj

Note that the index in Eq. (104) is not transitive in general. However, the index
would be transitive if all the binary comparisons make use of the same reference
technology and reference input vector. Two remarks are in order.

1. The numerical values of the quantity index critically depend on the choice of
the technology and the input vector except in the case where the technology
is such that the output distance function is separable and homothetic, that is,
j
Do (q, x) = 1/g(q)h (j, x). In this case the value of the Malmquist index is
independent of the choice of reference technology and the reference input vector
xR .
2. However, the assumption of homothetic separability does not hold in general.
When this condition does not hold, the choice of reference technology and input
vector is indeed critical. If it is cross-sectional data, all the firms may be using

28 Theoriginal theorem and the proof in Balk [6] refer to GY index for price comparisons. But the
same proof holds for quantity index numbers.
814 D. S. Prasada Rao

the same technology and in this case choice of the reference input vector xR is
critical. Again the recommendation is to examine the sensitivity of productivity
comparisons resulting from different choices and select a plausible reference
input vector.

Gini-Éltető-Ko‚ves-Szulc (GEKS) Approach

The GEKS approach proposed in Gini [32, 33], Éltető and K‚oves [25], and Szulc
[56] is a simple approach that works on the following logic. For making binary
comparisons the Fisher, Törnqvist or some other suitable formula may be selected.
This choice may depend upon the fact that these indices satisfy a number of
axiomatic and economic theoretic properties. However, using these formulae results
in a matrix of multilateral quantity index numbers which are not transitive. For
example, use of the Fisher index leads to:
⎡ ⎤
QF QF12 . . . . QF1k . . . QF1M
⎢ Q11 QF2M ⎥
⎢ F QF22 . . . . QF2k . . . ⎥
QFM×M = ⎢ F21 ⎥ (105)
⎣ Qj 1 QFj2 . . . . QFjk . . . QFjM ⎦
QFM1 QFM2 . . . . QFMk . . . QFMM

This matrix of comparisons is not transitive. The GEKS approach is then to

construct a matrix of transitive comparisons, denoted by QGEKS M×M , where elements
of the matrix exhibit the property that the GEKS indices deviate the least from the
Fisher binary indices. If Fisher index is the primary choice then the GEKS approach
provides a way to make minimal changes to the Fisher indices resulting in transitive
comparisons.
The GEKS method solves the following minimization problem:

M
M 2
minQGEKS ln QGEKS
jk − ln QFjk
jk
j =1 k=1 (106)
s.t.QGEKS
jk = QGEKS
jl × QGEKS
lk for all j, k and l = 1, 2, . . . M

The solution to the optimization problem in Eq. (106) has the following closed
form:

M
1/M
QGEKS
jk = QFjl · QFlk j, k = 1, 2, . . . , M (107)
l=1

The GEKS quantity comparisons have an intuitive interpretation. For a compari-

son between j and k, it is possible to make indirect comparisons through each of the
firms in the comparison. Since the Fisher index is not transitive, different numerical
values result from these indirectly linked comparisons. The GEKS index in Eq.
(107) can be seen as a simple geometric average of all these M chained comparisons.
19 Index Numbers and Productivity Measurement 815

The GEKS index is a formula for multilateral comparisons. This is the main
aggregation method used in international comparisons of prices and real expenditure
at the World Bank over the last two decades. This has been used as the aggregation
procedure by the Eurostat for comparisons across the EU member countries since
1990. The GEKS index is also the method proposed by Caves et al. [10] for
multilateral output and input quantity index numbers based on the Malmquist
distance functions and for multilateral productivity comparisons based on the
Malmquist productivity index.29
The GEKS index satisfies many of the axioms discussed in Section “Axiomatic
Approach to Index Numbers.” However, it fails the strong form of identity test but
satisfies the weak form of identity test. It also fails to satisfy the proportionality
axiom and monotonicity axiom in extreme cases. Despite the deficiencies associated
with GEKS index, it is a commonly used method for multilateral comparisons. The
GEKS method is preferred over the Lowe and GY indices as these methods are
anchored on a single selected reference price vector in the case of the Lowe Index
and a reference set of weights in the case of GY index. In addition, the Malmquist
index requires a reference technology and reference input vector.
In conclusion to this section, transitivity is an important operational requirement
when multilateral output and input quantity index numbers are considered. If one
wishes to strictly adhere to the axiomatic approach, the result by Funke et al. [31] is
quite important and in this case the Geometric Young index is the only option. The
GEKS approach provides an alternative but it fails some important tests. However,
it has the advantage that it does not rely on any choice with respect to the weights
or reference technology or input vectors.

Conclusion

This chapter is about the use of index numbers to measure output and input
growth generally and on their use in productivity measurement. This chapter is
not about efficiency and productivity measures and the related task of productivity
decomposition. These aspects are dealt with in other chapters in this Handbook,
of particular relevance are Chap. 20, “Conceptualization and Measurement
of Productivity Growth and Technical Change: A Nonparametric Approach”;
Chap. 8, “Stochastic Frontier Analysis: Foundations and Advances I”, and
Chap. 9, “Stochastic Frontier Analysis: Foundations and Advances II”, and
Chap. 10, “Data Envelopment Analysis: A Nonparametric Method of Production
Analysis”.
The main objective of the chapter has been one of providing the practitioner with
an overview of the index number methods and approaches available for measuring
changes in output and input quantities. These methods are equally applicable for

29 TheGEKS index is also being increasingly used in inter-temporal comparisons of prices based
on scanner data.
816 D. S. Prasada Rao

measuring changes in output and input prices. In this chapter two alternative
approaches to this problem are discussed in detail. The first approach, discussed
in Section “Index Numbers Based on Quantity Aggregates,” utilizes the notion
of aggregates of output and input quantities and uses the resulting aggregates to
measure changes in output and input quantities. The second approach, discussed
in Sections “Index Number Approach to Measuring Quantity Change,” reviews
the standard index number approach where index number formulae are used to
directly measure changes in output and input quantities without having to first
measure quantity aggregates. The axiomatic and economic theoretic approaches are
discussed in Sections “Axiomatic Approach to Index Numbers”and “Economic The-
oretic Approach to Output and Input Quantity Index Numbers.” Section “Special
Topics” discusses a few additional topics that bring the whole chapter together. The
problem of multilateral comparisons and the need to compile transitive measures
of output and quantity change are discussed in Section “Transitivity and Quantity
Index Numbers.”
The main take home message for the practitioner is that the index number
approach to quantity and productivity comparisons does not always lead to a
single index number formula. When bilateral comparisons involving only two
periods or two firms are required, the general recommendation is to use the Fisher
or Törnqvist index numbers formulae. These indices possess useful axiomatic
and economic theoretic properties and are known to be numerically close to
each other. However, the problem becomes somewhat complex when multilateral
comparisons (Section “Transitivity and Quantity Index Numbers”) are required as
the recommended Fisher and Törnqvist indices fail to satisfy transitivity property.
The often recommended Gini-Éltető-K‚oves-Szulc method of measuring quantity
changes in a multilateral context satisfies transitivity but fails to satisfy the strong
identity test (although it satisfies the weak identity test) and the proportionality
test and may fail the monotonicity test in extreme cases (Section “Transitivity and
Quantity Index Numbers”). It is in this context, the approach based on quantity
aggregates, discussed in Section “Index Numbers Based on Quantity Aggregates,”
assumes relevance. The practitioner interested in using the quantity aggregates to
measure quantity changes must be aware of the main issues associated with the
use and implementation of the quantity aggregates to measure quantity change
(discussed in Section “Use of Quantity Aggregates to Measure Quantity Change”).
The empirical implementation of this approach requires choices to be made at
several stages. To start with, one has to articulate the set of properties to be satisfied
by the quantity aggregate (see Section “Use of Quantity Aggregates to Measure
Quantity Change”). The subsequent choice of an appropriate aggregator function
such as the Lowe, Geometric Young, or a Malmquist aggregate function (Section
“Specification of Functional form for the Output Aggregates”) followed by the
choice of selection of weights, reference vectors, and reference technology (in the
Malmquist index case) are critical to this approach. The resulting measures of output
and input quantity changes are likely to be highly sensitive to the choices made in
this process. Currently no concrete recommendations regarding these choices are
available. Keeping this in perspective, it is important to report sensitivity of the
19 Index Numbers and Productivity Measurement 817

results arising out of various choices made in measuring quantity changes using
quantity aggregates along with results from the use of direct index number approach
and the GEKS method. Where the results show significant differences in results
from alternative choices, it is necessary to outline the reasons and to defend any
particular choice made for the final analysis.

References
1. Aczèl J (1966) Lectures on functional equations and their applications. Academic, New
York/London
2. Allen RGD (1949) The economic theory of index numbers. Economica NS 16(99):197–203
3. Allen RC, Diewert WE (1981) Direct versus implicit superlative index number formulae. Rev
Econ Stat 63:430–435
4. Balk BM (1995) Axiomatic price index theory: a survey. Int Stat Rev 63:69–93
5. Balk BM (1998) Industrial Price, quantity, and productivity indices: the micro-economic theory
and an application. Springer, New York
6. Balk B (2008) Price and quantity index numbers – models for measuring aggregate change and
difference. Cambridge University Press, New York
7. Balk BM, Diewert WE (2009) The Lowe consumer price index and its substitution bias.
Chapter 8. In: Diewert WE, Balk BM, Fixler D, Fox KJ, Nakamura AO (eds) Price and
productivity measurement – volume 6 index number theory. Trafford On Demand Pub
8. Bjurek H (1996) The Malmquist total factor productivity index. Scand J Econ 98(2):303–313
9. Carli GR (1764) Del Valore e della Proporzione de’ Metslli Monerysyi. In Scrittori Classici
Italiani di Economia Politica< aprte Moderna, XIII, 297–366
10. Caves DW, Christensen LR, Diewert WE (1982) Multilateral comparisons of output, input, and
productivity using superlative index numbers. Econ J 92:73–86
11. Coelli TJ, Rao DSP, Battese GE (1998) An introduction to efficiency and productivity
measurement. Kluwer Academic, New York
12. Coelli TJ, Rao DSP, O’Donnell CJ, Battese GE (2005) An introduction to efficiency and
productivity measurement. Springer, New York
13. Diewert WE (1976) Exact and superlative index numbers. J Econ 4:115–145
14. Diewert WE (1978) Superlative index numbers and consistency in aggregation. Econometrica
46:883–900
15. Diewert (1983) The theory of the output Price index and the measurement of real output
change. In: Diewert WE, Momtmarquette C (eds) Price level measurement: proceedings from
a conference sponsored by statistics Canada, Ottawa, pp 1049–1108
16. Diewert WE (1992) Fisher ideal output, input and productivity indices revisited. J Prod Anal
3:211–248
17. Diewert WE (1993) The early history of Price index research. In: Diewert WE, Nakamura A
(eds) Essays in index number theory. Amsterdam, North-Holland
18. Diewert WE (2013) Methods of aggregation above the basic heading level Chapter 5. In:
World Bank (ed) Measuring the real size of the world economy. World Bank, Washington, DC,
pp 93–119
19. Diewert WE (2015a) Functional equations, Chapter 2. In: Index number theory and practice.
Lecture Notes, Department of Economics, University of British Columbia, Vancouver
20. Diewert WE (2015b) Early approaches to index number theory, Chapter 1. In: Index number
theory and practice. Lecture Notes, Department of Economics, University of British Columbia,
Vancouver
21. ECE-ILO (2014) Manual of the consumer price index. ILO, Geneva
22. Edgeworth FY (1925) The plurality of index numbers. Econ J 35:379–388
23. Eichorn W (1978) Functional equations in economics. Addison-Wesley, Reading
818 D. S. Prasada Rao

24. Eichorn W, Voeller J (1976) Theory of the price index, Lecture notes in economics and
mathematical systems 140. Springer, Berlin
25. Éltető O, K‚oves P (1964) On an index computation problem in international comparisons (in
Hungarian). Statiztikai Szemle 42:507–518
26. Färe R, Primont D (1995) Multi-output production and duality: theory and applications. Kluwer
Academic, Boston
27. Farrell M (1957) The measurement of productive efficiency. J Roy Stat Soc A (General)
120(3):253–290
28. Fisher I (1922) The making of index numbers: a study of their varieties, tests and reliability.
Houghton Mifflin Company, Boston/New York
29. Fisher, F.M. and K. Shell (1972), “The pure theory of the National Output Deflator”, pp.
49-113 in The economic theory of price indices by F.M. Fisher and K. Shell, Academic,
New York
30. Frisch R (1936) Annual survey of general economic theory: the problem of index numbers.
Econometrica 4:1–38
31. Funke H, Hacker G, Voeller J (1979) Fisher’s circular test revisited. Schweizerische Zeitschrift
fűr Volkswirtschaft und Statistik 115:677–687
32. Gini C (1924) Quelques considerations au Sujet de la Construction des Nombres Indices des
Prix et des Questions Analogues. Metro 4:3–162
33. Gini C (1931) On the circular test of index numbers. Int Rev Stat 9(2):3–25
34. Hill TP (2009) Lowe indices. Chapter 9. In: Diewert WE, Balk BM, Fixler D, Fox KJ,
Nakamura AO (eds) Price and productivity measurement – volume 6 index number theory.
Trafford On Demand Pub
35. IMF (2004) Produce Price Index (PPI). Manual, Washington, DC
36. Jevons WS (1865) On the variation of prices and the value of the currency since 1782. J Stat
Soc Lond 28:2940–2320
37. Konus AA (1924) The problem of the true index of the cost of living, translated in
Econometrica, 7 (1939), pp 10–29
38. Kuczma M, Gilányi A (2009) An introduction to the theory of functional equations and
inequalities: cauchy’s equation and Jensen’s inequality. Birkhäuser, Basel
39. Laspeyres E (1871) Die Berchnung einer Mittlerrrn Waarenpreisstrigertung. Jahrbücher für
National‚okonomie und Statistik 16:296–314
40. Lowe J (1823) The present state in regard to agriculture, trade and finance, 2nd edn. Longman,
Hurst, Rees, Orne & Brown, London
41. Malmquist S (1953) Index numbers and Indsifference surfaces. Trabajos de Estadistica 4:
209–242
42. Marshall A (1887) Remedies for fluctuations of general prices. Contemp Theatr Rev 51:355–
375. Reprinted in Memorials of Alfred Marshall, edited by A.C. Pigou, Macmillan, London,
1925)
43. O’Donnell CJ (2011) DPIN 3.0: a program for decomposing productivity index numbers.
Available on Center for Efficiency and Productivity Analysis website: www.uq.edu.au/
economics/cepa/dpin.php; School of Economics, University of Queensland, Brisbane
44. O’Donnell CJ (2012a) Alternative indices for multiple comparisons of quantities and prices.
Centre for Efficiency and Productivity Analysis Working Papers WP05/2012. School of
Economics, University of Queensland, Brisbane
45. O’Donnell CJ (2012b) Nonparametric estimates of the components of productivity and
profitability change in U.S. agriculture. Am J Agric Econ 94(4):873–890
46. O’Donnell CJ (2016) Using information about technologies, markets and firm behaviours to
decompose a proper productivity index. J Econ 190(2):328–340
47. O’Donnell CJ (2018) Productivity and efficiency analysis: an economic approach to measuring
and explaining managerial performance. Springer, Singapore
48. OECD (2001) Measurement of aggregate and industry-level productivity growth – productivity
Manual. OECD, Paris
19 Index Numbers and Productivity Measurement 819

49. Paasche H (1874) Ueber die Preisentwicklung derr Letzten Jable nack dern Hamburger
B‚orsennotierungen. Jahrbücher für National‚okonomie und Statistik 23:168–178
50. Rao DSP (2013a) The framework for the international comparison program (ICP). Chapter 1.
In: World Bank (ed) Measuring the real size of the world economy. World Bank, Washington,
DC, pp 13–45
51. Rao DSP (2013b) Computation of basic heading purchasing power parities (PPPs) for
comparisons within and between regions. Chapter 4. In: World Bank (ed) Measuring the real
size of the world economy. World Bank, Washington, DC, pp 93–119
52. Samuelson S, Swamy S (1974) Invariant economic index numbers and canonical duality:
survey and synthesis. Am Econ Rev 64:566–593. 58, pp 434–442
53. Sato K (1976a) The ideal log-change index number. Rev Econ Stat 58:223–228
54. Sato K (1976b) The meaning and measurement of the real value added index. Rev Econ Stat
58:434
55. Sickles R, Zelenyuk V (2019) Measurement of productivity and efficiency: theory and practice.
Cambridge University Press, Cambridge, UK
56. Szulc B (1964) Index numbers of multilateral regional comparisons (in Polish). Przeglad
Statysticzny 3:239–254
57. Shephard RW (1970) Theory of Cost and Production Functions. Princeton University Press,
Princeton
58. Törnqvist L (1936) The Bank of Finland’s consumption Price index. Bank Finl Mon Bull
16(10):27–34
59. Vartia YO (1976) Ideal log-change index numbers. Scand J Econ 3:121–126
60. von Bortkiewicz L (1923) Zwck und Struktur einer Preisindexzahl. Nordisk Statistisk
Tidsskrift 2:369–408
61. Walsh CM (1901) The measurement of general exchange-value. Macmillan, New York
62. Young A (1812) An inquiry into the progressive value of money in England as marketed by the
Price of agricultural products. Hatchard, Piccadilly
Conceptualization and Measurement
of Productivity Growth and Technical 20
Change: A Nonparametric Approach

Subhash C. Ray

Contents
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 822
The Theoretical Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 823
The Production Possibility Set . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 823
Distance Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 825
Technical Efficiency and Distance Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 827
Input and Output Sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 827
Technical Change . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 828
Change in the Technology Versus Change in the Technique: A Clarification . . . . . . . . . . . 828
Productivity Change in Continuous Time . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 829
Productivity Change in Discrete Time . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 838
A Hicks-Moorsteen Productivity Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 838
The Tornqvist Productivity Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 839
The Fisher Productivity Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 841
Profitability, Terms of Trade, and Productivity Indexes . . . . . . . . . . . . . . . . . . . . . . . . . . . . 842
Malmquist Productivity Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 844
Malmquist Productivity Index with Multiple Outputs and Inputs . . . . . . . . . . . . . . . . . . . . 847
Allowing Technological Change . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 848
Allowing Returns to Scale Effect . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 849
Biennial Malmquist Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 851
Directional Distance Function and Luenberger Productivity Indicator . . . . . . . . . . . . . . . . 852
Relation Between Tornqvist and Malmquist Productivity Indexes . . . . . . . . . . . . . . . . . . . . 855
Relation Between Fisher and Malmquist Productivity Indexes . . . . . . . . . . . . . . . . . . . . . . 858
Nonparametric Decomposition of the Fisher Productivity Index . . . . . . . . . . . . . . . . . . . . . 860
Relation Between Malmquist Productivity Index and Luenberger
Productivity Indicator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 863
Data Envelopment Analysis and a Nonparametric Measurement
of Productivity Change . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 864
DEA Models for Measuring the Malmquist Productivity Index . . . . . . . . . . . . . . . . . . . . . . 864

S. C. Ray ()
Department of Economics, University of Connecticut, Storrs, CT, USA
e-mail: [email protected]

© Springer Nature Singapore Pte Ltd. 2022 821

S. C. Ray et al. (eds.), Handbook of Production Economics,
https://doi.org/10.1007/978-981-10-3455-8_26
822 S. C. Ray

Biennial Malmquist Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 865

DEA Model for the Directional Distance Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 868
Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 868
Cross-References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 869
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 869

Abstract

The famous Solow residual measures productivity change as the difference

between the growth rates of output and input. Under constant returns to scale
and in the absence of any technical inefficiency, it serves as a measure of
technical change. In neoclassical production economics, productivity change can
be measured alternatively from the production, cost, profit, or distance functions.
In continuous time analysis, one measures the rates of productivity and technical
change. In discrete time, one measures indexes of productivity and technical
change over time. This chapter describes the Tornqvist, Fisher, and Malmquist
productivity indexes along with the Luenberger productivity indicator based
on the directional distance function and how they relate to one another. Also
discussed is the relation between productivity and profitability of a firm. The
relevant nonparametric DEA models for measuring the Malmquist index and the
Luenberger productivity indicator are formulated for nonparametric analysis of
productivity, technical change, and change in efficiency.

Keywords

Solow residual · Shephard distance function · Directional distance function ·

Data envelopment analysis

JEL Classiﬁcation Numbers

D24, C61

Introduction

An appropriate starting point for any discussion of productivity growth and technical
change in the neoclassical production economic framework is the seminal paper
by Solow on technical progress and productivity change [34]. The famous Solow
residual measuring the difference between the rates of growth in output and inputs
is interpreted as the rate of technical progress. Solow assumed constant returns
to scale, which is quite appropriate in the context of his macroeconomic model.
When applied to an individual producer, one needs to allow variable returns to
20 Conceptualization and Measurement of Productivity Growth and Technical . . . 823

scale. Further, changes in technical efficiency may account in part for a higher
or lower rate of growth in output. It is now generally accepted that in addition to
technical progress, returns to scale effects of a change in inputs along with changes
in technical efficiency may also contribute to the Solow residual. The principal
objective of this chapter is to explain how to isolate technical progress, scale effects,
and efficiency change as three distinct components of productivity change measured
empirically using the nonparametric method of data envelopment analysis (DEA).1
The paper unfolds as follows. Section “The Theoretical Background” presents
a brief overview and basic assumptions of the neoclassical production theory and
considers alternative ways to measure technical change and productivity growth
in continuous time using production, cost, profit, and distance functions as the
analytical framework. Section “Productivity Change in Discrete Time” revisits
the issues in discrete time using index numbers. In particular, Tornqvist, Fisher,
and Malmquist productivity indexes and their interrelations are considered. Also
described is the more recent Luenberger productivity indicator measured by the
differences between directional functions. Section “Data Envelopment Analysis
and a Nonparametric Measurement of Productivity Change” presents the relevant
nonparametric DEA models for measuring the Malmquist index and the Luenberger
productivity indicator. Section “Conclusion” is the conclusion.

The Theoretical Background

The Production Possibility Set

Consider an industry producing m outputs from n inputs. Let x ∈ R+n be a vector

of inputs and y ∈ R+ an output vector. Then the input-output pair is a feasible

production plan if and only if y can be produced from x. The set of all feasible
production plans constitute the production possibility set:

T = (x, y) : y ∈ R+
m
can be produced from x ∈ R+
n
(1)

It is conventional to make the following assumptions about the production

possibility or the technology set:

(a) T is closed and bounded.

(b) T is convex.
(c) Inputs are freely disposable. This implies that from any y if (x0 , y) ∈ T and
x1 ≥ x0 , then (x1 , y) ∈ T.

1A comparable analysis of productivity change using the parametric method of stochastic frontier
analysis (SFA) may be found in Chap. 21, “Modeling Technical Change: Theory and Practice”
by Kumbhakar in this volume.
824 S. C. Ray

(d) Outputs are freely disposable. This implies that from any x if (x, y0 ) ∈ T and
y1 ≤ y0 , then (x, y1 ) ∈ T.
If, in addition to (a)–(d) one also assumes constant returns to scale (CRS),
we get
(e) (x, y) ∈ T implies that (kx, ky) ∈ T for all k ≥ 0.

It is often convenient to define the production possibility set indirectly by means

of a parametrically specified function.
For single-output technologies, one can define the production function y∗ = f (x)
as the maximum output producible from the input (vector) x. Equivalently, the
production possibility set is defined as

T = {(x, y) : y ≤ f (x)} (2)

In practice, the output produced by a firm from the input bundle x may be strictly
lower than f (x) due to technical inefficiency. One can incorporate efficiency in the
production function as

y = f (x)τ ; 0 ≤ τ ≤ 1. (3)

Alternatively,
y
τ= . (4)
f (x)

When multiple outputs are produced from multiple inputs, a simple representa-
tion of the technology through a production function is no longer possible. One may
however specify a transformation function:

F (x, y) = a; x ∈ R+
n
, y ∈ R+
m
, a ∈ R1. (5)

The production possibility set can then be defined as

T = (x, y) : F x, y ≤ 0; x ∈ R+
n
, y ∈ R+
m
. (6)

For any explicit functional specification of the transformation function, free

disposability of inputs and outputs would imply

∂F
≤ 0, i = 1, 2, . . . , n; (7a)
∂xi

and

∂F
≥ 0, j = 1, 2, . . . , m; (7b)
∂yj
20 Conceptualization and Measurement of Productivity Growth and Technical . . . 825

Further, convexity of T would imply that F(x, y) is a concave function.

Distance Function

It is apparent from (6) above that the transformation function provides a criterion for
separating feasible input-output bundles from those which are infeasible. However,
an input-output bundle is not efficient unless F(x, y) = 0. When F(x0 , y0 ) < 0 for any
input-output bundle (x0 , y0 ), (7a) and (7b) would imply that it would be possible to
increase one or more output(s) or to reduce one or more (inputs) without violating
feasibility.
Shephard [32, 33] defined the (output) distance function as

1
D y (x, y) = min λ : x, y ∈ T. (8)
λ

It is clear that if F(x, y) = 0, Dy (x, y) = 1. On the other hand, for F(x, y) < 0,
Dy (x, y)< 1. Thus, an alternative, and equivalent, characterization of the production
possibility set is

T = (x, y) : D y x, y ≤ 1; x ∈ R+
n
, y ∈ R+
m
. (9)

At this point, it will be useful to distinguish between weak (or radial) efficiency
and strong (or non-radial) efficiency.
An input-output bundle (x0 , y0 ) is weakly output efficient if

x 0 , y 0 ∈ T and a > 1 ⇒ x 0 , ay 0 ∈/ T. (10a)

Similarly, (x0 , y0 ) is weakly input efficient if

x 0 , y 0 ∈ T and b < 1 ⇒ bx 0 , y 0 ∈/ T. (10b)

By contrast, (x0 , y0 ) is strongly (non-radially) output efficient if

x 0 , y 0 ∈ T and y 1 ≥ y 0 ⇒ x 0 , y 1 ∈/ T. (11a)

Similarly, (x0 , y0 ) is strongly (non-radially) input efficient if

x 0 , y 0 ∈ T and x 1 ≤ x 0 ⇒ x 1 , y 0 ∈/ T. (11b)

Weak output efficiency rules out any proportional increase in all outputs, but
increase in some outputs may be possible. The vector inequality in (11a) rules out
826 S. C. Ray

any increase in any output without increasing inputs. It is clear that Dy (x, y) = 1
implies that the bundle (x, y) must be weakly (or radially) efficient even though it
may or may not be strongly efficient.
Listed below are some properties of the output-oriented distance function:

O1. Dy (x, y) is non-decreasing in y. That is, for any input bundle x, y1 ≥ y0 ⇒

Dy (x, y0 ) ≤ Dy (x, y1 ).
O2. Dy (x, y) is non-increasing in x. That is, for any output bundle y, x1 ≥ x0 ⇒
Dy (x0 , y) ≥ Dy (x1 , y).
O3. Dy (x, y) is homogeneous of degree one in y. That is, Dy (x, αy) = αDy (x, y).
O4. Dy (x, y) is convex in y.

Comparable to the output distance function, one can define the input distance
function:

1
D x (x, y) = max δ : x, y ∈T (12)
δ

Again, for any feasible input-output bundle (x, y), Dx (x, y) ≥ 1. Hence, another
way to define the production possibility set is

T = (x, y) : D x x, y ≥ 1; x ∈ R+
n
, y ∈ R+
m
. (13)

As argued above, if Dx (x0 , y0 ) = 1, (x0 , y0 ) is weakly (but not necessarily

strongly) input efficient.
The analogous properties of the input-oriented distance function are:

I1. Dx (x, y) is non-decreasing in x.

I2. Dx (x, y) is non-increasing in y.
I3. Dx (x, y) is homogeneous of degree one in x.
I4. Dx (x, y) is concave in x.

In general, there is no specific relationship between the output- and the input-
oriented distance functions for the same input-output bundle (x, y). However, under
CRS, the distance functions are inverses of one another. This can be shown as
follows.
We know from the definition
of the output-oriented distance function in (8) above
that λ is min λ : x, λ y ∈ T . However, under the CRS assumption, that would
∗ 1

mean

1
λ∗ = min λ : (λx, y) ∈ T = max θ : x, y ∈ T. (14)
θ

Hence, using the subscript C to refer to a CRS technology,

20 Conceptualization and Measurement of Productivity Growth and Technical . . . 827

y 1
DC (x, y) = . (14a)
DCx (x, y)

Technical Efficiency and Distance Function

Farrell [21] defined the output-oriented technical efficiency of an input-output

bundle (x, y) as

1
τy (x, y) = ; ϕ ∗ = max ϕ : (x, ϕy) ∈ T . (15)
ϕ∗

Comparison of (15) with (8) shows that

τy (x, y) = D y (x, y) . (15a)

On the other hand, input-oriented Farrell efficiency is

τx (x, y) = min θ : (θ x, y) ∈ T . (16)

Hence,

1
D x (x, y) = . (16a)
τx (x, y)

Input and Output Sets

Yet another equivalent characterization of the technology is through the families of

input sets and output sets.
For any given output vector y0 , the input set is

V y 0 = x ∈ R+
n
: x, y 0 ∈ T (17)

Two things may be noted. First, for any production possibility set T, there is a
m+n
family of input sets, one for each output vector. Second, while T ⊂R+ n.
, V (y)⊂R+
0
In an analogous way, for any input vector x , the output set is

P x 0 = y ∈ R+
m
: x0, y ∈ T (18)

Again, T yields a family of output sets, one for each input vector.
828 S. C. Ray

Technical Change

In our discussion, so far, we have assumed that the production possibility set, T,
remains unchanged over time. In reality, however, due to advancement in technical
knowhow, the frontier of the production possibility may shift outward making it
possible to produce a bigger output bundle than before from the same input bundle.
This is interpreted as technical progress.
One can accommodate technical change by dating the production possibility set
over time as

T t = (x, y) : y ∈ R+
m
can be produced from x ∈ R+
n
in period t (19)

Thus, in the single-output case,

T t = (x, y) : y ≤ f x, t ; x ∈ R+
n
in period t (20)

Similarly, for the multiple output case,

T t = (x, y) : F x, y; t ≤ 0; x ∈ R+
n
; y ∈ R+m

(21)
⇐⇒ (x, y) : D y x, y; t ≤ 1; x ∈ R+ n
; y ∈ R+
m

Change in the Technology Versus Change in the Technique:

A Clarification

It is important to clear up a common misconception about technical progress. Often

an increase in the capital labor ratio is erroneously regarded as an improvement in
the technology. This amounts to confusing a movement along an isoquant with a
shift in the isoquant. Consider a firm that uses 2 machines (K) and 50 workers (L),
and suppose that the production function is
√
Q = 2 KL. (22)

Further assume that there is no technical inefficiency. Then, its output (Q) will
be 20. Now suppose that in the next period, it uses 20 machines and 5 workers. So
long as the same production function applies in both periods, again its output will
be 20. The two input bundles (K = 2, L = 50) and (K = 20, L = 5) are two points
in the same isoquant for Q=20. Even though the second input bundle shows a much
higher use of capital per worker, it represents input substitution and does not imply
any technical progress. Only if the production function itself changes between the
two periods, we can talk about technical progress. This will be associated with a shift
of the isoquant toward the origin because less of labor or capital (or a combination of
both) will be required to produce the same level of output as before. Solow is quite
20 Conceptualization and Measurement of Productivity Growth and Technical . . . 829

clear on this point and used the phrase “technical change” as a shorthand expression
for any kind of shift in the production function ([34], p. 312). A point on an isoquant
represents a technique. Thus, movement from one point to another on the isoquant
amounts to a change in the technique. By contrast, the technology is the assortment
of all available techniques to produce any given level of output, and a change in
the technology implies a shift in the production function resulting in a shift in the
isoquant.

Productivity Change in Continuous Time

Deriving the Solow Measure of Technical Progress

Solow considered an aggregate production function for the entire economy with
output, Q, as a function of capital, K, and labor, L, and technical change causing a
neutral shift in the production function over time without altering the marginal rate
of substitution between capital and labor:

Q = A(t)f (K, L) (23)

Thus

dQ d(f (K, L) dA(t)

= A(t). + f (K, L)
dt dt dt

∂f dK ∂f dL dA(t)
= A(t). + + f (K, L)
∂K dt ∂L dt dt

dQ dA(t) ∂f ∂f
Defining dt = Q̇, dK
dt = K̇, dt = L̇, dt = Ȧ, ∂K = fK , and
dL
∂L = fL , we
get

Q̇ fK K K̇ fL L L̇ Ȧ
= + + (24)
Q f K f L A

Under perfect competition and treating the output price as the numeraire, the
∂Q
input prices of capital and labor will be equal to their marginal productivities ∂K =
∂Q
A(t)fK and ∂L = A(t)fL , respectively. In that case, the partial elasticities of output
∂Q ∂Q
.K .L
with respect to the inputs, ∂KQ = fKfK = sK and ∂KQ = fLf L = sL , become the
corresponding shares of K and L in the total output. Under the assumption of CRS,
f (K, L) = fK K + fL L so that the factor shares add up to unity. At this point, one
may define the rate of change in the aggregate output as

Ẋ K̇ L̇
= sK + sL (25)
X K L
830 S. C. Ray

Then the (multi-factor) productivity change can be seen to be equal to the rate of
technical change:

Ṁ Q̇ Ẋ Ȧ
= − = . (26)
M Q X A

As already noted above, in Solow’s analysis CRS was the default returns to scale.
Moreover, he assumed full technical efficiency. We now relax the CRS assumption
and also accommodate inefficiency.

Productivity Change at the Firm Level with Variable Returns to Scale

and Inefficiency2
Consider a one-output multiple-input production function at the firm level allowing
the returns to scale to vary across different input bundles and the presence of
technical inefficiency in the observed output.
Let y be a scalar output and x ∈ R+ n be a bundle of n inputs. The production

function can then be written as

y = f (x, t) τ ; 0 ≤ τ ≤ 1 (27)

where t is an index of time to capture technical change and τ is a measure of

efficiency, which coincides with the Shephard output distance function.
Then
dy df (x, t) dτ
≡ ẏ = τ. + f (x, t)
dt dt dt
(28)
∂f dx i ∂f dτ
=τ + + f (x, t)
∂xi dt ∂t dt
i

∂f ∂f
Define ∂xi ≡ fi (i = 1, 2, . . . , n) and ∂t ≡ ft .
Thus
ẏ fi xi ẋi ft τ̇
= + + . (29)
y f xi f τ
i

It can be easily seen that εi = ∂∂ lnlnxyi = ffi xi is the partial elasticity of output with
respect to the ith input. We may now define a measure of the rate of growth in the
total input as

2 Sections“Productivity Change at the Firm level with Variable Returns to Scale and Inefficiency”,
through “Productivity Growth from Growth Accounting”, and “Measuring Productivity Growth
from the Cost Function” builds upon the model in Denney, Fuss, and Waverman [14].
20 Conceptualization and Measurement of Productivity Growth and Technical . . . 831

Ẋ fi xi ẋi ẋi
= = εi (30)
X 1 f xi xi
i i

Then we get the measure of total (or multi) factor productivity growth as

Ṁ ẏ Ẋ ft τ̇
= − = + . (31)
M 1 y X 1 f τ

ft
Here f is the rate of autonomous shift in the production function due to technical
change, and τ̇τ is a measure of the change in technical efficiency over time.
While the measure of total factor productivity growth in (31) is theoretically
correct, the measure of the rate of change total input in (30) is not quite satisfactory.
The reason is that unless CRS holds, the partial elasticities (εi ) do not add up to
unity. This creates the paradox that if εi < 1 (as is the case when diminishing
i
returns to scale hold at the observed input bundle)even when all inputs increase at
Ẋ
the same rate (say by 5%), the total input growth X will be less than 5%! One
1
way out of this paradox is to redefine the aggregate input growth rate as

Ẋ εi ẋi ẋi εi
= = ηi ; ηi = . ε0 = εi (32)
X 2 ε0 xi xi ε0
i i i

Note that ε0 is the scale elasticity that equals, falls short of, or exceeds unity
under constant, diminishing, and increasing returns to scale, respectively.
With this alternative measure of the growth rate in the aggregate input, the
paradox mentioned above does not arise. Note also that

Ẋ 1 Ẋ
= . (33)
X 2 ε0 X 1

Thus the total factor productivity growth rate based on this alternative measure
of growth rate of total input is

Ṁ ẏ Ẋ ẏ Ẋ Ẋ Ẋ
= − = − + −
M 2 y X 2 y X 1 X 1 X 2

Ṁ 1 Ẋ
= + 1− (34)
M 1 ε0 X 1

ft τ̇ 1 Ẋ
= + + 1− .
f τ ε0 X 1

In this decomposition of productivity growth, the last term on the right captures
the returns to scale effect. Under CRS, ε0 equals unity and the scale component
832 S. C. Ray

disappears. Under DRS, ε0 is less than unity and the scale effect is negative. The
opposite is true under IRS when ε0 exceeds unity.

Productivity Growth from Growth Accounting

By far, the most common approach to measuring growth of total input is to create
an average of growth rates of the individual inputs weighted by their respective cost
shares. This yields the measures
w x ẋ
Ẋ i i i
= ;C = wi xi (35)
X 3 C xi
i i

and

Ṁ ẏ Ẋ
= − . (36)
M 3 y X 3

As an aside, consider, at this point, the cost minimization problem of the firm:

min C = w x : A(t)f (x, t) .τ = y. (37)

Here w ∈ R++ n is a strictly positive vector of input prices. We know from the

first-order conditions for a minimum that for each input, wi = λA(t)fi τ and also that
the Lagrange multiplier λ = ∂C ∂y . Thus

y w x
wi xi i i λA(t) (fi xi ) .τ y ∂C fi xi
= = = (38)
C C y A(t)f (x, t) τ C ∂y f

Thus,
w x ẋ
Ẋ i i i y ∂C fi xi ẋi
= = . (39)
X 3 C xi C ∂y f xi
i i

y ∂C
Further, C ∂y = ∂ ln C
∂ ln y ≡ εcy is the flexibility of cost with respect to output.3
Thus,

Ẋ y ∂C fi xi ẋi Ẋ
= . = εcy . (40)
X 3 C ∂y f xi X 1
i

Hence,

3 The inverse of εcy is a metric of overall scale economies.

20 Conceptualization and Measurement of Productivity Growth and Technical . . . 833

Ṁ ẏ Ẋ ẏ Ẋ Ẋ Ẋ
= − = − + −
M 3 y X 3 y X 1 X 1 X 3

Ṁ Ẋ
= + 1 − εcy (41)
M 1 X 1

ft τ̇ Ẋ
= + + 1 − εcy .
f τ X 1

Again, the last term on the right in (41) measures the scale effect. Under CRS,
average cost remains constant so that εcy = 1 and there is no scale effect. Under
diminishing returns, the average cost is increasing, and εcy > 1 so that the scale
effect is negative. In the case of increasing returns, the scale effect is positive.

Measuring Productivity Growth from the Cost Function

We now show how the measure of productivity growth can be derived from a firm’s
dual cost function.
Denoting the vector of its input prices by w, the dual cost function can be written
as

C = C (w, y, t) . (42)

The right-hand side of (42) equals the right-hand side of (37) above by definition.
Differentiating (42) with respect to t, we get

dC ∂C dwi ∂C dy ∂C
= + + . (43)
dt ∂wi dt ∂yi dt ∂t
i

That is

∂C ∂C
Ċ = ẇi + ẏ + Ct . (44)
∂wi ∂yi
i

Also recall that the dual cost equals the cost of the least cost output bundle and

C= wi xi . (45)
i

From (45) we get

Ċ = wi ẋi + xi ẇi . (46)

i i

Comparing (44) and (46) and recognizing that by Shephard’s lemma ∂C

∂wi = xi
for each input i, we get
834 S. C. Ray

∂C
ẏ + Ct = wi ẋi . (47)
∂yi
i

Hence,

∂C y ẏ Ct wi xi ẋi
+ = . (48)
∂yi C y C C xi
i

That is

ẏ Ct wi xi ẋi Ẋ Ẋ
εcy + = = = εcy . (49)
y C C xi X 3 X 1
i

Hence,

Ṁ ẏ Ẋ 1 Ct
= − =− . (50)
M 1 y X 1 εcy C

Of course, under constant, diminishing, or increasing returns to scale, εcy is equal

to, greater than, and less than unity, respectively. In this representation, technical
progress is captured by − CCt , the rate of autonomous downward shift of the cost
function.

Measuring Productivity Growth from the Profit Function4

To wrap up the discussion of the one-output case, we show how productivity growth
can be measured from a firm’s dual profit function:

π = π (p, w, t) = max py − w x : (x, y) ∈ T t

(51)
⇐⇒ π (p, w, t) = max py − w x : y ≤ f (x, t)

In (51) above, p is the price of output, w the input price vector, and Tt the
production possibility set at time t. As before differentiating (51) with respect to
t, we get

dπ ∂π dp ∂π dw i ∂π
π̇ = = + + . (52)
dt ∂p dt ∂wi dt ∂t
i

This time, by Hotelling’s lemma, ∂π

∂p = y, and ∂π
∂wi = −xi so that (52) leads to

4 This section extends Ray and Segerson [29] by explicitly including inefficiency.
20 Conceptualization and Measurement of Productivity Growth and Technical . . . 835

dπ
π̇ = = y.ṗ − xi ẇi + πt . (53)
dt
i

Also,

π = py − wi xi (54)
i

where (x, y) is the profit maximizing input-output bundle.

Again, from (54)

dπ
π̇ = = y.ṗ + pẏ − xi ẇi − wi ẋi . (55)
dt
i i

Thus a comparison of (53) and (55) yields

πt = pẏ − wi ẋi . (56)

Dividing both sides of (56) by py, we get

πt ẏ 1
= − wi ẋi
py y py
i
(57)
ẏ C wi xi ẋi
= −
y py C xi
i

For profit maximization, p = ∂C

∂y so that C
py = ∂C
C
= 1
εcy . Thus, (57) becomes
∂y .y

πt ẏ 1 wi xi ẋi
= −
py y εcy C xi
i

ẏ 1 Ẋ (58)
= − .
y εcy X 3

Hence, by virtue of (49) above

πt ẏ Ẋ Ṁ
= − = . (59)
py y X 1 M 1
836 S. C. Ray

Productivity Measurement with Multiple Outputs5

We now consider a firm producing a vector of m outputs y ∈ R+m from the input

vector x ∈ R+ . Let its production technology be represented by the Shephard

(output) function:

1
D y (x, y, t) = ;ϕ ≥ 1 (60)
ϕ

Alternatively,

D y (x, y, t) .ϕ = 1 (61)

Differentiating (61) with respect to t,

dD y (x, y, t) ϕ dD y (x, y, t) dϕ
=ϕ + D y (x, y, t)
dt dt dt
⎛ ⎞
∂D y dx ∂D y dy ∂D y
=ϕ⎝ ⎠
i j
+ + (62)
∂xi dt ∂yl dt ∂t
i j

dϕ
+ D y (x, y, t) =0
dt

Dividing (62) through by Dy (x, y, t)ϕ, we get

⎛ ⎞

⎝ ∂D y 1 dx i ∂D y 1 dy j ∂D y 1 ⎠ dϕ 1
+ + + = 0 (63)
∂xi D y dt ∂yj D y dt ∂t D y dt ϕ
i j

Recall that the distance function is increasing in outputs and decreasing in

inputs, and define its partial output and input elasticities as δj = ∂∂ ln yj =
ln D

∂D y yj y y xi
∂yj D y (j = 1, 2, , , m) ; μi = ∂∂lnlnDxi = − ∂D
∂xi Dy (i = 1, 2, , , n).
Then the expression in (63) becomes
⎛ ⎞
y
ẏj ẋi ⎠ Dt ϕ̇
⎝ δj − μi + y + (64)
yj xi D ϕ
j

Define the growth rates of aggregate output and input derived from the function
as

5 This section is based on Lovell [23].

20 Conceptualization and Measurement of Productivity Growth and Technical . . . 837

D
Ẏ ẏj
= δj (65)
Y yj
j

D
Ẋ ẋi
= μi (66)
X 1 xi
i

Note that the output distance function is increasing (non-decreasing) and homo-
y
geneous of degree 1 in outputs. Thus Dj yj = D y (x, y, t), and therefore
j

y
Dj yj
= δj = 1. (67)
Dy
j j

However, unless CRS holds, μi = 1. As a result the growth rate of aggregate

j
input is not homogeneous of degree 1 in the growth rates of individual inputs. To
solve this problem, we define the new weights ωi = μμ0i , μ0 = μi , and a different
i
measure of the growth rate of aggregate input

D D
Ẋ ẋi 1 ẋi 1 Ẋ
= ωi = μi = . (68)
X 2 xi μ0 xi μ0 X 1
i i

Using (65 and 66) in (64), we get

D D D y
Ṁ ẏ ẋ Dt ϕ̇
≡ − =− y
− (69)
M 1 y x 1 D ϕ

The alternative measure of total factor productivity growth is

D D D
Ṁ Ẏ Ẋ
≡ −
M 2 Y X 2
D D D D
Ẏ Ẋ Ẋ Ẋ
= − + −
Y X 1 X 1 X 2
(70)
D D
Ṁ Ṁ
= + (μ0 − 1)
1 M M 2
D y
Ṁ D ϕ̇
= (μ0 − 1) − ty −
M 2 D ϕ
838 S. C. Ray

In (70), the first term represents the returns to scale effect and disappears under
constant returns when μ0 equals unit, the second term captures productivity growth
measured by the rate of autonomous shift in the distance function over time, and the
last term is the rate of change in technical efficiency over time.

Productivity Change in Discrete Time

A Hicks-Moorsteen Productivity Index

Consider a firm producing a single output y from a single input x. Let

(xt , yt ) (t = 0, 1) represent its input-output pairs in two successive time periods.
Then its productivity index in period 1 with period 0 treated as the base year is

y1
AP1 x1
π1,0 = = y0 . (71)
AP0 x0

This can also be written as

y1
AP1 y0
π1,0 = = x1 . (72)
AP0 x0

y1 x1
Define the output and input quantity ratios Qy = y0 and Qx = x0 . Then

Qy
π1,0 = . (73)
Qx

If the proportionate increase in output is greater than the increase in input,

π 1, 0 > 1, and productivity is higher in period 1.
In almost all realistic cases, however, firms use multiple inputs and often produce
multiple outputs. Even in a simple one-output two-input case, there will be two
partial average productivities corresponding to the two inputs. Suppose that the
input vectors in the two periods are x t = x1t , x2t , t = 0, 1 Now the partial
average productivities are AP t1 = xytt and AP t2 = xytt in the two different periods
1 2
t = 0 and 1. Consequently, measuring a productivity index as the ratio of average
productivities becomes problematic. In the case of multiple outputs and multiple
inputs, the problem is even more complicated.
One way to solve this problem is to create aggregate measures of inputs and
outputs and measure total factor productivity (rather than partial productivity for
individual inputs). Suppose that the output vectors produced by the firm in the two
periods are yt = y1t , y2t , t = 0, 1. We use the aggregator functions Yt = a y1t , y2t
and Xt = b x1t , x2t to measure the total factor productivity index:
20 Conceptualization and Measurement of Productivity Growth and Technical . . . 839

a y11 ,y21
Y1 0 0
X1 a y1 ,y2
π1,0 = T F P 1,0 = Y0
= . (74)
b x11 ,x21
X0 0 0
b x1 ,x2

Define the

output

and input quantity

indexes as
a y11 ,y21 b x11 ,x21
Qy = and Qy = . Then
a y10 ,y20 b x10 ,x20

Qy
T F P 1,0 = . (75)
Qx

The only restrictions on the aggregator functions are that they must be non-
negative, increasing (non-decreasing) in each argument, and homogeneous of degree
1. That is, if the quantity of any input (output) increases, the aggregate input (output)
also increases (does not decrease), and if every input (output) increases by the same
proportion, so does the aggregate input (output).6 The total factor productivity index
defined in (75) is the ratio of the output and input quantity indexes. 7

The Tornqvist Productivity Index

By far the most popular quantity index number is the Tornqvist index measured
by a weighted geometric mean of the relative quantities from the two periods.
Consider the output quantity index first. Suppose that m outputs are involved. The
output vectors produced in periods 0 and 1 are, respectively, y 0 = y10 , y20 , . . . , ym
0

and y 1 = y11 , y21 , . . . , ym1 . The corresponding output price vectors are p 0 =
0 0
p1 , p2 , . . . , pm and p = p11 , p21 , . . . , pm
0 1 1 , respectively. Then, the Tornqvist

output quantity index in period 1 with period 0 as the base is

v1t v2t t
vm m
y1 1 y2 1 ym 1
T Qty = .... ; vjt = 1 (76)
y1 0 y2 0 ym 0
1

pjt yjt
Here, vjt = m (t = 0, 1) . is the share of output i in the total value of the
pkt ykt
1
output bundle in period t. Of course, the value shares of the individual outputs are,
in general, different in the two periods. In practical applications, for vjt , one uses the
arithmetic mean of vj 0 and vj 1 . This leads to

6 It is important to emphasize that although such arbitrary quantity aggregators yield a theoretically

valid productivity index, they may not provide any insight into the extent of technical change
between the two periods.
7 For a detailed discussion of index numbers, see Chap. 19, “Index Numbers and Productivity

Measurement” by Prasada Rao in this volume.

840 S. C. Ray

v10 +v11 v20 +v21

0 +v 1

vm m
y1 1 2
y2 1 2
ym 1 2
T Qy = .... = T Q0y .T Q1y . (77)
y1 0 y2 0 ym 0

It may be noted that in the single-output case, the Tornqvist output quantity index
trivially reduces to the ratio of output quantities in the numerator of (73). This is also
true when the quantity ratio remains unchanged across all outputs.
Similarly, let the input vectors in the two periods be x 0 = x10 , x20 , . . . , xn0

and x 1 = x11 , x21 , . . . , xn1 . The corresponding input price vectors are w 0 =
0 0
w1 , w2 , . . . , wn0 and w 1 = w11 , w21 , . . . , wn1 . Then, the Tornqvist input quantity
index is
s1t s2t snt n
x1 1 x2 1 xn 1
T Qtx = .... ; sit = 1; t = 0, 1. (78)
x1 0 x2 0 xn 0
1

Here,
wt x t
sit = n i i ; t = 0, 1. is the share of input j in the total cost of the input bundle.
wkt xkt
1
Again, in practice, one uses the average of the cost share of any input in the two
periods. Thus, the Tornqvist input quantity index is

T Qx = T Q0x .T Q1x . (79)

The Tornqvist productivity index is the ratio of the Tornqvist output and input
quantity indexes. Thus,

T Qy
πT Q = . (80)
T Qx

When TQy > TQx , output in period 1 has grown faster (or declined slower) than
input as a result of which productivity has increased in period 1 compared to what
it was in period 0.
It may be noted that the Tornqvist productivity index can be measured without
any knowledge of the underlying technology so long as data are available for the
input and output quantities as well as the shares of the individual inputs and outputs
in the total cost and total revenue, respectively.8

8 However, unless one assumes that the cost or revenue shares correspond to cost-minimizing/profit-

maximizing behavior by the producer, one cannot extract technical change from the Tornqvist
productivity index.
20 Conceptualization and Measurement of Productivity Growth and Technical . . . 841

The Fisher Productivity Index

An alternative to the Tornqvist index of productivity is the Fisher index where one
uses Fisher indexes of output and input quantity in the multi-factor productivity
index measure. It may be noted that the Fisher quantity (or price) index is itself the
geometric mean of the relevant Laspeyres and Paasche indexes.
The Laspeyres output quantity index is the value ratio of the two output vectors
at base period prices and is measured as
m
pj 0 yj 1
1
LQy = m . (81)
pj 0 yj 0
1

m
yj 1
It is easy to see that LQy = λj 0 yj 0
1
where
pj 0 yj 0
λj 0 = m is the same as vj 0 defined above.
pk 0 yk 0
1
Thus, while the Tornqvist quantity index is a weighted geometric mean of
the quantity relatives, the corresponding Laspeyres index is a similarly weighted
arithmetic mean.
The Paasche output quantity index, for which we evaluate the current and base
period output bundles at current period prices, is measured as
m
pj 1 yj 1
1
P Qy = m . (82)
pj 1 yj 0
1

m
yj 1 pj 1 yj 0
Thus, P Qy = μj 1 yj 0
where μj 1 = m .
1 pk 1 yk 0
1
The Fisher output quantity index is the geometric mean of the Laspeyres and
Paasche output quantity indexes.9 Hence,

F Qy = LQy .P Qy . (83)

In an analogous manner, the Laspeyres, Paasche, and Fisher input quantity

indexes are obtained as

9 Thegeometric mean is a merged relative score that satisfies a number of important postulates.
The interested reader should refer to Aczel [1].
842 S. C. Ray

n
wj 0 xj 1
1
LQx = n , (84)
wj 0 xj 0
1
n
wj 1 xj 1
1
P Qx = n , (85)
wj 1 xj 0
1

and

F Qx = LQx .P Qx , (86)

respectively.
The resulting Fisher productivity index is

F Qy
πF = . (87)
F Qx

It may be noted that the Tornqvist and Fisher indexes are derived from the
geometric and arithmetic means of ratios of the output and input quantities. In
practical applications their numerical values are generally quite close.

Profitability, Terms of Trade, and Productivity Indexes10

Consider again the input and output quantity and price vectors from the two periods
(xt , yt ) and (wt , pt ) for t = 0,1. Then, in any period, one can define profitability
measures:

pt y t
Proft = (t = 0, 1) . (88)
wt x t
Following O’Donnell [26], one may construct a profitability index in period 1
with period 0 as the base as

p1 y 1
Prof1 w1 x 1
Prof1,0 = = . (89)
Prof0 p0 y 0
w0 x 0

The right-hand side of (89) can be alternatively expressed as

10 This section is based in part on O’Donnell [26].

20 Conceptualization and Measurement of Productivity Growth and Technical . . . 843

p1 y 1 p1 y 0 p1 y 1
Prof1 p0 y 0 p0 y 0 p1 y 0
Prof1,0 = = = (90a)
Prof0 w1 x 1 w1 x 0 w1 x 1
w0 x 0 w0 x 0 w1 x 0

p1 y 1 p1 y 1 p1 y 0
Prof1 p0 y 0 p1 y 0 p0 y 0
Prof1,0 = = = (90b)
Prof0 w1 x 1 w1 x 1 w1 x 0
w0 x 0 w1 x 0 w0 x 0

A geometric mean of (90a) and (90b) yields

p1 y 0 p1 y 1 p1 y 1 p0 y 1
p0 y 0 p0 y 1 p1 y 0 p0 y 0 P p 1 , p 0 F Qy
Prof1,0 = = . = T T 1,0 πF (91)
w1 x 0 w1 x 1 w1 x 1 w0 x 1 1
W w ,w 0 F Q x
w0 x 0 w1 x 0 w1 x 0 w0 x 0

In (91) above
1 0 1 1
P p1 , p0 = pp0 yy 0 pp0 yy 1 is a Fisher index of output prices
w1 x 0 w1 x 1
and W w 1 , w 0 = w 0 x 0 w 1 x 0 is a Fisher index of output prices.
Thus, the two components of the profitability index are the
Fisher

productivity
FQ P p1 ,p0
index πF = F Qyx and a terms of trade factor T T 1,0 = W w1 ,w0 . When output
( )
prices increase faster (slower) than input prices, the terms of trade factor is greater
(less) than 1, and profitability index is greater (less) than the productivity index.
In this chapter the principal focus is on explaining the rate of growth in
productivity measured by the rate of growth in the output beyond the rate of
growth in inputs in terms of technical change, change in technical efficiency, and
returns to scale effects of input change. This is in the tradition of Schultz [31] and
Jorgenson and Griliches [22]. O’Donnell [26] argues, however, that changes in the
so-called terms of trade factor (TT1, 0 ) affect productivity under competitive profit-
maximizing behavior of a firm.
In a single-output single-input case, the profit maximization problem is

max py − wx
(92)
s.t.y ≤ f (x).

From the first-order conditions for a maximum, one gets (i) f (x) = wp and (ii)

y = f (x). This leads to the input demand x ∗ = f −1 w p = x p output supply
w

function y ∗ = f x w p = y w w
p . When p goes up (i.e., the TT goes down),

x w w
p goes down, and hence the output supply y p also goes down. Finally, if
the production function is concave, output declines less than proportionately with
844 S. C. Ray

the input. As a result, average productivity goes up. Hence, there would be an inverse
relationship between changes in the terms of trade and productivity change.

Malmquist Productivity Index

Both the Tornqvist and Fisher productivity indexes are essentially descriptive
measures based on the observed input and output data. By contrast, the Malmquist
productivity index introduced by Caves, Christensen, and Diewert (CCD) [6] is a
normative index and is based on a reference technology.
We first consider the one-output one-input case and assume that the input-output
quantities in the two periods are (x0 , y0 ) and (x1 , y1 ). Further, assume that the
production function in both periods is y∗ = f (x).We can then write the productivity
index as
y1 y1 f (x1 )
x1 f (x1 ) . x1
π1,0 = y0 = y0 f (x0 )
. (93)
x0 f (x0 ) . x0

Alternatively, we could invert the production function to get the input require-
ment function x∗ = f−1 (y) ≡ g(y) and write the productivity index as
∗
y1 y1 g(y1 ) y1 x1 y1
x1 g(y1 ) . x1 g(y1 ) . x1 g(y1 ) .τx (x1 , y1 )
π1,0 = y0 = y0 g(y0 )
= ∗ = y0 . (94)
y0 x0
x0 g(y0 ) . x0 g(y0 ) . x0 g(y0 ) .τx (x0 , y0 )

CCD proposed two different measures of the productivity index defined by ratios
of distance functions – one input oriented and the other output oriented. Their input-
oriented Malmquist productivity index for (x1 , y1 ) relative to (x0 , y0 ) is

D x (x0 , y0 )
MP I x (x1 , y1 ; x0 , y0 ) = . (95)
D x (x1 , y1 )

On the other hand, their output-oriented Malmquist productivity index for the
same input-output pairs is

D y (x1 , y1 )
MP I y (x1 , y1 ; x0 , y0 ) = . (96)
D y (x0 , y0 )

CCD assume Dx (x0 , y0 ) equals unity in (95). As a result, MPIx (x1 , y1 ; x0 , y0 ) =

(Dx (x1 , y1 ))−1 .
They explicitly recognize, however, that the input- and output-
oriented Malmquist productivity indexes “will differ from each other by a factor
that reflects the returns to scale of the production structure” ([6], p. 1402).
Figure 1 shows, for the one-output one-input case, how the Malmquist pro-
ductivity index in (95) differs from the intuitive measure of productivity change.
20 Conceptualization and Measurement of Productivity Growth and Technical . . . 845

Fig. 1 CCD input-oriented Malmquist productivity index (VRS)

Points A and B in this diagram denote the input-output bundles (x0 , y0 ) and (x1 , y1 ),
respectively. Because Dx (x0 , y0 ) equals unity, A is located on the production frontier
shown
bythe curve f(x).The input-oriented efficient projection of point B is the point
C x1∗ , y1 and D x (x1 , y1 ) = Ox
Ox1 By1
∗ = Cy . Further, dividing both the numerator and
1
1
the denominator of Dx (x1 , y1 ) by Oy1 , we obtain

By1 Oy1
Oy1 Cy1 AP (C)
D (x1 , y1 ) =
x
Cy1
= Oy1
= . (97)
AP (B)
Oy1 By1

Hence,

D x (x0 , y0 ) AP (B)
MP I x (x1 , y1 ; x0 , y0 ) = x
= . (98)
D (x1 , y1 ) AP (C)

Thus,

AP (B) AP (B) AP (A)
MP I x (x1 , y1 ; x0 , y0 ) = = . (99)
AP (C) AP (A) AP (C)

In the CCD input-oriented Malmquist productivity index shown in (99), the first
factor is the productivity index, while the second factor is the returns to scale effect.
846 S. C. Ray

Fig. 2 CCD input-oriented Malmquist productivity index (CRS)

We can contrast this with the case where the production function exhibits CRS.
We would then get f (x
x1
1)
= f (x 0) g(y1 ) g(y0 )
x0 and also y1 = y0 , and (94) above would
reduce to
y1 y1 g(y1 ) x1∗
x1 g(y1 ) . x1 x1 τx (x1 , y1 ) D x (x0 , y0 )
π1,0 = y0 = y0 g(y0 )
= x0∗
= = Cx . (100)
τx (x0 , y0 ) DC (x1 , y1 )
x0 g(y0 ) . x0 x0

This is shown in Fig. 2, where the production function f(x) is a ray through the
origin. This time, AP(A) and AP(C) are equal due to CRS. Hence the ratio of the
distance functions correctly measures the productivity index.
y1
x1 τyC (x1 , y1 ) τxC (x1 , y1 )
π1,0 = y0 = = . (101)
x0 τyC (x0 , y0 ) τxC (x0 , y0 )

That is, productivity change between period 0 and period 1 is simply the change
in technical efficiency relative to a CRS production function. Unless, CRS holds,
however, the ratio of technical efficiencies (input and output oriented) will differ
from the productivity index11 .

√
11 For a simple example, consider the production function f (x) = 2 x and its inverse f −1 (y) =
y y1
√1
y2 yk τy (x1 ,y1 ) 2 x1
g(y) = This time, τy (xk , yk ) = (k = 0, 1) . Clearly, = =
x1
√ y y0 .
4. 2 xk τy (x0 ,y0 ) √0
2 x0 x0
20 Conceptualization and Measurement of Productivity Growth and Technical . . . 847

Fig. 3 Geometry of the output-oriented Malmquist productivity index

Malmquist Productivity Index with Multiple Outputs and Inputs

Although the one-input one-output example of the Malmquist productivity index

easily generalizes to the m-output n-input technology, the input- and output-oriented
Malmquist productivity indexes in (95) and (96) correctly measure productivity
change only when the distance functions are measured relative to a CRS frontier.
Thus,

y
DC x 1 , y 1 DCx x 0 , y 0
π1,0 = MP I C
x ,y ;x ,y
1 1 0 0
= y 0 0 = x 1 1 (102)
DC x , y DC x , y

In reality, of course, CRS may not hold in many situations. It is important to

recognize, however, that the Malmquist productivity index should be measured by
the ratio of the CRS distance functions even when the technology does not exhibit
CRS globally. Below, in the discussion of a multiplicative decomposition of the
Malmquist productivity index, we identify the specific contribution of the returns to
scale factor to overall productivity change.
848 S. C. Ray

Allowing Technological Change

We now consider technological change over time. In that case, there will be two
different production possibility sets in the two different time periods, and we will
get two different measures of the distance function (or technical efficiency) for
y
the same input-output bundle. Let Dt (x, y) be the VRS output distance function
evaluated at the input-output bundle (x, y) relative to the technology from period
y
t(=0, 1). Further, DCt (x, y) is the output function when CRS is assumed. We will
now have two alternative measures of the Malmquist index depending on which of
the two technologies (t = 0 and t = 1) is used as the benchmark.
Färe, Grosskopf, Lindgren, and Roos (FGLR), [17] provided a decomposition of
the Malmquist index into two distinct components representing change in technical
efficiency and technical change measured by the shift in the frontier.
Consider first the MPI relative to the technology from period 0:

D y x 1 , y 1 y y
DC1 x 1 , y 1 DC0 x 1 , y 1

MP I C
0 x , y ; x , y = y 0 0 = y 0 0 .
1 1 0 0 C0
. (103)
DC0 x , y DC0 x , y DC1 x 1 , y 1

Similarly, with period 1 technology as the benchmark,

D y x 1 , y 1 y y
DC1 x 1 , y 1 DC0 x 0 , y 0

MP I C
1 x , y ; x , y = y 0 0 = y 0 0 .
1 1 0 0 C1
. (104)
DC1 x , y DC0 x , y DC1 x 0 , y 0

Taking the geometric mean of (103) and (104), one gets

D y x 1 , y 1 D y x 0 , y 0 D y x 1 , y 1
MP I C x 1 , y 1 ; x 0 , y 0 = y 0 0 .
C1
y 0 0 .
C0 C0
. (105)
D 1 1
DC0 x , y DC1 x , y C1 x , y

The first factor on the right-hand side

y
DC1 x 1 , y 1
y = T EC (106)
DC0 x 0 , y 0

represents technical efficiency change (TEC) between the two periods. A value
of TEC greater (less) than 1 indicated increase (decline) in technical efficiency
in period 1 compared to period 0. Each ratio inside the second factor represents
the shift in the production frontier measured at the input-output bundle in the two
different periods. The geometric mean of the two is the second factor

y y
DC0 x 0 , y 0 DC0 x 1 , y 1
y . = TC (107)
DC1 x 0 , y 0 DC1 x 1 , y 1
20 Conceptualization and Measurement of Productivity Growth and Technical . . . 849

representing technical change (TC). A value of TC greater than 1 implies an outward

shift of the technology or technical progress.

Allowing Returns to Scale Effect

In a subsequent paper, Färe, Grosskopf, Norris, and Zhang (FGNZ) [18] relaxed the
CRS assumption and offered the following decomposition:

D y x 1 , y 1 D y x 0 , y 0 D y x 1 , y 1 D y (x 1 ,y 1 )
DC1 x 1 ,y 1

MP I C x 1 , y 1 ; x 0 , y 0 = 1y 0 0 . y
C0
. C0 . y1
D0 x , y DC1 x 0 , y 0 DC1 x 1 , y 1 DC0(x 0 ,y 0)
y
D0 x 0 ,y 0

(108)

As in FGLR [17], the first two terms are interpreted as technical efficiency change
(TEC) and technical change (TC). The last term in the right-hand side of (108)

DC1 x 1 ,y 1
y 1 1
D1 (x ,y )
y = SEC (109)
DC0 (x 0 ,y 0 )
y
D0 (x 0 ,y 0 )

is the ratio of scale efficiencies in period 1 and period 0 and is described as scale
efficiency change (SEC).
Ray and Desli (RD) [30] argued that assuming VRS to measure technical effi-
ciency change and CRS to measure technical change within the same decomposition
is not internally consistent and proposed the following decomposition:

MP I x 1 , y 1 ; x 0 , y 0

D y (x 1 ,y 1 ) D y (x 1 ,y 1 )
C0 C1 (110)
D0 x 0 , y 0 D0 x 1 , y 1
y y y
D1 x 1 , y 1
y y
D0 (x 1 ,y 1 ) D1 (x 1 ,y 1 )
= y 0 0 . y . . D y (x 0 ,y 0 ) D y (x 0 ,y 0 )
. .
D0 x , y D1 x 0 , y 0 D1 x 1 , y 1 C0
y
C0
y
D0 (x 0 ,y 0 ) D0 (x 0 ,y 0 )

The first factor in (110) measures technical efficiency change (TEC) exactly the
same way as in (108), but the second factor measures technical change (TC) by
the shift in the VRS frontier and is consistent with the first factor where technical
efficiency change is also measured with reference to VRS frontiers. RD call the last
factor on the right-hand side of (110)
850 S. C. Ray

D y (x 1 ,y 1 ) D y (x 1 ,y 1 )
C0 C1
D y (x 1 ,y 1 ) D y (x 1 ,y 1 )
0
D y (x 0 ,y 0 ) . D y (x 0 ,y 0 ) = SCF
1
(111)
C0 C0
y y
D0 (x 0 ,y 0 ) D0 (x 0 ,y 0 )

a scale change factor (SCF). Unlike SEC in (109), it has no clear intuitive
interpretation and is more like a residual incorporating both scale efficiency change
and what is described by some authors as “scale bias of technical change.”
The Malmquist productivity index and its three-factor decomposition are shown
in Fig. 3. The points A (x0 , y0 ) and B (x1 , y1 ) show the input-output bundles for the
same firm in period 0 and period 1.The curves y∗ = f0 (x) and y∗ = f1 (x) show the
production functions in periods 0 and 1. The productivity index is
y1
x1 AP (B)
π1,0 == y0 = . (112)
x0 AP (A)

This can be expressed as

AP (B) AP (C) AP (D) AP (E) AP (B)

π1,0 = = . . .
AP (A) AP (A) AP (C) AP (D) AP (E)
AP (B) (113)
AP (E) AP (D) AP (E)
= AP (A)
. . .
AP (C) AP (D)
AP (C)

y1
AP (B)
f 1 (x1 ) τy1 (x1 ,y1 )
In (113) the first factor AP (E)
AP (A) on the right is y0 = τy0 (x0 ,y0 )
= T EC, which
AP (C) f 0 (x0 )
f (xo ) 1
measures technical efficiency change, the second factor AP (D)
AP (C) = f 0 (x0 ) = T C (x0 )
is a measure of the autonomous shift in the production function due to technical
AP (E) SE 1 (x1 )
change measured at x0 , and the last factor AP (D) = SE 1 (x0 ) measures the relative
scale efficiency of inputs x1 and x0 relative to the production function from period 1.
In a completely analogous way,

AP (B) AP (B) AP (E) AP (F ) AP (C)

π1,0 = AP (A) = AP (E) . AP (F ) . AP (C) . AP (A)
AP (B)
AP (E) AP (F ) (114)
= AP (E)
AP (A) . AP (F ) . AP (C) .
AP (C)

In (114)

AP (B) τy1 (x1 , y1 ) f 1 (x1 ) SE 0 (x1 )

π1,0 = = 0 . . . (115)
AP (A) τy (x0 , y0 ) f 0 (x1 ) SE 0 (x0 )

The RD decomposition is the geometric mean of (114) and (115).

20 Conceptualization and Measurement of Productivity Growth and Technical . . . 851

Biennial Malmquist Index

The biennial Malmquist index introduced by Pastor, Asmild, and Lovell [27]
provides the same decomposition as in Ray and Desli [30] but avoids a linear
programming infeasibility problem associated with the Ray-Desli decomposition
of the Malmquist index that arises when an input-output bundle from one period
is infeasible relative to the frontier in the other period. Instead of using a period-
specific production possibility frontier, they estimate the technical efficiency of a
production unit with reference to a biennial production possibility frontier which
is empirically constructed from the pooled input-output data from two consecutive
periods. The reference technology set TB is empirically constructed from the pooled
data from both periods t and t + 1 (a simple graphical illustration of the biennial
production possibility frontier, for single-output single-input case, is given in section
“Biennial Malmquist Index” and in Fig. 6).
Using the output-oriented technical efficiency scores with reference to a CRS
biennial frontier, the biennial Malmquist productivity index of the firm s producing
a single output from multiple inputs is measured as12

T Ec B x t+1 , y t+1
s s
Mc B
xs , ys ; xs , ys
t t t+1 t+1
= . (116)
T Ec B xst , yst

The decomposition of this biennial Malmquist productivity index is

Mc B xs t , ys t ; xs t+1 , ys t+1 = T EC × T C × SEC (117)

where

T Ev t+1 xs t+1 , ys t+1
T EC = , (118)
T Ev t (xs t , ys t )

T Ev B xs t+1 , ys t+1 /T Ev t+1 xs t+1 , ys t+1
TC = (119)
T Ev B (xs t , ys t ) /T Ev t (xs t , ys t )

and

T Ec B xs t+1 , ys t+1 /T Ev B xs t+1 , ys t+1
SEC = . (120)
T Ec B (xs t , ys t ) /T Ev B (xs t , ys t )

12 Sincethe biennial Malmquist index of productivity uses the biennial CRS production possibility
set, which includes the period t and t + 1 sets, one need not calculate a “geometric mean” of two
productivity indexes while measuring it.
852 S. C. Ray

Directional Distance Function and Luenberger Productivity

Indicator

Building upon Luenberger’s [24] benefit function, Chambers, Chung, and Färe [10]
introduced the directional distance function (DDF) to measure the distance of an
observed input-output bundle from the frontier of the PPS in a direction chosen by
the analyst. Let g x = g1x , g2x , . . . , gnx ∈ R+n and g y = g y , g y , . . . , g y ∈ R m be
1 2 m +
two direction sub-vectors. Then the DDF can be defined as
→ 0 0 x y
−
D x , y ; g , g = max β : x 0 − βg x , y 0 + βg y ∈ T . (121)

It is clear that one can recover the radial output-oriented model by setting
gx = 0 and gy = y0 . In that case, β in (121) would equal (ϕ − 1) in (15) above.
An interesting choice of the direction for projection would be (gx , gy ) = (x0 , y0 ).
That leads to
→ 0 0 x y
−
D x , y ; g , g = max β : (1 − β) x 0 , (1 + β) y 0 ∈ T . (122)

In that case β is the maximum percentage by which all outputs can be expanded
and all inputs can be contracted simultaneously.
In Fig. 4, A is the observed bundle (x0 , y0 ). The point B (gx = − x0 , gy = y0 )
defines the direction of movement. The point C on the production frontier shows the
maximum feasible movement within the production possibility set in the direction
parallel to OB. In this case, the directional distance function is β = OB
AC
= ODOB .
While the Malmquist productivity index measured by the geometric mean of
ratios of Shephard distance functions corresponding to the technologies of two dif-
ferent periods continues to be the most commonly used approach for measurement
of productivity change, in recent years an alternative measure of productivity change
Output ( )

∗
= ( )

0
B A

0 Input (X)

Fig. 4 Directional distance function

20 Conceptualization and Measurement of Productivity Growth and Technical . . . 853

known as the Luenberger productivity indicator introduced by Chambers, Färe, and

Grosskopf (CFG) [11] has gained wide popularity.13
Consider, as before, two CRS technologies – TC0 for period 0 and TC1 period
1. Further, let (xt , yt ) be the input-output of a firm period t (=0, 1). Next consider a
direction g = (−gx , gy ).Then two possible measures of productivity change between
period 0 and period 1 are
− →0 −→0
L0 x1, y1 , x0, y0 ; g = D x0, y0; g − D x1, y1; g (123)

with reference to TC0 and

− →1 −→1
L1 x1, y1 , x0, y0 ; g = D x0, y0; g − D x1, y1; g (124)

with reference to TC1 .

→t
− →t
−
If D x 0 , y 0 ; g > D x 1 , y 1 ; g ⇐⇒ Lt x 1 , y 1 , x 0 , y 0 ; g > 0,
(x0 , y0 ) is farther away from the period t frontier than (x1 , y1 ) in the direction g. This
implies higher productivity in period 1 than in period 0.
CFG [11] define the Luenberger productivity indicator as
1
L x1, y1 , x0, y0 ; g = L0 x 1 , y 1 , x 0 , y 0 ; g
2

+L1 x 1 , y 1 , x 0 , y 0 ; g

1 →0 0 0 −
− →0 (125)
= D x , y ; g − D x1, y1; g
2

→0 0 0 −
− →0 1 1
+ D x ,y ;g − D x ,y ;g

This can also be expressed as

− →0 −→1
L x1, y1 , x0, y0 ; g = D x0, y0; g − D x1, y1; g

1 −→1 0 0 − →0
+ D x , y ; g − D x0, y0; g (126)
2

→1
− −→0
+ D x1, y1; g − D x1, y1; g

13 The
Luenberger productivity indicator was first formulated in a working paper by Chambers [7].
Subsequently, it appeared in a number of papers including Chambers, Chung, and Färe ([10], [12]);
Chambers and Pope [9]; Chambers, Färe, and Grosskopf [11]; and Chambers [8].
854 S. C. Ray

Fig. 5 Luenberger productivity indicator

The first term inside the square brackets represents change in (directional)
technical (in)efficiency. A positive value of this difference means that the level of
(in)efficiency is lower in period 1 compared to period 0. This implies a positive
technical efficiency change (TEC). The other term represents technical change (TC).
The two differences inside the second term show shifts in the frontier between
period 0 and period 1 measured from (x0 , y0 ) and (x1 , y1 ) in the direction g. A
positive value of the difference implies an outward shift of the frontier.
In Fig. 5, the lines OR0 and OR1 show the CRS production frontiers in period
0 and period 1. The points a and b show the input-output combinations of a firm
in period 0 and period 1, respectively. The projection of a in the direction g is the
point c on OR0 and the point e on OR1 . The corresponding projections of b are
the points d on OR0 and the point f on OR1 . One way to measure the productivity
−
→0
differences between b and a is to compare the distances ac = D C (x0 , y0 ; g) and
−
→0
bd = D C (x1 , y1 ; g) from the OR0 line. A positive value of this difference implies
that productivity is higher at b than at a. Similarly, one could compare the distances
−
→1 −
→1
ae = D C (x0 , y0 ; g) and bf = D C (x1 , y1 ; g) from the OR1 line. The Luenberger
productivity indicator shown in (126) is 12 [(ac − bd) + (ae − bf )] .
Based on the directional distance functions, the Luenberger productivity indica-
tor depends critically on the choice of the direction of projection. As mentioned
20 Conceptualization and Measurement of Productivity Growth and Technical . . . 855

above, a popular choice for the direction is (gx = xt , gy = yt ). This, of course, would
mean that the directions of projection will be different in the two periods 0 and 114 .

Relation Between Tornqvist and Malmquist Productivity Indexes

CCD define the Malmquist output quantity index as

D y y 1 , x
Qy y , y ; x = y 0
1 0
(127)
D y ,x

Because the output function is homogeneous of degree 1 and is (weakly)

∂D y
monotonic in outputs, for any output vector y, D y (y, x) = ∂yj yj is a valid
j
output aggregator. Corresponding to the two different input bundles, there are two
measures of the Malmquist output quantity index,

D y y 1 , x 0 D y y 1 , x 1
Q0y y , y ; x = y 0 0 and Qy y , y ; x = y 0 1 .
1 0 0 1 1 0 1
(128)
D y ,x D y ,x

Similarly, the Malmquist input quantity indexes are

D x y 0 , x 1 D x y 1 , x 1
Q0x x , x ; y = x 0 0 and Qx x , x ; y = x 1 0 .
1 0 0 1 1 0 1
(129)
D y ,x D y ,x

At this point, consider the revenue maximization problem

R (y1 , y2 ; x1 , x2 ) = max p1 y1 + p2 y2
(130)
s.t.D y (x1 , x2 ; y1 , y2 ) = 1

where (p1 , p2 ) are the output prices and (x1 , x2 ) are the given input quantities. The
Lagrangian for the constrained maximization problem is

L = p1 y1 + p2 y2 + λ (1 − D y (x1 , x2 ; y1 , y2 )) (131)

and the corresponding first-order conditions for a maximum are

∂L ∂D y
= pj − λ = 0 (j = 1, 2) (132)
∂yj ∂yj

14 Afrashian and Ahn [2] have extended the CFG [11] decomposition of the Luenberger productiv-
ity indicator to identify separately a change in direction component.
856 S. C. Ray

∂L
= 1 − D y (x1 , x2 ; y1 , y2 ) = 0 (133)
∂λ

Thus, (131), (132) together imply that

pj yj ∂ ln D y
= vj = (j = 1, 2) . (134)
pk yk ∂ ln yj
k

Here vj is the revenue share of output j defined earlier in the context of the
Tornqvist index. Similarly, for the cost minimization problem for a given pair of
outputs,

C (x1 , x2 ; y1 , y2 ) = min w1 x1 + w2 x2
(135)
s.t.D x (x1 , x2 ; y1 , y2 ) = 1

Arguing as above, we can get

wi xi ∂ ln D x
= si = (j = 1, 2) . (136)
wk xk ∂ ln xi
k

As before, si is the cost share of input i.

CCD have shown that when both input-output bundles (xk , yk ) (k = 0, 1) are ele-
ments of the same production possibility set (i.e., there no technological difference
between them) and have translog distance functions with identical linear, quadratic,
and interaction parameters, the Malmquist and Tornqvist output and input quantity
indexes will be identical. We show that with a two-output two-input example, for
simplicity.
Consider now the output distance function

1 1
ln D y (x1 , x2 ; y1 , y2 ) = α0 + α1 ln y1 + α2 ln y2 + α11 (ln y1 )2 + α22 (ln y2 )2
2 2
1
+ α12 (ln y1 ) . (ln y2 ) + β1 ln x1 + β2 ln x2 + β11 (ln x1 )2
2
1
+ β22 (ln x2 )2 + β12 (ln x1 ) . (ln x2 ) + γ11 (ln y1 ) (ln x1 )
2
+ γ12 (ln y1 ) (ln x2 ) +γ21 (ln y2 ) (ln x1 ) +γ22 (ln y2 ) (ln x2 )
(137)

For the distance function in (137),

20 Conceptualization and Measurement of Productivity Growth and Technical . . . 857

∂ ln D y
= α1 + α11 (ln y1 ) + α12 (ln y2 ) + γ11 (ln x1 ) + γ12 (ln x2 ) ;
∂ ln y1
(138)
∂ ln D y
= α2 + α21 (ln y1 ) + α22 (ln y2 ) + γ21 (ln x1 ) + γ22 (ln x2 ) ;
∂ ln y2

∂ ln D y
Because the output function is homogeneous of degree 1 in (y1 , y2 ), ∂lny1 +
∂ ln D y
∂lny2 = 1. This in its turn implies

lα1 + α2 = 1;
α11 + α12 = α21 + α22 = 0; (139)
γ11 + γ12 = γ21 + γ22 = 0;

Also, by Young’s theorem, α 12 = α 21 .

In light of (139), CCD use Diewert’s quadratic identity [15] formulated in CCD
[6] as a translog identity to show that

1 1 1 1
2
ln Q0y .Q1y = v1 + v10 ln y11 − ln y10 + v21 + v20 ln y21 − ln y20
2 2
(140)

Hence, the Malmquist output quantity index

v10 +v11 v20 +v21

2 2
y11 y1
Qy = Q1y .Q0y = . 20 = T Qy (141)
y10 y2

In a comparable manner, one can use a translog input distance function to show
that

Q1y .Q0y Qy
MP I x x , x ; y , y =
0 1 0 1
= (142)
1
Qx .Qx 0 Qx

Recall that CCD propose two alternative measures of the Malmquist productivity
index, one output oriented and the other input oriented, and in the absence of CRS,
they will provide differing measures of productivity change. Further, it should be
noted that if the two units face different technologies, the Malmquist productivity
indexes (whether input or output oriented) will be technology-specific. In that
case, one should use a geometric mean of the technology-specific indexes. CCD
([6], theorems 3 and 4) have shown that the input- and output-oriented Malmquist
productivity indexes will differ from the Tornqvist productivity index by a scale
factor. The difference will disappear when CRS holds.
858 S. C. Ray

Bjurek [4] proposed a different measure of the Malmquist productivity index as

Q1y .Q0y T Qy
MP I x 0 , x 1 ; y 0 , y 1 = = . (143)
Q1x .Q0x T Qx

Relation Between Fisher and Malmquist Productivity Indexes

Färe and Grosskopf [16] used the duality between the distance function and the
dual cost function to establish equivalence between the Fisher and Malmquist
productivity indexes under certain conditions.
We have assumed that the production possibility set T and, hence, the input
requirement set V(y) = (x : (x, y) ∈ T) is convex. By definition of a distance
t
function, x ∗t = D x xx t ,y t lies on the boundary of V(yt ). Therefore, by the supporting
( )
hyperplane theorem, there exists an input price vector w∗ 0 such that

w ∗0t x 0
w ∗0 x ∗0 =
0 0
= C w ∗0 , y 0 (144)
Dx x ,y

In other words,

w ∗0 x 0
= Dx x 0, y 0 . (145)
C w ∗0 , y 0

Similarly, there will be an input price vector ŵ∗1 for which

w ∗1 x 1
= Dx x 1, y 1 . (146)
C w ∗1 , y 1

At this point, they assume that there is no allocative inefficiency and, conse-
quently, w∗ t = wt , (t = 0, 1). This amounts to assuming that
w 0t x 0 w 1t x 1
C w0 , y 0 = ; C w1 , y 1 = ;
Dx x 0, y 0 Dx x 1, y 1
(147)
w 0t x 1 w 1t x 0
C w0 , y 0 = ; C w1 , y 0 = ;
x
D x ,y0 1 D x0, y0
x

As argued by Balk [3], while the first two assumptions in (147) are relatively
innocuous, the other two are quite problematic15 . While the input mix of the x1

15 Balk points out a typo in of FG [16]. See the footnote 3 on page 681 of Balk [3].
20 Conceptualization and Measurement of Productivity Growth and Technical . . . 859

bundle may be allocatively efficient for the isoquant for the output bundle y1 , there is
no reason why its efficient radial projection will be cost minimizing for the isoquant
for the y0 input bundle as well. Following Färe and Grosskopf, one can express the
input-oriented Malmquist productivity index of CCD in (95) as

Dx x 0, y 0
MP I x x , x ; y , y
0 1 0 1
= x 1 1
D x ,y .
1 1
w0 x 0 w1 x 0 2 C w1 , y 1 C w0 , y 1 2
= . . . .
w0 x 1 w1 x 1 C w1 , y 0 C w0 , y 0
(148)

Finally, they assume profit maximization under CRS which implies (a) that for
each output j, the price pj equals its marginal cost and also that the cost function is
homogeneous of degree 1 in outputs. That is

∂C ∂C
= pj and C (w, y) = yj = pj yj .
∂yj ∂yj
j j

In consequence, (148) would reduce to

1
p1 y 1 p0 y 1 2
.
p1 y 0 p0 y 0 F Qy
MP I x x , x ; y , y
0 1 0 1
= 1
= = πF . (149)
w0 x 1 w1 x 1 2 F Qx
.
w0 x 0 w1 x 0

Balk [3] argues that although the conditions under which FG show equivalence
between the Malmquist and Fisher productivity indexes are unlikely to hold except
in very special cases, an approximate equivalence can be established for the general
case.
0 1 1 0
Because C w 0 , y 1 ≤ D xwx 1x,y 1 and C w 1 , y 0 ≤ D xwx 0x,y 0 , combined with
( ) ( )
the first two equalities in (147), one can get16

w0 x 0 w1 x 0
C (w0 ,y 0 ) Dx x 0, y 0 C (w1 ,y 0 )
≤ x 1 1 ≤ (150)
w0 x 1 D x ,y w1 x 1
C (w0 ,y 1 ) C (w1 ,y 1 )

As argued by Balk [3] (see page 681),

16 By definition, D x x 1 , y 1 = max θ : ∈ V y 1 . Hence, D x x11 ,y 1 x 1 ∈ V y 1 . But,
1 1
θx ( )
0 1
C(w0 , y1 ) = min w0 x : x ∈ V(y1 ). Hence, C w 0 , y 1 ≤ D xwx 1x,y 1 . Similarly, for the other
( )
inequality.
860 S. C. Ray

⎡ ⎤1
0 0 w0 x 0 w1 x 0 2
Dx x ,y ⎢ C (w0 ,y 0 ) C (w1 ,y 0 ) ⎥
⎣ 0 1 ⎦
Dx x 1, y 1 w x w1 x 1
1 1
C (w ,y ) (
0 1 C w ,y )
1 1
w0 x 0 w1 x 0 2 C w0 , y 1 C w1 , y 1 2
(151)
= . . .
w0 x 1 w1 x 1 C w0 , y 0 C w1 , y 0
12
C w0 ,y 1 C w1 ,y 1
.
C (w0 ,y 0 ) C (w1 ,y 0 )
=
F Qx

Again, assuming profit maximization under CRS,

1
0 0 p0 y 1 p1 y 1 2
Dx x ,y .
p0 y 0 p1 y 0 F Qy
= = πF . (152)
D x1, y1
x F Qx F Qx

Nonparametric Decomposition of the Fisher Productivity Index

As was recognized before, the Fisher productivity index is a descriptive rather than
a normative measure. It is, nonetheless, possible to use the dual representation
of an empirically constructed best practice technology to decompose the Fisher
productivity index into a number of economically meaningful factors. We now
consider an analogous decomposition of the Fisher productivity index introduced
by Ray and Mukherjee [28].
As explained before, the Fisher productivity index is the geometric mean of a
Laspeyres and a Paasche productivity index. Consider the Laspeyres index first.
For simplicity, assume that the firm produces a single output from multiple inputs.
Suppose that we are measuring the productivity index for firm k. The output
quantities produced by the firm are yk0 in period 0 (the base period) and yk1 (in
period 1) the current period. The observed input bundles are xk 0 and xk 1 in the two
periods. The corresponding input price vectors are wk0 and wk1 . Then the Laspeyres
productivity index becomes
yk1
yk0
L= . (153)
’
wk0 xk1
’
wk0 xk0

At this point, recall the dual cost function for period t

C t (w, y) = min w ’ x : (x, y) ∈ T t

20 Conceptualization and Measurement of Productivity Growth and Technical . . . 861

where Tt is the production possibility set in period t.

Then the Laspeyres productivity index can be expressed as

C 1 wk0 ,yk1
yk1
C 1 wk0 ,yk1 ’
wk0 xk1
L= . (154)
C 0 wk0 ,yk0
yk0
C 0 wk0 ,yk0 ’
wk0 xk0

But, following the Farrell decomposition of the cost efficiency, we can write

C 1 wk0 , yk1
= T E 1 xk1 , yk1 .AE 1 xk1 , yk1 ; wk0 (155)
’
wk0 xk1

where T E 1 xk1 , yk1 is the technical efficiency of the input-output pair xk1 , yk1 in

period 1 and AE 1 xk1 , yk1 , wk0 is the allocative efficiency of the input mix of the
bundle xk 1 at input price wk 0 in period 1. In an analogous manner,

C 0 wk0 , yk0
= T E 0 xk0 , yk0 .AE 0 xk0 , yk0 ; wk0 . (156)
’
wk0 xk0

Thus,

C 0 wk0 ,yk0
T E 1 xk1 , yk1 .AE 1 xk1 , yk1 ; wk0 . yk0
L= C 1 wk0 ,yk1 . (157)
T E 0 xk0 , yk0 .AE 0 xk0 , yk0 ; wk0 . yk1

This can be further manipulated to get

⎡ C 1 wk0 ,yk0 ⎤

1 1 0 0
T E1 x , yk1 AE xk , yk1 ; wk
1 0 C wk , yk0
⎣ 1 k0 ⎦.
y
L= k
T E 0 xk0 , yk0 AE 0 xk0 , yk0 ; wk0 C 1 wk0 , yk0 C wk0 ,yk1
yk1
(158)

Similar manipulations of the Paasche productivity index

yk1
yk0
P = (159)
’
wk1 xk1
’
wk1 xk0
862 S. C. Ray

lead to the decomposition

⎡ C 0 wk1 ,yk0 ⎤

0 1
T E 1 xk1 , yk1 AE 1 xk1 , yk1 ; wk1 C wk , yk1
⎣ 0 k0 ⎦.
y
P =
T E 0 xk0 , yk0 AE 0 xk0 , yk0 ; wk1 C 1 wk1 , yk1 C wk1 ,yk1
yk1
(160)

Now define

T E 1 xk1 , yk1
T EI = ; (161)
T E 0 xk0 , yk0

AE 1 x 1 , yk1 ; w 0 AE 1 x 1 , yk1 ; w 1
AEI = k k
. k k
; (162)
AE 0 xk0 , yk0 ; wk0 AE 0 xk0 , yk0 ; wk1

0
C 0 w , yk0 C 0 w 1 , yk1
T CI = 0 k
. k
; (163)
C 1 wk , yk0 C 1 wk1 , yk1

and

C 1 w0 ,yk0 C 0 w1 ,yk0
k k
y yk0
ACI = 1 k00
. 0
1
. (164)
C wk ,yk1 C wk ,yk1
yk1 yk1

Then,
√
πF = L.P = (T EI ).(AEI ).(T CI ).(ACI ). (165)

In this factorization, the four terms on the right-hand side relate to (a) technical
efficiency change, (b) allocative efficiency change, (c) technical change, and (c)
change in scale economies, respectively. The first, TEI, obviously shows the increase
(decrease) in technical efficiency in period 1 relative to what it was in period 0.
The factor AEI is itself the geometric mean of two ratios, each of which shows
the relative allocative efficiency of the input bundle from period 1 compared to
the bundle from period 0. The allocative efficiencies are measured using the same
technology and input prices for both bundles. TCI is a dual measure of technical
change. It shows the autonomous shift of the cost function between the two periods
evaluated alternatively at the input price and output quantity levels from the two
periods. Finally, the factor ACI shows the relative (dual) scale efficiencies of the
output levels from the two periods. When any one of the two ratios under the
square root sign in this factor is greater than unity, it implies that along the dual
cost function for the technology and input prices specified, the average cost is lower
at the output level in the current period than at the output level from the base period.
20 Conceptualization and Measurement of Productivity Growth and Technical . . . 863

That is, the current period output is relatively more scale efficient. This contributes
positively to productivity growth.
A note of caution is in order here. As with all nonparametric models based
on cross-period DEA, some components of this decomposition of the Fisher
productivity index may be unavailable. This will be the case when the output level
from one period is larger than the maximum output observed in the other period. In
that case, the input requirement set relevant for the cross-period cost minimization
problem would be empty.

Relation Between Malmquist Productivity Index and Luenberger

Productivity Indicator

We first consider the relationship between the Shephard distance function and the
directional distance function under CRS.
For any input-output pair (x, y), D (x, y) = min λ : x, λ1 y ∈ T ⇒
−
→
F x, λ1 y = 0, while D (x, y; g x , g y ) = max β : F (x + βg x , y + βg y ) = 0.
Hence, setting (gx , gy ) = (−x, y), we get F((1− β)x, (1 + β)y) = 0. Now, under
CRS, F(tx, ty) = 0. Setting t = 1−β1
we get F x, 1+β 1−β
1−β y = 0. Hence, λ = 1+β . In
other words,
−
→
1 − D (x, y; −x, y)
D y (x, y) = −
→ . (166)
1 + D (x, y; −x, y)

That is,
−
→ −
→
ln D y (x, y) = ln 1 − D (x, y; −x, y) − ln 1 + D (x, y; −x, y) (166a)

From (96) above,

ln MP I y x 0 , x 1 ; y 0 , y 1 = ln D y x 1 , y 1 − ln D y x 0 , y 0
→y
− →y
−
= ln 1 − D x 1 , y 1 ; −x 1 , y 1 − ln 1 − D x 0 , y 0 ; −x 0 , y 0
→y
− →y
−
− ln 1 + D x 1 , y 1 ; −x 1 , y 1 − ln 1 + D x 0 , y 0 ; −x 0 , y 0
(167)

Using a first-order Taylor’s series approximation, Boussemart et al. [5] show17

that

17 See their proposition 2 in pp 399–400.

864 S. C. Ray

ln MP I y x 0 , x 1 ; y 0 , y 1 −L1 x 1 , y 1 , x 0 , y 0 ; g (168)

Data Envelopment Analysis and a Nonparametric Measurement

of Productivity Change

Unlike in a parametric approach like stochastic frontier analysis (SFA), in data

envelopment analysis, one makes a minimal number of fairly general assumptions
about the underlying production technology to construct an approximation to the
production possibility set using observed values of inputs and outputs with any
explicit specification of a functional form of the production, cost, or function.

One starts with the data set of input-output bundles D = x j , y j , j =
1, 2, . . . , N
and assumes
that
all observed input-output bundles are feasible. That
is x j , y j ∈ D ⇒ x j , y j ∈ T . Then under the assumptions of free disposability
of input and outputs and convexity of the production possibility set, an empirical
approximation of T based on the observed data points is
⎧ ⎫
⎨ N N N ⎬
S= (x, y) : x ≥ λj x j ; y ≥ λj y j ; λj = 1; λj ≥ 0; j = 1, 2, . . . , N
⎩ ⎭
j =1 j =1 j =1
(169)

DEA Models for Measuring the Malmquist Productivity Index

In order to quantify the different components of the Malmquist index, we need to

compute various contemporaneous and cross-period efficiency measures. The con-
temporaneous technical efficiency/distance function measures are obtained solving
for each period (t = 0, 1) the standard output-oriented DEA LP problems
1
τyt xt0 , yt0 = ∗
ϕ

where

ϕ ∗ = max ϕ
j
s.t. λj yt ≥ ϕyt0 ;
j
j
λj xt ≤ xt0 ; (170)
j

λj = 1; λj ≥ 0; (j = 1, 2, . . . , N )
j
20 Conceptualization and Measurement of Productivity Growth and Technical . . . 865

Of course, for CRS the constraint λj = 1 is dropped. For measurement of

j
cross-period efficiency
of the input-output bundle from period t against the period s
technology τys xt0 , yt0 , one needs to solve the problem:

max ϕ
j
s.t. λj ys ≥ ϕyt0 ;
j
j
λj xs ≤ xt0 ; (171)
j

λj = 1; λj ≥ 0; (j = 1, 2, . . . , N )
j

In solving the cross-period optimization problem in (171), one encounters

infeasibility whenever any input from the bundle xt0 is smaller than the smallest
j
value of the corresponding input across all input bundles xs (j = 1, 2, .., N) . This
is an inherent LP problem with the VRS assumption and is not a limitation of
the RD decomposition in particular. By using the CRS frontier to evaluate cross-
period distance functions for measuring technical change, the FGNZ decomposition
avoids the infeasibility problem. However, at the conceptual level, it lacks internal
consistency.
Another problem that applies to the Malmquist productivity index in general is
that it fails the circularity test in the sense that a direct measure of productivity
change between periods 0 and 2 needs not be the same as the product of the change
between periods 0 and 1 and between 1 and 2. To avoid the infeasibility problem,
one may use a “sequential technology” as the base. As mentioned above in section
“Relation Between Fisher and Malmquist Productivity Indexes,” Pastor, Asmild,
and Lovell [27] proposed a “biennial” Malmquist productivity index.

Biennial Malmquist Index18

The appropriate
t t DEA model to estimate period t output-oriented technical efficiency
T EB
c xs , ys of firm s, with reference to a CRS biennial production possibility set,
is

18 This section is based on Deb and Ray [13].

866 S. C. Ray

ϕs∗ = Max ϕ
nk
subject to λkj yjk ≥ ϕyst ;
k=t,t+1 j =1
nk
(172)
λkj xjk ≤ xst ;
k=t,t+1 j =1

λkj ≥ 0;
t t 1
where nk is the number of observed firms in period k and T E Bc xs , ys = ϕs∗ .
B
t t
Period t output-oriented technical efficiency T E v xs , ys of firm s, with refer-
ence to a biennial VRS frontier, can be estimated by the following DEA model:

φs∗ = Max ϕ
nk
subject to λkj yjk ≥ φyst ;
k=t,t+1 j =1
nk
λkj xjk ≤ xst ; (173)
k=t,t+1 j =1
nk
λkj = 1;
k=t,t+1 j =1

λkj ≥ 0;

t t 1
where nk is the number of observed firms in period k and T E B v xs , ys = φs∗ .
Figure 6 provides an illustration of the biennial production possibility frontier
and measure of output-oriented technical efficiency with reference to it for a firm,
producing a single output from a single input, observed in two time periods t and
t+1 (point A and B, respectively). The VRS frontiers for period t and t + 1 are
indicated by K0 L0 M0 -extension and K1 L1 M1 -extension, respectively. The rays
through origin OP0 and OP1 represent the CRS frontiers for period t and period
t + 1, respectively. The biennial VRS frontier is indicated by the broken line K1 L1
DFM0 -extension, and the biennial CRS frontier in this case coincides with that of
period t + 1. Output-oriented technical efficiency of the firm with reference to CRS
biennial frontier in period t is TEc B (xt , yt ) = (AXt /QXt ), and that for period t + 1 is
TEc B (xt + 1 , yt + 1 ) = (BXt + 1 /RXt + 1 ). Similarly with reference to the VRS biennial
frontier, TEv B (xt , yt ) = (AXt /DXt ) and TEv B (xt + 1 , yt + 1 ) = (BXt + 1 /FXt + 1 ) show
the levels of technical efficiency for the firm in period t and t + 1, respectively.
Therefore, the biennial Malmquist productivity index is
20 Conceptualization and Measurement of Productivity Growth and Technical . . . 867

Fig. 6 Biennial Malmquist index and its decomposition

BX t+1 /RX t+1

Mc B
x ,y ;x ,y
t t t+1 t+1
= . (174)
AX t /QXt

The decomposition of this Malmquist productivity index is

BX t+1 /EX t+1

Mc B
x ,y ;x ,y
t t t+1 t+1
=
AX t /CX t

BXt+1 /F X t+1 / BXt+1 /EX t+1
× (175)
AX t /DX t / AX t /CX t

BXt+1 /RX t+1 / BX t+1 /F Xt+1
×
AXt /QX t / AX t /DXt

Here

BX t+1 /EX t+1
= T EC (technical efficiency change) (176a)
AXt /CX t

BX t+1 /F Xt+1 / BXt+1 /EX t+1
= T C (technical change) (176b)
AXt /DX t / AX t /CX t
868 S. C. Ray

and

BXt+1 /RX t+1 / BX t+1 /F Xt+1
= SEC (scale efficiency change) (176c)
AXt /QX t / AXt /DX t

DEA Model for the Directional Distance Function

For an arbitrary choice of (gx , gy ), the relevant VRS DEA problem will be

max β

N
j y
s.t. λj yr − βgr ≥ yr0 ; (r = 1, 2, . . . , m) ;
j =1

N
j
λj xi + βgix ≤ xi0 ; (i = 1, 2, . . . , n) (177)
j =1

N
λj = 1;
j =1

λj ≥ 0; (j = 1, 2, . . . , N) ; β unrestricted

The flexibility of the DDF is apparent from the fact that it can be radial (setting
gx = 0 or gy = 0), bi-radial (setting gx = x0 and gy = y0 ), or completely non-radial
for arbitrary choice of (gx , gy ).

Conclusion

This chapter offers a broad overview of how a measure of technical change

can be extracted from alternative measures of total factor productivity change
obtainable from production, cost, profit, or distance functions. In the nonparametric
approach, the focus is on index numbers. Appropriate DEA models are formulated
for measuring Malmquist or biennial Malmquist productivity indexes and the
Luenberger productivity indicator. Some topics not considered in this chapter are
measurement of technical change without the convexity assumption (through Free
Disposal Hull analysis [35] or output and input biases of technical change). For a
detailed discussion of index numbers and their properties, one should look into the
relevant chapter in this volume. An excellent discussion of the Malmquist index can
be found in Färe, Grosskopf, and Roos [19].
20 Conceptualization and Measurement of Productivity Growth and Technical . . . 869

Cross-References

Index Numbers and Productivity Measurement

Modeling Technical Change: Theory and Practice

References
1. Aczel J (1990) Determining merged relative scores. J Math Anal Appl 150:20–40
2. Afsharian M, Ahn H (2014) The Luenberger indicator and directions of measurement: a
bottoms-up approach with an empirical illustration to German savings banks. Int J Prod Res
52(20):6216–6233
3. Balk B (1993) Malmquist productivity indexes and fisher productivity indexes comment. Econ
J 103:680–682
4. Bjurek H (1996) The Malmquist total factor productivity index. Scand J Econ 98:303–313
5. Boussemart J, Briec W, Kerstens K, Poutnieau J (2003) Luenberger and Malmquist productiv-
ity indices: theoretical comparisons and empirical illustration. Bull Econ Res 55(4):391–405
6. Caves DW, Christensen LR, Diewert WE (1982) The economic theory of index numbers and
the measurement of input, output, and productivity. Econometrica 50(6):1399–1414
7. Chambers RG (1996) A new look at exact input, output, productivity, and technical change
measurement. Working papers 197840, University of Maryland, Department of Agricultural
and Resource Economics
8. Chambers RG (2002) Exact nonradial input, output, and productivity measurement. Economic
Theory 20:751–765
9. Chambers RG, Pope RD (1996) Aggregate productivity measures. Am J Agric Econ
78(5):1360–1365
10. Chambers RG, Chung Y, Färe R (1996) Benefit and distance functions. J Econ Theory 70:407–
419
11. Chambers RG, Färe R, Grosskopf S (1996) Productivity growth in APEC countries. Pac Econ
Rev 1:181–190
12. Chambers RG, Chung Y, Färe R (1998) Profit, directional functions, and Nerlovian efficiency.
J Optim Theory Appl 98:351–364
13. Deb AK, Ray SC (2014) Total factor productivity growth in Indian manufacturing: a biennial
Malmquist analysis. Indian Econ Rev 49(1):1–25
14. Denney M, Fuss M, Waverman L (1981) The measurement and interpretation of total factor
productivity in regulated industries with an application to Canadian telecommunications. In:
Cowing TG, Stevenson RE (eds) Productivity measurement in regulated industries. Academic,
New York, pp 179–218
15. Diewert WE (1976) Exact and superlative index numbers. J Econ 4:115
16. Färe R, Grosskopf S (1992) Malmquist productivity indexes and fisher ideal indexes: comment.
Econ J 102:158–160
17. Färe R, Grosskopf S, Lindgren B, Roos P (1994) Productivity developments in Swedish
hospitals: a Malmquist output index approach. In: Charnes A, Cooper WW, Lewin AY,
Seiford LM (eds) Data envelopment analysis: theory, methodology and applications. Kluwer
Academic, Boston. (Originally presented at a Conference on new uses of DEA in management
and public policy, University of Texas, Austin, September 27–29, 1989)
18. Färe R, Grosskopf S, Norris M, Zhang Z (1994) Productivity growth, technical progress, and
efficiency change in industrialized countries. Am Econ Rev 84(1):66–83
19. Färe R, Grosskopf S, Roos P (1994) Malmquist productivity indexes: a survey of theory and
practice. In: Färe R, Grosskopf S, Russell R (eds) Index numbers: essays in Honor of Sten
Malmquist. Kluwer Academic, Boston, 1998, pp 127–190
870 S. C. Ray

20. Fare R, Grosskopf S, and Roos P (1998) Malmquist productivity indexes: a survey Of theory
and practice; in Fare R, Grosskopf S, and Russell R (eds) Index Numbers: Essays in Honor of
Sten Malmquist. Boston: Kluwer Academic Press, 127–190
21. Farrell M.J (1957) The Measurement of Technical Efficiency. Journal of the Royal Statistical
Society Series A, General, 120, Part 3, 253–81
22. Jorgenson DW, Griliches Z (1967) Explanation of productivity change. Rev Econ Stud
34(3):249–283
23. Lovell CAK (2003) The decomposition of productivity indexes. J Prod Anal 20:437–458
24. Luenberger DG (1992) Benefit functions and duality. J Math Econ 21:461–481
25. Moorsteen RH (1961) On measuring productive potential and relative efficiency. Q J Econ
75(3):451–467
26. O’Donnell C (2012) Nonparametric estimates of components of productivity and profitability
change in U.S. agriculture. Am J Agric Econ 94(4):873–890
27. Pastor JT, Asmild M, Lovell C (2011) The biennial Malmquist productivity change index.
Socio Econ Plan Sci 45:10–15
28. Ray SC, Mukherjee K (1996) Decomposition of the fisher ideal index of productivity: a non-
parametric dual analysis of U. S. Airlines data. Econ J 106:1659–1678
29. Ray SC, Segerson K (1991) A profit function approach to measuring the rate of technical
progress: an application to U.S. manufacturing. J Prod Anal 2:39–52
30. Ray SC, and E. Desli (1997) Productivity Growth, Technical Progress, and Efficiency Change
in Industrialized Countries: Comment“, American Economic Review, Vol 87, No. 5, Dec. 1997
pp 1033–1039.
31. Schultz T (1956) Reflections on agricultural production, output, and supply. J Farm Econ
38(3):748–762
32. Shephard RW (1953) Cost and Production functions princeton: princeton university press
33. Shephard RW (1970) Theory of Cost and Production Functions (Princeton)
34. Solow RM (1957) Technical change and the aggregate production function. Rev Econ Stat
39(3):312–320
35. Tulkens H, Eeckaut PV (1995) Non-parametric efficiency, progress and regress measures for
panel data: methodological aspects. Eur J Oper Res 80(3):474–499
Modeling Technical Change: Theory
and Practice 21
Subal C. Kumbhakar

Contents
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 872
Modeling TC: The Single Output Case . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 874
Production Function Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 874
The Cost Function Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 877
Profit Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 878
Price Function Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 879
Specification and Estimation Issues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 879
Production Function Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 879
Estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 883
Production Function Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 883
Cost Function Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 886
Profit Function Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 894
Multiple Outputs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 894
The Primal Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 894
The Dual Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 897
TC Measures Induced by Management/Exogenous Factors and Time . . . . . . . . . . . . . . . . . . . 898
Management Variables as Technology Shifter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 899
Specification of the IDF with Multiple Outputs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 901
TC in Cross-Sectional Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 903
Technical Change from Other Indirect Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 904
Revenue Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 904
Indirect Production Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 905
Technical Change and TFP Change . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 906
TC and Profit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 906
TC as a Component of TFP Change . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 908
Formulating and Estimating TC Without Estimating Profit/Cost Function . . . . . . . . . . . . . 909

S. C. Kumbhakar ()
Department of Economics, State University of New York at Binghamton, Binghamton, NY, USA
Inland Norway University of Applied Sciences, Lillehammer, Norway
e-mail: [email protected]

© Springer Nature Singapore Pte Ltd. 2022 871

S. C. Ray et al. (eds.), Handbook of Production Economics,
https://doi.org/10.1007/978-981-10-3455-8_27
872 S. C. Kumbhakar

Models with Technical Inefficiency . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 911

TC and Technical Inefficiency . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 911
Estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 913
TC in Production Models with Good and Bad Outputs . . . . . . . . . . . . . . . . . . . . . . . . . . . . 915
Productivity and Profitability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 918
TC and Factor Productivity with One Variable Input . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 919
Concluding Remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 920
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 920

Abstract

This chapter focuses on models dealing with technical change (TC). We discuss
TC under both single output and multiple outputs cases. For the single output
case, we specify the technology in terms of a (i) production (input or output
distance) function, (ii) cost function, and (iii) profit function. Each of these cases
is discussed under disembodied approaches, which are not associated with a spe-
cific input and embodied (factor-augmenting (FA)) approaches, which are spe-
cific to input(s). In the disembodied approach, we mostly focus on time driven TC
which includes models with the time trend and general index approaches. First,
we discuss each of the cases in a standard neoclassical setup without technical
inefficiency followed by a brief discussion of some of the models with technical
inefficiency. We also discuss TC and total factor productivity (TFP) in models
with good and bad outputs. Finally, we link TFP change with profitability change.

Keywords

Technical change · Total factor productivity · Panel data · Production

function · Cost function · Profit function · Distance function · Technical
inefficiency

JEL Classiﬁcation Numbers

D24

Introduction

“Productivity isn’t everything, but in the long run it is almost everything. A

country’s ability to improve its standard of living over time depends almost
entirely on its ability to raise its output per worker.” (Paul Krugman, The Age of
Diminishing Expectations [1]). Similarly, Robert Solow argued in his famous paper,
A Contribution to the Theory of Economic Growth [2], that an economy cannot
grow in the long run without technological progress. More specifically, growth
in per capita real income for an economy in the long run will be determined by
technological progress.
21 Modeling Technical Change: Theory and Practice 873

Both Krugman and Solow referred to macro aspects of economic growth. Solow,
in particular, focused on the role of technical progress in steady state long run
growth. However, in this chapter, our focus will be on micro perspective and
therefore we will not be discussing economic well beings or aggregate economic
growth. Instead, our focus will be on individual firms (producers), and therefore, we
will concentrate on the impact of technical change (TC) on a firm’s cost, profit,
revenue, returns to outlay, etc. While pursuing this goal, we will be discussing
various micro models that are used to estimate TC. Similar to the macro models,
we define TC as a shift in firm-level technology, which can occur due to many
factors. However, the main idea is that more output is produced without increasing
inputs if there is technical progress (positive TC), that is, in a production function
formulation, the production function shifts upward. Alternatively, less input is
required (i.e., cost is lower given input prices) to produce a given level of output.
It can also be examined from many other forms, where the technology can be
represented by profit, revenue, and distance functions. This shift in firm-level
technology can take place in different forms and due to many different factors.
As a result, TC can be examined in many ways, although the production function
formulation is used the most. TC can also be viewed as endogenous or exogenous.
In endogenous TC, the factors causing the shift are decision variables to the firm.
For example, a firm can invest in state-of-the-art computer (IT capital) to speed
up TC. If firms do not decide on such variables, TC can be viewed as exogenous.
Similarly, TC can be embodied in some specific inputs or disembodied. If TC is
associated with capital of recent vintage, it is said to be embodied in capital. On
the other hand, TC may not be associated with any particular input in which case
it is disembodied. Finally, TC can be neutral (not directed towards any input in
particular) or non-neutral. If TC is non-neutral, it is biased. Skill biased TC is an
example of non-neutral TC.
In the Solow model, unitary returns to scale are assumed. In the micro models
that we will be focusing on, returns to scale (RTS) will be unconstrained – either
constant (not unitary) or observation specific, depending on the functional form
used. Nonconstant RTS will be assumed when we focus on flexible functional forms.
Since the estimate of TC is likely to depend on whether one assumes RTS to be
unitary or not, which is a testable hypothesis, our models will be unconstrained
in relation to RTS. Thus, although our main focus is TC, we will be frequently
discussing RTS. Another reason for emphasizing RTS is that TC is only one
component of total factor productivity (TFP) change. Although our discussion of
TFP change will be limited, because it is discussed in detail in another chapter of
this volume (Ray [68]), we cannot ignore RTS entirely.
In this chapter, we focus on time driven TC (a shift taking place over time), as
well as a shift due to some exogenous (to the firm) factors in parametric models.
That is, we review models in which TC is exogenous and non-neutral. Perhaps we
will be abusing the terminology “embodied TC” by labeling the factor-augmenting
(FA) TC models as embodied TC models, especially when TC is embodied in the
inputs of recent vintage proxied by time in panel models. For the cross-sectional
models, embodiments are associated with exogenous factors such as management.
874 S. C. Kumbhakar

By the same token, we can use the term disembodied TC to mean a shift that is
not attributed to a particular input. These notions can also be linked to neutral and
non-neutral TC. As a result, different classifications are not completely separate.
Although one can discuss TC using time series data for a firm, region, state, country,
etc., our focus will mostly be on the use of panel (time-series of cross section)
data so that we can concentrate on time as the shifter in the absence of exogenous
factors. We will also be discussing TC in cross-sectional models when TC is driven
by exogenous factors.
Our discussion of TC in this chapter will cover single as well as multiple outputs.
For the single output case, we will specify the technology mostly but not entirely in
terms of a (I) production (input or output distance) function, (II) cost function, and
(III) profit function. The definition of TC and its interpretation will differ depending
on whether the technology is specified in terms of a production, cost, profit, or
revenue function. Finally, each of these cases will be discussed under disembodied
and embodied (FA) approaches. In the disembodied approach, TC is not associated
with a specific input, which is the case in the FA approach. We will mostly focus on
time driven TC in the disembodied approach, including the time trend and general
index approaches. First, we discuss each of the cases without technical inefficiency
followed by a brief discussion of some models with technical inefficiency. We also
discuss TC and TFP in models with good and bad outputs. Finally, although our
focus in this chapter is not on TFP, we link TFP change with profitability change for
an easy interpretation of TFP change.
We use an input or output distance function to represent multiple outputs in
a primal framework. Since the dual cost/profit functions with multiple outputs
are straightforward generalizations of the single output cost/profit functions, our
discussion on the multiple output cost/profit functions with and without inefficiency
is not extensive.

Modeling TC: The Single Output Case

In this section, we discuss modeling (disembodied) exogenous TC via (i) the

production function Approach, (ii) the cost function approach, and (iii) the profit
function approach.

Production Function Approach

First, we consider models where TC is time driven.

The Time Trend (Continuous Time) Model

In the time trend (TT) formulation of TC, the production function (see, for example,
Chambers [3] for the properties) is specified as

y = f (x, t) (1)
21 Modeling Technical Change: Theory and Practice 875

where y is output, x is a vector of inputs and t is the time-trend variable. Mathemat-

ically, t can be viewed as a factor of production like x. Although x is mostly treated
as the input vector, it can also include environmental variables (facilitating inputs).
In measuring TC, we consider the shift in the production function over time, ceteris
paribus. TC, in this formulation, is defined as

TCp (x, t) = ∂ln y/∂t = ∂ln f (x, t) /∂t. (2)

The subscript p on TC in (2) is used to indicate that TC is based on a production

function formulation. A positive (negative) value of TC indicates technical progress
(regress). It shows the rate at which output changes while holding input levels
unchanged. In other words, output increases over time without increasing the input
quantities when technical progress takes place. This is shown in the figure below
where output increases from yA to yB moving forward in time from t0 to t1 (t1 > t0 ),
holding x unchanged at xA . If x is labor, the figure shows that output per worker
(average product of labor) increases over time, ceteris paribus. Because of this, the
average product of different inputs is likely to change depending on the functional
form of f (.).

y = f ( x, t1 )
y
y = f ( x, t0 )
yB

0 xA x

TC is said to be Hicks neutral if the production function y = f (x, t) can be written

as y = μ(t)f (x) where μ(t) is the shift function. Note that here TC (the shift function)
is not embodied in any particular input, and this is why it is labeled as neutral.
Thus, productivity of all the inputs changes by the same proportion μ(t) under Hicks
neutral TC. Consequently, the rate of technical substitution between any two inputs
(the ratio of marginal products), is constant under Hicks neutral TC.
876 S. C. Kumbhakar

The General Index (Discrete Time) Formulation

In the general index (GI) approach, instead of treating t as a continuous variable,
an index A(t) is defined using the time dummies. The production function with this
index is written as

y = f (x, A(t)) (3)

where A(t) = λt DT t , DTt being the time dummies, and λt are the associated
t
coefficients (parameters). TC is then defined as

TCp (x, A(t)) = ln f x, A(t) = ln f (x, A(t)) − ln f (x, A (t − 1)) (4)

Here, the partial derivative is replaced by finite changes, holding x unchanged.

Conceptually, it is no different from the TT model. The only difference is in the
algebraic formulation and estimation, which will come later.

The Factor-Augmenting (Embodied in Time) Approach

As noted before, we view the factor-augmenting approach as a form of embodied
TC for which the production function is written as

y = f x∗ (5)

where x∗ = x or xj∗ = λj (t) xj , j = 1, . . . , J. The efficiency (factor augmenting)

factors λj > 0 are functions of time. Thus, TC is embodied in all the inputs,
although one can set λj = 1 for some j so that TC is augmenting in some specific
inputs. For example, TC might be embodied only in new capital. In a way, the
FA approach is more intuitive because it shows that the shift in the technology
is taking place because efficiency of one or more factor inputs is changing over
time. For example, if labor becomes more productive because of learning with
experience, it is natural to think that technical progress is due to the fact that labor
is becoming more productive over time. TC is Harrod neutral if the production
function y = f (K, L, t) = f (K, λL (t)L), λL > 1. That is, under a Harrod neutral
TC, only the efficiency of labor increases. Such an interpretation is not always
possible with the TT representation. Meaning that the TT and FA formulations are
not identical for all types of production function.
TC in the FA model is obtained from

TCp x ∗ = ∂ln y/∂t = ∂ln y/∂ln xj∗ ∂ln xj∗ /∂t
j
(6)
= T Cj = Ej λ̇j
j j

where Ej = ∂ln y/∂xj∗ and λ̇j = ∂ln λj /∂t.

21 Modeling Technical Change: Theory and Practice 877

Thus, the overall TC can be decomposed into input-specific components. How-

ever, the input-specific components cannot be identified for all types of production
functions (Cobb-Douglas (CD), for example).

The Cost Function Approach

Duality in production theory shows that all the features of the production function
can be captured by a dual cost function under cost minimization behavior. So instead
of using a production function to represent the technology, one can also use a cost
function which is a function of input prices and output. To accommodate facilitating
inputs, we can add them separately and treat them like output variables which are
exogenous. Similar to the production function, to model TC, we append the shifter
as an argument in the cost function. In the TT approach, the shifter is t and the
neoclassical cost function (see, for example, Chambers [3], Fuss and McFadden [4]
for its properties) is

C = C (w, y, t) (7)

where w is the input price vector. If the production function shifts upward (technical
progress), the cost function will shift downwards, ceteris paribus. Thus, it is natural
to define TCc as

TCc (w, y, t) = −∂ln C/∂t ≡ −Ċt (8)

where the subscript c indicates that TC is based on a cost function formulation. A

positive (negative) value of TC indicates technical progress (regress). TC in a cost
function approach is the rate of cost diminution and it is generally not the same as
TC from the production function unless the cost elasticity of output is unitary (i.e.,
∂ln C/∂ln y = 1). This can be shown as follows:
From the cost minimization problem:

= wx + θ y − f (x, t)

where θ is the Lagrange multiplier, which has an optimum value that is θ = ∂C/∂y.
Since at the optimum, the value of the Lagrangian is the same as the optimum
cost, we can get

∂C(w, y, t)/∂t = −θf(x, t)/∂t = −∂C/∂y × ∂f(x, t)/∂t

(9)
⇒ ∂ln f(x, t)/∂t = TC(x, t) = −T C c (w, y, t)/ECY ,

where ECY = ∂ln C/∂ln y is the cost elasticity of output.

878 S. C. Kumbhakar

In the GI approach, the cost function is expressed as

C = C w ∗ , y, A (t) (10)

In the FA approach, the neoclassical cost function is

C = C w ∗ , y, t (11)

where w∗ = Bw and Bj = 1/λj .

TCc w ∗ , y, t = −∂ln C/∂t = − ∂ln C/∂ln wj∗ ∂ln wj∗ /∂t
j
(12)
= Sj∗ λ̇j , Sj∗ = ∂ln C/∂ln wj∗ .
j

If there is technical progress, λ̇j < 0 and therefore, TC will be positive.

Intuitively, technical progress in a FA model means an increase in input productivity
by
λj , which is equivalent to a decrease in input prices by λj and a decrease in cost
by j Sj∗ λ̇j .

Profit Function

Following neoclassical duality theory, the technology for a profit maximizing firm
can be expressed as
π = π (p, w, t) (13)

where p is the output price and w is the vector of input prices. See Chambers [3],
Fuss and McFadden [4] for properties of profit functions. Here TC is defined as

TCπ (p, w, t) = ∂ln π/∂t ≡ π̇t (14)

Again, the subscript π on TC indicates that it is based on a profit function. Note

that to use the profit function approach for estimating TC, we need profit to be
positive, especially when one uses the popularly used functions such as the CD or
translog.
Similar to the production function approach, technical progress (regress) will
mean a positive (negative) value of π̇t . The numerical value of TC in a profit
function is different from those in a production and cost function. The relationship
can easily be established. The Lagrangian for the profit maximization problem is
= py − wx + μ(y − f (x, t) where μ is the Lagrange multiplier. Using the Envelope
theorem, which gives μ = − p, and at the optimum π (p, w, t) = py − wx, we get
∂π (w, p, t)/∂t = − μ ∂f (x, t)/∂t ⇒ π ∂ln π (w, p, t)/∂t = py ∂ln f (x, t)/∂t. Thus,

TCp (p, w, t) = (R/π ) TC(x, t) = Eπp TC(x, t)

21 Modeling Technical Change: Theory and Practice 879

where Eπ p = ∂ln π (w, p, t)/∂ln p = py/π using Hotelling’s lemma.

In the GI model, the profit function can be expressed as π = π (p, w, A(t)) where
A(t) is an index of time dummies. The TC can then be defined as

TCπ p, w, A(t) = ln π (w, p, A(t)) = ln π (w, p, A(t)) − ln π (w, p, A (t−1)) .
(15)

For this, profit has to be positive. Since in reality profit can be negative for many
firms, the profit function is not very popular. Furthermore, it requires information on
both output and input prices which are often difficult to get. Further, without enough
variations in prices, the profit function cannot be reliably estimated.
In the FA approach, we write the profit function as π = π (p∗ , w) where pm ∗ =

pm Bm and TC is

∂ln π/∂t = TC p, w ∗ = ∂ln π/∂ln wj∗ ∂ln wj∗ /∂t
j
(16)
= j P S j λ̇j , P S j = wj xj /π

Price Function Approach

Jorgenson [5] suggested the use of a Price Function Approach to model TC. The
price function is based on the long-run equilibrium condition of zero profit (revenue
equals cost under constant returns to scale). Jorgenson [6] shows that all the features
of the underlying technology can be obtained from the price function. Thus, this
approach is useful for estimating aggregate TC along with other features of the
aggregate technology. In this approach, output price is specified as a function of
input prices and the time trend. TC is then defined as negative of output price
change between two time periods, holding everything else unchanged. See Jin and
Jorgenson [7] for details. In many micro applications, the use of constant returns to
scale assumption is not a norm, so this approach is not very useful for modeling TC
for individual producers who may be operating under increasing or decreasing RTS.

Specification and Estimation Issues

Production Function Approach

Time Driven Disembodied TC

Here the production function is y = f (x, t). We start our discussion with a Cobb-
Douglas (CD) formulation. Although this is much simpler and more restricted than
a flexible functional form, such as the translog, the parameters in the CD formulation
have economic meaning and are therefore easy to interpret. Thus, it might be helpful
880 S. C. Kumbhakar

in understanding the issues. Because of this, the CD formulation is still used in many
applications. We skip the firm and time subscripts to avoid notational clutter and use
them when it is absolutely necessary.
In specifying the technology, assumptions are often made on RTS. For the
production function in (1), it is defined as RTS = j ∂lny/∂lnxj which is the
percentage change in output for a change in all the inputs by a constant percentage.
Thus, the question is, if the scale of operation is increased by ρ% by what percentage
will output increase? If all the inputs change by ρ > 1% but output changes by ρ h ,
then RTS is said to be increasing (decreasing) if h > 1 (<1). Mathematically, if
the production function in (1) is homogeneous of degree h, RTS equals h. RTS is
unitary when h = 1. If h is constant, meaning it is the same for levels of inputs
(the scale is invariant to the input quantities), RTS is constant. However, here we
separate constant RTS (non-observation specific) from unitary RTS (constant and
unitary). RTS is increasing (IRS) if RTS > 1. Similarly, it is decreasing (DRS) if
RTS < 1. Finally, it is constant RTS (CRTS), when RTS = 1. RTS for a firm may not
be increasing, decreasing, or constant for all levels of outputs, unless the underlying
is restrictive. For a more flexible function, RTS is allowed to change with output
level which is more intuitive.

The Cobb-Douglas Case: Time Trend Formulation

ln y = α0 + αj ln xj + αt t
j
(17)
TC = Ȧt = αt and RT S = αj
j

Both are constants for all t and for all firms. Furthermore, TC is neutral because
it is not related to any input. Similarly, RTS is also constant but not unitary.
TC can be biased towards some inputs – non-neutral technical change affects
productivity (efficiency) of different inputs differently. More specifically, it is
said to be biased towards input j if its share in total cost increases over time.
Alternatively, TC is said to be input j using (saving) if Bj (t) = ∂Sj /∂t > 0 (<0) where
Sj = ∂ln y/∂ln xj . TC is neutral if Bj (t) = 0. That is, if TC is neutral, it is not biased
towards any particular input. Therefore, in a CD model TC is always neutral.
Instead of using the time trend, one can use time dummies and write the
production function as

ln y = α0 + αj ln xj + A(t)DTt (18)
j

where A(t) are parameters and DT are time dummies. Although we are using the
terminology general index for a CD function, the model with time dummies and the
GI formulations are identical. This is not the case when one uses a translog function,
which we discuss later. In the CD case, TC = A(t) − A(t−1) which is time-varying
but the same for firms. As before, bias is zero for all inputs .
21 Modeling Technical Change: Theory and Practice 881

The Generalized GI Model

The above GI model can be generalized further. Introducing the firm and time
subscripts i and t, the CD production function is written as

ln yit = α0 + αj ln xj it + θ (i) t (19)
j

where θ (i) are firm-specific coefficients (which can be captured by introducing firm-
dummies (Di ), i.e., θ (i) = μi Di ). Here the coefficient of the time trend variable
varies with the firm. Thus, TC is firm specific but time invariant, viz.,

T C i = θ (i) ∀t (20)

It is neutral and bias = 0 for inputs.

One can do the same for the model with time dummies, that is,

ln yit = α0 + αj ln xj it + θ (i) A(t) (21)
j

where the interaction of θ (i) and A(t) firm and time dummies mean that the
coefficient of time for each period is not only firm-specific but also differs from
one year to another. As a result of this, TC is both firm specific and time varying,
viz.,

TCit = θ (i) [A(t) − A (t − 1)] . (22)

Although this generalizes the TT formulation, the Bias = 0 for all inputs.
We do not discuss the factor-augmenting approach for the CD case mainly
because in most cases, these models will be identical to the TT case.
Given the limitations of the CD functional form, we now move to the flexible
functional form (translog) in modeling TC. First, we consider the production
function approach [8].

The Translog TT Model

1
ln y = α0 + αj ln xj + αt t + αj k ln xj ln xk + αtt t 2
j 2 j k
(23)

+ αj t ln xj t
j

TC = ∂ln y/∂t = αt + αtt t + αj t ln xj (24)
j

RTS = ∂ln y/∂ln xj = αj + αj k ln xk + αj t t (25)
j j k
882 S. C. Kumbhakar

Note that TC is observation-specific because the inputs are observation-specific.

Further,
TC can be decomposed into a neutral (α t + α tt t) and a non-neutral part
( j α jt lnxj ). As the name suggests, the neutral part is not associated with any
particular input, while the non-neutral part is associated with inputs, that is, it
depends on the level of each input. Thus, unless two otherwise identical producers
use the same levels of inputs (meaning that they are identical), TC will be different.
Finally, we can check whether TC is biased towards a particular input or not from
the bias measure

Biasj = ∂Sj /∂t = αj t , (26)

where Sj = α j + k α jk ln xk + α jt t
Thus, the bias is input specific as well as time varying but not firm-specific. RTS
in (25) is both firm-specific and time varying. This means that RTS for a firm can
change over time from increasing to constant to decreasing. The change in RTS is
likely to be affected by a change in the input quantities over time.

The Translog GI Model

1
ln y = α0 + αj ln xj + A(t) + αj k ln xj ln xk + αj t ln xj A(t)
j 2 j k j
(27)

This is called a GImodel because instead of using t as a continuous variable,

here an index A(t) = t at DTt , t = 2, . . . , T is defined using the time dummies
(DTt ) which is then used as a regressor in the translog function. The squares of
A(t) are not used because squares of the time dummies are the same as the time
dummies themselves. The translog GI model is not the same as a translog model
with time dummies, because in the GI model the function A(t) is used as a covariate
and its parameters do not change when moving from A(t) to its interactions with
ln xj variables. Consequently, the translog GI model is inherently nonlinear (in
parameters), whereas a translog model with time dummies is linear (in parameters).
TC for the GI model is
⎡ ⎤

TC = [A(t) − A (t − 1)] ⎣1 + αj t ln xj ⎦ (28)
j

which is non-neutral, and the bias is

Bj = αj t [A(t) − A (t − 1)] (29)

which is input specific and varies over time.

21 Modeling Technical Change: Theory and Practice 883

Finally, RTS is

RTS = ∂ln y/∂ln xj = αj + αj k ln xk + αj t A(t) (30)
j j k

Factor-Augmenting Approach
The translog FA model is

1
ln y = α0 + αj ln xj + ln λj + αj k ln xj + ln λj (ln xk + ln λk )
j 2 j k
(31)

If we assume that ln λj = bj t and bj > 0 then efficiency of input j increases over

time at the rate bj .

TC = ∂lny/∂t = αj + αj r (ln xr + ln λr ) ∂ln λj /∂t = Sj bj
j r j
(32)

Biasj = ∂Sj /∂t = αj r (∂ln λr /∂t) = αj r br (33)
r r

which is a constant but differs across inputs.

RTS = ∂ln y/∂ln xj = ∂ln y/∂ln xj∗ ∂ln xj∗ /∂ln xj
j j
(34)
= αj + αj r (ln xr + ln λr ) = Sj
j r j

which is observation specific. Thus, RTS changes with input levels and is noncon-
stant unless restrictions are imposed on the parameters.

Estimation

We now introduce the firm and time subscript, i and t.

Production Function Models

The TT Model
The TT model (CD) is

ln yit = α0 + αj ln xj it + αt t + vit (35)
j
884 S. C. Kumbhakar

where v is the error term. Assuming the input variables are exogenous, which is
usually assumed in the literature, the OLS can be used to estimate the parameters.
If the input variables are not exogenous/predetermined, we need to use instrumental
variables to take care of the endogeneity. Once the parameters are estimated, we can
get an estimate of TC from the coefficient of the time trend variable (t), that is,

TC = αt (36)

RTS = αj (37)
j

Both TC and RTS are constant.

The GI Model
Here the model is

ln yit = α0 + αj ln xj it + A(t) + vit (38)
j

which can be estimated using OLS with time-dummies. TC is then estimated from

TCt = A(t) − A (t − 1) (39)

RTS = αj (40)
j

Here, TC is time varying but exactly the same for all firms. On the other hand,
RTS is the same for all firms in every year.

Extensions of TT and GI Models

Here the model is

ln yit = α0 + αj ln xj it + θ (i) A(t) + vit (41)
j

Since both θ (i) and A(t) are firm and time-specific parameters, the model can be
estimated using nonlinear least squares. However, we need to set θ (i) = 1 for one i
for identification. Once the parameters are estimated, TC can be obtained from

TCit = θ (i) [A(t) − A (t − 1)] (42)

which is both firm and time-specific. However, RTS = j α j is the same for all
observations.
The main problem with the CD models in all the above cases is that RTS does
not vary across firms and over time. This is very restrictive. One can easily avoid
this problem by using a flexible functional form such as the translog.
21 Modeling Technical Change: Theory and Practice 885

The TT Model (Translog)

1
ln yit = α0 + αj ln xj it + αt t + αj k ln xj it ln xkit + αtt t 2
j 2 j k
(43)

+ αj t ln xj it t + vit
j

The TT model, like the CD model, can be estimated using OLS, assuming that
the inputs are not correlated with the noise term. Once the parameters are estimated,
TC, RTS, Bias, etc., can be computed using the formulas discussed earlier (Eqs.
(24)–(26)).

The GI Model
The translog GI model adds a noise (error) term.

1
ln yit = α0 + αj ln xj it + A(t) + αj k ln xj it ln xkit
j 2 j k
(43a)

+ αj t ln xj it A(t) + vit
j

One of the parameters in A(t) = at DTt is to be normalized to 1 for identification.

Since A(t) appears interactively with lnxj , the model is nonlinear and can be
estimated using nonlinear least squares (NLS). Once the parameters are estimated,
TC, RTS, Bias, etc., can be computed using the formulas discussed earlier (Eqs.
(28)–(30)).

Factor Augmenting TT Model

ln yit = α0 + αj ln xj it + ln λj
j

1
(44)

+ αj k ln xj it + ln λj t (ln xkit + ln λkt ) + vit
2 j k

If we assume that ln λjt = bj t, that is, factor augmentation is input specific and
time varying, the above model becomes nonlinear and can be estimated using NLS.
Once the parameters are estimated, TC, RTS, Bias, etc., can be computed using the
formulas discussed earlier (Eqs. (32)–(34)).

Factor Augmenting GI Model

The model is similar to the TT model above, except that the factor augmentation is
modeled differently, that is,
886 S. C. Kumbhakar

ln yit = α0 + αj ln xj it + ln λj t
j

1
(45)

+ αj k ln xj it + ln λj t (ln xkit + ln λkt ) + vit
2 j k

where ln λjt = bj (t) = bj (DTt ). Since DTt are time dummies and bj (t) are both input
specific and time varying parameters, the above model is nonlinear and NLS can be
used to estimate the parameters. Once the parameters are estimated, TC, RTS, Bias,
etc., can be computed from

TCit = αj bj (t) − bj (t − 1)
j

+ αj k ln xj it + ln λj t {bk (t) − bk (t − 1)} (46)
j k

+ αj k {ln xkit + ln λkt } bj (t) − bj (t − 1)
j k

Sj t = αj + αj k {ln xkit + ln λkt } (47)
k

and the bias is

Bj t = Sj t = αj k [bk (t) − bk (t − 1)] (48)
k

which is input specific and varies over time. Finally, RTS is

RTS = αj + αj k {ln xkit + ln λkt } (49)
j k

Cost Function Approach

Time Trend Formulation in Generalized Leontief Cost Function

One can use many different parametric forms of the cost function, although the
translog is the most popular. The generalized Leontief (GL) function, which is
also flexible, is often used. The generalized Leontief (GL) cost function was first
introduced by Diewert [9]. RTS in this chapter was assumed to be unitary. The cost
function was specified as shown below (the time trend is added to allow technical
change).
⎡ ⎤

C (w, y, t) = y ⎣ bj k wj 1/2 wk 1/2 + bj t wj t ⎦ (50a)
k k j

with bjk = bkj , j, k = 1, . . . , J. TC in this formulation is TCc = y j bjt wj which
is quite restrictive compared to the more flexible specification introduced later. The
21 Modeling Technical Change: Theory and Practice 887

GL cost function is further generalized in Diewert and Wales [10] in which the
generalized GL cost function (double generalized) is

C (w, y, t) = bj k wj 1/2 wk 1/2 y + bj wj + bj t wj ty
k k j j
⎛ ⎞ ⎛ ⎞ ⎛ ⎞

+ bt ⎝ αj wj ⎠ t + byy ⎝ βj wj ⎠ y 2 + btt ⎝ γj wj ⎠ t 2 y
j j j
(50b)

with bjk = bkj , j, k = 1, . . . , J. Note that if all bjk for j = k are constrained to
be non-negative, only then is the cost function i globally concave. See Diewert and
Wales [10] for identifying restrictions on the parameters. Since this restriction is
too restrictive, Diewert and Wales [10] did not recommend its use in empirical
applications. There are other forms of the GL cost function which are not fully
flexible (constant RTS imposed) that are used in the literature.

C (w, y, t) = y bj k wj 1/2 wk 1/2 + bj t wj t + γj t wj t 2 (51)
j k j j

Another form suggested by Diewert and Wales [10] is the Normalized quadratic
cost function
⎧ ⎫
⎨ ⎬
C (w, y, t) = y βj wj + g(w) + βj t wj t + γj t wj t 2 (52)
⎩ ⎭
j j j

where g(w) = w Bw/2θ w, B = [β jk ] is a J × J symmetric matrix and

θ = [θ 1 , . . . , θ J ] > 0 is a vector of non-negative predetermined constants, not
all zero. In order to identify all the parameters in B, it is assumed that Bw∗ = 0 for
some chosen w∗ .
TC for the above cost functions are [∂C(w, y, t)/∂t]/C(w, y, t).

Time Trend and General Index Models (Translog)

The translog cost function for the TT model can be written as

ln C = α0 + αj ln wj + αy ln y + αt t
j

1
+ αj k ln wj ln wk + αyy ln y ln y + αtt t 2 (53)
2 j k

+ αjy ln wj ln y + αj t ln wj t + αyt ln y t + v.
j j
888 S. C. Kumbhakar

The cost function above is assumed to satisfylinear homogeneity (in w) and

symmetry
restrictions. These are: j α j = 1, j α jk = 0 ∀ k, j α jy = 0,
j α jt = 0 and α jk = α kj .
One way of generalizing this TT model is to replace the trend variable t by A(t)
where A(t) (t = 1, . . . , T) has one parameter for each time period. This model is
labeled as the general index (GI) model by Baltagi and Griffin [11]. Thus, assuming
a translog form, the GI model can be written as

ln C = α0 + αj ln wj + αy ln y + A(t)
j

1
+ αj k ln wj ln wk + αyy ln y ln y (54)
2 j k

+ αjy ln wj ln y + αj t ln wj A(t) + αyt ln y A(t) + v
j j

The above cost function is assumed to satisfy linear homogeneity (in w) and
symmetry restrictions. These restrictions are the same as before.
Technical change in the TT (TC_TT0) and GI (TC_GI0) models can be expressed
as

TC_TT0 = − αt + αtt t + αj t ln wj + αyt ln y , (55)
j

TC_GI0 = − {A(t) − A (t − 1)} αj t ln wj + αyt ln y . (56)
j

Both measures of TC are firm-specific and time-varying.

It is to be noted that there are restrictions built-in on the nature of TC in the
TT model. First, the pure (neutral) component, −{α t + α tt t}, either increases or
decreases linearly. Second, with unbalanced panel data, it is not clear whether the
trend variable for a firm entering in period τ (1 < τ < T) should start from τ or
be rescaled to start from 1. Both of these problems are avoided in the GI model by
estimating one parameter for each time period in A(t).
In translog models, TC can be biased towards a particular input(s). For
input j (j = 1, . . . , J) bias in TC can be measured from IBj = ∂Sj /∂t, where
Sj = ∂ln C/∂ln wj = wj xj /C is the cost share of input j [12]. A positive (negative)
value of IBj indicates that TC is relatively jth input using (saving). A zero value of
IBj implies that TC is neutral (not biased towards any particular input). Thus, TC is
skilled bias if the share of skilled workers increases over time.
In the TT model, IBj = α jt which is a constant for all time periods, and its sign is
determined by the sign of α jt . Thus, although the cost function representing the TT
model is flexible, measures of technological biases (inputs and scale) derived from it
are neither firm-specific nor time-varying. Stevenson [12] avoided this problem in a
TT model by appending some third order terms in the translog cost function. These
additional terms allow technological biases to change among firms and over time.
However, in the GI model, IBj = α jt [A(t) − A(t − 1)] varies over time. The sign of
21 Modeling Technical Change: Theory and Practice 889

IBj is the same (opposite) as that of A(t) − A(t − 1), if α jt is positive (negative). RTS
can be obtained from

RTS_TT0−1 = αy + αyy ln y + αjy ln wj + αyt t (57)
j

RTS_GI0−1 = αy + αyy ln y + αjy ln wj + αyt A (t) (58)
j

in the TT and GI models, respectively.

Generalizations of Time Trend and General Index Models

In both the TT and GI models, TC and RTS are firm-specific, because the w and y
variables vary among firms. However, input and scale biases in TC are the same for
all firms in both models. This undesirable feature can be eliminated by extending
both the TT and GI models in which the α yt , αj t parameters are firm-specific, that
i , α i , i = 1, . . . , N. See Kumbhakar, Nakamura, and
is, αyt , αj t are replaced by αyt jt
Heshmati [13] for details on this model. In doing so, no restrictions are imposed on
technological bias measures. The cost of this generalization is that JN additional
parameters are to be estimated. Thus, TC in the TT1 and GI1 models can be
expressed as

TT_TT1 = − αt + αtt t + αji t ln wj + αy t i ln y , (59)
j

TT_GI1 = − {A(t) − A (t − 1)} 1 + αji t ln wj + αyt
i
ln y . (60)
j

It can be seen from (59) and (60) that firms producing the same output level and
facing the same input prices will experience different rates of TC as long as αyt i ,

αji t parameters are not the same for all firms. For the same reason, input and scale
biases in TC are firm-specific although the formulas for calculating these biases
are the same as those in the TT0 and GI0 models. Another advantage of the above
generalization is that the TT0 and GI0 models become special cases of the TT1 and
GI1 models – specification of which can be tested using likelihood ratio (LR) tests.
Formulas for calculating RTS, for the TT1 and GI1 models, are the same as those
i , αi .
in (55)–(58), except that αyt , αj t are now firm-specific and are replaced by αyt jt
Thus, estimates of RTS, for firms producing the same output level and facing the
same input prices, will be different. Consequently, the scale components will also
be different for these firms. The same is true for input and scale biases, which are
firm-specific if the αyt , αj t parameters are firm-specific.
In addition to these two extensions, several other extensions of the TT and
GI models are considered in the Kumbhakar et al. [13] paper. These extensions
are parsimonious in comparison with the TT1 and GI1 models, and yet flexible
enough to generate firm-specific biases in TC. In the TT2 (GI2) model, they specify
i = θ (i)α and α i = θ (i) α , where θ (i) are firm-specific parameters. Thus,
αyt yt jt jt
the input and scale biases are not allowed to be completely free as in the TT1 (GI1)
890 S. C. Kumbhakar

model, because the θ (i) parameters are constrained to be the same for all inputs and
output. Consequently, the TT2 (GI2) model has only (N–1) additional parameters
compared to the TT0 (GI0) model. Note that one of the θ parameters is normalized
to unitary for identification. The TT0 (GI0) model can be obtained as a special case
of the TT2 (GI2) model by restricting θ (i) = 1 for all i. Such restrictions are testable
econometrically.
One can justify the above extensions of the TT and GI models because of their
flexibility. Empirically, we often deal with firms that are heterogeneous in size. Such
heterogeneity may not always be captured by firm-effects (fixed or random) in the
error term (which essentially makes the intercept firm-specific). It might have some
second-order effects as well. Since we are mostly interested in analyzing TC and
RTS properties of the production technology in an industry that is characterized by
firms of different sizes and product mixes, one often focuses on models in which the
scale and TC components can vary across firms, irrespective of variation in input
prices and output quantities. In other words, one can let the data determine whether
such second order effects are present or not. This means that before imposing such
constraints a priori, one can test whether scale and TC components are affected by
firm size or not. Since the TT0 (GI0) model is nested in TT1 and TT2 (GI1 and
GI2), the LR test can be used to determine the appropriateness of these extensions.

Factor Augmenting Approach

The cost function, which is dual to the FA production function in (5), can be written
as

C = C (w̃, y) , (61)

where w̃j = Bj (t)wj and Bj (t) = 1/λj (t) ∀ j, λj (t) ≥ 1. Thus, an increase in
efficiency of an input (λj (t) ≥ 1) is equivalent to a decrease (Bj (t) ≤ 1) in its effective
price. Assuming a translog form for the cost function in (61), we get

ln C = α0 + αj ln w̃j + αy ln y
j
⎧ ⎫ (62)
1 ⎨ ⎬
+ αj k ln w̃j ln w̃k + αyy ln y ln y + αjy ln w̃j ln y
2⎩ ⎭
j k j

where w̃j = Bj (t)wj and Bj (t) are parameters (j = 1, . . . , J and t = 1, . . . , T). The
above cost function is assumed to satisfy the symmetry and linear homogeneity (in
w̃) restrictions.
It is to be noted that efficiency of each input in the above cost function
model is completely flexible (subject to the normalizing constraints Bj (1) = 1
for all j). The rate of efficiency change in input j can be obtained from ln
λj (t) − ln λj (t − 1) = − (ln Bj (t) − ln Bj (t − 1)) for t = 2, · · · , T, and
j = 1, · · · , J.
21 Modeling Technical Change: Theory and Practice 891

The different neutrality hypothesis postulated above can be tested via the cost
function in (62) with appropriate parametric restrictions. For example, technical
change is Hicks neutral if the RTS is unitary (i.e., α y = 1, α yy = 0, α jy = 0 ∀ j), and
Bj (t) = B(t) ∀ j. That is, the cost function under the Hicks neutral TC can be written
as C = B(t)·C(w, y) where the cost function C(w, y) is homogeneous of degree 1 in
y (RTS = 1). Similarly, technical change is Harrod neutral if RTS = 1 and Bj (t) = 1
for all j= labor. Furthermore, technical change is Solow neutral if RTS = 1 and
Bj (t) = 1 for all j= capital.
The measure of overall technical change (TC_FA) in the above model can be
obtained from

T C_F A = [ln C(t)− ln C(t−1)]|w,y unchanged = Sj ln Bj (t) − ln Bj (t − 1)
j

1
+ αj k ln Bj (t) − ln Bj (t − 1) {ln Bk (t) − ln Bk (t − 1)}
2
j k

≡ STCj
j
(63)

where

∂ln C w̃j x̃j wj xj

Sj = = = = αj + αj k ln w̃k + αjy ln y (64)
∂ln w̃j C C
k

and

!
1
STCj = ln Bj (t) − ln Bj (t − 1) Sj + αj k {ln Bk (t) − ln Bk (t − 1)}
2
k
(65)

which is the share of input j in the overall technical change TC_FA.

The above formula has several important features. First, the rate of overall
technical change is defined by the change in efficiency of each input, as well as
the cost share of each input (Sj ). The cost shares as shown in (64) are also affected
by factor productivity of each input via ln w̃k = ln wk + ln Bk . Second, the share
of input j in the overall technical change is given by STCj which is defined in (65).
Note that STCj is affected by a change in efficiency of other inputs (k = j) directly
through the αj k {ln Bk (t) − ln Bk (t − 1)} component and indirectly via Sj . Third,
k
by rewriting (63) as
892 S. C. Kumbhakar

!
1
T C_F A = ln Bj (t)− ln Bj (t−1) αj + αj k {ln Bk (t)− ln Bk (t − 1)}
2
j k

+ αjy ln y ln Bj (t) − ln Bj (t − 1)
j

+ αj k ln wk ln Bj (t) − ln Bj (t − 1)
j k
(66)

one can decompose TC into the following components:

pure
!
1
ln Bj (t) − ln Bj (t − 1) αj + αj k {ln Bk (t) − ln Bk (t − 1)} ,
2
j k
(67)

scale related

αjy ln y ln Bj (t) − ln Bj (t − 1) , (68)
j

and input price related

αj k ln wk ln Bj (t) − ln Bj (t − 1) . (69)
j k

Bias in technical change associated with input j (j = 1, . . . , J) in the present

model can be measured from

I bj = αj k {ln Bk (t) − ln Bk (t − 1)} (70)
k

signs of which will depend on efficiency change in each input as well as the α jk
parameters. Like the GI model, these measures are time-varying, but are also related
to efficiency of all inputs. Consequently, input bias in technical change associated
with an input j depends on efficiency change in every input. If efficiency factors
are the same for all inputs, that is, Bj (t) = B(t) ∀ j, and the underlying production
function is homogeneous, then Ibj = 0 ∀ j. Another possibility is that α jk = 0 ∀ j,
k, which implies that the production function is Cobb-Douglas. The neutrality
hypothesis arising from either way is testable. One can also test the neutrality
hypothesis with constant returns to scale (Hicks neutral technical change).
21 Modeling Technical Change: Theory and Practice 893

Other forms of neutrality hypotheses representing Harrod and Solow neutral

technical change can also be tested. Scale bias in technical change in the factor
augmenting model can be obtained from

Scb = αjy {ln Bk (t) − ln Bk (t − 1)} . (71)
j

Thus, unlike the other models, scale bias in the factor augmenting model has
input-specific components which can be estimated. If Bj (t) = B(t) ∀ j and j αjy =
0, the scale bias in technical change is zero.

Symmetric Generalized McFadden Cost Function

Cost functions, such as the translog, tend to violate economic theoretical restrictions
(see Diewert and Wales [10]). Imposition of global concavity in input prices often
destroys flexibility of the translog cost function. To avoid this problem, Diewert and
Wales [10] suggested a functional form (Symmetric Generalized McFadden (SGM))
that is globally concave. The SGM cost function is
⎛ ⎞

C (w, t, y) = g(w)y + ηj wj + ηjj wj y + b ⎝ γj wj ⎠ ty
j j j
⎛ ⎞ ⎛ ⎞

+ at ⎝ φj w j ⎠ t + ηY Y ⎝ βj wj ⎠ y 2 (72)
j j
⎛ ⎞

+ att ⎝ λj wj ⎠ t 2 y, j = 1, . . . , J
j

where the function g(w) is defined as

1 w Sw
g(w) = (73)
2 θ w

where θ ≡ [θ 1 , . . . , θ j ] > 0J is a vector of non-negative constants not all zero, and

S is an J × J symmetric negative semidefinite (NSD) matrix such that S w = 0. w
can be considered as the point of approximation, that is, the point at which the data
is normalized to unitary. Following theorem 10 of Diewert and Wales [10], it can be
shown that C(·) is globally concave in w if the matrix S is negative semidefinite.
If S is not NSD then negative semi-definiteness can be imposed following the
technique developed by Wiley, Schmidt, and Bramble [14]. One can reparametrize S

as S = − where is an upper triangular matrix. Such reparameterization does
not affect the flexibility of C(·) and its global behavior. The θ parameters in g(w)
894 S. C. Kumbhakar

are assumed to be known. To minimize the number of parameters, while keeping

the function flexible, the γ j , φ j , β j , and λj are also assumed to be known. See
Kumbhakar [15] for an application of the SGM model. The translog cost function
described above can be generalized to accommodate multiple outputs. In general,
the cost function is C = C(w, y, t) where w and y are vectors of input prices and
output. Kumbhakar [16] generalizes the SGM cost function to the multioutput case
that makes it relatively easy to estimate different aspects of a production technology.
The model is estimated using the system of demand functions (normalized by y)
which are derived from ∂C(.)/∂Wj = xj (.). Once the parameters are estimated, TC
can be obtained from

TC = (1/C(.)) (∂C(.)/∂t) where ∂C(.)/∂t

⎛ ⎞ ⎛ ⎞
(74)
= b y + at ⎝ φj wj ⎠ + 2att ⎝ λj wj ⎠ ty
j j

and C(.) is replaced by its predicted value.

Other applications include Kutlu, Liu, and Sickles [17] who discussed properties
of the SGM as well as some other cost functions.

Profit Function Approach

Under the behavioral assumption of profit maximization with competitive input and
output markets, the profit function can be expressed as π = π (w, p, t) [3, 4, 18],
where w (p) is a vector of input (output) prices. The profit function, similar to
the cost function, can handle multiple outputs. The profit function is homogeneous
of degree 1 in p and w. One can estimate the profit function after imposing the
linear homogeneity restrictions, using any parametric functions. For the CD and the
translog functions, profit has to be positive for all observations. TC can then be
computed using the formula defined earlier in the estimated profit function. Note
that to estimate the profit function, one needs data on input and output prices, which
should have enough variations (both cross-sectionally and over time) in them to get
reliable estimate of the parameters.

Multiple Outputs

The Primal Approach

Output Distance Function

Although most of the production processes consist of multiple outputs, multiple
output production functions are rarely used in practice [19]. The problem lies in
using the function econometrically. Even if inputs are assumed to be exogenous,
21 Modeling Technical Change: Theory and Practice 895

it is not straightforward to accommodate multiple outputs in a single equation

production function when outputs are endogenous. That is, using one output as
the left-hand side variable (which will be treated as endogenous) and others as
the right-hand side variables (which are assumed to be exogenous) will give rise
to the endogeneity problem, which cannot be avoided no matter which output
appears on the left-hand side of the equation. To avoid this, one can use distance
functions, for example, output distance functions (ODF). An easy way of describing
it is to start with a transformation function T(y, x, t) = A. For identification of the
parameters in it using a parametric form such as the CD or translog, we assume
T(.) to be homogeneous of degree 1 in outputs which gives the ODF [20]. With the
homogeneity assumption, the ODF can be expressed as

ln y1 = mO ln ŷm , ln x, t + ln A (75)

where ŷm = ym /y1 , m = 2, . . . , M. One can use CD or translog form for the mO (.)
function. The translog ODF with TT specification is

1
ln y1it = β0 + βj ln xj it + βj k ln xj it ln xkit + αm ln ŷmit
2
j =1 j =1 k=1 m=2

1
+ αmn ln ŷmit ln ŷnit
2
m=2 n=2
(76)
1
+ δmj ln xj it ln ŷmit + αt t + αtt t 2 + αj t ln xj it t
2
m=2 j =1 j

+ βmt ln ŷmit t + vit ,
m=2

If inputs are exogenous (which is rarely true) for a revenue maximizing firm,
output ratios will be exogenous (competitive output markets) especially when
the output prices are exogenous. In such a case, the ODF does not suffer from
the endogeneity (of outputs) problem. The ODF is invariant to the choice of the
numeraire output which is the output that is used to impose the homogeneity
restriction.
Similar to the production function approach, one can obtain RTS from
j ∂ln y1 /∂ln xj and TC from ∂ln y1 /∂t. Note that the interpretation of TC here is
similar to the production function model. Holding the inputs unchanged, TC denotes
a proportional change in all the outputs. This means the output ratios ŷm appearing
in the right-hand side of the
ODFare unchanged. Because of this, TC from the ODF
can be obtained from TC ŷ, x, t = ∂ln y1 /∂t. It is a radial measure and shows the
rate at which all the outputs would be increased holding inputs unchanged. Since
an increase in all the outputs by k% increases revenue by k% as well, TC in ODF
can have revenue interpretation (the percentage at which the revenue is changed due
to a change in all the outputs by 1%). A positive (negative) value of TC indicates
technical progress (regress). Similarly, RTS is defined as RTS = j ∂ ln y1 /∂ ln xj . It
896 S. C. Kumbhakar

is a radial measure, meaning that RTS shows the rate at which all the outputs change
when the inputs are changed by a constant percentage.
The GI formulation in the ODF can be obtained from the above TT formulation
by replacing the time trend variable t above by A(t) and dropping the t2 term. TC
can be computed by taking finite differences, that is, lny1it − ln y1i, t − 1 holding the
inputs xjit and ŷmit unchanged.
For FA modeling, we write the ODF as ŷ1 = mI F ŷ, A1 (t)x1 , . . . , AJ (t)xJ
and then use a parametric functional form (such as the translog) for mIF (.). The rest
of the model is similar to our earlier discussion of TC in a FA model.

Input Distance Function (IDF)

If outputs are assumed to be exogenous (which is the case for service industries
such as electricity, transportation, hospital, telephone, post offices, etc.), one can
specify the technology in terms of an IDF which is obtained from the transformation
function by making it homogeneous of degree 1 in inputs. Thus, the IDF can be
expressed as

ln x1 = mI ln y, ln x̂, t (77)

where ln x̂ = ln (x2 /x1 ) , ln (x3 /x1 ) , . . . , ln (xJ /x1 ) . One can use a CD or a
translog form for the mI (.) function. The translog IDF can be written as (Kumbhakar
and Lien [21], Kumbhakar [20], and many others)

1
ln x1it = β0 + βj ln x̂j it + βj k ln x̂j it ln x̂kit + αm ln ymit
2
j =2 j =2 k=2 m

1
+ αmn ln ymit ln ynit
2 m n
(78)
+ δmj ln x̂j it ln ymit + αt t
m j =2

1
+ αtt t 2 + αj t ln x̂j it t + βmt ln ymit t,
2
j =2 m

For a cost minimizing firm with exogenous outputs, input ratios will be exoge-
nous if input prices are exogenous (competitive input market) and therefore the
IDF does not suffer from the endogeneity (of outputs) problem. See for example,
Sipilainen et al. [22]. The ODF is invariant to the choice of the numeraire output
(the output that is used to impose the homogeneity
restriction). Similar to the cost
function approach, one can obtain RTS from m ∂ln x1 /∂ln ym and TC from
−∂ln x1 /∂t. Note that here TC is a measure of cost diminution holding outputs and
input proportions unchanged. That is, TC is measured radially by examining the
rate at which all the inputs can be reduced. Thus, for example, if all the inputs are
21 Modeling Technical Change: Theory and Practice 897

reduced by k%, cost is also reduced by k% and hence we have the cost interpretation
of TC in IDF.
Instead of using the TT formulation, one can use the GI as well as FA approaches
in specifying the IDF (ODF). The expressions for TC and RTS can be derived
similarly. Because of this, we do not repeat them here.

The Dual Approach

In a primal approach (using the production function, ODF, and IDF), no economic
behavioral assumption is made explicitly so far as the use of inputs and/or
production of output are concerned. In a dual approach, economic behavior is
explicitly used. In the dual cost function approach, the economic behavior is cost
minimization subject to a given output level.

Cost Function
The cost function with multiple outputs can be written as C = C(w, y, t) where
w and y are input price and output vectors, respectively. Since the cost function
assumes output(s) to be exogenous, a vast majority of the applications are from
service sectors such as banking, provision of health services (hospitals, nursing
homes, etc.), water, gas, and electricity distribution, transportation services (airlines,
railroad, trucking, etc.), post office, and many others. The only difference between a
single and multiple outputs is that y is scalar in a single output case and it is a vector
for multiple outputs.
The translog cost function with multiple outputs and time trend TC can be written
as
1
ln Cit = β0 + βj ln wj it + βj k ln wj it ln wkit
2
j j k
1
+ αm ln ymit + αmn ln ymit ln ynit
m
2 m n
(79)
1
+ δmj ln wj it ln ymit + αt t + αtt t 2 + αj t ln wj it t
m
2
j j

+ βmt ln ymit t,
m

which is homogeneous of degree 1 in input prices. Other restrictions are symmetry,

that is, β jk = β kj , α mn = α nm . TC from (79) is

TC (w, y, t) = αt + αtt t + αj t ln wj it + βmt ln ymit (80)
j =1 m

which is non-neutral. Input bias of TC is ∂Sjit /∂t = α jt where

898 S. C. Kumbhakar

Sj it = ∂ln Cit /∂ln wj it = βj + βj k ln wkit + δmj ln ymit + αj t t. (81)
k m

Thus, bias is input specific but constant for all firms. Finally, RTS is defined as
the inverse of.

RTSit = ∂ln Cit /∂ln ymit
m
⎡ ⎤
(82)
= ⎣αm + αmn ln ynit + δmj ln wj it + βmt t ⎦
m n j

which is observation specific.

There are many special cases of the above cost function. Some of these are:
(i) (δ mj = 0), (ii) homogeneity (which implies constant
separability (δmj =
RTS,
0, αmn = 0, βmt = 0), and (iii) unitary RTS (δ mj = 0, αmn =0, βmt =0,
m m m m
αm = 1).

Profit Function
The profit function with multiple outputs is similar to that of the single output. The
only difference is that output price (p) is replaced by a vector, that is, π = π (p, w, t).
The profit function is homogeneous of degree 1 in input and output prices, which can
be easily imposed by writing π (.) as π/p1 = π (p̃, w̃, t), where p̃m = pm /p1 , m =
2, . . . , M and w̃j = wj /p1 , j = 1, . . . , J . Since TC is defined in the same way
and has the same interpretation as in the single output case, we are not discussing it
again to avoid repetition.

TC Measures Induced by Management/Exogenous Factors

and Time

So far, in the models we considered, TC is not only exogenous but purely time
driven (manna from heaven). Sometimes TC can be explained by some exogenous
(predetermined) variables. In this case, TC can be decomposed into a time driven
component and a component that is induced by predetermined variables (z). It
could also be driven by endogenous variables. However, we are not addressing
endogenous TC here. Thus, if something is endogenous, for modeling purposes
we will be treating it as predetermined (decided sometimes in the past) to avoid
the endogeneity problem. First, we discuss the primal approach for a single output
(production function). Then we address TC in the context of multiple outputs
(distance function).
21 Modeling Technical Change: Theory and Practice 899

Management Variables as Technology Shifter

Here the z variables (dummy) represent different management types, say, from 1 to
5 as shown in Triebs and Kumbhakar [69]. One can also think of different types
of innovation, R&D, etc. Finally, it is possible to combine management induced
and time driven TC in a production function. In the following, we consider both
management induced (z) and time driven (t) TC in a translog production function.

Both z and t Are Continuous

The translog production function with the time-invariant z variable can be written
as (Triebs and Kumbhakar [69])
1 1
ln yit = β0 + βj ln xj it + βj k ln xj it ln xkit + βt t + βtt t 2
2 2
j j k
(83)
1
+ βj t ln xj it t + βz zi + βzz zi2 + γj z ln xj it zi + δzi t
2
j j

from which the time driven and management induced TC, respectively, are

T C 1it = ∂ln yit /∂t = βt + βtt t + βj t ln xj it + δzi
j
(84)

T CM1it = ∂ln yit /∂zi = βz + βzz zi + γj z ln xj it + δt

The z variable(s) can be both firm-specific and time-varying.

If z is treated as a discrete (binary) variable, one needs to drop the z2
term in (83) and therefore TCM1it will be given by TCM1it = ln yit |{lnxjit ,
zi = 1} − ln yit |{lnxjit , zi = 0} = β z + γ jz ln xjit + δt.

Multiple Discrete Management Variables zm and Time Trend

(Continuous)
If there are multiple management variables that are discrete in nature, one can

M
define a management index M (zi ) = θm zmi and use it the translog production
m=1
function which is (Triebs and Kumbhakar [69])
1 1
ln yit = β0 + βj ln xj it + βj k ln xj it ln xkit + βt t + βtt t 2
2 2
j j k
(85)
+ βj t ln xj it t + M (zi ) + γj ln xj it M (zi ) + δM (zi ) t,
j j

Like before, TC in the above formulation can have two components: one driven
by time and the other induced by management variables. Since there are zm
900 S. C. Kumbhakar

management variables which may not be all 1 for a firm i, one can define TC induced
by management in many different ways. For simplicity, we consider the case where
zmi = 1 ∀ m vs. zmi = 0 ∀ m = 1 in defining management induced TC. The time
driven and management induced TC are:

TC2it = ∂ln yit /∂t = βt + βtt t + βj t ln xj it + δM (zi )
j

T C M2it = ln yit | ln xj it , M (zmi = 1∀m) − ln yit | ln xj it , M (zmi = 0∀m)
⎛ ⎞

= M (zi ) ⎝1 + γj ln xj it + δt ⎠
j
(86)

where M(zi ) = M(zmi = 1 ∀ m). Note that although the zm variables (and therefore
M(zi )) are time-invariant, TCM2it is not time-invariant.

TC with Continuous Management and General Time Index

Assuming that there is only one continuous management variable, z, we write the
translog production function as (Triebs and Kumbhakar [69])

1
ln yit = β0 + βj ln xj it + βj k ln xj it ln xkit +A(t) + βj t ln xj it A(t)
2
j j k j

1
+ βz zi + βzz zi2 + γj z ln xj it zi + δzi A(t)
2
j m
(87)

T
where A(t) = λt Dt .
t=1
TC measures associated with (87) are, respectively,

⎛ ⎞

TC3it = (A(t) − A (t − 1)) ⎝1 + βj t ln xj it + δzi ⎠ ,
j
(88)

T C M3it = βz + βzz zi + βj m ln xj it + δmt A(t)
j

TC with Management Index and Time Index

To accommodate both management and time indexes, we write the translog
production function as (Triebs and Kumbhakar [69])
21 Modeling Technical Change: Theory and Practice 901

1
ln yit = β0 + βj ln xj it + βj k ln xj it ln xkit +A(t) + βj t ln xj it A(t)
2
j j k j

+ M (zi ) + γj ln xj it M (zi ) + δM (zi ) A(t)
j
(89)

T
M
where A(t) = λt Dt and M (zi ) = θm zmi .
t=1 m=1
The time driven and management induced TC measures for this model are
⎛ ⎞

TC4it = (A(t) − A (t − 1)) ⎝1 + βj t ln xj it + δM (zi )⎠ ,
j
⎛ ⎞ (90)

T C M4it = M (zi ) ⎝1 + γj ln xj it + δA(t)⎠
j

Specification of the IDF with Multiple Outputs

The Translog Specification with Time Trend

The translog IDF with the time trend formulation is written as

M
M
J
J
ln x 1it = α0 + βm ln ymit + δmj ln ymit ln x̃j it + αj ln x̃j it + αt t
m=1 m=1 j =2 j =2
⎛ ⎞
1 ⎝
M M J
J
+ βmn ln ymit ln ynit + αj k ln x̃j it ln x̃kit + αtt t 2 ⎠
2
m=1 n=1 j =2 k=2

M J
+ αmt ln ymit t + βj t ln x̃j it t + vit
j =2
m=1
(91)

Sometimes researchers are interested in estimating the impact of innovation,

innovation systems, R&D, management, etc., on productivity. One can view such
effects as TC because these exogenous and/or predetermined variables, shift the
technology, ceteris paribus. Thus, one can measure the effect of innovation, R&D,
and management type variables from the shift in technology, and therefore, it can be
labeled as TC.
In particular, for the innovation/management induced TC, one can (i) define an
innovation (management) index or several indices, or (ii) use Likert scale variables
to define dummy variable for innovation. These innovation dummies can then be
902 S. C. Kumbhakar

used to define a general innovation index similar to general time index of Baltagi and
Griffin [11]. This index (indices) can be constructed in a cross-sectional framework
as well. Thus, one can define TC in a cross-sectional model also. See Heshmati and
Kumbhakar [23] for this type of formulation.

IDF with Many Innovation Indices as Shifters

Following Heshmati and Kumbhakar [24], we specify an IDF with many innovation
indices
q
lnx1it = lnDI x̃, y, t, Tq zit (92)

q q
where Tq zit , q = 1, . . . , Q are Q innovation indices based on a vector of zit
q
external factors (labeled as technology shifters). Further, we specify Tq zit as

⎛ ⎞
Pq Pq
q
Tq zit = ⎝ ϕp zpit ⎠ ,
q q q
ϕp = 1∀q (93)
p=1 p=1

where Pq is the number of technology shifters in the innovation index Tq (·). For
each of the indices, one can restrict the sum of the weights to be unity, so that the
weights can be interpreted as “importance” of each shifter on the innovation indices
(technology shifters). q
To complete the model, a translog function for ln DI x̃, y, t, Tq zit is
assumed, that is,

M
Q
q M
J
ln x1it = α0 + βm ln ymit + λq Tq zit + δmj ln ymit ln x̃j it
m=1 q=1 m=1 j =2

M
Q
q
Q
J
q
J
+ γmq ln ymit Tq Zit + γqj Tq zit ln x̃j it + αj ln x̃j it
m=1 q=1 q=1 j =2 j =2
⎛
1 ⎝
M M J J
+ βmn ln ymit lnyYnit + αj k ln x̃j it ln x̃kit
2
m=1 n=1 j =2 k=2
⎞

Q
R
q r
M
+ λqr Tq zit Tr zit + αtt t 2 ⎠ + αt t + αmt ln ymit t
q=1 r=1 m=1

J
Q
q
+ βj t ln x̃j it t + λqt Tq zit t + vit
j =2 q=1
(94)
21 Modeling Technical Change: Theory and Practice 903

In this specification, technology shifts (given input quantities) are allowed in

terms of both time trend and innovation indices. The time driven technical change
(TCT ) for this model is

∂ln x1it
TCT = −
∂t
M J Q
q
= − αt + αtt t + αmt ln ymit + βj t ln xj it + λqt Tq zit
" #$ % m=1 j =2 q=1
P ure T C " #$ % " #$ %
N on−neutral T C innovation index comp
(95)

Innovation induced technical change associated with the qth index (TCZq ) is

∂ln x1it
TCZq = − q
∂Tq zit
⎡ ⎤
M
J
Q
q
= − ⎣ λq + γmq ln ymit + γqj ln x̃j it + λqr Tq zit + λqt t ⎦
m=1 j =2 q=1
(96)

Thus, the overall innovation induced TC is

Q
TCz = TCZq (97)
q

Note that TC, in this setup, can be defined for each innovation index, which
can be interpreted as the percentage cost diminution for a one-point change (or
one standard deviation change) in the innovation index. This, as mentioned before,
follows from the duality between the IDF and the cost function.
As noted in the outset, TC is a shift in the technology without changing the inputs.
So, in principle, if there are variables other than inputs that shift the technology,
the shift can be viewed as TC so long as the shift variables are exogenous or
predetermined. If we take this broad view of TC, then we can model it in a cross-
sectional setup. Below we illustrate this in the context of innovation indices using
an IDF.

TC in Cross-Sectional Data

Since TC is defined as a shift in the technology, it is possible to define TC in cross-

sectional models also. The IDF with the innovation index using cross-section data
can be expressed as
904 S. C. Kumbhakar

q
ln x1i = ln DI x̃, y, Tq zi (98)
q q
where Tq zi are Q (q = 1, . . . , Q) innovation indices and zi are external factors
q
that shift the technology shifters. As before, we specify Tq zi as
& '
q Pq q q Pq q
Tq zi = ϕp zpi , ϕp = 1∀q (99)
p=1 p=1

where Pq is the number of technology shifters in the technology index Tq (·). For
each of the indices, we restrict the sum of the weights to be unity, so that we can
interpret the weights as “importance” of each shifter on the technology components.
The overall TC can then be calculated from
Q ∂ln DI
TCz = − q (100)
q=1 ∂Tq Zit

where

∂ln DI M J Q q
q = λq + γmq ln ymi + γqj ln x̃j i + λqr Tq zi
∂Tq Zi m=1 j =2 q=1
(101)

In this setup, TC can be defined for each innovation index, which can be inter-
preted as the percentage cost diminution for a one-point change in the innovation
index. Note that all the variables used to define the innovation index are assumed to
be exogenous (predetermined).

Technical Change from Other Indirect Functions

Revenue Function

We considered cost minimization and profit maximization as behavioral objectives

of firms. These are widely used in the literature. However, a firm may be maximizing
revenue, given the inputs. If inputs are exogenously given, then profit maximization
is the same as revenue maximization. This is because if inputs and input prices
are exogenously given, so is cost. Therefore, maximizing profit is the same as
maximizing revenue . In such a case, the technology can be represented by the dual
revenue function defined as

R = R (p, x, t) (102)

where R = py is revenue and p is the output price vector. TC, for the revenue
function, can then be defined as ∂lnR(.)/∂t, holding output prices and input quantities
21 Modeling Technical Change: Theory and Practice 905

fixed. In this case, TC represents a shift in the revenue function, meaning the
percentage change in revenue over time, ceteris paribus. That is,

TCR (p, x, t) = ∂ln R(.)/∂t (103)

TCR can be related to TC in the ODF for the multiple output case,
and to the production function for a single output. For a single output
R = py = pf (x, t) ⇒ ∂ln R/∂t = ∂lnf (x, t)∂t ≡ TCp (x, t). Thus, TC from the
revenue and production function are theoretically the same. Intuitively, if output is
increased by k% due to a shift in the production function, revenue will also increase
by k% holding output price unchanged. However, empirical results are likely to be
different because data used for estimation of the revenue function are different from
those used in a production function estimation.
For the multiple output case, the technology can be represented with various
forms. If one uses an output distance function (ODF) which without inefficiency
is nothing, but a transformation function T(y, x, t) = A with homogeneous of
degree 1 in outputs
imposed for identification [20], the ODF can be expressed as
ln y1 = ln ψ ŷ, x, t where ŷ = y2 /y1 , . . . , yM /y1 . Assuming that TC changes all
outputs radially, TCodf = ∂ln y1 /∂t = ∂ln ψ ŷ, x, t /∂t. We can give it a revenue
interpretation by writing

R (p, x, t) = pm ym = p1 y1 1 + p̂m ŷm ⇒ ∂ln R/∂t = ∂ln y1 /∂t,
m m=2
(104)

holding output ratios unchanged. Note that, p̂m = pm /p1 , m = 2, . . . , M. Thus,

TCR (p, x, t) = TCodf ŷ, x, t (105)

that is, TC from the revenue function is the is the same as TC from the IDF. The
interpretation is the same as the single output case. TCodf is the percentage change
in revenue, holding output prices and inputs unchanged.

Indirect Production Function

In formulating the profit maximization problem, we assume that producers do not

face any budget constraint. In reality, the producers may face a budget constraint.
In such a case, profit maximization, given output price, is the same as output
maximization if input cost is exactly the same as the exogenously given budget.
Thus,

Max π = pf (x, t) − wx subject to C 0 = wx

906 S. C. Kumbhakar

is the same as

Max y = f (x, t) subject to C 0 = wx

where C0 is the exogenously given budget. That is, the input allocation under profit
maximization that satisfies the exogenously given budget is the same as output
maximization. The above problem gives solutions of input demand functions which
can be plugged into the production function. This gives the indirect production
function (IPF) which can be written as y = ϕ(w, C0 , t) which is homogeneous of
degree zero in w and C0 . TC from the IPF can be derived from

TCipf w, C 0 , t = ∂ln ϕ w, C 0 , t ∂t (106)

which shows the percentage change in output over time, keeping input prices and the
budget constraint unchanged. TCp (x, t) shows the rate at which output changes over
time, holding input quantities constant. On the other hand, TCipf (w, C0 , t) shows
the rate at which output changes, holding input prices and budget unchanged. If
input quantities and input prices are unchanged, then the budget constraint is also
unchanged. Thus, the two problems are theoretically the same. For example, if
TCp (x, t) increases output by k% holding x unchanged, TCipf (w, C0 , t) will be the
same if input prices do not change. While TCp (x, t) does not assume any economic
behavior and it does not depend on input prices, TCipf (w, C0 , t) is based on an
explicit economic behavior and it depends on input prices. Thus, although, in
both TCp (x, t) and ϕ(w, C0 , t), we measure the rate of change in output over time,
these functions depend on different sets of variables and therefore, empirically, the
estimates of f (x, t) and ϕ(w, C0 , t) are likely to be different.

Technical Change and TFP Change

TC and Profit

Since our discussion is focused on technical change, from an individual producer’s

point of view, a question that arises naturally is how TC affects productivity and
profitability for an individual producer. The underlying assumption is that profit is
the ultimate goal of a producer in making production decisions. To show the link,
we first relate profit with productivity and then link it to TC. For this, we define
productivity as the ratio of output produced to an input index, instead of using the
ratio of output to labor (which is labor productivity). In a one input case, productivity
is nothing but the average product (AP) of the input. So, the question is whether
high productivity is always good to a producer. Alternatively, is low productivity
necessarily bad to a producer?
The answer depends on input and output prices. If real wage (w/p) increases
faster than AP, productivity in value terms (py/wx) will be lower. Since py/wx = R/C
21 Modeling Technical Change: Theory and Practice 907

is the return to the outlay, it can be lower (higher) even if AP is higher (lower). Thus,
low productivity (AP) in physical terms is not always bad judging from return to the
outlay.
Can productivity for a producer be increased by using more labor? More
generally, can it be input driven? To answer this, assume that there is no inefficiency.
Go back to the production function, which shows that productivity declines as x is
increased (follows from concavity of the production function). Labor productivity
can be increased in the short run by using more capital. But capital is not free. Thus,
productivity cannot increase simply by using more inputs. Input driven growth is
not sustainable in the sense that it cannot increase productivity [2, 25].
The production function has to shift up (technical progress) to increase produc-
tivity and increase returns to the outlay. This is the ideal situation as can be seen in
the figure below.

Is high productivity always good for a producer from the point of view of
profit? That is, how does productivity affect profit? For this, we link profit (π ) to
productivity starting from the definition of profit π = py − wx ⇒ y = π /p + (w/p)x.
From the figure below, it can be seen that the optimal x, x∗ (associated with
maximum profit) is given by the point of tangency between the straight line
y = π /p + (w/p)x and the production function, y = f (x). If x < x∗ a firm could
increase its profit by increasing the use of x, although it will reduce its productivity.
That is, high productivity is associated with low profitability. Increase in input usage
might increase profit but lower productivity.
908 S. C. Kumbhakar

(p /p)*

TC as a Component of TFP Change

Production Function Approach

From the discussion above, it is clear that input productivity (partial) is affected
by TC defined as a shift in the production function. In a multiple inputs case,
productivity can be measured by changes in (y/x) where x is an index of aggregate
input (input aggregator). In such a case, productivity change is called total factor
productivity (TFP) change as opposed to partial productivity change, which is
change in y/xj , j= 1, . . . , J. Starting fromthe production function and the definition
of TFP change T Ḟ P as ẏ − ẋ ≡ ẏ − j Sj ẋj , it is straightforward to show that
[13], [63]

T Ḟ P = TCp + (RTS − 1) λj ẋj = TCp + Scale, (107)
j

where the “dot” over a variable means its rate of change and Sj = wj xj /C is
the cost share of input j. In deriving the result in (107), we assumed that input
allocations are optimal (no misallocation/allocative inefficiency). RTS and λj in the
scale component are defined as

RTS = ∂ln y/∂ln xj ≡ εj , λj = εj /RTS (108)
j j
21 Modeling Technical Change: Theory and Practice 909

It is clear from (107) that TC plays an important part in total factor productivity
change. In fact, if one assumes a technology that exhibits unitary returns to scale
(as in Solow [2]), total factor productivity change will be solely determined by
TC. That is, productivity can only be increased by shifting the production function
upward (technical progress). This can be done via some policy instruments such as
investment in R&D, changes in management, etc.

Cost Function Approach

Starting from the accounting definition of cost (C = wx) and the neoclassical cost
function, C = C(w, y, t), it is shown by Denny, Fuss, and Waverman [26] that

T Ḟ P = ẏ − Sj ẋj = 1 − RT S −1 ẏ − Ċt = Scale + TCc (109)
j

where RTS = [∂lnC/∂ln y]−1 . As argued before, TC in (107) and (109) are not
exactly the same. They have different interpretations and are derived from two
different representations of the technology.

Profit Function Approach

Starting from the accounting definition of profit (π = py − wx) and the neoclassical
profit function π = π (w, p, t), it is straightforward to show that [27]

T Ḟ P = (π/py) (∂ln π/∂t) + (RTS − 1) Sj ẋj = TCπ + Scale (110)
j

Note that TCπ is the change in profit as a ratio of total revenue and therefore it
not the same as (∂ ln π /∂t). Thus, the interpretation of TC and its magnitude differs
depending on whether one uses a production, cost or profit function.

Formulating and Estimating TC Without Estimating Profit/Cost

Function

TC from a Production Function Formulation

In modeling TC, we used either a primal or a dual approach to specify the
technology and derive measures of TC therefrom. In this subsection, we consider
formulations where TC is formulated differently, although its definition is exactly
For this we start with the production function y = f (x, t) which gives
the same.
ẏ = j λj ẋj + TC (x, t), where λj = ∂lny/∂lnxj are input elasticities. If firms
maximize profit in choosing the inputs and output and the markets are perfectly
competitive, λj = ∂lny/∂lnxj = wj xj /py = Rj which can be obtained from the
observed data on inputs, output, and their prices. Thus, ẏ − j Rj ẋj = TC (x, t)
which can be computed from data without doing any econometric estimation. In
this framework, TC will be observation specific. Note that although the profit
maximizing assumption is used here, TC is a function of input quantities x and t.
However, one can argue that since the input demand functions in a profit maximizing
910 S. C. Kumbhakar

case depend on w and p, TC can be expressed in terms of w, p, and t. This framework

is pointed out in Chambers [3].
If firms minimize cost and output is exogenously given (as is the case in the
service industries), then using the first order conditions of cost minimization, we
as λj = ∂lny/∂lnxj = Ecy Sj , Ecy = ∂lnC/∂lny. Thus,
can express input elasticities
T (x, t) = ẏ − Ecy (y, w, t) j Sj ẋj which can be written in a regression format

ẏ = T (x, t) + Ecy (y, w, t) ẋ, (111)

where ẋ = j Sj ẋj and Ecy (.) is a function of w, y, and t because the cost
function depends on them. Furthermore, since the input demand functions (in a cost
minimizing model) depend on w, y, and t, we can write

ẏ = TC (w, y, t) + Ecy (y, w, t) ẋ ≡ β0 (w, y, t) + β1 (w, y, t) ẋ + v, (112)

which is a semiparametric smooth coefficient (SPSC) regression model after adding

the noise term v. Since the intercept and slope coefficients are nonparametric
functions of w, y, and t, the TC = β 0 (w, y, t) and RTS (which is the reciprocal of
Ecy = β 1 (w, y, t)) are much more flexible than a parametric function such as the
flexible translog production function, for example. The regressors in this SPSC
model are ẏ and ẋ which are computed from data. To estimate the model (the
functional coefficients), we need data on input prices in addition to output quantities.
If prices are constant (the same for all firms), the model still works although β 0 , β 1
will be functions of only y and t. Note that here we are not using a cost function per
se, although the cost minimization assumptions are invoked.

TC from a Cost Function Formulation

Here we start from the cost function C(w, y, t) in which cost minimization conditions
used. The log derivative of C = C(w, y, t)
are explicitly where C = wx gives
Ċ = j Sj ẇj +Ecy ẏ+Ċt which is rewritten as Ċ− j Sj ẇj = Ecy (w, y, t) ẏ−

TCc (w, y, t). Defining q̇ = Ċ − j Sj ẇj (which is data), the above expression
becomes

q̇ = β1 (w, y, t) ẏ + β0 (w, y, t) (113)

After adding an error term to it, we get a SPSC model in which the intercept
function represents TC and the slope is the reciprocal of RTS. Being nonparametric,
both TC and RTS are flexible which does not depend on functional form of the cost
function. Note that TC from (112) is not likely to be the same as the one from (113).

TC from TFP Change

The Divisia index defined as ẏ − j Sj ẋj is used as a measure of nonparametric
(growth accounting) TFP change. Starting from the production function y = f (x, t)
yields
21 Modeling Technical Change: Theory and Practice 911

T Ḟ P = TCp + (RTS − 1) λj ẋj = TCp + SCALE (114)
j

in which all the notations are defined before. If firms minimize

cost, and there are
no mistakes in input allocations, λj = Sj . Defining ẋ = j Sj ẋj as the change in the
aggregate x as before, (114) gives (after adding error terms) a regression function of
the form

T Ḟ P = (RTS(x, t) − 1) ẋ + TC(x, t) ≡ β0 (x, t) + β1 (x, t) ẋ + v (115)

which is a SPSC regression function. The intercept function gives an estimate of

TC(x, t) = β 0 (x, t) and the slope function estimates RTS(x, t) − 1 = β 1 (x, t) (scale
economies).
Kumbhakar and Sun [28] used an IDF and their estimating equation used the
rate of change in the input, input ratios, and output variables. The coefficients are
nonparametric functions of outputs, input ratios, and time. They linked the estimated
coefficients to TFP change, RTS, and TC. The main advantage of their approach is
that they did not use any functional form assumption on the IDF.

Models with Technical Inefficiency

So far, we assumed that the production process is fully efficient. However, a firm
can be technically inefficient. That is, firms may either not be able to produce
the maximum possible output (output-oriented inefficiency) or they may be using
more inputs than necessary (input-oriented inefficiency). Instead of repeating our
discussion of inefficiency in all the cases discussed so far, we focus on the discussion
in terms of TFP change. Details can be found in Kumbhakar and Lovell [29],
Kumbhakar, Wang and Horncastle [30]. See also the chapters by Kumbhakar et al.
[66, 67] in this volume.

TC and Technical Inefficiency

Note that although technical inefficiency can be defined in a cross-sectional setup,

here we focus on panel models because our objective is to model TC.

The Production Function Approach

There is a plethora of panel models that include technical inefficiency. We start with
the production function and write it as (Kumbhakar [60])

yit = f (xit , t) exp (−uit ) ,

where u ≥ 0 is output-oriented technical inefficiency. All other variables are

defined before. Output-oriented technical inefficiency, u, measures the proportion
912 S. C. Kumbhakar

by which actual output (yit ) falls short of maximum possible output (labeled
as the frontier output f (xit , t)). Output technical efficiency (or simply technical
efficiency) is then defined by yit /f (xit , t) = exp(−uit ) ≤ 1. For small values of uit , we
use the relationship that technical efficiency (exp(−uit )) equals 1 minus technical
inefficiency, uit using exp (−uit ) 1 − uit . 1
Given the input vector xit , output can change because of a shift in the technology
(which is labeled as TC) as well as a change in technical efficiency (TEC).
Differentiating the production function totally and using the definition of TFP
change (as before), we get

T Ḟ P = (RTS − 1) λj ẋj + TC + TEC + λj − Sj ẋj , (116)
j j

where every term is defined as before except TEC = −∂u/∂t. In (116) we allow
the possibility of input misallocation. Subscripts i and t are omitted to avoid
notational clutter. Defining the scale component as SCALE = (RTS − 1) j λj ẋj

and input misallocation (or price effects) as MISAL = j λj − Sj ẋj , the above
relationship in (116) decomposes TFP change into SCALE, TC, TEC, and MISAL.
Note that MISAL captures either deviations of input prices from the value of their
marginal products, that is, wj = pfj , or departure of marginal rate of technical
substitution from the ratio of input prices (fj /fk = wj /wk ) which is often labeled as
allocative inefficiency. It is clear that exclusion of inefficiency (both technical and
allocative) is likely to bias estimates of TFP change.
If technical efficiency is time-invariant (i.e., TEC = −∂u/∂t = 0), the above
decomposition of TFP change shows that TEC does not affect TFP change. Under
unitary RTS assumption, the TFP change formula is identical to the one derived in
Nishimizu and Page [31], viz.,

T Ḟ P = TC − ∂u/∂t + λj − Sj ẋj = TC + TEC + MISAL. (117)
j

Note that all the components of TFP change are firm specific and vary over
p
time. With inefficiency, one can define TFP in a potential sense, that is, TFPit =
p
yit exp (uit ) /xit where xit is an input aggregator. This gives T Ḟ Pit = ẏit −

ẋit + u̇it = T Ḟ Pit + u̇it and ẋit = Sj it ẋj it . Thus, potential TFP change
j
adjusted for inefficiency will depend on temporal pattern of inefficiency change
(u̇it ). Consequently, TFP change ignoring inefficiency is likely to be biased (unless
(u̇itp< 0), the discrepancy
it is time invariant). If inefficiency declines over time
between the potential TFP change and TFP change T Ḟ Pit − T Ḟ Pit = u̇it will
decline.

1 Kumbhakar and Hjalmarsson [61] were first to seperate TC from technical efficiency.
21 Modeling Technical Change: Theory and Practice 913

Estimation

The components of TFP change defined in the preceding section can be estimated
from parametric production, cost, and profit functions. In this subsection, we discuss
in detail the production function approach in which inputs are assumed to be
exogenous (not decision variables).2 We assume that the production function is
defined by a translog function, viz.,
1 1
ln yit = a0 + aj ln xj it + at t + aj k ln xj it ln xkit + att t 2
2 2
j j k
(118)
+ aj t ln xj it t − uit + vit ,
j

where v it is the i.i.d.N 0, σv2 random noise term, i = 1, . . . , N indexes
producers, and t = 1, . . . , T indexes time. This is not the state-of-the-art stochastic
frontier model. Greene [32, 33] defined models that include either random or
fixed firm-effects along with uit and vit . Although the presence of the firm-effects
will not change the decomposition formula discussed below, it will change the
estimation method and the parameter estimates. Further developments by Colombi
et al. [34] and Kumbhakar et al. [35] decompose inefficiency into time-invariant and
time-varying components. Again, these models will not change the decomposition
formula because the time-invariant effects do not count in TFP change. Again, the
estimation method will be different. See Kumbhakar, Wang, and Horncastle [30] for
discussion of TFP change decomposition for some of these new models.
Estimation of the above models requires some assumption about the behavior of
technical inefficiency over time. If it is assumed to be time-invariant, the TFP change
is not affected by the presence of technical inefficiency. Thus, it might not be an
interesting specification to study TFP change. There are several specifications that
make the technical inefficiency term time-varying. In Kumbhakar [36] and Battese
and Coelli [37], uit = ui g(t) where g(t) is a parametric function of time, and ui
is a nonnegative random variable with half-normal (or exponential) distribution.
In Lee and Schmidt [38], uit = ui λt where λt are time-effects (represented by
time dummies) and ui can be either fixed or random producer-specific effects with
no distributional assumption. Cornwell, Sickles, and Schmidt [39] specified it as
uit = a1i + a2i t + a3i t2 where a1i , a2i , a3i are producer-specific parameters. Since
time appears in a linear fashion as a regressor in the production function, as well as
in uit , the parameters associated with the time variable in the production function
and in uit cannot, in general, be identified. For example, it is not possible to separate
the effects of technical change and technical efficiency change in the Cornwell,
Sickles, and Schmidt [39] model. In the specification uit = ui g(t) the time effects,
captured by g(t), are identified by positing a nonlinear specification by Kumbhakar

2 We are not surveying the literature that deals with input endogeneity.
914 S. C. Kumbhakar

[36] and Battese and Coelli [37]. In the specification uit = ui λt , no assumption is
made on the temporal pattern of inefficiency. However, a restriction is imposed in
both specifications. The temporal pattern of technical inefficiency is assumed to be
the same for all producers.
With these caveats in mind, let us consider the
model in which v it is
i.i.d.N 0, σv2 , uit = ui g(t), ui ∼ i.i.d.N + μ, σu2 , and v it independent of ui
for all i and t. Based on these assumptions, one can derive the probability density
function of the composite error term v it − uit , and hence the log-likelihood function
for the model in (118). Once the ML estimators of the parameters of the production
function in (118), the parameters in g(t) as well as σu2 , σv2 , and μ are obtained, the
Jondrow, Lovell, Materov, and Schmidt [40] (JLMS) formula can be used to obtain
estimates of uit (see Kumbhakar and Lovell [29] for details).
Since our interest here is estimation of TFP change and its components that are
given in (116), we need to get estimates of RTS, SCALE, TC, and TEC, all of which
can be computed once the parameters (including u) in (118) are estimated. These are
( )

RTSit = εj it = aj + aj k ln xkit + aj t t ,
j j k

SCALEit = (RTSit − 1) λj it ẋj it (119)
j
∂uit ∂g(t)
TCit = at + att t + aj t ln xj it , and TEC = − = −ui
∂t ∂t
j

Once these components are estimated, the TFP change can be computed for each
producer at every point in time. If price information is available, one can calculate
Sj directly from the data.
Although dated, there are two main advantages of the specification in uit = ui λt
over uit = ui g(t). First, the temporal pattern of technical inefficiency is completely
flexible (other than the normalizing restrictions for identification). Second, no
distributional assumption is made on ui which can be estimated from the coefficient
of firm dummies (in the fixed effects model). The production function in (118) can
be estimated using the nonlinear least squares method without any distributional
assumption on v. Since the mean of v it is zero, one can use the time mean of the
composed error term (residual after estimation) to estimate ui and then uit = ui g(t).
Since the decomposition and estimation using a cost or profit function are similar,
we decided not to discuss them here. Details can be found in Kumbhakar [13] for
single as well as multiple outputs. The decomposition results are similar to those of
a single output. One can also add facilitating inputs (z) in the production function.
The corresponding TFP change formula will have an extra component associated
with the rate of change in the z variables.
In a cost function, one can add allocative inefficiency. However, estimation
of both technical and allocative inefficiency requires a system approach, either
the production function and the first-order conditions of cost minimization (as
21 Modeling Technical Change: Theory and Practice 915

in Schmidt and Lovell [41], Kumbhakar [42, 43], Kumbhakar and Wang [44],
Kumbhakar and Lai [45], Tsionas and Kumbhakar [64]) or the cost function and
cost share equations (Kumbhakar and Tsionas [46], Greene [47, 48], Kumbhakar
and Lovell [29], Kumbhakar et al. [30], among others).
Modeling technical and allocative inefficiency in a profit function is more
complicated unless the underlying production function is restrictive (Cobb-Douglas)
(see Kumbhakar [42], Kumbhakar et al. [30]). For a profit function with flexible
production function, see Kumbhakar [49].

TC in Production Models with Good and Bad Outputs

In many cases, production of desirable (good) outputs generates some unintended

(bad) outputs. The special issue of Empirical Economics (2018) edited by Kumb-
hakar and Malikov [65] presents some papers dealing theoretical and econometric
models to address the issues arising from bad outputs. Interested readers are advised
to check the papers in the special issue.
In this subsection, we consider a modeling approach in the spirit of Fernandez
et al. [50, 51], Forsund [52], Murty et al. [53], and Malikov et al. [54] in which
production of both good and bad outputs are considered. Specifically, we model
the firm’s production process as a system of simultaneous production technologies
for desirable and undesirable outputs. In this setup, desirable outputs are produced
by transforming inputs via the conventional transformation function, satisfying all
standard assumptions. The by-production of undesirable outputs is treated as the so-
called “residual generation technology” which models production of bad outputs.
The production function for bad outputs depends on production of good output(s), as
well as inputs that might be specifically used to mitigate production of bad outputs.
By separating the generation of undesirable outputs from that of desirable outputs,
we ensure that the former are not modeled as inputs in the same production function.
An advantage of the by-production system approach is that it can distinguish
between technical efficiency and environmental efficiencies that are undesirable
output-specific. It can also differentiate between traditional technical productivity
and environmental (“green”) productivity. This framework allows for technical
inefficiency in the production of, say, one desirable output and environmental
inefficiency in the by-production of multiple undesirable outputs. For example,
in electricity generation, the good output is electricity which is produced using
(good inputs) labor and capital. Production of electricity also produces unintended
bad outputs such as SO2 and NOx . The technologies for the production of these
bad outputs are separated so that the arguments are good output and possibly bad
inputs. An alternative to the by-production model is to consider a single technology
which can be represented by F(y, xg , xb , z) = 1 where y, xg , xb , z are vectors of
M good outputs, J good inputs, K pollution generating (bad) inputs and Q bad
outputs. The monotonicity assumptions on these variables consist of Fy ≥ 0, Fxg ≤
0, Fxb ≤ 0 and Fz ≤ 0 where Fy , Fxg , Fxb , Fz are partial derivatives of F (.).
916 S. C. Kumbhakar

Since Fxg ≤ 0, Fxb ≤ 0 and Fz ≤ 0, there is no difference between z, xg and

xb in F y, xg , xb , z = 1 from a pure mathematical point of view. That is, bad
outputs can be treated as inputs (both xg and xb ), and since inputs are assumed
to be freely disposable, so are bad outputs. This violates axioms of production
theory and is criticized in the environmental production literature (e.g., see Färe
and Grosskopf [56] and Färe et al. [55]). Further, if, for example, in estimating the
input distance function all bad outputs are treated as inputs, both bad outputs and
inputs can be scaled back by the same proportion, holding good outputs constant.
However, this approach violates a basic engineering requirement that a reduction
in bad outputs requires the usage of more good inputs, holding good outputs and
inefficiency constant. See Kumbhakar and Tsionas [46] for other problems.
Given the problems in modeling bad outputs in terms of a single equation
distance function, Fernandez et al. [50], Førsund [52], and Murty et al. [53] proposed
a bi-production approach which uses two separate technologies to model good and
bad outputs. The former describes the textbook type production process, where
inputs (good and bad) are transformed into desirable outputs, and the process does
not depend on bad outputs. Furthermore, it satisfies all the standard properties,
most importantly the free-disposability property. The latter can be viewed as a
residual generation technology which models the production of bad outputs as a
function of good outputs and xb . Inefficiency is allowed in each technology, thereby
distinguishing technical inefficiency from environmental inefficiency.
We write the bi-production system in terms of the stochastic transformation
function where P separates the environmental residual generation functions for each
undesirable output. This is similar to Kumbhakar and Tsionas [57], that is,

f x, θ −1 y, t = exp {v0 }
(120)
Hp y, λp Bp , t = exp vp ∀ p = 1, 2, . . . , P

where θ ≤ 1 and λp ≤ 1 are technical and environmental efficiencies, respectively;

and (v0 , vp ) are the noise terms.
We rewrite the above system in the log form as

ln yt = ln f (xt , t) − u0,t + v0,t

(121)
ln Bp,t = ln hp (yt , t) + up,t + vp,t

where, for convenience, we define f (·) = [F(·, 1)]−1 and hp (·) = [Hp (·, 1)]−1 ; and
u0, t = − ln θ t ≥ 0 and up, t = − ln λp, t ≥ 0 (p = 1, 2, . . . , P) as technical and
environmental inefficiencies, respectively.
Total differentiation of the above equations with respect to t yields

d ln yt ∂ln f (xt , t) ∂ln xj,t

J
∂ln f (xt , t) ∂u0,t
= + −
dt ∂ln xj,t ∂t ∂t ∂t
j =1
21 Modeling Technical Change: Theory and Practice 917

d ln Bp,t ∂ln hp (yt , t) ∂ln yt ∂ln hp (yt , t) ∂up,t

= + + ∀p
dt ∂ln yt ∂t ∂t ∂t

where we have made use of ∂v0, t /∂t = ∂vp, t /∂t = 0 since (v0 , vp ) are the i.i.d.
noise terms. After some rearranging, we get the following Solow [25] type (Divisia)
technical productivity index (TFPG):

d ln yt ∂ln f (xt , t) ∂ln xj,t

J
∂ln f (xt , t) ∂u0,t
TFPG = − = − , (122)
dt ∂ln xj,t ∂t " ∂t
#$ %" ∂t %
#$
j =1
TTC TEC

along with the similarly defined environmental productivity index (EPG) for each p
from

& '
d ln Bp,t ∂ln hp (yt , t) ∂ln yt
EPGp = − −
dt ∂ln yt ∂t
∂ln hp (yt , t) ∂up,t (123)
=− − ∀ p = 1, 2, .., P
" ∂t
#$ ∂t %
%" #$
ETCp EECp

Furthermore, (122) and (123) provide a meaningful way to decompose pro-

ductivity indices into technical change and efficiency change components. The
conventional technical productivity index TFPG equals the sum of the technical
change TTC = ∂lnf (xt , t)/∂t, which measures the temporal shift in the production
frontier, and technical efficiency change TEC = − ∂u0, t /∂t which measures the
movement toward (away from) the frontier. Similarly, the Bp -oriented environmental
productivity index EPGp is decomposed into environmental technological change
ETCp = − ∂ln hp (yt , t)/∂t and environmental efficiency change EECp = − ∂up, t /∂t.
Note the conceptual difference between the definition of a “technological
progress” for desirable outputs and that for undesirable outputs. For a desirable
output y, the technological progress corresponds to the case of TTC > 0, that is,
an outward shift in the production frontier over time, whereas for an undesirable
output Bp , the technological progress corresponds to ETCp < 0, that is, an inward
shift in the residual generating frontier over time. Thus, the residual generating
frontier Hp (·) (p = 1, . . . , P) is defined as the minimum quantity of undesirable
output generated when producing a given quantity of desirable outputs subject to
the material balance condition.
To implement it empirically, one needs to specify functional forms on ln f (.) and
ln hp (.) and estimate the system jointly. Malikov et al. [54] used translog functional
forms for ln f (.) and ln hp (.) (p = 1, . . . , P). Random firm effects, in ln f (.) and
ln hp (.), were also introduced so that the estimates are not contaminated by firm
effects in the technologies of both good and bad output production. Malikov et al.
[54] used a Bayesian approach to estimate the system.
918 S. C. Kumbhakar

Productivity and Profitability

The concept of productivity change in a micro model is not quite intuitive because
even if productivity change is positive it is not clear whether profitability of a firm
will be increasing over time or not. To make the concept more intuitive and practical,
we examine change in profit as a percentage of total cost as opposed to profit. This
is useful because in practice, actual profit is often negative. If profit is negative but
change in profit is positive (things are improving), then the percentage change in
profit will be negative and there is no way to separate this from the case in which
change in profit is negative (things are getting worse) but actual profit is positive.
This problem can be avoided by expressing profit change as a percentage of cost or
revenue.
Differentiating profit π (= py − wx totally, we get
)
py
C ∂t = C {ṗ + ẏ} − wj xj ẇj + Sj ẋj . Using the expression for TFP
1 ∂π
j j
change, we can express the above as

1 ∂π R R .
= ṗ + ẏ − Sj ẇj + T F P (124)
C ∂t C C
j

We call the left-hand side of the above equation profitability change (change in
profit per unit of cost). It is positive when a change in profit is positive, irrespective
of profit being positive or negative. This will not be the case if we use rate of change
in profit as a profitability change measure, because negative profit with a negative
profit change will be identical to a positive profit change with positive profit. The
above decomposition gives three additional components for profitability change and
these are related to output and output price changes as well as input price changes.
This result holds irrespective of the number of outputs. Note that TFP change is a
physical measure and it cannot describe performance of a firm in terms of profit.
Since profit depends on input and output prices, the decomposition in (124) is more
appealing. Even if TFP change is positive and large, profitability change can be
low or even negative if change in output price is negative and input price change is
positive and large.
To show this explicitly, we write the profitability change as

1 ∂π R .
= Ṙ − sj ẇj + T F P (125)
C ∂t C
j

so that profitability change depends on change in TFP which is a physical measure,

plus the financial measures which include rates of change in revenue (R) and input
prices. The profitability measure is more important to an individual producer instead
of TFP change. For a country, the TFP change is more meaningful because a positive
TFP change means higher average output, which in turn means higher wellbeing no
21 Modeling Technical Change: Theory and Practice 919

matter whether higher average output goes to the workers or not. It is possible to
derive the profitability change formula with facilitating inputs (z). In such a case,
there will be an extra term associated with the profitability change formula.

TC and Factor Productivity with One Variable Input

If we assume that labor is the only variable input in the production process, that
is, y = f (L, z, t) where z is a vector of quasi-fixed inputs, then labor productivity
is the same as TFP. This might be the mindset of people when they do not
distinguish
TFP from labor productivity. In such a case, change in labor productivity,
ẏ
L = (L − 1) L̇ + j j żj + TC, where L = ∂ln f (.)/∂ln L < 1. Thus, labor
productivity may not increase with an increase in labor use, even with
TC > 0 unless
ẏ
TC > (1 − L ) L̇ + j j żj . With multiple variable inputs, L = (L − 1) L̇ +

j ẋj + TC. Thus, labor productivity will depend on whether nonlabor inputs are
j =L
increasing over time or not, along with the other two terms (TC and L ). Therefore,
one cannot ignore the other factors while talking about labor productivity. This is a
common mistake in many official statistics.
If there is one variable input but many outputs, one can specify the technology
as L = ψ(z, y, t) which is the labor requirement function introduced in Diewert
[58], Kumbhakar and Hjalmarsson [59]. Like a cost function, the labor requirement
function gives the minimum amount of labor required to produce a vector of
outputs, ceteris paribus. Assuming a TT specification with translog functional form
as Kumbhakar and Hjalmarsson [59, 62] do, the labor requirement function (without
z) is

ln Lit = β0 + βm ln ymit
m
1 1
+ βkl ln ykit ln ylit + αt t + αtt t 2 + λkt ln ykit t + vit
2 2
k l k
(126)

where ym are M different services which are exogenously given. If a GI formulation

is used, the model becomes
1
ln Lit = β0 + βm ln ymit + βkl ln ykit ln ylit + A(t)
m
2
k l
(127)
+ λkt ln ykit A(t) + vit
k

where A(t) = δt DT t , DTt being the time dummies.
t
920 S. C. Kumbhakar

TC in the above models (126) and (127) are

TC_TTit = αt + αtt t + λkt ln ykit
*k +
(128)
TC_GIit = [A(t) − A (t − 1)] 1 + λkt ln ykit
k

A negative sign on TC means labor requirement is reduced (labor saving) over

time, ceteris paribus. This is equivalent to technical progress in the cost function
model.
If there is more than one variable input and outputs are exogenous, the best
strategy is to use the IDF which is invariant to the choice of numeraire input.
This strategy works not only for estimating TC but also inefficiency and change
in technical efficiency. For more details, there are many papers that use IDF with
inefficiency and decompose output growth and TFP change (Kumbhakar [20],
Sipilainen et al. [22], among others).

Concluding Remarks

The literature on TC and productivity is extensive. Our discussion of it in this

chapter does not cover every aspect of it. Nor does our discussion completely
cover all the topics written in papers regarding TC and productivity (which would
be a monograph). We concentrated on the models that are mostly used in the
microeconomics literature3 . In particular, we focused on parametric models based
on different forms of production, cost, and profit functions to model TC in a
single output case. For multiple outputs, we used cost, profit, and distance function
formulations. Most of our models and discussions excluded detailed treatment of
technical inefficiency, especially the recent models. This is done to avoid repetition
and to keep a reasonable length. Although most of our models use panel data,
we added a section on modeling TC in a cross-sectional setup when shift in the
technology is explained by exogenous factors. Finally, we added a section linking
TC to TFP and also used a model to include both good and bad outputs. TFP is also
linked to a measure of profitability change.

References
1. Krugman P (1997) The age of diminishing expectations. MIT Press, Cambridge
2. Solow RM (1956) A contribution to the theory of economic growth. Q J Econ 70:65–94
3. Chambers RG (1988) Applied production analysis. Cambridge University Press, Cambridge
4. Fuss M, McFadden D (1978) Production economics: a dual approach to theory and applications
volume I: the theory of production. North-Holland, Amsterdam

3 See Ray [68] in this volume for other formulations.

21 Modeling Technical Change: Theory and Practice 921

5. Jorgenson DW (1986) Econometric methods for modeling producer behavior. In: Griliches Z,
Intriligator MD (eds) Handbook of econometrics, vol 3. North-Holland, Amsterdam, pp 1841–
1915
6. Jorgenson DW (2000) Econometric modeling of producer behavior. The MIT Press, Cambridge
7. Jin H, Jorgenson DW (2009) Econometric modeling of technical change. J Econ 157:205–219
8. Kumbhakar SC, Heshmati A, Hjalmarsson L (1999) Parametric approaches to productivity
measurement: a comparison among alternative models. Scand J Econ 101:405–424
9. Diewert WE (1971) An application of the Shephard duality theorem: a generalized Leontief
production function. J Polit Econ 79:481–507
10. Diewert WE, Wales TJ (1987) Flexible functional forms and global curvature conditions.
Econometrica 55:43–68
11. Baltagi BH, Griffin JM (1988) A general index of technical change. J Polit Econ 96:20–41
12. Stevenson R (1980) Measuring technological bias. Am Econ Rev 70:162–173
13. Kumbhakar SC, Nakamura S, Heshmati A (2000) Estimation of firm-specific technological
bias, technical change and total factor productivity: a dual approach. Econ Rev 19(4):493–515
14. Wiley DE, Schmidt WH, Bramble WJ (1973) Studies of a class of covariance structure models.
J Am Stat Assoc 68:317–323
15. Kumbhakar SC (1992) Allocative distortions, technical progress, and input demand in U.S.
airlines: 1970–1984. Int Econ Rev 33(3):723–737
16. Kumbhakar SC (1994) A multiproduct symmetric generalized McFadden cost function. J Prod
Anal 5:349–357
17. Kutlu L, Liu S, Sickles R (2020) Cost, revenue, and profit function estimates. In: Ray SC,
Chambers R, Kumbhakar SC (eds) Handbook of production economics, vol 1. Springer Nature,
Singapore
18. Lau LJ (1978) Applications of profit functions. In: Fuss M, McFadden DL (eds) Production
economics: a dual approach to theory and applications, Volume I: The theory of production.
Elsevier, Amsterdam
19. Hasenkemp G (1976) Specification and estimation of multiple-output production func-
tions, Lecture notes in economics and mathematical systems, vol 120. Springer-Verlag,
Berlin/Heidelberg
20. Kumbhakar SC (2013) Specification and estimation of multiple output technologies: a primal
approach. Eur J Oper Res 231:465–473
21. Kumbhakar SC, Lien G (2009) Productivity and profitability decomposition: a parametric
distance function approach. Food Econ Acta Agric Scand C 6:143–155
22. Sipilainen T, Kumbhakar S, Lien G (2014) Performance of dairy farms in Finland and Norway
from 1991–2008. Eur Rev Agric Econ 41:63–86
23. Heshmati A, Kumbhakar SC (2011) Technical change and total productivity growth: the case
of Chinese provinces. Technol Forecast Soc Chang 78:575–590
24. Heshmati A, Kumbhakar SC (2013) A general model of technical change with an application
to the OECD countries. Econ Innov New Technol 23:25–48
25. Solow RM (1957) Technical change and the aggregate production function. Rev Econ Stat
39:312–320
26. Denny M, Fuss M, Everson C, Waverman L (1981) The measurement and interpretation of
total factor productivity in regulated industries, with an application to Canadian telecommuni-
cations. In: Cowing TG, Stevenson RE (eds) Productivity measurement in regulated industries.
Academic Press, New York, pp 179–218
27. Kumbhakar SC (2002) Productivity measurement: a profit function approach. Appl Econ Lett
9:331–334
28. Kumbhakar SC, Sun K (2012) Estimation of TFP growth: a semiparametric smooth coefficient
approach. Empir Econ 43:1–24
29. Kumbhakar SC, Lovell CAK (2000) Stochastic frontier analysis. Cambridge University Press,
Cambridge
30. Kumbhakar SC, Wang H-J, Horncastle AP (2015) A practitioner’s guide to stochastic frontier
analysis using Stata. Cambridge University Press, New York
922 S. C. Kumbhakar

31. Nishimizu M, Page JM (1982) Total factor productivity growth, technological Progress and
technical efficiency change: dimensions of productivity change in Yugoslavia, 1965–78. Econ
J 92:920–936
32. Greene W (2005) Fixed and random effects in stochastic frontier models. J Prod Anal 23:
7–32
33. Greene W (2005) Reconsidering heterogeneity in panel data estimators of the stochastic
frontier model. J Econ 126:269–303
34. Colombi R, Kumbhakar SC, Martini G, Vittadini G (2014) Closed-skew normality in stochastic
frontiers with individual effects and long/short-run efficiency. J Prod Anal 42:123–136
35. Kumbhakar SC, Lien G, Hardaker JB (2014) Technical efficiency in competing panel data
models: a study of Norwegian grain farming. J Prod Anal 41:321–337
36. Kumbhakar SC (1990) Production frontiers, panel data, and time-varying technical ineffi-
ciency. J Econ 46:201–211
37. Battese GE, Coelli TJ (1992) Frontier production functions, technical efficiency and panel data:
with application to paddy farmers in India. J Prod Anal 3:153–169
38. Lee Y, Schmidt P (1993) A production frontier model with flexible temporal variation in
technical efficiency. In: Fried H, Lovell CAK, Schmidt S (eds) The measurement of productive
efficiency. Oxford University Press, Oxford
39. Cornwell C, Schmidt P, Sickles RC (1990) Production frontiers with cross-sectional and time-
series variation in efficiency levels. J Econ 46:185–200
40. Jondrow J, Lovell CAK, Materov IS, Schmidt P (1982) On the estimation of technical
inefficiency in the stochastic frontier production function model. J Econ 19:233–238
41. Schmidt P, Lovell CAK (1979) Estimating technical and allocative inefficiency relative to
stochastic production and cost frontiers. J Econ 9:343–366
42. Kumbhakar SC (1987) The specification of technical and allocative inefficiency in stochastic
production and profit frontiers. J Econ 34:335–348
43. Kumbhakar SC (1997) Modeling allocative inefficiency in a translog cost function and cost
share equations: an exact relationship. J Econ 76:351–356
44. Kumbhakar SC, Wang H-J (2006) Estimation of technical and allocative inefficiency: a primal
system approach. J Econ 134:419–440
45. Kumbhakar SC, Lai H-p (2019) Technical and allocative efficiency in a panel stochastic
production frontier system model. Eur J Oper Res 278:255–265
46. Kumbhakar SC, Tsionas EG (2005) The joint measurement of technical and allocative
inefficiencies: an application of Bayesian inference in nonlinear random-effects models. J Am
Stat Assoc 100:736–747
47. Greene WH (1980) On the estimation of a flexible frontier production model. J Econ 13:
101–115
48. Greene WH (1993) The econometric approach to efficiency analysis. In: Fried HO, Lovell
CAK, Schmidt SS (eds) The measurement of productive efficiency: techniques and applica-
tions. Oxford University Press, Oxford, pp 68–119
49. Kumbhakar SC (2001) Estimation of a profit function when profit is not maximum. Am J Agric
Econ 83:1–19
50. Fernandez C, Koop G, Steel MFJ (2002) Multiple-output production with undesirable outputs:
an application to nitrogen surplus in agriculture. J Am Stat Assoc 97:432–442
51. Fernandez C, Koop G, Steel MFJ (2005) Alternative efficiency measures for multiple-output
production. J Econ 126:411–444
52. Forsund F (2009) Good modelling of bad outputs: pollution and multiple-output production.
Int Rev Environ Resour Econ 31:1–38
53. Murty S, Russell RR, Levkoff SB (2012) On modeling pollution-generating technologies. J
Environ Econ Manag 64:117–135
54. Malikov E, Kumbhakar SC, Tsionas EG (2015) Bayesian approach to disentangling technical
and environmental productivity. Econometrics 3:443–465
55. Färe R, Grosskopf S, Noh D-W, Weber WL (2005) Characteristics of a polluting technology:
theory and practice. J Econ 126:469–492
21 Modeling Technical Change: Theory and Practice 923

56. Färe R, Grosskopf S (2003) Nonparametric productivity analysis with undesirable outputs. Am
J Agric Econ 85(4):1070–1074
57. Kumbhakar SC, Tsionas EG (2016) The good, the bad and the technology: endogeneity in
environmental production models. J Econ 190:315–327
58. Diewert WE (1974) Functional forms for revenue and factor requirements functions. Int Econ
Rev 15:119–130
59. Kumbhakar SC, Hjalmarsson L (1995) Decomposing technical change with panel data: an
application to the public sector. Scand J Econ 97:309–323
60. Kumbhakar SC (2000) Estimation and decomposition of productivity change when production
is not efficient: a panel data approach. Econ Rev 19:425–460
61. Kumbhakar SC, Hjalmarsson L (1993) Technical efficiency and technical progress in Swedish
dairy farms. In: Fried H, Schmidt S, Lovell CAK (eds) The measurement of productive
efficiency: techniques and applications. Oxford University Press, New York
62. Kumbhakar SC, Hjalmarsson L (1998) Relative performance of public and private ownership
under yardstick competition: electricity retail distribution. Eur Econ Rev 42:97–122
63. Kumbhakar SC, Heshmati A (1996) Technical change and total factor productivity growth in
Swedish manufacturing industries. Econ Rev 15(3):275–298
64. Kumbhakar SC, Tsionas EG (2020) On the estimation of technical and allocative efficiency in a
panel stochastic production frontier system model: some new formulations and generalizations.
Eur J Oper Res 287(2):762–775
65. Kumbhakar SC, Malikov E (2018) Good modeling of bad outputs, special issue (ed. Kumb-
hakar and Malikov). Empir Econ 54:1–308
66. Kumbhakar SC, Parmeter CF, Zelenyuk V (2020) Stochastic frontier analysis: foundations
and advances I. In: Ray SC, Chambers R, Kumbhakar SC (eds) Handbook of production
economics, vol 1. Springer Nature, Singapore
67. Kumbhakar SC, Parmeter CF, Zelenyuk V (2020) Stochastic frontier analysis: foundations
and advances II. In: Ray SC, Chambers R, Kumbhakar SC (eds) Handbook of production
economics, vol 1. Springer Nature, Singapore
68. Ray S (2020) Conceptualization and measurement of productivity growth and technical
change: A nonparametric approch. In: Ray SC, Chambers R, Kumbhakar SC (eds) Handbook
of production economics, vol 1. Springer Nature, Singapore
69. Triebs T, Kumbhakar SC (2018) Management in production: from unobserved to observed. J
Prod Anal 49:111–121
Economics of Externalities: An Overview
22
Jean-Paul Chavas

Contents
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 926
Motivations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 928
A General Equilibrium Analysis of Efficiency under Externalities . . . . . . . . . . . . . . . . . . . . . 934
Efficient Pricing under Externalities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 938
Efficient Policies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 943
Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 947
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 947

Abstract

Externalities arise when the decisions of an agent have direct effects on the
welfare of others. This chapter presents an overview on the economics of
externalities. Relying on Pareto efficiency, the analysis is presented in a general
equilibrium framework and evaluates the efficient management of externalities.
The investigation also focuses on the role of non-convexity and transaction costs.
It covers alternative institutional setups, including markets, government inter-
ventions, and contracts. It reexamines how efficient management of externalities
remains consistent with aggregate profit maximization under transaction costs
and non-convexity. It indicates how pricing can support an efficient allocation
under externalities, but this may require nonlinear pricing under non-convexity.

The author would like to thank two anonymous reviewers for useful comments on an earlier draft
of the chapter.

J.-P. Chavas ()

University of Wisconsin, Madison, WI, USA
e-mail: [email protected]

© Springer Nature Singapore Pte Ltd. 2022 925

S. C. Ray et al. (eds.), Handbook of Production Economics,
https://doi.org/10.1007/978-981-10-3455-8_13
926 J.-P. Chavas

And it explores how the minimization of transaction costs is an integral part of

the efficient management of externalities.

Keywords

Externalities · Pareto efficiency · Non-convexity · Transaction costs

Introduction

Externalities are common in economics: they arise whenever an agent makes

decisions that affect the welfare of other agents [1, 2]. The agents can be producers
or consumers. And the externalities can be positive (when they improve the welfare
of others) or negative (when they have adverse effects on others). Examples abound.
Pollution is a negative production externality when it causes damages to crops
or public health. Second-hand smoke is a negative consumer externality when a
smoker has adverse effects on the health of nearby nonsmokers. Bees generate
positive production externality when, besides producing honey, they also provide
pollination services to surrounding crops and orchards. An individual getting
vaccinated against a communicable disease involves a positive consumer externality
when it reduces the odd of infection for both himself and other people nearby.
Thus, externalities are prevalent. But how they should be managed is less clear. One
argument is that externalities make unregulated markets inefficient (when market
prices do not reflect the full social cost or benefit of goods and services). In this
case, recommendations are often made for government interventions implementing
policies that can “internalize” the externalities (e.g., [1, 3–5]). Others have argued
that externalities can be managed by contracts among the affected agents, in which
case an efficient outcome can be attained without government intervention [6].
This debate raises two important questions. First, how to characterize an efficient
allocation in the presence of externalities? Second, how can we evaluate the role of
markets versus non-market institutions in the efficient management of externalities?
These are the key questions addressed in this chapter.
It is well known that, in the absence of externalities and under convexity
conditions, competitive markets can implement an efficient allocation [7]. This
has stimulated much interest in relying on market institutions and decentralized
decision-making to support efficient allocations. But externalities create significant
challenges to market allocations [8]. First, by affecting the ability to decentralize
decision-making in an efficient manner, externalities undermine the efficiency of
unregulated markets (e.g., [5]). Second, as argued by Starrett [9] and Baumol
and Bradford [10], production externalities can imply non-convex technologies. In
addition, environmental goods often generate externalities while being produced
in non-convex ecosystems [11]. It is well known that, under non-convexity and
uniform pricing, markets can fail to be efficient (e.g., [12–17]). Chavas and Briec
[13] showed that nonlinear pricing can be required under non-convexity. Following
Chavas [18], it means that nonlinear pricing may be needed to manage externalities
22 Economics of Externalities: An Overview 927

efficiently. Third, as emphasized by Coase [6] and others, various institutional

choices (including contracts and the legal system) can also be used to manage
externalities.
Relying on the classical Pareto efficiency criterion, this chapter provides a broad
overview on the economics of externalities. By definition, externalities imply the
effects of some agents on others. Managing externalities means developing some
coordination scheme among these agents, which typically involves the use of
resources. We call the costs of these resources “transaction costs.” They include
information cost, search cost, and enforcement cost. The economics of transaction
costs has been analyzed by Coase [19], Foley [20], Hahn [21], Williamson [22],
Williamson and Winter [23], Shleifer [24], and others. Transaction costs vary across
economic institutions and are relevant in the evaluation of externality management.
This applies to market as well as nonmarket institutions (including government and
contracts).
The efficient management of externalities suggests a need to integrate Coase’s
[6] analysis of contracts with the role of transaction costs and non-convexity.
This chapter argues that the Coase theorem (stating that the efficient management
of externalities is consistent with aggregate profit maximization) holds under
transaction costs and non-convexity. This result applies to production externalities
as well as coordination/exchange externalities. In addition, the efficient management
of consumption externalities is consistent with the minimization of aggregate
consumer expenditure. In this context, we examine how transaction costs play a role
in efficiency analysis and how the minimization of transaction costs is an integral
part of efficient allocations. Such results apply under non-convexity provided that
we allow for nonlinear pricing.
In the presence of non-convexity, the efficient management of externalities
becomes more complex. First, the classical dichotomy often made in economics
between nonmarket institutions and competitive markets (exhibiting uniform pricing
for all market participants) is not very helpful. Indeed, when externalities generate
non-convexity, markets exhibiting nonlinear pricing may be needed to implement
efficient allocations. Second, within market institutions, the implementation of
nonlinear pricing (where prices vary across agents) requires the identification and
implementation of price discrimination schemes. Such choices would be made by
firm managers and/or policy makers (a “visible hand”) and would typically involve
transaction costs. This indicates that transaction costs play a role in the management
of externalities under markets as well as nonmarket institutions.
The chapter is organized as follows. Section 2 motivates the analysis of efficiency
under externalities using graphical arguments. A general equilibrium character-
ization of Pareto efficiency under externalities, non-convexity, and transaction
costs is presented in Sect. 3. The analysis goes beyond Baumol and Oates [1]
in several ways. First, it relies on a benefit function as an intuitive measure of
aggregate welfare (the maximization of aggregate benefit providing a convenient
representation of efficiency). Second, the linkages between the maximization of
aggregate benefit and pricing give useful insights into the role of markets in sup-
porting efficient allocations under externalities. For example, the approach identifies
928 J.-P. Chavas

conditions under which the implementation of simple Pigouvian taxes/subsidies is

inappropriate (when uniform pricing is inefficient). Third, the analysis stresses the
role of transaction costs in choosing institutions and policies. Policy implications
for the efficient management of externalities are discussed in Sect. 4. Finally, Sect.
5 concludes.

Motivations

To motivate the analysis, we examine a production process involving two outputs

(y1 , y2 ) produced under a given set of resources. We consider alternative situations,
all illustrated in Fig. 1. We start the analysis with a situation where there are no
externalities. Without externalities, the outputs satisfy (y1 , y2 ) ∈ Y0 ⊂ R2 , where
Y0 denotes the feasible set without externalities. We assume that the feasible
set Y0 is convex, corresponding to a technology exhibiting diminishing marginal
productivity. We now introduce externalities in the analysis.

P1: An externality arises when the decisions of one agent have direct effects
on the welfare of other agents.
Coase [6] discusses an example with two agents: a rancher managing livestock and a
farmer producing crops, the externality coming from straying livestock that destroy
crops growing on the neighboring farm. In this context, letting y1 be livestock
production and y2 be crop production, there is a negative externality of y1 on y2 .
From P1, at least two agents are required. Indeed, in the Coase example, if the farmer
and the rancher were to merge into a single firm, then the management of crop losses

Y0
B

Y1
E2

Fig. 1 Feasible set under negative externalities

22 Economics of Externalities: An Overview 929

due to livestock damages would reduce to an issue of internal management (in which
case the externality would be “internalized”).
We examine two externality scenarios. Under scenario 1, the outputs satisfy
(y1 , y2 ) ∈ Y1 ⊂ R2 with Y1 ⊂ Y0 and the feasible set Y1 is convex. Under scenario
2, (y1 , y2 ) ∈ Y2 ⊂ R2 with Y2 ⊂ Y0 but the feasible set Y2 is non-convex. These two
scenarios are illustrated in Fig. 1. Both scenarios represent negative externalities that
have adverse effects on productivity and reduce the feasible set. As showed in Fig. 1,
introducing externalities implies an inward shift in the boundary of the feasible set
from isoquant (A B C) to isoquant (A E1 C) under scenario 1 and to isoquant (A E2
C) under scenario 2. Note that all isoquants go through the points A and C: there
is no externality when only one product is produced (i.e., when either y1 = 0 or
y2 = 0). But when products are produced (i.e., when y1 > 0 and y2 > 0), then negative
externalities between y1 and y2 implies a reduction in productivity (due to crop
losses generated by straying livestock in the Coase example). As discussed below,
the distinction between scenario 1 (where Y1 is convex) and scenario 2 (where Y2
is non-convex) will prove important.
While Fig. 1 reflects negative externalities reducing the feasible set, it also shows
that, when external effects are large enough (under scenario 2), this reduction is
associated with a shift from a convex set Y0 to a non-convex set Y2.

P2: Externalities can lead to non-convexity in the production set.

Property P2 has been noted in the literature by Starrett [9], Baumol and Bradford
[10], and Dasgupta and Maler [11]. We discuss below how non-convexity creates
significant challenges to the analysis of economic efficiency under externalities.
This section focuses on a graphical analysis involving two firms (a rancher and
a farmer in the Coase example) and one consumer. Firm 1 (the rancher) produces
output y1 (livestock production) while firm 2 (the farmer) produces y2 (crop output),
with y1 having negative effects on the production of y2 . This example will help
motivate the more general analysis presented in the rest of the paper.
First, consider scenario 1 where negative externalities imply that Y1 ⊂ Y0, the
feasible set Y1 being convex. This scenario is illustrated in Fig. 2. Figure 2 shows
that the efficient allocation is given by point E1. Indeed, point E1 generates the
highest possible utility level represented by the indifference curve (D E1 D’). At
the efficient point E1, the slope of the indifference curve (D E1 D’) is tangent to
the isoquant line (A E1 C) (the upper bound of the feasible set Y1). And in the
neighborhood of point E1, the slope of both lines is equal to the slope of the line (F
E1 F ) in Fig. 2. In turn, this slope is equal to (−p2 /p1 ), where (p1 , p2 ) are the (social)
shadow prices of (y1 , y2 ). When the feasible set is convex, taking prices (p1 , p2 ) as
given, the line (F E1 F ) is the budget line supporting the efficient point E1 for the
two producers and the consumer.
But what if producers failed to take into consideration the external effects of
their decisions on others? In general, a failure to internalize production externalities
would affect production incentives. In the Coase example, if the external cost of the
externality imposed on the farmer is neglected, the rancher would have incentives to
produce y1 beyond what is socially optimal, leading to a market equilibrium where
930 J.-P. Chavas

1
D

∗
1
E1

E1’
D’
Y1

0 C F’

Fig. 2 Efficiency under externalities and a convex feasible set Y1 (scenario 1)

prices p1 , p2 differ from (p1 , p2 ). This is illustrated in Fig. 3 for the market
good y1 . Figure 3 distinguishes between a social supply curve (where production
decisions reflect the presence of externalities) and a private supply curve (where
externalities are ignored
in an unregulated market). In Fig. 3, the efficient point
E1 corresponds to p1 , y1∗ and is situated at the intersection of the social supply
curve and demand curve.
But the unregulated
market equilibrium is given by
point E1’ where p1 , y1 differs from p1 , y1∗ because production decisions ignore
external effects. Such a market equilibrium would necessarily be inefficient. In this
context, government intervention can help. It can be done in at least two ways.
First, government could set regulations/standards or issue permits/quotas stipulating
the efficient quantities to be produced (i.e., y1∗ and y2∗ in Figs. 2 and 3). Second,
government could implement Pigouvian taxes/subsidies that reflect the social cost of
the externality [5]. When the feasible set is convex, such Pigouvian taxes/subsidies
would restore efficiency [1, 3, 5].1

P3: In unregulated markets facing externalities, government policy can

help restore efficiency. This can be done through regulations using stan-
dards/permits/quotas; or through price policies imposing Pigouvian taxes/
subsidies when the feasible set is convex.

1 Incase where Pigouvian taxes/subsidies are not fiscally neutral, attaining efficiency requires
redistribution of any fiscal surplus/deficit to consumers through lump sum payments [1, 3, 5].
Otherwise, Pigouvian taxes would not achieve “first-best” efficiency.
22 Economics of Externalities: An Overview 931

Fig. 3 Market equilibrium under externalities

But what if the feasible set is not convex? We now turn our attention to scenario
2 where negative externalities are associated with the feasible set Y2 being non-
convex. This scenario is illustrated in Fig. 4. The efficient allocation corresponds
to point E2 in Fig. 4. Indeed, point E2 generates the highest possible utility level
represented by the indifference curve (D E2 D’). Again, at the efficient point E2,
the slope of the indifference curve (D E2 D’ is tangent to the isoquant line (A E2 C)
(the upper bound of the feasible set Y2). And in the neighborhood of E2, the slope of
both lines is equal to the slope of the line (F E2 F ) in Fig. 4. What is new in Fig. 4 is
that the linkages between the slope of the line (F E2 F ) and efficiency breaks down.
Indeed, if the two firms faced prices given by the slope of (F E2 F ), they would have
incentive to produce at point C, where only output y2 is produced. This is showed
in Fig. 4 where the lines (F E2 F ) and (C C) have the same slopes (reflecting
the same relative prices), but aggregate revenue is higher at point C than at point
E2. Yet, point C is inefficient as it generates lower utility, the indifference curve
(C C ) being lower than the indifference curve (D D’). In other words, under the
non-convexity of the set Y2, the slope of line of (F E2 E’) no longer defines global
prices that can support an efficient allocation. Note that this slope still provides a
local measure of the shadow prices of (y1 , y2 ) in the neighborhood of the efficient
point E2. But this local measure does not provide a global representation of efficient
pricing. This raises the question: Does there exist a pricing scheme that can support
an efficient allocation? As showed in Fig. 4, an efficient pricing scheme is given
by the budget line (G E2 G’), the slope of this line reflecting the shadow relative
prices of (y1 , y2 ). Like the line (F E2 F ), the line (G E2 F ) is tangent to both the
indifference curve (D E2 D’) and the isoquant (D E2 D’) in the neighborhood of
932 J.-P. Chavas

Fig. 4 Efficiency under externalities and a non-convex feasible set Y2 (scenario 2)

the efficient point E2. But unlike (F E2 F ), the budget line (G E2 G’) provides
incentives for the firms to produce at the efficient point E2. And in contrast with
(F E2 F ), the budget line (G E2 G’) is nonlinear. Interpreting the slope of budget
lines as measures of relative prices, we deduce that a pricing scheme supporting an
efficient allocation must involve nonlinear pricing. Noting that this statement holds
only under the non-convexity of the feasible set (i.e., under scenario 2), this gives
the following result.

P4: In a market economy with externalities, nonlinear pricing may be required

to support an efficient allocation when the feasible set is non-convex.
Knowing from property P2 that externalities can lead to non-convexity of the
feasible set, property P4 indicates that linear pricing may not support an efficient
allocation. In such a situation, uniform Pigouvian taxes/subsidies mentioned in
property P3 would not be efficient.2
Such complexities raise a more fundamental question: What is the most efficient
way to manage externalities? Trying to identify the appropriate institutional and
policy response to externality issues has generated a debate among economists and
policy makers. At least three different lines of arguments have been explored. A
first line is that externality issues can be managed through government intervention.
This includes regulation and/or Pigouvian taxes/subsidies chosen to make market

2 Asdiscussed below, government intervention can still be helpful to achieve efficiency through
quantity regulations and/or through nonlinear pricing policies.
22 Economics of Externalities: An Overview 933

prices reflect the social cost of the externalities [3, 5]. But as noted above, standard
Pigouvian taxes may not always lead to efficient allocations. A second line is to note
that production externality issues can be resolved when the affected firms merge. In
this case, the externalities can be “internalized” under efficient management by the
merged firm. A third line is that externalities can be managed through contracts
between the affected parties [6]. In the Coase example mentioned above, this would
involve a contract between the rancher and the farmer that would stipulate how
livestock is managed to reduce or eliminate crop damages. For example, a contract
could stipulate building a fence to prevent livestock from straying and destroying the
crop grown on the neighboring farm. This could be efficient if the cost of building
a fence is less than the crop damages. As stressed by Coase [6], under this contract
option, there is no need for government intervention.
So, how can externalities be managed efficiently? There are many institutional
options, including government intervention, mergers, and contracts.3 In general,
it is difficult to tell which option is better. If each option could be implemented
under perfect information and at zero cost, then all options may be seen as equally
efficient, in which case efficiency alone would not provide much guidance.4 But
most situations involve imperfect information and the management of externalities
is typically costly. This suggests introducing transaction costs explicitly in the
analysis. Defining transaction cost as the cost of the resources used in the process of
coordination/exchange, this will provide additional insights on how externalities can
be managed efficiently. This is illustrated in Fig. 5 under two institutional options,
A and B, where institution B exhibits larger transaction costs (i.e., uses more
resources) than institution A in managing externalities, ceteris paribus. Resources
used in externality management are no longer available for consumption. As a
result, the feasible set for consumer goods is larger under institution A than under
institution B. As showed in Fig. 5, it follows that institution A is deemed efficient
compared to institution B. Indeed, point EA attained under institution A generates a
higher level of utility (as given by the indifference curve (D EA D’)) than point EB
attained under institution B. This generates the following result.

P5: Transaction costs are relevant in the management of externalities: ceteris

paribus, institutions exhibiting lower transaction costs are deemed more effi-
cient.
Property P5 states that transaction costs play an important role in the efficient
management of externalities. It indicates that each institutional option (government
intervention, merger, or contract) may be desirable on efficiency ground when
associated with low transaction costs. Alternatively, any option involving large

3 Other options include individual transferable permits/quotas [25], and Varian [26]’s scheme

involving a two-step mechanism that can implement efficient allocations as subgame-perfect

equilibria under externalities.
4 Note that equity considerations (not addressed in this chapter) can also play a role in evaluating

alternative externality management strategies.

934 J.-P. Chavas

1
G D

EA
EB

D’

G’

C 2

Fig. 5 Efficiency under externalities and transaction costs

transaction costs may be inefficient. To the extent that information costs are
included among transaction costs, property P5 would apply to information: efficient
management of externalities should be handled by institutions that have good access
to information. These arguments suggest that government intervention may be seen
as desirable when government has good information about the externalities and
government action can be implemented at relatively low cost (e.g., [5]).5 Similarly,
contracts provide a good option to manage externalities when contractual costs are
low and the affected agents are well informed [6, 24]. Finally, the merger option
may be desirable when the management of the merged firm is well informed and
effective in “internalizing” the externalities [19, 27].

A General Equilibrium Analysis of Efficiency under Externalities

Based on the motivating example presented in Sect. 2, we now present a general

model of externalities. Following Chavas [18], consider an economy consisting of m
commodities and n economic agents. We distinguish between two groups of agents:
consumers and producers. Let Nc be the set of nc consumers, and Ns the set of ns

5 In this context, when comparing government pricing policies versus government standards/quotas,

economists often follow Pigou [5] and argue in favor of pricing policies on the ground that they
are easier to implement and require less information (especially in the presence of heterogeneous
agents). These issues are further discussed in Sect. 4.
22 Economics of Externalities: An Overview 935

producers. The set of all agents is N = Nc ∪ Ns = {1, 2, . . . , n}, where n = nc + ns .

The i-th consumer chooses a consumption bundle xi = (xi1 , . . . , xim ) ∈ Xi ⊂ Rm ,
i ∈ Nc , where the feasible set Xi is assumed to be convex and to have a lower
bound, i ∈ Nc . Let x = (x1 , . . . , xnc ), where x ∈ X = X1 × · · · × Xnc . Consumer
preferences for the i-th consumer are represented by a continuous utility function
ui (x), i ∈ Nc . The utility functions are general and allow for non-zero income effects.
And given x = (x1 , . . . , xnc ), they allow for externalities among consumers (as the
consumption decision of a consumer can affect the utility of other consumers).
The allocation of m commodities among the n agents also involves production
and exchange activities. The production activities of the j-th producer are denoted
by yj = (yj1 , . . . , yjm ) ⊂ Rm , j ∈ Ns . When the k-th commodity is a consumer good,
yjk is the nonnegative quantity produced by the j-th producer (yjk ≥ 0), j ∈ Ns , and xik
is
the nonnegative quantity consumed by the i-th consumer (xik ≥ 0), i ∈ Nc . Let y =
y1 , . . . , yn s ∈ Y ⊂ Rm ns , where Y is the feasible set for production activities.
In general, the feasible set Y represents a joint production process, allowing for
externalities among producers.
Exchange can take place among agents. Let the vector tji = {tjik :k = 1, . . . , m}∈Rm
denote the exchanged quantities of commodities provided by the j-th agent to the
i-th agent, j, i ∈ N. When the k-th commodity is a consumer good, tjik ≥ 0 is
the quantity traded from the j-th agent to the i-th consumer. We consider the case
where coordination and exchange among agents can be costly and involve the use
of resources. Let z = (z1 , z2 , . . . , zn ), where zi = (zi1 , . . . , zim ) ∈ Rm is the vector of
commodities used by the i-th agent in coordination and exchange activities, i ∈ N.
The costs of z are the transaction costs from coordination and exchange activities
among the agents. Such costs include transportation cost, information cost, search
cost, contractual cost, and enforcement cost. The feasible set for (z, t) is denoted by
Z, with (z, t) ∈ Z. Again, this allows for externalities in coordination and exchange
among agents.
Assume that the sets Y, Z, and X are closed, that (0, 0) ∈ Z, and that the set

i∈Nc i X ∩ y
j ∈Ns j : (y1 , . . . , ym ) ∈ Y has a non-empty interior. Importantly,
we allow the sets Y and Z to be non-convex. In this context, a feasible allocation is
defined as a vector (x, y, z, t) satisfying

tj i ≤ yj − zj , j ∈ Ns , (1)
i∈N

xi ≤ tj i − zi , i ∈ Nc , (2)
j ∈N

where x ∈ X, y ∈ Y, (z, t) ∈ Z. Equations (1) and (2) are commodity balance

constraints. When the k-th good is an output and a consumer good, yjk ≥ 0 is the
quantity produced by the j-th producer, tjik ≥ 0 is the quantity traded from the j-th
agent to the i-th agent, and (1) implies that the j-th producer cannot sell more than
its production yjk net of resources used in coordination and exchange zjk , j ∈ Ns . And
when the k-th commodity is an input, then yjk ≤ 0 where yjk is the input quantity
936 J.-P. Chavas

used by the j-th producer, tjik ≤ 0 where tjik is

the quantity traded from the j-th agent
to the i-th agent, and (1) implies that yjk ≤ i ∈ N |tjik | − zjk , j ∈ Ns , meaning that
k-th input by the j-th producer yjk cannot exceed its availability from
the use of the
exchange ( i ∈ N | tjik | ) net of zjk..
Similarly, when the k-th commodity is a consumer good, we have xik ≥ 0, and
(2)
implies that the i-th consumer cannot consume more than what it can acquire
( j ∈ N tjik ) net of zi , i ∈ Nc . And when the k-th commodity is a “consumer bad”
(e.g., pollution), let xik ≤ 0 where xik is the quantity of the k-th bad facing the i-
th consumer, and let tijk ≤ 0 where tijk is the quantity of the k-th bad exchanged
from the i-th agent to the j-th agent, i ∈ Nc . This shows that a feasible allocation in
(1, 2) applies under general conditions, including situations of externalities between
producers and consumers where production/trade activities have adverse effects on
consumer welfare.
Finally, note that the analysis can include dynamics and uncertainty. Following
Debreu [7] and using a state-contingent approach, each decision can be defined to
be specific to a given time period and a given state of nature representing uncertainty
(e.g., weather conditions). The feasible sets X, Y, and Z would then reflect the
information available to each of the n agents. In this case, externalities may arise
when information and learning involve social networks [28].
Our analysis focuses on efficient allocations, relying on the classical Pareto
efficiency criterion: a feasible allocation (x∗ , y∗ , z∗ , t∗ ) is Pareto efficient if there is
no other feasible allocation (x, y, z, t) that can make one individual better off without
making anyone else worse off.
Our analysis of Pareto efficiency relies on a benefit function. Consider the
commodity bundle g = (0, . . . , 0, 1) ∈ Rm + where the m-th commodity is “money”
treated as a private good that that can be exchanged costlessly among the n
individuals. We assume that consumer preferences are non-satiated in the m-th
good (money). Following Luenberger [29], using g as a reference bundle, define
the aggregate benefit function as

B (x, U ) = max βi : (xi − βi g) ∈ Xi , ui x1 − β1 g, . . . , xnc − βn c g
β i∈Nc

≥ Ui , i ∈ Nc
(3)

if there is a feasible β = β1 , . . . , βn c ,
= − ∞ otherwise,
where U = (U1 , . . . , Unc ). The benefit function B(x, U) in (3) gives the
largest amount of the bundle g that consumers facing utilities U are willing to
give up to reach consumption x. The function B(x, U) in (3) provides a general
22 Economics of Externalities: An Overview 937

measure of aggregate consumer benefits under consumption externalities.6 When

g = (0, . . . , 0, 1) ∈ Rm
+ with the m-th commodity being money, we set the unit price
of the bundle g to be 1, meaning that our welfare measurements involve monetary
evaluations. Yet, the analysis allows for production and consumption externalities
with respect to the first (m − 1) goods. And it allows for transaction costs to arise
when exchange in the first (m − 1) commodities takes place.
Next, define a maximal allocation as an allocation (x, y, z, t) solving the following
maximization problem

V (U ) = max {B (x, U ) : equ.(1a) − (1b), x ∈ X, y ∈ Y, (z, t) ∈ Z} . (4)

x,y,z,t

And (x, y, z, t) is said to be zero maximal if, in addition, U is chosen in (3) such
that V(U) = 0. The following result was obtained by Luenberger [29, 30] and Chavas
[18].

Proposition 1: A Pareto efficient allocation is equivalent to a zero-maximal

allocation given in (3) with U = (U1 , . . . , Unc ) being chosen such that V(U) = 0.

Proposition 1 holds in the presence of externalities, non-convexity, and transac-

tion costs. It means that the investigation of Pareto efficiency under externalities can
be based on the analysis of zero-maximal allocations. The function V(U) in (3) has
an intuitive interpretation: it is the distributable surplus that maximizes aggregate
benefit [29, 32]. From (3), a maximal allocation makes aggregate benefit as large as
possible. And zero-maximality means that this surplus must be entirely redistributed
to consumers. In this context, Proposition 1 states that Pareto efficiency involves the
maximization of aggregate benefit and then its complete redistribution.
In addition, the zero-maximality condition V(U) = 0 is an implicit equation for
U = (U1 , . . . , Unc ) that characterizes the Pareto utility frontier. Noting that V(U)
is non-increasing in U, the set of U that satisfies V(U) ≥ 0 defines the space of
reachable utility levels, and {U : V(U) = 0} identifies the upper bound of this space
as the Pareto utility frontier. In this case, moving along the Pareto utility frontier
corresponds to efficient allocations associated with different welfare distributions
among consumers. These results establish the characterization of efficiency under
externalities, transaction costs, and non-convexity under general conditions.
By identifying the efficient quantities produced, consumed, and traded, Proposi-
tion 1 can be used in the evaluation of contracts and government regulations in the
presence of externalities. When externalities are evenly distributed, Proposition 1

6 Equation(3) includes as a special case the situation where there is no consumption externality, in
which case individual benefit can be evaluated one consumer at a time and aggregate benefit is just
the sum of individual benefits across all consumers [30, 31].
938 J.-P. Chavas

is consistent with government regulations imposing uniform standards over space.

But efficient contracts/standards would change depending on the nature and extent
of the externalities. For example, efficient standards would vary across space when
the externalities are spatially distributed. In this situation, efficient regulation would
require information on the spatial nature of the externalities. In addition, non-
convexity can provide incentives to concentrate external effects in a small area, in
which case spatially uniform standards would be inefficient [33].

Efficient Pricing under Externalities

We now examine the role of markets and pricing in controlling externalities.

Studying the linkages between efficiency and market allocations is not new (e.g.,
[34, 35]). Here, we follow Chavas [18] to explore the economics of efficiency and
pricing in the presence of externalities under general conditions (including non-
convexity (from P2) and transaction costs (from P5)).
Let F be the set of continuous and non-decreasing functions f from Rm to R that
satisfy the translation property: f (y + α g) = α + f (y) for any y ∈ Rm and any α ∈ R.
Consider the generalized Lagrangian functional L:

L (x, y, z, t, f, h, U ) = B (x, U ) + fj yj − fj t j i + zj
j ∈Ns i∈N

+ hi tj i − zi − hi (xi ) ,
i∈Nc j ∈N
(5)

where f = f1 , . . . , fn s ∈ Fs = F × · · · × F and h = (h1 , . . . , hnc ) ∈ Fc = F ×
· · · × F are “penalty functions” associated with constraints (1) and (2), respectively.
Given g = (0, . . . 0, 1), the m-th commodity is used as a numeraire good. When the
unit price of g is normalized to be equal to 1, then fi and hi can be interpreted as
monetary values of goods associated with the i-th agent, i ∈ N. In this context, we
interpret the functions f and h as reflecting pricing schemes. Note the Lagrangian
in (4) is “generalized” in the sense that it allows the functions (f, h) to be nonlinear.
This is an extension of the standard Lagrangian approach applied under convexity
where the penalty functions (f, h) are taken to be linear, their slopes being Lagrange
multipliers measuring the shadow prices of constraints (e.g., [36, 37]). As illustrated
in Fig. 4, the presence of non-convexity requires us to consider nonlinear penalty
functions (and nonlinear pricing as discussed below).
For a given U, consider the case where the generalized Lagrangian in (4) has a
saddle-point (x∗ , y∗ , z∗ , t∗ , f∗ , h∗ ) ∈ [X × Y × Z × Fs × Fc ] that satisfies

L (x, y, z, t, f, h, U ) ≤ L x ∗ , y ∗ , z∗ , t ∗ , f ∗ , h∗ , U ≤ L x ∗ , y ∗ , z∗ , t ∗ , f, h, U
(6)
22 Economics of Externalities: An Overview 939

for any x ∈ X, y ∈ Y, (z, t) ∈ Z, f ∈ Fs , h ∈ Fc .7 The first inequality in (6)

implies the maximization of the Lagrangian with respect to (x,y, z, t) and the
second inequality implies its minimization with respect to (f, h). In addition, we
have L(x∗ , y∗ , z∗ , t∗ , f∗ , h∗ , U) = V(U), implying that the Lagrangian approach in (6)
provides a dual formulation to the maximization of aggregate benefit given in (4).
The Lagrangian approach also gives the efficient pricing scheme (f∗ , h∗ ). Chavas
[18] obtained the following result.

Proposition 2: A Pareto efficient allocation satisfies

E (h, U ) = Minx hi (xi ) : ui (x) ≥ Ui , i ∈ Nc ; x ∈ X , (7)
i∈Nc

πs (f ) = Maxy fj yj : y ∈ Y , (8)
j ∈Ns

πT (f, h) = Maxz,t hi tj i –zi
i∈Nc j ∈N
(9)
– fj tj i + zj : (z, t) ∈ Z ,
j ∈Ns i∈N

V (U ) = I nff,h {πs (f ) + πT (f, h) –E (h, U ) : f ∈ Fs , h ∈ Fc } , (10)

with U = (U1 , . . . , Unc ) being chosen such that V(U) = 0.

Proposition 2 gives a dual representation of Pareto efficiency under externalities,
allowing for externalities, non-convexity, and transaction costs. It provides useful
information on the role of markets in the efficient management of externalities. We
interpret the function fj as measuring the value of goods associated with the j-th
producer, j ∈ Ns , and the function hi as measuring the value of goods associated
with the i-th consumer, i ∈ Nc . This interpretation applies to market institutions as
well as nonmarket institutions (in which case the functions f ’s and h’s represent
valuations associated with implicit markets; see Rosen [39] and Ekeland et al. [40]).
Equations (7, 8, and 9) follow from the first inequality in (6), while Eq. (10)
follows from the second inequality in (6). Equation (7) states that, conditional on U
and the pricing scheme h, consumption x is chosen to minimize aggregate consumer
expenditure, i∈Nc hi (xi ). Equations (8, 9) are profit maximizing conditions. In

Eq. (8), j ∈Ns fj yj is the aggregate value from all production activities and
πs (f ) is the largest possible aggregate profit given the pricing scheme f. In Eq. (9),

i∈Nc hi j ∈N tj i –zi – j ∈Ns fj i∈N tj i + zj is the aggregate value from

7 Theconditions needed for the existence of a saddle-point in (6) are mild and are expected to hold
under fairly general conditions. See Gould [38], Bertsekas [36], and Chavas and Briec [13]. In this
chapter, we assume that these conditions hold.
940 J.-P. Chavas

coordination and exchange activities, and πT (f, h) is the largest possible profit from
trade, conditional on f and h. Equation (10) establishes linkages between aggregate
profit, πs (f ) + πT (f, h), net of aggregate expenditure, E(h, U). Note that the function
V(U) in (10) is the same as the distributable surplus V(U) defined in (3), making
it clear that V(U) is a welfare measure with a monetary interpretation. Then, Eq.
(10) states that the distributable surplus V(U) is obtained by choosing the pricing
functions f and h that minimize aggregate profit minus aggregate expenditure.
Finally, choosing U such that V(U) = 0 means that, under Pareto efficiency, the
distributable surplus V(U) is entirely redistributed.
Denote the solution to Eq. (10) by (f∗ , h∗ ). From proposition 2, given the
aggregate expenditure function E(h, U) in (7) and the aggregate profit functions
πs (f ) in (8) and πT (f, h) in (9), Eq. (10) defines the pricing scheme (f∗ , h∗ ) that
yields the distributable surplus V(U). Then, choosing U such that V(U) = 0 in
(10) implies that πs (f∗ ) + πT (f∗ , h∗ ) − E(h∗ , U) = 0. This corresponds to an
aggregate budget constraint where aggregate consumer expenditure E(h∗ , U) equals
aggregate profit πs (f∗ ) + πT (f∗ , h∗ ). This provides a basis supporting the measures
commonly used in national accounts: the total value of goods and services can
be measured equivalently from the production side πs (f∗ ) + πT (f∗ , h∗ ), or from
the consumption side E(h∗ , U). And with {U : V(U) = 0} defining the Pareto
utility frontier, moving along this utility frontier corresponds to efficient allocations
associated with different distributions of income among consumers. Recall that our
analysis covers general consumer preferences and allows for income effects. In the
presence of income effects, it means that efficient allocations would change along
the Pareto utility frontier (as production, consumption, trade, and monetary values
would typically change under different income redistributions).
Proposition 2 generalizes the Coase theorem [6]. Indeed, Proposition 2 and
expression (8) imply that aggregate profit maximization is consistent with Pareto
efficiency. This is the essence of the Coase theorem [6], making it clear that
aggregate profit maximization is at the heart of Pareto efficiency, with or without
externalities. It indicates that a failure to maximize aggregate profit would be
inconsistent with efficiency. But the analysis in Coase [6] was presented assuming
no transaction costs. As argued in Chavas [18], aggregate profit maximization
remains a valid characterization of the efficient management of externalities under
two important generalizations: (1) under transactions costs; and (2) in the presence
of non-convexity in production and exchange activities. These are important
generalizations of the Coase theorem. They stress the generality of the arguments
presented by Coase [6] and their importance leading him to win the Nobel Prize in
1991.
Another important result is given in Eq. (9): aggregate profit maximization for
exchange activities is also consistent with Pareto efficiency. This is an insight
that was explored in Coase [19] but not in Coase [6]. Thus, Proposition 2
provides a nice integration of Coase’s two seminal papers. First, the consistency
between aggregate profit maximization and Pareto efficiency applies to production
22 Economics of Externalities: An Overview 941

as well as coordination/exchange activities. Importantly, this result holds in the

presence of externalities in consumption and coordination/exchange activities and
under non-convex technology. Second, as stressed in [19], assessing the relative
efficiency of markets versus nonmarket institutions requires evaluating the role
of transaction costs. As argued in Chavas [18], this is a “missing piece of the
puzzle” in Coase [6]. Identifying which institutions can manage externalities
efficiently depends on their associated transaction costs. The efficient management
of externalities is associated with the institutions that maximize aggregate profit,
including both aggregate profit from production activities (as stated in (8)) and
aggregate profit from coordination/exchange activities (as stated in (9)). The profit
maximization condition in (9) implies the minimization of transaction costs. In
other words, efficient institutions are the ones that maximize aggregate profit (as
discussed in Coase [6]) as well as minimize transaction costs (as stressed in Coase
[19]).
It is sometimes argued that the Coase analysis implies that the efficient manage-
ment of externalities is independent of the assignment of property rights [41]. In
general, this argument is false. Indeed, reassigning property rights typically affects
the distribution of income. In the presence of income effects, this would affect
consumption decisions and thus resource allocation [42]. Thus, given the empirical
prevalence of income effects, it is incorrect to assert that the efficient management
of externalities is independent of property rights. Note that this argument holds
under very general conditions: it applies with or without transaction cost; and it
applies in the presence of non-convexity. In other words, it is inappropriate to argue
that the assignment of property rights is irrelevant in the efficiency evaluation of
externalities.
While Eqs. (7, 8) state that aggregate profit maximization is consistent with
Pareto efficiency, what does it say about decentralized decision making? In general,
in the presence of externalities among firms, production and exchange activities
cannot be fully decentralized. Indeed, production externalities would make firm-
level profit maximization inefficient. As illustrated in Fig. 3, efficiency requires that
external effects among producers be explicitly taken into consideration. Similarly,
under coordination/exchange externalities (e.g., due to social learning), firm-level
profit maximization applied to trading firms would be inefficient (by failing to
consider external effects among traders). In such cases, some coordination schemes
are needed among the agents facing externalities.
How does Proposition 2 relate to the standard welfare theorems establishing close
linkages between Pareto efficiency and markets (e.g. [43])? As illustrated in Fig. 2,
externalities do not always lead to non-convexity. Under convexity, the analysis
of efficiency is simpler. In this case, the separating hyperplane theorem applies,
meaning that there exists a hyperplane separating the feasible set from the efficient
consumption set [37], the slope of the hyperplane measuring prices supporting an
efficient allocation. Then, the functions f and h in (7)–(10) can be taken to be linear:

fj (yj ) = psi yj , j ∈ Ns , and hi (xi ) = pci xi , i ∈ Nc , where the p s are prices reflecting the
942 J.-P. Chavas

social value of the commodities.8 In Fig. 2, the separating hyperplane is given by the
line (F E1 F ). The slope of the hyperplane is equal to (−p2 /p1 ), where (p1 , p2 ) being
the social prices supporting an efficient allocation. But as discussed in Sect. 2, under
decentralized decisions, neglecting the externalities would affect
supply/demand
decisions, leading to inefficiency as observed prices p1 , p2 differ from efficient
prices (p1 , p2 ) (as illustrated in Fig. 3). Under this scenario, inefficiency can be
restored by government intervention imposing Pigouvian taxes/subsidies reflecting
the social cost of the externalities.
As argued by Starrett [9] and Baumol and Bradford [10], externalities can lead to
non-convexity. This is the scenario illustrated in Fig. 4, where negative externalities
are large enough to make the feasible set non-convex. The investigation of non-
convexity is not new (e.g., [13–17]). It is well known that introducing non-convexity
for Y or Z can invalidate the standard welfare theorems. The reason is that, under
non-convexity, the separating hyperplane theorem no longer holds and cannot be
used to identify efficient prices. This is illustrated in Fig. 4 where the linear pricing
line (F E2 F ) fails to support the efficient allocation E2 (as it would provide
incentives to produce at the inefficient point C). But as discussed in Sect. 2, the
nonlinear pricing line (G E2 G’) would support the efficient allocation E2. This
pricing line corresponds to a hypersurface separating the feasible set from the
efficient consumption set. Again, the slope of this separating hypersurface provides
information about prices. The separating hypersurface being nonlinear implies
nonlinear pricing.9 This is a scenario where the pricings schemes (f, h) in (7)–(10)
must be nonlinear. Interestingly, Eq. (7) still associates efficiency with aggregate
expenditure minimization. And Eqs. (8, 9) associate efficiency with aggregate
profit maximization. Thus, the problem created by non-convexity does not come
from aggregate profit maximization (which continues to hold under efficiency).
The problem comes from uniform pricing. Under non-convexity, efficiency can
be attained by moving away from uniform pricing and implementing a nonlinear
pricing scheme [13]. In Fig. 4, this involves moving from the uniform price line (F
E2 F ) to the nonlinear pricing line (G E2 G’). This argument applies to externality-
driven non-convexity of the production set Y. But it also applies to possible
non-convexity in the feasible set Z. For example, externalities among traders
could also generate non-convexity in Z. Again, such non-convexity invalidates the
separating hyperplane theorem, implying that uniform prices may fail to support
an efficient allocation and require nonlinear pricing schemes f∗ and h∗ as identified
in (10).

8 Note that, under convexity and in the absence of externalities, the analysis would then reduce
to the standard welfare theorems establishing close relationships between Pareto efficiency,
decentralized decisions, and competitive markets (e.g., [7, 43]).
9 The line (G E2 G’) in Fig. 4 implies that the relative price (p /p ) declines with y , indicating
2 1 2
that the price p2 decreases with y2 . This is a situation of “volume discount” commonly observed in
nonlinear pricing (e.g., [44]).
22 Economics of Externalities: An Overview 943

This stresses the importance of nonlinear pricing in efficiency analysis under

non-convexity. When externalities generate non-convexity, this implies that
economists and policy makers should not insist on uniform pricing in the efficiency
evaluation of externalities. Indeed, nonlinear pricing may be required to implement
an efficient allocation in the presence of externalities. As noted by Wilson [44],
nonlinear pricing schemes are commonly observed in many markets. The challenges
of implementing nonlinear pricing in support of efficiency are discussed below.
Finally, while the efficient pricing schemes (f∗ , h∗ ) were identified in (10),
one issue remains: how to discover and implement such pricing schemes? Under
nonmarket institutions (e.g., contracts), these pricing schemes need not be explicit:
they would be shadow prices in implicit markets supporting an efficient allocation
[39, 40]. Alternatively, under market institutions, the pricing schemes (f∗ , h∗ ) are
an explicit part of a market economy. Under linear pricing (where fj (yj ) = psi yj ,
∗ and p ∗ are easy to identify and
j ∈ Ns , and hi (xi ) = pci xi , i ∈ Nc ), the prices psi ci
implement: they are the market-clearing prices that satisfy the commodity balance
Eqs. (1 and 2), respectively. As illustrated in Fig. 2, linear pricing can always
support an efficient allocation under convexity, and identifying efficient schemes
reduces to finding the “right prices”. In competitive market prices in the absence
of externalities, the market clearing prices correspond to Adam Smith’s “invisible
hand.” In the presence of externalities, this can involve Pigouvian taxes that make
market prices equal to social prices [1, 3, 5].
But non-convexity and nonlinear pricing make economic evaluations more
complex. In this case, the nonlinear pricing schemes (f∗ , h∗ ) have two roles to play:
they clear the markets; and they provide the proper incentives to implement an
efficient allocation. This second role arises as a separate function only under non-
convexity. This raises the question: In situations where the role of pricing goes
beyond just “clearing the markets,” who choose the nonlinear pricing schemes
(f∗ , h∗ )? In this case, the pricing strategy is chosen by the managers in charge
of marketing and/or the policy makers in charge of pricing policy. These are
scenarios where implementing an efficient allocation requires a “visible hand”
as pricing decisions are made by managers and/or policy makers. Such pricing
decisions require the use of information and managerial skills that are typically not
costless. This implies that, under non-convexity and nonlinear pricing, transaction
costs would be relevant in the evaluation of pricing schemes supporting efficient
allocations. In such situations, transaction costs would play a role in the efficient
management of externalities under markets as well as nonmarket institutions. These
issues are further discussed in the next section.

Efficient Policies

The analysis presented in the previous section provides two approaches to the
characterization of Pareto efficiency in the presence of externalities: Proposition
1 identifies efficient production, consumption, and trade; Proposition 2 relies on
a dual generalized Lagrangian approach that also evaluates the role of pricing.
944 J.-P. Chavas

From a policy viewpoint, can we use these two Propositions to recommend

whether externalities should be managed through market-based mechanisms (e.g.,
as opposed to regulations)? The answer to this question is: No. Indeed, Propositions
1 and 2 provide alternative representations of the same efficient allocations. This
means that Proposition 1 does not imply that regulation is a preferred solution to
an externality problem. Similarly, Proposition 2 does not imply that market-based
mechanisms provide superior means of controlling externalities.
How can economists assist policy makers in choosing policies supporting effi-
cient allocations in the presence of externalities? Typically, tradeoffs exist between
regulatory approaches versus market-based approaches to externality control. If
Propositions 1 and 2 do not assist in evaluating these tradeoffs, economists may
make recommendations on ideological grounds (e.g., market-loving economists
argue in favor of market-based solutions, while government-loving economists
argue in favor of regulations). This is not desirable: ideological arguments do not
help identify the tradeoffs between alternative policy options. Yet, economists can
make constructive contributions to the policy-making process by focusing attention
on the role of transaction costs.
As first argued by Coase [19], transaction costs play a fundamental role as
they affect the limits of organizations and firms and the functioning of markets
(e.g., [20–24]). Such arguments also apply to institutions affecting the management
and control of externalities [24, 45]. Two issues have the subject of special
attention (e.g., [4]). First, when externalities can be reduced through technological
innovations, any policy response to externalities needs to provide proper incentives
to support innovations. Second, information is typically costly and is an important
part of transaction costs. Access to information is crucial in the design and
implementation of efficient allocations (e.g., [46, 47]). Without good information,
any management or policy would fail to assess the nature and extent of externalities.
This is a scenario where poorly informed decision-makers would fail to provide
efficient management of externalities. This argument is often presented against
government regulations: poorly informed regulators would fail to develop and
implement efficient policies dealing with externalities. This is a likely scenario when
externalities vary across space and obtaining information about local externalities
is difficult (e.g., [48]). But poor information is not specific to regulators. For
example, when externalities are global and there are economies of scale in obtaining
information about externalities, private agents may be less informed than policy
makers about the exact nature of externalities.
In general, obtaining information about externalities or new technologies is
costly. The cost and utilization of this information typically vary across policy
options. Higher information cost would contribute to a downward shift in the Pareto
utility frontier, indicating that economic efficiency would improve under institutions
that have lower information costs and a better access to information. This argument
applies across institutions, including markets, contracts, and regulation. Shleifer
[24] has argued that the prevalence of regulations in market economies can be
explained in part by the failure of courts and their (in)effectiveness in settling
22 Economics of Externalities: An Overview 945

contract disputes. This raises the question of identifying the institution(s) that can
obtain good information and use it to implement an efficient control of externalities.
Our analysis has examined the role of pricing. Figure 2 indicates that externalities
do not always generate non-convexity, in which case linear pricing and Pigouvian
taxes/subsidies are appropriate. But Fig. 4 shows that externalities can generate non-
convexity that requires nonlinear pricing. More generally, externalities are not the
only possible source of non-convexity. Another source of non-convexity involves
the presence of fixed cost. It is well known that fixed cost can make competitive
markets inefficient (e.g., [12, 14, 28]). For example, under fixed cost and increasing
returns to scale, marginal cost pricing is inefficient: marginal cost would be lower
than average cost and competitive firms would exit (as marginal cost pricing would
generate negative firm profit, revenue being insufficient to cover the fixed cost).
Yet, nonlinear pricing can support an efficient allocation (when many consumers
buy at marginal cost, but some consumers pay a higher price that allows firms to
cover their fixed cost). For this reason, some industries have moved in the direction
of using nonlinear pricing (e.g., [44]). An interesting example is the case of block
pricing for electricity (e.g., [49–52]). The electricity industry has three important
characteristics: (1) electricity-generating power plants have large fixed cost (thus
exhibiting non-convexity); (2) the demand for electricity varies over time (e.g., peak
demand occurring during heat wave); and (3) fuel-based power plants pollute the
air (thus generating negative externalities). One may think about implementing a
Pigouvian tax to cover the social cost of the externality. But in this case, a uniform
Pigouvian tax is not efficient: a “high uniform price” would be needed to cover
the fixed cost of building an additional power plant to supply electricity during
peak periods. The inefficiency comes from two sources: the “high” electricity price
would have negative impacts on the welfare of most consumers; and when active,
the additional power plant would increase pollution. With fixed cost creating non-
convexity, the efficient nonlinear pricing would involve many consumers paying
the (lower) marginal social cost of electricity (including the cost of pollution),
while some consumers pay higher prices inducing a reduction in their electricity
consumption during peak periods. In this case, the higher price paid by some
consumers would play two roles: (1) covering the fixed cost of power plant, thus
making the pricing scheme sustainable; and (2) reducing the demand for electricity
during peak periods. This last effect creates a double dividend: it saves on the cost
of building an additional power plant; and it reduces pollution during peak periods.
In this case, uniform Pigou taxes would be inefficient. But while nonlinear pricing
can be efficient, it also has distributional effects [49]. And to achieve efficiency,
nonlinear pricing faces the difficulties of identifying the price-responsive consumers
who are going to pay higher prices as well as the nature and timing of the price
discrimination scheme [50]. Addressing these difficulties requires more information
(compared to uniform pricing) and creates significant challenges for designing and
implementing market-based policies that can achieve efficiency.
Previous environmental policies can shed useful lights on the relative effective-
ness of alternative policy options to externality control. Experiences have varied
946 J.-P. Chavas

greatly depending on the nature and extent of the externality [53]. For illustration
purpose, we consider three examples. First, the depletion of the stratospheric ozone
layer and its link with pollution were identified in the 1970s. This generated a strong
regulatory response leading to the 1987 Montreal Protocol and the international
banning of ozone-depleting chemicals [54, 55]. Evidence indicates that this strong
policy response was effective: healing of the Antarctic ozone layer is in progress
[56]. Second, acid rain was another environmental issue identified as an effect
of sulfur dioxide (SO2 ) emission from US coal-fired power plants. This is a
situation where externalities vary across space (as some power plants pollute more
than others). The US policy response was a “cap and trade” acid rain program
enacted under the 1990 Clean Air Act [57, 58]. The program consisted in the
government setting maximum emission allowances for US electric power plants
while permitting a market exchange for allowances among plants. The program
has been credited as a great success: it drastically reduced SO2 emissions from
US coal-fired power plants and it did so at a lower cost than a comparable
command-and-control regulation [59]. The US acid rain program illustrates how
market-based policies can provide a flexible and cost-effective way of reducing
pollution.
Our last example involves climate change. Nordhaus [60] identifies climate
change and its linkages with greenhouse gases (GHG) emission as the “ultimate
challenge” for economics. (Nordhaus received the Nobel Prize in economics in 2018
for his research on climate change.) The production of GHG from human activities
(mostly carbon dioxide CO2 generated from burning fossil fuels) contributes to
increasing atmospheric temperature on earth with significant long-term effects on
global climate [61]. Attempts to reduce GHG emission has led to the 1997 Kyoto
Protocol and the 2015 Paris Agreement. The debate has focused on evaluating the
social cost of carbon and on considering the imposition of Pigouvian taxes on
GHG emission. At this point, there are some disagreements. On the policy side, the
United States withdrew from the Paris agreement in 2017. On the economic side,
uncertainties related to climate dynamics and economic valuations have made it
difficult for economists and policy makers to agree on precise estimates of the social
price of carbon [60], thus lessening the political support in favor of a Pigouvian tax
on GHG. Finally, there is some concern that a carbon tax may not be an effective way
of dealing with GHG externalities (e.g., [62]). Patt and Lilliestam [62] argue that
the most effective response to climate change issues will be technological, implying
that current policies should be supporting technological innovations that reduce our
reliance on fossil fuels. Patt and Lilliestam [62] contend that a carbon tax alone
would not be enough to move the world economy away from fossil fuels toward
low-carbon technologies.
These examples illustrate the complexities of designing and implementing
efficient policies dealing with externalities. In general, the nature and magnitude of
externalities matter. So does the information available on their effects. In the context
of pollution, the presence (or absence) of close substitutes to the pollutants or to the
polluting technology plays a role. Finally, policy making always depends on the
bargaining ability of different interest groups to deal with each other. This argument
22 Economics of Externalities: An Overview 947

applies at the local level, national level, and international level (e.g., [53]). This
stresses the importance of the political economy of externalities.

Conclusion

This chapter has investigated the efficiency of resource allocation in the presence
of externalities. The analysis applies to markets as well as nonmarket institutions
(including contracts and government). We argue that externalities can lead to non-
convexity and the need for nonlinear pricing. In this context, while simple Pigouvian
taxes/subsidies can help, they are inappropriate when uniform pricing is inefficient.
We also examine the effects of transaction costs on efficient allocations and
coordination/exchange activities. Reducing transaction costs is an integral part of
the efficiency of resource allocation. This argument applies to contracts, regulations
as well as market-based policies and their relative abilities to control externalities.
The analysis provides useful insights into the efficiency of alternative governance
structures dealing with externalities.

References
1. Baumol WJ, Oates WE (1988) The theory of environmental policy, 2nd edn. Cambridge
University Press, Cambridge
2. Laffont JJ (2008) Externalities. In: Durlauf S, Blume LE (eds) The new Palgrave dictionary of
economics, 2nd edn. Palgrave Macmillan, London
3. Baumol WJ (1972) On taxation and the control of externalities. Am Econ Rev 62:307–322
4. Cornes R, Sandler T (1996) The theory of externalities, public goods and club goods.
Cambridge University Press, Cambridge
5. Pigou AC (1920) The economics of welfare. Macmillan, London
6. Coase RH (1960) The problem of social cost. J Law Econ 3:1–44
7. Debreu G (1959) Theory of value. Wiley, New York
8. Arrow K (1969) The organization of economic activity: issues pertinent to the choice of market
versus non-market allocation. In: The analysis and evaluation of public expenditures: the PPB
system. Government Printing Office, Washington, DC
9. Starrett D (1972) Fundamental non-convexities in the theory of externalities. J Econ Theory
4:180–199
10. Baumol WJ, Bradford DF (1972) Detrimental externalities and non-convexity of the production
set. Economica 39:160–176
11. Dasgupta P, Maler KG (2003) The economics of non-convex ecosystems: introduction. Environ
Resour Econ 26:499–525
12. Brown DJ (1991) Equilibrium analysis with non-convex technologies. In: Hildenbrand W,
Sonnenschein H (eds) Handbook of mathematical economics, vol 4. Elsevier, Amsterdam
13. Chavas JP, Briec W (2012) On efficiency under non-convexity. Econom Theor 50:671–701
14. Guesnerie R (1975) Pareto optimality in non-convex economies. Econometrica 43:1–29
15. Khan MA (1999) The Mordukhovich normal cone and the foundations of welfare economic. J
Public Econom Theor 1:309–338
16. Khan MA, Vohra R (1987) An extension of the second welfare theorem to economies with
nonconvexities and public goods. Q J Econ 102:223–242
17. Mordukhovich BS (2005) Nonlinear prices in nonconvex economies with classical pareto and
strong Pareto optimal allocations. Positivity 9:541–568
948 J.-P. Chavas

18. Chavas JP (2015) Coase revisited: economic efficiency under externalities, transaction costs
and non-convexity. J Inst Theor Econ 171:709–734
19. Coase RH (1937) The nature of the firm. Economica 4:386–405
20. Foley DK (1970) Economic equilibrium with costly marketing. J Econ Theory 2:276–291
21. Hahn FH (1971) Equilibrium with transaction costs. Econometrica 39:417–439
22. Williamson OE (1989) Transaction cost economics. In: Handbook of industrial organization,
vol 1. North Holland, Amsterdam
23. Williamson OE, Winter SG (eds) (1991) The nature of the firm, origins, evolution, and
development. Oxford University Press, Oxford
24. Shleifer A (2010) Efficient regulation. Working paper no. 15651, National Bureau of Economic
Research, Cambridge, MA
25. Stavins RN (1995) Transaction costs and tradeable permits. J Environ Econ Manag 29:133–148
26. Varian HR (1994) A solution to the problem of externalities when agents are well-informed.
Am Econ Rev 84(1994):1278–1293
27. Coase RH (1992) The institutional structure of production. Am Econ Rev 82:713–719
28. Radner R (1968) Competitive equilibrium under uncertainty. Econometrica 36:31–58
29. Luenberger DG (1995) Externalities and benefits. J Math Econ 24:159–177
30. Luenberger DG (1992a) New optimality principles for economic efficiency and equilibrium. J
Optim Theory Appl 75:221–264
31. Luenberger DG (1992b) Benefit functions and duality. J Math Econ 21:461–481
32. Allais M (1981) La Théorie Générale des Surplus. Economies et Sociétés Institut des Sciences
Mathématiques et Economiques Appliquées, Paris
33. Helfand GE, Rubin J (1994) Spreading versus concentrating damages: environmental policy in
the presence of nonconvexities. J Environ Econ Manag 27:84–91
34. Dixit AK, Norman V (1980) Theory of international trade: a dual general equilibrium
approach. Cambridge University Press, London
35. Luenberger DG (1994) Dual Pareto efficiency. J Econ Theory 62:70–85
36. Bertsekas DP (1995) Nonlinear programming. Athena Scientific, Belmont
37. Takayama A (1985) Mathematical economics, 2nd edn. Cambridge University Press, Cam-
bridge
38. Gould FJ (1969) Extensions of Lagrangian multipliers in nonlinear programming. SIAM J
Appl Math 17:1280–1297
39. Rosen S (1974) Hedonic prices and implicit markets: product differentiation in pure competi-
tion. J Polit Econ 82:34–55
40. Ekeland I, Heckman JJ, Nesheim L (2004) Identification and estimation of hedonic models. J
Polit Econ 112:S60–S109
41. Stigler GJ (1989) Two notes on the Coase theorem. Yale Law J 99(3):631–633
42. Hurwicz L (1999) Revisiting externalities. J Public Econom Theor 1(2):225–245
43. Mas-Colell A, Whinston MD, Green J (1995) Microeconomic theory. New York, Oxford
university press
44. Wilson RB (1993) Nonlinear pricing. Oxford University Press, Oxford
45. Papandreou AA (2003) Externalities, convexity and institutions. Econ Philos 19:281–309
46. Salanié B (1999) The economics of contracts. The MIT Press, Cambridge, MA
47. Tirole J (1999) Incomplete contracts: where do we stand? Econometrica 67:741–781
48. Fabrizio KR, Rose NL, Wolfram CD (2007) “Do markets reduce costs”? Assessing the impact
of regulatory restructuring on US electric generation efficiency. Am Econ Rev 97(4):1250–
1277
49. Borenstein S (2012) The redistributional impact of nonlinear electricity pricing. Am Econom
Rev Econom Policy 4(3):56–90
50. Borenstein S, Bushnell JB (2018) Do two electric pricing wrongs make a right? Cost recovery,
externalities and efficiency. Working paper 24756, National bureau of economic research,
Cambridge, MA
51. Dutta G, Mitra K (2017) A literature review on dynamic pricing of electricity. J Oper Res Soc
68:1131–1145
22 Economics of Externalities: An Overview 949

52. Joskow PL, Wolfram CD (2012) Dynamic pricing of electricity. Am Econ Rev 102(3):381–385
53. Sandler T (2017) Environmental cooperation: contrasting international environmental agree-
ments. Oxford Econ Pap 69(2):345–364
54. Haas PM (1991) Policy responses to stratospheric ozone depletion. Glob Atmos Change
1(3):224–234
55. Morrisette PM (1989) The evolution of policy responses to stratospheric ozone depletion. Nat
Resour J 29(3):794–820
56. Solomon S, Ivy DJ, Kinnison D, Mills MJ, Neely RR III, Schmidt A (2016) Emergence of
healing in the Antarctic ozone layer. Science 353(6296):269–274
57. Joskow PL, Schmalensee R (1998) The political economy of market-based environmental
policy: the U.S. acid rain program. J Law Econ 41(1):37–84
58. Stavins RN (1998) What can we learn from the grand policy experiment? Lessons from SO2
allowance trading. J Econom Perspect 12(3):69–88
59. Chan HR, Chupp BA, Cropper ML, Muller NZ (2018) The impact of trading on the costs and
benefits of the acid rain program. J Environ Econ Manag 88:180–209
60. Nordhaus W (2019) Climate change: the ultimate challenge for economics. Am Econ Rev
109(6):1991–2014
61. Velders GJM, Andersen SO, Daniel JS, Fahey DW, McFarland M (2007) The importance of
the Montreal protocol in protecting climate. Proc Natl Acad Sci 104(12):4814–4819
62. Patt A, Lilliestam J (2018) The case against carbon prices. Joule 2:2494–2498
Shadow Pricing in Production Economics
23
Rolf Färe, Shawna Grosskopf, and Dimitris Margaritis

Contents
Introduction: What Is a Shadow Price? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 953
Primal Representation of Technology: Distance Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . 954
Calculus and Dual Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 958
Pricing Inputs and Outputs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 960
One Input Price Is Known . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 960
Total Cost Is Known . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 963
Pricing Inputs with a Single-Output Technology and CRS . . . . . . . . . . . . . . . . . . . . . . . . . . 966
Pricing Outputs and Their Characteristics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 967
Pricing Outputs When One Output Price Is Known . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 967
Total Revenue Is Known . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 970
Cost and Revenue Indirect Pricing Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 972
Cost Indirect Pricing Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 972
Revenue Indirect Pricing Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 977
Sub-cost and Sub-revenue Indirect Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 981
Pricing Inputs and Outputs: A Profit Maximization Approach . . . . . . . . . . . . . . . . . . . . . . . . . 983

R. Färe ()
Department of Economics and Department of Agricultural and Resource Economics, Oregon
State University, Corvallis, OR, USA
Department of Economics and Department of Applied Economics, School of Public Policy,
Oregon State University, Corvallis, OR, USA
Department of Agricultural Economics, University of Maryland, College Park, MD, USA
e-mail: [email protected]
S. Grosskopf
Department of Economics, School of Public Policy, Oregon State University, Corvallis, OR, USA
e-mail: [email protected]
D. Margaritis
Department of Accounting and Finance, University of Auckland Business School, Auckland,
New Zealand
e-mail: [email protected]

© Springer Nature Singapore Pte Ltd. 2022 951

S. C. Ray et al. (eds.), Handbook of Production Economics,
https://doi.org/10.1007/978-981-10-3455-8_16
952 R. Färe et al.

Appendix A: Catalog of Shadow Pricing Rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 989

Input Pricing Rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 989
Output Pricing Rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 990
Pricing Inputs, Indirect Approaches . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 991
Pricing Outputs, Indirect Approaches . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 991
Price and Quantity Mixed, Indirect Approaches . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 991
Pricing Under CRS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 992
Pricing Inputs and Outputs When Total Profit and (x, y) Are Known . . . . . . . . . . . . . . 993
Appendix B: Functional Forms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 993
More Formal Exposition of Calculus and Primal and Dual Spaces . . . . . . . . . . . . . . . . . . . . . 995
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 998

Abstract

This chapter is devoted to outlining production theoretical approaches to iden-

tifying shadow prices. Shadow prices have a long history in economics; they
are perhaps most familiar from welfare economics and benefit-cost analysis.
The focus here is narrower – shadow prices derived in a production theoretical
framework. We seek to identify prices that are consistent with resource use or
opportunity cost that would induce the decision-maker to choose the observed
quantity vector. Market prices – if they exist – may be consistent with this
condition, but there are many cases where they are not, often referred to as cases
of market failure.
We begin with the function representations of technology used to identify
shadow prices, i.e., distance functions and their dual associated value functions.
We emphasize the relationship between primal and dual spaces through their
connection with calculus. Perhaps the most familiar example of our approach is
Shephard’s lemma, which uses calculus and duality to find the optimal input
quantities associated with the cost function, i.e., going from price space to
quantity space. Here we adopt a dual Shephard’s lemma approach: begin with
quantity space representations of technology, and use duality and calculus to
find the associated dual support prices. These include shadow pricing of inputs
or their characteristics, shadow prices of outputs, as well as pricing of inputs
and outputs in a profit function and directional distance function framework.
We include an appendix devoted to the choice of appropriate functional forms
which accommodate the underlying structure of technology and the calculus. A
more detailed theoretical development of the role of calculus in our approach
to shadow pricing suggested to us by Robert Chambers is also included in an
appendix.

Keywords

Shadow pricing · Production economics · Shephard distance functions ·

Directional distance functions · Cost pricing · Revenue pricing
23 Shadow Pricing in Production Economics 953

Introduction: What Is a Shadow Price?

“Let x be a given vector of quantities. Then p is a shadow price for vector x if x

is an optimal choice given p” (Daniel Primont, private communication, August 29,
2016).
Building on [1], this chapter is devoted to outlining production theoretical
approaches to identifying shadow prices. Shadow prices have a long history in
economics; they are perhaps most familiar from welfare economics and benefit-cost
analysis (see, e.g., [2]). Although the focus here is narrower – shadow prices derived
in a production theoretical framework (see [3] for a consumer-based approach) – the
definition is motivated by basic welfare economics notions. In simple undergraduate
textbook terms, we seek to identify prices that are consistent with resource use
or opportunity cost that would induce the decision-maker to choose the observed
quantity vector. Market prices – if they exist – may be consistent with this condition,
but there are many cases where they are not, often referred to as cases of market
failure. Among other examples are the cases of public goods, positive and negative
externalities, and price distortions due to regulation. For example, production
decisions may be made with respect to shadow prices rather than observed market
prices. In this sense the wedge between these two prices reflects the effects of
exogenous constraints faced by the producer; see [4–8], and [9] among many
others.
We begin with some preliminaries to focus on the production setting. The starting
point includes the basic function representations of technology which we use to
identify shadow prices, i.e., various distance functions and their dual associated
value functions. We emphasize the relationship between primal and dual spaces
through their connection with calculus. Perhaps the most familiar example which
exploits the relationship between duality and calculus is Shephard’s lemma. After
the preliminaries we turn to various cases of shadow pricing beginning with pricing
inputs or input characteristics. Shadow prices of outputs are next followed by
pricing of inputs and outputs in a profit function and directional distance function
framework. Our basic empirical focus here is on parametric specifications of
technology to facilitate the use of calculus to identify the shadow prices. We
include an appendix devoted to the choice of appropriate functional forms which
accommodate the underlying structure of technology.1 Also included in an appendix
is a more detailed theoretical development of the role of calculus in our approach
to shadow pricing suggested to us by Robert Chambers.We summarize all of our
pricing models in yet another appendix.

1 Although not included in this chapter, nonparametric estimation is possible. For how to apply
calculus to estimation with, for example, data envelopment analysis, see [10].
954 R. Färe et al.

Primal Representation of Technology: Distance Functions

In this chapter we derive shadow pricing rules using nine different distance
functions as our representations of technology.2 They are function representations of
technology that accommodate multiple input quantities, as do production functions.
In contrast to production functions, they accommodate multiple output quantities
as well. The distance functions we employ may be classified into two groups:
those that are defined as radial scalings (like Shephard-type distance functions)
and the second group dubbed directional distance functions. In order to provide
some intuition and perhaps clarification, we begin with the underlying technology
sets and the properties required to ensure existence of our distance functions and
the resulting properties inherited by the distance functions. We show how these
functions are defined and construct a chart summarizing these distance functions
and relationships.
The basic technology sets underlying our distance functions include the input set
consisting of all input vectors x = (x1 , . . . , xN ) ∈ N
+

L(y) = {x ∈ N : x can produce y ∈ M }, (1)

and the output set of all output vectors y = (y1 , . . . , yM ) ∈ M

P (x) = {y ∈ M : x ∈ L(y)}. (2)

We assume free disposability of inputs and outputs (to ensure the existence of our
distance functions) and that the technology set may be defined as

T = {(x, y) : x ∈ L(y)} (3)

and is closed and strictly convex. This ensures that our functions are differentiable
and globally optimal. Note that these three sets represent the same technology, hence
(x, y) ∈ T ⇔ x ∈ L(y) ⇔ y ∈ P (x).
Before formally introducing the various distance functions and their relation-
ships, we first summarize key properties of the two types of distance functions that
they inherit from the technology and their definitions. We illustrate for the input
side; the output side is similarly derived. The standard radial input distance function
is defined as

Di (y, x) = sup{λ > 0 : x/λ ∈ L(y)} (4)

and the directional input distance function is

2 The sub-cost and sub-revenue distance functions are not included in this count.
23 Shadow Pricing in Production Economics 955

i (x, y; gx ) = sup{β ∈ : x − βgx ∈ L(y)},

D (5)

where gx is a direction vector. Given the conditions above, both of these are function
representations of the technology so that

Di (y, x) > >

= 1 ⇔ x ∈ L(y) ⇔ Di (x, y; gx ) = 0. (6)

In terms of our shadow prices, the most important properties of these two types
of distance functions are (i) positive homogeneity of Di (y, x) in x which ensures
that Euler’s theorem holds

∇x Di (y, x)x = Di (y, x) (7)

i (x, y; gx )
and (ii) the translation property for D

i (x + αgx , y; gx ) = D
D i (x, y; gx ) + α. (8)

Parallel to (7), the inner product of the gradient of the directional distance function
and the direction vector is one3 which implies that

i (x + αgx , y; gx ) = ∇x D
∇x D i (x, y; gx ). (9)

We now can define our distance functions – providing a function representation

of the underlying sets defined above, which we can ultimately estimate. See the
appendix on functional forms for details on parameterization of these functions.
Beginning with input-based radial and directional distance functions, we have

Di (y, x) = sup{θ : x/θ ∈ L(y)}

i (x, y; gx ) = sup{β : (x − βgx ) ∈ L(y)}
D
I Di (p/r, x) = sup{λ : x/λ ∈ L(y), py >
= r}
i (p/r, x; gx ) = sup{β : (x − βgx ) ∈ L(y), py > r},
ID (10)
=

where I Di (p/r, x) and ID i (p/r, x; gx ) denote revenue indirect input distance
functions, both Shephard-type and directional distance functions, respectively.
These functions are appropriate for entities operating with revenue targets py.
On the output side we have

Do (x, y) = inf{θ : y/θ ∈ P (x)}

o (x, y; gy ) = sup{β : (y + βgy ) ∈ P (x)}
D

3 This is readily shown by differentiating (8) with respect to α and then setting α equal to zero.
956 R. Färe et al.

I Do (w/c, y) = inf{θ : y/θ ∈ P (x), wx <

= c}
o (w/c, y; gy ) = sup{β : (y + βgy ) ∈ P (x), wx < c},
ID (11)
=

where I Do (w/c, y) and ID o (w/c, y; gy ) denote cost indirect output distance

functions which are appropriate for entities operating under a budget constraint,
typical of the public sector.

We also have the directional distance function defined on the technology set:

T (x, y; gx , gy ) = sup{β : (x − βgx , y + βgy ) ∈ T }.

D (12)

The relationships among these distance functions are depicted in Fig. 1.4 The
numbers in parentheses refer to the more detailed explanations of the relationships
below the schematic.

Input Distance Functions (Left-Hand Side of Fig. 1)

i (x, y; gx ), then (see [11])

(1) If we set gx = x in D

i (x, y; gx ) = 1 − 1/Di (y, x).

(2) I Di (p/r, x) and Di (y, x) are dual to each other (see [12]):

Di (y, x) = inf {I Di (p/r, x) : py >

= r}
p/r

(2) (9) (6)

IDi (p/r, x) Di (y, x) Do (x, y) IDo (w/c, y)

(4) (1) (5) (8)

(3) (7)
i (p/r, x; gx )
ID i (x, y; gx )
D o (x, y; gy )
D o (w/c, y; gy )
ID

(10) (11)

T (x, y; gx , gy )
D

Fig. 1 Distance functions

4 Many thanks to Maryam Hasannasab for the figure.

23 Shadow Pricing in Production Economics 957

I D i (p/r, x) = sup{Di (y, x) : py >

= r}
y

i (x, y; gx ) are dual to each other (see [13]):

i (p/r, x; gx ) and D
(3) ID

i (x, y; gx ) = sup{ID
D i (p/r, x; gx ) : py > r}
=
p/r
i (x, y; gx ) : py > r}
i (p/r, x; gx ) = sup{D
ID =
y

(4) To our knowledge this relationship is not available in the literature, but we derive
it here. Let gx = x; then

i (p/r, x; gx ) = sup{β : (x − βx) ∈ L(y), py > r}

ID =
= sup{β : Di (y, x(1 − β)) = 1, py >
>
= r}
sup{β : Di (y, x)(1 − β)) > >
= 1, py = r}
= 1 + sup{1 − β) : (1 − β) > >
= 1/Di (y, x), py = r}
1
=1+
I Di (p/r, x)

Output Distance Functions (Right-Hand Side of Fig. 1)

(5) As in (1), these two distance functions are dual to each other. Set gy = y; then
(see [11])
o (x, y; gy ) = 1 − 1/Do (x, y).
D

(6) These functions are dual to each other (see [12]):

Do (x, y) = inf {I D o (w/c, y) : wx <

= c}
w/c

I D o (w/c, y) = sup{Do (x, y) : wx <

= c}.
y

(7) These functions are dual to each other (see [13]):

o (x, y; gy ) = sup{D
D o (x, y; gy ) : wx < c}
=
w/c

o (x, y; gy ) : wx < c}
o (w/c, y; gy ) = sup{D
ID =
y

(8) Set gy = y (the proof is left to the reader); then

o (w/c, y; gy ) = 1 − 1/I Do (y, x)

ID
958 R. Färe et al.

(9) Under CRS we have

Di (y, x) = 1/Do (x, y).

(10) Set gy = 0; then

i (x, y; gx ) = D
D T (x, y; gx , 0).

(11) Set gx = 0; then

o (x, y; gy ) = D
D T (x, y; 0, gy ).

Calculus and Dual Spaces

The purpose of this section is to provide some basic intuition with respect to the
role of calculus in deriving shadow prices. In particular we highlight the role of the
derivative of a function – which takes us from the primal space to the dual space on
which the function is defined.5 An important example from economics is Shephard’s
lemma [18], which we use to illustrate this concept.
We provide a sketch of this lemma and begin by recalling the definition of the
input requirement set L(y) as

L(y) = {x : x can produce y} (13)

where x ∈ N
+ is an input vector and y ∈ + an output vector. The cost function is
M

defined as

C(y, w) = min wx s.t. x ∈ L(y), (14)

where w ∈ N + denotes input prices.

Shephard’s lemma states that the partial derivative of the cost function with
respect to any input price wn yields the associated cost minimizing input quantity
xn , i.e.,

∂C(y, w)/∂wn = xn , n = 1, . . . , N, (15)

5 Fora more formal treatment which requires more advanced calculus concepts than those we
use here, we refer the reader to the appendix, which was included with collaboration of Robert
Chambers.
23 Shadow Pricing in Production Economics 959

or even more compactly this can be written in terms of the gradient vector of the
cost function as

w C(y, w) = x. (16)

The gradient vector can be written in more detail as

∂C ∂C
w C(y, w1 , . . . , wN ) = ,..., . (17)
∂w1 , ∂wN

Thus the derivative taken with respect to a price wn yields a quantity xn , where
the two variables are dual to each other.
We state this idea in more general terms. Let l : 2 → be a function with
domain 2 and range . This function is linear if q, q o , q 1 ∈ 2

l(q o + q 1 ) = l(q o ) + l(q 1 ) (18)

and since this is Cauchy’s first equation, under continuity it implies

l(aq) = al(q), a ∈ . (19)

One can prove that a function is linear if and only if (see [14] and [15]) there
exist real numbers b1 , b2 such that

l(q) = b1 q1 + b2 q2 . (20)

The dual space to 2 , (2 )∗ consists of all continuous functionals (f1 , f2 , f3 )

given by

(f1 + f2 )q = f1 (q) + f2 (q) (21)

and

(af )q = a(f (q)). (22)

The following holds

2 = (2 )∗ , (23)

i.e., if 2 is the quantity space and (2 )∗ is the price space, they coincide.6 We note
that in economics, where we think of quantities being in primal space and prices in

6 This is true here, but in general it is not. See, e.g., [16].

960 R. Färe et al.

dual space, we consider them as different. We typically can “consume” quantities,

but not prices.
To relate the dual space to calculus, let

g : → . (24)

This function is differentiable if there exists an l, an element of = ∗ that is a

linear function () such that

g(q + h) − g(q) − l(h)

lim = 0. (25)
h→0 h

By homogeneity of l, we have

g(q + h) − g(q)
lim − l(1) = 0. (26)
h→0 h

and hence

dg(q)
= l(1) ∈ ()∗ . (27)
dq

Therefore

dg(q)/dq ∈ ()∗ , (28)

belongs to the dual space of q.

If g : N → , then the gradient vector is

∂g ∂g
q g(q1 , . . . , qN ) = ,..., ∈ (N )∗ . (29)
∂q1 ∂qN

In this chapter we use these concepts to derive shadow price models. Specifically,
we model technologies in terms of our previously defined distance functions, which
are typically functions we define in quantity space and apply calculus to derive the
corresponding dual space shadow prices.

Pricing Inputs and Outputs

One Input Price Is Known

In this section we assume that we observe one input price, as well as input quantities
x ∈ N + and output quantities y ∈ + . Let the one known input price be w1 ; our
M

goal is to show that we can retrieve shadow prices for the remaining (unknown)
23 Shadow Pricing in Production Economics 961

input prices (w2 , . . . , wN ) using our known data w1 , x, and y. Recall that Shephard
input distance function is defined as

Di (y, x) = sup{θ : (x/θ ) ∈ L(y)} (30)

where

L(y) = {x : x can produce y}, y ∈ M

+ (31)

is the input requirement set. For our purpose, two properties of the distance function
introduced earlier in (6) and (7) are key, namely,

(i) x ∈ L(y) if and only if Di (y, x) >

= 1,

and

(ii) Di (y, θ x) = θ Di (y, x), θ > 0.

We refer to (i) as the representation property and (ii) as the homogeneity of

degree +1 in inputs. The homogeneity property follows from the definition of the
distance function, while the representation property holds if inputs are weakly
disposable, i.e.,

x ∈ L(y), θ >
= 1 ⇔ θ x ∈ L(y). (32)

If we denote input prices by w = (w1 , . . . , wN ) ∈ N

+ , then the cost function is
given by

C(y, w) = min wx s.t. x ∈ L(y) (33)

which due to the representation property can be written as

C(y, w) = min{wx : Di (y, x) >

= 1}, (34)
x

or if written as a Lagrangian problem

C(y, w) = min wx − μ(Di (y, x) − 1). (35)

The first-order conditions associated with this problem are

w − μx Di (y, x) = 0. (36)

Using our first-order conditions, we can start to identify the unknown or shadow
price w2 from
962 R. Färe et al.

w2 ∂Di (y, x)/∂x2

= (37)
w1 ∂Di (y, x)/∂x1

which we can solve for w2 as

∂Di (y, x)/∂x2

w2 = w1 , (38)
∂Di (y, x)/∂x1

where the right-hand side consists of known data w1 , x, y.

Recall that our preferred parameterization of Shephard input distance functions
is the translog function, which implies that no zeros are allowed in the data. To
allow for zeros, one may specify a quadratic function, which is our preferred
parameterization for the directional distance function.
The directional input distance function is defined as

i (x, y; gx ) = sup{β : x − βgx ∈ L(y)}

D (39)

where gx ∈ N + , gx = 0 is the directional vector which determines in which

direction the input vector x is contracted toward the boundary of L(y).
This distance function satisfies the representation property

i (x, y; gx ) > 0 if and only if x ∈ L(y)

D (40)
=

as well as the translation property

i (x − αgx , y; gx ) = D
D i (x, y; gx ) − α, α ∈ . (41)

The translation property (which is an additive analog of the multiplicative

homogeneity property associated with Shephard distance functions) follows from
the definition of this distance function. The representation property holds when the
technology satisfies g-disposability (note technology is g−disposable if x ∈ L(y);
then x ∈ L(y) where x = x + λgx , λ > = 0).

Since Di (x, y; gx ) satisfies the representation property, we can rewrite our cost
minimization definition as

i (x, y; gx ) > 0},

C(y, w) = min{wx : D (42)
x =

or as a Lagrangian problem

i (x, y; gx ).
C(y, w) = min wx − μD (43)
x
23 Shadow Pricing in Production Economics 963

The associated first-order conditions7

i (x, y; gx ) = 0
w − μx D (44)

may be used as the basis for our next pricing rule for unknown input price w2

i (x, y; gx )/∂x2
∂D
w2 = w1 , (45)
i (x, y; gx )/∂x1
∂D

where again we are assuming that w1 , x, y, gx , are known. It is important to note

that the resulting value of w2 depends on the choice of the direction vector gx , i.e.,
it may change as gx changes.

Total Cost Is Known

In this section we assume that we observe total cost

N
c = wx = wn xn ; (46)
n=1

however, we do not observe individual input prices wn . In this case we can derive
the following input pricing rule

x Di (y, x)
w=c . (47)
Di (y, x)

To verify this rule, recall our first-order conditions for cost minimization with the
Shephard input distance function

w = μx Di (y, x). (48)

We start by deriving a useful interpretation of the multiplier μ. Multiply both

sides of the first-order condition by x,

wx = μx Di (y, x)x, (49)

and by the homogeneity property of the distance function in the input vector (see
(7)), we have

7 Note i (x, y; gx ) and gx is equal to one by the

that the inner product of the gradient vector of D
translation property.
964 R. Färe et al.

c = μDi (y, x) (50)

μ = c/Di (y, x)
μ = C(y, p).

Inserting this into our first-order conditions results in

w = C(y, w)x Di (y, x). (51)

Multiplying both sides by x and applying Euler’s theorem to the distance

function; then

c = wx = C(y, w)Di (y, x) (52)

which implies

C(y, w) = c/Di (y, x). (53)

We can now use the right-hand side to substitute for the unobserved C(y, w) to
yield our pricing rule:

w = C(y, w)x Di (y, x)

x Di (y, x)
=c . (54)
Di (y, x)

Thus given observed data on c, y, x, and Shephard input distance function, the
input price vector w can be estimated.
To relate the above pricing rule to earlier literature, assume that inputs are
technically efficient in the sense of Farrell, i.e.,

Di (y, x) = 1. (55)

Also assume that we seek to find cost-deflated input prices:

ŵ = w/c. (56)

In this case the pricing rule becomes

ŵ = x Di (y, x), (57)

and this expression for a single output can be found in [17] as well as in [18].
Again, if there are zeros in the data, the translog function associated with the
Shephard input distance functions may not be suitable. The directional distance
23 Shadow Pricing in Production Economics 965

i (x, y; gx ) is an alternative. Rewriting the first-order condition for cost

function D
minimization

i (x, y; gx ) = 0
w − μx D (58)

we can derive the input pricing rule as

x D i (x, y; gx )
w=c . (59)
i (x, y; gx )x
x D

To verify this rule, we again need to interpret the Lagrangian multiplier under
the directional distance function constraint. We set up a perturbation of the cost
minimization problem as8

i (x, y; gx ) − α)}
Ĉ(y, w, α) = min{wx − μ(D (60)
x

i (x̂, y; gx )}
= wαgx + min{x̂w x̂ − μD
x

by the translation property, where x̂ = x − αgx . Thus

Ĉ(y, w, α) = αwgx + C(y, w). (61)

Differentiating with respect to α

∂ Ĉ(y, w, α)/∂α = μ = wgx . (62)

So in the directional distance function case, the multiplier may be interpreted

as the value of the direction vector gx . Inserting this expression into the first-order
conditions yields

i (x, y; gx ).
w = wgx x D (63)

Multiplying both sides by x and noting that c = wx, we have

i (x, y; gx )x
c = wgx x D (64)

8 Alternatively, we could obtain μ = wgx multiplying both sides of (58) by gx noting that the inner
product of the gradient of the directional distance function and the direction vector is equal to one,
and then use this result to obtain (63) directly.
966 R. Färe et al.

c
wgx = (65)
i (x, y; gx )x
x D

i (x, y; gx ) from above to substitute for wgx ,

From this and using w = wgx x D
we derive the pricing rule

x D i (x, y; gx )
w=c . (66)
i (x, y; gx )x
x D

This pricing rule is in terms of the directional input distance function and
therefore will depend on the choice of the directional vector gx . Note that in contrast
to the pricing rule for the radial input distance function given by (54), here we have
in the denominator of (66) the inner product of the gradient of the distance function
times the input vector rather than simply the input distance function since we cannot
apply Euler’s theorem.

Pricing Inputs with a Single-Output Technology and CRS

Here we relate our Shephard input distance function model to those of [19] and
[20] by assuming that only one output y ∈ + is produced (e.g., house value). In
addition we assume that technology exhibits CRS (constant returns to scale), i.e.,

L(λy) = λL(y), λ > 0

or equivalently

Di (λy, x) = (1/λ)Di (y, x), λ > 0. (67)

In this case the distance function is also independent of output, i.e.,

x Di (1, x)
w=c . (68)
Di (1, x)

Since CRS and single output yield a distance function

Di (y, x) = 1/yDi (1, x), y > 0, (69)

23 Shadow Pricing in Production Economics 967

when the distance function takes a value of one,

Di (y, x) = 1,

which signals technical efficiency, we have

y = Di (1, x) (70)

y = F (x) (71)

where F (x) is a production function.

In their work on pricing, [19] takes F (x) to be a CES function, while [20] takes
F (x) to be a Cobb-Douglas function.

Pricing Outputs and Their Characteristics

Pricing Outputs When One Output Price Is Known

In this section we derive shadow prices for outputs when only one output price
is known: we seek to derive shadow prices for the outputs which do not have
observable output prices that reflect resource use or opportunity cost. This follows
the outline of the previous treatment of shadow pricing inputs and their character-
istics. We began the treatment of input shadow prices with the representation of
technology using Shephard input distance function. Here we turn to Shephard’s
output distance function as representation of technology of a multiple output,
multiple input technology.
Recall that this function is defined on the output set

P (x) = {y : x can produce y} (72)

Do (x, y) = inf{θ : y/θ ∈ P (x)}. (73)

We exploit two of the properties of the output distance function to identify our
shadow prices, namely,

(i) y ∈ P (x) if and only if Do (x, y) <

and

(ii) Do (x, λy) = λDo (x, y), λ > 0.

968 R. Färe et al.

The first condition is our representation property, which establishes the distance
function as a complete characterization of the underlying output set, with the
provision that outputs are weakly disposable, i.e.,

y ∈ P (x), 0 < <

= θ = 1 ⇔ θy ∈ P (x). (74)

The second condition – homogeneity in outputs – follows from the definition of

Do (x, y) and is important for determining its parameterization.
The value function associated with the output distance function is the revenue
function which depends on the vector of output prices p ∈ M + and inputs x ∈ +
N

and is the result of maximizing revenues py given technology

R(x, p) = max{py : y ∈ P (x)} (75)

or using (i) above

R(x, p) = max{py : Do (x, y) <

= 1} (76)
y

The Lagrangian formulation of this problem is

R(x, p) = max py − μ(Do (x, y) − 1), (77)

where μ is the Lagrangian multiplier.

The first-order conditions are

p − μy Do (x, y) = 0. (78)

Suppose the price of y1 is known to equal its observed market price p1 ; then we
can solve for the shadow price of say output y2 using our first-order conditions

p2 ∂Do (x, y)/∂y2

= (79)
p1 ∂Do (x, y)/∂y1

and

∂Do (x, y)/∂y2

p2 = p1 (80)
∂Do (x, y)/∂y1

where the right-hand side consists of observed data x, y, p1 . This model applies to
any pm , m = 2, . . . , M and of course an estimate of the distance function.
Translog is our preferred parameterization of the output distance function.
However, if there are zeros in the data, our preferred alternative is a quadratic form
which allows for zeros and is consistent with the properties of the directional output
distance function, to which we turn next.
23 Shadow Pricing in Production Economics 969

The directional output distance function is defined as

o (x, y; gy ) = sup{β : (y + βgy ) ∈ P (x)},

D (81)

where gy ∈ M + , gy = 0 is the directional output vector which specifies the direction

in which the output vector is projected onto the boundary of the output set P (x). It
is important to note that D o (x, y; gy ) is a function of gy , since the choice of this
vector affects the resulting output price.
This distance function, like the directional input distance function, satisfies
representation

o (x, y; gy ) > 0 if and only if y ∈ P (x)

D (82)
=

and the translation property

o (x, y + αgy y; gy ) = D
D o (x, y; gy ) − α. (83)

The translation property follows from the definition of the distance function,
and representation holds if the technology P (x) is gy disposable. This property
is important for parameterizing D o (x, y; gy ), as is discussed in a later section.
The revenue function R(x, p) is also the value function associated with the
directional output distance function. Given output prices p ∈ M + , the revenue
maximization problem is

R(x, p) = max{py : y ∈ P (x)} (84)

o (x, y; gy ) > 0.}

R(x, p) = max{py : D (85)
y =

In Lagrangian form

o (x, y; gy ),
R(x, p) = max py − μD (86)
y

where, as before, μ is the Lagrangian multiplier. The first-order conditions for this
problem are

o (x, y; gy ) = 0.
p − μy D (87)

Suppose again that one output price is known, say p1 . Then we may derive the
shadow prices of outputs m = 2, . . . , M from the first-order conditions as
970 R. Färe et al.

o (x, y; gy )/∂ym
∂D
p m = p1 , (88)
o (x, y; gy )/∂y1
∂D

where the right-hand side consists of the data we have assumed are observable –
(p1 , y, x) and the chosen direction vector gy .
If one price is known, using our pricing rules, we can also derive the remaining
prices above as well as total revenue, since

M
∂Do (x, y)/∂ym
r = p1 y1 + ym (89)
∂Do (x, y)/∂y1
m=2

or in terms of the directional distance function

M o (x, y; gy )/∂ym
∂D
r = p1 y1 + y (90)
o (x, y; gy )/∂y1 m
∂D
m=2

In the next section, we assume that revenue r but no pm is known.

Total Revenue Is Known

We assume that total revenue

M
r = py = pm ym (91)
m=1

is known, but none of the output prices pm , m = 1, . . . , M are known. This leads to
two output pricing rules: one in terms of the Shephard output distance function and
another in terms of the directional output distance function, i.e.,

y Do (x, y)
p=r (92)
Do (x, y)

and

y D o (x, y; gy )
p=r , (93)
o (x, y; gy )y
y D

respectively.
Recall that the first-order conditions associated with the revenue maximization
problem with Shephard’s output distance function are

p − μy Do (x, y) = 0 (94)

23 Shadow Pricing in Production Economics 971

p = μy Do (x, y). (95)

In order to derive a pricing rule using these conditions, we need to interpret the
multiplier μ. We will prove that

μ = R(x, p), (96)

i.e., the multiplier equals the value function, maximal revenue R(x, p). As we did
for the input distance function case, we begin by multiplying both sides of the first-
order condition above by y. Summing and accounting for Euler’s theorem since the
output distance function is homogeneous of degree 1 in y, we have

py − μDo (x, y). (97)

Rearranging yields
py
μ=
Do (x, y)
= R(x, p) (98)

as desired. Inserting this into the first-order condition

p = R(x, p)y Do (x, y). (99)

Multiply both sides by y and apply Euler’s theorem to obtain

r = py = R(x, p)Do (x, y) (100)

R(x, p) = r/Do (x, y). (101)

Inserting this into

p = R(x, p)y Do (x, y) (102)

yields our pricing rule

y Do (x, y)
p=r , (103)
Do (x, y)

where as before the right-hand side consists of observed data r, x, y, as well as

estimates of the directional distance function.
972 R. Färe et al.

Cost and Revenue Indirect Pricing Models

In the earlier sections of this chapter, we used Shephard and directional distance
functions to provide us with a function representation of technology based on input
and output quantities. Here we generalize that approach to the case in which the
decision-maker faces a budget constraint and revenue target, which were termed
indirect models by Shephard; see [21]. He recognized that these models were
especially useful for the case of service and public sectors. Following Shephard’s
terminology, the representation of technology in this section is indirect distance
functions.

Cost Indirect Pricing Models

The pricing models discussed earlier used the classic Shephard and directional
distance functions as their representation of technology, defined in terms of input
and output quantity vectors. Here we define indirect distance functions which are
defined in terms of normalized input prices and output quantities. Thus this pricing
model makes use of prices, much like the classic competitive model that sets the
prices of outputs equal to marginal cost.
We begin with the traditional output set

P (x) = {y : x can produce y}, x ∈ N

+. (104)

Input prices are w ∈ N

+ and let observed cost be denoted as c. The associated
budget constraint is

N
c = wx = wn xn . (105)
n=1

We modify the traditional output set by introducing the budget constraint and
defining the cost indirect output set as the union of all output sets which satisfy the
budget constraint, i.e., all those output sets with input vectors which cost no more
than the given c

I P (w/c) = {y : y ∈ P (x), wx <

= c} (106)
= {y : y ∈ P (x), w/c <
= 1}.

We can now define a radial (Shephard-type) distance function on the cost indirect
output set I P (w/c) as

I Do (w/c, y) = inf{θ : y/θ ∈ I P (w/c)}. (107)

23 Shadow Pricing in Production Economics 973

As in the direct distance function case, this function is homogeneous of degree +1

in the scaled output vector

I Do (w/c, λy) = λI Do (w/c, y), λ > 0. (108)

It also satisfies the representation property

I Do (w/c, y) <
= 1, if and only if y ∈ I P (w/c) (109)

I P (w/c) = {y : I Do (w/c, y) <

= 1}. (110)

If outputs are strongly disposable, i.e.,

y ∈ I P (w/c), y <
= y ⇔ y ∈ I P (w/c). (111)

We are now ready to introduce the cost indirect pricing models, which are derived
using revenue maximization. We define the cost indirect revenue function as

I R(w/c, p) = max{py : y ∈ I P (w/c)}

= max{py : I Do (w/c, y) <

= 1}, (112)
y

where the second equality follows from the representation property. As usual, p ∈
M+ is a non-negative output price vector, and one can prove that the output set
I P (w/c) is compact; therefore the maximum exists.
The Lagrangian formulation of the maximization problem is

I R(w/c, p) = max py − μ(I Do (w/c, y) − 1), (113)

where μ is the Lagrangian multiplier. The first-order conditions are

p − μy I Do (w/c, y) = 0. (114)

Suppose that we know p1 and seek the price of output m, pm , m = 2, . . . , M;

then using the first-order conditions, we have

pm ∂I Do (w/c, y)/∂ym
= , (115)
p1 ∂I Do (w/c, y)/∂y1
974 R. Färe et al.

which yields the pricing model

∂I Do (w/c, y)/∂ym
p m = p1 , m = 2, . . . , M. (116)
∂I Do (w/c, y)/∂y1

We have shown that if input prices w, total cost c, outputs y and output price p1
are known, we may compute the remaining output prices, pm = 2, . . . , M. We note
that normalized input prices are used here.
In the case in which technology exhibits constant returns to scale, i.e.,

L(λy) = λL(y)λ > 0, (117)

then the cost function is homogeneous of degree +1 in outputs

C(λy, w) = λC(y, w), λ > 0. (118)

In turn, we can then prove that

C(y, w/c) = I Do (w/c, y), (119)

in which case we may substitute the cost function for our cost indirect output
distance function. Using this in our maximization problem, we have

max(py − C(y, w)) (120)

which yields the standard competitive result that output prices equal marginal cost

p = y C(y, w). (121)

Again under CRS, we may use the indirect output distance function in the
maximization problem

max(py − I Do (w/c, y)) (122)

which yields the pricing rule

p = y I Do (w/c, y) (123)

Next assume that we know total revenue

M
r = py = pm ym ; (124)
m=1
23 Shadow Pricing in Production Economics 975

then we can derive the following pricing rule using the cost indirect output distance
function

y I Do (w/c, y)
p=r . (125)
I Do (w/c, y)

Recall that revenue maximization under the constraint I Do (w/c, y) <

= 1 yields the
first-order constraints

p − μI Do (w/c, y) = 0. (126)

Following our earlier sections, one can prove that

μ = I R(w/c, p). (127)

Inserting this into the first-order conditions yields

p = I R(w/c, p)y I Do (w/c, y). (128)

Next multiply both sides by y and apply Euler’s theorem to arrive at

r = py = I R(w/c, p)I Do (w/c, y) (129)

I R(w/c, p) = r/I Do (w/c, y) (130)

yielding the pricing rule

y I Do (w/c, y)
p=r . (131)
I Do (w/c, y)

Thus if we know revenue r but not individual prices, we can derive a pricing rule
given w, c and y.
Next we consider the indirect functions in terms of directional distance functions.
Thus define the cost indirect directional output distance function as

o (w/c, y; gy ) = sup{β : (y + βgy ) ∈ P (x), wx < c},

ID (132)
=

where gy ∈ M
+ , gy = 0 is the directional vector. This function satisfies the two
properties:

ID o (w/c, y; gy ) − α
o (w/c, y + αgy ; gy ) = ID (133)
976 R. Färe et al.

and

I Do (w/c, y; gy ) >
= 0 if and only if y ∈ I P (w/c), (134)

the translation and representation properties, respectively. The cost indirect revenue
maximization problem is

I R(w/c, p) = max{py : y ∈ I P (w/c)} (135)

o (w/c, y; gy ) > 0}
= max{py : ID
y =

where the last equality holds due to the representation property. The Lagrangian
formulation of the problem is

o (w/c, y; gy ),
I R(w/c, p) = max py − μID (136)
y

where μ is the multiplier.

The first-order conditions associated with this problem are

o (w/c, y; gy ) = 0.
p − μy ID (137)

Now if the price of one output is known, say p1 , the prices of the other outputs
follow from

o (w/c, y; gy )/∂ym
∂ ID
p m = p1 . (138)
o (w/c, y; gy )/∂y1
∂ ID

Given w, c, y and gy together with p1 , the prices pm = 2, . . . , M may be

calculated from the expression above. In addition, we can then find total revenue
as

M
r = py = pm ym (139)
m=1

M o (w/c, y; gy )/∂ym
∂ ID
= p1 y1 + p1 y .
o (w/c, y; gy )∂y1 m
∂ ID
m=2

What if we know revenue r but not the individual output prices? Following our
earlier models, we can derive a pricing model
23 Shadow Pricing in Production Economics 977

y ID o (w/c, y; gy )
p=r , (140)
o (w/c, y; gy )y
y ID

o (w/c, y; gy ) is a gradient vector.

where we note that y ID
We leave the derivation to the reader.

Revenue Indirect Pricing Models

A parallel set of pricing rules can be developed for our other indirect models –
namely, the revenue indirect cases. Here the decision-maker seeks to minimize costs
but must also meet a minimal revenue target. Thus the decision-maker must choose
input quantities to minimize costs given input prices as well as a revenue target r.
More formally the revenue indirect cost minimization problem is

I C(p/r, w) = min wx s.t. x ∈ I L(p/r), (141)

where the revenue indirect input set is

I L(p/r) = {x : x ∈ L(y), py >

= r} (142)
= {x : x ∈ L(y), py/r >
= 1}.

This set models the condition that feasible input vectors x ∈ L(y) must generate
at least total revenue r, given output prices p ∈ M+,

M
r = py = pm ym . (143)
m=1

The input distance function defined on I L(p/r) is called the revenue indirect
input distance function, and it is defined as

I Di (p/r, x) = sup{λ : x/λ ∈ I L(p/r)}. (144)

This function is homogeneous of degree +1 in inputs:

I Di (p/r, θ x) = λI Di (p/r, θ x), θ > 0. (145)

It also satisfies the representation condition

I Di (p/r, x) >
= 1 if and only if x ∈ I L(p/r), (146)
978 R. Färe et al.

I L(p/r) = {x : I Di (p/r, x) >

= 1} (147)

We can now define the cost function defined on I L(p/r)

I C(p/r, w) = min{wx : x ∈ I L(p/r)} (148)

which by the representation property can also be written as

I C(p/r, w) = min{wx : I Di (p/r, x) >

= 1}. (149)

The Lagrangian formulation of this cot function is

I C(p/r, w) = min wx − μ(I Di (p/r, x) − 1), (150)

with first-order conditions

w − μx I Di (p/r, x) = 0. (151)

Here again if we know one input price, say w1 , the other input prices may be
estimated from the above as

∂I Di (p/r, x)/∂xn
wn /w1 = (152)
∂I Di (p/r, x)/∂x1

or equivalently

∂I Di (p/r, x)/∂xn
wn = w1 , n = 2, . . . , N. (153)
∂I Di (p/r, x)/∂x1

Using this result we can compute the total cost:

N
N
∂I Di (p/r, x)/∂xn
c = wx = wn xn = w1 x1 + xn . (154)
∂I Di (p/r, x)/∂x1
n=1 n=2

If we assume that technology satisfies CRS, we can prove that the revenue
function

R(x, p/r) = max{p/ry : y ∈ P (x)}, r > 0 (155)

equals the revenue indirect input distance function

R(x, p/r) = I Di (p/r, x). (156)

23 Shadow Pricing in Production Economics 979

Thus under CRS, we may substitute the revenue function for I Di (p/r, x) in our
pricing models. For example, if we define profit maximization as the least costly
way to meet the minimum revenue target

min R(x, p) − wx, (157)

we find that

x R(x, p) = w, (158)

i.e., marginal revenue equals input price. Of course under CRS, we could also
consider the problem

min I Di (p/r, x) − wx (159)

which gives us

x I Di (p/r, x) = w. (160)

If we observe total cost

N
c = wx = wn xn
n=1

but do not know individual input prices, we can develop the following pricing rule

x I Di (p/r, x)
w=c . (161)
I Di (p/r, x)

From the first-order conditions

w − μx I Di (p/r, x) = 0 (162)

and the condition that

μ = I C(p/r, w), (163)

we have

w = I C(p/r, w)x I Di (p/r, x). (164)

Multiplying both sides by x and applying Euler’s theorem yields

c = wx = I C(p/r, w)I Di (p/r, x) (165)

980 R. Färe et al.

which gives us the pricing rule for input prices

x I Di (p/r, x)
w=c . (166)
I Di (p/r, x)

Our next indirect pricing rule uses the directional indirect input distance function,
which is defined as

i (p/r, x; gx ) = sup{β : (x − βgx ) ∈ L(y), py > r},

ID (167)
=

where gx ∈ N + , gx = 0 is the directional vector. As a directional distance function,

this satisfies the translation and representation properties:

i (p/r, x − αgx ; gx ) = ID
ID i (p/r, x; gx ) − α (168)

and

i (p/r, x; gx ) > 0 if and only if x ∈ L(y), py > r,

ID (169)
= =

respectively.
The revenue indirect cost minimization problem is

I C(p/r, w) = min{wx : x ∈ I L(p/r)} (170)

where the revenue indirect input set is defined as

I L(p/r) = {x : x ∈ L(y), py >

= r}. (171)

Employing the representation property, we may reformulate the minimization

problem as

i (p/r, x; gx ),
I C(p/r, w) = min wx − μID (172)
x

where μ is the Lagrangian multiplier.

The first-order conditions for this problem are

i (p/r, x; gx ) = 0.
w − μx ID (173)

If one input price, say w1 , is known, the remaining input prices may be calculated
as

i (p/r, x; gx )/∂x1
∂ ID
wn = w1 . (174)
i (p/r, x; gx )/∂xn
∂ ID
23 Shadow Pricing in Production Economics 981

This can be used to calculate total cost:

N
c = wx = wn xn
n=1

N i (p/r, x; gx )/∂x1
∂ ID
= w1 x1 + w1 x . (175)
i (p/r, x; gx )/∂xn n
∂ ID
n=2

We now assume that we know total cost but that none of the individual input
prices are known. Our pricing rule in this case is

x ID i (p/r, x; gx )
w=c . (176)
i (p/r, x; gx )x
x ID

To derive this expression we have used the fact that the Lagrangian multiplier μ
equals the indirect cost function

μ = I C(p/r, w). (177)

Using this together with the first-order conditions yields the pricing rule above.
We leave the details to the reader.

Sub-cost and Sub-revenue Indirect Models

In this section we generalize the pricing models from sections “One Input Price Is
Known” and “Total Cost Is Known” by defining sub-cost and sub-revenue indirect
distance functions. This accommodates short- and long-run models as well as
models with mixed quality and price data. We use inputs and normalized input
price in the indirect sub-cost distance function, similarly for the revenue indirect
functions.
Beginning with the indirect sub-cost case, let inputs

x = (xs , x−s ) ∈ N
+, (178)

where s is a subset of {1, . . . , N} and −s is its complement.

A sub-budget cost indirect output set can then be defined as

SI P (ws /cs , x−s ) = {y : y ∈ P (x), ws xs <

= cs , x = (xs , x−s ), } (179)

where cs is the sub-budget cost for the n = 1, . . . , s inputs.

The radial sub-budget indirect output distance function is

SI Do (ws /cs , x−s , y) = inf{θ : y/θ ∈ SI P (ws /cs , x−s )}. (180)
982 R. Färe et al.

The corresponding directional distance function is

SID o (ws /cs , x−s , y; gy ) = sup{β : (y + βgy ) ∈ SI P (ws /cs , x−s )}. (181)

Both functions satisfy a representation property; thus we can define the following
revenue maximization problems

max py s.t. SI Do (ws /cs , x−s , y) <

= 1, (182)
y

and

max py s.t. SID o (ws /cs , x−s , y; gy ) >

= 0. (183)
y

The first problem yields the pricing models

∂SI Do (ws /cs , x−s , y)/∂ym

pm = p1 , m = 2, . . . , M, (184)
∂SI Do (ws /cs , x−s , y)/∂y1

when one price, p1 , is known, and

y SI Do (ws /cs , x−s , y)

p=r (185)
y SI Do (ws /cs , x−s , y)y

when p1 and r are known. Verification is left to the reader.

The second maximization yields the following pricing rules

∂ SID o (ws /cs , x−s , y; gy )/∂ym

p m = p1 (186)
∂ SID o (ws /cs , x−s , y; gy )/∂y1

and

y SID o (ws /cs , x−s , y; gy )

p=r (187)
y SID o (ws /cs , x−s , y; gy )y

for the case in which p1 and r are known. Details are left to the reader.
Finally, we turn to the sub-budget version of the revenue indirect models, whose
reference input requirement set is

SI L(ps /rs , y−s ) = {x : x ∈ L(y), ps ys >

= rs , y = (ys , y−s )}, (188)

where rs is the sub-revenue and y = (ys , y−s ) with s ⊂ {1, . . . , M} and s− is its
complement.
23 Shadow Pricing in Production Economics 983

The radial and directional distance functions defined on this set are

SI Di (ps /rs , y−s , x) = sup{θ : x/θ ∈ SI L(ps /rs , y−s )} (189)

and

SID i (ps /rs , y−s , x; gx ) = sup{β : (x − βgx ) ∈ SI L(ps /rs , y−s )}, (190)

respectively. Both distance functions satisfy a representation property which allows

us to define

min wx s.t. SI Di (ps /rs , y−s , x) >

= 1, (191)
x

and

min wx s.t. SID i (ps /rs , y−s , x; gx ) >

= 0. (192)
x

The four pricing rules associated with these two minimization problems are

∂SI Di (ps /rs , y−s , x)/∂xn

wn = w 1 , n = 2, . . . , N (193)
∂SI Di (ps /rs , y−s , x)/∂x1
x SI Di (ps /rs , y−s , x)
w=c (194)
SI Di (ps /rs , y−s , x)
∂ SID i (ps /rs , y−s , x; gx )/xn
wn = w1 , n = 2, . . . , N (195)
∂ SID i (ps /rs , y−s , x; gx )/x1
x SID i (ps /rs , y−s , x; gx )
w=c , (196)
x SID i (ps /rs , y−s , x; gx )x

respectively.

Pricing Inputs and Outputs: A Profit Maximization Approach

In most of our earlier pricing sections, we have focused on input prices and output
prices separately. And in each case, we assume that we know or observe at least
one input (output) price or total costs (total revenues), respectively. In this section
we show how knowledge of one input price can be used to price outputs and how
one output price can be used to price inputs. We also consider using total costs to
shadow price outputs and total revenues to shadow price inputs.
The key idea is to exploit profit maximization as the optimization criterion, rather
than cost minimization or revenue maximization. When the goal is to maximize
profit, we are seeking optimal quantities of both inputs and outputs, providing the
984 R. Färe et al.

“crossover” from inputs to outputs. In fact this allows us to simultaneously find

shadow prices for both inputs and outputs.
The traditional profit maximization problem takes the form

max(py − wx) s.t. (x, y) ∈ T , (197)

x,y

where T is the technology set, i.e.,

T = {(x, y) : x can produce y}. (198)

As usual, x ∈ N + , w ∈ + are our input vectors and their prices, respectively.

Similarly, we have y ∈ + , p ∈ M
M
+ on the output side. Next we formulate
the Lagrangian representation of the profit maximization problem, but we directly
represent technology as the directional technology distance function, noting that this
function satisfies representation and translation:

T (x, y; gx , gy ) = sup{β : (x − βgx , y + βgy ) ∈ T }

D (199)

In addition we will make use of the radial input and output distance functions.9
In order to justify the use of the Shephard-type radial distance functions, we appeal
to the following equivalencies:

(x, y) ∈ T ⇔ x ∈ L(y) ⇔ y ∈ P (x), (200)

i.e., the three sets T , P (x), L(y) model the same technology. Both the input and
output distance functions satisfy the representation property

L(y) = {x : Di (y, x) >

= 1}

and

P (x) = {y : Do (x, y) <

= 1},

respectively. It follows that technology T may be expressed in terms of these

distance functions

T = {(x, y) : x ∈ L(y)} = {(x, y) : Di (y, x) >

= 1}

and

T = {(x, y) : y ∈ P (x)} = {(x, y) : Do (x, y) <

= 1}.

9 We leave development of the corresponding theory for the directional input and output distance
functions to the reader.
23 Shadow Pricing in Production Economics 985

The last two expressions verify our use of the radial input and output distance
functions as constraints in the profit maximization model. We begin by using the
output distance function as the technology constraint,

max(py − wx) − μ(Do (x, y) − 1), (201)

x,y

where μ is the Lagrangian multiplier. The first-order conditions with respect to

outputs are

p − μy Do (x, y) = 0 (202)

and with respect to inputs

w − μx Do (x, y) = 0. (203)

Our first pricing rules are10

∂Do (x, y)/∂ym

pm = w 1 , m = 1, . . . , M, (204)
∂Do (x, y)/∂x1

and the second rule is

∂Do (x, y)/∂xn
w n = p1 , n = 1, . . . , N. (205)
∂Do (x, y)/∂y1

Thus we may price outputs ym , m = 1, . . . , M using information on one input

price, say w1 , as well as information on input and output quantities (x, y). Similarly,
we may also price inputs xn , n = 1, . . . , N given one output price, say p1 , together
with data on (x, y).
To show how we may use data on total revenue

M
r = py = pm ym ,
m=1

together with input and output quantities to solve for shadow prices for inputs,
consider the first-order constraints for outputs from our profit maximization problem

p = μy Do (x, y) (206)

and multiply both sides by y and apply Euler’s theorem to the right-hand side
(recalling that Do (x, y) is homogeneous of degree +1 in y); then

10 We may also derive rules for the “same” price.

986 R. Färe et al.

r = py = μDo (x, y); (207)

thus

μ = r/Do (x, y). (208)

Inserting this into the first-order constraints for inputs, we have

x Do (x, y)
w=r , (209)
Do (x, y)

which shows that total revenue r together with quantities (x, y) can be used to
solve for shadow prices of inputs. This may be compared to the pricing rule from
Section 3.2.

x Di (y, x)
w=c . (210)
Di (y, x)

The next two pricing rules for outputs make use of Shephard input distance
function Di (y, x). Again maximizing profit, now using Di (y, x) as the technology
constraint, we have

max py − wx − μ(Di (y, x) − 1) (211)

x,y

with the associated first-order conditions with respect to y

p − μy Di (y, x) = 0 (212)

and for the conditions with respect to x

w − μx Di (y, x) = 0, (213)

where μ is the Lagrangian multiplier. From these conditions we can develop the
output pricing rule

∂Di (y, x)/∂ym

pm = w1 , m = 1, . . . , M, (214)
∂Di (y, x)/∂x1

and for the inputs

∂Di (y, x)/∂xn

w n = p1 , n = 1, . . . , N. (215)
∂Di (y, x)/∂y1

Thus observing one input or output price, together with quantities (x, y), we can
price outputs pm , m = 1, . . . , M or inputs wn , n = 1, . . . , N .
23 Shadow Pricing in Production Economics 987

We may also use total cost

N
c = wx = wn xn
n=1

together with (x, y) to price outputs. To verify this claim, multiply the first-order
constraints for inputs by x and use Euler’s theorem to obtain

c = wx = μDi (y, x), (216)

yielding

μ = c/Di (y, x). (217)

Inserting this expression into the first-order constraints for outputs yields

y Di (y, x)
p=c , (218)
Di (y, x)

a pricing rule for outputs based on total costs c and input distance function Di (y, x)
and its gradient with respect to outputs y.
Contrasting this rule to that for inputs

x Di (y, x)
w=c , (219)
Di (y, x)

we see that the pricing rule for inputs or outputs depends on which gradient vector
of Di (y, x) we choose. Recall, however, that this latter rule was derived under cost
minimization while the former was derived under profit maximization.
Next we return to the directional technology distance function to derive what
we call “crossover” shadow pricing rules. We begin with the profit maximization
problem with directional technology distance function constraint

T (x, y; gx , gy ) > 0,
max(py − wx) s.t. D (220)
x,y =

or as a Lagrangian problem

T (x, y; gx , gy ),
max(py − wx) − μD (221)
x,y

where μ is the Lagrangian multiplier.

The first-order constraints with respect to outputs are

T (x, y; gx , gy ) = 0
p − μy D (222)
988 R. Färe et al.

and with respect to inputs

T (x, y; gx , gy ) = 0.
−w − μx D (223)

From these expressions we can derive two crossover pricing rules:11

T (x, y; gx , gy )/∂ym
∂D
pm = w 1 , m = 1, . . . , M, (224)
T (x, y; gx , gy )/∂x1
∂D

and for inputs

T (x, y; gx , gy )/∂xn
∂D
w n = p1 , n = 1, . . . , N. (225)
T (x, y; gx , gy )/∂y1
∂D

Next we assume that in addition to input and output quantities (x, y), that total
profit is known

M
N
= py − wx = pm ym − wn xn
m=1 n=1

.
Given this information, two additional pricing rules can be developed: one for
outputs

y D T (x, y; gx , gy )
p= (226)
T (x, y; gx , gy )y + x D
y D T (x, y; gx , gy )x

and for inputs

x D T (x, y; gx , gy )
w = − . (227)
T (x, y; gx , gy )y + x D
y D T (x, y; gx , gy )x

We go through the derivation of the expression for output shadow prices and
leave the input pricing derivation to the reader.
Using the translation property of D T (x, y; gx , gy ), it follows that the Lagrangian
μ takes the form

μ = wgx + pgy . (228)

11 We may also price say pm , m = 2, . . . , M for outputs if we know p1 . Similar results hold for
input pricing.
23 Shadow Pricing in Production Economics 989

Inserting this into the first-order constraints with respect to outputs and multiply-
ing by the output vector y gives us

T (x, y; gx , gy )y
py = (wgx + pgy )y D (229)

with the corresponding expression for the input case which is

T (x, y; gx , gy )x.
wx = −(wgx + pgy )x D (230)

If we sum this expression and insert

μ = (wgx + pgy ) = (231)
T (x, y; gx , gy )x
y DT (x, y; gx , gy )y + x D

into the first-order constraints for outputs, it yields the desired output shadow pricing
rule.

Appendix A: Catalog of Shadow Pricing Rules

This appendix summarizes the pricing rules we developed in sections “Introduction:

What Is a Shadow Price?,” “Primal Representation of Technology: Distance Func-
tions,” and “Calculus and Dual Spaces.” They are organized by the type of distance
function used to represent technology – whether radial or directional – as well as by
what type of variable is being priced. We also explicitly state what data are required
to estimate the shadow price.

Input Pricing Rules

(x, y) is known and one input price is known:

∂Di (y, x)/∂xn

wn = w1 , n = 2, . . . , N (232)
∂Di (y, x)/∂x1

i (x, y; gx )/∂xn
∂D
wn = w1 , n = 2, . . . , N (233)
i (x, y; gx )/∂x1
∂D

(x, y) and total cost c are known:

x Di (y, x)
w=c (234)
Di (y, x)
990 R. Färe et al.

x D i (x, y; gx )
w=c . (235)
i (x, y; gx )x
x D

(x, y) and one output price are known:

∂Di (y, x)/∂xn

w n = p1 , n = 1, . . . , N (236)
∂Di (y, x)/∂y1

∂Do (x, y)/∂xn

w n = p1 , n = 1, . . . , N. (237)
∂Do (x, y)/∂y1

Output Pricing Rules

(x, y) and one output price are known:

∂Do (x, y)/∂ym

p m = p1 , m = 2, . . . , M, (238)
∂Do (x, y)/∂y1

o (x, y; gy )/∂ym
∂D
pm = p1 , m = 2, . . . , M. (239)
o (x, y; gy )/∂y1
∂D

(x, y) and total revenue r are known:

y Do (x, y)
p=r (240)
Do (x, y)

y D o (x, y; gy )
p=r (241)
o (x, y; gy )y
y D

(x, y) and total cost c are known:

x Di (y, x)
p=c (242)
Di (y, x)

(x, y) and one input price are known:

∂Di (y, x)/∂ym

pm = w 1 , m = 1, . . . , M, (243)
∂Di (y, x)/∂x1

∂Do (x, y)/∂ym

pm = w 1 , m = 1, . . . , M. (244)
∂Do (x, y)/∂x1
23 Shadow Pricing in Production Economics 991

Pricing Inputs, Indirect Approaches

(p/r, x) and one input price are known:

∂I Di (p/r, x)/∂xn
wn = w1 , n = 2, . . . , N, (245)
∂I Di (p/r, x)/∂x1

i (p/r, x)/∂xn
∂ ID
wn = w1 , n = 2, . . . , N. (246)
i (p/r, x)/∂x1
∂ ID

(p/r, x) and total cost c are known:

x I Di (p/r, x)
w=c , (247)
I Di (p/r, x)

x ID i (p/r, x)
w n = w1 . (248)
i (p/r, x)x
x ID

Pricing Outputs, Indirect Approaches

(w/c, y) and one output price are known:

∂I Do (w/c, y)/∂ym
p m = p1 , m = 2, . . . , M, (249)
∂I Do (w/c, y)/∂y1

o (w/c, y)/∂ym
∂ ID
pm = p1 , m = 2, . . . , M. (250)
o (w/c, y)/∂y1
∂ ID

(w/c, y) and total revenue are known:

y I Do (p/r, x)
p=r , (251)
I Do (p/r, x)

y ID o (w/c, y; gy )
p=r , (252)
o (w/c, y; gy )y
y ID

Price and Quantity Mixed, Indirect Approaches

Pricing inputs when (ps /rs , y−s , x) and one input price are known:
992 R. Färe et al.

∂SI Di (ps /rs , y−s , x)∂xn

wn = w1 , n = 2, . . . , N, (253)
∂SI Di (ps /rs , y−s , x)∂x1

∂ SID i (ps /rs , y−s , x; gx )∂xn

wn = w1 , n = 2, . . . , N, (254)
∂ SID i (ps /rs , y−s , x; gx )∂x1

Pricing inputs when (ps /rs , y−s , x) and total cost c are known:

x SI Di (ps /rs , y−s , x)

w=c , (255)
SI Di (ps /rs , y−s , x)

x SID i (ps /rs , y−s , x; gx )

w=c , (256)
x SID i (ps /rs , y−s , x; gx )x

Pricing outputs when (ws , cs , x−s , y) and one output price are known:

∂SI Do (ws /cs , x−s , y)/∂ym

pm = p1 , m = 2, . . . , M (257)
∂SI Do (ws /cs , x−s , y)/∂y1

∂ SID o (ws /cs , x−s , y; gy )/∂ym

pm = p1 , m = 2, . . . , M (258)
∂ SID o (ws /cs , x−s , y; gy )/∂y1

(ws , cs , x−s , y) and total revenue r are known:

y SI Do (ws /cs , x−s , y)

p=r (259)
SI Do (ws /cs , x−s , y)

y SID o (ws /cs , x−s , y; gy )

p=r . (260)
y SID o (ws /cs , x−s , y; gy )y

Pricing Under CRS

p = y C(y, w) (261)

w = x R(x, p). (262)

23 Shadow Pricing in Production Economics 993

Pricing Inputs and Outputs When Total Profit and (x, y) Are
Known

y D T (x, y; gx , gy )
p= (263)
T (x, y; gx , gy )y + x D
y D T (x, y; gx , gy )x

x D T (x, y; gx , gy )
w= . (264)
T (x, y; gx , gy )y + x D
y D T (x, y; gx , gy )x

Appendix B: Functional Forms

In this chapter we make use of two types of distance functions as the representa-
tion of the production technology: radial (Shephard-type distance functions) and
directional distance functions. Our pricing rules are derived in terms of derivatives
of these functions, which in general would be estimated parametrically, typically
requiring specification of a functional form. The purpose of this appendix is to
provide guidance in those choices based on the properties of the functions as well
as flexibility of the specification.
By their definitions these two types of distance functions satisfy distinct proper-
ties, which as we shall see imply different functional forms. The Shephard distance
functions are based on a radial scaling – or multiplication – of a vector, which
implies that the distance function is homogeneous in that scaled vector. In contrast
the directional distance function, rather than homogeneity of the outputs or inputs,
satisfies the translation property. To formalize these properties, we begin with some
notation. Let

F : 2 → F (q1 , q2 ) ∈ (265)

be a function, which is homogeneous of degree +1 in q if

F (λq) = λF (q), λ > 0, q = (q1 , q2 ). (266)

This function satisfies the translation property if

F (q + αg) = F (q) + α, g ∈ 2 , α ∈ . (267)

In addition to these properties, we seek a functional form that is “flexible,” i.e., it

allows for interaction terms and second-order terms. Thus we want our functions to
994 R. Färe et al.

belong to the family of generalized quadratic functions.1 More formally, let

h : → and ρ −1 : → ,

then we say that F is generalized quadratic if there exist real constants ai , aij
such that

2
2
2
F (q1 , q2 ) = ρ −1 (ao + ai h(qi ) + aij h(qi )h(qj )). (268)
i=1 i=1 j =1

When ao and aij = 0, then this reduces to a quasi-linear function [25].

When F is generalized quadratic as well as homogeneous, then [24] showed that
F can take two forms: the translog

2
2
2
F (q1 , q2 ) = ao + ai ln(qi ) + aij ln(qi ) ln(qj ) (269)
i=1 i=1 j =1

and, alternatively, the mean of order ρ form

ρ/2 ρ/2
F (q1 , q2 ) = (a11 q1 + a22 q2 + a12 q1 q2 )1/2 , (270)

where ai and aij are appropriately restricted to meet homogeneity. Thus the radial
distance function can be parameterized as a translog or mean of order ρ function.
Since the mean of order ρ function has only second-order parameters aij terms
but no first-order parameters ai , we prefer the translog form. This form is also
differentiable, which allows us to use calculus to derive and estimate shadow prices.
When F is generalized quadratic and satisfies the translation property (rather
than homogeneity), [26] show that the functional form must be either a quadratic
function

2
2
2
F (q1 , q2 ) = ao + ai qi + aij qi qj , (271)
i=1 i=1 j =1

or an unnamed function

1 2 2
F (q1 , q2 ) = (ln aij exp(λqi ) exp(λqj )) (272)
2α
i=1 j =1

1 Thisterminology is due to [22]. This form has been dubbed transformed quadratic by [23] and
Taylor series approximation by [24].
23 Shadow Pricing in Production Economics 995

with the appropriate restrictions on the parameters ai , aij to satisfy translation.

Again, we favor the form that has both first- and second-order parameters (the
quadratic functional form) over the unnamed function which has only second-order
parameters. Thus we recommend the quadratic functional form – which is also
differentiable – for estimation of the directional distance function.
We have not considered a number of problems associated with these estimating
these distance functions and their value counterparts. For examples of econometric
applications that address some of these issues, including endogeneity and choice of
direction vectors, see [27] and [28].

More Formal Exposition of Calculus and Primal and Dual Spaces

We gratefully acknowledge this appendix which was suggested and sketched out for
us by Robert G. Chambers.
Suppose we have an arbitrary vector space C; we denote its dual space C ∗ as
the space of linear functional defined on C (for a reference, see [16]). For N it is
well-known that

N ∗ = N .

Note that the usual notion of a gradient (using the Gâteaux definition) for a
smooth function f : N → is as the singleton set

f (x + λv) − f (x)
∇f (x) = {m ∈ N : mv = lim for all v ∈ N },
λ→0 λ

or alternatively the gradient is defined as

∇f (x) ∈ N = N ∗

such that

f (x + λv) − f (x)
∇f (x)v = lim
λ→0 λ

for all v ∈ N , which in turn means that the directional derivative is linear.
Hence, gradients are elements of the dual space. This is the generalization of the
definition used in the earlier section on calculus and dual spaces that were employed
for real spaces.
As earlier, we assume properties (disposability, convexity, and closedness of
technology set T ) to ensure existence, differentiability, and optimality of the
distance functions which then are function representations used to identify shadow
996 R. Färe et al.

prices.2 As before (focusing on the input distance functions for simplicity), these
distance functions are of two types: radial (Shephard-type) distance functions

Di (y, x) = sup{λ > 0 : x/λ ∈ L(y)}

and the directional distance function

i (x, y; gx ) = sup{β ∈ : x − βgx ∈ L(y).}

The properties of these functions that are relevant for shadow prices include pos-
itive homogeneity of Di (y, x) in x which ensures that ∇x Di (y, x) is homogeneous
of degree zero in x (Euler’s theorem) and translation for D i (x, y; gx ), which implies
that for α ∈

i (x + αgx , y; gx ) = ∇x D
∇x D i (x, y; gx ).
∗
These properties have implications for the elements of the dual space N
associated with the these gradient vectors, yielding

∇x Di (y, x)
x=1
Di (y, x)

and

i (x, y; gx )gx = 1.
∇x D

This means that the hyperplanes generated by the respective gradients of Di (y, x)
and D i (x, y; gx ) value x and gx at 1, respectively, i.e., x and gx are the respective
numeraires and ∇D x Di (y,x)
i (y,x)
i (x, y; gx ) define “dual (valuation)” hyperplanes
and ∇x D
passing through the numeraires.
In economics we are generally interested in the linear functions of, e.g., x
associated with prices. As an example, the question is equivalent to asking what is
the cheapest bundle of inputs consistent with producing a given output. So define
∗
C : M × N → as

C(y, w) = min{wx : x ∈ L(y)}.

For our two types of input distance functions, the associated Lagrangeans are

wx − μDi (y, x)

2 Following [10], it is important to distinguish the linear programming notion of a shadow price
from the economic notion of a shadow price. The economic notion presumes efficiency, whereas
the linear programming notion does not.
23 Shadow Pricing in Production Economics 997

i (x, y; gx ).
wx − ϕ D

Taking account of the homogeneity and translation properties of the distance

functions, the FOC simplify to

wx = μ

since ∇x Dxi (y,x) = Di (y, x) = 1 (by the constraint) and wg = ϕ, respectively.

Returning to a restatement of Daniel Primont’s definition (in real form) put into
the cost function context, we have the following: for a given vector of quantities
∗
x, w = (w1 , w2 , . . . , wN ) ∈ N is a real shadow price vector for it if x is optimal
for w. According to that definition,

−1
∂Di ((x, y)
∇x Di (x, y)
∂x1

and

∂Di ((x, y) i (x, y; gx )
∇x D
∂x1

are real shadow price vectors for x; all suitably normalized gradients may be
interpreted as real shadow price vectors.
Returning to the cost function and the relationship of our shadow prices to
Shephard’s lemma, and maintaining the assumptions and properties from above,
note that by the duality between cost and distance functions, we may write

wx
C(y, w) = min{ }
x Di (y, x)

and
w w i (x, y; gx )}.
C(y, ) = min{ x−D
wgx x wgx

Given the curvature conditions we have imposed, we obtain

w ∇x Di (x, y) x
= ⇔ ∇w C(y, w) =
wx Di (y, x) Di (y, x)

and
w i (x, y; gx ) ⇔ ∇w/wx C(y, w ) = x.
= ∇x D
wgx wgx
998 R. Färe et al.

This provides the key link between shadow pricing results (on the left) and their
dual manifestations (on the right). In a smooth world, the relationship above is what
Shephard proved, i.e., Shephard’s lemma.
Using these definitions and results and extending them to revenue and profit
provides the structure for deriving the remaining “pricing rules” in this chapter.

References
1. Färe R, Grosskopf S, Margaritis D (2019) Pricing non-marketed goods using distance
functions. In: Collaboration with Robin Sickles, Chenjun Shang, Maryam Hasannasab and
William L. Weber. NOW Publisher, Inc, Hanover
2. Warr PG (1982) Pricing rules for non-traded commodities. Oxf Econ Pap 34:305–325
3. Diewert EW (2003) Hedonic regressions: a consumer theory approach. In: Feenstra RC,
Shapiro MD (eds) Scanner data and price indexes. University of Chicago Press, Chicago, pp
317–348
4. Toda Y (1976) Estimation of a cost function when the cost is not minimum: the case of Soviet
manufacturing industries. Rev Econ Stat 58:259–268
5. Lau LJ, Yotopoulos PA (1979) Resource use in agricultural applications of the profit function
to selected countries: the methodological framework. Stanford Food Res Inst Stud 17:11–22
6. Atkinson S, Halvorsen R (1980) A test of the relative and absolute price efficiency in regulated
utilities. Rev Econ Stat 62:81–88
7. Kumbhakar SC, Battacharyya A (1992) Price distortions and resource-use inefficiency in
Indian agriculture: a restricted profit function approach. Rev Econ Stat 74:231–239
8. Wang J, Wailes EJ, Cramer GL (1996) A shadow price frontier measurement of profit efficiency
in Chinese agriculture. Am J Agric Econ 78:146–156
9. Starrett DA (2000) Shadow pricing in economics. Ecosystems 3(1):16–20
10. Chambers RG, Färe R (2008) A calculus for data envelopment analysis. J Prod Anal 30:
169–175
11. Färe R, Grosskopf S (2004) New directions: efficiency and productivity. Kluwer Academic
Publishers, Boston
12. Färe R, Primont D (1995) Multi-output production and duality: theory and applications. Kluwer
Academic, Norwell
13. Färe R, Primont D (2006) Directional duality theory. Econ Theory 29:239–247
14. Flemming W (1977) Functions of several variables. Springer, Berlin
15. Aczél J (1987) A short course on functional equations. D. Reidel Publishing Co., Dordrecht
16. Luenberger DG (1969) Optimization by vector space methods. Wiley, New York
17. Shephard RW (1970) Theory of cost and production functions. Princeton University Press,
Princeton
18. Shephard RW (1953) Cost and production functions. Princeton University Press, Princeton
19. Thorsnes P (1997) Consistent estimates of the elasticity of substitution between land and non-
land inputs in the production of housing. J Urban Econ 42:98–108
20. McMillen DP (2003) The return of centralization to Chicago: using repeat sales to identify
changes in house price distance gradients. Reg Sci Urban Econ 33:287–304
21. Shephard RW (1974) Indirect production functions. Mathematical Systems in Economics, No.
10, Verlag Anton Hain, Meisenheim Am Glan
22. Chambers RG (1988) Applied production analysis: a dual approach. Cambridge University
Press, Cambridge
23. Diewert EW (2002) The quadratic approximation lemma and decomposition of superlative
indexes. J Econ Soc Res 28:63–88
24. Färe R, Sung KJ (1986) On second order Taylor’s series approximations and linear homogene-
ity. Aequationes Mathematicae 30:180–186
23 Shadow Pricing in Production Economics 999

25. Aczél J (1966) Lectures on functional equations and their applications. Academic, New York
26. Färe R, Lundberg A (2006) Parameterizing the shortage function, mimeo. Department of
Economics, Oregon State University
27. Atkinson S, Tsionas MG (2018) Shadow directional distance function with bads: GMM
estimation of optimal directions and efficiencies. Empir Econ 54:207–230
28. Atkinson S, Primont D, Tsionas MG (2018) Statistical inference and efficiency of production
with bad inputs and outputs using latent price and optimal directions. J Econometr 204(2):
131–146
Capacity and Capacity Utilization
in Production Economics 24
Dale Squires and Kathleen Segerson

Contents
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1002
Conceptual Foundations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1004
Technological Approach to Defining Capacity and CU . . . . . . . . . . . . . . . . . . . . . . . . . . . 1005
An Economic Optimization Approach to Defining Capacity and CU . . . . . . . . . . . . . . . . 1008
Additional Considerations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1014
Other Capacity and Utilization-Related Concepts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1017
Measurement of Capacity and CU . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1021
Macroeconomic Approaches . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1021
Microeconomic Frontier-Based Approaches . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1023
Microeconomic Optimization-Based Approaches . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1028
Concluding Remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1029
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1030

Keywords

Capacity · Capacity utilization · Dual measures · Primal measures ·

Quasi-fixed inputs

D. Squires
NMFS, Southwest Fisheries Science Center, La Jolla, CA, USA
Department of Economics, University of California San Diego, La Jolla, CA, USA
e-mail: [email protected]
K. Segerson ()
Department of Economics, University of Connecticut, Storrs, CT, USA
e-mail: [email protected]

S. C. Ray et al. (eds.), Handbook of Production Economics,
https://doi.org/10.1007/978-981-10-3455-8_7
1002 D. Squires and K. Segerson

Introduction

The concepts of capacity and capacity utilization (CU) help explain many economic
phenomena, including investment behavior, productivity measurement, inventory
behavior, entry/exit into an industry, market power, pricing, and profitability
[35, 107, 124, 132, 136, 166]. These concepts are sometimes employed to indicate
the strength of aggregate demand, especially throughout the business cycle and con-
sequent inflationary pressures, and the exploitation pressures placed upon renewable
resource stocks. Understanding the role of measured capacity in economic fluctu-
ations also helps understand theories of the business cycle. Central banks, such as
the Federal Reserve Board in the United States, other government economists, and
academic macroeconomists and industrial organizational economists developed the
early notions of capacity to address these issues.1
Similarly, capacity utilization has been studied within the contexts of operations
research and business strategy [27, 103, 124]. Capacity expansion models from the
operations research literature have been used to explain both the amount and the
timing of firm investment or disinvestment to alter capacity and to relate industry
capacity expansion to nonstrategic and measurable factors, such as the growth rate
and variability of demand, the number of firms and plants in the industry, and the
degree of capital intensity and investment economies of scale.2 This approach also
considers the role of the firm’s business strategy. For example, excess capacity may
be used as a strategy to protect firms’ market share [5, 46, 54] or to allow the firm
to seize opportunities that may develop [85]. Capacity strategies can also allow
a firm to preempt competitors by taking advantage of market changes before its
competitors respond [58].
This chapter provides an overview of the literature on capacity utilization,
focusing primarily on the work within production economics but also noting the
literature from related fields, especially macroeconomics. We focus on the two
broad approaches that have been taken to defining capacity and CU, an engineering
or technological approach based on production possibilities and an economic

1 Notable among the initial economists who contributed are Cassels [28], Chenery [33, 34], Klein
[113], Klein and Summers [115], de Leeuw [50], Schultze [160], Hickman [91, 92], Wilson and
Eckstein [186], Klein et al. [116], and Klein and Preston [114].
2 The optimal timing of capital investment projects to expand capacity distinguishes between

incremental and more substantive increases in capacity. This discussion also distinguishes between
timing within the business cycle, building capacity when the need for it develops at peaks or
countercyclically at the lower point of a cycle, and long-term objectives of the firm, such as
expanding to maintain the firm’s competitive position and market share. In response to economic
shocks, Greenwood et al. [86, 87] allow intensity of capital stock use to vary, and Kydland
and Prescott [121] and Bils and Cho [21] allow the workweek of capital to respond to examine
variations in CU by the intensity or the period of time for capital utilization. Bresnahan and Ramey
[24] and Cooley et al. [42] allow CU to vary by the fraction of the capital stock used in production.
24 Capacity and Capacity Utilization in Production Economics 1003

approach based on optimizing firm behavior [91].3 Under both approaches, capacity
utilization (CU) is defined as the ratio of actual output to some reference level
of output, but the reference level differs under different CU approaches. Under
the technological approach, the reference level is typically some notion of a
realistically sustainable maximum level of output. Alternatively, under an economic
optimization approach, it is defined as some notion of “optimal” output. We
discuss both of these approaches, focusing first on the theoretical/methodological
underpinnings of these concepts and then turning to some of the relevant empirical
literature aimed at measuring capacity and/or CU. Our intention is not to provide an
exhaustive literature review, but rather to introduce and overview this literature and
the multitude of issues.
The sections of this chapter are organized as follows. “Conceptual Foundations”
develops the conceptual foundations for capacity and CU. “Technological Approach
to Defining Capacity and CU” discusses the technological approach to defining
capacity and CU. Several different approaches or variations are discussed, since
the literature developed in different contexts and with different emphases. “An
Economic Optimization Approach to Defining Capacity and CU” then discusses the
economic optimization approach. Again, we discuss several different approaches
that differ by the behavioral objective (cost minimization, profit maximization,
or revenue maximization) as well as whether they are based on primal or dual
specifications of the firm’s problem. We then turn briefly to some extensions to
the basic theory to account for multiple outputs and/or multiple inputs, other stocks
(such as stocks of natural capital), regulatory constraints, dynamic decision-making
and adjustment, uncertainty and stochastic demand, and imperfect competition. This
section then briefly surveys other capacity and utilization-related concepts, which
are sometimes conflated with the canonical approach. These include input capacity
based on quasi-fixed inputs, capital (as opposed to capacity) utilization, variable
input utilization, and the link between capacity utilization and productivity.
Following this conceptual overview, “Measurement of Capacity and CU” devel-
ops the empirical measurement of capacity and CU. “Macroeconomic Approaches”
develops primal technical-engineering measures of capacity and CU. Distinctions
are made between different approaches that developed within different contexts and
slightly different analytical frameworks and emphases but that are fundamentally the
same (each with their own strand of literature), including the US Federal Reserve;
peak-to-peak; full employment maximum output; and an explicit production func-
tion. “Microeconomic Frontier-based Approaches” discusses measurement based
on technological-economic methods that draw from microeconomic foundations
and the theory of the firm. These include both primal frontier methods based on
production or distance functions and dual methods based on cost, profit, or revenue
functions. “Microeconomic Optimization-based Approaches” discusses the plant

3 The analysis of capacity, investment to increase capacity, and CU can also be approached from
the capital budgeting and finance literatures. This chapter does not include a review of this strand
of the capacity discussion [118, 158].
1004 D. Squires and K. Segerson

capacity model of CU developed by Johansen [98]. Finally, “Concluding Remarks”

provides concluding comments.

Conceptual Foundations

As noted above, capacity utilization (CU) is usually defined as the ratio of actual
output (Y) to some measure of its “capacity,” which is a reference level of output
(YR ) generally thought to be the level that can or should be produced given the
existing input base, prices, states of technology and the environment, and firm
management. Thus, the primal measure of CU is simply CU Y = YYR . The inverse,
1/CUY = YR /Y, indicates that the amount of output could increase if the existing
capacity were fully used (under the technological approach) or optimally used
(under the economic optimization approach) [132].
Capacity and CU are typically viewed as short-run concepts. Earlier views
allowed for capacity in either the short or the long run. For example, Cassels [28]
states: “Careful distinction must always be made between the excess capacity of
fixed factors which exists for the short run and the excess capacity of all factors
which may be present over the long run . . . .” However, the contemporary view is
that capacity and CU arise due to scarcity or fixity or quasi-fixity of one or more
inputs that are available to utilize for the production of one or more outputs.4 The
stock of labor is sometimes considered as one of these fixed factors, but the most
commonly specified fixed or quasi-fixed input, forming the usual capacity base
available for production, is the amount of physical capital stock (K) – which we
use hereafter unless otherwise noted.5
In addition, in some contexts, firms face other constraints in the short run (and
potentially the long run as well) that operate as restrictions on their choices and
should thus be treated as “fixed” when defining capacity and CU. For example, in
sectors with renewable resources (e.g., fisheries) or non-nonrenewable resources
(e.g., mining), the natural capital stock forms a second stock in a stock-flow
production technology that is fixed for individual firms. Similarly, a firm might be
regulated by a binding quota or predetermined demand that should be reflected in
the measure of capacity and CU. We return to the role of these additional constraints
below.

4 Fixed factors are factors that must receive payment whether or not any output is produced. Quasi-
fixed factors must be paid only if the firm decides to produce a positive amount of output. Another
definition of quasi-fixed factors is factors of production that can be adjusted in a time period,
the short run, but will not be adjusted all the way to the full static equilibrium level because of
constraints such as adjustment costs.
5 Issues arise regarding consistently aggregating K from individual capital stocks and different

vintages of each capital stock and incorporating multiple stocks in the capacity base [22, 96].
24 Capacity and Capacity Utilization in Production Economics 1005

Technological Approach to Defining Capacity and CU

Measures Based on Maximum Sustainable Output

Technological notions of capacity and CU measures were discussed early on by
Cassels [28] and further developed and given rigor for empirical application by Gold
[82], Klein [113], Klein and Summers [115], Klein et al. [116], Klein and Preston
[114], Hickman [92], and Johansen [97]. Klein and Summers [115] state that full
capacity is “the maximum sustainable level of output which the industry can attain
within a very short time if the demand for its product were not a constraining factor,
and when the industry is operating its existing stock of capital at its customary level
of intensity.”
The technological approach thus defines the reference level of output YR as the
maximum output (Y0 ) that may be produced given the firm’s short-run capital stock
K – or more generally, assuming some factors of production are fixed in the short
run that include K but can also include labor and other factors [113, 116, 166].
Capacity is “wasted” when the output produced is lower than what is maximally
producible from the quantities of variable and fixed inputs used. Capacity remains
“underutilized” when the quantity of variable inputs is less than what is required
to produce the maximum output producible given the fixed inputs. Shapiro [166]
gives an economic interpretation to this approach: “Capacity is best thought of as
the level of output where the marginal cost curve becomes steep. If this region of
the cost curve is sometimes relevant, the relationship between capacity and output
should be non-linear. If capacity is tight, growth in capacity limits growth in output.”
This concept of capacity includes the qualification that capacity represents
a realistically sustainable maximum level of output rather than some higher
unsustainable short-term maximum [44]. Additional assumptions include that the
elasticity of substitution between variable and fixed factors of production (i.e., the
elasticity of intensity, [57]) is very low and that movements in production arise
from shifts in demand rather than shifts in production possibilities [166]. Under
these assumptions, short-run changes in output equal short-run changes in utilization
of fixed factors. Thus, the US Federal Reserve, Census Bureau, and Defense
Logistics define full production capability as the maximum level of production that
an establishment could reasonably expect to attain under reasonable and realistic
operating conditions fully utilizing the machinery in place and ready to operate.
Under this definition of CU, clearly CU is always less than or equal to 1, i.e.,
CU Y = YY0 ≤ 1. When CUY < 1, the firm has the potential for greater production,
given K, without having to incur major expenditures for new capital or equipment
[115]. This implies that some of the capital stock is not fully utilized while full
capital utilization and technical efficiency would yield Y0 . However, in practice, due
to sharply rising marginal costs as production starts to approach Y0 , full sustainable
CU is typically considered, in practice, as attained at a production level that is lower
than Y0 .
This raises practical questions about defining the full employment or full
utilization level of variable inputs and a precise distinction between variable and
fixed factors [44, 135]. Klein et al. [116] state that full capacity can be described
1006 D. Squires and K. Segerson

as a full input point on an aggregate production function and that full capacity
should be defined as an “attainable level of output that can be reached under
normal input conditions” and “fully employing” the variable inputs, “without
lengthening accepted working weeks, and allowing for usual vacations and normal
maintenance,” given the current technology and keeping fixed factors at their current
levels. However, some notion of what constitutes normal conditions must still be
determined. For example, is the capacity of a plant and equipment determined by
the production of this plant and equipment operating throughout the day or season or
year, and should downtime for repair and maintenance, institutional constraints such
as holidays, and the like be considered? The answer varies by the type of technology
and institutional factors that constitute issues such as “normal” downtime [44].
Short-run output varies with technology type in different ways according to (1)
duration and (2) intensity or speed of operations. Alternatively, capacity output can
be defined based on applying variable inputs without limit, to the point at which
the marginal productivity of all variable inputs falls to zero. As seen below, Färe
[65] and Färe et al. [67] rigorously develop this approach within the context of Data
Envelopment Analysis.
Finally, note that the above definition of capacity output is based on the notion
of a best-practice production frontier of Debreu [52], Koopmans [117], and Farrell
[[67, 72].6 However, in practice firms might not be operating on their production
frontiers because of technical inefficiency. In the presence of inefficiency, the
observed output used to calculate the standard measure of primal CU may differ
from the capacity output due either to the quasi-fixity of inputs or to technical
inefficiency. This would result in a downward bias relative to the frontier measures
of capacity and in the CU measure. To address this, Färe et al. [67] proposed an
alternative primal CU measure that is the ratio of technically efficient output, YTE ,
TE
to capacity output: CU TY E = YY 0 . This CU measure is consistent in that both the
numerator and the denominator are technically efficient. It provides a CU measure
that indicates deviations from capacity output due solely to quasi-fixed inputs. Such
a ratio is “unbiased” in that it is not directly influenced by measured technical
inefficiency.

Johansen’s Plant Capacity

The most commonly used approach to defining capacity based on production
possibilities employs a primal definition of capacity output – plant capacity –
developed by Johansen [97] and refined by Färe [65].7 In contrast to the above

6 Output-oriented technical efficiency occurs when firms produce the maximum output attainable
for a given set of inputs, given the state of technology, environmental conditions, and, in natural
resource industries, the resource stock. Output-oriented technical inefficiency occurs when the
actual output is less than the technically efficient level of output. Technical efficiency is measured
from a best-practice production frontier – the production frontier estimated from the input-output
observations for firms with the observed best practice.
7 Again, empirical application uses the notion of the best-practice production frontier of Debreu

[52], Koopmans [117], and Farrell [72].

24 Capacity and Capacity Utilization in Production Economics 1007

measure, the plant capacity concept of capacity is defined as “the maximum amount
that can be produced per unit of time with existing plant and equipment, provided the
availability of variable factors of production is not restricted.” Färe [65] developed
a formal proof of the existence for Johansen’s definition of capacity. Färe [65]
identified Johansen’s concept of capacity as a strong definition of capacity because
it is unbounded. He also developed a weaker notion, which only requires that for
fixed inputs output is bounded when there is no restriction on variable inputs. The
strong definition implies the weak definition, but the reverse does not hold.
When the Johansen plant capacity definition is applied in an industry, the
assumption of unbounded variable inputs is problematic [109], again raising the
question of how to measure capacity in practice. Complicating factors include
how to define customary and normal operating procedures, the number of shifts,
vacations, etc. [44, 115, 116]. The maximum potential variable input level, such as
three full 8-hour shifts a day every day of a week and every week in a year for similar
plants, may not match the observed practice (and hence the data). The maximum
must then be defined relative to historically observed maximum variable input usage
in a fairly recent period for similar plants (see, e.g., [106, 112]). These norms, rules,
and practices can also vary by industry. When the Johansen plant capacity concept
is employed in a regulated industry in which the variable input use may be less than
the theoretically unrestricted levels, the resulting capacity estimates are likely to be
more realistically obtainable than the strict definition indicates [73].
Capacity output may also be limited by budget constraints. Ray et al. [153]
develop a restricted version of Johansen’s [97] plant capacity, utilizing a restricted
version of Shephard’s [167] indirect production function that takes explicit account
of input prices. Capacity output is then the maximum quantity that the firm can
produce given a specific quantity of the quasi-fixed factor and an overall budget
constraint for its choice of variable inputs. The firm can use any variable input
bundle within an overall expenditure constraint.
Estimates of plant capacity using individual firm-level data face several limita-
tions [105]. Horizontally summing these firm-level capacity outputs across firms
gives a measure of aggregate industry capacity output.8 Comparing this aggregate
industry capacity output to current aggregate industry output provides a measure
of CU. The plant capacity measure, however, does not allow reallocation of inputs
and outputs across firms [98]. In turn, this does not allow assessing the industry’s
optimal restructuring and configuration. Nonetheless, the plant capacity approach
has been widely applied, notably to fisheries, hospitals, banking, and power plants
along with other industries [55, 68, 102, 104–106, 108–112, 129, 138, 139, 142,
151, 153, 155, 177, 179, 181–184].
The Johansen-Färe plant capacity measure of capacity for the individual producer
can be extended to the industry model using a multi-product, frontier-based version

8 The horizontal summation of output to achieve an aggregate industry output implies a private
good. A public good would require vertical summation to achieve the aggregate industry provision
of the public good.
1008 D. Squires and K. Segerson

of the short-run Johansen industry model [53, 105, 106, 179]. Industry capacity
is the sum of firm capacities with possible reallocation of fixed inputs across
producers. The short-run Johansen [98] industry model analyzes the industry
structure due to the underlying ex post firm-level production structure. Investment
decisions imply a putty-clay production structure. Thus, firms may choose ex ante
from a catalogue of production options exhibiting smooth substitution possibilities,
but most face fixed coefficients ex post investment. Firms then have a capacity that
is entirely conditioned by the investment decisions they have made. The short-run
industry nonetheless exhibits substitution possibilities when inputs and outputs can
be reallocated across the production units that comprise the industry. Over time,
substitution and technical change can be traced via shifts in successive short-run
industry models.

An Economic Optimization Approach to Defining Capacity and CU

In contrast to the technological approach described above, the economic optimiza-

tion approach to defining capacity and CU defines the reference level of output based
on economic optimizing behavior. Part of the stimulus for the development of this
approach was an attempt to understand the impact of rising energy prices on capacity
and CU following the 1973 oil price shocks triggered by the Iranian Revolution
and the emergence of OPEC and the subsequent very high rates of global inflation.
Even though rates of economic growth remained relatively high, investment and
average labor productivity were lower than expected, and the explanatory power of
existing CU measures for inflation and other macroeconomic indicators had fallen
off sharply.
The development of this approach was also spurred by advances in applied
microeconomic theory in the form of duality theory and flexible functional forms,
developments in econometrics, advances in computing power and software, and the
advent of increasingly larger and more comprehensive data sets. These develop-
ments strengthened the link between capacity/CU and economic theory based on
explicit behavioral objectives of the firm. Early, and indeed foundational, work
in this new, microeconometric approach to capacity and CU included Berndt
and Morrison [17], Morrison [132–134], Berndt and Fuss [14], and Hulten [94].
Although this approach was implicit in earlier work, such as that by Cassels
[28], Klein [113], Friedman [79], and Hickman [92] among others, developments
in duality theory, econometrics, computing power and software, and data sets
stimulated an explosion of applied theory and empirical applications.
The economic optimization approach is based on defining the reference level of
output as the level that is economically optimal (Y*) [28]. Because the stock K is
taken as fixed or quasi-fixed in the short run, short-run fluctuations in demand are
accommodated by changes in the amount of variable inputs used in production [132,
135]. If the demanded output level differs from the production level that would be
optimally supported by K, utilization of K will not be economically optimal. The
economically optimal production level given short-run constraints on adjustment,
24 Capacity and Capacity Utilization in Production Economics 1009

i.e., the firm’s “capacity,” is defined as the output level that would be produced in
steady-state equilibrium given prices, technology, and the current levels of K. At
this production level, i.e., if Y = Y∗ , given input prices and technology, there is no
incentive for the firm to change levels of the fixed input stocks. Thus, while capacity
output and CU are inherently short-run concepts (since they reflect the existence of
one or more fixed factors), the capital stock or the capacity decision is a long-run
concept [107]. As is developed below, an economic CU different from one indicates
that the firm faces incentives to invest or disinvest in K and therefore is not in long-
run equilibrium.
Since capacity and CU are inherently short-run concepts, they should be based
on the short-run objectives of the firm. The economic behavioral objective of the
firm falls into one of three broad behavioral objectives [163, 164, 171]: (1) cost
minimization, where output levels are exogenous or predetermined and some inputs
are variable while others are fixed or quasi-fixed; (2) revenue maximization, where
one or more outputs are endogenous and all remaining outputs and all inputs are
fixed or quasi-fixed; or (3) a restricted (variable) profit function, where one or more
outputs are endogenous and variable and some inputs are variable while all other
outputs and inputs are fixed or quasi-fixed. Each of these can be used to define both
a primal measure described above, CUY , and analogous dual measures. Below we
provide an overview of these measures under the alternative behavioral objectives.

Primal Measures of CU Based on Cost-Minimizing Behavior

Three different economic definitions of capacity output have been proposed for
defining a primal measure of CU for a cost-minimizing firm. Each of these
approaches addresses the short-run excessive or insufficient utilization of fixed or
quasi-fixed inputs. The first defines capacity output as the output at which the short-
run average cost (SRAC) curve reaches its minimum. This definition was originally
proposed by Cassels [28] and subsequently adopted by others, including Chenery
[33, 34], Hickman [92], Wilson and Eckstein [186], Berndt and Morrison [17], and
Berndt and Hesse [16]. This approach emphasizes exploiting short-run economies
of scale. However, Klein [113] noted that Cassel’s definition was difficult to apply
empirically, since L-shaped long-run average cost curves without a well-defined
minimum were more likely to be observed [16].
A second definition of economic capacity based on cost-minimizing behavior
defines capacity as the output determined by the minimum of the long-run average
cost curve (LRAC) [28, 92] rather than the minimum of the SRAC. However, this
approach has not been used much in practice, perhaps because it is so closely
intertwined with scale economies [105, 152].
The third definition is based on defining capacity output as the output at which the
LRAC and SRAC curves are tangent when the stock K is fixed. This approach was
originally suggested by Klein [113] and then further developed by Friedman [79]
and Morrison [132], Berndt and Fuss [14], and Hulten [94]. This third definition
coincides with the first measure under long-run constant returns to scale, but more
generally reflects a capacity output level that is a steady state in the sense that
the firm does not have an incentive to change output if input prices, stocks of
1010 D. Squires and K. Segerson

fixed inputs, and technology remain unchanged. Intuitively, this economic notion
of capacity means that for a given output level and state of technology and prices,
the firm is using the stock of capital that allows that output to be produced at the
lowest average cost. If demand falls short of the output level corresponding to the
tangency of the SRAC and LRAC curves so that CUY < 1, there is excess capital,
and the firm faces incentives to disinvest to lower K. If, on the other hand, demand
exceeds the level of output supported in steady state by the existing stock K, then
CUY > 1, which implies that there is insufficient K and the firm faces incentives to
invest.

Dual Measures of CU Based on Cost-Minimizing Behavior

The measures of CU described above are based on an “output gap,” i.e., a com-
parison of actual vs. capacity output. However, CU can be equivalently measured
in terms of the cost gap that exists when Y = Y∗ [132]. This dual CU measure
contains information on the difference between the current short-run or temporary
equilibrium and the long-run equilibrium in terms of the implicit costs of divergence
∗
from long-run equilibrium. It is defined as CU C = CC , where C is the firm’s actual
cost and C* is the firm’s shadow cost. The shadow cost is its cost when capital is
valued at its shadow price (rather than its actual price), where the shadow value is
defined as the negative of the derivative of the variable cost function with respect to
K. More specifically, if G(Y, W, K) represents the firm’s short-run variable cost as a
function of output, variable input prices (W), and the fixed capital input K, then
∗
C∗ G (Y, W, K) + PK∗ K PK − PK K
CU C = = =1+ (1)
C G (Y, W, K) + PK K C

where PK is the actual price of K and PK∗ = −GK is the shadow price of K.
Clearly, C∗ > C and hence CUC > 1 when PK∗ = −GK > PK . This corresponds
to the case where the valuation of an incremental unit of K is higher than its actual
cost and the current level of K is therefore less than the cost-minimizing level, K*.
Equivalently, it means that Y > Y* and hence CU Y = YY∗ > 1, when Y* is defined
as the output level where long-run and short-run average costs will be equal, i.e., the
SRAC and LRAC curves will be tangent, which corresponds to the level of output
where GK (Y, W, K) + PK = 0. Thus, while differing in magnitude, both the primal
and dual measures imply that the firm has an incentive to invest to increase its capital
stock. Conversely, when the observed K is larger than K*, then the shadow price of
K will be less than PK and Y < Y*. In this case, both measures signal that the firm
has an incentive to reduce its capital stock or disinvest.

Profit and Revenue-Based Measures of CU

The above economic measures of capacity output assume cost minimization of
exogenous or predetermined output given one or more fixed or quasi-fixed inputs.
This approach is particularly apt when a firm produces a standardized product in
anticipation of stable market demand and competes on the basis of price and low cost
arising from economies of scale. However, firms may have behavioral objectives
24 Capacity and Capacity Utilization in Production Economics 1011

SRMC
$

P = MC

SRAC
LRAC

A B C D E Y

Fig. 1 Alternative measures of capacity output

other than minimizing the cost of a predetermined level of output, and one or more
outputs may be endogenous, i.e., choice variables for the individual firm. Squires
[171], Berndt and Fuss [15], Segerson and Squires [163], Fousekis and Stefanous
[78], Kim [107], Coelli et al. [41], and Briec et al. [26] extended the concept of
capacity and CU to the case where firms choose their output level(s) to maximize
short-run profit.
Assuming an objective of short-run profit maximization rather than cost mini-
mization has implications for both primal and dual measures of CU. In the context
of a single-product firm,9 under cost minimization Y∗ is typically defined based on
the point of tangency of the SRAC and LRAC curves, as discussed above. Kim [107]
and Briec et al. [26] advocate defining Y* in this way even when the firms choose
output endogenously. However, this is not necessarily an equilibrium output level,
even in the short run. The short-run profit-maximizing output level is the output
level at which the firm maximizes variable profit, i.e., at which price equals short-
run marginal cost or P = SRMC. This output level differs from the output level
at which SRAC and LRAC curves are tangent under any reasonable conditions,
as illustrated in Fig. 1 (discussed in more detail below). For this reason, Squires
[171] and Segerson and Squires [163, 164] implicitly advocate and Coelli et al.
[41] explicitly advocate defining Y* as the short-run profit-maximizing output level
when defining primal measures of CU.
Defining Y* as the short-run profit-maximizing choice of output also allows for
a comparable dual measure to be easily defined. In particular, under the assumption
of profit maximization, we can define a CU measure based on the profit gap (rather
than the cost gap) that stems from the firm being out of long-run equilibrium. This
profit-gap measure of CU is given by CU π = ππ∗ , where π and π ∗ are actual and

9 The definition becomes more complex when there are multiple products as discussed below.
1012 D. Squires and K. Segerson

shadow profit, respectively. More specifically, if H(P, W, K) is the short-run variable

(restricted) profit function where P is the output price and W denotes the vector of
variable input prices, then π ∗ = H (P , W, K)−PK∗ K, where PK∗ is now the shadow
value of capital given by HK . As a result
∗
π H (P , W, K) − PK K PK − PK K
CU π = ∗ = =1+ (2)
π H (P , W, K) − PK∗ K π∗

Segerson and Squires [163]. As with the dual cost-based measure CUC , this profit-
based dual measure will signal incentives to invest (CUπ > 1) when PK∗ > P , or,
equivalently, K < K∗ and Y > Y∗ . Conversely, CUπ < 1 signals incentives to reduce
the capital stock.
Revenue maximization is a special case of profit maximization for multi-product
firms when all inputs are quasi-fixed or fixed and at least one output is endogenous
(some outputs may be quasi-fixed). In this case, the profit-based CU measure can be
readily adapted by simply reinterpreting H in (2), giving R(P, K). Alternatively, the
cost-based measure in (1) can be applied. Because short-run variable costs are zero
if all inputs are quasi-fixed or fixed, in this context the cost-based measure reduces
P∗
to CU C = PKK , where now the shadow price of capital is defined as its marginal
contribution to revenue [164]. While the magnitudes of these different measures
differ, they all consistently imply incentives to increase capital when CU > 1 and to
reduce capital when CU < 1.
The different measures of capacity output described above, including the primal
technological definition and the definitions based on cost-minimizing and profit-
maximizing behavior, are depicted graphically in Fig. 1 (taken from Coelli et al.
[41]). The first three measures are based solely on the cost curves. Point B defines
capacity output as the output at which the short-run average total cost (SRAC) curve
reaches its minimum. Point E defines capacity as the output determined by the
minimum of the LRAC curve rather than the minimum of the SRAC curve. Point A
defines capacity as the output at which the LRAC and SRAC curves are tangent. The
fourth measure, based on profit-maximizing behavior and labeled point C, defines
capacity as the output at which marginal revenue (here, price P) equals SRMC.
Finally, the fifth measure, labeled point D, is the primal plant capacity measure of
Johansen [97].

Extensions to Multi-product Firms

The above discussion of alternative measures of CU focused on a single-product
firm. However, many contexts of interest involve firms that produce multiple
products under joint production. A number of authors have considered CU for
multi-product firms with joint production. For example, Squires [171], Berndt and
Fuss [15], Segerson and Squires [162–164], Fousekis and Stefanous [78], Kim
[107], Coelli et al. [41], Briec et al. [26], and De Borger et al. [49] extended
the concept of capacity and CU when the firm’s behavioral objective is short-run
profit maximization and there are multiple endogenous outputs. Similarly, Segerson
24 Capacity and Capacity Utilization in Production Economics 1013

and Squires [163, 164], Färe et al. [71], and Lindebo et al. [125] developed a
revenue-based economic concept of capacity for a multi-product firm. As explained
here, it is quite straightforward to extend the dual measures of CU to allow for
multiple products, but more challenging for primal measures.
The economic duality approach readily extends from the single-product to the
multiple-product case with a single quasi-fixed input, because these measures are
based on the ratio of scalar measures of costs or profits even when output is not a
scalar. Consider first the cost-based dual measure of CU defined in (1). By simply
reinterpreting Y as a vector of output levels, this definition can be applied to multi-
product firms as well [162]. Similarly, by interpreting P as a vector of output prices
in (2), the profit-based dual measure also extends to multi-product firms. The fact
that output is a vector does not limit the use of these measures since cost and profit
are still scalars, and hence the cost or profit gap is still easily defined.
However, extension of the primal measures is less straightforward, and the
restrictiveness of these extensions is one reason to prefer the use of dual measures
when studying CU in multi-product firms. Segerson and Squires [162] observe that
a consistent scalar measure of output in multi-product firms exists if all outputs
are homothetically separable from inputs. In this case, a direct analogue of the
single-product primal measure of capacity utilization can be developed for the multi-
product firm or industry. They specifically examined aggregate output for which the
long- and short-run cost functions are tangent given the stock of physical capital.10
When the production technology is not homothetically separable in outputs so
that a consistent composite output is not possible, Segerson and Squires [162]
suggest two alternative ways of defining a primal capacity utilization measure.
Both approaches required restrictive assumptions: (1) outputs must move along
a ray, giving a ray measure of capacity utilization, or (2) only one output can
adjust (giving a partial measure of capacity utilization).11,12 Eilon and Soesan
[60] suggested a similar primal ray measure (due to a constant output mix) by
constructing a full capacity envelope curve that defines the maximum possible
output level for each output mix and then measuring CU as the ratio of observed

10 Gold [82, 83] suggested using output prices as weights to create a price-weighted sum of actual
output levels divided by the price-weighted sum of the maximum possible levels of each output.
11 Segerson and Squires [162] show the equivalence of the ray measure to homothetic output

separability due to the linear output expansion path with homothetic output separability (where
linear homogeneity is imposed upon the output aggregator function; see [22], Lemma 3.3.a) and
relate the (short-run) ray average cost to the work of Baumol, Panzar, and Willig [12] rather than
the other major production economics strand of the capacity literature that uses the output distance
functions of Shephard [167].
12 The numerical value of this CU measure will vary across products, and therefore it is not unique

for a given firm. However, it can be shown under certain conditions that these measures provide
a consistent indication of whether the firm’s capacity is currently under- or overutilized and the
same answer is given regardless of which output is used to measure CU.
1014 D. Squires and K. Segerson

output to maximum output, holding the output mix constant.13 Coelli et al. [41]
suggested a ray primal measure using a short-run profit function giving an approach
analogous to Johansen’s [97] plant capacity but at the corresponding short-run
profit-maximizing scale of production when variable input prices are zero.
Felthoven and Morrison Paul [74] describe two additional primal approaches
using a stochastic multi-product asymmetric transformation function: (1) iteratively
or simultaneously solve values of capacity outputs and the capacity level of the
variable input to obtain values where each variable corresponds to the simultaneous
conditions of marginal product of the variable input equals zero and capacity output
ratio(s) equal the observed ratio(s) or (2) iteratively or simultaneously solving for
capacity output values that satisfy the profit- (or revenue-)maximizing condition
between the marginal rate of product transformation and the negative of the price
ratio(s) and a condition on capacity variable input use.
In addition to Felthoven and Morrison Paul [74]’s application of a multi-product
asymmetric transformation function, Färe et al. [71] measure CU in multi-product
firms through the multi-product distance functions of Shephard [167]. This approach
is not only readily estimated through either econometrics or Data Envelopment
Analysis, but it also allows for undesirable outputs (“bads”) in addition to desirable
outputs. This approach has spawned a large number of applications in the area of
environmental performance, environmental damages, and fisheries, all too numer-
ous to cite here, and even conservation (e.g., [75]). This approach also allows for the
possibility that “too much” of the fixed factor may be employed by allowing for the
directional distance function expansions of outputs (giving the output orientation of
Johansen) and contracting inputs in the capacity measure.

Additional Considerations

Multiple Fixed Inputs

As discussed above, the dual economic approach, which is based on cost, profit,
or revenue optimization and economic duality, readily accommodates multi-product
production under joint production. However, this is predicated on the existence of a
single quasi-fixed input that is the focus of incentives to invest or disinvest. When
there are multiple quasi-fixed factors, optimization-based measures of capacity and
CU are less obvious and more limited measures must be applied to define and
measure capacity and capacity utilization [15, 162]. While it might be tempting
to assume that movements along the long-run average total cost curve correspond
to plant expansions derived from (proportional) increases in all quasi-fixed inputs,
Chambers [31] shows that this is clearly not the case. Movements along the average
total cost curve indicate plant expansion activities only when there is a single fixed
input. Thus, as formally demonstrated by Berndt and Fuss [15], it may not be

13 Such an aggregation approach (constant output mix) can be justified by Leontief aggregation of
outputs or more generally Lewbel’s Generalized Composite Commodity Theorem [123].
24 Capacity and Capacity Utilization in Production Economics 1015

possible to determine the maximum capacity output or rate of capacity utilization

when there are multiple quasi-fixed or fixed factors. The indeterminacy problem is
more severe in the presence of multiple products.
Ray et al. [154] adopt a linear programming approach and build off of Ray
[152] to develop a CU measure based on minimum short-run ray average cost under
alternative scale assumption when there are multiple outputs and/or multiple fixed
inputs. The concept of ray average costs inherently keeps multiple outputs in fixed
proportions and thereby satisfies Leontief output aggregation to create a composite
output. This being a short-run analysis, the scale of the fixed input vector does not
change, but ray expansion of the output vector serves to change the average ray fixed
cost.

CU in Natural Resource Industries

The concepts of capacity and CU have been of particular interest in natural resource
industries where output depends not only on the firm’s input but also on the resource
stock. In fact, an explosion of conceptual and especially empirical work on capacity
and CU arose in part as a response to emerging and critical practical problems in
natural resource economics and concerns about sustainable exploitation of natural
resources and pressures on natural resource stocks.
The concept of capacity in industries exploiting a natural resource stock, whether
renewable or non-renewable, must deal with two stocks of capital, physical capital K
and natural capital N.14 The stock of natural capital, however, differs from physical
capital, since natural capital is not under the control of the individual firm and hence
should not be treated as a discretionary input. In addition, the resource stock imposes
an upper limit on the total output that can be produced [108, 109, 147]. That is,
regardless of the expansion of K and increased utilization of variable inputs, for
given input prices and technology, output cannot exceed some level determined
by N. Thus, key conceptual issues that arise in this context are the addition of the
stock of natural capital, the observed and sustainable capacity output flows from this
stock, and the requirement to compare capacity output not just to observed output
for a measure of CU but to the sustainable target output for the industry [63, 64, 108,
109, 137, 138, 142, 147]. Ill-structured, incomplete, or severely attenuated property
rights, regulatory structure, regional specificity, multiple stocks for different species,
mobility of physical capital, and externalities all further complicate capacity issues
in these industries [110].

Capacity and CU Under Regulatory Constraints

Up to this point, the discussion has assumed that firms face constraints in the
form of fixed or quasi-fixed inputs and possibly predetermined output (in cost-
based measures). In many contexts, firms can face additional constraints from,

14 Note that the multiple quasi-fixed stock problem of K and N differs from the problem of multiple
quasi-fixed inputs discussed above, since N is not directly controlled by the firms.
1016 D. Squires and K. Segerson

for example, government regulations, which can and should be factored into CU
measures as well [51]. For example, Averch and Johnson [9] demonstrated that
regulation designed to ensure some rate of return on capital for the firm will induce
some “excess” capacity and lower the rate of CU. Similarly, regulations in the form
of rations, quotas, or other input or output restrictions can impact production and
investment decisions that in turn affect CU. For example, in an effort to promote
sustainable fisheries, fisheries managers often impose limits on harvests or inputs
that can impact capacity and CU [163, 172, 173, 175, 185].

Dynamics
The concepts of capacity and CU discussed above can also be extended to
incorporate dynamic decision-making and adjustment. The primal measure is then
based on behavior of the firm’s supply along an optimal path to the steady state,
while the dual measure is based on the behavior of the dynamic value function in
the stock of the quasi-fixed input. For example, Morrison [132] and Fousekis and
Stefanous [78] extended the static (single-period) concept of capacity to a dynamic
concept in which there is multi-period adjustment of K that reflects the costs of
adjustment. Abel [1] allows firms to choose optimal utilization rates of quasi-fixed
factors, and capital investment is negatively related to capital utilization along the
path to steady state, but capital utilization and investment are positively related.
Dynamic models, based on dynamic optimization (e.g., minimizing the present
value of costs or maximizing the present value of profits), can help to explain the
gradual adjustment that occurs in capital stocks as they move over time toward some
target level of quasi-fixed inputs [132]. They can also be modified to incorporate
expectations about future output demand and input prices [133] or to account for
changes in natural resource stocks over time.

Uncertainty
CU can also be affected by uncertainty. Stochasticity in demand is one important
source of uncertainty that impacts CU [176]. When demand is stochastic, the firm
faces potential losses when capacity turns out to be too high or too low. If future
demand is higher than the chosen capacity can handle, the firm will lose potential
revenue from sales. Conversely, if capacity turns out to be too high relative to
demand, the firm must bear the cost of excess capacity. Hence, in an uncertain
environment, the choice of capacity (and hence CU) depends on both the distribution
of future demand and the magnitude of these two potential losses arising from a
mismatch of capacity and demand [176].
Investment that is largely irreversible, so that expenditures are mostly sunk
costs that cannot be recovered, creates an additional source of uncertainty [6, 38,
143, 144]. In addition, the investments can be delayed, giving the firm an oppor-
tunity to wait for new information to arrive about prices, costs, and other market
conditions before it commits resources [7, 143, 144]. Irreversibility makes invest-
ment especially sensitive to various forms of risk, such as uncertainty over future
24 Capacity and Capacity Utilization in Production Economics 1017

product and input prices and operating costs that determine cash flows, uncertainty
over future interest rates, and uncertainty over the cost and timing of the investment
itself. Irreversibility can arise through capital that is firm or industry specific, so
that a different firm or industry cannot productively use the capital, and hence the
capital becomes a sunk cost. Government regulations or institutional arrangements
can also make capital expenditures a sunk cost. Irreversibility and sunk costs in turn
impact, PK∗ , the shadow value of capital given by HK in Equation (2), and therefore
CU π = ππ∗ .
In natural resource industries, uncertainty through fluctuations in natural resource
stock size and production quotas, and more generally environmental variability,
introduces another source of uncertainty over the optimal level of capacity [88].
In industries where property rights are not fully complete and structured, individual
production rights can be imposed and tradable between firms. These production
rights can then be considered a distinct type of capital in which investment is
required for a firm to operate, and the CU measure can be adjusted to account for
this additional source of investment ([185].

Imperfect Competition
Imperfect competition can also lead to excess capacity, since it causes each firm
to produce below its cost-minimizing full capacity in order to earn higher profits.
Studies that have examined the relationship between imperfect competition and
capacity include Cassels [28], Gabsewicz and Poddar [81], Fagnart et al. [61, 62],
Kim [107], Todorova [180], and Dixon and Savagar [59].

Other Capacity and Utilization-Related Concepts

Input Capacity Based on Quasi-Fixed Inputs

The measures of capacity and CU discussed above focus on output levels or their
dual representation through costs or profits. However, as noted above, in many
industries, such as fisheries, regulations (designed, e.g., to promote sustainability
or other objectives) may limit the number of firms or plants, output per firm, and/or
the capital stock. Focusing only on output-based CU levels begs the question of what
should be the optimal input levels or the number of operating units. For example,
the literature on fisheries repeatedly stresses the potential for “overcapitalization”
or “excess harvesting capacity,” i.e., too many resources chasing too few fish or
production that is wasteful or not at minimum cost or maximum net benefit to
society.
In this case, the need is for information on capacity and CU that is based on
output levels, but expressed in terms of inputs, typically the capacity base given by
K [108]. This gives rise to the concept of input capacity, in which the output levels
are taken as exogenously fixed at the industry level or at the level of the individual
firm, and these in turn determine the corresponding appropriate level of K without
restrictions on the availability of variable inputs (borrowing from Johansen’s [97]
1018 D. Squires and K. Segerson

output-oriented plant capacity). The potential contraction in the capacity base K that
can still support existing output levels can be imputed and compared to the existing
capacity or capital level to create input-oriented CU measures.
This input-oriented CU measure can be formalized by first defining a measure of
potential capital K as the minimum amount of capital that, in a given unit of time,
produces the existing or target output under customary and usual working condi-
tions, provided the availability of variable factors of production is not restricted
[13].15,16 Formally, CU K = Potential K
K
, where the subscript K indicates that it is
a capital input-oriented measure of capital utilization. Note that, as with technical
output-based measures of CU, this measure can never exceed 1. A value of CUK < 1
indicates the potential capacity or capital contraction that could be achieved and still
maintain current or target output levels. This definition is closely related to capital
utilization (see discussion below) or even input-oriented technical inefficiency,
where the key difference is the absence of any restriction on the availability of
variable inputs.17
Alternatively, input-based CU can be defined in a manner analogous to the
output-based definition of CU based on economic optimization. For example, under
the assumption of cost minimization, CU for capital could be defined as CU K =
K∗
K , where K* is the long-run cost-minimizing level of capital given the level of
output. In this case, CUK can be greater or less than 1. As with the previous output
definitions based on optimizing behavior, it provides a signal regarding incentives
to invest when CUK > 1 or disinvest when CUK < 1.

Capacity Utilization Versus Capital Utilization

Capacity utilization, whether based on outputs or quasi-fixed inputs, is sometimes
conflated with capital utilization. However, these are two distinct concepts that
coincide only under some very specific conditions.
There are several definitions of capital utilization.18 Betancourt [18] refers to
capital utilization as the duration of operations of productive processes. Betancourt
and Clague [20] define capital utilization as the proportion of time that capital is
working productively. Bosworth and Dawkins [23] refer to capital utilization as the
timing of input flows and in particular to shift work and overtime. A traditional

15 More recently, Cesaroni et al. [29, 30] define a measure of the input-oriented plant capacity

measure that compares variable input levels relative to the amount of variable inputs compatible
with a zero output. This measure is discussed below.
16 The minimum stock of capital needed to produce a given level of output can be derived using the

factor requirements function or inverse production function (see [57, 167], and [108]). The factor
requirements function depicts the production possibilities set and relates the minimal amount of
an input required to produce a vector of outputs: Kt = g(Y1t , Y2t , . . . , YMt ),where the outputs are
exogenously fixed and K is an endogenous stock.
17 Input-oriented technical inefficiency relates to how much the input use of a firm could contract

if used efficiently in order to achieve the same output level.

18 General summaries of capital utilization are given in Betancourt [19], Betancourt and Clague

[20], and Winston [187].

24 Capacity and Capacity Utilization in Production Economics 1019

strand of the literature refers to the ratio of actual usage to a maximum value derived
from technical-engineering specifications [76]. Schworm [161], Hulten [94, 95],
and Lee [122] define capital utilization as the ratio of capital services to the stock
of capital. However, the endogenous capital utilization literature allows the rate of
capital utilization to vary with input prices and output levels or price and hence is
a choice variable to firms. The rate of capital utilization, whether in the form of the
length of time that capital is operated or the intensity of its operation per unit of
time, is determined by profit maximization.
The concept of capital utilization, which relates to underutilizing a given capital
stock, is distinct from the economic concept of capacity utilization described above.
Capital utilization captures how much of the existing capital stock is being utilized,
and CU provides information about short-run vs. long-run equilibrium levels of
output and capacity output relative to one or more fixed factors that includes the
capital stock and economic incentives for investment and disinvestment. These
measures coincide only when there is only one fixed input, all variable inputs are
in fixed proportions to the fixed input, and production is characterized by constant
returns to scale [4, 13, 15].
One context in which confusion over the distinction between capital utilization
and capacity utilization has arisen is fisheries. The fisheries economics literature
[36–38], which typically assumes a single composite input (fishing effort), has
often conflated a version of capital utilization (i.e., utilization of fishing effort)
with capacity utilization.19 Although the specification of the canonical bioeconomic
model meets the restrictive conditions for these two concepts to be the same (see,
e.g., [36, 47, 89]), this model overlooks that the production relationship between
effort and catch is a stock-flow relationship, in which effort is applied to the stock
of natural capital (here fish) to produce a flow of output. Hence, variations in the
utilization of fishing effort do not provide a direct and linear relationship with
variations in catch, even under constant returns to scale in effort, and full utilization
of effort can give different catch levels depending upon the size of the stock of
natural capital. The effort aggregator function for rivalrous inputs must be linearly
homogeneous for consistent aggregation.20 Thus, although capital stock measures
are often used to measure potential or capacity output, as well as capital utilization,
the equivalence of these measures must be understood to be valid only under very
limited conditions.

Variable Input Utilization

Capacity utilization is also distinct from variable input utilization, which measures
the ratio of optimal use of a variable input to observed use [65]. As with economic
measures of capacity utilization, the behavioral objective determines the nature of

19 See Kirkley and Squires [108] for a review of this literature.

20 When there are externalities beyond those due to the common resource stock, such as congestion

or knowledge spillovers with technical change, then linear homogeneity is no longer enough to
ensure consistent aggregation [174].
1020 D. Squires and K. Segerson

the optimal input use, which can be based on cost minimization or maximization of
revenue, profit, or output. When the rate exceeds (falls short of) unity, that variable
input is overutilized (underutilized) relative to optimal utilization, and the firm or
plant has a shortage (surplus) of that variable input.
The difference between variable input utilization and capital utilization is that the
former refers to a variable input and the latter refers to a fixed or quasi-fixed factor.
Moreover, the variable input-based measures involve the returns to variable inputs,
given the existing capital base, that result from an expansion in scale of production
holding the capital factors fixed [142]. Both this and the output-based measures
of CU entail imputing the potential for expansion of output, given fixed inputs.
However, the input-based measure reflects returns to a particular input, rather than
returns to scale, and the measures will only be numerically the same with constant
returns to the variable input.

Capacity Utilization and Productivity

Capacity utilization has also played an important role in studies of productivity.
Productivity measures within the Denison-Kendrick-Jorgenson-Griliches-Solow
framework (see [99, 100]) traditionally assumed that producers are in long-run
equilibrium, so that the firm’s output is always at the long-run equilibrium level,
i.e., the point of tangency between the short-run unit or average total cost curve and
the long-run unit or average cost curve. However, firms may instead be in short-
run or temporary equilibrium [14, 16, 94, 134] due to unexpected demand shocks
that in turn lead to under- or overutilization of capacity or sudden changes in factor
prices. Temporary equilibrium, such as that associated with the business cycle, can
bias measured productivity growth away from its long-term growth path. Periods
of low growth and low demand alternate with periods of high growth and growth
above long-term trends. Stocks of capital and other quasi-fixed or fixed factors of
production are difficult to rapidly adjust, so that periods of low growth and low
demand are associated with underutilization of capital, other quasi-fixed factors,
and capacity.
The total factor productivity (TFP) approach to measuring productivity assumes
that the total growth rate of real output is comprised of two components: one that
reflects movements along the aggregate production function due to growth rates
of the factor inputs and a residual that reflects shifts in the aggregate production
function due to changes in the efficiency of production or total factor productivity
(TFP) [94, 170]. However, for the reasons described above, the TFP residuals
calculated using capital stock data fluctuate procyclically along with the rate of
utilization [14, 94, 96], and these fluctuations in the rate of service flows and
utilization of the stocks in turn tend to obfuscate the longer-term movements in
components of the TFP residual. Thus, when accounting for changes in utilization,
changes in TFP should be decomposed into three components: changes in variable
inputs, changes in both the stocks and utilization of quasi-fixed or fixed factors, and
productivity [14, 94]. The resulting decomposition indicates the amount by which
24 Capacity and Capacity Utilization in Production Economics 1021

the efficiency-driven change is cyclical and how much is driven by the long-run
trend [14].21
An alternative approach to the Berndt-Fuss one uses Johansen’s [97] concept of
plant capacity to allow for variations in CU in a primal, nonparametric specification
of technology [48, 189]. This approach decomposes the Malmquist productivity
index into technical efficiency change, variations in plant capacity utilization, and
production frontier shifts.

Measurement of Capacity and CU

Given the conceptual foundations described above, we turn now to an overview of

the empirical measurement or estimation of capacity and CU. The measures can
be classified as macroeconomic and microeconomic (firm level), where the latter
include both frontier-based and optimization-based approaches.

Macroeconomic Approaches

United States (US) Federal Reserve

The US Federal Reserve calculates macroeconomic measures of output, capacity,
and CU for the United States’ industrial sector, comprised of industries within
the manufacturing, mining, and electric and gas utilities subsectors, creating the
Federal Reserve’s Industrial Production and Capacity Utilization index [44, 131,
149, 166].22 The Federal Reserve and the Census Bureau have adopted a definition
of capacity that assumes the full employment of all variable factors of production
and the use of only the equipment in place and ready to operate to give “full
production.”23 The CU measure is the ratio of the actual level of output to this
definition of sustainable maximum level of output or capacity, defined as the
maximum output each plant in a given industry can maintain within the framework
of a realistic work schedule, taking account of normal downtime and assuming

21 The Berndt-Fuss [14] approach to the Solow residual does not, however, remove the procyclical
component of the residual that can arise, for example, through entry and exit of firms over the
business cycle. This approach to utilization also does not generalize to multiple capital goods.
22 The Federal Reserve’s capacity and CU measures are derived through a complex procedure

[44, 131, 148, 166]. Since 1990, the Bureau of the Census’s Survey of Plant Capacity forms the
primary source of utilization rates. The Federal Reserve combines the available survey evidence on
utilization with their industrial production indices to obtain a consistent system of output, capacity,
and CU [44, 166].
23 The questions that the surveys forming the measures asked pertaining to rates of utilization have

changed somewhat over time. Nonetheless, the definitions appear close enough that the time series
are treated as a single series without any ad hoc adjustments.
1022 D. Squires and K. Segerson

sufficient availability of inputs to operate the machinery and equipment that is in

place. In principle, this is the engineering-technical measure discussed by Klein
[113], Klein and Summer [115], Klein et al. [116], etc. However, as Shapiro
[166] observes, “The Federal Reserve procedure, moreover, mixes engineering and
economic notions of capacity, particularly in its assumption that seasonal peaks in
output are unsustainable.”

Peak-to-Peak Measurement of CU
The peak-to-peak approach of Klein [113], Klein and Summers [115], and Klein
and Preston [114], also known as trend through peaks, is another macroeconomic
measure. It is an interpolation of peak values of output per unit input over time,
adjusted for technical change [35, 114, 132, 135, 149]. This nonparametric approach
intends to reflect the maximum attainable output, given K. Peaks in production
per unit of K are used to represent full capacity, and linearly interpolated output-
capital stock ratios between peak years are used, in conjunction with data on K, to
estimate the maximum attainable output between peak years (essentially a straight
line is drawn between peaks). This approach calculates the ratio of a composite
output to K (a ratio of output to an input), identifying the peak values over time,
assuming that these peaks depict the full CU given normal operating and economic
conditions. This approach is essentially a partial productivity measure, Y/K, and as
such suffers from the standard problem of ignoring scarcity of other factors besides
K and input substitution possibilities. Ragan [149] and Christiano [35] discuss many
of the limitations and extensions to this approach, such as modified trend through
peaks of Dhrymes [56] and demand for capital of Hickman [92].
The Wharton School index of CU is an example of a peak-to-peak measure [114,
149]. Cyclical peaks for each of the component indices of the Federal Reserve
Board’s Index of Industrial Production are denoted, and then linear segments
between successive peaks are fitted. The trend lines through peaks are assumed to
represent an index of capacity output, on a base of actual output, which is the same
base used in the Federal Reserve Board’s Index of Industrial Production. Harris
and Taylor [90] estimate Cobb-Douglas and CES production functions using peak
output data to estimate capacity output, which in turn is used to estimate CU. This
CU estimate, used as a benchmark, is used to evaluate estimates produced by the
Wharton method and the output-capital method.

Full Employment Maximum Output

Another macroeconomic measure of capacity and CU depends on the notion that
the full employment level determines the possible or potential capacity output [80,
135, 140]. This approach is based on the “maximum” level of labor input, although
it can include other inputs that are assumed to be important to “fully” employ.
Incorporating only a single input excludes the impact that other inputs can have
on the maximum output. Incorporating multiple inputs yields a measure that is
essentially the same in principle as the Federal Reserve measure that assumes the
full employment of all variable factors of production. However, it raises the question
of how to define “full” or “maximum” levels of these other inputs.
24 Capacity and Capacity Utilization in Production Economics 1023

Production Function Approach

Another macroeconomic approach, which is a variant of the Federal Reserve and
full employment maximum approach, was developed based on the production
function. Klein [113] and Klein and Preston [114] develop the production function
approach to capacity and CU for a firm or industry, and Christiano [35] provides
a comprehensive survey. The production function for actual operations may be
specified as: Y = f (L, D), where (following Klein) Y denotes output as a flow, L
denotes labor measured as actual employment and a flow, and D here denotes the
actual flow of capital services. At full capacity, YC = f (Lf , K), where YC denotes
full capacity output, Lf denotes the input flow of the fully employed work force,
and K denotes the stock of capital that is fully utilized. Capacity output is thus the
production flow associated with the input of fully utilized manpower, capital, and
other relevant factors of production.
Since actual capital and labor cannot be assumed fully utilized, this method
requires some adjustment of the inputs [8, 35, 165], which gives rise to the
utilization-adjusted production function approach. This approach (1) uses available
surveys to estimate the “natural rates” of capacity and labor utilization above which
inflation begins to accelerate; (2) estimates a production function with utilization-
adjusted capital and labor inputs; and (3) defines potential output as the level of
output obtained when both capital and labor are at their estimated natural rates
[8, 141]. This basic approach, employed by the International Monetary Fund,
requires the existence of an aggregate production function and the natural rate of
unemployment and hence is a macroeconomic measure. This approach can also be
estimated with adjustments for the mean age of the capital stock to account for
vintage effects [8, 35].

Microeconomic Frontier-Based Approaches

Production economists have developed empirical methods for estimating measures

of capacity and CU based on microeconomic foundations and the theory of the firm.
These include both frontier methods (based on production or distance functions)
and optimization-based methods (based on cost, profit, or revenue functions). The
frontier methods are described here, followed by a description of optimization-based
methods in “Microeconomic Optimization-based Approaches.”

Production Frontier Methods

The capacity measures generated from frontier models in some cases have been
deemed technological-economic, because the analysis restricts the production
technology to those observed in the data, which inherently captures underlying
economic decisions [70, 91]. Within this framework, the capacity estimate refers
to the maximum potential or frontier level of output that could be produced
given the fixed factors, states of technology and environment, and unconstrained
variable factors. Empirically estimating the production technology by frontier
methods involves fitting the production function to the data points representing
1024 D. Squires and K. Segerson

observed output-input combinations. This can be estimated nonparametrically or

parametrically.

Data Envelopment Analysis: A Nonparametric Frontier Approach

Several nonparametric frontier primal approaches are available to estimate capac-
ity output and CU for both output-oriented and input-oriented approaches. The
most widely applied approach, proposed by Färe et al. [67],24 is based on Data
Envelopment Analysis (DEA) developed by Charnes et al. [32], which implements
the Johansen [97]-Färe [65] plant capacity definition. DEA has also been used to
develop and measure output-oriented, economic capacity and CU for the behavioral
objectives of profit maximization [26, 41], revenue maximization [70, 125], and cost
minimization [49, 154].
DEA is a form of activity analysis or linear programming that builds off of Farrell
[72, 151]. This approach is typically deterministic, so that the frontier envelops all
observations, and no allowance is made for outputs that could potentially lie above
the best-practice frontier due to stochastic factors. DEA uses linear programming
to construct a piecewise linear representation of the production frontier. This is
usually done nonparametrically, i.e., without an explicit functional form, although
a parametric approach has been used [2]. These piecewise linear frontiers envelop
the observations as tightly as possible subject to certain minimal production axioms.
Deviations from the frontier can result from technical inefficiency and variable input
use that differs from the firms that define the best-practice frontier.
The Färe et al. [67] DEA specification, an output-oriented DEA linear program,
computes the maximum proportionate increase in outputs when variable inputs are
allowed to vary and be fully utilized but fixed inputs are bounded by their observed
value for each observation or firm.25 The chapter entitled “Stochastic Frontier
Analysis: Foundations and Advances I” in this volume develops this DEA model.
The radial DEA approach to capacity measurement effectively converts the multiple
products into a single composite output through a radial expansion of outputs.
Outputs kept in fixed proportions for different input levels give Leontief aggregation
and the ray measure of capacity and CU considered by Segerson and Squires [162].
The DEA approach effectively converts the heterogeneous capital stock (multiple
fixed factors) into a single measure of the capital stock, i.e., a composite fixed factor,
to solve the indeterminancy problem raised by Berndt and Fuss [15].

Stochastic Production Frontier

The stochastic production frontier represents the efficient relationship between
inputs and outputs [3, 84, 130, 178]. That is, the firms or plants that produce

24 Eilon and Soesan [60] suggested but did not empirically implement the use of linear program-

ming.
25 Other nonparametric frontier models are available that are consistent with the underlying

production technology [184], including the free disposal hull model and the “order-m” frontier,
which have been used to study efficiency in other industries [25, 45, 169].
24 Capacity and Capacity Utilization in Production Economics 1025

the maximum potential output given a bundle of inputs define the best-practice
production frontier and are technically efficient. Firms or plants that produce less
output given comparable input bundle, technology, etc. are below the frontier
and are technically inefficient. Inputs can be flows or stocks, depending upon
the situation and specification. The production frontier could also include control
variables such as the state of the environment. The stochastic approach allows for
random or stochastic effects upon production, such as weather, bad luck, machine
breakdowns, and the like. The stochastic production frontier can be extended to
include an equation that explains the technical inefficiency according to exogenous
or predetermined variables and that is simultaneously estimated by maximum
likelihood [10, 11, 40, 120, 156]. The integrated approach provides consistency in
assumptions about the distribution of the inefficiencies. While stochastic frontier
techniques have been developed primarily to estimate technical efficiency, they can
be readily modified to measure capacity utilization [101, 108–110, 142].

Stochastic Multi-product Distance Function

The single-product stochastic production frontier discussed above does not allow
for situations where firms produce multiple outputs. To address this concern,
Kirkley and Squires [108] and Färe and Grosskopf [66] proposed the primal multi-
output stochastic distance function for multi-product technologies, which permits
various assumptions about how the composition of multiple outputs may change
when operating at full capacity.26 Felthoven and Morrison-Paul [74] empirically
estimate such a function for an Alaskan fishery. Fousekis [77] and Felthoven
[73] estimated a stochastic ray distance function [126] for multi-product fisheries.
Such an approach maintains outputs in fixed proportions and effectively converts
the multiple outputs into a single output. (See chapters on “Index Numbers
and Productivity Measurement,” and “Stochastic Frontier Analysis: Foundations
and Advances I” in this volume for more detailed discussion.) As with the
stochastic production frontier, the inefficiency function can be explicitly modeled
and simultaneously estimated with the stochastic multi-product distance function
[93].
The multi-product distance function can be extended to an input orientation
[39]. Rather than looking at how the output vector can be proportionally expanded
with the input vector held fixed, an input-oriented distance function considers
by how much the input vector can be proportionally contracted with the output
vector held fixed. The input distance function is defined on the input set, L(Y), as
DI (Y, X) = max {ρ : (X/ρL(Y))} [167]. The input set, L(Y), represents the set
of all input vectors X that can produce the output vector Y, where the properties
of L(Y) are given by Shephard [167]. Under constant returns to scale, the input
distance function is equivalent to the inverse of the output distance function (i.e.,

26 Other options for considering multiple outputs in a general stochastic frontier framework include

the use of polar coordinates, canonical regression, and instrumental variables, although all of these
options have some limitations.
1026 D. Squires and K. Segerson

DO = 1/DI ) [69]. Econometric estimation of the input-oriented stochastic multi-

product distance function is similar to the output-oriented distance function, where
homogeneity of degree one in inputs is imposed on a parametric form of DI (Y, X),
such as the translog [39]. Inputs are normalized by one of the inputs, say Xn .
The stochastic multi-product distance function can also be extended to the
directional distance function [66]. The directional distance function is associated
with an explicit direction in which efficiency is gauged. It requires specification of
a direction vector, so that outputs are expanded and inputs are contracted in that
direction. We do not develop this approach further, but note that it can be adapted to
account for capacity and CU.
Finally, as with the single-product production functions and frontiers, the para-
metric multiple-product stochastic distance function approach raises identification
issues due to potential endogeneity of the regressors (outputs, inputs), which
can lead to biased and inconsistent parameter estimates [157]. However, semi-
parametric alternatives are available for both the output and input distance functions.
The earliest approach is Corrected Ordinary Least Squares, implemented by Lovell
et al. [128] for multi-product stochastic distance functions.27 Pitt and Lee [145]
and Schmidt and Sickles [159], reviewed by Sickles [168], developed this approach
within a panel data context.28 The advantage of this semi-parametric approach
is that it avoids the need to specify particular parametric distributions of the
inefficiency term. Although the distribution of the inefficiency term is one-sided,
the terms are intrinsically latent and unobservable components. The assumption that
inefficiency is time-invariant is quite strong, although the model is relatively simple
to estimate if efficiency is specified as a fixed parameter instead of as a random
variable. (Greene [84] proposed flexible one-sided distributions such as the gamma,
which allows the distribution to be shaped by the data.) Sickles [168] reviews other
panel frontier estimators that differ from the parametric approach.

Stochastic Ray Production Functions

Löthgren [126, 127] proposed the stochastic ray production function as an alterna-
tive to the stochastic multiproduct distance function. Felthoven [73] and Felthoven

27 The COLS function is fitted in two steps.

The first step involves interpreting the unobservable
term − ln DOj as a two-sided i.i.d. N 0, σv2 distributed random error, vj , and estimating the
(translog) distance function using ordinary least squares (OLS). In the second step, the OLS
estimate of the intercept parameter, γ 0 , is adjusted by adding the largest negative OLS residual
to it, so that the function no longer passes through the center of the observed data but bounds
the data from above (i.e., envelops the data as a frontier). The distance measure for firm j, j = 1,
2, . . . , J, is then calculated as the exponent of the corrected OLS residual, forming the technical
inefficiency term.

j t = β0 + lnXj t + vj t − uj = β0 − uj +
28 The fixed-effects model can be written as [159]: lnY

lnXj t + vj t = αj + lnXj t + vj t . Once α̂j are available, uˆi = maxi α̂j – α̂j ≥ 0. The approach
assumes that the most efficient unit in the sample is 100% efficient. Firm-specific efficiency can be
obtained from: T ˆE j = exp −ûj . Cornwell et al. [43] extend the model to allow for time-varying
efficiency, and Rashidghalam et al. [150] give further specifications and discussion.
24 Capacity and Capacity Utilization in Production Economics 1027

and Morrison-Paul [74] empirically applied this approach to a primal estimate of

capacity. The ray frontier is conditioned on the polar-coordinate angles representing
the output mix and the inputs [127, 77]. All random error terms are specified to affect
the outputs radially, given the output mix that is exogenous under the specification.
A stochastic version of the ray production function can also be defined. As with the
stochastic production frontier, partitioning the input vector into fixed and variable
inputs, XK and Xv , and only specifying XK in the stochastic ray production function
gives a measure of Johansen’s plant capacity output. Another option estimates with
both XK and Xv and evaluates at maximal values of Xv , which would differ from
Johansen’s plant capacity output since Xv is bounded.
Estimation can take several approaches. As with the stochastic production
frontier, the inefficiency function can be explicitly modeled and simultaneously
estimated with the stochastic ray production function. In this case, ui is distributed
as a truncation at zero of the distribution N δZ, σu2 [93]. Alternatively, Corrected
Ordinary Least Squires (COLS) can be used [128], as discussed above. Estimates of
industry capacity output are then obtained by horizontal summation of the individual
firm capacity outputs [73]. As with the single-product stochastic production frontier
models, CU measures can be constructed as the ratio of technically efficient
output to capacity output. Firm-level technical efficiency scores (i.e., estimates of
DO (Yi , Xi )) can be computed as TEi = E[exp(ui |ei )], where ei denotes the composite
error term. Kumbhakar and Lovell [119] provide further details on the specific
formula used in the conditional expectation and the likelihood function for the
maximum likelihood estimation.

Nonparametric Deterministic Frontier for Minimum Cost- and Proﬁt-Based

Capacity
The above measures focus on use of frontier methods to estimate technological-
economic measures of capacity rather than cost- or profit-based measures. However,
nonparametric frontier methods can also be used to estimate measures of capacity
and CU that explicitly incorporate profit-maximizing behavior. For example, Coelli
et al. [41] developed linear programming models to estimate their short-run
profit measure of capacity and CU. Their approach involves the following steps:
(1) develop a piecewise linear capacity possibility frontier, essentially applying
the Johansen [97] plant capacity framework, to obtain a measure of ray capac-
ity, (2) estimate the maximum short-run profit for each firm, (3) estimate the
output-oriented technical efficiency of each firm using the standard DEA linear
programming approach, and (4) use a fourth and final linear programming model
to measure their ray economic capacity measure for each firm.
Similarly, Briec et al. [26] propose the following approach to implement the
nonparametric, deterministic frontier-based approach to minimum cost- and profit-
based notions of capacity and CU: (1) estimate the short-run minimum average
total cost by solving a variable cost function relative to a constant returns to
scale technology [146], (2) estimate the long-run minimum average total cost by
computing a total cost function relative to a constant returns to scale technology
[151], and (3) given fixed inputs but endogenous outputs, such that installed capacity
1028 D. Squires and K. Segerson

is utilized ex post at a tangency level, compute the output level and costs at the
tangency between the short-run and long-run average cost functions. Briec et al.
[26] also develop comparable economic capacity under variable profit maximization
with endogenous multiple outputs.
Capacity can also be incorporated into decompositions of static economic effi-
ciency that integrate primal and economic (dual) notions of CU using nonparametric
production frontiers [41, 49]. For example, Coelli et al. [41] use ray economic
capacity (the largest radial expansion or contraction of the output vector coinciding
with the largest possible profit) to decompose the gap between observed and
maximum short-run profit into unused capacity, technical efficiency, input-mix
allocative inefficiency, and output-mix allocative inefficiency.

Microeconomic Optimization-Based Approaches

The methods described in the previous section are based on the estimation of frontier
models that explicitly link inputs to outputs. These typically yield output-based, i.e.,
primal, measures of capacity and CU.
Dual measures of CU, on the other hand, are typically estimated through
parametric estimation of variable cost functions G(Y, W, K), variable profit functions
H(P, W, K), or revenue functions R(P, K). The most important issue in choosing
whether to estimate a cost, profit, or revenue function is whether the firm minimizes
costs for a given output bundle or maximizes revenue or profits. A related determi-
nant is whether outputs are endogenous. Endogenous outputs can be incorporated
by estimating a profit or revenue function (depending on whether some inputs
are variable or not), or by estimating a cost function that includes marginal cost
functions for outputs and uses instrumental variables to account for endogenous
outputs Y. Typically translog or generalized Leontief flexible functional forms are
specified, although other functional forms are occasionally used.
The minimum variable cost function G(Y, W, K), in which the firm minimizes
costs for a fixed (exogenous or predetermined) output bundle, is simultaneously
estimated with conditional demand functions for variable inputs derived using
Shephard’s Lemma, using either Zellner’s seemingly unrelated regression (SUR)
or maximum likelihood (ML) (both typically iterated to convergence) to account
for correlation of errors across equations. Estimating a system of equations gives
greater efficiency of parameter estimates. The variable profit function H(P, W, K) is
simultaneously estimated with variable input demand and variable output supply
equations obtained through Hotelling’s Lemma, again using either SUR or ML
(iterated to convergence). The revenue function R(P, K) is simultaneously estimated
with variable supply functions, using either SUR or ML estimation. When data are
time series, either a time trend or time fixed effects are included to account for
changes over time, including exogenous technical change. When data are panel,
firm fixed or random effects account for unobserved differences among firms due
to differences in management skills (e.g., managers) or other characteristics of the
firm or the environment in which it operates (along with either time fixed effects
24 Capacity and Capacity Utilization in Production Economics 1029

or a linear time trend). When data are available on managers, a three-way fixed-
effect model accounts for manager skill, firm characteristics, and time fixed effects
[188]. Standard errors should be heteroscedastic and autocorrelation consistent.
Tests for homothetic output separability or Hicks-Leontief aggregation can reduce
the number of outputs and give a two-stage production process in which some
outputs are combined into an aggregate output under the assumption of revenue
maximization and under homothetic output separability. Similar approaches can
aggregate variable inputs.

Concluding Remarks

The concepts of capacity and capacity utilization were originally and largely
developed to address issues and questions in macroeconomics and industrial organi-
zation. Developments in microeconomic theory (especially production economics)
and econometrics, increased availability of data (especially firm level), and the
emergence of other issues, such as energy shocks, renewable resource management
(especially overcapacity in fisheries), and appropriate levels of capacity in other
industries such as health care (e.g., hospitals) and banking, all stimulated further
refinements in the concepts of capacity and capacity utilization and further devel-
opment in methods of measurement. The emergence of readily available firm-level
data led to an orientation away from industry-level production technologies to firm-
level specifications.
Two main developments in production economics contributed to the analysis
of capacity and capacity utilization. The important development in the production
economics of duality led to a reorientation away from the primal to the dual
specification of production technology and different firm behavioral objectives of
cost minimization, revenue maximization, and profit maximization. The second
main development in production economics was the concept of economic efficiency
of Debreu [52], Koopmans [[72, 117], and Shephard [167], notably production
frontier concepts and the decomposition of economic efficiency into its components
of technical, allocative, and scale efficiency. Much of the analysis of capacity and
capacity utilization in this vein utilizes Shephard’s [167] distance function.
Empirical methods, borrowed from operations research – especially Data Envel-
opment Analysis [32] – supplementing the development of econometrics, have been
applied to the distance function framework. The emergence of Data Envelopment
Analysis (coupled with the frequent problem of a paucity of cost data, especially at
the firm level) led to a resurgence of primal approaches and applications, especially
at the firm level. Data Envelopment Analysis facilitated decompositions of capacity
and capacity utilization to account for technical efficiency.
These further developments in capacity and capacity utilization arising from
duality and economic efficiency (especially the distance function framework),
refinement of the concepts and measurement (especially econometrics and Data
Envelopment Analysis), and application to new issues occurred almost entirely
1030 D. Squires and K. Segerson

within the field of production economics. The distinct two strands of capacity
and capacity utilization within macroeconomics and production economics (now
the entire approach of industrial organization applications) continue to reflect
their different orientation, purposes, and even sources of data and methods of
measurement.

References
1. Abel AB (1981) A dynamic model of investment and capacity utilization. Q J Econ
96(3):379–403
2. Aigner D, Chu S (1968) On estimating the industry production function. Am Econ Rev
53:826–839
3. Aigner DJ, Lovell CAK, Schmidt P (1977) Formulation and estimation of stochastic frontier
production function models. J Econ 6(1):21–37
4. Allen RGD (1967) The technology of the economy. Chapter 3. In: Macro-economic theory: a
mathematical treatment. Palgrave Macmillan, London, pp 35–56
5. Anand G, Ward PT (2004) Fit, flexibility and performance in manufacturing: coping with
dynamic environments. Prod Oper Manag 13(4):369–385
6. Arrow KJ (1968) Optimal capital policy with irreversible investment. In: Wolfe JN (ed) Value,
capital and growth: papers in honour of Sir John Hicks. Edinburgh University Press, Chicago
7. Arrow KJ, Fisher AC (1974) Environmental preservation, uncertainty, and irreversibility. Q J
Econ 88:312–319
8. Artus JR (1977) Measures of potential output in manufacturing for eight industrial countries,
1955–1978. Int Monetary Fund Staff Pap 24:1–35
9. Averch H, Johnson LL (1962) Behavior of the firm under regulatory constraint. Am Econ Rev
52(5):1052–1069
10. Battese GE, Coelli TJ (1988) Prediction of firm level technical inefficiencies with a
generalized frontier production function and panel data. J Econ 38:387–399
11. Battese GE, Coelli TJ (1995) A model for technical inefficiency effects in a stochastic frontier
production function for panel data. Empir Econ 20:325–332
12. Baumol W, Panzar J, Willig R (1982) Contestable markets and the theory of industry structure.
Harcourt, Brace & Jovanovich, San Diego
13. Berndt ER (1990) Comment. In: Berndt ER, Triplett J (eds) Fifty years of economic
measurement, National Bureau of economic research studies in income and wealth, vol 54.
University of Chicago Press, Chicago
14. Berndt ER, Fuss M (1986) Productivity measurement with adjustments for variations in
capacity utilization and other forms of temporary equilibria. J Econ 33:7–29
15. Berndt ER, Fuss M (1989) Economic capacity utilization and productivity measurement
for multiproduct firms with multiple Quasi-Fixed inputs. Working paper no. 2932. National
Bureau of Economic Research, Cambridge, MA
16. Berndt ER, Hesse DM (1986) Measuring and assessing capacity utilization in the manufac-
turing sectors of nine OECD countries. Eur Econ Rev 39:961–989
17. Berndt ER, Morrison CJ (1981) Capacity utilization measures: underlying economic theory
and an alternative approach. Am Econ Rev 71:48–52
18. Betancourt RR (1986) A generalization of modern production theory. Appl Econ 18(8):
915–928
19. Betancourt RR (1987) Capital utilization. In: Eatwell J, Milgate M, Newman P (eds) The new
Palgrave: a dictionary of economics. Stockton Press, New York
20. Betancourt RR, Clague C (1982) Capital utilization: a theoretical and empirical analysis.
Cambridge University Press, Cambridge
24 Capacity and Capacity Utilization in Production Economics 1031

21. Bils N, Cho J-O (1993) Cyclical factor utilization. Manuscript, University of Rochester,
Rochester
22. Blackorby C, Primont D, Russell R (1978) Duality, separability, and functional structure:
theory and economic applications. North-Holland, New York
23. Bosworth D, Dawkins PJ (1983) Optimal utilization in British manufacturing industry. In:
Eichorn W, Henn R, Neumann K, Shepard R (eds) Quantitative studies in production and
prices. Physica-Verlag, Würzburg-Wien
24. Bresnahan TF, Ramey VA (1994) Output fluctuations at the plant level. Q J Econ 119(3):
593–624
25. Briec W, Kerstens K, Van den Eeckaut P (2004) Non-convex technologies and cost functions:
definitions, duality and nonparametric tests of convexity. J Econ 81(2):155–192
26. Briec W, Kerstens K, Prior D, Van de Woestyne I (2010) Tangency capacity notions based
upon the profit and cost functions: a non-parametric approach and a general comparison.
Econ Model 27:1156–1166
27. Cagle CJ (2011) Capacity position and financial performance: longitudinal evidence from
U.S. manufacturers. Unpublished PhD dissertation, University of Texas at Arlington, 89 pp
28. Cassels JM (1937) Excess capacity and monopolistic competition. Q J Econ 51(3):426–443
29. Cesaroni G, Kerstens K, Van de Woestyne I (2017a) Short- and long-run plant capacity
notions: definitions and comparisons. IESEG school of management working paper series
2017-EQM-04, Lille Catholic University
30. Cesaroni G, Kerstens K, Van de Woestyne I (2017b) A new input-oriented plant capacity
notion: definition and empirical comparison. Pac Econ Rev 22(4):720–739. Woestyne
31. Chambers RG (1988) Applied production economics: a dual approach. Cambridge University
Press, Cambridge
32. Charnes A, Cooper WW, Rhodes E (1978) Measuring efficiency of decision-making units.
Eur J Oper Res 2:429–444
33. Chenery HB (1947) The theory of monopolistic competition: a re-orientation of the theory of
value, 5th edn. Harvard University Press, Cambridge
34. Chenery HB (1952) Overcapacity and the acceleration principle. Econometrica 20:1–28
35. Christiano L (1981) A survey of measures of capacity utilization. Int Monetary Fund Staff
Pap 28(1):144–198
36. Clark CW (2010) Mathematical bioeconomics: the optimal management of renewable
resources, 3rd edn. Wiley, New York
37. Clark CW, Munro GR (1975) The economics of fishing and modern capital theory: a
simplified approach. J Environ Econ Manag 2:92–106
38. Clark CW, Clarke FH, Munro GR (1979) The optimal exploitation of renewable resource
stocks: problems of irreversible investment. Econometrica 47:25–47
39. Coelli TJ, Perelman S (1999) A comparison of parametric and non-parametric distance
functions: with application to European railways. Eur J Oper Res 117:326–339
40. Coelli TJ, Rao DSP, Battese GE (1998) An introduction to efficiency and productivity
analysis. Kluwer, Boston
41. Coelli TJ, Grifell-Tatj E, Perelman S (2002) Capacity utilization and profitability: a decom-
position of short-run profit efficiency. Int J Prod Econ 79:261–278
42. Cooley TF, Hansen GD, Prescott EC (1995) Equilibrium business cycles with idle resources
and variable capacity utilization. Economic Theory 6:3–49
43. Cornwell C, Schmidt P, Sickles RC (1990) Production frontiers with cross-sectional and time-
series variation in efficiency levels. J Econ 46(1–2):185–200
44. Corrado C, Mattey J (1997) Capacity utilization. J Econ Perspect 11(1):151–168
45. Cullinane K, Song DW, Wang T (2005) The application of mathematical programming
approaches to estimate container port production efficiency. J Prod Anal 24:73–92
46. Cyert RM, March JG (1992) A behavioral theory of the firm. Blackwell Publishers,
Cambridge
47. Dasgupta P (1982) The control of resources. Basil Blackwell, Oxford
1032 D. Squires and K. Segerson

48. De Borger B, Kerstens K (2000) The Malmquist productivity index and plant capacity
utilization. Scand J Econ 102(2):303–310
49. De Borger B, Kerstens K, Prior D, Van de Woestnye I (2012) Static efficiency decompositions
and capacity utilization: integrating technical and economic capacity notions. Appl Econ
44(31):4125–4141
50. De Leeuw F (1961) The concept of capacity. In: Proceedings of the business and economics
statistics section. American Statistical Association, Washington, DC, pp 320–329
51. De Vany AS (1975) Capacity utilization under alternative regulatory restraints: an analysis of
taxis markets. J Polit Econ 83(1):83–94
52. Debreu G (1951) The coefficient of resource utilization. Econometrica 19(3):273–292
53. Dervaux B, Kerstens K, Leleu H (2000) Remedying excess capacity in French surgery units
by industry reallocations: the scope for short and long term improvements in plant capacity
utilization. In: Blank J (ed) Public provision and performance: contribution from efficiency
and productivity measurement. Elsevier, Amsterdam, pp 121–146
54. Dess GG, Beard DW (1984) Dimensions of organizational task environments. Adm Sci Q
29(1):52–73
55. Deville A (2009) Branch banking network assessment using DEA: a benchmarking analysis –
a note. Manag Account Res 20:252–261
56. Dhyrmes PJ (1976) Impacts of an overvalued currency on domestic employment and prices.
Unpublished manuscript, Department of Economics, Columbia University referenced by
Christiano (1981)
57. Diewert WE (1974) Functional forms for revenue and factor requirements functions. Int Econ
Rev 15(1):119–130
58. Dixit A (1980) The role of investment in entry-deterrence. Econ J 90:95–106
59. Dixon H, Savagar A (2017) The effect of firm entry on capacity utilization and macroeco-
nomic productivity. 2017 meeting papers 1130, Society for Economic Analysis
60. Eilon S, Soesan J (1976) Reflections on measurement and evaluation. In: Eilon S, Gold
B, Soesan J (eds) Applied productivity analysis for industry. Pergamon Press, Oxford,
pp 115–133
61. Fagnart JF, Licandro O, Snessens H (1997) Capacity utilization dynamics and market power.
J Econ Dyn Control 22:123–140
62. Fagnart JF, Licandro O, Portier F (1999) Firm heterogeneity, capacity utilization, and the
business cycle. Rev Econ Dyn 2:433–355
63. FAO (Food and Agriculture Organization of the United Nations) (1998) Report of the
technical working group on the management of fishing capacity, La Jolla, California, United
States, 15–18 April 1998. Fisheries report no. 586. FAO, Rome
64. FAO (Food and Agriculture Organization of the United Nations) (1999) International plan of
action on the management of fishing capacity. FAO, Rome
65. Färe R (1984) On the existence of plant capacity. Int Econ Rev 25(1):209–213
66. Färe R, Grosskopf S (2010) Theory and application of directional distance functions. J Prod
Anal 13(2):93–103
67. Färe R, Grosskopf S, Kokkelenberg E (1989a) Measuring plant capacity, utilization, and
technical change: a nonparametric approach. Int Econ Rev 30(3):655–666
68. Färe R, Grosskopf S, Valdmanis V (1989b) Capacity, competition and efficiency in hospitals:
a nonparametric approach. J Prod Anal 1(2):123–138
69. Färe R, Grosskopf S, Lovell CAK (1994) Production frontiers. Cambridge University Press,
New York
70. Färe R, Grosskopf S, Kirkley J (2000a) Multiple output capacity measures and their relevance
for productivity. Bull Econ Res 52(2):101–113
71. Färe R, Grosskopf S, Kirkley J, Squires D (2000b) Data Envelopment Analysis (DEA): a
framework for assessing capacity in fisheries when data are limited. In: Proceedings of the
tenth biennial conference of the International Institute of Fisheries Economics and Trade,
July 10–14, 2000, Corvalis, Edited by Johnston RS and compiled by Shriver AL, International
Institute of Fisheries Economics and Trade (IIFET), Corvalis, Oregon, 2001
24 Capacity and Capacity Utilization in Production Economics 1033

72. Farrell MJ (1957) The measurement of productive efficiency. J R Stat Soc Ser A 120(Part
3):253–281
73. Felthoven R (2002) Effects of the American Fisheries Act on capacity, utilization and
technical efficiency. Mar Resour Econ 17:181–205
74. Felthoven R, Morrison-Paul C (2004) Multioutput, nonfrontier primal measures of capacity
and capacity utilization. Am J Agric Econ 86(3):619–633
75. Ferraro P (2004) Targeting conservation investments in heterogeneous landscapes: a distance-
function approach and application to watershed management. Am J Agric Econ 86(4):
905–918
76. Foss MF (1963) The utilization of capital equipment: postwar compared with prewar. Surv
Curr Bus 43(6):8–16
77. Fousekis P (2002) Distance vs. ray functions: an application to the inshore fishery of Greece.
Mar Resour Econ 17(4):251–267
78. Fousekis P, Stefanous SE (1996) Capacity utilization under dynamic profit maximization.
Empir Econ 21(3):335–359
79. Friedman M (1963) More on Archibald versus Chicago. Rev Econ Stud 30:65–67
80. Fromm G, Klein LR, Ripley FC, Crawford D (1979) Production function estimation of
capacity utilization. Paper presented at the econometric society meetings, Atlantic, December.
Referenced by Morrison (1993)
81. Gabsewicz JJ, Poddar S (1997) Demand fluctuations and capacity utilization under duopoly.
Economic Theory 10(1):131–146
82. Gold B (1955) Foundations of productivity analysis. Pittsburgh University Press, Pittsburgh
83. Gold B (1976) Framework for productivity analysis. In: Eilon S, Gold B, Soesan J (eds)
Applied productivity analysis for industry. Pergamon Press, Oxford, pp 15–40
84. Greene WH (1999) A gamma-distributed stochastic frontier model. J Econ 46(1-2):141–163
85. Greenley GE, Oktemgil M (1998) A comparison of slack resources in high and low
performing British companies. J Manag Stud 35(3):377–398
86. Greenwood J, Hercowitz Z, Huffman GW (1988) Investment, capacity utilization, and the
real business cycle. Am Econ Rev 78:402–417
87. Greenwood J, Hercowitz Z, Krusell P (1992) Macroeconomic implications of investment-
specific technological change. Manuscript, University of Rochester, Rochester
88. Hannesson R (1987) Optimal catch capacity and fishing effort in deterministic and stochastic
fishery models. Fish Res 5:1–21
89. Hannesson R (1993) Bioeconomic analysis of fisheries. Blackwell, Oxford
90. Harris R, Taylor J (1985) The measurement of capacity utilization. Appl Econ 17(5):849–866
91. Hickman BG (1957) Capacity, capacity utilization, and the acceleration principle. In:
Problems of capital formation, Studies in income and wealth, vol 19. Princeton University
Press, Princeton
92. Hickman BG (1964) A nem Method of capacity estimation. J Am Stat Assoc 59:529–549
93. Huang CJ, Liu JT (1994) Estimation of a non-neutral stochastic frontier production function.
J Prod Anal 5:171–180
94. Hulten CR (1986) Productivity change, capacity utilization, and the sources of efficiency
growth. J Econ 33(1/2):31–50
95. Hulten CR (1990) The measurement of capital. In: Berndt ER, Triplett J (eds) Fifty years of
economic measurement, National Bureau of economic research studies in income and wealth,
vol 54. University of Chicago Press, Chicago
96. Hulten CR (2001) Total factor productivity: a short biography. In: Hulte CR, Dean ER,
Dean MJ (eds) New developments in productivity analysis. University of Chicago Press,
Chicago
97. Johansen L (1968) Production functions and the concept of capacity. In: Recherches Recentes
sur la Fonction de Production, Collection Economie et Mathematique et Econometrie, vol 2.
Centre d’Etudes et de la Recherche Universitaire de Namur, Namur, pp 46–72. Reprinted in
Førsund FR, (eds) Collected works of Leif Johansen, vol 1, 1987. North-Holland, Amsterdam,
pp 359–382
1034 D. Squires and K. Segerson

98. Johansen L (1972) Production functions: an integration of micro and macro, short run and
long run aspects. North Holland, Amsterdam
99. Jorgenson DW, Griliches Z (1967) The explanation of productivity change. Rev Econ Stud
34:249–82. Reprinted with corrections 1972. In: Survey of current business. U.S. Department
of Commerce, Washington, DC, pp 3–36
100. Jorgenson DW, Griliches Z (1972) Issues in growth accounting: a reply to Edward F.
Denison. In: Survey of current business. U.S. Department of Commerce, Washington, DC,
pp 65–94
101. Kalirajan KP, Salim RA (1997) Economic reforms and productive capacity realisation in
Bangladesh: an empirical analysis. J Ind Econ XLV(4):387–403
102. Karagiannis R (2015) A system-of-equations two-stage DEA approach for explaining capac-
ity utilization and technical efficiency. Ann Oper Res 227(1):25–43
103. Kärri T (2007) Timing of capacity change: models for capital intensive industry. Unpublished
DSc dissertation, Department of Industrial Management, Faculty of Technology Manage-
ment, Lappeenranta University of Technology, Finland, 134 pp
104. Kerr CA, Glass JC, McCallion GM, McKillop DG (1999) Best-practice measures of resource
utilization for hospitals: a useful complement in performance assessment. Public Adm
77(3):639–650
105. Kerstens K, Squires D, Vestergaard N (2005) Methodological reflections on the short-run
Johansen industry model in relation to capacity management. Mar Resour Econ 20(4):
425–443
106. Kerstens K, Vestergaard N, Squires D (2006) A short-run Johansen industry model for
common-pool resources: planning a fishery’s industrial capacity to curb overfishing. Eur Rev
Agric Econ 33(3):1–29
107. Kim HY (1999) Economic capacity utilization and its determinants: theory and evidence. Rev
Ind Organ 15:321–339
108. Kirkley JE, Squires DE (1999a) Measuring capacity and capacity utilization in fisheries.
In: Greboval D (ed) Managing fishing capacity: selected papers on underlying concepts and
issues, FAO fisheries technical paper 386. Food and Agricultural Organization of the United
Nations, Rome
109. Kirkley JE, Squires D (1999b) Capacity and capacity utilization in fishing industries.
Discussion paper 99-16, Department of Economics University of California, San Diego
110. Kirkley J, Morrison Paul C, Squires D (2002) Capacity and capacity utilization in common-
pool resource industries: definition, measurement, and a comparison of approaches. Environ
Resour Econ 22(1-2):71–97
111. Kirkley JE, Squires D, Alam MF, Ishak HO (2003) Excess capacity and asymmetric
information in developing country fisheries: the Malaysian purse seine fishery. Am J Agric
Econ 85:647–642
112. Kirkley JE, Morrison-Paul C, Squires DE (2004) Deterministic and stochastic estimation
of capacity utilization and implications for capacity reduction. Mar Resour Econ 19(3):
271–294
113. Klein L (1960) Some theoretical issues in the measurement of capacity. Econometrica
28(2):272–286
114. Klein LR, Preston RS (1967) Some new results in the measurement of capacity utilization.
Am Econ Rev 57(1):34–58
115. Klein LR, Summers R (1967) The Wharton index of capacity utilization. University of
Pennsylvania, Studies in Quantitative Economics, Philadelphia, 94 pp
116. Klein L, Long V, Greenspan A, Greenwald D, Edmonson N, Perry G (1973) Capacity
utilization: concept, measurement, and recent estimates. Brook Pap Econ Act 1973(3):
743–756
117. Koopmans TC (1951) Analysis of production as an efficient combination of activities. In:
Koopmans TC (ed) Activity analysis of production and allocation. Physica-Verlag, Heidelberg
118. Krinsky I (1991) Capital budgeting and plant capacity. Eng Costs Prod Econ 21(3):233–241
24 Capacity and Capacity Utilization in Production Economics 1035

119. Kumbhakar S, Lovell CAK (2000) Stochastic frontier analysis. Cambridge University Press,
Cambridge
120. Kumbhakar S, Ghosh S, McGuckin J (1991) A generalized production frontier approach
for estimating determinants of inefficiency in United States dairy farms. J Bus Econ Stat 9:
279–286
121. Kydland FE, Prescott EC (1991) Hours and employment variation in business cycle theory.
Economic Theory 1:63–81
122. Lee J-K (1995) Comparative performance of short-run capacity utilization measures. Econ
Lett 48(3-4):293–300
123. Lewbel J (1996) Aggregation without separability: a generalized composite commodity
theorem. Am Econ Rev 86(3):524–543
124. Lieberman M (1989) Capacity utilization: theoretical models and empirical tests. Eur J Oper
Res 40(2):155–168
125. Lindebo E, Hoff A, Vestergaard N (2007) Revenue-based capacity utilization measures and
decomposition: the case of Danish North Sea trawlers. Eur J Oper Res 180(1):215–227
126. Löthgren M (1997) Generalized stochastic frontier production models. Econ Lett 57:255–259
127. Löthgren M (2000) Specification and estimation of stochastic multi-output production and
technical inefficiency. Appl Econ 32(12):1533–1540
128. Lovell CAK, Richardson S, Travers P, Wood I (1994) Resources and functionings: a new
view of inequality in Australia. In: Eichorn W (ed) Models and measurement of welfare and
inequality. Springer, Berlin, pp 787–807
129. Magnussen J, Mobely LR (1999) The impact of market environment on excess capacity and
the cost of an empty hospital bed. Int J Econ Bus 6(3):383–398
130. Meeusen W, van den Broeck J (1977) Efficiency estimation from cobb-douglas production
functions with composed error. Int Econ Rev 18(2):435–444
131. Morin N, Stevens J (2004) Estimating capacity utilization from survey data. Paper 2004-49,
Finance and economics discussion series. Federal Reserve Board, Division of Research &
Statistics and Monetary Affairs, Washington, DC
132. Morrison CJ (1985a) Primal and dual capacity utilization: an application to productivity
measurement in the U.S. automobile industry. J Bus Econ Stud 3(4):312–324
133. Morrison CJ (1985b) On the economic interpretation and measurement of optimal capacity
utilization with anticipatory expectations. Rev Econ Stud 52(2):295–310
134. Morrison CJ (1986) Productivity measurement with non-static expectations and varying
capacity utilization. J Econ 33(1/2):51–74
135. Morrison CJ (1993) A microeconomic approach to the measurement of economic perfor-
mance: productivity growth, capacity utilization, and related performance indicators, Bilkent
University lecture series. Springer, New York
136. Nelson R (1989) On the measurement of capacity utilization. J Ind Organ 33(3):51–74
137. Kirkley JE, Squires D, Walden J, Ward J (1999) Assessing efficiency and capacity in fisheries.
Prepared for the National Marine Fisheries Service workshop of “Assessing technical
efficiency and capacity in fisheries”. Silver Spring, September
138. NOAA Fisheries (2004) U.S. National plan of action for the management of fishing capacity
NOAA Fisheries. U.S. Department of Commerce, National Oceanic and Atmospheric
Administration, National Marine Fisheries Service, Washington, DC
139. Ward J, Brainerd T, Freese, S, Mace P, Milazzo M, Squires D, Terry J, Thunberg EM, Travis
M, Walden J (2001) Report of the national task force for defining and measuring fishing
capacity. National marine fisheries service, office of science and technology, Silver Spring,
Maryland
140. Okun AM (1962) Potential GMP: its measurement and significance. In: Proceedings of the
business and economics section of the American Statistical Association. American Statistical
Association, Washington, DC, pp 98–104
141. Oomes N, Dynnikova O (2006) The utilization-adjusted gap: is the Russian economy
overheating? IMF staff paper 06/68. International Monetary Fund, 46 p
1036 D. Squires and K. Segerson

142. Pascoe S, Kirkley J, Greboval D, Morrison-Paul CJ (2003) Measuring and assessing capacity
in fisheries 2. Issues and methods. FAO fisheries technical paper 433/2. Food and Agriculture
Organization of the United Nations, Rome, 130 pp
143. Pindyck RS (1988) Irreversible investment, capacity choice, and the value of the firm. Am
Econ Rev 78:969–985
144. Pindyck RS (1991) Irreversibility, uncertainty, and investment. J Econ Lit 29:1110–1148
145. Pitt MM, Lee L-F (1981) The measurement and sources of technical inefficiency in
Indonesian weaving industry. J Dev Anal 9:43–64
146. Prior D (2003) Long-run and short-run non-parametric cost frontier efficiency: an application
to Spanish Savings Banks. J Bank Financ 27(4):655
147. Prochaska FJ (1978) Theoretical and empirical considerations for estimating capacity and
capacity utilization in commercial fisheries. Am J Agric Econ 60(5):1020–1025
148. Raddock RD (1995) A revision to industrial production and capacity utilization, 1991–1995.
Fed Reserv Bull 81:16–25
149. Ragan JF (1976) Measuring capacity utilization in manufacturing. Federal Reserve Board of
New York Quarterly Review
150. Rashidghalam M, Heshmati A, Dashti G, Pishbahar E (2016) A comparison of panel data
models in estimating technical efficiency. IZA DP no. 9807. The Institute for Labor Studies,
Bonn
151. Ray SC (2004) Data envelopment analysis: theory and techniques for economics and
operations research. Cambridge University Press, Cambridge
152. Ray SC (2015) Nonparametric measures of scale economies and capacity utilization: an
application to U.S. manufacturing. Eur J Oper Res 245:602–611
153. Ray SC, Mukherjee K, Wu Y (2006) Direct and indirect measures of capacity utilization: a
non-parametric analysis of U.S. manufacturing. Manch Sch 74(4):526–548
154. Ray SC, Walden J, Chen L (2018) Economic measures of capacity utilization: a nonpara-
metric ost function analysis. Working paper 2018-02 Department of Economics, University
of Connecticut Storrs. http://web2.uconn.edu/economics/working/2018-02.pdf. Accessed 28
Aug 2018
155. Reid C, Squires D, Jeon Y, Clarke L, Clarke R (2003) Fishing capacity of tuna purse seine
vessels in the Western and Central Pacific Ocean. Mar Policy 27(6):449–469
156. Reifschneider D, Stevenson R (1991) Systematic departures from the frontier: a framework
for the analysis of firm inefficiency. Int Econ Rev 32(3):715–723
157. Ritter C, Simar L (1997) Pitfalls of normal-gamma stochastic frontier models. J Prod Anal
8:167–182
158. Sarkar S (2009) A real-option rationale for investing in excess capacity. Manag Decis Econ
30(9):119–133
159. Schmidt P, Sickles R (1984) Production frontiers and panel data. J Bus Econ Stat 2:367–374
160. Schultze C (1963) Uses of capacity measures for short-run economic analysis. Am Econ Rev
53:293–308
161. Schworm W (1977) User cost and the demand for capital, discussion paper no. 77-22,
Department of Economics, University of British Columbia
162. Segerson K, Squires D (1990) On the measurement of economic capacity utilization for multi-
product industries. J Econ 44:347–361
163. Segerson K, Squires D (1993) Capacity utilization under regulatory constraints. Rev Econ
Stat 25(1):76–85
164. Segerson K, Squires D (1995) Measurement of economic capacity utilization for revenue
maximizing firms. Bull Econ Res 47:77–84
165. Shaikh A, Moudud JK (2004) Measuring capacity utilization in OECD countries: a coin-
tegration method. Working paper no. 415, The Levy Economics Institute of Bard College.
http://www.levyinstitute.org/pubs/wp415.pdf. Accessed July 15 2019
166. Shapiro MD (1989) Assessing the federal reserve’s measures of capacity and utilization.
Brook Pap Econ Act 1989(1):181–241
24 Capacity and Capacity Utilization in Production Economics 1037

167. Shephard RW (1970) Theory of cost and production functions. Princeton University Press,
Princeton
168. Sickles RC (2005) Panel estimators and the identification of firm-specific efficiency levels in
parametric, semiparametric, and nonparametric settings. J Econ 126:305–334
169. Simar L (2003) Detecting outliers in frontier models: a simple approach. J Prod Anal 20:
391–424
170. Solow RM (1956) A contribution to the theory of economic growth. Q J Econ 70(1):65–94
171. Squires D (1987) Long-run profit functions for multiproduct firms. Am J Agric Econ
69(3):558–569
172. Squires D (1994) Firm behavior under input rationing. J Econ 61(2):235–257
173. Squires D (2016) Firm behavior under quantity controls: the theory of virtual quantities. J
Environ Econ Manag 79:70–86
174. Squires D, Vestergaard N (2018) Rethinking the commons problem: technical change,
knowledge spillovers, and social learning. J Environ Econ Manag 91:1–25
175. Squires D, Jeon Y, Grafton RQ, Kiriley J (2010) Controlling excess capacity in common-pool
resource industries: the transition from input to output controls. Aust J Agric Resour Econ
54(3):361–377
176. Steen F (1994) Optimal capacity with stochastic demand. J Prod Anal 5:375–384
177. Steinmann L, Dittrich G, Karmann A, Zweifel P (2004) Measuring and comparing the
(in)efficiency of German and Swiss hospitals. Eur J Health Econ 5(3):216–226
178. Stevenson RE (1980) Likelihood functions for generalized stochastic frontier estimation. J
Econ 13:56–66
179. Tingley D, Pascoe S (2005) Eliminating excess capacity: implications for the Scottish fishing
industry. Mar Resour Econ 20:407–424
180. Todorova T (2015) Is there excess capacity really?. Theory Pract Res Econ Fields 6(2):
127–143. ASERS Publishing
181. Valdmanis V, Bernet P, Moises J (2010) Hospital capacity, capability, and emergency
preparedness. Eur J Oper Res 2017(3):1628–1634
182. Valdmanis V, De Nicola A, Bernet P (2015) Public health capacity in the provision of health
care services. Health Care Manag Sci 18(4):475–482
183. Vestergaard N, Squires D, Kirkley J (2003) Measuring capacity and capacity utilization in
fisheries: the case of the Danish Gill-net fleet. Fish Res 60:357–368
184. Walden J, Tomberlin D (2010) Estimating fishing vessel capacity: a comparison of nonpara-
metric frontier approaches. Mar Resour Econ 25:23–36
185. Weninger Q, Just R (1997) An analysis of transition from limited entry to transferable quota:
non-Marshallian principles for fisheries management. Nat Resour Model 10(1):53–83
186. Wilson TA, Eckstein O (1964) Short-run productivity behavior in U.S. manufacturing. Rev
Econ Stat 46(1):41–54
187. Winston GC (1982) The timing of economic activities. Cambridge University Press, Cam-
bridge, MA
188. Wolff F-C, Guillotreau P, Squires D (2013) The firm’s management in production: manage-
ment, firm, and time effects in an Indian Ocean Tuna fishery. Am J Agric Econ 95(3):547–567
189. Yu MM (2007) The capacity productivity change and the variable input productivity change: a
new decomposition of the Malmquist productivity index. Appl Math Comput 185(1):375–381
Aggregation of Efficiency and Productivity:
From Firm to Sector and Higher Levels 25
Valentin Zelenyuk

Contents
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1040
The Aggregation Problem: A Brief Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1041
The Essence of the Aggregation Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1041
The Evolution of the Aggregation Literature . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1042
Aggregation of Efficiency Scores . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1043
Individual Primal and Dual Efficiency Scores . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1044
Group Primal and Dual Efficiency Scores . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1046
The Fundamental Aggregation Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1047
Understanding the Fundamental Aggregation Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1050
Aggregation of Aggregates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1051
Price-Independent Weights . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1053
Aggregation of Productivity Indexes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1055
Individual Malmquist Productivity Indexes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1055
Aggregation Problem: Inter-temporal Perspective . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1056
Aggregation of the MPIs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1058
Geometric vs. Harmonic Averaging . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1059
Decomposition and Aggregation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1061
Aggregation for Scale Measures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1062
Aggregation with Possibility of Reallocation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1068
Aggregate Technology and Measures with Reallocation . . . . . . . . . . . . . . . . . . . . . . . . . . 1068
Reallocation vs. No Reallocation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1070
Aggregate vs. Individual Reallocative Measures of Efficiency . . . . . . . . . . . . . . . . . . . . . 1072

The author acknowledges support of the University of Queensland and from the ARC grants (ARC
FT170100401).

V. Zelenyuk ()
School of Economics and Centre for Efficiency and Productivity Analysis (CEPA), The
University of Queensland, Brisbane, QLD, Australia
e-mail: [email protected]

S. C. Ray et al. (eds.), Handbook of Production Economics,
https://doi.org/10.1007/978-981-10-3455-8_19
1040 V. Zelenyuk

Remarks on Estimation of Aggregate Scores . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1074

Concluding Remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1076
Cross-References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1076
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1077

Abstract

Here we consider various cases where researchers are interested in measuring

aggregate efficiency or productivity levels or their changes for a group of
decision-making units. These could be an entire industry composed of individual
firms, banks, and hospitals or a region composed of sub-regions or countries, or
particular sub-groups of these units within a group, e.g., sub-groups of public
vs. private or regulated vs. non-regulated firms, banks, or hospitals within the
same industry, etc. Such analysis requires solutions to the aggregation problem –
some theoretically justified approaches that can connect individual measures
to aggregate measures. Various solutions are offered in the literature, and our
goal is to try to coherently summarize at least some of them in this chapter.
This material should be interesting not only for theorists but also (and perhaps
more so) for applied researchers, as it provides exact formulas and intuitive
explanations for various measures of group efficiency, group scale elasticity, and
group productivity indexes and refers to original papers for more details.

Keywords

Efficiency · Productivity · Aggregation · Industry efficiency · Duality

JEL Classiﬁcation Numbers

D24, C43, L25

Introduction

An aggregate perspective is very important for theory and perhaps even more so
in practice. Even if a researcher estimates the efficiency of individual units, she/he
might still (and usually do) want to have just one or a few aggregate numbers that
summarize the individual estimates. Such aggregate numbers would be especially
useful if the number of individual units is too large to report all of them and
especially to comprehend them all for understanding the overall picture. Indeed,
hardly anyone would want to read hundreds of individual efficiency scores and
would rather demand a summary – some aggregate efficiency or productivity
measures that will give a big picture about the efficiency or productivity situation in
the industry or sub-groups of interest within it.
The key question here is therefore: How to meaningfully aggregate the individual
efficiency and productivity scores or indexes? A natural answer would be: “Take
an average!” But, which one? Is it arithmetic, geometric, harmonic, or any other?
25 Aggregation of Efficiency and Productivity: From Firm to Sector and Higher . . . 1041

And, much more importantly: Should it be a weighted or a non-weighted average?

Or, more generally: What are the most meaningful (from economics point of view)
ways to aggregate the individual efficiency and productivity scores or indexes, of
potentially many individuals, into one number representing aggregate efficiency of
productivity of a group?
The goal of this chapter is to provide some answers to these fundamentally
important questions, by summarizing the recent developments in the literature.
In a nutshell, the results summarized here provide applied researchers with the
formulas for group efficiency, group scale elasticity, and group productivity indexes.
Importantly, in all these formulas, the weights of aggregation have a fairly intuitive
economic meaning, yet they are not ad hoc but derived mathematically via economic
theoretical reasoning.1

The Aggregation Problem: A Brief Background

The problem of finding a measure (a score, an index) representing a group of

individual measures is called an aggregation problem – a problem that has been
studied in many fields, including economics.
In the field of productivity and efficiency analysis, this problem have been raised
starting, at least, with the classical works of [17, 19], and later followed up by [30],
and most thoroughly theoretically scrutinized by [1] and critically evaluated by [54].
More recently this important analytical problem was addressed and to some
extent resolved by [7, 12–14, 36, 48, 48, 56] and most recently [31, 33], to mention a
few. Here, we will briefly summarize the essence of the key results from these and
other works.2

The Essence of the Aggregation Problem

As in the general context, the most important issue here is the choice of weights in
the aggregation. To vividly illustrate the point, consider an example of an industry
with many firms, most of which are small, while a very few large firms take most
of the industry share.3 Now suppose that those small firms are very efficient and
suppose for simplicity of computation they are 100% (or nearly that) efficient.

1A different area of the aggregation questions that focuses on the aggregation of inputs or
aggregation of outputs for a firm (e.g., to reduce the dimension of the model) is not considered
here and can be found in [6, 11, 51, 53] and the references therein. We also do not consider the
question of aggregation of indexes with respect to different references (e.g., time periods) for the
same firm, which can be found in [16] and the references therein.
2 This chapter is a substantially revised, extended, and elaborated material that I presented earlier,

in Chapter 5 of [47].
3 While this is a generic example, a reader might have realized that many industries in the real

world have a similar composition, often resembling the so-called Pareto principle, more casually
1042 V. Zelenyuk

Meanwhile, suppose those big firms are much less efficient, say 50% efficient.4
If, for such an example, a researcher were to use the simple (equally weighted)
arithmetic average, then such aggregate efficiency score would indicate the industry
is close to 100% efficient! On the other hand, if another researcher wanted to use
a weighted arithmetic average, then a dramatically different conclusion might be
reached – depending on the weighting scheme. Indeed, if one takes the market
shares as the relative weights and uses them in the weighted arithmetic average, then
such an aggregate efficiency score will indicate that the industry is closer to 50%
efficient. Thus, one would reach a dramatically different conclusion with opposite
policy implications than from the equally weighted average!
The essence of the problem here is in the nature of efficiency scores – by
construction, they are “standardized” so that they are between 0 and 1, and, while
this gives some advantages, the side effect of such standardization is that they lose
the information about the relative weights of the firms that obtained these scores.
Clearly, one may try to justify some other weights that may imply very different
conclusions and thus different policy implications and this, in turn, emphasizes the
importance of having justifications for the choice of weights.

The Evolution of the Aggregation Literature

The early key ideas that attempted to take into account the economic weights of
firms when aggregating their efficiency can be found in the seminal work of [17],
where he proposed the concept of Structural Efficiency of an Industry. To be precise,
[17] considered a single-output case and proposed taking the weighted arithmetic
average of efficiency scores of individual firms in that industry, where the weights
were the observed output shares of the firms within the industry. Importantly, note
that Farrell had not given any formal theoretic justifications for such an aggregation
scheme at that time and, in particular, had not justified why output shares were
to be used for aggregating the input-oriented technical efficiency scores that he
considered.5 Farrell also did not explain how to apply his idea for a multiple-output
case. These limitations were perhaps among the main reasons for why Farrell’s
concept of Structural Efficiency of an Industry had not attained a wider use in

known as “the 80/20 rule” postulating that about 80% share (e.g., of wealth, sales, etc.) is taken up
by about 20% of members of a group.
4 Lower efficiency of large firms is not unusual and often was reported in the literature. It can arise,

for example, due to the greater complexity of being a larger organization involving greater levels of
hierarchy and thus implying potentially greater principal-agent problems or requiring more inputs
or higher costs than needed for producing the same level and the same quality of output.
5 Indeed, later in this chapter, we will see that output shares are more coherent with output

orientation, while for the input orientation it would be more natural to use the cost shares.
25 Aggregation of Efficiency and Productivity: From Firm to Sector and Higher . . . 1043

practice, where many tended to just use the equally weighted averages to report
on the aggregate efficiency of industries or sub-groups within them.
About two decades later, Farrell’s ideas were revisited by [19] who proposed to
estimate efficiency scores of an “average firm,” constructed as the average of input-
output allocations. It is easy to construct an example that will show that such a
measure can indicate high inefficiency even though all firms are technically efficient,
and this was considered (incorrectly) as a drawback and, apparently, was one of the
reasons why this measure is also rarely used in practice.
The fundamental ideas of [17, 19] were then revisited by a very important (yet
for a long time overlooked) work of [30], who attempted to synthesize the two
approaches with additional assumptions, focusing on the data envelopment analysis
(DEA) context and on the use of the so-called shadow prices in DEA.
At the turn of the last century, [1] were the first to scrutinize the problem on
pure theoretical grounds and derived several important, yet “negative” results –
they proved the impossibility of a solution of the aggregation problem in a general
setup. What this implied was that some additional assumptions or structure were
needed to arrive at a “positive result.”6 Such additional assumptions and structure
were discovered by [12]: In addition to the usual assumptions of production
theory, they followed [25]’s work on aggregation in economics, adapting it to the
context of efficiency analysis. Differently from [25], however, they assumed an
additive structure for the aggregate technology being the set-wise summation of the
individual output (rather than technology) sets, for given input allocations. Adding
this structure to the standard regularity conditions of production theory and with
the so-called “law of one price” assumption (as in Koopmans), [12] then involved
the principles of economic optimization to derive a theoretically justified weighting
scheme for aggregation of individual efficiencies into a group efficiency. In turn, this
theoretical framework provided the grounds of economic theory for the weighting
scheme of [17, 30] and circumvented the impossibility theorems of [1].
The approach of [12] was then used to derive many other interesting and
useful aggregation results, e.g., for aggregation of directional distance functions
in [8], aggregation of scale elasticities in [15], and scale efficiencies in [59]; for
aggregation of Malmquist and Hicks-Moorsteen productivity indexes [31, 33, 56];
etc. The goal of this chapter is to summarize these aggregation results and give some
insights on future developments.

Aggregation of Efficiency Scores

While most of the discussion here will be theoretical, it would be helpful for a reader
to keep in mind that a typical empirical context of this methodology is a study

6 This is not entirely surprising, e.g., recall that very strong assumptions are needed to establish
positive aggregation results in consumer theory.
1044 V. Zelenyuk

of efficiency or productivity (or their changes) of an economic system consisting

of different decision-making units (hereafter DMUs), e.g., industry consisting of
firms or a particular bank or other institutions consisting of many branches, or
a comparison of distinct groups within such a system (e.g., regulated vs. non-
regulated, foreign vs. local, private vs. public firms, etc.). To get to the group level,
we first need to briefly refresh the key concepts and notions for the individual level,
which we do in the next subsection.

Individual Primal and Dual Efficiency Scores

Without loss of generality, suppose the system is a group (e.g., industry, sector, etc.)
consisting of n DMUs, where for each DMU k ∈ {1, 2, . . . , n} we will use vector
k ) ∈ N to denote N inputs that the DMU k utilizes to produce a
x k = (x1k , . . . , xN +
k ) ∈ M . For generality of the aggregation
vector of M outputs, y k = (y1k , . . . , yM +
results, we will allow for each DMU k to employ technology that is potentially
different from those used by other DMUs, and we assume it can be characterized by
the technology set k , defined in general terms as7

k ≡ {(x k , y k ) : x k can produce y k }. (1)

An equivalent characterization of technology can also be given via the output sets

P k (x k ) ≡ {y k : x k can produce y k }, x k ∈ RN
+. (2)

An important advantage of the aggregation results that we summarize here is

their generality with respect to characterization of technology. Indeed, we do not
assume any particular production or transformation function (e.g., Cobb-Douglass,
Leontieff, CES), rather we allow for a very wide class of technologies that satisfy
usual regularity axioms of production theory, and in particular:

A1: The technology set k is closed.

A2: The output correspondence P k (x k ) is bounded ∀x k ∈ N +.
A3: There is no “free lunch,” i.e., nothing cannot produce something, i.e.,
(0N , y k ) ∈
/ k , ∀y k ≥ 0M (i.e., ym
k ≥ 0 for m = 1, . . . , M, y k = 0 ).
M
A4: It is possible to produce nothing, i.e., 0M ∈ P k (x k ), ∀x k ∈ N + .
A5: Outputs and inputs are freely (strongly) disposable, i.e., (x 0 , y 0 ) ∈ k =⇒
(x, y) ∈ k , ∀y y 0 , ∀x x 0 .
To employ the results from the duality theory in economics, we also need some

7 Inthe discussion of economic theoretical foundation here we mainly use framework developed
by [45, 46] and further refined in many works and concisely outlined in [10] and [47].
25 Aggregation of Efficiency and Productivity: From Firm to Sector and Higher . . . 1045

convexity assumptions. At the beginning we only assume that the output sets
are convex, i.e.,
A6: y o , y 1 ∈ P k (x k ), x k ∈ N
+ ⇒ δy + (1 − δ)y ∈ P (x ), ∀δ ∈ [0, 1].
o 1 k k 8

With these conditions, the output-oriented [46] distance function Dok : N

+×
M
+ → 1+ ∪ {+∞}, defined as

Dok (x k , y k ) ≡ inf{θ > 0 : y k /θ ∈ P k (x k )}, (3)

gives a complete characterization of the technology of a DMU k, in the sense that

Dok (x k , y k ) ≤ 1 ⇔ y k ∈ P k (x k ). (4)

A closely related concept is the Farrell output-oriented measure of technical

efficiency, defined as

OT E k (x k , y k ) ≡ sup{θ > 0 : θ y k ∈ P k (x k )} = 1/Dok (x k , y k ). (5)

Furthermore, let p = (p1 , . . . , pM ) ∈ RM

++ be the vector of corresponding output
9
prices then the dual characterization of P k (x k ) is obtained from the revenue
function: R k : N
+ × ++ → + ∪ {+∞}
M 1

R k (x k , p) ≡ sup{py : y ∈ P k (x k )}, (6)

and the related efficiency measure for a DMU k in the dual framework would then
be the revenue efficiency (also referred to as the overall output efficiency), defined
formally as

RE k (x k , y k , p) ≡ R k (x k , p)/py k . (7)

From the duality theory for the revenue function [10, 46], we then have

R k (x k , p) ≥ py k /Dok (x k , y k ), (8)

which leads to another notion – a measure of the output-oriented allocative

(in)efficiency, defined as a multiplicative residual that turns (8) into equality, i.e.,

8 For theoretical results we do not require convexity of k , although when implementing in

practice one may impose it when choosing a particular estimator or particular functional form
for technology.
9 Note that for the aggregation results, a necessary assumption is the so-called Law of One Price,

i.e., here it implies that all firms face the same output prices.
1046 V. Zelenyuk

OAE k (x k , y k , p) ≡ RE k (x k , y k , p)/OT E k (x k , y k ), (9)

and so, we have a useful decomposition:

RE k (x k , y k , p) = OT E k (x k , y k ) × OAE k (x k , y k , p), (10)

This decomposition (10) is a stepping-stone for deriving the aggregation results, as

will be apparent below.

Group Primal and Dual Efficiency Scores

Let us consider a sub-group l (l = 1, . . . , L), consisting of nl DMUs within the

original group of n DMUs. Such sub-grouping can be based on various exogenous
criteria such as geographic regions, ownership structures, regulation regimes, etc.
For each group l (l = 1, . . . , L), let the input allocation among DMUs within the
sub-group l be Xl = (x l,1 , . . . , x l,nl ), and let the total of output vectors over all
l l
firms in the l th group be Y = nk=1 y l,k .
A cornerstone in the derivation of the aggregation results is the structure of the
aggregate technology. In the context of output orientation, it is natural to assume
a linear structure of aggregation of the output sets, as was done in [12]: For each
l
group l (l = 1, . . . , L), the aggregate output set P (Xl ) is the Minkowski sum of the
individual output sets across all DMUs k (k = 1, . . . , nl ) within the group l, i.e.,10

l
nl
P (X ) ≡
l
P l,k (x l,k ). (11)
⊕k=1

l
As a result of such a structure, P (Xl ) would inherit the regularity conditions
imposed on the individual output sets. In particular, note that the Minkowski sum
of convex sets is also a convex set.11 Thus, convexity of the individual output sets
l
imposed by A6 ensures convexity of P (Xl ).
l
It is also worth noting that the aggregation structure defined by P (Xl ) pre-
sumes no reallocation of inputs across the individuals k ∈ {1, . . . , nl } and so
depends not on the total sum of all the inputs but on the particular allocation
Xl = (x l,1 , . . . , x l,nl ). This structure also assumes there are no externalities across
firms.

10 We use ⊕ to distinguish the summation of sets (also called “Minkowski summation”) from the
standard summation; e.g., see [37].
11 For example, see [26,44], and a more recent work of [37], as well as references therein. For other

examples involving Minkowski summation in economics, see Shapley–Folkman-Starr theorem and

related results [50].
25 Aggregation of Efficiency and Productivity: From Firm to Sector and Higher . . . 1047

Now, using the l th sub-group technology (11), one can define the sub-group
revenue function as
l l
R (Xl , p) ≡ sup{py : y ∈ P (Xl )}, (12)
y

which, analogous to (7), gives rise to the l th sub-group revenue efficiency measure

l l l l
RE (Xl , Y , p) ≡ R (Xl , p)/pY . (13)

The Fundamental Aggregation Results

Having specific formulas for the efficiency measures defined with respect to
individual technologies and with respect to aggregate technologies raises questions
regarding the relationship between them. Ideally, one may want to establish their
equality, so that the latter can be obtained from the former via some feasible
computations, at least under some clear and reasonable conditions. Formally, the
goal is to find fRE (·) such that
l l
RE (Xl , Y , p) = fRE (RE 1 (·), . . . , RE nl (·)). (14)

In words, the goal is to find some aggregation function, which we call fRE (·), that
can relate the aggregate measure (13) to the individual measures (7), for all firms
k ∈ {1, . . . , nl } in a group of interest and do so in some meaningful way in the
sense that the group measure should represent the group. Finding a function fRE (·)
is not a difficult problem – there is an abundance of well-studied functions offered
by mathematicians. It is the “meaningful way” aspect that is the most challenging
and, as with many (if not all) notions in economics, depends on the views and
assumptions of a researcher. The goal therefore is to make the choice grounded on
and derived from some clear assumptions and if one does not like some assumptions
then one may try to replace them with others and, possibly, derive new aggregation
results. This is the approach we discuss here. In particular, we also consider it as
desirable that the decomposition of revenue efficiency into technical efficiency and
allocative efficiency that we have at the individual level is also maintained at the
aggregate level, so that we have
l l l
RE (·) = OT E (·) × OAE (·), (15)

where
l
OT E (·) = fT E (OT E 1 (·), . . . , OT E nl (·)), (16)

and
l
OAE (·) = fAE (OAE 1 (·), . . . , OAE nl (·)), (17)
1048 V. Zelenyuk

where fT E (·), fAE (·) are also some aggregation functions to be found (potentially
different from each other and from fRE (·)) so that they ensure the aggregate
measures are related to the individual analogues. Such functions can be found using
the following fundamental theorem.

Theorem 1. For each group l (l = 1, . . . , L), the maximal revenue of the sub-group
of DMUs feasible from (Xl , p) is equal to the sum of the maximal revenues of all its
member DMUs feasible from their (x l,k , p), k = 1, . . . n, i.e.,

l nl
R (Xl , p) = R l,k (x l,k , p). (18)
k=1

This theorem is from [12], and it can be viewed as the revenue analogue to the
[25] theorem of aggregation of the profit functions, while the cost or input-oriented
analogue can be found in [7].
More importantly, this theorem provides a key to our aggregation problem, and so
it is important to understand the economic intuition behind it: The theorem says that
the sum of the revenues of individual revenue-maximizing DMUs in a sub-group is
the same as the revenue optimized over the aggregate technology (11) for this sub-
group, provided these DMUs face the same (e.g., equilibrium) output prices (and
other regularity conditions hold). That is, whether optimized individually or as a
group, the same revenue is attained under the “Law of One Price” (e.g., equilibrium
price level) for all the outputs.
The theorem above assumes full revenue efficiency (and full information) and
so, a natural question is: Why do we consider full revenue efficiency when we want
to measure output oriented inefficiency? And the answer is: Because we need it
to set a benchmark against which the inefficiency will be measured. This is in the
same fashion as how we choose the maximal output as the benchmark (although not
assuming it to be reached by each firm) so that the actual output can be measured
relative to it, in the output oriented context of efficiency measurement.12
From this fundamental theorem (as well as its cost and profit analogues), one can
then get many useful results for the aggregation of the efficiency scores, some of
which we summarize below, starting with the following corollary that first appeared
in [12] and is an immediate consequence of (18).

Corollary 1. For each group l (l = 1, . . . , L), we have

l l
nl
RE (Xl , Y , p) = RE l,k (x l,k , y l,k , p) × S l,k , (19)
k=1

12 In the input-oriented context, such a benchmark will be the cost function, while in the framework

where both input and output vectors can be changed when measuring efficiency (e.g., for efficiency
based on the directional distance function or hyperbolic measures), the natural benchmark will be
the profit function. We will briefly discuss these cases later in the chapter.
25 Aggregation of Efficiency and Productivity: From Firm to Sector and Higher . . . 1049

where
l
S l,k = py l,k /pY , k = 1, . . . , nl . (20)

In words, this corollary states that the weighted sum of the revenue efficiencies
of individual revenue-maximizing firms in a sub-group is the same as the revenue
efficiency with respect to the aggregate technology (11) for this same sub-group,
provided these firms face the same output prices and the standard regularity condi-
tions of production theory hold. In turn, this corollary implies another useful result,
which gives the weighting schemes for the technical and allocative efficiencies into
their group analogues, preserving the decomposition like (10) also at the aggregate
level. We summarize this important result in the next corollary (also first appeared
in [12]).
For each group l (l = 1, . . . , L), the aggregate revenue efficiency can be
decomposed multiplicatively into the weighted sum of the technical efficiencies
(where the weights are the actual revenue shares), and the weighted sum of the
allocative efficiencies (where the weights are the revenue shares corrected for
technical inefficiency) of all its member DMUs. We summarize this formally in
the next corollary.

Corollary 2. For each group l (l = 1, . . . , L), we have

l l l l
RE (Xl , Y , p) = OT E × AE , (21)

where

l
nl
OT E ≡ OT E l,k (x l,k , y l,k ) × S l,k , (22)
k=1

and

l
nl
OAE ≡ OAE l,k (x l,k , y l,k , p) × Sae
l,k
, (23)
k=1

where

py l,k p(y l,k OT E l,k (x l,k , y l,k ))

S l,k ≡ l
l,k
, Sae ≡ n l , k = 1, . . . , nl . (24)
pY p k=1 (y l,k OT E l,k (x l,k , y l,k ))

In the next subsection, we provide some intuition behind these important results
from which many other results can be derived.
1050 V. Zelenyuk

Understanding the Fundamental Aggregation Results

Before going further, it is worth making a few intuitive remarks that should help
in clarifying the fundamental results on aggregation summarized in the previous
subsection.
First of all, it should be clear that if L = 1, then the aggregate measures above
are the efficiency measures for the entire group.
Second, note that the measure (22) can be viewed as a multi-output generalization
of Farrell’s measure of Structural Efficiency of an Industry, ( [17], p. 261–262).
Third, recall that in the context of aggregation over industries, [3] derived
a similar weighting scheme, using different arguments than outlined here and
after imposing more restrictive assumptions. Meanwhile, [30] proposed the same
weights and decomposition of the aggregate revenue (although defined differently
than above) into aggregate technical efficiency and aggregate allocative efficiency
measures.13
Fourth, note that while the technical efficiency is constructed to be a price inde-
pendent measure of efficiency, the aggregation weights for obtaining the (sub)group
technical efficiency derived above depend on prices. This might be viewed as
undesirable. On the other hand, note that these weights were not chosen arbitrarily
or in an ad hoc way, but came out as a result of imposing an economic criterion of
optimizing behavior, which researchers also often consider as a benchmark when
making their choice of orientation in measuring efficiency. Intuitively, if one wants
to account for an economic importance of a DMU that obtained the particular
“standardized” efficiency score then, since prices contain important economic
information, it shall not be surprising that the weights derived using the economic
optimization principle are price-dependent. Another consideration is more practical:
price information may be unavailable (or unreliable) in a given study. To circumvent
this problem, one may use the shadow prices [30]. Alternatively, one may impose
an extra assumption to make the derived weights price-independent, as we outline
in section “Price-Independent Weights”.
Fifth, a condition often referred to as the ‘Law of One Price’ was assumed to
enable feasibility of the derivations of these aggregation results. Importantly, note
that this is a necessary assumption for obtaining a positive result in the stated
aggregation problem. To be more precise, it is a necessary condition to establish
an equivalence between the aggregate notions of efficiency (defined with respect to
the aggregate technology and optimized as a group) and the dis-aggregate notions
of efficiency (defined with respect to the individual technologies and optimized
independently by each individual in the group). In other words, this “Law of One
Price” condition can be viewed as the condition of an equilibrium that ensures
the system reaches the same outcome whether optimized individually and then

13 Tobe precise, [30] used a similar framework, yet without explicit relationship to the maximal
revenue defined on the sum of the output sets and without noticing the theoretical link via the
analogue of [25] theorem, and focusing on the DEA framework.
25 Aggregation of Efficiency and Productivity: From Firm to Sector and Higher . . . 1051

aggregated or optimized over the aggregate technology by a group (e.g., a “central

planner” for the group). In this sense, the weights derived from this framework can
be viewed as “optimal weights,” in the sense that they are derived from a framework
where the system has reached equivalent optimal outcomes from both the aggregate
and the dis-aggregate sides.
On the other hand, without this assumption, the general impossibility theorems
of [1] are in action, which lead to much more disappointing conclusions for
practitioners (since they ensure the impossibility of the equivalence) than this quite
common condition in economic theory. Indeed, this condition is coherent with many
economic models (perfect competition, Cournot-type oligopoly, etc.), where the
notion of economic equilibrium indeed implies a common price. As many other
theoretical assumptions, it is, of course, simplifying the reality (e.g., see [20, 27]
for a discussion). In practice, it is of course possible to use the same formulas for
the weights but with different prices and then compute an aggregate by averaging
the individual efficiencies using such “nonoptimal” (or ad hoc) weights, and they
can be viewed as approximations of the “optimal weights” derived above. The
problem is that such an aggregate is not guaranteed to be equivalent to the aggregate
obtained with respect to the aggregate technology, yet it may have another useful
meaning that might be appealing from another perspective (e.g., it can be regarded
as an aggregate efficiency that accounts for the price variation across the obser-
vations and thus showing the gap relative to the aggregate based on the “optimal
weights”).
Finally, it should not be surprising that establishing positive aggregation results
in economics requires extra and perhaps relatively strict assumptions. A good
example would be the fairly strong conditions imposed to obtain the well-known in
economic theory solutions to aggregation of demands, whether over goods or over
consumers. Similarly, and as mentioned above, in the context of efficiency analysis,
[1] analyzed a more general aggregation problem (without considering optimization
behavior) and arrived at several impossibility results, concluding that very strong
assumptions on the technology are needed for establishing positive aggregation
results. The approach summarized above circumvents such assumptions by resorting
to the optimization behavior (as a benchmark against which inefficiency of actual
performance is measured) along with the other assumptions described above.

Aggregation of Aggregates

We now look at the case when a researcher wants to aggregate further, over
already aggregate efficiency scores, i.e., across some sub-groups within a larger
group. For example, suppose there is some partitioning of interest of the entire
group into L non-intersecting and exhaustive sub-groups l = 1, . . . , L. Let Y ≡
n L nl
k=1 y =
k y l,k be the total output across all DMUs in all the sub-groups.
l=1 k=1
Also let the input allocation among firms within all the groups be denoted with
1052 V. Zelenyuk

X = (X1 , . . . , XL ). If (11) is true for all l = 1, . . . , L, then we must have

n
L
l
L
nl
P (X) = P k (x k ) = P (Xl ) = P l,k (x l,k ), (25)
⊕k=1 ⊕l=1 ⊕l=1 ⊕k=1

i.e., the aggregate output set of all groups together is the Minkowski sum of the
group output sets, over l = 1, . . . , L. Thus, P (X) would inherit its properties
from the properties of sub-group technologies, which in turn are inherited from the
regularity conditions imposed on the output sets of individual DMUs.
Using the group technology (25), one can define the group revenue function as

R(X, p) ≡ sup{py : y ∈ P (X)}, (26)

which, similar to (7), gives rise to the group revenue efficiency measure

RE(X, Y , p) ≡ R(X, p)/pY . (27)

An immediate consequence of the previous theorem and of (25) is summarized

in the next corollary.

Corollary 3. The maximal revenue of the entire group of DMUs feasible from
(X, p) is equal to the sum of maximal revenues of all its (non-intersecting) sub-
groups of DMUs feasible from (Xl , p), l = 1, . . . , L, i.e.,

L l L nl
R(X, p) = R (Xl , p) = R l,k (x l,k , p). (28)
l=1 l=1 k=1

The intuition of this result is the same as that of its analogue of (18) – it
is its extension to the aggregation between the sub-groups into a larger group.
The corresponding result about aggregation of revenue efficiency measures is
summarized in the next corollary.

Corollary 4. We have

L
l l
RE(X, Y , p) = RE (Xl , Y , p) × S l , (29)
l=1

where

l
L
l
S = pY / p
l
Y , l = 1, . . . , L. (30)
l=1
25 Aggregation of Efficiency and Productivity: From Firm to Sector and Higher . . . 1053

Intuitively, this corollary says that the weighted sum of the revenue efficiencies
of revenue-maximizing sub-groups of firms is the same as the revenue efficiency
with respect to the aggregate technology (25) for the group that unites these sub-
groups (assuming all firms face the same output prices and the standard regularity
conditions hold). That is, it is an analogue of (19)–(20). In turn, this corollary
implies the following important result.

Corollary 5. We have

RE(X, Y , p) = OT E × OAE, (31)

where

L
l
OT E = OT E × S l , (32)
l=1

and

L
l
OAE = OAE × Sae
l
, (33)
l=1

where

l
L
l
S = pY / p
l
Y , l = 1, . . . , L, (34)
l=1

and
L
l l
l l
Sae
l
= pY × OT E / p Y × OT E , l = 1, . . . , L. (35)
l=1

Intuitively, this last corollary provides a theoretically justified weighting scheme for
an aggregation over sub-groups of the aggregate technical and aggregate allocative
efficiencies into more aggregate analogues, and such that they decompose the
aggregated revenue efficiency. Thus, this approach provides “internally consistent”
aggregation within and between the sub-groups.

Price-Independent Weights

In this section we summarize the method for converting the derived above price-
dependent weights into the price-independent weight such that the same aggregation
scheme based on and derived from the economic principles is preserved. This
method was proposed by [12, 14, 48]. We first focus on the case of aggregating
1054 V. Zelenyuk

efficiency scores of the entire group. The key additional assumption here is the
following:

M
pm Y m / pm Y m = αm , m = 1, . . . , M, (36)
m=1

where Y m ≡ nk=1 ym k and α ∈ (0, 1) is a constant (known or estimated) for all
m

m ∈ {1, . . . , M} with normalization M m=1 αm = 1. Intuitively, (36) states that the
weight of the industry revenue from the output m in the industry total revenue equals
αm . Furthermore, let us denote mk = ymk /Y to be the weight of the k th firm in the
m
th
group in terms of the m -output, and let us impose the condition (36) upon the
weights for the aggregation of the revenue and technical efficiency scores derived
above, to obtain

M
Sk = αm mk , k = 1, . . . , n. (37)
m=1

Intuitively, (37) says that the weight of a firm is the weighted average over all the
output shares of this firm in its group, where the weights are the revenue shares of
the industry for each output m in the total revenue of the industry. Next, use (36)
and (34) to derive the weights for aggregating “between the sub-groups”

M
Sl = αm Wml , l = 1, . . . , L, (38)
m=1

l
where Wml = Y m /Y m is the share of the l th sub-group in the entire group in terms
of the mth output. Furthermore, with a bit more algebra we can derive the price-
independent weight for an individual efficiency of firm k ‘within a sub-group l’ to be

S l,k = S k /S l , k = 1, . . . , nl ; l = 1, . . . , L, (39)

i.e., we get an analogue of (37) which accounts for the weight of each particular
sub-group in the entire group.
On the other hand, the price-independent weights for aggregating allocative
efficiencies are derived similarly as above but where the observed outputs are
replaced with their technically-efficient analogues, i.e.,

S k × OT E k (x k , y k )
k
Sae = n , k = 1, . . . , n, (40)
k=1 S × OT E (x , y )
k k k k

where S k is given in (37).

25 Aggregation of Efficiency and Productivity: From Firm to Sector and Higher . . . 1055

Meanwhile, we can also employ the standardization in (36), along with (34), to
get the weights for aggregating “between the sub-groups” to be given by
M l l
αm Wml OT E OT E × S l
Sae
l
= m=1 = , l = 1, . . . , L, (41)
L M l L l
l=1 OT E × S
l l
l=1 m=1 αm Wm OT E

l
where Wml = Y m /Y m is the weight of l th sub-group in the entire group in terms of
the mth -output. Note that (41) is analogous to what we obtained for the individual
firms but for the sub-group level. Moreover, (41) can be used to derive the weight
of an individual efficiency of firm k “within a sub-group l” to be
l,k
Sae = Sae
k l
/Sae , k = 1, . . . , nl ; l = 1, .., L.

Finally, note that all the derivations here were done for the case of the output
orientation and analogous derivations can be made for the case of input orientation
(and, potentially, for the joint input-output or profit-orientation), which we leave as
exercises for the readers (see [7, 31, 32] for some related derivations).

Aggregation of Productivity Indexes

Similarly as with the efficiency scores, applied studies involving productivity

indexes usually need to present some aggregates of the estimated productivity
indexes – to summarize the overall tendencies in a sample, to perform statistical
inference about the population, etc. Typically, researchers use the simple or the
equally weighted geometric mean for this purpose. The discussion above suggests
that it would also be important to have some well-justified weights when aggregating
productivity indexes. Such weights would help in accounting for the relative
importance of each firm whose index is entering into the average. This question
was first addressed by [56], who derived an aggregation scheme for the Malmquist
productivity index (MPI), and we summarize this approach in this section.14 To
simplify the notation, from now on we will consider just one group, i.e., drop the
sub-group subscript l (but add the time subscript τ = s, t).

Individual Malmquist Productivity Indexes

Let us first recall the definitions of the MPI. We will focus on measuring changes
in productivity from a period s to a period t (s < t). Recall that the output-oriented
MPI can be defined as

14 Also see [31, 33] for extensions of this approach.

1056 V. Zelenyuk

1/2
Dsk (xtk , ytk ) Dtk (xtk , ytk )
M k
(ysk , ytk , xsk , xtk ) ≡ × . (42)
Dsk (xsk , ysk ) Dtk (xsk , ysk )

where Dsk (xtk , ytk ) is the Shephard’s output-oriented distance function that we now
cast in the inter-temporal framework, characterizing technology of DMU k in period
s and evaluated at the point (xtk , ytk ). Note that we dropped the subscript “o” to
simplify our already intense notation.15
In the light of the duality between the distance function and the revenue function,
one can also define the revenue (or dual) analogue of the MPI as

RM k (·) ≡ RM k (ps , pt , ysk , ytk , xsk , xtk )

−1 1/2
REsk (xtk , ytk , pt ) REtk (xtk , ytk , pt )
≡ × , (43)
REs (xs , ys , ps ) REtk (xsk , ysk , ps )
k k k

which, naturally, can be decomposed as

RM k (·) ≡ M k (·) × AM k (·), (44)

where M k (·) is defined in (42) and AM k (·) is the allocative component of the dual
MPI, defined as

AM k (·) ≡ AM k (ps , pt , ysk , ytk , xsk , xtk )

−1 1/2
OAEsk (xtk , ytk , pt ) OAEtk (xtk , ytk , pt )
≡ × . (45)
OAEsk (xsk , ysk , ps ) OAEtk (xsk , ysk , ps )

Aggregation Problem: Inter-temporal Perspective

Here we adapt the aggregation concepts outlined above to the inter-temporal

framework. As for the case of efficiency aggregation, a key stepping-stone for
deriving the aggregation results for productivity indexes is to define a relevant group
technology, and as before, here we admit the additive structure of aggregation of the
output sets, i.e.,

n
P τ (X) ≡ Pτk (x k ), τ = s, t, (46)
⊕k=1

15 Again, here we focus on the output orientation case and similar developments can be done for
the input orientation case. See [31, 33] for some of these details.
25 Aggregation of Efficiency and Productivity: From Firm to Sector and Higher . . . 1057

and so the group revenue function at period τ is given by

R τ (X, p) ≡ max{py : y ∈ P τ (X)}, τ = s, t, (47)

while the revenue efficiency at τ is given by

RE τ (X, Y , p) ≡ R τ (X, p)/pY , τ = s, t. (48)

Now, to measure changes in productivity between s and t, let the group (aggregate)
analogue of (43) be

⎡ −1 ⎤ 12
RE s (Xt , Y t , pt ) RE t (Xt , Y t , pt )
RM(ps , pt , Y s , Y t , Xs , Xt ) ≡ ⎣ × ⎦ ,
RE s (Xs , Y s , ps ) RE t (Xs , Y s , ps )
(49)
where the time subscripts indicate the particular values of efficiency measures for
specific periods τ = s, t.
Ideally, we want to find an aggregation function fRM (·) that can relate the
aggregate measure (49) to the individual measures (43) in some “meaningful” way.
Being unable to find such a “meaningful” way, [56] resorted to something that
may seem “less than ideal,” yet feasible – find an aggregation function fRE (·) that
can relate the aggregate measure (49) to all the components of all the individual
measures (43), in a “meaningful” way, i.e., so that we have

RM(ps , pt , Y s , Y t , Xs , Xt ) = fRE (REτ1 (·), . . . , REτn (·)), τ = s, t, (50)

such that, preferably, the decomposition (44) is maintained at the aggregate level,
i.e.,

RM(ps , pt , Y s , Y t , Xs , Xt ) = M(·) × AM(·), (51)

where, in turn, one need to find some aggregation functions fD (·), fAE (·) that
ensure that the aggregate primal MPI is related to all the components of all the
individual analogues (42), i.e.,

M(·) ≡ M(Y s , Y t , Xs , Xt ) ≡ fD (Dτ1 (·), . . . , Dτn (·)), τ = s, t, (52)

while the aggregate allocative-MPI is related to (45) or its individual components,

i.e.,

AM(·) ≡ AM(Y s , Y t , Xs , Xt ) ≡ fA (OAEτ1 (·), . . . , OAEτn (·)), τ = s, t. (53)

1058 V. Zelenyuk

Such functions are found in the next subsection using, again, the Koopmans-type
arguments that we described above.

Aggregation of the MPIs

As for the aggregation of efficiency scores, the foundation here is the inter-temporal
extension of the aggregation theory from [12], which in turn is an adaptation of
aggregation theory from [25], which we restate below casting it in the inter-temporal
framework, with the time subscript τ = s, t:
n
R τ (X, p) = Rτk (x k , p), x k ∈ N
+ , ∀k = 1, . . . , n, p ∈ ++ ,
M
(54)
k=1

and therefore, for j, τ = s, t, we have

n
RE τ (Xj , Y j , pj ) = REτk (xjk , yjk , pj ) × Sjk (55)
k=1

where

Sjk ≡ pj yjk /pj Y j , k = 1, . . . , n. (56)

Thus, the decomposition is maintained at the aggregate level: for any j, τ = s, t,

we have

RE τ (Xj , Y j , pj ) = OT E τ (j ) × OAE τ (j ), (57)

where
n
OT E τ (j ) ≡ [Dτk (xjk , yjk )]−1 × Sjk , (58)
k=1

n
OAE τ (j ) ≡ OAEτk (xjk , yjk , pj ) × Sae,τ,j
k
, (59)
k=1

and

pj yjk /Dτk (xjk , yjk )
k
Sae,τ,j ≡ , k = 1, . . . , n. (60)
pj nk=1 yjk /Dτk (xjk , yjk )

Furthermore, applying (49), (55) and (56) we get a desired aggregation result – a
solution to (50), given by:
25 Aggregation of Efficiency and Productivity: From Firm to Sector and Higher . . . 1059

RM(ps , pt , Y s , Y t , Xs , Xt )
n
REsk (xtk , ytk , pt ) × Stk
= nk=1
k=1 REs (xs , ys , ps ) × Ss
k k k k

n 1/2
REtk (xtk , ytk , pt ) × Stk −1
× nk=1 . (61)
k=1 REt (xs , ys , ps ) × Ss
k k k k

Importantly, note that the decomposition at the aggregate level is preserved and
given by

RM(ps , pt , Y s , Y t , Xs , Xt ) = M(·) × AM(·), (62)

where the solutions to (52) and (53) are given, respectively, by

⎡ −1 ⎤1/2
OT E s (t) OT E t (t)
M(Y s , Y t , Xs , Xt ) = ⎣ × ⎦ , (63)
OT E s (s) OT E t (s)

and
⎡ −1 ⎤1/2
OAE s (t) OAE t (t)
AM(ps , pt , Y s , Y t , Xs , Xt ) = ⎣ × ⎦ , (64)
OAE s (s) OAE t (s)

where, in turn, the four components inside (63) are given in (58) while the four
components inside (64) are given in (59).
The theoretical and practical importance of these results is that they give
explicit formulas for aggregation of the MPIs. In particular, they give a way of
obtaining a group productivity change score from the individual analogues, where
the aggregation function and the aggregation weights are not ad hoc but derived
from economic principles, besides being intuitive.

Geometric vs. Harmonic Averaging

In earlier studies, noting on the multiplicative nature of the MPI, researchers often
used not only the equal weights but also the geometric rather than the arithmetic or
the harmonic averaging of the individual estimates when they wished to summarize
the point estimates of MPIs (e.g., see [4]). That is, not only the weights were equal,
but also the aggregating function used in previous practice was quite different from
what the theoretical derivations in the previous subsection suggested. How can these
different approaches be reconciled? From the discussions above, it must be clear
that the weights can dramatically influence the results, whether quantitatively or
qualitatively. A natural question is whether the functional form of the aggregation is
1060 V. Zelenyuk

critical and, in particular, can one use the geometric mean rather than the arithmetic
mean?
This question was also addressed by [56], who pointed out that (63) can be
restated in terms of harmonic aggregations of individual distance functions, i.e.,
n k k k −1
−1
k=1 [Ds (xt , yt )] × Stk
M(·) = n −1
k k k −1
k=1 [Ds (xs , ys )] × Ssk
n −1 1
[Dtk (xtk , ytk )]−1 × Stk 2
× k=1 −1 . (65)
n k k k −1
k=1 [Dt (xs , ys )] × Ssk

and its geometric analogue can be defined as

⎡ n k
n k
⎤1/2
Dsk (xtk , ytk )ωt Dtk (xtk , ytk )ωt
⎢ k=1 ⎥
M (·) ≡ ⎢ ⎥ ,
G k=1
⎣ × n (66)
n
k k k ω k k k k ωk ⎦
Ds (xs , ys ) s Dt (xs , ys ) s
k=1 k=1

for some weights ωtk , ωsk .

The aggregation usually used in practice is a particular case of (66) that assumes
equal weights across all k, i.e., ωtk = ωsk = 1/n. It must be clear that, in general, (65)
is not equal to (66) and, in fact, no exact general relationship exists between
n k
the two. However, taking the first-order approximation of Dsk (xtk , ytk )St and
k=1
n
k (x k , y k )]−1 × S k −1 around unity (which is a natural point around
of k=1 [D s t t t
which
nproductivity and efficiency indexes can be approximated) in both cases we
get k=1 Dsk (xtk , ytk ) × Stk , meaning that one can conclude

M(Y s , Y t , Xs , Xt ) ∼
G
= M (Y s , Y t , Xs , Xt ), for (ωtk , ωsk ) = (Stk , Ssk ). (67)

In words, (67) states that the first-order-approximation relationship exists between

the aggregate MPI constructed with harmonic components derived above and the
geometric aggregate of individual MPIs, if both use the same set of weights. This
implies that, for anyone who prefers the geometric aggregation, this relationship
gives a justification for choosing the aggregation weights (which are more influ-
ential) – the weights derived from economic principles, which account for the
economic weight of each firm.
A natural question is “How substantial is the difference between the geometric
and harmonic aggregations?”16 Zelenyuk [56] presented some simulation results

16 From theory, it is known that under the same weighting scheme, the geometric mean is larger
than the harmonic mean but smaller than the arithmetic mean. Note however that the aggregate MPI
in (65) involves products of ratios of the harmonic means and so it can be smaller or greater than
25 Aggregation of Efficiency and Productivity: From Firm to Sector and Higher . . . 1061

confirming that the difference is fairly small. For instance, if the scores of a
productivity index come from uniform distribution around unity with the range
of 50 percentage points (thus allowing a substantial change), then the square root
of the mean squared difference between the harmonic and geometric means across
various simulations was only about 1 percentage point. Thus, a practical implication
that one can deduce from here is that the geometric-type and the harmonic-type
aggregations of the productivity indexes (under the same weights) give similar
aggregate scores for moderate variations of the scores being aggregated. In other
words, the aggregation function per se (whether geometric, harmonic or arithmetic)
is not as crucial – what is more important are the weights of aggregation, which
needs to be justified on some theoretical grounds.17

Decomposition and Aggregation

The aggregation results we summarized above can also be extended to the aggre-
gation of components of various decompositions of MPIs. While there are many
decompositions of MPI offered in the literature, here we focus on what seems to be
the most popular decomposition in practice – the one proposed in the seminal work
of [4], as the following

M k (·) ≡ EF CH k (·) × T ECH k (·), (68)

where the first component is referred to as the efficiency change, defined as

Dtk (xtk , ytk )

EF CH k (·) ≡ EF CH k (ysk , ytk , xsk , xtk ) ≡ , (69)
Dsk (xsk , ysk )

and the second component is referred to as the technological change, defined as

1/2
Dsk (xtk , ytk ) Dsk (xsk , ysk )
T ECH k (·) ≡ T ECH k (ysk , ytk , xsk , xtk ) ≡ × . (70)
Dtk (xtk , ytk ) Dtk (xsk , ysk )

The aggregation question then is to find appropriate group analogues to (69)

and (70), i.e., some functions fEC (·) and fT C (·) that relate the aggregate measures

the aggregate MPI obtained via a geometric mean as in (66), depending on the relative magnitudes
that appear in the numerators and denominators of (65). Both means are approximately equal (to
the arithmetic mean) in the sense of first order approximation around unity.
17 One should however be careful aggregating when there are scores equal or very close to zero:

both geometric and harmonic averages completely fail if at least one element is zero and may
yield an unreasonably low aggregate score if at least one element is very close to zero (even if
many others have large efficiency or productivity scores), unless they are “neutralized” by a very
low weight in the aggregation, as can be done with weighted aggregates. In such cases, using
arithmetic aggregation, which is less sensitive to the outliers, could also be a better solution.
1062 V. Zelenyuk

to individual ones. As above, a natural choice is to utilize the Koopmans-type

arguments, as was done in [56], to arrive at

−1 n −1

k k k −1
OT E t (t) k=1 [Dt (xt , yt )] × Stk
EF CH (·) = = n −1 , (71)
k k k −1
k=1 [Ds (xs , ys )] × Ssk
OT E s (s)

and

⎡ −1 ⎤1/2
OT E s (t) OT E s (s)
T ECH (·) = ⎣ × ⎦
OT E t (t) OT E t (s)
n k k k −1

k −1
n k k k −1

k −1
1/2
k=1 [Ds (xt , yt )] ×St k=1 [Ds (xs , ys )] ×Ss
= n n .
k k k −1 k −1 k k k −1 k −1
k=1 [Dt (xt , yt )] ×St k=1 [Dt (xs , ys )] ×Ss
(72)

As before, the first-order approximation relationship can also be established

between the harmonic-type aggregations in (71) and (72) and their geometric
analogues. Moreover, these aggregation results can also be extended to aggregation
across or over larger groups, in a similar manner as for aggregating efficiency scores
that we discussed above, i.e., extending [48].

Aggregation for Scale Measures

Measurement of economies of scale for an individual firm or for an industry has

been one of the most frequently addressed research questions in economics and
applied econometrics. This is usually done via estimating such measures as scale
elasticity and/or scale efficiency. Here we will focus on the elasticity approach,
following [15], while the aggregation for scale efficiency can be found in [59].
For analyzing economies of scale for a group (e.g., industry or sub-industry),
researchers usually estimate the elasticity at some points of interest, e.g., the non-
weighted mean or the median of the data or, alternatively, the non-weighted mean
of the individual estimates of scale elasticities. Importantly, note that these different
approaches do not give the same information, in general, and each has certain
theoretical or practical appeals and caveats. Here we discuss another theoretical
approach of measuring scale elasticity of a group, which is based on a similar
aggregation result as that derived above.
So far we considered the output-oriented framework, and in this section, because
researchers often focus on elasticity of the cost function, we will consider the case
of input orientation. To do so, first note that the technology set of firm k can be
equivalently characterized by the input requirement sets, defined as
25 Aggregation of Efficiency and Productivity: From Firm to Sector and Higher . . . 1063

L k (y k ) ≡ {x : x can produce y k }, y k ∈ RM
+, (73)

so that x k ∈ L k (y k ), y ∈ RM + ⇐⇒ (x k , y k ) ∈ T k . In turn, technology

can also be equivalently characterized by the input-oriented [45] distance function
Dik : M
+ × + → + ∪ {∞}, defined as
N

Dik (y k , x k ) ≡ sup{δ > 0 : x k /δ ∈ L k (y k )}. (74)

A closely related concept is the input-oriented Farrell measure of technical effi-

ciency:

I T E k (y k , x k ) ≡ inf{θ > 0 : θ x k ∈ L k (y k )} = 1/Dik (y k , x k ). (75)

In addition to assuming the main regularity axioms of production theory (A1-

A5), we also assume convexity of the input requirement sets, i.e.,

A7: Input requirement sets L k (y k ) are convex, ∀y k ∈ M

As a result, due to duality theory in economics (see [10, 45, 47]), the technology
can be equivalently characterized by the cost function, C k : M
+ ×++ →+ ∪ {∞},
N

defined as

C k (y k , w) ≡ inf{wx : x ∈ L k (y k )}, (76)

where w ≡ (w1 , . . . , wN ) ∈ N ++ is the vector of input prices. The related effi-

ciency measure for a DMU k in the dual input-oriented framework would then be
the cost efficiency (also referred to as the overall input efficiency), defined formally
as

CE k (y k , x k , w) ≡ C k (y k , w)/wx k . (77)

From the duality theory for the cost function, we also have the so-called Mahler’s
inequality:

C k (y k , w) ≥ wx k /Dik (y k , x k ), (78)

leading to the notion of the input oriented allocative (in)efficiency, defined as a

multiplicative residual that turns (78) into equality, i.e.,

I AE k (y k , x k , w) ≡ CE k (y k , x k , w)/I T E k (y k , x k ), (79)

and so we have another useful decomposition:

1064 V. Zelenyuk

CE k (y k , x k , w) = I T E k (y k , x k ) × I AE k (y k , x k , w), (80)

We will use these efficiency measures and their aggregate analogues later in the
chapter, while for the framework of scale elasticity, we will focus on the distance
function and the cost function characterizations.
It is important to note that one can use both the primal and the dual charac-
terizations to measure economies of scale via the scale elasticity. Specifically, with
appropriate differentiability assumptions, for the dual framework the scale elasticity
is defined as18
∇y k C k (y k , w)y k
∂ ln C k (y k θ, w)
ec (y k , w) ≡ = , (81)
∂ ln θ θ=1 C k (y k , w)

and for the primal framework, the scale elasticity is defined as19

∂ ln λ
ei (y , x ) ≡
k k
= −∇y k Dik (y k , x k )y k . (82)
∂ ln θ Dik (y k θ,x k λ)=1,
θ=1,λ=1

Now, suppose x ∗k is a solution to (76), then one can obtain equality between the
dual and the primal measures, i.e.,20

ec (y k , w) = ei (y k , x ∗k ), (83)

where x ∗k is a solution to (76), i.e.,

x ∗k ≡ arg inf{wx : x ∈ L k (y k )}. (84)

Intuitively, (83) states that the same information about the scale elasticity of an
individual firm k can be obtained from the primal and dual approaches.
Now, analogous to what we did in previous sections, let the group input
requirement set be given by the Minkowski sum of the individual input requirement
sets across all DMUs k (k = 1, . . . , n), i.e.,

n
L (y 1 , . . . , y n ) = L k (y k ). (85)
⊕k=1

Note that L (y 1 , . . . , y n ) inherits the regularity conditions imposed on the indi-

vidual input requirement sets and, in particular, convexity of the individual input

18 Forexample, see [39].

19 See [5, 10].
20 See [5, 58] for more details on this.
25 Aggregation of Efficiency and Productivity: From Firm to Sector and Higher . . . 1065

requirement sets implies that L (y 1 , . . . , y n ) is also convex. Also note that the
aggregation structure defined by L (y 1 , . . . , y n ) presumes no reallocation of outputs
and no externalities across the individuals.
In turn, the group cost function would be the aggregate analogue of (76), defined
as

C(y 1 , . . . , y n , w) ≡ inf{wx : x ∈ L (y 1 , . . . , y n )}, (86)

while the group input-oriented distance function would be defined as

n n
D i (y 1 , . . . , y n , x k ) ≡ sup{δ > 0 : ( x k /δ) ∈ L (y 1 , . . . , y n )},
k=1 δ k=1
(87)
as the aggregate analogue of (74).
Therefore, one can measure the economies of scale for the group from the
measures of scale elasticity defined for the aggregate technology – analogously to
how it is done for the individual technologies, i.e., we have

1 θ, . . . , y n θ, w)
∂ ln C(y
ec (y 1 , . . . , y n , w) ≡
∂ ln θ
θ=1

∇Y C(y 1 , . . . , y n , w)Y

= , (88)
C(y 1 , . . . , y n , w)

where ∇Y C(y 1 , . . . , y n , w) ≡ (∂C(y 1 , . . . , y n , w)/∂y 1 , . . . , ∂C(y 1 , . . . , y n , w)

/∂y n ) and Y ≡ (y 1 , . . . , y n ) . Meanwhile, for the primal framework, we get

n
∂ ln λ
1
ei (y , . . . , y ,n
x )≡
k
∂ ln θ D i (y 1 θ,...,y n θ, x k λ)=1,

n

k=1 k=1
θ=1,λ=1

n
= −∇Y D i (y 1 , . . . , y n , x k )Y. (89)
k=1

Furthermore, let x ∗ be a solution to (86), then the dual and the primal measures of
group scale elasticity would be equal, i.e., putting this formally, we have a desired
result:

ec (y 1 , . . . , y n , w) = ei (y 1 , . . . , y n , x ∗ ), (90)

where
1066 V. Zelenyuk

x ∗ ≡ arg inf{wx : x ∈ L (y 1 , . . . , y n )}.

The reader shall notice that (90) is an aggregate analogue of (83). The main goal
therefore now is to find a relationship between the aggregate and the individual
scale elasticity measures that will enable getting the aggregate measures from
the individual ones. As above, the fundamental step for reaching this aim is the
following result.

Theorem 2. The minimal cost of the group of DMUs with production plan Y =
(y 1 , . . . , y n ) is equal to the sum of the minimal costs of all its member DMUs with
the same production plan y 1 , . . . , y n , assuming all the member DMUs face the same
input prices w, i.e.,

n
C(y 1 , . . . , y n , w) = C k (y k , w). (91)
k=1

In words, this theorem states that whether the group of DMUs minimize the costs
for their given output plans together via a “social planner” (and without reallocation
of outputs across DMUs) or they minimize individually and then these costs are
summed over, the result should be the same if they face the same input prices w.
This theorem is the cost analogue of the theorem of [25] for aggregation of profit
functions (see [7] for a proof).
Now, for measuring the change in costs due to infinitesimal and equiproportional
change of all outputs, we differentiate both sides of (91) along the ray from the
origin through the point Y ≡ (y 1 , . . . , y n ) . Doing so for the l.h.s. of (91), we get

∂C(y 1 θ, . . . , y n θ, w)
= ∇Y C(y 1 , . . . , y n , w)Y . (92)
∂θ
θ=1

while doing so for the r.h.s. of (91), we get

n

n

∂ k k
C (y θ, w) /∂θ = ∇y k C k (y k , w)y k . (93)

k=1 θ=1 k=1

and combining the two, we get the following important equivalence results (origi-
nally derived by [15]), summarized in the next two corollaries.
25 Aggregation of Efficiency and Productivity: From Firm to Sector and Higher . . . 1067

Corollary 6. We have

n
ec (y 1 , . . . , y n , w) = ec (y k , w) × S k , (94)
k=1

where

n
S k ≡ C k (y k , w)/ C k (y k , w). (95)
k=1

This mathematical result is quite intuitive: In the dual framework, the scale elasticity
of a group equals the weighted sum of the individual scale elasticity scores of all
firms in this group, where the weights are the cost shares. As above, a strength of
this result is that the weights are not ad hoc but derived from economic principles.
Similar aggregation result can also be derived for the primal scale elasticity
measurement. In particular, from (83) and (90), we get the following equivalence
result.

Corollary 7. We have

n
ei (y 1 , . . . , y n , x ∗ ) = ei (y k , x ∗k ) × S k . (96)
k=1

This important result tells us how to obtain the group scale elasticity measure from
the individual scale elasticity measures in the primal framework. Specifically, note
that (96) says that one can get the primal aggregate scale elasticity measure from the
weighted arithmetic average of the individual scale elasticity scores of all firms in
this group, where the weights are the individual cost shares, derived from economic
theoretic reasoning.
In case the researcher has no price information to calculate the weights, she/he
may use shadow prices, estimated from the primal information, or, alternatively,
impose additional assumption and help to derive the price-independent weights,
similarly as discussed above and following [12, 14]. Specifically, the additional
assumption here would be

n N n
wr xr∗k / wr x ∗k = br , r = 1, . . . , N, (97)
k=1 r=1 k=1 r

where br ∈ (0, 1) is a known or estimated constant. In words, (97) states that the
share of the group expenditures on the r th input in the group total cost is given by
br . Further, if we let r = xr / nk=1 xrk be the share of the k th firm in the group in
k k

terms of the r th input, then from (97) we get the price-independent weights given by
1068 V. Zelenyuk

N
Sk = rk br , k = 1, . . . , n. (98)
r=1

In words, (98) states that a firm’s weight is the weighted average over all input
shares of this firm in the group, where the weights are the shares of the industry
expenditures on the r th input in the industry total cost.
It is also worth noting that analogous developments can also be done for other
“derivatives” of the cost function as well as of the revenue and profit functions.
Moreover, such aggregation results can be generalized further to the case of
aggregation within sub-groups (e.g., private vs. public, etc.) and then aggregation
between these sub-groups into a larger group.
Finally, similar analysis can also be done for the case of aggregation of scale
efficiency scores, as was done in [55, 59].

Aggregation with Possibility of Reallocation

In the discussion above, we restricted attention to cases where reallocation of inputs

between DMUs are not allowed for the output orientation and reallocation of outputs
between DMUs are not allowed for the input orientation. What if one of these or
both restrictions are relaxed? This context was first considered in [36] in the context
of aggregating Farrell-type efficiency scores, while [31] extended it to the context
of aggregating MPIs. Both papers focused on the output oriented context, while
the input-oriented context was outlined in [32] and refined further in [33], which
also extended it to the context of aggregating Hicks-Moorsteen Productivity Indexes
(HMPIs). In this section we briefly summarize some key results from these papers.

Aggregate Technology and Measures with Reallocation

To measure the gains from allowing for the reallocation of resources among DMUs
in a group, we need to allow for a more general structure of aggregate technology,
which we will refer to as the group potential technology, and define it as the
Minkowski sum of technology sets of all individual DMUs for a given period τ 21 :

n
τ∗ ≡ τk . (99)
⊕k=1

While aggregating technology sets rather than the output or the input requirement
sets (as was done above), this type of aggregate technology allows for full

21 This technology aggregation structure was earlier used in [1, 30] and goes back to [25].
25 Aggregation of Efficiency and Productivity: From Firm to Sector and Higher . . . 1069

reallocation of inputs and outputs among all the DMUs in the group.22 Other, and
equivalent, characterizations of this technology can be given via the group potential
input requirement set, defined as

Lτ∗ (Y ) = {x : (x, Y ) ∈ τ∗ }, (100)

and via the group potential output set, defined as

Pτ∗ (X) = {y : (X, y) ∈ τ∗ }. (101)

Based on this aggregate technology, and following [36], let the group potential
output-oriented technical efficiency be defined as23

OT Eτ∗ ≡ OT Eτ∗ (Xτ , Y j ) ≡ sup{θ : θ Y j ∈ Pτ∗ (X τ )}, (102)

while the dual characterization of Pτ∗ (X τ ), the group potential revenue function is
defined as

Rτ∗ (Xτ , pj ) ≡ sup{pj y : y ∈ Pτ∗ (X τ )}. (103)

The associated group potential revenue efficiency is then defined as

Rτ∗ (Xτ , pj )
REτ∗ ≡ REτ∗ (X τ , Y j , pj ) ≡ , pj Y j = 0. (104)
pj Y j

Due to duality between the revenue function and the output distance function, we
have REτ∗ ≥ OT Eτ∗ , and so the group potential output-oriented allocative efficiency
can be defined to turn it into equality, yielding the following decomposition:

REτ∗ (X τ , Y j , pj ) = OT Eτ∗ (X τ , Y j ) × OAEτ∗ (X τ , Y j , pj ), ∀τ, j. (105)

In words, (102) and (104) measure the group efficiency relative to the group
potential output set (101) and the associated aggregate cost function, similar to the
individual level.
By the same token, and following [32, 33], let the group potential input-oriented
technical efficiency be defined as

22 More recently, another definition of aggregate technology, which involved the union of technol-
ogy sets, was considered by [40,41], which later was shown to be equivalent to the Koopmans-type
aggregate technology τ∗ , under standard regularity conditions of production theory (see [47]).
23 Here, note that we allow for different time subscripts for inputs and outputs for the framework to

be compatible with the HMPI context.

1070 V. Zelenyuk

I T Eτ∗ ≡ I T Eτ∗ (Y τ , X j ) ≡ inf{λ : λXj ∈ Lτ∗ (Y τ )}, (106)

while the dual characterization of Lτ∗ (Y τ ), the group potential cost function, can be
given by

Cτ∗ (Y τ , wj ) ≡ inf{wj x : x ∈ Lτ∗ (Y τ )}, (107)

and so, the related group potential cost efficiency is then given by

Cτ∗ (Y τ , wj )
CEτ∗ ≡ CEτ∗ (Y τ , X j , wj ) ≡ , wj Xj = 0. (108)
wj X j

As before, due to duality between the cost function and the input distance function,
we have CEτ∗ ≤ I T Eτ∗ , and so the group potential input-oriented allocative
efficiency can be defined to close this inequality, giving rise to the following
decomposition:

CEτ∗ (Y τ , X j , wj ) = I T Eτ∗ (Y τ , X j ) × I AEτ∗ (Y τ , X j , wj ), ∀τ, j. (109)

In words, (106) and (108) measure group efficiency relative to the group potential
input requirement set (100) and associated aggregate cost function, in a way similar
to measurements done at the individual level.

Reallocation vs. No Reallocation

A natural question at this stage is the following: What is the relationship between
the group technology when the full reallocation is allowed to those we considered
earlier (which did not allow for the full reallocation)? The following simple, yet
important lemma clarifies this question.

Lemma 1. We have

P τ (Xτ ) ⊆ Pτ∗ (X τ ), (110)

and

L τ (Yτ ) ⊆ Lτ∗ (Y τ ). (111)

A proof of (110) is relatively simple and can be found in [36] and the proof
of (111) is analogous [33]. This lemma affirms what is expected on an intuitive level:
The aggregate technology characterizations where full reallocation across firms is
allowed must always embrace, as a special case, the aggregate technology where the
25 Aggregation of Efficiency and Productivity: From Firm to Sector and Higher . . . 1071

full reallocation (of outputs in the input-oriented case and of inputs in the output-
oriented case) is not permitted.
As a result of this lemma, note that for any (Yτ , Y τ , wj ) we must have

Cτ∗ (Y τ , wj ) ≤ C τ (Yτ , wj ), (112)

and so for any (Yτ , Y τ , wj ), we must also have

CEτ∗ ≤ CE τ . (113)

Similarly, for any (Xτ , X τ , pj ) we have

Rτ∗ (X τ , pj ) ≥ R τ (Xτ , pj ), (114)

and so for any (Xτ , X τ , pj ) we must also have

REτ∗ ≥ RE τ . (115)

To measure the difference in efficiency between these different levels of aggregation,

[36] introduced the concept of reallocative efficiency. Specifically, in the output-
oriented context, we now also have the group revenue reallocative efficiency, as the
multiplicative residual that closes the inequality (115), i.e.,

RREτ∗ ≡ RREτ∗ (Xτ , X τ , pj ) ≡ Rτ∗ (Xτ , pj )/R τ (Xτ , pj ), (116)

and we obtain a useful decomposition of the group revenue efficiency

REτ∗ = RE τ × RREτ∗ . (117)

Meanwhile, in the input-oriented context (following [33]), we have the group

cost reallocative efficiency, as the multiplicative residual which closes the inequal-
ity (113), i.e.,

CREτ∗ ≡ CREτ∗ (Yτ , Y τ , wj ) ≡ Cτ∗ (Y τ , wj )/C τ (Yτ , wj ), (118)

and so we obtain a useful decomposition of the group cost efficiency:

CEτ∗ = CE τ × CREτ∗ . (119)

What is even more interesting is that both RREτ∗ and CREτ∗ can be further
decomposed, as outlined in the following lemmas.

Lemma 2. We have

RREτ∗ = OT REτ∗ × OAREτ∗ , ∀τ, (120)

1072 V. Zelenyuk

where group output-oriented technical reallocative efficiency is

OT REτ∗ ≡ OT Eτ∗ /OT E τ , (121)

and group output-oriented allocative reallocative efficiency is

OAREτ∗ ≡ OAEτ∗ /OAE τ . (122)

This result is from [36], and its input-oriented analogue (from [32, 33]) is outlined
next.

Lemma 3. We have

CREτ∗ = I T REτ∗ × I AREτ∗ , ∀τ, (123)

where group input-oriented technical reallocative efficiency is

I T REτ∗ ≡ I T Eτ∗ /I T E τ , (124)

and group input-oriented allocative reallocative efficiency is

I AREτ∗ ≡ I AEτ∗ /I AE τ . (125)

In words, these two lemmas say that the reallocative efficiency measures character-
ize the difference for the group between individual efficiency in each DMU and the
collective efficiency, where outputs are allowed to be reallocated among DMUs in
the input orientation or inputs are allowed to be reallocated among DMUs in the
output orientation.

Aggregate vs. Individual Reallocative Measures of Efficiency

What about the individual counterparts of the reallocative efficiency measures that
appeared in the previous subsection? Using the path “from aggregate to individual,”
[36] introduced the reallocative measures for individual output-oriented DMUs,
which for k = 1, . . . , n are given by

RREτk ≡ REτ∗ /REτk , (126)

OT REτk ≡ OT Eτ∗ /OT Eτk , (127)
OAREτk ≡ OAEτ∗ /OAEτk , (128)

and then established the relationship between individual and group reallocative
measures, which we summarize in the next lemma.
25 Aggregation of Efficiency and Productivity: From Firm to Sector and Higher . . . 1073

Lemma 4. We have
−1

n
RREτ∗ = (RREτk (xτk , yjk , pj ))−1 × Sjk , (129)
k=1
−1

n
OT REτ∗ = (OT REτk (xτk , yjk ))−1 × Sjk , (130)
k=1
−1

n
OAREτ∗ = (OAREτk (xτk , yjk , pj ))−1 × Sae,τ,j
k
, (131)
k=1

where

pj yjk p(yjk × OT Eτk (xjk , yjk ))

Sjk ≡ n , k
Sae,τ,j ≡ n , k = 1, . . . , n.
pj k
k=1 yj p k=1 (yjk × OT Eτk (xjk , yjk ))
(132)

Moreover, note that combining these results with the decompositions we derived
above, we also get the following useful decompositions of group potential revenue
efficiency

REτ∗ = OT E τ × OAE τ × OT REτ∗ × OAREτ∗ . (133)

Furthermore, [32, 33], using the same logic as [36], defined the corresponding
reallocative measures for individual input-oriented DMUs (for k = 1, . . . , n)

CREτk ≡ CEτ∗ /CEτk , (134)

I T REτk ≡ I T Eτ∗ /I T Eτk , (135)
I AREτk ≡ I AEτ∗ /I AEτk . (136)

and then established the relationship between the individual and the group realloca-
tive measures, which we summarize in the following lemma.

Lemma 5. We have
n −1

CREτ∗ = k k k −1
(CREτ (yτ , xj , wj )) × Wj
k
, (137)
k=1
n −1

I T REτ∗ = k k k −1
(I T REτ (yτ , xj )) × Wj
k
, (138)
k=1
1074 V. Zelenyuk

n −1

I AREτ∗ = k k k −1
(I AREτ (yτ , xj , wj )) × Wae,τ,j
k
, (139)
k=1

where

pj yjk pj (yjk OT Eτk (xjk , yjk ))

Wjk ≡ n k
k
, Wae,τ,j ≡ n , k = 1, . . . , n.
pj k=1 yj pj k=1 (yjk OT Eτk (xjk , yjk ))
(140)

If we combine these results with the decompositions derived above, then we get the
following decomposition of group potential cost efficiency:

CEτ∗ = I T E τ × I AE τ × I T REτ∗ × I AREτ∗ . (141)

These key results can be further used for extending various aggregation results
discussed above to allow full reallocation: For MPI it was done in [31], while for
HMPI it was done in [33]. Related extensions for the aggregation of scale efficiency
and scale elasticity as well as for the directional distance functions are yet to be
developed, which presents a fruitful field of research for the near future.

Remarks on Estimation of Aggregate Scores

It is worth emphasizing here that our discussion so far was mainly theoretical,
and we had not restricted our attention to any particular estimator. Indeed, the
aggregation theories we summarized here are fairly general and can serve as a
background for any suitable estimator, whether it is based on Data Envelopment
Analysis (DEA), Stochastic Frontier Analysis (SFA), Free Disposal Hull (FDH)
approach, or another appropriate paradigm.24 These well-established approaches
can be used to estimate the individual efficiency scores which then can be aggregated
to obtain the corresponding estimates of most of the aggregate scores we presented
above. Note however that the group potential measures are not calculated from
the individual efficiency scores, but require calculation directly from the group
potential technology. Yet, after imposing two extra assumptions we can recover
these measures from the individual scores as well. These two assumptions are
actually very common for many methods in productivity and efficiency analysis,
especially in DEA. In particular, we can assume that (i) the technology set τk is the
same for all DMUs within each period and that (ii) it is also convex, then, following
[30, 36] we get:

24 Example, see [47], and the relevant Chap. 8, “Stochastic Frontier Analysis: Foundations
and Advances I”, Chap. 9, “Stochastic Frontier Analysis: Foundations and Advances II”, and
Chap. 10, “Data Envelopment Analysis: A Nonparametric Method of Production Analysis” in
this Handbook.
25 Aggregation of Efficiency and Productivity: From Firm to Sector and Higher . . . 1075

τ∗ = nτ , ∀k = 1, . . . , n, ∀τ, (142)

which in turn, for any period τ , gives:

Pτ∗ (Xτ ) = nPτ (x̃τ ), (143)

n
where x̃j ≡ n−1 k
k=1 xj , and

Lτ∗ (Y τ ) = nLτ (ỹτ ), (144)

where ỹτ ≡ n−1 nk=1 yτk . Intuitively, Pτ (x̃τ ) and Lτ (ỹτ ) are, respectively, the
output set and the input requirement set of the “average DMU” for the sample (i.e.,
a hypothetical DMU whose input-output allocation is the average of input-output
allocations in the sample, in period τ ).
Therefore, the output-oriented group potential efficiencies can be obtained as the
efficiency measures of the average DMU in the group, i.e., we have:

OT Eτ∗ (Y τ , X j ) = OT Eτ (x̃τ , ỹj ), (145)

REτ∗ (X τ , Y j , pj ) = REτ (x̃τ , ỹj , pj ), (146)

OAEτ∗ (X τ , Y j , pj ) = OAEτ (x̃τ , ỹj , pj ) = REτ (x̃τ , ỹj , pj )/OT Eτ (x̃τ , ỹj ),
(147)

where OT E, RE, and OAE are as defined in (5), (7), and (9), respectively, with
superscript k dropped, and presented in the inter-temporal context.
Similarly, the input-oriented group potential efficiencies are the same as the
efficiency measures of the average DMU in the group, i.e., we have

I T Eτ∗ (Y τ , X j ) = I T Eτ (ỹτ , x̃j ), (148)

CEτ∗ (Y τ , X j , wj ) = CEτ (ỹτ , x̃j , wj ), (149)

I AEτ∗ (Y τ , X j , wj ) = I AEτ (ỹτ , x̃j , wj ) = CEτ (ỹτ , x̃j , wj )/I T Eτ (ỹτ , x̃j ),
(150)

where I T E, CE, and I AE are as defined in (75), (77), and (79) respectively, which
we cast in the inter-temporal context, with superscript k dropped.
It is also worth reminding that (145) and (148) are the versions of aggregate
efficiency measures suggested (without the theoretical developments as summarized
here) and advocated by [19].
1076 V. Zelenyuk

Concluding Remarks

In this chapter we briefly summarized some of the key results on aggregation in

productivity and efficiency analysis. We mostly focused on the output orientation
and pointed out that most of the results can be extended or generalized to derive
analogous results for the input orientation as well as to various other contexts.
Here we conclude by briefly mentioning a few interesting extensions and appli-
cations of these and other related aggregation results. First, the aggregation results
for the directional distance functions were developed by [8].25 A similar theoretical
framework for aggregating growth rates in the Solow’s growth accounting approach
was derived by [57]. Meanwhile, the theory for aggregation of the scale efficiency
was developed by [59].
Related aggregation analysis can be also found in [38, 43], and some interesting
extensions can be found in [2, 9, 23, 24, 28, 29, 35].
Various applications analyzing real data for various economic questions can be
found in [18, 21, 22, 34, 42, 52], to mention just a few.
Finally, in terms of actual estimation, note that what we discussed is a point-
measure and one may (and typically should) be interested in the corresponding con-
fidence interval measures and related inference. The first theoretical foundation for
this important aspect was laid out in [48], who proposed a practical bootstrap-based
approach for constructing confidence intervals and performing related inference on
the aggregate efficiency measures. More recently, [49] extended this framework, by
deriving convergence rates and new central limit theorems (CLTs) for the aggregate
efficiency scores estimated via DEA and FDH. With the help of the Monte Carlo
study, they also confirmed that for statistical inference on aggregate efficiency, the
standard CLTs work poorly even for very simple 1-input-1-output cases and do not
work correctly at all for larger dimensions. Meanwhile, the new CLTs that they
derived performed reasonably well, reaching the nominal levels when samples get
large. While deriving their asymptotic results, [49] focused on aggregates of the
Farrell-type efficiency scores, and similar developments are yet to be made for the
other aggregates, which shall constitute key research questions in the area.

Cross-References

Data Envelopment Analysis: A Nonparametric Method of Production Analysis

Stochastic Frontier Analysis: Foundations and Advances I
Stochastic Frontier Analysis: Foundations and Advances II

25 Also see [55] for this and other related results.

25 Aggregation of Efficiency and Productivity: From Firm to Sector and Higher . . . 1077

References
1. Blackorby C, Russell RR (1999) Aggregation of efficiency indices. J Prod Anal 12(1):
5–20
2. Cooper W, Huang Z, Li S, Parker B, Pastor J (2007) Efficiency aggregation with enhanced
Russell measures in data envelopment analysis. Socio Econ Plan Sci 41(1):1–21
3. Domar ED (1961) On the measurement of technological change. Econ J 71(284):709–729
4. Färe R, Grosskopf S, Lindgren B, Roos P (1994) Productivity developments in Swedish
hospitals: a Malmquist output index approach. In: Charnes A, Cooper W, Lewin AY, Seiford
LM (eds) Data envelopment analysis: theory, methodology and applications. Kluwer Academic
Publishers, Boston, pp 253–272
5. Färe R, Grosskopf S, Lovell CAK (1986) Scale economies and duality. Zeitschrift für
Nationalökonomie/J Econ 46(2):175–182
6. Färe R, Grosskopf S, Zelenyuk V (2004) Aggregation bias and its bounds in mea-
suring technical efficiency. Appl Econ Lett 11(10):657–660. https://doi.org/10.1080/
1350485042000207243
7. Färe R, Grosskopf S, Zelenyuk V (2004) Aggregation of cost efficiency: indicators and indexes
across firms. Acad Econ Pap 32(3):395–411
8. Fare R, Grosskopf S, Zelenyuk V (2008) Aggregation of nerlovian profit indicator. Appl Econ
Lett 15(11):845–847
9. Färe R, Karagiannis G (2014) A postscript on aggregate Farrell efficiencies. Eur J Oper Res
233(3):784–786
10. Färe R, Primont D (1995) Multi-output production and duality: theory and applications. Kluwer
Academic Publishers, New York
11. Färe R, Zelenyuk V (2002) Input aggregation and technical efficiency. Appl Econ Lett
9(10):635–636
12. Färe R, Zelenyuk V (2003) On aggregate Farrell efficiencies. Eur J Oper Res 146(3):615–620
13. Färe R, Zelenyuk V (2005) On Farrell’s decomposition and aggregation. Int J Bus Econ
4(2):167–171
14. Färe R, Zelenyuk V (2007) Extending Färe and Zelenyuk 2003. Eur J Oper Res 179(2):594–
595
15. Färe R, Zelenyuk V (2012) Aggregation of scale elasticities across firms. Appl Econ Lett
19(16):1593–1597
16. Färe R, Zelenyuk V (2019) On luenberger input, output and productivity indicators. Econ Lett
179:72–74
17. Farrell MJ (1957) The measurement of productive efficiency. J R Stat Soc Ser A (General)
120(3):253–290
18. Ferrier G, Leleu H, Valdmanis V (2009) Hospital capacity in large urban areas: is there enough
in times of need? J Prod Anal 32(2):103–117
19. Førsund FR, Hjalmarsson L (1979) Generalised Farrell measures of efficiency: an application
to milk processing in Swedish dairy plants. Econ J 89(354):294–315. http://ww.jstor.org/stable/
2231603
20. Färe R, Karagiannis G (2017) The denominator rule for share-weighting aggregation. Eur
J Oper Res 260(3):1175–1180. https://ideas.repec.org/a/eee/ejores/v260y2017i3p1175-1180.
html
21. Hall MJ, Kenjegalievaa KA, Simper R (2012) Environmental factors affecting Hong Kong
banking: a post-Asian financial crisis efficiency analysis. Glob Financ J 23(3):184–201
22. Henderson DJ, Zelenyuk V (2007) Testing for (efficiency) catching-up. South Econ J
73(4):1003–1019. http://www.jstor.org/stable/20111939
23. Karagiannis G (2015) On structural and average technical efficiency. J Prod Anal 43(3):
259–267
1078 V. Zelenyuk

24. Karagiannis G, Lovell CAK (2015) Productivity measurement in radial DEA models with a
single constant input. Eur J Oper Res 251(1):323–328
25. Koopmans T (1957) Three essays on the state of economic science. McGraw-Hill, New York
26. Krein M, Smulian V (1940) On regulary convex sets in the space conjugate to a Banach space.
Ann Math 41(2):556–583
27. Kuosmanen T, Cherchye L, Sipiläinen T (2006) The law of one price in data envelopment
analysis: Restricting weight flexibility across firms. Eur J Oper Res 170(3):735–757
28. Kuosmanen T, Kortelainen M, Sipiläinen T, Cherchye L (2010) Firm and industry level
profit efficiency analysis using absolute and uniform shadow prices. Eur J Oper Res 202(2):
584–594
29. Li SK, Cheng YS (2007) Solving the puzzles of structural efficiency. Eur J Oper Res
180(2):713–722
30. Li S-K, Ng YC (1995) Measuring the productive efficiency of a group of firms. Int Adv Econ
Res 1(4):377–390
31. Mayer A, Zelenyuk V (2014) Aggregation of Malmquist productivity indexes allowing for
reallocation of resources. Eur J Oper Res 238(3):774–785
32. Mayer A, Zelenyuk V (2014) An aggregation paradigm for Hicks-Moorsteen productivity
indexes, cEPA Working Paper No. WP01/2014
33. Mayer A, Zelenyuk V (2019) Aggregation of individual efficiency measures and productivity
indices. In: ten Raa T, Greene W (eds) The Palgrave Handbook of Economic Performance
Analysis. Palgrave Macmillan, Cham. https://link.springer.com/chapter/10.1007/978-3-030-
23727-1_14#citeas
34. Mugera A, Ojede A (2014) Technical efficiency in African agriculture: is it catching up or
lagging behind? J Int Dev 26(6):779–795
35. Mussard S, Peypoch N (2006) On multi-decomposition of the aggregate Malmquist productiv-
ity index. Econ Lett 91(3):436–443
36. Nesterenko V, Zelenyuk V (2007) Measuring potential gains from reallocation of resources. J
Prod Anal 28(1–2):107–116
37. Oks E, Sharir M (2006) Minkowski sums of monotone and general simple polygons. Discret
Comput Geom 35(2):223–240
38. Pachkova EV (2009) Restricted reallocation of resources. Eur J Oper Res 196(3):1049–1057
39. Panzar JC, Willig RD (1977) Free entry and the sustainability of natural monopoly. Bell J Econ
8(1):1–22
40. Peyrache A (2013) Industry structural inefficiency and potential gains from mergers and break-
ups: a comprehensive approach. Eur J Oper Res 230(2):422–430
41. Peyrache A (2015) Cost constrained industry inefficiency. Eur J Oper Res 247(3):996–1002
42. Pilyavsky A, Staat M (2008) Efficiency and productivity change in Ukrainian health care. J
Prod Anal 29(2):143–154
43. Raa TT (2011) Benchmarking and industry performance. J Prod Anal 36(3):285–292
44. Schneider R (1993) Convex bodies: the Brunn-Minkowski Theory. Cambridge University
Press, New York
45. Shephard RW (1953) Cost and production functions. Princeton University Press, Princeton
46. Shephard RW (1970) Theory of cost and production functions. Princeton studies in mathemat-
ical economics. Princeton University Press, Princeton
47. Sickles R, Zelenyuk V (2019) Measurement of productivity and efficiency: theory and practice.
Cambridge University Press, Cambridge. https://doi.org/10.1017/9781139565981
48. Simar L, Zelenyuk V (2007) Statistical inference for aggregates of Farrell-type efficiencies. J
Appl Econ 22(7):1367–1394. http://ideas.repec.org/a/jae/japmet/v22y2007i7p1367-1394.html
49. Simar L, Zelenyuk V (2018) Central limit theorems for aggregate efficiency. Oper Res
166(1):139–149
50. Starr RM (2008) Shapley-Folkman theorem. In: Durlauf SN, Blume LE (eds) The new palgrave
dictionary of economics. Palgrave Macmillan, Basingstoke, pp 317–318
51. Tauer LW (2001) Input aggregation and computed technical efficiency. Appl Econ Lett 8:295–
297
25 Aggregation of Efficiency and Productivity: From Firm to Sector and Higher . . . 1079

52. Weill L (2008) On the inefficiency of European socialist economies. J Prod Anal 29(2):79–89
53. Wilson PW (2018) Dimension reduction in nonparametric models of production. Eur J Oper
Res 267(1):349–367. http://www.sciencedirect.com/science/article/pii/S0377221717310317
54. Ylvinger S (2000) Industry performance and structural efficiency measures: solutions to
problems in firm models. Eur J Oper Res 121(1):164–174
55. Zelenyuk V (2002) Essays in efficiency and productivity analysis of economic systems. Ph.D.
thesis, Oregon State University, Corvallis
56. Zelenyuk V (2006) Aggregation of Malmquist productivity indexes. Eur J Oper Res
174(2):1076–1086
57. Zelenyuk V (2013) A scale elasticity measure for directional distance function and its dual:
theory and DEA estimation. Eur J Oper Res 228(3):592–600. http://ideas.repec.org/a/eee/
ejores/v228y2013i3p592-600.html
58. Zelenyuk V (2013) A note on equivalences in measuring returns to scale. Int J Bus Econ
12(1):85–89. http://ideas.repec.org/a/ijb/journl/v12y2013i1p85-89.html
59. Zelenyuk V (2015) Aggregation of scale efficiency. Eur J Oper Res 240(1):269–277
Part II
Applications
Choice of Inputs and Outputs
for Production Analysis 26
Subhash C. Ray

Contents
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1084
Inputs and Outputs: Some Basic Features . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1085
Multistage Production . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1087
Classification of Inputs: Variable and Fixed . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1087
Fixed Inputs and Short Run Cost Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1088
Bad Outputs: Zero or Negative Inputs and Outputs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1089
Input Aggregation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1091
Input Aggregation in Nonparametric Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1093
Input Aggregation in Parametric Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1097
Statistical Tests for Input Aggregation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1097
A Statistical Test for Nested Radial DEA Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1101
DEA and Bootstrap . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1101
Nondiscretionary Inputs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1103
Second Stage Regression . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1106
Truncated Regression in the Second Stage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1107
Contextual Variables in Parametric Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1108
Choice Between Inputs and Contextual Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1108
Input-Output Choice in Some Areas of Application . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1109
Manufacturing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1109
Banking . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1110
Health Care . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1112
Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1114
Cross-References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1114
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1115

The paper has benefitted from valuable comments by Subal Kumbhakar on an earlier draft.

S. C. Ray ()
Department of Economics, University of Connecticut, Storrs, CT, USA
e-mail: [email protected]

S. C. Ray et al. (eds.), Handbook of Production Economics,
https://doi.org/10.1007/978-981-10-3455-8_20
1084 S. C. Ray

Abstract

Production creates value by transforming inputs into outputs. Classification of

variables as inputs or outputs depends on the scope of decision-making by the
firm. Inputs enter the boundary of the firm from outside without any prior pro-
cessing by the firm and once transformed into outputs exit its jurisdiction without
any further processing. This chapter highlights the defining characteristics of
inputs and outputs both for single stage and multistage production. The necessary
conditions that must be met for a valid aggregation of several inputs into
total expenditure are discussed both for nonparametric and parametric models
of production. Several statistical tests of hypotheses related to aggregation
of several inputs or exclusion of individual inputs in nonparametric models
are discussed. The technology set of feasible input-output bundles invariably
depends on many environmental or contextual variables that are outside the
control of the producer. In parametric Stochastic Frontier Analysis, they can
be directly included as determinants of the mean or variance the technical
efficiency factor causing shifts in the production frontier. In nonparametric Data
Envelopment Analysis, influence of such factors is measured through a second
stage regression of efficiency scores on the contextual variables. The alternative
approaches of a second stage least squares regression and a truncated regression
are briefly discussed. The chapter ends with examples of input-output choice in
several popular areas of application like manufacturing, banking, and health care.

Keywords

Multistage production · Input aggregation · Contextual variables · Second

stage regression

JEL Classiﬁcation Numbers

D24, C44

Introduction

Production is the act of transforming inputs into outputs creating value in the
process. Enhancing human welfare is the ultimate goal of an economic system
and welfare of an individual in the society is increased through consumption of
goods and services. Thus, production of these goods and services is the means
of improving welfare. As noted by Marx and Engels, men begin to “distinguish
themselves from animals as soon as they begin to produce their means of subsis-
tence, a step which is conditioned by their physical organization. By producing their
means of subsistence men are indirectly producing their actual material life” (Marx
26 Choice of Inputs and Outputs for Production Analysis 1085

and Engels The German Ideology, “Feuerbach” 1) [17]. Over the different phases
of civilization, human societies morphed from hunter-gatherer tribes, into farming
communities, and ultimately into the modern industrial economies supported by
artificial intelligence, machine learning, and information technology. Despite all the
technological progress, however, the essence of production remains conversion of
inputs into outputs. In order to measure rates of technical progress or to evaluate
productive performance of a firm, one must appropriately identify and measure
inputs and outputs. The broad objective of this chapter is to highlight a number
of important points that must be considered while classifying resources as inputs or
outputs in any specific context.
The rest of the chapter unfolds as follows. Section “Inputs and Outputs: Some
Basic Features” starts with some defining characteristics of inputs and outputs. This
is followed by a discussion of multistage production, the distinction between fixed
and variable inputs, the role of fixed inputs in short run cost minimization, and
dealing with bad outputs or zero and negative outputs and inputs.
Section “Input Aggregation” considers the validity of input aggregation and
use of total cost as a single aggregate input in both nonparametric Data Envelop-
ment Analysis (DEA) and parametric Stochastic Frontier Analysis (SFA) models,
statistical tests for aggregation and/or inclusion of inputs, and bootstrapping the
estimated DEA efficiency scores. Various issues related to use of a second stage
regression to measure the marginal effects of nondiscretionary contextual variables
on efficiency measurement are discussed in section “Nondiscretionary Inputs.”
Section “Input-Output Choice in Some Areas of Application” wraps up the chapter
with examples of input-output selection for three frequently analyzed industries:
manufacturing, banking, and health care. Section “Conclusion” is the conclusion.

Inputs and Outputs: Some Basic Features

In production economics , the firm is the decision-making agent with a clearly

defined boundary or jurisdiction of its decisions. It acquires its inputs from outside
and, once production is complete, ships out the finished output. In most (if not all)
cases, such inputs have been subject to prior processing by some other firm before
it enters into its jurisdiction. For example, in agriculture, the farmer uses fertilizers
and pesticides as inputs. These fertilizers and other chemicals had to be produced by
other firms before they were available as inputs to the farmer. However, the farmer
did not participate in the production of these chemicals. At the other end, wheat
produced by the farmer is ground into flour by a flour mill. But once the bag of
wheat has been shipped out by the farm, it plays no role in any further processing
of its product. Thus, an input enters the jurisdiction of the firm from outside without
any prior processing whereas its output, once it exits the boundary of the firm, is not
subject to any further processing by the firm.
1086 S. C. Ray

In some cases, an input used by the firm may not be supplied from outside. A
steel plant may acquire coal for its furnace from its own mines. But coal provided
by the company’s captive mine as fuel for the furnace is no different from coal
purchased from some other supplier in the market and has not been excavated by
the steel plant.
In classifying resources as inputs and outputs, the critical points to note are:

• An input exists even before production starts. The truckload of coal was already
there even before the steel plant started production.
• An input is depleted in stock as production is carried out. The inventory of coal
in the warehouse of the steel plant is reduced as more and more coal is used for
steel making.
• An input has not been subject to any prior processing by the firm. The truckload
of coal has not been excavated from the mine by the steelmaking plant.
• An output did not exist in its present form prior to production. Although the iron
ore, coal, and other materials used by the steelmaker were in existence, there was
no steel before production.
• An output increases in stock as production is carried out.
• Once the output is shipped out, it is not subject to any further processing by the
firm.

In some cases, the output may not be physically differentiable from the inputs.
Making a bowl of salad, for example, requires slicing tomatoes, cucumbers, and
lettuce, chopping onions, and adding salad dressing. The bowl of salad is the output
while the various ingredients and labor are the inputs. In this case, the output does
not look much different from the inputs and most of the ingredients (at least the
vegetables) can be identified and physically separated. But the prepared bowl of
salad is more valuable than the raw ingredients. The act of slicing, chopping, and
mixing the ingredients is the production process that adds value to the ingredients.
An extreme example can be found from the postal service industry where the
accumulated mail (consisting of letters, magazines, and packages) at the post office
are inputs and the delivered mail at the different addresses are the outputs. In this
case, there is no physical transformation of the inputs at all. However, the letter
carrier’s labor and the transportation service of the mailman’s vehicle are used to
deliver the mail from the post office to the recipient’s addresses thereby creating
value. Even though not physically altered in shape or size, the accumulated mail at
the post office is turned into delivered mail at the recipient’s address and becomes
more valuable to the addressee than the undelivered mail.
In any empirical application, the relevant input and output variables need to be
defined with care. There should be some obvious technical relationship between
inputs and outputs. It is important to remember that definition of inputs and outputs
will depend on the boundary of the firm.
26 Choice of Inputs and Outputs for Production Analysis 1087

Multistage Production

The process of producing the output typically involves various tasks – some to be
performed simultaneously and others sequentially. Often, a number of tasks can be
grouped together to define a specific stage of production. In many cases, one can
identify multiple stages of production within the jurisdiction of the same firm. In
such cases, the output from the earlier stage becomes an input for the next stage.
For example, in furniture making, lumber, labor, and tools are the inputs and the
fabricated parts are the outputs in the fabrication shop. These parts along with
additional labor and tools are the inputs in the assembly shop while the assembled
but unpainted furniture are the output at this stage. These unfinished furniture, more
labor, and paints are the inputs and the finished furniture are the outputs at the final
stage. One can think of each division within the firm as a decision-making unit with
the inputs and outputs appropriately defined to measure efficiency at each stage.
Alternatively, the entire firm can be viewed as the decision-making unit and only the
lumber, labor, tools used at the different stages, and the paints are treated as inputs
while the finished furniture are the outputs. A strand in the DEA literature described
as network DEA (but more appropriately a multistage DEA) seeks to ascribe the
overall (in)efficiency of a firm to the different stages of production in a multistage
production setting.

Classification of Inputs: Variable and Fixed

In production economics, a distinction is made between variable and fixed inputs.

Variable inputs are those the quantities of which change with the quantity of output
produced. These include raw materials, unskilled labor, and energy. By contrast, one
does not see a change in plant size or the number of managers with every change
in the output level. These are called fixed inputs. It is important to recognized
that the fixed versus variable input designation is not a physical but an economic
concept. When the output level increases, the firm can schedule production workers
for additional hours and use more fuel and raw materials without any significant
adjustment cost. Frequent changes in plant size (even when physically possible, by
leasing extra facilities at short notice) would entail enormous amount of adjustment
cost. Similarly, hiring and firing management personnel would result in large
adjustment costs in the form of severance pay or hiring bonus. In such cases, the
firm is better off by not altering the levels of these inputs. That is the reason why
they are considered to be fixed. But this is only in the short run. In the long run,
when the time comes to renew the lease on the premises or to renew the contract
of management staff, the firm can change the levels of these inputs that would be
appropriate for its planned level of output.
1088 S. C. Ray

Fixed Inputs and Short Run Cost Function

For cost minimization in the short run, the firm will have to find the least expensive
bundle of the variable inputs that can produce the targeted output when combined
with the preset quantities of the fixed inputs. Note that because both the prices
and the quantities of the fixed inputs are predetermined, the level of fixed costs
is already set. Cost minimization in the short run, therefore, requires minimization
of the variable costs only.
Consider the production possibility set

T = (x, y) : x ∈ R+
n
can produce y ∈ R+
m
(1)

where x is a vector of inputs and y is a vector of outputs.

Suppose that the firm’s input bundle is partitioned as x = (v; K) where v is
the vector of variable inputs and K is the vector of fixed inputs. A corresponding
partition of the input price vector is q = (w, r). The firm has a target output
vector y0 and has the given bundle of fixed inputs K0 . The minimum variable cost
would be

V C w; K 0 , y 0 = min w v : v, K 0 , y 0 ∈ T . (2)

In parametric econometric analysis, one can specify a suitable functional form

of the variable cost and estimate the parameters of the model empirically. The
regularity conditions that must be satisfied by any estimated variable cot functions
are

(i) ∂V∂yC > 0; (positive marginal costs)
C
(ii) ∂V∂w > 0; (positive conditional input demands implied by Shephard’s
lemma).
C
(iii) ∂V∂w w =V C; (linear

homogeneity in the variable input prices) and
∂2V C
(iv) The matrix ∂w.∂w is negative definite (concavity in the input prices).

C
Additionally, (v) ∂V
∂K < 0. This last condition follows from the fact that
because the marginal productivity of each fixed input is positive, any increase in
a fixed input must be offset by a decrease in some variable input if the output is to
remain constant. This causes the variable cost to decline. In fact, one can define the
shadow price of the sth fixed input as

∂V C
ρs = − . (3)
∂Ks

In the single fixed input case, one can perform a statistical test for the hypothesis
that the shadow price (ρ) of the fixed input is equal to its market price (r). When
26 Choice of Inputs and Outputs for Production Analysis 1089

multiple fixed inputs are involved, one can apply the Mahalanobis D2 test to
compare the mean vectors of shadow prices and the market prices of the fixed inputs.
The corresponding DEA LP problem is

V C ∗0 = min w v
N
s.t. λj v j ≤ v;
j =1

N
λ j Kj ≤ K 0 ;
j =1
(4)
N
λj y j ≥ y 0 ;
j =1
N
λj = 1;
j =1
v ≥ 0; λj ≥ 0, (j = 1, 2, . . . , N ) .

The short run cost of the firm is C(w, r, y0 , K0 ) = VC(w, K0 , y0 ) + r K0 . The
shadow price of a fixed input is the negative of the optimal value of the dual variable
associated with that particular fixed input. The marginal cost of any output is the
dual variable associated with that output constraint.
It is important to clearly distinguish between fixed inputs and outputs. Output
is clearly a “cost driver” in the sense that an increase in the output quantity will
result in an increase in the cost. For the fixed inputs, there are two opposing forces
at work. Clearly, with prices of the fixed inputs given, any increase in the quantity of
fixed inputs will increase the fixed cost. But to the extent that the fixed inputs have
substitutability with any of the variable inputs, an increase in a fixed input holding
the outputs unchanged will lower the quantity of some variable input and hence the
cost of variable inputs. Hence, the short run total cost may rise or fall.
Sometimes, fixed inputs are confused with outputs because there are variable
inputs required for the maintenance of these fixed inputs. One requires labor and
other resources for maintenance of a furnace in a steel plant. In this case, an increase
in the level of a fixed input is actually increasing the quantity and the cost associated
with variable inputs. None the less, the furnace remains an input and not an output
because it is not produced in the steel plant. For accurate costing, the maintenance
cost of the fixed input (like the furnace) should be included in the user cost of the
fixed input and would constitute a part of the fixed cost of the firm.

Bad Outputs: Zero or Negative Inputs and Outputs

Bad Output
In some cases, production results in some unintended and undesirable bad outputs
side by side with the desired or good outputs. Electric power generation also results
in smoke emission from thermal plants. Cement production leads to toxic waste
1090 S. C. Ray

on the side. In production economics, bad outputs have been treated in various
ways – as inputs, joint products with the good outputs, and as by-products of
good output production. Of these, the input interpretation is the least satisfactory.
There was no smoke before power generation started. The amount of smoke in
the air is increased rather than reduced as more power is generated. Further, the
smoke in the air is not used for producing anything by the power plant. Thus, it
possesses none of three defining characteristics of an input listed above. As for the
joint production vis-à-vis by product interpretation, the context should determine
which one is more appropriate. In some cases, products are physically joint – like
cowhide and beef or some flowers and allergenic pollens. There joint production is
the obvious interpretation. Similarly, the same polluting input coal produces heat
(for electricity) and carbon. The two outputs – power and carbon emission – are
not physically joint but are ascribed to the same input. There by-production is more
appropriate. In other cases, the conceptualized production technology depends on
the analyst’s perspective. (For a detailed discussion of bad outputs, see the chapter
by Murty and Russell in volume 1 of this handbook.) It is important to note in this
context that good or bad characterization is related to an individual’s preferences
and is not a characteristic of the technology. For example, while smoke is generally
considered a bad output, in rural India villager’s often burn wet hay explicitly to
create smoke to drive away mosquitoes. In that specific case, smoke is the intended
good output.

Zero Input or Output

In some cases, one finds that the firm is not using some input at all. For example,
in agriculture in less developed countries, farmers use both chemical fertilizers and
farmyard manure. But often farms are found to use no manure at all. The same is
true for the use of bullocks or motor-powered tillers. This implies that the one input
can be substituted for the other. Soil nutrients can be provided either by manure
or by chemical fertilizers. In such cases, the two inputs should be aggregated as
“Fertilizers and Manure.” The exact basis for aggregation may be their chemical
composition or some other formula determined by experts. Similarly, different kinds
of fuel (coal, oil, and natural gas) can be aggregated based upon their calorific
values.
In some other cases, substitutability between pairs of inputs is less obvious. One
can think of chemical weed killers and labor. One may spray chemicals to kill weeds
or use labor to root the weeds out. In this situation, aggregation is neither possible
nor recommended and one must treat any zero value of the chemical herbicide input
as it is.
As for outputs, in multiproduct firms, some outputs may not be produced at
all. This is a perfectly rational decision of a profit or revenue maximizing firm. In
fact, even when all firms produce all of the outputs, one would like to compare the
combined cost of standalone production of individual outputs with the production
cost of a diversified firm producing all of the outputs together to measure economies
of scope. This is particularly a problem when outputs are measured in log (as in
a Translog cost function). Often the recommended solution is to use a Box-Cox
26 Choice of Inputs and Outputs for Production Analysis 1091

transformation instead of the logarithm of the output quantity to circumvent the

problem of defining the log of zero.
In the nonparametric DEA models, a zero output poses no problem in a radial
output-oriented model. Similarly, a zero input is easily accommodated in an input-
oriented model. But with a zero input, in an output-oriented model, the benchmark
can be constructed only from units that employ zero quantity of the relevant input.

Negative Inputs or Outputs

In many papers (especially in the nonparametric DEA literature), authors have
developed various mathematical models for evaluating production efficiency in the
presence of negative inputs or outputs. One such model simply reverses the algebraic
sign and treats negative inputs as outputs or outputs as inputs (when all observations
of the relevant input or output are negative). When some observations have negative
and others have positive values of an input or output, one adds a sufficiently large
positive number to all observations so that the modified values are all positive. One
appeals to the translation invariable property of the DEA models to justify this data
transformation. (As explained in Ray [24], the output oriented radial DEA model
is invariant to input translation but not to output translation. Similarly, in the input-
oriented model is invariant to output translation but not to input translation.) But
the fundamental point one needs to remember is that the neoclassical production
m+n
possibility set is a subset of R+ and for any feasible input-output bundle (x, y),
x ∈ R+ and y ∈ R+ . Thus, negative inputs or outputs are precluded. One often
n m

regards the profit of a firm as an output and points out that it can be negative. But
a firm does not produce profit. It uses inputs to produce output. Profit is neither an
input nor an output but the difference between the revenue from the output and the
cost of the inputs.

Input Aggregation

Selecting the right number of distinct inputs and outputs for the specification of
the underlying technology is often quite difficult in applied production analysis. In
many cases, for a variety of reasons, several inputs (or outputs) may be aggregated
into a single composite input (or output). In agricultural production, for example,
the number of even the important crops may be too large to treat them as separate
outputs. This requires aggregating them into a single index of crop output. Similarly,
especially in developing countries, one constructs a measure of total cultivated area
by adding up irrigated and unirrigated acreage, assigning a higher weight to irrigated
land. Because there are farms that cultivate only unirrigated land, including irrigated
land as a distinct input would create the usual problems associated with zero input
levels. In manufacturing, quantities of different kinds of fossil fuels are sometimes
aggregated in terms of their British Thermal Units (BTU) equivalent. In many cases,
information about input use is available only in value terms. One may, for example,
have to use the total wage payments instead of the numbers of different categories
of employees to measure the labor input. This is especially true for applications
1092 S. C. Ray

where the input-output data are indirectly constructed from financial statements of
firms.
In applications of Data Envelopment Analysis (DEA), maintaining a large
number of inputs (or outputs) in the model may result in all or most of the decision-
making units (DMUs) to be rated as efficient. As explained by Leibenstein and
Maital [16], this is a result of the dimensionality of the input/output space relative to
the number of observations. Meaningful aggregation of multiple inputs or outputs,
therefore, reduces the number of constraints in the linear programming model,
thereby increasing the “degrees of freedom.” However, while input aggregation has
some advantages when it is valid, it does pose problems in cases when it is not valid.
The theoretical implications of such aggregation have been addressed in some
previous studies. For instance, based on Monte Carlo simulations, Thomas and
Tauer [40] argue that with linear aggregation of inputs, the technical efficiency
measure becomes an economic measure of efficiency comprising of both technical
and allocative components. As such, this introduces bias in the measurement of
technical efficiency. In a subsequent study by Tauer [39], the results of simulations
indicate that technical efficiency estimates computed by DEA are biased even when
the exact aggregator function is used to aggregate inputs. Färe and Zelenyuk [12] and
Färe et al. [13] demonstrate that if some inputs are introduced in the DEA model in
price aggregated form, the technical efficiency measure will be biased downward
as compared to if the inputs were included in disaggregated terms. Further, the
bias is equal to the allocative inefficiency. They further illustrate that the DEA
technique will yield unbiased technical efficiency scores even when inputs are price
aggregated, if and only if there is no allocative inefficiency in the subvector of inputs
that is aggregated.
The above theoretical developments reveal that while including too many distinct
inputs (or outputs) in the DEA model may lead to problems associated with
overspecification, aggregating inputs into a fewer number of inputs to be included
in the model may lead to biased measures of technical efficiency. The validity of
input aggregation in any specific application is therefore an empirical question.
However, the statistical side of this issue has been far less explored. Of the few
studies that address this aspect of the problem, some are nonparametric, while
others assume some parametric form of the statistical distribution. Simar and Wilson
[36] discuss bootstrap procedures for testing several restrictions in nonparametric
models, including tests for input or output aggregation. Sirvent et al. [38] use
simulations to evaluate the performance of a number of statistical tests that can
be used for the selection of variables in DEA efficiency models. In a recent paper,
Banker et al. [7] propose a method of estimating allocative inefficiency when only
aggregate cost or revenue data and quantity data (but not price data for individual
inputs or outputs) are available. The idea is that the prices are not known to the
analyst. But if all firms face identical prices and are allocatively efficient then the
aggregated and disaggregated technical efficiencies are identical. It is a test for
this equality. They provide a test of allocative efficiency through an application
of Banker’s [4] F-test based on the DEA technical efficiency residuals from the
aggregate and the disaggregated models.
26 Choice of Inputs and Outputs for Production Analysis 1093

In an empirical application, even when price and quantity data are all available,
one might prefer to aggregate subgroups of inputs (or outputs) to overcome the
“curse of dimensionality.” There are, of course, various ways in which inputs may be
aggregated. In some contexts, the aggregation weights may be guided by technical
norms – like aggregating various kinds of fuels into a total energy input based on
their individual Btu contents. In some other cases, the aggregation may be data
driven – like using Principal Component Analysis where several inputs (or outputs)
are replaced by a small number of their principal components, to reduce the problem
of dimensionality. (For a discussion of the use of Principal Component Analysis
(PCA) within DEA see Adler and Golany [1, 2]. An inherent problem with PCA is
that typically there is no intuitive or economic interpretation of what these principal
components represent.) An economically meaningful aggregation procedure is one
where a subgroup of inputs (or outputs) is replaced by its total cost (or revenue).
Using prices would be the valid way to aggregate when choice of the relevant inputs
(or outputs) is allocatively efficient.

Input Aggregation in Nonparametric Models

Consider an industry producing a scalar output, y, from a bundle of n inputs, x = (x1 ,

x2 , . . . , xn ). Let (xj , yj ) be the observed input-output bundle of firm j (j = 1, 2, . . . ,
N). (This section draws heavily from Ray and Mukherjee [28].) The technology is
defined by the production possibility set

T = (x, y) : y ≤ f (x); y ∈ R+ , x ∈ R+
n
(5)

where f (x) = max y : (x, y) ∈ T is the production function.

Firms move from one point to another on the same isoquant when (relative) input
prices change – either over time or across regions. But under competitive input
market assumption (justifying price-taking behavior) firms in the same market will
face same prices.
The production function evaluated at the input bundle x0 is

f x 0 = max φy 0 : x 0 , φy0 ∈ T . (6)

Obviously, every observed input-output bundle (xj , yj ) is feasible. Under the

standard assumptions of convexity of the production possibility set and free
disposability of inputs and output, an estimate of the set T is:
⎧ ⎫
⎨ N N N ⎬
S = (x, y) : x ≥ λj x j ; y ≤ λj yj ; λj = 1; λj ≥ 0; j = 1, 2, . . . , , N
⎩ ⎭
j =1 j =1 j =1
(7)
1094 S. C. Ray

A nonparametric estimate of the production frontier at the input bundle x0 is

obtained by solving the following linear programming problem:

fˆ x 0 = max φy0

N
s.t. λj x j <= x 0 ;
j =1

N (8)
λj yj <= ϕy0 ;
j =1

N
λj = 1; λj ≥ 0 (j = 1, 2, . . . , N )
j =1

The support of fˆ(.) is the free disposal convex hull of the observed input bundles
⎧ ⎫
⎨ N N ⎬
X∗ = x : x ≥= λj x j ; λj = 1; λj ≥ 0; j = 1, 2, . . . , N . (9)
⎩ ⎭
j =1 j =1

The dual of the LP problem (8) is

min α0 + β 0’ x 0
s.t.α0 + β 0’ x j ≥ py j ; (j = 1, 2, . . . , N)
(10)
py 0 = y 0 ;
p ≥ 0; β 0 ≥ 0; α0 unrestricted.

For a simple example, consider the 3-input 1-output case and the observed input-
output bundle of firm k in the sample. For this example, the explicit form of problem
(8) above is

fˆ (x1k , x2k , x3k ) = max ϕyk

N
s.t. λj yj ≥ ϕyk ;
j =1

N
λj x1j ≤ x1k ;
j =1

N (11)
λj x2j ≤ x2k ;
j =1

N
λj x3j ≤ x3k ;
j =1

N
λj = 1; λj ≥ 0 (j = 1, 2, . . . , N )
j =1
26 Choice of Inputs and Outputs for Production Analysis 1095

The corresponding dual problem is

fˆ (x1k , x2k , x3k ) = min αk + β1k x1k + β2k x2k + β3k x3k
s.t.αk + β1k x1j + β2k x2j + β3k x3j ≥ py j ; (j = 1, 2, .., N)
(12)
py k = yk ;
β1k , β2k , β3k , p ≥ 0; αk unrestricted

Clearly, in (12) above, p = 1.

Define the function

R k (x1 , x2 , x3 ) = αk + β1k x1 + β2k x2 + β3k x3 (13)

Then Rk (xj ) ≥ yj for each j and R k x k = fˆ x k . (The function Rk (.) is a
supporting hyperplane constituting a local outer approximation to the frontier of the
production possibility set and is one of the facets of the overproduction function
[41]. For a more detailed discussion see Ray [24], Chap. 10.)
We may express the deviation between yj and Rk (xj ) as

εjk = R k x j − yj (j = 1, 2, . . . , k, .., N ) . (14)

Given that R k x k = fˆ x k , as noted above, the problem (12) can be
reformulated as

min εkk = R k x k − y k
(15)
s.t. εjk = R k x j − yj ≥ 0 (j = 1, 2, . . . , k, .., N) .

Several points need to be noted here:

(i) While the non-negative residual εkk measures the deviation of yk from the
frontier fˆ(.) evaluated at the input bundle xk , the other residuals εjk (j = k)
have no such interpretation.
(ii) For any other input bundle xj (j = k), the corresponding tangent hyperplane
j j j
R j (x1 , x2 , x3 ) = αj + β1 x1 + β2 x2 + β3 x3 will have a different gradient
vector.
j
(iii) The efficiency residuals εj (j = 1, 2, ., .., N) are obtained independently
rather than jointly.

In (12) or (14) above, we impose no additional constraints beyond the non-

negativity of the β s. Now impose a further constraint

aβ1k − bβ2k = 0. (16)

1096 S. C. Ray

That is, β2k = ab β1k . The restricted version of (12) would then be

R̃ x k = min αk + β1k x1k + ab x2k + β3k x3k

s.t.αk + β1k x1j + ab x2j + β3k x3j ≥ py j ;
(17)
py k = yk ;
β1k , β2k , β3k , p ≥ 0; αk unrestricted.

Define, now, the aggregated input

a
X1j = x1j + x2j . (18)
b

Problem (17) would then become

R̃ x k = min αk + β1k X1k + β3k x3k
s.t. αk + β1k X1j + β3k x3j ≥ py j ;
(19)
py k = yk ;
β1k , β3k , p ≥ 0; αk unrestricted

The dual of this problem is

f˜ (x1k , x2k , x3k ) = max ϕyk

N
s.t. λj yj ≥ ϕyk ;
j =1

N
λj x1j ≤ X1k ;
j =1 (20)

N
λj x3j ≤ x3k ;
j =1

N
λj = 1; λj ≥ 0 (j = 1, 2, . . . , N )
j =1

Obviously, when a = b, X1 is simply the sum of the quantities of the inputs x1

and x2 . In that case, the two inputs are treated as perfect substitutes. Further, because
(20) is a restricted version of (12), the minimum value of the objective function at
the optimal solution of (20) will be no lower than what is obtained at the optimal
solution of (12). Therefore,

ε̃kk = f˜k x k − yk ≥ εkk = fˆ x k − yk . (21)

The former is the restricted and the latter the unrestricted residual.
26 Choice of Inputs and Outputs for Production Analysis 1097

Input Aggregation in Parametric Models

For input aggregation in a parametric specification of the production function, one

assumes homothetic separability. Consider, again, the production function

y = f (x1 , x2 , x3 ) = F (a (x1 , x2 ) , x3 ) . (22)

Assume further X1 = a(x1 , x2 ) is homogeneous of degree 1 and is functionally

separable from x3 . Linear homogeneity implies X1 = a1 x1 + a2 x2 , where a1 =
∂x1 and a2 = ∂x2 . Separability implies that the marginal rate of substitution
∂a ∂a

a2
∂ a1
between x1 and x2 is independent of x3 . That is = 0. Now, if the firm selects
∂x3
the inputs (x1 , x2 ) to minimize C(X1 ) = w1 x1 + w2 x2 for the sub-aggregate input
w1 x1 +w2 x2
X1 , aa21 = w2 w2
w1 . Hence, X1 = x1 + w1 x2 = w1 . Using w1 as the numeraire, we
get

X1 = w1 x1 + w2 x2 . (23)

Statistical Tests for Input Aggregation

In econometric models, test of linear restrictions on parameters is usually conducted

by comparing the residual sums of squares from the restricted and the unrestricted
model in an appropriate F test. A comparable F test developed by Banker [4]
can be employed to test whether the efficiency residuals from the restricted DEA
model are significantly bigger as a whole than the unrestricted DEA residuals.
(Pastor, Ruiz, and Sirvent [20] performed a nonparametric statistical test of nested
radial DEA models to determine the optimal choice of inputs and outputs.) Note
that the restriction (16) results in the input aggregation defined in (17). Hence,
this F test is in effect a test of the validity of the way the inputs have been
aggregated.
The multipliers β1k and β2k at the optimal solution of (12) denote the shadow
prices of the corresponding inputs of firm k: x1k and x2k . A problem often encountered
in DEA (as in other linear programming models) is that at the optimal solution,
the shadow price becomes zero because a constraint is nonbinding. To avoid this
problem, one may impose upper and lower bounds of the form

β2k
L≤ ≤U (24)
β1k

on the ratio of a pair of shadow prices. This is known as assurance region (AR)
analysis.
1098 S. C. Ray

Note that one gets the restriction (16) above by setting

a
L= = U. (25)
b

In that sense, aggregation of the two inputs in the manner described above is a
degenerate case of AR analysis.
The lower and upper bounds in AR analysis are often arbitrarily chosen.
However, the cost minimizing behavior of a competitive firm can provide guidance
in selecting the ratio ab from available data. When firm k is allocatively efficient in
its choice of the input quantities x1k and x2k , the marginal rate of substitution will be
equal to the ratio of the two input prices faced by the firm. In that case,

β2k w2k
= . (26)
β1k w1k

A valid measure of the aggregate input would then be

w2k
X1j = x1j + x2j . (27)
w1k

As argued by Banker et al. [7], Banker’s F test would then become a test of
allocative efficiency.
In the existing literature, it is standard practice to assume that all firms face the
same prices of the inputs. In that case

w2k w2
= for each firm k. (28)
w1k w1

When this is not the case, one must construct a different measure of the aggregate
input by using a different price ratio in order to evaluate the efficiency of each firm
k. That is when evaluating firm r, the aggregate input is constructed as

w2r
r
X1j = x1j + x2j . (29)
w1r

In aggregating over more than two inputs, any one input price may be regarded as
a numeraire and the other inputs may be weighted by their relative prices. However,
when evaluating a different firm, s, the appropriate way to measure the aggregate
input would be to define

w2s
s
X1j = x1j + x2j (j = 1, 2, .., N) . (30)
w1s
26 Choice of Inputs and Outputs for Production Analysis 1099

The relevant LP problem (17) for any firm k (k = 1, 2, . . . , N) would then become

f˜ (x1s , x2s , x3s ) = max ϕys

N
s.t. λj yj ≥ ϕys ;
j =1

N
s ≤ Xs ;
λj x1j 1k
j =1 (31)

N
λj x3j ≤ x3k ;
j =1

N
λj = 1; λj ≥ 0 (j = 1, 2, . . . , N )
j =1

Note that the aggregate input X1 needs to be measured differently while solving
the problem (31) for each individual firm unless all firms face the same prices of the
two inputs.
It is possible, of course, that in a particular application input (or output) prices
are not available and one must use the actual cost (or revenue) data in the model as
in Banker, Chang, and Natarajan [7]. This, it may be noted, implicitly assumes that
input prices are identical across firms.

Banker’s F Test in DEA

It may be noted that the DEA efficiency residuals εj are obtained independently of
each other. This is in contrast with the frontier production function model proposed
by Aigner and Chu [3]. In their case, a single parametric function is fitted to the
entire data set and the efficiency residuals are jointly derived and, therefore, are
not independent of one another. Now suppose that we choose a probability density
function f (.) such that f (εj ) is monotone decreasing in the efficiency residuals. In
that case, because the DEA estimate of the production function minimizes each
εj , it thereby maximizes each f (εj ). Hence, the DEA frontier g* (x) maximizes the
likelihood function subject to the constraints specified above.
Banker specifies the deterministic frontier where the random inefficiency com-
ponent of y appears in an additive manner. One may directly link the one-sided
econometric frontier with the DEA frontier by specifying (18) differently as

y = g(x)e−ε ; ε ≥ 0 (32)

leading to

g(x) = φy; φ ≥ 1. (33)

Thus,

ε = ln (φ) ≥ 0. (34)
1100 S. C. Ray

Note that as argued in Banker [4], interpretation of the DEA frontier g* (x) as a
maximum likelihood estimator of the unknown frontier g(x) remains valid.
An alternative conceptualization of the frontier would be to directly specify (21)
with the non-negative efficiency residual defined as

f˜k x k − yk
εkk = = (ϕk − 1) ≥ 0. (35)
yk

Banker has proposed a number of statistical tests for comparing two groups of
firms to assess whether one group is more efficient than the other. Assume that there
are N firms in the sample of which m1 are in group 1 and m2 are in group 2. Firms
in group1 have the exponential distribution of (in)efficiency εj with parameter σ 1
and those in group 2 also have the exponential distribution but with parameter σ 2 .
Designate the first group of firms as M1 and the second group as M2 . Consider the
residuals εj * (j = 1, . . . ., N) obtained from DEA. Under the maintained hypothesis,
the sample statistic.
εj∗
σi has the χ distribution with 2mi (i = 1, 2) degrees of freedom.
2
j ∈Mi
Under the null hypothesis σ 1 = σ 2 , the test statistic

εj∗ /m1
j ∈M1
F = (36)
εj∗ /m2
j ∈M2

has the F distribution with (2m1 , 2m2 ) degrees of freedom.

On the other hand, if the εj s have the half Normal distribution, (i.e., the
Normal distribution with mean 0 and variance σ 2 truncated from below at 0),

εj∗ 2
then σ1 has the χ 2 distribution with m1 degrees of freedom. Similarly,
j ∈M1
∗ 2
εj
σ2 has the χ 2 distribution with m2 degrees of freedom. Hence, in this
j ∈M2
case, under the null hypothesis σ 1 = σ 2 , the statistic

∗ 2
εj /m1
j ∈M1
F =
∗ 2
(37)
εj /m2
j ∈M2

1 , m2 ) degrees of freedom.

has the F distribution with (m
∗
One would use εk ≡ ln φ̂k for the aggregated (i.e., the restricted) model in
the numerator and εk∗ ≡ ln (φk ) for the disaggregated (i.e., the unrestricted) model
in the denominator in testing for the validity of input aggregation. The aggregated
26 Choice of Inputs and Outputs for Production Analysis 1101

model would be rejected only when the test statistic exceeds the critical value for
the relevant degrees of freedom, for any specific distributional assumption.

A Statistical Test for Nested Radial DEA Models

In parametric regression analysis, one typically applies the t test at the desired level
of significance to decide whether or not any individual regressor should be included
in the estimated model. Pastor, Ruiz, and Sirvent [20] proposed an analogous test for
model selection in terms of what they describe as marginal efficiency contribution
(MEC) of any individual input or output.
The underlying logic behind their test is quite simple. Suppose that an output-
oriented model includes m outputs and n inputs and the resulting radial output
expansion factors for the N individual units are ϕj∗ (j = 1, 2, . . . , N ). Next
consider the model without one of the inputs. Because the new model has one fewer
restriction (corresponding to the excluded input), we get a new set of optimal values
ϕj∗∗
ϕj∗∗ (j = 1, 2, . . . , N) for the same units. Define ρj = ϕj∗ . Clearly, because the
model excluding one input is less restrictive, ϕj∗∗
≥ ϕj∗
so that ρ j ≥ 1. At this point,
the analyst has to select a critical value ρ > 1 (e.g., ρ = 1.10) and define the
binary variable Tj which equals 1 if ρj > ρ and 0 otherwise. Stated differently,
Tj = 1 indicates that exclusion of the input under consideration significantly alters
the optimal value of the DEA problem for unit j. Viewed this way, the random
variables Tj are binary outcomes of a series of independent Bernoulli trials with
some constant probability p of “success” (i.e., Tj = 1). One may select some
hypothesized probability p0 and test the null hypothesis

H0 : p < p0 against the alternative H1 : p > p0 .

If H0 is rejected, one may conclude that exclusion of the relevant input

significantly affects the distribution of efficiency and, hence, it is a relevant input.
The p-value for the test statistics is 1 − FB (N − 1, p0 ) where FB is the cumulative
density function for the Binomial distribution B(N − 1, p0 ). (The p-value is the
probability of getting T0 or more “success” in a series of N − 1 independent trials
where the probability of success in any trial is p0 .)

DEA and Bootstrap

Simar [32, 33], Simar and Wilson (1998, 2000) set the foundation for the consis-
tent use of bootstrap techniques to generate empirical distributions of efficiency
scores and have developed tests of hypotheses relating to returns to scale through
bootstrapping. Following Simar and Wilson (1997a), we can describe the existing
bootstrap techniques for the output-oriented technical efficiency measure from an
output-oriented radial DEA model with the following algorithm:
1102 S. C. Ray

(i) Solve the DEA LP problem to obtain ϕ̂j for each DMU j = 1, 2, . . . , N.
(ii) Select
the b-th (b = 1, 2, . . . , B) independent naive bootstrap sample
∗ ∗ ∗
ϕ1,b , ϕ2,b , . . . , ϕN,b , which consists of N data values drawn with
replacement from the estimated values ϕ̂j s.
(iii) Construct the smoothed bootstrap sample ϕ1,b ∗∗ , ϕ ∗∗ , . . . , ϕ ∗∗
2,b N,b , from the
naïve bootstrap sample. Notice that all the ϕj s are greater than or equal to 1.
Therefore, the smoothed bootstrap sample should be appropriately bounded. It
will be computed according to:
∗
ϕj + hεj if ϕj∗ + hεj ≥ 1
∗∗
ϕj,b = ; for j = 1, 2, . . . , N . (38)
2 − ϕj∗ + hεj otherwise

As proposed in Silverman (1986), h = 0.9 A n-1/5 , where A =

min s.d.; I1.34
QR
is the optimal bandwidth that minimizes the approximate
values of ϕ.
mean integrated square error of the distribution ofthe DEA
ϕ̂
(iv) Create the b-th pseudo-data set as {(xj* , yj * = yj ϕj∗∗ ); j = 1, 2, . . . , N}.
(v) Use the pseudo-data set to compute new φ̂j∗ s from the output oriented DEA
LP problem.
∗ ; b = 1, 2, . . . , B} for each DMU
(vi) Repeat steps (ii)–(iv) B-times to obtain {φ̂j,b
j, j = 1, 2, . . . , N.
(vii) Calculate the average of the bootstrap estimates of ϕ s for each unit j ϕ ∗j =
1 ∗
B
B ϕj,b . The bias corrected estimate of ϕj is ϕjbc = 2ϕ̂j − ϕ ∗j .Use the
b=1
∗ to construct the relevant bootstrap
empirical distribution function of ϕ̂j,b
confidence intervals for ϕj . (see Simar and Wilson [35] and Sickles and
Zelenyuk [30] Sect. 9.3.1 for a detailed discussion).

It should be noted that an interpretation of the results obtained from the bootstrap
procedure is not always clear. For example, in the bth replication using the pseudo-
data consisting of the actual input bundles coupled with the fictitious output levels
of firms, the optimal solution φ ∗ shows the scalar expansion factor for the fictitious
output quantity and its inverse is not a measure of the efficiency of the actual input
output bundle. It is possible that the actual input-output bundle may lie above the
production frontier constructed from the pseudo-data obtained in any one bootstrap
sample. One may, of course, use the optimal solutions from the (bootstrap) DEA
problems to construct measures of the frontier output level producible from the fixed
input bundle of a firm. Thus, it is more meaningful to construct ∗ a 95% confidence
interval of the maximum output with lower and upper bounds y , y ∗ . In principle,
L U
the upper bound yU∗ may be used to derive a probabilistic measure of the technical
efficiency of an observed input-output bundle. It is still possible that the actually
observed output from a given input bundle may exceed its corresponding upper
bound.
26 Choice of Inputs and Outputs for Production Analysis 1103

Nondiscretionary Inputs

Variable inputs are varied while fixed inputs are held constant (even as the level of
production changes) at the discretion of the firm (see also the section on “contextual
variables” in the chapter on DEA by Ray volume 1 of this handbook). There are,
however, other factors that affect the output produced from the inputs chosen by
the firm but are beyond its control. For a simple example, consider irrigation in
farming. The farmer can increase the level of irrigation by choosing a bigger pump
but may want to wait until its existing pump is sufficiently depreciated. Rainfall is an
alternative source of water for crops. However, the farmer cannot alter the amount
of rainfall at his discretion. These are treated as nondiscretionary inputs. One can
think of the discretionary inputs as directly productive and the nondiscretionary
inputs as facilitating production. (This description was suggested to me by Subal
Kumbhakar.)
As noted earlier, the production possibility set consists of all feasible input-
output bundles and is thus defined in the input-output space. However, production
takes place in a specific physical, social, and cultural environment. Differences in
environmental conditions can play a decisive role in defining the feasibility of a
particular input-output bundle. In measuring the efficiency of a decision-making
unit, we assume that it can choose the input bundle it uses, or the output bundle it
produces. Unlike inputs or outputs, the environmental factors cannot be chosen by
the firm and must be treated as “nondiscretionary.”
An obvious example of an environmental factor is rainfall in the context of
agricultural production. The maximum output producible from a given bundle of
inputs (say labor, fertilizer, and land) depends on the amount of rainfall. In that
sense, rainfall contributes to the output much the same way as irrigation. However,
while the farmer can choose the level of irrigation, the amount of rainfall is not
within his control. Here, rainfall acts as a nondiscretionary input. Note that in
defining the feasible set for a DEA LP problem, one has to include a constraint for
the amount of rainfall. However, when the radial input-oriented technical efficiency
is to be measured, the proportional scaling factor applies only to the discretionary
inputs (like labor, fertilizer, and land) but not to rainfall.
For another example, consider a secondary school where the average perfor-
mance of its pupils in a standardized test in mathematics is one of the outputs
and hours of classroom instruction in math is one of the inputs. An increase in
this input is expected to improve the average test score in math. Now consider
another variable – the median family income of the town where the school is located.
There is ample evidence to conclude that students from more affluent families
where the parents are professionals are better motivated and spend more time on
homework and perform better in tests. In that sense, the economic status of the
pupil acts like class time spent on math. However, the former is an input while
the latter is a contextual variable. They are also referred to as nondiscretionary
inputs. Another example of a contextual variable is the marital status of parents
of a pupil. A child from a single parent family (irrespective of income) is unlikely
to get the same level of parental attention as when both parents are present. Thus, an
1104 S. C. Ray

increase the proportion of pupils from single parent families will lower the average
math score even when all other variables are unchanged. Two things emerge out of
this illustrative example. First, unlike an input, a contextual variable may be either
favorable (like family income) or detrimental (like single parent families). Second,
a decision maker at an appropriate level of authority (the school superintendent or
the Board of Education) can select the input bundle used. This is not true for the
contextual variables.
For yet another example, consider the efficiency of a water utility. The outputs
are the number of customers served and the gallons of water distributed. The inputs
are pumps, length of pipe lines, and hours of labor. Note that in an urban area
the higher density of population implies that the same number of customers can
be served and the same volume of water dispensed with a smaller network of
pipelines than what is required in a rural area. Moreover, when many customers
are located in the same building (as is the case in an urban community), the labor
hours needed for meter reading will be lower than in a rural area where customers
are located at distant points. In this case, density of population is an environmental
variable.
Where contextual variables are considered to be significant determinants of
performance, an appropriate way to conceptualize the production technology is to
define the production possibility set conditional on a specific vector of contextual
variables z0 :

T z0 = (x, y) : y can be produced from x given the contextual variables z0 .
(39)

One then has a family of conditional production possibility sets. Efficiency is still
evaluated at the inputs used and outputs produced. But the appropriate benchmark
bundle depends on the applicable vector of contextual variables. The disposability
and convexity assumptions about the technology apply to the input-output set but
are not necessarily extended to the contextual variables z.
Consider the revised transformation function

F (x, y; z) = k. (40)

The conditional production possibility set would then be:

T z0 = (x, y) : F x, y; z0 ≤ 0 . (41)

There are two different ways to formulate the DEA problem depending on how
the revised transformation function is conceptualized. One possibility is to consider
the function multiplicatively separable between (x, y) and z as

F (x, y, z) = a (x, y) .b(z) = k. (42)

26 Choice of Inputs and Outputs for Production Analysis 1105

In this case, any change in the contextual variables causes a neutral shift in
the production frontier without altering the marginal rates of substitution between
inputs or marginal rates of transformation between outputs. Such multiplicative
separability permits a 2-stage analysis where a conventional DEA model is solved
using only the inputs and outputs in the first stage followed by a regression of the
DEA efficiency scores on the contextual variables in the second stage. It is important
to recognize that this 2-stage analysis is permissible only if the environmental
variables (the regressors) in the second stage are uncorrelated with the inputs and
outputs in the first stage DEA. This second stage regression analysis was proposed
by Ray [22,23] and is quite widely used in empirical applications to explain
variations in measured levels of DEA efficiency.
The other option is to include the contextual variables along with the inputs and
outputs in a comprehensive DEA problem. In this case, favorable environmental
factors are treated like inputs because they contribute positively towards outputs
while unfavorable environmental factors are treated like outputs because they
demand additional inputs. However, even though appropriate inequality constraints
are included for these factors, the proportional scaling factors do not apply to them.
This one step analysis would be recommended when the contextual factors are
correlated with some inputs. In the water utility example, the population density in
the area served affects the specific inputs like length of network or labor but not other
inputs like the number of pumps. In the school example on the other hand, there is no
reason to believe that socioeconomic factors (either good or bad) would be related
specifically to some inputs and considering a neutral shift is more appropriate.
There are two important points to note about a one-step DEA. First, the analyst
must make a prior judgment about whether any relevant contextual variable is
favorable or unfavorable because that would determine the nature of the inequality
in the relevant constraint. Second, and more importantly, the convexity assumption
about the input-output bundles may not be applicable for some contextual variables.
This is particularly true for categorical variables. Suppose that in our water utility
example, the service areas are classified as rural, urban, and metropolitan but
the exact measure of population density is not available for each observation. In
this case, all we know is that water delivery is most difficult in the rural areas
and the least difficult in the metropolitan areas. Creating convex combinations of
a categorical variable representing population density is not meaningful in this
context. Following Banker and Morey [5], one may handle this by treating the
conditional production possibility sets as nested in the sense that all input-output
bundles that are feasible in a less favorable condition are also feasible in in a more
favorable condition but not the other way around. In this case, we include only the
rural observations to construct the frontier for evaluating utilities serving rural areas
but all observations to construct the frontier that is to be used for evaluating the
utilities serving the most densely populated areas. It should be noted though that
for multiple contextual variables that are categorical such cross-classification may
severely restrict the number of observations available for constructing the frontier
for the less favored groups.
1106 S. C. Ray

Second Stage Regression

Ray (1988, [24]) conceptualized a 1-output production function of the form

y = g(x).h(z)e−η ; η ≥ 0; 1 ≥ h(z) ≥ 0 (43)

and argued that the DEA radial output-oriented efficiency score τy = ϕ1∗ is a
y
measure of g(x) = h(z)e−η when g(x) is additive. (Unless x and z are correlated and
if the g(x) is additive, there should not be any bias. However, the DEA efficiency
measure includes effects of difference in z (like soil type, public infrastructures,
etc). The second stage regression is designed to appropriately shift the frontier
downwards relative to the best possible environment. In the absence of CRS, the
additivity property of g(x) does not hold and there will be a bias due to convexity.

Assume further that lnh(z) = z β. This leads to the second stage regression

ln τy = z β + u; u ≤ 0. (44)

One can estimate this regression by OLS to get consistent estimators of the
slope parameters. However, by construction, some of the fitted residuals û will
be positive. To overcome this problem, one can adjust the intercept upwards by
the
largest positive residual so that the so called “Greene corrected” residuals
∗ u∗j
uj = ûj − ûmax are all nonpositive. Ray suggests using e ≤ 1 to measure
efficiency corrected for the contextual variables.
This section is not designed to be comprehensive review of different approaches
to measuring the impact of contextual variables on the measured efficiency through
a second stage regression. The interested reader may refer to Chaps. 9 and 10 of
Sickles and Zelenyuk [30].
Two specific problems should be highlighted. First, when g(x) is strictly concave,
the DEA efficiency measures will be biased upwards. (Because in that case,
j
λj g x < g λj x j | λj = 1.) Also, the “fitted value” ĥ(z) = ez β̂
j j j
may exceed 1 for some observations. Moreover, as noted above, the second stage
OLS regression will not have the standard statistical properties of the classical linear
regression.
Banker and Natarajan [6] and Banker, Natarajan, and Zhang [8] frame the second
stage regression model differently. First, in the spirit of the parametric stochastic
frontier analysis, they include a two-sided random noise along with the one-sided
efficiency. Thus, their model is

y = g(x).h(z)ev−u ; 1 ≥ h(z) ≥ 0 (45)

Further, while u ≥ 0, vM ≥ v ≥ − vM . Thus, the random noise, although two-

sided, is bounded both from above and below. Also, they specify h(z) = e−z β and
26 Choice of Inputs and Outputs for Production Analysis 1107

only positive values of both the coefficient vector β and the contextual variables
z. These non-negativity restrictions automatically ensure 1 ≥ h(z) ≥ 0. Thus, their
model becomes

y
ln = v − u − z β (46)
g(x)

For the simplest case of a single contextual variable, the regression with an
intercept becomes

y
ln = v − u − β1 z (47)
g(x)

Define ε = v − u and ε̃ = ε − E (ε). Then the regression can be written as

y
ln = β0 − β1 z + ε̃; where β0 = E (ε) − v M . (48)
g(x)

One can use the log of the DEA score τy = ϕ1∗ as the dependent variable and
estimate the second stage regression by OLS.
While the OLS regression leads to consistent estimators of the coefficients of the
contextual variables, the pure inefficiency measures (unrelated to these variables)
cannot be derived directly. Banker and Natarajan [6] also consider maximum
likelihood estimation of the second stage regression and derive conditional mean
and mode of the distribution of the one-sided term (u) (comparable to the formulas
in Jondrow, Lovell, Materov, and Schmidt [14]) but advise using the OLS residuals
to rank order the units in terms of efficiency.

Truncated Regression in the Second Stage

Simar and Wilson [37] extend their earlier models of smoothed bootstrap of DEA
efficiency scores to incorporate contextual variables in a second stage regression.
However, they object to using either OLS or a Tobit model (i.e., censored regression)
and instead propose a truncated regression. Using the optimal value of the DEA
output-oriented LP problem, they consider a second stage model

ϕi∗ = zi β + ui ≥ 1 ⇒ ui ≥ 1 − zi β. (49)

One can specify a normal distribution N 0, σu2 for u and use MLE for truncated
regression to estimate the model parameters (β̃, σ̃u2 ) using only those observations
for which the ϕ∗ is strictly greater than 1. They advise bootstrap by drawing pseudo
values (ũib ) of the residuals from a Normal distribution with variance σ̃u2 truncated
at 1 − zi β̃ and constructing a new bootstrap sample ϕib∗ = zi β̃ + ũ for truncated
ib
regression.
1108 S. C. Ray

In the computational procedure described above, bootstrapping is applied only

in the second stage but the DEA efficiency scores are computed only once with the
actual data for the initial estimation of the truncated regression. Simar and Wilson
[37] also proposed an alternative and more elaborate computational algorithm that
can be described as “double bootstrap” where the DEA results are also bootstrapped.
(For details of the steps involved, see Simar and Wilson [37].)

Contextual Variables in Parametric Models

In stochastic frontier analysis, a comprehensive maximum likelihood estimation is

applied to estimate the production function along with the “efficiency function”
nested into it. Typically, the specified model is

ln y = β ln x + v − u (50)

where the random noise v is distributed as N 0, σv2 while the inefficiency
component u is some one-sided distribution either naturally non-negative (e.g., the
exponential or he gamma distribution) or a two-sided distribution truncated from

below at 0. A popular choice for u is the truncated Normal distribution |N μu , σu2 |.
To incorporate the contextual variables, one expresses the
mean of the pretruncated

distribution as μu = α z. All parameters of the model β, α, σv2 , σu2 are estimated
by maximizing the likelihood function. Estimates of efficiency at individual data
points are obtained by the mean (or mode) of the one-sided component conditional
upon the composite disturbance (see the chapters on SFA by Kumbhakar, Parmeter,
and Zelenyuk in volume 1 of this Handbook for details). One can also make σu2 a
function of the z variables. Wang [42] considers a framework where z affects both
the mean and variance.

Choice Between Inputs and Contextual Variables

Except in the obvious cases like rainfall, classification of factors affecting output
as inputs or contextual variables is not quite straightforward. In principle, an
input is something that the producer can choose. For example, the proportion of
students from non-English language speaking families has an impact on the average
test score of a school district. But the school superintendent cannot change this
socio-demographic characteristic of its student population. In that sense, it is an
exogenously determined (nondiscretionary) variable. However, the town’s Board of
Education may adopt a policy of encouraging enrollment from such families. In that
case, it becomes an input (like the number of teachers per student).
In the end, classification of variables as inputs or contextual is a judgment call
for the analyst. In general, however, the greater the scope of decision-making, the
fewer are the variables deemed contextual.
26 Choice of Inputs and Outputs for Production Analysis 1109

Input-Output Choice in Some Areas of Application

Manufacturing

A very popular subject of efficiency and productivity analysis is manufacturing (see

also the chapter by Bhoumik in this volume).

Output
In this sector, output can be defined in alternative ways. The three main choices are:

(i) Value added

For value added, the only two inputs included should be Labor and Capital.
(ii) Sales/shipment
Sales or shipment does not reflect total production. Should be adjusted for
change in inventories.
(iii) Gross output
Gross output should be used only when intermediate inputs (materials and
energy) are included.

Inputs: The inputs included are labor, capital, energy, and materials. (Recently,
purchased services are included as yet another input is the so called KLEMS
(Capital-Labor-Energy-Materials-Services) data sets constructed at the 2-digit
industry level for different countries. But there is hardly any KLEMS analysis
at the firm level.)

(i) Labor:
Typically, one distinguishes between production workers and nonproduction
workers.
Because production workers are often scheduled for overtime during peak
periods and fewer hours during slack times, a more accurate measure of this
labor input is the number of hours worked.
Nonproduction workers are usually full-time employees. Nonproduction
labor is measured by the number of employees.
(ii) Capital:
Capital is, by far, the most difficult input to measure. Apart from the fact that
the age and vintage differences of machinery make it hard to get an aggregate
measure of the capital input, there is also the question of stock vs. flow.
Usually, one can get the data only on the value of gross fixed assets. This
can be appropriately deflated by a price index of machinery and transport
equipment to get a real value of the capital stock. This can be used as a measure
of the capital input.
One major shortcoming of this approach is that it fails to take account
of differences in capacity utilization across firms and/or over years. The true
measure of the capital input is the actual flow of capital services that contributes
to production of the output during the production period.
1110 S. C. Ray

A different approach is to construct a measure of the capital flow input

indirectly using the “user cost approach .” The total user cost associated with
the capital flow input can be measured by the sum of interest and amortization,
depreciation, rent, and repair and maintenance expenses. Next one needs to
construct a price of capital services. One can start with the wholesale price
index of transport and machinery and compute the interest cost using an
appropriate interest rate. For example, suppose that the machinery price index
in a particular year is 250 and the interest is 5%. Next assume that the relevant
depreciation rate is 7% and repairs and maintenance add another 3%. Hence,
the user cost is 15% or 37.50 per year of capital service of one machine. If the
total user cost of capital for a firm is 7500. We can measure the flow of capital
7500
service as 37.50 = 200 machine years.
A note of caution here is that one should never use new capital formation
to measure capital input. New capital formation is investment and neither the
stock of capital nor the flow of capital services.
(iii) Energy:
In general, information on energy input is available mainly in terms of the
expenditure on energy. The principal sources of energy are (a) coal, (b) oil,
(c) electricity, and (d) natural gas. One needs to construct a price of energy as
a weighted average of these four different energy inputs. The problem is that
these fuels are measured in different units. For example, coal is measured in
tons while electricity in kilowatt hours. One must, first, reduce these units of
measurement to British Thermal Units (BTUs). Thus, price per ton of coal is to
be transformed into the price per BTU of energy from coal. Similarly, for other
fuels. Next a price of energy per BTU is constructed as the weighted average
of these energy source specific prices using the expenditure share of industrial
consumption of energy in the geographical region (e.g., the states of the USA)
where the firm is located. Finally, a measure of the quantity of energy consumed
(in BTU) by the firm is obtainable by the ratio of the expenditure on fuels and
this price per BTU of energy.
(iv) Materials:
The materials input in most cases include too many different items and even
though the total expenditure on materials may be available, it is difficult to
construct a price and deflate the expenditure to get a measure of the quantity
of materials consumed. The only reasonable way to measure materials is by
the value (effectively setting the price per unit equal to 1). Of course, for inter
temporal analysis, one can use the price index of industrial raw materials to
deflate the nominal amount of expenditure on materials.

Banking

Although an overwhelming share of empirical applications relate to measurement

of efficiency in banking, there is no consensus on what constitute the inputs and
26 Choice of Inputs and Outputs for Production Analysis 1111

outputs of a bank. (For a detailed analysis of the banking industry see the chapter
by Miller in this volume.)
There are two competing views about the technological character of a bank
as a production decision-making unit. Berger and Humphrey [9] describe the two
alternative views as production approach and intermediation approach.
In the production approach, a bank is visualized as a provider of different kinds
of services to deposit and credit account holders by performing transactions and
processing documents like loan application, credit reports, demand drafts, and so
on. In this view of a bank, labor, and physical capital (like building and equipment)
are the inputs and the numbers of deposit and credit transactions and documents
processed are the outputs. Because information on the number of deposit and credit
transactions performed or the number of documents processed is not available,
researchers use either the numbers of deposit and credit accounts or the monetary
values of the balances in these accounts.
In the intermediation approach (also known as the Asset Approach), a bank is
seen to be a financial intermediary between savers with funds to lend and investors
looking for funds to borrow. The principal act of value addition by a bank is to
transform the deposits and other borrowed funds into loans and other investments
that yield returns. As a financial intermediary, a bank uses labor and capital along
with funds accepted from depositors (used as raw materials) to create income
generating assets like loans and investments.
In the intermediation approach based primarily on the model of the short run
optimization problem of a financial firm (like a bank) developed by Sealey and
Lindley [29], loans and investments are the outputs while labor, deposits, and
(physical) capital are the inputs of the bank.
A choice between the two different approaches is not merely a matter of
interpretation of how a bank works but has important implications for selection of
inputs and outputs. As Berger and Humphrey [9] argue, the production approach is
more appropriate for evaluating performance at the branch level because branches
primarily process accounts for the bank as a whole and branch managers have little
control over bank funding and investment decisions. For the bank as the unit of
evaluation, the intermediation approach is more appropriate because it explicitly
includes interest expenses, which account for a major part of the total expenses of a
bank.
There are numerous studies, however, at the bank level which treat deposits as
output. It is often argued that the bank as a financial firm does offer valuable service
to the depositors by keeping their savings secure and deposits, therefore, should be
treated as output. But these funds in deposit accounts generate no revenue to the
banks unless transformed into credit given out to borrowers.
Sealey and Lindley [29] argue that the deposited funds are themselves inputs
that are turned into “loanable funds” as intermediate output which is ultimately
transformed into revenue generating assets.
Another argument often voiced against considering deposits as input is that it
would imply that a bank becomes more efficient if it reduces deposits because
1112 S. C. Ray

economizing on the use of inputs enhances efficiency. But this line of argument is
somewhat misleading in that it does not tell the whole story. One must recognize that
acquiring more deposits cannot be an end in itself for the bank and if the funds in the
deposit accounts are held idle (entailing interest expenses) without being processed
into loans and investments generating revenue, an increase in deposits would indeed
make the bank less efficient!
A variable that has received little attention in the production efficiency literature
on banking is the amount of equity capital of a bank. Very few studies have included
it in the list of inputs and outputs. Berger and Mester [11] argue that within the
intermediation or asset approach, the equity capital of the bank along with other
liability items like core deposits and purchased funds are sources of funds and
should be treated as inputs that enable a bank to produce the revenue generating
assets. However, unlike deposits and other funds, the equity capital cannot be
changed in the short run. Hence, it ought to be treated as a fixed input. Ray and
Das [27] in their study of short run cost and profit efficiency of Indian commercial
banks included a bank’s capital and reserve as fixed input.
Finally, there is no consensus about how to treat the number of branches of a bank
in the input-output classification. On one hand, increase in the number of branches
amount to easier access for retail customers of the bank and can be interpreted as
better service quality. In that sense, a branch is an output of the bank. On the other
hand, a branch generates more business and can, therefore, be treated as an input.
In fact, like deposits, number of branches have been treated as an output in some
studies and as input in some others.
An ingenious way to treat branches in the production technology of a bank is
to include it as a conditioning variable in a cost function with unequivocal output
quantities and input prices as regressors. If the coefficient of the number of branches
is negative, then it can be treated as an input because a negative marginal effect
implies that increase in this “input” permits reduction in the use of other inputs
resulting in a lower cost without reducing the outputs. If, however, the marginal
effect is positive, branches should be treated as output because in that case, with
more branches, the cost of producing the unchanged quantities of the other outputs
becomes higher.

Health Care

In view of the fast-increasing cost of health care all over the world, it is not
surprising that efficiency in delivery of health care has attracted considerable
attention from both academic researchers and policy makers and DEA has been
extensively used to evaluate the efficiency of hospitals, nursing homes, and other
providers of health care.
Conceptually the output of health care is the extent of improvement in health
status or the physical quality of life of a patient. Just as in the case of education,
the cognitive skills acquired by a pupil can only be indirectly (and imperfectly)
26 Choice of Inputs and Outputs for Production Analysis 1113

measured by standardized test scores, the output of a provider of health care must
be measured indirectly by a number of observable and quantifiable indicators.
A hospital as the unit of analysis is regarded as a production decision-making
unit combining different kinds of labor and capital inputs to provide treatment to
different kinds of patients.

Inputs
Three broad kinds of labor considered as separate inputs are numbers of physicians,
nurses and paramedical staff, and other workers. Where detailed data are available,
a further distinction is made between surgeons and general practitioners.
Measuring the capital input is far more problematic. While the hi-tech equipment
account for the bulk of the physical capital input of a hospital, lack of data makes it
difficult to include them in a measure of the capital input used.
Similarly, the size of the hospital building and the number of intensive care
units available are measures of capital input that should be included if the data are
available.
In most cases, the number of beds is used as a “catch all” measure of the capital
input. Basically, the number of beds is regarded as a measure of the size of a
hospital and the implicit assumption is that other capital inputs are proportional
to the number of beds.

Outputs
Outputs are measured by the numbers of inpatient days and outpatient cases treated.
In some cases, inpatient care is further broken up into emergency, maternity, and
other kinds of treatment. In the case of health care, the preexisting condition of the
patient at the time of admission is an important determinant of the resources need
for treatment.
Therefore, adjusting for complexity and intradiagnostic severity of cases is
important while measuring output by the number of patients. In practice, however,
even if detailed information on severity and complexity of cases may be obtained
for one or two hospitals, it is unrealistic to expect that the analyst will have such
information for all hospitals in the sample. A minimal adjustment for complexity
may be made by grouping patients as seniors (needing geriatric care), children
(receiving pediatric care), and other adults (requiring general care).
Another important aspect of health care that should be taken into consideration
in measuring efficiency is the quality of care provided. A provider may be able to
treat a larger number of patients from the same bundle of labor and capital inputs if
the quality of care is lower.
Few studies of health care efficiency have taken explicit account of quality.
Sometimes an index of quality is constructed from patient satisfaction survey. But
constructing an objective measure of quality of care is difficult.
Intuitively, a better quality of care restores the patient to normal health sooner
and keeps the person healthy for a longer time period in the absence of any new and
unrelated ailment.
1114 S. C. Ray

In the extreme case, negligence in critical situations may result in death of the
patient. In some studies, the incidence of patient mortality is treated as a measure of
the lack of quality. In other words, lower mortality rate ceteris paribus implies better
quality of care.
Another variable used to measure quality is the proportion of unscheduled
readmissions. Here the presumption is that if a patient needs to be readmitted
without a scheduled follow-up, the patient was given less than proper care in the
hospital and was discharged even before treatment was complete.

Conclusion

Inputs are related to the outputs by the technology. The role of any empirically
constructed production or transformation function is to quantify this relationship.
The technology can also be formulated through cost, revenue, and profit functions.
But these require additional assumption about optimizing behavior by the producer.
In principle, the production technology captures a physical relationship between
inputs and outputs. In that sense, identical input bundles should produce identical
outputs unless there are differences in the contextual variables – systematic or
random. A second stage regression in DEA or incorporating such variables in the
statistical distribution of the efficiency component of the error term in stochastic
frontier analysis allows one to extract the impact of these contextual variables from
the measured efficiency.
Finally, appropriate choice of inputs and outputs is a precondition of any
meaningful analysis of the production technology. Whether one wishes to evaluate
productive performance through technical efficiency or simply wants to examine
returns to scale properties, homotheticity, sub-/superadditivity or any other char-
acteristic of the technology, improper selection of inputs and outputs will render
any empirical analysis of production meaningless. This chapter ends with a note
of caution. In some cases (especially among OR analysts), there is a perception that
anything that the decision makers want to reduce is an input while anything that they
want to increase is an output. An extreme example is one where in an international
comparison of efficiency in emission control, fossil fuel is treated as input (because
the author believes that nations should use less of it) while nonfossil fuel is treated as
an output (because the author considers it desirable to use more of it). One can avoid
such ill-conceived model formulation by paying attention to the underlying produc-
tion economic theory (see the DEA example in Sect. 6.2.5 in Ramanathan [21]).

Cross-References

Bad Outputs
Data Envelopment Analysis: A Nonparametric Method of Production Analysis
26 Choice of Inputs and Outputs for Production Analysis 1115

References
1. Adler N, Golany B (2001) Evaluation of deregulated airline networks using data envelopment
analysis with principal component analysis with an application to Western Europe. Eur J Oper
Res 132:260–273
2. Adler N, Golany B (2002) Including principal component weights to improve discrimination
in data envelopment analysis. J Oper Res Soc 53:985–991
3. Aigner DJ, Chu SF (1968) On estimating the industry production function. Am Econ Rev
58(4):826–839
4. Banker RD (1993) Maximum likelihood, consistency, and data envelopment analysis: a
statistical foundation. Manag Sci 39:1265–1273
5. Banker RD, Morey RC (1986) The use of categorical variables in data envelopment analysis.
Manag Sci 32(12):1613–1627
6. Banker RD, Natarajan R (2008) Evaluating contextual variables affecting productivity using
data envelopment analysis. Oper Res 56(1):48–58
7. Banker RD, Chang H, Natarajan R (2007) Estimating DEA technical and allocative inefficiency
using aggregate cost or revenue data. J Prod Anal 27:115–121
8. Banker RD, Natarajan R, Zhang D (2019) Two-stage estimation of contextual variables in
stochastic production function models using data envelopment analysis: second stage OLS
versus bootstrap approaches. Eur J Oper Res 278:368–384
9. Berger AN, Humphrey DB (1991) Efficiency of financial institutions: international survey and
directions for future research. Eur J Oper Res 98:175–212
10. Berger AN, Mester LJ (1997) Inside the black box: what explains differences in the efficiencies
of financial institutions? J Bank Financ 21:895–967
11. Bhaumik S (2022, forthcoming) Technical efficiency and its determinants in the manufacturing
sector: what we know and what we should know. In: Ray SC, Chamber R, Kumbhakar SC (eds)
Handbook of production economics, vol 2. Springer Nature
12. Färe R, Zelenyuk V (2002) Input aggregation and technical efficiency. Appl Econ Lett 9:
635–636
13. Färe R, Grosskopf S, Zelenyuk V (2004) Aggregation bias and its bounds in measuring
technical efficiency. Appl Econ Lett 11:657–660
14. Jondrow J, Materov IS, Lovell CAK, Schmidt P (1982) On the estimation of technical
inefficiency in the stochastic frontier production function model. J Econ 19(2–3):233–238
15. Kumbhakar SC, Parmeter C, Zelenyuk V (2022 forthcoming) Stochastic frontier analysis:
foundations and advances I. In: Ray SC, Chamber R, Kumbhakar SC (eds) Handbook of
production economics, vol 1. Springer Nature
16. Leibenstein H, Maital S (1992) Empirical estimation and partitioning of X-inefficiency: a data-
envelopment approach. Am Econ Rev 82(2):428–433
17. Marx K, Engels F (1938) The German ideology (trans: Looch W, McGill CP). Lawrence and
Wishart, London
18. Miller S (2022, forthcoming) Empirical analysis of production economics: applications to
banking. In: Ray SC, Chamber R, Kumbhakar SC (eds) Handbook of production economics,
vol 2. Springer Nature
19. Murty S, Russell RR (2022, forthcoming) Bad outputs in production economics. In: Ray SC,
Chamber R, Kumbhakar SC (eds) Handbook of production economics, vol 1. Springer Nature
20. Pastor JT, Ruiz JL, Sirvent I (2002) A statistical test for nested radial DEA models. Oper Res
50(4):728–735
21. Ramanathan R (2003) An introduction to data envelopment analysis. Sage
22. Ray SC (1988) Data envelopment analysis, non-discretionary inputs and efficiency: an
alternative interpretation. Socio-Economic Planning Sciences 22(4):167–176
23. Ray SC (1991) Resource-use efficiency in public schools: a study of connecticut data. Manag
Sci 37(12):1620–1628
1116 S. C. Ray

24. Ray S (2004) Data envelopment analysis: theory and techniques for economics and operations
research. Cambridge University Press, Cambridge, UK
25. Ray SC (2022, forthcoming) Data envelopment analysis: a nonparametric approach to pro-
duction analysis. In: Ray SC, Chamber R, Kumbhakar SC (eds) Handbook of production
economics, vol 1. Springer Nature
26. Ray SC (2022, forthcoming) Conceptualization and measurement of productivity growth and
technical change. In: Ray SC, Chamber R, Kumbhakar SC (eds) Handbook of production
economics, vol 1. Springer Nature
27. Ray SC, Das A (2010) Distribution of cost and profit efficiency: evidence from Indian banking.
Eur J Oper Res 201:297–307
28. Ray SC, Mukherjee K (2005) The validity of input aggregation in DEA models: some statistical
tests. University of Connecticut Economics working paper 2005-54
29. Sealey C, Lindley J (1977) Inputs, outputs, and a theory of production and cost at depository
financial institutions. J Financ 32:1251–1266
30. Sickles R, Zelenyuk V (2019) Measurement of productivity and efficiency. Cambridge
University Press, Cambridge
31. Silverman BW (1986) Density estimation for statistics and data analysis. London: Chapman
and Hall
32. Simar L (1992) Estimating efficiencies from frontier models with panel data: a comparison
of parametric, nonparametric, and semiparametric methods with bootstrapping. J Prod Anal
3(1):171–203
33. Simar L (1996) Aspects of statistical analysis in DEA-type frontier models. J Prod Anal
7(27):177–2185
34. Simar L, Wilson PW (1998) Sensitivity analysis of efficiency scores: how to bootstrap in
nonparametric frontier models. Manag Sci 44:49–61
35. Simar L, Wilson PW (2000) Statistical inference in nonparametric frontier models: the state of
the art. J Prod Anal 13:49–78
36. Simar L, Wilson PW (2001) Testing restrictions in nonparametric efficiency models. Commun
Stat Simul Comput 30(1):159–184
37. Simar L, Wilson PW (2007) Estimation and inference in two-stage, semi-parametric models of
production processes. J Econ 136:31–64
38. Sirvent I, Ruiz JL, Borrás F, Pastor JT (2005) A Monte Carlo evaluation of several tests for the
selection of variables in DEA models. Int J Inf Technol Decis Mak 4(3):325–344
39. Tauer LW (2001) Input aggregation and computed technical efficiency. Appl Econ Lett 8:
295–297
40. Thomas AC, Tauer LW (1994) Linear input aggregation bias in nonparametric technical
efficiency measurement. Can J Agric Econ 42:77–86
41. Varian HR (1984) The nonparametric approach to production analysis. Econometrica
52(3):579–597
42. Wang HJ (2002) Heteroscedasticity and non-monotonic efficiency effects of a stochastic
frontier model. J Prod Anal 18:241–253
Airline Economics: A Survey of Applied
Issues in the Performance of the US 27
and International Airline Industry

Levent Kutlu, Daniel Prudencio, and Robin C. Sickles

Contents
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1118
Mergers, Alliances, Vertical Integration and Collusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1119
Mergers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1120
Collusive Behavior . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1121
On Collaboration and Vertical Integration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1122
Financial Struggle in the Airline Industry . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1123
Pricing and Differentiation by Heterogeneous Characteristics . . . . . . . . . . . . . . . . . . . . . . . . 1124
Market Power and Price Premium . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1126
Threat of Entry . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1127
Market Power and Efficiency . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1128
Airline Efficiency . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1131
Economic Impact of Delays . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1133
Governance and Airport Efficiency . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1134
Deregulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1136
Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1139
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1139

Prepared for Volume II of the Handbook of Production Economics, Robert Chambers, Subal
Kumbhakar, and Subhash Ray (eds.), New York: Springer, in progress
L. Kutlu ()
Department of Economics and Finance, University of Texas Rio Grande Valley, Edinburg,
TX, USA
e-mail: [email protected]
D. Prudencio · R. C. Sickles
Department of Economics, Rice University, Houston, TX, USA
e-mail: [email protected]; [email protected]

S. C. Ray et al. (eds.), Handbook of Production Economics,
https://doi.org/10.1007/978-981-10-3455-8_1
1118 L. Kutlu et al.

Abstract

In this survey, we consider applied issues in the productivity and efficiency of the
US and the international airline industry. We discuss the implications of mergers
and alliances, vertical integration, collusive behavior, bankruptcy, pricing and
differentiation by heterogeneous service and network characteristics, entry and
competition, airport governance, inefficiency, and deregulatory dynamics. Due to
legacies of regulations in the US, European, and international airline industry and
the existence of subsidized national flag carriers, there are substantial differences
in the level of productivity in the provision of airline services across different
carriers and over time. Thus, we also examine various treatments in the applied
airline economics literature that have modeled and measured the existence of
time-varying inefficiency and its persistent, brought about in part by the long-
run inefficiency of incumbent firms operating in non-contested markets. The
formal rules and dynamics of regulation and deregulatory initiatives in the USA,
Europe, and in international markets are also discussed, as well as the economic
impacts of delays, the presence and exploitation of market power in projected
niche market’s and the price premiums that go hand-in-hand with market
power.

Keywords

Airline industry · Competition · Mergers · Productivity · Efficiency ·

Innovation · Deregulation · Networks

Introduction

Our survey of applied issues in the productivity and efficiency of the US and
international airline industry will focus on a number of related issues. These include
the formation of mergers and alliances, the exploitation by airline firms of the
economies of vertical integration, as well as collusive behavior. We also will touch
on issues of bankruptcy, pricing and differentiation by heterogeneous service and
network characteristics, and how the threat of entry may have provided some com-
petitive behaviors by firms as they acquired larger market sizes via contestability
of those larger markets. Due to legacies of regulations in the US, European, and
international airline industry and the existence of subsidized national flag carriers,
substantial inefficiencies have also existed in the provision of airline services.
Thus, we also will examine various treatments in the applied airline economics
literature that have modeled and measured the existence of time-varying inefficiency
and its persistent, as evidenced by the long-run inefficiency of incumbent firms
operating in noncontested markets. The formal rules and dynamics of regulation
and deregulatory initiatives in the USA, Europe, and in international markets will be
discussed, as well as the economic impacts of delays, the presence and exploitation
of market power in projected niche markets and the price premiums that go hand-in-
hand with market power. We also discuss recent work on the economics of airport
27 Airline Economics: A Survey of Applied Issues in the Performance of the . . . 1119

governance, an issue that has had relatively little written on it in the economics
literatures although it is probably the most important constraining factors in the
ability of airline firms to be productive and efficient. In the course of our survey, we
will at times provide a bit more analytic and modeling detail in order to provide the
reader with technique as well as an historical overview.
Our survey chapter begins with a discussion of the industrial organization of the
airline industry, focusing on applied work that addresses mergers, alliances, vertical
integration, and collusion. In section “Mergers, Alliances, Vertical Integration and
Collusion,” we focus on the US industry in regard to such matters as its industry
structure has been driven much more by market forces than by national carrier
sovereignty, as has been the case in much of Europe and in Asian, Africa, and
Latin America. Section “Financial Struggle in the Airline Industry” provides a brief
discussion of entry and exit studies and those that focus on the puzzling record of
financial instability in the airline industry. As a way to differentiate their product,
legacy carriers have opted to expand networks to differentiate their product and they
have done this largely by mergers and acquisitions. However, their cost structures
post-merger and acquisition are not clearly lower. Our next section “Pricing and
Differentiation by Heterogeneous Characteristics” explores issues of pricing and its
differentiation in a setting of heterogeneous demand and supply characteristics. The
impact of competition in a setting in which there is heterogeneity in consumers’
willingness-to-pay and brand loyalty may have different impacts (positive or
negative) on different sets of consumers. The relationship between market power
and efficiency is taken up in section “Market Power and Price Premium,” focusing
on structural dynamic modeling of efficient super-game equilibrium of airlines
competing in the same city-pair market, while section “Airline Efficiency” deals
more explicitly with the form of and source of the inefficiency. The economic
impact of delays is taken up in section “Economic Impact of Delays.” As air
traffic grows and capacity is constrained by local zoning or other public policy
priorities, delays are a natural by-product of growing airport congestion. This
relates in part with the topic of governance and airport efficiency that is taken up
in section “Governance and Airport Efficiency.” The section surveys studies that
estimate the relative efficiency differences among US airports, controlling for airport
specific heterogeneity, and speaks to efficiency impacts of local authority, hub size,
and availability of multiple urban airports. Deregulation is taken up in section
“Deregulation,” beginning with the 1978 Airline Deregulation Act and continuing
through a series of transitional changes in regulatory policy in the USA and in
Europe. A brief conclusion is provided in section “Conclusions.”

Mergers, Alliances, Vertical Integration and Collusion

The airlines industry has experienced a rise in market concentration in the last
two decades. Whereas in 2008 six legacy airlines served the market, mergers in
the industry, decreased the number of legacy firms to three: American, Delta, and
United. Furthermore, low-cost carriers have also experience mergers, both with
1120 L. Kutlu et al.

other low-cost carriers or with legacy firms. As such, by 2015, Southwest Airlines,
along with the three legacy firms, controlled 80% of the airline traffic in the
continental US [91].
The recent consolidation of the market allowed firms to expand their networks
and to increase efficiency, nevertheless, such a consolidation raises antitrust con-
cerns. For the purpose of this review, hence, we are interested in, how mergers
between large firms influence the structure of the market, how collusive behavior
may take place in an oligopolistic market, and finally, how collaboration takes place
in the form of code-sharing.

Mergers

When antitrust authorities evaluate mergers, they evaluate how market power might
be created, and whether barriers to entry may be enhanced by the merger. Both
considerations are done for the short and long run. Early work by Nevo [76] has been
used to evaluate the short run implications of a merger, and recent contributions in
entry and dynamic games ([14, 36, 71], and [2, 3]) improved our understanding of
the long-term effects of a merger. Since the early work on the short-term effects of
a merger is well documented, we briefly describe the latter.
Li et al. [71] analyze the endogenous service choice of a carrier to offer non-
stop or connecting services on a route after a merger takes place. Their model
has a standard two-stage structure where carriers choose their type of service and
then choose equilibrium prices. Importantly, they assume complete information on
the information of qualities and costs of the type of service provided (non-stop
or connecting). The key insight the paper provides is to point out that the choice
of service (to offer a non-stop or a connecting flight) reveals information about
the firm, specifically about their efficiency to provide the service selected. The
authors show that accounting for selection of the firms able to compete in the non-
stop market is important, since we may otherwise overestimate the probability that
other firms may enter a market where a new merger has taken place. Interestingly,
Ciliberto et al. [36] reach to similar conclusions, although they do not focus on
the type of service provided, but rather on the endogenous entry decision of the
firms. The authors find that not accounting for endogenous entry to a market leads to
overestimation of demand elasticities, which leads to biased markups, and therefore
incorrect implications when analyzing the market power generated by a merger.
When simulating a merger, the authors find that the new merged firm has strong
incentives to enter new markets, and that it faces a stronger threat of entry from
other legacy firms, rather than from low-cost carriers.
Complementing the static analysis that measure the short run effects of a
proposed merger, recent contributions by Benkard et al. [14] and Aguirregabiria
and Ho [2, 3] in dynamic games, look to understand better how long increases in
concentration due to a merger are likely to prevail, as well as their impact on prices
in the medium and long run. These methods, rather than substitutes to the more
used static models, are useful complements to them. For example, Benkard et al.
27 Airline Economics: A Survey of Applied Issues in the Performance of the . . . 1121

[14], building on a two-step estimator by Bajari et al. [10], estimate policy functions
based on historical premerger data, and then future industry outcomes are simulated
both with and without the proposed merger. The authors find that low-cost carriers
play an important role in creating offsetting entry, and that the size of an airline’s
network at each end of the route is an important determinant of entry. Hence, the
empirical model will predict offsetting entry after a merger takes place on routes
where there is a potential entrant with a rich route network close to the route being
contested.
In a counterfactual analysis, they find that, when considering the dynamic aspects
of a merger, some mergers that were not authorized may have been able to go
through, as the attempted merger of United Airlines (UA) with US Airways. Given
that the overlap of the network of UA and US Airways was in areas where other
low-cost carriers were operating and expanding, the decrease in competition (due to
the reduction of flights operated by both airlines) may have created incentives for
the low-cost carriers to enter the less competitive routes mainly served by the new
merged firm.
Whenever a large network of a low-cost carrier does not overlap with the network
of the proposed merger network, antitrust authorities can implement measures to
promote the entry of low-cost carriers, so as to avoid monopolistic practices of
the new merged company. Zhang et al. [91] evaluate one of such measures. The
authors evaluate the effectiveness of forcing the merging companies to divest assets
as a condition for merger approval. Specifically, they evaluate the divestiture of
the control of gates, and find that such measure is effective at decreasing rates for
both merging airlines (3% decrease), as well as non-merging ones (1% decrease) at
affected airports, relative to airports where no forced divestiture took place.
Understanding the long-term dynamics in a market is important to more accu-
rately estimate the potential changes in the market due to a merger, and act
accordingly. Likewise, understanding the sources of constraints to enter a market
can aid antitrust authorities to implement measures that may avoid monopolistic
behavior.

Collusive Behavior

In oligopolistic markets, even if two large firms do not merge, collusive behavior
can still have a corrosive effect on prices and consumer welfare. In an earlier work
on collusion, Alam et al. [5] analyze the long run pricing behavior of carriers
competing in the same rout. If firms sustain a price relationship, the series will
be cointegrated when looking at the mean price, or stationary when considering
the price distribution. Therefore, testing for stationarity and cointegration allow
the authors to draw conclusions about parallel pricing. Using an index of price
dispersion, the authors find long-run stability in approximately two-thirds of their
sample. Similarly, Fang and Sickles [41] develop a dynamic model of collusion in
airport-pair routes for selected US airlines and present an estimation procedure by
1122 L. Kutlu et al.

Kalman-filtering techniques. Nevertheless, we should note that, as pointed out by

Motta [75], parallel pricing can also be present in competitive markets.
In a recent work, Ciliberto and Williams [34] aim to find what facilitates
collusion. To test for collusion, the authors nest parameters into a standard oligopoly
model with differentiated products, where firms compete in prices. The estimation
of the conduct parameters is achieved using variation in multimarket contact across
local airline markets. Intuitively, the authors find that multimarket contact facilitates
collusion among airlines. Carriers with many markets simultaneously seem to
sustain almost perfect coordination, and importantly, the cross-price elasticity
seems to play a key role in determining the impact of multimarket contact on
collusive behavior. Finally, Ciliberto et al. [37] formulate two additional tests for
collusion based on the following insight. First, colluding firms will reduce pair-wise
differences in prices within market, and second, in order to avoid informational
costs, colluding firms will sacrifice efficiency in production by increasing price
rigidity. The authors then test for contexts in the airline industry that may be
consistent with the previous two implication, and find that greater multimarket
contact leads to pricing patterns consistent with the previous two implications,
whereas code-share agreements are consistent with only the second implication.

On Collaboration and Vertical Integration

Another way that carriers have looked to strengthen their market power or differen-
tiate their product is through formal collaboration with other carriers to broaden the
set of services the carrier can provide, or to expand their networks. One of the most
common alliances is code-sharing, which is an agreement between two carriers that
allows them to sell seats on specific flights that are operated by one of the parties.
Code-sharing is attractive for efficiency and competitive motives as it effectively
widens the network of a carrier. Code-sharing may also eliminate the double
marginalization that would exist had the airlines independently determined the price
for different segments of a trip [25, 27]. Regarding the competitive motive, it may
be used as a means toward preserving or increasing market power. As Bamberger
et al. [11] argue, it may give the allied firms a broader reach in their advertising
and service, which may broaden their consumer base while the advertising costs are
shared with the allied firm. Others, as Goetz and Shapiro [51] argue, code-sharing
may be used as a preemptive action when a legacy firm faces a threat of entry from
a low-cost carrier.
The gains in efficiency by the use of code-sharing may provide a larger network
for consumers at lower prices; nevertheless, the benefits to the consumer may be
hampered when one of the affiliated firms in the code-share agreement also provides
a single-carrier product in the concerned market [49]. In such instances, double-
marginalization may not be eliminated by code-sharing, as the firm offering a
competing product will have incentives to soften competition to their own product,
and hence, it will negotiate higher prices in the segment where it has a code-share
agreement.
27 Airline Economics: A Survey of Applied Issues in the Performance of the . . . 1123

In an empirical application of the effect of code-sharing on consumer welfare,

Amantier and Richard [8] study the impact of the 1999 domestic code-share agree-
ment between Continental Airlines (CA) and Northwest Airlines (NW). The authors
find that the consumer’s valuation of an airline product is not only significantly
affected by the price of it, but also by the flight duration and the time spent in transit
at an intermediate airport. Also, they find that the impact of the code-sharing did
not have a significant impact on consumer welfare on average, nevertheless, this is
not to say that the program had a null effect on all consumers. The program did
increase the average surplus of connecting passengers but decreased it for non-stop
passengers. This result may be attributed in large part not by changes in price, but
by changes in the product characteristics.
Apart from code-sharing, another frequent means of collaboration between
airlines is the sub-contraction of services for specific routes. Frequently, major
airlines sub-contract portions of their network to a local carrier, which may be
owned by the major airline or it may be independent. Forbes and Lederman [45]
study the effects of vertical integration on operational performance. Specifically,
they analyze whether, in a particular airport, an airline’s use of an owned carrier
rather than an independent one affects delays and cancellations of the airline’s own
flights out of that airport. Using an IV estimation procedure, they find that integrated
major carriers are systematically better than nonintegrated ones on the same day.
Moreover, differences in efficiency seem to be more pronounced during days with
extreme weather and when airports are more congested. These results are consistent
with previous finding by the authors in Forbes and Lederman [44], who find that
airlines are more likely to use owned regional carriers on city pairs with more
complex schedules, and therefore, where administrative decision of the crew and
delays need to be made more frequently.

Financial Struggle in the Airline Industry

Although air transport is one of the most important means of transportation in the
US, the historically poor financial record of the airline industry remains a puzzle. For
a review of possible causes see Borenstein [17], who considers the following factors:
exogenous cost drivers such as an increase in taxes and the price of fuel, demand
shocks, and entry and expansion of low-cost carriers. During the last decades, legacy
carriers have maintained higher costs then low-cost carriers and have seen their
price premia consistently decline, hence, although the exogenous shocks affect all
companies, legacy airlines have fared worse than low-cost carriers. To differentiate
their product, legacy carriers have opted to expand their networks through mergers
and acquisitions, nevertheless, it is not totally clear that their mergers have allowed
them to decrease their costs gap. Berry and Jia [15] consider these causes and
estimate a structural model of the industry to estimate the impact of demand and
supply changes on profitability. On the demand side, the authors find that compared
with the late 1990s, the price elasticity of air travel demand in 2006 increased by
8%. Also, passengers showed a stronger preference for direct flights. On the supply
1124 L. Kutlu et al.

side, changes in marginal cost favored direct flights. Together with the expansion
of low-cost carriers, the previous three factors account for more than 80% of the
observed reduction in legacy carrier’s profits, of which 50% can be attributed to
changes in demand.
Recent studies have started to look into the impact that filing for bankruptcy has
over the quantity of services provided, and over the fares charged. As expected,
filing for bankruptcy decreases the level of services provided and the size of
the network served [21, 31], nevertheless, there is less of a consensus over its
effect on prices. In early work, Borenstein and Rose [20] find little evidence
that bankruptcy filing affected pricing behavior, whereas Ciliberto and Schenone
[31] find that, under bankruptcy protection, airlines lower their route-specific
prices while under bankruptcy protection, and increase them after emerging from
bankruptcy. Importantly, during the studied period, they did not find similar patterns
in behavior by the airline’s competitors.

Pricing and Differentiation by Heterogeneous Characteristics

In markets characterized by few competitors, the capacity of large airlines to

influence the prices of the market has sparked an interest to understand better
their pricing behavior. Since the early work of Borenstein and Rose [19], the focus
of many studies has been to study the relationship between price dispersion and
competition, with little consensus in the direction of such relationship. Whereas
Borenstein and Rose [19] find that a positive relationship between price dispersion
(as measured by the Gini Index) with competition, Gerardi and Shapiro [50] find the
opposite. Recent work by Chandra and Lederman [30] and Chakrabarty and Kutlu
[29] seem to explain the mixed in the literature. On the one hand, Chakrabarty and
Kutlu [29] found an S-shaped relationship between price dispersion and market
concentration. On the other hand, theoretically, the former show that if there is
heterogeneity in the consumer’s willingness-to-pay and in their brand loyalty,
then competition may decrease price competition between some consumers while
increase it between others, hence, the ambiguous result on price dispersion and
competition.
More recently, Kutlu and Wang [66] examine the effect of market power and
marginal cost efficiency on the US airline price dispersion. They use the same
dataset that Kutlu and Wang [65] used so that the city-pairs originate from Chicago.
They model price dispersion by the following equation:

Yirt = β0 + β1 MP irt + β2 EF F irt + f (COST it ; δ) + g (P RODAT T irt ; γ )

+ h (P OP AT T tr ; η) + Fi + Rr + Tt + εirt , (1)

where Y is a price dispersion measure (either Gini coefficient for ticket prices or
difference of 90 and 10 percentiles of ticket prices); MP is a market power measure
(either conduct parameter or Herfindahl-Hirschman Index); EFF is the marginal cost
27 Airline Economics: A Survey of Applied Issues in the Performance of the . . . 1125

efficiency; f (.) is a function of cost related variables; g(.) is a function of product

attributes; h(.) is a function of population attributes; and Fi , Rr , and Tt are firm,
route, and time dummies; and εirt is the error term. The indices i, r, and t are for
airline, route (city-pair), and time.
In their analysis, their benchmark measure for price dispersion is the Gini
coefficient, which reflects price inequality across the entire range of different prices.
The estimation results of Kutlu and Wang [66] show that the marginal effect of
conduct may be over-estimated if the marginal cost efficiency is omitted from the
regression. The marginal cost efficiency has a negative effect on price dispersion and
seems to be a relatively important determinant of price dispersion. This highlights
another important aspect of efficiency as marginal cost efficiency not only directly
affects deadweight loss through suboptimal behavior of airlines but also it affects
welfare through its effect on price dispersion. Another finding that Kutlu and Wang
[66] mention is that the marginal cost efficiency is more effective in leisure routes
compared with big city routes. The data shows that the airlines are more successful
in identifying different customer types in big city routes. The ability to identify
customer types along with lower costs at big city routs may be giving airlines a
room for more price discrimination.
Other papers, rather than studying how the market structure affect prices, have
instead looked at how the ability to price discriminate over time affects consumer
welfare. Lazarev [69] propose a structural model where price discrimination can
affect consumer welfare through three channels, by affecting the quantity and
quality of the good, and as a result of misallocation of products among buyers.
Compared to an ideal allocation that maximizes social welfare, the profit maximiza-
tion allocation results in a welfare loss of 21% for the consumers. To understand
how intertemporal price discrimination contributes to this loss, the author tests
counterfactual market designs. When allowing a secondary market for the resale
of tickets, the author finds that tickets for leisure travelers would decrease, while
tickets for business travelers would increase. Social welfare would increase almost
by 12%, driven by the welfare gains of the business travelers, but airline’s profits
would decrease by 28%.
Also considering the effect of price discrimination over the allocation products,
Aryal et al. [9] develop a model of intertemporal and intratemporal price discrim-
ination to analyze how the ability of discriminatory mechanisms could be used to
remove sources of inefficiency. They focus on measuring the inefficiency in the
allocation of products, and in identifying the portion attributed to two sources of
inefficiency: asymmetric information and stochastic demand. The authors find that
compared to the first-best welfare, the current pricing practices result in a 19%
welfare loss. Of this loss, stochastic demand and asymmetric information account
for 65% and 35%, respectively.
Another method that enables airlines to price discriminate is the unbundling of
services. In 2000, a flight ticket would generally include the possibility to check
a bag along a carry on, and the choice of a seat and the provision of snacks were
benefits taken for granted. In contrast, most airlines today charge fees for every
additional service. In general, most airlines now provide a basic package with the
1126 L. Kutlu et al.

least level of services, and then provide a set of optional purchases for various fees.
In order to attract the more price sensitive passengers, the adoption of unbundling
fares was pioneered by low-cost carriers (LCC), and later adopted by flagship
carriers. Recently, Howell and Grifell-Tatjé [58] have studied this rather overseen
phenomenon in the economics literature. The authors study both the history of the
unbundling process and perform a case analysis of Frontier Airlines, an LCC based
in Denver, Colorado.
Frontier Airlines is an LCC operating under a hub and spoke framework. In
2008, their operating cost and passenger revenue were 11.2 and 10.5 cents per
available seat mile (ASM), underperforming the average of low-cost carriers, with
operating costs and passenger revenue of 10.5 and 11.7 per ASM. That year they
averaged approximately $3.7 in ancillary revenue per passenger altogether they had
an operating loss of $17 million. By 2016, the picture changed drastically. Although
the operating cost and the passenger revenue had both dropped by 7.6 and 7.9 per
ASM, the ancillary profits had risen to $48.33 per passenger, and their reported
operating income to $317 million.
Importantly for the study case, in 2014, Frontier announced a stripped-down cost
structure, reduced their average economy fare by 12%, and unbundled the low-cost
economy tickets. The observed change in strategy in 2014 allows the authors to
study how the unbundling changed the Frontier’s revenue stream, and the structure
of the markets in which it operated. More broadly, the authors study whether the
segmentation of services allow for productivity change.
Finally, given that the ability of a firm to price discriminate is interlinked with
the market power it holds, the studies that look into price discrimination also
consider how this affects the market power of a firm. In this line of thought,
Kutlu and Sickles [64] propose a market power measure that is designed to capture
price discrimination. This exercise is important because in industries where price
discrimination is common, often mergers are analyzed under the framework of a
single price. The lack of consideration of price discrimination by antitrust authorities
when looking at market power could lead them to block socially beneficial mergers
or to accept harmful ones.

Market Power and Price Premium

Understanding the sources of market power are important both for antitrust author-
ities, to propose effective measures when the market power is over a threshold, and
for firms, to better understand the obstacles of entry to a market.
Recent literature has shown that firms may seek to strengthen their market power
by their strategic control of airport facilities, specifically the gates. Others have
pointed out the importance of the frequent flyer programs to strengthen the loyalty of
consumers. Studying the former, Ciliberto and Williams [33] study the importance
of the control of gates as a determinant of the hub premium, and find that when the
percentage of gates controlled by the carrier in a hub increases from 10% to 30%,
then fare prices increase by 3%. Similarly, if there are limits to the fees that the
27 Airline Economics: A Survey of Applied Issues in the Performance of the . . . 1127

airline can charge for subleasing the gates, then prices decrease. Nevertheless, both
the control of gates and restriction on sublease fees have an impact on fares when
gates are scars relative to the number of departures from that airport.
With respect to the importance of frequent flyer programs (FFP), Lederman [70]
studies the impact of enhancements to an airline’s FFP on demand. Lederman shows
that enhancements to an airline’s FFP is correlated with increases in demand on the
routes that depart from the airports at which it is dominant. Also, the author shows
that this upgrade in the FFP’s results in a new equilibrium with fewer passenger and
higher fares, which is consistent with the idea that FFP’s are especially valuable for
the passengers that already have a high valuation of the airline.
The relationship between market power and both the control of gates and the
use FFPs, highlights the importance of considering variables other than prices when
studying market power. Consisting with this consideration, Röller and Sickles [83]
study how sensitive is the measure of market power when competition is on both
prices and capacity, rather than on prices only. The authors propose a structural
model with a two-stage set up. In the first stage, firms make capacity decision,
followed by a price setting game in the second stage. They study the European
market for the period of 1976 to 1900, and find that in the two-stage set up, the
market conduct is less collusive than in a one-stage specification. Overall, in order
to avoid a biased measure of market power, it is important to understand the barriers
of entry, and to consider variables other than prices to strengthen our understanding
of the strategic behavior of firms.

Threat of Entry

The consolidation of the airline industry has increased the network size of the
remaining firms in the market, but as airlines seek to expand, we might expect that
incumbent carriers at particular airports act strategically to deter entrance in order
to avoid competition. To this end, Ciliberto and Zhang [35] estimate three different
models for entry games in order to compare the fit of the models. They estimate a
simultaneous game with complete information, as in Ciliberto and Tamer [32], and
two sequential games with or without strategic entry deterrence. The authors show
that the model with sequential games with strategic deterrence provides the best fit
to the data, rendering some evidence to the strategic behavior to deter entrance of
new firms.
As to the specific actions that airlines may take to deter entrance, Goolsbee
and Syverson [54] find evidence that incumbent firms under threat of entry
cut fares significantly. Nevertheless, the authors find little support for strategic
investment/excess capacity as a preemptive action. Consistent with deterrence, when
the entrance of a competitor is guaranteed, firms seem to not lower prices prior to
the entrance of the new firm. Although the authors do not find pre-emptive strategic
investment, Parise [80] finds that when legacy firms are under threat of entrance of a
low-cost carrier, they change their debt structure. Incumbents increase debt maturity
before entry occurs, which allows firms to reduce rollover risk.
1128 L. Kutlu et al.

Market Power and Efficiency

Kutlu and Sickles [63] examine the relationship between market power and
efficiency in two US city pairs (Chicago-San Diego and Chicago-Salt Lake City).
For this purpose, they develop a dynamic conduct parameter game where in the
full market power scenario the airlines play an efficient supergame equilibrium. As
in Rotemberg and Saloner [85], they assume a full-information environment for the
airlines. In the beginning of each time period, the airlines know the demand and cost
shocks before making simultaneous decisions, which become common knowledge.
Since the shocks are observable by the airlines, they can adjust their quantities and
dampen the profits strategically. If the cost and demand shocks are such that the
incentive to deviate is high, then the airlines adjust their quantities to lower their
profits relative to the case in which the incentive to deviate is lower. While this may
look like a price war, this is an attempt to prevent deviation from the equilibrium.
In their model, the dynamic optimization of an airline i with full market power is
given by:

Qt ∗ (St , δ) = arg maxQt ,st πit (sit Qt ; St ) s.t.
i

πitb (Qt ; St ) + δ k Et πitr (St+k ) (2)
k

≤ πit (sit Qt ; St ) + δ k Et πit∗ (St+k ) ∀i,
k

where s is the market share; Q is the total market quantity; π is the realized profit
based on the actions taken; π b is the best response profit for retaliation period; π ∗
is the profit when the collusion is sustained; St is a vector of factors that represents
the state of the word at time t; and δ is the discount factor.
When we allow conducts that are less than full market power, the first order
condition for the conduct parameter game counterpart of Eq. 2 would be given by:

θt P Q∗t Q∗t + MK t − μ∗t = 0, (3)

where θ is the conduct parameter; P(.) is the inverse demand function;

MK is the
market share weighted price-marginal cost markup (i.e., MKt = i sit MKit = i
sit (Pit − MCit )); and μ∗ is a dynamic term that reflects the incentive compatibility
constraint. If μ∗ = 0, this first order condition represents the static game scenario.
Otherwise, we would have a dynamic conduct parameter game. Kutlu and Sickles
[63] assume that μ∗ is a linear function of cost and demand shocks as well as (market
share weighted) cost inefficiency of airlines for a city-pair.
Since Kutlu and Sickles [63] do not have city-pair-specific airline cost data,
they estimate an industry-specific stochastic frontier cost function for airlines. The
quarterly dataset that they use for cost function estimation has 11 airlines and
covers years between 1980 and 1993, which is collected from the Department of
Transportation’s Form 41/T100. For the conduct parameter estimation, they only
27 Airline Economics: A Survey of Applied Issues in the Performance of the . . . 1129

use data for top six airlines for the specific city-pairs that they use. The cost data has
more airlines because it is for the whole US airline industry. Their cost function has
four inputs: labor, energy, flight capital, and materials, which is a residual category
that includes supplies, outside services, and non-flight capital. They calculate the
price data using multilateral Tornqvist-Theil index number procedure. The output
variables are the number of enplanements, and scheduled and non-scheduled (cargo
and charter operations) passenger revenue mile. In order to control for different
characteristics of the output, they include load factor and stage length variables.
Also, they include average size (measured in seats) and fuel efficiency variables
to describe flight capital. Finally, they use seasonal dummies and event dummies,
which control for some important events occurred during the study period (Iran–Iraq
war; Gulf war; air traffic strike; and dummies for airline mergers and strikes).
They estimate the cost function using the distribution-free within estimator of
Cornwell et al. [38]. In contrast to distribution-based models, the distribution-free
model of CSS can be estimated by a fixed-effects type estimator, which allows fixed-
effects to vary over time. The efficiencies in the CSS model can be correlated with
the regressors. However, they capture both heterogeneity and efficiency. Among
others, see Green [55, 56], Wang and Ho [90], and Kutlu et al. [67] who proposed
models that can disentangle efficiency and heterogeneity. Another issue with the
CSS model is that the efficiency estimates are not robust to outliers. Hence, some
researchers avoid this by dropping observations with top and bottom 5% fixed-
effects values. After estimating the cost function parameters, they calculate marginal
costs specific to each airline and city-pair using corresponding enplanement and
miles flown data for each airline. When calculating marginal costs, they utilized
a distance of relevant city-pairs as well as airline fixed effects. In particular, they
assumed that overall MC is the MC of passenger revenue output times miles flown
plus MC of Enplanement.
While Kutlu and Sickles [63] find evidence for a dynamic game for Chicago-Salt
Lake City city-pair, they do not find such evidence for Chicago-San Diego. Their
results accord with Hicks’ quiet life hypothesis, which argues that there is a positive
relationship between market power and inefficiency. Finally, they argue that either
ignoring dynamic environments or inefficiency may lead to serious miscalculation
of deadweight loss.
Duygun et al. [40] estimate a (stochastic frontier) production function for the US
airlines using a dataset that covers years between 1999 and 2009, which is a time
period that airlines faced serious financial troubles. Their main focus is to illustrate
how the Kalman filter estimation method performs when estimating a production
function and airline (firm) technical efficiency.
Their dataset comes mainly from the International Civil Aviation Organization
and includes 35 airlines and 298 observations. When constructing the output and
input variables, they follow Sickles [86] and Sickles et al. [87]. Their dataset
includes four input variables: the flight capital (K, quantity of planes), labor
(L, quantity of pilots, cabin crew, mechanics, passenger and aircraft handlers, and
other labor), fuel (F, quantity of barrels of fuel), and materials (M, quantity index of
supplies, outside services, and non-flight equipment). However, their final models
1130 L. Kutlu et al.

focus on value added from capital and labor by netting out from revenue output
(RTK, revenue ton miles), the value of the intermediate energy, and materials. In
order to control airline heterogeneity, their production function also includes load
factor, stage length, jet engine aircraft percentage, and average plane size.
Due to multicollinearity issues, they focus on a restricted translog model,
which doesn’t include interaction terms between inputs and control variables. They
estimate the value-added production function via three estimators: the Battese and
Coelli [13] (BC) estimator, the within estimator of Cornwell et al. [38] (CSS), and
a Kalman filter estimator (KFE), which is proposed in their paper. Since the KFE is
the central part of their empirical analysis, we describe it below.
Duygun et al. [40] consider the following Kalman filter model for production
function:

yit = βj xj it + μit + εit
j
(4)
μi,t+1 = μit + eit ,

where εit ∼ N 0, σε2 and eit ∼ N 0, σe2 are independently distributed error
terms. Here μit is a state variable, which controls for unobserved time-varying
heterogeneity. This method can be used for estimating both neoclassical produc-
tion functions that doesn’t assume inefficiency and stochastic frontier production
functions. In the stochastic frontier production function setting, Duygun et al.
[40] interpret μit as a term that representsthe extent of efficiency.
The efficiency
estimates are calculated by Eit = exp − maxi μ̂it − μ̂it where μ̂it is the
estimate of μit . Compared to KFE, relatively inflexible stochastic frontier models
(e.g., BC and CSS) will more likely fail to capture potentially complex time-varying
patterns of inefficiency. Hence, KFE is especially useful when the efficiency patterns
of airlines are suspected to have variation over time.
Since μit is a random walk process, so that it is non-stationary, they use a diffuse
prior for its initial value. In particular, they assume that the mean squared error
matrix of the initial states are constant multiples of the identity matrix where the
constant is a large number. They suggest that an alternative is using the exact diffuse
priors. One of the difficulties of estimating Kalman filter models is that, due to
numerical rounding errors, some of the matrices used in the Kalman filter recursive
equations may end up being non-positive definite (when they should be positive
definite). This may cause numerical instability in the optimization process. In order
to avoid such issues, Duygun et al. [40] use a square-root Kalman filter, which is
computationally slower but more stable.
The estimate of the median returns to scale is 0.88 and the mean efficiency
estimate is 58%. In the airline literature a common finding is that the firms operate in
a constant return to scale environment. However, as Basu and Fernald [12] prove that
the value-added estimate of returns to scale would be farther away from the unity
compared to the corresponding gross output model. That is, it is smaller (greater)
than the corresponding gross output model when there is decreasing (increasing)
27 Airline Economics: A Survey of Applied Issues in the Performance of the . . . 1131

returns to scale. Therefore, they argue that their returns to scale estimate is within a
reasonable range.

Airline Efficiency

Kutlu and Wang [65] estimate firm-route-quarter specific conducts and marginal
cost efficiencies of US airlines for Chicago based routes without using route-level
cost data. Their main data sources are the Passenger Origin-Destination Survey
of the US Department of Transportation (DB1B dataset), which is a 10% random
sample of all tickets that originate in the US domestic flights. The quarterly dataset
covers years between 1999 and 2009. In their study, they assume that a market is a
city-pair (route). Following Borenstein [16] and Brueckner et al. [26], they assume
that ticketing carrier is the relevant airline.
In contrast to the conventional stochastic frontier cost function models, which
estimate cost efficiency, they estimate what they call “marginal cost efficiency.”
They argue that from the viewpoint of an antitrust authority who aims to analyze
the welfare, marginal cost efficiency is a more relevant concept as the calculation
of welfare utilizes marginal cost rather than total cost. Of course, marginal cost is
not directly observed and thus their estimation method is unconventional as well.
In particular, they estimate the marginal cost efficiency through a system of demand
and supply relation equations, which is derived from a conduct parameter game. One
advantage of this system of equations is that estimation of marginal cost efficiency
does not require cost data under certain identification conditions. In line with the
conduct parameter approach, the implied marginal cost and corresponding marginal
cost efficiency is deduced from demand and cost shifters. Another advantage of this
estimation strategy is that market power (i.e., conduct parameter) and marginal cost
efficiency are estimated jointly. However, since what is being estimated is a system
of equations with endogenous variables, their methodology requires instrumental
variables in order to obtain consistent parameter and marginal cost efficiency
estimates.
For simplicity, for now we drop the route index. The demand-supply relation
system that they estimate is described by:

Pit = P Qt ; Xd,t + εitd
(5)
ln Pit = ln cit∗ Xc,t + git + uit + εits ,

where P(.) is the inverse demand function; Qt is the total output in a route; Xd, t is a
vector of variables that affect demand; cit∗ is the deterministic part of the marginal
cost;Xc, t is a vector of variables that affect the marginal cost; git = g (θit , sit , Et ) =
sit
− ln 1 − E t
θit ≥ 0 is a term that reflects the effect of market power on the price;
θ it is the conduct parameter; sit is the market share of airline i at time t; and Et =
− ∂Q t Pt
∂Pt Qt is the (absolute value of) elasticity of demand. Three benchmark values
for the conduct parameter that represent perfect competition, Cournot competition,
1132 L. Kutlu et al.

and joint profit maximization are θ it = {0, 1, 1/sit }. While in theory marginal cost
function can be chosen to depend on quantity, in their empirical application they
assume that the marginal cost function is constant, i.e., it is invariant to the quantity.
Note that the marginal cost still depends onXc, t .
The identification of parameters of this model requires stronger assumptions
compared to the parameters of a standard demand-supply system. In addition to
the standard assumptions for identifying parameters in a demand-supply system,
we need certain restrictions on the functional forms for demand and marginal cost
functions. Bresnahan [23], Lau [68], Bresnahan [24], Perloff et al. [81], and Kutlu
and Wang [65] exemplify some studies that provide detailed treatments of these
identification conditions for conduct parameter models. Besides this, we also need
to identify the parameters of the inefficiency term, which would require relevant
identification assumptions from the stochastic frontier literature in the presence of
endogenous regressors. Amsler et al. [6], Kutlu and Wang [66], and Kutlu et al. [67]
discuss these identification conditions.
Although it is possible to estimate inverse demand and supply relation equations
simultaneously, Kutlu and Wang [65] estimate an inverse demand function first; and
then plug in relevant parameters obtained from the first stage into the supply relation
equation, which is then estimated by a variation of the control function approach for
stochastic frontier models that is first introduced by Kutlu [61] and further developed
by Karakaplan and Kutlu [59, 60].
For the inverse demand function, they choose the following functional form:

ln Pitr = β0 + β1 ln Qtr + β2 ln P CI tr ln Qtr + fd Xd,itr + εitr
d
, (6)

where fd (.) is a function of demand related variables Xd, itr ; Qtr is the total demand
at time t in route r; PCItr is population weighted per capita income for the relevant
d is the error term. They assume that lnQ and lnPCI ln Q
route (city-pair); and εitr tr tr tr
are endogenous; and are instrumented by logarithms of input prices (labor, capital,
and energy), logarithm of number of passengers for other routes; as well as two other
variables that they construct, which are inspired by the price dispersion literature
(Geometric market share and its product with the number of firms). Using the
parameter estimates obtained from this model, they predict the demand elasticity
specific to a route.
When estimating the supply relation, they assume that gitr is a function of
concentration ratio for top four airlines (CR4 ), logarithm of city-pair distance,
market share of airline, trend, and demand elasticity for the city-pair market. In
particular, they model gitr as follows

Bitr exp Xg,itr βg
gitr = , (7)

1 + exp Xg,itr βg
27 Airline Economics: A Survey of Applied Issues in the Performance of the . . . 1133

where Bitr = − ln 1 − E1tr so that gitr ∈ [0, Bitr ] and Xg, itr is the vector of
variables that explain gitr . Then, they estimate the conduct as follows:

θ̂itr = Êtr 1 − exp −ĝitr , (8)

where ĝitr is the estimate of gitr .

Finally, they model the marginal cost as a (restricted) translog function of
logarithms of output and input prices (labor price, capital price, energy price);
logarithms of average number of flight segments, average stage length, average fleet
size, and flight distance; online rate; low-cost carrier (LCC) dummy; and quarter,
year, and airline dummies.
They find that relative to the non-LCCs, the conducts of LCCs are closer to
perfectly competitive values; and relative to LCCs, the conducts of non-LCCs
are closer to Cournot competition values. When inefficiency is ignored in the
estimations, the conduct parameter estimates are generally smaller and closer to
perfect competition values. Hence, it is essential to control for inefficiency. They
find that the conduct parameter is positively related with CR4 , which verifies
that higher concentration helps increasing market power. The median efficiency
estimates for the whole sample is 82.6% and for the non-LCCs is 84.4%, which
indicates that LCCs and non-LCCs have similar marginal cost efficiencies. Finally,
they argue that their results are in line with Hicks’ [57] quiet life hypothesis as the
inefficiency is positively related with CR4 .

Economic Impact of Delays

Air traffic delays are one of the most common complaints by airline passengers
[43], and although the percentage of delayed flights has gone down in the last
decades, the increase in demand and of the airline’s network complexity may
hinder the achievement of on-time performance targets. As the demand for more air
transportation grows, a growing congestion in airports may lead to greater delays.
Moreover, as competition increases among carriers at an airport, departure times
may seem to be less differentiated if the route is served by competing airlines rather
than by a single firm [18], which further increases congestion.
The delays do not only have a negative impact on consumers, as their time is
lost and their flight experience deteriorated, but it has an impact on their willingness
to pay for a ticket, and therefore it affects airline’s profits. In early work, Forbes
[43] used a legislative change in the takeoff and landing restriction at an airport that
resulted in higher delays, and thus used this exogenous variation to study its impact
on prices. For connecting passengers, the author found that prices fell by $0.77 per
additional minute of delay, while for direct passengers it fell $1.42. This reduction
was larger at competitive routes, which showed a reduction of $2.44.
To ameliorate congestion and delays, economists have long proposed the use
of pricing mechanism (Morrison and Winston [74]; Aravena et al. [7]) rather than
1134 L. Kutlu et al.

direct allocation of space, which would delegate the production decision from the
airport to the airlines, and the better-informed agent may choose the production
level. A complementary strategy taken by the Department of Transportation (DOT)
since 1987 has been to inform the consumers about the relative on-time performance
of the largest airlines, so as to incentivize airlines to reduce their delays. For this
end, the DOT produces a ranking of airlines based on the fraction of their flights
that arrive less than 15 min late. Although this program was created to encourage
airlines to improve their on-time performance, it incentivized airlines to focus on
the routes with performance just under the threshold. Forbes et al. [46] analyze
this unintended incentive and find heterogeneous results in the response of the
airlines to this incentive. The airline characteristics that matter for its response to
such incentive seem to be their technology to compute the on-time performance
(whether it is done manually or automatically), and whether it has bonus package
for its employees that rewards efficiency. The heterogeneity in results point out that
the effect of a disclosure program depends more than just the incentives created by
the program, but also on the internal organizational practices of the firm affected.
A second unintended consequence of the 15 min threshold is that airlines may
increase their flight schedule in order to have more flexibility in meeting the
performance target. In a recent study, Forbes et al. [47] show that flight schedules
have lengthened over time. For flights by the same airline on the same route in the
same month of the year, flight schedules in 2016 were around 8 min longer than
in 1990. Thus, even though the official rate of delays has decreased, passengers
are taking more time to reach their destination. Interestingly, this is not only true
for the airlines that have long reported their on-time performance, but recent work
by Forbes et al. [48] also shows that when an airline or their competitors become
subject to a disclosure requirement for on time performance, they adjust their targets
and lengthen their schedule.
This research shows that it is not straight-forward to evaluate the operational
performance of an airline, nevertheless, for future research, greater emphasis should
be put to multiple dimensions of quality rather than just looking at one dimension.
An index that could be used when considering more dimensions is the Airline
Quality Ranking (AQR), developed by the Embry-Riddle Aeronautical University.
This index includes information on: mishandled baggage, consumer complaints, on-
time performance, and involuntary denied boarding.

Governance and Airport Efficiency

Abrate and Erbetta [1] and Voltres-Dorta and Lei [89] provide excellent summaries
of the literature on airport costs, efficiency, productivity, and type of ownership. In
what follows, we briefly summarize papers related to airport governance, efficiency,
and productivity; and present a study in more detail. Oum et al. [77] argue that
managerial autonomy in airport efficiency is important. Oum et al. [78] and Oum
et al. [79] find that airports that are privatized are more efficient compared to
public sector airports, which in turn are more efficient compared to public-private
27 Airline Economics: A Survey of Applied Issues in the Performance of the . . . 1135

structure airports. In line with these findings, Botasso et al. [22] find no productivity
effect associated with mixed ownership structures for airports. Martin et al. [73]
and Martin and Voltes-Dorta [72] find that multi-airport systems have higher unit
costs as they operate under nonexhausted scale economies. While these studies
illustrate that airport governance is a relevant factor, Vasigh et al. [88] argue that
market structure and competition may be more relevant for airport productivity and
efficiency. Craig et al. [39] analyze two types of public ownership: airport authorities
and city-owned. Similar to Zhao et al. [92], they find that airport authorities
governed airports are more efficient compared to city-owned airports.
As an example of an empirical application, we review the work of Kutlu and
McCarthy [62], who use stochastic frontier analysis to examine the effects of
governance types for commercial airports on cost efficiency. They contribute to the
literature in at least three ways. First, they examine relative efficiency differences
across four US commercial airport ownership types (city, county, state, airport
authority). Second, for the first time in the airport efficiency literature, they control
for airport specific heterogeneity and illustrate the importance of controlling for
heterogeneity when estimating airport cost efficiency. Third, they analyze the
efficiency effects of local ownership, hub size, and multiple airport metropolitan
areas via a series of counterfactual analyses.
Their dataset includes 24 medium and 26 large hub commercial airports in
the USA, and covers years between 1996 and 2008. Using data from Bureau
of Transportation Statistics, the FAA, and the National Flight Data Center, they
estimate a variable cost function with one output and three inputs where the output
variable is airline departures; and the input variables are labor, contracting and
repair/maintenance, and general airport operations. While they do not have input
prices specific to airports, they have data on national and MSA price indices for
each input, which they use to construct indices for input prices in a way that
generates variation across geography and time. Their model also includes a quasi-
fixed factor effective number of standard runways, and as control variables: the share
of international departures, the share of freight, share of revenues from parking,
share of revenues from retail activities, year dummy variables, and airport dummy
variables. They include airport-specific dummy variables to control for airport
specific heterogeneity, which is in line with Greene [55, 56].
In particular, they estimate the following stochastic frontier model for variable
cost:

ln V C it = f (xit ) + uit + vit

Sit = Sit∗ + wit

uit ∼ N + 0, σu2 (zit ) (9)

vit ∼ N 0, σv2

wit ∼ N (0, w) ,
1136 L. Kutlu et al.

where VC is the variable cost; f (.) is a function that represents the deterministic
part of the variable cost frontier; x is a vector of variables included in the frontier;
u ≥ 0 is a one-sided error term representing inefficiency; v is the usual two-sided
error term; S and S∗ are vectors that represent the observed and explained part of
input shares; w is a vector of error term with multivariate normal distribution; z is
environmental variables that are explaining inefficiency; and σu (zit ) =
a vector 2
of
exp zit δ . Assuming that u, v, and w are independent, the corresponding marginal
log-likelihood function for panel unit i at time t is given by:

εit λ 1 1 −1
ln Lit = const. − ln σs + ln − ln | w| − w w wit , (10)
σs 2 2 it

where is the cumulative distribution function for the standard normal distribution,
εit = uit + vit , σs2 = σu2 + σv2 , and λ = σ u /σ v .
Note that this model differs from the conventional stochastic frontier models of
cost function as it includes input share equations to improve statistical efficiency.
Moreover, as mentioned earlier, the model controls for unobserved heterogeneity by
including airport dummy variables in the frontier. Kutlu and McCarthy [62] argue
that controlling for heterogeneity may be essential when estimating efficiencies of
airports. When ignored, one potential problem would be that the inefficiency and
heterogeneity may be confused in the estimation, furthermore, regularity conditions
may be violated. When the authors ignored heterogeneity, 85 of 649 observations
violated regularity conditions (concavity and monotonicity) for a cost function,
while when they controlled for heterogeneity, only 3 of 649 observations did. A
violation of regularity condition is more serious for a stochastic frontier variable
cost function compared to its neoclassical counterpart that assume no inefficiency.
The reason is that the efficiency measurements are invalid for those observations that
violate regularity conditions. This situation would also contaminate the parameter
estimates.
Based on this model, the main findings of Kutlu and McCarthy [62] is summa-
rized as follows. The median cost efficiency estimates for the US airports is 87.6%.
For medium hub airports, airports owned by city or airport/port authorities have
9.6% higher variable cost due to cost inefficiency compared to county or state-
owned airports. Moreover, for medium airports owned by the city or airport/port
authorities, multiple-airport cities have 8.6% higher variable costs compared to
single-airport cities. They did not find much difference in cost efficiency between
medium and large hubs.

Deregulation

Finally, we review the impact of the 1978 Airline Deregulation Act (ADA) on the
market. Many studies focus on analyzing how the US market changed ex-post the
ADA, and on how the US market progressed in comparison to other similar but
more regulated markets, such as the European one. In early work on the former,
27 Airline Economics: A Survey of Applied Issues in the Performance of the . . . 1137

Sickles et al. [87] develop a model of allocative distortions, and find that, during the
period of deregulation, the airline industry decreased both the total cost of allocative
distortions and the relative level of input and output allocative distortions.
During the period of deregulation in the USA, Good et al. [52] compare the
technical efficiency and productivity growth of the four largest European carriers
with eight of their American counterparts. The period analyzed is between 1976
and 1986, which is particularly useful since it starts at the beginning of a period
where we saw informal steps taken toward deregulation in America, and ends just
prior to deregulation reforms in Europe. Based con Cornwell et al. [38], they find
a rather large gap in efficiency which otherwise avoided would translate in savings
of approximately of $4.5 billion per year. The subsequent studies in the literature
focused on finding possible explanations for this gap.
Prior to the deregulation reforms in Europe, the European carriers, mostly stated
owned, where sheltered from competition, heavily subsidized, and the fares to
international destinations where set by bilateral treaties. During this period, the
average fares in Europe were consistently higher than the fares in the USA for routes
of similar distance. To explain the price difference, Captain and Sickles [28] study
European airlines in an oligopolistic structure with product differentiation, and find
evidence of price/cost markups, mainly driven by the technical inefficient use of
inputs, in particular of labor, which presented wages in excess of their marginal
revenue product. Complementary, Captain and Sickles [28] and Postert and Sickles
[82] test the extent to which market concentration also plays a role in explaining
higher fares. Interestingly, despite inelastic demand, there was little evidence of
large deviations of competitive pricing and monopolistic market concentration as an
explanation of the higher fares in Europe. This result further strengthens previous
findings that the gap in efficiency was mainly the result of differences in the cost
structure of both industries.
A more recent study that revisits the efficiency comparison of the USA and
Europe is Good et al. [53]. The authors develop a dynamic industry model to test
the counterfactual of how the European industry would have developed, had they
deregulated in 1979 alongside the USA. Through simulation, they solve for optimal
levels of employment, network, and fleet size for the period 1979–1990. The results
reveal various sources of forgone earnings, mainly the need for European carriers to
expand their networks in order to take advantage of returns to density. A first policy
in this direction could be an increase in alliances and further use of code sharing.
Now, when looking only at the US market, it is of interest to see how the industry
in general developed after the deregulation and to empirically test the relationship
between an increase in competition and technical efficiency. Throughout the
deregulation period and after, we expect two patters in the times series of the
airline’s efficiency scores; that they should be cointegrated and that they should
converge. To see why we expect these patterns consider a market with two airlines.
If efficiency-enhancing practices are made by one airline but not followed by the
other, the efficiency scores of these two airlines would move apart. In time, the
airline that fails to follow innovation is eventually driven out of the market. Hence,
in the long run, the efficiency scores should be cointegrated among the remaining
1138 L. Kutlu et al.

airlines in a market. Second, the efficiency scores should converge as increases in

competition force airlines to close efficiency gaps. In line with this argument, Alam
and Sickles [4] analyze the time series of technical efficiency of US airlines, using
quarterly observations from 1970 to 1990.
As an efficiency measure, the authors use an output-based distance function
(OD). In order to define OD, let y denote a vector of outputs of length
K, and x a vector of inputs of length L, then, the production technology
is S = (x, y) |x ∈ RJ+ , y ∈ RK + , (x, y) is feasible . Then, OD is defined as
OD(x, y) = min {λ (x, y/λ ∈ S}. Notice that an output efficient firm has a score
of one. Likewise, an output-inefficient firm has OD(x, y) < 1. The authors use two
methods to estimate the model, first they use the data envelopment analysis (DEA),
and then the free disposal hull (FDH) method. Since these methods are related, I
will only explain the method DEA and point out to the differences when they exist.
The DEA is a linear programming method where we solve:

[OD (xnt , ynt )]−1 = max λnt , subject to :

λnt yknt ≤ wnt yknt , k = 1, . . . , K,
n t
(11)
wnt xknt ≤ xj nt , j = 1, . . . , J,
n t

wnt ≥ 0, n = 1, . . . . , N ; t = 1, . . . , T .

Where wnt stands for weights and where the conditions imposed give constant
returns
to scale (CRS). The FDH frontier is obtained by replacing the last restriction
by n t wnt = 1, wnt ∈ {0, 1}, ∀ n, t.
Using the estimates from the DEA or FDH, then, in a second stage, the results
obtained are regressed on firm characteristics, along with firm and time trends. Here,
the residuals will provide us with our appropriate performance measure.
For the co-integration test, the authors use the Dicky Fuller Test, and for
convergence, they make use of a Malmquist productivity index procedure. This
method accounts for changes both in technical efficiency change (catching up effect)
and changes in the frontier technology (innovation). For this last approach, we need
to calculate output distance functions between periods. Letting

OD t (xt+1 , yt+1 ) = min {λ |( xt+1 , yt+1 /λ ∈ St ,

and

OD t+1 (xt , yt ) = min {λ |( xt , yt /λ ∈ St+1 ,

the Malmquist productivity index is then defined as:

27 Airline Economics: A Survey of Applied Issues in the Performance of the . . . 1139

1/2
ODt+1 xt+1 , yt+1 ODt xt+1 , yt+1 ODt (xt , yt )
M xt+1 , yt+1 , xt , yt =
ODt (xt , yt ) ODt+1 xt+1 , yt+1 ODt+1 (xt , yt )

= Et+1 × At+1 ,

where Et + 1 reflects the change in relative efficiency, and At + 1 the change in

technology, between t and t + 1. The authors find evidence that the US airlines
industry is consistent with both co-integration and convergence of the efficiency
score.
Nevertheless, although the effect of deregulation has been positive (the industry
has shown lower fares, and an increase in non-stop flights and productivity [84]),
the quality of the services has not fared as well. For example, an increasing reliance
on a hub and spoke system can bring inconvenience to consumers. Färe et al.
[42] consider service quality effects on consumers when estimating total factor
productivity (TFP), and find that it lowered the rate of productivity growth of the
TFP. This is not to say that a re-regulation of the industry would be advised, but
rather that further care should be taken to consider the effects on the quality of
services when analyzing the impact of deregulation on markets.

Conclusions

Our Handbook Chapter has provided an overview of the empirical literature that
speaks to the productivity and efficiency of the US and International Airline
Industry. As with any survey of such a vast literature, coverage of all papers and
of all topics was not possible. We have tried to focus our attention on issues and
empirical findings that speak to the theme of this Handbook Volume II but leveraged
by the modelling approaches and theoretical perspectives from Volume I of this
Handbook. We trust we have succeeded in this endeavor.

References
1. Abrate G, Erbetta F (2010) Efficiency and patterns of service mix in airport companies: an
input distance function approach. Transp Res E 46(5):693–708
2. Aguirregabiria V, Ho CY (2010) A dynamic game of airline network competition: hub-and-
spoke networks and entry deterrence. Int J Ind Organ 28(4):377–382
3. Aguirregabiria V, Ho CY (2012) A dynamic oligopoly game of the U.S. airline industry:
estimation and policy experiments. J Econom 168(1):156–173
4. Alam IMS, Sickles RC (2000) Time series analysis of deregulatory dynamics and technical
efficiency: the case of the U.S. airline industry. Int Econ Rev 41(1):203–218
5. Alam IMS, Ross LB, Sickles RC (2001) Time series analysis of strategic pricing behavior in
the US airline industry. J Prod Anal 16:49–62
6. Amsler C, Prokhorov A, Schmidt P (2016) Endogeneity in stochastic frontier models.
J Econom 190(2):280–288
7. Aravena O, Basso LJ, Figueroa N (2019) Effects of asymmetric information on airport
congestion management mechanisms. Int J Ind Organ 62(C):4–27
1140 L. Kutlu et al.

8. Armantier O, Richard O (2008) Domestic airline alliances and consumer welfare. Rand J Econ
39(3):875–904
9. Aryal G, Murry C, Williams JW (2018) Price discrimination in international airline markets.
SSRN Working Paper
10. Bajari P, Benkard CL, Levin J (2007) Estimating dynamic models of imperfect competition.
Econometrica 75(5):1331–1370
11. Bamberger GE, Carlton DW, Neumann LR (2004) An empirical investigation of the competi-
tive effects of domestic airline alliances. J Law Econ 47(1):195–222
12. Basu S, Fernald JG (1997) Returns to scale in U.S. production: estimates and implications.
J Polit Econ 105(2):249–283
13. Battese GE, Coelli TJ (1992) Frontier production functions, technical efficiency and panel data
with application to paddy farmers in India. J Prod Anal 3:153–169
14. Benkard CL, Bodoh-Creed A, Lazarev J (2019) Simulating the dynamic effects of horizontal
mergers: US. Working Paper
15. Berry S, Jia P (2010) Tracing the woes: an empirical analysis of the airline industry. Am Econ
J: Microecon 2(August):1–43
16. Borenstein S (1989) Hubs and high fares: dominance and market power in the U.S. airline
industry. Rand J Econ 20:344–365
17. Borenstein S (2011) Why can’t U.S. airlines make money ? AEA Pap Proc 101(3):233–237
18. Borenstein S, Netz J (1999) Why do all the flights leave at 8 am?: Competition and departure-
time differentiation in airline markets. Int J Ind Organ 17:611–640
19. Borenstein S, Rose NL (1994) Competition and price dispersion in the U.S. airline industry.
J Polit Econ 102(4):653–683
20. Borenstein S, Rose NL (1995) Bankruptcy and pricing behavior in U.S. airline markets. AEA
Pap Proc 85(2):397–402
21. Borenstein S, Rose NL (2003) The impact of bankruptcy on airline service levels. AEA Pap
Proc 93:415–419
22. Bottasso A, Conti M, Piga C (2012) Low cost carriers and airports’ performance: empirical
evidence from a panel of UK airports. Ind Corp Change 22:745–769
23. Bresnahan TF (1982) The oligopoly solution is identified. Econ Lett 10:87–92
24. Bresnahan TF (1989) Studies of industries with market power. In: Schmalensee R, Willig R
(eds) The handbook of industrial organization, vol 2. Elsevier B.V, Amsterdam, pp 1011–1057
25. Brueckner JK, Proost S (2010) Carve-outs under airline antitrust immunity. Int J Ind Organ
28(6):657–668
26. Brueckner J, Dyer N, Spiller PT (1992) Fare determination in airline hub-and-spoke networks.
Rand J Econ 23:309–333
27. Brueckner JK, Lee DN, Singer ES (2011) Alliances, codesharing, antitrust immunity, and
international airfares: do previous patterns persist? J Comp Law Econ 7(3):573–602
28. Captain PF, Sickles RC (1997) Competition and market power in the European airline industry:
1976–90. Manag Decis Econ 18(3):209–225
29. Chakrabarty D, Kutlu L (2014) Competition and price dispersion in the airline markets. Appl
Econ 46:3421–3436
30. Chandra A, Lederman M (2018) Revisiting the relationship between competition and price
discrimination. Am Econ J Microecon 10(2):190–224
31. Ciliberto F, Schenone C (2012) Bankruptcy and product-market competition: evidence from
the airline industry. Int J Ind Organ 30:564–577
32. Ciliberto F, Tamer E (2009) Market structure and multiple equilibria in airline markets.
Econometrica 77(6):1791–1828
33. Ciliberto F, Williams JW (2010) Limited access to airport facilities and market power in the
airline industry. J Law Econ 53(3):467–495
34. Ciliberto F, Williams JW (2014) Does multimarket contact facilitate tacit collusion? Inference
on conduct parameters in the airline industry. Rand J Econ 45(4):764–791
35. Ciliberto F, Zhang Z (2017) Multiple equilibria and deterrence in airline markets. Econ Inq
55(1):319–338
27 Airline Economics: A Survey of Applied Issues in the Performance of the . . . 1141

36. Ciliberto F, Murry C, Tamer E (2018) Market structure and competition in airline markets.
SSRN Working Paper
37. Ciliberto F, Watkins E, Williams JW (2019) Collusive pricing patterns in the U.S. airline
industry. Int J Ind Organ 62:136–157
38. Cornwell C, Schmidt P, Sickles RC (1990) Production frontiers with time-series variation in
efficiency levels. J Econ 46:185–200
39. Craig S, Airola J, Tipu M (2012) General purpose of special district governance? Technical
efficiency vs. rent dissipation in airport finances. Public Finance Rev 40:712–735
40. Duygun M, Kutlu L, Sickles RC (2016) Measuring productivity and efficiency: a Kalman filter
approach. J Prod Anal 46:155–167
41. Fang Y, Sickles RC (2007) A dynamic model of airline competition. Rev Netw Econ 6(3):355–
371
42. Färe R, Grosskopf S, Sickles RC (2007) Productivity? of U.S. airlines after deregulation.
J Transp Econ Pol 41(1):93–112
43. Forbes SJ (2008) The effect of air traffic delays on airline prices. Int J Ind Organ 26:
1218–1232
44. Forbes SJ, Lederman M (2009) Adaptation and vertical integration in the airline industry. Am
Econ Rev 99(5):1831–1849
45. Forbes SJ, Lederman M (2010) Does vertical integration affect firm performance? Evidence
from the airline industry. Rand J Econ 41(4):765–790
46. Forbes SJ, Lederman M, Tombe T (2015) Quality disclosure programs and internal organiza-
tional practices: evidence from airline flight delays. Am Econ J Microecon 7(2):1–26
47. Forbes SJ, Lederman M, Yuan Z (2017) Do airlines pad their schedules? Working Paper
48. Forbes SJ, Lederman M, Wither MJ (2019) Quality disclosure when firms set their own quality
targets. Int J Ind Organ 62:228–250
49. Gayle PG (2013) On the efficiency of codeshare contracts between airlines: is double
marginalization eliminated? Am Econ J Microecon 5(4):244–273
50. Gerardi KS, Shapiro AH (2009) Does competition reduce price dispersion? New evidence from
the airline industry. J Polit Econ 117(1):1–37
51. Goetz CF, Shapiro AH (2012) Strategic alliance as a response to the threat of entry: evidence
from airline code sharing. Int J Ind Organ 30(6):735–747
52. Good DH, Nadiri MI, Röller LH, Sickles RC (1993) Efficiency and productivity growth
comparisons of European and U.S. air carriers: a first look at the data. J Prod Anal 4(1–2):
115–125
53. Good DH, Sickles RC, Weiher JC (2008) A hedonic price index for airline travel. Rev Income
Wealth 54(3):438–465
54. Goolsbee A, Syverson C (2008) How do incumbents respond to the threat of entry on their
networks? Evidence from the major airlines. Q J Econ 123(4):1611–1633
55. Greene WH (2005a) Fixed and random effects in stochastic frontier models. J Prod Anal 23:
7–32
56. Greene WH (2005b) Reconsidering heterogeneity in panel data estimators of the stochastic
frontier model. J Econ 126:269–303
57. Hicks JR (1935) Annual survey of economic theory: the theory of monopoly. Econometrica
3:1–20
58. Howell C, Grifell-Tatjé E (2019) Unbundled airfares: a comprehensive analysis. Working paper
59. Karakaplan MU, Kutlu L (2017a) Handling endogeneity in stochastic frontier analysis. Econ
Bull 37:889–901
60. Karakaplan MU, Kutlu L (2017b) Endogeneity in panel stochastic frontier models: an
application to the Japanese cotton spinning industry. Appl Econ 49:5935–5939
61. Kutlu L (2010) Battese-Coelli estimator with endogenous regressors. Econ Lett 109:79–81
62. Kutlu L, McCarthy P (2016) U.S. airport governance and efficiency. Transp Res E Logist
Transp Rev 89:117–132
63. Kutlu L, Sickles RC (2012) Estimation of market power in the presence of firm level
inefficiencies. J Econ 168:141–155
1142 L. Kutlu et al.

64. Kutlu L, Sickles RC (2017) Measuring market power when firms price discriminate. Empir
Econ 53(1):287–305
65. Kutlu L, Wang R (2018a) Estimation of cost efficiency without cost data. J Prod Anal 49:
137–151
66. Kutlu L, Wang R (2018b) Price dispersion, competition, and efficiency - evidence from
Chicago based routes. J Transp Econ Pol 52(1):23–44
67. Kutlu L, Tran KC, Tsionas MG (2019) A time-varying true individual effects model with
endogenous regressors. J Econ 211(2):539–559
68. Lau LJ (1982) On identifying the degree of competitiveness from industry price and output
data. Econ Lett 10:93–99
69. Lazarev J (2013) The welfare effects of intertemporal price discrimination: an empirical
analysis of airline pricing in U.S. monopoly markets. Working Paper
70. Lederman M (2007) Do enhancements to loyalty programs affect demand? The impact of
international frequent flyer partnerships on domestic airline demand. Rand J Econ 38(4):1134–
1158
71. Li S, Roberts J, Mazur J, Park Y, Zhang J (2018) Endogenous and selective service choices
after airline mergers. NBER Working Paper 24214
72. Martin JC, Voltes-Dorta A (2011) The econometric estimation of airports’ cost function. Transp
Res B 45:112–127
73. Martin JC, Roman C, Voltes-Dorta A (2011) Scale economies and marginal costs in Spanish
airports. Transp Res E 47:238–248
74. Morrison SA, Winston C (2007) Another look at airport congestion pricing. AEA Pap Proc
97(5):1970–1977
75. Motta M (2004) Competition policy. Theory and practice. Cambridge University Press,
Cambridge, MA
76. Nevo A (2000) Mergers with differentiated products: the case of the ready-to-eat cereal
industry. Rand J Econ 31(3):395–421
77. Oum T, Yu C, Fu X (2003) A comparative analysis of productivity performance of the world’s
major airports: summary report of the ATRS global airport benchmarking research report—
2002. J Air Transp Manag 9:285–297
78. Oum T, Adler N, Yu C (2006) Privatization, corporatization, ownership forms and their effects
on the performance of the world’s major airports. J Air Transp Manag 12:109–121
79. Oum T, Yan J, Yu C (2008) Ownership forms matter for airport efficiency: a stochastic frontier
investigation of worldwide airports. J Urban Econ 64:422–435
80. Parise G (2018) Threat of entry and debt maturity: evidence from airlines. J Financ Econ
127(2):226–247
81. Perloff JM, Karp LS, Golan A (2007) Estimating market power and strategies. Cambridge
University Press, Cambridge, MA
82. Postert AK, Sickles RC (1999) Competition in the European airline industry. In: Gaudry M,
Mayes R (eds) Tacking stock of air liberalization. Centre for Research on Transportation 25th
Anniversary Series 1971–1996. Springer, Boston, pp 39–57
83. Röller LH, Sickles RC (2000) Capacity and product market competition: measuring market
power in a ‘fat-cat’ industry. Int J Ind Organ 18:845–865
84. Rose NL (2012) After airline deregulation and Alfred E. Kahn. AEA Pap Proc 102(3):376–380
85. Rotemberg JJ, Saloner G (1986) A supergame-theoretic model of price wars during booms.
Am Econ Rev 76:390–407
86. Sickles RC (1985) A nonlinear multivariate error-components analysis of technology and
specific factor productivity growth with an application to the U.S. airlines. J Econ 27:61–78
87. Sickles RC, Good DH, Johnson RL (1986) Allocative distortions and the regulatory transition
of the U.S. airline industry. J Econ 33:143–163
88. Vasigh B, Erfani GR, Sherman BW (2014) Airport performance and ownership structure:
evidence from the United Kingdom, United States, and Latin America. J Aviat Technol Eng
4:40–49
27 Airline Economics: A Survey of Applied Issues in the Performance of the . . . 1143

89. Voltes-Dorta A, Lei Z (2013) The impact of airline differentiation on marginal cost pricing at
UK airports. Transp Res A 55:72–88
90. Wang HJ, Ho CW (2010) Estimating fixed-effect panel stochastic frontier models by model
transformation. J Econ 157:286–296
91. Zhang Z, Ciliberto F, Williams J (2017) Effects of mergers and divestitures on airline fares.
Transp Res Rec 2603:98–104
92. Zhao Q, Choo YY, Oum TH (2014) The effect of governance forms on North American airport
efficiency: a comparative analysis of airport authority vs. government branch. Transp Res
Forum 53:93–110
Globalization, Innovation, and Productivity
28
Shunan Zhao and Man Jin

Contents
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1146
International Trade and Productivity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1147
Modeling Productivity and Trade Effects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1150
FDI and Productivity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1152
Learning and Spillovers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1152
Persistent Benefits of FDI . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1153
Empirical Models to Measure FDI Spillovers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1154
Innovation and Productivity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1156
Innovation Measures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1156
R&D Measure and Modeling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1158
Financial Constraints, R&D, and Productivity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1159
Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1160
Cross-References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1160
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1160

Abstract

We comprehensively review previous studies examining the effects of innovation

and globalization on productivity. Innovation is measured by the R&D invest-
ment, patents, and novel business models and production practices. Globalization
takes forms of the international trade in goods and services and foreign direct
investment (FDI). In our review, we first present various mechanisms of how
a form of innovation or globalization theoretically affects productivity, and
then evaluate different model settings. Meanwhile, we summarize the empirical
evidence, from both macro and micro analyses, such as the heterogeneous effects

S. Zhao () · M. Jin

Department of Economics, Oakland University, Rochester, MI, USA
e-mail: [email protected]; [email protected]

S. C. Ray et al. (eds.), Handbook of Production Economics,
https://doi.org/10.1007/978-981-10-3455-8_28
1146 S. Zhao and M. Jin

on productivity across different sectors and economic regions. Lastly, we discuss

the related measurement, data source, identification, and modeling issues.

Keywords

Productivity · Trade · FDI · R&D · Patents · Financial constraints

Introduction

In 2017, the G7 countries, which are the seven largest advanced economies in the
world, including Canada, France, Germany, Italy, Japan, the United Kingdom, and
the United States, accounted for about 80% of the world’s research and development
(R&D) expenditures, according to the data of United Nations Educational, Scientific
and Cultural Organization. R&D is considered a key driver of these countries’
economic and productivity growth. In the same year, the World Bank reported
that the top five countries/regions that have the fastest real GDP growth are Libya,
Guinea, Ethiopia, Macao, and Maldives. The growth “star” in the past four decades –
China – ranks 16th with an annual growth rate of 6.90%. Besides the traditional
catch-up effect explaining the fast growth in these underdeveloped economies,
perhaps more importantly, most of these countries are very export-oriented. For
example, the export-to-GDP ratio is 67.6% in Maldives, which is two times more
than the world average. Exporting along with other globalization initiatives provides
these countries more opportunities to learn the cutting-edge technologies from
abroad and the export-fueled “scale effect” primarily explains their fast pace of
growth.
Examining the relationship among globalization, innovation, and productivity is
one of the most important topics in economics. Globalization is a process of growing
interdependence of world economies, brought by cross-border trade – exports and
imports – in goods and services, technology, and the flow of investments, people,
and information. Innovation can be broadly defined as “production or adoption,
assimilation, and exploitation of value-added novelty in economic and social
spheres; renewal and enlargement of products, services, and markets; development
of new methods of production; and the establishment of new management systems”
[19]. According to the theory of production, total factor productivity is the part of
output which cannot be explained by the inputs. It includes two components: one is
the temporary productivity shock due to macroeconomic environment and the other
is persistent productivity due to technological innovation, managerial efficiency,
industrial organization, globalization, etc. And the growth in total factor productivity
is largely explained by the growth in the persistent productivity [65].
In this chapter, we review studies examining the effects of innovation and
globalization on productivity. Innovation is measured by R&D investment, patents,
novel business models, and production practices. Globalization takes forms of
international trade in goods and services and foreign direct investment. Theoret-
ically, most economists agree that globalization in general has a positive effect
28 Globalization, Innovation, and Productivity 1147

on productivity, although the effect shows heterogeneity in different business

sectors and economic regions. The ones that attract most economists’ attention are
mechanisms through which globalization facilitates productivity growth. Here we
discuss various mechanisms in the literature. Meanwhile, we summarize empirical
evidence, from both macro and micro analyses, and discuss related measurement,
identification, and modeling issues. While it is impossible to include all influential
studies, on this long-standing topic, we structure the chapter according to our own
knowledge and research interests, hoping that it will provide guidance for future
research.

International Trade and Productivity

The relationship between international trade/trade policies and productivity has

been studied extensively for many years. Now, there are relatively more discussions
and examinations, at the macro level, than in previous years. In the recent two
to three decades, due to the increased availability of international trade data and
firm-level datasets in various countries and industries, firm- or plant-level studies
are becoming more common. In general, there are two opposite opinions on their
relationship. Advocates of trade liberalization argue that international trade can
increase productivity at the plant, industry, and country levels through multiple
channels, but critics argue that the removal of inward trade barriers could be a
double-edged sword which may hurt the domestic economy. Although the two views
get almost equal attention, among public news and political debates, the former
dominates the latter in academic studies.
To be more specific, economic theories suggest that international trade has
positive effects on productivity through the following channels:

1. Competition: exposure to international trade increases competition in the product

market. This gives firms more incentives to improve existing production tech-
nologies through methods such as conducting more R&D investment or adopting
better corporate management practices [6]. This implicit “challenge response”
mechanism should work through both export expansion and import liberalization.
2. Reallocation: competition also reduces markups and decreases a firm’s profit
margin [47]. Through competition, trade liberalization forces the least productive
firms to exit and reallocates market shares toward more productive exporting
firms (e.g., [52, 56]). Therefore, the aggregate productivity at the sector level
increases with trade.
3. Learning: exporting firms may benefit from their foreign buyers’ technical and
managerial expertise or the expertise of other foreign contacts (e.g., competitors,
suppliers, or scientific agents) through the supply chain. Exporters can learn
from product quality improvements, shipment size, or, even more directly, from
specific investment requirements. This mechanism, whereby firms improve their
productivity after entering export markets, is called “learning by exporting.” The
1148 S. Zhao and M. Jin

importance of the learning effect goes beyond an academic issue and has been
used to review government aid for the internationalization of firms.
4. Input access: international trade allows firms to get access to high-quality and
cheaper equipment and intermediate inputs [5]. Therefore, firms can produce
output of high quality at a lower cost. More importantly, imported equipment and
intermediate goods may embody better production technologies and the import of
specialized inputs might facilitate learning about the product, spurring imitation
or innovation of a competing product.
5. Scale economies: exposure to international trade means a larger market for
domestic firms. The existence of scale economy implies that the widened market,
through trade, should lead to reductions in production costs and increases in
productivity. The argument is usually made in terms of the benefits of expansion
in demand through increased exports.
6. Scope economies: with rapid progress of communication and technology, pro-
duction processes increasingly involve global value chains spanning across
multiple countries, with different stages of the production taking place in
several countries or regions. International trade provides opportunities for further
deepening product specialization, both vertically and horizontally, which in turn
increases productivity.

According to the above channels, international trade can increase productivity of

both traded and non-traded firms regardless of importing or exporting. Countries of
all development statuses can benefit from international trade.
However, some policy makers and researchers have also expressed their concern
that, depending on its impact on competition and cost reducing incentives of
producers in the medium to long run [54], international trade, even excessive export,
may sabotage the development of an economy and decrease firm productivity.
Import substitution industrialization (ISI), which advocates replacing imports with
domestic production, is a trade and economic policy used widely in developing
countries. It protects the “infant industries,” which, by definition, are in their early
stages and are incapable of competing with established competitors abroad until
they have grown up and become internationally competitive. Export expansion
policies such as excessive export subsidies may distort incentives and lead to
increasing inefficiency, such as the overuse of inputs. In addition, skeptics argue that
domestic firms may not be able to realize productivity gains because they are unable
to adapt foreign technologies to local methods of production or because domestic
firms face binding credit constraints that prevent the expansion of efficient industries
as well as investment in new technology.
Empirically, to identify the effects of international trade on productivity, one
needs to address the endogeneity issue of trade. Early studies, in development
economics, consider a cross-country regression of a productivity measure (labor
productivity or total factor productivity) on the export-to-GDP ratio. Such regres-
sions typically find a moderately positive relationship. However, as pointed out by
Frankel and Romer [28], Alcalá and Ciccone [4], and many others, this relationship
may not reflect a causal effect of trade on productivity. The problem is that the trade
28 Globalization, Innovation, and Productivity 1149

share may be endogenous because the countries with higher productivity, for reasons
other than trade, may trade more. Using measures, of countries’ trade policies, in
place of, or as an instrument for, the trade share in the regression, does not solve
the problem since trade policies may also correlate with aggregate productivity
determinants omitted from the regression.
Frankel and Romer [28] argue that countries’ geographic characteristics have
important effects on trade and are plausibly uncorrelated with other determinants
of productivity and income. They construct measures of the geographic component
of countries’ trade and use those measures to obtain instrumental-variable estimates
of the effect of trade on productivity. The results provide no evidence that ordinary
least-squares estimates overstate the effect of trade. However, Irwin and Tervio [38]
find that Frankel and Romer’s [28] result is not robust to the inclusion of latitude, or
additional geographic variables, in the regression. Rodriguez and Rodrik [60] argue
that the geographic attributes may be proxies of institutional quality. Therefore, the
captured positive effect might not be due to trade. Alcalá and Ciccone [4] use a new
measure of trade which eliminates distortions due to cross-country differences in the
relative price of non-tradable goods, and simultaneously addresses the endogeneity
of trade and missing control of institutional quality. They still find that international
trade has an economically significant and statistically robust positive effect on
productivity. To sum up, although most economists argue that international trade
has positive effects on productivity, which has been demonstrated by different
theoretical models [24, 52], it is not easy to find convincing empirical evidence
to support the argument using macro-level data.
Micro-econometric analyses of plant-level datasets provide us with better iden-
tification settings for the effects of trade on productivity. Early studies, such as
Bernard and Jensen [9] and Aw et al. [7], explore firm/plant-level datasets from
countries/regions including the USA, Taiwan, Colombia, Mexico, and Morocco,
and find that more productive firms self-select into foreign markets. There is no
evidence that exporting increases the growth rates of plant productivity. At the
industry level, aggregate productivity benefits from the resource reallocations from
the less productive firms to the more productive ones. To address the endogeneity
issue of a firm’s trade decision, recent studies often rely on certain exogenous policy
changes, since, in contrast to the nexus of international trade, trade policies, and
aggregate productivity at the macro level, a firm’s trade behavior is more likely to be
the only channel through which trade policy changes affect the firm’s productivity.
For example, Pavcnik [56] investigates the effects of trade on plant productivity
with the help of trade liberalization in Chile. She also distinguishes between traded-
goods and nontraded-goods sectors to separate productivity effects stemming from
liberalized trade from the productivity variation stemming from other sources.
Baldwin and Gu [8] examine how Canadian manufacturing plants have responded to
reductions in tariff barriers between Canada and the rest of world between 1984 to
1996. Muendler [53] studies how reduced inward trade barriers affect productivity
in Brazil from 1986 to 1998. Amiti and Konings [5] estimate the productivity
gains from reducing tariffs on final goods and intermediate inputs in Indonesia.
De Loecker [22] studies the productivity effects from reduced trade protection in
1150 S. Zhao and M. Jin

the Belgian textile market. Topalova and Khandelwal [61] examine the causal link
between changes in tariffs and firm productivity in India. Yu [64] explores how
reduced tariffs on imported inputs and final goods affect the productivity of large
Chinese trading firms. All these studies find significantly positive effects of trade on
firm-level productivity, although the magnitude varies by countries and methods.

Modeling Productivity and Trade Effects

In this subsection, we summarize common approaches to model productivity and the

effects of international trade on productivity. These approaches are quite general in
the sense that they can be easily modified to study the effects of other productivity
determinants, such as FDI and R&D, which we will discuss later on in this chapter.
Here, we describe a general framework for modeling the relationship between
productivity and its determinants, leaving FDI- and R&D-specific estimation issues
discussed in their specific sections.
The standard approach to empirically examining the relationship between pro-
ductivity and its determinants, such as globalization and innovation, usually has two
steps: first, productivity is estimated or calculated from certain methods; second,
the estimated/calculated productivity is regressed on a set of control variables
including the proxies of globalization or innovation. For robustness checks, multiple
productivity measures may be used.
In the process, there are at least three widely used productivity measures. The
first one is labor productivity, which is obtained by dividing the value added by the
numbers of hours worked or the number of workers. Labor productivity is simple
to calculate, but it does not account for other inputs in a value-added production
process. Therefore, labor productivity may not be a good productivity measure for
capital-intensive production processes [13].
The other two most common approaches, to measure productivity are residual-
based. They begin by assuming a neoclassical production function:

Y = F (X; t) , (1)

where Y is the single output in the production process, X is a K × 1 vector of inputs,

and t is a time index that allows the function to be shifted by technological inno-
vations or improvements. The total factor productivity (TFP) growth is the portion
of growth in output not explained by growth in traditionally measured inputs (X).
Without assuming a specific structure of F(·), we can write the TFP growth rate as:

Ẏ K Ẋ
gT F P = − θk (2)
Y k=1 X

Here a dot over a variable denotes its total derivative with respect to time, and θ k
is the elasticity of output with respect to the k factor input. If one replaces the time
derivatives with discrete changes and output elasticities with average factor shares
28 Globalization, Innovation, and Productivity 1151

of current and previous periods, then Eq. (2) leads to the benchmark of the second
widely used productivity measure – the well-known Tornqvist TFP index. The
traditional Tornqvist index could be restrictive, and some refinements can be made
based on Eq. (2) to accommodate factors such as scale economies and imperfect
market competition [62]. Finally, if we assume the production function is Hicks-
neutral, we can rewrite Eq. (2) as:

Y = A • G(X). (3)

Here, A provides the third productivity measure. Its growth rate, that is, gT F P =
Ȧ/A, and itself can be calculated upon estimation of the function G(X).
Despite its popularity, the above two-step approach often provides an estimated
relationship, between productivity and its determinants, that is more appropri-
ately interpreted as correlation, rather than causality. The reason for this is self-
inconsistency between the first- and second-step estimation, in which the impact
of trade, FDI, or R&D is ignored, or equivalently assumed nonexistent, in the
productivity estimation and then the estimated productivity is regressed on its
determinants in search for a connection. Due to the same reason, the two-step
approach ignores the endogenous factor demands, which in turn leads to biased
estimates of output elasticities, production function, and productivity. To see this
clearly, consider the logarithm of Eq. (3):

y = g (X) + , (4)

where the lower-case letters stand for the log values of their upper-case counterparts
and A = e . Besides random shocks, the error term also captures the effects of
globalization and innovation on output, which may cause correlation between X and
since input choices of production may be affected by firms’/countries’ trade and
innovation activities. Therefore, a simple ordinary least squares (OLS) regression
based on Equation (4) will lead to biased estimates of g(X) and output elasticities,
which in turn leads to biased productivity estimates.
There are different methods to solve the above endogeneity issue, and the key
idea here is to account for the effects of globalization and innovation on production
when productivity is estimated. One can estimate an augmented production function
(or input distance function) with productivity determinants incorporated. See
Eberhardt et al. [25] for example. Within the framework of Olley and Pakes [55] and
Levinsohn and Petrin [48], an alternative method is used to model firm productivity
and its determinants, which is robust to the above issue. Assume At = eωt +ηt ,
where the log TFP is broke into two parts: ωt and ηt , and the subscript t stands
for time. The term ηt measures the unpredictable random shocks to output, and ωt
is persistent productivity. The effects of globalization and innovation on production
can be incorporated through an assumption of a first-order Markovian process of ωt ,
that is, P(ωt | It ) = P(ωt | ωt − 1 , Gt − 1 , R & Dt − 1 ), where It represents information
available at time t. Gt − 1 and R & Dt − 1 are some globalization and innovation
measures at period t − 1, respectively. With the above augmented productivity
1152 S. Zhao and M. Jin

evolution process, the production function as well as output elasticities can be

consistently estimated following the methods proposed by Ackerberg et al. [1] or
Gandhi et al. [29].

FDI and Productivity

Learning and Spillovers

The existence of productivity effects, stemming from FDI, is well-documented in

the literature. Most economists, as well as policy makers, argue that FDI often
involves the transfer of knowledge from one country/firm to another [43], and it
facilitates gaining access to intangible productive “knowledge” assets from abroad,
such as new technology, proprietary know-hows, more efficient and innovative
marketing and management practices, established relational networks, and reputa-
tion, which can boost the productivity of domestic firms. Multinational enterprises
(MNEs) and their main vehicle, FDI, are often studied together since, by definition,
an MNE is an enterprise which owns, controls, and manages value-adding activities
in more than one country and whose business activities are financed by FDI
[14]. MNEs and FDI provide multiple direct and indirect channels through which
domestic firms can improve their production technology. These channels include
direct technology transfer across international borders by sharing technology among
multinational parents and subsidiaries [50], labor training and turnover [27], and
provision of high-quality intermediate inputs [59]. MNEs also transfer technology
to local suppliers as part of a strategy to build efficient supply chains for overseas
operations [39]. By doing so, MNEs enjoy a private benefit from lowering the cost
of non-labor inputs. Moreover, to reduce the firm-specific risk, MNEs may have
incentives to diffuse technology wider than a single upstream vendor, either by
direct technology transfer to additional firms or by encouraging spillovers from
the original recipient. Blomström and Kokko [11] provide detailed discussions of
possible spillover channels associated with FDI. Since international trade is closely
related to MNEs and FDI, it is not surprising to see that they share some common
productivity-improvement channels.
Empirically, there are a number of studies that provide mixed evidence of FDI
effects: consistent with the above theoretical arguments, the majority of studies find
robust evidence that FDI has positive effects on productivity for both immediate
recipients, those who benefit from direct learning of foreign knowledge, and
the other domestic firms in the region, who benefit from technology/knowledge
spillovers [21, 39, 44, 49]. Others find evidence that FDI may have insignificant,
or even negative, spillover effects on productivity due to reasons such as data
limitation [43], heterogeneity across industries/countries [63], absorbing capacity
[26], or “business-stealing” competition [3]. For example, consider the US-owned
multinational company General Motor (GM), one of the largest manufacturers
of highly specialized automobiles in the world. GM entered China in 1997, and
since then it has established 11 joint ventures and two wholly owned foreign
28 Globalization, Innovation, and Productivity 1153

enterprises. GM has a complicated relationship with Chinese domestic automakers –

it simultaneously collaborates and competes with local businesses. On the one hand,
the presence of GM hurts domestic firms, such as SAIC motor, which is one of
the “big four” automaker oligopolies in China, by “stealing” SAIC’s market share.
On the other hand, SAIC benefits from GM’s mature production and management
practices and its cutting-edge technology. The aggregation of the two opposite
effects could lead to insignificant or negative FDI effects.
Xu [63] uses the data on the US outward FDI into 40 countries between 1966
and 1994, and finds technology transfer provided by the US MNEs contributes to
the productivity growth in developed countries but not in developing countries. He
shows that a country needs to reach a minimum human capital threshold level to
benefit from the technology transfer. Xu’s [63] analyses are at the country level,
which may be subject to the aggregation bias because of heterogeneity across
sectors and across firms. To explore the heterogeneity of spillovers at the firm level,
Fons-Rosen et al. [26] investigate the types of domestic firms that overall benefit
from FDI. They develop a sector-level measure of technology closeness based on
whether firms within a sector hold similar patents. They construct measures of
horizontal and vertical technology-weighted FDI, as a fraction of output produced
by foreign-owned firms weighted by the technological closeness measure for the
sector. Using these technology-weighted FDI measures, they find that domestic
firms that are technologically close to MNEs become more productive following
FDI, while firms that produce similar goods to MNEs, but are not technologically
close, become less productive.
Aitken and Harrison [3] find a negative relationship between FDI and produc-
tivity of domestically owned plants using Venezuelan plant-level data, which they
interpret as a market-stealing or competition effect. Controlling for the increased
competition through FDI, Griffith et al. [31] find a significant positive FDI spillover
effect using the UK plant-level data. Keller and Yeaple [44] estimate international
technology spillovers to the US manufacturing firms and find a positive and
economically important spillover effect. According to their estimates, FDI accounts
for about 11% of US manufacturing productivity growth. Blalock and Gertler
[10] investigate the effect of FDI on local supplier productivity by estimating
a production function using a rich panel dataset of local- and foreign-owned
Indonesian manufacturing firms. Overall, the realized productivity gain is more
than 2%. They also examine the market and welfare effects of technology diffusion
from FDI and find that FDI increases the output and profits of both upstream
and downstream firms. Meanwhile, FDI decreases prices in both upstream and
downstream markets.

Persistent Benefits of FDI

How persistent are the benefits of FDI? Is the superior performance of foreign
affiliates due to a one-time know-how transfer? Or does it depend on the continuous
flow of knowledge and headquarter services from the parent firm? These questions
1154 S. Zhao and M. Jin

matter profoundly for policy making. Public policies, aimed at attracting FDI, are
common in both developing and developed economies. Foreign investors are often
given tax incentives/breaks in the hope that their affiliates will become a source of
knowledge spillovers to domestic firms. How long they can remain such a source
enters the government’s cost-benefit calculation of attracting FDI. The duration of
tax break is usually prescribed by law, and tax break cannot be awarded after the
foreign parent leaves. If foreign affiliates retain their productivity advantage even
after their foreign parents leave, the value proposition of such tax policies would be
much greater than that under the scenario where the advantage evaporates once the
parent divests.
Javorcik and Poelhekke [40] study these issues and examine the development
of previous foreign affiliates that were sold by their parents to local owners. They
use Indonesian plant-level data from 1990 to 2009 and focus on plants that were
at least 50% foreign owned, whose foreign ownership dropped to less than 10% (a
standard threshold used in the literature to denote FDI) and remained so for at least
3 years. They find that divestment is associated with a 0.038 log point productivity
drop among divested plants. The decline starts in the year of ownership change and
persists over time. They also find an increasingly large output drop, ranging from
0.35 log points in the year of divestment to 0.54 log points 2 years later, which
is driven by the decline in export sales. Their results suggest that the benefits of
foreign ownership, at least partially, depend on continuous injections of knowledge
and access to headquarter services. Any externalities associated with the presence
of foreign affiliates may fade away after foreign owners leave. Malikov and Zhao
[49] study intra-industry FDI-facilitated productivity spillovers in China. They find
the magnitude of productivity spillovers may not depend on the ownership of firms.
The primary beneficiaries of the direct knowledge transfers generally appear to be
the wholly domestically owned firms. This suggests that the bulk of a productivity
boost is due to learning new knowledge and acquiring new technology, immediately
after the domestic firm gains direct access to them, from its initial foreign investors.
The indirect technology spillovers may be long-lasting, but the direct learning
from foreign investors among immediate recipients is more likely to be a one-time
knowledge transfer.

Empirical Models to Measure FDI Spillovers

Keller [43] summarizes different empirical approaches employed to study interna-

tional technology diffusion, including the case studies and the econometric methods
such as association studies, structure studies, and general equilibrium analysis. Here,
we pick out two commonly used regression approaches and briefly discuss their pros
and cons.

Reduced-Form Model
Generally, the reduced-form model tests whether a specific foreign activity (FA)
leads to a particular domestic technology outcome (DTO) through a regression in
28 Globalization, Innovation, and Productivity 1155

the form of

DT O = ψ (F A, X) + u, (5)

where ψ(·) usually takes a linear form, X is a vector of control variables, and u
is the noise term. For example, in Aitken and Harrison’s [3] analysis, FA is the
industry share of employment in foreign-owned firms, and DTO is the growth of
domestic firm productivity. Most studies routinely estimate FDI-driven technologi-
cal spillovers by (linearly) regressing firm-level measures of productivity/output on
various industry-level FDI aggregates, which reflects the overall extent of foreign
presence in the sector. Since the reduced-form model has low data requirements and
is easy to implement, it is probably the most commonly used approach to examine
FDI externalities.
The reduced-form model is subject to common issues in any regression models,
such as individual/time heterogeneity and endogeneity. As in most empirical
studies, finding good instrumental variables, for foreign activity variables, may
be challenging. Moreover, such a coarse “reduced-form” formulation, of FDI
spillovers, treats the foreign knowledge diffusion across firms as a black-box process
without attempting to establish links between peer firms. It thereby obfuscates the
measurement of positive “technological spillover” externalities, which is widely
conceptualized as productivity flows from more efficient MNEs to less efficient
domestic firms. It may confound such spillover effects with negative externalities
like “business stealing effects.” This is, arguably, one of the reasons why positive
horizontal FDI spillovers have been elusive for empiricists. Keller [43] also points
out that the reduced-form model precludes a precise interpretation of the results
since the black-box process can not reflect a particular mechanism of FDI spillovers.

Structural Model
Different from the reduced-form approach, the structural model approach is more
specific about the spillover mechanism, which helps in interpreting the results.
Generally, the structural model can be written as

DT O = ψ (M, F T , X) + u, (6)

where the foreign technology variable, FT, replaces the foreign activity (FA) variable
in equation (6), and the specification adds a specific channel or mechanism of
diffusion (denoted by M). An example of applying this model is the study by De
La Potterie and Lichtenberg [21], who estimate the following regression:

logT F P = α + αd logR&D + α f logF R&D + u, (7)

where logTPF is the logarithm of TFP, R & D represents the domestic R&D capital
stock, and FR & D represents the inward/outward FDI-weighted foreign R&D
capital stock. In equation (7), the coefficient α f can be used to test whether FDI
serves as a channel for the international technology diffusion.
1156 S. Zhao and M. Jin

De La Potterie and Lichtenberg [21] calculate the total fact productivity based
on a constant-return-to-scale production function using the average share of capital
income as the elasticity. They impose strong assumptions on the production
technology. Malikov and Zhao [49] provide another example in which they apply
the structure model to examine the FDI-facilitated productivity spillovers and
estimate the production function, productivity and spillover effects jointly. The key
component of their method is to model the productivity evolution process as follows:
⎛ ⎞

ωit = h ⎝ωit−1 , Git , sij t−1 ωj t−1 ⎠ + ξit , (8)
j =i

where the subscript i is a firm index, t is a year index, ωit is a productivity measure,
and G represents a firm’s foreign capital stock. {sijt − 1 } are the weights embodying
peer connections of firms based on their exposure to foreign technology – the
weight sijt − 1 for each firm i is constructed based on the foreign equity share of its
foreign peers j in the industry (and the geographic
locality). Therefore, the weighted
aggregate of other foreign firms’ productivity ( j =i sij t−1 ωj t−1 ) captures potential
positive FDI externalities, giving rise to productivity spillovers from other firms with
foreign capital in the industry.

Innovation and Productivity

Innovation Measures

Studies of growth, based on the aggregate production function, have revealed

that growth in the traditional factor inputs, such as capital and labor, can only
explain about half of the output growth in the USA and many other countries.
The remaining portion is ascribed to growth in productivity or the broadly defined
technological progress. As we see from the introduction, Crossan and Apaydin
[19] give a comprehensive definition of innovation, which encompasses almost all
channels of productivity growth. A large collection of literature has emerged to
find more specific and trackable measures for innovative activities and to examine
their impacts on productivity. Despite that, measuring innovation in a form that is
useful for statistical analysis has proven to be challenging. The central problem is
data availability. The previous work has mostly used two measures of innovative
activities: R&D expenditures and patent counts. Doraszelski and Jaumandreu [23]
use the Encuesta sobre Estrategias Empresariales (ESEE) survey to investigate the
relationship of R&D and productivity. The ESEE survey of Spanish firms is a com-
prehensive and well-designed survey, which includes questions related to patents,
innovation (product, process, organizational and marketing innovations), payments
and income for licenses and technical assistance, salaries of R&D personnel,
and complementary technological activities such as collaboration with customers
and suppliers. Jin et al. [41] examine the nexus of financial constraints, R&D, and
28 Globalization, Innovation, and Productivity 1157

productivity using a survey of all large- and medium-sized Chinese manufacturing

firms. This survey is conducted by China’s National Bureau of Statistics (NBS)
and provides data on R&D expenditures after 2004. Different from ESEE, NBS’s
questionnaires of R&D are not customized to take all R&D-related expenditures
into consideration (see [41] for a detailed comparison). Lach [46] uses data on
patent counts to assess the contribution of knowledge to productivity growth at the
industry level. The output of Robert Evenson’s Yale-Canada Concordance Project,
described by Kortum and Putnam [45], provides the patent data. Chen and Yang [17]
employ both R&D stock and patent counts to investigate the relationship among
technological knowledge, spillovers, and productivity of Taiwanese manufacturing
firms from 1990 to 1997. Their patent data is collected from the computer file of
patents from the Taiwan Intellectual Property Office. The Compustat and the China
Stock Market and Accounting Research (CSAMR) databases also provide patent
counts for the US and Chinese publicly listed firms, respectively [26, 42].
Hall [33] points out that, as measures of innovation, R&D expenditures and
patent counts have both positive and negative attributes. They both pertain primarily
to technological innovation and are more suited to measuring innovation in man-
ufacturing firms than in firms in other areas, such as services. R&D expenditures
have the advantage that they are denominated in comparable units and represent
a firm’s decision variable. They reflect a firm’s appropriate level of innovative
activities. For the same reason, R&D expenditures are only an input to innovation
and cannot tell us about innovation success. In contrast, the patent count is a measure
of invention success, and can be considered at least a partial measure of innovation
output. But the counts are inherently very noisy – it is likely that a few patents are
associated with very valuable inventions and most others are associated with little
value. The extent of their innovation coverage varies by sector, with sectors like
pharmaceuticals and instruments making heavy use of patents while other sectors
use them sparingly.
Recognizing the disadvantages of R&D expenditures and patent counts,
economists have begun to examine innovation more broadly and search alternative
measures including those more suitable for the services sector. Several such
measures of innovative activities have been employed, including the use of
information and communication technologies (ICT), business model patents,
expenditure on organizational innovation, marketing expenditure related to new
products, etc. Polder et al. [58] consider the ICT usage as an additional innovation
input separated from R&D that increases firm productivity over time. Based on a
data set merging different surveys (i.e., the Community Innovation Survey (CIS),
business ICT survey, and Investment Statistics and Production Statistics), they find
that ICT is the most important survey for innovation success in the services sector.
ICT investment and the use of broadband connectivity and e-commerce positively
affect all product, process, and organizational innovations. Martin and Nguyen-Thi
[51] combine both CIS and annual ICT usage and e-commerce in an enterprises
survey in Luxembourg and find the positive effect of ICT use on labor productivity
in the services sector. A business model is a combination of technologies and
markets, which requires strategies and resources to overcome mis-combinations
1158 S. Zhao and M. Jin

[30]. Following an outdated business model can cause firms to miss a potentially
valuable use of technology, and thus cause firms to miss a potential opportunity
to increase productivity. To overcome the challenges of increasing development
costs and shorter product life cycles, companies must be innovative in their business
model. To measure innovation of business models, one can use the business model
patent data, which is available in a worldwide patent database – the Global Patent
Analysis Service System (G-PASS). It covers all major countries (e.g., the USA,
Europe, China, Japan, Korea, Germany, France, the United Kingdom, and Canada)
from 1960 to present.
To overcome the noncomparability issue associated with using patent counts or
other innovation counts to measure innovation – some patents create whole new
markets whereas others are useful but trivial. One can, at least, use two other
measures to quantify the quality of patents. First, as pointed out by Hall [33],
innovation surveys typically have information on the share of the firm’s sales
that are resulted from innovations introduced during a preceding period (usually
the past 3 years). The share of sales of innovative products provides a good
indicator of the overall importance of the innovations. However, it is useful only
for innovations in goods and services and cannot be used to capture process or
organizational innovations. Second, one can use the patent citations to measure the
importance of a patent. Different from a simple patent count, citations can capture
not only the quantity of ideas produced but also the quality of those ideas [18]. The
National Bureau of Economic Research (NBER) patent database provides detailed
information on almost 3 million US patents granted from 1963 to 1999, and it
includes all citations made to these patents between 1975 and 1999. This database
also has other original measures such as indices of originality and generality. See
Hall et al. [34] for a detailed description. An updated version of this database
includes data up to 2006.

R&D Measure and Modeling

Among all the innovation measures mentioned above, R&D is the most intensively
examined in previous studies. Firms invest in R&D to achieve productivity gains
through innovations resulting from their investments, and there is a large body of
literature estimating returns to R&D. The common approach used in these studies
is to adopt the Griliches’ [32] knowledge production framework, which augments
a standard production function with a measure of the current stock of technical
knowledge (see a survey by [35]). In general, we can write the Griliches production
function g(•) as:

yt = g (xt , rt ) + t , (9)

where yt , xt , and rt are (log) output, standard factor inputs (such as capital, labor, and
materials), and knowledge capital at time t, respectively. The knowledge capital is
usually calculated through the perpetual inventory method from the observed R&D
28 Globalization, Innovation, and Productivity 1159

expenditures, that is, rt = (1 − δ)rt − 1 + R & Dt , where δ is the depreciation rate.

Doraszelski and Jaumandreu [23] argue that the widely used Griliches framework
has two potential problems: the depreciation rate is difficult to estimate, and the
available history of R&D expenditures is short. Therefore, they explore the law of
motion of (log) productivity, viz., t+1 = E t+1 t , R&Dt + error t+1 , in
which calculation of knowledge stock is not needed. The effects of previous R&D
investment on productivity are controlled by the contemporary productivity ( t ),
and the current period R & Dt interacts with t . They jointly determine the future
productivity.
However, neither approach accounts for the spillovers of R&D investment
during estimation. Due to the non-excludability and inexhaustibility nature of
knowledge, previous studies show that investment in R&D, by private corporations
and countries, “spills over” for the third party to exploit [25]. To empirically
measure the effect of spillovers on productivity, previous studies typically model
the (prespecified) spatial correlation of the prior estimated productivity by applying
spatial econometric tools, such as the spatial autoregressive (SAR) model and spatial
error model (SEM), in a two-step approach. Nevertheless, spillovers lead to cross-
sectional dependence across units, which may lead to inconsistent productivity
estimates in the first step [57]. Eberhardt et al. [25] and Malikov and Zhao [49] argue
that spillovers should be accounted for during productivity estimation. They apply
the interactive fixed-effects model and the dynamic law of motion of productivity to
control the spillover effects, respectively.

Financial Constraints, R&D, and Productivity

While R&D investment is a crucial driver of productivity growth, financing R&D

tends to be difficult due to reasons such as lack of collateral value, uncertainty,
and asymmetric information associated with R&D activities [16]. Intuitively, it
seems natural that financial constraints would have a negative effect on firms’ R&D
investment and thus decrease firm productivity. However, the results from empirical
studies are mixed. There are studies that find a strong link between R&D and
financial status measures, such as cash flow, internal and external equity finance
[15, 37]. On the other hand, there are other studies that find a weak correlation or no
correlation at all between R&D and cash flow [12, 36]. These inclusive results may
be due to market/project heterogeneity, sample selection, and data limitation. For
example, Czarnitzki and Hottenrott [20] find that R&D expenditures on cutting-edge
projects are curtailed by financial constraints, while routine R&D investment is not.
Brown et al. [16] find strong evidence that the availability of finance matters for
R&D once (i) firm’s efforts to fund R&D with cash flow and (ii) firm’s use of
external equity finance are directly controlled for.
Recent studies also find a nonlinear relationship between financial constraints
and productivity. Jin et al. [41] find that financial constraints have non-monotonic
effects on productivity. Severely constrained firms cannot afford sufficient invest-
1160 S. Zhao and M. Jin

ment in R&D activities, resulting in productivity loss. Unconstrained firms may not
have incentives to invest in R&D, which also results in low productivity. Aghion
et al. [2] argues that better access to credit makes it easier for entrepreneurs to
innovate, but it allows less efficient incumbent firms to constantly remain in the
market, thereby discouraging entry of new and potentially more efficient innovators.

Conclusion

Globalization is one of the fundamental issues in production and development

economics. International trade (import and export) and FDI are among the key
components of the globalization process. Theoretical and empirical studies, based
on different model settings and samples, have identified various mechanisms
through which trade and FDI can increase productivity at both macro and micro
levels, even though heterogeneity exists due to reasons such as competition and
learning capacity.
Innovation and productivity growth are intimately related. Different types of
innovative activities have been studied with challenges of measuring innovation
precisely. Commonly used innovation measures include R&D expenditures, patent
counts, patent citations, business model patents, and the use of information and
communication technologies. R&D expenditure is studied most intensively. The
discussions of other measures are based mostly on empirical findings. We look
forward to more theoretical frameworks on their connections to firms’ decisions
in the field of economics.
Finally, the consistent estimation of productivity and identification of productiv-
ity effects stemming from trade, FDI, R&D, and spillovers could be challenging
due to various endogeneity issues (e.g., simultaneity problem associated with input
choices and latent productivity, the omitting variable bias due to connection between
globalization/innovation and firm input usage, and cross-sectional dependence
resulted from technology spillovers). Fortunately, each of them can be solved by
using the appropriate econometric tools, so future empirical studies of good quality,
can, and should, take them into consideration.

Cross-References

Cost, Revenue, and Profit Function Estimates

Economics of Externalities: An Overview
Scale Elasticity and Returns to Scale

References
1. Ackerberg DA, Caves K, Frazer G (2015) Identification properties of recent production
function estimators. Econometrica 83(6):2411–2451
28 Globalization, Innovation, and Productivity 1161

2. Aghion P, Bergeaud A, Cette G, Lecat R, Maghin H (2019) The inverted-U relationship

between credit access and productivity growth. Economica 86:1–31
3. Aitken BJ, Harrison AE (1999) Do domestic firms benefit from direct foreign investment?
Evidence from Venezuela. Am Econ Rev 89(3):605–618
4. Alcalá F, Ciccone A (2004) Trade and productivity. Q J Econ 119(2):613–646
5. Amiti M, Konings J (2007) Trade liberalization, intermediate inputs, and productivity: evidence
from Indonesia. Am Econ Rev 97(5):1611–1638
6. Aw B-Y, Hwang AR (1995) Productivity and the export market: a firm-level analysis. J Dev
Econ 47(1995):313–332
7. Aw B-Y, Chung S, Roberts MJ (2000) Productivity and turnover in the export market: micro-
level evidence from the Republic of Korea and Taiwan (China). World Bank Econ Rev
14(1):65–90
8. Baldwin JR, Gu W (2004) Trade liberalization: export-market participation, productivity
growth, and innovation. Oxf Rev Econ Policy 20(3):372–392
9. Bernard A, Jensen JB (1995) Exporters, jobs, and wages in U.S. manufacturing: 1976–1987.
Brook Pap Econ Act 26, issue 1995 Microeconomics:67–119
10. Blalock G, Gertler PJ (2008) Welfare gains from foreign direct investment through technology
transfer to local suppliers. J Int Econ 74(2):402–421
11. Blomström M, Kokko A (1998) Multinational corporations and spillovers. J Econ Surv
12(3):247–277
12. Bond S, Harhoff D, Van Reenen J (2003) Investment, R&D and financial constraints in Britain
and Germany. CEP Discussion papers dp0595, Centre for Economic Performance, LSE
13. Brandt L, Van Biesebroeck J, Zhang Y (2012) Creative accounting or creative destruction?
Firm-level productivity growth in Chinese manufacturing. J Dev Econ 97(2):339–351
14. Brewer TL, Young S (2000) The multilateral investment system and multinational enterprises.
OUP Catalogue
15. Brown JR, Fazzari SM, Petersen BC (2009) Financing innovation and growth: cash flow,
external equity, and the 1990s R&D boom. J Financ 64(1):151–185
16. Brown JR, Martinsson G, Petersen BC (2012) Do financing constraints matter for R&D? Eur
Econ Rev 56(8):1512–1529
17. Chen J-R, Yang C-H (2005) Technological knowledge, spillover and productivity: evidence
from Taiwanese firm level panel data. Appl Econ 37(20):2361–2371
18. Correa JA, Ornaghi C (2014) Competition & innovation: evidence from US patent and
productivity data. J Ind Econ 62(2):258–285
19. Crossan MM, Apaydin M (2010) A multi-dimensional framework of organizational innovation:
a systematic review of the literature. J Manag Stud 47(6):1154–1191
20. Czarnitzki D, Hottenrott H (2011) Financial constraints: routine versus cutting edge R&D
investment. J Econ Manag Strateg 20(1):121–157
21. De La Potterie BVP, Lichtenberg F (2001) Does foreign direct investment transfer technology
across Borders? Rev Econ Stat 83(3):490–497
22. De Loecker J (2011) Product differentiation, multiproduct firms, and estimating the impact of
trade liberalization on productivity. Econometrica 79(5):1407–1451
23. Doraszelski U, Jaumandreu J (2013) R&D and productivity: estimating endogenous productiv-
ity. Rev Econ Stud 80(4):1338–1383
24. Eaton J, Kortum S (1996) Trade in ideas patenting and productivity in the OECD. J Int Econ
40(3–4):251–278
25. Eberhardt M, Helmers C, Strauss H (2013) Do spillovers matter when estimating private returns
to R&D? Rev Econ Stat 95(2):436–448
26. Fons-Rosen C, Kalemli-Ozcan S, Sorensen BE, Villegas-Sanchez C, Volosovych V (2017) For-
eign investment and domestic productivity: identifying knowledge spillovers and competition
effects. Working paper 23643. National Bureau of Economic Research
27. Fosfuri A, Motta M, Ronde T (2001) Foreign direct investment and spillovers through workers’
mobility. J Int Econ 53(1):205–222
28. Frankel JA, Romer DH (1999) Does trade cause growth? Am Econ Rev 89(3):379–399
1162 S. Zhao and M. Jin

29. Gandhi A, Navarro S, Rivers D (2018) On the identification of gross output production
functions. Centre for Human Capital and Productivity (CHCP), University of Western Ontario
30. Gassmann O, Frankenberger K, Csik M (2014) The business model navigator: 55 models that
will revolutionise your business. Pearson, Harlow
31. Griffith R, Redding S, Simpson H. (2003) Productivity convergence and foreign ownership at
the establishment level. CEP discussion papers dp0573. Centre for Economic Performance,
LSE
32. Griliches Z (1979) Issues in assessing the contribution of research and development to
productivity growth. Bell J Econ 10(1):92–116
33. Hall BH (2011) Innovation and productivity. National Bureau of Economic Research, Cam-
bridge, Massachusetts
34. Hall BH, Jaffe AB, Trajtenberg M (2001) The NBER patent citation data file: lessons, insights
and methodological tools. National Bureau of Economic Research
35. Hall BH, Mairesse J, Mohnen P (2010) Measuring the returns to R&D. In: Handbook of the
economics of innovation, vol 2, pp 1033–1082. Elsevier
36. Harhoff D (2000) Are there financing constraints for R&D and investment in Ger-
man Manufacturing Firms? In: The economics and econometrics of innovation. Springer,
New York, pp 421–456
37. Himmelberg CP, Petersen BC (1994) R&D and internal finance: a panel study of small firms
in high-tech industries. Rev Econ Stat 76(1):38–51
38. Irwin DA, Terviö M (2002) Does trade raise income? Evidence from the twentieth century. J
Int Econ 58(1):1–18
39. Javorcik B (2004) Does foreign direct investment increase the productivity of domestic firms?
In search of spillovers through backward linkages. Am Econ Rev 94(3):605–627
40. Javorcik B, Poelhekke S (2017) Former foreign affiliates: cast out and outperformed? J Eur
Econ Assoc 15(3):501–539
41. Jin M, Zhao S, Kumbhakar SC (2019) Financial constraints and firm productivity: evidence
from Chinese manufacturing. Eur J Oper Res 275(3):1139–1156
42. Jin M, Tian H, Kumbhakar S (2020) How to compete and survive: the impact of information
asymmetry on productivity. J Prod Anal 53:107–123
43. Keller W (2004) International technology diffusion. J Econ Lit 42(3):752–782
44. Keller W, Yeaple SR (2009) Multinational enterprises, international trade, and productivity
growth: firm-level evidence from the United States. Rev Econ Stat 91(4):821–831
45. Kortum S, Putnam J (1989) Estimating patents by industry: part I and II. Unpublished
Manuscript, Yale University
46. Lach S (1995) Patents and productivity growth at the industry level: a first look. Econ Lett
49(1):101–108
47. Levinsohn J (1993) Testing the imports-as-market-discipline hypothesis. J Int Econ 35(1–2):
1–22
48. Levinsohn J, Petrin A (2003) Estimating production functions using inputs to control for
unobservables. Rev Econ Stud 70(2):317–341
49. Malikov E, Zhao S (2019) Cross-firm productivity spillovers in the presence of foreign
investments. Working paper
50. Markusen JR, Maskus KE (2002) Discriminating among alternative theories of the multina-
tional enterprise. Rev Int Econ 10(4):694–707
51. Martin L, Nguyen-Thi TU (2015) The relationship between innovation and productivity based
on R&D and ICT use: an empirical analysis of firms in Luxembourg. Rev Écon 66(6):1105–
1130
52. Melitz MJ (2003) The impact of trade on intra-industry reallocations and aggregate industry
productivity. Econometrica 71(6):1695–1725
53. Muendler M-A (2004) Trade, technology and productivity: a study of Brazilian manufacturers
1986–1998. CESifo working paper series 1148. CESifo Group, Munich
54. Nishimizu M, Robinson S (1984) Trade policies and productivity change in semi-industrialized
countries. J Dev Econ 16(1–2):177–206
28 Globalization, Innovation, and Productivity 1163

55. Olley GS, Pakes A (1996) The dynamics of productivity in the telecommunications equipment
industry. Econometrica 64(6):1263–1297
56. Pavcnik N (2002) Trade liberalization, exit, and productivity improvements: evidence from
Chilean plants. Rev Econ Stud 69(1):245–276
57. Pesaran MH (2006) Estimation and inference in large heterogeneous panels with a multifactor
error structure. Econometrica 74(4):967–1012
58. Polder M, van Leeuwen G, Mohnen P, Raymond W (2009) Productivity effects of innovation
modes. MPRA paper 18893. University Library of Munich
59. Rodriguez-Clare A (1996) Multinationals, linkages, and economic development. Amer Econ
Rev 852–873
60. Rodriguez F, Rodrik D (2000) Trade policy and economic growth: a Skeptic’s guide to the
cross-National Evidence. Natl Bur Econ Res Macroecon Annu 15:261–325
61. Topalova P, Khandelwal A (2011) Trade liberalization and firm productivity: the case of India.
Rev Econ Stat 93(3):995–1009
62. Tybout JR (1992) Linking trade and productivity: new research directions. World Bank Econ
Rev 6(2):189–211
63. Xu B (2000) Multinational enterprises, technology diffusion, and host country productivity
growth. J Dev Econ 62(2):477–493
64. Yu M (2015) Processing trade, tariff reductions and firm productivity: evidence from Chinese
firms. Econ J 125(585):943–988
65. Zhao S, Jin M, Kumbhakar SC (2020) Estimation of firm productivity in the presence of
spillovers and common shocks. Empir Econ, forthcoming
Empirical Analysis of Production
Economics: Applications to Banking 29
Stephen M. Miller

Contents
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1166
Production Economics in Banking Research . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1167
Organizing Production: How to Measure Output . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1167
Productivity Growth . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1168
Bank and Banking Profitability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1170
Economies of Scale . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1174
Economies of Scope . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1174
Efficiency . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1176
Bank Failure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1183
Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1184
Cross-References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1185
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1185

Abstract

Volume II of the Handbook of Production Economics provides surveys of

empirical applications of the neoclassical production economics discussed in
Volume I. This chapter examines empirical applications in banking that now
enter what we can categorize as accumulated, accepted knowledge or wisdom.
To begin, we consider how to measure output. Two basic approaches exist –
the production and intermediation specifications. The treatment of deposits
differentiates these two specifications, whereby the production approach takes
deposits as an output and the intermediation approach takes deposits as an input.
Then, this chapter proceeds to discuss various issues in bank production – bank

S. M. Miller ()
Department of Economics, Lee Business School, University of Nevada, Las Vegas, Las Vegas,
NV, USA
e-mail: [email protected]

S. C. Ray et al. (eds.), Handbook of Production Economics,
https://doi.org/10.1007/978-981-10-3455-8_29
1166 S. M. Miller

productivity growth, bank and banking industry profitability, economies of scale

and scope in banking, and bank efficiency. Bank efficiency includes efficiency
as measured by the production function, the cost function, the revenue function,
the profit function, and the efficient frontier between expected return and risk.
The chapter concludes with an analysis of predicting problem banks and/or bank
failures.

Keywords

Banking-Productivity-Profitability-Scale · Scope-Efficiency-Expected return ·

Risk-Failure Prediction

Introduction

The US banking industry experienced significant change over the last four decades.
The financial reforms adopted due to the Great Depression promoted a tension
between federal regulations and their implementation and enforcement. While reg-
ulation generally restricted geographic and product-line expansion of commercial
banking activities, financial innovation reacted against these regulatory constraints.
That is, bankers applied pressure for less regulation and more freedom of action.
Such activity generally produced implementation and enforcement of existing regu-
lations that loosened regulatory constraints. In addition, financial innovations aimed
to avoid or work around existing regulations. Many prohibitions on geographic and
product-line expansion of commercial banking activities proved less effective than
the regulators originally planned.
The process of deregulation from the regulatory environment implemented in
response to the Great Depression began in the early 1980s with the Depository
Institution and Monetary Control Act of 1980. The process continued through the
passage of the Riegle-Neal Interstate Banking and Branching Efficiency Act of
1994 and essentially ended with the adoption of the Gramm–Leach–Bliley Act, also
known as the Financial Services Modernization Act of 1999. This environment of
deregulation or freeing up of commercial banking activities in the late 1980s and
early 1990s also saw the largest number of commercial bank failures and mergers
since the Great Depression. This reduction in the number of institutions through
failure and merger (either voluntary or FDIC arranged) significantly lowered the
number of commercial banks to under 8000 by 2008 from a previous peak of
nearly 15,000 in the early 1980s. Today, the number of commercial banks falls
below 5000. This process of deregulation and consolidation in the industry also
accelerated a trend toward more complex banking organizations and activities.
The conglomeration of different financial services under the umbrella of one legal
organization, the bank holding company, increased the complexity of financial
services offered by banks, such as derivatives and mortgage (and other credit)
securitizations.
These new financial innovations not only rightly received accolades for their
ability to diversify risk, but they also brought new levels of risk to bank operations.
29 Empirical Analysis of Production Economics: Applications to Banking 1167

For example, the subprime mortgage crisis of 2007 led to the bailout of US banks
that had become “too big to fail.” This ultimately precipitated the passage of the
Dodd-Frank Wall Street Reform and Consumer Protection Act in 2010 that was
designed to address many of the apparent weaknesses within the US financial system
that developed because of the deregulation process. In other words, a significant
reregulation of the industry began to occur.
In sum, the banking industry with its fundamental role in the operation of the
economy and its experience with regulatory change provides a unique industry to
study. In addition, the availability of income and balance sheet data on individual
banks and/or bank-holding companies provides the grist for analysis of various
aspects of production economics. Volume II of the Handbook of Production
Economics provides surveys of empirical applications of the neoclassical production
economics discussed in Volume I. This chapter examines empirical applications in
banking that now enter what we can categorize as accumulated, accepted knowledge
or wisdom.

Production Economics in Banking Research

This section examines a variety of issues of production economics in the banking

industry. Topics include productivity and profitability, economies of scale and scope,
production, cost, revenue, and profit efficiency, and predicting (identifying) problem
banks and/or failed banks. Most papers included in this chapter consider US banking
data with a few exceptions. Typically, research projects using US banking data
quickly translate into similar research projects using data from other developed
and/or developing countries.

Organizing Production: How to Measure Output

At the outset, important problems emerge in defining what a commercial bank

produces or how to measure commercial bank output. In general, the existing
literature in banking adopts two main specifications of how a bank operates in
the banking industry: the production and intermediation specifications [1]. The
production approach specifies the bank as producing loans, deposits, and other
financial services, using inputs such as labor and capital. The intermediation
approach specifies the bank as intermediating financial services between spenders
(borrowers) and savers (lenders) and producing loans and other interest-earning
assets using deposits and labor [2]. The production and intermediation specifications
identify deposits, in turn, as an output and as an input, respectively.
Other specifications also exist, although less frequently used in the literature.
The asset specification aligns itself with the intermediation approach [3]. That
is, deposits and other liabilities along with labor and capital produce output
(assets), and only bank assets generate revenue such as loans or investments
1168 S. M. Miller

[4]. Alternatively, the value-added (revenue) specification in Data Envelopment

Analysis (DEA) models identifies outputs as components of a portfolio that create
value [5]. Therefore, this specification postulates deposits and loans as outputs.
Further, the operating (income) specification models the bank as generating the
revenue necessary to cover its total cost or expenses. This specification iden-
tifies interest and noninterest expenses as inputs and interest and noninterest
income as outputs [6]. Finally, Holod and Lewis [7] specify a two-stage DEA
model. In stage one, the bank employs labor, capital, and other inputs to produce
deposits. Then, in stage two, the bank employs deposits with other inputs to
produce interest-earning assets (output), treating deposits as an intermediate output
(input).

Productivity Growth

Solow’s [8] path-breaking work on growth accounting spawned an ongoing litera-

ture on measuring productivity and/or technological growth. Such measurement of
productivity growth, using parametric and nonparametric methods, also exists in the
banking industry.
Several papers employ parametric models (e.g., [9–12]) to examine either total
factor productivity growth or technological progress in the US commercial banking
industry during the 1980s. Humphrey [10] calculates total factor productivity
growth using growth-accounting models. He determines that the average annual
productivity change varied between −0.07%, using the production method, and
0.6%, using the cost method, between 1977 and 1987.
The lack of productivity growth is traced to several factors. First, before
deregulation, US commercial banks competed significantly through “brick and
mortar” investment, which offered convenience to bank customers through a large
branching network (where permitted). Second, financial innovation and deregulation
during the 1970s and 1980s pushed the cost of funds to higher levels for commercial
banks due to more intense competition. Third, banks failed to adjust sufficiently or
quickly enough on their investment in branching networks to compensate for the
higher costs of funds. In sum, total factor productivity, measured either through the
production or cost approaches, fell.
Humphrey [11] and Hunter and Timme [12] estimate cost functions to measure
technological change in the US commercial banking industry. Technological change
excludes scale-economy effects. Both papers find small or negligible technological
change in the 1977–1988 period, supporting the findings of Humphrey [10].
Finally, Bauer, et al. [9] implement both stochastic and thick-frontier specifi-
cations and estimate total factor productivity growth, using a sample of 683 banks.
They discover that total factor productivity growth varied from −2.28 to 0.16% over
the 1977–1988 sample period.
Some papers employ nonparametric, rather than parametric, specifications to
measure productivity and technology. Caves, et al. [13] employ the Malmquist pro-
ductivity index, using the “proportional scaling” method introduced by Malmquist
29 Empirical Analysis of Production Economics: Applications to Banking 1169

[14], to measure productivity growth.1 The Malmquist productivity index provides

a nonparametric alternative to the prior parametric specifications. To derive the
Malmquist index, Caves, et al. [13] adopt a translog model for the distance functions
and assume that the firms operate efficiently. Färe et al. [16] allow inefficiencies
in firm operation and model the technology as piecewise linear. They assume
constant returns to scale and decompose the Malmquist index into efficiency change
and technical change index components. A series of other papers develop further
decompositions of the Malmquist productivity index (e.g., [17–20]).
Since many DEA studies of bank efficiency employ cross-sectional data, they
cannot measure productivity change. Using a sample from 1980 to 1985, Elyasiani
and Mehdian [21] calculate technological change for large US commercial banks,
but not their productivity growth. They identify significant technological advance-
ment of about 2.6% per year over their 5-year sample period, classifying this
technological progress as a non-neutral and labor-biased technical change.
Berg et al. [22] employ the Malmquist index to measure productivity growth in
Norwegian banking during its deregulation. The analysis examined the 1980–1989
period with the number of banks falling from 346 in 1980 to 178 in 1989. They
used the value-added method to measure output. They consider productivity growth
for the banking frontier as well as for individual banks, finding productivity decline
prior to deregulation and productivity growth after deregulation. They also find less
dispersion of productivity levels because of deregulation.
Wheelock and Wilson [23] use a sample of all US banks between 1984 and 1993,
omitting banks with missing observations or no loans, to construct the Malmquist
index to determine productivity change. They measure the Malmquist index as
the ratio of variable returns to scale (VRS) distance functions and decomposed
this index into two factors: technical efficiency change and technical change. For
example, they find that productivity generally declined for all banks. The subsample
of banks that operated with more than $300 million in assets continuously from
1983 to 1993 experienced increased productivity from 1983 to 1989, but decreased
productivity after 1989 that offset the productivity increase in the 1980s.
Mukherjee, et al. [24] estimate productivity growth for 201 large US commercial
banks in the postderegulation period from 1984 to 1990, using the DEA approach
and Malmquist productivity indexes. They isolate the contributions of technical
change, technical efficiency change, and scale efficiency change to productivity
growth. They identify that overall productivity growth increased by 4.5% per
year, on average. Productivity actually declined, however, by 7.61% and 0.33%
between 1984 and 1985 and between 1988 and 1989, respectively. Second-stage
panel regressions document that larger productivity growth associates with larger
asset size and specialization of product mix, whereas lower productivity growth
associates with higher net charge-offs.

1 Färe, et al. [15] provide a comprehensive survey of the theory and development of the Malmquist

productivity index.
1170 S. M. Miller

Bank and Banking Profitability

As a financial institution, banks generate profit in large measure by managing their

portfolio of assets and liabilities, that is, their balance sheets. Of course, banks
also hire labor and rent (own) capital to run their operations in addition to selling
services, which also appear in the income statements of banks along with the income
and expenses from managing their assets and liabilities. Nonetheless, managing
the portfolio of assets and liabilities provides an important component of bank
profitability. Using Consolidated Report of Condition and Income (Call Report)
data, researchers have considered the important determinants of individual US bank
profitability as well as the dynamic adjustment of industry profitability through the
growth and structure of the industry.

Determinants of Bank Profitability2

Numerous studies directly attempt to identify the characteristics associated with
bank profitability. Researchers frequently measure bank profitability by return on
assets (ROA) or return on equity (ROE), where a measure of net income captures the
profit from the income statement and total assets or equity comes from the balance
sheet. Explanatory variables typically include bank-specific variables (internal
factors) and macroeconomic variables (external factors). Bank-specific variables
include the effect on bank profitability of some financial ratios that measure
asset (lending and investment) management (e.g., total loans to total assets),
liability (funding) management (e.g., total deposits to total assets), productivity and
efficiency (e.g., total noninterest income to total income), and asset quality (e.g.,
provisions for loans losses to total loans). The role of macroeconomic variables
in affecting bank profitability involves changes over time. Thus, studies of bank
profitability that include both bank-specific variables as well as macroeconomic
variables generally employ time-fixed effects in a panel data setting to control for
macroeconomic variables.
Several authors provide a mixture of findings on US bank performance with
samples of various sizes of banks and over various periods of time (e.g., [25–28]. A
few consistent explanations of good bank performance do surface frequently in these
papers. High-performance banks possess low noninterest expense, and researchers
frequently attribute good bank performance to quality management. Those factors
that associate with high-performance banks do shift somewhat from study to study.
Part of this diversity of findings reflects different samples of bank sizes and different
sample periods.
For example, Gup and Walter [25] examine the characteristics of top-performing
small banks. Small banks differed significantly from the general results of the

2 Some papers use other performance measures besides bank profit such as net interest margin to
total assets, noninterest expenses to assets, nonperforming loans to total loans, or loan losses to
total loans (e.g., [25–27]).
29 Empirical Analysis of Production Economics: Applications to Banking 1171

other studies. High-performance small banks possess more securities, higher-quality

loans, more capital, and more demand deposits.
A large literature focuses on banks in countries other than the United States.
Some of the more frequently cited papers include the following: [29–33].

Dynamic Adjustment of Banking Industry Profitability

The dynamic adjustment of banks and their portfolios within that industry affects
overall industry profitability. The US banking industry developed within an environ-
ment of strong aversion to concentrations of power and with significant regulation
in the banking sector enacted in response to the Great Depression. Due to this
history, US banking activity and regulation generated an industry encompassing
many more banks than the norm around the world.3 During the 1970s, the banks
developed financial innovations frequently to circumvent existing regulation. The
ATM machine is the classical example. Commercial banks created ATMs to circum-
vent branching restrictions and reduce costs. (The ATM is not a branch but functions
as a minibranch.) Those innovations gradually eroded the control of existing
regulations, ultimately rendering much of the regulatory superstructure erected
during the Great Depression much less effective. Thus, the last two decades of the
twentieth century saw a series of deregulatory actions that significantly freed banks
from the regulatory control enacted during the Great Depression. For example, the
prohibition of intrastate and interstate banking slowly disappeared, first with a series
of relaxations of regulation on a state-by-state basis, then with increasing state-level
actions permitting interstate banking activity through multibank-holding companies,
and finally with the adoption of full interstate banking with the passage of the
Interstate Banking and Branching Efficiency Act of 1994. In sum, the geographic
deregulation of banking activity at the state and national levels provides an unusual
real-world experiment on the effects of such deregulation on banking behavior and
performance.4
A noneconometric study by Duca and McLaughlin [42] develops a taxonomy
of changes affecting bank profitability from 1985 to early 1990. Their discussion
relies on balance sheet and income statement data for all insured commercial banks
as well as for these banks separated into different size categories. Several general
conclusions emerge from their analysis. First, they determine that variations in
loan-loss provisions largely explain variations in bank profitability. In other words,
the variation in net income drops dramatically after purging loan-loss provisions.
Second, they note that total real estate loans grew as a fraction of interest-earning
assets at all insured commercial banks and across all size categories reported. That

3 Kane [34] provides an excellent historical account of the deregulatory movements in the US
banking sector.
4 Existing work examines the effects of deregulation on various banking issues. For example, how

did deregulation affect new charters, failures, and mergers [35–37] as well as bank performance
[37–41]?
1172 S. M. Miller

is, the banking industry uniformly increased its exposure to real estate loans in the
late 1980s.
Decomposing industry performance measures typically adopts the approaches of
Bailey, et al. [43] and Haltiwanger [44], which trace back to the work of Bennet
[45] on decomposition. Bailey, et al. [43] provide an algebraic decomposition of
an industry’s total factor productivity (TFP) growth into three effects – “within,”
“between,” and “net-entry” effects. The within effect measures the contribution of
surviving firms toward TFP growth. The between (or reallocation) effect measures
the contribution of changing market share of surviving firms toward TFP growth,
while the net-entry effect measures the contribution of firms’ entry into and exit
from the industry toward TFP growth.
Haltiwanger [44] extends the Bailey et al. [43] approach and separates the effects
of firm entry and exit from the industry. Moreover, he also divides the between
effect into two components – the “share” and “covariance” effects. The share effect
measures the contribution toward aggregate TFP growth of the changing share of
firms while the covariance effect measures the contribution toward aggregate TFP
growth of the changing share of firms times the changing TFP growth of firms.5
Stiroh [46], using US banking data, further decomposes Haltiwanger’s [44] method
by dividing banks into those that acquired other banks and those that did not.
Such decomposition methods share a common index-number issue – the base-
year choice. Bailey, et al. [43], Haltiwanger [44], and Stiroh [46] all choose the
initial year as the base. Thus, the within effect measures the change in performance
measure at the firm level between the initial and final years weighted by the
industry’s share in the initial year. Alternatively, another decomposition exists of
within, between (reallocation), entry, and exit effects where the final year provides
the base. That is, the within effect weights the change in the performance measure
between the initial and final years for each firm by the firm’s industry share in the
final year.
Finally, a Bennet [45] dynamic decomposition combines these two dynamic
decompositions into a simple average.6 Thus, the weighting of the within, between
(reallocation), entry, and exit effects all employ simple averages of the initial and
final year weights. In addition, the Bennet dynamic decomposition of the industry

5 As illustrated below, the covariance effect emerges because of the decomposition method. The
Bennet [45] decomposition method causes the covariance effect to disappear.
6 This discussion possesses an analogy to the price index literature. The Laspeyres [47] price

index uses in the numerator the sum of the current prices times base-period quantities, and in
the denominator the sum of base prices times base-period quantities. The Paasche [48] price
index uses in the numerator the sum of the current prices times current-period quantities, and
in the denominator the sum of base prices times current-period quantities. The Fisher [49] ideal
price index, then, forms a geometric mean of the Laspeyres and Paasche indices. Pigou [50] also
proposed the ideal price index. Bennet [45] specifies the analogy to the Fisher ideal price index
for changes in revenue – the sum of prices times quantities. Diewert [51] provides an extensive
discussion of the Bennet index, showing that the Bennet index equals the arithmetic average of the
Laspeyres and Paasche difference index analogies.
29 Empirical Analysis of Production Economics: Applications to Banking 1173

eliminates the covariance effect derived by Haltiwanger [44].7 Jeon and Miller [54]
modify the Bennet [45] method to consider geographic (regional) effects. Their
extension explores the effects, if any, of the deregulation of geographic restrictions
on the state-level decomposition of bank performance (return on equity). In that
analysis, they control for competition and the state of the economy in each state,
employing fixed- and random-effects regressions in the panel database across the
50 states and the District of Columbia from 1976 to 2005.
Researchers decompose average bank profitability in the industry, measured by
ROA or ROE, into factors attributable to improved profitability of individual banks
(the “within” effect), shifts of resources from less to more profitable banks (the
“between” effect), entries of more profitable banks (the “entry” effect), and exits
of less profitable banks (the “exit” effect). Jeon and Miller [54] implement their
modification of the Bennet dynamic decomposition to the ROE in the commercial
banking industry between 1976 and 2005 where the microeconomic unit equals
the bank. They find that the between and exit (within and entry) effects contribute
positively (negatively) and strongly to the banking industry’s trend ROE. Moreover,
together the four components produce a negligible net effect over the 30-year period.
Interestingly, although all four effects explain the cumulative long-run change in
return on equity, they report that the within effect dominates the between, entry, and
exit effects on a year-to-year basis. Thus, the within effect dominates the cyclical
movements in bank performance, but the trend movement in bank performance
reflects the between, entry, and exit effects.
The growing market share of high-performance banks at the expense of low-
performance banks explains the trend movement in bank performance over the
sample period.8 That is, the trend movement in industry performance reflects
a process of creative destruction, whereas the cyclical, year-to-year movement
primarily reflects the fortune of individual bank performance (within effect). Thus,
creative destruction generally confines itself to the long-run trend adjustments but
not to the short-run, cyclical adjustments. The 1993–2005 period after the banking
crisis provides the exception to this rule whereby creative destruction played a more
important role in short-run, cyclical movements in bank performance.

7 Griliches and Regev [52] employ this decomposition method in their study of firm productivity in
Israeli industry. Scarpetta et al. [53] briefly describe the Griliches and Regev [52] and Haltiwanger
[44] methods of decomposition, noting how they differ. Jeon and Miller [54], however, link the
differences to the base-year weighting issue. Finally, Bartelsman et al. [55] note that the covariance
term disappears for their decomposition. Balk [56] also provides an extensive review of the Bennet
[45] decomposition in terms of productivity changes.
8 Stiroh and Strahan [57], using a different methodology, reach a similar conclusion.
1174 S. M. Miller

Economies of Scale

Research on banking as an application of production economics began in earnest

with a series of papers that explored the existence of economies and diseconomies
of scale around five decades ago. Numerous studies of scale economies in banking
exist, generally tracing their roots to the work by Benston [58] and Bell and Murphy
[59]. Clark [60] reviews most studies through the early phase of deregulation,
including economies of scope. The original conclusion of these papers, using
samples of smaller banks of under $1 billion in assets, generally finds that
economies of scale quickly disappear and that further increases in bank size lead
to diseconomies of scale.
The extrapolation of findings for samples of smaller banks to large banks proved
problematic. First, small and large banks provide different services to a different
mix of markets and customers, suggesting that small and large banks probably face
different cost structures. Second, the deregulation in banking probably rendered the
results of prior studies moot since the initial studies occurred before the more recent
significant deregulation.
Several papers examine banks with deposits above $1 billion (e.g., [61–65]).
These studies, however, except for Nelson [63] and Noulas et al. [64], treat output
as a single composite commodity and do not disaggregate output. To aggregate,
one must assume that the cost function is separable in output. That is, the relative
marginal cost of any two outputs proves independent of input prices. If this
separability assumption fails to hold, then the estimation may produce biased scale
results. Noulas et al. [64] disaggregate output to test the separability assumption,
finding that it does not hold.
Since this original work on economies of scale in banking, the scale of banking
organizations grew over time to much larger-sized banks. The experience of the
Great Recession also ignited the debate on “too-big-to-fail” institutions. As such, the
determination of when banking institutions exhaust economies of scale becomes an
important public policy issue. As Humphrey [61] noted, estimates of scale economy
based on 1-year’s cross section do not generalize to other years. Thus, Wheelock
and Wilson [66] examine the extent of scale economies for 1986, 1996, 2006, and
2015, using cost, revenue, and profit functions. They derive estimates of returns to
scale from a nonparametric, local-linear estimator. Their findings determine that for
many institutions, increasing returns to scale continued to exist for larger size as
time moved from 1986 to 1996 to 2006 to 2015. Their strongest findings relate to
economies of scale derived from cost data, whereby the revenue and profit estimates
proved more mixed.

Economies of Scope

Economies and diseconomies of scale give information about optimal bank size.
Scale effects, however, do not consider the growth of banking institutions by
29 Empirical Analysis of Production Economics: Applications to Banking 1175

diversification of the financial products offered within the institution, that is, the
multiproduct nature of bank production. (To examine this issue, researchers focus
on economies and diseconomies of scope, considering the potential cost comple-
mentarities associated with multiproduct production.) Several authors develop the
original research on economies and diseconomies of scope (e.g., [67–70]). These
papers, however, fail to document evidence of economies of scope. They do find
limited evidence of cost complementarities between some pairs of outputs. Ferrier
and Lovell [71] report limited pairwise cost complementarities. Kim [72] and Buono
and Eakin [73] find economies of scope. In contrast, Cebenoyan [74] documents
diseconomies of scope. Mester [75] discovers evidence of significant economies of
scope for savings and loan associations, estimating a translog cost function. She
uses a sample of 1115 US savings and loans with positive equity in 1991. Mester
[76], however, uncovers no evidence of economies of scope, estimating a translog
cost function. She employs a sample of 214 3rd Federal Reserve District banks in
1991–1992.
This initial research on scope effects in banking generally focuses on small
banks with less than $1 billion in assets. Some research did consider large banks
of more than $1 billion in assets. Rangan et al. [77] estimate scope effects for
different sizes of banks, reporting diseconomies of scope for large banks. They
actually measure cost subadditivity rather than scope economies since they do not
permit specialized production. Hunter, et al. [78] also consider cost subadditivity for
large banks, finding no subadditivity in the bank cost function and no measurable
cost complementarities. Pulley and Braunstein [79] and Pulley and Humphrey [80]
document widely varying scope estimates, using a translog cost function, when
outputs take on zero values. They then employ a “composite function” that does
not exhibit such instability and find economies of scope.
Noulas et al. [81] determine that the ordinary translog cost function generates
unreliable estimates of scope effects, showing that the scope measures change in
size and magnitude considerably in a sensitivity analysis. Moreover, at the point of
evaluation of scope effects, the regularity conditions fail to hold, suggesting that all
studies of scope effects using a translog cost function are suspect. Instead, using a
hybrid cost function (i.e., a Box-Cox transformation), Noulas et al. [81] discover
economies of scope for the only specification where regulatory conditions hold at
the point of scope evaluation. Furthermore, the results are stable, as revealed by
evaluating the hybrid cost function at values of output close to zero.
Mitchell and Onvural [82] employ the Fourier flexible functional form cost
functions to estimate scope effects for large banks. Their main conclusion reports
that banks experience little or no cost gains from producing an output as a single
bank or as a part of a larger organization. That is, economies of scope do not seem
too important.
The prior work seeks evidence on economies of scope in the cost function.
Economies of scope, however, may appear in the revenue function. Berger et al.
[83] search for evidence of economies of scope in bank revenue function. They find
that revenue economies of scope prove insignificant.
1176 S. M. Miller

The existing literature supports the conclusion that little, or unclear, evidence
exists that commercial banks experience economies of scope. That is, the majority
of studies find no, or unclear, evidence of such scope economies.

Efficiency

Shortly after the search for economics and diseconomies of scale in banking
began, another related line of research started on measuring the efficiency of bank
production, using frontier-modeling techniques. Several authors develop the basic
methods for measuring inefficiency in production (e.g., [84–86]), which involves
deviations of actual from best-practice outcomes. That is, the best-practice frontier
provides the benchmark of production against which the researcher measures actual
production. Various mathematical programming and statistical techniques exist for
the construction or estimation of the best-practice frontier.
Two general categories of frontiers exist – deterministic and stochastic frontiers.
Both techniques encompass the observed data, but in differing ways. Deterministic
frontiers, once constructed, remain fixed in space and encompass all sample
observations. Consequently, a small subset of data supports the frontier, making
it more prone to sampling, outlier, and statistical-noise errors, which can distort
the efficiency measure. Two different methods construct deterministic frontiers.
Mathematical programming methods assume no statistical noise, an unreasonable
assumption for large economic data sets [87], while the statistical techniques assert
that random shocks, statistical noise, and firm-specific effects combined reflect
inefficiency, also an unreasonable assumption [88].
Stochastic frontiers avoid some of the problems associated with deterministic
frontiers by explicitly considering the stochastic properties of the data and distin-
guishing through a composite error term between firm-specific effects that relate to
inefficiency and random shocks or statistical noise. Here, the frontier can shift from
one observation to the next, being random rather than exact. Two types of stochastic
frontier models exist – parametric and nonparametric. Parametric versions, on the
one hand, own a longer record of development and accomplishment. Nonparametric
versions, on the other hand, are newer approaches with research effort currently
focusing on making the nonparametric methods more efficient [89].
Parametric stochastic frontiers confront researchers with other issues, however.
First, researchers must adopt an explicit functional form to represent the production
or cost function. Thus, researchers adopt flexible functional forms, such as the
translog or the Fourier flexible functional form, to address such concerns to some
extent.
Second, researchers must make strong distributional assumptions on the one-
sided error term in cross section studies. Some evidence exists suggesting that the
distributional assumptions do not exert a dramatic effect on the ranking of agents
based on inefficiency estimates (e.g., [90, 91]). While the relative ranking of agents
seems largely invariant based on inefficiency scores, the inefficiencies differ over
alternative distributional assumptions on the one-sided error term, with “the single
29 Empirical Analysis of Production Economics: Applications to Banking 1177

parameter models . . . providing a more pessimistic impression than warranted” [91,

p. 158].
Berger and Humphrey [92] introduce the “thick-frontier” model, where the
frontier depends on a group of “best-practice” firms and where weaker distributional
assumptions support the estimation. For multiple-input/multiple-output models,
however, this thick-frontier approach may prove problematic because the ordering
criterion implies a different model from that estimated.
Berger and Humphrey [93] survey 130 frontier efficiency studies of financial
institutions in 21 countries. They evaluate the methods and estimates of the 130
papers, identifying similarities and differences in the modeling approaches and
resulting efficiency estimates. Finding that differences existed in the results of
these studies, they offer suggestions for improving the consistency, accuracy, and
usefulness of future work. They conclude by recommending areas of future research.
Sickles et al. [94] provide a summary of various methods to estimate productive
efficiency for firms, sectors, and countries. They also provide references and
internet links to data sources and R codes to implement the analyses. The authors
describe the approaches used to estimate production frontiers using stochastic
frontier analysis (SFA) and data envelopment analysis (DEA). The R codes estimate
production efficiency with various methods, including “ . . . time invariant fixed
effects, correlated random effects, and uncorrelated random effects panel stochastic
frontier estimators, time varying fixed effects, correlated random effects, and
uncorrelated random effects estimators, semi-parametric efficient panel frontier
estimators, factor models for cross-sectional and time-varying efficiency, bootstrap-
ping methods to develop confidence intervals for index number-based productivity
estimates and their decompositions, DEA and Free Disposable Hull estimators” [94,
p 267].

Production Efficiency
At a most primitive level, we can consider the production efficiency of banks.
That is, do banks produce the maximum output for given inputs or employ the
minimum inputs for given outputs? The level of analysis focuses on the production
function.
Rangan et al. [95] compute a nonparametric frontier to estimate technical
efficiency of a sample of 215 US banks. They find that banks could reduce their
input use by 70% and still produce the same output, where most of this reflected
pure technical inefficiency from wasted inputs rather than from scale inefficiency.
Further, they show that larger banks exhibit more technical efficiency.
Elyasiani and Mehdian [96] collect a random sample of 144 banks, including
small (78) and large (66) banks, in 1985 and consider the total, technical, and
scale efficiency of these banks. The analysis includes four inputs (labor, capital,
deposits excluding large CDs, and large CDs) and one output (total revenue). They
employ a deterministic statistical frontier method, imposing ray-homotheticity [97].
They shift the estimated production function by its intercept until no residuals
are positive and at least one residual is zero. They conclude that larger banks
prove more efficient and that much of the improved efficiency comes from
1178 S. M. Miller

improved scale efficiency, unlike Rangan et al. [95] who report the opposite
finding.
Elyasiani and Mehdian [21] also consider the determination of technical effi-
ciency, using a nonparametric linear programming approach to calculating effi-
ciency. They collect a sample of 191 large banks that match in 1980 and 1985. The
production process includes four outputs (investments, real estate loans, commercial
and industrial loans, and other loans) and four inputs (time and saving deposits plus
CDs, demand deposits, capital, and labor). They conclude that significant technical
progress occurred between 1980 and 1985.
Miller and Noulas [98] consider the technical efficiency of 201 large banks
(banks with assets over $1 billion) from 1984 to 1990, employing DEA analysis.
They average the data for each bank over the 6 years of data. The analysis includes
four inputs (total transactions deposits, total nontransactions deposits, total interest
expense, and total noninterest expense) and six outputs (commercial and industrial
loans, consumer loans, real estate loans, investments, total interest income, and
total noninterest income). They conclude that large banks produced, with low-
technical inefficiency, about 5%. Moreover, larger, more profitable banks exhibit
lower-technical inefficiency.
Elyasiani and Wang [99] examined the productive efficiency of Bank Holding
Companies (BHC) in the United States. They calculate total factor productivity
(TFP) using the Malmquist index, applying data envelopment analysis (DEA).
They determined the change in TFP for their sample of BHCs using data from
1997 through 2007. The question focused on the effect of BHCs’ attempts to
diversify their business across banking, securities, and insurance activities using
Section 20 subsidiaries prior to the adoption of the Gramm-Leach-Bliley Act
(GLBA) of 1999 and using financial holding companies after the GLBA. They
concluded that technical efficiency correlated negatively with business diversifica-
tion and that changes in diversification do not significantly affect TFP. The driving
force behind their findings was whether BHCs experienced first mover advantage
by employing Section 20 diversification prior to the adoption of the GLBA in
1999.

Cost Efficiency
Initial work on bank efficiency concentrates on cost efficiency, which directly relates
to the concept of economies and diseconomies of scale and scope. These cost-
efficiency analyses generally report significant overall technical and/or allocative
inefficiencies in commercial banking that generally decrease with bank size. These
studies (e.g., [62, 70, 75, 76, 92, 100–102]), however, do not give much evidence
across the range of banks of differing sizes, since they either exclude large
commercial banks (i.e., banks with more than $1 billion in assets) or include a
relatively small number of large banks in their samples. In other words, one cannot
draw inferences about small-, medium-, and large-bank performance based on these
studies. Berger and Humphrey [92], using the thick-frontier method, provide the
exception, as they consider inefficiency for the sample of all commercial banks
in 1984, finding, among other things, that differences in inefficiency dominate
29 Empirical Analysis of Production Economics: Applications to Banking 1179

differences in scale and product mix when explaining differences in average cost
across banks.
Kaparakis et al. [103] extend the existing cost-efficiency literature in two direc-
tions. First, they produce inefficiency estimates for a large sample of commercial
banks (i.e., 5548), including large banks with assets exceeding $1 billion, based on a
flexible stochastic cost frontier. Thus, they directly compare the efficiencies of small,
medium, and large banks, precluding unnecessary out-of-sample extrapolations.
Second, they estimate the effects, if any, of external factors (e.g., state popula-
tion density), enacted policies (e.g., state branching regulations), and managerial
qualities (e.g., portfolio riskiness) on the variability of bank efficiency, finding that
generally banks exhibit less efficiency with increasing size, reversing the finding
noted above.

Revenue Efficiency
As noted above, efficiency studies began with cost efficiency. Since the microe-
conomic theory of production begins with cost minimization and progresses to
profit maximization, it makes sense to see a migration toward profit efficiency
(see next section) as the next step from cost efficiency. Some researchers, however,
take an intermediate step and consider revenue efficiency, where much of the work
examines countries other than the United States.
English et al. [104] employed a sample of 442 small banks that participated in
the Federal Reserve’s Functional Cost Analysis program in 1982. They apply the
output distance function to calculate output allocative and technical efficiency. The
duality between the output distance function and the revenue function permits the
derivation of the shadow prices of outputs for individual banks. Then the comparison
of shadow and actual prices of outputs allows a determination of whether banks
select an allocatively efficient output mix. They conclude that this sample of small
banks exhibit significant technical inefficiencies and do not maximize revenue.
Rogers [105] estimates cost, revenue, and profit efficiency using models that
include and exclude a nontraditional measure of output – net noninterest income. He
adopts stochastic translog specifications for the cost, revenue, and profit functions
and uses a panel data sample of US commercial banks from 1991 to 1995 from
the Sheshunoff, Ferguson and Co. Bank Source CD. He reports that cost and profit
efficiency improve after including nontraditional output, whereas revenue efficiency
declines.

Profit Efficiency
Applying the method from measuring cost efficiency to profit efficiency, Banker
and Maindiratta [106] extend the approach of Farrell [85], who offers a procedure
for calculating cost efficiency and its decomposition into technical and allocative
components, generating multiplicative technical and allocative profit efficiency
measures. Several authors measure profit efficiency by the ratio of actual to
optimal profit, given input and output prices (e.g., [106–108]). Thus, the ratio of
the firm’s actual profit to its input-oriented technically efficient profit measures
technical efficiency, which creates several problems. First, prices affect the technical
1180 S. M. Miller

efficiency component in this multiplicative decomposition, which contrasts with

the standard cost and revenue technical efficiency measures that do not depend
on prices. Second, when the firm earns negative profit and optimal profit remains
positive, then measured profit efficiency becomes negative. If optimal profit also
becomes negative, then the profit efficiency ratio becomes positive, but greater than
one.
Banker and Maindiratta [106] note this latter problem and propose an alternative
decomposition based on profit differences. In an earlier paper, Nerlove [109]
measures profit efficiency as the difference between optimal and actual profit, which
he calls lost profit. Nerlove [109] notes that this profit efficiency measure varies with
proportional price changes. To address this issue, Chambers et al. [110] modify
Nerlove [109] and normalize profit using the value of the reference input-output
combination.
Berger et al. [111] use the distribution-free method to consider profit inefficiency.
They decompose profit inefficiency into technical and allocative components, using
estimated shadow prices that can differ from actual prices. Technical inefficiency
measures the movement from the actual production point to the frontier-based using
the shadow prices; allocative inefficiency then measures the movement along the
frontier from the shadow prices to the actual prices. They discover that large profit
inefficiencies exist that include “half of all potential variable profit lost due to
inefficiency” [p 328]. Technical inefficiencies dominate allocative inefficiencies,
where the “technical components averaged between about two and five times as
great as the allocative components.” [p 328] and that output inefficiency exceeds
input inefficiency.
Sahoo et al. [112] extend the work of Fukuyama and Weber [113], and Färe
and Grosskopf [114] who likewise extend the work of Tone [115] on measuring
cost and revenue efficiency as well as profit inefficiency. Sahoo et al. [112] develop
new measures of directional cost- (DCE) and revenue-based (DRE) measures of
efficiency that satisfy translational invariance and strong monotonicity. They then
decompose the DCE and DRE measures into directional value-based technical
(TE) and allocative (AE) efficiency components. Although this paper is primarily
theoretical, the empirical application uses a sample of 50 US banks from 1996.

Nonperforming Loans and Bank Efficiency

One strand of literature examines the effect of bad outcomes, typically nonper-
forming loans on bank efficiency. Two of the original papers in this literature are
Berger and DeYoung [116] and Mester [117] that both use the stochastic frontier
method of estimation. Berger and DeYoung [116] outline how nonperforming loans
can affect bank efficiency measurement. They identify four channels of potential
influence – “bad luck,” “bad management,” “skimping,” and “moral hazard.” The
bad-luck hypothesis reflects external events to the bank. That is, the coronavirus
leads to a series of bankruptcies for a number of a bank’s outstanding loans. The
other three hypotheses all reflect internal events. The bad-management hypothesis
refers to poor senior management practice in managing the portfolio of assets as
well as the day-to-day operations of the bank. The skimping hypothesis captures the
29 Empirical Analysis of Production Economics: Applications to Banking 1181

trade-off between allocating resources to the underwriting process and monitoring

activities against future loan performance problems. That is, by reducing short-run
costs of underwriting and monitoring loans will appear to improve bank efficiency,
since the bank lowers costs for the same quantity of loans. But as time passes, the
skimping activity will reveal itself in more nonperforming loans. The moral hazard
hypothesis sees lower-capitalized banks taking on more risk. Berger and DeYoung
[116] note that moral hazard does not directly affect back cost and efficiency, but it
does contribute to nonperforming loans.
Since the bad-luck hypothesis reflects external influences, researchers need to
incorporate nonperforming loans into their estimates of cost, revenue, and profit
efficiency. Doing so will eliminate the extra costs of dealing with nonperform-
ing loans that reflect external factors and not managerial inefficiency. For the
bad-management and skimping hypotheses, since they reflect internal influences,
researchers should not incorporate nonperforming loans into their estimates of cost,
revenue, and profit efficiency as this will tend to overstate efficiency as costs are
artificially inflated. Berger and DeYoung find that bad luck and bad management
both generate significant effects. Skimping also provides a significant effect, but
its magnitude is smaller. Podpiera and Weill [118] reconsider the bad luck and
bad management hypotheses in emerging markets, finding support for the bad
management hypothesis. Koutsomanoli-Filippakia and Mamatzakis [119] slightly
redefine the hypotheses as the “bad luck,” “bad management,” “skimping/moral
hazard,” and “risk adverse” hypotheses, finding that most evidence supports the
bad luck hypothesis with some evidence for the skimping/moral hazard hypothesis
for a sample of EU banks. Further, Mamatzakis, et al. [120] follow the definitions
in Koutsomanoli-Filippakia and Mamatzakis [119] and find evidence of skimp-
ing/moral hazard for bankrupt loans while evidence of bad luck for restructured
loans, using a sample of Japanese banks.

Return and Risk Efficiency

Implicitly, models of cost, revenue, and profit inefficiency assume risk-averse bank
managers. That is, the decision rule maximizes or minimizes the expected outcome.
No concern expresses itself about the level of risk in the bank’s operation. More
recent research focuses on managing information asymmetries, adverse selection
and moral hazard issues, and risk in addition to bank returns. This literature models
the bank manager as maximizing utility, where the utility function includes expected
return on bank operations and a measure of risk. Hughes and Mester [121] provide
an overview of this literature.
Hughes et al. [122–124] associate the risk of managerial decisions with the
selection of production plans. Thus, the utility of management depends on profit and
the choice of outputs and inputs (i.e., the production plan). A subjective probability
distribution associates with each production plan, according to the beliefs of the
managers about the relationship between future economic states and the production
plans in the determination of profit. Thus, the managerial utility function of profit
and its risk reflects how the production plans underlie the determination of profit
and its risk. They employ the Almost Ideal Demand System (AIDS) developed
1182 S. M. Miller

by Deaton and Muellbauer [125], since this specification leads to the translog
specification when bank managers are risk neutral. The frontier that emerges trades
off expected return and risk, which represents the efficient frontier in portfolio
theory.
In a comment on Hughes et al. [122], Hunter [126] refers to their model as
coming from the third generation. The first-generation models focused on the cost
function of a single output that allowed the measurement of economies and dis-
economies of scale. In second-generation models, banks produce multiple outputs
and researchers employ modern duality theory to uncover measures of productive
efficiency such as ray and expansion-path scale economies, economies of scope,
subadditive cost functions, technological change, and measures of managerial
inefficiency. Hughes et al. [122] develop a model that comes from the third-
generation and focuses on bank profit coupled with uncertainty and risk aversion.
Hughes et al. [122] employ a sample of 443 bank-holding companies in 1994
whose assets ranged from $32.5 million to $249.7 billion. They calculate volatility
of deposits as their risk measure. They find that bank managers do not operate in
a risk-neutral manner and do not maximize profit. They also discover evidence
supporting scale economies in all cases and significant inefficiency of opera-
tion. Finally, they conclude that more geographic and/or depositor diversification
increases the bank’s expected return and that more branches move the institution
toward the frontier reducing inefficiency.
Färe et al. [127] use the directional distance function to derive profit inefficiency,
normalizing by revenue plus cost. Technical efficiency improves with bank size, and
profit inefficiency relates more to allocative than technical sources. They find that
profit inefficiency and its allocative part decrease with bank size. They use 1000
banks ranging from $8 million to $150 billion in assets and 114 banks above $300
million in assets.
A major conclusion from this literature relates back to the issue of economies
of scale in banking. The history of measuring economies of scale led to different
outcomes, depending on the range of bank sizes in the sample considered. That is, as
researchers used samples of larger and larger banks, the range of economies of scale
also expanded too much larger sizes. This new literature on estimating the expected
return and risk frontier in banking also permits the estimation of economies of scale.
Hughes et al. [128] determine that economies of scale depend on how the researcher
models capital structure and risk taking. Larger banks associate with economies
of scale that increase with bank size along the value-maximizing expansion path
rather than the cost-minimizing path. In addition, lower risk through diversification
associates with larger-scale economies.
Hughes and Mester [129] argue that recent advances in the modeling of bank
decision-making that accounts for the endogeneity of risk in the model affect the
production, revenue, cost, and profit of banking institutions. Thus, for example,
unless one accounts for the endogenous nature of risk taking in banking, one will
mis-measure economies of scale. They state that banking studies of large banks that
ignore the role of risk do not find evidence of economies of scale. Those studies of
29 Empirical Analysis of Production Economics: Applications to Banking 1183

large banks, however, that do incorporate a role for endogenous risk find evidence
of substantial economies of scale for large banks.
Hughes and Mester [129] also report that modeling bank decision-making with
endogenous risk leads to performance at large banks that may threaten financial
stability. As such, this finding implies an important role for capital regulation of
large financial institutions.

Bank Failure

Bank failures generally occur during economic distress. Some studies link bank
operating and capital characteristics and the likelihood of failure. Bank failures
destroy banking relationships, lead to reductions in small business lending, and
impair the public’s trust in the banking system. Bank failures prove costly to the US
banking system, and during economic downturns bank failures can lead to further
declines in output [130].
Bank failure prediction models provide regulators with a tool that aids in
preventing bank closures and, thus, reduces the adverse effect of bank failure on
communities. Jagtiani et al. [131] examine the Early Warning System (EWS) models
used to predict problem banks in the Federal Reserve and the Comptroller of the
Currency.

Economic Effects of Bank Failures

Most bank failures occur during periods of economic decline. In his review of
the aggregate banking data, Ashcraft [130] finds a statistically and economically
significant relationship between bank failures and real declines in county income.
Boyd et al. [132] find that bank failures lead to prolonged, significant declines in
real GDP growth, especially for less developed countries. They conclude that the
effect of failed banks on output may persist, be longer than previously thought,
and in some cases be long-term for some local areas. Dell’Ariccia et al. [133]
control for economic contractions and currency crises in their study of the effect
of banking failures on real economic variables. They find that banking crises cause
declines in real economic growth because of less lending. Driscoll [134] uses an
instrumental variable approach, however, to determine that banking shocks do not
have a statistically significant relationship with real output at the macroeconomic
level.

Early Warning System Models

Bank failure prediction models, also called early warning systems, use a variety of
econometric models. Crowley and Loviscek [135] review four methods by using
annual data to determine which model would most accurately predict the failure of
small banks in 1984. They find that logit and probit models deliver higher accuracy
than linear probability and discriminant analysis models. Jagtiani et al. [131] also
compare logit models to nonparametric models. Simple logit estimation techniques
1184 S. M. Miller

prove as valuable as more complex logit and nonparametric models in predicting

future bank failures.
Kolari et al. [136] use logit analysis to predict correctly large US commercial
bank failures at 96% accuracy 1 year prior to failure, and 95% accuracy 2 years
prior to failure. Their bank sample included quarterly data on 50 large banks with
assets greater than $250 million that failed between 1989 and 1992. The authors
also employ a trait recognition model with multiple variable interactions.
Jagtiani et al. [131] develop a logit model with a primary focus on capital ratios.
The basic logit model with two capital ratios appeared to predict capital inadequacy
as accurately as more complex computer-based systems, especially 1 year prior to
failure. Bank capital became a critical component of bank examinations during the
savings and loan (S&L) crisis, specifically with the advent of the Basel I Accord in
1988, which established minimum capital requirements for banks.
Researchers consider other determinants of bank failure, including Moody’s debt
ratings, complex accounting ratios, and bond spreads. Jagtiani and Lemieux [137]
find that bond spreads can signal trouble in the financial markets beginning up to six
quarters before a failure. Bond spreads for banks that will probably fail exceed the
bond spreads for healthy banks. Henebry [138] uses cash-flow spreads to develop a
bank failure prediction model based on bank failures from 1986 to 1990. Jeon and
Miller [37] discover from a failure regression that deregulation did not significantly
affect bank failures and banks with high noninterest income were more likely
to fail.
DeYoung [139] uses hazard function analysis of annual data to model the
likelihood of failure for new commercial banks. The author considers two major
issues while shaping the model: banking conditions in the metropolitan market and
different legal policies across states. Bank failure studies inform policy makers and
federal regulators (e.g., [135, 140, 141]).
Bank failures receive significant attention in the literature because bank failures
are costly and compromising to the safety and soundness of the banking system.
Early warning systems have been developed to predict the likelihood and timing of
bank failures, and several variations have been evaluated to improve the accuracy
and predictive power of these models.

Conclusion

This chapter considers various applications of production economics to the banking

sector. The chapter first considers the issue of how to measure output. Two basic
approaches exist – the production and intermediation specifications. The treatment
of deposits differentiates these two specifications, whereby the production approach
takes deposits as an output and the intermediation approach takes deposits as an
input. Then, this chapter proceeds to discuss various issues in bank production –
bank productivity growth, bank and banking industry profitability, economies of
scale and scope in banking, and bank efficiency. Bank efficiency includes efficiency
as measured by the production function, the cost function, the revenue function,
29 Empirical Analysis of Production Economics: Applications to Banking 1185

the profit function, and the efficient frontier between expected return and risk. The
chapter concludes with an analysis of predicting problem banks and/or bank failures.

Cross-References

Bad Outputs
Cost, Revenue, and Profit Function Estimates
Data Envelopment Analysis: A Nonparametric Method of Production Analysis
Distance Functions in Production Economics
Index Numbers and Productivity Measurement
Stochastic Frontier Analysis: Foundations and Advances I
Stochastic Frontier Analysis: Foundations and Advances II

References
1. Humphrey DB (1985) Cost and scale economies in bank intermediation. In: Aspinwall R,
Eisenbeis R (eds) Handbook for banking strategy. Wiley & Sons, New York
2. Sealey C, Lindley J (1977) Inputs, outputs, and a theory of production and cost at depository
financial institutions. J Finance 32:1251–1266. https://doi.org/10.1111/j.1540-6261.1977.
tb03324.x
3. Camanho AS, Dyson RG (2005) Cost efficiency, production and value-added models in the
analysis of bank branch performance. J Oper Res Soc 56:483–494. https://doi.org/10.1057/
palgrave.jors.2601839
4. Ray SC, Das A (2010) Distribution of cost and profit efficiency: evidence from Indian
banking. Eur J Oper Res 201:297–307. https://doi.org/10.1016/j.ejor.2009.02.030
5. Drake L, Hall MJB, Simper R (2006) The impact of macroeconomic and regulatory factors on
bank efficiency: a non-parametric analysis of Hong Kong’s banking system. J Bank Finance
30:1443–1466. https://doi.org/10.1016/j.jbankfin.2005.03.022
6. Das A, Ghosh S (2006) Financial deregulation and efficiency: an empirical analysis of Indian
banks during the post reform period. Rev Financ Econ 15:193–221. https://doi.org/10.1016/j.
rfe.2005.06.002
7. Holod D, Lewis HF (2011) Resolving the deposit dilemma: a new DEA bank efficiency
model. J Bank Finance 35:2801–2810. https://doi.org/10.1016/j.jbankfin.2011.03.007
8. Solow R (1957) Technical change and the aggregate production function. Rev Econ Stat
39:312–320. https://doi.org/10.2307/1926047
9. Bauer PW, Berger AN, Humphrey DB (1993) Efficiency and productivity growth in U.S.
banking. In: Fried H, Lovell CAK, Schmidt S (eds) The measurement of productive efficiency.
Oxford University Press, New York, pp 386–413
10. Humphrey DB (1991) Productivity in banking and effects from deregulation. Federal Reserve
Bank of Richmond, Econ Rev 77:16–28
11. Humphrey DB (1993) Cost and technical change: Effects from bank deregulation. J Prod Anal
4:9–34. https://doi.org/10.1007/978-94-011-2200-9_2
12. Hunter WC, Timme SG (1991) Technological change in large U.S. banks. J Bus 64:339–362.
https://doi.org/10.1086/296541
13. Caves DW, Christensen LR, Diewert W (1982) The economic theory of index numbers and
the measurement of input, output and productivity. Econom 50:1393–1413. https://doi.org/
10.2307/1913388
14. Malmquist S (1953) Index numbers and indifference surfaces. Trabajos de Estatistica 4:209–
242. https://doi.org/10.1007/BF03006863
1186 S. M. Miller

15. Färe R, Grosskopf S, Roos P (1998) Malmquist productivity indexes: a survey of theory and
practice. In: Färe R, Grosskopf S, Russell RR (eds) Index numbers: essays in honour of Sten
Malmquist. Springer, Dordrecht
16. Färe R, Grosskopf S, Lindgren B, Roos P (1992) Productivity changes in Swedish pharmacies
1980–1989: a non-parametric Malmquist approach. J Prod Anal 3:85–101. https://doi.org/10.
1007/BF00158770
17. Färe R, Grifell-Tatjé E, Grosskopf S, Lovell CAK (1997) Biased technical change and the
Malmquist productivity index. Scand J Econ 99:119–127. https://doi.org/10.1111/1467-9442.
00051
18. Färe R, Grosskopf S, Norris M, Zhang Z (1994) Productivity growth, technical progress and
efficiency change in industrialized countries. Am Econ Rev 84:66–83. www.jstor.org/stable/
2117971
19. Lovell CAK (2003) The decomposition of Malmquist productivity indexes. J Prod Anal
20:437–458. https://doi.org/10.1023/A:1027312102834
20. Ray SC, Desli E (1996) Productivity growth, technical progress, and efficiency change
in industrialized countries: comment. Am Econ Rev 87:1033–1039. www.jstor.org/stable/
2951340
21. Elyasiani E, Mehdian SM (1990b) A nonparametric approach to measurement of efficiency
and technological change: the case of large U.S. commercial banks. J Finance Ser Res 4:157–
168. https://doi.org/10.1007/BF00352569
22. Berg SA, Førsund F, Jansen ES (1992) Malmquist indices of productivity growth during the
deregulation of Norwegian banking, 1980–1989. Scand J Econ 94:211–228. https://doi.org/
10.2307/3440261
23. Wheelock DC, Wilson PW (1999) Technical progress, inefficiency, and productivity change
in U.S. banking, 1984–1993. J Money Credit Bank 31:212–234. https://doi.org/10.2307/
2601230
24. Mukherjee K, Ray SC, Miller SM (2001) Productivity growth in large US commercial banks:
the initial post-deregulation experience. J Bank Finance 25:913–939. https://doi.org/10.1016/
S0378-4266(00)00103-5
25. Gup BE, Walter JR (1989) Top performing small banks: making money the old-fashioned
way. Federal Reserve Bank of Richmond, Econ Rev:23–31
26. Kwast ML, Rose JT (1982) Pricing, operating efficiency, and bank profitability among
large commercial banks. J Bank Finance 6:233–254. https://doi.org/10.1016/0378-
4266(82)90035-8
27. Wall L (1985) Why are some banks more profitable than others? J Bank Res Win-
ter: 240–256. https://www.researchgate.net/publication/288911761_Why_are_some_banks_
more_profitable_than_others
28. Miller SM, Noulas AG (1997) Portfolio mix and large-bank profitability in the USA. Appl
Econ 29:505–512. https://doi.org/10.1080/000368497326994
29. Athanasoglou PP, Brissimis SN, Delis MD (2008) Bank-specific, industry-specific and
macroeconomic determinants of bank profitability. J Inter Finance Markets Inst Money
18:121–136. https://doi.org/10.1016/j.intfin.2006.07.001
30. Bourke P (1989) Concentration and other determinants of bank profitability in Europe,
North America and Australia. J Bank Finance 13:65–79. https://doi.org/10.1016/0378-
4266(89)90020-4
31. Demirgüç-Kunt A, Huizinga H (1999) Determinants of commercial bank interest margins and
profitability: some international evidence. World Bank Econ Rev 13:379–408. https://doi.org/
10.1093/wber/13.2.379
32. Flamini V, Schumacher L, McDonald CA (2009) The determinants of commercial bank
profitability in Sub-Saharan Africa, IMF Working Paper, WP/09/15
33. Molyneux P, Thornton J (1992) Determinants of European bank profitability: a note. J Bank
Finance 16:1173–1178. https://doi.org/10.1016/0378-4266(92)90065-8
34. Kane E (1996) De jure interstate banking: why only now? J Money Credit Bank 28:141–161.
https://doi.org/10.2307/2078020
29 Empirical Analysis of Production Economics: Applications to Banking 1187

35. Amos OM Jr (1992) Regional distribution of bank closings in the United States from 1982 to
1988. South Econ J 58:805–815. https://doi.org/10.2307/1059846
36. Cebula RJ (1994) The regional distribution of bank closings in the United States: an extension
of the Amos analysis. South Econ J 61:202–208. https://doi.org/10.2307/1060142
37. Jeon Y, Miller SM (2007b) Has deregulation affected births, deaths, and marriages in the U.S.
commercial banking industry? Econ Inq 45:325–341. https://doi.org/10.1111/j.1465-7295.
2006.00037.x
38. Berger AN, Mester LJ (2003) Explaining the dramatic changes in performance of US banks:
technological change, deregulation, and dynamic changes in competition. J Financ Intermed
12:57–95. https://doi.org/10.1016/S1042-9573(02)00006-2
39. Jayaratne J, Strahan PE (1997) The benefits of branching deregulation. Federal Reserve Bank
of New York, Policy Rev:13–29
40. Jayaratne J, Strahan PE (1998) Entry restrictions, industry evolution and dynamic efficiency:
evidence from commercial banking. J Law Econ 41:239–273. https://doi.org/10.1086/467390
41. Tirtirglu D, Daniels K, Tirtirglu E (2005) Deregulation, intensity of competition, industry
evolution and the productivity growth of US commercial banks. J Money Credit Bank 37:339–
360. https://doi.org/10.1353/mcb.2005.0025
42. Duca JV, McLaughlin MM (1990) Developments affecting the profitability of commercial
banks. Fed Reserv Bull 76:461–484
43. Bailey MN, Hulten C, Campbell D (1992) The distribution of productivity. Brook Pap Econ
Act Micro 1:187–267
44. Haltiwanger JC (1997) Measuring and analyzing aggregate fluctuations: the importance
of building from microeconomic evidence. Federal Reserve Bank St. Louis, Rev 79:
55–77
45. Bennet TL (1920) The theory of measurement of changes in the cost of living. J Royal Stat
Soc 83:455–462. https://doi.org/10.2307/2340960
46. Stiroh J (2000) Compositional dynamics and the performance of the U.S. banking industry,
Federal Reserve Bank of New York, Staff Reports #98, (September)
47. Laspeyres E (1871) Die berechnung einer mittleren warenbpreissteigerung. Jahrbücher für
Nationalökonomie und Statistik 16:296–314
48. Paasche H (1874) Über die preisentwicklung der letzten jahre nach den hamburger börsen-
notirungen. Jahrbücher für Nationalökonomie und Statistik 23:168–178
49. Fisher I (1922) The making of index numbers. Houghton Mifflin, Boston
50. Pigou AC (1920) The economics of welfare. Macmillan, London
51. Diewert WE (2005) Index number theory using differences rather than ratios. Am J Econ
Sociol 64:311–360. https://doi.org/10.1111/j.1536-7150.2005.00365.x
52. Griliches Z, Regev H (1995) Firm productivity in Israeli industry 1979–1988. J Econom
65:175–203. https://doi.org/10.1016/0304-4076(94)01601-U
53. Scarpetta S, Hemmings P, Tressel T, Woo J (2002) The role of policy and institutions
for productivity and firm dynamics: evidence from micro and industry data, Organization
for Economic Cooperation and Development, Economics Department Working Paper 329.
https://doi.org/10.2139/ssrn.308680
54. Jeon Y, Miller SM (2007a) A Bennet decomposition of industry dynamics: an application
to the nationwide and state level U.S. banking industry. SSRN: https://ssrn.com/abstract=
1006307
55. Bartelsman E, Haltiwanger J, Scarpetta S (2005) Microeconomic evidence of creative destruc-
tion in industrial and developing countries. SSRN: https://papers.ssrn.com/sol3/papers.cfm?
abstract_id=612230
56. Balk BM (2003) The residual: On monitoring and benchmarking firms, industries, and
economies with respect to productivity. J Prod Anal 20:5–47. https://doi.org/10.1023/A:
1024817024364
57. Stiroh KJ, Strahan PE (2003) Competitive dynamics of deregulation: evidence from
U.S. banking. J Money Credit Bank 35:801–828. https://doi.org/10.1111/j.1538-4616.2008.
00141.x
1188 S. M. Miller

58. Benston GJ (1965) Branch banking and economies of scale. J Finance 20:312–331. https://
doi.org/10.1111/j.1540-6261.1965.tb00212.x
59. Bell FW, Murphy NB (1968) Costs in commercial banking: a quantitative analysis of bank
behavior and its relation to bank regulation, Federal Reserve Bank of Boston, Research report
no. 41
60. Clark JA (1988) Economies of scale and scope at depository financial institutions: a review
of the literature. Econ Rev, Federal Reserve Bank of Kansas City, September/October, 16–33
61. Humphrey DB (1987) Cost dispersion and the measurement of economies in banking. Federal
Reserve Bank f Richmond, Econ Rev 73:24–38
62. Hunter WC, Timme SG (1986) Technical change, organizational form, and the structure of
the bank production. J Money Credit Bank 18:152–166. https://doi.org/10.2307/1992199
63. Nelson RW (1985) Branching, scale economies, and banking cost. J Bank Finance 9:177–191.
https://doi.org/10.1016/0378-4266(85)90016-0
64. Noulas AG, Ray S, Miller SM (1990) Returns to scale and input substitution for large U.S.
banks. J Money Credit Bank 22:94–108. https://doi.org/10.2307/1992130
65. Shaffer S, David E (1991) Economies of superscale in commercial banking. Appl Econ
23:283–293. https://doi.org/10.1080/00036849100000135
66. Wheelock DC, Wilson PW (2018) The evolution of scale economies in U.S. banking. J Appl
Econ 33:16–28. https://doi.org/10.1002/jae.2579
67. Benston GJ, Berger A, Hanweck GA, Humphrey DB (1983) Economies of scale and scope
in banking, Research papers in Banking and Financial Economics 64. Board of Governors of
the Federal Reserve System. https://ideas.repec.org/p/fip/fedgbf/64.html
68. Berger AN, Hanweck GA, Humphrey DB (1987) Competitive viability in banking: scale,
scope, and product mix economies. J Monet Econ 20:501–520. https://doi.org/10.1016/0304-
3932(87)90039-0
69. Kolari J, Zardkoohi A (1987) Bank costs, structure, and performance. Lexington Books,
Lexington
70. Lawrence C, Shay R (1986) Technology and financial intermediation in a multiproduct
banking firm: an econometric study of U.S. banks 1979–1982. In: Lawrence C, Shay R (eds)
Technological innovation, regulation and the monetary economy. Ballinger, Cambridge, MA
71. Ferrier GD, Lovell CAK (1987) Estimating economies and efficiencies in production: An
evaluation of alternative approaches. In: Proceedings of the business and economic statistics
section. American Statistical Association, Alexandria
72. Kim M (1986) Banking technology and the existence of a consistent output aggregate. J
Monet Econ 18:181–195. https://doi.org/10.1016/0304-3932(86)90075-9
73. Buono M, Eakin B (1990) Branching restrictions and banking costs. J Bank Finance 14:1151–
1162. https://doi.org/10.1016/0378-4266(90)90006-N
74. Cebenoyan AS (1990) Scope economies in banking: the hybrid Box-Cox function. Financ
Rev 25:115–125. https://doi.org/10.1111/j.1540-6288.1990.tb01291.x
75. Mester LJ (1993) Efficiency in the savings and loan industry. J Bank Finance 17:267–286.
https://doi.org/10.1016/0378-4266(93)90032-9
76. Mester LJ (1996) A study of bank efficiency taking into account risk-preferences. J Bank
Finance 20:1025–1045. https://doi.org/10.1016/0378-4266(95)00047-X
77. Rangan N, Zardhoohi A, Kolari J, Fraser D (1989) Production costs for consolidated
multibank holding companies compared to one-bank organizational forms. J Econ Bus
41:317–325. https://doi.org/10.1016/0148-6195(89)90028-3
78. Hunter WC, Timme SG, Yang WK (1990) An estimation of cost subadditivity and multi-
product production in large U.S. banks. J Money Credit Bank 22:504–525. https://doi.org/10.
2307/1992434
79. Pulley L, Braunstein Y (1992) A composite cost function for multiproduct firms with an
application to economies of scope in banking. Rev Econ Stat 74:221–230. https://doi.org/10.
2307/2109653
80. Pulley LB, Humphrey DB (1993) The role of fixed costs and cost complementarities in
determining scope economies and the cost of narrow banking proposals. J Bus 66:437–462.
www.jstor.org/stable/2353208
29 Empirical Analysis of Production Economics: Applications to Banking 1189

81. Noulas AG, Miller SM, Ray S (1993) Regularity conditions and scope estimates: the case of
large-sized U.S. banks. J Finance Serv Res 7:235–248. https://doi.org/10.1007/BF01047012
82. Mitchell K, Onvural NM (1996) Economies of scale and scope at large commercial banks:
evidence from the Fourier flexible functional form. J Money Credit Bank 28:178–199. https://
doi.org/10.2307/2078022
83. Berger AN, Humphrey DB, Pulley LB (1996) Do consumers pay for one-stop banking?
Evidence from an alternative revenue function. J Bank Finance 20:1601–1621. https://doi.
org/10.1016/S0378-4266(96)00028-3
84. Debreu G (1951) The coefficient of resource utilization. Econom 19:273–292. http://doi.org/
0012-9682(195107)19:3<273:TCORU>2.0.CO;2-4
85. Farrell MJ (1957) The measurement of productive efficiency. J Royal Stat Soc Ser A 120:253–
281. https://doi.org/10.2307/2343100
86. Koopmans TC (1951) An analysis of production as an efficient combination of activities. In:
Koopmans TC (ed) Activity analysis of production and allocation, Cowles Commission for
Research in Economics, Monograph 13. John-Wiley and Sons, Inc, New York
87. Schmidt P (1985–86)) Frontier production functions. Econom Rev 4:289–328. https://doi.org/
10.1080/07474938608800089
88. Førsund FR, Lovell CAK, Schmidt P (1980) A survey of frontier production functions and
of their relationship to efficiency measurement. J Econom 13:5–25. https://doi.org/10.1016/
0304-4076(80)90040-8
89. Simar L, van Keilegom I, Zelenyuk V (2017) Nonparametric least squares methods for
stochastic frontier models. J Prod Anal 47:189–204. https://doi.org/10.1007/s11123-016-
0474-2
90. Cowing TG, Reifschneider D, Stevenson RE (1983) A comparison of alternative frontier
cost function specifications. In: Dogramaci A (ed) Developments in econometric analyses
of productivity, Ch. 4. Kluwer-Nijhoff
91. Greene WH (1990) A Gamma-distributed stochastic frontier model. J Econom 46:141–163.
https://doi.org/10.1016/0304-4076(90)90052-U
92. Berger AN, Humphrey DB (1991) The dominance of inefficiencies over scale and prod-
uct mix economies in banking. J Monet Econ 28:117–148. https://doi.org/10.1016/0304-
3932(91)90027-L
93. Berger AN, Humphrey D (1997) Efficiency of financial institutions: international survey and
directions for future research. Eur J Oper Res 98:175–212. https://doi.org/10.1016/S0377-
2217(96)00342-6
94. Sickles RC, Song W, Zelenyuk V (2020) Econometric analysis of productivity: theory and
implementation in R. In: Vinod H, Rao CR (eds) Financial Macro and micro econometrics
using R, vol 42, 1st edn, pp 267–297
95. Rangan N, Grabowski R, Aly HY, Pasurka C (1988) The technical efficiency of U.S. banks.
Econ Lett 28:69–175. https://doi.org/10.1016/0165-1765(88)90109-7
96. Elyasiani E, Mehdian SM (1990a) Efficiency in the commercial banking industry: a produc-
tion frontier approach. Appl Econ 22:539–551. https://doi.org/10.1080/00036849000000010
97. Färe R, Shephard RW (1977) Ray-homothetic production functions. Econom 45:133–146.
https://doi.org/10.2307/1913291
98. Miller SM, Noulas AG (1996) The technical efficiency of large bank production. J Bank
Finance 20:495–509. https://doi.org/10.1016/0378-4266(95)00017-8
99. Elyasiani E, Wang Y (2012) Bank holding company diversification and production efficiency.
Appl Financ Econ 22:1409–1428. https://doi.org/10.1080/09603107.2012.657351
100. Aly HY, Grabowski R, Pasurka C, Rangan N (1990) Technical, scale and allocative efficien-
cies in U.S. banking: an empirical investigation. Rev Econ Stat 72:211–218. https://doi.org/
10.2307/2109710
101. Ferrier GD, Lovell CAK (1990) Measuring cost efficiency in banking: econometric
and linear programming evidence. J Econom 46:224–245. https://doi.org/10.1016/0304-
4076(90)90057-Z
102. Grabowski R, Rangan N, Rezvanian R (1994) The effect of deregulation on the efficiency of
U.S. banking firms. J Econ Bus 46:39–54. https://doi.org/10.1016/0148-6195(94)90020-5
1190 S. M. Miller

103. Kaparakis EI, Miller SM, Noulas AG (1994) Short-run cost inefficiency of commercial banks:
a flexible stochastic frontier approach. J Money Credit Bank 26:875–893. https://doi.org/10.
2307/2077953
104. English M, Grosskopf S, Hayes K, Yaisawamg S (1993) Output allocative and technical
efficiency of banks. J Bank Finance 17:349–366. https://doi.org/10.1016/0378-
4266(93)90036-D
105. Rogers KE (1998) Nontraditional activities and the efficiency of US commercial banks. J
Bank Finance 22:467–482. https://doi.org/10.1016/S0378-4266(98)00020-X
106. Banker R, Maindiratta A (1988) Nonparametric analysis of technical and allocative efficien-
cies in production. Econom 56:1315–1332. https://doi.org/10.2307/1913100
107. Berger AN, Mester LJ (1997) Inside the black box: what explains differences in the
efficiencies of financial institutions? J Bank Finance 21:895–967. https://doi.org/10.1016/
S0378-4266(97)00010-1
108. Cooper WW, Seiford L, Tone K (2000) Data envelopment analysis. Kluwer Academic
Publishers, Boston
109. Nerlove M (1965) Estimation and identification of Cobb-Douglas production functions. Rand
McNally Company, Chicago
110. Chambers RG, Chung Y, Färe R (1998) Profit, directional distance functions, and
Nerlovian efficiency. J Optimization Theory Appl 98:351–364. https://doi.org/10.1023/A:
1022637501082
111. Berger AN, Hancock D, Humphrey D (1993) Bank efficiency derived from the profit function.
J Bank Finance 17:317–347. https://doi.org/10.1016/0378-4266(93)90035-C
112. Sahoo BK, Mehdiloozad M, Tone K (2014) Cost, revenue and profit efficiency measurement
in DEA: a directional distance function approach. Eur J Oper Res 237:921–931. https://doi.
org/10.1016/j.ejor.2014.02.017
113. Fukuyama H, Weber WL (2009) Output slacks-adjusted cost efficiency and value-based
technical efficiency in DEA models. J Oper Res Soc Jpn 52:86–104. https://doi.org/10.15807/
jorsj.52.86
114. Färe R, Grosskopf S (2006) Resolving a strange case of efficiency. J Oper Res Soc 57:1366–
1368. https://doi.org/10.1057/palgrave.jors.2602109
115. Tone K (2002) A strange case of the cost and allocative efficiencies in DEA. J Oper Res Soc
53:1225–1231. https://doi.org/10.1057/palgrave.jors.2601438
116. Berger AN, DeYoung R (1997) Problem loans and cost efficiency in commercial banks. J
Bank Finance 21:849–870. https://doi.org/10.1016/S0378-4266(97)00003-4
117. Mester LJ (1997) Measuring efficiency at U.S. banks: accounting for heterogeneity is
important. Eur J Oper Res 98:230–242. https://doi.org/10.1016/S0377-2217(96)00344-X
118. Podpiera J, Weill L (2008) Bad luck or bad management? Emerging banking market
experience. J Financ Stab 4:135–148. https://doi.org/10.1016/j.jfs.2008.01.005
119. Koutsomanoli-Filippakia A, Mamatzakis E (2009) Performance and Merton-type default risk
of listed banks in the EU: a panel VAR approach. J Bank Finance 33:2050–2061. https://doi.
org/10.1016/j.jbankfin.2009.05.009
120. Mamatzakis E, Matousekb R, Vu AN (2016) What is the impact of bankrupt and restructured
loans on Japanese bank efficiency? J Bank Finance 72:5187–5202. https://doi.org/10.1016/j.
jbankfin.2015.04.010
121. Hughes JP, Mester LJ (2010) Efficiency in banking: theory, practice, and evidence. In: Berger
AN, Molyneux P, Wilson JOS (eds) The Oxford handbook of banking, Ch. 18. Oxford
University Press, New York, pp 463–485
122. Hughes JP, Lang W, Mester LJ, Moon C-G (1996) Efficient banking under interstate
branching. J Money Credit Bank 28:1045–1071. https://doi.org/10.2307/2077940
123. Hughes JP, Lang W, Mester LJ, Moon C-G (1999) The dollars and sense of bank consolida-
tion. J Bank Finance 23:291–324. https://doi.org/10.1016/S0378-4266(98)00088-0
124. Hughes JP, Lang W, Mester LJ, Moon C-G (2000) Recovering risky technologies using the
almost ideal demand system: an application to U.S. banking. J Finance Serv Res 18:5–27.
https://doi.org/10.1023/A:1026554922476
29 Empirical Analysis of Production Economics: Applications to Banking 1191

125. Deaton A, Muellbauer J (1980) An almost ideal demand system. Am Econ Rev:70312–70336.
www.jstor.org/stable/1805222
126. Hunter WC (1996) Comment on efficient banking under interstate branching. J Money Credit
Bank 28:1072–1075. https://doi.org/10.2307/2077941
127. Färe R, Grosskopf S, Weber W (2004) The effect of risk-based capital require-
ments on profit efficiency in banking. Appl Econ 36:1731–1743. https://doi.org/10.1080/
0003684042000218525
128. Hughes JP, Mester LJ, Moon C-G (2001) Are scale economies in banking elusive or
illusive? Evidence obtained by incorporating capital structure and risk-taking into mod-
els of bank production. J Bank Finance 25:2169–2208. https://doi.org/10.1016/S0378-
4266(01)00190-X
129. Hughes JP, Mester LJ (2019) The performance of financial institutions modeling, evidence,
and some policy implications. In: Berger AN, Molyneux P, Wilson JOS (eds) The Oxford
handbook of banking, Ch. 8. Oxford University Press, Oxford, pp 230–261
130. Ashcraft AB (2005) Are banks really special? New evidence from the FDIC-induced failure of
healthy banks. Am Econ Rev 95:1712–1730. https://doi.org/10.1257/000282805775014326
131. Jagtiani J, Kolari J, Lemieux C, Shin H (2003) Early warning models for bank supervision:
simpler could be better. Federal Reserve Bank of Chicago, Econ Perspect Q3:49–60
132. Boyd J, Kwak SK, Smith B (2005) The real output losses associated with modern banking
crises. J Money Credit Bank 37:977–999. https://doi.org/10.1353/mcb.2006.0002
133. Dell’Ariccia G, Detragiache E, Rajan R (2008) The real effect of banking crises. J Financ
Intermed 17:89–112. https://doi.org/10.1016/j.jfi.2007.06.001
134. Driscoll JC (2004) Does bank lending affect output? Evidence from the U.S. states. J Monet
Econ 51:451–471. https://doi.org/10.1016/j.jmoneco.2004.01.001
135. Crowley FD, Loviscek AL (1990) New directions in predicting bank failures: the case
of small banks. North Am Rev Econ Finance 1:145–162. https://doi.org/10.1016/1042-
752X(90)90011-4
136. Kolari J, Glennon D, Shin H, Caputo M (2002) Predicting large US commercial bank failures.
J Econ Bus 54:361–387. https://doi.org/10.1016/S0148-6195(02)00089-9
137. Jagtiani J, Lemieux C (2001) Market discipline prior to bank failure. J Econ Bus 53:313–324.
https://doi.org/10.1016/S0148-6195(00)00046-1
138. Henebry KL (1996) Do cash flow variables improve the predictive accuracy of a Cox
proportional hazards model for bank failure? Q Rev Econ Finance 36:395–409. https://doi.
org/10.1016/S1062-9769(96)90023-X
139. DeYoung R (1999) Birth, growth, and life or death of newly chartered banks. Federal Reserve
Bank of Chicago, Econ Perspect 23:18–34
140. Kishan RP, Opiela TP (2000) Bank size, bank capital, and the bank lending channel. J Money
Credit Bank 32:121–141. https://doi.org/10.2307/2601095
141. Molina CA (2002) Predicting bank failures using a hazard model: the Venezuelan banking
crisis. Emer Market Rev 3:31–50. https://doi.org/10.1016/S1566-0141(01)00029-2
Applications of Production Economics
in Education 30
Jill Johnes

Contents
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1194
Cost Functions and Economies of Scale and Scope . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1195
Background on Cost Concepts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1196
Estimating Cost Functions in Education and Higher Education:
Challenges and Methodology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1198
Estimation Approach: SFA Versus DEA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1201
Findings from the Literature . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1203
Recent Developments in Estimating Cost Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1205
Policy Implications and Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1206
Production Functions, Distance Function, Shadow Prices, and Elasticities . . . . . . . . . . . . . . 1206
Background on Production Concepts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1207
Estimating Distance Functions in Education and Higher Education:
Challenges and Methodology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1208
Findings from the Literature . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1211
Policy Implications and Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1212
Efficiency, Productivity Change, and Analyses of Factors Underlying Efficiency . . . . . . . . 1212
Background on Efficiency Concepts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1213
Findings from the Literature . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1214
Recent Developments in Efficiency Measurement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1218
Policy Implications and Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1220
Level of Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1221
Individual-Level Analyses . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1221
Funding Area Analyses . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1222
National-Level Analyses . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1223
Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1224
Cross-References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1225
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1225

J. Johnes ()
Huddersfield Business School, University of Huddersfield, Huddersfield, UK
e-mail: [email protected]

S. C. Ray et al. (eds.), Handbook of Production Economics,
https://doi.org/10.1007/978-981-10-3455-8_30
1194 J. Johnes

Abstract

This chapter provides a comprehensive survey of the existing literature on

production economics from the education perspective, bringing together findings
from the education costs, production, and efficiency contexts and relating to all
levels of education including primary, secondary (both compulsory and non-
compulsory), and higher education.

Keywords

Education · Higher education · Costs · Production · Economies ·

Elasticities · Efficiency

JEL Classiﬁcation Numbers

D24, I20, I21, I23, L25

Introduction

This chapter focuses on production economics in the context of the education sector.
This sector makes an interesting case study because of its particular characteristics,
which derive from the fact that returns to education can be both private (accruing
to the individual in terms of higher salaries) and social (accruing to society in
terms of increased productivity and economic growth). As a consequence of the
benefits accruing to society as a whole from individuals being educated, education
institutions are often publicly funded, although the extent of the public funding
likely varies by level of education (and country). The public funding aspect of
education affects costs, production, and efficiency in that sector, and these are all
relevant in the production economics context.
It is worth considering two broad components of education which we will term
(a) higher or tertiary education, encompassing non-compulsory education for post-
18-year-olds often in universities, and (b) education, encompassing primary and
secondary education. The latter is largely compulsory, at least up to the age of
around 16 years, particularly in developed countries, and predominantly publicly
funded. The former is not compulsory, but is also in receipt of substantial public
funding, particularly in developed countries, since there are still considered to be
some benefits of higher education accruing to society (as well as the individual). In
addition to these categories, education can be provided at many levels to adults
(typically 25 years plus). Adult education is provided very differently from the
traditional primary, secondary, and higher education levels, with provision often
being in the form of modules offered through a blended or distance medium [114].
Given the frequent lack of data on adult education provision, there will be only little
reference to this particular sector in the paper below.
To illustrate the importance of education, in 2015, across all OECD countries,
the average spending on education institutions across the spectrum of education
30 Applications of Production Economics in Education 1195

levels is some 5% of GDP, with a variation from around 3% to 6%. On average

in the OECD, the majority of around 70% of the spending on education is
to non-tertiary education institutions (as might be expected), and this is equiv-
alent to 3.5% of GDP, with a variation of 3% to 4.5% (see Education at a
Glance 2018 https://read.oecd-ilibrary.org/education/education-at-a-glance-2018_
eag-2018-en#page260, accessed June 14th 2019). The largely publicly funded
nature of organizations in the education and higher education sectors therefore
makes this an interesting focus in the context of production economics. Such
organizations are generally non-profit making. Yet the amount of public funds
received by schools and universities and the role these institutions play in driving
growth in the economy make it imperative for them to be run efficiently and
effectively. An empirical knowledge of concepts in production such as the size
of economies of scale or scope, efficiency levels, and possibilities for substitution
between inputs (or, indeed, between outputs) are all important in the education
context.
While the education and higher education sectors of many countries comprise
largely publicly funded institutions, privately funded institutions also exist, to a
greater or lesser extent, at all levels of education. The focus of this chapter is
generally on the non-profit, largely publicly funded provision, but private sector
examples will be reviewed as appropriate.
This chapter is in six sections of which this introduction is the first. Section “Cost
functions and economies of scale and scope” focuses on cost functions in education
including concepts, estimation, and findings from the literature. The section ends
with some recent developments, policy implications, and suggestions for future
work. Output distance functions are the subject of Section “Production functions,
distance function, shadow prices, and elasticities”, which examines concepts, esti-
mation, and findings from the literature before concluding with policy implications
and possible topics for future exploration. Section “Efficiency, productivity change,
and analyses of factors underlying efficiency” turns to efficiency and productivity
change including concepts, findings from the literature, recent developments, policy
implications, and future work. Level of analysis is the focus of Section “Level
of analysis” which examines and reviews the literature on various possibilities
including individuals, funding areas, and countries. Final conclusions are drawn
in Section “Conclusions”, which also suggests areas for future applications of
production economics in the education and higher education contexts.

Cost Functions and Economies of Scale and Scope

Schools and higher education institutions (HEIs) are multi-product organizations.

School pupils, for example, are taught and attain qualifications in multiple subjects
in schools; universities produce outcomes from teaching, research, and third mission
activities. This leads to the estimation of multi-product cost functions [37] in the
education and higher education contexts and permits the testing of a number of key
production economics concepts such as:
1196 J. Johnes

• Existence or otherwise of economies of scale

• Existence or otherwise of economies of scope
• Extent of substitution possibilities through evaluating elasticities of substitution

Full details on the theory underpinning cost functions and economies of scale can
be found in Chaps. 16 and 17 (respectively) of Volume I of this publication. Each of
these concepts will be considered briefly in the context of the empirical literature in
this section.

Background on Cost Concepts

In a multi-product production situation such as we have in education and higher

education, the cost relationship is

C(y) = f (y; p) (1)

where
y is the vector of outputs
p is the vector of input prices

In order to estimate this function empirically, the researcher must select a

functional form which should:

• Be consistent with cost minimization given outputs and input costs, i.e., it must
be a non-negative and non-decreasing function.
• Provide predictions of costs when the value of one or more outputs is zero. This is
particularly needed in order to derive estimates of economies of scale and scope
and precludes cost functions in logarithms such as the Cobb-Douglas.
• Allow for the existence of scale or scope economies or diseconomies, without
enforcing their existence.

Functional forms which fulfill these criteria and which have been used in
empirical studies include the cross elasticity of substitution, quadratic, and hybrid
translog. Each has been used in empirical studies and has various advantages and
disadvantages in terms of estimation, a brief overview of which can be found in
Johnes et al. [195], while a detailed comparison of the merits of the translog over
the Cobb-Douglas can be found in Gronberg et al. [141].
In this multi-product case, there are two concepts relating to economies of scale
[181]. Ray economies of scale are the savings in costs occurring when all outputs
increase (while holding the output mix constant). Product-specific economies of
scale are the cost savings which occur when one output increases and all other
outputs remain at fixed production levels [197]. If we assume that we have k inputs
(k = 1, . . . , K) and m outputs (m = 1, . . . , M), these concepts can be denoted for
the general case as follows:
30 Applications of Production Economics in Education 1197

C(y)
SR = (2)
m ym Cm (y)

where
SR represents ray economies of scale
ym is the mth output
Cm (y) = ∂C(y)/∂ym is the marginal cost of producing the mth output

Sm (y) = AI C (ym ) /Cm (y) (3)

where
Sm (y) denotes product-specific economies relating to product m (where m = 1,
. . . , M)
AIC(ym ) = [C(yM ) − C(yM − m )]/ym
C(yM ) is the total cost of producing all M outputs
C(yM − m ) is the total cost of producing all M outputs except output m

Values above (below) 1 indicate the presence of economies (diseconomies) of

scale in the estimated long-run cost equation. Evaluating these measures can be
useful from a policy viewpoint in determining, for example, whether an expansion in
provision is best effected through increasing the size of existing providers (schools
or universities) or if diseconomies of scale are observed in the sector by introducing
entirely new providers.
Economies of scope, in contrast, occur when it is less costly to produce a number
of outputs together rather than to produce each output independently in its own
specialist production unit [181]. As with economies of scale, in this multi-product
case, we have two concepts relating to economies of scope. Global economies of
scope occur when the costs of producing all outputs together in a single firm are
less than the sum of the costs of producing each output in a separate firm. Product-
specific economies of scope for product m arise when the costs of producing all
outputs together in a single firm are less than the sum of costs of producing output
m in a separate firm and all outputs apart from m in another firm [197]. These can
be denoted in the general case as follows:

SG = C (ym ) − C(y) /C(y) (4)
m

where
SG denotes global economies of scope
C(ym ) is the cost of producing output m

SC m = [C (ym ) + C (yM−m ) − C(y)] /C(y) (5)

where:
SCm denotes product-specific economics of scope for output m
1198 J. Johnes

Economies of scope can arise if it is possible to spread the costs of central

services across an array of outputs. It is likely that both schools and HEIs benefit
from producing their outputs in one production unit as they may be able to spread
the costs of capital and administration across their different outputs whether it
is teaching across different disciplines (schools and universities) or teaching and
research (universities). The degree of scope economies in universities depends
on the extent to which the products (e.g., research and teaching) are produced
jointly as opposed to separately, and this issue is considered further below in the
context of the empirical literature. The empirical evaluation of these measures can
provide useful policy and managerial insights into the degree to which organizations
should become more (or less) specialized in the outputs produced. In the higher
education case, for example, economies of scope can indicate whether HEIs should
be research-focused or teaching-focused or even whether they should specialize in
a specific discipline (such as arts or medicine).

Estimating Cost Functions in Education and Higher Education:

Challenges and Methodology

Knowing the parameters of the estimated cost function in an education context can
clearly offer useful insights to managers and policy-makers alike. But the imple-
mentation of the cost function methodology in education is not easy, and decisions
regarding specification (e.g., of costs, outputs, and functional form) and estimation
approach in particular can potentially affect outcomes and conclusions drawn from
any cost function analysis. The first major challenge in estimating education cost
functions is identifying what is meant by “costs.” Costs (or expenditure) can be
allocated to various categories. For universities these might include administration
versus academic expenditure; research versus teaching expenditures; and recurrent
versus capital expenditures. For schools these might be instructional and non-
instructional expenditures or total fee revenue. Many empirical studies are interested
in total recurrent expenditure, and this is the typical definition of costs. There
are, however, exceptions. Some studies, for example, have focused specifically on
administration (rather than total recurrent) costs in the context of universities [64,
84].
When estimating cost functions, there is an underlying assumption that education
providers are seeking to minimize their costs. Given that such organizations are
typically in receipt of public funds, this assumption is open to debate. Indeed, an
early examination of costs of universities in the USA suggests that universities do
not minimize costs but rather spend all the income they receive [50]. This view is
challenged [225] on the premise that providers with diverse sources of funding (such
as universities) are more likely to adopt optimizing behavior than when receiving all
funding from the public purse. More recently, the marketization of, and increasing
competition in, higher education sectors across the world following the global
financial crash and subsequent constraints on public funding have put increasing
pressure on higher education providers to minimize their costs. Similar pressures
can also be seen in the education sector where policies to increase competition
30 Applications of Production Economics in Education 1199

among publicly funded schools in some countries would also lead to increasing
cost minimization behavior.
While the outputs of schools and HEIs may seem obvious, their precise mea-
surement is not so clear-cut. Schools often use test or examination performance or
graduate numbers (by subject) to reflect teaching outputs and their quality [34, 35,
41, 51, 59, 69, 109, 152, 165, 229, 233, 234, 239, 277, 278]. Where this is not
available, student enrolment numbers might be substituted [31, 59, 62, 166, 167,
218, 230, 268, 271, 272, 294, 295, 326]. But the disadvantage of enrolment figures
is that they fail to reflect output quality. Since schools’ outputs are affected by their
environment, the background of the pupils who attend the school, and the quality
of teachers, as well as other contextual variables relating to pupils, families, school,
or the school location are often added to the cost equation to take these factors into
account [33, 41, 51, 69, 107, 109, 141, 142, 152, 165, 167, 327].
Universities produce outputs which can be categorized as teaching, research, or
third mission. Student numbers are commonly used to reflect teaching outputs [3,
18, 28, 39, 40, 158, 168, 183, 195, 197, 235, 249, 297, 301, 320], often categorized
by level (e.g., undergraduate or postgraduate) and broad disciplines (such as science,
non-science, and medicine), but various problems arise not least of which is the
issue of quality of teaching output. Graduate numbers have been used in preference
to student numbers in order to try to capture quality [27], but this ignores the quality
of degrees obtained by different graduates. Quality is addressed in various ways
including adding variables to reflect “quality” to the cost equation such as average
entry qualifications of intake or a value-added measure [195, 197, 297, 309].
These outputs can be seen as the short-term outcomes of higher education. Long-
term benefits from taking a higher degree might be measured using labor market
metrics such as numbers of graduates achieving a job or graduates’ starting salary
[7, 47, 176, 212, 222].
Measuring research is also problematic. Nationally organized research rating
exercises (such as the Research Excellence Framework in the UK) have measures of
both quantity and quality [136], but these are available only at intervals and therefore
not always a reflection of current position. Citation and publication counts can also
be used (as in, e.g., [101, 182, 248]) but can be difficult to obtain and may not reflect
the current output. As a consequence, many studies resort to input measures, such as
competitively won grant income, rather than output measures, in an effort to capture
quality and quantity of current activity.
Outcomes from third mission activities are the most difficult to measure in
the higher education context with many empirical studies not even attempting it,
although its omission will inevitably lead to problems of bias in the estimated cost
function (some exceptions include [183, 195, 197]; and [301], where university
income from other services rendered is included to reflect third mission activities).
Input prices should also be included in the cost function if these vary across
production units, with many studies including the price of capital and/or the price
of labor [86, 134, 135, 226]. Average salary is commonly used to reflect the price
of labor, but this, of course, may be more of a reflection of the distribution of staff
across grades in their organization than an indication of the price of labor. If prices
are not known precisely but are known to vary by broad geographical location of the
1200 J. Johnes

university, a location indicator can be incorporated in the function as a proxy [183,

195, 197, 301]. Capturing price variations is typically likely to be more important
in inter-country rather than within-country studies.
Different organizations have different missions or objectives. In the higher
education sector, some universities might choose to focus more on, for example,
research as opposed to teaching, and vice versa. In the education context, some
countries distinguish between schools with vocational as opposed to academic
routes. An underlying assumption when estimating a cost function for an industry is
that the firms within it all have similar objectives [131]. If this assumption does not
hold, then cost functions might be estimated separately for different mission groups
[86, 183, 195, 197, 301, 323, 325]. Recent advances in estimation techniques also
provide alternative approaches for this situation (see Section “Recent developments
in estimating cost functions”).
A full and detailed examination of all these issues can be found in Johnes et al.
[195].
In practice, the translog cost functional form provides further opportunities for
estimating additional quantities of interest, namely, elasticities of substitution. We
can write the translog cost function as [152]

M
1
M M
ln C = δ0 + δm ln ym + δmn ln ym ln yn
2
m=1 m=1 n=1

K
1
K K M K
+ μk ln Pk + μkl ln Pk ln Pl + γmk ln ym ln Pk + ε
2
k=1 k=1 l=1 m=1 k=1
(6)

We assume the following conditions:

(a) Linear homogeneity of degree +1 in input prices

K
μk = 1 (7a)
k=1

K
K
μkl = γmk = 0 (7b)
l=1 k=1

(b) Symmetry

δmn = δnm (8a)

μkl = μlk (8b)

We can derive a set of input share equations (Sk ) from Eq. (6) as follows:
30 Applications of Production Economics in Education 1201

∂ ln C K M
= Sk = μk + μkl ln Pl + γmk ln ym (9)
∂ ln Pk
l=1 m=1

Two possible measures of elasticity are the Allen and Morishima elasticities of
substitution, respectively, which can be estimated from this set of equations. The
Allen elasticity of substitution measures the impact of a change in the price of the
kth input on the demand for the lth input with output held constant. This is estimated
in this cost function context (denoted by superscript C) by

kl = (μkl + Sk Sl ) /Sk Sl for k = l

AC (10a)

kl = (μkl + Sk (Sk − 1)) /Sk for k = l

2
AC (10b)

A positive (negative) value suggests that inputs k and l are substitutes (com-
plements). The Morishima elasticity of substitution can be estimated in this cost
function context (denoted by superscript C) by:

C
Mkl = Sl AC
kl − AC
ll (11)

The two measures are different when there are more than two inputs and the
production technology is represented by the translog as here [152]. The Allen
elasticities are symmetric, whereas the Morishima elasticities are not. As such,
the Allen and Morishima elasticities of substitution may not provide consistent
conclusions (see [152] for more details).

Estimation Approach: SFA Versus DEA

The idea that production might vary by mission group leads on to a more general
concern regarding production and technology and their implications for estimating
empirical cost functions. Clearly concepts such as scale economies require estima-
tion of a long-run cost function. The challenge for empirical researchers is that they
must assume that their sample of production units (schools or HEIs) is operating
on the long-run cost function in the time period under study [131]. In reality, this
is unlikely to be the case, and the observations will be a mix of those operating
in a long-run equilibrium, those operating in a short-run equilibrium, and those in
either a short-run or a long-run position not operating efficiently. This might be
a particularly pertinent consideration if the sector under study is going through a
period of rapid change, in which case organizations may be in various short-run
equilibria as they move toward their long-run positions [57].
Some of the earliest empirical cost functions in higher education are estimated
using ordinary least squares (OLS) and a linear functional form [309, 310]; the latter
therefore largely precludes the existence of economies of scale and scope, and the
1202 J. Johnes

former simply estimates a line of best fit through all the data regardless of position
(short run versus long run; efficient versus inefficient). The seminal work of Cohn
et al. [86] incorporates the multi-product nature of production drawing on the work
of Baumol et al. [37], but the estimation method (OLS applied to cross-sectional
data) still does not address the issue of observations at different production points.
This is also the case for many subsequent studies (see Section “Findings from the
literature”).
Early school cost functions use OLS and a quadratic function, permitting
estimation of optimum size and scale economies, but precluding economies of
scope [41, 60, 254]. The multi-product nature of school production is recognized
in later studies by using a translog functional form, with applications in Bolivia,
the USA, and Flanders [62, 166, 295]. But the estimation methods do not allow for
inefficiencies or data points being at different production points (short run versus
long run).
There has been a growing recognition that using methods which preclude the
possibility of inefficiency is a problem [95], especially as the non-profit nature
of education does not naturally provide the incentives for efficiency which prevail
in a private sector setting. The more widespread availability of frontier estimation
methods which allow for inefficient operations, however, has led to frontier estima-
tion methods increasingly being the customary approach when estimating empirical
cost functions in education. A cost function estimated using frontier techniques
envelopes the data; thus its position is determined by the outermost data points
which are, in turn, likely to be those in a long-run equilibrium and/or most efficient
in the sector.
Frontier estimation techniques can be parametric, such as the family of estima-
tion methods falling under the umbrella of stochastic frontier analysis (SFA) [20], or
non-parametric, with data envelopment analysis (DEA) being a common approach
in this context [74, 75]. Full details of SFA and DEA can be found in Chaps. 12, 13
and 14 of Volume I of this publication.
SFA assumes an error comprising two components – one a normally distributed
random error and the other a one-sided term, often following a half-normal or expo-
nential distribution and attributed to inefficiency. The basic SFA model produces
estimates of the cost function parameters which are identical for all organizations in
the data set. The advantage of the approach is that the significance of the parameters
can be tested [85, 283], and they can be used to produce estimates of economies of
scale and scope as well as elasticities of substitution. As a consequence, SFA has
been used in many empirical cost function studies (see Section “Findings from the
literature”).
DEA is a non-parametric approach often used to derive estimates of organi-
zations’ efficiencies and makes no assumptions regarding functional form. This
means that there are no problems with misspecification. Moreover, the linear
programming method of estimation means that DEA easily accommodates multiple
inputs and multiple outputs [229, 233]. Furthermore, it provides weights of inputs
and outputs which vary by each organization (or decision-making unit – DMU)
in the sample and which maximize the efficiency score subject to weights being
30 Applications of Production Economics in Education 1203

positive and universal [84]. The inter-institutional variation in weights can be

particularly advantageous in the context of education and higher education where
we have noted that mission can vary by production unit. DEA can also provide
useful benchmarking information for managers to help them improve performance.
But estimates of economies of scale and scope, and of elasticities of substitution,
are more difficult to derive in the non-parametric context (see [301] for an
example).
In the context of cost functions where there is a single input (expenditure) and
multiple outputs, a parametric technique is typically preferred over a non-parametric
one, although there are exceptions as will be discussed in Section “Findings from
the literature”. As already noted, however, expenditure itself can be divided into
different categories, thereby leading to a multi-input multi-output situation. DEA
can therefore be advantageous where the underlying cost components are known
and of interest [18, 19, 27, 39, 40].

Findings from the Literature

We will focus in this section largely on literature which incorporates the multi-
product nature of education into the estimated cost functions [37]. This literature
can be divided into parametric and non-parametric studies, and the former can be
further divided into those using frontier estimation methods and those not doing so.

Parametric, Non-frontier Estimation

The school context is complex with some studies examining scale at a funding
area level (such as school districts in the USA or local education authority in the
UK), others looking at the school level, and others still considering both. There
is evidence of scale economies in funding areas [22, 60] and in funding areas
up to a certain size [109, 327]; and a study of only large school districts finds
no evidence of scale economies [273]. There is also evidence of scale economies
in schools [41, 51, 107, 166]. When the two are considered together (schools
and funding districts), economies of scale are found in schools [224] and in both
schools and school funding areas [70]. In a rare study of secondary schools which
incorporates multiple outputs (relating to education clusters), ray economies of
scale are observed along with product-specific economies in six of the seven fields
considered and also global economies of scope [295]. Economies of scale are
confirmed in a study of school districts which are assumed to produce two outputs,
namely, primary and secondary education, but in this case there are no economies of
scope [62], and this aligns with findings from an earlier multi-product cost study by
Jimenez [166] .
In the context of higher education, and across various developed countries, cost
functions estimated using parametric non-frontier estimation methods tend to find
that there are ray economies of scale but that the evidence on economies of scope is
more mixed [86, 99, 110, 134, 135, 213–215, 280, 319].
1204 J. Johnes

Parametric, Frontier Estimation

Much of the literature on estimates of scale economies in the context of schools
is based on non-frontier estimation. There are some exceptions where both non-
frontier and frontier methods are applied [109], and these find that the coefficients
for the frontier model are similar to those estimated using non-frontier methods.
More recently, Gronberg et al. [143] use SFA in the school context to investigate the
economies deriving from consolidation of school districts in Texas. They find that
economies of scale can be gained from consolidating very small school districts
(producing primary and secondary education). Other studies which use frontier
methods have largely focused on efficiency aspects rather than scale economies,
and so will be reviewed in Section “Efficiency, productivity change, and analyses of
factors underlying efficiency”.
There are many studies of economies of scale and scope in higher education
using parametric, frontier estimation methods to estimate the multi-product cost
function (e.g., [3, 10, 13, 164, 169, 183–185, 195, 197, 249, 297, 301]). While
there is some variation in findings, typically the studies using these methods find
that ray economies of scale are exhausted; an exception relates to Japanese private
universities where many HEIs enjoy economies of scale [249]. Some also find
there are product-specific scale economies relating to research and/or postgraduate
outputs [14, 15, 187, 188, 249]. Most of the evidence on scope economies, however,
points to diseconomies of scope both globally and (where calculated) for individual
products [164, 170, 183, 195, 197, 249].

Non-parametric Frontier Estimation

The need for cost function parameter estimates with which to derive measures of
scale and scope economies means that cost function studies using non-parametric
frontier estimation methods typically examine efficiency rather than the specific
cost function concepts discussed above. As such, a review of these studies will
be presented in Section “Efficiency, productivity change, and analyses of factors
underlying efficiency”. An exception is Thanassoulis et al. [301] who examine costs
in English higher education and find opportunities for expanding student numbers
are possible through currently unexploited scale and scope economies [301, 302].

Summary
Aside from the fact that the findings provide some mixed messages, there are some
additional caveats. In the context of schools, most studies do not include externality
costs of increasing school size. As a school increases its size, for example, student
discipline issues increasingly arise, and the crime and violence which this may
engender impose external costs on pupils, families, and society more generally,
which are not taken into account in a standard school cost function [122]. This
suggests that care should be taken when interpreting the results of standard cost
functions. An area of future research which would be particularly useful to managers
and policy-makers alike would therefore revolve around developing empirical cost
functions for schools which incorporate these externalities.
30 Applications of Production Economics in Education 1205

Another caveat is that if recommendations regarding scale economies are derived

from cost functions which inadequately measure quality of outputs, there may be a
detrimental effect on pupil outcomes from increasing school size. The relationship
between size and outcomes can be examined more closely using a production func-
tion approach, and we will consider this further in Section “Production functions,
distance function, shadow prices, and elasticities”, while the issue of increasing
school size and pupil outcomes is investigated in depth by Schiltz and De Witte
[281]. Caveats regarding inadequate measurement of quality apply equally in the
higher education context.

Recent Developments in Estimating Cost Functions

Education providers, whether at primary, secondary, or tertiary levels, can vary

widely in terms of, for example, their mission, size, and history. Such diversity,
if not taken into account, can potentially affect parameter estimates and hence the
estimated economies of scale and scope. Inclusion in the cost function of contextual
variables to reflect defined characteristics, such as mission or region, is one approach
to addressing the diversity issue [323, 325]; another approach is to estimate cost
functions within pre-defined groups based on perceptions about what characteristics
ought to affect cost function parameters. These characteristics might be type of
institution such as public or private [86, 165], or mission group [183, 195, 197,
301]. Such an approach can identify differences in estimated parameters across the
specified groups.
Thus, known characteristics of institutions affect costs, but there may also
be unobserved characteristics which also affect costs. Random parameter (RP)
SFA [140, 305] and latent class (LC) SFA [221, 253] allow both observable and
unobservable characteristics to be taken into account in the estimation of parameters
and efficiency scores – explored further in Section “Efficiency, productivity change,
and analyses of factors underlying efficiency” [185]. In particular, RP SFA, which
requires panel data for estimation, allows parameters to vary by each individual
provider, while LC SFA, which can be applied to cross-sectional data, permits
parameters to be derived for groups of HEIs – however the groups are not pre-
defined by the analyst, but rather they are determined by the data. These methods
not only lead to different parameters across (groups of) institutions but also allow the
calculation of scale and scope economies by individual provider or by group. Studies
adopting RP SFA have found evidence that ray economies of scale are typically
exhausted or decreasing and diseconomies of scope are observed. There are some
product-specific economies of scale, but these vary from study to study [14, 15,
183, 187, 188, 197]. When LC SFA methods are used, the findings on economies
of scale and scope vary from group to group [184, 185]. While the LC approach is
attractive in defining groups based on the data, the disadvantage is that the results
can be difficult to interpret if the composition of the resulting groups does not align
with any obvious patterns.
1206 J. Johnes

Policy Implications and Future Work

Empirical estimations regarding economies of scale and scope whether at school or

university level are policy relevant as they can feed into considerations regarding
potential consolidation – this might be at funding unit level (such as districts) or at
organization level. Thus policies to amalgamate schools or to merge universities can
be developed from such empirical work.
However, the findings reported above at both school and university levels are
often mixed or conflicting, and it is difficult to develop coherent policies on such
a basis. Future work which examines the reasons why findings on economies of
scale and scope vary by type of organization might therefore help to better inform
policy [159]. A useful contribution to the literature undertakes a meta-analysis
of cost function studies to identify reasons for the mixed findings in the higher
education context [324]. It seems that estimates of scale efficiency vary according
to model specification and functional form assumed and whether or not managerial
efficiency is taken into account (a quadratic cost function in particular seems to
lead to a conclusion of diseconomies of scale, as does a model which accounts
for inefficiency). Estimates of scope efficiency are affected not just by model
specification but also by period covered by the study, sample size, and type of
data. In particular, estimates derived from older, cross-sectional data from small or
developing country samples of universities are likely to lead to the conclusion that
scope economies exist [324]. At school level, a meta-regression analysis of optimum
school size based on 10 studies with 22 estimates finds the optimum school size to
be around 1543 pupils [88]. These studies provide useful insights, and much more
work of this kind would be welcome, particularly at the school level.
A less explored area of research concerns the derivation of elasticities from
parametric cost functions. Both Allen and Morishima elasticities (defined in Section
“Estimating cost functions in education and higher education: Challenges and
methodology”) can provide useful details about substitution possibilities between
inputs. One example in higher education can be found [318] and suggests that
within universities it is easier to switch into capital inputs than into academic or non-
academic labor; indeed the substitution possibilities between the two types of labor
seem limited. Examples from the school context suggest that instructional, support,
and administrative inputs are generally substitutable [62, 152, 166], although Allen
elasticities imply less ability to substitute between the non-instructional input and
others [152]. Much more work could be undertaken in this context to provide useful
policy insights.

Production Functions, Distance Function, Shadow Prices,

and Elasticities

Concepts relating to the multi-product nature of education and higher education can
also be examined in a production function context. Given both the multi-input and
multi-output nature of production, estimation of production-related concepts leads
30 Applications of Production Economics in Education 1207

to the output distance function approach which has numerous advantages: it does
not assume any particular optimizing behavior on the part of the firms, which is an
advantage in the non-profit context in which schools and universities often operate;
it does not require a knowledge of prices of either inputs or outputs, the latter being
particularly useful in education and higher education where teaching outputs, for
example, are difficult to value; and it does not require prices to be exogenous [81,
82, 308]. In the context of education and higher education, the concepts of interest
in the production setting include:

• Existence or otherwise of returns to scale

• Existence or otherwise of returns to scope
• Extent of substitution possibilities in the production relationship through evalua-
tion of elasticities of substitution (between inputs)
• Extent of complementarity or substitutability between the outputs through
evaluation of elasticities of substitution (between outputs)

Full details of the theory underpinning production and the related concepts
(including elasticities) can be found in Chaps. 3 and 22 of Volume I of this
publication. Each of these concepts will be considered briefly in the context of the
empirical literature in this section.

Background on Production Concepts

We assume that schools or universities produce multiple outputs from a variety of

inputs. Let us assume, as in Section “Cost functions and economies of scale and
scope”, that providers – be they schools or HEIs – use a vector of inputs x ∈ RK
+ to
produce a vector of outputsy ∈ RM + . We assume that providers focus on producing
outputs relative to given inputs (an output-oriented approach) and hence define the
production technology for a provider as

+ : x can produce y
P (x) = y ∈ RM (12)

where
y is already defined
x is the vector of inputs

The output distance function [287], denoted by D(x, y), is defined on the output
set P(x) as

D (x, y) = minθ {θ : (y/θ ) ∈ P (x)} (13)

The output distance function is non-decreasing, convex, and positively linearly

homogeneous of degree +1 and can be used to derive shadow prices and substitution
properties. We define shadow prices of inputs as
1208 J. Johnes

∂D (x, y) /∂xk (14)

The marginal rate of technical substitution between inputs k and l (MRTSkl )

reflects the slope of the isoquant, provides a measure of substitutability between
inputs k and l, and is derived from the ratio of input shadow prices:

∂D (x, y) /∂xk
MRT S kl = (15)
∂D (x, y) /∂xl

This statistic is affected by the units in which inputs are measured, and so it is
conventional to calculate a normalized MRTSkl :

∂D (x, y) /∂xk xk
subkl = . (16)
∂D (x, y) /∂xl xl

If subkl > 1 (subkl < 1), it is difficult (easy) to substitute out of input k into input
l [259]. An alternative measure of substitutability is the Allen elasticity defined in
this output distance function context as

D (x, y) Dkl (x, y)

Akl (x, y) = (17)
Dk (x, y) Dl (x, y)

If Akl (x, y) > 0 (<0), it is difficult (easy) to substitute between the two inputs.
When the number of inputs exceeds two, there are many directions in which the
curvature of the isoquant can be measured, and thus subkl and Akl (x, y) can be
unsatisfactory in reflecting substitutability in this situation. The Morishima elasticity
of substitution appears to be a more satisfactory measure of substitutability in
the multiple input case [46]. The (indirect) Morishima elasticity of substitution is
defined in this output distance function context as [259]

d ln Dk (x, y) /Dl x, y Dkl (x, y) Dkk (x, y)
Mkl (x, y) = − = xk − xk (18)
dln [xk /xl ] Dl (x, y) Dk (x, y)

This gives the percentage change in the slope of the MRTS brought about by
a percentage change in ratio of inputs. If xk and xl are highly substitutable, values
will be small (less than or equal to zero); and the elasticity rises if substitutability
possibilities between the inputs xk and xl are limited. The Morishima elasticity (in
contrast to the Allen elasticity) is asymmetric such that Mkl (x, y) will not normally
equal Mlk (x, y) [145].

Estimating Distance Functions in Education and Higher Education:

Challenges and Methodology

Many of the challenges of empirical estimation in the production context are the
same as those already discussed in Section “Estimating cost functions in education
30 Applications of Production Economics in Education 1209

and higher education: Challenges and methodology” in the cost context and will not
be discussed further here. In addition, just as there are problems in specifying output
measures in education and higher education, there are also challenges in identifying
satisfactory measures of inputs to the education production process.
Inputs to schools are broadly defined as labor, capital, and pupils. Inputs
to higher education are similarly defined as labor, capital, and students. Labor
inputs can be categorized by type of labor (teaching or support) and might be
standardized by number of students to give a staff to student ratio [56, 241, 255,
267, 328]. Staff quality might be incorporated by using qualifications, experience
indicators, or salary data [19, 73, 142, 199, 285, 291]. Capital normally encompasses
buildings, computers, and library facilities [18, 19, 24, 175, 178]. Measuring pupil
or student input is possibly the biggest challenge as pupils vary by ability and
socioeconomic, personal, and family circumstances, and these can all affect their
success in school or higher education, and so should be incorporated in some way.
Finally, the environment in which a provider is located can impact its outcomes,
and so variables to reflect the economic characteristics of the location are typically
included, particularly at school level (see [102] for a full overview of all possible
inputs).
It is worth noting that providers, particularly in education where catchment areas
are relatively local, have varying amounts of control over the different inputs. Thus,
they can certainly control how much labor or capital is employed, but they cannot
control the economic or demographic characteristics of the environment in which
they operate. It is therefore important to take account of these non-discretionary
inputs as they might impact the estimated parameters and hence measures of returns
to scale and scope.
The distance function assumes constant technology across all providers. As in
the case of costs, differences in mission or other potentially relevant attributes (such
as size) can be accommodated in various ways, the most straightforward one being
inclusion of additional variables to reflect these differences.
Perhaps the main challenge in the production context is the estimation of the out-
put distance function. If we assume that the outputs in the multi-output, multi-input
case are entirely separately produced, then we can undertake estimation separately
for each output. This means that for each output, estimation of the distance function
simply reverts to estimation of the production function in a single output production
context. This is the approach taken in many studies of production in the education
context [4, 91, 92, 112, 201, 239, 285] and in early production studies in the higher
education context [171, 189–193]. This approach, however, does not capture the
obvious “jointness” of production potentially observed in education and higher
education, for example, academic staff in universities undertaking research may also
feed into their research-led teaching [104], and therefore precludes an examination
of substitutability [77–79, 151].
The multi-input, multi-output nature of production raises potential difficulties
of estimation. The non-parametric approach underpinned by linear programming
techniques can accommodate both multiple inputs and multiple outputs, but the
derivation of the concepts of interest (such as elasticities or returns to scale or
scope) can be more difficult. The ease of application, however, makes the non-
1210 J. Johnes

parametric approach a popular choice in the estimation of education and higher

education distance functions with multiple inputs and multiple outputs.
The parametric approach allows for stochastic errors and provides estimates
from which returns to scale and scope and elasticities can be derived, but the
estimation is more problematic as it is typical to have a single output related to
multiple inputs or a single input (often costs as explored in Section “Background
on cost concepts”) related to multiple outputs. As a consequence, there is limited
application of parametric methods to the empirical estimation of multi-input multi-
output distance functions in education, although there is a growing literature. The
disadvantages of the parametric approach are that results may be sensitive to the
choices of functional form and error distribution.
The practical application of a parametric distance function requires a functional
form that (a) is flexible; (b) is easy to estimate; and (c) permits the imposition of
homogeneity [83]. The translog fulfills all three criteria and is the functional form
of choice in many education applications. We assume as before that providers use
inputs xk (k = 1, . . . ,K) to produce outputs ym (m = 1, . . . ,M). The translog distance
function is defined as

M
1
M M
K
ln D (x, y) = α0 + αm ln ym + αmn ln ym ln yn + βk ln xk
2
m=1 m=1 n=1 k=1
(19)
1
K K K M
+ βkl ln xk ln xl + δkm ln xk ln ym + ε
2
k=1 l=1 k=1 m=1

Distance function restrictions require that the following conditions hold:

(a) Homogeneity of degree +1 in outputs

M
αm = 1 and (20a)
m=1

M
αmn = 0 m = 1, 2, . . . , M and (20b)
n=1

M
δkm = 0 k = 1, 2, . . . , K (20c)
m=1

(b) Symmetry

αmn = αnm m, n = 1, 2, . . . , M and (21a)

βkl = βlk k, l = 1, 2, . . . , K (21b)

30 Applications of Production Economics in Education 1211

The homogeneity in output restriction means that D(x, ωy) = ωD(x, y) and hence
the Mth output can be chosen arbitrarily such that ω = 1/yM . Thus Eq. (19) can be
written as

M−1
ym 1
M−1 M−1
ym yn
− ln yM = α0 + αm ln + αmn ln ln
yM 2 yM yM
m=1 m=1 n=1

K
1
K K
+ βk ln xk + βkl ln xk ln xl (22)
2
k=1 k=1 l=1

K M−1
ym
+ δkm ln xk ln +ε
yM
k=1 m=1

where the error ε = − ln D(x, y)

The output distance function can be used to provide estimates of returns to scale
as follows. The output elasticity for input k is

−εD,xk = −∂ ln D/∂ ln xk = ∂ ln yM /∂ ln xk = εyM ,xk (23)

and measures the percentage change in output yM if xk changes by 1%, holding the
output ratios constant [258].
If we sum these output elasticities over k, we get

K
εyM ,x = εyM ,xk (24)
k=1

which is analogous to a returns to scale estimate. If εy, x > 1, we have increasing

returns to scale as a 1% increase in xk results in a more than 1% increase in output
expansion (with proportional changes in all outputs) [258].

Findings from the Literature

Much of the literature regarding output distance functions focuses on deriving

estimates of efficiency (the subject of Section “Efficiency, productivity change,
and analyses of factors underlying efficiency”) rather than on the production
concepts referred to in this section. Early efforts to model the higher education
production function in a framework where multiple outputs are produced from
multiple inputs employ canonical correlation estimation methods and find a degree
of substitutability between inputs based on data on individual university students
[78, 79]. Most recently a SFA translog output distance function suggests that returns
to scale appear to be exhausted across the English higher education sector. Based
on estimates of Allen and Morishima elasticities, substitution is difficult between
1212 J. Johnes

academic and non-academic staff (a similar result to the school context reported
in Section “Recent developments in estimating cost functions”) and much easier
between academic staff and capital inputs [177].
Returns to scale can also be established using the non-parametric DEA approach.
Where this has been used in the higher education context, the findings generally
point to the prevalence of constant or decreasing returns to scale [38, 80, 177].
In the context of schools, the parametric approach taken has been to estimate
a single output production function with multiple inputs typically to have a better
understanding of returns to scale and optimal school size. The literature is somewhat
simplistic in its approach with a surprisingly limited focus on functional form
[22] especially compared to cost function studies (at school and higher education
level) and production function studies in the higher education context. The output
measures used are largely based on average test score, and efficiency is typically
not included (exceptions are [106, 223]). The limited evidence from this arena
suggests that returns are more often constant or decreasing [106, 123, 128, 298]
than increasing [121, 209].

Policy Implications and Future Work

The work on elasticities of substitution reveals some interesting differences in terms

of opportunities to substitute between inputs between HEIs which subsequently
merge, those which do not merge at all, and post-merger institutions [177]. Greatest
opportunities for substitution are generally observed for HEIs which will subse-
quently merge. Institutional merger is sometimes considered as a policy initiative by
governments (Cai and Yang [61] summarize merger activity across countries), and
so this observation is important as it suggests that institutions which do not have the
appropriate initial characteristics prior to merger may not reap the potential rewards
(see Section “Policy implications and future work” for more on the efficiency
effects of mergers in higher education). More work is needed to investigate these
findings further and to confirm whether initial characteristics of providers are indeed
important in determining success following merger.

Efficiency, Productivity Change, and Analyses of Factors

Underlying Efficiency

A by-product of the frontier estimation techniques applied in the costs or production

contexts is that they also lead to the derivation of measures of efficiency for
providers in the sample. By choosing a frontier estimation method, the researcher
is therefore also able to undertake a detailed examination of efficiency and, if panel
data is available, productivity of organizations. Such analysis is particularly impor-
tant in the education and higher education contexts where the non-profit nature of the
sector makes traditional financial ratios inappropriate for performance measurement
[42], but yet the public funding aspect makes it crucial to understand that resources
30 Applications of Production Economics in Education 1213

are being used efficiently. The interest in efficiency, and the availability of a wealth
of data on inputs and outputs, in education and higher education sectors around the
world has led to a large literature on education efficiency and productivity, a review
of which can be found in Johnes [180].
Efficiency should not be confused with effectiveness: the latter relates to doing
the right things – in education it means having the right quantity of outputs –
while the former relates to doing things right, where in education it means using
scarce resources to produce the highest possible outputs [76, 127]. Typically,
efficiency receives the greater attention in the literature, and this will be the focus
of this section. It should be noted, however, that one novel publication looks at,
distinguishes, and provides comparative measures of both concepts (efficiency and
effectiveness), with an application in the secondary schooling context, and this will
be reviewed further below [76].

Background on Efficiency Concepts

Efficiency work is rooted in the seminal contribution of Farrell [120], and the two
main approaches used to derive and examine efficiency are SFA and DEA (already
discussed in Section “Cost functions and economies of scale and scope”). These
methods can be used to derive various measures of efficiency based on cost (or
input distance) functions and output distance functions [200]. From a cost point
of view, the parametric measure of efficiency is derived from the error term of, for
example, Eq. (6), i.e., as ε = v + u where v is a stochastic error and u is the one-sided
efficiency term. In the production context, the parametric estimate of efficiency is
derived from, for example, Eq. (22). The distance measure, lnD(x, y), is the quantity
of interest in Eq. (19) as this provides a measure of efficiency, and this is derived
from the error term in Eq. (22), which is typically assumed to be split into two
components, i.e., ε = v − u where v is a stochastic error and u is the one-sided
efficiency term.
The non-parametric measure of efficiency is often derived from the DEA
approach such that D(x,y) is defined as [74, 75]

M
am ym
D (x, y) = m=1
K
(25)
k=1 bk xk

where ym and xk are as already defined, am is the weight applied to output m,

and bk is the weight applied to input k. For each DMU, the weights are found by
maximizing efficiency subject to the constraints that weights must be non-zero and
universal. DEA can be applied in the context of constant returns to scale (CRS) or
variable returns to scale (VRS). A DMU is fully efficient if D(x, y) = 1.
In establishing the efficiency of an organization, we therefore examine its
observed production/costs relative to best practice in the entire industry. As such,
the frontier methodology provides a benchmark which an inefficient provider can
1214 J. Johnes

use to help it to become more efficient and ultimately to move on to the best practice
frontier.
When we have a panel of data, bringing in a time dimension (denoted by t and
by t + 1), we are able to perform an analysis of productivity change which can be
measured using the Malmquist productivity index [228], developed by Caves et al.
[66] and further by Färe et al. [119], which is derived as follows for the output
distance function (where superscripts and superscripts denote the time period of the
distance function):

1/2
D t (xt+1 , yt+1 ) D t+1 (xt+1 , yt+1 )
M (xt+1 , yt+1 , xt , yt ) = (26)
D t (xt , yt ) D t+1 (xt , yt )

Notation is as defined earlier, and Dt (xt + 1 , yt + 1 ) denotes the distance of the

period t+1 observation from the period t frontier. If the Malmquist productivity
change index exceeds unity, there has been an improvement in productivity between
periods t and t+1. Values less than 1 suggest the converse.
The change in the production position of a provider over the two time periods
has two underlying determinants: first, the provider can produce more because the
output distance frontier for the sector has moved outward, and therefore the potential
for production across all providers is expanded; second, the provider’s position
relative to the time-relevant frontier can change. The Malmquist productivity index
can be decomposed into two components as follows [118, 119]:

D t+1 (xt+1 , yt+1 )

M (xt+1 , yt+1 , xt , yt ) =
D t (xt , yt )
1/2 (27)
D t (xt+1 , yt+1 ) D t (xt , yt )
D t+1 (xt+1 , yt+1 ) D t+1 (xt , yt )
t+1
D (xt+1 ,yt+1 )
The first component, t
D (xt ,yt ) , measures the change in techni-
cal efficiency over the two periods (i.e., whether or not the unit is get-
ting closer to its efficiency frontier over time), and the second component,
t t 1/2
D (xt+1 ,yt+1 ) D (xt ,yt )
t+1
D (xt+1 ,yt+1 ) t+1
D (xt ,yt )
, measures the change in technology over the
two time periods (i.e., whether or not the frontier is shifting out over time). Values
of either of these components of greater (less) than unity suggest improvement
(deterioration) in the measure.

Findings from the Literature

Efficiency
There is a huge literature reporting findings on efficiency in both education and
higher education, and various reviews can be found (see, e.g., [53, 172, 179, 186,
30 Applications of Production Economics in Education 1215

302, 317]) including a particularly detailed one [102]. This section provides a brief
overview of that literature.
In the context of schools and further education institutions, and taking a
production perspective, mean efficiency varies from just under 0.6 to well over 0.9
using parametric estimation methods [71, 91, 92, 106, 124, 146, 148, 201, 210,
244, 296]. A similar spread of mean efficiency scores is observed when using non-
parametric methods [5, 21, 43, 44, 48, 49, 53, 54, 59, 68, 72, 73, 118, 150, 153, 154,
157, 198, 202, 211, 230–232, 244, 246, 251, 252, 260, 262–266, 276, 279, 296, 303,
307]. For most of these studies, values are typically at the higher end of the range but
depend on model specification, context of the sample, type of schools (e.g., public
or private), and (in the case of DEA) whether constant or variable returns to scale
are assumed, with the latter providing higher mean estimates. An exception to these
studies is in the context of Australian schools [69] where mean efficiency is around
0.4 for primary schools and 0.5 for secondary schools – these results are discussed
further later in this section.
When a cost perspective is taken, mean efficiency is found to be relatively high
with a range of 0.83–0.96 using parametric methods [33, 142, 278] and 0.664–0.95
using non-parametric methods [31, 34, 35, 156, 229, 230, 275, 277, 278].
Most studies at university level use non-parametric methods (often DEA) in a
production context to estimate efficiency. Such studies, which cover an array of
university sectors, find average efficiency to be relatively high. Mean values tend
to fall in the range 0.5–0.97 [1, 3, 16, 38, 115, 178, 212, 257, 268], but there
are some models which yield mean efficiency below 0.5 [108, 217, 242, 313].
Parametric estimation methods applied in a production context yield relatively low
mean efficiency scores of the order 0.5–0.8 [178]. Mean efficiency derived from
cost function studies falls in a similar range of around 0.5 upward with smaller,
specialist institutions more likely to exhibit lower average efficiency [132, 164, 183,
185, 195–197, 297, 301].
Only a few studies have compared efficiency values of providers derived
using alternative methods. While efficiencies from parametric and non-parametric
estimations of cost or output distance functions are often significantly correlated
[170, 178, 208, 240], these correlations are not always particularly strong suggesting
that different estimation methods can lead to different conclusions.
These findings on efficiency levels are interesting insofar as they lead to questions
as to why one provider is substantially more (or less) efficient than another.
It should be remembered, however, that they are only estimates; the possibility
of providing standard errors around the efficiency scores allows the researcher
to establish whether there are significant differences between providers. Where
this has been done, the conclusion is that there are significant differences only
between the best and worst performers [173, 178, 257]. It should be noted that
the estimation methods assume that the units under examination are comparable –
in terms of, for example, their production technology or environment. If such
differences between institutions exist but are not allowed for, this might be captured
in the efficiency score, and hence these scores should be interpreted with caution
[184, 185].
1216 J. Johnes

The differences between institutions and the subsequent questions raised by

efficiency analyses often lead to a second-stage investigation as to what factors
might actually influence how efficiently an institution can operate. There is a
considerable literature examining the determinants of efficiency at both school and
higher education levels. Methods of analysis vary. Early studies typically use DEA
followed by a Tobit approach to accommodate the contention that the dependent
variable (efficiency score) is a censored variable taking values between 0 and 1 [2,
49, 53, 54, 71, 91, 198, 202, 211, 216, 237, 240, 247, 265, 278, 279, 284]. Later
studies argue that the dependent variable is not censored but fractional [238] and
that the appropriate second-stage analysis should take an OLS estimation approach,
with White heteroskedasticity-consistent standard errors, which produces consistent
estimators for large samples [161, 238]. A regression approach (or suitable panel
data methodology) is used in the second stage in a number of studies [59, 156, 157,
230, 241, 264, 266, 313].
Separate second-stage analyses, such as those referred to above, have been
criticized. When using SFA to derive the efficiency scores, these scores are assumed
to be independently and identically distributed. Yet in the second stage, they are
assumed to be affected by factors relating to, for example, the DMU. Models which
address this issue have been devised for both cross-sectional and panel data [36,
163, 219, 270], and such methods which simultaneously apply SFA and investigate
the determinants of efficiency have been applied in the education context (see, e.g.,
[17, 240, 297, 328]).
Analyses of efficiency have uncovered a vast array of determinants of efficiency.
At the school level, school-related determinants including per pupil expenditures on
teachers, teacher salary, physical resource expenditure, and scale (school or class)
have all been found to be important, although direction of effect can vary from study
to study [53, 59, 202, 211, 251, 264, 278].
Pupil discipline record, absenteeism, and having pupils with special educational
needs also affect school efficiency [49, 91, 227, 230, 312] as does type of school
such as selective and single sex girls’ schools [53, 59]. Factors relating to the pupils
themselves are also highly important in determining the efficiency of schools. Such
factors include ethnic background, socioeconomic status, and parental education [5,
91, 157, 211, 237, 251, 265, 266].
Variables relating to the wider region in which the school is located are also
important in determining efficiency levels of schools. These include variables
indicating the unemployment rate and the wealth and educational attainment of
inhabitants of the area [5, 54, 157, 201, 251, 265, 278]. Direction of relationship
between such variables and efficiency can vary according to study. Finally, political
factors have also been found to play a part in determining school efficiency. These
include the source of funding (particularly deriving from local sources) and political
leaning of residents of an area, both of which can affect efficiency [4, 49, 312].
The array of variables affecting efficiency is therefore vast (more information can
be found in Burney et al. [58]), and the variables vary in terms of what the school
can control (such as resources) and what they cannot (such as characteristics of the
pupils in the catchment area and the regional environment). Clearly this distinction
30 Applications of Production Economics in Education 1217

is important in terms of developing policies to improve efficiency. As an initial

step, the second-stage analysis can be used to compute a revised efficiency score
which takes into account the variables. One study which does this finds that mean
efficiency in primary schools rises from 0.4 to 0.9 and in secondary schools from
0.5 to 0.9 [69]. This demonstrates the effect these variables can have in explaining
inter-institutional differences in efficiency, and managers and policy-makers should
be aware of this.
Similarly useful results are found in the higher education context. University-
related factors include provider size and composition, age, governance (such as
public or private), source of funding, geographical location, as well as staff
characteristics such as gender, age, and ethnicity [216, 240, 241, 284, 297, 313, 315,
328]. The influence of student characteristics on efficiency is less well investigated
[297].
It is worth ending this sub-section with a quick note on effectiveness. Cherchye
et al. [76] define a measure of effectiveness for organizations by assuming constant
resources; in practice this means applying the CRS DEA framework with resources
equal to unity for all DMUs in order to derive an effectiveness score. In applying
this methodology to Flemish secondary schools, they find that performance can
be improved more by improving efficiency (as there is unexploited production
capacity) than effectiveness. It will be interesting to see this methodology applied
to different sectors and countries.

Productivity
Measures of productivity have typically been undertaken using non-parametric
approaches in education and higher education. In applications to higher education
sectors as diverse as the UK, Italy, Spain, China, Australia, Australasia, and Iran,
productivity growth is found, and this appears to be more a consequence of
technology change (the frontier shifting out) than of efficiency change (inefficient
units getting closer to the frontier) [32, 126, 175, 178, 196, 197, 235, 250, 320].
There are, however, some exceptions where productivity has increased but due
to efficiency rather than technology change [130, 269]. When samples are split,
for example, by mission group, findings are more nuanced with some groups
experiencing productivity decline, and this too is a consequence of shifting frontier
[301, 302]. In the context of productivity improvement, it is hypothesized that recent
innovations to higher education such as e-learning support for teaching and digital
support enabling and supporting research networks may well be reasons for the
frontier being pushed out. The inefficient universities may find it difficult to keep
pace with the changing technologies.
In the context of schooling, we find similar results regarding productivity
change and the underlying cause being technology improvements at both the
post-compulsory [54] and secondary school levels [113, 255] in the UK and
Canada. Where productivity is found to decline [260, 303], this is also related to
technological performance rather than efficiency decline. Johnson and Ruggiero
[199] take the Malmquist decomposition one step further by adding in a component
relating to environmental harshness. In a practical application to Ohio school
1218 J. Johnes

districts, the approach reveals that while technological progress drives productivity
change in top-performing school districts, it is the environmental harshness which is
the most important driver for low-performing districts. A similar approach is applied
to Dutch schools and also provides useful insights [55].

Recent Developments in Efficiency Measurement

Many developments covered in earlier sections are relevant here. Heterogeneity

among providers, and how it is addressed, is an important factor in efficiency
studies. Some researchers choose to divide their sample based on a known char-
acteristic, such as public or private funding [108, 212, 231], or by mission group.
More recently, developments in the methodological approaches are used to address
heterogeneity in the efficiency context. Thus LC and RP SFA, while providing
different parameters by group or unit (respectively), also provide different efficiency
scores by group or unit.
We have referred throughout this chapter to the issue of institutional diversity
in education and higher education sectors and considered ways in which diversity
has been handled. Another emerging approach in the efficiency context (based
on cost functions) is one which distinguishes between transient and permanent
efficiency [89, 90, 125, 220, 306, 311]. The underlying premise is that some
differences between organizations arise from a historical and geographical context
which the education provider cannot alter. Inefficiency differences arising from
such structural variations should be addressed differently from those arising from
transient (or short-term) factors. There are some subtle differences in the precise
approach, in this context. An SFA approach which allows for unobserved het-
erogeneity and incorporates the premise of transient and permanent inefficiency
[220, 306] has been applied in the higher education context [11, 139]. It seems
that for German and Italian universities, transient efficiency is relatively high,
while persistent efficiency is much lower. Papadimitriou and Johnes [256] use an
approach developed by Filippini and Greene [125] and also find that persistent
efficiency is lower than transient efficiency in the English higher education sector.
Clearly policies for improving efficiency likely need to be adapted in light of this
finding: a low persistent efficiency value, for example, suggests a need for structural
changes.
An aspect of production analyses which we have not yet explored is that of
complexities in the production process. So far we have assumed that all inputs go
into a “black box” at the start of production, and all outputs come out of it at the end
point. In reality, the “black box” may be hiding a more complex production process
whereby some inputs may produce a set of outputs at one stage, and then (some of)
these outputs, possibly along with other inputs, then become inputs into a second
stage of production which produces more outputs. Where a production process can
be divided into a series of sub-processes, a standard DEA fails to account for the
efficiency of each sub-process. By ignoring such complexities, the standard DEA
might lead to bias in efficiency estimates [203, 206] and conceals useful information
30 Applications of Production Economics in Education 1219

about efficiency of each of the stages. Network DEA (NDEA) [117, 304] takes into
account such complexities of production and provides estimates of efficiency at each
stage. A number of studies have applied a network DEA approach mainly in the
higher education context [176, 222, 322].
A network approach, whereby outputs such as student satisfaction and student
achievement are assumed to happen in a first stage while employment outcomes
happen in a second stage (where student achievement is an input into that second
stage), reveals considerably more discrimination in terms of HEIs identified as
efficient. Moreover, the second stage (production of student outcomes in the labor
market) is less efficient than the first stage, thereby providing managers with useful
information on where they should concentrate their efforts in terms of improving
efficiency [176, 222]. Indeed, an analysis of the factors underpinning each of the
sets of efficiencies (stage 1 and stage 2) indicates that there are different reasons
for differential performance in each case and hence provides more information for
managers and policy-makers [222]. More work of this type at both school and higher
education level would be useful.
We have noted in Section “Findings from the literature” above the many studies
which employ a second-stage analysis to explore the variables which might impact
efficiency scores. However, such studies are valid only if the separability condition
between the input-output space of the first stage and the space of the external
factors in the second stage holds. In the situation where the separability condition
does not hold, then a conditional DEA model is the appropriate approach [67,
96, 97]. While it is important to check that the separability condition holds [288,
289], and a test of the validity of the separability assumption is available [98],
studies which investigate the issue of separability and apply a conditional non-
parametric approach are relatively rare to date (see, e.g., [45, 93, 94]). The early
indication is that academic or school-related variables may be less important than
economic and cultural indicators. A particularly novel and interesting application
of the conditional efficiency model investigates efficiency of the provision of adult
education programs in Flanders [282]. This work suggests that characteristics of
the adult learners and homogeneity among the teachers on programs are important
determinants of managerial efficiency in the adult education contact. Clearly more
work using this approach is required at all levels of education.
The Malmquist approach has been extended to allow comparisons of perfor-
mance between groups rather than time periods [63], and this has further been
extended to examine and compare patterns of change across groups over time [23].
For example, in the context of schools in the Basque country in Spain, this approach
establishes that privately run schools have consistently better performance and that
this is because of superior technological performance. The methodology can also
be applied when there are more than two groups. When Ohio school districts are
assigned to five groups based on environmental harshness, the Malmquist decompo-
sition shows that productivity is largely explained by environmental harshness and
that technological progress is also hampered by the harshness of the environment
[199]. Distinctions are also found between public and private universities in Spain
with private universities outperforming their public counterparts at the start of the
1220 J. Johnes

study period, but the Malmquist decomposition reveals that the public universities
catch up over the period [100].

Policy Implications and Future Work

While average efficiency is generally found to be high in many education studies,

there is typically a spread of performance across providers, and this means that
the results can potentially be useful at a policy level. Efficiency-based funding
[116], for example, is one aspect where there has been relatively little work, but
the applications that exist suggest some potential for efficiency improvements by
distributing resources based on efficiency. Sexton et al. [286] provide an example
of an efficiency-based state funding scheme for HEIs underpinned by DEA. Such a
scheme, which would encourage HEIs to behave in such a way as to be consistent
with government or state objectives, would reap potential savings of 9% across the
sector, with differential savings observed in each provider. A particular advantage
of the approach is that, as efficiency improves relative to a given DEA frontier,
any subsequent DEA will produce an improved frontier against which efficiency
will be measured, and so there is a natural tendency of the approach for ongoing
improvement [286].
A drawback of the approach is that efficiency estimates based on annual
estimations can fluctuate from 1 year to another meaning that there is potential
for instability in resource allocations [115]. A reduction in sensitivity might be
achieved by using a moving average over several years. In addition, an efficiency-
based funding scheme may not be appropriate if there is little significant deviation
in efficiency across providers. In such cases, the studies should instead be used to
provide institutions with useful information on benchmarking and examples of good
practice [185].
Even where efficiency does apparently vary substantially across providers, we
know from the second-stage analyses undertaken in previous studies that efficiency
is affected by various factors and some of these are outside the control of
the institutions. Strategies to improve efficiency must therefore be nuanced. For
example, if efficiency is affected by the ethnic mix of pupils [54], providers can do
little to alter that. Instead, they must focus on ways in which to improve outcomes
of the at-risk groups, and this may then impact on efficiency. The importance of
variables reflecting the conditions in the wider environment means that local and
government policies to improve economic conditions in a catchment area can also
impact school efficiency.
The introduction of increased competition in school sectors has been a deliberate
policy of some governments (e.g., in the UK) to improve school performance and
efficiency. There are various studies which have specifically examined the impact of
increased competition on efficiency in various state school sectors [6, 8, 52, 53, 157].
With one exception [148], these studies find that the larger the number of schools
in a region, the higher the schools’ efficiency. Some studies find that competition
from private schools impacts on efficiency in publicly funded schools [7, 8, 243],
30 Applications of Production Economics in Education 1221

although the effect quickly diminishes as distance from the school decreases [201].
Competition has also been investigated as a driver of efficiency in higher education,
where it has been found to have a positive effect in the Canadian higher education
context, although not always significantly so [240, 241].
A final example of how efficiency analyses might inform policy arises in the
context of mergers. Theoretically, a merger might be expected to have benefits in
terms of increased efficiency accruing from returns to scale or returns to scope
where the merging providers have complementary offerings [155, 293]. A suite of
papers utilizing a sample of data relating to English higher education suggest that,
typically, efficiency improves following merger but that the benefits accrue in the
years immediately following the merger and do not continue indefinitely [178, 194,
257]. There is scope for more work into the evolution over time of the effects of
merger on subsequent efficiency.

Level of Analysis

In the preceding sections, we have made little reference to the level of the analyses
undertaken. In many cases, the estimations, be they cost functions or output distance
functions, are at provider level. There are some exceptions in the schooling context,
where the level might equally well be the funding region (such as school district in
the US context or local education authority – LEA – in the UK context). The review
of efficiency in education by De Witte and López-Torres [102] confirms the provider
(defined as organization, school, department, etc.) as the typical unit of analysis in
such studies: of 223 papers relating to efficiency in the education context over the
period 1977–2015, 147 are at the organization level (with 89 relating to HEIs and 58
to schools); 44 focus on the funding district, county, or city level, while 9 studies are
at the level of the country, and 23 at the level of the individual student. A number of
these studies are of note because they focus on a particular discipline or department
[30, 65, 87, 105, 132, 205, 207, 236, 284, 292] or a support service [64, 204, 245,
267, 290]. In this section we take a brief look at the studies undertaken at individual-,
funding area-, and national-level analyses to see what additional information they
provide, and what challenges arise, in the context of production economics.

Individual-Level Analyses

Individual-level studies are not uncommon in the schooling literature relating to

education production functions, which has long recognized that pupils are nested
within schools and hence the data are hierarchical in nature. As such, multi-level
modelling (MLM) has been developed to estimate such functions while allowing
for within-unit variations [137, 138, 316]. Recognition of the hierarchical structure
avoids issues such as aggregation bias and mis-estimated parameters, and the
MLM approach is sufficiently flexible that it can allow both intercept and slope
coefficients to vary. An additional advantage of such an approach is that it is
1222 J. Johnes

possible to disentangle the effects of both pupils and schools on their outcomes.
The disadvantage is that MLM is not a frontier estimation technique, and so there is
no allowance for inefficiency in the education production function.
An alternative approach which allows for inefficiency is to apply DEA to
individual-level data. Such an approach has been taken in a small number of studies
in the schooling context [261, 299, 300]. By using a meta-frontier type of approach,
it is possible to decompose overall efficiency for a pupil into that attributable
to the pupil him-/herself and that attributable to the school (assuming just pupil
and school levels – additional levels are possible). By careful aggregation of the
pupil efficiencies [302], schools derive more information as to the source of their
shortcomings (pupil or school) and can devise appropriate initiatives accordingly.
Applications of individual-level DEA in universities are also relatively rare. Find-
ings from such studies suggest that efficiencies derived from aggregate university
level analyses incorporate both individual and institution performance components;
an individual-level DEA, meanwhile, provides more detailed information about the
source of the inefficiency, i.e., student or university [174]. A comparison of MLM
and individual-level DEA applied to the same data set finds interesting differences
in the performance rankings of universities based on the two approaches, and these
are particularly relevant for the best- and worst-performing HEIs [173]. This is
in contrast to findings at school level; De Witte et al. [103] find more alignment
between their results from MLM and an individual-level non-parametric approach
using a sample of school pupils.

Funding Area Analyses

While not as prolific in number as organization studies, papers focusing on

efficiency within funding areas in education are nevertheless reasonably numerous.
They mostly relate to school-level education and are based on both parametric
and non-parametric approaches. One of the earliest such studies utilizes maximum
likelihood and corrected ordinary least squares to estimate efficiency among local
education authorities in providing schooling in England, using a cost function
approach [33]. The level of estimated efficiency depends on whether the approach
is deterministic (with efficiency levels around 83% to 89%) or stochastic (with
efficiency levels much higher at well over 90%).
Experimentation with efficiency measurement continues in the context of fund-
ing areas with a comparison of ratios (comparing a single output to a single input,
e.g., cost per student graduated) and efficiencies derived from a variety of DEA
models [111]. There are significant inconsistencies between the ratios and DEA
efficiency measures, which is not surprising as the ratios fail to take into account
the multi-input multi-output nature of production. Subsequent studies largely use
standard frontier techniques such as DEA and SFA (including conditional and
network DEA), applied in cost or production settings, and generally establish similar
levels of efficiency to the earliest studies [31, 129, 144, 147, 149, 199, 201].
30 Applications of Production Economics in Education 1223

A non-frontier strand of literature employs a (modified) quadriform approach

[160] to the identification of efficiency among school funding areas [162]. The
modified quadriform approach is a means whereby performance of units can be
displayed in a two-dimensional depiction. Specifically, costs are regressed on a
set of uncontrollable school characteristics, and school output (such as graduation
rate) is regressed on the same set of characteristics. The resulting residuals from
each regression equation are plotted for each school district, and performance is
examined in quadrants ranging from efficient (described as low input and high
output) through effective, ineffective, and finally inefficient (described as high input
and low output). While interesting and easy to interpret, such an approach does not
adequately account for the multi-dimensional nature of production, is non-frontier,
and relies on regression residuals which contain both unexplained variation and
random error. An adaptation to provide a buffer around residuals which are low
in magnitude (and therefore such districts can be assumed to be performing as
expected) addresses the latter point to some extent, but other drawbacks remain.
A comparison of the quadriform approach with frontier techniques can be found in
Rolle [274].
Higher education studies rarely feature in the funding area context, mainly
because higher education is often a national (not regional) responsibility – hence
national-level analyses are more appropriate, and these are discussed in the next
section. An exception is a study of Chinese higher education at the level of Chinese
provinces which takes a production function approach [321] and where efficiency
levels are found to be relatively low (with mean technical efficiency of under 40%).
Such funding area studies can provide useful insights into efficiency or (in the
rare cases where it is calculated, productivity [255]) for the funding providers. The
relationship between the funding area and organizations within it is rarely utilized –
a network approach by Grosskopf et al. [149] is an example where the relationship is
adapted into the approach. A meta-frontier analysis of schools within funding areas
might also provide a useful extension to this particular body of literature.

National-Level Analyses

The benchmarking advantages of such tools as DEA are well known. As austerity
measures have been introduced in various education and higher education sectors
around the world in the last decade, there has been an increasing recognition that
international comparisons are necessary to provide benchmarks of good practice
which may be outside of national boundaries. Combined with this, the last decade
has seen a constant improvement in the availability of data at all levels across
countries meaning it is now increasingly possible to make such international
comparisons and to identify exemplars of good practice across countries for national
governments to emulate.
Studies which make international comparisons – whether at school or higher
education level – fall into two categories. There are those which use provider-level
data across two or more countries and then frequently take a meta-frontier approach
1224 J. Johnes

to make cross-country comparisons [11, 12, 16, 314, 315]; and there are those which
use national-level data (i.e., the nation is the DMU) to derive their results [7, 9, 25,
26, 29, 47, 133]. Interesting differences between countries can be found. Agasisti
and Pérez-Esparrells [16], for example, compare universities in Italy and Spain and
find, using the Malmquist productivity index approach, that productivity has been
rising in both countries over the study period. In comparing the countries however, it
appears that technological change underpins productivity increases in Italy, whereas
it is efficiency gains which underpin the observation for Spain.
There is no doubt that such studies will proliferate as more data becomes
available, and that is beneficial so long as results are treated with caution. There
are various problems with cross-country comparisons and in particular the latter
approach. It is extremely difficult to get comparable data on costs or inputs and
outputs at the national level. The assumption that production technology and
environment are the same across diverse sets of countries is open to serious doubt.
Thus if a national-level study is to be undertaken, it is advisable either to use
individual providers to seek useful insights into education provision across countries
using a meta-frontier type of approach, or, if national-level data are to be used, then
a parametric estimation approach which allows for unobserved heterogeneity should
ideally be adopted. There is scope for much more work in this context.

Conclusions

This chapter has examined empirical findings relating to production economic

concepts in the context of education and higher education. Education is an important
sector of any economy as the benefits (in terms of increased productivity) accrue to
both the individuals who consume the education and also society as a whole. This
is particularly the case for primary and secondary education, which are typically
compulsory in many countries, and to a limited extent for tertiary education as
well. As such, education and higher education are in receipt of publicly allocated
funds, potentially making the incentives for efficient operation less compelling
than in a private sector. The public funding of all levels of education, combined
with the incentives and pressures which that imposes on the providers operating
in the sector, makes education and higher education interesting sectors in which to
examine concepts from production economics.
This chapter examines findings relating to costs, production, and efficiency
in education and higher education and contributes to the production economic
literature by bringing together the findings of these diverse literatures, at all levels
of education, into one repository. The review has uncovered a number of key areas
for future research.
The mixed findings emerging from all topics in both education and higher
education contexts make it difficult for managers and policy-makers to take a
consistent message on, for example, the existence (or otherwise) of economies of
scale or scope, the degree of substitutability between different inputs, the extent of
inefficiency in the sectors, and the identification of factors affecting efficiency. This
30 Applications of Production Economics in Education 1225

points to a pressing need for more detailed analyses of the literature to provide a
framework for why results vary and hence permit the users of the work to make
informed decisions. A key contribution in this area is by Zhang and Worthington
[324] who undertake a meta-regression analysis of the empirical cost function
literature in higher education. They are able to identify reasons why the findings
on economies of scale and scope vary across the studies. More studies of this type
in the education context, or relating to output distance functions and efficiency (at
both schools and higher education levels), are also needed.
In terms of factors affecting efficiency, the conditional DEA approach offers
a rigorous methodology for identifying those variables which are most important
in affecting efficiency. This knowledge is essential in determining strategies for
improving efficiency and hence getting more value for public funds and in particular
in revealing whether institution-level or regional-level or national-level policies will
be most effective.
While economies of scale and scope (and returns to scale and scope in the
production context) are relatively well researched, there is much less empirical
research into elasticities of substitution between inputs (or between outputs). In
times of public funding constraints, such information could be particularly useful
to managers and policy-makers. Similarly, more work on the potential benefits of
performance-based funding would be welcome.
Finally, there is considerably more scope for education studies which make
comparisons across countries. These might use national-level data, in which case
appropriate methods which take into account unobserved heterogeneity should
definitely be applied. But the increasing availability of large individual level data
sets offers opportunities for findings from these sources. However, more work is
required on the application of frontier methods to the individual-level context and
using these results to derive insights into concepts, such as efficiency, relating to
providers and even nations.
Empirical applications of production economics to education and higher edu-
cation have a long and fruitful history and are set to continue to provide useful
information to both managers and policy-makers alike.

Cross-References

Elasticities of Substitution
Stochastic Frontier Analysis: Foundations and Advances I
Stochastic Frontier Analysis: Foundations and Advances II

References
1. Abbott M, Doucouliagos C (2000) Technical and scale efficiency of vocational education and
training institutions: the case of the New Zealand polytechnics. N Z Econ Pap 34(1):1–23
1226 J. Johnes

2. Abbott M, Doucouliagos C (2002) A data envelopment analysis of the efficiency of Victorian

TAFE institutes. Aust Econ Rev 35(1):55–69
3. Abbott M, Doucouliagos H (2003) The efficiency of Australian universities: a data envelop-
ment analysis. Econ Educ Rev 22:89–97
4. Adkins LC, Moomaw RL (2003) The impact of local funding on the technical efficiency of
Oklahoma Schools. Econ Lett 81(1):31–37
5. Afonso A, St. Aubyn M (2006) Cross-country efficiency of secondary education provision: a
semi-parametric analysis with non-discretionary inputs. Econ Model 23(3):476–491
6. Agasisti T (2011a) Does competition affect schools’ performance? Evidence from Italy
through OECD-PISA data. Eur J Educ 46(4):549–565
7. Agasisti T (2011b) Performances and spending efficiency in higher education: a European
comparison through non-parametric approaches. Educ Econ 19(2):199–224
8. Agasisti T (2013) Competition among Italian junior-secondary schools: a variance-
decomposition empirical analysis. Ann Public Coop Econ 84(1):17–42
9. Agasisti T (2014) The efficiency of public spending on education: an empirical comparison
of EU countries. Eur J Educ 49(4):543–557
10. Agasisti T (2016) Cost structure, productivity and efficiency of the Italian public higher
education industry 2001–2011. Int Rev Appl Econ 30(1):48–68
11. Agasisti T, Gralka S (2019) The transient and persistent efficiency of Italian and German
universities: a stochastic frontier analysis. Appl Econ 51:1–19
12. Agasisti T, Johnes G (2009a) Beyond frontiers: comparing the efficiency of higher education
decision-making units across more than one country. Educ Econ 17(1):59–79
13. Agasisti T, Johnes G (2009b) Cost structure, efficiency and heterogeneity in US higher
education: an empirical analysis, LUMS working paper 2009/013. Lancaster University
Management School, Lancaster
14. Agasisti T, Johnes G (2010) Heterogeneity and the evaluation of efficiency: the case of Italian
universities. Appl Econ 42(11):1365–1376
15. Agasisti T, Johnes G (2015) Efficiency, costs, rankings and heterogeneity: the case of US
higher education. Stud High Educ 40(1):60–82
16. Agasisti T, Pérez-Esparrells C (2010) Comparing efficiency in a cross-country perspective:
the case of Italian and Spanish state universities. High Educ 59(1):85–103
17. Agasisti T, Barra C, Zotti R (2016) Evaluating the efficiency of Italian public univer-
sities (2008–2011) in presence of (unobserved) heterogeneity. Socio Econ Plan Sci 55:
47–58
18. Ahn T, Seiford LM (1993) Sensitivity of DEA to models and variable sets in a hypothesis
test setting: the efficiency of university operations. In: Ijiri Y (ed) Creative and innovative
approaches to the science of management. Quorum Books, Westport, pp 191–208
19. Ahn T, Arnold V, Charnes A, Cooper WW (1989) DEA and ratio efficiency analyses for
public institutions of higher learning in Texas. Res Gov Nonprofit Account 5:165–185
20. Aigner D, Lovell CAK, Schmidt P (1977) Formulation and estimation of stochastic frontier
production models. J Econ 6:21–37
21. Al-Enezi M, Burney NA, Johnes J, Al-Musallam M (2010) An assessment of efficiency of
public schools in Kuwait using data envelopment approach (DEA) and Tobit regression. J
Dev Econ Policies 12(2):5–33
22. Andrews M, Duncombe W, Yinger J (2002) Revisiting economies of size in American
education: are we any closer to a consensus. Econ Educ Rev 21:245–262
23. Aparicio J, Crespo-Cebada E, Pedraja-Chaparro F, Santín D (2017) Comparing school
ownership performance using a pseudo-panel database: a Malmquist-type index approach.
Eur J Oper Res 256(2):533–542
24. Arcelus FJ, Coleman DF (1997) An efficiency review of university departments. Int J Syst
Sci 28(7):721–729
25. Aristovnik A, Obadić A (2011) The funding and efficiency of higher education in Croatia and
Slovenia: a nonparametric comparison. Munich Personal RePEc Archive 31466. http://mpra.
ub.uni-muenchen.de/31466/1/Pula-Obadic-Aristovnik-HE-Nonparametric_final.pdf
30 Applications of Production Economics in Education 1227

26. Aristovnik A, Obadić A (2014) Measuring relative efficiency of secondary education in

selected EU and OECD countries: the case of Slovenia and Croatia. Technol Econ Dev Econ
20(3):419–433
27. Athanassopulos A, Shale EA (1997) Assessing the comparative efficiency of higher education
institutions in the UK by means of data envelopment analysis. Educ Econ 5(2):117–135
28. Avkiran NK (2001) Investigating technical and scale efficiencies of Australian universities
through data envelopment analysis. Socio Econ Plan Sci 35:57–80
29. Azar Dufrechou P (2016) The efficiency of public education spending in Latin America: a
comparison to high-income countries. Int J Educ Dev 49:188–203
30. Aziz NAA, Janor RM, Mahadi R (2013) Comparative departmental efficiency analysis within
a university: a DEA approach. Procedia Soc Behav Sci 90:540–548
31. Banker RD, Janakiraman S, Natarajan R (2004) Analysis of trends in technical and allocative
efficiency: an application to Texas public school districts. Eur J Oper Res 154(2):477–491
32. Barra C, Zotti R (2013) Measuring teaching and research efficiency in higher education using
data envelopment analysis. A case study from the University of Salerno. University of Salerno
CELPE, Interdepartmental Centre for Research in Labour Economics and Economic Policy,
Working papers number 3/2013. https://www.academia.edu/4905831/Measuring_teaching_
and_research_efficiency_in_higher_education_using_data_envelopment_analysis._A_case_
study_from_the_University_of_Salerno
33. Barrow MM (1991) Measuring local education authority performance: a frontier approach.
Econ Educ Rev 10(1):19–27
34. Bates JM (1993) The efficiency of local education authorities. Oxf Rev Educ 19(3):277–289
35. Bates JM (1997) Measuring predetermined socioeconomic ‘input’ when assessing the
efficiency of educational outputs. Appl Econ 29(1):85–93
36. Battese GE, Coelli TJ (1995) A model for technical inefficiency effects in a stochastic frontier
production function for panel data. Empir Econ 20(2):325–332
37. Baumol WJ, Panzar JC, Willig RD (1982) Contestable markets and the theory of industry
structure. Harcourt Brace Jovanovich, London
38. Bayraktar E, Tatoglu E, Zaim S (2013) Measuring the relative efficiency of quality manage-
ment practices in Turkish public and private universities. J Oper Res Soc 64(12):1810–1830
39. Beasley JE (1990) Comparing university departments. Omega 18(2):171–183
40. Beasley JE (1995) Determining teaching and research efficiencies. J Oper Res Soc 46(4):441–
452
41. Bee M, Dolton PJ (1985) Costs and economies of scale in UK private schools. Appl Econ
17:281–290
42. Berkner LV (1966) Management in the university. Manag Sci 12(10):183–191
43. Bessent AM, Bessent EW (1980) Determining the comparative efficiency of schools through
data envelopment analysis. Educ Adm Q 16(2):57–75
44. Bessent AM, Bessent EW, Kennington J, Reagan B (1982) An application of mathematical
programming to assess productivity in the Houston independent school district. Manag Sci
28(12):1355–1367
45. Blackburn V, Brennan S, Ruggiero J (2014) Measuring efficiency in Australian Schools: a
preliminary analysis. Socio Econ Plan Sci 48(1):4–9
46. Blackorby C, Russell R (1989) Will the real elasticity of substitution please stand up? (A
comparison of Allen/Uzawa and Morishima elasticities). Am Econ Rev 79:882–888
47. Bogetoft P, Heinesen E, Tranæs T (2015) The efficiency of educational production: a
comparison of the Nordic countries with other OECD countries. Econ Model 50:310–321
48. Bonesrønning H, Rattsø J (1994) Efficiency variation among the Norwegian high schools:
consequences of equalization policy. Econ Educ Rev 13(4):289–304
49. Borge L-E, Naper LR (2005) Efficiency potential and efficiency variation in Norwegian lower
secondary schools. In: CESifo working papers, Working paper no. 1624.: 26pp. CESifo,
Munich
50. Bowen HR (1981) The costs of higher education: How much do colleges and universities
spend per student and how much should they spend? Jossey-Bass, San Francisco
1228 J. Johnes

51. Bowles TJ, Bosworth R (2002) Scale economies in public education: evidence from school
level data. J Educ Financ 28(Fall):285–300
52. Bradley S, Taylor J (2002) The effect of the quasi–market on the efficiency–equity trade–off
in the secondary school sector. Bull Econ Res 54(3):295–314
53. Bradley S, Johnes G, Millington J (2001) The effect of competition on the efficiency of
secondary schools in England. Eur J Oper Res 135(3):545–568
54. Bradley S, Johnes J, Little A (2010) The measurement and determinants of efficiency and
productivity in the further education sector in England. Bull Econ Res 62(1):1–30
55. Brennan S, Haelermans C, Ruggiero J (2013) Nonparametric estimation of education
productivity incorporating nondiscretionary inputs with an application to Dutch schools. Eur
J Oper Res 234(3):809
56. Breu TM, Raab RL (1994) Efficiency and perceived quality of the nation’s ‘Top 25’ national
universities and national liberal arts colleges: an application of data envelopment analysis to
higher education. Socio Econ Plan Sci 28(1):33–45
57. Brinkman PT, Leslie LL (1986) Economies of scale in higher education: sixty years of
research. Rev High Educ 10(1):1–28
58. Burney NA, Johnes J, Al-Enezi M, Al-Ramadhan M, Al-Musallam M, Hamada S, Al-
Musallam N, Al-Essa G, Al-Khayat A, Al-Husaini F (2009) An assessment of cost structure
and efficiency of public schools in Kuwait, Report no. KISR9673. Kuwait Institute for
Scientific Research, Kuwait
59. Burney NA, Johnes J, Al-Enezi M, Al-Musallam M (2013) The efficiency of public schools:
the case of Kuwait. Educ Econ 21(4):360–379
60. Butler RJ, Monk DH (1985) The cost of public schooling in New York State: the role of scale
and efficiency in 1978–79. J Hum Resour 20(3):361–381
61. Cai Y, Yang X (2016) Mergers in Chinese higher education: lessons for studies in a global
context. Eur J High Educ 6(1):71
62. Callan SJ, Santerre RS (1990) The production characteristics of local public education: a
multiple product and input analysis. South Econ J 57(2):468–480
63. Camanho AS, Dyson RG (2006) Data envelopment analysis and Malmquist indices for
measuring group performance. J Prod Anal 26:35–49
64. Casu B, Thanassoulis E (2006) Evaluating cost efficiency in central administrative services in
UK universities. Omega 34(5):417–426
65. Casu B, Shaw D, Thanassoulis E (2005) Using a group support system to aid input-output
identification in DEA. J Oper Res Soc 56(12):1363–1372
66. Caves DW, Christensen LR, Diewert WE (1982) The economic theory of index numbers and
the measurement of input, output, and productivity. Econometrica 50(6):1393–1414
67. Cazals C, Florens J-P, Simar L (2002) Nonparametric frontier estimation: a robust approach.
J Econ 106(1):1–25
68. Chakraborty K (2003) Measuring the productive efficiency and cost of public education.
Southwest Econ Rev 30(1):23–33
69. Chakraborty K, Blackburn V (2013) Efficiency and equity in funding for government schools
in Australia. Aust Econ Pap 52(3/4):127–142
70. Chakraborty K, Biswas B, Lewis WC (2000) Economies of scale in public education: an
econometric analysis. Contemp Econ Policy 18(2):238–247
71. Chakraborty K, Biswas B, Lewis WC (2001) Measurement of technical efficiency in public
education: a stochastic and nonstochastic production function approach. South Econ J
67(4):889–905
72. Chalos P (1997) An examination of budgetary inefficiency in education using data envelop-
ment analysis. Financ Account Manage 13(1):55–69
73. Chalos P, Cherian J (1995) An application of data envelopment analysis to pub-
lic sector performance measurement and accountability. J Account Public Policy 14:
143–160
74. Charnes A, Cooper WW, Rhodes E (1978) Measuring the efficiency of decision making units.
Eur J Oper Res 2(4):429–444
30 Applications of Production Economics in Education 1229

75. Charnes A, Cooper WW, Rhodes E (1979) Measuring the efficiency of decision making units:
a short communication. Eur J Oper Res 3(4):339
76. Cherchye L, De Witte K, Perelman S (2019) A unified productivity-performance approach
applied to secondary schools. J Oper Res Soc 70(9):1522–1537
77. Chizmar JF, McCarney BJ (1984) An evaluation of ‘trade-offs’ implementation using
canonical estimation of joint educational production functions. J Econ Educ 15(1):11–20
78. Chizmar JF, Zak TA (1983) Modeling multiple outputs in educational production functions.
Am Econ Rev 73(2):18–22
79. Chizmar JF, Zak TA (1984) Canonical estimation of joint educational production functions.
Econ Educ Rev 3(1):37–43
80. Clermont M, Dirksen A, Dyckhoff H (2015) Returns to scale of business administration
research in Germany. Scientometrics 103(2):583–614
81. Coelli TJ (2000) On the econometric estimation of the distance function representation of a
production technology, Working paper 2000/042. Université catholique de Louvain, Center
for Operations Research and Econometrics (CORE), Louvain
82. Coelli TJ, Perelman S (1999) A comparison of parametric and non-parametric distance
functions: with application to European railways. Eur J Oper Res 117:326–339
83. Coelli TJ, Perelman S (2000) Technical efficiency of European railways: a distance function
approach. Appl Econ 32:1967–1976
84. Coelli TJ, Rao DSP, O’Donnell CJ, Battese GE (2005) An introduction to efficiency and
productivity analysis. Springer, New York
85. Cohn E, Rossmiller RA (1987) Research on effective schools: implications for less developed
countries. Comp Educ Rev 31(3):377–399
86. Cohn E, Rhine SLW, Santos MC (1989) Institutions of higher education as multi-product
firms: economies of scale and scope. Rev Econ Stat 71(2):284–290
87. Colbert A, Levary RR, Shaner MC (2000) Determining the relative efficiency of MBA
programs using DEA. Eur J Oper Res 125(3):656–669
88. Colegrave AD, Giles MJ (2008) School cost functions: a meta regression analysis. Econ Educ
Rev 27(6):688–696
89. Colombi R (2013) Closed skew normal stochastic frontier models for panel data. In: Studies in
theoretical and applied statistics, selected papers of the Statistical Societies. Springer, Berlin,
Heidelberg, pp 177–186
90. Colombi R, Kumbhakar SC, Martini G, Vittadini G (2014) Closed-skew normality in stochas-
tic frontiers with individual effects and long/short-run efficiency. J Prod Anal 42(2):123–136
91. Conroy SJ, Arguea NM (2008) An estimation of technical efficiency for Florida public
elementary schools. Econ Educ Rev 27(6):655–663
92. Cooper ST, Cohn E (1997) Estimation of a frontier production function for the South Carolina
educational process. Econ Educ Rev 16(3):313–327
93. Cordero JM, Santín D, Simancas R (2017) Assessing European primary school performance
through a conditional nonparametric model. J Oper Res Soc 68(4):364–376
94. Cordero JM, Polo C, Santín D, Simancas R (2018) Efficiency measurement and cross-country
differences among schools: a robust conditional nonparametric analysis. Econ Model 74:
45–60
95. Costrell R, Hanushek E, Loeb S (2008) What do cost functions tell us about the cost of an
adequate education? Peabody J Educ 83(2):198–223
96. Daraio C, Simar L (2005) Introducing environmental variables in nonparametric frontier
models: a probabilistic approach. J Prod Anal 24(1):93–121
97. Daraio C, Simar L (2007) Conditional nonparametric frontier models for convex and
nonconvex technologies: a unifying approach. J Prod Anal 28(1):13–32
98. Daraio C, Simar L, Wilson PW (2018) Central limit theorems for conditional efficiency
measures and tests of the ‘separability’ condition in non-parametric, two-stage models of
production. Econ J 21(2):170–191
99. de Groot H, McMahon WW, Volkwein JF (1991) The cost structure of American research
universities. Rev Econ Stat 73(3):424–431
1230 J. Johnes

100. de la Torre EM, Gómez-Sancho J-M, Perez-Esparrells C (2017) Comparing university

performance by legal status: a Malmquist-type index approach for the case of the Spanish
higher education system. Tert Educ Manag 23(3):206
101. De Witte K, Hudrlikova L (2013) What about excellence in teaching? A benevolent ranking
of universities. Scientometrics 96(1):337–364
102. De Witte K, López-Torres L (2017) Efficiency in education: a review of literature and a way
forward. J Oper Res Soc 68(4):339–363
103. De Witte K, Thanassoulis E, Simpson G, Battisti G, Charlesworth-May A (2010) Assessing
pupil and school performance by non-parametric and parametric techniques. J Oper Res Soc
61(8):1224–1237
104. De Witte K, Rogge N, Cherchye L, Van Puyenbroeck T (2013) Economies of scope in
research and teaching: a non-parametric investigation. Omega 41(2):305–314
105. Dehnokhalaji A, Korhonen PJ, Köksalan M, Nasrabadi N, Wallenius J (2010) Efficiency
analysis to incorporate interval-scale data. Eur J Oper Res 207(2):1116
106. Deller SC, Rudnicki E (1993) Production efficiency in elementary education: the case of
Maine public schools. Econ Educ Rev 12(1):45–57
107. Dougherty CRS (1990) Unit costs and economies of scale in vocational and technical
education: evidence from the People’s Republic of China. Econ Educ Rev 9(4):389–394
108. Duh R-R, Chen K-T, Lin R-C, Kuo L-C (2014) Do internal controls improve operating
efficiency of universities? Ann Oper Res 221(1):173–195
109. Duncombe W, Miner J, Ruggiero J (1995) Potential cost savings from school district
consolidation: a case study of New York. Econ Educ Rev 14(3):265–284
110. Dundar H, Lewis DR (1995) Departmental productivity in American universities: economies
of scale and scope. Econ Educ Rev 14(2):119–144
111. Engert F (1996) The reporting of school district efficiency: the adequacy of ratio measures.
Public Budg Financ Manag 8(2):247–271
112. Engle PL, Black MM, Behrman JR, de Mello MC, Gertler PJ, Kapiriri L, Martorell R, Young
ME, G. International Child Development Steering (2007) Strategies to avoid the loss of
developmental potential in more than 200 million children in the developing world. Lancet
369(9557):229–242
113. Essid H, Ouellette P, Vigeant S (2014) Productivity, efficiency, and technical change of
Tunisian schools: a bootstrapped Malmquist approach with quasi-fixed inputs. Omega
42(1):88–97
114. Eurydice (2018) Adult education and training. From https://eacea.ec.europa.eu/national-
policies/eurydice/content/adult-education-and-training-3_en. Accessed 13 Jan 2020
115. Fandel G (2007) On the performance of universities in North Rhine-Westphalia, Germany:
government’s redistribution of funds judged using DEA efficiency measures. Eur J Oper Res
176(1):521–533
116. Fandel G, Gal T (2001) Redistribution of funds for teaching and research among universities:
the case of North Rhine-Westphalia. Eur J Oper Res 130(1):111–120
117. Färe R (1991) Measuring Farrell efficiency for a firm with intermediate inputs. Acad Econ
Pap 19(2):329–340
118. Färe R, Grosskopf S, Weber WL (1989) Measuring school district performance. Public Financ
Q 17(4):409–428
119. Färe R, Grosskopf S, Norris M, Zhang Z (1994) Productivity growth, technical progress, and
efficiency change in industrialized countries. Am Econ Rev 84(1):66–83
120. Farrell M (1957) The measurement of productive efficiency. J R Stat Soc Ser A 120(3):253–
281
121. Ferguson RF (1991) Paying for public education: new evidence on how and why money
matters. Harv J Legis 28(2):465
122. Ferris JS, West EG (2004) Economies of scale, school violence and the optimal size of
schools. Appl Econ 36(15):1677–1684
123. Fetler M (1989) School dropout rates, academic performance, size, and poverty: correlates of
educational reform. Educ Eval Policy Anal 11(4):109
30 Applications of Production Economics in Education 1231

124. Fieger P, Villano R, Cooksey R (2016) Efficiency of Australian technical and further education
providers. Int J Train Res 14(1):62–75
125. Filippini M, Greene W (2016) Persistent and transient productive inefficiency: a maximum
simulated likelihood approach. J Prod Anal 45(2):187–196
126. Flegg T, Allen D, Field K, Thurlow TW (2004) Measuring the efficiency of British
universities: a multi-period data envelopment analysis. Educ Econ 12(3):231–249
127. Førsund FR (2017) Measuring effectiveness of production in the public sector. Omega 73:93–
103
128. Fowler WJ, Walberg HJ (1991) School size, characteristics, and outcomes. Educ Eval Policy
Anal 13(2):189–202
129. Fukuyama H, Weber WL (2002) Evaluating public school district performance via DEA gain
functions. J Oper Res Soc 53(9):992–1003
130. García-Aracil A (2013) Understanding productivity changes in public universities: evidence
from Spain. Res Eval 22(5):351–368
131. Getz M, Siegfried JJ, Zhang H (1991) Estimating economies of scale in higher education.
Econ Lett 37(2):203–208
132. Giménez VM, Martínez JL (2006) Cost efficiency in the university: a departmental evaluation
model. Econ Educ Rev 25:543–553
133. Giménez VM, Prior D, Thieme C (2007) Technical efficiency, managerial efficiency and
objective-setting in the educational system: an international comparison. J Oper Res Soc
58(8):996–1007
134. Glass JC, McKillop DG, Hyndman NS (1995a) The achievement of scale efficiency in UK
universities: a multiple-input multiple-output analysis. Educ Econ 3(3):249–263
135. Glass JC, McKillop DG, Hyndman NS (1995b) Efficiency in the provision of university
teaching and research: an empirical analysis of UK universities. J Appl Econ 10(1):61–72
136. Glass JC, McCallion G, McKillop DG, Rasaratnam S, Stringer KS (2006) Implications of
variant efficiency measures for policy evaluation in UK higher education. Socio Econ Plan
Sci 40:119–142
137. Goldstein H (1987) Multilevel models in educational and social research. Charles Griffin,
London
138. Goldstein H (1997) Methods in school effectiveness research. Sch Eff Sch Improv 8(4):369–
395
139. Gralka S (2018) Persistent inefficiency in the higher education sector: evidence from
Germany. Educ Econ 26(4):373–392
140. Greene W (2005) Reconsidering heterogeneity in panel data estimators of the stochastic
frontier model. J Econ 126:269–303
141. Gronberg TJ, Jansen DW, Taylor LL (2011) The adequacy of educational cost functions:
evidence from Texas. Peabody J Educ 86(1):3–27
142. Gronberg TJ, Jansen DW, Taylor LL (2012) The relative efficiency of charter schools: a cost
frontier approach. Econ Educ Rev 31(2):302–317
143. Gronberg TJ, Jansen DW, Karakaplan MU, Taylor LL (2015) School district consolidation:
market concentration and the scale-efficiency tradeoff. South Econ J 82(2):580–597
144. Grosskopf S, Moutray C (2001) Evaluating performance in Chicago public high schools in
the wake of decentralization. Econ Educ Rev 20:1–14
145. Grosskopf S, Hayes KJ, Hirschberg JG (1995) Fiscal stress and the production of public
safety: a distance function approach. J Public Econ 57:277–296
146. Grosskopf S, Hayes KJ, Taylor LL, Weber WL (1997) Budget-constrained frontier measures
of fiscal quality and efficiency in schooling. Rev Econ Stat 79:116–124
147. Grosskopf S, Hayes KJ, Taylor LL, Weber WL (1999) Anticipating the consequences of
school reform: a new use of DEA. Manag Sci 45(4):608–620
148. Grosskopf S, Hayes KJ, Taylor LL, Weber WL (2001) On the determinants of school district
efficiency: competition and monitoring. J Urban Econ 49:453–478
149. Grosskopf S, Hayes K, Taylor LL, Weber W (2015) Centralized or decentralized control of
school resources? A network model. J Prod Anal 43(2):139–150
1232 J. Johnes

150. Gstach D, Somers A, Warning S (2003) Output specific efficiencies: the case of UK
private secondary schools, Department of Economics working paper series 84.: 16pp. Vienna
University of Economics and Business, Vienna
151. Gyimah-Brempong K, Gyapong AO (1991) Characteristics of education production func-
tions: an application of canonical regression analysis. Econ Educ Rev 10(1):7–17
152. Gyimah-Brempong K, Gyapong AO (1992) Elasticities of factor substitution in the production
of education. Econ Educ Rev 11(3):205–217
153. Haelermans C, De Witte K (2012) The role of innovations in secondary school performance –
evidence from a conditional efficiency model. Eur J Oper Res 223(2):541–549
154. Haelermans C, Ruggiero J (2013) Estimating technical and allocative efficiency in the public
sector: a nonparametric analysis of Dutch schools. Eur J Oper Res 227(1):174–181
155. Harman G (2000) Institutional mergers in Australian higher education since 1960. High Educ
Q 54(4):343–366
156. Harrison J, Rouse P (2002) Measuring the performance of Aukland secondary schools: a pilot
study using data envelopment analysis. University of Aukland, Department of Accounting and
Finance, New Zealand
157. Harrison J, Rouse P (2014) Competition and public high school performance. Socio Econ
Plan Sci 48(1):10
158. Hashimoto K, Cohn E (1997) Economies of scale and scope in Japanese private universities.
Educ Econ 5(2):107–115
159. Hemelt SW, Stange KM, Furquim F, Simon A, Sawyer JE (2018) Why is math cheaper
than English? Understanding cost differences in higher education, NBER working paper no.
25314. National Bureau of Economic Research, Cambridge
160. Hickrod GA, Liu C-C, Arnold R, Chaudhari R, Frank L, Franklin D, Polite MM, RPruyne
GB, Ward JG (1989) The biggest bang for the buck: An initial report on technical efficiency
in Illinois K-12 schools. Center for the Study of Educational Finance, Illinois State University,
Normal
161. Hoff A (2007) Second stage DEA: comparison of approaches for modelling the DEA score.
Eur J Oper Res 181(1):425–435
162. Houck EA, Rolle RA, He J (2010) Examining school district efficiency in Georgia. J Educ
Financ 35(4):331–357
163. Huang CJ, Liu J-T (1994) Estimation of a non-neutral stochastic frontier production function.
J Prod Anal 5(2):171–180
164. Izadi H, Johnes G, Oskrochi R, Crouchley R (2002) Stochastic frontier estimation of a CES
cost function: the case of higher education in Britain. Econ Educ Rev 21(1):63–71
165. James E, King EM, Suryadi A (1996) Finance, management, and costs of public and private
schools in Indonesia. Econ Educ Rev 15(4):387–398
166. Jimenez E (1986) The structure of educational costs: multiproduct cost functions for primary
and secondary schools in Latin America. Econ Educ Rev 5(1):25–39
167. Jimenez E, Paqueo V (1996) Do local contributions affect the efficiency of public primary
schools? Econ Educ Rev 15(4):377–386
168. Johnes J (1990) Unit costs: some explanations of the differences between UK universities.
Appl Econ 22(7):853–862
169. Johnes G (1996) Multi-product cost functions and the funding of tuition in UK universities.
Appl Econ Lett 3:557–561
170. Johnes G (1997a) Costs and industrial structure in contemporary British higher education.
Econ J 107:727–737
171. Johnes J (1997b) Inter-university variations in undergraduate non-completion rates: a statisti-
cal analysis by subject of study. J Appl Stat 24(3):343–361
172. Johnes J (2004) Efficiency measurement. In: Johnes G, Johnes J (eds) International handbook
on the economics of education. Edward Elgar, Cheltenham, pp 613–742
173. Johnes J (2006a) Measuring efficiency: a comparison of multilevel modelling and data
envelopment analysis in the context of higher education. Bull Econ Res 58(2):75–104
30 Applications of Production Economics in Education 1233

174. Johnes J (2006b) Measuring teaching efficiency in higher education: an application of data
envelopment analysis to economics graduates from UK universities 1993. Eur J Oper Res
174:443–456
175. Johnes J (2008) Efficiency and productivity change in the English higher education sector
from 1996/97 to 2004/05. Manch Sch 76(6):653–674
176. Johnes G (2013) Efficiency in higher education institutions revisited: a network approach.
Econ Bull 33(4):2698–2706
177. Johnes J (2014a) Efficiency and input substitutability in English higher education 1996/97 to
2008/09. INFORMS, San Francisco, USA, 9th – 12th November
178. Johnes J (2014b) Efficiency and mergers in English higher education 1996/97 to 2008/9:
parametric and non-parametric estimation of the multi-input multi-output distance function.
Manch Sch 82(4):465–487
179. Johnes J (2015) Operational research in education. Eur J Oper Res 243(3):683–696
180. Johnes G (2020a) Economic approaches to school efficiency. In: Bradley S, Green C (eds)
The economics of education: a comprehensive overview. Elsevier, London, pp 479–489
181. Johnes J (2020b) Scale and scope economies, higher education. In: Teixeira PN, Shin J-C,
Amaral A, Bernasconi A, Magalhaes A, Kehm BM, Stensaker B, Choi E, Balbachevsky
E, Hunter F, Goastellec G, Mohamedbhai G, de Wit H, Valimaa J, Rumbley L, Unangs
L, Klemencic M, Langa P, Yang R, Nokkal T (eds) International encyclopedia of higher
education systems and institutions. Springer, Dordrecht
182. Johnes G, Johnes J (1993) Measuring the research performance of UK economics depart-
ments: an application of data envelopment analysis. Oxf Econ Pap 45:332–347
183. Johnes G, Johnes J (2009) Higher education institutions’ costs and efficiency: taking the
decomposition a further step. Econ Educ Rev 28(1):107–113
184. Johnes J, Johnes G (2013) Efficiency in the higher education sector: a technical exploration.
Department for Business Innovation and Skills, London
185. Johnes G, Johnes J (2016) Costs, efficiency and economies of scale and scope in the English
higher education sector. Oxf Rev Econ Policy 32(4):596–614
186. Johnes G, Johnes J (2019) Panel data in educational research. In: Tsionas M (ed) Panel data
econometrics: empirical applications. Elsevier, Amsterdam
187. Johnes G, Salas Velasco M (2007) The determinants of costs and efficiencies where producers
are heterogeneous: the case of Spanish universities. Econ Bull 4(15):1–9
188. Johnes G, Schwarzenberger A (2011) Differences in cost structure and the evaluation of
efficiency: the case of German universities. Educ Econ 19(5):487–499
189. Johnes J, Taylor J (1989a) An evaluation of performance indicators based upon the first
destination of university graduates. Stud High Educ 14(2):219–235
190. Johnes J, Taylor J (1989b) The first destination of new graduates: comparisons between
universities. Appl Econ 21(3):357–373
191. Johnes J, Taylor J (1989c) Undergraduate non-completion rates: difference between UK
universities. High Educ 18(2):209–225
192. Johnes J, Taylor J (1990a) Determinants of student wastage in higher education. Stud High
Educ 15(1):87–99
193. Johnes J, Taylor J (1990b) Performance indicators in higher education. Society for Research
into Higher Education and Open University Press, Buckingham
194. Johnes J, Tsionas M (2014) University mergers in England: effects on efficiency. European
Workshop on Efficiency and Productivity Analysis, 15–18 June Helsinki, Finland
195. Johnes G, Johnes J, Thanassoulis E, Lenton P, Emrouznejad A (2005) An exploratory analysis
of the cost structure of higher education in England, Research report 641. Department for
Education and Skills, London
196. Johnes G, Camanho AS, Portela MCAS (2008a) Assessing efficiency of Portuguese universi-
ties through parametric and non-parametric methods. Port J Manag Stud 13(1):39–66
197. Johnes G, Johnes J, Thanassoulis E (2008b) An analysis of costs in institutions of higher
education in England. Stud High Educ 33(5):527–549
1234 J. Johnes

198. Johnes J, Bradley S, Little A (2012) Efficiency in the further education sector in England.
Open J Stat 2(1):131
199. Johnson AL, Ruggiero J (2014) Nonparametric measurement of productivity and efficiency
in education. Ann Oper Res 221:197–210
200. Jondrow J, Lovell CAK, Materov IS, Schmidt P (1982) On the estimation of technical
inefficiency in the stochastic frontier production function model. J Econ 19(2–3):233–238
201. Kang B-G, Greene KV (2002) The effects of monitoring and competition on public education
and outputs: a stochastic frontier approach. Public Financ Rev 30(1):3–26
202. Kantabutra S, Tang JCS (2006) Urban-rural and size effects on school efficiency: the case of
Northern Thailand. Leadersh Policy Sch 5(4):355–377
203. Kao C (2014) Network data envelopment analysis: a review. Eur J Oper Res 239(1):1–16
204. Kao C, Hung H-T (2003) Ranking university libraries with a posteriori weights. Libri 53:282–
289
205. Kao C, Hung H-T (2008) Efficiency analysis of university departments: an empirical study.
Omega 36(4):653–664
206. Kao C, Hwang S-N (2008) Efficiency decomposition in two-stage data envelopment analysis:
an application to non-life insurance companies in Taiwan. Eur J Oper Res Soc 185(1):418–
429
207. Kao C, Liu S-T (2000) Data envelopment analysis with missing data: an application to
university libraries in Taiwan. J Oper Res Soc 51(8):897–905
208. Kempkes G, Pohl C (2010) The efficiency of German universities – some evidence from
nonparametric and parametric methods. Appl Econ 42(16):2063–2079
209. Kenny LW (1982) Economies of scale in schooling. Econ Educ Rev 2(1):1–24
210. Kirjavainen T (2007) Efficiency of Finnish upper secondary schools: an application of
stochastic frontier analysis with panel data, VATT discussion paper 428. Government Institute
for Economic Research, Helsinki
211. Kirjavainen T, Loikkanen HA (1998) Efficiency differences of Finnish senior secondary
schools: an application of DEA and Tobit analysis. Econ Educ Rev 17(4):377–394
212. Kong W-H, Fu T-T (2012) Assessing the performance of business colleges in Taiwan using
data envelopment analysis and student based value-added performance indicators. Omega
40(5):541–549
213. Koshal RK, Koshal M (1999) Economies of scale and scope in higher education: a case of
comprehensive universities. Econ Educ Rev 18:269–277
214. Koshal RK, Koshal M (2000) Do liberal arts colleges exhibit economies of scale and scope?
Educ Econ 8(3):209–220
215. Koshal RK, Koshal M (2001) Multi-product total cost function for higher education: a case
of bible colleges. Econ Educ Rev 20:297–303
216. Kounetas K, Anastasiou A, Mitropoulos P, Mitropoulos I (2011) Departmental efficiency
differences within a Greek university: an applications of DEA and Tobit analysis. Int Trans
Oper Res 18(5):545–559
217. Kuah CT, Wong KY (2011) Efficiency assessment of universities through data envelopment
analysis. Proc Comput Sci 3:499–506
218. Kumar RC (1983) Economies of scale in school operation: evidence from Canada. Appl Econ
15(3):323–340
219. Kumbhakar SC, Ghosh S, McGuckin JT (1991) A generalized production frontier approach
for estimating determinants of inefficiency in U.S. dairy farms. J Bus Econ Stat 9(3):279–286
220. Kumbhakar SC, Lien G, Hardaker JB (2014) Technical efficiency in competing panel data
models: a study of Norwegian grain farming. J Prod Anal 41(2):321–337
221. Lazarsfeld PF, Henry NW (1968) Latent structure analysis. Houghton Mifflin, New York
222. Lee B, Johnes J (2019) A network DEA of higher education teaching: new evidence from
the UK teaching excellence framework. European Workshop on Efficiency and Productivity
Analysis London
223. Lee VE, Smith JB (1997) High school size: which works best and for whom? Educ Eval
Policy Anal 19(3):205–227
30 Applications of Production Economics in Education 1235

224. Lewis WC, Chakraborty K (1996) Scale economies in public education. J Reg Anal Policy
26(1):23–35
225. Lloyd PJ, Morgan MH, Williams RA (1993) Amalgamations of universities: are there
economies of scale or scope. Appl Econ 25:1081–1092
226. Longlong H, Fengliang L, Weifang M (2009) Multi-product total cost functions for higher
education: the case of Chinese research universities. Econ Educ Rev 28(4):505–511
227. Lovell CAK, Walters LC, Wood LL (1994) Stratified models of education production
using modified data envelopment analysis and regression analysis. In: Charnes A, Cooper
WW, Lewin AY, Seiford LM (eds) Data envelopment analysis: theory, methodology and
applications. Kluwer Academic, Dordrecht
228. Malmquist S (1953) Index numbers and indifference surfaces. Trab Estad 4:209–242
229. Mancebon M-J, Bandrés E (1999) Efficiency evaluation in secondary schools: the key role of
model specification and of ex post analysis of results. Educ Econ 7(2):131–152
230. Mancebón M-J, Mar Molinero C (2000) Performance in primary schools. J Oper Res Soc
51(7):843–854
231. Mancebón M-J, Muñiz M (2008) Private versus public high schools in Spain: disentangling
managerial and programme efficiencies. J Oper Res Soc 59(7):892–901
232. Mancebón M-J, Calero J, Choi Á, Ximénez-de-Embún DP (2012) The efficiency of public
and publicly subsidized high schools in Spain: evidence from PISA-2006. J Oper Res Soc
63:1516–1533
233. Mante B (2001) Measuring the performance of state secondary schools in Victoria: an
application of data envelopment analysis. Educ Res Perspect 28(1):105–133
234. Mante B, O’Brien G (2002) Efficiency measurement of Australian public sector organisations:
the case of state secondary schools in Victoria. J Educ Adm 40(3):274–298
235. Margaritis D, Smart W (2011) Productivity changes in Australasian universities 1997–2005: a
Malmquist analysis. 52nd Annual Conference of the New Zealand Association of Economics,
Wellington, New Zealand, 29th June to 1st July
236. Mayston DJ (2014) Effectiveness analysis of quality achievements for university Departments
of Economics. Appl Econ 46(31):3788–3797
237. McCarty TA, Yaisawarng S (1993) Technical efficiency in New Jersey school districts. In:
Fried HO, Lovell CAK, Schmidt SS (eds) The measurement of productive efficiency. Open
University Press, Oxford
238. McDonald J (2009) Using least squares and tobit in second stage DEA efficiency analyses.
Eur J Oper Res 197(2):792–798
239. McEwan PJ, Carnoy M (2000) The effectiveness and efficiency of private schools in Chile’s
voucher system. Educ Eval Policy Anal 22(3):213–239
240. McMillan ML, Chan WH (2006) University efficiency: a comparison and consolidation of
results from stochastic and non-stochastic methods. Educ Econ 14(1):1–30
241. McMillan ML, Datta D (1998) The relative efficiencies of Canadian universities: a DEA
perspective. Can Public Policy 24(4):485–511
242. Mikušová P (2015) An application of DEA methodology in efficiency measurement of the
Czech public universities. Proc Econ Financ 25:569–578
243. Misra K, Grimes PW, Rogers KE (2012) Does competition improve public school efficiency?
a spatial analysis. Econ Educ Rev 31:1177–1190
244. Mizala A, Romaguera P, Farren D (2002) The technical efficiency of schools in Chile. Appl
Econ 34(12):1533–1552
245. Moreno AA, Tadepalli R (2002) Assessing academic department efficiency at a public
university. Manag Decis Econ 23(7):385–397
246. Muñiz M (2002) Separating managerial inefficiency and external conditions in data envelop-
ment analysis. Eur J Oper Res 143:625–643
247. Naper LR (2010) Teacher hiring practices and educational efficiency. Econ Educ Rev
29(4):658–668
248. Nazarko J, Šaparauskas J (2014) Application of DEA method in efficiency evaluation of
public higher education institutions. Technol Econ Dev Econ 20(1):25–44
1236 J. Johnes

249. Nemoto J, Furumatsu N (2014) Scale and scope economies of Japanese private universities
revisited with an input distance function approach. J Prod Anal 41(2):213–226
250. Ng YC, Li S-k (2009) Efficiency and productivity growth in Chinese universities during the
post-reform period. China Econ Rev 20(2):183–192
251. Noulas AG, Ketkar KW (1998) Efficient utilization of resources in public schools: a case of
New Jersey. Appl Econ 30:1299–1306
252. Oliveira MA, Santos C (2005) Assessing school efficiency in Portugal using FDH and
bootstrapping. Appl Econ 37:957–968
253. Orea L, Kumbhakar SC (2004) Efficiency measurement using a latent class stochastic frontier
model. Empir Econ 29(1):169–183
254. Osburn DD (1970) Economies of size associated with public high schools. Rev Econ Stat
52(1):113–115
255. Ouellette P, Vierstraete V (2010) Malmquist indexes with quasi-fixed inputs: an application
to school districts in Québec. Ann Oper Res 173(1):57–76
256. Papadimitriou M, Johnes J (2016) Persistent and transient cost inefficiency in the English
higher education sector: a generalised true random effects model. 4th Workshop on Efficiency
in Education, Politecnico di Milano, Milan, Italy, October
257. Papadimitriou M, Johnes J (2018) Does merging improve efficiency? A study of English
universities. Stud High Educ 44:1–21
258. Paul CJM, Nehring R (2005) Product diversification, production systems, and economic
performance in US agricultural production. J Econ 126:525–548
259. Paul CJM, Johnston WE, Frengley GAG (2002) Post-reform substitution and cost efficiency
in the New Zealand agricultural sector. Int J Bus Econ 1(2):135–146
260. Podinovski VV, Ismail I, Bouzdine-Chameeva T, Zhang W (2014) Combining the assump-
tions of variable and constant returns to scale in the efficiency and evaluation of secondary
schools. Eur J Oper Res 239(2):504–513
261. Portela MCAS, Thanassoulis E (2001) Decomposing school and school-type efficiency. Eur
J Oper Res 132(2):357–373
262. Portela MCAS, Camanho AS, Borges D (2012) Performance assessment of secondary
schools: the snapshot of a country taken by DEA. J Oper Res Soc 63(8):1098–1115
263. Primont DF, Domazlicky B (2006) Student achievement and efficiency in Missouri schools
and the No Child Left Behind Act. Econ Educ Rev 25:77–90
264. Ramanathan R (2001) A data envelopment analysis of comparative performance in the
Netherlands. Opsearch 38(2):160–182
265. Rassouli-Currier S (2007) Assessing the efficiency of Oklahoma public schools: a data
envelopment analysis. Southwest Econ Rev 34(1):131–144
266. Ray SC (1991) Resource-use efficiency in public schools: a study of Connecticut data. Manag
Sci 37(12):1620–1628
267. Ray SC, Jeon Y (2008) Reputation and efficiency: a non-parametric assessment of America’s
top-rated MBA programs. Eur J Oper Res 189(1):245–268
268. Ray SC, Mukherjee K (1998) Quantity, quality, and efficiency for a partially super-additive
cost function: Connecticut public schools revisited. J Prod Anal 10(1):47–62
269. Rayeni MM, Saljooghi FH (2010) Network data envelopment analysis model for estimating
efficiency and productivity in universities. J Comput Sci 6(11):1235–1240
270. Reifschneider D, Stevenson RE (1991) Systematic departures from the frontier: a framework
for the analysis of firm inefficiency. Int Econ Rev 32(3):715–723
271. Riew J (1966) Economies of scale in high school operation. Rev Econ Stat 48(3):
280–287
272. Riew J (1986) Scale economies, capacity utilization, and school costs: a comparative analysis
of secondary and elementary schools. J Educ Financ 11(Spring):433–446
273. Robertson FW (2007) Economies of scale for large school districts: a national study with
local implications. Soc Sci J 44(4):620–629
274. Rolle A (2004) Out with the old – in with the new: thoughts on the future of educational
productivity research. Peabody J Educ 79(3):31–56
30 Applications of Production Economics in Education 1237

275. Ruggiero J (1999) Non-parametric analysis of educational costs. Eur J Oper Res 119:
605–612
276. Ruggiero J (2000) Nonparametric estimation of returns to scale in the public sector with an
application to the provision of educational services. J Oper Res Soc 51:906–912
277. Ruggiero J (2007) Measuring the cost of meeting minimum educational standards: an
application of data envelopment analysis. Educ Econ 15(1):1–13
278. Ruggiero J, Vitaliano DF (1999) Assessing the efficiency of public schools using data
envelopment analysis and frontier regression. Contemp Econ Policy 17(3):321–331
279. Ruggiero J, Duncombe W, Miner J (1995) On the measurement and causes of technical
inefficiency in local public services: with an application to public education. J Public Adm
Res Theory 5(4):403–428
280. Sav GT (2011) Panel data estimates of public higher education scale and scope economies.
Atl Econ J 39(2):143–153
281. Schiltz F, De Witte K (2017) Estimating scale economies and the optimal size of school
districts: a flexible form approach. Br Educ Res J 43(6):1048–1067
282. Schiltz F, De Witte K, Mazrekaj D (2019) Managerial efficiency and efficiency differentials in
adult education: a conditional and bias-corrected efficiency analysis. Ann Oper Res 288:529
283. Schmidt P (1985–1986) Frontier production functions. Econ Rev 4(2):289–328
284. Selim S, Bursalioglu SA (2013) Analysis of the determinants of universities efficiency in
Turkey: application of the data envelopment analysis and panel Tobit model. Procedia Soc
Behav Sci 89:895–900
285. Sengupta JK, Sfeir RE (1986) Production frontier estimates of scale in public schools in
California. Econ Educ Rev 5(3):297–307
286. Sexton TR, Comunale CL, Gara SC (2012) Efficiency-based funding for public four-year
colleges and universities. Educ Financ Policy 7(3):331–359
287. Shephard RW (1970) Theory of cost and production frontiers. Princeton University Press,
Princeton
288. Simar L, Wilson PW (2007) Estimation and inference in two-stage, semi-parametric models
of production processes. J Econ 136(1):31–64
289. Simar L, Wilson PW (2011) Two-stage DEA: caveat emptor. J Prod Anal 36(2):205–218
290. Simon J, Simon C, Arias A (2011) Changes in productivity of Spanish university libraries.
Omega 39(5):578–588
291. Sinuany-Stern Z, Mehrez A, Barboy A (1994) Academic departments efficiency via DEA.
Comput Oper Res 21(5):543–556
292. Sîrbu A, Cimpoieş D, Racul A (2016) Use of data envelopment analysis to measure the
performance efficiency of academic departments. Agric Agric Sci Procedia 10:578–585
293. Skodvin O-J (1999) Mergers in higher education – success or failure? Tert Educ Manag
5(1):65–80
294. Smet M (2001) Determining the optimal size of study fields in Flemish secondary education.
Econ Educ Rev 20(5):443–457
295. Smet M, Nonneman W (1998) Economies of scale and scope in Flemish secondary schools.
Appl Econ 30(9):1251–1258
296. Smith PC, Street A (2006) Analysis of secondary school efficiency: final report. Department
for Education and Skills, Nottingham
297. Stevens PA (2005) A stochastic frontier analysis of English and Welsh universities. Educ Econ
13(4):355–374
298. Summers AA, Wolfe BL (1977) Do schools make a difference? Am Econ Rev 67(4):
639–652
299. Thanassoulis E (1999) Setting achievement targets for school children. Educ Econ 7(2):
101–119
300. Thanassoulis E, Portela MCAS (2002) School outcomes: sharing the responsibility between
pupil and school. Educ Econ 10(2):183
301. Thanassoulis E, Kortelainen M, Johnes G, Johnes J (2011) Costs and efficiency of higher
education institutions in England: a DEA analysis. J Oper Res Soc 62(7):1282–1297
1238 J. Johnes

302. Thanassoulis E, De Witte K, Johnes J, Johnes G, Karagiannis G, Portela MC (2016)

Applications of data envelopment analysis in education. In: Zhu J (ed) Data envelopment
analysis: a handbook of empirical studies and applications. Springer-Verlag, New York, pp
367–438
303. Thieme C, Prior D, Tortosa-Ausina E (2013) A multilevel decomposition of school perfor-
mance using robust nonparametric frontier techniques. Econ Educ Rev 32:104–121
304. Tone K, Tsutsui M (2009) Network DEA: a slacks-based measure approach. Eur J Oper Res
Soc 197:243–252
305. Tsionas EG (2002) Stochastic frontier models with random coefficients. J Appl Econ 17:
127–147
306. Tsionas EG, Kumbhakar SC (2014) Firm heterogeneity, persistent and transient technical
inefficiency: a generalized true random-effects model. J Appl Econ 29(1):110–132
307. Tyagi P, Yadav SP, Singh SP (2009) Relative performance of academic departments using
DEA with sensitivity analysis. Eval Program Plann 32(2):168–177
308. Uri ND (2003) The adoption of incentive regulation and its effect on technical efficiency in
telecommunications in the United States. Int J Prod Econ 86:21–34
309. Verry DW, Davies B (1976) University costs and outputs. Elsevier, Amsterdam
310. Verry DW, Layard PRG (1975) Cost functions for university teaching and research. Econ J
85:55–74
311. Vittadini G, Colombi R, Martini G (2011) A stochastic frontier model with short-run and
long-run inefficiency random effects. Universita Di Bergamo, Bergamo
312. Waldo S (2007) Efficiency in Swedish public education: competition and voter monitoring.
Educ Econ 15(2):231–251
313. Warning S (2004) Performance differences in German higher education: empirical analysis
of strategic groups. Rev Ind Organ 24(4):393–408
314. Wolszczak-Derlacz J (2017) An evaluation and explanation of (in)efficiency in higher
education institutions in Europe and the U.S. with the application of two-stage semi-
parametric DEA. Res Policy 46(9):1595–1605
315. Wolszczak-Derlacz J, Parteka A (2011) Efficiency of European public higher education
institutions: a two-stage multi-country approach. Scientometrics 89(3):887–917
316. Woodhouse G, Goldstein H (1988) Educational performance indicators and LEA league
tables. Oxf Rev Educ 14(3):301–320
317. Worthington AC (2001) An empirical survey of frontier efficiency measurement techniques
in education. Educ Econ 9(3):245–268
318. Worthington AC, Higgs H (2008) An empirical note on factor price elasticities and substitu-
tion in Australian higher education. SSRN. http://ssrn.com/abstract=1313527 or https://doi.
org/10.2139/ssrn.1313527
319. Worthington AC, Higgs H (2011) Economies of scale and scope in Australian higher
education. High Educ 61(4):387–414
320. Worthington AC, Lee BL (2008) Efficiency, technology and productivity change in Australian
universities 1998–2003. Econ Educ Rev 27(3):285–298
321. Wu J, Zhang G, Zhu Q, Zhou Z (2020) An efficiency analysis of higher education institu-
tions in China from a regional perspective considering the external environmental impact.
Scientometrics 122(1):57–70
322. Yang G-L, Fukuyama H, Song Y-Y (2018) Measuring the inefficiency of Chinese research
universities based on a two-stage network DEA model. J Informetrics 12(1):10–30
323. Zhang L-C, Worthington AC (2017) Scale and scope economies of distance education in
Australian universities. Stud High Educ 42(9):1785–1799
324. Zhang L-C, Worthington AC (2018) Explaining estimated economies of scale and scope in
higher education: a meta-regression analysis. Res High Educ 59(2):156–173
325. Zhang L-C, Worthington AC, Hu M (2017) Cost economies in the provision of
higher education for international students: Australian evidence. High Educ 74(4):
717–734
30 Applications of Production Economics in Education 1239

326. Zimmer R, Buddin R (2009) Is charter school competition in California improving the
performance of traditional public schools? Public Adm Rev 69(5):831–845
327. Zimmer T, DeBoer L, Hirth M (2009) Examining economies of scale in school consolidation:
assessment of Indiana school districts. J Educ Financ 35(2):103–127
328. Zoghbi AC, Rocha F, Mattos E (2013) Education production efficiency: evidence from
Brazilian universities. Econ Model 31:94–103
Dairy Farming from a Production
Economics Perspective: An Overview 31
of the Literature

Boris E. Bravo-Ureta, Alan Wall, and Florian Neubauer

Contents
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1242
Early Uncovering of Basic Relationships . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1243
Technical Efficiency . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1246
Output Growth and Total Factor Productivity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1251
Cost Function Approaches: Efficiency and Economics of Scale, Size, and Scope . . . . . . . . 1254
Technology Adoption . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1256
Supply Response and Government Intervention . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1258
Risk and Uncertainty . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1262
Sustainability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1264
Concluding Remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1267
Cross-References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1268
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1268

Abstract

The purpose of this chapter is to give a broad overview of the published empirical
work on the production economics of dairy farming as well as an outlook on
future challenges for this area of research. This chapter shows that the vast
production economics literature on dairy farming has been used to address a
wide variety of topics including efficiency and productivity, technology adoption,
economies of size, scale and scope, the effects of government intervention
policies in the sector, the effect of risk and uncertainty, and issues relating

B. E. Bravo-Ureta () · F. Neubauer

Agricultural and Resource Economics, University of Connecticut, Storrs, CT, USA
e-mail: [email protected]; [email protected]
A. Wall
Department of Economics, University of Oviedo, Oviedo, Spain
e-mail: [email protected]

S. C. Ray et al. (eds.), Handbook of Production Economics,
https://doi.org/10.1007/978-981-10-3455-8_31
1242 B. E. Bravo-Ureta et al.

to sustainability including climatic effects, animal welfare, and environmental

efficiency. Dairy farming faces important challenges, particularly with regard to
environmental sustainability, animal welfare, structural changes, and input and
output price volatility, all of which provide fertile ground for future production
economics research in dairy. The conceptual frameworks and empirical analyses
reviewed in this chapter show that production economists have several tools
at their disposal to carry out studies related to these challenges and thereby
contribute to policy analyses and formulation. Moreover, the role of produc-
tion economists, working with scientists in various other disciplines, will be
paramount in the search for avenues to improve the overall productivity of dairy
farming while offering policymakers sound advice on sustainable technologies
and tools to deal with greater risk and uncertainty.

Keywords

Dairy · Milk · Production · Economics · Productivity · Efficiency · Cost ·

Profit · Supply · Technology · Size · Scale · Weather · Climate ·
Sustainability

Introduction

Dairy farming has been and remains a very important agricultural activity around the
globe. According to the Food and Agriculture Organization (FAO) [1], milk is one
of the most valuable farm products and over 130 million farms have dairy cattle,
with significant variability in cow numbers per farm across countries. Moreover,
cows represent an important source of livelihoods, employment, food and wealth
among rural households. In developing countries, dairy animals are an important
mechanism for the empowerment of women. On the consumption side, fluid milk
and dairy products are a significant source of energy, protein and micronutrients.
These features can play a key role in decreasing hunger, and in enhancing food
security, nutritional levels and diets.
Worldwide demand for food is expected to double by 2050 as a result of growing
population and incomes. To satisfy this rise in demand, farms around the world
will have to produce significant quantities of additional food. This rise will present
important opportunities and challenges to global agricultural systems and the dairy
sector is no exception [2].
The objective of this chapter is to provide an overview of the production
economics literature centering on the dairy sector, specifically on dairy farming
and milk production rather than on the dairy products manufacturing sector. To
provide some historical context, section “Early Uncovering of Basic Relationships”
reviews early studies that sought to uncover basic production relationships, and
this brings us to the 1970s. Section “Technical Efficiency” focuses on the literature
analyzing technical efficiency, followed in section “Output Growth and Total Factor
Productivity” by an exposition of studies dealing with output growth and total
31 Dairy Farming from a Production Economics Perspective: An Overview . . . 1243

factor productivity using primal (production) approaches. Section “Cost Function

Approaches: Efficiency and Economics of Scale, Size, and Scope” reviews studies
using cost function models to measure efficiency as well as economies of size,
scale and scope. Section “Technology Adoption” covers technology adoption, while
Section “Supply Response and Government Intervention” considers studies that
analyze supply response and government (intervention) policies in the dairy sector,
especially quotas and price supports. Section “Risk and Uncertainty” tackles the
literature on risk and uncertainty and section “Sustainability” reviews studies that
have examined the sustainability of dairy production in a broad sense, including
papers on weather and climate, animal welfare, and environmental efficiency. The
final section concludes and comments on the role of production economics research
with regard to future challenges facing dairy farming.

Early Uncovering of Basic Relationships

How far to go back in time as well as the specific papers to include in a chapter like
the present one is a matter of judgement and ultimately somewhat arbitrary. We have
chosen the paper coauthored by Mordecai Ezequiel (1927), an eminent agricultural
economist with a prolific professional career within the USA, as the beginning of
our exposition [3].
Ezekiel, McNall and Morrison (1927) set out to provide a scientific understand-
ing of why farm milk production records show considerable variability in milk per
unit costs across farmers. The authors reasoned that this variability had two main
sources: (1) the efficiency in the use of the milk production technology in terms of
the mix of inputs used to generate the observed level of milk output; and (2) the
effectiveness of the combination of factors considering their costs and output value
under prevailing economic conditions. Using data gathered from Wisconsin dairy
farmers, the authors documented “great variation from farm to farm in average
production per cow” (p. 3). After conducting detailed statistical analyses, they
reported that the essential problem in dairy farming was the technical relationship
between feeding rations and milk output. A general conclusion was the consistency
between their farm-level results with those obtained by other scientists in feeding
experiments.
In a 1932 report, Ezekiel, Rauchenstein and Wells focused on the response of
milk production to changes in price [4]. These authors drew from earlier analyses
and concluded that 53–88% of the nonseasonal changes in milk production in the
winter season could be explained by changes in the milk-feed price ratio. The most
elastic milk supply was found in Vermont, followed by Baltimore, and Saint Paul-
Minneapolis, although the authors concluded that farmers reacted in similar ways
despite differences between the three areas.
In a more theoretical contribution, Cassels (1933) stated that progress had been
made on empirical investigations on the responsiveness of supply to price and
highlighted several issues that deserved additional work [5]. The author argued that
the market and long-run supply curves are difficult, if not impossible, to estimate
1244 B. E. Bravo-Ureta et al.

statistically. In contrast, he concluded that the short-run curve was easier to estimate
and closer to the studies that were being conducted at the time. Cassels emphasized
that supply curves depend on the time period considered and are not reversible,
where the latter implies a higher elasticity when prices go up compared to when
prices decline.
Cassels and Malenbaum (1938) expressed skepticism regarding statistical anal-
yses of supply response [6]. They noted that Cassels (1937), contrary to Ezekiel,
Rauchenstein and Wells (1932), found no evidence of a high correlation between
milk-feed price ratios and output in Vermont, even though the methods and the
data used were similar (the period of analysis, however, differed) [7]. Cassels and
Malenbaum (1938) investigated this disparity and concluded that omitted variables
that could not be included in the studies had to be responsible for the disparities
and cautioned that results from statistical analyses could not be taken as economic
laws [6].
The early work by Cassels and others was recognized by several authors in
the literature on supply response in later decades. In particular, Halvorson (1958)
highlights the importance of both the early Cassels contributions and the book
Interregional Competition in Agriculture by Mighell and Black (1951) [8, 9]. As
better data became available, many influential supply response studies appeared in
the 1950s and 1960s, including Brandow (1953), Halvorson (1958), Schuh (1957),
Cochrane (1958), Cowling and Gardner (1964), Wipf and Houck (1967) and Wilson
and Thompson (1967), all of which used regression techniques [8, 10–15]. The
econometric literature on supply response was influenced heavily by the seminal
work of Nerlove in the mid-1950s and was carried on into the 1970s by Chen,
Courtney and Schmitz (1972), and Prato (1973) [16, 17]. An extensive review of
supply response papers in the period influenced by the Nerlove model in agricultural
commodities, including milk, can be found in Askari and Cummings (1977) [18].
Linear programming models were also used to study supply response in the 1950s
and 1960s by Faris and McPherson (1957), Cowling and Baker (1963) and Kelley
and Knight (1965). Some drawbacks regarding the use of this technique in this
setting were discussed by Barker (1965) [19–22].
Jensen (1940) focused on the estimation of “input-output relationships in milk
production [in] an attempt to accomplish quantitative verification of theory and to
obtain data that will be useful to economists, technical specialists and producers in
determining the most economic organization and adjustment of production” (p. 249)
[23]. He provided a critical overview of several research reports published in the
1920s and 1930s relying on data from farm records and argued that data generated
from experiments designed specifically to investigate input-output associations were
more desirable. He went on to assert that the determination of these physical
relationships is not economic but purely technical. He then used linear regression
to estimate parabolic production functions to explain the variation in milk output in
terms of feed inputs where the unit of observation is a cow and the data came from
several experiments conducted in different research stations in the USA.
31 Dairy Farming from a Production Economics Perspective: An Overview . . . 1245

Jensen’s work inspired a number of agricultural economists to pursue quanti-

tative production studies in dairy and other types of farming. It is interesting to
observe that in the late 1930s an important policy concern was soil management,
while governmental programs were being enacted to promote the conservation of
natural resources. A net effect expected at that time was an increase in forage
production that was suitable for dairy farming. Understanding how best to utilize
an expanding availability of forage crops along with technological improvements
and price changes became increasingly important in order to provide sound advice
to farmers, extensionists and policymakers [24].
An early study using regression analysis along with records from operating farms
was published by Herrmann (1943) who was interested in determining the most
profitable feeding rate for commercial dairy herds in West Virginia [25]. The author
concluded that his analysis provided good foundation for adjusting feeding rates in
response to different input-output price ratios. In the 1950s and 1960s, considerable
work using regression analyses of experimental data was undertaken to uncover the
nature of the production function for milk, including alternative functional forms,
and to analyze diminishing returns of different feeds, marginal rates of technical
substitution, the geometry of isoquants and expansion paths, and related issues con-
cerning dairy rations [26–30]. Heady and Dillon (1961) published a comprehensive
examination of agricultural production function models and of empirical work done
up to the late 1950s [31]. This publication appears to be the first to introduce a
second-degree polynomial version of a standard two-input Cobb-Douglas model (p.
205), which over the years has become widely known as the translog model. The lat-
ter name comes from a research note published by Christensen, Jorgenson and Lau
(1971) where they discuss the transcendental logarithmic function, abbreviating it as
“Trans-Log” [32].
Among other early studies, Aune and Day (1959) used data from Minnesota
dairy producers for 1956 and 1957 along with regression analysis to exam-
ine the relationship between labor used in different farm activities in herds
varying in size [33]. Waugh (1951) used linear programming, a new technique
at that time, to determine least cost combinations of feed inputs that would
fulfill or exceed various nutritional requirements, while Weeks (1964) applied
linear programming to find profit-maximizing dairy rations incorporating wheat
as a feed source using parametric variation in wheat prices [34, 35]. Related
dairy production work includes Coffey and Toussaint (1963), Heady, Madden,
Jacobson and Freeman (1964), and Paris, Malossini, Pilla and Romita (1970)
[36–39].
This brief review of early quantitative studies that examine production economic
issues in dairy farming shows that emphasis was placed on the generation of
information that had a direct bearing on farm management decisions. Another
feature of this work was the close collaboration between agricultural economists
and other disciplines involved in dairy science which explains the wide use of
experimental data in these economic analyses.
1246 B. E. Bravo-Ureta et al.

Technical Efficiency

This section considers studies that examine the technical efficiency (TE for the
remainder of the chapter) dimension of dairy farm productivity using primal models.
Since many of the empirical papers included in the remainder of this chapter use
frontier techniques, it is useful to provide an overview of the frontier methodology
before moving on to dairy studies.
The seminal 1957 article by Farrell sets the foundation for frontier function
research, which has become a significant subfield in production economics [40–42].
Farrell specified a best practice constant returns to scale production frontier and
then defined technical, allocative, and economic efficiency. Farrell and Fieldhouse
(1962) extended the 1957 model to accommodate increasing returns to scale [43].
Estimation methods of the best practice frontier have evolved significantly over
the past few decades and a rich menu is now available including parametric,
nonparametric, stochastic, and deterministic formulations [44].
The origin of nonparametric frontiers is Farrell (1957), and this was followed
by conceptual and empirical work by agricultural economists at Berkeley including
Boles (1966 and 1971), Bressler (1966) and Seitz (1970) [40, 45–48]. However, the
preceding work was largely ignored by agricultural economists, until it was brought
to the forefront by Charnes, Cooper and Rhodes (1978), who introduced the data
envelopment analysis (DEA) concept, and by Banker, Charnes and Cooper (1984)
[49, 50]. Nonparametric measures of efficiency are now commonly obtained using
DEA, which relies on mathematical programming techniques. This area of research,
both in terms of new models and applications, has seen a vigorous evolution (e.g.,
Färe, Grosskopf and Lovell 1985; Simar and Wilson 2007; Färe, Grosskopf and
Margaritis 2008) [51–53].
Aigner and Chu (1968) proposed the first parametric production frontier model,
deterministic in nature and estimated with linear or quadratic programming [54].
Following them, Timmer (1971) formulated a probabilistic production frontier
model, still estimated with mathematical programming, which he applied to US
state-level agricultural data [55]. The parametric stochastic production frontier
(SPF) framework was introduced around the same time by Aigner, Lovell and
Schmidt (1977) and Meeusen and van den Broeck (1977) [56, 57]. The SPF has also
seen rapid evolution in recent years, particularly in applications that make use of
panel data (e.g., Schmidt and Sickles 1984; Battese and Coelli 1995; Greene 2005;
Greene 2008; Tsionas and Kumbhakar 2014; Kumbhakar, Wang and Horncastle
2015; Filippini and Greene 2016) [58–64].
An initial shortcoming of the SPF was the ability to only measure average TE
for the sample. This limitation was alleviated by Jondrow, Lovell, Materov and
Schmidt (1982) who developed an approach to obtain individual TE scores for each
observation given cross-sectional data [65]. Schmidt and Sickles (1984) presented
various approaches for calculating individual scores for panel data models including
fixed effects [58]. Battese and Coelli (1988) extended the Jondrow, Lovell, Materov
and Schmidt (1982) approach to panel data and applied their method to study TE
31 Dairy Farming from a Production Economics Perspective: An Overview . . . 1247

for a sample of milk producers from the Australian Dairy Industry Survey for New
South Wales and Victoria for the years 1978–1979, 1979–1980 and 1980–1981 [66].
Another aspect of frontier models that has received considerable attention is the
explanation of TE, or what has been called inefficiency effects. The initial approach
was in two stages, where TE was first generated parametrically or nonparametrically
and then the TE scores were regressed on a set of variables. This two-stage approach
has been criticized by several authors for introducing statistical bias and this led
Fried, Lovell and Schmidt (2008) to write: “We hope to see no more two-stage
models” (p. 39) [44]. Several one-step SPF models have been introduced including
Battese and Coelli (1992), which uses panel data where the temporal pattern of TE is
the same for all units in the sample, and Battese and Coelli (1995), which allows for
time-varying inefficiency effects and accommodates explanatory variables related
to technical inefficiency [59, 67]. Two-step inefficiency effect models have also
been applied and criticized in the DEA literature (e.g., Simar and Wilson 2008)
and alternatives have been proposed (Simar and Wilson 2007) [52, 68].
Production frontier methods have been applied to a number of industries, sectors
and subsectors and a primary interest has been the measurement of TE, which is
the focus in this section [44]. An issue of substantial importance in the production
frontier literature is the connection between managerial performance and TE, a link
made by Farrell (1957) and subsequently by Martin and Page (1983) and Triebs and
Kumbhakar (2018) [40, 69, 70]. A similar link between firm effects and managerial
performance/ability and TE has been made in the nonfrontier literature by a number
of authors including Hoch (1955), Mundlak (1961), Mundlak and Hoch (1965), and
Hoch (1976) [71–74].
The remainder of this section will highlight applications of frontier methods
that center primarily on the TE analysis of dairy production using primal (input
and output quantity based) methods. This discussion begins with applications of
nonparametric frontiers. The first published nonparametric study of dairy production
seems to be Grisley and Mascarenhas (1985) who used the Boles (1971) approach to
measure efficiency of Pennsylvania farms using records for 1981 and 1982 [46, 75].
Weersink, Turvey and Godah (1990) used the nonparametric methodology, based on
Färe, Grosskopf, and Lovell (1985), to examine TE for a cross-section of Ontario
farms for 1987 [51, 76]. Overall TE was decomposed into pure TE (producing
below the frontier), congestion (input overuse), and scale efficiency (deviations from
constant returns to scale).
Cloutier and Rowley (1993) applied DEA to examine TE for a Quebec sample
for 1988 and 1989 and that same year Tauer (1993) examined both short- and long-
run TE and allocative efficiency (AE) for a sample of New York producers for 1990
[77, 78]. Fraser and Cordina (1999) used an input-oriented DEA model to assess
TE for farms in 1994/1995 and 1995/1996 located in Northern Victoria, Australia
[79]. Asmild, Hougaard, Kronborg and Kvist (2003) pointed out that TE scores
derived from DEA were common in selecting benchmark firms [80]. They then used
multidirectional efficiency analysis (MEA) to argue that it is desirable to separate
benchmark selection from efficiency measurement. The authors compared MEA
1248 B. E. Bravo-Ureta et al.

with DEA scores and found a high correlation but noted considerable differences
between small and large farms in Denmark. Mugera (2013) introduced an innovation
in dairy farm TE analysis where fuzzy set theory was combined with DEA to
examine cases when both inputs and outputs are measured imprecisely [81].
Hansson, Manevska-Tasevska and Asmild (2018) explored another angle based
on MEA [82]. They argued that the common interpretation of technical inefficiency
as waste due to input overuse might instead represent a sensible handling of risk
and uncertainty, a behavior referred to as rational inefficiency. Employing MEA,
they found support for the rational inefficiency hypothesis and concluded that, for
a sample of Swedish farms, what might appear as inefficiency represents rational
management behavior.
The following paragraphs provide an overview of applications of parametric
frontiers, beginning with cross-sectional studies, followed by panel data appli-
cations. Müller (1974) appears to be the first to discuss dairy farm TE in the
context of a production frontier [83]. Interestingly, Müller argued that TE is
conceptually inadequate from a theoretical standpoint. He develops a model where
proxy variables for information are introduced to account for observed efficiency
differences. Using data from California’s San Joaquin Valley, he fitted a modified
Cobb-Douglas production function using ordinary least squares; thus, this is a
nonfrontier TE efficiency paper and the author concluded that his approach was
“conceptually and analytically superior to the methodology of frontier production
functions” (p. 730). It is interesting to note that the Müller paper was published three
years before the seminal work by Aigner, Lovell and Schmidt (1977) and Meeusen
and van den Broeck (1977) [56, 57].
The first published paper to apply a parametric production frontier approach
to study TE in dairy farming seems to be Bravo-Ureta (1986), who implemented
a Cobb-Douglas probabilistic production frontier à la Timmer (1971) based on
data from New England, USA, for the year 1980 [55, 84]. Soon after, Dawson
(1987) published what appears to be the first SPF study of dairy farming and they
utilized data from England and Wales [85]. Bravo-Ureta and Rieger (1990) used
farm records for New England and New York to evaluate the robustness of TE
scores with respect to four different parametric frontier models estimated separately
for 1982 and 1983 [86]. TE scores varied markedly across models but were highly
correlated so that the ordinal rankings were similar. Bravo-Ureta and Rieger (1991)
extended the Kopp and Diewert efficiency decomposition methodology from a
deterministic to a stochastic framework [87]. Their method permitted the estimation
of TE, AE, and economic efficiency (EE) using only estimates from an SPF
model. Technical, allocative, and scale inefficiencies were analyzed by Kumbhakar,
Biswas, and Bailey (1989) for Utah dairy farmers using a system of simultaneous
equations comprising unconditional input demand and output supply functions that
incorporate the three types of inefficiency [88]. Large farms were more technically
efficient than small farms and most farms in all size categories were found to be
scale efficient.
Mbaga, Romain, Larue and Lebel (2003) provided another look at the robustness
of TE measures using data from Quebec to compare the effect of the Cobb-
31 Dairy Farming from a Production Economics Perspective: An Overview . . . 1249

Douglas, translog, and generalized Leontief (GL) functional forms, assuming

different distributions of the inefficiency term and also included DEA measures [89].
The TE scores from all parametric models were highly correlated but correlations
between the parametric and the DEA TE scores were low. Another evaluation
across methodologies was provided by Balcombe, Fraser and Kim (2006) who
compared a standard SPF, a Bayesian SPF, and DEA for dairy farms located in the
River Murray region of Victoria and South Wales, Australia, for 1999/2000 [90].
Spearman rank correlation coefficients revealed a high positive and statistically
significant association for the TE scores across all models considered.
Moving on to panel data studies, Heshmati and Kumbhakar (1994) introduced
a two-step model to separately identify time-invariant farm heterogeneity, farm-
and time-variant TE, and technological progress [91]. Kumbhakar and Heshmati
(1995) extended the model developed in their previous paper by separating overall
inefficiency into firm- and time-invariant components, as well as firm- and time-
variant components [92]. Applications were provided in both papers to Swedish
dairy farms.
Cuesta (2000) generalized the 1992 Battese and Coelli approach by allowing the
inefficiency term to vary across firms and over time and provided an application to
Spanish dairy farms [93]. The Battese and Coelli (1995) panel stochastic frontier
model was used by Kompas and Che (2006) to estimate TE for an unbalanced
panel of Australian dairy farms observed in the years 1996, 1998, and 2000
[59, 94]. They found production to be characterized by constant returns to scale,
with TE determined by the type of dairy shed used, feed concentration, and the
number of dairy cows milked at peak season. The true fixed effects (TFE) and true
random effects (TRE) models introduced by Greene [60] were used by Abdulai and
Tietje (2007) to estimate time-invariant unobserved firm-specific heterogeneity for a
sample of German dairy farms [95]. They also used the results from Mundlak (1978)
[96] to account for possible correlation between heterogeneity and the regressors
in the SPF. They found considerable variability in the results across models and
concluded that unobserved heterogeneity needs careful consideration.
Noting that SPF models typically utilize output-oriented (OO) TE measures,
Kumbhakar and Tsionas (2008) used a nonhomogeneous SPF to calculate and
compare OO and input-oriented (IO) measures and returns to scale based on
balanced panel data for Spanish dairy farms for 1993–1998 [97]. The authors
showed that the econometric models for the IO and OO measures are different
for nonhomogeneous functional forms, so the resulting TE, returns to scale (RTS),
and partial elasticities of production also differ. These differences become more
pronounced the larger the deviation of RTS from unity.
Input distance functions have also been used to study TE in dairy farming.
Rasmussen (2010) used a stochastic input distance function based on the Battese and
Coelli (1992) specification to examine scale efficiency for an unbalanced panel of
Danish dairy farms for a 22-year period (1985–2006) [67, 98]. He found that average
TE remained constant over time, but it was lower for older farmers and larger
herds. Tsionas, Kumbhakar and Malikov (2015) estimated technical and allocative
inefficiency for a panel of Norwegian dairy farms using a translog input distance
1250 B. E. Bravo-Ureta et al.

function augmented by the set of independent first-order conditions, which makes it

possible to address input endogeneity [99].
Much of the efficiency analysis in the economics literature has been based
on static production models where firms are assumed to adjust their input levels
instantaneously ignoring input fixity and adjustment costs over time. Dynamic
models have been developed to deal with these limitations. Silva and Stefanou
(2007) presented an intertemporal cost minimization nonparametric model to get
short- and long-run dynamic measures of TE, AE and EE for each unit and year in a
sample of Pennsylvanian dairy farms for the period 1986–1992 [100]. Serra, Oude
Lansink and Stefanou (2011) implemented a quadratic specification of a parametric
directional distance function to obtain TE, AE, and cost efficiency measures for a
panel of Dutch farms and found that the dynamic efficiency rankings were consistent
with those reported in the static literature [101].
Emvalomatis, Stefanou and Oude Lansink (2011) presented a stochastic distance
function that accounted for possible persistence of TE over time (autocorrelation)
[102]. In an application to German and Dutch farmers, they reported that technical
inefficiency was persistent over time. A similar result was found by Skevas,
Emvalomatis and Brümmer (2018), who implemented a dynamic stochastic output
distance function to account for firm-level heterogeneity in long-run TE using a
Bayesian estimation approach [103]. In an application to German dairy farms, they
found that technical inefficiency was persistent over time and that average long-run
TE was consistent with significant adjustment costs.
The final papers considered in this section comprise dairy studies using metafron-
tiers and latent class stochastic frontiers, which are methodologies used to tackle
the presence of different technologies within a sample. Battese, Rao and O’Donnell
(2004) define the metafrontier as “an overarching function of a given mathematical
form that encompasses the deterministic components of the SPF for the firms
that operate under the different technologies” (p. 92) [104]. Refinements were
incorporated by O’Donnell, Rao and Battese (2008) [105]. More recently, Huang,
Huang, and Liu (2014); Amsler, O’Donnell and Schmidt (2017); and Amsler,
Chen, Schmidt and Wang (2020) have developed alternative stochastic metafrontier
models [106–108].
While the metafrontier approach has been employed in studies on several sectors,
relatively few applications to dairy farming are available. The first such application
of the deterministic metafrontier model to dairy farm data was by Moreira and
Bravo-Ureta (2010) in their analysis of farms located in Argentina, Chile and
Uruguay [109]. Latruffe, Fogarasi and Desjeux (2012) used the metafrontier
approach to compare the productivity of Hungarian and French dairy farms (field
crop farms were also included) and Jiang and Sharp (2015) used the deterministic
approach to compare the performance of dairy farms located in the North and South
Island of New Zealand [110, 111]. The only stochastic metafrontier study found
for milk producers to date is Alem, Lien, Hardaker and Guttormsen (2019), who
analyzed a rich panel dataset to evaluate dairy farm performance in five regions in
Norway [112].
31 Dairy Farming from a Production Economics Perspective: An Overview . . . 1251

An alternative to metafrontiers to capture intra-sample differences in technology

is the latent class stochastic frontier model proposed by Orea and Kumbhakar (2004)
[113]. This model was used by Álvarez and del Corral (2010) to estimate the
technology of a sample of Spanish dairy farms that included intensive and extensive
farms [114]. They found that intensive farms were more productive and more
technically efficient than farms using the extensive technology. Another application
of latent class stochastic frontiers to Spanish dairy farms was provided by Orea,
Pérez-Méndez and Roibas (2015) who used this model to investigate the effects
of land fragmentation on technology choice and TE [115]. They found that land
fragmentation reduced the probability of adopting extensive production processes
and had a greater effect on TE in extensive farms.
Readers interested in a more detailed discussion of the TE literature focusing on
dairy and other agricultural products can go to various meta-analyses that have been
undertaken including Bravo-Ureta, Solís, Moreira López, Maripani, Thiam and
Rivas (2007), Moreira and Bravo-Ureta (2009), and Ogundari (2014) [116–118].

Output Growth and Total Factor Productivity

This section reviews papers that have analyzed output growth and total factor
productivity (TFP) in dairy farming. It starts by providing a brief overview of
key methodological contributions that have been made to decompose and better
understand the sources of output and TFP growth.
An early contribution toward TFP decomposition was Caves, Christensen and
Diewert (1982a) who relied on Diewert (1976) to define “superlative” output and
input indexes based on a constant returns to scale translog transformation function
consistent with the Törnqvist index [119, 120]. Soon after, Caves, Christensen and
Diewert (1982b) established the relationship between Törnqvist input and output
indexes and the Malmquist index [121]. Contemporaneously, Nishimizu and Page
(1982) stated that technological progress and TE have a common foundation on
the production function; however, they argued “that applied work in these fields
has evolved largely independently” (p. 920) [122]. These authors were pioneers in
incorporating TE in addition to technological progress as a source of productivity
change and they estimated a deterministic production frontier following the linear
programming model introduced by Aigner and Chu (1968) [54].
Färe, Grosskopf, Norris and Zhang (1994) implemented a nonparametric math-
ematical programming framework to decompose Malmquist productivity indexes
into changes in technological progress and TE for 17 OECD countries. They referred
to the former component as shifts in the world frontier coming from innovation
while changes in TE represent movements from individual countries toward the
world frontier or catching up [123]. Kumbhakar and Lovell (2000) estimated a
translog stochastic production frontier, calculated productivity change, and then
decomposed the latter into three elements: technological progress, returns to scale,
and time-varying TE [124]. They also presented various decompositions for dual
1252 B. E. Bravo-Ureta et al.

cost and profit frontiers as well as multi-input multi-output technologies. More

recently, O’Donnell (2016 and 2018) introduced the “proper” TFP index, which
needs to satisfy a number of axioms and is consistent with measurement theory
[125, 126]. O’Donnell decomposed TFP changes into various components including
scale, technology, TE, environmental, and statistical noise. Moreover, he argues that
commonly used TFP indexes (e.g., Fisher and Törnqvist) are not “proper” because
they do not comport with measurement theory and violate important index number
axioms.
Turning now to empirical dairy studies that have focused on output growth
and TFP, Ahmad and Bravo-Ureta (1995) used alternative fixed effects production
functions and SPF models to measure TE and to decompose dairy farm production
growth into TE, technological progress, and input growth for Vermont dairy farms
using the Nishimizu and Page (1982) framework [122, 127]. Weersink and Tauer
(1990) assessed the regional and temporal impacts of technological progress in
the US dairy sector using a dynamic dual optimization model [128]. They found
that the major milk-producing regions of the Northeast and Lake States benefited
most from existing trends in productivity at that time. Weersink and Tauer (1991)
applied multivariate Granger causality tests to investigate the relationship between
productivity per cow and farm size using US data for the period 1964–1987, finding
that the relationships varied considerably across states [129].
Parametric output distance functions have been used to decompose TFP growth
by Brümmer, Glauben and Thijssen (2002); Newman and Matthews (2006); Emval-
omatis (2012); and Cechura, Grau, Hockmann, Levkovych and Kroupova (2017)
[130–133]. In the first of these papers, Brümmer, Glauben, and Thijssen (2002)
used an output-oriented approach to decompose TFP growth into technological
progress, TE change, AE change and scale components for Dutch, German and
Polish farms over the period 1991–1994 [130]. They found that productivity growth
in Germany and Poland was driven mainly by technological progress, whereas for
the Netherlands the AE component was the most important driver. Newman and
Matthews (2006) decomposed TFP into technological progress, TE change and
changes in scale efficiency for Irish farms for the years 1984–2000 and found
that productivity growth over the period was driven entirely by technological
progress [131]. Emvalomatis (2012) used a random coefficients specification of
an output distance function to measure and decompose productivity in German
dairy farming [132]. TFP growth was decomposed into technological progress, TE
change, and a scale effect. Average TFP growth over the period was estimated
at 1.1%, which was driven primarily by technological progress. Cechura, Grau,
Hockmann, Levkovych and Kroupova (2017) used a stochastic output distance
function to analyze productivity for 24 EU member states and decomposed TFP
into scale, TE, technological progress, and heterogeneity effects [133]. Among their
results was that the impact of technological progress in Eastern Europe was lower
than in the rest of the EU.
Parametric input distance functions were used to analyze productivity change
by Sipiläinen, Kumbhakar and Lien (2014), Sauer and Latacz-Lohmann (2015),
and Singbo and Larue (2016) [134–136]. Sipiläinen, Kumbhakar and Lien (2014)
31 Dairy Farming from a Production Economics Perspective: An Overview . . . 1253

analyzed the profitability and productivity dynamics of Finnish and Norwegian

farms by estimating a translog input distance function and decomposing prof-
itability change into output growth, output price change, input price change,
technological progress, scale, markup, and TE change [134]. This decomposi-
tion permits TFP change to be obtained from profitability change. Sauer and
Latacz-Lohmann (2015) analyzed productivity for a sample of German dairy
farms where TFP change was measured with a Luenberger index and decom-
posed into efficiency change and technological progress [135]. The latter was
found to be the main driver of TFP change while innovation, proxied by net
investment, contributed to productivity change through its effect on technolog-
ical progress. Singbo and Larue (2016) estimated a stochastic input distance
function for a sample of Quebec dairy farms to decompose TFP into techno-
logical progress, TE change, scale efficiency change, and an input-mix com-
ponent, finding that the farms in their sample were operating at suboptimal
scale [136].
An application using a parametric stochastic production frontier as opposed to
multi-output distance functions is Moreira and Bravo-Ureta (2016), who analyzed
TFP change for a panel of Chilean dairy farms [137]. Decomposing TFP change
into technological progress, TE change, scale efficiency change, and allocative
change, they found that technological progress was the greatest contributor to
TFP change. No relationship was found between farm size and productivity
growth.
DEA methods have also been applied in the analysis of TFP in dairy farming.
Tauer (1998) examined the productivity change of New York dairy farms over
the period 1985–1993 with Malmquist indices using DEA [138]. Productivity
change was decomposed into TE change and technological progress and positive
annual productivity growth was found to come mainly from technological progress.
Latruffe, Fogarasi and Desjeux (2012), referred to briefly in the previous section,
used a DEA metafrontier approach to compare the productivity of Hungarian
and French dairy farms for the 2001–2007 period [110]. They found no differ-
ences in productivity between the countries based on their own frontiers. Using
the metafrontier, however, they found that Hungarian farms are more produc-
tive.
Finally, Jang and Du (2019) estimated productivity for dairy farms in the
USA using recent developments in the literature on control functions to correct
for biases induced by simultaneity and sample selection [139]. Computing
time-varying productivity at the farm level, they found that surviving farms
contribute more to regional productivity growth than entering and exiting farms.
Herd size variation was negatively associated with productivity at the state
level.
To summarize, this section showed that stochastic input and output distance
functions have been used widely for the measurement and decomposition of pro-
ductivity in dairy farming, though DEA approaches have also been used. Regardless
of the method used, the studies reviewed highlight the importance of technological
progress as a driver of productivity growth.
1254 B. E. Bravo-Ureta et al.

Cost Function Approaches: Efficiency and Economics of Scale,

Size, and Scope

The literature on TE and productivity reviewed in the previous two sections involved
primal approaches (i.e., production and distance frontiers). This section turns to
studies of efficiency and productivity that rely primarily on cost function models,
providing an overview of selected articles that have dealt with cost efficiency and
economies of scale, size, and scope. To clarify concepts, economies of size is a
measure of the relative change in output with respect to changes in all inputs along
the expansion path. In contrast, economies of scale measures how output responds
when all inputs are changed in the same proportion, i.e., along a ray through the
origin or the scale line. If the technology is homothetic then the expansion and scale
lines are equivalent and thus economies of size and scale correspond to each other
[140]. Economies of scope are present when “it is less costly to combine two or
more product lines in one firm than to produce them separately” [141] (p. 268).
Hoch (1976) is an early example of using a production function to investigate scale
economies in dairy farming [74].
Before looking at papers dealing with cost efficiency per se, it is important to
note that the issue of determining the most appropriate specification of technical
inefficiency in a cost frontier was addressed by Orea, Roibás and Wall (2004) [142].
They estimated three versions of a translog cost frontier for a panel of Spanish dairy
farms which differ according to the way technical inefficiency is introduced: input
oriented, output oriented, and hyperbolic. The Vuong test was used to select the most
appropriate model, and the authors found that the cost frontier with input-oriented
TE was the best choice for their data.
Cost efficiency in dairy was treated by, among others, Lund, Jacobsen and
Hansen (1993); Cocchi, Bravo-Ureta and Cooke (1998); Maietta (2000); Hailu,
Jeffrey, and Unterschultz (2005); and Álvarez, del Corral, Solís and Pérez (2008),
which are now examined in turn [143–147]. Lund, Jacobsen and Hansen (1993)
used linear programming methods to study the cost efficiency of a sample of Danish
dairy farms, finding that pure cost inefficiencies were much more important than
inefficiencies arising from suboptimal farm size [143]. Cocchi, Bravo-Ureta and
Cooke (1998) derived cost efficiency indexes for US and Canadian dairy farmers
and decomposed them into technological progress, regional competitive advantage,
and economies of size [144]. A policy implication of their results is that the
elimination of dairy support programs would accelerate the trend towards fewer
and larger farms, with these changes varying across states. Maietta (2000) used
a shadow cost model to decompose cost inefficiency into technical and allocative
inefficiencies for a panel of Italian dairy farms [145]. Excess costs were found to
stem primarily from technical inefficiency. Hailu, Jeffrey and Unterschultz (2005)
estimated non-homothetic frontier cost functions with and without local concavity in
input prices imposed to analyze cost efficiency of Alberta and Ontario dairy farms
[146]. They found that imposing curvature had little effect on the cost efficiency
estimates but did yield more plausible estimates of own-price and cross-price
31 Dairy Farming from a Production Economics Perspective: An Overview . . . 1255

elasticities. Finally, Álvarez, del Corral, Solís and Pérez (2008) estimated separate
stochastic average total cost frontiers for intensive and extensive dairy farms in
northern Spain [147]. They found that intensive farms produced at lower average
total cost and showed greater levels of efficiency than extensive farms.
Economies of size and scale in dairy farming have received considerable atten-
tion. The relationship between farm size and productivity has important implications
for farm policy, particularly in developing countries [148]. The structure of agricul-
ture and the resilience of small-scale farms have been the focus of considerable
discussion over the years and have regained popularity as a topic in recent times
[149]. The sharp decline in the number of dairy farms motivated Matulich (1978),
which is a good source for previous scale/size studies, to analyze the cost structure of
large specialized Californian dairy farms (375–3600 cows) [150]. While generating
short- and long-run average cost curves (LRAC) the results reveal that “significant
economies of size were evident up to 750-cow herds” (p. 645), and the LRAC
became quite flat thereafter.
The influence of managerial ability on economies of size was analyzed by
Dawson and Hubbard (1987) for English dairy farms. These authors use cross-
sectional data for 1980–1981 where they included “margin over feed cost per liter
of milk” as a proxy for management [151]. Estimated output was incorporated into
a translog LRAC function along with the management proxy and they found that the
LRAC was U-shaped but exhibited more pronounced economies than diseconomies
of size, with better managers producing any output amount at a lower average
cost. In a similar formulation to Dawson and Hubbard (1987), Alvarez and Arias
(2003) assumed that managerial ability is fixed and used a two-step procedure to
estimate an average cost model [152]. Using Spanish dairy farm data, they found
that managerial ability played a key role in the level of size economies achieved
and recommended that policies promoting growth in farm size be complemented
with actions that enhance managerial capacity, such as suitable extension programs.
Mukhtar and Dawson (1990) also estimated a LRAC function for a sample of
English and Welsh dairy farmers and reported considerable economies of size at
small levels of production but less pronounced diseconomies at larger levels [153].
Moschini (1990) used nonparametric and semi-parametric methods to analyze
scale economies for Ontario dairy farms [154]. Using a multiproduct cost function,
he found that the technology exhibited substantial scale economies. Løyland and
Ringstad (2001) examined economies of scale in Norwegian dairy farming by
applying variations of the standard Cobb-Douglas form that allowed for U-shaped
cost functions [155]. These authors found that exploiting scale economies fully
would lead to markedly lower cost and a substantial drop in farm numbers.
Mosheim and Lovell (2009) found substantial scale economies in an application
using US survey data and stressed the importance of taking technical and allocative
inefficiency into account when estimating scale economies [156].
The static concepts of scale and scope economies were extended into a dynamic
adjustment-cost framework by Fernández-Cornejo, Gempesaw, Elterich and Ste-
fanou (1992) [157]. In their empirical work, they estimated dynamic measures of
1256 B. E. Bravo-Ureta et al.

scope and scale for a sample of German dairy farmers observed over the period
1981–1988 covering the period when production quotas were introduced. They
found that as size increases, so does output specialization. Farm specialization, as
well as large milk output, was associated with lower marginal costs by Wieck and
Heckelei (2007) in a study of eight important European dairy farming regions from
five countries (Denmark, France, Germany, the Netherlands and the UK) [158].
Alem, Lien, Kumbhakar and Hardaker (2019) analyzed economies of scale and
scope for a panel of Norwegian farms using a system of equations including a flex-
ible translog cost function and input shares and found that farm costs can be signifi-
cantly lower by increasing scale, i.e., larger crop and dairy output, and scope, i.e., by
producing both crops and milk in the same farm rather than in separate units [159].
Intertemporal cost-minimizing behavior among US dairy farmers was tested
by Silva and Stefanou (2003) [160]. They developed a nonparametric dynamic
dual cost approach to recover technological information from intertemporal cost-
minimizing behavior without imposing a parametric functional form on the tech-
nology. In an application to a panel of Pennsylvania dairy operators, intertemporal
cost-minimizing behavior was not supported and the joint hypothesis of constant
returns to scale and dynamic cost minimization was rejected.
Finally, behavioral objectives, and in particular the issue of whether farms behave
in accordance with profit maximization or cost minimization, were addressed by
Tauer (1995) who tested the weak axiom of profit maximization and the weak
axiom of cost minimization for a sample of New York state dairy farms [161].
He found that farmers were not successful in maximizing profits but came closer
to cost-minimization behavior. Stefanou and Saxena (1988), who developed a
generalization of the dual (nonfrontier) profit function to test for the influence of
training variables on allocative efficiency, reported that Pennsylvania dairy farmers
allocated variable inputs to maximize production [162].
In sum, these cost-based studies show that the nature of the LRAC curve is
far from universal, though several studies have pointed to U-shaped curves where
cost per unit first declines (economies) as output increases, reaches a minimum,
and then increases (diseconomies) as output continues to rise. Economies at low
levels of output tend to be more important than diseconomies at larger levels. The
empirical evidence on economies of scope is mixed, with some authors reporting
that greater farm specialization is associated with lower costs and other findings
that costs could be reduced by producing multiple outputs. A final common finding
is that managerial ability has been found to play a crucial role in attaining cost
advantages.

Technology Adoption

The study of technology adoption, an area of enquiry pioneered by Griliches (1957),

has a long tradition in agricultural economics [163]. This section provides an
overview of technology adoption studies in dairy farming, covering animal-related
31 Dairy Farming from a Production Economics Perspective: An Overview . . . 1257

technologies including growth hormones and genetic selection, the choice between
conventional and organic technologies, and intensive and extensive technologies.
Starting with animal-related technologies, Giesen, Oskam and Berentsen (1989)
examined the expected profitability of adopting bovine somatotropin (BST) – or
bovine growth hormone – in the Netherlands, a topic that received considerable
attention in the 1980s [164]. The authors used representative farm and linear
programming methods and found positive effects at the farm and national levels.
A similar study is Marion and Wills (1990) who focused on Wisconsin farms and
concluded that the expected economic effects would be significant but lower than
reported in other studies, and sensitive to economic assumptions [165]. Stefanides
and Tauer (1999) used econometric procedures to correct for selectivity bias in the
adoption of BST for a panel of New York dairy farms and concluded that BST had a
positive effect on output per cow but a negligible impact on profits [166]. The article
includes a review of earlier studies on BST adoption aiming to identify the relation
between socioeconomic characteristics of farmers and BST adoption intentions.
Foltz and Chang (2002) studied the adoption and profitability of BST for a sample
of Connecticut dairy farms surveyed in 1999 [167]. Adoption was estimated with
probit and tobit models, and then adoption was endogenized in estimates of milk
production and farm profit rates. In line with Stefanides and Tauer (1999), they
found that adoption significantly increased milk production but had virtually no
impact on profit per cow.
The effects of adopting dairy management information systems (DMIS) were
analyzed by Tomaszewski, van Asseldonk, Dijkhuizen and Huirne (2000) for Dutch
dairy farms using regression analysis of panel data (1987–1996) for a group of
adopters and a similar group of nonadopters (i.e., controls) [168]. They found
that adopters of the technology achieved significant per-cow increases in milk
and protein production per cow as well as a shorter calving interval compared to
nonadopters.
The effect of genetic selection on productive performance has also received
attention in the literature. Roibás and Alvarez (2010) analyzed the impact of genetics
on the profitability of dairy farming in northern Spain [169]. Employing an SPF
translog model along with two auxiliary feed expense equations, they found that
changes in the genetic index over time had a positive impact on profits and such
effect was significantly higher for better managed farms (i.e., those with higher TE).
Atsbeha, Kristofersson and Rickertsen (2012) examined the effect of genetics on the
productivity of a panel of Icelandic dairy farms and found that 19% of the average
annual productivity growth rate was contributed by the breeding technology [170].
For recent studies that measure the effects of genetic selection on efficiency and farm
profits, see Whitt, Tauer and Huson (2019) for US dairy farms and Pérez-Méndez,
Roibás and Wall (2020) for dairy farms in Spain [171, 172].
The comparison of organic and conventional technologies was studied by
Kumbhakar, Tsionas and Sipiläinen (2009) who used an SPF model to jointly
estimate the underlying technology and its adoption, considering that the latter is
both endogenous and affected by TE [173]. In an application to Finnish farms, they
found the conventional technology to be more productive. Mayen, Balagtas and
1258 B. E. Bravo-Ureta et al.

Alexander (2010) also examined the impact of adopting organic dairy production
technology on productivity and TE [174]. Using USDA data for 2005, they applied
propensity score matching techniques to account for selectivity bias from observable
variables. Organic and conventional producers exhibited different frontiers and
again the former group was less productive than the latter while both groups
exhibited similar average TE.
In a study based on ARMS data for 2003, Nehring, Gillespie, Sandretto and
Hallahan (2009) explored whether small dairy farms could compete with large
ones using conventional and pasture-based technologies [175]. Results of a translog
stochastic input distance function revealed that farms using a pasture-based system
had more room to exploit scale economies than conventional operations. Ma,
Bicknell and Renwick (2019) investigated the switch from a pasture-based to
more intensive feeding technologies on TE using a fixed effects SPF model for
New Zealand dairy farms [176]. The findings showed that more intensive feeding
technologies, herd size, and milking frequency all had a positive effect on TE.
Intensive systems had also been found to be more efficient for Spanish farms by
Álvarez, del Corral, Solís and Pérez (2008) [147].
Overall, several of the studies considered in this section reveal the potential of
various technologies such as BST, DMIS, and genetic selection to generate positive
effects on production, profits, TE and scale economies. Moreover, conventional
farming seems to be more productive than organic and intensive technologies to
be more productive than extensive ones.

Supply Response and Government Intervention

Supply elasticities have an important influence on the impact of government policies

designed to support milk producers, a common feature of the dairy sector across
many countries. This section reviews the production economics literature that has
dealt with supply elasticities and implications of government intervention in dairy
farming.
There is a long literature on milk supply response and the estimation of the
related elasticities. The early contributions have been reviewed in Sect. 2, so here
the focus is on salient papers from the 1980s onwards. Levins (1982) stated that
much of the econometric literature on milk supply response in the 1970s relied
on the Nerlove (1956) partial adjustment model and the polynomial distributed lag
model (Chen, Courtney and Schmitz 1972) [16, 177, 178]. The author argued that
direct estimation of lagged price parameters in milk supply response models offered
advantages relative to alternative models in terms of theoretical simplicity and the
elimination of assumed a priori parameter restrictions. Early research on the factors
determining milk supply was summarized by Buckwell (1984) who contended
that concentrating on price-output relationships does not provide a satisfactory
explanation of milk production [179]. He provided the first specific application to
the dairy sector of the theory advanced by Kislev and Peterson (1982) and that
changes in farm size can be explained by changes in relative factor prices, given
31 Dairy Farming from a Production Economics Perspective: An Overview . . . 1259

fixed labor input [180]. Using UK data, he found that growth in milk output was
determined mainly by growth in average herd size, which was in turn determined by
the relative price of capital to labor. Howard and Shumway (1988) highlighted the
role of herd size on the long-run milk elasticity of supply reported in various studies
and applied a dynamic dual model to the US dairy industry to estimate the rate of
adjustment of herd size and labor to their optimal values [181]. They concluded that
dairy policies such as price support programs would need at least a 10-year duration
to achieve fully their potential effects.
An influential paper by Chavas and Klemme (1986) presented a dynamic model
of herd composition and supply response in the US dairy sector, considering the
herd as a capital good [182]. They modeled US milk production using yearly data
over the period 1960–1982 by bringing together a dynamic model of the aggregate
dairy herd’s size and age structure with a cow productivity equation. They found
small short-run supply elasticities, implying that price supports would not lead to
notable excess supply in the short run. As farmers react strongly to relative price
changes in the long run, however, they concluded that it would be difficult to reduce
excess supply once herd size has been expanded.
Several subsequent papers build on the work of Chavas and Klemme (1986).
Chavas and Kraus (1990) found that the short-run milk supply response to market
prices was very inelastic for five US Lake States during 1950–1985 and that it would
take at least seven years for this response to become elastic [183]. Adelaja (1991)
conceptualized the relationships between long- and short-run supply elasticities for
yields, farm size, and herd population for northeastern US dairy farms over the
period 1971–1985 [184]. Price responsiveness was found to decrease with farm size
in the short run, whereas in the long run large farms are more price responsive
than smaller ones. Also highlighted were the implications of changes in price
support on the distribution of revenues within the sector. Bozic, Kanter and Gould
(2012) examined the evolution of the long-run US milk supply elasticity based on
aggregate national data covering the period 1975–2010 [185]. They reported large
differences between short-run and long-run responses to price changes, leading them
to conclude that policymakers must take medium- and long-run policy impacts into
account. Furthermore, the responsiveness to feed prices implies that dairy policy
should focus on managing dairy farm profit margins rather than revenue streams.
Short-, intermediate-, and long-run price elasticities of output and inputs were
analyzed by Thijssen (1994), who estimated a dynamic factor demand model for
Dutch dairy farms [186]. Investments were found to be sensitive to price and
technological progress, but price elasticities of output and variable inputs were
found to be small in the short run and even in the long run, limiting the effectiveness
of price policies on output supply and variable input demand. A restricted profit
function was used by Quiroga and Bravo-Ureta (1992) to analyze the structure of
dairy technology for a sample of Vermont dairy farmers [187]. They found that
observed levels of quasi-fixed inputs were significantly lower than their long-run
optimal values and concluded that lower milk prices would threaten the viability of
small- and medium-sized farms in the short run and decrease optimal farm size in
the long run.
1260 B. E. Bravo-Ureta et al.

Technological progress can counteract the effect of price changes on milk supply.
This was explored by Blayney and Mittelhammer (1990), who used aggregate
production and profit functions to decompose milk supply response into technology
and price effects for the Washington state dairy sector and Munshi and Parikh (1994)
for India [188, 189]. Blaney and Mittelhammer (1990) reported that technological
advances outweigh lower supply due to decreases in milk prices, while Munshi and
Parikh (1994) found that the dissemination of information may be more effective
than strategies geared at providing technical inputs.
The remainder of this section deals more explicitly with the effects of quotas
and price subsidies. The impact of quota policies has been a popular topic of
research. Stefanou, Fernández-Cornejo, Gempesaw and Elterich (1992) derived and
estimated intertemporal cost-minimizing investment and variable demand functions
to compare pre- and post-quota period producer responses of German dairy farmers
in the early 1980s [190]. They found evidence of considerable excess production
capacity and that the introduction of the quota changed variable input responses and
investment behavior.
The costs of quantity restrictions and transfer costs in dairy quota exchanges
were examined by Boots, Oude Lansink and Peerlings (1997) for Dutch dairy farms
[191]. Estimating a system of input demand and output supply, they found that the
free trade of quotas would increase profits by 9%, and that small farmers would
gain more than large farmers from quota trading in a context of trade restrictions.
Similarly, Sauer (2010) analyzed the effect of quota transferability on the production
structure of Danish conventional and organic farms [192]. He found an increase in
overall market efficiency over time due to success in allocating quotas to the more
efficient farms as well as an upward shift towards organic milk production.
Colman, Burton, Rigby and Franks (2002) estimated a cost function for a cross-
section of UK dairy farmers to simulate the adjustment in the sector to different
dairy policy reform scenarios and concluded that incomes would be maintained only
if herd size increased considerably and the number of producers declined [193].
Pierani and Rizzi (2003) examined the effect of the introduction of a milk quota in
1984 on the behavior of Italian farmers and provided evidence of a rigid productive
structure during the pre- and post-quota period [194].
The relation between quota values and economic efficiency was analyzed by
Álvarez, Arias and Orea (2006) for a sample of Spanish farmers [195]. They argued
that efficient firms should be net purchasers while inefficient firms should be net
sellers of milk quotas. Their results suggested that economic efficiency was far
more important than farm size in explaining quota values, with the implication that
authorities should take efficiency into account when allocating quotas. Kumbhakar,
Lien, Flaten and Tveteras (2008) investigated the effect of quota policy on the output
growth of a panel of Norwegian farms observed over the period 1976–2005 [196].
They showed that quota regulations had a negative effect on output growth and
technological progress and suggested that a policy that does not have productive
performance as a primary objective may have contributed to the increase in off-farm
work captured in the dataset. The explanation of inefficiency under a quota regime
was analyzed by Ang and Oude Lansink (2018) using a DEA framework where
31 Dairy Farming from a Production Economics Perspective: An Overview . . . 1261

dynamic profit inefficiency was decomposed into technical and allocative ineffi-
ciency [197]. The authors reported considerable dynamic profit inefficiency among
Belgian dairy farmers, which was primarily attributed to allocative inefficiency in
the outputs and variable inputs.
The effects of deregulation and milk policy reform in the EU have been
investigated by several authors. De Frahan, Baudry, De Blander, Polomé and
Howitt (2011) analyzed the farm-level supply and income effects of removing milk
quotas and reducing producer prices while increasing direct compensatory payments
for a sample of Belgian dairy farmers covering the period 1996–2006 [198].
They found that quota removals with a 20% drop in milk prices would maintain
aggregate milk supply and farm income at its reference level. Latruffe, Bravo-
Ureta, Carpentier, Desjeux, and Moreira (2017) analyzed the relationship between
agricultural subsidies and dairy farm TE in nine Western European countries [199].
Estimating a stochastic production frontier using a method of moments estimator
that addressed input endogeneity, they found that decoupling following the 2003
CAP Reform weakened the link between subsidies and TE but did not change
the direction of such link except for Italy. Frick and Sauer (2018) analyzed the
relationship between deregulation, represented by the phasing out of the EU milk
quota, and efficient resource allocation [200]. Using data for German dairy farms
for the period 2000–2014, they found evidence that deregulation led to higher
productivity through greater reallocation of resources towards more productive
farms but found no effect of output price risk on this reallocation.
While the papers just mentioned were centered on the EU, several authors have
studied the effects of intervention policies on efficiency for other regions. Thus,
Bezlepkina, Oude Lansink, and Oskam (2005) analyzed the effect of subsidies
in dairy farming in the Moscow region of Russia [201]. They concluded that
while subsidies had a distorting effect on the input-output mix, they relieved credit
constraints while improving AE. Slade and Hailu (2016) studied the cost efficiency
of dairy farms in Ontario and New York State between 2005 and 2007, which
operated under two different regulatory regimes [202]. Farmers in Ontario received
higher government support and were found to be more cost inefficient than those in
New York mainly due to greater AE, leading the authors to conclude that managers
in more competitive environments make better decisions. Larue, Singbo and Pouliot
(2017) evaluated the impact of supply management policies on Quebec farms [203].
They reported that a fall in the volume of production quota traded had large effects
on TE and established that exchange regulations, such as price ceilings, should be
removed in order to make production quota available to farms willing to make
technological innovations and herd size adjustments. Komaki and Penzer (2005)
used a structural time series model to estimate price elasticities for two Japanese
regions over the period 1970–1997, which covered major dairy policy changes in
the form of price supports and quotas [204].
Overall, the supply response literature generally points to relatively inelastic
short-run milk supply and much higher long-run elasticity, with adjustment in the
long run potentially taking several years. The role of relative factor prices and herd
size has been emphasized in long-run adjustment and technological progress has
1262 B. E. Bravo-Ureta et al.

been shown to be able to counteract the long-run impact of price adjustment on

output. Results indicate that tradeable quotas can increase market efficiency while
contributing to technological progress and herd size adjustment. Studies have also
supported the notion that deregulation can have a positive effect on dairy farm
productivity.

Risk and Uncertainty

Risk and uncertainty are inherent to agricultural activities and have been the subject
of a large theoretical and empirical literature in agricultural economics (see, for
example, the review by Moschini and Hennessy 2001, [205]). The dairy sector
is no exception, with producers facing multiple risks including production risk,
volatile output qualities, as well as institutional and market price risk. Finger,
Dalhaus, Allendorf and Hirsch (2018) provide relevant references on these sources
of risk [206]. As argued by Antle and Goodger (1984), the stochastic properties of
production technologies may have important effects on producer behavior so that
an important task for economists is to estimate and test stochastic technologies and
use the estimates to evaluate alternative decision models [207]. In the dairy farming
literature, the most common models used to estimate stochastic technologies have
been the heteroskedastic Just-Pope production function ([208]) and variants of the
moments-based approach of Antle (1983) [209]. In recent years, state-contingent
production models, which consider that producers can manage uncertainty through
the allocation of productive inputs to different states of nature, have become
increasingly popular in agricultural economics in general, though applications to
dairy have been scarce [210, 211].
The flexible moment-based approach of Antle (1983) was used by Antle and
Goodger (1984) to estimate the stochastic structure of large-scale dairy farms
in California and highlighted the importance of generalizing decision models to
incorporate third and potentially higher moments [207]. Their results indicated that
capital-intensive farms were riskier. Comparing a mean-variance (MV) criterion
with the mean-variance-skewness (MVS) criterion yielded a particularly interesting
result: the MV criterion indicated that the effects of uncertainty were unimportant
for decision-making, whereas the MVS criterion showed that uncertainty had impor-
tant effects on optimal decisions. Finger, Dalhaus, Allendorf and Hirsch (2018)
followed Antle (1983) by assuming that producers choose inputs to maximize utility
taking into account the moments of the different outputs produced [206]. They
explored the determinants of German dairy farmers’ risk exposure in an analysis
combining downside risk, the effect of climatic extremes, and animal health in a
multi-output stochastic production framework. They found that animal health affects
average revenues and production risk. An interesting policy implication was that
income stabilization measures may encourage riskier production decisions at the
expense of animal health.
The Just-Pope approach was used by Tveteras, Flaten and Lien (2011) and Orea
and Wall (2012) [212, 213]. Tveteras, Flaten and Lien (2011) estimated a multi-
31 Dairy Farming from a Production Economics Perspective: An Overview . . . 1263

output version of the Just-Pope production function for a panel of Norwegian

dairy farms, specifying mean, variance, and covariance production functions.
They found that inputs primarily increase output variance, with the risk-reducing
effects of inputs mainly present in the covariance functions. They also found that
technological progress shifted the profit distribution, increasing mean profit but also
the variance of profit, so that no welfare improvement was found for risk averse
farmers. Orea and Wall (2012) used the Just-Pope approach in an analysis of how
technological progress, production risk, and risk attitudes interact to affect producer
welfare. Using a sample of Spanish dairy farms, a welfare index was presented
comprising total factor productivity (TFP), production risk, and risk preferences.
The study showed how producer welfare may fall despite positive TFP growth when
this growth is associated with an increase in production risk and farmers are risk
averse.
An alternative approach was used by Lien, Kumbhakar and Hardaker (2017),
who estimated a translog input distance function incorporating three risk-related
indices as variables along with those defining the technology [214]. In an application
analyzing the effects of risk on productivity for Norwegian farms they found that
decreases in risk aversion, increases in optimism of risk perceptions, and increases
in risk management skills all lead to higher productivity.
The impact of price uncertainty has also received attention in the production
economics-oriented dairy farming literature. Melhim and Shumway (2011) used
the flexible mean-SD utility approach introduced by Saha (1997) to estimate risk
preferences and derive dual estimates for scope, product-specific, and multi-output
scale economies under price uncertainty for a sample of US dairy farms for 2000
[215, 216]. Dairy producers were found to be risk averse and the authors asserted
that ignoring these risk preferences would lead to underestimation of the effect
of scope economies for large farms, highlighting the importance of incorporating
risk preferences when estimating scope and scale economies. Pieralli, Hüttel and
Odening (2017) analyzed how milk price uncertainty and TE affect the decision of
farmers to abandon milk production in a real options model [217]. Using German
data, they found that abandonment was related to lower efficiency in milk production
but that greater price volatility reduced the probability of abandonment.
Finally, the state-contingent production approach was used by Mallawaarachchi,
Nauges, Sanders and Quiggin (2017) to estimate a stochastic technology for a panel
dataset of Australian irrigated dairy farms [218]. Defining two states of nature
to reflect favorable and unfavorable conditions of water availability, they found
that the production technology was consistent with state-contingent technology.
Thus, when seasonal conditions were variable, milk production was relatively
stable, with farmers using state-allocable inputs to manage their exposure to
unfavorable conditions. This implies that producers responded to information by
reallocating inputs towards states of nature that appear more likely in the light of
new information.
While there is a sizeable literature on risk and uncertainty in agriculture in
general, applications to dairy have been surprisingly scarce. What little literature
there is seems to point to risk aversion being negatively related to productivity and
1264 B. E. Bravo-Ureta et al.

highlights that technological progress may not improve farmers’ welfare if they are
risk averse.

Sustainability

This section reviews dairy production economics papers that deal with sustain-
ability. A broad view of sustainability is considered, encompassing the effects of
weather and climate change, animal welfare, and environmental efficiency.
There is a small but growing literature on the effects of weather and climate
change on production performance. Mukherjee, Bravo-Ureta and De Vries (2013)
appears to be the first paper to use an SPF framework to study the effects of climatic
variables on dairy production [219]. Using the Battese and Coelli (1995) approach
they found a significant negative effect of heat stress on farm productivity and a
robust positive return to investing in fans combined with sprinklers as an adaptation
mechanism [59]. Key and Sneeringer (2014) also applied the Battese and Coelli
(1995) SPF approach to examine the relationship between heat stress, output losses,
and TE [220]. Using data for US conventional dairy farms they reported a negative
association between TE and expected heat stress levels. The influence of weather
on milk production was examined for a panel of Spanish dairy farms by Pérez-
Méndez, Roibás and Wall (2019) [221]. In their model, weather can affect milk
output through direct effects on cow and forage performance, and indirectly through
other inputs. They found that warm weather had a substantial positive impact on
milk output due to better forage production.
The consequences of climatic conditions on farm productivity were assessed by
Qi, Bravo-Ureta and Cabrera (2015) using panel data for Wisconsin dairy farms
[222]. Their results showed that higher summer and autumn temperatures had a
negative effect on milk output. Warmer winters and springs had a positive effect
whereas higher precipitation led consistently to negative outcomes. The combined
climatic effects on milk output over the 17-year period studied were negative. In
a related paper, Njuki, Bravo-Ureta and Cabrera (2020) used a random parameters
SPF model and examined the differential effects of weather and climate on TFP
change [223]. Annual TFP growth was 2.16%, coming mainly from technological
progress.
Animal welfare has been a theme of increasing attention among consumers,
producers, and policymakers in recent years [224], and there is a small but growing
production economics literature dealing with milk production. Measures to improve
animal welfare generally come at a cost, and the cost-effectiveness of measures to
alleviate disease and heat stress has been addressed by several authors. For example,
Chi, Weersink, Van Leeuwen and Keefe (2002) designed a model to determine
optimal cost-minimizing strategies for managing four infectious diseases that can
significantly diminish milk production [225]. Gunn, Holly, Veith, Buda, Prasad,
Rotz, Soder and Stoner (2019) studied the cost-effectiveness of alternative heat
abatement strategies to reduce heat exposure in US dairy [226].
31 Dairy Farming from a Production Economics Perspective: An Overview . . . 1265

The focus of the following paragraphs are studies relating animal welfare to
productive performance (fundamentally TE), beginning with parametric studies
and then turning to nonparametric studies, which have been far more numerous.
Beginning with parametric studies, Lawson, Agger, Lund and Coelli (2004) used
stochastic frontier analysis to study the TE outcomes associated with various
illnesses and reproductive disorders for Danish dairy farms [227]. They found that
the adverse effects of reproductive disorders on milk production were compensated
by good managerial decisions in efficient farms. Atsbeha, Kristofersson, and
Rickertsen (2012) estimated a cost system to explore the connection between genetic
progress, animal health and fertility characteristics, and variable costs for a panel of
Norwegian dairy farms observed over the period 1999–2007 [170]. Their results
pointed to significant variable cost savings from genetic progress in production,
health, and fertility traits. Finally, Pérez-Méndez, Roibás and Wall (2020) assessed
how health, reproductive conditions, and genetic selection can affect TE and
profits using a stochastic frontier approach for a panel dataset of Spanish dairy
farms [172].
DEA methods feature far more prominently in the animal welfare literature.
Hansson and Öhlmér (2008) used DEA to evaluate the effects of animal breeding,
health, and feeding decisions on the efficiency of Swedish dairy farms and found that
breeding practices were important [228]. Hansson, Szczensa-Rundberg and Nielsen
(2011) analyzed the effects of preventative measures against mastitis on dairy farm
efficiency, again using DEA for Swedish dairy farms [229]. Barnes, Rutherford,
Langford and Haskell (2011) reported a negative association between lameness
and TE for a sample of British dairy farms [230]. Another strand of the literature
explores the possibility that farmers may deliberately trade off use value (i.e.,
priced) and nonuse value in the management of livestock. This so-called ”rational
inefficiency” literature applied to dairy farming assumes that some farmers may
supply animal welfare beyond levels that would be optimal from a purely financial
perspective due to the existence of nonuse value. Hansson, Manevska-Tasevska and
Asmild (2018) provide a discussion of this literature and an application to Swedish
dairy farms using multidirectional efficiency analysis (MEA) [82].
The last group of papers reviewed in this section addresses environmental effi-
ciency, which can be defined in various ways. Reinhard, Lovell and Thijssen (1999)
is an early study that applied econometric procedures to examine environmental
efficiency in dairy farming [231]. Using data on Dutch dairy farms, environmental
efficiency was measured using an input that captured the excess nitrogen derived
from overuse of manure and chemical fertilizer. Reinhard and Thijssen (2000) used
similar data to estimate input-oriented TE and mean nitrogen efficiency, where the
latter was an aggregate measure incorporating the nitrogen content of all relevant
inputs and output [232]. Shortall and Barnes (2013) investigated the connection
between TE and environmental efficiency for Scottish farms where they defined
greenhouse gas (GHG) emissions as an environmental bad [233]. They reported
that farms that exhibited higher TE were larger or had higher output per cow
and performed at a higher efficiency regarding GHG emissions. The concept of
eco-efficiency to measure environmental performance was used by Pérez-Urdiales,
1266 B. E. Bravo-Ureta et al.

Oude Lansink and Wall (2016) to evaluate the environmental performance of a

sample of Spanish dairy farms [234]. Analyzing the role of farmers’ socioeconomic
characteristics and attitudes in explaining eco-efficiency, it was found that younger
farmers, farmers that plan to continue operation in the foreseeable future, and those
that participate more in training schemes were found to be more eco-efficient.
The consideration of environmental goods as outputs has been addressed by
several authors. Peerlings and Polman (2004) argued that agriculture produces both
commodity and noncommodity products, where the latter in some cases are public
goods [235]. They formulated a symmetric normalized quadratic profit function
system to study milk and wildlife/landscape services as joint products. Using Dutch
data, they found that wildlife/landscape services and milk are substitutes, and that
most farms in the sample exhibited diseconomies of scope and would benefit from
specializing. Gullstrand, De Blander and Waldo (2014), following Peerlings and
Polman (2004), examined the connection between providing for biodiversity and the
cost structure of dairy farms in Sweden [236]. They found milk and beef output to
be substitutes of biodiversity provision. In contrast, they reported a complementary
relationship between biodiversity provision and crop production. Areal, Tiffin and
Balcombe (2012) redefined TE by incorporating the provision of environmental
goods as one of the outputs [237]. Using data on dairy farms in England and
Wales, they determined that farm efficiency rankings change when provision of
environmental outputs by farms is incorporated in the analysis.
In two related studies, Njuki and Bravo-Ureta (2015) and Njuki, Bravo-Ureta and
Mukherjee (2016) applied directional distance functions along with the generalized
true random effects estimator to analyze the trade-off between milk production (a
good output) and an undesirable or bad output [238, 239]. Their results indicated
that policies designed to curb emissions would have heterogenous effects across
space and farm size and thus need to be evaluated and implemented with care
to minimize adjustment costs. The possible effects of environmental regulation
on the location of dairy production in the USA were investigated by Isik (2004)
[240]. After developing a behavioral model of location and production, spatial
lag models were estimated using US county-level agricultural and economic data.
Differences in state environmental regulation were found to have contributed to
dairy farms relocating to states with less strict regulation. A policy implication is
that harmonization of environmental rules by increasing regulation in states with
laxer regimes would increase concentration of dairy production in states with more
stringent levels of existing regulation.
This section finishes with two studies relating compliance with environmental
standards and production. Samson, Gardebroek and Jongeneel (2017) set up a
microeconomic model to estimate separate production functions for milk, feed,
and roughage, as well as a manure production function derived from agronomic
standards [241]. The focus was on analyzing the costs and benefits of dairy farm size
growth in a context where policy changes such as milk quotas and environmental
constraints on manure production and handling needed to be accounted for. Zhang
(2018) examined the compliance of large confined animal facilities (CAFs) with
practice-based air quality regulation in California’s San Joaquin Valley and the asso-
31 Dairy Farming from a Production Economics Perspective: An Overview . . . 1267

ciated costs and economic performance on regulated farms [242]. The regulation
under study was found to have a nonsignificant effect on the total costs of producing
milk for the regulated farmers relative to the control group.
Summing up, the tools from production economics have proven useful in the
study of a wide range of topics related to the fundamentals of sustainability in
dairy farming. The studies have documented the impact of weather and climate on
productive performance and have shed light on the relation between animal welfare
and output. The production of bad outputs such as pollution has also been studied,
as well as the impact of regulatory reform. These studies constitute a valuable
body of work whose results provide well-informed advice and recommendations
to stakeholders on the increasingly important issue of sustainable production.

Concluding Remarks

As shown in this chapter, the production economics literature on dairy farming is

vast and has been used to address a wide variety of topics including: efficiency
and productivity; technology adoption; economies of size scale and scope; the
effects of government policies; the effect of risk and uncertainty; and issues relating
to sustainability including the weather and climate change, animal welfare, and
environmental efficiency. Given the importance and changing nature of the dairy
sector, this literature can be expected to continue to grow.
Dairy farming and milk consumption offer many societal benefits but there are
important challenges that will deserve attention from the research community. A
major growing concern has to do with the adverse environmental impact of dairy
production. Dairy farming offers the possibility to increase human welfare, but also
contributes to environmental degradation [2], the effects of which are far-reaching
and include GHG emissions and soil and water contamination. Consequently, a
significant challenge for the dairy sector will be to reduce its adverse environmental
impact while meeting the increasing demand for dairy products of societies around
the world. One would expect that the methodological arsenal that has been
developed in the production economics literature will be increasingly applied to
examine the environmental effects of dairy farming along with potential policy
responses. The role of production economists, working with scientists in various
other disciplines, will be paramount in the search of avenues to improve the
overall productivity of dairy farming while offering policymakers sound advice on
sustainable technologies and tools to deal with greater risk and uncertainty.
On the other hand, the sector not only contributes to environmental degradation
but is increasingly being affected by changing environmental conditions. Rising
average temperatures, droughts, and floods jeopardize forage and grazing, imposing
additional heat stress on animals and added burden from disease [243]. Another
important issue is animal welfare, with increasing pressure from consumers and
policymakers towards more stringent animal welfare standards. The challenges from
climate change and animal welfare coupled with globalization and a fast-changing
1268 B. E. Bravo-Ureta et al.

technological environment are putting additional pressure on the industry’s sustain-

ability [243].
The future evolution of input and output prices is also a matter of concern.
Increasing pressure from climate change on the availability of feed, as well as
higher demand for animal products and grains used for biofuels are all expected
to put upward pressure on feed and land prices [244]. In addition, farmers often face
highly volatile dairy prices, leading to a risky environment for investors and farm
operators.
Another challenge for the dairy sector concerns the structure of the industry.
In some developed countries dairy farms are highly specialized. While this can
provide cost advantages, dependence on a single commodity exposes producers
to economic shocks and the ensuing price and income variability. Furthermore, an
ageing farmer population generally and the small size of dairy farms found in many
developing countries also present challenges to dairy farming. In addition, farmers
often occupy a relatively fragile position in the food supply chain with a weaker
bargaining muscle than the processing or retail links in the value chain, thus making
them vulnerable to asymmetric market power.
These challenges provide fertile ground for future production economics research
in dairy. The conceptual frameworks and empirical analyses that have been reviewed
in this chapter show that production economists have several tools at their disposal
to carry out studies related to these challenges and thus contribute to policy analyses
and formulation.

Cross-References

Data Envelopment Analysis: A Nonparametric Method of Production Analysis

Dynamic Analysis of Production
Neoclassical Production Economics: An Introduction
Stochastic Frontier Analysis: Foundations and Advances I
Stochastic Frontier Analysis: Foundations and Advances II

Acknowledgments Boris E. Bravo-Ureta acknowledges partial support from USDA-NIFA Grant

#2016-67024-24760. Alan Wall is grateful for support from the Spanish Ministry of Economics,
Industry and Competitiveness grant ECO2017-85788-R.

References
1. Food and Agriculture Organization (FAO) (2016) The global dairy sector: facts. https://www.
fil-idf.org/wp-content/uploads/2016/12/FAO-Global-Facts-1.pdf. Accessed 4 June 2020
2. Food and Agriculture Organization (FAO) (2019) Climate change and the global dairy cattle
sector: the role of the dairy sector in a low-carbon future. FAO, Roma
3. Practices responsible for variations in physical requirements and economic costs of milk pro-
duction on Wisconsin dairy farms. Wisconsin Agricultural Experiment Station Bulletins 79
31 Dairy Farming from a Production Economics Perspective: An Overview . . . 1269

4. Ezekiel M, Rauchenstein E, Wells O (1932) Farmers’ response to price in the production of

market milk. United States Department of Agriculture, Bureau of Agricultural Economics,
Washington, DC
5. Cassels JM (1933) The nature of statistical supply curves. J Farm Econ 15:378–387. https://
doi.org/10.2307/1231011
6. Cassels JM, Malenbaum W (1938) Doubts about statistical supply analysis. J Farm Econ
20:448–461. https://doi.org/10.2307/1231146
7. Cassels J (1937) A study of fluid milk prices. Harvard University Press, Cambridge
8. Halvorson HW (1958) The response of milk production to price. J Farm Econ 40:1101–1113.
https://doi.org/10.2307/1234979
9. Mighell R, Black J (1951) Interregional competition in agriculture: with special reference to
dairy farming in the Lake States and New England. Harvard University Press, Cambridge
10. Changes in milk production in the United States, 1924–51. Pennsylvania Agricultural
Experiment Station Progress Report 97
11. Schuh C (1957) The supply of fluid milk in the Detroit milk shed as affected by cost of
production. Michigan State University, East Lansing
12. Cochrane W (1958) Farm prices: myth and reality. University of Minnesota Press
13. Cowling K, Gardner TW (1964) Milk supply response: an interbreed analysis. Statistician
14:255–266. https://doi.org/10.2307/2987047
14. Wipf L, Houck JP (1967) Milk supply response in the United States: an aggregate analysis.
Department of Applied Economics, University of Minnesota
15. Wilson RR, Thompson RG (1967) Demand, supply, and price relationships for the dairy
sector, post-World War II period. J Farm Econ 49:360–371. https://doi.org/10.2307/1237206
16. Chen D, Courtney R, Schmitz A (1972) A polynomial lag formulation of milk production
response. Am J Agric Econ 54:77–83. https://doi.org/10.2307/1237736
17. Prato AA (1973) Milk demand, supply, and price relationships, 1950–1968. Am J Agric Econ
55:217–222. https://doi.org/10.2307/1238441
18. Askari H, Cummings JT (1977) Estimating agricultural supply response with the Nerlove
model: a survey. Int Econ Rev 18:257–292. https://doi.org/10.2307/2525749
19. Faris JE, McPherson WW (1957) Application of linear programming in an analysis of
economic changes in farming. Rev Econ Stat 39:421–434. https://doi.org/10.2307/1927013
20. Cowling K, Baker C (1963) A polyperiod model for estimating the supply of milk. J Agric
Econ Res 15:15–23
21. Kelley PL, Knight DA (1965) Short-run elasticities of supply for milk. J Farm Econ 47:93–
104. https://doi.org/10.2307/1236156
22. Barker R (1965) Appropriate methods for estimating the short-run elasticity of supply for
milk. J Farm Econ 47:840–843. https://doi.org/10.2307/1236295
23. Jensen E (1940) Determining input-output relationships in milk production. J Farm Econ
22:249–258. https://doi.org/10.2307/1232051
24. Christensen RP (1941) Expectation and performance. Related to conservation and production
adjustments in the midwest dairy region. J Farm Econ 23:632–645. https://doi.org/10.2307/
1231562
25. Herrmann LF (1943) Diminishing returns in feeding commercial dairy herds. J Farm Econ
25:397409. https://doi.org/10.2307/1231708
26. Heady EO (1951) A production function and marginal rates of substitution in the utilization
of feed resources by dairy cows. J Farm Econ 33:485–498. https://doi.org/10.2307/1233559
27. Redman JC (1952) Economic aspects of feeding for milk production. J Farm Econ 34:333–
345. https://doi.org/10.2307/1233691
28. Rauchenstein E (1953) Forage-grain substitution: its importance in the economics of milk
production. J Farm Econ 35:562–571. https://doi.org/10.2307/1233369
29. Heady EO, Schnittker J, Bloom S, Jacobsen NL (1956) Isoquants, isoclines and economic
predictions in dairy production. J Farm Econ 38:763–779. https://doi.org/10.2307/1234460
30. Heady EO (1957) An econometric investigation of the technology of agricultural production
functions. Econometrica 25:249–268. https://doi.org/10.2307/1910253
1270 B. E. Bravo-Ureta et al.

31. Heady E, Dillon J (1961) Agricultural production functions. Iowa State University Press,
Ames
32. Christensen L, Jorgenson D, Lau L (1971) Conjugate duality and the transcendental logarith-
mic production function. Econometrica 39:255–256
33. Aune HJ, Day LM (1959) Determining the effect of size of herd and equipment on dairy chore
labor. J Farm Econ 41:569–583. https://doi.org/10.2307/1234879
34. Waugh FV (1951) The minimum-cost dairy feed (an application of “linear programming”).
J Farm Econ 33:299–310. https://doi.org/10.2307/1233608
35. Weeks EE (1964) Maximum profit dairy rations that include wheat. J Farm Econ 46:580–587.
https://doi.org/10.2307/1236441
36. Coffey JD, Toussaint WD (1963) Some economic aspects of free-choice feeding of dairy
cows. J Farm Econ 45:1213–1218. https://doi.org/10.2307/1236794
37. Heady EO, Madden JP, Jacobson NL, Freeman AE (1964) Milk production functions
incorporating variables for cow characteristics and environment. J Farm Econ 46:1–19.
https://doi.org/10.2307/1236469
38. Hoover LM, Kelley PL, Ward GM, Feyerherm AM, Chaddha R (1967) Economic relation-
ships of hay and concentrate consumption to milk production. J Farm Econ 49:64–78. https://
doi.org/10.2307/1237068
39. Paris Q, Malossini Davis F, Pilla AM, Romita A (1970) A note on milk production functions.
Am J Agric Econ 52:594–598. https://doi.org/10.2307/1237269
40. Farrell MJ (1957) The measurement of productive efficiency. J R Stat Soc Ser A (Gen)
120:253–281. https://doi.org/10.2307/2343100
41. Debreu G (1951) The coefficient of resource utilization. Econometrica 19:273–292
42. Koopmans T (1951) Analysis of production as an efficient combination of activities. In:
Koopmans T (ed) Activity analysis of production and allocation. J. Wiley, New York
43. Farrell MJ, Fieldhouse M (1962) Estimating efficient production functions under increasing
returns to scale. J R Stat Soc Ser A (Gen) 125:252–267. https://doi.org/10.2307/2982329
44. Fried H, Lovell C, Schmidt S (2008) Efficiency and productivity. In: The measurement of
productive efficiency and productivity growth. Oxford University Press, pp 3–91
45. Boles JN (1966) Efficiency squared – efficient computation of efficiency indexes. In:
Proceedings of the 39th Annual Meeting (Western Farm Economics Association) 39 (August
15–17, 1966), pp 137–142
46. Boles JN (1971) The 1130 Farrell efficiency system – multiple products, multiple factors.
Giannini Foundation of Agricultural Economics
47. Bressler RG (1966) The measurement of productive efficiency. In: Proceedings of the 39th
Annual Meeting (Western Farm Economics Association) 39 (August 15–17, 1966), pp 129–
136
48. Seitz WD (1970) The measurement of efficiency relative to a frontier production function.
Am J Agric Econ 52:505–511. https://doi.org/10.2307/1237257
49. Charnes A, Cooper W, Rhodes E (1978) Measuring efficiency of decision making units. Eur
J Oper Res 2:429–444
50. Banker RD, Charnes A, Cooper WW (1984) Some models for estimating technical and
scale inefficiencies in data envelopment analysis. Manag Sci 30:1078–1092. https://doi.org/
10.1287/mnsc.30.9.1078
51. Färe R, Grosskopf S, Lovell C (1985) The measurement of efficiency of production. Kluwer-
Nijhoff, New York
52. Simar L, Wilson PW (2007) Estimation and inference in two-stage, semi-parametric models
of production processes. J Econ 136:31–64. https://doi.org/10.1016/j.jeconom.2005.07.009
53. Färe R, Grosskopf S, Margaritis D (2008) Efficiency and productivity: Malmquist and more.
In: Fried H, Lovell C, Schmidt S (eds) The measurement of productive efficiency and
productivity growth. Oxford University Press
54. Aigner D, Chu S (1968) On estimating the industry production function. Am Econ Rev
58:826–839
55. Timmer CP (1971) Using a probabilistic frontier production function to measure technical
efficiency. J Polit Econ 79:776–794. https://doi.org/10.1086/259787
31 Dairy Farming from a Production Economics Perspective: An Overview . . . 1271

56. Aigner D, Lovell CAK, Schmidt P (1977) Formulation and estimation of stochastic frontier
production function models. J Econ 6:21–37. https://doi.org/10.1016/0304-4076(77)90052-5
57. Meeusen W, van Den Broeck J (1977) Efficiency estimation from Cobb-Douglas production
functions with composed error. Int Econ Rev 18:435–444. https://doi.org/10.2307/2525757
58. Schmidt P, Sickles R (1984) Production frontiers and panel data. J Bus Econ Stat 2:
367–374
59. Battese GE, Coelli TJ (1995) A model for technical inefficiency effects in a stochastic
frontier production function for panel data. Empir Econ 20:325–332. https://doi.org/10.1007/
BF01205442
60. Greene W (2005) Fixed and random effects in stochastic frontier models. J Prod Anal 23:
7–32. https://doi.org/10.1007/s11123-004-8545-1
61. Greene WH (2008) The econometric approach to efficiency analysis. In: Fried HO, Lovell
CAK, Schmidt SS (eds) The measurement of productive efficiency and productivity change.
Oxford University Press, pp 92–250
62. Tsionas EG, Kumbhakar SC (2014) Firm heterogeneity, persistent and transient technical
inefficiency: a generalized true random-effects model. J Appl Econ 29:110–132. https://doi.
org/10.1002/jae.2300
63. Kumbhakar SC, Wang H, Horncastle A (2015) A practitioner’s guide to stochastic frontier
analysis using Stata. Cambridge University Press, New York
64. Filippini M, Greene W (2016) Persistent and transient productive inefficiency: a maximum
simulated likelihood approach. J Prod Anal 45:187–196. https://doi.org/10.1007/s11123-015-
0446-y
65. Jondrow J, Knox Lovell CA, Materov IS, Schmidt P (1982) On the estimation of technical
inefficiency in the stochastic frontier production function model. J Econ 19:233–238. https://
doi.org/10.1016/0304-4076(82)90004-5
66. Battese G, Coelli T (1988) Prediction of firm-level technical efficiencies with a generalized
frontier production function and panel data. J Econ 38:387–399
67. Battese GE, Coelli TJ (1992) Frontier production functions, technical efficiency and panel
data: with application to paddy farmers in India. J Prod Anal 3:153–169
68. Simar L, Wilson P (2008) Statistical inference in nonparametric models: recent developments
and perspectives. In: Fried H, Lovell C, Schmidt S (eds) The measurement of productive
efficiency and productivity growth. Oxford University Press
69. Martin JP, Page JM (1983) The impact of subsidies on X-efficiency in LDC industry: theory
and an empirical test. Rev Econ Stat 65:608–617. https://doi.org/10.2307/1935929
70. Triebs TP, Kumbhakar SC (2018) Management in production: from unobserved to observed.
J Prod Anal 49:111–121. https://doi.org/10.1007/s11123-018-0526-x
71. Hoch I (1955) Estimation of production function parameters and testing for efficiency, report
of the Montreal meeting, september 10-13, 1954. Econometrica 23:324–337
72. Mundlak Y (1961) Empirical production function free of management bias. J Farm Econ
43:44–56
73. Mundlak Y, Hoch I (1965) Consequences of alternative specifications in estimation of Cobb-
Douglas production functions. Econometrica 33:814–828. https://doi.org/10.2307/1910358
74. Hoch I (1976) Returns to scale in farming: further evidence. Am J Agric Econ 58:745–749.
https://doi.org/10.2307/1238820
75. Grisley W, Mascarenhas J (1985) Operating cost efficiency on Pennsylvania dairy farms.
Northeast J Agric Resour Econ 14:88–95. https://doi.org/10.1017/S0899367X00000817
76. Weersink A, Turvey CG, Godah A (1990) Decomposition measures of technical efficiency
for Ontario dairy farms. Can J Agric Econ 38:439–456
77. Cloutier L, Rowley R (1993) Relative technical efficiency: data envelopment analysis and
Quebec’s dairy farms. Can J Agric Econ 41:169–176
78. Tauer L (1993) Short-run and long-run efficiencies of New York dairy farms. Agric Resour
Econ Rev 22:1–9
79. Fraser I, Cordina D (1999) An application of data envelopment analysis to irrigated dairy
farms in Northern Victoria, Australia. Agric Syst 59:267–282. https://doi.org/10.1016/S0308-
521X(99)00009-8
1272 B. E. Bravo-Ureta et al.

80. Asmild M, Hougaard JL, Kronborg D, Kvist HK (2003) Measuring inefficiency via potential
improvements. J Prod Anal 19:59–76
81. Mugera AW (2013) Measuring technical efficiency of dairy farms with imprecise data: a fuzzy
data envelopment analysis approach. Aust J Agric Resour Econ 57:501–520. https://doi.org/
10.1111/1467-8489.12008
82. Hansson H, Manevska-Tasevska G, Asmild M (2018) Rationalising inefficiency in agricul-
tural production – the case of Swedish dairy agriculture. Eur Rev Agric Econ:1–24. https://
doi.org/10.1093/erae/jby042
83. Müller J (1974) On sources of measured technical efficiency: the impact of information. Am
J Agric Econ 56:730–738. https://doi.org/10.2307/1239302
84. Bravo-Ureta BE (1986) Technical efficiency measures for dairy farms
based on a probabilistic frontier function model. Can J Agric Econ 34:
399–415
85. Dawson P (1987) Farm-specific technical efficiency in the England and Wales dairy sector.
Eur Rev Agric Econ 14:383–394
86. Bravo-Ureta B, Rieger L (1990) Alternative production frontier methodologies and dairy farm
efficiency. J Agric Econ 41:215–226
87. Bravo-Ureta B, Rieger L (1991) Dairy farm efficiency measurement using stochastic frontiers
and neoclassical duality. Am J Agric Econ 73:421–428
88. Kumbhakar S, Biswas B, Bailey D (1989) A study of economic efficiency of Utah dairy
farmers: a system approach. Rev Econ Stat 71:595–604
89. Mbaga MD, Romain R, Larue B, Lebel L (2003) Assessing technical efficiency of Que-
bec dairy farms. Can J Agric Econ 51:121–137. https://doi.org/10.1111/j.1744-7976.2003.
tb00169.x
90. Balcombe K, Fraser I, Kim JH (2006) Estimating technical efficiency of Australian dairy
farms using alternative frontier methodologies. Appl Econ 38:2221–2236. https://doi.org/10.
1080/00036840500427445
91. Heshmati A, Kumbhakar SC (1994) Farm heterogeneity and technical efficiency: some results
from Swedish dairy farms. J Prod Anal 5:45–61. https://doi.org/10.1007/BF01073597
92. Kumbhakar SC, Heshmati A (1995) Efficiency measurement in Swedish dairy farms: an
application of rotating panel data, 1976–88. Am J Agric Econ 77:660–674. https://doi.org/
10.2307/1243233
93. Cuesta RA (2000) A production model with firm-specific temporal variation in
technical inefficiency: with application to Spanish dairy farms. J Prod Anal 13:
139–158
94. Kompas T, Che TN (2006) Technology choice and efficiency on Australian dairy farms*. Aust
J Agric Res Econ 50:65–83. https://doi.org/10.1111/j.1467-8489.2006.00314.x
95. Abdulai A, Tietje H (2007) Estimating technical efficiency under unobserved heterogeneity
with stochastic frontier models: application to northern German dairy farms. Eur Rev Agric
Econ 34:393–416. https://doi.org/10.1093/erae/jbm023
96. Mundlak Y (1978) On the pooling of time series and cross section data. Econometrica 46:69–
85. https://doi.org/10.2307/1913646
97. Kumbhakar SC, Tsionas EG (2008) Estimation of input-oriented technical efficiency using
a nonhomogeneous stochastic production frontier model. Agric Econ 38:99–108. https://doi.
org/10.1111/j.1574-0862.2007.00285.x
98. Rasmussen S (2010) Scale efficiency in Danish agriculture: an input distance-function
approach. Eur Rev Agric Econ 37:335–367. https://doi.org/10.1093/erae/jbq023
99. Tsionas EG, Kumbhakar SC, Malikov E (2015) Estimation of input distance functions: a
system approach. Am J Agric Econ 97:1478–1493. https://doi.org/10.1093/ajae/aav012
100. Silva E, Stefanou SE (2007) Dynamic efficiency measurement: theory and application. Am J
Agric Econ 89:398–419. https://doi.org/10.1111/j.1467-8276.2007.00999.x
101. Serra T, Lansink AO, Stefanou SE (2011) Measurement of dynamic efficiency: a directional
distance function parametric approach. Am J Agric Econ 93:756–767. https://doi.org/10.
1093/ajae/aaq175
31 Dairy Farming from a Production Economics Perspective: An Overview . . . 1273

102. Emvalomatis G, Stefanou SE, Lansink AO (2011) A reduced-form model for dynamic
efficiency measurement: application to dairy farms in Germany and the Netherlands. Am J
Agric Econ 93:161–174. https://doi.org/10.1093/ajae/aaq125
103. Skevas I, Emvalomatis G, Brümmer B (2018) Heterogeneity of long-run technical efficiency
of German dairy farms: a bayesian approach. J Agric Econ 69:58–75. https://doi.org/10.1111/
1477-9552.12231
104. Battese GE, Rao DSP, O’Donnell CJ (2004) A metafrontier production function for estimation
of technical efficiencies and technology gaps for firms operating under different technologies.
J Prod Anal 21:91–103. https://doi.org/10.1023/B:PROD.0000012454.06094.29
105. O’Donnell CJ, Rao DSP, Battese GE (2008) Metafrontier frameworks for the study of firm-
level efficiencies and technology ratios. Empir Econ 34:231–255. https://doi.org/10.1007/
s00181-007-0119-4
106. Huang CJ, Huang T-H, Liu N-H (2014) A new approach to estimating the metafrontier
production function based on a stochastic frontier framework. J Prod Anal 42:241–254.
https://doi.org/10.1007/s11123-014-0402-2
107. Amsler C, O’Donnell CJ, Schmidt P (2017) Stochastic metafrontiers. Econ Rev 36:1007–
1020. https://doi.org/10.1080/07474938.2017.1308345
108. Amsler C, Chen YY, Schmidt P, Wang HJ (2020) A hierarchical panel data stochastic frontier
model for the estimation of stochastic metafrontiers. Empir Econ. https://doi.org/10.1007/
s00181-020-01929-w
109. Moreira VH, Bravo-Ureta BE (2010) Technical efficiency and metatechnology ratios for dairy
farms in three southern cone countries: a stochastic meta-frontier model. J Prod Anal 33:33–
45. https://doi.org/10.1007/s11123-009-0144-8
110. Latruffe L, Fogarasi J, Desjeux Y (2012) Efficiency, productivity and technology comparison
for farms in central and Western Europe: the case of field crop and dairy farming in Hungary
and France. Econ Syst 36:264–278. https://doi.org/10.1016/j.ecosys.2011.07.002
111. Jiang N, Sharp B (2015) Technical efficiency and technological gap of New Zealand dairy
farms: a stochastic meta-frontier model. J Prod Anal 44:39–49. https://doi.org/10.1007/
s11123-015-0429-z
112. Alem H, Lien G, Hardaker JB, Guttormsen A (2019) Regional differences in technical
efficiency and technological gap of Norwegian dairy farms: a stochastic meta-frontier model.
Appl Econ 51:409–421. https://doi.org/10.1080/00036846.2018.1502867
113. Orea L, Kumbhakar SC (2004) Efficiency measurement using a latent class stochastic frontier
model. Empir Econ 29:169–183. https://doi.org/10.1007/s00181-003-0184-2
114. Alvarez A, del Corral J (2010) Identifying different technologies using a latent class model:
extensive versus intensive dairy farms. Eur Rev Agric Econ 37:231–250. https://doi.org/10.
1093/erae/jbq015
115. Orea L, Perez JA, Roibas D (2015) Evaluating the double effect of land fragmentation on
technology choice and dairy farm productivity: a latent class model approach. Land Use
Policy 45:189–198. https://doi.org/10.1016/j.landusepol.2015.01.016
116. Bravo-Ureta BE, Solís D, Moreira López VH, Maripani JF, Thiam A, Rivas T (2007)
Technical efficiency in farming: a meta-regression analysis. J Prod Anal 27:57–72. https://
doi.org/10.1007/s11123-006-0025-3
117. Moreira López VH, Bravo-Ureta BE (2009) A study of dairy farm technical efficiency using
meta-regression: an international perspective. Chilean J Agric Res 69:214–223. https://doi.
org/10.4067/S0718-58392009000200011
118. Ogundari K (2014) The paradigm of agricultural efficiency and its implication on food
security in Africa: what does meta-analysis reveal? World Dev 64:690–702. https://doi.org/
10.1016/j.worlddev.2014.07.005
119. Caves DW, Christensen LR, Diewert WE (1982) Multilateral comparisons of output, input,
and productivity using superlative index numbers. Econ J 92:73–86. https://doi.org/10.2307/
2232257
120. Diewert WE (1976) Exact and superlative index numbers. J Econ 4:115–145. https://doi.org/
10.1016/0304-4076(76)90009-9
1274 B. E. Bravo-Ureta et al.

121. Caves DW, Christensen LR, Diewert WE (1982) The economic theory of index numbers and
the measurement of input, output, and productivity. Econometrica 50:1393–1414. https://doi.
org/10.2307/1913388
122. Nishimizu M, Page JM (1982) Total factor productivity growth, technological progress and
technical efficiency change: dimensions of productivity change in Yugoslavia, 1965–78. Econ
J 92:920–936. https://doi.org/10.2307/2232675
123. Färe R, Grosskopf S, Norris M, Zhang Z (1994) Productivity growth, technical progress, and
efficiency change in industrialized countries. Am Econ Rev 84:66–83
124. Kumbhakar SC, Lovell C (2000) Stochastic frontier analysis. Cambridge University Press
125. O’Donnell CJ (2016) Using information about technologies, markets and firm behaviour
to decompose a proper productivity index. J Econ 190:328–340. https://doi.org/10.1016/j.
jeconom.2015.06.009
126. O’Donnell C (2018) Productivity and efficiency analysis: an economic approach to measuring
and explaining managerial performance. Springer
127. Ahmad M, Bravo-Ureta BE (1995) An econometric decomposition of dairy output growth.
Am J Agric Econ 77:914–921. https://doi.org/10.2307/1243814
128. Weersink A, Tauer LW (1990) Regional and temporal impacts of technical change in the U.S.
dairy sector. Am J Agric Econ 72:923–934. https://doi.org/10.2307/1242624
129. Weersink A, Tauer LW (1991) Causality between dairy farm size and productivity. Am J Agric
Econ 73:1138–1145. https://doi.org/10.2307/1242442
130. Brümmer B, Glauben T, Thijssen G (2002) Decomposition of productivity growth using
distance functions: the case of dairy farms in three European countries. Am J Agric Econ
84:628–644. https://doi.org/10.1111/1467-8276.00324
131. Newman C, Matthews A (2006) The productivity performance of Irish dairy farms 1984–
2000: a multiple output distance function approach. J Prod Anal 26:191–205. https://doi.org/
10.1007/s11123-006-0013-7
132. Emvalomatis G (2012) Productivity growth in German dairy farming using a flexible
modelling approach: TFP growth in German dairy farming. J Agric Econ 63:83–101. https://
doi.org/10.1111/j.1477-9552.2011.00312.x
133. Cechura L, Grau A, Hockmann H, Kroupova Z, Levkovych I (2017) Catching up or falling
behind in European agriculture: the case of milk production. J Agric Econ 68:206–227
134. Sipiläinen T, Kumbhakar SC, Lien G (2014) Performance of dairy farms in Finland and
Norway from 1991 to 2008. Eur Rev Agric Econ 41:63–86. https://doi.org/10.1093/erae/
jbt012
135. Sauer J, Latacz-Lohmann U (2015) Investment, technical change and efficiency: empirical
evidence from German dairy production. Eur Rev Agric Econ 42:151–175. https://doi.org/10.
1093/erae/jbu015
136. Singbo A, Larue B (2016) Scale economies, technical efficiency, and the sources of total
factor productivity growth of Quebec dairy farms. Can J Agric Econ/Revue canadienne
d’agroeconomie 64:339–363. https://doi.org/10.1111/cjag.12077
137. Moreira VH, Bravo-Ureta BE (2016) Total factor productivity change in dairy farming:
empirical evidence from southern Chile. J Dairy Sci 99:8356–8364. https://doi.org/10.3168/
jds.2016-11055
138. Tauer L (1998) Productivity of New York dairy farms measured by nonparametric Malmquist
indices. J Agric Econ 49:234–249
139. Jang H, Du X (2019) Evolving techniques in production function identification illustrated in
the case of the US dairy. Appl Econ 51:1463–1477
140. Chambers RG (1988) Applied production analysis: a dual approach. Cambridge University
Press
141. Panzar JC, Willig RD (1981) Economies of scope. Am Econ Rev 71:268–272
142. Orea L, Roibás D, Wall A (2004) Choosing the technical efficiency orientation to analyze
firms’ technology: a model selection test approach. J Prod Anal 22:51–71. https://doi.org/10.
1023/B:PROD.0000034691.84524.b2
31 Dairy Farming from a Production Economics Perspective: An Overview . . . 1275

143. Lund M, Jacobsen BH, Hansen LCE (1993) Reducing non-allocative costs on Danish dairy
farms: application of non-parametric methods. Eur Rev Agric Econ 20:327–341. https://doi.
org/10.1093/erae/20.3.327
144. Cocchi H, Bravo-Ureta BE, Cooke S (1998) A growth accounting analysis of cost efficiency
in milk production for six northern states in the United States. Can J Agric Econ 46:287–296.
https://doi.org/10.1111/j.1744-7976.1998.tb00086.x
145. Maietta OW (2000) The decomposition of cost inefficiency into technical and allocative
components with panel data of Italian dairy farms. Eur Rev Agric Econ 27:473–495
146. Hailu G, Jeffrey S, Unterschultz J (2005) Cost efficiency for Alberta and Ontario dairy farms:
an interregional comparison. Can J Agric Econ 53:141–160. https://doi.org/10.1111/j.1744-
7976.2005.00314.x
147. Alvarez A, del Corral J, Solís D, Pérez JA (2008) Does intensification improve the economic
efficiency of dairy farms? J Dairy Sci 91:3693–3698. https://doi.org/10.3168/jds.2008-1123
148. Carletto C, Savastano S, Zezza A (2013) Fact or artifact: the impact of measurement errors
on the farm size–productivity relationship. J Dev Econ 103:254–261. https://doi.org/10.1016/
j.jdeveco.2013.03.004
149. Rada NE, Fuglie KO (2019) New perspectives on farm size and productivity. Food Policy
84:147–152. https://doi.org/10.1016/j.foodpol.2018.03.015
150. Matulich SC (1978) Efficiencies in large-scale dairying: incentives for future structural
change. Am J Agric Econ 60:642–647. https://doi.org/10.2307/1240249
151. Dawson P, Hubbard L (1987) Management and size economies in the England and Wales
dairy sector. J Agric Econ 38:27–38
152. Alvarez A, Arias C (2003) Diseconomies of size with fixed managerial ability. Am J Agric
Econ 85:134–142. https://doi.org/10.1111/1467-8276.00108
153. Mukhtar SM, Dawson PJ (1990) Herd size and unit costs of production in the England
and Wales dairy sector. J Agric Econ 41:9–20. https://doi.org/10.1111/j.1477-9552.1990.
tb\penalty-\@M00615.x
154. Moschini G (1990) Nonparametric and semiparametric estimation: an analysis of multiprod-
uct returns to scale. Am J Agric Econ 72:589–596. https://doi.org/10.2307/1243028
155. Løyland K, Ringstad V (2001) Gains and structural effects of exploiting scale-economies in
Norwegian dairy production. Agric Econ 24:149–166. https://doi.org/10.1111/j.1574-0862.
2001.tb00020.x
156. Mosheim R, Lovell CAK (2009) Scale economies and inefficiency of U.S. dairy farms. Am J
Agric Econ 91:777–794. https://doi.org/10.1111/j.1467-8276.2009.01269.x
157. Fernandez-Cornejo J, Gempesaw CM, Elterich JG, Stefanou SE (1992) Dynamic measures of
scope and scale economies: an application to German agriculture. Am J Agric Econ 74:329–
342. https://doi.org/10.2307/1242487
158. Wieck C, Heckelei T (2007) Determinants, differentiation, and development of short-term
marginal costs in dairy production: an empirical analysis for selected regions of the EU. Agric
Econ 36:203–220. https://doi.org/10.1111/j.1574-0862.2007.00199.x
159. Alem H, Lien G, Kumbhakar SC, Hardaker JB (2019) Are diversification and structural
change good policy? An empirical analysis of Norwegian agriculture. J Agric Appl Econ
51:1–26. https://doi.org/10.1017/aae.2018.17
160. Silva E, Stefanou S (2003) Nonparametric dynamic production analysis and the theory of
cost. J Prod Anal 19:5–32
161. Tauer LW (1995) Do New York dairy farmers maximize profits or minimize costs? Am J
Agric Econ 77:421–429. https://doi.org/10.2307/1243551
162. Stefanou SE, Saxena S (1988) Education, experience, and allocative efficiency: a dual
approach. Am J Agric Econ 70:338–345. https://doi.org/10.2307/1242074
163. Griliches Z (1957) Hybrid corn: an exploration in the economics of technological change.
Econometrica 25:501–522. https://doi.org/10.2307/1905380
164. Giesen GWJ, Oskam AJ, Berentsen PBM (1989) Expected economic effects of BST in the
Netherlands. Agric Econ 3:231–248
1276 B. E. Bravo-Ureta et al.

165. Marion BW, Wills RL (1990) A prospective assessment of the impacts of bovine soma-
totropin: a case study of Wisconsin. Am J Agric Econ 72:326–336. https://doi.org/10.2307/
1242336
166. Stefanides Z, Tauer LW (1999) The empirical impact of bovine somatotropin on a group of
New York dairy farms. Am J Agric Econ 81:95–102. https://doi.org/10.2307/1244453
167. Foltz JD, Chang H (2002) The adoption and profitability of rbST on Connecticut dairy farms.
Am J Agric Econ 84:1021–1032. https://doi.org/10.1111/1467-8276.00364
168. Tomaszewski MA, Asseldonk MAPM, Dijkhuizen AA, Huirne RBM (2000) Determining
farm effects attributable to the introduction and use of a dairy management information
system in the Netherlands. Agric Econ 23:79–86. https://doi.org/10.1111/j.1574-0862.2000.
tb00085.x
169. Roibas D, Alvarez A (2010) Impact of genetic progress on the profits of dairy farmers. J Dairy
Sci 93:4366–4373. https://doi.org/10.3168/jds.2010-3135
170. Atsbeha DM, Kristofersson D, Rickertsen K (2012) Animal breeding and productivity growth
of dairy farms. Am J Agric Econ 94:996–1012. https://doi.org/10.1093/ajae/aas033
171. Whitt CE, Tauer LW, Huson H (2019) Bull efficiency using dairy genetic traits. PLoS One
14:1–14. https://doi.org/10.1371/journal.pone.0223436
172. Pérez-Méndez JA, Roibás D, Wall A (2020) Somatic cell counts, reproduction indicators, and
technical efficiency in milk production: a stochastic frontier analysis for Spanish dairy farms.
J Dairy Sci 103:7141–7154. https://doi.org/10.3168/jds.2019-17146
173. Kumbhakar SC, Tsionas EG, Sipiläinen T (2009) Joint estimation of technology choice and
technical efficiency: an application to organic and conventional dairy farming. J Prod Anal
31:151–161. https://doi.org/10.1007/s11123-008-0081-y
174. Mayen CD, Balagtas JV, Alexander CE (2010) Technology adoption and technical efficiency:
organic and conventional dairy farms in the United States. Am J Agric Econ 92:181–195.
https://doi.org/10.1093/ajae/aap018
175. Nehring R, Gillespie J, Sandretto C, Hallahan C (2009) Small U.S. dairy farms: can they
compete? Agric Econ 40:817–825. https://doi.org/10.1111/j.1574-0862.2009.00418.x
176. Ma W, Bicknell K, Renwick A (2019) Feed use intensification and technical efficiency of
dairy farms in New Zealand. Aust J Agric Resour Econ 63:20–38. https://doi.org/10.1111/
1467-8489.12283
177. Levins RA (1982) Price specification in milk supply response analysis. Am J Agric Econ
64:286–288. https://doi.org/10.2307/1241137
178. Nerlove M (1956) Estimates of the elasticities of supply of selected agricultural commodities.
J Farm Econ 38:496–509. https://doi.org/10.2307/1234389
179. Buckwell A (1984) Herd structure and milk supply response. J Agric Econ 35:341–353
180. Kislev Y, Peterson W (1982) Prices, technology, and farm size. J Polit Econ 90:578–595.
https://doi.org/10.1086/261075
181. Howard WH, Shumway CR (1988) Dynamic adjustment in the U.S. dairy industry. Am J
Agric Econ 70:837–847. https://doi.org/10.2307/1241925
182. Chavas J, Klemme RM (1986) Aggregate milk supply response and investment behavior on
U.S. dairy farms. Am J Agric Econ 68:55–66. https://doi.org/10.2307/1241649
183. Chavas J-P, Kraus AF (1990) Population dynamics and milk supply response in the U.S. lake
states. J Agric Econ 41:75–84. https://doi.org/10.1111/j.1477-9552.1990.tb00620.x
184. Adelaja AO (1991) Price changes, supply elasticities, industry organization, and dairy output
distribution. Am J Agric Econ 73:89–102. https://doi.org/10.2307/1242886
185. Bozic M, Kanter CA, Gould BW (2012) Tracing the evolution of the aggregate U.S. milk
supply elasticity using a herd dynamics model. Agric Econ 43:515–530. https://doi.org/10.
1111/j.1574-0862.2012.00600.x
186. Thijssen G (1994) Supply response and dynamic factor demand of Dutch dairy farms. Eur
Rev Agric Econ 21:241–258. https://doi.org/10.1093/erae/21.2.241
187. Quiroga R, Bravo-Ureta B (1992) Short- and long-run adjustments in dairy production: a
profit function analysis. Appl Econ 24:607–616
31 Dairy Farming from a Production Economics Perspective: An Overview . . . 1277

188. Blayney DP, Mittelhammer RC (1990) Decomposition of milk supply response into tech-
nology and price-induced effects. Am J Agric Econ 72:864–872. https://doi.org/10.2307/
1242618
189. Munshi K, Parikh K (1994) Milk supply behavior in India: data integration, estimation and
implications for dairy development. J Dev Econ 45:201–223
190. Stefanou SE, Fernadez-Cornejo J, Gempesaw CM, Elterich JG (1992) Dynamic structure of
production under a quota: the case of milk production in the Federal Republic of Germany.
Eur Rev Agric Econ 19:283–299. https://doi.org/10.1093/erae/19.3.283
191. Boots M, Lansink AO, Peerlings J (1997) Efficiency loss due to distortions in Dutch milk
quota trade. Eur Rev Agric Econ 24:31–46. https://doi.org/10.1093/erae/24.1.31
192. Sauer J (2010) Deregulation and dairy production systems: a Bayesian distance function
approach. J Prod Anal 34:213–237. https://doi.org/10.1007/s11123-010-0179-x
193. Colman D, Burton M, Rigby D, Franks J (2002) Structural change and policy reform in
the UK dairy sector. J Agric Econ 53:645–663. https://doi.org/10.1111/j.1477-9552.2002.
tb00042.x
194. Pierani P, Rizzi PL (2003) Technology and efficiency in a panel of Italian dairy farms: an
SGM restricted cost function approach. Agric Econ 29:195–209. https://doi.org/10.1111/j.
1574-0862.2003.tb00157.x
195. Alvarez A, Arias C, Orea L (2006) Explaining differences in milk quota values: the role
of economic efficiency. Am J Agric Econ 88:182–193. https://doi.org/10.1111/j.1467-8276.
2006.00846.x
196. Kumbhakar SC, Lien G, Flaten O, Tveterås R (2008) Impacts of Norwegian milk quotas on
output growth: a modified distance function approach. J Agric Econ 59:350–369. https://doi.
org/10.1111/j.1477-9552.2008.00154.x
197. Ang F, Lansink AO (2018) Decomposing dynamic profit inefficiency of Belgian dairy farms.
Eur Rev Agric Econ 45:81–99
198. Henry de Frahan B, Baudry A, De Blander R, Polome P, Howitt R (2011) Dairy farms without
quotas in Belgium: estimation and simulation with a flexible cost function. Eur Rev Agric
Econ 38:469–495. https://doi.org/10.1093/erae/jbr013
199. Latruffe L, Bravo-Ureta BE, Carpentier A, Desjeux Y, Moreira VH (2017) Subsidies and
technical efficiency in agriculture: evidence from European dairy farms. Am J Agric Econ
99:783–799. https://doi.org/10.1093/ajae/aaw077
200. Frick F, Sauer J (2018) Deregulation and productivity: empirical evidence on dairy produc-
tion. Am J Agric Econ 100:354–378
201. Bezlepkina IV, Lansink AGJMO, Oskam AJ (2005) Effects of subsidies in Russian dairy
farming. Agric Econ 33:277–288. https://doi.org/10.1111/j.1574-0864.2005.00067.x
202. Slade P, Hailu G (2016) Efficiency and regulation: a comparison of dairy farms in Ontario
and New York state. J Prod Anal 45:103–115. https://doi.org/10.1007/s11123-015-0456-9
203. Larue B, Singbo A, Pouliot S (2017) Production rigidity, input lumpiness, efficiency, and
the technological hurdle of Quebec dairy farms. Can J Agric Econ/Revue canadienne
d’agroeconomie 65:613–641. https://doi.org/10.1111/cjag.12156
204. Komaki T, Penzer J (2005) Estimation of time-varying price elasticity in 1970-1997 Japanese
raw milk supply by structural time-series model. Agric Econ 32:1–14. https://doi.org/10.1111/
j.0169-5150.2005.00001.x
205. Moschini G, Hennessy D (2001) Uncertainty, risk aversion, and risk management for
agricultural producers. In: Gardner B, Rausser G (eds) Handbook of agricultural economics.
Elsevier Science, Amsterdam
206. Finger R, Dalhaus T, Allendorf J, Hirsch S (2018) Determinants of downside risk exposure
of dairy farms. Eur Rev Agric Econ 45:641–674. https://doi.org/10.1093/erae/jby012
207. Antle JM, Goodger WJ (1984) Measuring stochastic technology: the case of Tulare milk
production. Am J Agric Econ 66:342–350. https://doi.org/10.2307/1240801
208. Just R, Pope R (1978) Stochastic specification of production functions and economic
implications. J Econ 7:67–86
1278 B. E. Bravo-Ureta et al.

209. Antle J (1983) Testing the stochastic structure of production: a flexible moment-based
approach. J Bus Econ Stat 1:192–201
210. Chambers RG, Quiggin J (2000) Uncertainty, production, choice, and agency: the state-
contingent approach. Cambridge University Press
211. O’Donnell CJ, Griffiths WE (2006) Estimating state-contingent production frontiers. Am J
Agric Econ 88:249–266. https://doi.org/10.1111/j.1467-8276.2006.00851.x
212. Tveteras R, Flaten O, Lien G (2011) Production risk in multi-output industries: estimates
from Norwegian dairy farms. Appl Econ 43:4403–4414. https://doi.org/10.1080/00036846.
2010.491461
213. Orea L, Wall A (2012) Productivity and producer welfare in the presence of production risk:
productivity and producer welfare. J Agric Econ 63:102–118. https://doi.org/10.1111/j.1477-
9552.2011.00320.x
214. Lien G, Kumbhakar S, Hardaker J (2017) Accounting for risk in productivity analysis: an
application to Norwegian dairy farming. J Prod Anal 47:247–257
215. Melhim A, Shumway CR (2011) Enterprise diversification in US dairy: impact of risk
preferences on scale and scope economies. Appl Econ 43:3849–3862. https://doi.org/10.1080/
00036841003724478
216. Saha A (1997) Risk preference estimation in the nonlinear mean standard deviation approach.
Econ Inq 35:770–782. https://doi.org/10.1111/j.1465-7295.1997.tb01963.x
217. Pieralli S, Hüttel S, Odening M (2017) Abandonment of milk production under uncertainty
and inefficiency: the case of western German farms. Eur Rev Agric Econ 44:425–454
218. Mallawaarachchi T, Nauges C, Sanders O, Quiggin J (2017) State-contingent analysis of
farmers’ response to weather variability: irrigated dairy farming in the Murray Valley,
Australia. Aust J Agric Resour Econ 61:36–55. https://doi.org/10.1111/1467-8489.12193
219. Mukherjee D, Bravo-Ureta BE, De Vries A (2013) Dairy productivity and climatic conditions:
econometric evidence from South-eastern United States: impact of heat stress on dairy
productivity. Aust J Agric Resour Econ 57:123–140. https://doi.org/10.1111/j.1467-8489.
2012.00603.x
220. Key N, Sneeringer S (2014) Potential effects of climate change on the productivity of U.S.
dairies. Am J Agric Econ 96:1136–1156. https://doi.org/10.1093/ajae/aau002
221. Perez-Mendez JA, Roibas D, Wall A (2019) The influence of weather conditions on dairy
production. Agric Econ 50:165–175
222. Qi L, Bravo-Ureta BE, Cabrera VE (2015) From cold to hot: climatic effects and productivity
in Wisconsin dairy farms. J Dairy Sci 98:8664–8677. https://doi.org/10.3168/jds.2015-9536
223. Njuki E, Bravo-Ureta BE, Cabrera VE (2020) Climatic effects and total factor productivity:
econometric evidence for Wisconsin dairy farms. Eur Rev Agric Econ 47:1276–1301. https://
doi.org/10.1093/erae/jbz046
224. Dawkins MS (2017) Animal welfare and efficient farming: is conflict inevitable? Anim Prod
Sci 57:201–208. https://doi.org/10.1071/AN15383
225. Chi J, Weersink A, Van Leeuwen JA, Keefe GP (2002) The economics of controlling
infectious diseases on dairy farms. Can J Agric Econ 50:237–256. https://doi.org/10.1111/
j.1744-7976.2002.tb00335.x
226. Gunn KM, Holly MA, Veith TL, Buda AR, Prasad R, Rotz CA, Soder KJ, Stoner AMK (2019)
Projected heat stress challenges and abatement opportunities for U.S. milk production. PLoS
One 14:1–21. https://doi.org/10.1371/journal.pone.0214665
227. Lawson LG, Agger JF, Lund M, Coelli T (2004) Lameness, metabolic and digestive disorders,
and technical efficiency in Danish dairy herds: a stochastic frontier production function
approach. Livest Prod Sci 91:157–172. https://doi.org/10.1016/j.livprodsci.2004.07.016
228. Hansson H, Öhlmér B (2008) The effect of operational managerial practices on economic,
technical and allocative efficiency at Swedish dairy farms. Livest Sci 118:34–43. https://doi.
org/10.1016/j.livsci.2008.01.013
229. Hansson H, Szczensa-Rundberg M, Nielsen C (2011) Which preventive measures against
mastitis can increase the technical efficiency of dairy farms? Animal 5:632–640. https://doi.
org/10.1017/S1751731110002247
31 Dairy Farming from a Production Economics Perspective: An Overview . . . 1279

230. Barnes AP, Rutherford KMD, Langford FM, Haskell MJ (2011) The effect of lameness
prevalence on technical efficiency at the dairy farm level: an adjusted data envelopment
analysis approach. J Dairy Sci 94:5449–5457. https://doi.org/10.3168/jds.2011-4262
231. Reinhard S, Lovell CAK, Thijssen G (1999) Econometric estimation of technical and
environmental efficiency: an application to Dutch dairy farms. Am J Agric Econ 81:44–60.
https://doi.org/10.2307/1244449
232. Reinhard S, Thijssen G (2000) Nitrogen efficiency of Dutch dairy farms: a shadow cost system
approach. Eur Rev Agric Econ 27:167–186
233. Shortall OK, Barnes AP (2013) Greenhouse gas emissions and the technical efficiency of
dairy farmers. Ecol Indic 29:478–488. https://doi.org/10.1016/j.ecolind.2013.01.022
234. Pérez Urdiales M, Lansink AO, Wall A (2016) Eco-efficiency among dairy farmers: the
importance of socio-economic characteristics and farmer attitudes. Environ Resour Econ
64:559–574. https://doi.org/10.1007/s10640-015-9885-1
235. Peerlings J, Polman N (2004) Wildlife and landscape services production in Dutch dairy
farming; jointness and transaction costs. Eur Rev Agric Econ 31:427–449. https://doi.org/
10.1093/erae/31.4.427
236. Gullstrand J, De Blander R, Waldo S (2014) The influence of biodiversity provision on the
cost structure of Swedish dairy farming. J Agric Econ 65:87–111. https://doi.org/10.1111/
1477-9552.12028
237. Areal FJ, Tiffin R, Balcombe KG (2012) Provision of environmental output within a multi-
output distance function approach. Ecol Econ 78:47–54. https://doi.org/10.1016/j.ecolecon.
2012.03.011
238. Njuki E, Bravo-Ureta BE (2015) The economic costs of environmental regulation in U.S.
dairy farming: a directional distance function approach. Am J Agric Econ 97:1087–1106.
https://doi.org/10.1093/ajae/aav007
239. Njuki E, Bravo-Ureta BE, Mukherjee D (2016) The good and the bad: environmental
efficiency in Northeastern U.S. dairy farming. Agric Resour Econ Rev 45:22–43. https://doi.
org/10.1017/age.2016.1
240. Isik M (2004) Environmental regulation and the spatial structure of the U.S. dairy sector. Am
J Agric Econ 86:949–962. https://doi.org/10.1111/j.0002-9092.2004.00645.x
241. Samson GS, Gardebroek C, Jongeneel RA (2017) Analysing trade-offs between milk, feed
and manure production on Dutch dairy farms. Eur Rev Agric Econ 44:475–498. https://doi.
org/10.1093/erae/jbw025
242. Zhang W (2018) Costs of a practice-based air quality regulation: dairy farms in the San
Joaquin Valley. Am J Agric Econ 100:762–785. https://doi.org/10.1093/ajae/aax085
243. von Keyserlingk MAG, Martin NP, Kebreab E, Knowlton KF, Grant RJ, Stephenson M,
Sniffen CJ, Harner JP, Wright AD, Smith SI (2013) Invited review: sustainability of the US
dairy industry. J Dairy Sci 96:5405–5425. https://doi.org/10.3168/jds.2012-6354
244. Hemme T, Otte J (2010) Status and prospects for smallholder milk production: a global
perspective. Food and Agriculture Organization of the United Nations (FAO)
Performance Evaluation of Mutual Funds
Using Frontier Methods 32
Subrata Sarkar

Contents
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1282
Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1282
Brief Methodology of DEA and SFA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1284
DEA and Performance Evaluation of Mutual Funds . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1285
Early Attempts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1285
Introducing Additional Variables in the DEA Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1290
Network DEA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1295
Introducing Stochasticity into DEA Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1298
Other Nonparametric and Partial Frontier Measures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1300
Other Nonparametric Measures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1300
Partial Frontier Measures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1300
Stochastic Frontier Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1301
Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1304
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1305

Abstract

The mutual funds industry in an economy plays an important role in channel-

ing savings into investments. Given the limited amount of savings, economic
efficiency requires that such savings be allocated to firms with the highest
return on investments, adjusted for risk. Performance evaluation of mutual funds,

I thank Koustuv Saha for excellent research assistance in preparation of this manuscript. The usual
disclaimer applies.

S. Sarkar ()
Indira Gandhi Institute of Development Research, Mumbai, India
e-mail: [email protected]

S. C. Ray et al. (eds.), Handbook of Production Economics,
https://doi.org/10.1007/978-981-10-3455-8_32
1282 S. Sarkar

therefore, becomes important. In this chapter, we survey the extant literature on

this subject outlining the important works that have been carried out till date.

Keywords

Performance evaluation · Mutual funds · Data envelopment analysis ·

Stochastic frontier analysis

Introduction

Background

The mutual funds industry in an economy plays an important role in channeling

savings into investments. Given the limited amount of savings, economic effi-
ciency requires that such savings be allocated to firms with the highest return on
investments, adjusted for risk. Performance evaluation of mutual funds, therefore,
becomes important.
Early attempts at appraising the performance of managed portfolios were
confined to comparing returns of mutual fund returns against the returns from
a randomly chosen unmanaged “market” portfolio or the return of a weighted
average of many market portfolios. However, since higher expected returns are
invariably associated with higher risks, the average returns of a portfolio as a
benchmark measure proved to be an incomplete yardstick of evaluating mutual fund
performance. Therefore, the second-generation measures that were introduced in
the 1960s evaluated the performance of managed funds like mutual funds, based on
returns maximization subject to the risks involved. However, the lack of consensus
in the literature about the appropriate measure of risk led to a similar lack of
consensus about the appropriate risk-adjusted measure for evaluating mutual fund
performance [40]. The most popular measures that were used were (i) the Sharpe
ratio that measured the excess returns earned by the portfolio per unit of its total risk,
(ii) the Treynor ratio, which measured the excess returns earned by the portfolio per
unit of nondiversifiable market risk, and (iii) the Jensen’s alpha which measured
the excess return earned by a portfolio over an unmanaged portfolio with identical
market risk [25]. These indices are still much in use both in academia and the mutual
fund industry. It is important to understand the theoretical underpinning of these
three ratios to set the context for the evolution of the frontier methods in assessing
the performance of mutual funds.
The Treynor ratio and Jensen’s alpha are both based on the capital asset pricing
model (CAPM). These methods appraise the performance of individual mutual
funds against a benchmark portfolio, which is the efficient market portfolio. Roll
[47] criticized this method of portfolio evaluation on three counts, namely (i)
these measures require the identification of a market portfolio which includes all
assets, both marketable and nonmarketable, and is, therefore, impossible to observe,
(ii) these measures use the market portfolio as the benchmark, and, therefore,
32 Performance Evaluation of Mutual Funds Using Frontier Methods 1283

require a test of the joint hypotheses that the market portfolio is mean-variance
efficient and that CAPM is valid, which is impossible to carry out due to the
problem of measuring the true market portfolio, and finally (iii) if an ex-post
inefficient benchmark is used for performance evaluation, than any ranking is
possible depending on the benchmark used and slight changes to the benchmark
can completely reverse the performance rankings. Sharpe’s ratio circumvents these
problems as it does not necessarily use the market portfolio as the benchmark, but
nevertheless runs into the problem of arriving at a consensus of what the benchmark
portfolio should be.
Notwithstanding the theoretical challenges faced by these second-generation
measures for evaluating mutual fund performance, all these measures had a common
shortcoming, namely, that in evaluating portfolio performance exclusively in the
risk-return space, they systematically ignored the role of the fund manager in
achieving returns while minimizing risks. Put it differently, these measures ignored
managerial effort as a crucial input in return discovery. If collection and implemen-
tation of information are costly, better-informed managers should be able to generate
higher returns than less-informed managers. In addition, portfolio managers with
superior abilities may be able to charge higher fees to obtain economic rent [52]. If
so, measures that recognize the connection between returns and cost of information
and managerial fees are likely to outperform those which do not [50].
To overcome these shortcomings, in the late 1990s new methods of appraising
mutual fund performance were introduced which relied on frontier analysis. In fron-
tier analysis which has its origin in economics, the production frontier represents
the theoretical limit of output that can be produced given the amount of inputs and
technology. The aim of a production unit is to achieve an input-output combination
that lies on the production possibility frontier, though in reality it may fall short of it
due to any number of reasons. The notion of shortfall gave a theoretically grounded
concept of inefficiency which can be measured [25] and which can now be used to
construct a ranking of production units in terms of their efficiency with the lowest
deviation being associated with the most efficient unit.
The two most popular frontier-based methods that were developed were the
nonparametric data envelopment analysis (DEA) and the parametric stochastic
frontier analysis (SFA). The key differences between the two approaches lay in the
implicit assumptions set on the functional form of the efficiency frontier, allowance
or nonallowance of random error which may produce transitory positive or negative
deviations in outputs, inputs, costs, or profits, and in cases where the random
error was allowed, the distributional assumptions imposed on it to distinguish the
effect from the inefficiencies and random disturbances [5, 15]. The development of
these two frontier techniques opened the floodgates for performance evaluation in a
variety of industries which now had a strong theoretical basis.
The earliest attempts at applying frontier methods for evaluating mutual fund
performance were published over two decades ago [42]. Since then a considerable
body of work on mutual fund appraisal using frontier methods has developed
in the literature. However, these studies have been overwhelmingly based on the
DEA technique. Basso and Funari [12] list 76 papers written between 1997 and
1284 S. Sarkar

2014 which studied mutual fund performance based on the DEA. Whereas there
have been only three published studies of mutual fund performance that employed
stochastic frontier analysis (SFA) [4, 5, 48]. The reason for the dominance of the
DEA technique in the performance evaluation of the mutual funds industry and
indeed of many service industries may lie in the relative flexibility it offers in
constructing the benchmark or the most efficient production unit that does not
require a parametric specification as is required under the SFA.

Brief Methodology of DEA and SFA

The DEA method was introduced by Charnes et al. [18] in the context of a produc-
tion unit operating with a constant return to scale (CSR) production function. The
DEA generates an index of performance for each decision-making unit as the ratio of
its weighted outputs to its weighted inputs. Unlike parametric methods, the weights
for the output-input ratio are not derived from any underlying model of the prefer-
ence structure of the decision-maker. Rather the weights are determined by solving
a fractional programming problem. Each decision-making unit is assigned a distinct
set of weights which maximizes its output-input ratio. Thus, if a decision-making
unit turns out to be inefficient using the most favorable weights, it can be safely
concluded that it will be inefficient for all other combinations of weights as well.
The SFA was proposed by Aigner et al. [3] and Meeusen and van Den Broeck
[39] in independent studies. This approach estimates a flexible functional form,
whether cost, profit, or the production function, as the frontier along with an error
term to signify that the frontier itself is random due to exogenous shocks which
are beyond the control of the manager, or more generally, a decision-making unit.
Given inputs, the deviation of actual output from the frontier output is the sum
of this random shock and another error term that represents the inefficiency of
the decision-making unit. Since no manager can produce more than the frontier,
the inefficiency-related error term is always nonnegative and orthogonal to the
estimated frontier. The orthogonality ensures that the estimated inefficiency scores
are uncorrelated with the regressors and any scale economies [5, 24]. The addition of
the stochastic error term in the SFA makes it more robust to noise and measurement
errors which are potential problems for DEA. In fact, using a deterministic technique
like the DEA to determine the efficient portfolio frontier has been criticized on the
grounds that the phenomenon being studied is strongly stochastic in nature [16].
Notwithstanding these concerns, the relative popularity of DEA as a method for
evaluation for mutual funds, and indeed for evaluation of most service industries, is
due to the fact that it does not require any a priori specification of the form of the
production frontier other than convexity or any assumptions about the distributions
for the error term as required in SFA. This is a great advantage for empirical
measurement. For manufacturing units that produce clearly identified outputs with
clearly identified inputs and a well-specified technology, explicit specification of
functional forms of the frontier that exploit the theoretical properties of production
(cost) is relatively easier. This is not the case for service industries like mutual funds
32 Performance Evaluation of Mutual Funds Using Frontier Methods 1285

where the technology that transforms inputs to outputs itself may not be clearly
defined let alone the identification of outputs and inputs where a lot of subjectivity
may be involved.

DEA and Performance Evaluation of Mutual Funds

The basic idea behind applying the DEA to evaluate mutual fund performance is
that fund performance is characterized by multiple attributes which can be broadly
classified as outputs or benefits, like mean returns, and inputs or cost, like risk,
transaction costs, administration fees, loads, and minimum initial investment. Thus
the first study to apply the DEA to mutual funds was motivated by the need to
include transaction costs and management fees in the analysis [42]. Since then DEA
applications to mutual funds have evolved and been extended in several directions.
These advancements can be broadly classified into the following three categories:

(i) Addition of new variables like stochastic dominance indicators on the output
side ([8, 19, 36]), higher moments like skewness and kurtosis on the input side
[28, 43], ethical indicators on the input side [9], and quantile-based measures
like VaR and CVaR on the input side [19].
(ii) Development of network DEA models with multiple stages (Premachandra
2012).
(iii) Introduction of stochasticity to the deterministic DEA approach [1, 35, 44].

We cover each of these advancements in the following subsections. We begin by

describing the DEA approach as applied in the early attempts to evaluate mutual
fund performance.

Early Attempts

Murthi et al. [42] identified three major issues which the erstwhile measures of
portfolio evaluation like the Jensen’s alpha and the Sharpe ratio were unable to
address: the appropriate benchmark for comparison; the role of market timing; and
the endogeneity of transaction costs. They developed their new DEA-based index
to address two of these concerns. First, in their DEA portfolio efficiency index
by comparing the performance of individual funds with the performance of an
endogenously created best-performing fund from the sample of all available funds,
they circumvented the problem of validating the choice of an exogenous benchmark.
Secondly, by incorporating transaction costs like the turnover into the constructed
index as inputs, they successfully avoided endogeneity issues. Murthi et al. [42]
modeled the DEA portfolio efficiency index (DPEI) based on the Sharpe index. The
Sharpe index is given as the ratio between R and σ where R is the difference between
the actual return and the risk-free return, and σ is the standard deviation of the
portfolio. They defined the DPEI as:
1286 S. Sarkar

R
DP EI = (1)
i i i + υσ
ω X

where Xi refers to transaction costs like expense ratio, which accounts for man-
agement fees, marketing expenses, and other operational expenses; load, which
measures the amount that investors have to pay at the time of entering or exiting
the fund; and turnover, which captures the trading activity of the fund manager. The
weights ωi and υ associated with Xi and σ were determined through by solving an
optimization problem. Murthi et al. [42] based their optimization problem using the
DEA model proposed by Charnes et al. [18].
In particular, Murthi et al. [42] derived their DPEI measure by solving a
fractional programming problem that maximized the weighted output-input ratio
of a mutual fund, and can be represented as:
t
ur yrj0
Maximisevi ,ur h0 = r=1
m
t i=1 vi xij0
ur yrj
subject to r=1
m ≤1 j = 1, . . . , n, (2)
i=1 vi xij
ur ≥ ε, r = 1, . . . , t,
vi ≥ ε, i = 1, . . . , m,

where ε is a small positive number (non-Archimedean constant) which prevents

the weights from disappearing, n is the number of mutual funds indexed by j, m is
the number of inputs indexed by i, t is the number of outputs indexed by r, xij is the
amount of input i for unit j, yrj is the amount of output r for unit j, vi is the weight
assigned to input i, and ur is the weight assigned to output r.
The subscript 0 refers to the particular fund being evaluated. The optimal values
of ur and vi obtained from the above optimization problem indicate the position of
the fund relative to the Pareto-efficient frontier. Thus, the above program finds the
weights that maximize the ratio of the weighted sum of the outputs to the weighted
sum of inputs of a fund subject to the condition that all such ratios are less than
or equal to one. The efficiency measure thus obtained is the best that the fund can
achieve for any given value of weights, thereby giving the benefit of the doubt to
the mutual fund under evaluation. The problem assigns the value 1 to h for the best-
practiced fund in terms of the output-input ratio. The weights of the best-practiced
fund determine the slope of the straight line connecting the fund and the origin
in the mean-variance or return-cost space (Fig. 1). This straight line is the DEA
efficiency frontier. The slack variables obtained from the programming problem, if
nonzero, indicate the extent to which each input can be reduced to achieve unit level
of relative efficiency. Thus the optimization process also identifies the particular
areas in which management is inefficient [21, 42].
For the actual optimization exercise the fractional problem in equation system
2–2 is converted
into a more convenient linear programming problem (LPP) by
allowing Ii wi xi0 + vσ0 = 1.
32 Performance Evaluation of Mutual Funds Using Frontier Methods 1287

Fig. 1 DEA efficiency frontier is the straight line connecting origin with best-performing fund
under CRS [21]

t
Maximise ur yrj0
r=1
m
subject to vi xij0 = 1
i=1
t m (3)
ur yrj − vi xij ≤ 0 j = 1, . . . , n,
r=1 i=1
− ur ≤ −ε r = 1, . . . , t,
− vi ≤ −ε i = 1, . . . , m.

This LPP has t + m variables (the t output weights and m input weights) and
n + t + m + 1 constraints. The LPP thus described is the input-oriented Charnes
et al. [18] model. It is one of the simplest and most widely applied DEA models in
mutual fund performance analysis [2, 8, 26, 31, 42, 50].
Murthi et al. [42] used the DPEI to evaluate 731 US mutual funds across seven
categories, namely aggressive growth, asset allocation, equity income, growth,
growth income, and balanced income, using data from the Morningstar database.
They found strong evidence that mutual funds are mean-variance efficient and that
efficiency is not related to transaction costs. However, in a later study, Tarim and
Karan [50] using the DPEI, but with additional restriction on the weights, were
unable to validate the mean-return efficiency observed in Murthi et al. [42] for a
sample of 191 Turkish mutual funds. The additional restrictions on weights used by
Tarim and Karan [50] prevented extreme variations in the input and output weights.
They pointed out that since DEA assesses the efficiency of decision-making units
using the most favorable weights, it may lead to outlier decision-making units being
1288 S. Sarkar

classified as efficient, or a factor that turns out pertinent with nontrivial weights
for one fund and may come out with a zero weight for another. Comparing the
results from the no-bounds DEA and the DEA with added restrictions, they showed
for their sample of funds that the presence of outliers may lead to false hypothesis
that funds are mean-variance efficient under no-bounds DEA. The mean-variance
efficiency hypothesis remains much contested in empirical literature and a number
of DEA-based studies have found evidence in favor of and against the hypothesis.
Published in the same year as Tarim and Karan [50], Basso and Funari [8]
illustrated the usefulness of the cross-efficiency matrix as an alternative strategy to
cope with extreme variations in input-output weights for a given sample of funds.
The element in row i and column j of the cross-efficiency matrix represents the
efficiency ratio of the mutual fund i obtained with the weights that are optimal for
mutual fund j. Thus, a series of efficiency scores are generated for each fund and
comparisons between them take into account the average efficiency score based on
the different input-output weights.

Variable Returns to Scale

A major drawback of the Charnes et al. [18] DEA is that it ignored the effect of
returns to scale of a decision-making unit performance since it assumed a CRS
efficiency frontier. Choi and Murthi [21] were the first to apply the Banker et al.
[7] DEA model, which allowed variable returns to scale (VRS) to mutual fund
evaluation. The general Banker et al. [7] DEA model is given as the following
fractional programming problem:
t
u y −u
Maximiseur ,vi ,u0 m r rj0 0
r=1

t i=1 vi xij0 (4)

u y −u
subject to m r rj 0 < 1
r=1
j = 1, . . . J
i=1 vi xij

where the restrictions and notations from Eq. 1 still hold and u0 is the additional
free variable introduced to characterize returns to scale for the mutual fund. The
modified output-input ratio, I, proposed in Choi and Murthi [21] was given by

R0 − u0
I= (5)
i ωi Xi + υσ

Depending on whether u0 is positive, zero, or negative it is possible to infer

increasing, constant, or decreasing returns to scale for the fund under consideration.
The DEA efficiency score, I, had the interpretation as the ratio of the fund’s Sharpe
index over that of the best performing fund, assuming u0 and υ are zero. The optimal
u∗0 gives the intercept of the tangent line on the facet of the efficiency frontier. The
slope of the tangent line is given by the optimum weights (Fig. 2).
Note that Fig. 1 assumes constant returns to scale among mutual funds, and,
therefore, the DEA efficiency frontier, which is determined by the best-practiced
fund, is a straight line connecting the origin and the risk-adjusted return of the best-
performing fund. In contrast, Fig. 2 depicts that the efficient frontier consists of three
32 Performance Evaluation of Mutual Funds Using Frontier Methods 1289

Fig. 2 The DEA efficiency frontier with VRS fund performance

linear segments with each segment depicting different returns to scale. Beginning
from increasing returns to scale, then constant returns to scale (45-degree line) and
decreasing returns to scale.
Choi and Murthi [21] used the modified DEA performance index to evaluate
the same sample of mutual funds considered in Murthi et al. [42]. They found that
about 90% of aggressive growth funds showed increasing returns to scale with these
funds generally also exhibiting the highest gross returns. This, they argued, implied
that the superior performance of the aggressive growth funds in terms of returns
was due to increasing returns to scale over and above that could be attributed to
taking higher risks. The relative ranking of the funds using the new index did not
significantly change from the rankings obtained using DPEI and the authors found
that mean-variance efficiency property continued to hold in the relative sense.
Another property of the Banker et al. [7] DEA which made it particularly suited
for mutual fund evaluation is its translation invariance. For the Charnes et al.
[18] DEA model to yield accurate results the output and input variables have to
be nonnegative. However, in empirical applications to mutual fund evaluation, it
is often possible to have negative returns for the output variable. The translation
invariance property of the Banker et al. [7] DEA means that a suitable constant
(greater than the absolute value of the negative output) can be added to the negative
output value without affecting the results of the DEA exercise. The input-oriented
Banker et al. [7] model is suitable for this purpose since it is translation invariant
with respect to its outputs, whereas the output-oriented Banker et al. [7] model is
translation invariant with respect to its inputs.
1290 S. Sarkar

The input-oriented Banker et al. [7] DEA [20, 28–31, 38, 43] is specified as an
input minimization problem given a fixed level of outputs. On the other hand, the
output-oriented Banker et al. [7] DEA [13, 33, 38, 53] maximizes the outputs given
a fixed level of inputs. Among the two methods, the input-oriented Banker et al. [7]
DEA is more the popular of the two.
However, recently Basso and Funari [12] argued that the input-oriented DEA
is not the most suitable method to evaluate mutual funds. They reason that since
the average investor is mainly interested in maximizing the mean returns and other
output variables, without increasing the costs, the output-oriented model is more
suitable for the analysis. To overcome the issue of negative mean returns, Basso
and Funari [12] use the final value of the fund as output in their output-oriented
DEA. Since the final value of investment is always positive, by doing so, Basso and
Funari [12] successfully sidestep the problem of having negative output variables.
They admit, however, that using the final value of the fund could potentially make
the DEA outcome sensitive to the choice of the holding period. Other techniques
adopted in the literature to deal with the possibility of negative mutual fund returns
include using an additive DEA model instead of a multiplicative model [36];
using slack-based models [34]; and using directional distance-based DEA [49].
But as Basso and Funari [12] contend that while these approaches are technically
consistent, they tend to make the interpretation of these efficiency measures difficult
from a financial point of view.

Multihorizon DEA
Yet another interesting use of DEA to evaluate mutual fund performance is that
presented in Morey and Morey [41]. Whereas Murthi et al. [42], and almost all other
studies thereafter, used DEA primarily because it allowed the explicit incorporation
of transaction costs into the evaluation process, Morey and Morey [41] used the
DEA to develop a multihorizon approach by incorporating the average monthly
returns and risks over three different time horizons as outputs. They justified
their methodological approach by drawing an analogy to the “typical investor”
who, when deliberating over which mutual fund to invest in, is available to the
information on the fund’s performance over multiple time horizons rather than
a single time horizon. Morey and Morey [41] use their DEA model to evaluate
26 “aggressive growth” mutual funds from the Morningstar database. They used
monthly percentage return data to calculate the mean monthly returns, variances,
and covariances for each mutual fund and each of three horizon lengths – 3 years,
5 years, and 10 years. They employed both the input-oriented and output-oriented
DEA approaches and compared the fund rankings with a ranking order based on the
Morningstar rating system. Through their empirical exercise, they showed that the
DEA-based method ranked the funds in a significantly different order to that of the
Morningstar rating system.

Introducing Additional Variables in the DEA Models

Over the years the basic DEA models for evaluating mutual fund performance have
been extended to include various other measures to incorporate the varying nature
32 Performance Evaluation of Mutual Funds Using Frontier Methods 1291

of risk preference of investors. These risk measures have been introduced as new
output and input variables in the traditional DEA models.

Stochastic Dominance in DEA Models: CARA, IARA, and DARA

Basso and Funari [8] constructed a DEA mutual fund performance index with
a stochastic dominance indicator and returns as the two outputs. The stochastic
dominance indicator was included to reflect the investor’s preference structure and
the time occurrence of the returns.
Stochastic dominance techniques allow the partial ranking of random variables,
assuming that investors prefer those alternatives that maximize their expected utility.
The assumptions underlying stochastic dominance rules concern the signs of the
successive derivatives of the investor’s utility function.
The first-order stochastic dominance relation between two random portfolios X
and Y follows from the principle of non-satiety and is defined as follows: X is said
to dominate Y according to the first-order stochastic dominance criterion if for all
nondecreasing utility functions U we have E(U (X)) ≥ E(U (Y)) and there exists
a nondecreasing utility function U* such that E(U*(X)) > E(U*(Y)). The second-
order stochastic dominance between the random portfolios X and Y arises from the
principles of non-satiety and risk aversion; X is said to dominate Y according to the
second-order stochastic dominance criterion if for all nondecreasing and concave
utility functions U we have E(U (X)) ≥ E(U (Y)) and there exists a nondecreasing
and concave utility function U* such that E(U*(X)) > E(U*(Y)).
Another widely accepted stochastic dominance criterion concerns the assumption
of decreasing absolute risk aversion (DARA) of the investors’ utility functions;
X is said to dominate Y according to the DARA dominance criterion if for all
utility functions U satisfying the DARA assumption we have E(U (X)) ≥ E(U (Y))
and there exists a DARA utility function U* such that E(U*(X)) > E(U*(Y)). The
stochastic dominance rules get weaker with successive orders, so a prospect which
dominates another under order n will be dominant under every successive order
after n. Thus the higher the order of the dominance rule, the higher is the number
of dominance rules observes. However, the higher the number of dominance rules
observed, the more are the restrictions on the utility function of the investor implying
that the efficient set will be suitable for a smaller group of investors and involve a
loss in generality.
For the purposes of their DEA analysis, Basso and Funari [8] used the decreasing
absolute risk aversion (DARA) rule which is based on a widely accepted hypothesis,
is more selective than the first three orders of stochastic dominance, and can be
computationally tested using a convenient dynamic programming algorithm. To
define the stochastic dominance indicator, they began by taking the past returns
of the mutual fund over a convenient time period and dividing that period into
subperiods. They then constructed the stochastic dominance indicator for fund j
as the relative number of periods in which fund j was not dominated by any of the
other funds in the sample.

dj =(number of nondominated subperiods for fund j )/(total number of subperiods)

1292 S. Sarkar

They defined the two-output Charnes et al. [18] DEA portfolio performance
index IDEA_2 as the optimal value of the objective function of the following
fractional programming problem:

u1 oj0 +u2 dj0

Maximiseur ,vi ,wi h k
i=1 vi qij0 + i=1 wi cij0
u1 oj +u2 dj
subject to h k ≤ 1, j = 1, . . . , n
i=1 vi qij + i=1 wi cij (6)
ur ≥ ε, r = 1, 2
vi ≥ ε, i = 1, . . . , h
wi ≥ ε, i = 1, . . . , k

where j = 1 . . . n denotes the n mutual funds, r = 1, 2 denotes the two output

variables, dj denotes the relative number of periods in which fund j was not
dominated by any of the other funds in the sample, oj denotes the expected returns
of the fund, q1j . . . qhj denote the h risk measures for fund j, c1j . . . ckj denote
the k subscription or transaction costs for fund j, and u1 , u2 , vi , and wi are the
weights to be determined through the fractional programming process. Basso and
Funari [8] carried out two rounds of DEA calculations for a sample of 47 Italian
mutual funds over the years 1997–1999, once with the IDEA _ 2 index and once with
the IDEA _ 1 index, which is the same as the IDEA _ 2 index except that it excludes
the stochastic dominance indicator. They considered the standard deviation of the
portfolio returns, the square root of the half variance, and the β coefficient as
the risk measures and included subscription cost and a redemption cost as the
two transaction costs. These measures served as inputs to the evaluation process.
Comparing the results from the two DEA analyses revealed that the inclusion of
the stochastic dominance indicator as an output produced a significantly different
ranking among the funds and generated a larger set of efficient funds. Tavakoli
Baghdadabad et al. [51] extended the stochastic dominance approach presented in
Basso and Funari [8] to both constant risk aversion and increasing risk aversion.
They applied the three DEA models (corresponding to DARA, CARA, and IARA
stochastic dominance rules) to a sample of 17,555 US mutual funds. Their findings
show that the average fund efficiency values are quite different for each of the
stochastic dominance rules. The calculated average efficiencies were the highest
under the DARA rule, whereas the average efficiencies under CARA and IARA
were quite negligible.
Lozano and Gutiérrez [36] posited six different DEA models which used return,
risk, and safety measures that had previously been proved to be consistent with
second-order stochastic dominance when used in portfolio optimization models.
They did this to better account for the effect of diversification on the overall risk of
holding a mutual fund. They argue that conventional DEA models that compute the
risk measure of the benchmark portfolio as a linear combination of the risk measures
of the intervening mutual funds do not take into account the diversification effects
of mutual funds. This can result in significant overestimation of the risk measure
and lead to underestimation of the efficiency scores for some mutual funds.
32 Performance Evaluation of Mutual Funds Using Frontier Methods 1293

Risk Measures
Chen et al. [20] used the IDEA _ g index to rank Chinese mutual funds from the year
1999–2001. The IDEA _ g has the same set of inputs as IDEA _ 2 , but in addition to
returns and stochastic dominance as outputs, it also included several traditional
measures of mutual fund performance like Sharpe ratio, reward to half-variance,
Treynor ratio, and the Jensen’s alpha.
Chen et al. [20] were one of the first to try to address a controversy surrounding
the choice of risk measures for DEA mutual fund appraisal. Most DEA studies
on mutual funds only included the first moment (mean) and the second moment
(variance) among the input-output variables. This was fine as long as asset returns
were normally distributed. However, a large number of empirical studies have shown
that this assumption was not true. Other scholars have shown that investor’s utility
functions are not quadratic, and that that they prefer higher skewness and lower
kurtosis. A more skewed returns distribution means a higher probability of above-
average returns and a higher kurtosis means a higher chance of the fund undergoing
a major change. So skewness is to be preferred and kurtosis is to be avoided
[20, 28].
Chen et al. [20] introduced value at risk (VaR) and conditional value at risk
(CVaR) as additional inputs into the IDEA _ 2 index of Basso and Funari [8]
to accommodate the skewness and leptokurtosis observed in the distribution of
returns of actively managed funds. VaR and CVaR are quantile-based measures
which make them suitable to measure risk in case of asymmetric distributions,
as the investor is more likely to be concerned about the risk of a loss than of a
gain. A portfolio’s VaR, given a prespecified level of confidence and a particular
time horizon, is the maximal loss that one expects to suffer at that confidence
level holding that portfolio over that time horizon. The CVaR is the conditional
expectation of losses exceeding VaR in a specified period at the given confidence
level. Alongside the quantile-based measures, Chen et al. [20] also included the
conventional measures of risk like alpha, beta, and the square root of the lower
semi-variance. They argued this was necessary as quantile-based measures focused
only on the lower tail of the distribution of returns and, therefore, they need
to be combined with the other risk measures so that all the risk characteristics
of the fund’s returns can be simultaneously modeled. They applied their input-
oriented index on a sample of Chinese mutual funds, once with VaR and CVaR
included and once without them. They found that the proper combination of VaR
and CVaR with other risk measures can more comprehensively reflect mutual
funds’ risk properties and thus better measure the overall performance of mutual
funds.
For reasons similar to Chen et al. (2006), Guo et al. [28] and Pendaraki
[43] each constructed an input-oriented Banker et al. [7] type DEA models
where they incorporated skewness directly as one of the outputs and kur-
tosis as one of the inputs. They both found that inclusion of the higher
moments to the DEA measure changed the ranking of the mutual funds,
and increased both the average efficiency scores and the number of efficient
funds.
1294 S. Sarkar

DEA with Ethical Measures

Of late ethical and socially responsible investment has come into focus while
evaluating mutual fund performance. Ethical or socially responsible funds allow
investors the opportunity to support social causes or companies and nongovern-
mental organizations which are sensitive to social, cultural, and environmental
issues through their investments. However, these ethical features could involve
a financial penalty for the investors as socially responsible investment could be
possible only in a subset of economic activities, those that avoid activities that could
be considered morally dubious such as weapons manufacturing, gambling organiza-
tions, alcohol production, and activities which may pollute the environment. Since
socially responsible investing fulfills two different needs, namely the need to obtain
satisfactory returns as well as the need to be ethical, performance indicators that
take into account only the financial objective will tend to be biased against the funds
committed to socially responsible investment.
To address this problem, Basso and Funari [9] proposed using a two-output
DEA model, consisting of financial returns and an indicator for ethical behavior
as the two outputs. Two issues arose while incorporating ethical behavior in the
DEA model. First, usually the only information available on ethical behavior was
binary in nature or at most categorical. In this case, a basic DEA model which
uses continuous variables is inappropriate. Second, the ethical level is chosen by
savers a priori and cannot be arbitrarily modified, so that any proxy for ethical
behavior needs to be considered as an exogenously fixed variable. This lead Basso
and Funari [9] to develop three DEA models: first, a simple two-output Charnes
et al. [18] DEA model which included an output indicator for ethical behavior;
second, an intermediate model which took the ethical level as an exogenously fixed
variable, and third, the most appropriate model which was a categorical model with
an exogenously fixed output.
For exposition purposes, we illustrate the third model which is based on the DEA
model developed by Banker and Morey [6].
For the simple case where dj is a binary variable that takes the value 0 to indicate
ethical investing and 1 to indicate otherwise, the DEA model is the dual of the linear
programming problem in output orientation. Assuming a set of n mutual funds j = 1,
. . . , n with risky returns Rj , the following notations are introduced:

oj a return measure of fund j

q1j , . . . , qhj h risk measures of fund j
c1j , . . . , ckj k subscription costs for fund j
z0 a dual variable associated with the equality constraint
λj dual variables associated with the mutual fund constraints
dj0 the ethical measure variable for the mutual fund being evaluated
s1+ dual variable associated with the input weight constraint
si− dual variables associated with the output weight constraints
ε a non-Archimedean constant.
32 Performance Evaluation of Mutual Funds Using Frontier Methods 1295

The associated fractional programming problem can be written as:

h
k
Max z0 + ε si− + ε −
sh+i + εs1+
i=1 i=1

n
subject to qij λj + si− = qij0 , i = 1, . . . , h
j =1

n
−
cij λj + sh+i = cij0 , i = 1, . . . , k
j =1

n (7)
z0 oj0 − oj λj + s1+ = 0
i=1

n
q1j
q1j0 dj λj ≤ dj0
j =1
λj ≥ 0, j = 1, . . . , n
si− ≥ 0, i = 1, . . . , h + k
s1+ ≥ 0

The constraint associated with the binary variable for the ethical status of the
fund
n
q1j
dj λj ≤ dj0 (8)
q1j0
j =1

is redundant for nonethical funds (dj0 = 1) but for ethical funds (dj0 = 0) it requires
that the multipliers associated with the nonethical funds be zero (λj = 0 if dj = 1).
This model could be easily adapted to cases where ethical funds are measured on a
scale instead of a simple binary rule. Basso and Funari [9] tested the applicability
of the DEA models empirically using randomly generated data. Since then, several
other studies have extended this approach to capture the effect of variable returns to
scale (for example, [13]) and to be robust to negative returns (for example, [10, 11]).

Network DEA

Conventional DEA models evaluate mutual fund performance as a single-stage

production process and, therefore, fail to consider the internal structure by which
the mutual funds are managed [27]. Recently, network DEA models have been
developed that explicitly consider the different activities that make up a mutual
fund’s management process as separate stages, thereby allowing better modeling
and better identification of the sources of inefficiency. Network DEA models are
in the early stages of their development, and till the year 2018, there have been
three studies that have used a network DEA approach to appraise mutual fund
performance [27, 45, 46].
1296 S. Sarkar

Premachandra et al. [45], who were the first to implement a network DEA model
in the context of mutual funds using a two-stage DEA model, decomposed overall
efficiency into (i) operational efficiency which reflected how well a fund family has
managed its resources in securing or generating funds for the family (ii) portfolio
efficiency which measured how well a mutual fund family manages its investment
portfolio to realize returns subject to a chosen set of factors that may influence
them.
They modeled fund management as a two-stage process. In the first stage, the
operational management stage, the fund tries to attract monies from the investors
and incurs management fees (X11 ) and marketing and distribution expenses (X21 ). In
the first stage, the output variable is the net asset value generated (z1 ). The second
stage is the portfolio management stage. Here the net asset value (z1 ), the fund size
(X12 ), the net expense ratio (X22 ), the turnover ratio (X32 ), and the standard deviation
of the returns over the past three years (X42 ) are the inputs that go into producing the
mean returns of the portfolio (y1 ). The net asset value which is the output of the first
stage becomes an intermediate input into the second stage and forms the connecting
link between the two stages.
It is possible to solve the two stages of the problem separately as two standalone
DEA problems by assuming that they operate independently of each other. However,
ignoring the fact that the outputs from the first stage are the inputs in the
second stage can lead to various inconsistencies in the evaluations. In an earlier
work in the literature on production economics, Kao and Hwang [32] offered a
possible solution by pointing out that an overall measure of efficiency can be
expressed as the product of the efficiencies of two stages under constant returns
to scale assumption. Later, Chen et al. [20] extended this approach by using
additive efficiency decomposition under the constant and the variable returns to
scale assumptions. Premachandra et al. [45] employed the two-stage DEA model
under the variable returns to scale assumption illustrated in Chen et al. [20].
The choice of the model was motivated by the translation invariance property
of the variable returns to scale DEA model which makes it better suited to
situations where some of the variables (e.g., mutual fund returns) take negative
values.
The DEA-based procedure used in Premachandra et al. [45] can be described
using a general two-stage DEA network structure for the mutual fund family j
with i1 inputs to the first stage denoted by Xj = x1j
1 1 , x1 , . . . , x1
i1 j ; i2 inputs to
2j
the second stage denoted by Xj2 = x1j 2 , x2 , . . . , x2
2j i2 j ; D intermediate measures
denoted by zdj (d = 1, . . . , D), and s outputs from the second stage denoted by yrj
(r = 1, . . . , s). With respect to the empirical example in Premachandra et al. [45]
(illustrated in Fig. 3), X1 has two input variables, X2 has four input variables, z has
one variable, and y has one variable. Following Banker et al. [7], the VRS efficiency
score of a decision-making unit at the first and second stages can be calculated using
models (1) and (2), respectively. The first stage is given by:
32 Performance Evaluation of Mutual Funds Using Frontier Methods 1297

Fund Size ( )

Management
Fees ( )
Operational Portfolio
Management Net Asset Value ( ) Management Average
Function Function Returns ( )
(First Stage) Net Expense (Second Stage)
Marketing and Ratio ( )
Distribution
Fees ( )
Turnover ( )
Standard Deviation ( )

Fig. 3 Structure of the network DEA model in Premachandra et al. [45]

1
d ηd zd0 +u
1
Maximise 1 1
i vi xi 0
1 1 1 1
η zdj +u1 (9)
s.t. d d 1 1 ≤ 1, j = 1, 2, . . . , n
i 1 vi 1 xi 1 j
vi11 , η1d ≥ ε; u1 is free

where, vi11 , η1d are the weights for the inputs and intermediate variables to be
obtained through the optimization of the first stage. u1 is a free variable associated
with the returns to scale for the DEA in stage 1. The second stage can be expressed
as follows:

r0 +u
2
r ur y
Maximise
d ηd zd0 + i2 vi2 xi2 0
2 2 2

r ur yrj +u
2
≤ 1, j = 1, 2, . . . , n (10)
s.t.
η 2 z + v 2 x 2
d d dj i2 i2 i2 j
vi22 , ur , ηd2 ≥ ε; u2 is free

where, vi22 , ur , ηd2 are the weights for the inputs, intermediate measures, and outputs
of the second stage, to be obtained through optimization of the second stage. u2 is
the free variable associated with returns to scale for the DEA in the second stage.
Overall efficiency is then be calculated as the weighted average of the efficiency
scores from the first and the second stage as:

dηd1 zd0 + u1 ur yr0 + u2
w1 ∗ 1 1 + w2 ∗ 2 r 2 2 (11)
i1 vi1 xi1 0 d ηd zd0 + i2 vi2 xi2 0

The weights w1 and w2 sum up to unity and are defined by the user to reflect the
relative importance of each stage to overall performance. Premachandra et al. [45]
used the share of the total resources devoted in each stage to determine the weights.
1298 S. Sarkar

For their empirical exercise, they used data from 66 large US mutual fund families
consisting of 1296 individual mutual funds. Using the two-stage decomposition
process they demonstrate that in addition to identifying the efficient funds they
are also able to quantify the relative contributions of management efficiency and
operational efficiency to overall efficiency.
Galagedera et al. [27] extended the two-stage approach developed by Premachan-
dra et al. [45] by adding a third stage to the management process. In the first stage
of the two-stage model in Premachandra et al. [45] management fees and marketing
and distribution fees were the inputs that went into producing net asset value (NAV).
Galagedera et al. [27] divided this stage into two. In their model, the first stage,
the operational management stage, had the same inputs as in Premachandra et al.
[45] but now its output was the fund size. The new second stage, the resource
management stage, took the intermediate output fund size and the two inputs,
turnover ratio and expense ratio, to produce the output, the net asset value (NAV).
The NAV is then used as an input to the final stage, which produced returns as the
output after combining NAV with various risk measures. They argued that though
NAV and fund size were distinct measures, both cannot be included either as outputs
or as inputs since they were related to the scale of operations and, therefore, highly
correlated. By considering NAV as total funds transformed through a resource
management process, NAV can be included as an output while total funds can be
considered as an input to the second stage.

Introducing Stochasticity into DEA Models

Mutual fund returns generation is a stochastic process. Studying stochastic pro-

cesses using a nonparametric technique like the DEA leaves one at the risk of
confusing chance events with what is actually a product of the manager’s efficiency
or lack thereof. Parametric techniques like stochastic frontier estimation (SFE),
which we cover later in the review, are immune to this shortcoming though they have
their own set of drawbacks. Several studies have therefore attempted to incorporate
elements of stochasticity into the DEA framework itself.
One of the earliest studies in this respect was by Premachandra et al. [44], who
constructed a spreadsheet-based stochastic DEA (SDEA) based on the Monte Carlo
simulations using the @RISK program of Microsoft Excel. They used the SDEA
to model an asset allocation decision facing a fund manager who must choose a
stock and money market investment portfolio that will perform efficiently relative
to a number of important alternative portfolios selected from the same potential
investments. The money market investments were a risk-free debt market investment
and the potential stock market investments consisted of market indexing investment
as well as small, medium, and large company investments. The risk-free market,
small company, medium company, and large company investments were combined
in different ways to form alternative portfolios so that all major asset allocation
portfolio strategies were covered. The two inputs were the total dollar value initially
invested in the risky holdings (X1 ) and the dollar value of the initial risk-free
32 Performance Evaluation of Mutual Funds Using Frontier Methods 1299

investment. The single stochastic output was the portfolio’s market value at the end
of a time period minus the comparison benchmark return which would have been
generated by a market index investment strategy. The @RISK simulation package
produced summary statics for each of the alternative portfolios which were used
to compare their relative efficiencies. The numerical SDEA model predicted future
underperformance by identifying portfolios that were grossly inefficient and did a
better job than the standard Markowitz mean-variance portfolio ranking approach.
Lamb and Tee [35] used bootstrapping, a technique similar to the Monte Carlo
simulations to produce estimates of fund efficiency. Whereas Premachandra et al.
[44] modeled the DEA problem as the decision facing the mutual fund manager,
Lamb and Tee [35] used the SDEA to model the decision-facing investors deciding
between alternative mutual funds to invest in. The two input variables used were
max (CVAR, 0) and max (SD, 0), where CVAR is the conditional value at risk and
SD is the difference between the lower semi-deviation and the mean. The two output
variables were the mean returns and max (−CVAR, 0). The choice of the input and
output variables was motivated by the desire to make the DEA model sensitive
to diversification [36]. Lamb and Tee [35] used the bootstrap method to estimate
the biases and subsequently constructed the bias-corrected estimates of the DEA
efficiency measures. They found that the efficiency estimates had significant biases
and, more crucially, the magnitude of the bias was fund specific. They also noted
that the rank ordering of the funds was also markedly different when using the bias-
corrected efficiency estimates than when using the raw efficiency measures. The
confidence limits were estimated using a bootstrap method and the mutual funds
were partially ranked using matched pairs test. They also estimate diversification-
consistent DEA efficiency measures using the bootstrap methods.
Hu and Chang [31] presented an approach different from the above two studies
in terms of identifying stochastic elements using the DEA framework. Rather than
employing an SDEA, they designed a three-stage model to decompose mutual fund
performance and obtained pure managerial performance. In the first stage, they ran
an output-oriented CRS DEA model using around 156 funds from the Morningstar
database for the period 2005–2006. In the second stage, the output slacks obtained
in the previous stage were used as the dependent variable in a stochastic frontier
regression with various attributes of the fund and the investor as the independent
variables. This was done to decompose the output slacks obtained in the first
stage into three components: one component reflecting environmental influences,
one component reflecting luck, and the final component reflecting managerial
inefficiency. The stochastic frontier regression estimated was of the following form:

si = f (zi ; β) + νi − ui , νi ∼ N 0, σν2 and ui ∼ N + 0, σu2 ∀i = 1, . . . , I
(12)

where si is the output slack from the first stage DEA, zi is N observable envi-
ronmental variables like fund characteristics and manager attributes, f (zi ; β) is the
deterministic component of the slack variable with parameter vector β, ui is the
1300 S. Sarkar

nonnegative managerial inefficiency term, and ν i is the stochastic term representing

statistical noise. After estimating the sources of inefficiency, this information was
used to adjust the output data to better reflect pure managerial effort. This was
done by adjusting upwards the output variables of each fund in proportion to the
quantum of disadvantage faced by the fund on account of their relatively unfavorable
environment or relatively bad luck. In the third and final stage, the adjusted output
variables were used to rerun the CRS DEA from the first stage. They found that the
number of efficient funds significantly falls in the third-stage DEA as compared to
the first-stage DEA, implying that the variations in original underperformance could
almost be explained by fund characteristics, manager attributes, and luck.

Other Nonparametric and Partial Frontier Measures

Other Nonparametric Measures

There are some nonparametric methods other than the DEA to evaluate mutual fund
performance, though such methods are scarce. Among them is the Free Disposal
Hull (FDH) technique, which is the nonconvex counterpart of the DEA. It shares
the same underpinnings as the DEA, however, unlike the DEA it does not rest on
the hypothesis of convexity of the attainable set. While the fact that it does not
require convexity makes the FDH more flexible than the DEA, it also makes the
technique susceptible to what is known as the scarcity bias. Because the FDH drops
the convexity assumption in DEA, if a given fund cannot be compared with the
other funds due to their input-output combinations, it is classified as efficient by
default. Thus, despite being more flexible than the DEA, the FDH has trouble in
discriminating and ranking alternatives in the absence of a sufficient number of
similar mutual funds [37]. There are no studies that have exclusively applied the
FDH to mutual fund evaluation, it has been used in conjunction with other methods
like the DEA and partial frontier methods [23, 37, 49]. What is more noteworthy
is the applicability of new partial frontier techniques like order-m [17] and order-α
[22] estimators, to mutual fund evaluation and the advantages these methods offer
over full frontier methods like the DEA and the FDH.

Partial Frontier Measures

There are two major limitations of full frontier methods like the DEA and the
FDH, which partial frontier methods can overcome, namely (i) their susceptibility
to outliers and extreme values, and (ii) the curse of dimensionality. Full frontier
methods like the DEA and FDH measure efficiency and inefficiency using the
absolute minimum achievable level of inputs (given the output) as the benchmark.
Partial frontier techniques (specifically the order-m frontier) on the other hand
use the expected minimum achievable level of input which is a less extreme
benchmark. As a result, partial frontier methods are more robust to outliers than
32 Performance Evaluation of Mutual Funds Using Frontier Methods 1301

full frontier methods since they do not try to envelop all observed points, no matter
how extreme. Another way in which partial frontier methods offer an advantage
over full frontier methods is by avoiding the curse of dimensionality. Due to the
curse of dimensionality, for nonparametric methods like the DEA and FDH, as
the dimensions of the input-output space increase, the number of observations
needed to maintain the original level of precision also increases rapidly. Partial
frontier techniques like order-m [17, 37] and order-α [37] on account of being
√
n- consistent are less susceptible to the curse of dimensionality.
Since the methodologies of these techniques are quite involved, we do not
describe them in the present review. Matallín-Sáez et al. [37] provide an informative
discussion of the methodology of partial frontier techniques. The next section
provides a summary of studies applying SFA to the context of mutual funds.

Stochastic Frontier Analysis

Though nonparametric methods such as the DEA have come to dominate the
literature of mutual fund evaluation, the stochastic frontier approach offers a
theoretically sound alternative to model and evaluate the efficiency of mutual funds.
Indeed, there is reason to prefer the SFA over the DEA when evaluating any
decision-making units which operate under uncertain conditions, all of which are
not within the control of the “managers” of these units. The advantages of the SFA
technique, its robustness to outliers and extremes values, are due to the incorporation
of a stochastic error term. Notwithstanding this, the need to specify explicitly the
functional form for the frontier and the distributional assumption of the two error
terms related to noise and efficiency makes the SFA analysis rather challenging,
especially for service industries like mutual funds. As a result, examples of studies
that take an SFA approach to mutual fund analysis are very limited. At the time of
writing this review, there were only three such published studies. A brief discussion
of the three studies follows.
Annaert et al. [4] applied SFA to evaluate a sample of European mutual funds.
As a preliminary indication of fund performance, they computed the Jensen’s alpha
of each of the 179 funds. Jensen’s alpha (α p ) is given as the difference between the
excess return earned by the mutual fund portfolio and the expected excess return.

αp = R p − E Rp = R p − βp E (Rm ) (13)

where R p is the actual average excess return of fund p; β p is its beta coefficient,
which is a measure of the fund’s volatility relative to the market portfolio, and E(Rm )
is the expected excess return on the market portfolio. The second equality in Eq. 13
follows from the CAPM (though other asset pricing models can also be assumed to
hold). The alpha for each fund was obtained as the intercept term in a time-series
regression of the fund’s excess returns (Rpt ) on a proxy for the market portfolio
return (Rmt ). The estimated beta coefficients centered around one, indicating the
relevance of the benchmark portfolio. The average alpha coefficient was close
1302 S. Sarkar

to zero, indicating that the individual funds did not outperform the benchmark
portfolio.
According to the efficient market hypothesis, alpha, which measures the expected
abnormal returns for a mutual fund, should be zero, or even negative in case a fund
incurs excessive transaction or management costs. However, in a given sample some
funds will likely be found to exhibit positive abnormal returns due to sampling
noise. Hence, in the cross-sectional regression of mutual fund returns on their beta
coefficients, given by:

R p = γ0 + γ1 β̂p + αp (14)

the expected abnormal returns (α p ) will be zero or negative depending on the fund’s
transaction or management costs. Significantly positive abnormal returns (α p ) for
any fund in the sample is purely due to sampling error. The regression coefficient
γ 0 should be zero and the coefficient γ 1 should be the average excess return on
the market portfolio as long as the CAPM holds. The CAPM says that the returns
on a mutual fund are proportional to its systematic risk. To increase the statistical
power of the test, the author’s augmented Eq. 14 with a composed error, consisting
of a symmetric disturbance capturing measurement error (vp ) and a nonnegative
disturbance term, modeling the level of efficiency (ξ p ):

R p = γ0 + γ1 β̂p − ξp + vp (15)

Both error terms are independent of each other and across funds. The nonnegativ-
ity of ξ p follows from the assumption that funds cannot systematically outperform
the benchmark portfolio. Recognizing that one of the major concerns against the
parametric techniques is that the assumption made about the distribution of the error
terms can be restrictive, Annaert et al. [4] used three different gamma distributions
for ξ p , and pooled them together in a Bayesian framework. The error term vp was
assumed to follow the usual normal distribution with mean zero and variance σ 2 .
The β̂p for each fund was estimated by regressing fund excess returns on Rmt for
the previous 24–36 months. The estimated slope coefficients were taken as the β̂p ’s.
Using this framework, Annaert et al. [4] estimated five models for mutual fund
efficiency: a baseline model based on Eq. 14; three models based on Eq. 15 and the
three assumed distributions of ξ p ; and the pooled model.
Their results showed that individual mutual fund efficiencies exhibited large
variability. While the most efficient mutual funds were located very near to the
efficiency frontier, the least efficient funds were located approximately at a level
50% below the frontier. They used combinations of Spearman and Pearson correla-
tion tests to investigate the relationship between fund efficiency fund characteristics
like fund size, its age, and historical performance. Though they failed to find any
significant relationship between efficiency and age both size and past performance
were revealed to have a significant positive effect on efficiency.
32 Performance Evaluation of Mutual Funds Using Frontier Methods 1303

Building on the work of Annaert et al. [4], Santos et al. [48] implemented a
similar SFA approach based on the CAPM functional form to evaluate Brazilian
stock mutual funds. They assumed a seminormal distribution for the efficiency term
ξ p . Since their approach was very similar to the earlier paper it will not be discussed
in this review, in the interest of brevity.
Babalos et al. [5] offer the only example of the SFA technique being employed to
investigate mutual fund performance in the post-financial crisis era. Through their
analysis, they attempted to investigate the veracity of two hypotheses: one, if an
increase in a fund’s risk causes an increase in its efficiency, and two, if an increase
in the fund size results in its efficiency increasing. Post-2008, the question of how
fund managers will respond to a rise in the degree of risk associated with a mutual
fund has gained a special significance.
Fund efficiency was measured using a distribution-free approach based on Berger
[14]. This was done to avoid the usual pitfall of SFA which comes from its heavy
reliance on the assumed underlying error distribution. They carried out their analysis
for a panel dataset of US no-load mutual funds observed over the years 2002–2010.
The choice to only model no-load funds meant that they did not have to deal with
any confusion regarding whether to add sales cost to the mutual fund’s operating
cost since no-load funds are directly distributed to investors and do not charge sales
costs. They did not rely on the CAPM relation to benchmark fund efficiency. The
basic functional form for the frontier they estimated is given by:

Rit = f (Nit , Zit ) + ϑit + uit (16)

where Rit is the observed returns for fund i in year t, N is a vector of fund-specific
variables like expense ratio and turnover ratio affecting its return, Z is a vector of
control variables like volatility index and bond quality spread, ϑit corresponds to
random fluctuations and follows a symmetric normal distribution, and uit represents
the fund’s efficiency compared to the best practice level. The selection of inputs and
outputs was based on earlier nonparametric studies like Murthi et al. [42], Basso
and Funari [8, 9], Daraio and Simar [23], and Matallín-Sáez et al. [40]. In the
empirical estimation, Eq. 14 was fitted in a flexible translog specification to consider
for nonlinearities. The stochastic frontier model was estimated using a seemingly
unrelated regression (SUR) framework. They estimated the mean efficiency scores
for US no-load funds for each of the years 2002–2010 and efficiency scores for
different categories of funds such as small-cap, large-cap, financial fund, and
technology funds. The average efficiency scores across all funds were quite high
at 81% and the dispersion of the scores was found to be highest in 2002 and 2008,
which is to be expected considering the effect of the credit crunch on the global
financial market. Among the different mutual fund classes, large-cap fund categories
had the highest efficiency score and financial sector funds were also found to have
performed well considering the effects of the GFC.
Following the estimation of the efficiency scores, they employed a panel VAR
(vector auto-regression) to explore the causal relationship between efficiency and
some of its determinants like fund size, risk, flows (as measured by percentage asset
1304 S. Sarkar

growth rate net of appreciation), and Morningstar ratings. Predicting the risk of
endogeneity bias in using a simple OLS in this context, they used a VAR framework
instead since it allows all the variables in the analysis to be entered as endogenous
variables.
Interpreting the results from the IRFs (impulse response functions) and the
variance decomposition they showed that there was a double-sided relation between
risk and fund performance. Higher risk-taking led to better performance, while a
decline in the fund’s efficiency was associated with a rise in riskiness. Contrary to
Annaert et al. [4], they found the relationship between fund efficiency and size to be
negative.

Conclusion

The motivation behind the present paper was to provide a comprehensive review
of the diverse body of literature on mutual fund performance evaluation. The focus
was on displaying the variety of methodological approaches adopted by the studies
so far. The earliest attempts at ranking mutual funds used simplistic measures
which relied on comparing mutual fund returns with benchmark market returns.
Later measures like the Sharpe ratio, Treynor ratio, and Jensen’s alpha, which
are still among the most popular measures for mutual fund evaluation, also took
into consideration the trade-off between returns and risk associated with financial
assets like mutual funds. However, it was only in the last decade of the twentieth
century that advancements in frontier analysis techniques allowed for a truly holistic
approach to measuring mutual fund efficiency. The two best-known methods of
frontier analysis are the DEA and the SFA. Among the two, the DEA has been
more popular among researchers studying mutual fund performance, though both
these methods have their advantages and shortcomings. Namely, the DEA by not
requiring any prespecified functional form for the frontier can be easier to apply in
case of mutual funds evaluation, since mutual funds do not involve a traditional
production process. Whereas the SFA is more robust to outliers and sampling
errors due to the inclusion of the stochastic error term. Another caveat of the DEA
which is particularly relevant in the case of developing countries is the need for
the sample of funds to include at least some efficient funds for the DEA rankings
to provide accurate indications that scarce financial capital is being allocated to its
most efficient use. Nonparametric methods like the DEA will always identify at least
one fund as “efficient” irrespective of whether the fund is actually investing in the
best available projects in the economy. While this may not be a concern in developed
economies where capital markets are well developed and many efficient funds
exist, this concern could be real for emerging markets with less-developed financial
markets. In such cases, parametric methods like the SFA can be useful as these
models specify the benchmark frontier exogenously based on a normative concept of
what constitutes efficient investment using theoretical models of corporate finance.
It is clear that both of the techniques have their own strengths and weaknesses, and
there is evidence of these two strands of mutual fund research informing each other.
32 Performance Evaluation of Mutual Funds Using Frontier Methods 1305

It is hoped that future research into the mutual fund industry will demystify the
process of fund management and thereby catalyze mutual fund evaluation literature,
especially parametric methods of evaluation.

References
1. Abdelsalam O, Duygun M, Matallín-Sáez JC, Tortosa-Ausina E (2014) Do ethics imply
persistence? The case of Islamic and socially responsible funds. J Bank Financ 40:182–194
2. Adeli O (2013) Measuring the efficiency of mutual funds. J Oper Res Appl (J Appl Math)
9:27–41
3. Aigner D, Lovell CK, Schmidt P (1977) Formulation and estimation of stochastic frontier
production function models. J Econ 6(1):21–37
4. Annaert J, Van Den Broeck J, Vander Vennet R (2003) Determinants of mutual fund
underperformance: a Bayesian stochastic frontier approach. Eur J Oper Res 151(3):617–632
5. Babalos V, Mamatzakis EC, Matousek R (2015) The performance of US equity mutual funds.
J Bank Financ 52:217–229
6. Banker RD, Morey RC (1986) Efficiency analysis for exogenously fixed inputs and outputs.
Oper Res 34(4):513–521
7. Banker RD, Charnes A, Cooper WW (1984) Some models for estimating technical and scale
inefficiencies in data envelopment analysis. Manag Sci 30(9):1078–1092
8. Basso A, Funari S (2001) A data envelopment analysis approach to measure the mutual fund
performance. Eur J Oper Res 135(3):477–492
9. Basso A, Funari S (2003) Measuring the performance of ethical mutual funds: a DEA approach.
J Oper Res Soc 54(5):521–531
10. Basso A, Funari S (2005) Performance evaluation of ethical mutual funds in slump periods.
Rendiconti per gli Studi Economici Quantitativi 2005:89–105
11. Basso A, Funari S (2008) DEA models for ethical and non ethical mutual funds. Math Methods
Econ Finance 2(1):21–40
12. Basso A, Funari S (2016) DEA performance assessment of mutual funds. In: Data envelopment
analysis. Springer, Boston, pp 229–287
13. Basso A, Funari S (2017) The role of fund size in the performance of mutual funds assessed
with DEA models. Eur J Financ 23(6):457–473
14. Berger AN (1993) Distribution-free estimates of efficiency in the U.S. banking industry and
tests of the standard distributional assumptions. J Prod Anal 4:261–292
15. Berger AN, Humphrey DB (1997) Efficiency of financial institutions: international survey and
directions for future research. Eur J Oper Res 98(2):175–212
16. Briec W, Lesourd JB (2000) The efficiency of investment fund management: an applied
stochastic frontier model. In: Advances in quantitative asset management. Springer, Boston,
pp 41–59
17. Cazals C, Florens JP, Simar L (2002) Nonparametric frontier estimation: a robust approach. J
Econ 106(1):1–25
18. Charnes A, Cooper WW, Rhodes E (1978) Measuring the efficiency of decision making units.
Eur J Oper Res 2(6):429–444
19. Chen Z, Lin R (2006) Mutual fund performance evaluation using data envelopment analysis
with new risk measures. OR Spectr 28(3):375–398
20. Chen Y, Cook WD, Li N, Zhu J (2009) Additive efficiency decomposition in two-stage DEA.
Eur J Oper Res 196(3):1170–1176
21. Choi YK, Murthi BPS (2001) Relative performance evaluation of mutual funds: a non-
parametric approach. J Bus Financ Acc 28(7–8):853–876
22. Daouia A, Simar L (2007) Nonparametric efficiency analysis: a multivariate conditional
quantile approach. J Econ 140(2):375–400
1306 S. Sarkar

23. Daraio C, Simar L (2006) A robust nonparametric approach to evaluate and explain the
performance of mutual funds. Eur J Oper Res 175(1):516–542
24. Ferrier GD, Lovell CK (1990) Measuring cost efficiency in banking: econometric and linear
programming evidence. J Econ 46(1–2):229–245
25. Galagedera DU (2002) A survey on investment performance appraisal with special reference
to data envelopment analysis. Monash University, Victoria
26. Galagedera DU, Silvapulle P (2002) Australian mutual fund performance appraisal using data
envelopment analysis. Manag Financ 28(9):60–73
27. Galagedera DU, Roshdi I, Fukuyama H, Zhu J (2018) A new network DEA model for
mutual fund performance appraisal: an application to US equity mutual funds. Omega 77:
168–179
28. Guo J, Ma C, Zhou Z (2012) Performance evaluation of investment funds with DEA and higher
moments characteristics: financial engineering perspective. Syst Eng Procedia 3:209–216
29. Haslem JA, Scheraga CA (2003) Data envelopment analysis of Morningstar’s large-cap mutual
funds. J Invest 12(4):41–48
30. Haslem JA, Scheraga CA (2006) Data envelopment analysis of Morningstar’s small-cap mutual
funds. J Invest 15(1):87–92
31. Hu JL, Chang TP (2008) Decomposition of mutual fund underperformance. Appl Financ Econ
Lett 4(5):363–367
32. Kao C, Hwang SN (2008) Efficiency decomposition in two-stage data envelopment analysis:
an application to non-life insurance companies in Taiwan. Eur J Oper Res 185(1):418–429
33. Khedmatgozar HR, Kazemi A, Hanafizadeh P (2013) Mutual fund performance evaluation: a
value efficiency analysis approach. Int J Electron Finance 7(3–4):263–280
34. Kumar UD, Roy AB, Saranga H, Singal K (2010) Analysis of hedge fund strategies using
slack-based DEA models. J Oper Res Soc 61(12):1746–1760
35. Lamb JD, Tee KH (2012) Resampling DEA estimates of investment fund performance. Eur J
Oper Res 223(3):834–841
36. Lozano S, Gutiérrez E (2008) Data envelopment analysis of mutual funds based on second-
order stochastic dominance. Eur J Oper Res 189(1):230–244
37. Matallín-Sáez JC, Soler-Domínguez A, Tortosa-Ausina E (2014) On the informativeness of
persistence for evaluating mutual fund performance using partial frontiers. Omega 42(1):
47–64
38. McMullen PR, Strong RA (1998) Selection of mutual funds using data envelopment analysis.
J Bus Econ Stud 4(1):1
39. Meeusen W, van Den Broeck J (1977) Efficiency estimation from Cobb-Douglas production
functions with composed error. Int Econ Rev 435–444
40. Modigliani F, Modigliani L (1997) Risk-adjusted performance. J Portf Manag 23(2):45–54
41. Morey MR, Morey RC (1999) Mutual fund performance appraisals: a multi-horizon perspec-
tive with endogenous benchmarking. Omega 27(2):241–258
42. Murthi BPS, Choi YK, Desai P (1997) Efficiency of mutual funds and portfolio performance
measurement: a non-parametric approach. Eur J Oper Res 98(2):408–418
43. Pendaraki K (2012) Mutual fund performance evaluation using data envelopment analysis with
higher moments. J Appl Finance Bank 2(5):97
44. Premachandra IM, Powell JG, Shi J (1998) Measuring the relative efficiency of fund manage-
ment strategies in New Zealand using a spreadsheet-based stochastic data envelopment analysis
model. Omega 26(2):319–331
45. Premachandra IM, Zhu J, Watson J, Galagedera DU (2012) Best-performing US mutual fund
families from 1993 to 2008: evidence from a novel two-stage DEA model for efficiency
decomposition. J Bank Financ 36(12):3302–3317
46. Premachandra IM, Zhu J, Watson J, Galagedera DU (2016) Mutual fund industry performance:
a network data envelopment analysis approach. In: Data envelopment analysis. Springer,
Boston, pp 165–228
47. Roll R (1977) A critique of the asset pricing theory’s tests part I: on past and potential testability
of the theory. J Financ Econ 4(2):129–176
32 Performance Evaluation of Mutual Funds Using Frontier Methods 1307

48. Santos A, Tusi J, Da Costa N Jr, Da Silva S (2005) Evaluating Brazilian mutual funds with
stochastic frontiers. Econ Bull 13(2):1–6
49. Simar L, Vanhems A, Wilson PW (2012) Statistical inference for DEA estimators of directional
distances. Eur J Oper Res 220(3):853–864
50. Tarim SA, Karan MB (2001) Investment fund performance measurement using weight-
restricted data envelopment analysis: an application to the Turkish capital market. Russ East
Eur Finance Trade 37(5):64–84
51. Tavakoli Baghdadabad MR, Habibi Tanha F, Halid N (2013) The efficiency evaluation of
mutual fund managers based on DARA, CARA, IARA. J Bus Econ Manag 14(4):677–695
52. Titman S, Grinblatt M (1989) Mutual fund performance: an analysis of quarterly portfolio
holdings. J Buss 62(3)
53. Zhao XJ, Wang SY (2007) Empirical study on Chinese mutual Funds’ performance. Syst Eng
Theory Pract 27(3):1–11
Performance of Microfinance Institutions:
A Review∗ 33
Christopher F. Parmeter and Valentina Hartarska

Contents
Introduction and Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1310
Production/Cost Environment of MFIs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1311
What Efficiency Means to MFIs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1313
How Efficiency Has Been Measured Across MFIs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1314
Two-Stage Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1315
Data Availability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1316
Data on Subsidies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1318
Key Modeling Issues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1318
Selection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1318
Measurement Error . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1319
Loans Versus Savings and Loans . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1320
How to Quantify Outputs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1321
Main Findings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1322
Returns to Scale . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1322
Economies of Scope . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1323
MFI Heterogeneity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1324
Women’s Impact on MFIs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1325
Governance and Performance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1325
Outreach and Mission Drift . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1326
The Role of Risk . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1327

∗ We would like to thank Erika Schutt Pardo for comments on an earlier version of this chapter. All
errors are ours alone.
C. F. Parmeter ()
Department of Economics, University of Miami, Miami, FL, USA
e-mail: [email protected]
V. Hartarska
Auburn University, Auburn, AL, USA
e-mail: [email protected]

S. C. Ray et al. (eds.), Handbook of Production Economics,
https://doi.org/10.1007/978-981-10-3455-8_33
1310 C. F. Parmeter and V. Hartarska

Competition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1328
Future Directions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1329
Cross-References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1331
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1332

Abstract

This review covers current issues that applied researchers assessing the perfor-
mance of MFIs are likely to encounter and should be cognizant of.

Keywords

Efficiency · Productivity · Microfinance · Economies of scope · Returns to

scale

JEL Classiﬁcation

C10, C13, C14, C50

Introduction and Overview

The primary goal of this chapter is to introduce the wide audience of this Handbook
to the empirical literature that has developed over the past 20 years or so relating
to the productivity and efficiency of microfinance institutions (MFIs). One may
question why an entire chapter needs to be devoted to such a topic. While MFIs
operate quite broadly in much the same way that traditional banks do, making loans
to clientele and in some cases offering to collect deposits, they face an entirely
different set of objectives and hurdles that traditional banks do not (see [11]).
MFIs typically lend to the poorest of the poor and cannot use standard screening
tools given the lack of credit record or collateral; the loan amounts are typically of
such a small magnitude that standard banks would have a difficult time capitalizing
from said loans. In addition, while most banks offer both savings and intermediate
loans, MFIs traditionally have serviced loans and have only recently begun holding
deposits which are mainly viewed as an additional product, rather than a strict source
of capital. Moreover, while the primary objective of a traditional bank is to turn a
profit, in many instances MFIs promote outreach to the unbankable at the expense of
increased profit. All of this taken together suggests that while methods of assessing
productivity/efficiency of MFIs may overlap with those used for traditional banks,
there are a host of specific issues that require direct attention and detail when
studying MFIs.
This is also true for data pertaining to MFIs. While nearly every country has a
central bank that collects, aggregates, and reports on the banking sector, no such
country-level body exists for MFIs. Thus, how and where data are collected and
reported is an important issue for the applied researcher interested in studying
MFIs. We pay special attention to data sources and threats to credibility and external
validity for said data sources here.
33 Performance of Microfinance Institutions: A Review 1311

It is important to understand the performance of MFIs broadly as the industry

has changed substantially over the recent past. While MFIs have had a solid focus
on providing credit, the nature of the industry has evolved. This is due to the fact
that providing credit to the poor is generally a costly activity and may impinge on
the financial sustainability of MFIs. The origins of MFI were such that NGOs and
donors would provide financial support to MFIs by offering loans at below-market
interest rates, helping them lend. More recently, however, a shift from this model of
subsidizing MFIs to a model of financial sustainability and cost-efficiency has taken
over. This model focuses on MFIs having the ability to cover their internal costs of
lending through the interest income generated from the corresponding loan portfolio
and by minimizing costs in the delivery of these loans.
A business model of cost-efficiency and financial stability is appropriate given
the many challenges that MFIs face. Since they operate under significant financial
constraints with a higher-risk clientele without collateral, coupled with intensive
labor use, to eschew complete reliance on subsidies, high interest rates on loans are
inevitable. These high interest rates can, unfortunately, run counter to many MFIs’
mission statements of outreach and poverty alleviation. Rosenberg et al. [118],
van Rooyen et al. [130], Dehejia et al. [42], and Sinclair [126] all conclude that
high interest rates are one of the central reasons why MFIs may fail to meet their
outreach platform. Other studies have found that direct or indirect subsidies help
keep interest lower, and recent studies (including productivity) have shown that
continued subsidies play an important role in controlling costs and affect measures
of scale economies in MFIs [36].
The remainder of this survey will touch on the myriad empirical issues that have
arisen as academics have probed and studied the operating environment of MFIs,
their efficiency levels, the factors which impact these levels, and future issues that
deserve more attention.

Production/Cost Environment of MFIs

At the heart of performance evaluation of MFIs is whether they should be viewed

as profit-oriented, self-sustaining businesses or socially minded, nonprofit organi-
zations [27, 102]. As [27, pg. 28] ask, “Should MFIs be compared based on their
profitability or based on their outreach . . . .” How a researcher chooses to answer
this question lies at the heart of performance evaluation of MFIs and will shape the
policy narratives derived from it.
As different MFIs may have different targets, modelling how MFIs produce
output is a key challenge for empiricists. One can envision a setting where MFIs
produce outputs that maximize financial revenue (yield, for instance) as well as
outputs that maximize outreach (minimum average loan size, number of loans,
number of clients, etc.). In this instance a standard production environment is
unlikely to allow proper benchmarking of MFIs. Rather an output distance function
[27] or cost minimization framework [77] will be necessary.
1312 C. F. Parmeter and V. Hartarska

For our purposes, we will focus on modeling of costs for MFIs. The cost function
is a natural fit for MFIs as they typically take outputs (loans/deposits) as given and
minimize input costs in the production of these outputs. The cost function is also
an empirically accessible tool to model multiple outputs, which as noted above is
important for MFIs when comparing across MFIs who may have different objectives
(profit vs. outreach). The basic cost function framework is

ln Cit = m(y it , w it , zit ; β) + αi + εit , i = 1, . . . , n; t = 1, . . . , T , (1)

where y it is the vector of outputs for MFI i in period t, wit is the vector of
input prices (cost of labor, capital carrying costs, etc.), zit is a vector of controls
which may affect the cost environment of the MFI (clientele served, operating area,
region, type of MFI, etc.), αi is an MFI-specific term which captures time-constant
heterogeneity across MFIs, and εit is an idiosyncratic shock.
When cost-efficiency is being studied, εit can be decomposed into a pure noise
effect, vit , and a one-sided inefficiency component, uit , which serves to raise costs
above their minimal level: εit = vit + uit . The parameters of the model can be
estimated once the cost function, m(y it , w it , zit ; β), is parametrically specified.1
A common empirical specification is the translog functional form. To write this in
matrix form, we let ln x it = (ln y it , ln wit ) denote the vector of outputs and input
prices in logarithmic form. We then have

m(ln x it , zit ; β) = ln x it β 1 + ln x it β 2 ln x it + β 3 zit , (2)

where β = (β 1 , vec(β 2 ), β 3 ) and β 2 an appropriately sized matrix of coefficients

that respects linear homogeneity and symmetry of the cost function.
A serious empirical concern is how to deploy the translog functional form
when some MFIs do not produce all outputs. For example, in the study of scope
economies of MFIs, some MFIs may only offer loans, while others offer both loans
and deposits. In this case the pure translog functional form is insufficient for this.
A common, though empirically dubious, approach is to add a small number to
the corresponding 0 for the MFI-specific output prior to taking logarithms. While
many empirical studies uncover “reasonable” estimates of various metrics using this
approach, it does not parry the initial concern, and there is no consensus as to how
small a number should be added to the 0 output value prior to taking logarithms.
Going forward authors are counseled to avoid this practice, and if they do engage
in this behavior, they should be explicit as to which number they added, how robust
their results are to minor changes in this small number, and how many observations
in the overall dataset this impacts. A better alternative is to model separate translog
functional forms as in [96] or to use a different functional form such as generalized
quadratic or Leontief.

1 Nonparametric methods can also be deployed.

33 Performance of Microfinance Institutions: A Review 1313

What Efficiency Means to MFIs

Efficiency improvement is meaningful to any business or sector. Efficiency has

perhaps more important implications for MFIs given their (potential) reliance on
subsidies [36, 40] and the dichotomy between outreach and operational perfor-
mance [27]. The majority of MFIs opened initially with the goal to serve the
poorest clientele. However, “mission drift” [100] may set in given the absence of
substantial scope and/or scale economies [96]. This is illustrated with the financial
sustainability of the Grameen Bank, a flagship MFI, whose performance improved
materially after changing its business model to start offering micro-savings (in
addition to loans). These performance improvements coincided with a simultaneous
abandonment of its poorest clients. This is not surprising as once an MFI is licensed
to collect savings deposits, they become subject to banking regulations, and the
additional stringent supervisory environment incentivizes MFIs who are profit-
oriented to curtail outreach to costly-to-reach-customers [39].
Understanding the efficiency of a single MFI or of the industry as a whole has a
wide-ranging importance. Given the gradual withdrawal of subsidies, this promotes
the need for MFIs to demonstrate long-term viability and sustainability. There are
concerns that subsidies reduce MFI incentives to perform efficiently. As an example,
[29] show that MFIs in Central Asia and Eastern Europe that rely more heavily on
deposit funding and less on subsidies are more efficient than similar MFIs over time.
Using different methods, [79] find the opposite effect: “smart subsidies” [11], which
allow MFIs to build their infrastructure and develop institutional know-how, result
in MFIs which are more efficient than those that do not receive these subsidies.
Regardless of the type of subsidy, they need to be accounted for when assessing
MFI performance; as noted by [104, pg. 98], “. . . [MFIs’] performance on the basis
of traditional financial ratios without unearthing the degree of subsidy dependence
provides only a partial and often meaningless or misleading picture of the social
cost of maintaining the MFIs. . . .”
Cost-efficiencies are also important to MFIs given the increase in competition
over the past several decades. The entrance of a variety of for-profit MFIs has
led to concerns over higher than necessary interest rates, which can undermine
outreach objectives. This is characterized by [118, pg. 1]: “An interest charge
represents money taken out of clients’ pockets, and it is unreasonable if it not
only covers the costs of lending but also deposits ‘excessive’ profits into the
pockets of an MFI’s private owners. Even an interest rate that only covers costs
and includes no profit can still be unreasonable if the costs are excessively high
because of avoidable inefficiencies.” Thus, MFIs need to operate efficiently if they
are to provide microloans at rates that actually help their clientele and remain
financially stable.
To illustrate this point more deeply, consider recent research on six prominent
randomized controlled trials (RCTs) [14], which found only a small average impact
of microcredit access on new marginal borrowers. These modest impacts could have
the potential to produce sizable effects if MFI costs are proportionally small; thus the
ability to maintain cost-efficiency becomes all the more apparent. More recent work
1314 C. F. Parmeter and V. Hartarska

has revisited these six randomized controlled trials (and others as well), see [41,
103, 111], and raised questions concerning the role of MFIs on poverty reduction,
arguing that RCTs are best viewed as one of the myriad approaches to evaluate the
impact of MFIs. Regardless which side one falls one, it is apparent that MFIs need
to be cost-efficient to remain effective, regardless of the wider-reaching effects of
offering access to loans/deposits to the unbankable (see also [38]).

How Efficiency Has Been Measured Across MFIs

The spectrum of empirical studies investigating technical and allocative efficiency

of MFIs has used a host of statistical methods. However, a majority of this
research have deployed either data envelopment analysis (DEA) or stochastic
frontier analysis (SFA). Both methods have their merits and criticisms. DEA offers
the ability to model the production or cost environment in a fully nonparametric
setting and does not require distributional assumptions to identify inefficiency.
Unfortunately, these advantages require an assumption that idiosyncratic noise does
not exist in the model. In Equation (1) this is captured as εit = uit . This makes DEA
susceptible to outliers which are likely to exist in (especially cross-country) MFI
data [72, 73]. Methods do exist to seamlessly handle potential outliers when using
DEA. We direct the curious reader to [125] for more details. Another empirical
concern with the deployment of DEA is the fact that the nonparametric nature of
the methods means that potentially large finite sample biases may exist given the
dimensionality of the data. In this instance one may consider dimension reduction
methods [136].
SFA relies more heavily on parametric methods. It is common that the researcher
will have to specify both the cost function (translog, quadratic, etc.) along with
imposing distributional assumptions on the composed error term. The most common
specification is to assume that vit ∼ N(0, σv2 ) and uit ∼ N+ (0, σu2 ), the
ubiquitous normal-half-normal specification, popularized by [3]. The benefit of
using SFA over DEA is that random fluctuations can be more easily handled in one’s
analysis. Naturally one may ask if the potentially erroneous parametric assumptions
compensate for this benefit, and a large literature has developed studying this
question. Recent advances in SFA allow the production/cost technology to be
estimated in a nonparametric fashion [108] and to dispense with distributional
assumptions on the composed error [107,128,139]. We direct the reader to [106] and
the chapter of Kumbhakar et al. [91] in this Handbook for more detailed treatments
on various aspects of SFA. One important aspect of applying either SFA or DEA
to study efficiency of MFIs is the fact that the microfinance literature has reached a
consensus that the external operating environment impacts how MFIs behave and,
more importantly, that these factors need to be accounted for in studies of MFI
efficiency [2, 56]. Hartarska et al. [72] were the first to include these impacts in an
efficiency study of MFIs. Moreover, the vast majority of studies on MFI efficiency
typically find moderate levels of technical efficiency, indicating substantial room for
improvement. See [52] for a meta-analysis of this literature.
33 Performance of Microfinance Institutions: A Review 1315

Notable studies relying on DEA to estimate MFI level technical efficiency

include [62, 93, 109, 122]. Across the spectrum of studies exploring efficiency
with DEA, many focus on MFIs in a particular country/region: [54] study MFIs
in Peru; [62] focus on MFIs across Latin America more broadly; [8] study
agricultural co-ops in Bulgaria; and [93] study Vietnamese MFIs, while Efendic
and Hadziahmetovic’s [49] work looks at MFIs in Bosnia and Herzegovina.
One of the more geographically comprehensive studies deploying DEA is [63]
who look at MFIs across Africa, Asia, and Latin America. A key finding is that non-
governmental microfinance institutions are the most efficient, a result consistent
with this type of MFIs’ fulfillment of their (competing) dual objectives: poverty
alleviation and maintaining financial sustainability.
An early study of MFI efficiency using SFA is [75], focusing on the Grameen
Bank. Their findings suggest that female-only branches are the most efficient among
the various Grameen Bank branches, but the average inefficiency across all branches
is in the 3–6% range, suggesting that the Grameen Bank is highly efficient to
begin with. Other recent studies include Servin et al. [123] who focus on technical
efficiency of Latin American MFIs, Hermes et al. [77] who use SFA to investigate
if a trade-off between outreach to the poor and technical efficiency of MFIs exists,
Gregoire and Tuya [60] who study efficiency of Peruvian MFIs, and Pal and Mitra
[105] who explore the linkages between number of borrowers per loan officer and
MFI asset quality (measured as portfolio value at risk) using data on 1,575 MFIs
spanning the period 2006–2013. Kendo [86] studies MFI efficiency in African
countries, using a panel of 163 MFIs over the 2004–2011 period. Bensalem and
Ellouze [21] study how the current wave of commercialization of MFIs impacts
both their financial and social efficiency using a sample of 162 MFIs over the period
2007–2013 across Africa, Asia, Latin America, and the Middle East.
Until recently, many SFA efficiency studies of MFIs, while having access to panel
data, did not fully embrace the panel nature of the data. That is, the time constant
nature of MFI-specific heterogeneity was ignored. A step in this direction is the
“true random effects” estimator [59] deployed by [68] who study if the presence
of a female CEO at the MFI is related to efficiency. They find MFIs with female
CEOs have significantly higher “outreach efficiency” (as measured by the number
of clients) than similar MFIs with male CEOs.

Two-Stage Analysis

In several studies of MFI efficiency, researchers have deployed what is known

as the two-stage approach [134].2 This approach estimates the cost function in
Equation (1) (or some equivalent); recovers technical efficiency scores, usually by
following the methodology of [85]; and then regresses these technical efficiency
scores on a set of MFI characteristics. This approach should be avoided at all costs.

2 Prominent examples include [21, 66, 75, 86].

1316 C. F. Parmeter and V. Hartarska

Empirical results derived in this fashion are not to be trusted. The reason is
twofold. First, any set of characteristics that are omitted when the cost technology
is estimated will produce an omitted variable bias. Depending on the correlation
between traditional input prices and outputs and the omitted MFI characteristics,
these biases can be severe. This will undoubtedly impact not only the estimates
of the cost function parameters but also the subsequent residuals that are used to
produce the technical efficiency scores that feed into the second stage. Second,
even assuming away any impact of omitted variable bias, [134] have shown that
the estimated technical efficiency scores are underdispersed. This suggests that the
dependent variable in the second stage, the estimated technical efficiency scores, has
less variation than it should, which will adversely affect the quality of the estimates
in the second stage.
Given both of these concerns, studies attempting to discern the characteristics
of MFIs that improve technical efficiency should do so in a single-stage setting.
This can be done in a straightforward manner using either maximum likelihood (by
modeling the parameter σu2 as a function of said determinants) or nonlinear least
squares. The use of nonlinear least squares is appealing as in this case parametric
distributional assumptions are not necessary, just invocation of the so-called scaling
property [7] which multiplicatively decomposes inefficiency into deterministic
and stochastic components; the multiplicative form of the decomposition allows
estimation of the model eschewing distribution assumptions since only the mean
of technical efficiency enters the model and can be captured as a single parameter.
Readers are directed to [106] for a textbook discussion of this issue.

Data Availability

There are several main sources of data that are used in empirical studies of MFI
performance and efficiency. First, many papers use hand collected data from field
experiments (one prominent example is [14]).3 These data are not typically available
unless made public by either the authors of the study or via replication policies of
the journals (see the American Economic Review or Quarterly Journal of Economics
data policies, for instance).
Curated databases of various metrics for MFIs do exist, with the most prominent
being the MIX Market database (https://www.themix.org/). A free version exists
through the World Bank’s Data Catalog. The website provides information on over
2,000 microfinance institutions across all regions of the globe, dating back as far
as 1996. MIX, often referred to as the “Bloomberg of microfinance,” represents
a huge open data and transparency win for the entire microfinance industry. One
existing issue is that data at the MFI level are self-reported and some MFIs either
do not report or report incomplete information. This engenders potential concerns

3 Almost no studies of MFI efficiency collect data by hand from MFIs. Hartarska [64] is one
example that looks at performance of MFIs based on governance where the data is hand collected.
33 Performance of Microfinance Institutions: A Review 1317

over endogeneity and self-selection when deploying this data. While a selection
bias is anticipated and MFIs reporting to the MIX Market likely represent the better
performers, [17] find that in the early years of MIX Market reporting, MFIs’ patterns
of reporting to the MIX Market are correlated with institutions’ region of operation,
mission, and size, which makes the data adequate for many purposes.4 MIX Market
data are used in numerous microfinance studies, including on productivity and
efficiency of MFIs and of gender-related aspects.
Another database on MFIs is housed at the Center for Research on Social
Enterprise and Microfinance (CERSEM) at the University of Agder in Norway.5
Compared to the MIX Market data, which are used in most studies on the
performance of MFIs, the CERSEM dataset includes a representative sample of
much smaller MFIs, which are typical examples of firms working with both financial
and social goals. The dataset includes 660 MFIs around the globe. The dataset
is an unbalanced panel ranging from 1998 to 2015. CERSEM possesses several
advantages over MIX Market: (i) data are verified by professional external rating
agencies; (ii) data have less of a large-firm bias; and (iii) accurate information on
more variables is available in the CERSEM dataset allowing greater control for
important MFI characteristics.
A more specialized database, focusing exclusively on MFIs operating in Europe,
is available through the European Microfinance Network (EMN).6 This dataset
is compiled on a biennial basis, based on a survey of microfinance institutions
(MFIs) in Europe (see [28]). The MFIs surveyed by EMN self-report data mainly
about their mission, target, and social and financial impact. Currently the data
covers key institutional characteristics, outreach, social performance, and financial
performance across 444 MFIs for the period 2006–2015, 34 of which are observed
for at least 8 years in the reference time interval.
We note here that the MIX data is the dominant dataset that is used in the
literature when focusing on broad comparisons of MFIs globally. Many authors
collect data on specific MFIs in specific regions and conduct various surveys with
clientele or run field experiments to discern the impact that the MFIs are having
(e.g., [88]). However, this requires specialized skills, institutional knowledge, and
boots on the ground. The broad databases just listed here abstract from MFI-specific
issues. Moreover, recently many of the criticisms against MIX have begun to ebb
as the dataset remains up to date and publicly available, whereas both EMN and
CERSEM capture fewer underlying operational variables and are not as current as
MIX. In fact, to our knowledge, the last reporting year for CERSEM currently is
2017.

4 This is true for earlier versions; in later years of data collection as more MFIs understood that
reporting to MIX is a good way to attract investors and soft credits and grants, it has become less
so.
5 https://cersem.uia.no/dataset/
6 https://www.european-microfinance.org/
1318 C. F. Parmeter and V. Hartarska

Each of the regional/global datasets discussed here has their own unique flavor,
and different researchers may have specific questions that one of the databases is
more apt to help answer. Thus, we view these datasets as complements to one
another rather than as direct competitors.

Data on Subsidies

Primarily due to data limitations, the role of the subsidy-efficiency trade-off is

still poorly understood, even though subsidies are present in about half of MFIs.
Monetary subsidies are pure or conditional grants and soft loans, i.e., preferential
debt issued at below-market conditions and donations. In-kind donations include
paying part of labor costs (usually senior management and board of directors),
buildings, equipment, and sponsored management information system. Datasets
accessible to researchers contain accumulated donated equity data (from the balance
sheet), and much of the literature uses donated equity (contained in the MIX Market
data) to assess subsidies flowing to MFIs [79] or determines the subsidization
ratio as donated equity over total assets [25]. An alternative definition of subsidies
stems from the donations reported in the income statement [48]. Aside from hand
collecting data designed to measure subsidies, to understand their effect one would
need to use the proprietary component of the MIX Market database and a variety
of adjustments made by Mix Market personnel. Such an example is the work by
[40] who find relatively low levels of median subsidies suggesting that even modest
benefits of microcredit could yield impressive cost-benefit ratios. Surprisingly, the
authors report that subsidies are skewed and most go to MFIs serving fewer poor
borrowers.
Using subsidy data is important because many empirical studies find that
ignoring subsidies leads to misleading results. It seems that subsidies have different
implications in deposit-collecting and in loans-only MFIs especially in nonprofit
institutions, where lack of subsidies may be associated with socially harmful
consequences [70, 88]. Studies that do not distinguish between business models
find contrasting results in that subsidies are negatively related to outreach and
sustainability, may worsen efficiency, increase costs in time, and may crowd out
deposit collection [5, 27, 29, 36].

Key Modeling Issues

Selection

Selection bias is a common ailment of many empirical microeconometric studies,

and microfinance is not immune to this. We see two main areas where selection
is potentially an issue for the researcher: First, are the databases that collect data
representative? Second, if we study both MFIs that offer strictly loans and those
MFIs that offer loans and deposits, are we properly accounting for the potential
33 Performance of Microfinance Institutions: A Review 1319

choice the MFI made prior to operation to offer a specific product mix? Both of
these issues need to be seriously taken into account in one’s study of MFIs.
For example, as section “Economies of Scope” will detail for studying economies
of scope, MFIs are not randomly assigned to offer strictly loans or a mix of loans
and deposits but self-select. How this impacts the findings is an open question. Both
[72] and [44] estimate the degree of scope economies in microfinance institutions
but do not consider selection across product mix, estimating a single cost function
for all MFIs regardless of financial services. Malikov and Hartarska [96] go a step
further and directly model the selection of MFI’s financial services they offer using
the approach of [92].
Cozarenco et al. [36] build on the insights of [96] by modeling selection in the
technology of MFIs to vary across subsidization status as well as the decision to
offer strictly loans or loans and deposits. This is an important step in understanding
the performance of MFIs.

Measurement Error

An issue that has received only marginal attention, but is important nonetheless,
is the quality of the data being used in the empirical literature. As we noted in
section “Data Availability”, the majority of data for studies of MFIs are collected
either through surveys and field experiments on the ground or through various
agencies, with MIX Market being the most popular database. MIX Market data has
been self-reported in the past, and this leads to potential concerns over the accuracy
of the data that is reported.
One aspect of the database that can be exploited to investigate concerns over
data quality is the diamond rating.7 MIX Market classified MFIs into diamonds
according to the availability of data provided by the MFIs. Diamonds range from 1
to 5 with higher numbers indicating greater transparency and (potential) reliability
of data. Al-Azzam and Parmeter [6] conduct their analysis of competition among
MFIs using all of the available data (diamond ratings 1–5) against a subset using
only MFIs with diamond ratings of 4 and 5. Their qualitative findings were identical,
thus allaying potential concerns that measurement issues were impacting their
findings.
We note that the original purpose of the diamond rating was not to capture
measurement error via the quality of the data but whether the MFIs provide complete
information on all indicators that they were asked to provide. Thus, for example,
an MFI that only provided data on the number of borrowers and dollar value of
their loan portfolio but not specific cost data would be rated as 1, while an MFI
that provided all necessary data would get a 5 rating. It does not necessarily mean
that the data reported are correct. However, MFIs that take the time to report on

7 We note that the Diamond data are no longer available via the world bank MIX Market database.
1320 C. F. Parmeter and V. Hartarska

all aspects that were asked of them could be assumed to more reliably report those
numbers than MFIs that only report a limited set of measures.
Using historical MIX Market data with the diamond ratings is a strategy that
could be deployed in future analyses of MFI performance. While removing observa-
tions may lead to important reductions in the sample size, this is useful to determine
how much influence data quality/reporting may have on one’s empirical results.
A related issue that has yet to receive attention in the performance evaluation
literature of MFIs is that of missing data. The common approach is to drop obser-
vations where any of the key variables are missing (costs, outputs, determinants of
inefficiency, etc.). However, this may have an undue influence on the analysis if
these variables are not missing at random.

Loans Versus Savings and Loans

A more recent issue that has arisen in the study of MFI performance is the (potential)
difference in those MFIs that offer purely microcredit and those that both offer
microcredit and take deposits. Inherently there are differences in these types of
MFIs both from the outreach perspective and their governance. While it is widely
acknowledge that savings are important to allow consumption smoothing over
income shocks that are likely to arise, the poor find it hard to save, chiefly because
resource scarcity often combines perversely with behavioral biases.
Cozarenco et al. [37] is one of the first studies to look at the separate characteris-
tics of MFIs that offer micro-savings and those that do not, while previous work has
evaluated the savings-collecting subgroup independently, for example, estimating
efficiency of cooperatives [69] or of municipal banks [94]. Armendariz de Aghion
and Morduch [11] note that micro-savings are often considered as the “forgotten
half,” hence the lack of academic attention. However, the regulatory environment
also plays a key role in either the availability or lack thereof of micro-savings.
To protect clients of microfinance services, regulators typically create barriers that
make it costly for many MFIs to provide micro-savings accounts. Christen et al.
[32] point out that this lessens the accessibility to these products resulting in a much
smaller fraction of the impoverished to be reached by MFIs.
Micro-savings deposits come in two forms: compulsory and voluntary. Compul-
sory savings represent “hidden collateral” [9], while voluntary savings represents
the true demand for micro-savings. To understand more of the underlying factors
that drive MFIs to take voluntary micro-savings, [37] run random effects probit
models coupled with MIX Market data. Their summary statistics suggest that
MFIs taking voluntary savings are not significantly different in terms of financial
performance with those MFIs that do not take voluntary micro-savings but perform
worse in both outreach and the percentage of women served. One of their main
(unsurprising) findings is that MFIs collecting voluntary micro-savings are older.
This is due to the fact that micro-savings stem from regulatory compliance, which
is typically not feasible for new/young MFIs. Another important finding is that
subsidies crowd out micro-savings hampering outreach efforts of MFIs.
33 Performance of Microfinance Institutions: A Review 1321

The study of [37] is important for future research for several reasons. First,
the findings are useful to understand how better to promote micro-savings gener-
ically. Second, as we previously discussed, selection issues related to product
diversification are likely to arise in empirical work, and understanding what drives
the decision to offer micro-savings is key in specifying selection mechanisms.
Third, the performance of MFIs is likely dependent upon the product mix, and
so understanding decisions of both voluntary and compulsory micro-savings is
prudent prior to making comparisons across different MFIs. Fourth, the impact of
subsidies is an important area within MFI performance, and their study documents
an important result on the effects of subsidization on MFIs.

How to Quantify Outputs

When considering a financial institution, both number of accounts and volume of

accounts could be used as measures of output. However, the prevailing banking
literature uses volume of loans and deposits as banks’ outputs. This is for several
reasons. First, it has been argued that the intermediation approach [121] better
describes what a banking institution does and presumably achieves a more apt
description of the operating environment [55]. Second, variables which measure the
number accounts are difficult to come by in existing banking databases, and, when
available, these data are viewed as unreliable. It has been argued, however, that in
some financial firms, output is better measured by the total number of accounts and
transactions as opposed to volume or value [129].
Given many MFIs stated outreach mission to reach more clients, microfinance
studies have argued that the number of clients or even the overall level of clientele
poverty, rather than the monetary equivalent of accounts and profit, is an appropriate
measure to gauge MFI performance [29]. A goal of outreach for MFIs is to service
the largest number of borrowers with small loans; thus the volume of loans and
the number of loans can/should affect the productivity of these institutions. Caudill
et al. [29] report that their results do not differ substantially from those when only
the number of clients were used to measure lending. As such, future studies have
used number of loans and clients to measure MFI performance through efficiency
[65] as well as scale and scope economies [72, 74].8
A detailed comparison of the impact of using volume versus number on MFI per-
formance was conducted in [73]. Their empirical work studies possible differences
in both the mean and the overall distribution of estimated scope economies using
either total accounts held by the MFI or the total dollar value of these accounts as
output measures within a semiparametric quadratic cost function. They also conduct
several tests to determine how different their estimates of scope economies are not
only at the mean (median) but also on the overall distribution of estimated scope

8 Hartarskaet al. [68] use the number of clients (borrowers and savers) which equals the number
of borrowers in lending-only MFIs but captures better the output of savings-and-loans MFIs.
1322 C. F. Parmeter and V. Hartarska

economies. Their main conclusion for future empirical work is that qualitatively
their estimates of MFI performance (scope economies) are similar both globally
and within specific regions. Some caution with this result is needed as their tests of
distributional equality suggest statistically meaningful (though economically small)
differences.
While [73] found that differences in their estimates exist, no clear-cut measure
of output emerged as the definitive measure. There are two implications of this for
empirical work. First, it may be preferable to use the active number of clients as
the appropriate measure of output for an MFI since this better reflects the outreach
mission statement. Second, given that access to reliable data on the number of active
borrowers or active savers is not typically available for the majority of datasets, or in
the case of the MIX Market database, for a select subset, their findings suggest that
the use of dollar values should still provide similar insights. Researchers assessing
performance of MFIs should attempt to use number of clients or loans and not the
dollar value if at all possible (though recognizing that data limitations may prevent
this in all cases).

Main Findings

Returns to Scale

Generically, returns to scale can be computed directly from the cost function
M −1
∂ ln C
parameter estimates using RT S = ∂ ln ym for the M distinct outputs (here
m=1
this would most likely be volume or number of loans and possibly volume or number
of deposits).9 The importance of RTS as a performance measure is that this can yield
important insights into the overall health of the industry (either globally, within
a region, or within a given state/country). Increasing returns to scale suggest that
(proportional) cost reductions can be had while simultaneously increasing output
(which has connections to outreach).
The majority of existing studies find increasing returns to scale (IRS) suggesting
that the industry is progressing towards optimal size. This also presages future cost
savings as MFIs continue to grow. However, the finding of IRS is not fully robust.
For example, when outputs are measured as the number of active clients rather than
the volume of loans (or both), or when subsidies are properly accounted for, constant
and even decreasing returns to scale have been found for MFIs in both Eastern
Europe and Latin America [30, 36, 74].

1−∂ ln C /∂ ln e
9 When one includes quasi fixed inputs, e, then RT S =
M
.
∂ ln C
∂ ln ym
m=1
33 Performance of Microfinance Institutions: A Review 1323

Economies of Scope

Another important metric that has been assessed broadly is economies of scope (also
known as economies of diversification). The industry has trended away from solely
lending activities into one that offers both savings and loans. As noted in [96, pg.
162], in the early 2000s roughly one third of MFIs offered both savings and loans,
whereas a decade later, over half of all MFIs were offering both products to their
clientele, a near 100% increase. This reflects the growing demand of the poor for
expanded financial services [34].
As mentioned earlier, with deposits comes additional oversight, which can be
costly. Thus, MFIs need to be sure that the regulatory burden is offset by offering
deposits. This is most commonly achieved through economies of scope. Economies
of scope can emerge from two different sources. First, cost reductions can be had
by distributing fixed costs across the product mix. Allocating fixed costs over the
services an MFI offers can contribute to scope economies when excess capital
capacity is reduced by providing both savings and loans rather than just by offering
loans. Second, cost complementarities may exist between different products. These
complementarities are derived when consumer information developed in the pro-
duction of either savings or loans is used to reduce the monitoring requirements of
the other product.
Scope economies for an MFI that offers either loans (y1 ) or deposits (y2 ) exist,
in the traditional sense [18], if C(y1 , 0) + C(0, y2 ) > C(y1 , y2 ). Here we have
simplified our notation from Equation (1) to make the discussion simpler. Pulley
and Braunstein [113] suggest the estimation of quasi scope economies, while
[114] define a normalized version of this criterion to assess scope economies.
This measure does not restrict the calculation of scope economies to the case of
perfectly specialized output, the counterfactual situation where one output is set to
zero. In settings where all firms offer both products, this is a useful tool to have
to approximate the appropriate curvature of the cost function. However, for MFIs,
where many MFIs do not offer deposits, and virtually no MFIs are specialized to
offer only deposits, this is not an issue.
In fact, a more pressing concern for the estimation of scope economies for MFIs
is the opposite: the required counterfactual estimation of C(y1 , y2 ) to the case of a
single output, known as the “excessive extrapolation” problem [51, 80]. Malikov
and Hartarska [96] deploy a modified version of the traditional measure of scope
economies accounting for excessive extrapolation. They do this in two ways. First,
rather than estimate a single cost technology for all MFIs, they estimate separate
technologies for loan-only and savings-and-loan MFIs. The estimation of the loan-
only cost technology yields C(y1 ) as opposed to C(y1 , 0). Second, rather than shift
y1 to 0 in the counterfactual analysis of the cost technology for the savings-and-loan
MFI, they down weight y1 by an amount such that each counterfactual MFI does not
produce less of each output than what appears in the data. Thus, the weight, ω, is
such that ωy1 ≥ min y1 (for loan only MFIs) and (1 − ω)y1 ≥ min y1 (for savings
and loan MFIs). Economies of scope are said to exist if C(ωy1 )+C((1−ω)y1 , y2 ) >
C(y1 , y2 ). Following [114], economies of scope can be calculated as
1324 C. F. Parmeter and V. Hartarska

C(ωy1 ) + C((1 − ω)y1 , y2 ) − C(y1 , y2 )

ES(ω) = . (3)
C(y1 , y2 )

The natural question here is what value to select for ω as there will exist an
admissible region of values that ensure above minimum production of y1 . Malikov
and Hartarska [96] suggest a measure of global economics of scope, minES(ω).
ω
When the smallest value of global scope economics is positive, one can still
conclude that economics of scope exists in the production of savings and loans.
One issue that the approach of [96] cannot deal directly with is the further
decomposition of economies of diversification into separate components for fixed
and complementary costs. The seminal work of [72], while potentially suffering
from excess extrapolation, can investigate economies of scope broadly, as well as
decompose scope economies into separate pieces due to fixed and complementary
costs. This is due to the specific functional form that they specify and their use
of a semiparametric smooth coefficient model [76, 95]. This model is also used
in [44]. Knowledge of the magnitude of both of these components of economies
of diversification are important as [72, pg. 391] note, “. . . cost complementarities
accrue to MFIs if the account information that is developed in the process of creating
deposits is subsequently used to help monitor and gather credit information on loans
for the same customer base. Spreading fixed costs over an enhanced product base
produces scope economies if the same set of tools required to manage deposits can
also be used to produce and monitor loans.”
As far as results from various applied papers focusing on economies of scope
exist, the first in this area were [72, 73]. More recent studies using a variety
of sophisticated econometric tools include [44, 96, 97]. Nearly all of the studies
investigating scope economies have found that they indeed exist; however, there
is substantial heterogeneity in the range of estimates, and earlier studies, using
standard methods (i.e., not accounting for excessive extrapolation) ,find much higher
levels than newer studies that take this into account. More specifically, it appears that
estimated economies of scope are modest for most small-size MFIs but can be quite
substantial for large-scale institutions.

MFI Heterogeneity

Many studies of MFI performance have documented substantial heterogeneity

across MFIs, which is not surprising. Differential performance has been observed
regionally [96], across MFI type [72], by area served (namely, rural versus urban
versus both as in [44]), whether MFIs are subsidized [36], and composition of the
board of an MFI [68].
The work of [61] is also a recent example illustrating the diverse heterogeneity
that exists when assessing MFI performance. They study the impact of government
ideology on MFI sustainability, documenting diversity across left-wing and right-
wing ideologies of the governments in places where the MFIs operate. Gul et al. [61]
find that MFIs operating in a left-wing regime have higher portfolio growth rates and
33 Performance of Microfinance Institutions: A Review 1325

lower funding, operating, and default costs. It appears that the electoral incentives
of left-wing governments impair the capacity of MFIs to increase financial revenue.
Returning to the issue of scope economies, [72, Table 2] find substantial
differences in economies of scope (generally) as well as fixed and complementary
costs which vary heavily by region. Average scope economies are 0.245 in MFIs
operating in Africa, whereas those in Latin America have scope economies on
average of 0.105. These stark differences suggest that regional heterogeneity is an
important component to understanding MFI performance.

Women’s Impact on MFIs

A majority of an MFI’s clientele are women, who typically benefit from smaller
loans than men [11]. A large literature has documented links between the gender of
the microfinance borrowers and the performance of the MFIs. Agier and Szafarz [1]
compare denial rates and loan sizes for male and female applicants in an MFI located
in Rio de Janiero with similar expected creditworthiness to check for disparate
treatment linked to taste-based discrimination. This is an important first step in
understanding how MFIs perform based on gender given that MFIs commonly
offer fixed-interest loans and loan sizes are tailored to the expected creditworthiness
of the applicant [101]. This lending approach is derived from the need to keep
operating costs at a minimum and to help paper over any inefficiencies that may
exist. D’Espallier et al. [46] find that a higher percentage of female clients in MFIs
is associated with lower portfolio risk, fewer write-offs, and fewer provisions, all
else equal. More recently, [33] finds evidence of a trade-off between sustainability
and outreach depth, suggesting that, as women are poorer, fewer women would be
reached within a sustainable model.
Explicit research looking at potential differential effects of gender on perfor-
mance include [16, 23, 46, 50, 68, 87]. Specifically, [46] find that a focus on women
in MFIs is significantly related to smaller loan sizes within the portfolio, the use
of collective lending methods, nonprofit status, and a broader orientation. Again,
the smaller loans lead to higher operating costs, but women also have lower default
rates, which may offset these costs. This balance is an important component in the
overall assessment of gender on MFI performance, and it connects to Hartarska and
Parmeter’s [71] recent work documenting economically meaningful differences in
cost elasticities from loan provision based on gender.

Governance and Performance

Another important area of focus on MFI performance is the role that leadership
plays. A variety of studies have evaluated the role of gender on the composition of
the board of directors and upper management and MFI performance. At present this
research has produced mixed results [66, 68]. Recent work points out that female
managers and credit boards’ ability to deliver depend on the gender composition and
1326 C. F. Parmeter and V. Hartarska

matching of credit boards and managers [110]. Research also suggests differences
in efficiency in MFIs run by female CEOs [68], while female loan officers have
been documented to achieve distinctively different results with men and women
clients [19].
An early study on MFI performance and governance is [99]. They investigate
the impact of various governance mechanisms on both of the key tenets of MFIs’
mission platform, outreach and financial sustainability. They identify three main
channels: vertical, horizontal, and external. The vertical dimension consists of
the link between owners and management, the horizontal dimension captures the
MFI, and the customer and the external dimension represents that of governance
itself. Mersland and Strøm [99] present an array of econometric evidence: (i)
financial performance improves when an internal auditor is present and also when
the directors are from the local area as opposed to international directors; (ii)
ownership type has no effect on performance; (iii) female CEOs lead to better
financial performance of the MFI; and (iv) few governance variables appear to
influence MFI performance, either statistically or economically. One finding that
deserves more attention, but has yet to receive it, is why a female CEO of an MFI
does not lead to broader outreach (see Table 6 of [99]).
Hartarska and Mersland [66] show that efficiency increases with board size
initially but then reaches a saturation point (their estimate is nine members)
and harms efficiency thereafter. They also demonstrate that donors’ presence on
the board is harmful for efficiency in reaching many poor clients. Lastly, [66]
do not find strong evidence of competition or regulatory environment on MFI
performance/efficiency.
Looking specifically at governance and gender, many studies have found gender
effects on performance. Examples include Boehe and Cruz [24] who find that
female membership improves an MFI’s performance through enhanced debt repay-
ment, Strøm et al. [127] who find that enhanced female presence on the board to be
positively related to MFI performance, and [23] who estimate a positive effect of
female loan officers on technical cost-efficiency of an MFI.

Outreach and Mission Drift

One benefit of subsidization of MFIs is that it can alleviate financial sustainability

concerns, promoting greater attention towards meeting social goals. As an example
[45] show that when NGOs transform into commercial microfinance banks, their
reliance on subsidized funds declines in favor of deposits and commercial debt.
While these MFIs’ long-term profitability improves, it comes at the expense of
sharp increases in average loan size, which directly corresponds to diminished
outreach.
The forward facing side of all MFIs is to fight against poverty (outreach).
However, empirically, one pertinent issue when assessing MFI performance and
outreach is how to measure both poverty and access to microcredit. The standard in
the literature [10, 38] is average loan balance per borrower or the total number of
33 Performance of Microfinance Institutions: A Review 1327

loans or deposits taken. The smaller the average loan size, the deeper the outreach;
the higher the number of loans given out, the deeper the outreach.
Bos and Millone [27] find that disbursing larger loans implies a lower yield
on the gross loan portfolio and is correlated with higher personnel and financing
costs. Further, for their data, NGOs have lower costs per loan. Bos and Millone [27]
document that some MFIs can indeed combine the depth and breadth of outreach
and operate with above average levels of technical efficiency. Yet, they also find that
efficiency quickly decreases as the loan portfolio becomes larger.
More broadly, the literature points to the trade-off between outreach and financial
efficiency [39, 77, 116], and this suggests that there may be linkages to the loan
officers. Indeed, [119] focus on the role of microloan officers in the provision of
microfinance services by studying the preferences of microloan officers over loan
allocation. They examine whether, when given a choice, microloan officers select
the less advantaged client which is consistent with the stated outreach mission of
an MFI. This allows them to study the (potential) trade-off between better financial
results and achieving social goals. Similarly, [20] find that more experienced loan
officers serve fewer vulnerable clients.
The news is not all bad however. Bos and Millone [27] also document that
MFIs which specifically target the poor, lend at a higher frequency to women,
and provide educational programs are more technically efficient. This finding runs
contrary to several earlier studies looking at outreach and MFI performance. This
result also piggybacks off of the work of [100] who find no evidence of mission
drift, suggesting that the profit motive incentivizes MFIs to seek out new markets
and to become more efficient [31]. See also [98] for more insights on mission drift
of MFIs.
Quayes [115] highlights that during periods of financial distress, there is
a potential for trade-offs between outreach and financial sustainability. Wagner
and Winkler [133] find credit growth and thus MFIs’ ability to reach the poor
dropped sharply after 2008. MFIs in countries with better institutional quality (more
advanced financial systems) were more resilient to the global financial crisis [124],
and MFIs in general were more resilient than traditional banks [132]. In line with
that, [135] finds differences in the reaction of MFIs’ productivity to the shock
of the 2008 financial crisis based on ownership type and organizational structure,
with microfinance banks and microfinance non-bank financial institutions suffering
the most, while non-government organizations (NGOs) and cooperatives were least
affected.

The Role of Risk

In the microfinance industry, the focus has been on default risk (or credit risk), mea-
sured by the proportion of loan portfolio overdue more than 30 days (also known as
portfolio at risk overdue more than 30 days or PAR30) since loans are mostly short
term in nature. Within the structural approach to efficiency, such risk is essential
to control for when modeling the cost structure of financial institutions, because
1328 C. F. Parmeter and V. Hartarska

lower-quality assets (reflected in a higher non-performing loans ratio) require more

resources to manage a higher-level risk exposure, thereby raising the costs for MFIs.
Failure to account for riskiness/quality of loans may therefore produce misleading
results (e.g., see [81]). Thus, credit risk has been used consistently in most efficiency
and productivity studies (starting from [27, 29, 44, 65, 68, 74, 96, 97, 99]).
Credit risk is essential to monitor because repayment problems among a few
microfinance clients may quickly spread to many clients leading to “borrowers’
run” [26] such as the example of Bolivian MFIs between 1996 and 2000 [131].
The role of credit risk has been explored in the context of geographic diversification,
showing that geographic diversification comes with more credit risks, attributable to
difficulty of monitoring remote operations especially in NGOs and in cooperatives
[138].
The overall risk level of the MFIs themselves has been evaluated in the context of
ratings of MFIs. Ratings of MFIs are done by five microfinance rating agencies and
cover a wider range of categories than is common for traditional banks, including
outreach, ownership, regulation, governance, clients, and financial products as well
as financial information. MFIs obtain a rating to signal to investors and donors
their quality and to raise more funds. Yet, rating by only a few selected MFI rating
agencies was found to help MFIs raise funds [67].
In addition to credit risk, MFIs also have financial risk (e.g., liquidity risk, market
risk – including interest rate risk and foreign exchange risk, investment portfolio
risk, and capital adequacy risk) and the risk of overall failure. These risks have
been evaluated in relation to MFI transparency and governance. Better transparency
and governance reduce financial risks in nonprofit MFIs; transparency seems to be
associated with increased credit risk, while the presence of insiders and international
directors is associated with increased failure risk [57].

Competition

While microfinance has operated as a mechanism to alleviate poverty, it has received

substantial support from donors, social investment funds, NGOs, and subsidies.
Over time, the increase in opportunities to turn profits has gradually moved
microfinance into the provision of financial services to the poor on a commercial
basis. This commercialization has brought with it competition following the entry
of for-profit MFIs and the transformation of many NGO-backed MFIs into for-
profit MFIs. As such, competition has gradually become an important facet of the
microfinance industry [112].
A variety of studies have begun to investigate the performance of MFIs based on
competition as well as their profit status. The existing body of empirical studies that
examine the impact of competition on interest rates has reached inconclusive and
counterintuitive results. Baquero et al. [15] examine the impact of competition on
interest rates and portfolio quality for both nonprofit and for-profit MFIs using the
Herfindahl-Hirschman Index. The authors find that in less concentrated markets,
for-profit MFIs charge lower interest rates and have better portfolio quality and
33 Performance of Microfinance Institutions: A Review 1329

that nonprofit MFIs are insensitive to changes in market concentration. Mersland

and Strøm [99] construct a measure of competition based on the challenge of
competition as perceived by the MFIs and conclude that interest rates respond
positively to higher levels of competition. Depending on the profit status of the
MFI and using the number of MFIs active in each country in 2009 as a measure of
competition, [117] finds that competition among nonprofit MFIs reduces interest
rates, while competition among for-profit MFIs increases interest rates. See [12] for
a theoretical discussion of interest rates and over-indebtedness for MFI clients.
Al-Azzam and Parmeter [6] study 1997 MFIs between the years 2003 and
2016 using three different measures of competition (a Lerner Index, a Herfindahl
Index, and a geographical indicator) to evaluate the impact of competition on the
interest rate(s) charged by MFIs. Their results for the geographic indicator and the
Lerner Index display an accordant impact of competition on interest rates: increased
competition reduces interest rates charged by both for-profit and nonprofit MFIs.
This reduction in interest rates implies a greater concern for cost-efficiency as long-
term viability of MFIs is impacted if returns decrease (as measured through interest
rates) and inefficiencies remain.
As [118, pg. 1] note, “. . . there is widespread agreement, within the industry at
least, that in most situations MFIs ought to pursue financial sustainability by being
as efficient as they can and by charging interest rates and fees high enough to cover
the costs of their lending and other services.” This ties in with the work of Ghosh
and Van Tassel [58] who, assuming that MFIs vary in their operations, find that
competition over external funds can lead to higher aggregate poverty reduction. This
arises since the payment of higher returns for external funds forces higher interest
rates; it also redirects funds from inefficient MFIs to efficient MFIs. See also [47].

Future Directions

The business model for an MFI is quite challenging by definition. If success

were possible with standard approaches to banking, there would be no need for
microfinance to begin with. As with any industry, there is evolution. As firms evolve,
their performance is an important issue to focus on. The microfinance industry
has certainly evolved, migrating from a heavily subsidized industry which offered
microcredit to one with many banks being self-sufficient, offering loans and taking
deposits.10
As the dual profitability-social outreach mission comes into conflict with one
another, efficiency of MFIs is paramount to understand. For example, a new frontier
in the study of MFI performance is the impact of digital payments and mobile
money, both of which have the potential to augment the standard method of business

10 Thisis not to say that subsidization does not exist however; [40] find that subsidies are still
pervasive in the industry, representing on average 13 cents per dollar lent across all MFIs.
1330 C. F. Parmeter and V. Hartarska

for MFIs, allowing greater reach to the poorest customers. This area has yet to be
rigorously studied.
Another important aspect of the business model of MFIs is the impact that
women have. A large majority of MFIs have clientele, be it through savings or loans,
composed primarily of women. The poverty alleviation mission of many MFIs is
consistent with specifically targeting women because more women than men are
poor and because women are poorer than men [137]. While research exists on the
impacts of microfinance on a variety of aspects of women’s lives (empowerment,
health status, education of children), as well as on how adeptly MFIs themselves
are able to meet their missions to serve women [98], little is known about the cost
consequences of serving more women versus men. An important question to ask
is how costly it is to serve women versus men? Would MFIs elect to serve more
women because women have better repayment rates or stronger social networks? If
this is true, how might it be reflected in MFIs’ cost structure? These are insightful
but unanswered questions. A crucial issue to study moving forward is understanding
how the returns to the investments undertaken by women are, in general, much
lower than that to the investments undertaken by men, yet millions of poor women
successfully repay high interest rate loans (for one study of this, see [50]).
Properly embracing the panel nature of the available MFI data represents another
important direction for empirical research. This entails breaking down efficiency
into two distinct components, one a persistent effect that is constant over time and
another that is time-varying. These two disparate components can shed light onto
various performance aspects of MFIs and are important from a policy perspective
as each yields different implications. Colombi et al. [35] refer to time-varying
inefficiency as short-run inefficiency and mention that it can arise due to failure in
allocating resources properly in the short run. A variety of methods exist to estimate
models of this nature; see [90] for a simple OLS application, [35] for maximum
likelihood, and [53] for a simpler simulated maximum likelihood approach. These
new methods should be of interest to applied researchers studying the performance
of MFIs.
The implications from economies from (geographic) market and product diversi-
fication are yet to be identified. Consider rural-urban market diversification. Where
MFIs continue to serve more clients in urban than in rural markets [68], we can ask
if specialization is less costly or if there are cost benefits from serving both markets.
Lending only in rural areas may have higher transaction, screening, and monitoring
costs associated with lower population density, underdeveloped infrastructure, and
limited entrepreneurial opportunities [29, 43, 56]. Dependence on agriculture leads
to mismatch between borrower repayment capacity and the frequent repayments
structure of a typical microfinance contract [11]. Yet, diversifying by serving
both rural and urban markets could capture important remittances flows and even
lower costs if seasonality of loans and savings in rural markets is compensated
for. Previous work on scope economies from savings and lending finds economies
associated with sharing fixed costs (i.e., infrastructure), likely very important in
low-density rural markets. Hartarska et al. [72] find no cost complementarities
associated with learning from savers that can be useful to lending and vice
33 Performance of Microfinance Institutions: A Review 1331

versa. If future work finds diseconomies of market diversification, when costs are
lower for MFIs specializing in rural-only or urban-only markets, this would have
important policy implications for donors, investors, and the larger microfinance
community.
A related issue is the question about (dis)economies from product market
diversification. This question can be addressed only now as the MIX Market dataset
has started to collect such information. The market and product cost economies can
be addressed separately or together. The nature of the lending technologies effective
in urban and rural areas may be sufficiently different so that the overall costs of
an MFI operating in both rural and urban markets do not decrease. For example,
to address adverse selection and moral hazard issues in rural markets, MFIs have
successfully used joint liability – group loans and village banks [4, 13, 120].
In both rural and urban markets, MFIs also use dynamic incentives – individual
contracts with progressively increased loan size [83, 89]. Thus, with the improved
scope and scale economies’ methods and specifications described in this chapter,
research can identify the cost implications from product and market diversification.
In other related financial industries such as banking and insurance, benefits from
specialization in one output over joint production of outputs are found and attributed
to the external environment in which MFIs operate which is also an important issue
in the measurement of productivity as this chapter shows [22, 81, 82].
We hope this chapter serves to illustrate the large literature that exists exploring
the performance of microfinance as well as to demonstrate that more work can and
should be done moving forward. Microfinance is a key avenue to mitigate poverty
worldwide, and the success of this industry is paramount for bringing individuals
and families out of poverty and expanding their ability to engage with the greater
economy.
One unstudied area on MFI performance is adverse incorporation. Howson [78]
highlights the impact that MFIs can have on adverse incorporation, emphasizing the
causal interaction of unequal power relations and the terms of access to state, mar-
ket, community, and household resources in perpetuating rural poverty. Howson’s
[78] study does not investigate performance of MFIs, only their potential impact and
place in adverse incorporation. More work in this area may help to further pin down
various outreach impacts of MFIs and overall financial performance.
Finally, the ideas in [84] represent another dimension which to explore MFI
performance. They note that the majority of empirical work studying MFIs cannot
assess general equilibrium effects. While their research does not focus on MFI
performance, building in the dual outreach/sustainability platform into their work
would be a useful extension.

Cross-References

Aggregation of Efficiency and Productivity: From Firm to Sector and Higher

Levels
1332 C. F. Parmeter and V. Hartarska

Cost Assessment of (Un)bundling: Separation of Vertically Integrated Public

Utilities
Modeling Technical Change: Theory and Practice
Performance of Microfinance Institutions: A Review
Stochastic Frontier Analysis: Foundations and Advances I
Stochastic Frontier Analysis: Foundations and Advances II

References
1. Agier I, Szafarz A (2013) Microfinance and gender: is there a glass ceiling on loan size?
World Dev 42:165–181
2. Ahlin C, Lin J, Maio M (2011) Where does microfinance flourish? Microfinance institution
performance in macroeconomic context. J Dev Econ 95(2):105–120
3. Aigner DJ, Lovell CAK, Schmidt P (1977) Formulation and estimation of stochastic frontier
production functions. J Econ 6(1):21–37
4. Akram S, Routray JK (2013) Investigating causal relationship between social capital and
microfinance: implications for rural development. Int J Soc Econ 40:760–776
5. Al-Azzam M (2019) Financing microfinance institutions: subsidies or deposit mobilisation.
Appl Econ 51(15):1621–1633
6. Al-Azzam M, Parmeter CF (2019) Competition and microcredit interest rates: international
evidence. Empir Econ. Forthcoming
7. Alvarez A, Amsler C, Orea L, Schmidt P (2006) Interpreting and testing the scaling property
in models where inefficiency depends on firm characteristics. J Prod Anal 25(2):201–212
8. Amersdorffer F, Buchenrieder G, Bokusheva R, Wolz A (2015) Efficiency in microfinance:
financial and social performance of agricultural credit cooperatives in Bulgaria. J Oper Res
Soc 66(1):57–65
9. Armendáriz B (2011) Women and microsavings. In: Armendariz B, Labie M (eds) World
Scientific Publishing Co. Pte Ltd. Singapore, pp 503–515
10. Armendáriz B, Szafarz A (2011) On mission drift in microfinance institutions. In: Armendariz
B, Labie M (eds) World Scientific Publishing Co. Pte Ltd. Singapore, pp 341–366
11. Armendariz de Aghion B, Morduch J (2007) The economics of microfinance. MIT Press
12. Arnold LG, Booker B (2013) Good intentions pave the way to · · · the local moneylender.
Econ Lett 118(3):466–469
13. Arnold LG, Reeder J, Steger S (2013) On the viability of group lending when microfinance
meets the market: a reconsideration of the Besley-Coate model. J Emerg Mark Financ 12:59–
106
14. Banerjee A, Karlan D, Zinman J (2015) Six randomized evaluations of microcredit: introduc-
tion and further steps. Am Econ J Appl Econ 7(1):1–21
15. Baquero G, Hamadi M, Heinen A (2018) Competition, loan rates, and information dispersion
in nonprofit and for-profit microcredit markets. J Money Credit Bank 50(5):893–937
16. Basharat B, Hudon M, Nawaz A (2015) Does efficiency lead to lower prices? A new
perspective from microfinance interest rates. Strateg Chang 24(1):49–66
17. Bauchet J, Morduch J (2010) Selective knowledge: reporting bias in microfinance data.
Perspect Global Dev Technol 9:240–269
18. Baumol WJ, Panzer J, Willig R (1982) Contestable markets and the theory of market structure.
Harcourt, New York
19. Beck T, Behr P, Guettler A (2013) Gender and banking: are women better loan officers? Rev
Financ 17:1279–1321
20. Beisland LA, D’Espallier B, Mersland R (2019) The commercialization of the microfinance
industry: is there a ‘personal mission drift’ among credit officers? J Bus Ethics 158:119–134
33 Performance of Microfinance Institutions: A Review 1333

21. Bensalem S, Ellouze A (2019) Profit-orientation and efficiency in microfinance industry: an

application of stochastic frontier approach. Inf Syst Oper Res 57(3):411–429
22. Berger AN, Cummins JD, Weiss MA, Zi H (2000) Conglomeration versus strategic focus:
evidence from the insurance industry. J Financ Intermed 9(4):323–362
23. Bibi U, Balli HO, Matthews CD, Tripe DWL (2018) Impact of gender and governance on
microfinance efficiency. J Int Financ Mark Inst Money 53:307–319
24. Boehe DM, Cruz L (2013) Gender and microfinance performance: Why does the institutional
context matter? World Dev 47:121–135
25. Bogan VL (2012) Capital structure and sustainability: an empirical study of microfinance
institutions. Rev Econ Stat 94(4):1045–1058
26. Bond P, Rai AS (2009) Borrower runs. J Dev Econ 88(2):185–191
27. Bos J, Millone M (2015) Practice what you preach: microfinance business models and
operational efficiency. World Dev 70:28–42
28. Botti F, Crosi M, Zacchia G (2018) A new European microfinance panel data set: The
European Microfinance Network survey 2006–2015. EMN Working Paper No.1-June 2018.
Available at https://www.european-microfinance.org/
29. Caudill S, Gropper D, Hartarska V (2009) Which microfinance institutions are becoming
more cost-effective with time? Evidence from a mixture model. J Money Credit Bank 41:651–
672
30. Caudill S, Gropper D, Hartarska V (2012) Microfinance institution costs: effects of gender,
subsidies and technology. J Financ Econ Policy 4:292–319
31. Christen R, Drake D (2002) Commercialization. The new reality of microfinance. Kumarian
Press, Bloomfield, pp 2–22
32. Christen RP, Lyman TR, Rosenberg R (2003) Microfinance consensus guidelines: guiding
principles on regulation and supervision of microfinance. CGAP and World Bank, Washing-
ton, DC
33. Churchill SA (2020) Microfinance financial sustainability and outreach: is there a trade-off?
Empir Econ. 59:1329–1350
34. Collins D, Morduch J, Rutherford S, Ruthven O (2009) Portfolios of the poor: how the
World’s poor live on $2 a day. Princeton, New Jersey
35. Colombi R, Kumbhakar S, Martini G, Vittadini G (2014) Closed-skew normality in stochastic
frontiers with individual effects and long/short-run efficiency. J Prod Anal 42:123–136
36. Cozarenco A, Hartarska V, Szafarz A (2018) Too many cooks spoil the broth: the conflicting
impacts of subsidies and deposits on the cost-efficiency of microfinance institutions. Available
at SSRN: https://ssrn.com/abstract=3295173 or https://doi.org/10.2139/ssrn.3295173
37. Cozarenco A, Hudon M, Szafarz A (2016) What type of microfinance institutions supply
savings products? Econ Lett 140:57–59
38. Cull R, Demirgüç-Kunt A, Morduch J (2007) Financial performance and outreach: a global
analysis of leading micro banks. Econ J 117:107–133
39. Cull R, Demirgüç-Kunt A, Morduch J (2011) Does regulatory supervision curtail microfi-
nance profitability and outreach? World Dev 39(6):949–965
40. Cull R, Demirgüç-Kunt A, Morduch J (2018) The microfinance business model: enduring
subsidy and modest profit. World Bank Econ Rev 32(2):221–244
41. Dahal M, Fiala N (2020) What do we know about the impact of microfinance? The problems
of statistical power and precision. World Dev 128:104773
42. Dehejia R, Montgomery H, Morduch J (2012) Do interest rates matter? Credit demand in the
Dhaka slums’. J Dev Econ 97(2):437–449
43. Dehem T, Hudon M (2013) Microfinance from the clients’ perspective: an empirical enquiry
into transaction costs in urban and rural India. Oxford Dev Stud 41:S117–S132
44. Delgado M, Parmeter CF, Hartarska V, Mersland R (2015) Should all microfinance institu-
tions mobilize microsavings? Evidence from economies of scope. Empir Econ 48(1):193–225
45. D’Espallier B, Goedecke J, Hudon M, Mersland R (2017) From NGOs to banks: does
institutional transformation alter the business model of microfinance institutions? World Dev
89:19–33
1334 C. F. Parmeter and V. Hartarska

46. D’Espallier B, Guérin I, Mersland R (2011) Women and repayment in microfinance: a global
analysis. World Dev 39:758–772
47. D’Espallier B, Hudon M, Szafarz A (2013) Unsubsidized microfinance institutions. Econ Lett
120(2):174–176
48. D’Espallier B, Hudon M, Szafarz A (2017) Aid volatility and social performance in
microfinance. Nonprofit Volunt Sect Q 46(1):116–140
49. Efendic V, Hadziahmetovic N (2017) The social and financial efficiency of microfinance
institutions: the case of Bosnia and Herzegovina. South East Eur J Econ Bus 12(2):85–101
50. Emran MS, Morshed AKMM, Stiglitz J (2011) Microfinance and missing markets. MPRA
Paper No. 41451
51. Evans DS, Heckman JJ (1984) A test for subadditivity of the cost function with an application
to the Bell system. Am Econ Rev 74(4):615–623
52. Fall F, Akim A, Wassongma H (2018) DEA and SFA research on the efficiency of
microfinance institutions: a meta-analysis. World Dev 107:176–188
53. Fillipini M, Greene WH (2016) Persistent and transient productive inefficiency: a maximum
simulated likelihood approach. J Prod Anal 45(2):187–196
54. Flückiger Y, Vassiliev A (2007) Efficiency in microfinance institutions: an application of data
envelopment analysis to MFIs in Peru. Palgrave Macmillan, London, pp 89–110
55. Freixas X, Rochet J (1997) Microeconomis of banking. Cambridge
56. Garmaise MJ, Natividad G (2010) Information, the cost of credit, and operational efficiency:
an empirical study of microfinance. Rev Financ Stud 23:2560–2590
57. Gebremariam HT (2017) Microfinance institutions’ transparency, governance and risk in Sub-
Saharan Africa. Unpublished Thesis, Doctor of Philosophy, University of Otago. Retrieved
from http://hdl.handle.net/10523/7754
58. Ghosh S, Van Tassel E (2011) Microfinance and competition for external funding. Econ Lett
112(2):168–170
59. Greene WH (2005) Reconsidering heterogeneity in panel data estimators of the stochastic
frontier model. J Econ 126(2):269–303
60. Gregoire JR, Tuya OR (2006) Cost efficiency of microfinance institutions in Peru: a stochastic
frontier approach. Latin Am Bus Rev 7(2):41–70
61. Gul FA, Podder J, Shahriar AZM (2017) Performance of microfinance institutions: does
government ideology matter? World Dev 100:1–15
62. Gutiérrez-Nieto B, Serrano-Cinca C, Mar M (2007) Microfinance institutions and efficiency.
Omega Int J Manag Sci 35:131–142
63. Haq M, Skully M, Pathan S (2010) Efficiency of microfinance institutions: a data envelopment
analysis. Asia-Pacific Financ Mark 17(1):63–97
64. Hartarska V (2005) Governance and performance of microfinance institutions in central and
eastern Europe and the newly independent stats. World Dev 33:1627–1643
65. Hartarska V, James N, Mersland R (2009) Scale economies in microfinance. Paper presented
at the First European research conference on microfinance, Brussels
66. Hartarska V, Mersland R (2012) What governance mechanisms promote efficiency in reaching
poor clients? Evidence from rated MFIs’, Eur Financ Manag 18:218–239
67. Hartarska V, Nadolnyak D (2008) Does rating help microfinance institutions raise funds?
Cross-country evidence. Int Rev Econ Financ 17(4):558–571
68. Hartarska V, Nadolnyak D, Mersland R (2014) Are women better bankers for the poor:
evidence from rural microfinance institutions? Am J Agric Econ 96:1291–1306
69. Hartarska V, Nadolnyak D, Shen X (2012) Efficiency in microfinance cooperatives. Iberoam
J Dev Stud 1(2):53–74
70. Hartarska V, Nadolnyak D, Shen X (2013) Cost function approach to MFI efficiency: the role
of subsidy and social output measures. In: Ronny Manos J-PG, Yaron J (eds) Innovations in
microfinance: case studies from developing countries and countries in transition. New York
71. Hartarska V, Parmeter CF (2020) Do women make the (MFI’s) world go round? University
of Miami Working Paper
33 Performance of Microfinance Institutions: A Review 1335

72. Hartarska V, Parmeter CF, Nadolnyak D (2011) Economies of scope of lending and mobilizing
deposits in microfinance institutions: a semiparametric analysis. Am J Agric Econ 93(2):389–
398
73. Hartarska V, Parmeter CF, Nadolnyak D, Zhu B (2010) Economies of scope for microfinance:
differences across output measures. Pac Econ Rev 15:464–481
74. Hartarska V, Shen X, Mersland R (2013) Scale economies and elasticities of substitution in
microfinance institutions. J Bank Financ 37:118–131
75. Hassan MK, Tufte DR (2001) The X-efficiency of a group-based lending institution: the case
of the Grameen Bank. World Dev 29(6):1071–1082
76. Henderson DJ, Parmeter CF (2015) Applied nonparametric econometrics. Cambridge Uni-
versity Press, Cambridge
77. Hermes N, Lensink R, Meesters A (2011) Outreach and efficiency of microfinance institu-
tions. World Dev 39:938–948
78. Howson C (2013) Adverse incorporation and microfinance among cross-border traders in
Senegal. World Dev 42:199–208
79. Hudon M, Traca D (2011) On the efficiency effects of subsidies in microfinance: an empirical
inquiry. World Dev 39(6):966–973
80. Hughes JP, Mester LJ (1993) A quality and risk-adjusted cost function for banks: evidence on
the “too-big-to-fail” doctrine. J Prod Anal 4(3):293–315
81. Hughes JP, Mester LJ (2013) Who said large banks don’t experience scale economies?
Evidence from a risk-return-driven cost function. J Financ Intermed 22:559–585
82. Hughes JP, Mester LJ (2015) Measuring the performance of banks: theory, practice, evidence,
and some policy implications. Oxford University Press, Oxford, pp 247–270
83. Hulme D, Mosley P (1996) Finance against poverty. Routledge, New York
84. Islam K, O’Gorman M (2019) Microcredit contract design: a macroeconomic evaluation.
World Dev 124:104634
85. Jondrow J, Lovell CAK, Materov IS, Schmidt P (1982) On the estimation of technical
efficiency in the stochastic frontier production function model. J Econ 19(2/3):233–238
86. Kendo S (2017) Do decision variables improve microfinance efficiency? A stochastic frontier
analysis for African countries. Strateg Chang 26(2):159–174
87. Kevane M, Wydick B (2001) Microenterprise lending to female entrepreneurs: sacrificing
economic growth for poverty alleviation? World Dev 29(7):1225–1236
88. Khachatryan K, Hartarska V, Grigoryan A (2017) Performance and capital structure of
microfinance institutions in Eastern Europe and Central Asia. J East Eur Econ 55:395–419
89. Kumar NK (2012) Dynamic incentives in microfinance group lending: an empirical analysis
of progressive lending mechanism. SAGE Open 2(2):1–9
90. Kumbhakar SC, Lien G, Hardaker JB (2014) Technical efficiency in competing panel data
models: a study of Norwegian grain farming. J Prod Anal. 41:321–337
91. Kumbhakar SC, Parmeter CF, Zelenyuk V (2020) Stochastic frontier analysis: foundations
and advances I. In: Ray S, Chambers R, Kumbhakar SC (eds) Handbook of production
economics. Springer Nature Singapore Pte Ltd. Forthcoming
92. Kyriazidou E (1997) Estimation of a panel data sample selection model. Econometrica
65(6):1335–1364
93. Lebovics M, Hermes N, Hudon M (2016) Are financial and social efficiency mutually
exclusive? A case study of Vietnamese microfinance institutions. Ann Public Coop Econ
87(1):55–77
94. León JV (2009) An empirical analysis of Peruvian municipal banks using cost-efficiency
frontier approaches. Can J Dev Stud 29(1–2):161–182
95. Li Q, Racine J (2007) Nonparametric econometrics: theory and practice. Princeton, NJ 08540
96. Malikov E, Hartarska V (2018) Endogenous scope economies in microfinance institutions. J
Bank Financ 93:162–182
97. Malikov E, Hartarska V, Mersland R (2020) Economies of diversification in microfinance:
evidence from quantile estimation on panel data. Financ Res Lett 34:101246
1336 C. F. Parmeter and V. Hartarska

98. Mersland R, Nyarko SA, Szafarz A (2019) Do social enterprises walk the talk? Assessing
microfinance performances with mission statements. J Bus Ventur Insights 11:e00217
99. Mersland R, Strøm R Ø (2009) Performance and governance in microfinance institutions. J
Bank Financ 33(4):662–669
100. Mersland R, Strøm RØ (2010) Microfinance mission drift? World Dev 38(1):28–36
101. Morduch J (1999) The microfinance promise. J Econ Lit 37:1569–1614
102. Morduch J (2000) The microfinance schism. World Dev 28(4):617–629
103. Morduch J (2020) Why RCTs failed to answer the biggest questions about microcredit impact.
World Dev 127:104818
104. Nawaz A (2010) Performance of microfinance: the role of subsidies. Savings Dev 34(1):
97–138
105. Pal D, Mitra SK (2017) Does the number of borrowers per loan officer influence microfinance
institution asset quality? A stochastic frontier analysis. Investigación Económica 76(300):81–
103
106. Parmeter CF, Kumbhakar SC (2014) Efficiency analysis: a primer on recent advances. Found
Trends Econ 7(3–4):191–385
107. Parmeter CF, Wang H-J, Kumbhakar SC (2017) Nonparametric estimation of the determinants
of inefficiency. J Prod Anal 47(3):205–221
108. Parmeter CF, Zelenyuk V (2019) Combining the virtues of stochastic frontier and data
envelopment analysis. Oper Res 67(6):1628–1658
109. Paxton J (2007) Technical efficiency in a semi-formal financial sector: the case of Mexico.
Oxford Bull Econ Stat 69(1):57–74
110. Périlleux A, Szafarz A (2015) Women leaders and social performance: evidence from
financial cooperatives in Senegal. World Dev 74:437–452
111. Pieters J, Klasen S (2020) Randomization for women’s economic empowerment? Lessons
and limitations of randomized experiments. World Dev 127:104820
112. Porteous N (2006) Competition and microcredit interest rates. CGAP Focus Note No. 33,
CGAP, Washington, DC
113. Pulley LB, Braunstein YM (1992) A composite cost function for multiproduct firms with an
application to economies of scope in banking. Rev Econ Stat 74:213–230
114. Pulley LB, Humphrey D (1993) The role of fixed costs and cost complementarities in
determining scope economies and the cost of narrow banking proposals. J Bus 66:437–462
115. Quayes S (2015) Outreach and performance of microfinance institutions: a panel analysis.
Appl Econ 47(18):1909–1925
116. Reichert P (2018) A meta-analysis examining the nature of trade-offs in microfinance. Oxford
Dev Stud 46(3):430–452
117. Roberts PW (2013) The profit orientation of microfinance institutions and effective interest
rates. World Dev 41:120–131
118. Rosenberg R, Gonzalez A, Narain S (2009) The new moneylenders: are the poor being
exploited by high microcredit interest rates? In: Occasional paper. CGAP, Washington, DC,
pp 1–28
119. Sagamba M, Shchetinin O, Yusupov N (2013) Do microloan officers want to lend to the less
advantaged? Evidence from a choice experiment. World Dev 42:182–198
120. Schurmann AT, Johnston H (2009) The group-lending model and social closure: microcredit,
exclusion, and health in Bangladesh. J Health Popul Nutr 27(4):518–527
121. Sealey Jr CW, Lindley JT (1977) Inputs, outputs, and a theory of production and cost at
depository financial institutions. J Financ 32(4):1251–1266
122. Serrano-Cinca C, Gutiérrez-Nieto B, Mar Molinero C (2011) Social and financial efficiency
of microfinance institutions. In: Armendariz B, Labie M (eds) World Scientific Publishing
Co. Pte Ltd. Singapore, pp 397–418
123. Servin R, Lensink R, van den Berg M (2012) Ownership and technical efficiency of
microfinance institutions: empirical evidence from Latin America. J Bank Financ 36(7):2136–
2144
33 Performance of Microfinance Institutions: A Review 1337

124. Silva AC, Chávez GA (2015) Microfinance, country governance, and the global financial
crisis. Ventur Cap 17(1–2):191–213
125. Simar L, Wilson PW (2013) Estimation and inference in nonparametric frontier models:
recent developments and perspectives. Found Trends Econom 5(2):183–337
126. Sinclair H (2012) Confessions of a microfinance heretic: how microlending lost its way and
betrayed the poor. Berrett-Koehler Publishers Inc., San Fancisco
127. Strøm R, D’Espallier B, Mersland R (2014) Female leadership, performance, and governance
in microfinance institutions. J Bank Financ 42:60–75
128. Tran KC, Tsionas EG (2009) Estimation of nonparametric inefficiency effects stochastic
frontier models with an application to British manufacturing. Econ Model 26:904–909
129. van Cayseele P, Wuyts C (2007) Cost efficiency in the European securities settlement and
depository industry. J Bank Financ 31(10):3058–3079
130. van Rooyen C, Stewart R, de Wet T (2012) The impact of microfinance in Sub-Saharan Africa:
a systematic review of the evidence. World Dev 40(11):2249–2262
131. Vogelgesang U (2003) Microfinance in times of crisis: the effects of competition, rising
indebtedness, and economic crisis on repayment behavior. World Dev 31(12):2085–2114
132. Wagner C (2012) From boom to bust: how different has microfinance been from traditional
banking? Dev Policy Rev 30(2):187–210
133. Wagner C, Winkler A (2013) The vulnerability of microfinance to financial turmoil – evidence
from the global financial crisis. World Dev 51:71–90
134. Wang H-J, Schmidt P (2002) One-step and two-step estimation of the effects of exogenous
variables on technical efficiency levels. J Prod Anal 18:129–144
135. Wijesiri M (2016) Weathering the storm: ownership structure and performance of microfi-
nance institutions in the wake of the global financial crisis Econ Model 57:238–247
136. Wilson PW (2018) Dimension reduction in nonparametric models of production. Eur J Oper
Res 267(1):349–367
137. World Bank (2012) World development report 2012: Gender equality and development,
Technical report. https://openknowledge.worldbank.org/handle/10986/4391
138. Zamore S, Beisland LA, Mersland R (2019) Geographic diversification and credit risk in
microfinance. J Bank Financ 109:105665
139. Zhou J, Parmeter CF, Kumbhakar SC (2020) Nonparametric estimation of the determinants
of inefficiency in the presence of firm heterogeneity. Eur J Oper Res 286(3):1142–1152
The Economics of Production in Marine
Fisheries 34
Dale Squires and John Walden

Contents
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1340
Vessel-Level Production . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1343
Capital . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1344
Labor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1345
Management or Skipper Skill . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1346
Nonrivalrous Inputs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1347
Resource Stock . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1347
Dual Representations of Technology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1349
Product Transformation and Substitution Possibilities . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1349
Structure of Multiproduct Costs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1350
Multiproduct Joint Production . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1352
Separability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1354
Distance Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1354
Technical Efficiency and Stochastic Production Frontiers . . . . . . . . . . . . . . . . . . . . . . . . . 1355
Rationing and Quotas . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1356
Le Chatelier Principle, Quotas, and Product Transformation Possibilities . . . . . . . . . . . . . . 1357
Fishing Time . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1358
Technological Change . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1359
Productivity Growth . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1360
Bioeconomic Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1361
Effort as an Input . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1361
Concluding Remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1364
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1367

D. Squires ()
NMFS, Southwest Fisheries Science Center, La Jolla, CA, USA
e-mail: [email protected]
J. Walden
NMFS, Northeast Fisheries Science Center, Woods Hole, MA, USA
e-mail: [email protected]

S. C. Ray et al. (eds.), Handbook of Production Economics,
https://doi.org/10.1007/978-981-10-3455-8_34
1340 D. Squires and J. Walden

Abstract

Production economics is important to the economic analysis and public reg-

ulation of fishing industries in order to address the market failure associated
with a common renewable resource stock. Application of production economics
arose out of bioeconomic analysis of the aggregate fishery production framework.
Production economics gained in importance as it, along with econometrics and
mathematical programming, developed as fields. Coupled with an industrial
organization orientation and public regulation focus, production economics
contributed to analyses of fishing industries, and addressed the underlying
market failure. Compared to bioeconomics, this reorientation shifted the focus
to shorter time periods and to the individual firm – usually the vessel – and to
multiproduct, multi-input production. Production economics now contributes to
further development of the bioeconomic model, and addresses additional sources
of market failure in fishing industries arising from pure and impure public
goods associated with new technology and biodiversity, as well as ecosystem
services impacted by fisheries. The chapter reviews the historical developments
of production economics applied to fisheries, and concludes with potential future
directions forward.

Keywords

Economics · Production · Marine fisheries · Common resources · Industrial

organization · Productivity · Efficiency

Introduction

Beginning in the 1950s, the problem of the “commons,” particularly in relation to

fish resources, started to generate interest among economists. Economists sought
to answer questions centered on resource depletion, dissipation of economic rents
due to absent or ill-structured property rights, the optimal number of fishing vessels,
and appropriate harvest rates. Subsequent empirical work showed the accumulation
of physical capital starting to occur in fisheries [21, 186]. These studies were
among the first that used data collected from a specific fishery to show a result that
policymakers could use to restrict capital in a fishery. The introduction of economic
tools and resulting policy advice was timely because the biological models used
by regulators at the time were insufficient to provide coherent policy advice in the
presence of increasing capital, and perhaps more importantly, technological change.
Into this void, economic models and thought, which had slowly been developing,
were well positioned to provide policy advice to fishery managers. At the heart of
this guidance were models grounded in modern production economics.
The introduction of production economics into questions surrounding fishery
management began with seminal work by Gordon [90], Scott [197], and Smith
[202]. Gerhardsen [88] also wrote another very early economics paper. These papers
34 The Economics of Production in Marine Fisheries 1341

established the theoretical basis for rent dissipation and overcapitalization under
open access and how the rent-maximizing fishery would entail lower levels of
capitalization (or effort). They assumed a fleet of homogeneous fishing vessels with
the same underlying production technology and cost functions, and a fish stock
assumed to be under a stationary condition. The surplus production framework
and aggregate production function from the biologist Schaefer [193] were typically
adopted. The total number of vessels that could fish was usually viewed as the
control variable in the system, and the objective was to understand the relationship
between the number of vessels, harvest levels, and remaining stock size, within an
objective of maximizing economic rent. Overall, these aggregate models provided
policymakers with advice that could be used to limit fishing effort within the
constraints of the productivity of the underlying fish stocks. They also improved
the biological models of the time, which were void of economic thought.
The aggregate production framework was limiting, and by the late 1960s and
early 1970s researchers started to specify models based on modern production
theory focused on individual vessels as firms. In one of the first studies, Comitini
and Huang [46] estimated production functions for a panel of 32 fishing vessels.
This was followed by Carlson [39, 40], who used cross-sectional data from the
New England trawl fleet along with the tropical tuna purse seine fleet to estimate
generalized production functions for the two fleets. To move away from effort as an
aggregate input, which had been the dominant approach in the aggregate production
framework, Huang and Lee [113] and Anderson [9, 10] developed models that
recognized fishery production as a two-stage process, with an intermediate output
(effort) being used as an input for the final output (landings). This period firmly
established modern production theory as a tool that could be used to model the
fishing production process and provide relevant policy advice for managers.
Hannesson [99], the first to introduce modern concepts of production economics
and empirical analysis, applied separability to fishing effort specified as a composite
input. He also introduced (disembodied, exogenous) technological change, (deter-
ministic) frontier functions, technical inefficiency, and functional forms beyond the
Cobb-Douglas, notably the homothetic frontier of Zellner and Revankar [261] and
the translog. Influential work by Squires [205–211], Kirkley [122], and Kirkley and
Strand [126] introduced dual-based methods and other flexible functional forms to
examine the underlying multiproduct costs and production technology (especially
input and output substitution possibilities) and the nature of joint production
for multioutput, multi-input commercial fishing vessels and public regulation of
groundfish trawlers in New England. The dual-based approached readily allowed
disaggregation of aggregate output and input (i.e., effort) into individual products
and inputs, specifying exogenous prices as regressors, and testing for various types
of separability and joint production. Bjørndal [24], Dupont [56, 57], and Bjørndal
and Gordon [26] further developed the dual approach in fisheries, including the
impact of restricted inputs, while examining practical problems in the British
Columbia salmon, Scandinavian herring, and other fisheries and further introducing
contemporary empirical analysis.
1342 D. Squires and J. Walden

In short, the empirical models developed during the 1980s and early 1990s
ushered into fisheries the era of modern production economics. These models
shifted analysis beyond the popular aggregate bioeconomic models of the time
to empirical firm-level analysis applying rigorous microeconometric techniques
to cross-sectional and panel data on individual fishing vessels, flexible functional
forms, duality, concepts of multiproduct costs and production, homotheticity,
separability, and aggregation, joint production, technological change, and more
advanced econometric estimation techniques that were emerging in production
economics and econometrics. These empirical analyses explicitly recognized the
multiple-output, multiple-input joint production nature of the fishing firm’s pro-
duction process, which had not been fully explored at that time, although clearly
recognized by Comitini and Huang [46, 47], Carlson [39, 40], Huang and Lee [113],
and many of the other early production economists.
Applying production concepts to multiproduct revenue- and profit-maximizing
firms with endogenous products in the face of public regulation addressing common
resource market failure led to a number of extensions of production economics to
topics relevant to fishing industries. Squires [206, 207, 209], Segerson and Squires
[198, 199], Kirkley and Squires [124], Färe et al. [70], and Lindebo and Vestergaard
[145] extended the economics of capacity and capacity utilization to maximization
of short-run profits or revenues with endogenous multiple outputs and accounting
for a second capital stock (in addition to physical capital), the natural capital
stock. Squires [206, 207, 209, 211] and Squires and Kirkley [216] extended the
short- and long-run multiproduct cost structure from firms minimizing the costs of
exogenous or predetermined single and multiple products to revenue- and profit-
maximizing firms with multiple endogenous products. Segerson and Squires [198]
further develop the ray measure of multiproduct returns to scale plus two original
alternatives. Herrick and Squires [105] and Squires [210, 212] extended total factor
productivity and index numbers to account for multiple species of the natural capital
stock using consistency of multistage aggregation and superlative index numbers.
Segerson and Squires [199], Squires [213, 215], Squires and Kirkley [216, 218],
Dupont and Gordon [58], Asche et al. [16], Ekerhovd [66], and Hansen and Jensen
[103] extended virtual price [161] and virtual quantity theory [160] to rations and
quotas and the related shadow price approach to individual nontransferable and
transferable quotas on catch and effort. Asche [13] developed a dynamic revenue
function with adjustment costs. Dupont [57] developed the relationship between
Diewert’s [54] elasticity of intensity and the conventional price elasticity of input
demand. Squires [215] extended the elasticity of intensity to allow for adding or
removing quantity controls (using virtual prices and quantities).
By the early 2000s, input and output distance functions were applied to com-
mercial fishing fleets at the vessel level [76, 124, 125, 247, 257]. Distance functions
were used in response to concerns about capacity and excess capacity in commercial
fisheries [60, 130, 131, 172, 250]. (See Squires and Segerson [220] for a review
of capacity and capacity utilization including fisheries.) Kirkley and Squires [124]
and Kirkley et al. [127] first applied data envelopment analysis (DEA) with output
distance functions and the stochastic production frontier (SPF) to analyze technical
34 The Economics of Production in Marine Fisheries 1343

efficiency and fishing capacity. Distance functions have subsequently been used
to answer questions about discards, product transformation possibilities, and to
value fishing vessel capital [73, 75, 173, 175, 194]. Distance functions are far less
restrictive than a production function based upon a specific functional form.
The balance of this chapter discusses vessel-level production models and a deeper
discussion of capital, labor, management, or skipper skill, and effort as inputs in the
vessel production function. Next, dual-based methods using revenue, profit, and cost
functions are reviewed. Distance function and directional distance function models
follow. We then discuss multiperiod production models and dynamics, productivity
measurement, and technical change, along with examples of economic growth
models applied to fisheries. Using the theory of production concepts developed in
the chapter, we conclude with a discussion of the aggregate production framework
typically found in bioeconomic models.

Vessel-Level Production

Comitini and Huang [46], in one of the first examples of vessel-level production,
specified and estimated Cobb-Douglas and constant elasticity of substitution (CES)
production functions. Results showed differences in vessel productivity, which they
attributed to the managerial ability of the captains (“skipper” effect). Carlson [39]
specified production functions for New England trawl and the tropical tuna purse
seine vessels. The study was the first to extend capital input beyond a single
characteristic, such as vessel horsepower. Instead, capital input was defined as
gross tons, horsepower, hull construction (i.e., steel vs. wood), and vessel age.
Labor was still defined as crew size, and the study recognized the “skipper” effect
on vessel productivity. Productivity of the fishing grounds was accounted for by
home port dummy variables. Finally, although data were not available to test
for technological change, it was the first to recognize technological change as a
component of the production function. Other early empirical production function
studies included Comitini and Huang [47], MacSween [148], Liao [144], Buchanan
[32], Comitini [45], Hussen and Sutinen [116], Taylor and Prochaska [234], Strand
et al. [232], Holt [110], Hannesson [99], Kirkley [122], Staniford [230], Greenberg
and Herrmann [95], Agnello and Anderson [4], Bjørndal [25], and Campbell and
Lindner [36].
These early studies, consistent with the applied economics of the time, did not
consider the potential endogeneity of inputs (or simply appealed to the maximizing
of expected profit discussed by Zellner et al. [262]), areas fished, and ports, or fully
employ panel data techniques (one-, two, or three-way fixed and random effects,
mixed effects). They largely preceded flexible functional forms for the production
function, generalized approaches to addressing error terms for heteroscedasticity
and serial correlation, or quasi-experimental methods (such as difference in differ-
ences) for causal inference on policy or other issues.
Other influential production function studies include Grafton et al. [94], who
evaluated the impact of individual transferable quotas in the earliest “modern”
1344 D. Squires and J. Walden

microeconomics approach of evaluating natural experiments. Wolff et al. [260]

utilized three-way fixed effects to distinguish vessel and skipper effects. Horrace and
Schnier [112] specified fixed effects for area, and Natividad [159] utilized difference
in differences to evaluate the causal effect of individual transferable quotas (ITQs)
on catch and productivity of the Peruvian anchovy fishery. Zhang and Smith [263]
specify a two-stage estimation method to address the latent stock problem with
errors in the production function and stock dynamics. Fousekis and Kolonaris [82]
and Weninger and Strand [258] estimated the first multiproduct production (distance
and ray) functions.

Capital

Nøstbakken et al. [165] lists natural, physical, human, and immaterial capital as
distinct categories of capital in a fisheries framework. Natural capital refers to the
fish biomass St . Except in the case of a sole owner fishery, natural capital (St ) is
never under the control of an individual fishing vessel. Human capital refers to the
labor input, and has been studied by some in the context of skipper skill as discussed
below. Immaterial capital refers to assets such as fishing rights. In this section, we
will focus on physical capital Kt , which is the vessel and equipment that is needed
to conduct fishing operations.
Aggregate industry Kt in a fishery is the sum of all the capital on individual
vessels participating in the fishery. The perpetual inventory method provides a more
comprehensive measure of Kt than simply vessel counts by inventorying entering
and exiting vessels, differences in productivity between newer and older vessels, and
depreciation [150]. This method, however, requires detailed data collection that is
often not collected regularly. The value of capital (Kt ) can also be estimated through
the use of insurance surveys, surveys of secondhand prices, book values, or prices
of newly constructed vessels [183]. Kirkley and Squires [123] specified a hedonic
model to estimate vessel value for a fleet of vessels operating off the eastern US
coast. Färe et al. [75] estimated capital values for a group of fishing vessels in the
eastern USA based on secondhand sales advertisements from commercial vessel
brokers and an input distance function. Mean values for the capital inputs based on
the distance function model were used to construct a Lowe capital quantity index.
The vessel stock Kt needs to be turned into a flow of capital services. The
dual approach is through the user cost of capital or the Christenson-Jorgenson
capital services price [42]. Squires [206–208, 210–212], in the first estimates,
assumed an opportunity cost of Kt equal to the rate of return on a BAA rated
bond, which is considered a “risky” bond, and an economic depreciation rate of
7%. Fisheries studies typically assume there is no “unanticipated revaluation” of
physical capital (capital gains or losses), but could apply to an asset like an ITQ
share. Aggregate capital services costs aggregate over individual asset types and
ages. Primal measures of capital services are through multiplying Kt by a measure
of time, such as days at sea or days fishing.
34 The Economics of Production in Marine Fisheries 1345

Labor

The crew (i.e., labor) is typically rewarded using a “lay system” rather than an
hourly wage rate. These arrangements have existed for well over 100 years. Crew
is compensated with a percentage of the revenue from a fishing trip either before
or after trip costs are deducted. There are a wide variety of lay systems in place,
and who pays for the trip costs is an important part of each system, as well as
the split of revenue. When crew are paid through shared remuneration systems, the
sharing within crew members is often not homogeneous, highlighting differences in
marginal productivity and payment of labor quasi rents.
Shared remuneration systems vary across the globe [96], including four remu-
neration systems commonly used in fisheries: (1) fixed remuneration systems, (2)
shared remuneration system: proportional to catch or revenues, (3) shared remu-
neration system: proportional to revenues minus operational costs, and (4) shared
remuneration system: proportional to profits. Sometimes fishermen’s remuneration
includes two or more features of these classifications.
The lay system, similar to crop sharing, is usually explained as a means of sharing
risk [231], and is widely used in fisheries worldwide [153]. Vestergaard [244]
recognized risk sharing in lay systems, but also characterized these arrangements as
a principal-agent issue. This is particularly evident if supervision of workers’ effort
is unobservable, costly, or ineffective [231]. Vessel owners often do not participate
in fishing, but instead hire a captain and crew to fish their vessel for extended periods
of time. The share system allows the owner to both share risk and lower their
monitoring cost. McConnell and Price [153] suggest that moral hazard and team
agency can explain the share system. Moral hazard exists because individual effort is
usually unobserved by the vessel owner, which leaves stochastic harvest as the only
output of crew effort. Team agency conflicts come about because individual fishing
crew independently allocate effort which is both costly and unobservable [153].
The role of the lay system in commercial fisheries, and how it impacts vessel pro-
duction, has not been as extensively studied as other topics in fisheries production.
Early studies include Sutinen [233], Plourde and Smith [182], Craig and Knoeber
[48], and Matthiasson [151].
A lay system changes some of the behavioral assumptions which underlie the
usual neoclassical production model. Since the share system allows the crew to
share in revenue earned on a fishing trip, they may be earning more than their
opportunity cost, meaning they are earning an economic surplus (i.e., economic
rent) from their participation in the fishery [96]. Revenue maximization may then
be a better assumption than profit maximizing, or cost-minimizing behavior when
modeling fishing vessel behavior [13]. Moreover, incentives for investment in an
individual transferable quota, or ITQ system, may become distorted, so that the
presence of a share system could lead to either overinvestment or underinvestment
in vessels depending on the share which goes to the boat owner [100].
McConnell and Price [153] addressed whether the lay system distorts empirical
fishery production model results. They cautioned that the presence of a share system
may undermine econometric results (e.g., create biased and inconsistent parameter
1346 D. Squires and J. Walden

estimates) from random utility models or dual-based production models, in which

input demand for labor cannot be derived. This model limitation is overcome when
the analysis is an economic one that uses the opportunity cost of labor as a shadow
wage rate and shadow prices for other inputs when appropriate and capital services
prices rather than a private or financial analysis that uses observed prices [238–240].

Management or Skipper Skill

The firm’s management in fishing industries is known as the skipper effect or

the “good captain hypothesis.” The firm’s management can also be viewed as
part of the larger issue of unobserved heterogeneity between firms and unob-
served inputs in general, which extends beyond the individual firm’s management
to include multiple unobserved factors that influence production [157]. When
management is specified as a residual, it includes the effects of factors that do
not depend on management, but rather on the firm’s particular environmental
conditions.
Vessel management and this residual have been addressed in several different
ways. Authors looked for skipper effects by examining the size of the residual
variance remaining in the analysis of vessels’ catch rates after accounting for vessel
characteristics and other inputs [2, 23, 87, 106, 168, 169, 235]. Researchers have
specified proxy variables for the age, education, and experience of managers or
principal components analysis of personal characteristics of managers to derive
a proxy variable for management. Comitini and Huang [46] and Campbell [35]
employed dummy variables based upon one or more knowledgeable experts’
subjective evaluations of ordinal levels of skipper skill. Holt [110] specified a
cardinal measure of fishing skill, based on the proportion of successful pursuits
adjusted for vessel characteristics and days of effort, designed to distinguish
between professional and nonprofessional vessels. Del Valle et al. [53] used the
ratio between the number of small landings and total landings as a measure of
skipper skill. As with all proxy variables, measurement error and bias can follow,
although the asymptotic bias expected from inclusion is generally smaller than from
exclusion. Endogenous regressors with biased and inconsistent parameter estimates
are also a danger.
Kirkley et al. [128] specified managerial efficiency or skipper skill as techni-
cal efficiency measured through a stochastic production frontier. Extending the
technical efficiency approach, Kirkley et al. [128], Sharma and Leung [200],
Vishwanathan et al. [246], and others included an additional equation explaining
the technical inefficiency term. Such studies were often unsuccessful or inconsistent
when attempting to explain technical inefficiency identified as skipper skill by
variables such as years of education or experience.
Squires and Kirkley [219] applied the panel data approach of fixed and random
effects for the combined effect of the vessel and skipper to distinguish productive
performance between vessels. Wolff et al. [260] specified a three-way fixed effects
model, distinguishing between the vessel and skipper fixed effect (along with time
34 The Economics of Production in Marine Fisheries 1347

effects). They further allowed for potential time-varying firm management, whereby
the skipper can learn through acquiring additional experience with the production
process over time and through length of job tenure with a vessel or firm. Tingley
et al. [238] utilized data envelopment analysis, along with the stochastic production
frontier, to evaluate skipper skill.

Nonrivalrous Inputs

A vessel’s production function depends not only on its rivalrous and excludable
(i.e., private) inputs but also its nonrivalrous and (partially) nonexcludable inputs.
The nonrival and (partially) excludable public good knowledge or ideas – new and
accumulated technology – can be embodied in both physical and human capital of
the fishing firm [221, 224]. Knowledge has an accompanying externality and social
learning external to individual producers [12, 187]. A firm’s production function
then depends on the level of knowledge in the economy or fisheries sector. One
firm’s innovation, adopted by other firms, enhances all firms’ productivity and
innovation.
Nonconvexities arise with knowledge embodied in accumulated and new tech-
nology. Once the high fixed cost of creating new and better knowledge is incurred,
the public good knowledge can be repeated at little or no additional cost [12, 187].
This generates increasing returns to scale external to the individual producer over
all inputs – both rivalrous and excludable (private) and nonrivalrous and (partially)
nonexcludable (public) – in the production function.
New sources of knowledge – new technology – can either arise exogenous to
the sector, such as with information and communications technology embodied
in electronic and other equipment to find fish, or can arise endogenously through
research and development, such as biased technological change to reduce bycatch
by reducing the bycatch-target catch ratio. In either case, this new technology
becomes endogenous within the fisheries sector due to the producer’s investment
decisions required to implement it and the knowledge externality accompanying it.
Knowledge can be accumulated indefinitely without diminishing returns to physical
and human capital, leading to ongoing, endogenous economic growth in effort and
pressures upon the natural resource stock. Knowledge embodied in Kt intensifies
this process.

Resource Stock

The resource stock S, measured in biomass or numbers of fish, is not under the
control of the individual firm, and hence is an exogenous technological constraint
in a positive firm-level analysis rather than normative industry-level analysis
[126, 206, 207, 209, 212]. In the fisheries stock-flow production technology,
∂π (W, P; K, S)/∂S ≥ 0, i.e., an increase in S does not reduce restricted profits
π (W, P; K, S) and ∂π (W, P; K, S)/∂S = WS , i.e., the firm’s shadow value for S.
1348 D. Squires and J. Walden

A similar interpretation can be given to environmental parameters, such as sea

surface temperature, wind speed, etc. Similar considerations hold for the full static
equilibrium profit function and the full and partial static equilibrium cost and
revenue functions. S is treated like other quasi-fixed or fixed stocks such as K, except
in a dynamic approach in which K (with adjustment costs) is treated differently
than S.
Several specifications are possible for S [8, 15]. In cross-sectional studies, S is
treated as fixed, common to all vessels and equally distributed spatially, and not
explicitly specified. In time series, S times a series of cross-sections (pseudo-panel
data) or panel data has been treated with indices of abundance or actual measures
from population assessments [24, 56, 99]. In multispecies fisheries, aggregate
biomass is typically specified, if it is specified at all, due to the multicollinearity that
would otherwise arise. Dummy variables or time fixed effects are often specified,
especially in short panels [26, 198, 199, 206, 207, 211–242]. Time dummy variables
can capture not just S but also changes in disembodied technology, state of the
environment, regulations, and other factors that change over time. Squires [212] and
Pascoe et al. [175] used revenue shares to aggregate individual species measures of
abundance into a composite index of abundance. When population assessments are
unavailable, indices serving as proxy variables (with attendant issues of endogeneity
and measurement error) are sometimes specified [8, 15]. Comitini and Huang [46]
used catch per skate (a flatfish). Kirkley et al. [127] used a vessel’s last tow’s
trips using a scallop dredge. Eggert [63] used overall average landings value and
Pascoe and Coglan [171] specified average catch value per hour fished. Kirkley
et al. [130, 131] used lagged average fleet-wide landings per unit effort. Similarly,
Pascoe et al. [175] used average fleet-wide catch per unit effort during the season’s
first week. Andersen [8] showed that production estimates (here from DEA) differ
whether a stock index is specified for each primary species based on catch per
unit effort, one stock index is obtained from independent stock assessments for
each of the primary species, or inclusion of one composite stock index for each
observation is based on the independent stock measures and relative importance
of the primary species. If such catch indices are not properly specified (e.g.,
lagged and/or fleet wide), endogeneity and identification and proxy variable issues
can arise along with other limitations discussed by Andersen [8]. All approaches
implicitly assume that S is constant (not appreciably depleted) over the specified
time period.
The stock elasticity, ∂ ln Yt /∂ ln St , measures the impact of changes in St upon
Yt in a production function. Conventional wisdom holds that the stock elasticity is
close to zero for pelagic stocks due to their schooling behavior [99, 240] and closer
to one for demersal stocks due to their more even spatial distribution [193]. Limited
empirical studies find cod and saithe’s stock elasticity lies between zero and one,
pelagic herring and albacore at unity, and anchovy at 0.39. Gordon and Hannesson
[91] find that the presence and size of the stock effect depend upon the time period
and overall state of technology.
34 The Economics of Production in Marine Fisheries 1349

Dual Representations of Technology

Squires [206–211], Kirkley and Strand [126], Dupont [56, 57], Bjørndal [24], and
Bjørndal and Gordon [26] introduced the dual approach to econometrically analyze
the vessel-level technology, using cost, revenue, and profit functions rather than the
primal approach of the production function. The dual approach was accompanied
by the introduction of flexible functional forms, such as the translog, normalized
quadratic, and generalized Leontief, which allow less restrictive input and output
substitution possibilities and biased technological change. The disaggregated dual
approach opened up the possibility of many types of analyses consistent with
other areas in industrial organization and production economics that were under
development at the time.
The dual approach greatly facilitated examining the multiproduct firm by
allowing disaggregated outputs and inputs and exogenous prices as regressors.
Specifications were generally partial static equilibrium or short run, conditional
upon Kt . The dual approach has been used to estimate optimal vessel size [26, 207,
210, 211] and optimal engine power and headrope length [175]. It also allowed
researchers to examine product supply and product transformation possibilities
[122, 126, 205–207, 209–211], input demand and input substitution possibilities
[56, 206, 207, 209–211], the specification and testing of various types of joint
production, output and input-output separability, quantity controls (including ITQs)
through virtual prices and quantities and quota shadow prices (discussed below),
capacity and capacity utilization, and the multiproduct cost structure for revenue
and profit-maximizing firms [208, 210, 211, 216].

Product Transformation and Substitution Possibilities

The dual approach provides short-run Hicksian (net, compensated) and long-
run Marshallian (gross, uncompensated) output supply and derived input demand
price elasticities. Due to local Le Chatelier effects from the expansion effect,
long-run or Marshallian elasticities are more elastic than short-run or Hicksian
elasticities. Measures can also be obtained from directional distance functions
equal to the difference between strong and weak output disposal with efficient
production frontiers [194]. Most dual-based models show that own- and cross-price
elasticities of output supply and variable input derived demand are typically inelastic
across gear types, and cross-price elasticities indicate a mixture of substitutes and
complements and inelasticity for both inputs and outputs. Some elastic responses
have been found in the long run using Morishima elasticities.
Output transformation possibilities reflect “selectivity” and “targeting” ability
[37, 38, 194]. Complementarity and substitutability and the degree of elasticity
reflect vessels’ ability to change product or input mix as they change when,
where, and how they fish, and ex ante they indicate how vessels might respond to
1350 D. Squires and J. Walden

changes in input or output controls and whether or not vessels might discard quota
overages. Elasticities and product transformation and input substitution possibilities
change with changes in S and environment, and change when, where, and how
fishing occurs. Morishima elasticities of substitution are occasionally used [13, 142,
173, 174]. Differences in output disposability indicate limited output substitution
possibilities [194].
Substitution between unrestricted inputs (outputs) and restricted inputs (outputs)
can be evaluated by the elasticity of intensity [54]: ∂ ln Xi (W, P; K)/∂lnKk , where k
denotes a type of capital (or output) and a negative (positive) elasticity shows a sub-
stitute (complementary) relationship. Studies examining the relationship between
currently restricted inputs (such as a limited allowable fishing days) and unrestricted
inputs using the elasticity of intensity include Dupont [57] and Deacon et al. [51] –
who along with Dupont [59] pay particular attention to the dissipation of rents,
Pascoe et al. [175], Hansen and Jensen [103], and Squires [215] who evaluated
the impact upon the elasticity of intensity from adding or dropping quantity
controls under the virtual quantity framework. Dupont [57] shows inelasticity and
complementarity between the restricted and unrestricted input (which limits rent
dissipation that would otherwise occur with input substitution), while Hansen and
Jensen [103] show slightly elastic substitution between restricted days and fuel and
no relationship between restricted days and vessel (capital).

Structure of Multiproduct Costs

The firm’s (vessel’s) multiproduct cost structure is central to analyses of multiprod-

uct industry structure and the impact of public regulation [20]. The multiproduct
cost structure was developed under the behavioral hypothesis of cost minimization
of a given, exogenous output vector. However, because multiproduct fishing vessels’
products are endogenous, Squires [207, 211] and Squires and Kirkley [216]
retrieved the multiproduct cost structure from the information contained in the
revenue and profit functions under the behavioral assumptions of revenue or profit
maximization with endogenous outputs, where the costs are shadow costs with the
revenue function.
Economies of scope measure the cost savings from producing multiple outputs
rather than producing each separately when production is (almost) joint in inputs
[20]. Scope economies derive from weak cost complementarities or fixed costs that
do not depend on the quantities of outputs produced but do vary on which outputs
are chosen (since that affects local cost complementarities or anticomplementarities)
[20, 93]. Squires [207, 210, 211] and Squires and Kirkley [216] extend the measure-
ment of scope economies, transray convexity, incremental and average incremental
costs, and weak cost complementarities to revenue- and profit-maximizing firms
with endogenous outputs using the revenue and profit functions. Empirical results
find both economies and diseconomies of scope in fishing vessels, either by directly
34 The Economics of Production in Marine Fisheries 1351

estimating from the cost, revenue, or (restricted) profit function or from cost
complementarities [7, 15, 119, 207, 208, 210, 211, 216, 256, 258]. Economies of
scope are largely found in trawl vessels (as expected) and diseconomies of scope
are found in the surf clam and ocean quahog fleet (reflecting the spatial stock
separation).
Product-specific returns to scale Si [Y] measure the change in costs through
variation in the quantity of one product while holding other products’ quantities
constant [20]. Firms with increasing product-specific returns to scale have a cost
incentive to expand the scale of production of this product and may become
specialized in its production. Squires and Kirkley [216] develop and show how
to measure Si [Y] and incremental and average incremental costs for the revenue
or profit-maximizing firm. A sufficient condition for Si [Y] can be obtained by
examining incremental marginal shadow costs or costs found from the diagonal
elements of the Hessian submatrix for outputs from the estimated parameters of
the profit or revenue function [207, 210, 211, 216]. Empirical results indicate
that different pelagic and demersal species are produced under conditions of both
increasing and decreasing product-specific economies of scale [7, 15, 55, 119, 199,
207–211, 216, 256]. Some species, such as those long lived and slow growing,
which are subject to increasing product-specific returns, can be vulnerable to
overharvesting due to the decreasing marginal production costs.
Multiproduct economies of scale are typically measured along a ray in output
space that keeps outputs in fixed proportions, although other measures exist [20,
198]. The revenues exceed, are less than, or equal to (long-run) costs as there are
decreasing, increasing, or locally constant long-run ray returns to scale. Increasing
multiproduct returns to scale are found by Hannesson [99], Bjørndal [24], Asche
et al. [17], Weninger [256], Bjørndal and Gordon [27], Felthoven and Paul [76],
Nesbøkken [164], and Lazkano [142] and decreasing multiproduct ray returns to
scale are found by Squires [206–209], Squires and Kirkley [216], Alam et al. [7],
Horace and Schnier [112], and Hoff and Frost [107]. Increasing multiproduct ray
economies of scale are sometimes found in output-regulated fisheries that prevent
vessels from increasing the scale of production.
A cost function is subadditive at an output vector Y if and only if it is lower
cost to produce Y than to produce the outputs comprising Y individually, i.e.,
C(Y) ≤ C(Y1 ) + C(Y2 ) [20]. Evans and Heckman [68, p. 615] stated: “Thus an
industry is a natural monopoly if a single firm can produce all relevant output
vectors more cheaply than two or more firms.” Cost subadditivity would suggest that
some form of fishermen’s monopoly is appropriate on private efficiency grounds
[167, 210, 211]. Squires [207, 211] develops local sufficient conditions for cost
subadditivity using revenue or (restricted) profit functions. Onofri and Francesc
[167] devise an additional test for cost subadditivity in the fishery sector. Squires
[211], Alam et al. [7], and Onofri and Fransesc [167] reject cost subadditivity in
fishing industries. Decreasing multiproduct ray economies of scale would explain
the absence of cost subadditivity.
1352 D. Squires and J. Walden

Multiproduct Joint Production

The nature of joint production, including the production possibilities frontier (PPF),
impacts model specification, spatial management, rights-based management, and
fisheries management in general. Changes in St shift the PPF in or out and twist it
in the stock-flow production process. Area fished does not alter the PPF (except in
a disaggregated model in which different areas contain different resource stocks).
Different areas, aggregations by age and size, etc. are readily accommodated under
block joint production as discussed below. Here we only discuss desirable products
and the most relevant types of joint production for desirable two species (products)
Y1 and Y2 .
Joint-in-input quantities production arises when all inputs are used to produce
all outputs [98, 140]. The PPF implicitly assumes either a reasonably homogeneous
distribution of both species across all fishing grounds or aggregation across such
areas. Many of the ex ante analyses discussed above specified and tested this
type of PPF. This PPF could also be applied to outputs specified as species
area and to completely different resource stocks (and hence areas) as in Holzer
and De Piper [111]. If only some of the species are regulated by transferable
quotas, substitution from regulated to unregulated species may occur [16, 58,
66, 189]. Squires and Kirkley [218] developed the two-price, two-quantity direct
elasticity of transformation for two ITQ-regulated species and this type of
jointness.
Almost joint-in-input quantities arises when the production process uses not only
standard inputs such as fuel and labor but also quasi-public inputs, especially the
vessel Kt that cannot be explicitly allocated among nonjoint production processes
[141, 146]. Many of the ex ante analyses were short run in Kt and hence almost
joint-in-input quantities. Pascoe et al. [173] applied this approach in a multiproduct
distance function, and Hutniczak et al. [118] specified a restricted profit function.
Hansen and Jensen [103] recognized the issue of almost jointness, but specified
a distinct and unique model using days as an input and a multistage production
process.
Nonjointness-in-input quantities [98], also called output independence [138,
139], arises when: (1) there are separate production processes for each harvested
species or areas and (2) inputs are allocated between the different production
processes. Each production process can be separately regulated without affecting
production of the other processes because there are no technological or cost trade-
offs between the output of one activity and that of another [126, 205, 207]. No
empirical study has found nonjointness-in-input quantities in fisheries throughout
the species set.
Different multiproduct production processes can even be scalar multiples of one
another when there is either Leonteif aggregation of inputs and outputs, so that
they are in fixed proportions, or a single input and output (homothetic input-output
separability and nonjointness-in-input quantities). This is the specification of linear
programming, which Reimer et al. [184] utilized for different areas. Each product
34 The Economics of Production in Marine Fisheries 1353

combination is produced in fixed proportions, and hence there are not changes in
product-species compositions for an individual production process. Instead, changes
in species mix arises through shifting to a different production process (which could
pertain to different area or gear type).
Block jointness in inputs [33, 34, 134], also known as block output independence
[138, 139] and first discussed in fisheries by Squires [204], arises when there are
multiple but distinct and separate production process that are joint among a range of
production of products but nonjoint between these processes. Squires and Kirkley
[216] found block jointness for Pacific coast groundfish trawlers.
Two forms of joint-in-input quantities production that directly addresses area and
area-specific species combinations or simple species aggregations for interrelated
species or sequential production of different species, species groups, or gear
configurations is a block structure and almost jointness [146]. Hence, individual
species (or groups of species) could be partitioned into joint blocks (i.e., groups)
that capture product-species combinations and or product-species groups and areas
(which can be a simultaneous decision, since area often defines species composition
and density) or different gear configurations. For example, two separate fishing
grounds for species groups Y1 and Y2 , in which there are product transformation
possibilities within each block but not between blocks.
The block joint, almost joint, and nonjoint-in-input quantities specification could
be combined with a multinomial logit/probit model [35] or switching regression, or
random utility model [29] for choice of area, depending upon the vessel’s behavioral
assumption. Alternatively, products species could be explicitly defined by area
species and three-way panel data specification to also allow for endogeneity in both
area and species [112]. Area dummy variables as regressors could potentially be
biased and inconsistent since they are potentially endogenous due to the choice of
fishing ground.
Almost joint-in-output quantities [140, 141] arises when there are multiple
production functions for each type of variable input with the exception of sharing
the fixed inputs. Such jointness can imply the sequential use of different inputs or
different fishing strategies as found by Hutniczak et al. [118]. Allocated fixed inputs
can create product interdependence that differs from technical interdependence.
Hansen and Jensen [103] develop a restricted profit function with allocable days
which is also discussed by Reimer et al. [184].
Empirical tests for gill net, trawl, purse seine, longline, and dredge vessels almost
always reject nonjointness in inputs or almost nonjointness-in-input quantities [7,
13, 15, 55, 118, 126, 173, 189, 199, 206, 207, 236]. Exceptions include Campbell
and Nicholl [38], who find nonjointess for generalist firms and nonjointess for purse
seine vessels (specialized firms), Squires and Kirkley [216] who find nonjointness
for one product and reject nonjointness for all others in a trawl fishery (giving a
block joint production process), Weninger [256] for the ocean quahog and surf clam
fishery, and Alam et al. [6] who find both input-output separability and nonjointness
in a Malaysian gillnet fishery, which implies that all supply equations are scalar
multiplies of one another [98].
1354 D. Squires and J. Walden

Separability

Aggregation of individual outputs, notably individual species, into a single output

and inputs, such as capital, fuel, and labor (L) into a composite input effort (E)
is the structure of the bioeconomic model. Aggregation occurs through either
Leontief-Sono separability, Hicks-Leontief aggregation, or the generalized compos-
ite commodity theorem [28, 143]. Input-output separability is the implicit structure
of the bioeconomic model, in which there is a single composite input and output,
and the marginal rate of transformation between outputs is independent of changes
in inputs, and the marginal rate of substitution between inputs is independent of
changes in outputs [99, 207]. Only the levels of catch and effort require regulation,
and regulation of the input (species) mix does not adversely affect the optimal
product (factor) combinations [207]. Input-output separability has largely been
empirically rejected [7, 13, 37, 38, 55, 118, 126, 173, 189, 207, 210, 211, 216,
256]. Two notable exceptions are Alam et al. [6] and Squires [206] who did not
reject input-output separability.

Distance Functions

The distance function starts from a set-theoretic foundation and does not require a
specific functional form in contrast to the production function. Distance functions
also allow specifying disaggregated outputs and inputs in the primal rather than
the dual specification. Distance functions have been used to estimate technical
efficiency [127, 128, 258, 180, 181], capacity [60, 124, 125, 126, 129, 145, 247],
productivity [76, 166, 252, 255], vessel valuation [75, 130, 131], targeting and
bycatch problems [72, 173, 174, 194], vessel buyback programs [250] and optimal
fleet size and the basis for vessel buyout programs [132, 250]. Dupont et al. [60]
and Herrero et al. [104] included slack variables into DEA models to allow for non-
radial changes in input and output mix. Walden and Tomberlin [249] introduced an
“order-m” frontier and free disposal hull to estimate fishing capacity [251].
Technical efficiency refers to the individual firm or vessel’s level of pro-
duction given its bundle of rivalrous inputs, and states of technology, environ-
ment, and resource stocks, relative to the best-practice frontier established by
the highest achieving firms or vessels. Technical efficiency from an input ori-
entation is TEI (y, x) = 1/DI (y, x) ≤ 1, where DI (y, x) denotes an input-oriented
distance function. The TEI value indicates the amount a vessel will have to
scale their inputs downward it be technically efficient and operate on the best-
practice production frontier. If an output orientation is desired, TE is given by
TEo (x, y) = [Do (x, y)]−1 ≥ 1, where Do (y, x) denotes an output-oriented distance
function [85]. A natural extension of measuring TE was the estimation of vessel
capacity using the Johansen [121] plant capacity definition. The introduction of
data envelopment analysis (DEA) [41] and the stochastic production frontier (SPF)
[5] were pivotal in using distance functions to model vessel-level production.
34 The Economics of Production in Marine Fisheries 1355

Although these methods were introduced before 1980, it took some time for fisheries
researchers to integrate them in their work.
The directional distance function, a generalization of the traditional distance
function, can be used to account for unintended outputs, such as bycatch. This
modeling of TE measures a vessel’s ability to expand its intended catch and contract
its bycatch given their input use. Such models to estimate vessel productivity and TE
include Weninger [257], who modeled the efficient production frontier for vessels
operating in the mid-Atlantic surf clam and ocean quahog fishery. Scheld and
Walden [194] examined TE for multispecies fisheries where the ability to catch one
species may be hindered due to regulations regarding catch of other species. The
stochastic multiproduct distance function has also been estimated to evaluate output
transformation possibilities in fisheries [81, 173, 174, 176, 184].

Technical Efficiency and Stochastic Production Frontiers

The stochastic production frontier is the most widely used specification in fisheries
to measure technical inefficiency or deviation from the best-practice frontier. The
stochastic production frontier relates a vessel’s maximum output given inputs,
X1it, X2it , . . . , XNit while allowing for stochastic events. A second, simultaneously
estimated equation can explain the technical inefficiency according to exogenous or
predetermined variables.
Hannesson [99] applied the first production frontier, a deterministic one and
to Norwegian cod fisheries, in which the one-sided deviation from the frontier
captures both stochastic shocks and technical inefficiency. Kirkley et al. [127, 128]
followed with the stochastic frontier, identifying technical efficiency with skipper
skill (see the skipper skill section). These early papers specified only the production
frontier. Sharma and Leung [200], Vishwanathan et al. [246], and Squires et al.
[226, 228] first included the second equation to explain technical inefficiency.
Grafton et al. [94] first accounted for economic inefficiency, including both technical
and cost inefficiency, in a study of the impact of ITQs in the British Columbia
fishery for Pacific halibut. Kompas et al. [133] first related technical inefficiency to
input controls, showing that technically efficient fishers substituted unregulated for
regulated inputs and that technical efficiency declined with increasing restrictions
on production. These and many other studies that followed find different degrees
of technical inefficiency, and those relating technical inefficiency to measurable
attributes of skippers and crew find a wide range of results but typically do not
find a statistically valid relationship. Asche and Roll [14], using a shadow revenue
function, estimated revenue inefficiency and its decomposition into technical and
allocative inefficiency in the Norwegian groundfish fishery. Horace and Schier [112]
introduce time-varying technical inefficiency through nonparametrically identifying
time-varying technical efficiency by exploiting the spatial variation of vessels in
three-dimensional (cross-sectional, time, and area) panels in Berring Sea flatfish
fisheries, where each cross-sectional vessel can move across space and time.
1356 D. Squires and J. Walden

Recent analyses focused upon the impact of rights-based management upon

technical efficiency. New [162] found that structural adjustment reducing fleet
size did not impact vessels’ technical efficiency in the Australian Eastern Tuna
and Billfish Fishery. Schnier and Felthoven [196] found that a vessel’s measure
of technical inefficiency is a significant and positive factor in explaining whether
it exits a fishery following ITQs. Huang et al. [114] found that participation in
the collective rights-based management system (“sectors”) of the New England
groundfish fishery impacted behavioral responses rather than technical efficiency
even though sector participation led to shifts in the production frontiers for trawl
and gillnet vessels. Estrada et al. [67] examined the impact on technical efficiency
of cooperative catch shares for artisanal vessels in anchovy and sardine fisheries in
south-central Chile using a difference-in-differences causal inference framework.
The cooperative catch shares reduced average technical efficiency, although the
impact on heterogeneity depended upon the characteristics of fishermen’s organi-
zations, so that greater cooperation among members increased technical efficiency.
Mainardi [149] specified and estimated two stochastic frontier semiparametric
models for a panel of Falkland Island fisheries over 2003–2014 that treat unobserved
heterogeneity as a finite mixture or discrete approximation to continuous parameter
variation, by adjusting for sample selection and latent classes, respectively. The
hypothesis of frontier-enhancing effects of the new ITQ/ITE regime is supported for
most, albeit not all, fishing companies. Evans et al. [69] estimate a cost frontier with
time-varying inefficiency to allow for spatial variation in unobserved productivity
effects and measure changes in technical efficiency, capital investment/divestment
incentives, and resource rent following ITQs.
Other approaches can measure TE. Salvanes and Steen [190] specified a thick
frontier, in which the best-practice frontier is determined by grouping together
vessels with the smallest estimated disturbances. Holloway et al. [109], Holloway
and Tomberlin [108], and Tomberlin and Holloway [239, 240] apply the Bayesian
approach to composed error models under alternative, hierarchical characteriza-
tions, and demonstrate the Bayesian approach to model comparisons using recent
advances in Markov Chain Monte Carlo methods. Collier et al. [44] evaluate a
California multiple-input, multiple-output fishery using a hybrid DEA stochastic
frontier model in which DEA is used in a first stage to measure aggregate
output used in the second stage, the stochastic production frontier. Pascoe et al.
[175], estimating a restricted multiproduct profit function, applied the fixed effects
approach of Schmidt and Sickles [195].

Rationing and Quotas

The microeconomic theory of quotas, rations, and other quantity controls allows
better understanding of their impact upon fishing vessels. These quantity controls
include nontransferable individual vessel quotas and individual transferable quotas
(ITQs) for catch and effort and limits on gear, fishing time, vessel size, and inputs
in general.
34 The Economics of Production in Marine Fisheries 1357

The microeconomic theory of rationing and quotas for firms, initially developed
in consumer theory [161] and international trade [160], was extended to production
theory by Fulginiti and Perrin [86], Squires and Kirkley [216–218], Segerson and
Squires [199], Squires [213, 215], Squires et al. [225], Vestergaard [243], and
Vestergaard et al. [245]. Using the virtual price framework of Neary and Roberts
[161], Squires and Kirkley [216–218], and Squires [215] showed that the unit rent
of an individual transferable quota, or ITQ, is the difference between the output
price and virtual price of the quota and forms the firm’s inverse derived demand
function for the ITQ. Firm inverse demand is horizontally summed to form the
market ITQ demand. In equilibrium, the horizontally summed aggregate inverse
ITQ demand curve equated to the exogenous aggregate supply curve (typically a
total allowable catch or effort) gives the market ITQ price. The Antonelli matrix
of changes in endogenous unit rents in response to exogenous marginal quota
changes gives this unit rent and forms the basis of ITQ price flexibilities. Squires
and Kirkley [218] also calculated the gains from trade, estimated the ITQ product
transformation frontier, and developed the direct elasticity of substitution (two price-
two output) between ITQs. Extending Neary’s [160] ex post framework using virtual
quantities – the dual to virtual prices, Squires [214, 215] evaluated the effects of
adding, subtracting, or changing existing quotas. The microeconomic theory of
rationing and quotas addresses the substitution of unregulated inputs for regulated
inputs in input-regulated fisheries, first described by Pearce and Wilen [179] and
Wilen [259], by Squires [206, 213–215] and Dupont [57] and to the spillover effects
between quota (including ITQ)-regulated species and unregulated species [16, 58,
66, 117, 215].

Le Chatelier Principle, Quotas, and Product Transformation

Possibilities

The Le Chatelier principle as applied to economics by Samuelson [191] shows

that there are behavioral implications of rationality which are only exhibited when
extra constraints are imposed or withdrawn. The Le Chatelier principle applies
when transitioning from short-run to longer-run production or adding or subtracting
or adjusting quantity controls, such as quotas or trip limits, or property rights or
other direct regulations that impact production. The local Le Chatelier principle
states that if variables in a system are chosen to optimize a function, then as a
result of an infinitesimal (i.e., marginal) change to the system, e.g., an extremely
small change in prices or quotas, the responsiveness of the chosen variables will
be reduced (increased) when extra constraints are added to (dropped from) the
optimization problem. The Le Chatelier principle can be local, corresponding to
marginal changes, or global, corresponding to nonmarginal changes [155]. The key
question is whether the change in production is due to a nonmarginal (discrete)
rather than marginal (infinitesimal, local) policy shock.
The Le Chatelier principle can explain the failure of ex ante analyses in fisheries
to consider the full range of a vessel’s ability to adjust its catch and/or input mix
1358 D. Squires and J. Walden

when there are nonmarginal (discrete) changes in policy, production constraints,

industry structure, markets, biology, environment, or prices. Squires and Kirkley
[216–218], Squires et al. [227], and Pascoe et al. [173, 174] ex ante analyzed the
introduction ITQs in multispecies fisheries. Appealing to the local Le Chatelier
principle, they observed that ITQs may be ineffective due to limited substitution
possibilities between species, leading to potentially high discards or difficulties in
quota balancing. However, Sanchirico et al. [192], Branch and Hilborn [30], Abbott
et al. [1], Reimer et al. [184, 185], and Scheld and Walden [194], through ex post
empirical evaluation of multispecies fisheries with ITQs compared to production
prior to ITQs, found that vessels can frequently adjust their species mix far easier
than the ex ante analyses incorrectly anticipated. A global meta-analysis of 345
stocks showed that for many fisheries, management controls improve under ITQs
in terms of reduced variation in catch around quota targets [154], although counter-
evidence also exists [135]. ITQs can also reduce bycatch [62].
Reimer et al. [184, 185] posit that the estimated model must be “structural”
with respect to that shock, as discussed by Haavelmo [97] and Lucas [147] for
economics in general and for rights-based management in particular. That is, the
ex ante analyses’ assumptions of invariant behavioral equations are inconsistent
with dynamic maximizing behavior and changed incentives. Reimer et al. [184,
185] observe that revealed production possibilities are frequently constrained and
confounded by regulatory incentives, and that the empirically revealed production
set strongly depends on the institutional, economic, and biological setting in place
when fishing was observed.
The Le Chatelier principle more rigorously explains the failure of ex ante
models to anticipate vessels’ responses to introducing rights-based management.
When rights are introduced and existing direct regulations are reformed in marginal
ways, then the local Le Chatelier principle applies. Rights, however, are often
introduced into a deteriorated fishery and replace existing and highly restrictive
direct regulation or lead to its substantial modification. This situation constitutes a
nonmarginal policy shock and a change in complementary institutions, regulations,
and business practices, whether or not the property rights as transferable output
controls are simply marginal extensions of the existing regulations. The global Le
Chatelier principle framework then fully explains the observed results of greater
flexibility and is consistent with economic theory.

Fishing Time

A tension can exist between the standard economic approach to production, which
specifies variable inputs by some physical capital, notably gear and equipment, and
labor, materials, and energy, and the standard biological specification of days as
steaming, search, and fishing time. (A few studies examining fisheries regulated by
days, in which case days were specified as quasi-fixed. These studies include Dupont
[56, 57] and Hansen and Jensen [103]. The use of days to represent variable inputs
represents use of a proxy variable as a flow of energy and services from stocks of
34 The Economics of Production in Marine Fisheries 1359

labor, gear, and equipment. Proxy variables can introduce measurement error, in turn
leading to biased and inconsistent parameter estimates. Days as a variable input in
an econometric model can also be subject to endogeneity and identification issues,
in turn also leading to biased and inconsistent parameter estimates. The use of days,
depending upon its application, can implicitly assume either Leontief aggregation
or homothetic input separability.
Another tension exists between the stock of physical capital K and labor L and
flow of capital and labor services. When K and L are specified as fixed or quasi-fixed
factors, then production is conditional upon these stocks and issues do not arise.
When K and L stocks are specified as variable inputs, then the implicit assumption
is made that flows are proportional to stocks. When this assumption is invalid, then
biased and inconsistent estimates potentially arise due to the measurement error.

Technological Change

Technological change is one of the main driving forces behind the historical
development of fishing industries. Along with investment in Kt , technological
change impacts the status of the resource stocks and the extension of fishing grounds
by depth and geographical range, broadening of species harvested, and habitat
impact. Population biology addresses technological change through time-varying
catchability and changes in selectivity. Technological change can be: exogenous,
endogenous, or both, oriented on the process (inputs) through process innovation
(and factor augmentation) or outputs through product innovation (and product
augmentation); disembodied or embodied; and centered on target species or bycatch
species. Technological change can be oriented to the target species or desirable
outputs, the conventional approach. Technological change can also be oriented to
bycatch or habitat impact or undesirable outputs, which is biased technical change.
More generally technological change can be either neutral (typically Hicks neutral)
or biased in the outputs or inputs. Technological change can lead to lower costs per
unit of effort (input augmenting) or increased catch rates per unit of effort (output
augmenting) given St . Technological change can also lead to new species caught
through expanding the range of production and introducing new areas and depths to
fish with new gear and equipment.
Technological change on target species and input usage is largely exogenous to
the fisheries sector [221, 222, 224]. Endogeneity arises through any investment in
physical capital, by which the technological change is embodied, or for research
and development to adopt the external sources of new technology to fisheries.
Technological change to reduce bycatch and habitat impact is largely endogenous to
the fisheries sector. Endogeneous technological change arises because technological
change typically requires research and development. Further endogeneity arises if
the new technology is embodied in new physical capital requiring investment. Mea-
surement of technical change in fisheries is typically based on primal specification of
technology, notably the production function, or cost diminution given output (dual),
over time [46, 130, 131]. Economic analysis of technological change essentially
1360 D. Squires and J. Walden

treats it as a residual after all other inputs and control variables (technological
constraints) have been included in the model [204], including the resource stock
[212].
The most common specification of technological change in fisheries production
functions is disembodied and accounted for by a linear time trend, which implicitly
assumes that technological progress is Hicks neutral, exponential, and progresses
at a constant rate [46, 65, 80, 99, 110, 156, 158, 175, 201, 221, 224]. Pascoe
et al. [175] find a positive linear time trend and negative squared linear time
trend, indicating increasing Hicks-neutral technological change but at a declining
rate. Hannesson et al. [102] adopted the approach of Baltagi and Griffin [18],
which allows technology to progress at a variable rate for each individual year, to
evaluate technical change over 100 years in the Lofoten cod fishery and extended
by Kvamsdal [137]. Gordon and Hannesson [91] applied an ARMAX model to
the Norwegian winter herring fishery. Banks et al. [19] and Kirkley et al. [130,
131] analyzed embodied technical change in the Sète trawl fishery using dummy
variables. Kvamsdal [136] applied a structural time series model with a stochastic
trend to measure technological change in a Cobb-Douglas production function with
both single equation and multivariate models to the Norwegian Lofoten cod fishery.
Fissel and Gilbert [79] specified a compound Poisson process incorporated into the
catchability coefficient of the Schaefer production function. Gilbert and Yeo [89]
examined technology adoption patterns and productivity differences in a Malaysian
artisanal fishery to evaluate whether technology is a substitute or complement for
managerial skill, i.e., examining skill-diluting and skill-augmenting technological
change.

Productivity Growth

Measurement of total factor productivity (TFP) growth in fisheries requires account-

ing for all the sources of growth to disentangle changes in the TFP residual
from changes in St and the environment. This section surveys measuring TFP
growth using the growth accounting framework of Solow [204] and economic index
numbers as opposed to econometrically estimating technological progress using a
production function to then infer productivity growth. Walden et al. [254] give an
additional survey of productivity change in fisheries.
Bell and Kinoshita [22] measured labor productivity and Kirkley [122], Norton
et al. [163], and Davis et al. [50] estimated TFP in a growth accounting framework
but without accounting for changes in St . Squires [208–213] and Herrick and Squires
[105] specified a growth accounting framework using economic index numbers
and recognized the stock-flow nature of the production technology to disentangle

changes in St from inputs changes, while adjusting for St s elasticity of output and
variations in capacity utilization under both open access and the economic optimum
(sole owner). Squires [212, 213] developed superlative index number approaches to
consistently aggregate the resource stocks of individual species into an aggregate
S. Jin et al. [120], Arnason [11], Hannesson [101], Torres et al. [241], Eggert
34 The Economics of Production in Marine Fisheries 1361

and Tveterås [64], Pan and Walden [170], and Walden et al. [255] analyzed TFP
growth in different fisheries. Brandt [31], Walden et al. [252, 253], Färe et al. [74],
Solís et al. [203], and Thunberg et al. [237] evaluated the impact of ITQs upon
TFP growth. Felthoven and Paul [76], Squires et al. [229], and Paul et al. [178]
accounted for changes in the state of the environment. Squires and Vestergaard
[223] developed the impact of TFP growth on optimum resource use within the
context of the bioeconomic model. Pascoe et al. [177] further discuss the impact of
productivity growth (and TE) on maximum economic yield.
Norton et al. [163] developed an index of profits or economic health that
incorporated TFP, but not accounting for changes in St . Fox et al. [83] fully
decomposed a profitability index, which included price, productivity, and capacity
utilization indexes and changes in St , to determine whether changes in productivity
or prices have the largest impact upon profitability change. Subsequent profitability
index analyses include Dupont et al. [61], Fox et al. [84], and Walden and Kitts
[248].
Two broad approaches have been used to construct economic index numbers
[254]. The first approach used constructed superlative economic index numbers
using prices as indicators of production elasticities in the Lowe, Törnqvist, and
Fisher ideal index numbers. Economic index numbers can also be constructed using
distance functions and linear programming. These methods construct a production
frontier based on observed values of inputs and outputs in different time periods,
which also allows decomposing productivity change into change in technical,
allocative, and scale efficiency. Some of the different indices that can be constructed
include the Malmquist, Hicks-Moorstein, and Lowe.

Bioeconomic Models

Effort as an Input

Early bioeconomic models specified an aggregate production function coupled with

a biological model [90, 197, 202]. These early models, which grew out of biological
production models, typically specified a single composite rivalrous input “effort,”
E [49, 99, 126, 188, 206, 207]. Effort in biological models usually involved a time
component, such as days absent from port, or time the gear was in the water.
Huang and Lee [113] and Anderson [9, 10] were among the first economists to
question the biologists’ specification of effort. Anderson [9, 10] specified effort as
the output of a two-stage production process, although not founded upon theory
(Leontief-Sono homothetic separability or Leontief aggregation, exact separability,
and aggregate production functions). In contrast, Huang and Lee [113] discussed
effort within the context of separability and aggregate inputs and the conditions for
an aggregate production function. They specified the aggregate production function
in terms of individual inputs K and L rather than composite E. Anderson [9, 10],
interested in individual firms’ production functions but retaining E, constructed cost
curves based on E. Hannesson [99] provided a theoretically consistent framework of
1362 D. Squires and J. Walden

separability and aggregation to composite E and defined a more general production

framework that integrated E with traditional neoclassical production theory.
Multiproduct technologies have a comparable two-stage optimization process
in which revenues are optimized in each stage of aggregation and with allocative
efficiency to form the composite output Yt [206, 207, 221, 224]. The separability
is exact, rather than approximate, since Yt = f (q, Et , St ) is an exact representation
of technology. Weak (and therefore also strong) Leontief-Sono separability is not
a sufficient condition for the first stage of aggregation. Homothetic separability
provides both a necessary and sufficient condition, which requires a linearly
homogeneous aggregator function [28, 92, Lemma 3.3a]. Linear homogeneous Et
aggregator functions for the rivalrous inputs, Et = g(X1t , Kt ), satisfies Fisher’s factor
reversal test, in which the cost of the rivalrous inputs equals the product of g(•) and
the corresponding implicit price index [77]. (To simplify notation, let X1t denote
a scalar composite of variable rivalrous and excludable variable inputs and let Kt
denote the scalar composite of rivalrous and excludable nominal physical capital
stock in natural units, i.e., not in efficient units.) Kt is aggregated over different
individual units of capital of different vintages and levels of embodied capital
according to specific conditions [78]. Linear homogeneity of g(•) in rivalrous inputs
also satisfies the replication argument of production functions with rivalrous inputs
[187]. Linear homogeneity in g(•) in rivalrous inputs gives the familiar Graham-
β
Schaefer production function in which the exponent of Et is linear: Yt = qEt1 St .
Leontief aggregation requires: Et = min (AX1t , BKt ), where A and B are
fixed coefficients and in which one of X1t , Kt is the limiting factor [3, 113].
∼
Allowing for embodied technical change in Kt , E t = min (AX1t , Bt Kt ) =
Et = min (AX1t , J ) [221, 224]. Either X1t or Jt (Jt = t Kt ), i.e., Kt in efficiency
units, where the average embodied technical efficiency, t , is defined as the
weighted average level of best-practice efficiency associated with each past vintage
of investment, will be partially idle in the sense that a small change in one input will
not affect output or factor prices. There will be historical partial surplus of either
X1t or Jt . The more general Hicks-Leontief composite commodity theorem requires
that the ratio of input prices or quantities of individual rivalrous inputs comprising
Et to the composite effort price or quantity is independent over time [143]. The
standard fisheries bioeconomic model often assumes the more restrictive Leontief
aggregation [43]. Specification of days as effort implicitly assumes one of these
forms of aggregation (homothetic Leontief-Sono, Leontief, or the Hicks-Leontief
composite commodity theorem) and forms a proxy variable.
With homothetic Leontief-Sono separability for rivalrous inputs and embodied
technological change, Ē¯ t = f (X1t , t Kt ) = f (X1t , Jt ) and Jt = t Kt are them-
selves aggregates in full static equilibrium [221, 224]. As a linearly homogeneous
function, Ē¯ t can be written as Ē¯ t = t f ( 1t X1t , Kt ) = t Ē¯ t . Assuming a
constant rate of embodied technical change ψ and constant capital share of income
M2 with Cobb-Douglas functional form for production function f (•) implies that
t = eM2 ψt , where the growth rate of t is ψ t [115].
When the production technology is Cobb-Douglas or its Graham-Schaefer
form (exponents of one for Et , St ), and the production technology is an exact
34 The Economics of Production in Marine Fisheries 1363

representation of technology, then separability inflexibility, a restriction on the

technology, is required (Proposition 1 of Denny and Fuss [52] and Blackorby et al.
[28]). That is, the technology must be a Cobb-Douglas production function f (•) with
a translog effort aggregator function g(•) or f (•) must be translog and g(•) Cobb-
Douglas. Hicks-neutral, disembodied technical change independent of the rate of
rivalrous Kt formation can be specified as growing at a constant exponential growth
rate λ.
Nonconvexities arise with knowledge embodied in accumulated and new technol-
ogy, creating dynamic increasing returns to scale external to the individual vessel
over all inputs, both private and (partially) public, in the production function (as
discussed above). Endogeneity is created because the dynamic increasing returns
to scale are external to the individual production unit. This dynamic positive exter-
nality continously lowers unit costs with knowledge adoption, which incentivizes
further adoption of new technology, and thus the endogeneity.
The aggregate production function can be specified with disembodied and
embodied technical change, knowledge spillovers accompanying this technical
change, and homothetic exact Leontief-Sono separability [221, 224]. Effort is then
an unobserved composite, private (rivalrous) input. Endogenous nonrival knowledge
β
spillovers, measured by θ > 0, gives aggregate effective effort Ktθ Et 1 , which
in turn gives β 1 + θ > 0 and increasing returns to scale external to firm i in

time t. Firm i s output in time t with homothetic exact Leontief-Sono separa-
β2
bility is Yit = 1
(Jt , t) it f it X1it , Kit St . Assume additive seperability
across all firms i to give an aggregate technology. Let (Jt , t) = (t)Jtθ =
(t)(t Kt )θ = (t)tθ Kitθ . Letting t = eM2 ψt and (t) = qeλt − μ(t, Z) , where q
β β
denotes the catchability coefficient, gives Yt = qKtθ Et 1 St 2 e(λ+M2 (θ+1)ψ)t−μ(t,Z) .
For the Graham-Schaefer specification, β 1 = β 2 = 1. Absence of the external
effect, θ = 0, gives only rivalrous (private) inputs, giving Ktθ = 1 and Yt =
β β
qEt 1 St 2 e(λ+M2 ψ)t−μ(t,Z) .

Firm i s output in time t with Leontief aggregation builds off of Clark [43]
specifying rivalrous Et as the rivalrous stock of physical capital Kt formed under
Leontief aggregation and Kt as the limiting factor [221, 224]. This allows for
explicit, intentional, endogenous net investment in Kt , and β 1 = θ + 1, θ > 0, is
the positive knowledge spillover, 1 is the rivalrous or private effect (each producer
operates under the assumption of constant returns to the inputs that the producer
controls), and β 1 is the aggregate effect. This specification assumes full capital and
capacity utilization. The aggregate production frontier under Leontief aggregation
β
for Kt as the limiting factor is written Yt = qKtθ+1 St 2 e(λ+M2 (θ+1)ψ)t−μ(t,Z) =
β1 β2 (λ+M2 β1 ψ)t−μ(t,Z)
qKt St e .
Both specifications allow for Debreu-Farrell economic (technical, allocative, and
scale) inefficiency and disembodied technical change, learning by doing that can be
exogenous and endogenous to net investment, and embodied technical change with
accompanying knowledge spillovers and learning [224]. The Leontief aggregation
specification of effort, in contrast to the homothetic Leontief-Sono separable
specification of effort, explicitly explains, through endogenous and intentional net
investment in Kt , how embodied technological change that originates from sources
1364 D. Squires and J. Walden

external to the sector and accompanying endogenous knowledge sillovers are

endogenously introducted, through intentional investment in Kt , into the production
process in the fishery sector. The homothetic Leontief-Sono separable specification
provides a comprensive specification of rivalrous Et using rivalrous X1t , Kt and
allows for input substitution as a source of endogenous growth, but does not account
for intentional net investment. The Leontief aggregation specification assumes that
rivalrous Kt is always the limiting factor without any input substitution as an
endogenous source of growth.

Concluding Remarks

Production economics is increasingly important to the economic analysis and

public regulation of fishing industries. Fisheries economics arose out of population
dynamics and a focus upon the long-term dynamics of the natural resource stock and
an aggregate production technology. These bioeconomic models aimed to obtain the
optimum fleet size and resource stock. Production economics gained in importance
as it developed as a field and was able to contribute to analyses of fishing industries
with an industrial organization orientation and public regulation to address the
market failure stemming from the common resource. This reorientation shifted the
focus to shorter time periods and to the individual firm – usually the vessel – and to
multiproduct, multi-input production. Other factors also contributed to the increased
application of production economics to the fishing industry: (1) the development of
large-scale data bases at the vessel level and econometric software and estimation
procedures and (2) rapid development of production economics to address issues in
other economic sectors and the general economy.
Empirical production analyses show that the individual vessel’s multiproduct
production process is typically joint-in-input quantities, that a consistent aggregate
output or input seldom exists, and that input substitution and output transformation
possibilities are typically inflexible or limited. Moreover, the longer the time period
and adjustment of fixed or quasi-fixed inputs or even outputs, and notably the
physical capital stock, the greater the input substitution and output transformation
substitution possibilities due to the local Le Chatelier effect. Technical inefficiency
is pervasive, sometimes ranging widely between vessels, allocative inefficiency
exists, and technological progress is important. There is indeed skipper skill, but
it cannot be readily explained by measurable factors.
Production economics in fisheries has come full circle to refine the specification
of the original dynamic renewable resource economics (bioeconomic) models
from which fisheries economics originally emerged. The concept of “effort” as
an aggregate input used in early models has evolved substantially due to the
general production economics literature. The theory of homothetic separability,
aggregation, and index numbers makes clear that effort as a composite of rivalrous
and excludable (private) inputs (capital, labor, energy, materials, etc.) is an index
that requires consistent aggregation according to either Hicks-Leontief aggregation
or homothetic Leontief-Sono separability. The homothetically separable effort
34 The Economics of Production in Marine Fisheries 1365

aggregator function of nonrivalrous and excludable inputs is linear homogeneous

and the exponent of effort is one. Allocative efficiency of the rivalrous inputs is
accounted for in this effort aggregator function. The exponent of effort can exceed
one when there is the nonrival input knowledge (due to technological progress) and
fall short of one when there is congestion.
Technological change has often been overlooked in fishery economics and
bioeconomic models, but is now more often incorporated, almost invariably as
Hicks-neutral and exogenous.
When the public good knowledge, in the form of disembodied and embodied
technical change, is incorporated into the production technology, the resulting
knowledge spillovers require may create dynamic endogenous economies of scale
in rivalrous and nonrivalrous inputs.
Fisheries economics now routinely addresses the stock-flow production technol-
ogy through either specifying biomass, time trends, or dummy variables. Studies
of total factor productivity growth disentangle growth in the productivity residual
from changes in the resource stock. Productivity analyses are estimated from either
econometrics and estimates of technical change or more likely growth accounting
and economic index numbers. Many fairly recent studies have examined changes in
total factor productivity due to the introduction of individual transferable quotas.
The impact of rights-based management, notably before and after and occa-
sionally with counterfactuals of individual transferable quotas, is a major focus
of analysis of production economics. The analytical approach has varied from
estimation of multiproduct cost functions and frontiers, or stochastic or deter-
ministic production frontiers, or measurement of total factor productivity growth
through economic index numbers. One major finding is that product transformation
possibilities become more flexible, due to local or global Le Chatelier effects
as various previous direct regulations are lifted, thereby unbinding production
possibilities. Technical and scale efficiency and total factor productivity tend to
increase in most fisheries after the introduction of rights, although not in all cases,
due to retirement of redundant capital, more efficient use of retained capital and
other inputs, and quota transfers from less efficient to more efficient vessels. Local or
global Le Chatelier effects clearly contribute. Most studies could not attribute causal
inference since they were “before-and-after” rather than “with-and-without,” where
the “without” is modeled by a counterfactual. The length of industry adjustment
following the introduction of rights-based management depends in part upon how
many and to what extent and timing direct regulations, forming binding constraints,
were relaxed or eliminated entirely, thereby impacting the local or global Le
Chatelier effects.
Accounting for full economic efficiency, accumulated and new technology, and
knowledge spillovers in bioeconomic models leads to a broader concept of the
dynamic economically efficient equilibrium (i.e., maximum economic yield) than
the original steady state and hence static scale efficiency. Since much of the ongoing
technological change is in the form of information and communications technol-
ogy originating external to the fisheries sector, when combined with knowledge
spillovers, such a dynamic economically efficient equilibrium leads to dynamic
1366 D. Squires and J. Walden

increasing returns to scale, ongoing endogenous growth, and an optimum resource

stock that is less than that of maximum sustainable yield.
Best-practice econometric estimation in fisheries production is improving over
time. Attention is increasingly paid to identification issues, notably testing for
endogenous regressors (such as effort), and when found instrumental variable
estimation. Heteroscedasticity of a general nature is now routinely corrected through
Eicker-Huber-White methods. Serial correlation receives less attention. The use
of heteroscedastic-and-autocorrelation consistent standard errors, such as Newey-
West or Driscoll-Kray, should be considered, given pervasive heteroscedasticity and
serial correlation due to searching behavior. Spatial autocorrelation seldom receives
attention. Panel data methods, notably fixed or random effects and their testing, are
now well recognized. Addressing sample selection bias requires further attention,
where this bias arises due to vessel entry and exit and the almost universal use
of fishery-dependent data and nonrandom vessel search for catch. Moreover, fisher
search is nonrandom or there can be gear saturation or density-dependent gear
avoidance behavior. Sample selection bias is particularly important for bioeconomic
models specifying an aggregate production function, since otherwise the results are
not representative of the population and maximum economic yield is biased. Unit
root and if necessary cointegration tests of time series of data, including panel data,
for stationarity and degree of integration should become routine when the time series
is sufficiently long; otherwise regression may be spurious and differencing may be
necessary before further regression analysis.
Where do fisheries vessel-level production studies consistent with production
economics head in the future? In the past, such studies largely followed new
developments in production economics and econometrics, although the application
of production economics to fisheries also contributed to further development of
production economics, or by responding to the unique feature in fisheries of
transferable property rights – essentially an application of quota and rationing
theory. One topic that has received considerable attention is the spatial nature of
production due to the area-based nature of fisheries production. Future analysis
requires consistency with production economics and in particular the nature of
joint production and separability. Block jointness-in-input quantities (block out-
put independence) definitely requires further investigation and can contribute to
incorporating area fished. When such a block structure is found, then a discrete
choice model in the vein of Campbell [35] can be coupled with a primal or
dual specification of technology. A comparable development arose in agriculture
with allocable inputs. Random utility models of location can be made consistent
with econometrics through identification and accounting for potentially endogenous
inputs (such as distance or cost) as regressors, but face difficulty in pairing with
production technologies (as in Campbell [35]) due to the objective function of
utility maximization while production technologies with endogenous catch assume
maximization of catch, revenue, or profit rather than utility.
Causal inference, including evaluating natural experiments, will grow in impor-
tance to evaluate policy. Explicit application of the local and global Le Chatelier
principle can evaluate the impact of the transition from direct regulation to rights-
34 The Economics of Production in Marine Fisheries 1367

based regulation. Accounting for bycatch and broader ecosystem impacts, which
requires proper specification of joint production, is promising. Fisheries economics
evolved from production functions with an aggregate output – a primal specification
of technology – to dual-based methods and has shifted back to the primal problem
but relying upon the directional distance function for a primal specification in
part due to its accommodation of multiple outputs and joint production, including
undesirable ones or public bads.
Analysis of technological change is yet another area. Most studies to date
have specified Hicks-neutral disembodied and exogenous technological change at
a constant rate measured through a linear time trend. Insufficient research has
been conducted on technological change that is biased in either (desirable or
undesirable) outputs or inputs. Endogenous technical change that is bycatch and
habitat saving (directed technical change) has received little attention, and can
draw from the considerable progress that has been made in general economics
in this area. Little is empirically known about the factors that induce or direct
biased technological change. Insufficient attention has been given to embodied
technological change and the impact that different vintages of the physical capital
stock, embodying different levels of technology, have upon harvesting. Similarly,
research on investment, economic depreciation, and capital accumulation could
receive additional attention.
Production economics can contribute to standardization of effort as practiced
by population biologists. Standardization in fisheries refers to combining disparate
technologies, each with different levels of productivity, into a single aggregate
technology and composite input, effort, that in turn enters into a population model
[152]. Topics within production economics that can contribute to standardization
include consistent aggregation across technologies, firms, and inputs, frontier
functions and economic efficiency (especially technical efficiency), multilateral
and bilateral economic index numbers, joint production, and technological change
that impacts time-varying catchability and could be biased. The related discipline
of econometrics can contribute to identification strategies for potentially endoge-
nous regressors in the catch-effort production technology and selection bias for
fishery-dependent data. Econometrics can also contribute to standard errors when
standardizing.
In sum, production economics in fisheries can both open up new areas of empir-
ical analysis and contribute to theoretically consistent specification of production
technology in both fisheries economics and population dynamics.

References
1. Abbott JK, Haynie AC, Reimer MN (2015) Hidden flexibility: institutions, incentives and the
margins of selectivity in fishing. Land Econ 91(1):169–195
2. Acheson JM (1981) Anthropology of fishing. Annu Rev Anthropol 10:275–316
3. Adadio T, Kirkley JE (1981) The examination of separability and aggregation of selected
fish prices: the composite commodity theorem under inexact ratios. Unpublished manuscript,
Northeast Fisheries Science Center, Woods Hole
1368 D. Squires and J. Walden

4. Agnello RJ, Anderson LG (1989) Production relationships among interrelated fisheries. In:
Anderson LG (ed) Economic impacts of extended jurisdiction. Ann Arbor Science Publishers,
Ann Arbor
5. Aigner D, Lovell CK, Schmidt P (1977) Formulation and estimation of stochastic frontier
production function models. J Econ 6:21–37
6. Alam MF, Ishak HO, Squires D (1996) Sustainable resource use, economic development, and
public regulation: the multiproduct gill net fishery of Peninsular Malaysia. Environ Resour
Econ 7:117–132
7. Alam MF, Ishak HO, Squires D (2002) Sustainable fisheries development in the tropics:
trawlers and license limitation in Malaysia. Appl Econ 34:325–337
8. Andersen JL (2005) Production economic models of fisheries: vessel and industry analysis.
Unpublished Ph.D. dissertation, Fisheries Economics and Management Division, Food and
Resource Economics Institute, The Royal Veterinary and Agricultural University
9. Anderson LG (1976) The relationship between firm and fishery in common property fisheries.
Land Econ 52:179–191
10. Anderson LG (1978) Production functions for fisheries: comment. South Econ J 44(3):661–
666
11. Arnason R (2003) On productivity and productivity growth in the Icelandic fisheries. In:
Guðmundsson E, Valtysson HP (eds) Competitiveness within the global fisheries. University
of Akureyri, Akureyri
12. Arrow K (1962) The economic implications of learning by doing. Rev Econ Stud 29(3):155–
173
13. Asche F (2009) Adjustment cost and supply response: a dynamic revenue function. Land
Econ 85(1):201–215
14. Asche F, Roll KH (2018) Economic inefficiency in a revenue setting: the Norwegian whitefish
fishery. Appl Econ 50:6112–6127
15. Asche F, Bjørndal T, Eggert H, Frost H, Gordon D, Gudmundsson E, Hoff A, Jensen C, Pascoe
S, Sissener E, Tverteråas R (2005) Modeling fishermen behavior under new management
regimes: final report. SNF report no. 28/05, Institute for Research in Economics and Business
Administration, Bergen
16. Asche F, Gordon D, Jensen C (2007) Individual vessel quotas and increased fishing pressure
on unregulated species. Land Econ 83(1):41–49
17. Asche F, Bjørndal T, Gordon DV (2009) Resource rent in individual quota fisheries. Land
Econ 85(2):280–292
18. Baltagi BH, Griffin JM (1998) A general index of technical change. J Polit Econ 96(1):20–41
19. Banks R, Cunninghan S, Davidse WP, Lindebo E, Reed A, Sourisseau E, de Wilde JW
(2001) The impact of technological progress on fishing effort. Final report to the European
Commission
20. Baumol W, Panzar J, Willig R (1982) Contestable markets and the theory of industry structure.
Harcourt-Brace-Jovanovich, New York
21. Bell FW (1972) Technological externalities and common-property resources: an empirical
study of the US northern lobster fishery. J Polit Econ 80:148–158
22. Bell F, Kinoshita R (1973) Productivity gains in U.S. fisheries. Fish Bull 71:911–919
23. Bjarnason T, Thorlindsson T (1993) In defense of a folk model: the “skipper effect” in the
Icelandic cod fishery. Am Anthropol 95:371–394
24. Bjørndal T (1987) Production economics and optimal stock size in a North Atlantic Fishery.
Scand J Econ 89(2):145–164
25. Bjørndal T (1989) Production in a schooling fishery: the case of the North Sea herring fishery.
Land Econ 65(1):49–56
26. Bjørndal T, Gordon DV (1993) The opportunity cost of capital and optimal vessel size in the
Norwegian fishing fleet. Land Econ 69(1):98–107
27. Bjørndal T, Gordon DV (2000) The economic structure of harvesting for three ves-
sel types in the Norwegian spring-spawning herring fishery. Mar Resour Econ 15(4):
281–292
34 The Economics of Production in Marine Fisheries 1369

28. Blackorby C, Primont D, Russell R (1978) Duality, separability, and functional structure:
theory and economic applications. North-Holland, New York
29. Bockstael N, Opaluch J (1983) Discrete modeling of supply response under uncertainty: the
case of the fishery. J Environ Econ Manag 10:125–137
30. Branch TA, Hilborn R (2008) Matching Catches to Quotas in a Multispecies trawl Fishery:
Targeting and Avoidance Behavior under Individual Transferable Quotas. Canadian Journal
of Fisheries and Aquatic Sciences 65(7):1435–1446
31. Brandt S (2007) Evaluating tradable property rights for natural resources: the role of strategic
entry and exit. J Econ Behav Organ 63(1):158–176
32. Buchanan N (1978) The fishing power of the scottish inshore white fish vessels. Occasional
Paper Series No. 1. White Fish Authority, Fisheries Economics Research Unit, Edinburgh,
pp 29
33. Burmeister E, Kiyoshi K (1970) The factor-price frontier, duality and joint production. Rev
Econ Stud 37(1):11–19
34. Burmeister E, Turnovsky SJ (1971) The degree of joint production. Int Econ Rev 12:99–105
35. Campbell HF (1991) Estimating the elasticity of substitution between restricted and unre-
stricted inputs in a regulated fishery: a probit approach. J Environ Econ Manag 20(3):
262–274
36. Campbell HF, Lindner RK (1990) The production of fishing effort and the economic
performance of license limitation programs. Land Econ 66(1):56–66
37. Campbell HF, Nicholl RB (1994) Can purse seiners target yellow fin tuna? Land Econ
70(3):345–354
38. Campbell HF, Nicholl RB (1995) Allocating yellow in tuna between the multispecies purse
seine and longline fleets. Mare Resour Econ 10:35–58
39. Carlson EW (1973) Cross section production functions for North Atlantic groundfish and
tropical tuna seine fisheries. In Sokoloski A (ed) Ocean fishery management. NOAA technical
report, National Marine Service, CIRC
40. Carlson EW (1975) The measurement of relative fishing power using cross section production
functions. In: Pope JA (ed) Measurement of fishing effort. Rapports et procès-verbaux des
réunions, vol 168, The International Council for the Exploration of the Sea, Charlottenlund
41. Charnes A, Cooper WW, Rhodes E (1978) Measuring the efficiency of decision making units.
Eur J Oper Res 2:429–444
42. Christensen LR, Jorgenson DW (1969) The measurement of U.S. real capital input, 1929–
1967. Rev Income Wealth 15:293–320
43. Clark CW, Clarke F, Munro G (1979) The optimal exploitation of renewable resource stocks:
problems of irreversible investment 47(1):25–47
44. Collier TC, Mamula A, Ruggiero J (2014) Estimation of multi-output production functions in
commercial fisheries. Omega 42(1):157–165
45. Comitini S (1978) An economic analysis of the state of the Hawaiian skipjack tuna fishery.
Sea Grant technical report TR-78-01, University of Hawaii, Honolulu, p 46
46. Comitini S, Huang DS (1967) A study of production and factor shares in the halibut fishing
industry. J Polit Econ 75:366–372
47. Comitini S, Huang DS (1971) Licensing and efficiency: an empirical study of the Japanese
tuna fishing industry. Malaysian Econ Rev 16:252
48. Craig L, Kroeber CR (1992) Manager shareholding, the market for managers, and the
end-period problem: Evidence from the U.S. whaling Industry. J Law Econ Organ 8(3):
607–627
49. Cunningham S, Whitmarsh D (1980) Fishing effort and fishing policy. Mar Policy 5:309–316
50. Davis L, Gallman R, Hutchins T (1987) Technology, productivity, and profits: British-
American whaling competition in the North Atlantic, 1816–1842. Oxf Econ Pap 39:738–759
51. Deacon RT, Finnoff D, Tschirhart J (2011) Restricted capacity and rent dissipation in a
regulated open access fishery. Resour Energy Econ 33:366–380
52. Denny M, Fuss M (1977) The use of approximation analysis to test for separability and the
existence of consistent aggregates. Am Econ Rev 67(3):404–418
1370 D. Squires and J. Walden

53. Del Valle I, Astorkiza K, Astorkiza I (2003) Fishing effort validation and substitution
possibilities among components: The case study of the VIII division European Anchovy
Fishery. Appl Econ 35(1):63–77
54. Diewert EW (1974) Applications of duality theory. In: Kendricks DA, Intriligator MD (eds)
Frontiers of quantitative economics, vol II. North-Holland, Amsterdam, pp 106–171
55. Diop H, Kazmierczak RF (1996) Technology and management in Mauritanian cephalopod
fisheries. Mar Resour Econ 11:71–84
56. Dupont DP (1990) Rent dissipation in restricted access fisheries. J Environ Econ Manag
19:26–44
57. Dupont DP (1991) Testing for input substitution in a regulated fishery. Am J Agric Econ
73:55–64
58. Dupont D, Gordon D (2007) Shadow prices for fishing quotas: fishing with econometrics.
In: Bjørndal T, Gordon DV, Arnason R, Sumaila UR (eds) Advances in fisheries economics.
Blackwell Publishing, Oxford, UK, pp 87–108
59. Dupont D, Grafton RQ (2011) Multi-species individual transferable quotas: The scotia-sundy
mobile gear groundfishery. Mar Resour Econ 15:205–220
60. Dupont DP, Grafton RQ, Kirkley J, Squires D (2002) Capacity utilization measures and excess
capacity in multi-product privatized fisheries. Resour Energy Econ 24:193–210
61. Dupont DP, Fox KJ, Gordon DV, Grafton RQ (2005) Profit and price effects of multi-species
individual transferable quotas. J Agric Econ 56(1):31–57
62. Edinger T, Baek J (2015) The role of property rights in bycatch reduction: evidence from the
British Columbia groundfish fishery. Fish Res 168:100–104
63. Eggert H (2001) Technical efficiency in the Swedish trawl fishery for Norway lobster. Essays
on fisheries economics. Doctoral thesis, Gothenburg University
64. Eggert H, Tveterås R (2013) Productivity development in Icelandic, Norwegian and Swedish
fisheries. Appl Econ 45(6):709–720
65. Eide A, Skjold F, Olsen F, Flaaten O (2003) Harvest functions: the Norwegian bottom trawl
cod fisheries. Mar Resour Econ 18:81–93
66. Ekerhovd N-A (2007) Individual vessel quotas and unregulated species: the Norwegian blue
whiting fishery. NHH Department of Economics discussion paper no. 5/2007
67. Estrada GAC, Suazo MAQ, Cid JDD (2018) The effect of collective rights-based management
on technical efficiency: the case of Chile’s common sardine and anchovy fishery. Mar Resour
Econ 33(1):87–112
68. Evans DS, Heckman JJ (1984) A test for subadditivity of the cost function with an application
to the Bell system. Am Econ Rev 74(4):615–623
69. Evans KS, Lian C, Weninger Q (2019) Property rights and the efficient extraction of common
pool resources: evidence from West Coast groundfish. Working Paper, University of Maine
70. Färe R, Grosskopf S (2006) New directions: efficiency and productivity. Kluwer Academic
Press, Boston. E-version: Springer
71. Färe R, Grosskopf S, Kirkley J (2000) Multiple output capacity measures and their relevance
for productivity. Bulletin of Economic Research 52(2):101–113
72. Färe R, Kirkley JE, Walden JB (2006) Adjusting technical efficiency to reflect discarding: the
case of the US Georges Bank multi-species otter trawl fishery. Fish Res 78:257–265
73. Färe R, Kirkley JE, Walden JB (2011) Measuring fishing capacity when some outputs are
undesirable. East Econ J 37:553–570
74. Färe R, Grosskopf S, Walden JB (2015) Productivity change and fleet restructuring after
transition to individual transferable quota management. Mar Policy 62:318–325
75. Färe R, Grosskopf S, Walden JB (2017) Measuring capital value in a commercial fishery: a
distance function approach. Mar Policy 81:109–115
76. Felthoven RG, Paul CJM (2004) Directions for productivity measurement in fisheries. Mar
Policy 28:161–169
77. Fisher I (1922) The making of index numbers. Houghton Mifflin, Boston
78. Fisher FM (1969) The existence of aggregate production functions. Econometrica 37(4):553–
577
34 The Economics of Production in Marine Fisheries 1371

79. Fissel BE, Gilbert B (2010) Modeling technology shocks with compound poisson processes:
An application to open access fisheries. Unpublished paper, Department of Economics,
University of California San Diego
80. Flaaten O (1987) Sesongvarierende Bestandstilgjengelighet og Produktfunksjoner i Lofot-
fisket. EP 7005/1-87, FORUT, University of Tromsø, p 49 (Production functions with seasonal
variations in the catchability coefficients – the case of the Lofoten cod fisheries, in Norwegian)
81. Fousekis P (2002) Distance vs. ray functions: an application to the inshore fishery of Greece.
Mar Resour Econ 17(4):251–267
82. Fousekis P, Kolonaris S (2003) Technical efficiency determination for fisheries: a study of
trammel netters in Greece. Fish Res 63(1):85–95
83. Fox K, Grafton R, Kirkley J, Squires D (2003) Property rights, productivity, and profits in a
common-pool resource. J Environ Econ Manag 46(1):156–177
84. Fox KJ et al (2006) Capacity reduction, quota trading and productivity: the case of the fishery.
Aust J Agric Econ 50(2):189–206
85. Fried HO, Lovell CK, Schmidt SS (2008) The measurement of productive efficiency and
productivity growth. Oxford University Press
86. Fulginiti L, Perrin R (1993) The theory and measurement of producer response under quota.
Rev Econ Stat 75:97–106
87. Gatewood JB (1984) Is the skipper effect really a false ideology? Am Ethnol 11:378–379
88. Gerhardsen GM (1952) Production economics of fisheries. Rev Econ 5(1):371
89. Gilbert B, Yeo BH (2014) Technological change and managerial ability: Evidence from a
Malaysian Artisanal Fishery. Land Econ 90(2):352–371
90. Gordon HS (1954) The economic theory of a common-property resource: the fishery. J Polit
Econ 62(2):124–142
91. Gordon D, Hannesson R (2015) The Norwegian winter herring fishery: a story of technolog-
ical progress and stock collapse. Land Econ 91(2):362–385
92. Gorman WM (1959) Separable utility and aggregation. Econometrica 27:469–481
93. Gorman I (1985) Conditions for economies of scope in the presence of fixed costs. Rand J
Econ 16:431–436
94. Grafton RQ, Squires D, Fox KJ (2000) Private property and economic efficiency: a study of
a common-pool resource. J Law Econ 43(2):679–714
95. Greenberg JA, Herrman M (1993) Some economic impacts of pot limits in the Bristol Bay
red king crab fishery. In: Kruse G, Eggers DM, Marasco RJ, Pautzke C, Quinn TJ II (eds)
Management strategies for exploited fish populations. Alaska Sea Grant College programme
report no. 93-02, University of Alaska Fairbanks, College, pp 705–721
96. Guillen J, Macher C, Merzéréaud M, Boncoeur J, Guyader O (2015) Effects of the share
remuneration system on fisheries management targets and rent distribution. Mar Resour Econ
30:123–138
97. Haavelmo T (1944) The probability approach in economics. Econometrica 12(Suppl):1–118
98. Hall RE (1973) The specification of technology with several kinds of output. J Polit Econ
81:878–892
99. Hannesson R (1983) Bioeconomic production function in fisheries: theoretical and empirical
analysis. Can J Fish Aquat Sci 40:968–982
100. Hannesson R (2000) A note on ITQs and optimal investment. J Environ Econ Manag 40:181–
188
101. Hannesson R (2007) Growth accounting in a fishery. J Environ Econ Manag 53(3):364–376
102. Hannesson R, Salvanes K, Squires D (2010) The Lofoten fishery over hundred years. Land
Econ 86(4):746–765
103. Hansen LG, Jensen CL (2014) Jointness through vessel capacity input in a multispecies
fishery. Agric Econ 45:745–756
104. Herrero I, Pascoe S, Mardle S (2006) Mix efficiency in a multi-species fishery. J Prod Anal
25:231–241
105. Herrick S, Squires D (1990) On measuring fishing fleet productivity: development and
demonstration of an analytical framework. Fish Bull 88(1):85–94
1372 D. Squires and J. Walden

106. Hilborn R, Ledbetter M (1985) Determinants of catching power in the British Columbia
salmon purse seine fleet. Can J Fish Aquat Sci 42:51–56
107. Hoff A, Frost H (2007) Optimal vessel quotas and capacity of a danish trawler fleet segment:
A dual approach. Mar Resour Econ 22:1–14
108. Holloway G, Tomberlin D (2006) Bayesian hierarchical estimation of technical efficiency in
a fishery. Appl Econ Lett 17(2):210–204
109. Holloway G, Tomberlin D, Irz X (2005) Chapter 9. Hierarchical analysis of production
efficiency in a coastal trawl fishery. In: Carpa R, Alberini A (eds) Applications of simulation
methods in environmental and resource economics, The economics of non-market goods and
resources, vol 6. Springer, Dordrecht, pp 159–185
110. Holt S (1982) Optimal technological progress and fishing behavior: the California swordfish
fishery. Center for Marine Studies, San Diego State University, San Diego, p 23
111. Holzer J, DePiper G (2019) Intertemporal quota arbitrage in multispecies fisheries. J Environ
Econ Manag 93:185–207
112. Horrace WC, Schnier KE (2010) Fixed-effect estimation of highly-mobile production
technologies. Am J Agric Econ 92(5):1432–1455
113. Huang DS, Lee CW (1976) Toward a general model of fishery production. South Econ J
43(1):846
114. Huang L, Ray S, Segerson K, Walden J (2018) Impact of collective rights-based fish-
eries management: evidence from the New England groundfish fishery. Mar Resour Econ
33(2):177–201
115. Hulten CR (1991) The measurement of capital. In: Fifty years of economic measurement:
the jubilee of the conference on research in income and wealth. University of Chicago Press,
Chicago, pp 119–158
116. Hussen A, Sutinen JG (1979) Estimation of production functions for the artisanal fishery
for the Gulf of Nicoya (Costa Rica). Working paper no. 1, International Center for Marine
Resource Development, University of Rhode Island, p 36
117. Hutniczak B (2014) Increasing pressure on unregulated species due to changes in individual
vessel quotas: an empirical application to trawler fishing in the Baltic Sea. Mar Resour Econ
29(3):201–217
118. Hutniczak B, Vestergaard N, Squires D (2019) Policy change anticipation in the buyback
context. Environ Resour Econ 73(1):111–132
119. Jensen CL (2002) Applications of dual theory in fisheries: a survey. Mar Resour Econ
17(4):309–334
120. Jin D, Thunberg E, Kite-Powell H, Blake K (2002) Total factor productivity change in the
New England groundfish fishery: 1964–1993. J Environ Econ Manag 44(3):540–556
121. Johansen L (1968) Production functions and the concept of capacity. In: Recherches récentes
sur la fonction de production, collection, economie mathématique et économétrie, 2, p 52
122. Kirkley J (1984) Chapter 3. An empirical examination of fisheries production relationships:
single and multiple species. In: Conrad J, Kirkley J, Squires D (eds) Lectures on the
economics of fisheries production, NOAA technical memorandum NMFS F/NWC-60. U.S.
Department of Commerce, Springfield, pp 66–93
123. Kirkley JE, Squires DE (1988) A limited information approach for determining capital stock
and investment in a fishery. Fish Bull 86(2):339–349
124. Kirkley JE, Squires DE (1999) Measuring capacity and capacity utilization in fisheries. In:
Greboval D (ed) Managing fishing capacity: selected papers on underlying concepts and
issues, FAO fisheries technical paper 386. Food and Agricultural Organization of the United
Nations, Rome, 75–200
125. Kirkley JE, Squires D (1999) Capacity and capacity utilization in fishing industries. Discus-
sion paper 99-16, Department of Economics, University of California San Diego, San Diego
126. Kirkley JE, Strand IE (1988) The technology and management of multi-species fisheries.
Appl Econ 20:1279–1292
127. Kirkley JE, Squires D, Strand IE (1995) Assessing technical efficiency in commercial
fisheries: the mid-Atlantic Sea scallop fishery. Am J Agric Econ 77:686–697
34 The Economics of Production in Marine Fisheries 1373

128. Kirkley J, Squires D, Strand IE (1998) Characterizing managerial skill and technical
efficiency in a fishery. J Prod Anal 9(2):145–160
129. Kirkley J, Morrison Paul C, Squires D (2002) Capacity and capacity utilization in common-
pool resource industries: definition, measurement, and a comparison of approaches. Environ
Resour Econ 22(1–2):71–97
130. Kirkley JE, Morrison Paul CJ, Cunningham S, Catanzano J (2004) Embodied and disembod-
ied technical change in fisheries: an analysis of the Sète trawl fishery, 1985–1999. Environ
Resour Econ 29:191–217
131. Kirkley JE, Walden J, Waters J (2004) Buyback programs: goals, objectives, and industry
restructuring in fisheries. J Agric Appl Econ 36:333–346
132. Kirkley JE, Walden JB, Waters J (2008) Chapter 15. Buyback programs and industry
restructuring in fisheries. In: Curtis R, Squires D (eds) Fisheries buybacks. Blackwell
Publishing, pp 227–238
133. Kompas T, Che TN, Grafton RQ (2006) Technical efficiency effects of input controls:
evidence from Australia’s banana prawn fishery. Appl Econ 26(15):1631–1641
134. Kurga K (1973) More about joint production. Int Econ Rev 14(1):196–210
135. Kuriyama PT, Branch TA, Bellman MA, Rutherford K (2016) Catch shares have not led to
catch-quota balancing in two North American multispecies trawl fisheries. Mar Policy 71:
60–70
136. Kvamsdal SF (2016) Technical change as a stochastic trend in a fisheries model. Mar Resour
Econ 31(4):403–419
137. Kvamsdal SF (2019) Indexing of technical change in aggregated data. Comput Econ
53(3):901–920
138. Laitinen K (1980) A theory of the multiproduct firm. North-Holland Press, Amsterdam
139. Laitinen K, Theil H (1978) Supply and demand of the multiproduct firm. University of
Chicago Press, Chicago
140. Lau L (1976) A characterization of the normalized restricted profit function. J Econ Theory
12:131–163
141. Lau L (1978) Chapter 3. Applications of profit functions. In: Fuss M, McFadden D (eds)
Production economics: a dual approach to theory and applications, vol I. North-Holland,
Amsterdam, pp 133–218
142. Lazkano I (2008) Cost structure and capacity utilisation in multi-product industries: an
application to the Basque trawl industry. Environ Resour Econ 41:189–207
143. Lewbel J (1996) Aggregation without separability: a generalized composite commodity
theorem. Am Econ Rev 86(3):524–543
144. Liao DS (1971) Profitability and productivity analysis for the southeastern Alaska salmon
fishery. U.S. National Marine Fisheries Service. Mar Fish Rev 38(4):11–14
145. Lindebo E, Hoff A, Vestergaard N (2007) Revenue-based capacity utilisation measures
and decomposition: the case of Danish North Sea trawlers. Eur J Oper Res 180:
215–227
146. Livernois J, Ryan D (1989) Testing for non-jointness in oil and gas exploration: a variable
profit function approach. Int Econ Rev 30(2):479–504
147. Lucas RE Jr (1976) Econometric policy evaluation: a critique. In: Brunner K, Meltzer A
(eds) The Phillips curve and the labor market, vol. 1 of Carnegie-Rochester conference in
public policy, a supplementation series to the Journal of Monetary Economics. North-Holland
Publishers, Amsterdam
148. MacSween I (1973) Measurement of fishing power. Miscellaneous reference paper, Fishery
Economics Research Unit, White Fish Authority, Edinburgh
149. Mainardi S (2019) Access fees and efficiency frontiers with selectivity and latent classes:
Falkland Islands fisheries. Mar Resour Econ 34(2):163–195
150. Malvarosa L, Daures F, Leblond E, Andersen JL, Andersen KN, Nielsen R, Anton S, Curtis
H, Buisman E, Salz P (2006) Evaluation of the capital value, investments and capital costs in
the fisheries sector. IREPA Onlus, rep. no FISH/2005/03
151. Matthiasson T (1999) Cost sharing and catch sharing. J Dev Econ 58:25–44
1374 D. Squires and J. Walden

152. Maunder MN, Punt A (2004) Standardizing catch and effort data: a review of recent
approaches. Fish Res 70:141–159
153. McConnell KE, Price M (2006) The lay system in commercial fisheries: origin and implica-
tions. J Environ Econ Manag 51:295–307
154. Melnychuk M, Essington TE, Branch TA, Heppell SS, Jensen OP, Link JS, Martell SJD,
Parma AM, Smith AD (2016) Which design elements of individual quota fisheries help to
achieve management objectives? Fish and Fisheries 17:126–142
155. Milgrom P, Roberts J (1996) The LeChatelier principle. Am Econ Rev 86(1):173–179
156. Mulazzani L, Malorgio G (2013) Regional management of multi-species fisheries on the basis
of shared stocks and property rights: a Mediterranean case. Sci Mar 77:439–448
157. Mundlak Y (1961) Empirical production function free of management bias. J Farm Econ
43:44–56
158. Murray J (2007) Constrained marine resource management. Unpublished Ph.D. dissertation,
Department of Economics, University of California San Diego
159. Natividad G (2016) Quotas, productivity, and prices: the case of anchovy fishing. J Econ
Manage Strat 25(1):220–257
160. Neary P (1985) International factor mobility, minimum wage rates, and factor-price equaliza-
tion: a synthesis. Q J Econ 100(3):551–570
161. Neary JP, Roberts K (1980) The theory of household behavior under rationing. Eur Econ Rev
13:25–42
162. New R (2012) Management changes and vessel-level technical efficiency in the Eastern
Tuna and Billfish Fishery: a stochastic frontier analysis. Paper presented at the Australian
Agricultural and Resource Economic Society conference, Fremantle, Feb 2012
163. Norton VJ, Miller MM, Kenney E (1985) Indexing the economic health of the U.S. fishing
industry’s harvesting sector, NOAA technical memorandum NMFS-F/NEC-40. Northeast
Fisheries Center, Woods Hole, p 42
164. Nøstbakken L (2006) Cost structure and capacity in the Norwegian pelagic fisheries. Appl
Econ 38:1877–1887
165. Nøstbakken L, Thébaud O, Sørensen LC (2011) Investment behaviour and capacity adjust-
ment in fisheries: A survey of the literature. Mar Resour Econ 26(4):95–117
166. Oliveira MM, Camanho AS, Gaspar MB (2013) The influence of catch quotas on the
productivity of the Portuguese bivalve Dredge fleet. ICES J Mar Sci 70(7): 1378–1388
167. Onofri L, Francesc M (2017) A test for cost subadditivity in the fishery sector. Fish Ocean
Opj 5(1):555651
168. Pálsson G, Durrenberger EP (1982) To dream of fish: the causes of Icelandic skippers’ fishing
success. J Anthropol Res 38:227–242
169. Pálsson G, Durrenberger EP (1984) Reply to Gatewood. Am Ethnol 11:379–380
170. Pan M, Walden J (2015) Measuring productivity in a shared stock fishery: a case study of the
Hawaii longline fishery. Mar Policy 62:302–308
171. Pascoe S, Coglan L (2002) The contribution of unmeasurable inputs to fisheries production:
an analysis of technical efficiency of fishing vessels in the English Channel. Am J Agric Econ
84:585–597
172. Pascoe S, Kirkley J, Greboval D, Morrison-Paul CJ (2003) Measuring and assessing capacity
in fisheries 2. Issues and methods, FAO fisheries technical paper 433/2. Food and Agriculture
Organization of the United Nations, Rome, p 130
173. Pascoe S, Koundouri P, Bjørndal T (2007) Estimating targeting ability in multi-
species fisheries: a primal multi-output distance function approach. Land Econ 83(3):
382–397
174. Pascoe S, Punt AE, Dichmont CM (2010) Targeting ability and output controls in Australia’s
multi-species Northern Prawn Fishery. Eur Rev Agric Econ 37(3):313–334
175. Pascoe S, Vieria S, Dichmont CM, Punt AE (2011) Optimal vessel size and output in the
Australian Northern Prawn Fishery: a restricted profit function approach. Aust J Agric Resour
Econ 55:107–125
34 The Economics of Production in Marine Fisheries 1375

176. Pascoe S, Coglan L, Punt AE, Dichmont CM (2012) Impacts of vessel capacity reduction
programmes on efficiency in fisheries: the case of Australia’s multispecies Northern Prawn
Fishery. J Agric Econ 63:425–443
177. Pascoe S, Hutton T, Coglan L, Nguyen VQ (2018) Implications of efficiency and productivity
change over the season for setting MEY-based trigger targets. Aust J Agric Resour Econ
62:199–216
178. Paul CJM, de M, Torres O, Felthoven RG (2009) Fishing revenue, productivity and product
choice in the Alaskan Pollock fishery. Environ Resour Econ 44:457–474
179. Pearce P, Wilen J (1979) The impact of Canada’s Pacific fleet salmon control program. J Fish
Res Board Can 36(7):764–769
180. Pinello D, Liontakis A, Sintori A, Tzouramani I, Polymeros K (2016) Assessing the efficiency
of small-scale and bottom trawlers in Greece. Sustainability 8:681. https://doi.org/10.3390/
su8070681
181. Pinello D, Gee J, Accadia P, Sabetella EC, Vitale S, Polymeros K, Fiorentino F (2018)
Efficiency of shallow- and deep-water trawling in the Mediterranean and its implications for
discard reduction. Sci Mar 82(S1):97–106
182. Plourde C, Smith JB (1989) Crop sharing in the fishery and industry equilibrium. Mar Resour
Econ 6:179–193
183. Pokki H, Virtanen J, Karvinen S (2018) Comparison of economic analysis with financial
analysis of fisheries: application of the perpetual inventory method to the finnish fishing fleet.
Mar Policy 95:239–247
184. Reimer MN, Abbott JK, Wilen JE (2017a) Fisheries production: Management institutions,
spatial choice, and the quest for policy invariance. Mar Resour Econ 32:143–168
185. Reimer MN, Abbott JK, Hayie AC (2017b) Empirical models of fisheries production:
Conflating technology with incentives? Mar Resour Econ 32:169–190
186. Roberts R (1965) A commercial fishery production function: the Skeena River sockeye
salmon gillnet fishery. Unpublished M.A. thesis, University of Victoria
187. Romer PM (1990) Endogenous technological change. J Polit Econ 98(5 Part 2):71–102
188. Rothschild BJ (1972) An exposition on the definition of fishing effort. Fish Bull 70(3):
671–679
189. Salvanes KG, Squires D (1995) Transferable quotas, enforcement costs and typical firms: an
empirical application to the Norwegian trawler fleet. Environ Resour Econ 6(1):1–21
190. Salvanes KG, Steen F (1994) Testing for relative performance between seasons in a fishery.
Land Econ 70(4):431–447
191. Samuelson P (1947) Foundations of economic analysis. Cambridge University Press, Cam-
bridge
192. Sanchirico JN, Holland D, Quigley K, Fina M (2006) Catch-Quota balancing in multispecies
individual fishing quotas. Mar Policy 30(6):767–785
193. Schaefer MB (1957) Some considerations of population dynamics and economics in relation
to the management of marine species. J Fish Res Board Can 14:669–681
194. Scheld AM, Walden J (2018) An analysis of fishing selectivity for northeast US multispecies
bottom trawlers. Mar Resour Econ 33(4):331–350
195. Schmidt P, Sickles R (1984) Production frontiers and panel data. J Bus Econ Stat 2:
367–374
196. Schnier KE, Felthoven RG (2013) Production efficiency and exit in rights-based fisheries.
Land Econ 89(3):538–557
197. Scott A (1955) The fishery: the objectives of sole ownership. J Polit Econ 63:116–124
198. Segerson K, Squires D (1990) On the measurement of economic capacity utilization for
multiproduct industries. J Econ 44:347–361
199. Segerson K, Squires D (1993) Multiproduct capacity utilization under regulatory constraints.
Rev Econ Stat 75(1):76–85
200. Sharma K, Leung P (1998) Technical efficiency of the long-line fishery in Hawaii: an
application of a stochastic production frontier. Mar Resour Econ 13:259–274
1376 D. Squires and J. Walden

201. Skjold F, Eide A, Flaaten O (1996) Production functions of the Norwegian bottom trawl
fisheries of cod in the Barent Sea. International council for the exploration of the sea, C.M.
1996/P:3, theme session on management faced with multiple objectives
202. Smith VL (1968) Economics of production from natural resources. Am Econ Rev 58:409–431
203. Solís D, del Corral J, Perruso L, Agar JA (2014) Evaluating the impact of individual fishing
quotas (IFQs) on the technical efficiency and composition of the US Gulf of Mexico red
snapper commercial fishing fleet. Food Policy 36:74–83
204. Solow RM (1955) The production function and the theory of capital. Rev Econ Stud 23:
101–108
205. Squires D (1984) Chapter 2. On the application of production theory to commercial fishing:
static analysis. In: Conrad J, Kirkley J, Squires D (eds) Lectures on the economics of
fisheries production, NOAA technical memorandum NMFS F/NWC-60. U.S. Department of
Commerce, Springfield
206. Squires D (1987) Fishing effort: its testing, specification, and internal structure in fisheries
economics and management. J Environ Econ Manag 14:268–282
207. Squires D (1987) Public regulation and the structure of production in multiproduct industries:
the New England otter trawl industry. Rand J Econ 18(2):232–247
208. Squires D (1987) Index numbers and productivity measurement in multispecies fisheries: an
application to the Pacific coast trawl fleet, La Jolla
209. Squires D (1987) Long-run profit functions for multiproduct firms. Am J Agric Econ 69:
558–569
210. Squires D (1988) Index numbers and productivity measurement in multispecies fisheries: an
application to the Pacific coast trawl fleet. NOAA technical report NMFS 67. National Marine
Fisheries Service, National Oceanic and Atmospheric Administration, Washington, DC, p 34.
https://www.st.nmfs.noaa.gov/spo/SPO/tr67opt.pdf. Accessed 20 Aug 2021
211. Squires D (1988) Production technology, costs, and multiproduct industry structure: an
application of the long-run profit function to the New England fishing industry. Can J Econ
21(2):359–378
212. Squires D (1992) Productivity measurement in common property resource industries: an
application to the Pacific coast trawl fishery. Rand J Econ 23:221–236
213. Squires D (1994) Firm behavior under input rationing. J Econ 61(2):235–257
214. Squires D (2007) Producer quotas and virtual quantities. Mimeo, Department of Economics,
University of California San Diego, San Diego
215. Squires D (2016) Firm behavior under quantity controls: the theory of virtual quantities. J
Environ Econ Manag 79:70–86
216. Squires D, Kirkley J (1991) Production quota in multiproduct Pacific fisheries. J Environ Econ
Manag 21(2):109–126
217. Squires D, Kirkley J (1995) Resource rents from single and multispecies individual transfer-
able quota programs. ICES J Mar Sci 52:153–164
218. Squires D, Kirkley J (1996) Individual tradable quotas in a multiproduct common property
industry. Can J Econ 24(2):318–342
219. Squires D, Kirkley J (1999) Skipper skill and panel data in fishing industries. Can J Fish
Aquat Sci 56:2011–2018
220. Squires D, Segerson K (2024) Capacity and capacity utilization. In: Ray SC, Chambers R,
Kumbhakar S (eds) Handbook of production economics, vol 1. Springer, Cham
221. Squires D, Vestergaard N (2013) Technical change and the commons. Rev Econ Stat
95(5):1769–1787
222. Squires D, Vestergaard N (2013) Technical Change in Fisheries. Mar Policy 42:286–292
223. Squires D, Vestergaard N (2015) Productivity growth, catchability, stock assessments, and
optimum renewable resource use. Mar Policy 62:309–317
224. Squires D, Vestergaard N (2018) Rethinking the commons problem: technological change,
knowledge spillovers, and social learning. J Environ Econ Manag 91:1–25
225. Squires D, Alauddin M, Kirkley J (1994) Individual transferable quota markets and invest-
ment decisions in the fixed gear sablefish industry. J Environ Econ Manag 27(2):185–204
34 The Economics of Production in Marine Fisheries 1377

226. Squires D, Grafton RQ, Alam MF, Omar IH (1998) Where the land meets the sea: integrated
sustainable fisheries development and artisanal fishing. Discussion paper 98-26. Department
of Economics, University of California, San Diego
227. Squires D, Campbell H, Cunningham S, Dewees C, Grafton RQ, Herrick SF Jr, Kirkley J,
Pascoe S, Salvanes K, Shallard B, Turris B, Vestergaard N (1998) Individual transferable
quotas in multispecies fisheries. Mar Policy 22(2):135–159
228. Squires D, Grafton RQ, Alam MF, Omar IH (2003) Technical efficiency in the Malaysian gill
net artisanal fishery. Environ Dev Econ 8:481–504
229. Squires D, Reid C, Jeon Y (2008) Productivity growth in natural resource industries and the
environment: an application to the Korean tuna purse-seine fleet in the Pacific Ocean. Int Econ
J 22(1):81–93
230. Staniford A (1988) The effects of the pot reduction in the South Australian southern zone
rock lobster fishery. Mar Resour Econ 4:271–288
231. Stiglitz JE (1974) Incentives and risk sharing in sharecropping. Rev Econ Stud 41:219–255
232. Strand IE, Kirkley JE, McConnell KE (1981) Economic analysis and the management of
Atlantic surf clams. In: Anderson LG (ed) Economic analysis for fisheries management plans.
Ann Arbor Science Publishing, Ann Arbor, pp 113–141
233. Sutinen JG (1979) Fishermen’s remuneration systems and implications for fisheries develop-
ment. Scott J Polit Econ 26:147–162
234. Taylor TG, Prochaska FJ (1981) Fishing power functions in aggregate bioeconomic models.
Staff paper no. 185, Department of Food and Resource Economics, University of Florida,
Gainesville
235. Thorlindsson T (1988) The skipper effect in the Icelandic herring fishery. Hum Organ 47:
199–212
236. Thunberg EM, Bresnyan EW, Adams CM (1995) Economic analysis of technical inter-
dependencies and the value of effort in a multi-species fishery. Mar Resour Econ 10:
59–76
237. Thunberg E, Walden J, Agar J, Felthoven R, Harley A, Kasperski S, Lee J, Lee T, Mamula A,
Stephen J, Strelcheck A (2015) Measuring changes in multi-factor productivity in U.S. catch
share fisheries. Mar Policy 62:294–301
238. Tingley D, Pascoe S, Coglan L (2005) Factors affecting technical efficiency in fisheries:
stochastic production frontier versus data envelopment analysis approaches. Fish Res 73:
363–376
239. Tomberlin D, Holloway G (2006) Bayesian ranking and selection of fishing boat efficiencies.
Mar Resour Econ 21(4):415–432
240. Tomberlin D, Holloway G (2007) Trip-level analysis of efficiency changes in Oregon’s
deepwater trawl fishery. FEEM working paper no. 86.2007. SSRN electronic journal, p 21
241. Torres MO, RG Felthoven (2014) Productivity Growth and Product Choice in Catch Share
Fisheries: The Case of Alaska Pollock. Mar Policy 50:280–289
242. Ultang Ø (1976) Catch per unit effort in the Norwegian purse seine fishery for Atlanto-
Scandian (Norwegian spring spawning) herring, FAO fisheries technical papers 155. Food
and Agriculture Organization of the United Nations, Rome, pp 91–101
243. Vestergaard N (1999) Measures of welfare effects in multiproduct industries: the case of
multispecies individual transferable quota fisheries. Can J Econ 32(2):729–743
244. Vestergaard N (2010) Chapter 42. Principal-agent problems in fisheries. In: Grafton RQ,
Hilborn R, Squires D, Tait M, Williams M (eds) Handbook of marine fisheries conservation
and management. Oxford University Press, Oxford, UK, pp 563–571
245. Vestergaard N, Jensen F, Jørgensen HP (2005) Sunk cost and entry-exit decisions under ITQs:
why industry restructuring is delayed. Land Econ 81(3):363–378
246. Viswanathan K, Ishak O, Jeon Y, Kirkley J, Squires D, Susilowati I (2002) Fishing skill
in developing country fisheries: The Kedah, Malaysia trawl fishery. Mar Resour Econ
16(4):293–313
247. Walden J, Kirkley JE (2000) Measuring technical efficiency and capacity in fisheries by
data envelopment analysis using the general algebraic modeling system (GAMS): workbook.
1378 D. Squires and J. Walden

U.S. Department of Commerce, National Oceanic and Atmospheric Administration, National

Marine Fisheries Service, Northeast Region, Northeast Fisheries Science Center
248. Walden JB, Kitts N (2014) Measuring fishery profitability: an index number approach. Mar
Policy 43:321–326
249. Walden JB, Tomberlin D (2010) Estimating fishing vessel capacity: a comparison of
nonparametric frontier approaches. Mar Resour Econ 25:23–36
250. Walden JB, Kirkley JE, Kitts AW (2003) A limited economic assessment of the Northeast
groundfish fishery buyout program. Land Econ 79:426–439
251. Walden J, Kirkley J, Färe R (2010) Chapter 40. Measuring and managing fishing capacity. In:
Grafton RQ, Hilborn R, Squires D, Tait M, Williams M (eds) Handbook of marine fisheries
conservation and management. Oxford University Press, Oxford, UK, pp 546–555
252. Walden JB, Kirkley JE, Färe R, Logan P (2012) Productivity change under an individual
transferable quota management system. Am J Agric Econ 94:913–928
253. Walden J et al (2014) Productivity change in U.S. catch share fisheries. NOAA tech-
nical memorandum NMFS-F/SPO. National Marine Fisheries Service, National Oceanic
and Atmospheric Administration, Washington, DC. p 122. https://spo.nmfs.noaa.gov/sites/
default/files/TM146_0.pdf. Accessed 20 Aug 2021
254. Walden J, Fissel B, Squires D, Vestergaard N (2015) Productivity change in commercial
fisheries: an introduction to the special issue. Mar Policy 62:289–293
255. Walden JB, Färe R, Grosskopf S (2017) Measuring change in productivity of a fishery with
the Bennet-Bowley indicator. Fish Bull 115(3):273–283
256. Weninger Q (1998) Assessing efficiency gains from individual transferable quotas: an
application to the mid-Atlantic surf clam and ocean quahog fishery. Am J Agric Econ 80:
750–764
257. Weninger Q (2001) An analysis of the efficient production frontier in the fishery: implications
for enhanced fisheries management. Appl Econ 33:71–79
258. Weninger Q, Strand IE (2003) An empirical analysis of production distortions in the mid-
Atlantic surf clam and ocean quahog fishery. Appl Econ 35(10):1191–1197
259. Wilen J (1979) Regulatory implications of alternative models of fishermen behavior. J Fish
Res Board Can 36(7):855–858
260. Wolff F-C, Guillotreau P, Squires D (2013) The firm’s management in production: manage-
ment, firm, and time effects in an Indian Ocean tuna fishery. Am J Agric Econ 95(3):547–567
261. Zellner A, Revankar NS (1969) Generalized production functions. Rev Econ Stud 36:241–258
262. Zellner A, Kmenta J, Drèze J (1966) Specification and estimation of Cobb-Douglas produc-
tion function models. Econometrica 34(4):784–795
263. Zhang J, Smith MD (2011) Estimation of a generalized fishery model: a two-stage approach.
Rev Econ Stat 93(2):690–699
Production Economics in Spatial Analysis
35
Luis Orea and Inmaculada C. Álvarez

Contents
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1380
Knowledge Production Function and Spatial Economic Growth Models . . . . . . . . . . . . . . . 1381
Network Inputs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1384
Transport Infrastructures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1385
ICT and R&D Activities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1386
Local Versus Global . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1387
Agglomeration Economies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1388
Spatial Returns to Scale . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1390
Individual Elasticities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1391
Mean Elasticities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1392
Internal, External, and Total Returns to Scale . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1392
Economy-Wide Returns to Scale . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1393
Returns to Scale in Heterogeneous Coefficient Models . . . . . . . . . . . . . . . . . . . . . . . . . . . 1394
Spatial Econometric Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1394
Spatial Stochastic Frontier Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1395
Distribution-Free Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1396
Distribution-Based Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1397
Estimating Efficiency in Spatial Frontier Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1399
Spatial TFP Growth Decomposition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1399
Final Remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1401
Cross-References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1403
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1404

L. Orea ()
Oviedo Efficiency Group, Department of Economics, University of Oviedo, Oviedo, Spain
e-mail: [email protected]
I. C. Álvarez
Oviedo Efficiency Group, Department of Economics, Universidad Autónoma de Madrid,
Madrid, Spain

S. C. Ray et al. (eds.), Handbook of Production Economics,
https://doi.org/10.1007/978-981-10-3455-8_35
1380 L. Orea and I. C. Álvarez

Abstract

This chapter summarizes the empirical literature that uses a spatial analysis
framework in production economics. This literature takes advantage of the spatial
dimension of the data to capture the spillover effects of neighboring production
units. In the first three sections, we outline standard spatial extensions of the
neoclassical production models aiming to measure knowledge spillovers, the
effect of network inputs, and economies of agglomeration. The next four sections
outline the literature that on one hand examines returns to scale and productivity
growth from both internal and external inputs, and on the other hand summarize
the spatial econometric techniques used in frontier and non-frontier analyses of
firms’ production. The last section includes a set of final remarks regarding the
application of spatial econometric techniques in production analyses.

Keywords

Spatial econometrics · Stochastic frontier models · Production economics

Introduction

This chapter summarizes the empirical literature that uses a spatial analysis
framework in production economics. Overall speaking, this literature incorporates
external returns from other (nearby) production units, extending the set of produc-
tion inputs in neoclassical production models. Most empirical models in this field
are estimated using individual (e.g., firms) or aggregate (e.g., regions) production
units. In some settings, these production units can be associated with locations and,
therefore, can be placed on the map. The spatial dimension of the data is used in
this literature to compute overall marginal products, returns to scale or productivity
growth measures from both internal and external factors, or simply to get better
parameter estimates. Other researchers have used the spatial information in frontier
production analyses to control for unobserved but spatially correlated variables that
or to analyze interesting features of firms’ economic performance.
This chapter is organized as follows. Section “Knowledge Production Func-
tion and Spatial Economic Growth Models” introduces the so-called “knowledge
production functions” and formalizes the existence of knowledge spillovers in
this setting. We take advantage of this discussion to introduce the three most
popular production specifications in spatial econometrics: the spatial lag of X model
(SLX), the spatial autoregressive model (SAR), and the spatial Durbin model
(SDM). As firms’ productivity might be spatially correlated due to the existence
of knowledge spillovers, the standard neoclassical growth models (built from a
previously defined production function) can also be estimated using some of the
above spatial specifications. Section “Network Inputs” presents several papers that
argue that some inputs have network characteristics and generate external effects
on neighboring production units (regions). This is the case of transport infras-
tructure and information and communication technologies (ICT). We introduce
35 Production Economics in Spatial Analysis 1381

in this section a discussion on the selection of local versus global spillovers.

Section “Agglomeration Economies” outlines the empirical literature that examines
agglomeration economies, i.e., external returns from the concentration of economic
activity, via localization effects or urbanization economies.
The next four sections are more methodological. Section “Spatial Returns to
Scale” discusses how to measure internal, external, and total returns to scale
once a spatial specification of production units’ technology has been estimated,
either using homogeneous or heterogeneous coefficient models. This section also
discusses how to compute aggregate or economy-wide returns to scale from a set
of observations. Section “Spatial Econometric Models” provides a brief discussion
on the general estimation techniques that are typically used to estimate the spatial
models. Section “Spatial Stochastic Frontier Models” offers a brief discussion
of the small but evolving literature on spatial stochastic frontier modeling. This
literature is resuscitating the interest in spatial error models (SEM), because
the stochastic frontier models have two different random terms and controlling
for spatial spillovers in both noise and inefficiency terms does matter due to
the significant and different economic consequences of such correlations. Section
“Spatial TFP Growth Decomposition” is devoted to the decomposition of spatial
measures of firms’ total factor productivity (TFP). We outline the main features of
two papers that have extended the standard TFP growth decomposition to include
both direct (own) and indirect (spillover) components. We conclude this chapter
with a set of final remarks in section “Final Remarks.”
It is worth mentioning that, for notational ease, we have developed this chapter
for panel data. We also confine our discussion to the estimation of (frontier)
production functions due to other primal and dual representations of the technology
deserve similar comments.

Knowledge Production Function and Spatial Economic Growth

Models

As the spatial terms in a standard neoclassical growth model appear due to the
existence of knowledge spillovers, we first introduce in this section the concept
of “knowledge production function” (KPF). We next summarize several empirical
papers that have estimated neoclassical growth models using spatial econometric
techniques.
The knowledge or ideas production is crucial in the theory of innovation and
in the definition of optimal public policies. Pakes and Griliches [100] defined the
KPF as a function intended to represent the transformation process leading from
innovative inputs (e.g., R&D) to commercially valuable knowledge or innovative
output (e.g., patents). Most of the studies aiming to estimate a KPF depart from
a Cobb–Douglas functional form where the level of technological knowledge (Ait )
depends on the amount of physical resources allocated to R&D activities (zit ). This
function can be enlarged to include learning ideas processes [68] that depends on
neighbor’s knowledge (see, e.g., [12]):
1382 L. Orea and I. C. Álvarez

N
lnAit = α + β lnzit + λ wij lnAj t + vit (1)
j =i

where subscript i(=1, . . . , N) stands for regions, t(=1, . . . , T) stands for periods,
zit is a vector of knowledge determinants, vit is the traditional noise term, and the
weight term wij formalizes the connectivity between firm i and firm j. According to
this, the learning ideas process drives to knowledge spillovers if λ is positive.
Cunha and Neves [32] review the empirical literature on KPFs, identifying
a handful of papers that estimate KPFs using spatial econometric techniques.
For instance, Bottazzi and Peri [19] estimate a KPF where the regional (spatial)
spillovers decreases with the geographical distances between regions. The spatial
specification of this model is justified by theories of localization that argue that geo-
graphic proximity reduces the cost of accessing and absorbing knowledge spillovers
([57], p. 19). This explains why knowledge spillovers and spatial spillovers are
different but related concepts.
In the early articles, it was common to find spatial interdependence in the
determinants of knowledge, while more recently researchers consider also spatial
spillovers in knowledge itself. These considerations result in different spatial
specifications of KPF that correspond with different models widely known in the
spatial econometric literature [127]. On the one hand, the model that incorporates
the spatial lag of the dependent variable or the weighted average of neighboring
values of the dependent variable is the well-known spatial autoregressive model
(SAR).1 Equation (1) is an example of a SAR model. On the other hand, the model
that incorporates the spatial lag of the explanatory variables is the well-known
spatial lag of X model (SLX). For instance, Álvarez and Barbero [2] assumes that
the level of technological knowledge depends on physical and human capital of both
the own region and neighboring regions. Their KPF can be thus written as:
N
lnAit = α + βlnzit + θ wij lnzj t + vit (2)
j =i

where zit stands now for physical and human capital variable. There is not a
consensus about the most preferred specification (SLX vs. SAR) and whether the
spillovers are local or global. Although section “Local Versus Global” provides a
critical discussion on the nature of the spillovers generated by these two spatial
specifications, it is germane to mention here that the spillovers induced by the SAR
model in (1) are global in the sense that shocks disturbing a firm might affect all
other firms. In contrast, the SLX model in (2) yields more local spillover effects,
because they do not involve endogenous feedback effects from neighbors to the
neighbors and so on.
Empirical results indicate that both type of spillovers might play an important
role in the production of knowledge [23, 61]. If so, the model that should be

1 The “spatial lag” terminology used in spatial econometrics was originally introduced by Anselin
[6].
35 Production Economics in Spatial Analysis 1383

estimated is the well-known spatial Durbin model (SDM), which can be written
as follows:
N N
lnAit = α + λ wit lnAj t + βlnzit + σ wij lnzj t + vit (3)
j =i j =i

Notice that, as customary in spatial econometrics, this model can be rewritten in

a simpler fashion using matrix notation as follows:

lnAt = α + λW lnAt + βlnzt + θ W lnzt + vt (4)

where lnAt , lnzt , and vt are Nx1 vectors, and the set of spatial weight terms in (3)
are now written using a spatial weight NxN matrix W, where diagonal elements are
equal to zero, and the off-diagonal elements are nonzero if the firm i is assumed
to be correlated with firm j. Quite often, the off-diagonal elements are equal to
one if both observations are located in adjacent locations. In many applications,
the weight matrix is also row-standardized with the number of adjacent units (i.e.,
N
j =i wij = 1). In this case, WlnAt in (4) can be interpreted as the average values
of the technological knowledge of adjacent firms. The choice of a proper spatial
weight matrix is contentious. For instance, Tiefelsdorf et al. [121] point out that this
standardization procedure may emphasize the prevalence of the spatial dependence
on those units with fewer connections. Plümper and Neumayer [104] demonstrate
the enormous influence that choosing the functional form of the weighting matrix
can exert on the empirical results.
Given the relevance of geography in the diffusion of knowledge and R&D,
Fingleton and López-Bazo [44], Ertur and Koch [43], and Fischer [45] have
augmented the Solow neoclassical model by including both global and local spatial
autocorrelation on growth and convergence. The technology of the whole economy
is characterized by a Cobb–Douglas production function with constant returns to
scale in per worker terms:

lnyit = ln Ait + α lnkit + vit (5)

where yit = Yit /Lit is output per worker, kit = Kit /Lit is capital services per worker,
and Ait captures the level of technological knowledge. As in (2), the abovementioned
authors define the technological knowledge term using a SLX specification, i.e.:

ϕ
N ϕρw
Ait = kit k it (6)
j =i j t

where the technological parameter 0 < ϕ < 1 reflects the size of the home
externalities and p allows formalizing spatial interdependence by means of the
spatial weight terms wij . If we plug (6) into the neoclassical production function
(5), we get a (per worker) production function with spatial interactions:
1384 L. Orea and I. C. Álvarez

N
ln yit = ln + βlnkit + θ wij ln kit + vit (7)
i=i

where β = α + ϕ and θ = ϕρ. Therefore, the existence of knowledge spillovers

explains why output per worker in region i depends on its own capital investment
but also on neighbors’ capital investment.
If we now introduce the production function (7) into a neoclassical growth model,
we can obtain the output per worker at the steady state and the speed of convergence
to the steady state. Interesting enough, the obtained convergence equation includes
spatial lags in both the dependent and independent variables:

N N
ẏit = α + λ wij ẏj t + βzit + θ wij zj t + vit (8)
j =i j =i

where ẏit is the annual rate of growth of output per worker, zit is the logarithm
of the sum of the labor rate of growth, the rate of depreciation and the rate of
technical change, and the speed of convergence can be obtained from the estimated
λ parameter. For notational ease, we have omitted in this equation the initial output
(real income) per worker and the fraction of output that is saved. The spatial
convergence equation in (8) thus follows a SDM model and it predicts convergence
if output (per worker) growth is a negative function of initial output (not shown),
after controlling for the determinants of the steady state (i.e., labor rate of growth,
the rate of depreciation, and the rate of technical change) and the possible existence
of spatial interdependence among nearest economies. More recent papers also
include other determinants of economic growth and convergence in regions and
countries, such as human capital and public sector (e.g., [5]).
To finish this section, it is worth mentioning that the spatial specifications of the
above models come from economic theory. In general, one of the main criticisms
regarding the spatial econometric models is the absence of theoretical basis [31].
Therefore, a remarkable exception is the spatial neoclassical growth model in which
the spatial specification relies on the existence of knowledge spillovers and learning
processes.

Network Inputs

This section presents several papers that argue that some inputs have network char-
acteristics and generate external effects on neighboring production units (regions).
For instance, Munnell [93] points out that the transport infrastructure localized in a
region could benefit other regions. Stiroh [115, 116] and Griliches [59] also stated
that ICTs and R&D activities can be treated as network inputs because they can
generate externalities to other firms as well. In most cases, the “network” nature of
these inputs is because they indeed are public goods, i.e., they are inputs that one
firm can use without reducing their availability to others and from which no one is
excluded.
35 Production Economics in Spatial Analysis 1385

Transport Infrastructures

Transport network or transport infrastructure has been considered one of the public
policy decisions that has the greatest impact on economic development, both for
its effect on the structure of the population, and its capacity to reduce costs and
increase production. This explains why many studies have tried to quantify the effect
of transport infrastructure on private production. For a summary of this literature,
see Cohen and Morrison [25] and Pereira and Andraz [103]. More specifically, after
the seminal paper by Aschauer [10], there is a wide literature that has extended the
traditional (aggregate) production function where the provision of infrastructures
(KPit ) is complementary to labor (Lit ) and private capital (Kit ). Using a Cobb–
Douglas specification and assuming constant returns to scale in private inputs, this
production function can be written after taking as follows:

lnyit = ln Ait + α ln kit + β ln KP it + vit (9)

where yit and kit are again output and capital services per worker, vit is the noise
term, and Ait can be interpreted as a total factor productivity index.
Note that the provision of infrastructures enters the production function as a stan-
dard factor of production. Straub [117] states that the inclusion of the infrastructure
variables as simple inputs is questionable because, despite the increasing market
mediation of infrastructure, this type of capital is not completely remunerated
according to its marginal productivity in the real world. This has prompted several
authors (see, e.g., [34]; and [67]) to instead consider infrastructure as part of
the total factor productivity term (Ait ), i.e., as an efficiency-enhancing externality
specifically linked to the accumulation of infrastructure capital. As the Cobb–
Douglas production function in (9) does not allow researchers to distinguish the
direct effect of infrastructure (i.e., through the production of specific services)
from the indirect effect (i.e., the efficiency-enhancing infrastructure externalities),
Orea et al. [99] have pointed out that this problem can be addressed if we first
use a frontier specification of the production model and then we treat the set of
infrastructure variables as efficiency determinants.
In addition, transport infrastructures generate (spillover) effects outside the
geographical place where they are located, given their network characteristic. In
other words, firms located in a region not only use the infrastructure of its own
region but also the infrastructures located in neighboring regions. Therefore, firms
use two public infrastructures, not only one as it is implicitly assumed in (9). For
this reason, the majority of the literature examining spillovers effects attributed to
public infrastructure adopts a similar strategy, i.e., the addition of spatial lags of
KPit as a standard input. Therefore, the production function that is estimated is the
well-known SLX model:

N
lnyit = lnAit + α ln kit + βlnKP it + θ wij lnKP j t + vit (10)
j =i
1386 L. Orea and I. C. Álvarez

Depending on the strategy followed to define the spatial weights (Wjy), some
papers have confirmed the existence of positive spillover effects derived from
investment in transport infrastructures (e.g., [25], and [102]), while other studies find
negative spillover effects (e.g., [3, 4, 18]). The latter studies conclude that negative
spillover effects from transport infrastructures are found due to factor migration or
when the set of neighboring regions are defined in economic terms (e.g., regions
that are competitors or have similar characteristics), while positive spillovers are
generally found when the neighboring firms are defined using geographical criteria.
The spillover effects also vary with the set of countries or regions examined
and with the specification of the model in levels or in rates of growth. Indeed, in
his revision of the literature, Straub [117] finds that specifications using a standard
production function in levels are generally more supportive of a positive effect of
infrastructure than those using output (productivity) growth rates. He interprets this
result as an indication that transitory effects are more often observed than long
term effects. This authors also points out that in most cases, growth-accounting
studies find lower levels of infrastructure externalities for more developed countries
or regions than for developing ones.
Other authors analyze the impact of infrastructures provision on economic
performance using more comprehensive spatial models. Yu et al. [129] summarize
this specific literature and attribute their SDM production model to variations in
the rate of capital (capacity) utilization, an unobservable production driver in many
applications. To address this issue, Gajanan and Malhotra [47] suggest modeling
the rate of capital utilization as a function of the economic performance of the
neighboring provinces. This empirical strategy is supported on the basis that each
region accommodates its rate of capital utilization to meet the output increases in
other regions [21]. In this sense, Arbues et al. [8] and Álvarez et al. [5] assume
that the flow of capital services per worker in (10) is kit = CU it · kit∗ , where kit∗
is the stock of

capital per worker. They next define their capacity utilization rate as
N
CU it = eλY j =i wij lnyj t +τit . Substituting this spatial specification of CUit into (10),
they obtain the following SDM model:
N N
lnyit =lnAit + λ wij lnyit + αlnkit∗ + βlnKP it + θ wij lnKP j t + εit
j =i j =i
(11)

where εit = vit + ατ it and λ = αλY . Notice that the production function (11)
depends on the capital stock kit∗ and the spatial lag of the dependent variable, while
the production function (10) only depends on capital services that were (implicitly)
assumed to be proportional to the capital stock.

ICT and R&D Activities

ICT can be considered as a network input because it may enable a process innovation
itself (see, e.g., [15, 20]). The main debate in this literature has to do with the
35 Production Economics in Spatial Analysis 1387

empirical techniques used to link ICT and productivity growth. For instance, while
Jorgenson and Stiroh [70], Gordon [56], Van Ark et al. [123], and Van der Wiel
[124] use growth-accounting techniques with observed factor shares, other papers
(e.g., [116]) have questioned the use of growth-accounting techniques because
the estimated factor shares (from an estimated production function) do not often
coincide with their observed counterparts. Strobel [118] suggests that the divergence
in observed and estimated factor shares can be interpreted as evidence of the
existence of ICT spillovers.
As ICT can generate externalities to other firms, van Leeuwen and van der Wiel
[125, 126] among other authors introduce spatial lags of the ICT variable into their
production functions or growth accounting models. Using Dutch firm-level data,
they find that the ICT spillovers are an important source of TFP growth. They
also corroborate that the production function approach yields more significant and
plausible results than the growth accounting approach. Most recent papers include
more sophisticated specifications of the ICT spillovers. For instance, Strobel [118]
include the ICT spillovers as an intermediate input. While Bloom et al. [16] use
the degree of product market proximity to compute the weight matrix W, Lychagin
et al. [88] use the geographical distance. Interestingly, the latter weight matrix
seems to be more relevant for R&D spillovers than for ICT spillovers due to
the network effects associated to ICT are not confined to a limited geographical
space.
Since the seminal contribution by Griliches [58], many empirical studies have
provided solid evidence about the impact of R&D activities in firms’ production
using either firm or regional level data. For firm level applications, see Hall and
Mairesse [62], Klette and Kortum [77], and Rogers [108]. Regarding the second set
of papers, Prenzel et al. [105] highlight the relevance of regional and geographical
characteristics in the impact of R&D investment on productivity.
It should be noted that R&D is an input that create new ideas and innovation
[89] that other firms can copy, and thus an important aspect is the possibility of
externalities and knowledge spillovers [11]. In this case, as pointed out by Gråsjö
[57], the knowledge spillovers can be viewed as a pure externality, i.e., as an
unintended side effect of firms’ ordinary activities. Alternatively, knowledge can
be transmitted by explicit agreements of transaction of knowledge. To capture these
externalities, Bloom et al. [16, 17] and other authors extend the production function
with variables measuring R&D spillovers based on spatial and technological
proximity, and find remarkably spillovers associated to R&D activities. Eberhardt
et al. [37] find as well that the estimated returns of private R&D are seriously biased
if the R&D spillovers are ignored.

Local Versus Global

As shown before, the spatial production models allow us to examine the existence of
spatial spillovers associated to public infrastructures. The issue here is the selection
of an appropriate spatial specification in order to produce valid estimates and to
1388 L. Orea and I. C. Álvarez

infer accurate predictions of the scope of the existing spillovers. In most of the
abovementioned papers, the authors use a specification of the production function
that corresponds with the SLX model.
In this sense, LeSage [83] states that most spatial spillovers are local in applied
regional science modeling, but remarks that a network input, as for example a
highway, represents a resource shared by numerous locations and it thus could
also cause global spillovers. If so, the SDM model should be chosen instead of the
SLX model. However, as pointed out by this author, the distinction between local
and global spillovers based on estimated SLX vs. SDM models could be somehow
artificial , since it relies on the implicit assumption that the local spillovers in a SLX
model only involve adjacent neighbors, but not higher-order neighbors. However,
this is not true if the W matrix is defined in (very) broad terms, e.g., using the inverse
of the distance between the totality of regions or adopting economic concepts of
distance. In this case, most spatial observations will be involved, as it happens in the
SDM model, but now using a SLX model.
Despite the abovementioned discussion, LeSage and Pace [87] state that most
papers in this literate still maintain this artificial distinction because it facilitates to
test for local versus global specifications. However, this test is only informative if
the W matrix is defined in very narrow terms (using, e.g., first-order neighbors)
because the difference between the SLX and SDM models is larger. The same
applies for SLX vs. SAR models. In this sense, Gibbons and Overman [48] show that
the reduced forms of these two models are very similar if the W matrix is broadly
defined.
In summary, the above discussion shows that it is necessary to pay much attention
to the spatial specification of the model when we aim to capture spillover effects.
However, this does not take place in practice. Indeed, as pointed out by Gibbons
and Overman [48] and Vega and Elhorst [127], many empirical applications lack a
proper justification for the selected spatial specification. More thoughts about this
can be found in the last section of this chapter.

Agglomeration Economies

A vast literature confirming the relation between productivity and economies

of density has appeared since the seminal paper by Ciccone and Hall [24] on
agglomeration economies. The theory of agglomeration economies proposes that
firms benefit from the concentration of economic activity, via localization effects
or urbanization economies [46]. Krugman [78] and Fujita et al. [46] show that
the presence of agglomeration or concentration economies in geographical space
might explain the existence of increasing returns to scale in many empirical
applications.
The empirical literature measuring the effect of agglomeration economies on
firms’ productivity often are based on estimated production functions, which contain
35 Production Economics in Spatial Analysis 1389

some representation of agglomeration economies.2 This literature is focused on the

relative importance of two different agglomeration economies. While localization
economies are caused by industrial concentration [26], urbanization economies are
associated to city size [36]. A survey of this literature can be found in Rosenthal and
Strange [112].
In general, there is no consensus in this literature about the production effect
of the different agglomeration measures [92], although these studies usually find
a positive productivity gains from urban agglomeration [26, 38, 111]. This lack
of consensus has generated a heat debate on the level of disaggregation, the
specification of the model, the econometric methods, or the measurements of
agglomeration. Regarding the econometric issues, Combes and Gobillon [26]
propose several strategies to deal with potential endogeneity problems. Other source
of differences is the existence of missing production drivers positively correlated
with agglomeration, as for example, land, local public infrastructures [38] or natural
advantages of some locations [42]. Selection biases in location choice are also
expected because productivity of firms can be conditioned to the density of their
locations. In this sense, Combes et al. [29] developed a formal test that allows
examining whether firms’ selection does not explain spatial productivity differences.
Other differences are likely caused by the use of different spatial concentration
indexes that try to measure inequalities in the spatial distribution of economic
activity. In this sense, Combes and Overman [27] and Combes et al. [28] identify
six properties that an ideal index of spatial concentration should fulfill. As most
concentration indexes based on clusters of firms do not satisfy all properties,
Duranton and Overman [35] and Arbia et al. [7] suggest using distance-based spatial
concentration indexes.
It is worth mentioning that the abovementioned papers have to do with the spatial
location of the economic activity, but they do not use the SLX, SAR or SDM
models introduced in previous sections because this literature ignores the existence
of spillovers between “neighbors.” A remarkable exception is Han et al. [63] who
follow Ertur and Koch [43] and propose estimating an augmented version of the
production function (5), where the TFP term (Ait ) is modeled as a function of
two indicators of urban agglomeration and their spatial lags, and the TFP term of
neighboring cities:
β
N wij
Ait = Zit Zjθ t Aλjt (12)
j =i

where Zit is a vector of two agglomeration measures. In particular, Han et al. [63]
assume that technological interdependence among cities operates through spatial
externalities, and the external effect of technology generated by specialization
and diversification agglomeration of manufacturing in one city extends across its
borders. They next plug (12) into (5) and estimate the following production function:

2 If
real wages are proportional to labor productivity, this issue can be also examined using wage
functions.
1390 L. Orea and I. C. Álvarez

N
ln yit = βlnZit + αlnkit + λ wij lnyj t
j =i
N N (13)
+θ wij lnZj t − αλ wij lnkj t + vit
j =i j =i

For notational ease, we do not include any dynamic term in (13) as well as other
production drivers measuring urbanization, human capital, government intervention,
foreign direct investment, and energy consumption. The specification of knowledge
spillovers in the TFP term yields an SDM model. Han et al. [63] estimate this
model using spatial econometric techniques and find evidence of the existence of
spatial spillovers that are influenced by the use of economic and spatial definitions
of proximity of the weighted matrix.

Spatial Returns to Scale

Glass et al. [53] introduce the idea of spatial returns to scale (RTS) by adapting well-
known concepts introduced by LeSage and Pace [86] in applied spatial econometrics
to the measurement of classic technology characteristics, such as elasticities and
returns to scale. They suggest computing three different returns to scale measures
(i.e., internal, external, and total) once a spatial SAR and SDM production function
is estimated. As the authors pointed out, these three RTS measures can also be
calculated using other primal and dual technology representations, such as cost,
profit, revenue, and input and output distance functions. To catch the differences
among these three measures, assume that we have already estimated the following
single input SDM production function:

N N
lnYit = α + β lnXit + λ wij lnYj t + θ wij lnXj t + vit (14)
j =i j =i

where Yit and Xit are the output and input levels, A is the spatial autoregressive
parameter, and wij is an element of the spatial W matrix, which describes the strength
of the spatial interaction between the units. Note that the spatial autoregressive
parameter in (14) is common to all units. Therefore, the above equation is a
homogeneous coefficient spatial model. Notice that (14) can be rewritten using
matrix notation as follows:

lnYt = (I − λW )−1 [α + β lnXt + θ W lnXt + vt ] (15)

where lnYt , lnXt , and vt are Nx1 vectors, and W is a NxN matrix of spatial
weight terms. If the off-diagonal elements are equal to one if both observations
are in adjacent locations and the weight matrix is row-standardized, WlnXt can be
interpreted as the average input level of adjacent firms or regions.
35 Production Economics in Spatial Analysis 1391

Individual Elasticities

Differentiating (15) with respect to own factor inputs (i.e., lnXt ) and the inputs of
all the other units in the sample (i.e., WlnXt ) yields the following matrix of direct
and indirect elasticities for each unit:

⎡ ⎤ ⎡ ⎤
e1 e12 ··· e1N β ω12 θ · · · ω1N θ
⎢ e21 e2 ··· e2N ⎥ ⎢ ω21 θ β · · · ω2N θ ⎥
⎢ ⎥ −1 ⎢ ⎥
⎢ . .. .. .. ⎥ = (I − λW ) ⎢ .. .. .. . ⎥ (16)
⎣ .. . . . ⎦ ⎣ . . . .. ⎦
eN 1 eN 2 · · · eN ωN 1 θ ωN 2 θ ··· β

where ei = ∂ ln Yit /∂ ln Xit and ein = ∂ ln Yit /∂ ln Xnt are direct and indirect
elasticities, respectively. The direct elasticity in a spatial production function is the
rate of increase in a unit’s output following a (proportional) increase in its own input
variable(s). The indirect elasticity is the rate of increase in a unit’s output following
an increase in the factor inputs of another unit in the sample.
Both direct and indirect elasticities include a feedback effect, which pass through
other units via the spatial multiplier matrix and back to the unit which initiated the
change. It is worth mentioning that this feedback effect has also to do with our
previous discussion on the existence of local vs. global effects in spatial models.
Notice in this sense that, as pointed out by LeSage [83], the global multiplier
(I − λW)−1 can be expressed as I + λW + (λ2 W2 + λ3 W3 + · · · ). This implies
that any elasticity in turn can be decomposed into a nonspatial elasticity via the
identity matrix, plus a local effect that produces impacts on production of first-
order neighboring units via λ W , and a global effect that arises from impacts on
production of second-order neighboring units via λ2 W2 , and so on. More accurately
speaking, the indirect elasticity does not have a proper nonspatial effect because the
departing elasticity that later on will be multiplied by (I − λW)−1 already includes
the W matrix and thus the first component can be interpreted as a local effect,
and the next components as global effects. Unlike the simple SLX model which
only includes local effects, the above SDM production function will have both the
local and global effects (see [41, 86]; and [55]). Notice however that, as pointed out
before, the SLX model might also involve distant production units if the W matrix
is defined in broad terms.
As et and ein are linear functions of the elements of the global multiplier
(I − λW)−1 and these elements are unit-specific, both direct and indirect elasticities
vary across units even though we have estimated common β and θ parameters
for all observations. Notice also that if the estimated spatial production function
uses a SAR specification and there are not local spatial spillovers associated to
the input variables (i.e., θ = 0), we get that both direct and indirect effects are
simple adjustments of the original parameter. However, even if a SAR specification
is used, both elasticities still exhibit a nonlinear relationship with the underlying
model parameters.
1392 L. Orea and I. C. Álvarez

Mean Elasticities

LeSage and Pace [86] suggest reporting mean values of direct and indirect concepts
to facilitate interpretation. While the mean direct elasticity (hereafter DE) is
computed asa simple arithmetic average of the diagonal elements of (16), i.e.,
DE = 1/N N i=1 ei , the mean indirect elasticity (hereafter IE) is computed as the
cumulative sum of the off-diagonal
Nelements of (16) from each row, averaged over
all rows, i.e., I E = 1/N N j =1 n=i ein . The mean total elasticity (hereafter TE)
is next computed as the sum of the mean direct and mean indirect elasticities, i.e.,
TE = DE + IE. To compute the t-statistics for the mean direct, mean indirect and
mean total elasticities, LeSage and Pace [86] and Elhorst [41] propose conducting
Monte Carlo experiments that simulate the distribution of the mean elasticities
using the variance-covariance matrix associated with the ML estimates. Interesting
enough, while Glass et al. [51] and Glass and Kenjegalieva [50] use Bayesian
simulation techniques to compute the t-statistics for the mean effects, they are
calculated using the delta method in Glass et al. [52, 54].

Internal, External, and Total Returns to Scale

The main contribution of Glass et al. [53] is noticing that the above elasticities can
be interpreted as measures of the technology’s returns to scale in a spatial setting.
As the spatial effects in (14) are related to inputs, this allows extending a classical
characteristic of production to the spatial case.
They proposed three returns to scale measures: internal, external, and total. Using
the standard terminology in spatial econometrics, they can be alternatively labeled
as direct, indirect, and total returns to scale. The internal returns to scale is defined
as the rate of increase in a unit’s output following a proportional increase in its own
input variable(s). The unit-specific internal RTS can be simply computed as et due
to only a single input has been considered. The external returns to scale are defined
as the rate of increase in a unit’s output following an increase in the inputs of all
the other units in the sample. Glass et al. [53] propose computingthe unit-specific
external RTS using the simple sum of indirect elasticities, that is, N n=i ein . Finally,
total returns to scale is defined as the rate of increase in a unit’s output following a
simultaneous increase in its own inputs and the inputs of all the other units in the
sample. Therefore, the calculation of total returns to scale is based on all N units
in the sample simultaneously
changing their inputs. The unit-specific total RTS can
be computed as N n=i ein . Glass et al. [53] also examine the concavity of the spatial
production function and find that all definitions of concavity (i.e., internal, external,
and total) in a spatial setting depend on the specification of the spatial weight matrix.
Glass et al. [53] also used simple arithmetic averages to summarize their RTS
results. While mean internal returns to scale is computed as 1/N N i=1 ei , mean
N
external returns to scale is computed as 1/N N e
i=1 n=i in . Finally, mean total
returns to scale is the sum of the mean internal returns to scale and the mean
35 Production Economics in Spatial Analysis 1393

external returns to scale. Using these three mean values, the spatial production
function exhibits decreasing returns if RTS < 1. Constant returns appear if RTS = 1.
Finally, the spatial production function exhibits increasing returns if RTS > 1. The
above authors find positive labor and positive capital spillovers in their empirical
application to a set of European countries over the period 1990–2011. While they
cannot reject constant internal and external returns to scale, they reject constant
total returns to scale in favor of increasing total returns. Their findings thus provide
some empirical support for the endogenous growth theories which are based on the
assumption of increasing total returns to scale, although their increasing total returns
to scale are not caused by knowledge spillovers as in Romer [109, 110].

Economy-Wide Returns to Scale

The above mean RTS measures mimic the approach suggested by LeSage and Pace
[86] to summarize their marginal effects. Similar, but not the same, expressions can
be found if we aim to compute aggregate or economy-wide returns to scale from
the whole set of basic production units or regions. First notice that the aggregate or
economy-wide technology can be defined as:

N
Yt = G (X1t , . . . , XN t ) = Yit =N
i=1 fi (Xit , X−it ) (17)
i=1

where X–it is the vector of inputs of all the other units (regions) in the sample, and fi
is the production function of unit i , which depends on its own inputs and the inputs
of other units under the presence of spatial spillovers. Notice that G(x1t , . . . , xNt )
is not separable in individual inputs because all regional outputs depend on own
and neighboring inputs. In contrast, the aggregate technology in a nonspatial
model is separable in individual inputs as it can be written as Y = N i=1 fi (Xi ).
Differentiating the above economy-wide technology with respect to all inputs and
assuming that all individual inputs increase in the same proportion, we get after
some straightforward algebra that economy-wide returns to scale (hereafter, E) can
be measured as:

N N N
E= si ei + si ein (18)
i=1 i=1 n=i

It is now worth mentioning that, if the estimated model is a SAR model, the
economy-wide measure of returns to scale is a simple weighted average of all direct
elasticities, i.e., E = N i=1 si ei . In summary, we should use a weighted not a simple
arithmetic average of individual elasticities if we are willing to compute economy-
wide RTS. In other words, LeSage and Pace [86] and Glass et al. [53] mean measures
of total, direct, and indirect effects cannot be used to measure aggregate (economy-
wide) RTS except all units (regions) are of similar size.
1394 L. Orea and I. C. Álvarez

Returns to Scale in Heterogeneous Coefficient Models

The increasing availability of large (panel) data sets explains why important
contributions have been made in recent years to estimate spatial models with
autoregressive coefficients that vary across units. For instance, Malikov and Sun
[90] and Sun and Malikov [120] compute the spatial autoregressive values from an
unknown smooth function of a set of environmental factors. Gude et al. [60] do
so using a parametric function and in a frontier setting. They use a heteroscedastic
version of the spatial stochastic frontier models introduced by Glass et al. [54] as
they allow for province-specific degrees of spatial dependence.
Allowing for unit-specific spatial coefficients not only will lead to less biased
conclusions but also to richer conclusions (see, e.g., [60]), especially when the
spatial data represent firm-level rather than regional observations (see, e.g., [84]).
However, Sun and Malikov [120] state that such models are also useful in the
estimation of growth models as it is expected in this literature that the intensity
of knowledge spillovers greatly depend on institutional and cultural compatibility
of neighboring countries.
As mentioned in section “Mean Elasticities,” LeSage and Pace [86] and Elhorst
[41] proposed using arithmetic averages of either diagonal or off-diagonal elements
of (16) to simplify the task of interpreting estimates of direct and indirect effects
from the model. However, LeSage and Chih [84] stated later that scalar summary
measures are not consistent with the notion of parameter heterogeneity. In this
case, we should report observation-level effects estimates. For the case of the
heterogeneous coefficient SAR panel model, the N diagonal elements of the matrix
should be provided to produce direct effects estimates for each of the units (regions).
As estimates of unit-specific indirect effects, it is recommended to follow the
proposal of LeSage and Chih [84] and use the cumulative sum of off-diagonal
elements in each row of (16).

Spatial Econometric Models

We provide in this section a brief discussion on the general estimation techniques

used to estimate spatial econometric models with cross-sectional and panel data
sets.
Earlier developments in testing and estimation of SAR models in the context of
cross-sectional models has been summarized in Anselin [6]. Ord [95] first proposed
to estimate the SAR models by the method of maximum likelihood (ML). Kelejian
and Prucha [72, 73] later introduced the spatial two-stage least squares (2SLS)
and methods of moments (MM) estimators. There are numerous extensions that
have been made in more recent articles. In this sense, Kelejian and Prucha [74]
extended the model to a system of equations spatially interrelated, Kelejian and
Prucha [75, 76] introduced a method robust to heteroscedasticity and autocorrelation
in disturbances in a spatial autoregressive model, and Lee [81] introduced a spatial
35 Production Economics in Spatial Analysis 1395

quasi-maximum likelihood estimator (QMLE), which is more appropriate when the

disturbances in the model are not truly normally distributed.
The availability of panel data sets has allowed to observe individuals charac-
terized by spatial features over time. As the panel econometric models are less
vulnerable to multicollinearity issues and they allow to control for unobserved but
time invariant variables, many researchers have recently adapted (and extended) the
previous spatial econometric estimators to panel data settings. Elhorst [41] provides
a nice introduction to this literature. For instance, Kapoor et al. [71] and Mutl and
Pfaffermayr [94] have adapted the MM methods from cross-section to panel data.
Elhorst [39, 40] and Lee and Yu [82] introduced respectively the ML estimators of
the SLX and SEM models for panel data with fixed and random effects. Baltagi
and Liu [13] extended the traditional panel data error components estimator to SAR
model. In parallel, there is a literature that has focused on nonparametric techniques,
as is the case with the Bayesian spatial econometrics (see, for instance, [86]).
In recent years, we have noted open debates regarding the spatial econometric
techniques. One of them affects the decision about the W matrix representing spatial
linkages. Recent developments in spatial econometrics introduce new techniques
to automatically select the spatial weight matrix [130]. Another hot debate has
to do with the heterogeneous nature of the spatial autoregressive coefficients. As
anticipated in section “Returns to Scale in Heterogeneous Coefficient Models”
the so-called heterogeneous spatial models allow measures of spatial dependence
specific to each observation, information that can be useful for policymaker’s
decisions. Bayesian and Markov Chain Monte Carlo (MCMC) mixture estimation
methods [85] have been mostly proposed to achieve this flexibility in the spatial
autoregressive parameters.

Spatial Stochastic Frontier Models

Although there is extensive spatial econometric literature dealing with spatial

interactions across spatial units, the literature on efficiency analysis has not gen-
erally taken spatial effects into account. Several studies have found that failure to
account for spatial correlation effects in SF models results in biased estimates of
efficiency scores (e.g., [114]). For this reason, it is important to use an econometric
framework that allows controlling for the presence of cross-sectional dependence
when measuring the efficiency performance of spatially distributed production units.
It should be stressed that not only it is important to control for spatial spillovers
in efficiency analyses but also for the existence of heterogeneous spatial dependence
parameters. Indeed, Gude et al. [60] find that the standard efficiency estimates (i.e.,
the estimated “u” term) change a lot when they (incorrectly) use a common spatial
dependence parameter for all observations. The efficiency results are expected to
change even more if the total efficiency scores proposed by Glass et al. [54] are
computed, because these efficiency measures include direct and indirect spatial
effects that not only depends on the abovementioned “u” term but also on the
1396 L. Orea and I. C. Álvarez

estimated spatial dependence parameters, which might vary significantly across

observations.
This section offers a brief discussion of the small but evolving literature on
spatial stochastic frontier modeling.3 This literature tries to overcome this issue
by including spatial autoregressive terms in their models. Generally speaking,
this literature can be split into two groups, depending on whether distributional
assumptions are made for the inefficiency term.

Distribution-Free Models

The first group of papers estimate panel spatial models based only on the distri-
bution of the noise term and without making any distributional assumption for the
inefficiency component of the error term. Examples of papers that belong to this
group are Druska and Horrace [33] and Glass et al. [51, 52]. The model estimated
in these papers can be summarized using the following single-input production
function:
N N
lnYit = αit + β lnXit + λ wij lnYj t + θ wij lnXj t + εit (19)
j =i j =i

where Yit and Xit are respectively the output and input levels of unit i, λ is the
spatial autoregressive parameter, α it is a unit-specific effect that can be defined as
time-invariant (i.e., α it = α i ) or as an individual-specific parameterized function of
time (i.e., α it = α 0i + α 1i t + α 2i t2 ), and εit is a noise term that might also be
spatially correlated as in Druska and Horrace [33]. The individual efficiency scores
are simply computed from the cross-sectional specific effects using the approach in
Schmidt and Sickles [113] (hereafter SS) and Cornwell et al. [30] (hereafter CSS).
In this setting, the observation with the largest individual effect in each period is
placed on production frontier, and the efficiency estimates are the exponential of
the difference between the best performing individual effect and the corresponding
effect for each of the other observations in the sample. That is, efficiency is measured
as EFit = exp (α it − maxj (α jt )).
Druska and Horrace [33] implicitly assumed in eq. (19) that λ = θ = 0.
They ignored any spatial correlation in the frontier as because they interpreted
the production function as a purely deterministic (engineering) process where the
production units control all the inputs. This assumption allowed them to focus their
application on spatial correlations associated with the noise term as they developed
a spatial error (SEM) model with time-invariant fixed effects, which were used later
to calculate unit-specific efficiency scores using the SS estimator. The SEM models
have not been very popular in non-frontier settings, because the spatial dependence
that is accounted for in these models is not a representation of substantive economic

3 The content of this section is highly inspired in Orea and Álvarez [97].
35 Production Economics in Spatial Analysis 1397

spillovers [86]. The residual

term in this model is assumed to follow a SAR process,
that is: εit = vit + ρ N j =i wij εj t . The error term in this model consists of two
components, an idiosyncratic noise term (vit ) and a spatial component that relates a
unit’s random shocks to the random shocks of neighboring units. Glass et al. [51] is
a similar type of study as they use the fixed effects from a SAR stochastic frontier
model to estimate time-varying efficiency using the CSS approach. They use
maximum likelihood techniques as Glass et al. [52], who extended the CSS
methodology to the spatial autoregressive case and estimate direct, indirect, and
total efficiencies for each production unit.

Distribution-Based Models

The second group of spatial stochastic frontier models follows most of the non-
spatial stochastic frontier literature by making assumptions about the distribution
of both the noise and inefficiency terms. This group is not entirely homogenous
as some papers allow the frontier to be spatially correlated across production
units (e.g., [1]; and [54]), while other papers allow the error terms to be spatially
correlated (e.g., [9, 114]; and [122]). Most of these models can be summarized using
the following single-input production function:

N N
lnYit = α + β lnXit + λ wij lnYj t + θ wij lnXj t + vit − uit (20)
j =i j =i

This equation includes two error terms, vit and uit . While the former term
is a symmetric error term measuring pure random shocks, the latter term is a
nonnegative error term measuring unit-specific inefficiency. The above equation
describes the so-called spatial Durbin frontier (SDF) model proposed by Glass
et al. [54] that accounts for both local and global spatial interactions. If we assume
in (20) that θ = 0, we get the so-called spatial autoregressive stochastic frontier
(SARF) model. If in addition we assume that λ = 0, we get the traditional nonspatial
stochastic frontier model. If we, however, assume that one or both error terms
in (20) are spatially correlated, we get a spatial error stochastic frontier (SEF)
model. The latter model can be viewed as a vehicle to resuscitate the interest of
the scientific community, policymakers, and regulators in SEM models because,
unlike the traditional SEM model, we have two random terms in a frontier analysis
framework and controlling for spatial spillovers in both random terms does matter
due to the (different) economic consequences of such correlations. This conclusion
is motivated later in section “Final Remarks.”
Another restricted specification of (20) is provided by Adetutu et al. [1] who
propose a stochastic SLX frontier model because they only include spatial lags of
the exogenous variables as frontier determinants (i.e., they assumed λ = 0, but
allowed 0 to take nonzero values). As they limit their analysis to local spatial
dependence, their model can be estimated using the standard procedures for the
nonspatial stochastic frontier. This model, however, overlooks any global spatial
1398 L. Orea and I. C. Álvarez

dependence as they omit the endogenous autoregressive variable. Glass et al. [54]
estimate a SARF model that only accounts for global spillovers, and a SDF model
that accounts for both global and local spatial dependence. To minimize issues
relating to convergence, these authors adopt a pseudo maximum likelihood estimator
and estimate their model in two steps, first estimating a non-frontier SDM model
and then splitting the first-stage residuals into the idiosyncratic error and time-
variant efficiency. Gude et al. [60] generalize the above SARF and SDF models
in two aspects. First, they allow for heteroscedastic specifications of the inefficiency
term. Second, both models allow the researchers to identify the determinants of the
spatial dependence among the Spanish provinces. A parallel paper focusing on the
EU regions is Ramajo and Hewings [107] that explicitly consider (common) spatial
spillover effects by including a spatial lag of the dependent variable at the frontier.
Another set of papers allows the inefficiency term to be spatially correlated. In
these papers, the one-sided error term consists of two components, an idiosyncratic
one and a spatial component that relates a unit’s inefficiency to the inefficiency
of neighboring units. Standard maximum likelihood techniques are not used here
because the addition of spatial lagged inefficiency terms does not yield a closed
form for the likelihood function, and several computational algorithms are proposed
to conduct simulation-based inference and efficiency measurement. For instance,
Areal et al. [9] avoid this issue by using a Gibbs sampler and two Metropolis-
Hastings steps to estimate the spatial dependence of firms’ efficiency. A similar
model is proposed by Tsionas and Michaelides [122], who develop a Bayesian
estimator for a model that allows for spillover effects in inefficiency. Schmidt et al.
[114] also adopt a Bayesian approach to estimate a variety of spatial stochastic
frontier models. In contrast to the previous papers, their inefficiency term does
not follow a spatial autoregressive process, but it depends on a latent (unobserved)
local effect. In several specifications, they assume that the local effects follow a
conditional autoregressive distribution which depends on its neighbors. Similarly,
Herwartz and Strumann [64] estimate a frontier model with region-specific random
effects in the inefficiency term that allows for spatial dependence. As their like-
lihood function does not attain a closed-form solution, the model is estimated by
simulated ML.
Previous spatial stochastic frontier models have focused solely on spatial
spillovers in either the inefficiency term or the noise term. Thus, they have tended
to neglect one or the other of these sources of spatial correlation. Orea and Álvarez
[97] have recently proposed a new stochastic frontier model that permits separate
but simultaneous analyses of the spatial correlations of both noise and inefficiency
terms, which are likely to be of a different nature. Their model can be written as:

lnYit = β lnXit + ṽit (ρ) − ũit (τ ) (21)

where now the noise and inefficiency terms are spatially correlated using spatial
moving average (SMA) or spatial autoregressive (SAR) stochastic processes, and
the coefficients ρ and τ measure the degrees of spatial correlation between firms’
noise and inefficiency terms, respectively. In order to get a closed form for the
35 Production Economics in Spatial Analysis 1399

likelihood function, Orea and Álvarez [97] assumed that the basic inefficiency term
uit possesses the scaling property in the sense that the idiosyncratic inefficiency
term can be written as a function of exogenous variables times an industry-
specific or economy-wide inefficiency term. The above specification implies that
the distribution of the inefficiency term is not affected by the spatial transformation.
This is the crucial aspect of the model that enables them to get a tractable likelihood
function that can be maximized using standard software. The above authors also
estimated a portion of the model using nonlinear least squares (NLLS) with no
distributional assumptions, except exogeneity of all explanatory variables.

Estimating Efficiency in Spatial Frontier Models4

The presence of the endogenous autoregressive variable in the spatial frontier

model requires correcting the individual efficiency estimates, i.e., ξ it = exp (−uit ),
that have been obtained using Jondrow et al. [69] or Schmidt and Sickles [113].
Glass et al. [54] and Kutlu [80] have suggested two alternative methods to carry
out this adjustment. While Glass et al. [54] use the following technical efficiency
measure TEit = (I − λW)−1 ξ t , Kutlu [80] propose estimating the total efficiency
as TEit = exp [−(I − λW)−1 ut ]. Note that Kutlu’s efficiency calculation has the
global multiplier (I − λW)−1 inside the exponential operator, whereas Glass et al.’s
efficiency calculation has the global multiplier outside the exponential operator. This
subtle difference has important practical consequences.
If we use the global multiplier after the exponential operator, TEit might be larger
than unity, which is a necessary condition for total efficiency being well-defined.
In order to address this concern, Glass et al. [54] adapt the Schmidt and Sickles
[113] method and compute relative efficiencies by normalizing the above (absolute)
efficiencies with the most efficient observation. As pointed out by Kutlu [80], this
approach is, however, sensitive to the best performance in each period being an
outlier. His proposal is in line with the distribution-based methods as he does not
carry out any posterior normalization because (I − λW)−1 is always nonnegative as
long as 0 ≤ λ < 1.

Spatial TFP Growth Decomposition

Interest in the analysis of productivity at regional level has grown considerably in

recent years as productivity growth is one of the most important drivers behind
regional income. Thus, analyzing how regional productivity evolves over time is
essential to provide insights for the promotion of productivity growth in the future.
In the recent literature analyzing spillover effects on productivity at regional level,
there is a general consensus about the importance of spillover effects. A relevant

4 This subsection is inspired in Kutlu [80].

1400 L. Orea and I. C. Álvarez

contribution to this research topic using production functions is Baltagi et al. [14].
In addition to capturing spatial correlations in the model using a modified Hausman-
Taylor approach, this chapter allows for inter-sectoral spillovers that affect firms’
productivity in China’s chemical industry.
An estimated frontier production function can constitute the building block for
the measurement of TFP growth and its decomposition into its basic sources. The
traditional nonspatial TFP growth decomposition (see, e.g., [96]) includes three
components: changes in technical efficiency (EC), technical change (TC), and a
scale effect (SE) that relies on scale elasticity values and on changes in input
quantities, and therefore it vanishes under the assumption of constant returns to
scale or constant input quantities.
Glass et al. [51] extend the standard TFP growth decomposition to include direct
(own) and indirect (spillover) components using a spatial autoregressive production
frontier model. In particular, they estimate a fixed-effect SAR spatial panel model
using maximum likelihood techniques. Once the model is estimated, they compute
time-varying efficiency scores from the cross-sectional specific effects using CSS.
They next use the so-called quadratic identity lemma to obtain the following TFP
growth decomposition:

K K
Dir I nd
T Ḟ Pit = T Ėit + ηDir
it + SF kit Ẋkit + ηIitnd + SF kit Ẋkit (22)
k=1 k=1

where a dot over a variable stands for rate of growth, a line over a variable stands
for arithmetic averages in t and t — 1, ηit and eit are output elasticities with respect
to time and input levels, SFkit is a scale factor that vanishes under constant returns
to scale, and Dir and Ind denote elasticities and scale factors which are calculated
using the relevant direct and indirect marginal effects. The first three terms in (22)
are respectively direct (own) EC, TC, and SE productivity components that can
be found in nonspatial TFP growth decompositions. However, the above direct
components differ from the standard nonspatial ones because they contain feedback
effects, i.e., effects which pass through other units via the spatial multiplier matrix
and back to the unit which initiated the change. The last two components above are
indirect components associated respectively to technical change and the size effect,
which do not appear in a nonspatial setting. A more compact decomposition can be
obtained if we aggregate the direct and indirect components above into two total
effects, that is:

K T ot
T Ḟ Pit = T Ėit + ηTit ot + SF kit Ẋkit (23)
k=1

Later on, Glass and Kenjegalieva [50] extended their previous TFP growth
decomposition by adding spatial spillovers associated to the change in technical
efficiency. Their extension, however, relies on a different spatial stochastic frontier
model because they compute firms’ efficiency using the spatial SAR and Durbin
35 Production Economics in Spatial Analysis 1401

stochastic (cost) frontiers introduced by Glass et al. [53, 54] that are estimated using
(pseudo) maximum likelihood techniques.
Once individual efficiency scores are obtained, they estimate the direct, indirect,
and total efficiencies using the method outlined in “Estimating Efficiency in Spatial
Frontier Models”. That is, direct efficiency for a unit is interpreted in the same way
as own efficiency from a nonspatial model but, in contrast, comprises own efficiency
plus efficiency feedback. The indirect efficiency is the sum of the efficiency
spillovers to a unit from all the other units in the sample. Total efficiency is the sum
of its direct and indirect efficiencies. They next extend the growth of these three
efficiencies as part of their new spatial decomposition of TFP growth. In addition,
they include the growth in direct, indirect, and total allocative efficiency growth
components, which can be viewed as an extension of the allocative efficiency growth
component introduced by Kumbhakar and Lovell [79] in a nonspatial setting. They
propose a four-component spatial TFP growth decomposition:
K T ot K
T Ḟ Pit = ηTit ot + SF kit Ẋkit + T ĖitT ot + s Tkitot − eTkitot ṗkit (24)
k=1 k=1

where skT ot is the total input expenditure share weight and pk stands for input prices.
The first two components in (23) are the TC and SE productivity components that
already appeared separated into direct and indirect effects in eq. (22). The third
component captures the impact of a rise or fall in total efficiency, which Glass and
Kenjegalieva [50] in turn decomposed into its direct and indirect parts. The last term
captures the effect of a change in total allocative efficiency (AE), which again can
be decomposed into direct and indirect changes in allocative efficiency.

Final Remarks

Spatial spillovers can be defined as the impact of changes to input (explanatory)

variables in a unit on the output (dependent) variable values in other units. Units
could be firms, cities, regions, and so forth depending on the nature of the study.
As Vega and Elhorst [127] point out, a valuable aspect of spatial econometric
models is that the magnitude and significance of spatial spillovers can be empirically
assessed. To achieve this aim, several spatial specifications have been proposed in
the literature that rely on imposing model structure in the form of a spatial weight
matrix W, which reduces the number of parameters to be estimated.
However, this literature has been criticized due it often lacks theoretical back-
ground and it suffers from non-negligible identification problems, because it is
generally difficult to distinguish different spatial models from each other without
assuming prior knowledge about the true data-generating process, which is often
not possessed in practice (see, e.g., [101]; and [31]). The same applies to the weight
matrix as the true W is generally unknown. Several papers have tried to address this
issue by combining several spatial weight matrices that are often used in the spatial
econometric literature to capture spatial spillovers. To achieve this aim, Case et al.
1402 L. Orea and I. C. Álvarez

[22] and Qu and Lee [106] used a known function of geographic and economic
distance between units. While Sun [119] does so using nonparametric techniques,
the generalized spatial stochastic frontier models introduced by Gude et al. [60] also
use a parametric function to estimate a combination of spatial weight matrices.
Another identification problem highlighted by Gibbons and Overman [48] and
Vega and Elhorst [127] occurs when the unknown parameters of a model cannot
be uniquely recovered from their reduced-from specification even if the spatial
econometric model and W are correctly specified. Gibbons and Overman [48]
propose the use of natural experiments and microeconomic data sets, a solution
that often is not possible in standard applications in production economics where it
is compulsory use real data. Vega and Elhorst [127] suggest taking the SLX model
as point of departure, unless the researcher has an underlying theory or coherent
economic argument pointing toward a different model. They show that the SLX
specification not only is more flexible in modeling spatial spillover effects than other
specifications but also it is the simplest one. Moreover, in contrast to other spatial
econometric models, standard instrumental variables (IV) approaches can be used
to investigate whether (part of) the input variables and their spatially lagged values
are endogenous.
The authors of this chapter share this view: applied researchers in production
economics are encouraged to find sound economic arguments to first justify
the existence of spatial spillovers, and second to select the appropriate spatial
specification of their production (cost, profit or distance) functions when spillovers
are expected. However, as the economic arguments in production economics are
of different nature than in other research fields, the preferred specifications in
each field may differ. In this sense, it is worth mentioning that the SAR, SLX, or
SDM specifications are the preferred specifications in standard (i.e., non-frontier)
spatial econometric settings, because the spatial dependence that is accounted
for in the SEM model is often not a representation of substantive economic
spillovers. Notice as well that the spatial spillovers in the SAR, SLX, and SDM
specifications are treated as determinants of the estimated production/cost function,
i.e., as technological drivers. This treatment is, however, more difficult to justify in
production analyses using firm-level data. For instance, Orea et al. [98] did not use
a frontier-based spatial specification in their application to electricity distribution
firms because the Norwegian regulator did not see major systemic technical reasons
for the cost of an electricity distribution firm to be affected by those of an adjacent
firm to any significant degree. Similarly, Druska and Horrace [33] point out that
we do not need a model with spatial correlations in the frontier if the technology is
viewed as a purely deterministic (engineering) process where the firm controls all
the inputs.
Moreover, in a stochastic frontier analysis of firms’ efficiency, we have two
random terms. Orea and Álvarez [97] state that controlling for spatial spillovers in
both noise and inefficiency terms does matter due to the significant and different
economic consequences of such correlations. They argue that while the spatial
specification of the noise terms is likely capturing an environmentally induced
correlation, the spatial specification of the inefficiency term will likely capture
35 Production Economics in Spatial Analysis 1403

a kind of behavioral correlation. Both spatial specifications of the error terms

are important, although for different reasons. On one hand, a model specification
with spatial correlation in the noise term is useful as it accounts for unobserved
but spatially correlated variables that if ignored might result in biased estimates
of efficiency scores [98]. Thus, the effect of spatially correlated error terms in
a stochastic frontier model is not as benign as in a standard spatial econometric
model. On the other hand, a model with spatial correlation in the inefficiency term
is useful when firms tend to “keep an eye” on the decisions of other peer firms
trying to overcome the limitations caused by the lack of information [91], firms
are regulated using benchmarking techniques [97], or they simply emulate each
other [9]. As these issues provide interesting information on firms’ performance, the
recent spatial stochastic frontier literature is resuscitating the interest of the scientific
community, policymakers, and managers in spatial error-based models, which have
not been very popular so far.
The spatial production models are also useful when some firms benefit from (best
practices implemented in) adjacent firms. As Vidoli et al. [128] pointed out, this
could especially be the case if local firms belong to communitarian networks (e.g.,
cooperatives) characterized by a collaborative environment, exchange of technical
advice and continuous interaction, or common technicians (consultants) are advising
all local firms. In this sense, the literature on spatial production economics summa-
rized in this chapter is highly related to emerging literature on network production
functions, where the network structure is endogenous. For instance, Horrace et al.
[66] develops a model where worker’s productivity is a function of the productivities
of the co-workers on her team or, in our spatial framework, where firm’s production
is a function of another firms’ production. Horrace and Jung [65] propose a similar
model but in a stochastic frontier framework, where worker-level inefficiency is
correlated with a manager’s selection of worker teams. As the endogeneity of the
network structure (i.e., the W matrix in a spatial setting) is of primary concern in
this literature, various estimation techniques have been recently developed in the
econometrics of network literature to address this issue. As Gibbons et al. [49] point
out, these methods are likely very helpful in spatial settings in other to deal with the
endogeneity of some popular economic-based weigh matrices.

Cross-References

Economics of Externalities: An Overview

Neoclassical Production Economics: An Introduction
Scale Elasticity and Returns to Scale
Stochastic Frontier Analysis: Foundations and Advances I
Stochastic Frontier Analysis: Foundations and Advances II

Acknowledgments This chapter was supported by the Spanish Ministry of Economics, Industry
and Competitiveness (Grant MINECO-18-ECO2017-85788-R). The authors would like to thank
the two “Salvador de Madariaga” grants obtained from the Spanish Ministry of Science, Innovation
1404 L. Orea and I. C. Álvarez

and Universities (Grants PRX19/00596 and PRX19/00589). We also wish to acknowledge helpful
suggestions from an anonymous reviewer.

References
1. Adetutu M, Glass AJ, Kenjegalieva K, Sickles RC (2015) The effects of efficiency and TFP
growth on pollution in Europe: a multistage spatial analysis. J Prod Anal 43:307–326
2. Álvarez I, Barbero J (2016) The public sector and convergence with spatial interdependence:
empirical evidence from Spain. Appl Econ 48(24):2238–2252
3. Álvarez I, Delgado MJ (2012) High capacity road networks and spatial spillovers in Spanish
regions. JTEP 46:281–292
4. Álvarez I, Condeço-Melhorado AM, Gutiérrez J, Zofío JL (2016a) Integrating network
analysis with the production function approach to study the spillover effects of transport
infrastructure. Reg Stud 50(6):996–1015
5. Álvarez I, Barbero J, Zofio JL (2016b) A spatial autoregressive panel model to analyze road
network spillovers on production. Transp Res A Policy Pract 93:83–92
6. Anselin L (1988) Spatial econometrics: methods and models. Kluwer, Dordrecht
7. Arbia G, Espa G, Giuliani D, Mazzitelli A (2010) Detecting the existence of space-
time clustering of firms. In: Arbia G, Kelejian H (eds) Advances in spatial econometrics,
pp 311–323
8. Arbues P, Baños JF, Mayor M (2015) The spatial productivity of transportation infrastructure.
Transp Res A Policy Pract 75:166–177
9. Areal FJ, Balcombe K, Richard T (2012) Integrating spatial dependence into Stochastic
Frontier nalysis. Aust J Agric Resour Econ 56:521–541
10. Aschauer DA (1989) Is public expenditure productive? J Monet Econ 23(2):177–200
11. Audretsch DB, Feldman MP (1996) R&D spillovers and the geography of innovation and
production. Am Econ Rev 86:630–640
12. Audretsch DB, Feldman MP (2004) Knowledge spillovers and the geography of innovation.
In: Henderson JV, Thisse JF (eds) Handbook of regional and urban economics. Elsevier, San
Diego, pp 2713–2739
13. Baltagi BH, Liu L (2011) Instrumental variable estimation of a spatial autoregressive panel
model with random effects. Econ Lett 111(2):135–137
14. Baltagi BH, Egger P, Kesina M (2016) Firm-level productivity spillovers in China’s chemical
industry: a spatial Hausman-Taylor approach. J Appl Econ 31:214–248
15. Black SE, Lynch LM (2000) What’s driving the new economy: the benefits of workplace
innovation. NBER Working Paper No. 7479, January
16. Bloom N, Schankerman M, van Reenen J (2013) Identifying technology spillovers and
product market rivalry. Econometrica 81:1347–1393
17. Bloom N, Lucking B, van Reenen J (2018) Have R&D Spillovers Changed? CEP Discussion
Paper No 1548 May 2018
18. Boarnet MG (1998) Spillovers and the locational effects of public infrastructure. J Reg Sci
38:381–400
19. Bottazzi L, Peri G (2003) Innovation and spillovers in regions: evidence from European patent
data. Eur Econ Rev 47(4):687–710
20. Bresnahan TF, Brynjolfsson E, Hitt LM (2002) Information technology, workplace organiza-
tion, and the demand for skilled labor: firm-level evidence. Q J Econ 117:339–376
21. Burnside C, Eichenbaum M (1996) Factor-hoarding and the propagation of business-cycle
shocks. Am Econ Rev 86:1154–1174
22. Case AC, Rosen HS, Hines JR (1993) Budget spillovers and fiscal-policy interdependence:
evidence from the states. J Public Econ 52:285–307
23. Charlot S, Crescenzi R, Musolesi A (2015) Econometric modelling of the regional knowledge
production function in Europe. J Econ Geogr 15(6):1227–1259
35 Production Economics in Spatial Analysis 1405

24. Ciccone A, Hall RE (1996) Productivity and the density of economic activity. Am Econ Rev
86(1):54–70
25. Cohen JP, Morrison PCJ (2004) Public infrastructure investment, interstate spatial spillovers,
and manufacturing costs. Rev Econ Stat 86(2):551–560
26. Combes PP, Gobillon L (2015) The empirics of agglomeration economies. In: Duranton G,
Henderson JV, Strange WC (eds) Handbook of urban and regional economics, vol 5. Elsevier,
Netherlands
27. Combes P, Overman H (2004) The spatial distribution of economic activities in the European
Union. In: Henderson V, Thisse J-F (eds) Handbook of regional and urban economics: cities
and geography. Elsevier, Amsterdam
28. Combes PP, Mayer T, Thisse JF (2008) Economic geography: the integration of regions and
nations. Princeton University Press, Princeton
29. Combes PP, Duranton G, Gobillon L, Puga D, Roux S (2012) The productivity advantages of
large markets: distinguishing agglomeration from firm selection. Econometrica 80(6):2543–
2594
30. Cornwell C, Schmidt P, Sickles RC (1990) Production frontiers with cross-sectional and time-
series variation in efficiency levels. J Econ 46:185–200
31. Corrado L, Fingleton B (2012) Where is the economics in spatial econometrics? J Reg Sci
52(2):210–239
32. Cunha P, Neves T (2018) Spillovers in the production of knowledge: a meta-regression
analysis. Res Policy 47:750–767
33. Druska V, Horrace WC (2004) Generalized moments estimation for spatial panel data:
Indonesian rice farming. Am J Agric Econ 86:185–198
34. Duggal V, Saltzman C, Klein L (1999) Infrastructure and productivity: a nonlinear approach.
J Econ 92(1):47–74
35. Duranton G, Overman HG (2005) Testing for localization using micro-geographic data. Rev
Econ Stud 72(4):1077–1106
36. Duranton G, Puga D (2000) Diversity and specialization in cities: why, where and when does
it matter? Urban Stud 37(3):533–555
37. Eberhardt M, Helmers C, Strauss H (2013) Do spillovers matter when estimating private
returns to R&D? Rev Econ Stat 95(2):436–448
38. Eberts RW, McMillen DP (1999) Agglomeration economies and urban public infrastructure.
In: Cheshire P, Mills ES (eds) Handbook of regional and urban economics. Applied urban
economics, vol 3. Elsevier Science, Amsterdam, pp 1455–1495
39. Elhorst JP (2003) Unconditional maximum likelihood estimation of dynamic models for
spatial panels. Research Report 03C27. University of Groningen, Research Institute SOM
(Systems, Organizations and Management), Groningen
40. Elhorst JP (2010) Applied spatial econometrics: raising the bar. Spat Econ Anal 5(1):9–28
41. Elhorst JP (2014) Spatial panel models. In: Fischer MM, NijKamp P (eds) Handbook of
regional science. Springer, Berlin
42. Ellison G, Glaeser EL (1999) The geographic concentration of industry: does natural
advantage explain agglomeration? Am Econ Rev Pap Proc 89(2):311–316
43. Ertur C, Koch W (2007) Growth, technological interdependence and spatial externalities:
theory and evidence. J Appl Econ 22:1033–1062
44. Fingleton B, López-Bazo E (2006) Empirical growth models with spatial effects. Pap Reg Sci
85:177–198
45. Fischer M (2011) A spatial Mankiw-Romer-Weil model: theory and evidence. Ann Reg Sci
47(2):419–436
46. Fujita M, Krugman PR, Venables AJ (1999) The spatial economy: cities, regions and
international trade. MIT Press, Cambridge
47. Gajanan S, Malhotra D (2007) Measures of capacity utilization and its determinants: a study
of Indian manufacturing. Appl Econ 39:765–776
48. Gibbons S, Overman HG (2012) Mostly pointless spatial econometrics? J Reg Sci 52:
172–191
1406 L. Orea and I. C. Álvarez

49. Gibbons S, Overman HG, Patacchini E (2015) Spatial methods. In: Handbook of regional and
urban economics, vol 5. Elsevier, Netherlands, pp 115–168
50. Glass AJ, Kenjegalieva K (2019) A spatial productivity index in the presence of efficiency
spillovers: evidence for US banks 1992–2015. Eur J Oper Res 273:1165–1179
51. Glass AJ, Kenjegalieva K, Paez-Farrell J (2013) Productivity growth decomposition using a
spatial autoregressive frontier model. Econ Lett 119(3):291–295
52. Glass AJ, Kenjegalieva K, Sickles RC (2014) Estimating efficiency spillovers with state level
evidence for manufacturing in the US. Econ Lett 123(2):154–159
53. Glass AJ, Kenjegalieva K, Sickles RC (2016a) Returns to scale and curvature in the presence
of spillovers: evidence from European countries. Oxf Econ Pap 68(1):40–63
54. Glass AJ, Kenjegalieva K, Sickles RC (2016b) A spatial autoregressive stochastic frontier
model for panel data with asymmetric efficiency spillovers. J Econ 190(2):289–300
55. Golgher AB, Voss PR (2016) How to interpret the coefficients of spatial models: spillovers,
direct and indirect effects. Spatial Demogr 4(3):175–205
56. Gordon RJ (2000) Does the ‘New Economy’ measure up to the great inventions of the past? J
Econ Perspect 14(4):49–77
57. Gråsjö U (2006) Spatial spillovers of knowledge production: an accessibility approach
(Doctoral dissertation, Internationella Handelshögskolan)
58. Griliches B (1979) Issues in assessing the contribution of research and development to
productivity growth. Bell J Econ 10(1):92–116
59. Griliches Z (1991) The search for R&D spillovers. Working Paper No. 3768, NBER Working
Papers Series
60. Gude A, Alvarez I, Orea L (2018) Heterogeneous spillovers among Spanish provinces: a
generalized spatial stochastic frontier model. J Prod Anal 50(3):155–173
61. Gumbau-Albert M, Maudos J (2009) Patents, technological inputs and spillovers among
regions. Appl Econ 41(12):1473–1486
62. Hall BH, Mairesse J (1995) Exploring the relationship between R&D and productivity in
French manufacturing firms. J Econ 65:263–293
63. Han F, Xie R, Fang J (2018) Urban agglomeration economies and industrial energy efficiency.
Energy 162:45–59
64. Herwartz H, Strumann C (2014) Hospital efficiency under prospective reimbursement
schemes: an empirical assessment for the case of Germany. Eur J Health Econ 15(2):
175–186
65. Horrace WC, Jung H (2018) Stochastic frontier models with network selectivity. J Prod Anal
50(3):101–116
66. Horrace WC, Liu X, Patacchini E (2016) Endogenous network production functions with
selectivity. J Econ 190(2):222–232
67. Hulten CR, Bennathan E, Srinivasan S (2005) Infrastructure, externalities, and economic
development: a study of Indian manufacturing industry. World Bank, Mimeo, Washington DC
68. Jaffe AB (1986) Technological opportunity and spillovers of R&D: evidence from firms’
patents, profits and market value. Am Econ Rev 76(5):984–1001
69. Jondrow J, Lovell CK, Materov IS, Schmidt P (1982) On the estimation of technical
inefficiency in the stochastic frontier production function model. J Econ 19(2–3):233–238
70. Jorgenson DW, Stiroh KJ (2000) Raising the speed limit: U.S. economic growth in the
information age. Brookings Papers on Economic Activity, pp 125–211
71. Kapoor M, Kelejian HH, Prucha IR (2007) Panel data models with spatially correlated error
components. J Econ 140(1):97–130
72. Kelejian HH, Prucha IR (1998) A generalized spatial two-stage least squares procedure for
estimating a spatial autoregressive model with autoregressive disturbances. J Real Estate
Financ Econ 17(1):99–121
73. Kelejian HH, Prucha IR (1999) A generalized moments estimator for the autoregressive
parameter in a spatial model. Int Econ Rev 40(2):509–533
74. Kelejian HH, Prucha IR (2004) Estimation of simultaneous systems of spatially interrelated
cross-sectional equations. J Econ 118(1–2):27–50
35 Production Economics in Spatial Analysis 1407

75. Kelejian HH, Prucha IR (2007) HAC estimation in a spatial framework. J Econ 140(1):
131–154
76. Kelejian HH, Prucha IR (2010) Specification and estimation of spatial autoregressive models
with autoregressive and heteroskedastic disturbances. J Econ 157(1):53–67
77. Klette TJ, Kortum S (2004) Innovating firms and aggregate innovation. J Polit Econ 112:986–
1018
78. Krugman P (1998) What’s new about the new economic geography? Oxf Rev Econ Policy
14(2):7–17. Oxford University Press
79. Kumbhakar SC, Lovell CAK (2000) Stochastic frontier analysis. Cambridge University Press,
Cambridge, UK
80. Kutlu L (2018) Estimating efficiency in a spatial autoregressive stochastic frontier model.
Econ Lett 163:155–157
81. Lee F (2004) Asymptotic distributions of quasi-maximum likelihood estimators of spatial
autoregressive models. Econometrica 72:1899–1925
82. Lee L, Yu J (2010) Estimation of spatial autoregressive panel data models with fixed effects.
J Econ 154(2):165–185
83. LeSage JP (2014) What regional scientists need to know about spatial econometrics. Rev Reg
Stud 44:13–32
84. LeSage JP, Chih YY (2016) Interpreting heterogeneous coefficient spatial autoregressive
panel models. Econ Lett 142:1–5
85. LeSage JP, Chih YY (2017, in press) A Bayesian spatial panel model with heterogeneous
coefficients. Reg Sci Urban Econ 72
86. LeSage JP, Pace RK (2009) Introduction to spatial econometrics. CRC Press, Florida, FL
87. Lesage JP, Pace RK (2010) Spatial economic model. Handbook of applied spatial analysis.
Springer, Berlin/Heidelberg, pp 377–407
88. Lychagin S, Pinkse J, Slade ME, van Reenen J (2010) Spillovers in space: does geography
matter? NBER Working Paper No. 16188
89. Mairesse J, Mohnen P, Kremp E (2005) The importance of R&D and innovation for
productivity: a reexamination in the light of the French Innovation Survey. Ann Econ Stat
79(80):486–527
90. Malikov E, Sun Y (2017) Semiparametric estimation and testing of smooth coefficient spatial
autoregressive models. J Econ 199(1):13–34
91. Maté-Sánchez-Val M, López-Hernandez F, Mur-Lacambra J (2017) How do neighbouring
peer companies influence SMEs’ financial behaviour? Econ Model 63:104–114
92. Melo PC, Graham DJ, Noland RB (2009) A meta-analysis of estimates of urban agglomera-
tion economies. Reg Sci Urban Econ 39:332–342
93. Munnell AH (1992) Policy watch: infrastructure investment and economic growth. J Econ
Perspect 6(4):189–198. American Economic Association
94. Mutl J, Pfaffermayr M (2011) The Hausman test in a cliff and ord panel model. Econ J
14(1):48–76
95. Ord JK (1975) Estimation methods for models of spatial interaction. J Am Stat Assoc 70:
120–126
96. Orea L (2002) Parametric decomposition of a Generalized Malmquist productivity index. J
Prod Anal 18(1):5–22
97. Orea L, Álvarez I (2019) A new stochastic frontier model with cross-sectional effects in both
noise and inefficiency terms. J Econ 213(2):556–577
98. Orea L, Álvarez I, Jamasb T (2018) A spatial stochastic frontier model with omitted variables:
electricity distribution in Norway. Energy J 39(3):93–117
99. Orea L, Álvarez I, Servén L (2019) A two-level top-down decomposition of aggregate
productivity growth: the role of infrastructure. In: Lovell K, Pastor JT, Zhu Z, Aparicio J
(eds) Advances in efficiency and productivity II. Springer International Series in Operations
Research and Management Sciences, Berlin
100. Pakes A, Griliches Z (1984) Patents and R&D at the firm level: a first look. In: R&D, patents
and productivity. University of Chicago Press, Chicago, IL, pp 55–72
1408 L. Orea and I. C. Álvarez

101. Partridge MD, Boarnet M, Brakman S, Ottaviano G (2012) Introduction: whither spatial
econometrics? J Reg Sci 52(2):167–171
102. Pereira AM, Andraz JM (2004) Public highway spending and State spillovers in the USA.
Appl Econ Lett 11:785–788
103. Pereira AM, Andraz JM (2013) On the economic effects of public infrastructure investment:
a survey of the international evidence. J Econ Dev 38(4):1–37
104. Plümper T, Neumayer E (2010) Model specification in the analysis of spatial dependence. Eur
J Polit Res 49(3):418–442
105. Prenzel P, Ortega-Aviles R, Cozza C, Piva M (2018) Interplay between regional and
industrial aspects in the R&D productivity link: evidence from Europe. Reg Stud 52(5):
659–672
106. Qu X, Lee L (2015) Estimating a spatial autoregressive model with an endogenous spatial
weight matrix. J Econ 184:209–232
107. Ramajo J, Hewings GJ (2018) Modeling regional productive performance using a spatial
stochastic frontier approach: new evidence for Europe (1995–2007). Reg Stud 52(10):1372–
1387
108. Rogers M (2010) R&D productivity: using UK firm-level data to inform policy. Empirica
37(3):329–359
109. Romer PM (1986) Increasing returns and long-run growth. J Polit Econ 94(5):1002–1037
110. Romer PM (1987) Growth based on increasing returns due to specialization. Am Econ Rev
72:56–62
111. Rosenthal SS, Strange WC (2004) Evidence on the nature and sources of agglomeration
economies. In: Henderson V, Thisse J-F (eds) Handbook of regional and urban economics,
vol 4. North-Holland, Amsterdam, pp 2119–2171
112. Rosenthal SS, Strange WC (2008) The attenuation of human capital spillovers. J Urban Econ
64:373–389
113. Schmidt P, Sickles RC (1984) Production frontiers and panel data. J Bus Econ Stat 2:
367–374
114. Schmidt AM, Moreira AR, Helfand SM, Fonseca TC (2009) Spatial stochastic frontier
models: accounting for unobserved local determinants of inefficiency. J Prod Anal 31(2):
101–112
115. Stiroh KJ (2002a) Information technology and the US productivity revival: what do the
industry data say? Am Econ Rev 92(5):1559–1576
116. Stiroh KJ (2002b) Are ICT spillovers driving the new economy? Rev Income Wealth
48(1):33–57
117. Straub S (2011) Infrastructure and development: a critical appraisal of the macro-level
literature. J Dev Stud 47(05):683–708
118. Strobel T (2016) ICT intermediates and productivity spillovers-evidence from German and
US manufacturing sectors. Struct Chang Econ Dyn 37:147–163
119. Sun Y (2016) Functional-coefficient spatial autoregressive models with nonparametric spatial
weights. J Econ 195(1):134–153
120. Sun Y, Malikov E (2018) Estimation and inference in functional-coefficient spatial autore-
gressive panel data models with fixed effects. J Econ 203(2):359–378
121. Tiefelsdorf M, Griffith DA, Boots B (1999) A variance-stabilizing coding scheme for spatial
link matrices. Environ Plan A 31:165–180
122. Tsionas EG, Michaelides PG (2016) A spatial stochastic frontier model with spillovers:
evidence for Italian regions. Scottish J Polit Econ 63(3):243–257
123. van Ark B, Inklaar R, McGuckin RH (2003) ICT and productivity in Europe and the United
States: where the differences come from, May
124. van der Wiel HP (2001) Does ICT boost Dutch productivity growth? CPB Document No 016
125. van Leeuwen G, van der Wiel HP (2003a) Do ICT spillovers matter? CPB Discussion Paper
No 26
126. van Leeuwen G, van der Wiel HP (2003b) Spillover effects of ICT. CPB Report No 3
127. Vega SH, Elhorst JP (2015) The SLX model. J Reg Sci 55(3):339–363
35 Production Economics in Spatial Analysis 1409

128. Vidoli F, Cardillo C, Fusco E, Canello J (2016) Spatial nonstationarity in the stochastic
frontier model: an application to the Italian wine industry. Reg Sci Urban Econ 61:153–164
129. Yu N, de Jong M, Storm S, Mi J (2013) Spatial spillover effects of transport infrastructure:
evidence from Chinese regions. J Transp Geogr 28:56–66
130. Zhang X, Yu J (2018) Spatial weights matrix selection and model averaging for spatial
autoregressive models. J Econ 203:1–18
Technical Efficiency and Its Determinants
in the Manufacturing Sector: What We 36
Know and What We Should Know

Sumon Kumar Bhaumik

Contents
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1412
Estimating TE/Efficiency . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1414
Factors Affecting TE/Efficiency . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1417
External Factors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1418
Internal Factors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1420
Discussion and Concluding Remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1422
Cross-References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1422
Appendix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1423
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1429

Abstract

There is a larges literature on technical efficiency (TE) of manufacturing sector

firms in developing country contexts, much less so for developing country
contexts. Most of empirical research use either data envelopment analysis (DEA)
or stochastic frontier analysis (SFA) to estimate TE. Available evidence suggests
that TE of firms is low in developing countries, and there is considerable inter-
firm heterogeneity in TE. Firm-level TE is correlated with (or influenced by)
external factors such as liberalization of trade and foreign direct investment
(FDI) and overall economic liberalization, as also by internal factors such as
firm size, ownership, whether or not they operate in the formal sector, and use
and adoption of ICT. However, there is mixed evidence about the extent to which

S. K. Bhaumik ()
Sheffield University Management School, University of Sheffield, Sheffield, UK
IZA – Institute of Labor Economics, Bonn, Germany
Global Labor Organization, Geneva, Switzerland
e-mail: [email protected]

S. C. Ray et al. (eds.), Handbook of Production Economics,
https://doi.org/10.1007/978-981-10-3455-8_36
1412 S. K. Bhaumik

liberal economic policies and firm characteristics such as size and ownership are
associated with higher TE. Future research should focus on the impact of the
institutional environment in which firms operate and their managerial capability
on firm-level TE.

Keywords

Technical efficiency · Manufacturing sector · Data envelopment analysis ·

Stochastic frontier analysis · Developing countries

Introduction

Broadly speaking, a firm can increase its productivity in two ways. If there is
slack in the production process such that factor inputs are not used efficiently,
then it can work on increasing its efficiency, i.e., generate more output for a given
combination of inputs or use less inputs to generate the same amount of output.
If it is already fairly efficient in the use of inputs, however, productivity increase
may require (disembodied or embodied)1 technical progress.2 Traditionally, much
of the discussion about productivity in the context of the manufacturing sector in the
developed economies (DEs) has been about technical progress, or about changes in
total factor productivity (TFP) and labor productivity3 . The genesis of it possibly lies
in the (implicit) assumptions that (a) DEs are contexts where there are functioning
markets for factor inputs and (b) profit maximizing firms operating in these contexts
would necessarily use an efficient input bundle to produce the optimum (or profit-
maximizing) output level, such that technical efficiency would be less of an issue.
Both these assumptions are brought into question in the context of less developed
economies (LDEs), of which emerging market economies (EMEs) are a much-
discussed subset. To begin with, these contexts are characterized by missing factor
markets, such that it may be difficult for firms (and their managers) to choose the
optimal input bundles. In some cases, this is on account of paucity of appropriate
factor inputs – chronic shortage of capital and skilled labor in LDEs has been much
discussed in the development economics literature – and, to the extent that these
factors are available, market failure may occur on account of high information and
transactions costs in these factor markets.4 At the same time, firms in LDEs are

1 See Intriligator [1] and Jorgenson [2] for a discussion about disembodied and embodied technical
progress (change).
2 In principle, one can also talk about scale economies contributing to productivity growth. Coelli

et al. [3], for example, decompose total factor productivity (TFP) into three components, namely,
efficiency change, technical progress, and scale change. However, in the literature, much of the
focus is on technical efficiency and technical progress, and much less on scale economies.
3 Some papers that focus on TE (or change there of) in DEs include [72]-[75].
4 The challenge posed by informational and transactions cost is perhaps most evident in the market

for financial capital that is necessary to acquire physical capital, especially in contexts where
36 Technical Efficiency and Its Determinants in the Manufacturing . . . 1413

characterized by weak management practices that are arguably correlated with state
and founder–family ownership of firms in these contexts [4]. Alternatively, these
contexts are characterized by missing markets for managerial capital [5], whether
on account of paucity of such capital or on account of the lack of a functioning
market for managerial talent.
In addition, developing countries are often characterized by traditional industries
such as textiles and food processing, and a large proportion of firms in these contexts
are small and far removed from the global technological frontier. For these industries
and firms, it is often meaningful to talk about the efficiency of the production
process, i.e., the efficiency with which inputs are converted into output, rather than
technical progress. That is not to say that technical progress is inconceivable in
such contexts, e.g., a move from handloom-based production to machine-based
production in the textiles industries can lead to technical progress because of the
technology embodied in the machines (or physical capital). But, by and large, the
discussion about efficiency is much more relevant in the context of LDEs than in
the context of DEs, and this is reflected in the large literature on efficiency – often,
specifically, technical efficiency (TE) – of firms in the literature on firm and industry
performance in LDEs.5
Much of this literature concludes that the average firm in most LDE industries
has average TE of about 0.6–0.7, and that the dispersion of TE among firms can be
quite large, e.g., 0.3–0.8. The relatively low average TE levels are consistent with the
aforementioned discussions about factor markets and managerial abilities in LDE
contexts. The large dispersion in TE levels, on the other hand, confirms the absence
of competitive markets, arguably on account of weak institutions, and other factors
such as family and state ownership of firms that rule out exit of inefficient firms over
time. More importantly, this literature identifies factors that are correlated with –
correlation is easier to establish than causality – high (and hence also low) levels of
TE.6 Broadly speaking, the literature identifies two sets of factors that are correlated
with TE/efficiency levels, namely, external factors such as the business environment

internal accrual of firms is insufficient for acquisition of physical capital. Factors such as weak
property rights, absence of rule of law, and weak enforcement of contracts make it difficult for
creditors to overcome the twin problems of adverse selection and moral hazard. At the same
time, investment in equity/shares of firms may be hindered by weak corporate governance that
is often associated with high levels of ownership concentration and, consequently, entrenchment
of incumbent managers.
5 There is, of course, a literature on TE in the context of DEs as well (e.g., [6]), but the

corresponding literature in the context of LDEs is arguably much larger.

6 Some of the relevant empirical papers estimate TE from data envelopment analysis (DEA) or

from the econometric estimation of a Cobb–Douglas or a translog production function – we call

this stochastic function analysis (SFA) – and, in the second stage of the analysis, regress TE on
its potential correlates. Some other papers simultaneously estimate a production function and the
relationship between firm characteristics and external conditions such as the business environment
on firm-level inefficiency. Hence, while there is, strictly speaking, a difference between firm-level
efficiency and TE, we shall use these terms interchangeably.
1414 S. K. Bhaumik

(and changes thereof, by way of policy changes and reforms) and internal factors
such as the size and ownership of companies.7
In the rest of this chapter, our focus would be on these correlates of TE/efficiency.
We start with a discussion about the ways in which TE/efficiency is estimated in the
literature. Next, we discuss the empirical evidence reported in the literature and
discuss the mechanisms through which the aforementioned external and internal
factors affect efficiency. Finally, we reflect on the way forward for the TE literature
for the manufacturing sector.

Estimating TE/Efficiency

In the early literature on TE/efficiency in the manufacturing sector, inferences about

the efficiency of a certain type of firm relative to others was made on the basis of
simple regression models. Consider, for example, Tyler [7], who attempts to shed
light on whether, among other things, TE of foreign-owned or controlled firms is
different, on average, than that of domestic firms. His empirical strategy involves
estimating the extended Cobb–Douglas production function

ln Xi = ln A + γ DF i + α ln Ki + a1 (DF i . ln Ki ) + β ln Li + a2 (DF i . ln Li )
(1)

where γ captures the TE of foreign firms relative to domestic firms and where a1
and a2 capture the differences in “the respective output elasticities for foreign firms
in relation to domestic firms” (pp. 369). The model is estimated using ordinary
least squares (OLS), even though Tyler acknowledges the limitations of the OLS
estimator. Later in the paper, he tests for the adequacy of the Cobb–Douglas
specification – the alternative being the CES production function – and concludes
that “the Cobb-Douglas is an adequate model for explaining the production behavior
of foreign firms but not for domestic firms” (pp. 372).
By the 1980s, the empirical modeling of TE had become more sophisticated. For
example, Page Jr [8] considers a translog production function
1
ln X̂s = α̂0 + α̂m ln Zm (s) + β̂mn ln Zm (s) ln Zn (s) (2)
m 2 m n

where X̂s is an index of maximum potential output for firm s (s = {1, 2, . . . ., S}),
Zm (s) are indices of input levels, and m, n = {1, 2, . . . ., N}. The model parameters
are estimated using the optimization program

7 Itcan be argued that internal factors such as (family) ownership can themselves be determined
by external factors such as institutions, but we shall abstract that from discussion for the purposes
of this chapter.
36 Technical Efficiency and Its Determinants in the Manufacturing . . . 1415

S
1
min α̂0 + α̂m ln Zm (s) + β̂mn ln Zm (s) ln Zn (s) − ln X̂s
s=1 m 2 m n
(3)

and constraints are imposed to ensure that the production function is concave and
monotonic and that, for each firm s, the observed input–output combinations lie
on or below the frontier. The parameters of the model are obtained using linear
programming. It is easy to see that Xs = X̂s only if a firm operates on the production
frontier. Page Jr., therefore, is able to generate a firm-specific efficiency (or Farrell)
index [9] which is given by Xs /X̂s ≤ 1 and “the value of the index provides a
measure of relative technical efficiency” (pp. 133). If the production process follows
constant returns to scale, the Farrell index also gives us the percentage by which the
firm’s unit cost can be reduced, given the current output level.8
Since the 1990s, the empirical literature on TE/efficiency has been dominated by
(nonparametric) DEA models and (parametric) SFA models. The DEA approach
involves using linear programming to solve a constrained optimization problem
(e.g., [11]), and this optimization problem is given by
uy0
maxu,v g0 = m (4)
i=1 vi xi0

subject to
uyj
m ≤ 1, (5)
i=1 vi xij
m
vi xij = 1 (6)
i=1

ui , vi > 0 (7)

where y0 , x0 , and g0 are output, input(s), and technical efficiency of firm 0, whose TE
is being measured, j = {1, 2, . . . ., n} is an indicator for firms, and i is an indicator for
the inputs. The first constraint ensures that a firm is either on or below the efficient
frontier, the second constraint ensures that there is an unique combination of u and
v for which the objective function is maximized, and the third constraint ensures
that any increase in input will necessarily increase output. Further, the assumption
of constant returns to scale that is implicit in the optimization program can be easily

8 Variations of this approach has been used in other papers, without the linear programming
element. For example, Blomstrom [10] estimates an industry-level efficiency index as follows:
“First the efficiency frontier is obtained by choosing the size class within each four-digit industry
showing the highest value-added per employee. Value added per employee in this size class is
denoted y+ . Them the industry average (denoted y) is calculated as the ratio of total value-added
in each industry to the total number of employees. The efficiency index, e, for each industry i, [is
then defined] as ei = yi+ /y i .” (pp. 102)
1416 S. K. Bhaumik

relaxed. The solution to this program results in a TE score of between zero and one
for each firm.
An example of the SFA approach, on the other hand, is Sena [12] who estimates
a translog production function

1 1
ln y it = αk ln xkit + αkl ln xkit ln xlit + αl ln xlit + γ t + γ t 2
2 2 (8)
+ μk ln xkit t + μl ln xlit t + it

where y is output, k and l are capital and labor, respectively, t is a time trend, i = {1,
2, . . . ., I} denotes the number of firms, and t = {1, 2, . . . ., T} is the number of
time periods. The error term it can be decomposed into a one-sided (in)efficiency
term uit and an iid noise term vit , it = uit − vit . In this setup, TE is given by
TE = exp (−uit ).
Papers that examine the determinants of TE, after estimating firm-year level
estimates of TE using the DEA or the SFA approach, regress the estimated firm-
year estimates of TE on the relevant firm characteristics and characteristics of
the business environment within which these firms operate. An example of this
approach is Chirwa’s [13] examination of the relationship between TE and the
following variables: state–ownership in a firm (state), dummy variable indicator of
privatization (priv), the Herfindahl–Hirschman index of market concentration (hhi),
import competition (imps), capital-intensity of technology (kint), dummy variable
indicator of multinationality (mnc), dummy variable indicator of structural reform
programs (saps), and a dummy variable indicator of post-privatization enterprises
(psepriv). The regression model is given by

TEj t = α0 + α1 statej t + α2 priv it + α3 (hhi it ) + α4 imps it + α5 kint j t

+ α6 mncj t + α7 saps t + α8 psepriv t + εj t
(9)

where i is an indicator of the firm, j is an indicator of the industry, and t is the

time period. As such, such models can be estimated using fixed or – if there
are time invariant variables of interest (e.g., ownership status) – random effects
models. Some recent papers, however, also correct for potential endogeneity by
using variations of the generalized method of moments (GMM) estimator (e.g.,
[14]). Others have used the Tobit estimator [15].
A variation of this empirical strategy is sometimes used by researchers who
adopt the SFA approach to estimate TE. It involves an estimation of the production
function

yit = f (Xit ) + vit − uit (10)

where y is the output, X is a vector of inputs, vit ∼ N 0, σv2 iid error term, and

u ∼ N μit , σu2 nonnegative inefficiency term is to characterize the inefficiency
term as follows:
36 Technical Efficiency and Its Determinants in the Manufacturing . . . 1417

μit = δ0 + δ1 Z1it + δ2 Z2it + · · · + δn Znit (11)

The production function and the inefficiency equation can then be jointly estimated
to explore the relationship between inefficiency and the Z variables that can include
firm-level characteristics such as size (e.g., [16]) as well as environmental variables
such as institutional quality [17].
While a number of methodologies have been used in the literature to explore
TE in the manufacturing sector – some of which have been described above –the
current literature almost exclusively used DEA or SFA to estimate TE, and thereafter
uses a second-stage regression model to explore the relationship between TE and its
potential determinants. In other words, there is a large degree of agreement about
the methodological approach to estimating TE and examining its determinants and
the debate, therefore, is about the conceptual issues and the evidence about the
aforementioned determinants. We discuss these issues and the evidence in the next
section.

Factors Affecting TE/Efficiency

In light of the discussion earlier in this paper, it is not surprising that much of
the literature on TE/efficiency is in the context of LDEs and EMEs. Much of this
literature focuses on the relationship between specific aspects of an economy (e.g.,
openness to trade and foreign direct investment (FDI)) and specific firm characteris-
tics (e.g., ownership and formal/informal status of companies) on efficiency. Some
of the literature, however, focuses on “reforms” which can include a number of
different things. For example, in the case of India, reforms during the 1980s and
1990s included elimination of the requirement to obtain government permission to
start businesses (the so-called license raj), liberalization of both current and capital
accounts of the balance of payments, reform of credit and capital markets, and part
privatization of some state-owned financial and nonfinancial enterprises. Hence, it is
difficult to attribute the mixed evidence about whether or not broad-based “reforms”
contributed to TE improvement – Ray [18] and Din et al. [19] find that reforms did
improve TE in India and Pakistan, respectively, while Bhaumik and Kumbhakar [20]
did not find evidence of a significant change in TE in India9 – to specific aspects of
these reforms.
However, this research highlights interesting caveats that merit deeper explo-
ration. For example, Mukherjee and Ray [21] find that economic reforms in India
did not change the relative efficiency ranking of the states and that there is no
evidence of convergence in the distribution of (state-level) efficiency subsequent

9 Specifically, while Ray [18] finds that economic reforms in India, whose center piece was the

elimination of licensing requirements to facilitate entry, led to greater productivity growth largely
on account of TE improvement, Bhaumik and Kumbhakar [20] find that median TE of all but one
industry (examined in the paper) declined between 1989–90 and 2000–01 and that change in TE
explains a very small proportion of the change in gross value added.
1418 S. K. Bhaumik

to the reforms. This raises questions about the role of initial conditions and path
dependence in determining the impact of reforms on TE/efficiency. Similarly,
Chirwa [22] finds that, in Malawi, structural changes improved TE/efficiency of
only those sectors which were inefficient before the reforms. As such, this is not
surprising; inefficient sectors do have the greater scope to improve. However, it is
certainly something about which policymakers should be mindful.
Next, we discuss the literature on the relationship between specific aspects of the
business environment (e.g., infrastructure quality), government policies (e.g., trade
liberalization), and firm characteristics (e.g., ownership) on TE/efficiency (and its
change). We first take a look at factors that are external to a firm, and then at firm
characteristics themselves. Note that we use the word “impact” – which implies
causality – somewhat loosely. In much of this literature, it is easier to establish
correlation than causality.

External Factors

Both cross-country and single-country studies have confirmed that factors such as
infrastructure quality, availability of factor inputs, competition, and government
regulations/policies are correlated with TE/efficiency levels of firms and industries.
Some of these relationships are easy to predict, e.g., TE improves with infras-
tructural quality and availability of factor inputs. Mitra et al. [23] find that the
infrastructure elasticity of TE in the context of the Indian manufacturing sector
is 0.12 on average, and much higher for some industries. However, they also find
that some elements of infrastructure such as power shortages matter much more
than other elements such as transportation quality. Similarly, Hailu and Tanaka
[24] find that TE in Ethiopia, which is low on average and with a high level of
dispersion, is adversely affected by shortage in supply of raw materials. Wang
and Wong [25] argue that, in addition to having a direct effect on TE, in some
contexts, factor inputs such as skilled labor may have a moderating effect on
the relationship between TE and its correlates such as trade and foreign direct
investment (FDI).
From a conceptual perspective, TE should also be affected by competition –
competition is expected to induce profit-maximizing firms to become more efficient,
irrespective of whether the source of the competition is domestic or international.
In one of the few studies that directly link market structure with TE, Setiawan et al.
[26] finds that industry concentration is negatively correlated with sector-level TE.
Evidence linking the drivers of market concentration, namely, entry and exit of
firms, with TE is found, interestingly, in a low productivity region of a developed
country. Harris [27] finds that the average level of TE is lower in Northern Ireland
than in the other parts of the UK, in part because there is little churning to facilitate
replacement of inefficient firms (which exit) with more efficient firms (that enter)
and, in part because the new firms that enter are not much more efficient than the
36 Technical Efficiency and Its Determinants in the Manufacturing . . . 1419

incumbent firms. Interestingly, however, there is little discussion in the literature

about how institutions that underpin entry, exit, and market concentration are related
to TE.10
The literature on the impact of trade and FDI on TE is much more extensive
but it is difficult to disentangle the competitive effects of trade and FDI from
factors such as learning-by-doing and spillover effects of FDI. In addition, this
literature has had to grapple with the possibility that higher productivity and TE
of exporting firms and those with foreign ownership may reflect self-selection,
whereby more productive firms are more likely to export and attract overseas
investment, rather than any causal impact of export and overseas investment on TE
and productivity. Evidence reported in the literature suggests that TE is positively
associated with export orientation [28, 29], but that efficiency gains from exporting
are largest for firms that are new to exporting, on account of both learning-by-
exporting and self-selection of the most efficient firms into the exporting cohort
[30]. Correspondingly, TE (and productivity) are positively associated with trade
liberalization [31–33], with TE gains for both import-substituting and export-
promoting industries [34], and negatively associated with high level of protection
[32, 35]. Similarly, available evidence suggests that TEs have positive association
with foreign ownership and outward FDI activities (i.e., overseas investment)
of firms [36]. However, the benefits of sector-level inward FDI may only be
significant for industries that are regionally concentrated [37], suggesting that
aspects of economic geography may moderate the relationship between FDI and
TE.
Neither trade liberalization nor FDI may, however, be a panacea for TE improve-
ment. Trade liberalization, for example, may reduce efficiency of domestic firms,
on average, if they are not flexible enough to adjust to this change in their business
environment [38]. Similarly, while FDI may improve the TE of local firms who are
part of the multinational enterprises’ (MNE) supply chain, there may be a decline in
the TE of the domestic competitors of these foreign firms (and their suppliers) [39].
More generally, the reaction of domestic firms to trade liberalization and FDI may
depend on whether or not they (or the sectors to which they belong) are close to the
global technological frontier [40],11 such that there can be considerable variation in
interindustry and intra-industry impact of trade/openness and FDI on TE.

10 Indeed, in perhaps the only notable study that examines the relationship between formal
institutions and efficiency [17], the focus is on labor market institutions that directly matter more
for factors such as motivations of workers than for market concentration.
11 The broad intuition is that firms that are close to the global technological frontier would be

induced by overseas competition to become more efficient and/or invest in better technology, while
firms that are far from this frontier are unlikely to be willing to make these adjustments to their
production process.
1420 S. K. Bhaumik

Internal Factors

In the literature, the single most robust relationship between a firm attribute and TE
involves firm size; in nearly all relevant studies, larger firms have been associated
with higher levels of TE (see Appendix). This is as true in contexts in which
micro and small firms are not intrinsically inefficient but have lower TE levels
than larger firms [41], as in contexts where the average TE is low but larger firms
are relatively more efficient [42]. In an interesting deviation from this stylized
observation about the relationship between firm size and efficiency, Patibandla [43]
finds that medium-sized Indian firms are more efficient than both small and large
firms. This is consistent with an earlier study in the Indian context [8], which found
that firm size was positively associated with TE in only one of the four industries
that were examined in the study. Patibandla argues that this is on account of low
levels of organizational efficiency in larger firms, which is consistent with more
recent evidence that suggests that larger firms in developing countries are badly
managed, and that the challenges associated with poor management practices may
be particularly acute in organizational forms such as family firms. The importance
of managerial capability is also highlighted in the research of Page Jr [44] who finds
that TE in Ghana is positively associated with education and industry experience of
managers, and also with the ratio of expatriate managers to total managers.
Available evidence also provides links between TE and factors such as invest-
ment in research and development (R&D) and information and communication
technology (ICT) that are correlated with a firms organizational structure strategic
organization (e.g., Bayo-Moriones and Lera-Lopez [78]). Both R&D investment
[45, 46] and ICT investment [47, 48] are positively associated with TE. Sim-
ilarly, while deterioration in TE over time may sometimes be attributable to
poor investment planning and implementation [49], higher TE is associated with
firms that implement state-of-the-art management practices such as just-in-time
purchasing of materials and components that are used in the production process
[50]. Management, however, is a complex process that is difficult to view through
the prism of single factors such as R&D investment and supply chain management.
Good management requires ensuring that investment in any one firm-level resource
or capability is accompanied by investment in complementary resources that are
necessary to exploit the former. For example, Mahadevan [51] finds that TE may
be negatively associated with capital intensity, on account of unavailability of
workers with commensurate skills, implying that investment in capital without
commensurate investment in training may not lead to an improvement in TE.12

12 Indeed, the capital intensity of firms itself might not be optimal. If larger firms pay more for labor

than their smaller counterparts, for example, their use of capital may be more than what would have
been optimal if they paid the same per unit of labor as the smaller firms [14].
36 Technical Efficiency and Its Determinants in the Manufacturing . . . 1421

The relationship between ownership and TE is also not clear-cut. We have

already noted the challenges associated with organizational forms such as family
firms. Available empirical evidence suggests that, in some contexts, TE is positively
associated with foreign ownership and negatively associated with state ownership
[15].13 Correspondingly, privatization has been associated with an increase in
industry-level TE is some developing economies [13], while access to government
subsidies, which is generally much more relevant for state-owned firms than
for private and foreign firms, has been found to be positively associated with
inefficiency [52]. Evidence also suggests that increased concentration of ownership
in the hands of the largest shareholder and inclusion in (pyramidal) business groups,
which reduce Type I or principal–agent agency problems (albeit replacing it with
Type II or principal–principal agency problems), are also associated with higher
(or increase in) TE [53]. However, the difference in efficiency between state-owned
and private/foreign firms is not always significant [7]. Further, in some contexts,
the positive impact of certain types of ownership, e.g., that of foreign ownership in
Northern Ireland [54], decreases over time.14
In LDEs, the (in)formal status of firms is strongly correlated with their efficiency
levels, largely on account of the unfavorable business environment within which
these firms operate [55]. Specifically, TE of informal sector firms can be low
on account of factor market imperfections experienced by these firms [56], and
transactions costs for these firms can be particularly high in the credit market. In
the specific context of the capital market, however, the evidence is mixed. Evidence
suggests that while firms that have access to credit from banks and other sources
such as clients and suppliers have higher TE than those that are reliant on friend,
family, and their own financial resources [57] – TE is highest for firms that have
access to bank credit, the incentive to increase TE may be higher for firms that
experience financial constraints [12]. This literature, however, does not take into
consideration the possibility that operating in the formal sector may itself be a
strategic decision of the firm which trades of the benefits of easier access to credit
and other factor inputs with the higher direct costs of doing business in the formal
sector.

13 This is consistent with a much wider literature on the relationship between private and foreign
ownership of firms and their performance.
14 In this context, one may draw on the international business literature which posits that foreign

firms may have “ownership advantages” in the form of better technology, managerial skills, and
even country-specific factors such as greater access to key resources, but that these firms also face
challenges when they operate overseas. In the context of LDEs and EMEs, the main challenge may
be the high transactions cost of operation in weak institutional contexts.
1422 S. K. Bhaumik

Discussion and Concluding Remarks

Since the notion of TE/efficiency follows from the microeconomic theory about
production, it is not surprising that much of the empirical literature about it is
concerned about issues such as firm characteristics such as ownership, factor market
imperfections, and competition. Concerns about the impact of ownership and market
structures on firm performance are a natural extension of the neoclassical theory of
the firm. The essential argument is that private ownership and competition, whether
from domestic or foreign sources, give a firm the incentive to become efficient,
among others, in its use of factor inputs. This process of efficiency gain may
be aided by increased x-efficiency, which itself may be a consequence of greater
competition, and access to well-functioning factor markets. More recent extensions
to this literature involve focus on factors such as training (in addition to x-efficiency)
and agency problems (in addition to public versus private ownership). While the
empirical evidence is by no means unanimous, these core propositions of the drivers
of TE/efficiency find a fair amount support in the literature; the caveats are discussed
above and further highlighted in the Appendix.
The question, therefore, is where the literature on TE (in the context of the
manufacturing sector) can go from there. Further estimations, using data from
different countries and time periods, and for different industries, can doubtless be
undertaken, using ever more sophisticated SFA estimators [76], for example, but
it is unclear as to whether these new estimates of TE would provide us any new
insights unless we extend the literature in one of two directions. First, building on
the work of not only Bloom et al. [4] but also a large management literature, future
research should consider examining the role of managerial capability in influencing
TE/efficiency (and its increase). In LDE and EME contexts, a specific aspect of
management capability might be political connections of key managers such as the
chief executive officer (CEO) [58]. Second, building on the work of Bhaumik and
Dimova [17], future research may want to further explore how formal and informal
institutions affect TE. In other words, future research should perhaps unpack the
(neoclassical) firm further, on the one hand, and be more mindful about the role
of institutions, which are more structural in nature than policies and regulations, in
influencing the choices and decisions of the different stakeholders of the firm, the
aggregation of which is captured by performance measures such as TE.

Cross-References

Data Envelopment Analysis: A Nonparametric Method of Production Analysis

Modeling Technical Change: Theory and Practice
Neoclassical Production Economics: An Introduction
Stochastic Frontier Analysis: Foundations and Advances I
Stochastic Frontier Analysis: Foundations and Advances II
36 Technical Efficiency and Its Determinants in the Manufacturing . . . 1423

Appendix

Country Method Reference Empirical finding

Business
environment
Cross-country SFA [59] Efficiency distribution for most sectors unaffected
by explanatory variables; with weak positive
association between high efficiency levels and the
expansionary phase of the business cycle.
Cross-country SFA [79] Considerable intercountry and intra-country
dispersion in efficiency levels, and inefficiency is
mostly associated with managerial experience,
infrastructural quality, and competition.
Cross-country SFA [60] Inefficiency is lower in countries that have more
sophisticated production processes and higher
capacity for innovation.
Cross-country SFA [25] TE improvement is positively associated with
infrastructure, political stability, and urbanization.
Human capital matters as well; benefits from trade
and FDI-driven technology transfer increases a
country’s TE only after human capital level
reaches a minimum threshold.
Ethiopia SFA [24] TE is low and with high dispersion level. The main
reason for low efficiency levels across sectors is
shortage in supply of raw materials but low
infrastructure quality and unfavorable government
rules and regulations play a role as well.
India [23] and [81] Infrastructure elasticity for TE is 0.12, on average,
and higher for some industries. However, some
elements of infrastructure such as power shortage
matter much more than others such as
transportation quality.
Competition
Indonesia DEA [26] Industry concentration is negatively associated
with sector-level TE.
Northern SFA [27] Average levels of TE is lower in Northern Ireland
Ireland than in other parts of the UK, in part because there
is little churning to facilitate replacement of
inefficient firms (which exit) with more efficient
firms (that entry). Where entry does take place, the
entering firms are not much more efficient than the
incumbents.
Economic
policies and
reforms
Bangladesh [34] There was an increase in the overall TE of most
industries over time, for both export-promoting
and import-substituting industries.
(continued)
1424 S. K. Bhaumik

Country Method Reference Empirical finding

Chile [32] There is little evidence of productivity
improvement on account of trade liberalization but
a greater reduction in protection levels is
associated with a larger improvement in the
average efficiency level.
China [33] Average TE declined in the mid-1980s but has
increased since 1992. There is considerable
regional variation in TE and inefficiency is
impacted by a variety of reforms related to
privatization and trade and FDI liberalization, and
infrastructure development.
Hungary SFA [52] Inefficiency is higher for firms that are recipient of
government subsidy and inefficiency also
increases with the level of inefficiency.
India DEA [18] Annual rate of productivity growth is higher in the
post-reform period, in part, because of
improvement in TE.
India DEA [21] Economic reforms did not change the relative
efficiency rankings of the states, and there is no
evidence of convergence in the distribution of
efficiency subsequent to the reforms.
India SFA [20] Median TE of all but one industry declined
between 1989–90 and 2000–01 and that change in
TE explains a very small proportion of the change
in gross value added.
Malawi SFA [22] Structural changes did not significantly affect the
TE of sectors that were relatively efficient before
the reforms, but reforms did have a positive impact
on firm efficiency in the sector that was inefficient
before the reforms.
Malawi SFA [13] Privatization increased industry-level TE but the
direct impact of privatization on firm-level TE was
negative; efficiency of firms declined following
privatization.
Pakistan SFA [19] TE of most industries increased over time and can
& possibly be attributed to the reforms that were
DEA initiated in the late 1980s, aimed at increasing
competition and improving the business
environment.
Firm size and
formality
Chile [41] Micro and small firms are not intrinsically
inefficient; there are considerable variations across
industries. On average, however, medium firms
have higher efficiency levels than micro and small
firms. Inter-firm differences in efficiency are
explained by factors such as capital intensity.
(continued)
36 Technical Efficiency and Its Determinants in the Manufacturing . . . 1425

Country Method Reference Empirical finding

Cote d’Ivoire SFA [56] TE is low on account of input market
imperfections, and larger firms, those operating in
the formal sector and those that are part of
international networks, are better able to overcome
the challenges posed by these imperfections.
Cote d’Ivoire SFA [42] TE is low, on average, but larger firms are more
efficient and the lower TE of smaller firms can
perhaps be attributed, at least in part, to their
informality.
Cote d’Ivoire SFA [55] TE is lower for informal firms than for formal
sector firms, mainly on account of the unfavorable
business environment within which the former
operate.
Ghana SFA [14] Large firms pay more for labor and hence use
more capital than would be optimal if they paid the
same for labor as the smaller firms.
India SFA [8] Average TE is higher and dispersion of TE is
lower in the more modern industries, but firm size
is positively associated with TE in only one of the
four industries that were examined.
India SFA [43] TE is low among both small and large firms,
relative to the medium-sized firms. The
inefficiency of large firms can be attributed to low
levels of organizational efficiency.
India SFA Raj SN TE is low among informal sector firms and this
(2011) can largely be attributed to frictions (or high
transactions costs) they experience in the credit
and labor markets.
Kenya SFA [16] TE is positively associated with firm size.
Korea SFA [61] TE is positively associated with firm size in every
sector examined in the paper.
Philippines SFA [82] TE is positively associated with firm size.
Firm capability
China [46] TE is positively related to firm capability as
measured by capital intensity and R&D.
China DEA [62] TE is low for SOEs and productivity growth is
driven largely by technical progress. Best practice
SOEs are significantly different from average
SOEs in terms of technology, human capital
quality, and managerial capacity.
Ghana DEA [44] Average TE level is high, and TE is positively
associated with education and industry experience
of managers. The ratio of expatriate managers to
total managers is positively associated with TE as
well.
India SFA [80] TE is higher for firms that undertake R&D and
those that collaborate with foreign partners.
(continued)
1426 S. K. Bhaumik

Country Method Reference Empirical finding

Italy SFA [12] Firms that experience financial constraints (i.e.,
difficulty in accessing external capital) have an
incentive to increase their TE over time.
Italy DEA [53] TE increases with percentage of shares owned by
the largest shareholder and is higher for firms that
belong to a pyramidal (business) group.
Italy SFA [47] ICT investment by firms is negatively associated
with their inefficiency levels.
Mexico SFA [57] Firms that have access to credit from banks,
moneylenders, clients, and suppliers have higher
TE than firms that are reliant on family, friends,
and their own financial resources. TE is highest for
firms that have access to bank credit.
Singapore SFA [51] TE is negatively associated with capital intensity.
One plausible explanation is that increase in
capital intensity has not been accompanied by
availability of workers with commensurate skills.
Taiwan SFA [45] Efficiency is positively associated with firm-level
investment in training and R&D.
Tunisia SFA [48] TE is low, on average, for Tunisian firms but
intensive use of ICT can increase efficiency levels
by around 5%.
USA SFA [50] TE is higher for firms whose management is
committed to implementing just-in-time
purchasing.
Yugoslavia [49] TE deteriorates over time, quite possibly on
account of poor investment planning and
implementation which, in turn, points at poor
management and coordination of intermediate
input supply.
Multiple
factors
China [63] On average, TE is lower for SOEs than for
privately owned companies, and highest for
foreign firms. TE varies positively with firm size
and newness of fixed capital assets.
India SFA [64] Efficiency is higher for consumer goods industries
than for capital goods and intermediate goods
industries. Efficiency is positively associated with
factors such as skill and profit, but negatively
associated with capital intensity.
Indonesia SFA [65] Efficiency is higher for younger firms than for
older firms, for larger firms than for smaller firms,
and for domestic firms than for foreign firms.
Indonesia SFA [28] There is considerable inter-firm variation in TE,
and TE is positively associated with export
orientation and financial integration of firms, as
well as on female participation in the workforce.
(continued)
36 Technical Efficiency and Its Determinants in the Manufacturing . . . 1427

Country Method Reference Empirical finding

Indonesia SFA [66] Larger firms are more efficient than smaller firms,
and private firms are, by and large, more efficient
than the public sector firms. The Asian crisis had a
negative impact on the growth rate of TE across all
sectors examined in the analysis.
Nepal SFA [35] Efficiency is positively associated with firm size
and negatively with capital intensity. It is also
adversely affected by high levels of protection.
Pakistan DEA [11] Efficiency is higher for newer firms, those that are
managed by entrepreneurs with at least primary
education, and those that are involved in
subcontracting.
Thailand SFA [67] TE varies inversely with firm size and is generally
higher for firms in urban areas. State ownership is
associated with declining TE over time, perhaps
because the more efficient firms are privatized.
Thailand SFA [68] TE is influenced by a number of factors such as
firm size, firm age, access to skilled labor,
ownership, and location.
Turkey SFA [69] TE is higher for larger firms and those located in
the metropolitan areas or their hinterlands. On
average, TE is also higher for private enterprises
than for public sector enterprises, but this is mostly
relevant for the post-1982 period.
Turkey SFA [70] There are considerable inter-sectoral differences in
efficiency, and efficiency is influenced by a
number of factors such as legal status of firms,
firm size, forms of contracting, and location.
UK SFA [6] Market concentration has a curvilinear relationship
with efficiency. Higher export and import intensity
are associated with higher spreads of efficiency
within industries. Capital intensity is negatively
associated with efficiency, possibly because large
sunk costs make it difficult for firms to alter their
behavior as demand etc. change.
Vietnam SFA [83] TE is similar, on average, for state- and
private-owned domestic firms, but is lower for
foreign-invested sectors. TE also increases with
greater compliance with the labor code, and export
orientation of firms.
Ownership and
organizational
form
Brazil [7] TE of foreign firms not significantly different from
that of domestic firms but the former enjoys
greater returns to scale and has greater elasticities
of substitution.
(continued)
1428 S. K. Bhaumik

Country Method Reference Empirical finding

China [15] Within the public sector, TE is highest for
relatively large TVEs and lowest for SOEs; COEs
are less efficient than TVEs but more efficiency
than SOEs.
India DEA [71] Cooperative firms are more efficient than their
counterparts but the result is influenced by the
choice of the sample.
Northern SFA [54] TE of Northern Irish firms increased over time.
Ireland The positive impact of foreign ownership on TE,
however, decreased over time, and the largest
increases in TE were observed among
domestically Northern Irish firms.
Trade and FDI
China [29] TE is positively associated with export orientation
and FDI intensity of industries, even though it is
also associated by factors such as firm size and
capital intensity. There is also regional variation in
TE.
Cross-country SFA [30] Efficiency gains from exporting are large and are
largest for new entrants to exporting. This is driven
by both learning by exporting and self-selection of
the most efficient firms into exporting.
India [31] Trade liberalization was positively associated with
productivity in four out of the six industries
examined.
Indonesia SFA [39] Foreign firms are more efficient than domestic
& firms, and the former may increase the inefficiency
DEA of the latter. In particular, FDI may have negative
impact on TE changes of domestic competitors
and positive impact on TE changes of domestic
suppliers.
Spain SFA [38] Trade liberalization was negatively associated with
efficiency, in part, on account of the lack of
flexibility on the part of the firms to adjust to the
resultant change in the environment in which they
operate.
Taiwan SFA [36] TEs of manufacturing firms are positively
associated with their OFDI activities.
UK SFA [37] TE is positively associated with the extent of
foreign ownership in the domestic industry, but the
spillover effect is only significant for industries of
above-average regional concentration.
36 Technical Efficiency and Its Determinants in the Manufacturing . . . 1429

References
1. Intriligator MD (1965) Embodied technical change and productivity in the United States 1929–
1958. Rev Econ Stat 47(1):65–70
2. Jorgenson DW (1966) The embodiment hypothesis. J Polit Econ 74(1):1–17
3. Jorgenson DW (1966) The embodiment hypothesis. J Polit Econ 74(1):1–17
4. Coelli T, Rao D, O’Donnel C, Battese G (2005) An introduction to efficiency and productivity
analysis. Springer, New York
5. Bloom N, Genakos C, Sadun R, Van Reenen J (2012) Management practices across firms and
countries. Acad Manag Perspect 26(1):12–33
6. Bruhn M, Karlan D, Schoar A (2010) What capital is missing in developing countries? Am
Econ Rev 100(2):629–633
7. Green A, Mayes D (1991) Technical efficiency in manufacturing industries. Econ J
101(406):523–538
8. Tyler WG (1978) Technical efficiency and ownership characteristics of manufacturing firms in
a developing country: a Brazilian case study. Weltwirtschaftliches Arch 114(2):360–379
9. Page JM Jr (1984) Firm size and technical efficiency: applications of production frontiers to
Indian survey data. J Dev Econ 16:129–152
10. Farrell MJ (1957) The measurement of productive efficiency. J R Stat Soc A 120(Part 3):253–
281
11. Blomstrom M (1986) Foreign investment and productive efficiency. J Ind Econ 35(1):97–110
12. Burki AA, Terrell D (1998) Measuring production efficiency of small firms in Pakistan. World
Dev 26(1):155–169
13. Sena V (2006) The determinants of firm performance: can finance constraints improve
technical efficiency? Eur J Oper Res 172:311–325
14. Chirwa EW (2004) Industry and firm effects of privatization in Malawian oligopolistic
manufacturing. J Ind Econ 52(2):277–290
15. Soderbom M, Teal F (2004) Size and efficiency in African manufacturing firms: evidence from
firm-level panel data. J Dev Econ 73(1):369–394
16. Zheng J, Liu X, Bigsten A (1998) Ownership structure and determinants of technical efficiency:
an application of data envelopment analysis to Chinese enterprises (1986–1990). J Comp Econ
26(3):465–484
17. Lundvall K, Battese GE (2000) Firm size, age and efficiency: evidence from Kenyan
manufacturing firms. J Dev Stud 36(3):146–162
18. Bhaumik SK, Dimova R (2014) Good and bad institutions: is the debate over? Cross-country
firm-level evidence from the textile industry. Camb J Econ 38(1):109–126
19. Ray SC (2002) Did India’s economic reforms improve efficiency and productivity? A
nonparametric analysis of the initial evidence from manufacturing. Indian Econ Rev New Ser
37(1):23–57
20. Din M, Ghani E, Mahmood T (2007) Technical efficiency of Pakistan’s manufacturing sector:
a stochastic frontier and data envelopment analysis. Pak Dev Rev 46(1):1–18
21. Bhaumik SK, Kumbhakar SC (2010) Is the post-reform growth of the Indian manufacturing
sector efficiency driven? Empirical evidence from plant level data. J Asian Econ 21(2):
219–232
22. Mukherjee K, Ray S (2004) Technical efficiency and its dynamics in Indian manufacturing: an
inter-state analysis, Economics working paper no. 200418. University of Connecticut
23. Chirwa EW (2000) Structural adjustment programmes and technical efficiency in Malawian
manufacturing sector. Afr Dev Rev 12(1):89–113
24. Mitra A, Varoudakis A, Vegazones-Varoudakis M-A (2002) Productive and technical efficiency
in Indian states’ manufacturing: the role of infrastructure. Econ Dev Cult Chang 50(2):395–426
25. Hailu KB, Tanaka M (2015) A “true” random effects stochastic frontier analysis for technical
efficiency and heterogeneity: evidence from manufacturing firms in Ethiopia. Econ Model
50:179–192
1430 S. K. Bhaumik

26. Wang M, Wong MCS (2012) International R&D transfer and technical efficiency: evidence
from panel study using stochastic frontier analysis. World Dev 40(10):1982–1998
27. Setiawan M, Emvalomatis G, Lansink AO (2012) The relationship between technical efficiency
and industrial concentration: evidence from the Indonesian food and beverages industry.
J Asian Econ 23(4):466–475
28. Harris RD (2001) Comparing regional technical efficiency in UK manufacturing plants: the
case of Northern Ireland. Reg Stud 35(6):519–534
29. Hill H, Kalirajan KP (1993) Small enterprise and firm-level technical efficiency in the
Indonesian garment industry. Appl Econ 25(9):1137–1144
30. Sun H, Hone P, Doucouliagos H (1999) Economic openness and technical efficiency: a case
study of Chinese manufacturing industries. Econ Transit 7(3):615–636
31. Bigsten A et al (2000) Exports and firm-level efficiency in African manufacturing, Working
paper no. WPS/2000–16. The World Bank, Washington, DC
32. Driffield NL, Kambhampati US (2003) Trade liberalization and the efficiency of firms in Indian
manufacturing. Rev Dev Econ 7(3):419–430
33. Tybout J, de Melo J, Corbo V (1991) The effects of trade reforms on scale and technical
efficiency: new evidence from Chile. J Ind Econ 31(3–4):231–250
34. Zhou X, Li K-W, Li W (2011) An analysis on technical efficiency in post-reform China. China
Econ Rev 22(3):357–372
35. Hossain MA, Karunaratne ND (2004) Trade liberalization and technical efficiency: evidence
from Bangladesh manufacturing sector. J Dev Stud 40(3):87–114
36. Oczkowski E, Sharma K (2005) Determinants of efficiency on least developed countries:
further evidence from Nepalese manufacturing firms. J Dev Stud 41(4):617–630
37. Yang S-F, Chen K-M, Huang T-S (2013) Outward foreign direct investment and technical
efficiency: evidence from Taiwan’s manufacturing firms. J Asian Econ 27(1):7–17
38. Driffield N, Munday M (2001) Foreign manufacturing, regional agglomeration and technical
efficiency in UK industries: a stochastic production frontier approach. Reg Stud 35(5):391–399
39. Jorge-Moreno JD, Carrasco OR (2015) Technical efficiency and its determinants factors in
Spanish textiles industry (2002–2009). J Econ Stud 42(3):346–357
40. Suyanto R, Salim R (2011) Foreign direct investment spillovers and technical efficiency in the
Indonesian pharmaceutical sector: firm level evidence. Appl Econ 45(3):383–395
41. Aghion P, Blundell R, Griffith R, Howitt P, Prantl S (2009) The effects of entry on incumbent
innovation and productivity. Rev Econ Stat 91(1):20–32
42. Alvarez R, Crespi G (2003) Determinants of technical efficiency in small firms. Small Bus
Econ 20(3):233–240
43. Chapelle K, Plane P (2005) Technical efficiency measurement within the manufacturing sector
in Cote d’Ivoire: a stochastic frontier approach. J Dev Stud 41(7):1303–1324
44. Patibandla M (1998) Structure, organizational behavior, and technical efficiency: the case of
an Indian industry. J Econ Behav Org 34(3):419–434
45. Page JM Jr (1980) Technical efficiency and economic performance: some evidence from
Ghana. Oxf Econ Pap 32(2):319–339
46. Aw BY, Batra G (1998) Technological capability and firm efficiency in Taiwan (China). World
Bank Econ Rev 12(1):59–79
47. Wu ZB, Yeung G, Mok V, Han Z (2007) Firm-specific knowledge and technical efficiency of
watch and clock manufacturing firms in China. Int J Prod Econ 107(2):317–332
48. Castiglione C (2012) Technical efficiency and ICT investment in Italian manufacturing firms.
Appl Econ 44(14):1749–1763
49. Mouelhi RBA (2009) Impact of the adoption of information and communication technologies
on firm efficiency in Tunisian manufacturing sector. Econ Model 26:961–967
50. Nishimizu M, Page JM Jr (1982) Total factor productivity growth, technological progress and
technical efficiency change: dimensions of productivity change in Yugoslavia, 1965–78. Econ
J 92(368):920–936
51. Kaynak H, Pagan JA (2003) Just-in-time purchasing and technical efficiency in the US
manufacturing sector. Int J Prod Res 41(1):1–14
36 Technical Efficiency and Its Determinants in the Manufacturing . . . 1431

52. Mahadevan R (2000) How technically efficient are Singapore’s manufacturing industries? Appl
Econ 32(15):2007–2014
53. Piesse J, Thirtle C (2000) A stochastic frontier approach to firm level efficiency, technological
change, and productivity during early transition in Hungary. J Comp Econ 28(3):473–501
54. Destefanis S, Sena V (2007) Patterns of corporate governance and technical efficiency in Italian
manufacturing. Manag Decis Econ 28(1):27–40
55. Sheehan M (1997) The evolution of technical efficiency in Northern Ireland manufacturing
sector, 1973–1985. Scott J Polit Econ 44(1):59–81
56. Roudaut N (2006) Influences of the business environment on manufacturing firms technical
efficiencies: the Cote d’Ivoire case. J Prod Anal 25(1):93–109
57. Sleuwaegen L, Goedhuys M (2003) Technical efficiency, market share and profitability of
manufacturing firms in Cote d’Ivoire: the technology trap. Camb J Econ 27(6):851–866
58. Hernandez-Trillo F, Pagan JA, Paxton J (2005) Start-up capital, microenterprises and technical
efficiency in Mexico. Rev Dev Econ 9(3):434–447
59. Xu N, Yuan Q, Jiang X, Chan KC (2015) Founder’s political connections, second generation
involvement, and family firm performance: evidence from China. J Corp Financ 33:243–259
60. Koop G (2001) Cross-sectoral patterns of efficiency and technical change. Int Econ Rev
42(1):73–103
61. Salas-Velasco M (2018) Production efficiency measurement and its determinants across OECD
countries: the role of business sophistication and innovation. Econ Anal Policy 57:6–73
62. Kim S (2003) Identifying and estimating sources of technical inefficiency in Korean manufac-
turing industries. Contemp Econ Policy 21(1):132–144
63. Zheng J, Liu X, Bigsten A (2003) Efficiency, technical progress, and best practice in Chinese
state enterprises. J Comp Econ 31(Part 1):134–152
64. Zhang X-G, Zhang S (2001) Technical efficiency in China’s iron and steel industry: evidence
from the new census data. Int J Appl Econ 15(2):199–211
65. Neogi C, Ghosh B (1994) Intertemporal efficiency variations in Indian manufacturing indus-
tries. J Prod Anal 5(3):301–324
66. Pitt MM, Lee L-F (1981) The measurement and sources of technical inefficiency in the
Indonesian weaving industry. J Dev Econ 9(1):43–64
67. Margono H, Sharma SC (2006) Efficiency and productivity analyses of Indonesian manufac-
turing industries. J Asian Econ 17(6):979–995
68. Charoenrat T, Harvie C (2014) The efficiency of SMEs in Thai manufacturing: a stochastic
frontier analysis. Econ Model 43:372–393
69. Charoenrat T, Harvie C, Amornkitvikai Y (2013) Thai manufacturing small and medium sized
enterprise technical efficiency: evidence from firm-level industrial census data. J Asian Econ
27:42–56
70. Onder AO, Deliktas E, Lenger A (2003) Efficiency in manufacturing industry of selected
provinces in Turkey: a stochastic frontier analysis. Emerg Mark Financ Trade 39(2):98–113
71. Taymaz E, Saatci G (1997) Technical change and efficiency in Turkish manufacturing
industries. J Prod Anal 8(4):461–475
72. Ferrantino MJ, Ferrier GD, Linvill CB (1995) Organizational form and efficiency: evidence
from Indian sugar manufacturing. J Comp Econ 21(1):29–53
73. Angeriz A, McCombie J, Roberts M (2006) Productivity, efficiency and technological change
in European Union regional manufacturing: data envelopment analysis approach. Manch Sch
74(4):500–525
74. Bernstein JI, Manuenas TP, Pashardes P (2004) Technical efficiency and US manufacturing
productivity growth. Rev Econ Stat 86(1):402–412
75. Kim S, Han G (2001) A decomposition of total factor productivity growth in Korean
manufacturing industries: a stochastic frontier approach. J Prod Anal 16(3):269–281
76. Martin-Marcos A, Suarez-Galvez C (2000) Technical efficiency of Spanish manufacturing
firms: a panel data approach. Appl Econ 32(10):1249–1258
77. Shee A, Stefanou SE (2015) Endogeneity corrected stochastic production frontier and technical
efficiency. Am J Agric Econ 97(3):939–952
1432 S. K. Bhaumik

78. Bayo-Moriones A, Lera-Lopez F (2007) A firm-level analysis of determinants of ICT adoption

in Spain, Technovation, 27(6–7):352–366
79. Chaffai M, Kinda T, Plane P (2012) Textile manufacturing in eight developing countries: Does
business environment matter for firm technical efficiency, Journal of Development Studies,
48(10):1470–1488
80. Kalirajan K, Bhide S (2004) The post-reform performance of the manufacturing sector in India,
Asian Economic Papers, 3(2):126–157
81. Mitra A, Sharma C, Vegazones-Varoudakis M-A (2012) Estimating impact of infrastructure on
productivity and efficiency of Indian manufacturing, Applied Economics Letters, 19:779–783
82. Mini F, Rodriguez E (2000) Technical efficiency indicators in a Philippine manufacturing
sector, International Review of Applied Economics, 14(4):461–473
83. Pham HT, Dao TL, Reilly B (2010) Technical efficiency in the Vietnamese manufacturing
sector, Journal of International Development 22:503–520
Application of Production Economics
in the Electricity Distribution Sector 37
Ørjan Mydland and Gudbrand Lien

Contents
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1434
Regulatory Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1435
Overview of Regulatory Frameworks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1435
Efficiency Studies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1438
The Decision Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1438
Estimating Levels of Efficiency . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1439
Review of Selected Studies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1440
Economies of Scope and Scale . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1449
Overview of Studies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1450
Empirical Comparison . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1452
Productivity and Productivity Change . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1453
Measures of Productivity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1454
Empirical Comparison . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1455
Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1455
Cross-References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1457
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1458

Abstract

The companies in the electricity distribution sector are mainly natural monop-
olies. Thus, in most countries they are regulated. It is therefore of interest
to investigate the efficiency and productivity of electricity distribution firms,
and to examine how the regulation of these firms affects their efficiency and
productivity over time. It is also of interest to examine economies of scope
and scale in the electricity distribution industry, since these potential economies

Ø. Mydland () · G. Lien

Inland School of Business and Social Sciences, Inland Norway University of Applied Sciences,
Lillehammer, Norway
e-mail: [email protected]; [email protected]

S. C. Ray et al. (eds.), Handbook of Production Economics,
https://doi.org/10.1007/978-981-10-3455-8_37
1434 Ø. Mydland and G. Lien

may be constrained under some regulatory regimes. In this chapter, we examine

empirical findings reported in earlier studies of the electricity distribution sectors,
with a focus on efficiency, economies of scope and scale, productivity, and
productivity change. Additionally, we report evidence on how regulatory regimes
affect and constrain the performance of firms, leading to the identification of
implications for policy settings. For future research, it would be desirable to
have more reliable and relevant data available to fortify conclusions and policy
recommendations.

Keywords

Regulatory regimes within the electricity distribution sector · Productivity and

efficiency analysis · Economies of scope and scale · Productivity change

Introduction

In this chapter, we summarize the empirical findings reported in earlier studies of

the electricity distribution sectors. The research literature on production economics
within the electricity distribution sector consists largely of research focusing
on efficiency, economies of scope and scale, and productivity and productivity
change. Additionally, regulatory regimes are often part of the focus within these
themes.
The electricity sector can be divided into three main activities: generation
of electricity; distribution and transmission of electricity; and sales/trading of
electricity. While generation and power trading are market-oriented and exposed
to competition because customers can buy electricity from different providers, the
customers of network services generally cannot choose between networks and are
charged by the service provider (distribution company) located in their specific
area. This means that the distribution companies have monopoly power within each
concession area, given that it would not be expedient to build parallel power lines to
introduce competition in this part of the electricity industry. There are two reasons
for this: first, society would not benefit from cities or landscapes with many more
power lines; and second, investment in building power grids is very large, and
therefore it is not possible for a second provider to compete effectively with the
established provider. Hence, the electricity distribution companies are regarded as
natural monopolies, and we expect average and marginal costs to be decreasing with
scale. As will be discussed later, this is the reason why this part of the electricity
industry is being regulated, to ensure that companies do not take advantage of
their monopoly power at the expense of consumers. By regulating the companies,
governments can improve the incentives for companies to increase productivity
and efficiency, thereby reducing costs and making it possible for prices paid by
customers to also be reduced.
In this chapter, we focus on four themes. In section “Regulatory Systems”, we
present an overview of different regulatory systems. “Efficiency Studies” reports
37 Application of Production Economics in the Electricity Distribution Sector 1435

on efficiency studies, section “Economies of Scope and Scale” outlines studies

of economies of scope and scale, and section “Productivity and Productivity
Change” covers studies of productivity and productivity change. Section “Some
Final Remarks” concludes with some final remarks.

Regulatory Systems

As electricity distribution companies have the characteristics of natural monopolies,

they are generally regulated to prevent or restrict their use of their monopoly powers.
Hence, an important aspect in studying the performance of electricity distribution
industries in different countries is the particular regulatory framework the com-
panies face. Because of the regulation of the electricity distribution companies,
efficiency and productivity analysis and other related issues have been subject to
increased interest by researchers working on electricity distribution companies,
providing examples of “real-world” applications of production economics of interest
to both consumers and policy makers. In this section, we review the similarities and
differences in regulatory frameworks across Europe and in the USA.

Overview of Regulatory Frameworks

To prevent electricity distribution companies from taking advantage of their market

power in order to set prices far above costs, thereby earning abnormal profit, regu-
latory authorities have been established in most countries to regulate distribution
companies. Because it is also important that returns to investors are sufficiently
high to attract the needed levels of investment in power distribution networks, it
is necessary for the authorities to construct the regulation system such that investors
can earn an adequate rate for return on their capital, given the risks involved.
However, getting the balance right between the interests of consumers and investors
can be difficult.1
Regulatory systems are very diverse such that, if we were to study the systems
across all countries, we suspect that we would not find any two of them to be
identical. However, we can classify the systems into two main categories: “cost-
plus” and “incentive” regulation, with many variations within these two categories.
Below we describe the two systems in turn and give some examples of how
regulation can reach its intended goals.

1 One possible approach is for the power grid to be wholly owned by the government. However,
the huge investment required may be too high for some countries to meet all costs of building and
maintaining the power grid. Quiggin [76] provides a discussion from an economic point of view
of the pros and cons of government ownership, as well as the costs and benefits of reforms of
electricity industries in past decades.
1436 Ø. Mydland and G. Lien

Cost-Plus Regulation
Cost-plus regulation has been widely used in the twentieth century [20]. Under this
system, the regulated companies are allowed to set consumer prices equal to actual
costs plus an extra amount representing a reasonable rate of return on the invested
capital. The problem with this system is that it gives no incentives for companies
to increase productivity to reduce costs or limit cost rises. It is therefore likely that
cost-plus regulation will lead to higher prices for customers.

Incentive Regulation
As the name indicates, an incentive regulatory system is meant to give the electricity
distribution companies incentives to increase productivity. This type of regulation
can take many forms. The regulator can set a price cap or a revenue cap for a
company, based on the performance of the company. For example, a price cap can
be designed so that the price increases that consumers face cannot exceed changes in
the consumer price index minus the expected productivity growth of the company.
Price cap regulation is often called “CPI-X regulation,” formerly known as “RPI-X
regulation,” and was first introduced by Stephen Littlechild in 1983 (see Littlechild
[59] and Amundsveen and Kvile [7] for more on this).2 Such a system will give the
company a clear incentive to increase productivity. The crucial and difficult issue
under such a regulatory system is to estimate the appropriate expected productivity
growth for the company. If the assumed productivity growth is set too low, the
company will have less incentive to increase productivity because it can earn above
normal profit. If expected productivity growth is set too high, the company may face
financial problems (see Coelli et al. [20] for further details).
Under the revenue cap regulatory system, instead of setting a price cap, the
regulator can decide the maximum revenue the company can receive from its
operations. As a revenue cap system will affect the prices the company can
charge their customers, some call this approach an indirect price cap system [20].
Within the revenue cap system, there are also many ways of setting the revenue
cap. The Council of European Energy Regulators [18] gives an overview of the
regulatory systems in most European countries, and also describes how price caps
and/or revenue caps are decided in each case. CEER [18] includes data from 26
European countries (Austria, Belgium, Croatia, Czech Republic, Denmark, Estonia,
Finland, France, Germany, Great Britain, Greece, Hungary, Iceland, Ireland, Italy,
Latvia, Lithuania, Luxembourg, Netherlands, Norway, Poland, Portugal, Romania,
Slovenia, Spain, and Sweden) submitted by the various national energy regulators.
Only three of these countries (Estonia, Greece, and Latvia) reported that they had a
cost-plus regulatory system. The majority of countries thus reported incentive-based
regulatory systems, with either a price cap, a revenue cap, or a combination of both.
As noted, in implementing such regulatory systems, it can be difficult to
determine the appropriate assumed increase in performance/productivity growth.
However, regulators can use benchmarking as a basis for making the companies

2 Retail price index (RPI) is an older measurement of inflation used in the UK. Today, the consumer

price index (CPI) serves as the main measure of inflation.

37 Application of Production Economics in the Electricity Distribution Sector 1437

compete. By comparing productivity across companies, regulators can measure the

efficiency of each company, and then decide the revenue cap and/or price cap for
each company in the industry, incorporating a specified productivity improvement
based on the benchmarking information.
There are several ways to apply a revenue cap or price cap system in practice.
For example, in Norway, where a revenue cap (RC) is the main system used, the
cap for a given firm is calculated as: RC = (1 − ρ) × C + ρ × C* , where C is
the cost base (calculated from the company’s own costs), C* is the estimated cost
norm (from the benchmarking model), and ρ is a scalar between 0 and 1. The ρ
term dictates the strength of the incentives. As 0 < ρ < 1, by deciding the strength
of the incentives, the Norwegian model can be said to be a “hybrid” between cost-
plus and revenue cap systems. The last step in the Norwegian regulatory model is
a calibration procedure, where the regulator calibrates the estimated revenue cap
so that the sum of the revenue caps of all firms equals the sum of the costs in the
industry. Then, a firm with an average benchmarking score will receive a revenue
cap intended to give it an average rate of return (RoR), and a firm that is higher
(lower) than the benchmarking average will receive a higher (lower) RoR [5]. The
advantage of this method is that, by measuring all firms, there is no need for the
regulator to set a target for the performance of each company. The “best” company
will receive the highest RoR, and the “worst” company will receive the lowest RoR.
However, one problem with benchmarking in an incentive regulation context is
that it is difficult to construct a model that gives correct measures of the performance
of each company, and one that is suitable for comparing all the companies. Even
if all electricity distribution companies in a country belong to the same industry
and thereby mainly have the same operations, there will be some heterogeneity
among the firms. To make the benchmarking model fair and to persuade the
electricity distribution companies to accept the regulatory model, the heterogeneity
among the firms must be taken into account.3 Furthermore, there is an issue
concerning strategic behavior in an incentive-based regulation system, as discussed,
for example, by Agrell and Teusch [3]. The strategic behavior of the regulated firm
can consist of deliberately reporting incorrect data to increase the efficiency results
in the regulation model. The firm can also make strategic investment decisions based
on how this will make the firm perform in the regulatory model. As illustrated by
Agrell and Teusch [3], a firm may find it is a good strategy to merge with one
or several of the most efficient firms, thereby making the merged business appear
relatively more efficient than the remaining unmerged firms. A general problem for
most regulators in the electricity distribution industry is that often the electricity
distribution companies are vertically integrated with the generation of electricity.
As the latter is in a competitive market, it might be tempting to transfer operation
costs to the regulated part of the company, to increase the “cost base” used in the
regulation model.

3 There are several cases where distribution companies have sued the regulatory authority. In some
cases, such action leads to rejection of the current regulation model, leading to reversion to cost-
plus regulation.
1438 Ø. Mydland and G. Lien

Efficiency Studies

The Decision Problem

Efficiency is an ex post measure of the extent to which managers have managed

their firms in an optimal way. In the literature, it is normal to define efficiency such
that it takes values of 1.0 if the firm is managed in a fully efficient way, and a value
of zero in the unlikely case that it is managed totally inefficiently.
There is a range of different measures of efficiency, depending on the ways firms
operate or are regulated. Options include output-, input-, revenue-, cost-, profit-, and
productivity-oriented measures of efficiency (see O’Donnell [68], PP 175–218 for
a detailed explanation of these efficiency measures). An output-oriented measure
is relevant in a situation where inputs are predetermined and the manager can
choose either the quantity of outputs or both the quantity of outputs and the mix
of outputs. When outputs are predetermined, an input-oriented measure is relevant.
If the input mix is predetermined, the relevant measure is input-oriented technical
efficiency. If the manager can choose both the input quantity and the input mix, the
relevant measure is input-oriented technical and mix efficiency. If the aim of the
manager is to minimize costs, and we assume outputs are predetermined, efficiency
is maximized by finding the optimal input mix. The input mix should accommodate
any economies of input substitution. For the case of cost minimization, the relevant
measure is cost efficiency, which can be decomposed into input-oriented technical
efficiency and input-oriented allocative efficiency (i.e., the degree of under- or
overutilization of inputs when the objective is to minimize costs).
Several measures of outputs of electricity distribution companies have been used
in different studies and by different regulators around the world (e.g., [6, 42]).
A thorough discussion of choice of inputs and outputs for empirical analysis is
given in chapter “Empirical Analysis of Production Economics: Applications
to Banking”. The choice of output used depends on data availability. Among the
most frequently used output variables are size of the network (km), electricity
delivered, and total number of customers served; among the most frequently used
inputs are either total costs or capital, labor, and operational costs. Then, electricity
distribution is a production process involving multiple outputs and multiple inputs,
implying that estimating a standard production function is not an option. In that
case, it is usual to assume that the outputs are more or less given, and the company
managers decide the level and mix of the inputs. The objective of an electricity
distribution company may be presumed to be to minimize costs given that their
outputs (electricity delivered, customers served, etc.) are exogenous, whereas their
inputs (the cost elements related to capital, labor, etc.) are endogenous. It follows
that the relevant performance measure is cost efficiency and, typically, either a cost
function or an input distance function has been estimated. In the case of the input
distance function, use is typically made of the duality between the input distance
function and the cost function.
37 Application of Production Economics in the Electricity Distribution Sector 1439

Estimating Levels of Efficiency

The concept of efficiency measurement began with Farrell [29], who drew upon the
work of Debreu [24] and Koopmans [47].4 To estimate a firm’s level of efficiency
requires a benchmark or best practice for comparison. The standard technique is
to specify and estimate frontier models to identify the efficient frontier in terms of
input/output efficiency, cost efficiency, profit efficiency, etc.
Frontier models may be parametric or nonparametric. Parametric models are
based on an assumption that the data distribution can be defined in terms of a given
set of parameters, while for nonparametric models it is assumed that an infinite set
of parameters is needed to define a data distribution [77]. The most commonly used
nonparametric model is data envelopment analysis (DEA). DEA is an estimation
approach that involves enveloping a scatterplot of data points as tightly as possible
by forming multiple linear segments of the frontier. Application of DEA depends
on a few assumptions about the production technology. The main assumption is that
the technology function is convex. This kind of model is typically estimated using
mathematical programming. Estimation of parametric models using econometric
methods, typically a stochastic frontier model, is based on some a priori selected
parametric functional form.
A further classification is to distinguish between deterministic and stochastic
models. Stochastic models allow for individual observations to be somewhat
affected by random noise, generating an estimated frontier stripped of the effect
of random noise. In deterministic models, the noise is ignored and any variations
in data are considered to contain information about the efficiency of the firms.
Examples of deterministic and stochastic parametric models are corrected ordinary
least squares (COLS) and stochastic frontier analysis (SFA), respectively. Examples
of deterministic and stochastic nonparametric models are DEA and stochastic DEA,
respectively. There are many nonparametric DEA models and also many parametric
SFA models. Examples of the state of the art in the application of DEA and SFA
models in the literature on electricity distribution firms are the conditional DEA of
Bjørndal et al. [15] and Nieswand and Seifert [66], the 4-component SFA models
of Kumbhakar and Lien [49], Filippini et al. [34], and Badunenko et al. [12]. In
addition, between nonparametric and parametric models lies a range of alternative
model specifications. In the electricity distribution industry, the semiparametric
stochastic nonparametric envelopment of data (StoNED) model of Kuosmanen [56]
is probably the most widely applied.
In the choice between DEA and SFA, a key question is whether one wants
flexibility in the frontier structure or precision in noise separation. DEA gives the
best fit to the empirical data, but the most robust estimation method is SFA, because

4 For reviews of models used and recent applications, see, for example, Kumbhakar and Lovell
[50], Coelli et al. [21], Bogetoft and Otto [16], Kumbhakar et al. [54], O’Donnell [68], and Sickles
and Zelenyuk [83].
1440 Ø. Mydland and G. Lien

it gives results that are not so sensitive to random noise in the data. The choice
may also depend on sample size, that is, the number of regulated companies. In
practical applications, the typically small sample sizes favor the choice of DEA.
Furthermore, the reliability of efficiency scores hinges on the ratio of variation
in efficiency to the variation in noise [11]. Flexible models that are robust in the
presence of random noise are generally seen as superior, and recent advances in this
direction include semi- and nonparametric SFA models (e.g., [73, 83]). Badunenko
et al. [11] analyze well the underlying assumptions of SFA and DEA models and
provide useful recommendations for regulators.
The methods of efficiency analysis used in the regulation of electricity distri-
bution companies vary between countries and sometimes also between states in a
country. Among OECD countries, deterministic methods such as DEA and COLS
are used in Australia, Netherlands, Norway, Sweden, and the UK. In Finland,
StoNED is applied. In some countries, the parametric method SFA and related
frameworks are applied as complementary methods. In the US states of California
and Maine, total factor productivity (TFP) is used for benchmarking as an input
into the regulation process. There is also variation in how the regulators use
benchmarking results. In some countries, such as Norway, the UK, and Netherlands,
the benchmarking results are used in a rather “mechanical” way, as an explicit part
of the regulatory process. In countries such as the USA, Australia, and Finland,
the benchmarking results are used only as an additional instrument for regulatory
decisions (Farsi et al. [30]).
The increasing use of frontier analysis in the electricity distribution sector
has raised concerns among regulators and companies regarding the reliability of
efficiency estimates (see, e.g., [82]). The empirical evidence suggests that the
efficiency estimates are sensitive to the benchmarking approach used (e.g., [34,
43, 49, 66]). The choice of frontier model and access to reliable data sets are two
obvious challenging issues. Other issues include how to handle noise in the data,
how to account for risk, and how to account for heterogeneity such as differences
in environmental factors between the distribution companies. These and related
challenges imply that the regulators and practitioners should take extra care in
choosing and applying efficiency models used to develop and apply regulations
intended to reward or punish companies to achieve better performance.

Review of Selected Studies

In this review of the efficiency literature, we focus only on modeling of efficiency in

the electricity distribution sector. Other parts of the energy sector are not included;
nor are analyses using SFA models of energy efficiency (e.g., [33]). We also exclude
models of market power (e.g., [52]). Hence, this is not a complete literature review
of efficiency. It mainly covers frequently cited efficiency studies of the electricity
distribution industry over the past two decades, along with a sample of some of the
most recently published studies within this field. In what follows, we first give a
37 Application of Production Economics in the Electricity Distribution Sector 1441

chronological summary of the studies; second, methods applied are discussed; and
finally, the empirical findings in these studies are reviewed.

Chronology of Studies
Most of the studies of efficiency in the electricity distribution sector published
before about 2010 applied the nonparametric deterministic DEA method (Table 1),
while some applied COLS, SFA, or other frontier models. Among those published
after 2015, we found a higher proportion using the parametric SFA framework or
related approaches.
As indicated in Table 1, while there is considerable variation in the countries
examined in previous studies, quite a number of the newly published studies have
used data from and analyses of Norwegian distribution companies. This is mainly
because, for many years, the Norwegian regulator has made data used in the
regulation available to the public, whereas regulators in other countries generally
have not. This clearly represents a challenge because it limits the opportunity to
compare companies across different countries. In the future, easier access to reliable
and relevant electricity distribution data from a range of countries would increase the
scope for more useful comparative studies, potentially leading to better regulation
models.

Estimation Methods
In this section, we review the use of estimation methods in studies selected from
the literature. Tables 2 and 3, which include more details of the studies listed in
Table 1, together provide an overview of the estimation methods applied, function
forms used, and the main empirical findings in these studies.
As noted, the nonparametric DEA method is widely used by regulators. It is
also the most frequently used method in the academic literature. Mathematical
programming is used to estimate the best practice frontier, which is then used
to evaluate the relative efficiency of different firms. The best practice frontiers
are typically the best possible technology or production possibilities set from
among the analyzed firms. The best possible technology is constructed according
to the minimal extrapolation principle. That is, DEA finds the smallest production
possibilities set consistent with the data (i.e., input-oriented, meaning use of the
least possible amount of inputs for a given output) while also satisfying certain
technological assumptions such as some form of convexity. The estimated frontiers
are comprised of multiple linear segments and, accordingly, are also known as
piecewise frontiers. DEA is based on the implicit assumption that there is no noise
in the data and that no information on the technology is missing. If the data used
are random, because of exogenous shocks, bad reporting practice, or ambiguity in
accounting practices, the results will not be valid. Consequently, firms would neither
be evaluated against the best possible standard, nor against a cautious standard [16].
Most of the earlier DEA studies listed in Tables 2 and 3 used the standard DEA
briefly described above.
Several attempts have been made to deal with noise in data in a DEA framework.
One approach is to use the bootstrapping technique, where step one involves
1442 Ø. Mydland and G. Lien

Table 1 Reviewed papers on the technical efficiency of electricity distributors

Author(s) Country Sample Method
Selected publications
frequently cited up to 2015
Pacudan and De Guzman [72] Philippines Cross-sectional data. 15 DEA
firms
Resende [79] Brazil Cross-sectional 1997/98. DEA
24 firms
Edvardsen and Førsund [26] International 1997, 5 European DEA
countries
Jamasb and Pollitt [43] International 63 utilities, 1997, 1998, DEA, COLS
and 1999, depending on SFA
country
Korhonen and Syrjänen [48] Finland 1998. 102 firms DEA
Farsi and Filippini [31] Switzerland Balanced panel data, SFA
1988–1996. 59 firms
Jamasb et al. [44] USA Cross-sectional data DEA
2000. 28 firms
Agrell et al. [4] Sweden Unbalanced panel data DEA
1996–2000. 238 firms
Giannakis et al. [38] UK 1991/92 to 1998/99. 12 DEA
utilities
Pombo and Taborda [75] Colombia Balanced panel data DEA
1985–2001. 12 firms
Thakur et al. [86] India 2001–2002. 26 firms DEA
Sadjadi and Omrani [80] Iran 2004. 38 firms DEA, SFA
Growitsch et al. [39] International 2002. 499 firms SFA
Cullmann [22] Germany 2001–2005. 200 firms SFA
Kuosmanen [56] Finland 4-year averages over StoNED
2005–2008. 89 firms
Çelen [19] Turkey 2002–2009. 21 firms DEA
Dai and Kuosmanen [23] Finland 6-year averages over StoNED, DEA
2005–2010
Selected publications from
2015
Hafezalkotob et al. [41] Iran 2008. 38 firms DEA
Kumbhakar et al. [53] Norway Unbalanced panel, SFA
1998–2010. 128 firms
Mullarkey et al. [63] Ireland 2008. 26 firms DEA
Orea et al. [70] Norway 2001–2004. 128 firms SFA
Ervural et al. [28] Turkey 81 regions DEA
Agrell and Brea-Solís [2] Sweden Balanced panel Latent class
2000–2006. 118 firms SFA
Arcos-Vargas et al. [8] Spain 2011. 102 small firms DEA
Kumbhakar and Lien [49] Norway Unbalanced panel SFA
2000–2013. 134 firms
(continued)
37 Application of Production Economics in the Electricity Distribution Sector 1443

Table 1 (continued)
Author(s) Country Sample Method
Li et al. [58] Japan Unbalanced panel data Metafrontier
1980–2010. 10 firms
Orea and Jamasb [69] Norway Unbalanced panel SFA
2004–2011. 129 firms
Bjørndal et al. [15] Norway Average data for DEA, StoNED
2008–2012. 123 firms
Deng et al. [25] China 1999–2013. 31 firms SFA
Filippini et al. [34] New Zealand 2000–2011. 28 firms SFA
Makieła and Osiewalski [60] Poland SFA
Orea et al. [71] Norway Balanced panel SFA
2004–2011. 129 firms
Silva et al. [84] 13 European Cross-sectional over DEA, SFA
companies 2012–2014 with entropy
estimation
Kumbhakar et al. [55] Norway Unbalanced panel SFA
2000–2016. 146 firms
Musau et al. [64] Norway Unbalanced panel SFA
2000–2016. 149 firms
Badunenko et al. [12] Germany Unbalanced panel SFA
2006–2012. 242 firms

obtaining the efficiency scores from the original DEA, and then, in step two, the
standard errors of the DEA estimators are obtained by bootstrapping [85]. An
alternative approach to bootstrapping for dealing with noise in data, called robust
optimization, was proposed by Sadjadi and Omrani [80]. However, as pointed out by
Coelli et al. [21], these DEA techniques for dealing with noise in data are designed
to deal with sampling variability, meaning that they indicate how the DEA estimates
would vary if a different random sample were to be selected. These approaches do
not account for random noise in the modeling of efficiency.
DEA results are influenced by operational heterogeneity among the electric-
ity distribution firms. To account for this heterogeneity, regulators often use
second-stage regressions to control for the observed differences in the operational
environments of firms on the estimated frontier (e.g., [4, 19, 48, 75]). Environmental
or z-variables are regressed on the DEA estimates (from the first step) to determine
the effect of the operational environments on efficiency. The efficiency scores are
then adjusted to compensate for the effects of the z-variables.
A recent development to deal with observed heterogeneity is the conditional DEA
approach, as demonstrated by Bjørndal et al. [15] and Nieswand and Seifert [66].
They used a kernel estimation to restrict the reference sets (i.e., from the whole
sample to a restricted sample) with respect to z-variables, prior to measuring
actual performance, meaning that firms are compared only with others with similar
distribution environments.
1444 Ø. Mydland and G. Lien

Table 2 Selected publications frequently cited up to 2015. Estimation method, estimated techni-
cal efficiency, and main findings/implications
Estimation Technical
Author(s) method Function efficiency Findings/implications
Pacudan and De DEA Inputs and 0.82 Demand-side
Guzman [72] outputs management
incentives slightly
improve efficiency
Resende [79] DEA Inputs and 0.78–0.84 Access to reliable
outputs and relevant data is a
challenge
Edvardsen and DEA Inputs and 0.81 Should apply a
Førsund [26] outputs common technology
for identifying
multinational peers
Jamasb and DEA, COLS Inputs and DEA Estimated efficiency
Pollitt [43] and SFA outputs + 0.54–0.79 scores are sensitive
input distance COLS to the applied
functions 0.60–0.63, frontier model and
SFA model specification
0.62–0.72
Korhonen and DEA-VRS Inputs and 0.77
Syrjänen [48] outputs
Farsi and Four SFA Cost function The ranking of firms
Filippini [31] models varies significantly
across models
Jamasb et al. [44] DEA Inputs and
outputs
Agrell et al. [4] DEA Inputs and 0.68 Focus on the
outputs theoretical
foundation of the
analysis
Giannakis et al. [38] DEA, Inputs and CRS It is important to
technical and outputs 0.76–0.90 include service
service VRS quality in the
inefficiency 0.83–0.05 analysis
Pombo and DEA CRS + Inputs and 0.57–1.00 Regulatory reform in
Taborda [75] VRS outputs 1994 created
improved efficiency
Thakur et al. [86] DEA, both Inputs and DEA-CRS
TE and scale outputs 0.68
efficiency DEA-VRS
0.84
Sadjadi and DEA with Inputs and SFA 0.79, Efficiency estimates
Omrani [80] uncertain outputs with DEA 0.80 are smaller when
data, SFA random uncertainty is
component accounted for
Growitsch SFA Input distance Small utilities Quality of services
et al. [39] functions 0.45, medium should be an
0.49, large integrated part of the
0.80 efficiency analysis
(continued)
37 Application of Production Economics in the Electricity Distribution Sector 1445

Table 2 (continued)
Estimation Technical
Author(s) method Function efficiency Findings/implications
Cullmann [22] Latent class Input distance Mean 0.91 Allowing for
SFA functions Class 1 0.90 different
Class 2 0.92 technologies results
in more robust
individual efficiency
estimates
Kuosmanen [56] StoNED Cost 0.92 Accounts for noise in
function + the data and for
inputs and heterogeneity
outputs
Çelen [19] DEA + Inputs and 0.91–0.94 Private ownership
TOBIT outputs affects efficiency
positively
Dai and StoNED + Cost Cluster-based
Kuosmanen [23] DEA function + analysis used to
inputs and account for
outputs heterogeneity

All parametric SFA approaches deal with random noise, caused by exogenous
shocks, bad reporting practice, or ambiguity in accounting practices. There also
exists a range of SFA panel data models, accounting for unobserved firm-specific
or time-specific heterogeneity (also called “common errors”) (e.g., [2, 31, 53, 70].
However, using such models comes at a cost because, compared with DEA models,
parametric SFA models require more a priori assumptions about the structure of the
production possibility set. Using the SFA approach, it is presumed that the structure
of the production possibility set is known a priori through the specification of some
chosen functional form.
In addition to estimating the efficiency of firms, it is often of interest to learn
about factors that affect efficiency. In recent decades, there has been discussion
about the use of either a “one-step” model to simultaneously estimate the SFA model
and the way in which the efficiency term (u) depends on z-variables, versus using a
“two-step” procedure, where the first step is to estimate a standard SFA model, and
the second step is to estimate (via regression) the relationship between (estimated)
u and z. However, this issue is now resolved, and the efficiency determinants should
be estimated simultaneously as an integrated part of the SFA model [81, 89].
When including determinants of efficiency or z-variables in the SFA framework,
one can explain systematic differences in efficiency within and between firms. For
example, of the SFA studies listed in Table 3, Orea et al. [70], Agrell and Brea-
Solís [2], Kumbhakar and Lien [49], Li et al. [58], Orea and Jamasb [69], Deng
et al. [25], Orea et al. [71], Kumbhakar et al. [55] and Badunenko et al. [12] all
included efficiency determinants in their analyses. In the 4-component model (see
below), Badunenko et al. [12] included determinants of both transient and persistent
inefficiency.
1446 Ø. Mydland and G. Lien

Table 3 Selected publications from 2015. Estimation method, estimated technical efficiency, and
main findings/implications
Estimation Technical
Author(s) method Function efficiency Findings/implications
Hafezalkotob Robust data Inputs and 0.82–0.91 Deals with discrete
et al. [41] envelopment outputs uncertain input and
model (RDEA) output data
Kumbhakar “True” FE SFA Inputs and 0.82–0.87
et al. [53] outputs
Mullarkey DEA Inputs and Vary from Accounts for influence
et al. [63] (6 alternative outputs 0.65 to 0.91. of heterogeneity on
models) Rural 0.91, the DEA score
urban 0.83
Orea et al. [70] SFA combined Cost function 0.90–0.96 Higher efficiency
with when controlling for
supervised environmental
dimension variables
reduction.
Ervural DEA Inputs and
et al. [28] outputs
Agrell and Latent class Input distance Critique of latent class
Brea-Solís [2] SFA functions SFA
Arcos-Vargas DEA Inputs and 0.63–0.84
et al. [8] outputs
Kumbhakar 4-component Input distance Long-run TE A flexible model to
and Lien [49] SFA functions 0.53–0.93. explain firms’
Short-run TE observed practice and
0.89–0.96 efficiency
Li et al. [58] Metafrontier- Metafrontier Group frontier
based cost function 0.80
analysis (1996–2010),
Metafrontier
0.72
(1996–2010)
Orea and Nested latent Cost function ZISF 0.87, Distinguishes between
Jamasb [69] class SFA NLCSF 0.91 fully efficient and
nonefficient firms
Bjørndal Conditional Inputs and Unconditional Accounting for
et al. [15] DEA, StoNED outputs, Kernel DEA 0.67, heterogeneity in a
function conditional DEA framework
DEA 0.70
Deng et al. [25] Bayesian SFA Input distance 0.21–0.33 Distinguishes between
functions service quality and
technical efficiency
(continued)
37 Application of Production Economics in the Electricity Distribution Sector 1447

Table 3 (continued)
Estimation Technical
Author(s) method Function efficiency Findings/implications
Filippini 4-component Cost function 0.78–0.94 Important to
et al. [34] SFA disentangle transient
and persistent
efficiency
Makieła and Bayesian SFA Cost function 0.90 Framework where the
Osiewal- model specification is
ski [60] determined based on
information in the data
Orea et al. [71] Spatial SFA Cost function 0.92 How to control for
unobserved
environmental
conditions
Silva et al. [84] DEA, SFA Input distance 0.12–1.00 For small sample size,
with entropy functions incomplete, and noisy
estimation data
Kumbhakar A system Production Transient TE The costs of input
et al. [55] approach, function + 0.76–0.93. misallocation of
SFA + cost first-order Persistent TE Norwegian electricity
minimization conditions of 0.95. Overall distribution firms are
cost min TE 0.76–0.93 not negligible
Musau A system Production At mean: Includes determinants
et al. [64] approach, function + overall TE of transient and
SFA + cost first-order 0.85, transient persistent inefficiency,
minimization conditions of TE 0.98, and determinants of
cost min. persistent TE allocative input
0.86 misallocation
Badunenko 4-component Input distance At median: Firms in East
et al. [12] SFA with functions overall TE Germany achieves
determinants 0.68, transient better persistent
TE 0.95, efficiency on average
persistent TE than those in West
0.72 Germany

Orea et al. [70] and Makieła and Osiewalski [60] also demonstrated how to apply
a form of machine learning to deal with a large set of environmental variables and
heterogeneity in an integrated panel data model. Orea et al. [71] presented a spatial
SFA model, which accounted for spatial heterogeneity.
An important recent development of SFA models is the 4-component model. In
this model, the first component captures the unobserved heterogeneity of firms, the
second component captures persistent inefficiency, the third component captures
transient inefficiency, and the last component captures random shocks. Application
of the 4-component model to electricity distribution has been demonstrated by
Kumbhakar and Lien [49], Filippini et al. [34], and Badunenko et al. [12]. Filipini
et al. [34] also developed a theoretical model that shows that an imperfectly
informed regulator may not be able to disentangle the persistent and transient parts
1448 Ø. Mydland and G. Lien

of cost efficiency. The empirical results of Filipini et al. [34], Kumbhakar and
Lien [49], and Badunenko et al. [12] show that the transient and persistent parts of
efficiency are relatively different, and differ from efficiency measured by previous
approaches. From a regulatory point of view, these results imply that both persistent
and transient technical efficiency measures should be used in regulation.
Applications of SFA to study the efficiency of the firms that distribute electricity
have overwhelmingly focused on technical inefficiency. The implicit assumption
is that all firms are either fully allocatively efficient (i.e., there is no input
misallocation), or that their level of allocative inefficiency is negligible and can
be ignored. The reason input misallocation has been ignored or not modeled in
earlier studies could be because of the unavailability of data. Moreover, modeling
input misallocation is complicated. In any event, studies that use the cost function
tend to ignore allocative inefficiency and focus on technical inefficiency, because
the cost function (or input distance function) typically used in the literature does
not allow for the separation of technical inefficiency and allocative inefficiency.
Kumbhakar and Wang [51] used a Monte Carlo study to illustrate that allocative
efficiency cannot be lumped together with technical efficiency in the estimation
of a cost function. Their results show that failure to include the cost of allocative
inefficiency or input misallocation in a cost function framework biases the cost
efficiency estimates. One modeling strategy to overcome this problem, illustrated
by Kumbhakar et al. [55], is to use a system approach, consisting of a production
function and the first-order conditions for cost minimization. Input misallocation
for each pair of inputs in that framework is modeled via the first-order conditions
of cost minimization. Their findings, based on data for the Norwegian electricity
distribution sector, show that the costs of input misallocation are not negligible.
In fact, the costs to the industry arising from input misallocation can be high, and
ranged on average for the firms analyzed from 9.4% to 10.9% in their analysis.
Kumbhakar et al. [55] included determinants of both transient and persistent
technical inefficiency, while Musau et al. [64] extended that model framework
to also include determinants of allocative inefficiency (input misallocation).
For a more thorough and technical description of SFA and DEA models see
Chaps. 8, “Stochastic Frontier Analysis: Foundations and Advances I”,
9, “Stochastic Frontier Analysis: Foundations and Advances II”, and 10, “Data
Envelopment Analysis: A Nonparametric Method of Production Analysis”.

Empirical Comparison
Figure 1 presents plots of estimated technical efficiency from the studies frequently
cited up to 2015 (Table 2), and Fig. 2 depicts the efficiency scores for those studies
published from 2015 (Table 3).
The average technical efficiency of the studies plotted in Figs. 1 and 2 is 0.77.
Compared with the studies cited up to 2015 (Fig. 1), there is, on average, a somewhat
higher efficiency score with less variability in the studies published from 2015
(Fig. 2). By our own calculations, we found that the estimates based on DEA
estimators are on average about 0.75, while estimates based on SFA are on average
about 0.84. Of course, these variations in estimated efficiency levels are probably
37 Application of Production Economics in the Electricity Distribution Sector 1449

Fig. 1 Comparison of reported estimated technical efficiency in papers frequently cited up to

2015, using data from different countries/regions

Fig. 2 Comparison of reported estimated technical efficiency in papers cited from 2015, using
data from different countries

caused by a combination of differences between the countries investigated, the time

period analyzed, variable specifications, and estimation methods. Nevertheless, the
difference between the two averages highlights the variability of efficiency estimates
in published studies.

Economies of Scope and Scale

The concept of economies of scale dates back to Adam Smith and is a technical
concept that tells us what happens to output if we increase inputs. If we double
our inputs and output doubles, more than doubles, or less than doubles, then
we have constant, increasing, or decreasing returns to scale (RTS), respectively.
Economies of scale are considered to be a local measure, meaning that RTS can
differ for different parts of the production function. Hence, a measure of economies
of scale gives a good description of the local properties of the production function.
Furthermore, economies of scale are a useful tool in finding “optimal scale” (i.e.,
RTS = 1). A thorough treatment of scale elasticity and returns to scale can be found
in Chap. 17, “Scale Elasticity and Returns to Scale”.
1450 Ø. Mydland and G. Lien

The concept of economies of scope dates back to the work of William J. Baumol,
starting in 1974, and the work of John C. Panzar and Robert D. Willing who
independently started similar work around the same time. Their main work is
collected in Baumol et al. [14]. The concept is to examine whether one multi-output
firm can produce two or more outputs more cheaply than several different firms
producing one output each. The existence of economies of scope means that there
are some positive synergies between different outputs produced by a single firm.
In the electricity industry, the two outputs can be present in a company that both
produces and distributes electricity. If we find the existence of economies of scope,
it means that the company is operating more efficiently than would be the case if
the two functions were completely separated. This section about economies of scale
and scope is rather brief; for a thorough treatment of both the theoretical concepts
and ideas and a critical review of the literature, see Chap. 45, “ Cost Assessment
of (Un)bundling: Separation of Vertically Integrated Public Utilities”.

Overview of Studies

Table 4 gives a chronological summary of most of the studies of economies of scope

and/or scale in the electricity distribution sector. As can be seen, most studies have
used a parametric estimation approach. Only one study of electricity utilities in
Spain [9] used the nonparametric DEA approach to analyze economies of scope
and scale. In the group of studies using parametric estimation approaches, a large
proportion used a quadratic cost function. The frequent choice of this functional
form is attributable to the problems that stem from the existence of zero value
outputs; zero values cannot be handled in a translog cost function because the log
of zero does not exist. There are several examples, albeit not within the electricity
sector, where the translog cost function has been used in scope studies, achieved
by replacing all the zero values by arbitrary small numbers. However, Fraquelli
et al. [36] showed that this is not an appropriate approach because the value chosen
for the arbitrary small number can have a substantial effect on the results. There
are also some problems with quadratic cost functions. Battese [13] showed that
if the number of zero values represents a large proportion of the total number of
observations, the parameter estimates may be biased because the zero values are not
really zero, they are nonexistent. In Triebs et al. [88], the flexible dummy variable
approach was applied. This approach has advantages in estimating economies of
scope and scale in industries with some multi-output firms when using quadratic,
translog, or any functional form because a different set of parameters is estimated
for the nonexisting/zero-value observations. It is also possible to test for different
technologies (e.g., one technology for the power producers and one technology for
the electricity distributors) when applying the flexible dummy variable approach.
Among the parametric studies, ordinary least square (OLS), seemingly unrelated
regressions (SUR), and random effect models have all been used. Arocena [9]
showed that when the DEA approach is used, there is no need for assumptions on
functional form.
37 Application of Production Economics in the Electricity Distribution Sector 1451

Table 4 Summary of previous empirical scope and scale studies of combined generation and
transmission/distribution electricity companies
Functional Established Economies of scope
Author(s) Data form method and scalea
Kaserman and Cross-sectional Quadratic cost OLS Economies of scope
Mayo [46] (1981, US) function (EOS) = 0.12 (at
mean)
Kwoka [57] Cross-sectional Quadratic cost OLS EOS = 0.27 (at
(1989, US) function median). Reports
substantial costs of
vertical integration
and highest for the
smallest utilities
Jara-Díaz Panel data Quadratic cost Seemingly EOS = 0.065–0.28.
et al. [45] (1985–1996, function unrelated Returns to scale (RTS)
Spain) together with regressions = 1.07
cost share (SUR)
equations
Arocena [9] Panel data Cost function Data EOS = 0.11–0.49.
(1989–1997, envelopment RTS: Reports that the
Spain) analysis (DEA) scale efficiency of the
largest units in the
sample could improve
by dividing them into
smaller units
Piacenza and Panel data Multiproduct Nonlinear SUR EOS = 0.24. RTS =
Vannoni [74] (1994–2000, & multistage 0.96. Reports findings
Italy) Box–Cox of both vertical
transformed integration gains and
cost function horizontal economies
of scope
Fetz and Panel data Quadratic cost Random EOS = 0.50–0.60 (at
Filippini [32] (1997–2005, function effects GLS median). RTS =
Switzerland) and random 1.40–1.70 (at median).
coefficient Presence of
model considerable
economies of vertical
integration and
economies of scale for
most companies
Arocena Cross-sectional Quadratic cost SUR EOS = 0.04–0.10.
et al. [10] (2001, US) function RTS = 1.01–1.03.
together with Reports positive
cost share sample mean estimates
equations of both vertical and
horizontal economies
(continued)
1452 Ø. Mydland and G. Lien

Table 4 (continued)
Functional Established Economies of scope
Author(s) Data form method and scalea
Meyer [61] Panel data Quadratic cost OLS EOS = 0.19–0.26,
(2001–2008, function when separating
US) generation from
distribution and retail.
Reports that, if
generation and
transmission remain
integrated but are
separated from
distribution and retail,
EOS = 0.08–0.10
Triebs Panel data Flexible SUR EOS = 0.04 (0.40
et al. [88] (2000–2003, technology when zeros are
US) translog cost replaced by small
functions with numbers in the
different common cost function
specifications model). RTS =
1.10–1.13. Reports
evidence of economies
of scale and vertical
economies of scope
Gugler Panel data Multistage Nonlinear SUR EOS = 0.14–0.20.
et al. [40] (2000–2010, quadratic cost Reports that, at the
16 European function median, integrated
countries) utilities have EOS =
0.14, while large-scale
utilities have EOS =
0.20
Mydland Panel data Flexible Maximum EOS = 0.10 (at
et al. [65] (2004–2014, technology likelihood, median)
Norway) translog and random effects RTS = 1.70, 1.15,
quadratic cost 1.02 (at median),
functions integrated,
distribution, and
generation,
respectively
a Estimates of economies of scale (measured by RTS) are for integrated firms

Five of the studies presented in Table 4 used data from the USA, and five from
European countries: two from Spain, and one each from Italy, Switzerland, and
Norway. In addition, Gugler et al. [40] examined economies of scope in 16 European
countries (Austria, Czech Republic, Finland, France, Germany, Greece, Hungary,
Italy, Latvia, Norway, Poland, Portugal, Spain, Sweden, Switzerland, and the UK).

Empirical Comparison

Estimates of economies of scope from all of the studies reported in Table 4 are
shown in Fig. 3, with results varying from 4% to 60%. The results are quite
37 Application of Production Economics in the Electricity Distribution Sector 1453

Fig. 3 Comparison of the estimated economies of scope reported in studies using data from
different countries

Fig. 4 Comparison of the estimated returns to scale reported in studies using data from different
countries

consistent over the studies estimated with different methods and functional forms.
There is some variation in estimates between countries. The highest estimated
economies of scope reported are for Switzerland. The five studies from the USA
yielded some differences in estimates of economies of scope, ranging from below
10% to almost 30%.
In Fig. 4, the average estimated economies of scale are shown. Only Italy was
found to have decreasing RTS. All other studies estimated RTS above one, implying
a potential for cost reduction by increasing the size of the distribution firms.

Productivity and Productivity Change

Productivity is basically a measure of a firm’s performance in terms of the ratio

between outputs and inputs. The inputs here are the variables used to produce
an output (products and services), and they are controlled by the manager of the
production process. For variables that are controlled by the manager, it is possible
to add or subtract inputs or increase and decrease the amount of one or more
inputs. Variables that affect the production process but are not controlled by the
managers are often referred to as environmental variables, which can be weather
1454 Ø. Mydland and G. Lien

conditions, infrastructure, or legislation. A main part of the productivity theory

is the concept of production technology. Leaving environmental variables aside,
production technology can be said to be a technique, ability, or recipe to transform
inputs into outputs. In a historical context, we can say that the industrial revolution
represented a major shift in production technology when, for example, farmers went
from using animals to plow their fields to using tractor-drawn plows. If a farmer goes
from taking 2 weeks to plow the fields on his land, to taking only 2 days, because of
the replacement of animals by machinery, we would say this represents a change in
production technology.
Although the concept of productivity is quite simple and the studies on productiv-
ity and productivity change include the same kinds of variables, there is debate about
how to construct indices to measure changes in the variable(s) over time. O’Donnell
[68] argued that indices must be consistent with measurement theory, and that they
should satisfy axioms Q1–Q8 listed in O’Donnell [67].5 Q1: Weak monotonicity,
Q2: Homogeneous of degree one, Q3: Identity, Q4: Homogeneous of degree zero,
Q5: Proportionality, Q6: Time–space reversal, Q7: Transitivity, and Q8: Circularity.
Two of the most important axioms are the transitivity and proportionality axioms.
The transitivity axiom is that a direct comparison of two firms should give the same
index number as an indirect comparison through a third firm. The proportionality
axiom is that, if firm A produced λ times as much as firm B, then the index that
compares the outputs of firm A with outputs of firm B must take value λ. Indices
satisfying axioms Q1–Q8 are called “proper indices” by O’Donnell [67, 68]. For a
further discussion of index numbers and productivity measurement see Chap. 19,
“Index Numbers and Productivity Measurement”.

Measures of Productivity

The Malmquist total factor productivity (TFP) index, introduced by Caves et al.
[17], is widely adopted in the studies of productivity and productivity change in
the electricity distribution industry.6,7 Total factor productivity can be decomposed
into several measures to explain the productivity results and changes over time. One
frequently used step is to decompose the Malmquist TFP index into the following
economically meaningful sources: 1) a technical efficiency change component, mea-
sured relative to the best practice technologies; 2) a technical change component,
characterizing the shift in the best practice technologies; and 3) a scale component,
measuring the contribution of scale economies (see, e.g., [37]). Several other ways

5 Here Q refers to the output index. The arguments and interpretations are analogues for TFP, or for

input indices.
6 The Malmquist TFP index is also sometimes called any one of “Hicks–Moorsteen TFP index,”
“Hicks–Moorsteen index,” “Bjurek index,” “Bjurek productivity index,” or “Bjurek TFP index.”
See Sickles and Zelenyuk [83] for more on this.
7 Note that the Malmquist index does not satisfy all axioms Q1–Q8 listed in the previous subsection.
37 Application of Production Economics in the Electricity Distribution Sector 1455

to decompose the TFP index are possible. It is also possible to include some
environmental or z-variables into the TFP index in these kinds of analyses and
to decompose the TFP index to show the environmental effects (see, e.g., [68]).
Construction of the components requires some estimation procedures. The most
common methods for estimation of the decomposed components are variants of
either DEA or SFA.
Table 5 gives a chronological summary of most of the studies of productivity
and productivity change in the electricity distribution industry. As shown, most
studies have used a nonparametric approach by applying DEA; only Tovar et al. [87]
used the parametric SFA method. Furthermore, all the studies used Malmquist TFP
indexes or some variation thereof. Three of the studies used data from Norwegian
electricity distribution companies, although for different time periods [27, 35, 62].
All three studies reported some productivity growth. Two of the studies in Table 5
used data from Brazil [78, 87] from the same period (1998–2005). Both reported
productivity growth mainly caused by technical change, even though the methods
chosen were different; the former used DEA while the latter used SFA. Table 5 also
includes one study using data from Australia for the period 1969–1999 [1].

Empirical Comparison

Figure 5 shows the TFP results for some of the studies presented in Table 5. There
are two studies from Norway for two different time periods. Førsund and Kittelsen
[35] reported a 2% increase in productivity using data for the period (1983–1989),
while Edvardsen et al. [27] reported an average increase in productivity of 5% over
two periods (1996–1997 and 1996–2003). Using data from Australia, Abbott [1]
reported an annual growth rate in productivity of 2.5% for 1969–1999. Using data
from Brazil, Ramos-Real et al. [78] reported an annual growth rate of 1.3% for
1998–2005.

Conclusion

In this chapter, we have described different regimes in the regulation of electricity

distribution companies, and discussed the two main regulatory regimes of cost-plus
and incentive regulation. Furthermore, overviews of empirical studies on efficiency,
economies of scope and scale, and productivity and productivity change have been
provided. The studies presented are from 1991 to 2020/2021, a period during which
there have been substantial developments in the methods and knowledge relating to
the production economics of electricity distribution.
In most cases, we see that results regarding scale, scope, and productivity differ
more between the data used in each study (data from different countries) than
between different models, methods, and technical applications. However, this is not
entirely the case for the efficiency estimates. In addition to variation in efficiency
estimates between countries, the results are sensitive to how efficiency is modeled
1456 Ø. Mydland and G. Lien

Table 5 Chronological summary of most of the studies of productivity and productivity change
within the electricity distribution industry
Functional
form/technology Established Productivity/productivity
Author(s) Data set method change
Førsund and Panel data Piecewise DEA, Overall positive
Kittelsen [35] (1983–1989, linear frontier Malmquist development in
Norway) technology TFP index productivity of 2% per
year, mainly caused by
technical change
Abbott [1] Panel data DEA Malmquist On average TFP
(1969–1999, technology TFP index increased with annual
Australia) growth rate of 2.5%
Edvardsen Panel data DEA Bilateral Increase in
et al. [27] (1996–2003, technology Malmquist cost productivity of 0.2%
Norway) productivity 1996–1997. Average
index productivity increase
of 8.0% 1996–2003
Ramos-Real Panel data DEA DEA TFP index gives yearly
et al. [78] (1998–2005, technology Malmquist growth rate of 1.3%
Brazil) approach, 1998–2005. Technical
Decomposing change as main
TFP using component with
Malmquist average growth of
TFP index 2.1% per year
Tovar Panel data Translog input SFA, Mainly caused by
et al. [87] (1998–2005, distance Malmquist technical change, TFP
Brazil) function index increased in
1998–2003 and
decreased in
2003–2005
Miguéis Yearly average DEA DEA, On average the
et al. [62] panel data technology decomposed companies’
(2004–2007, Malmquist productivity increased
Norway) index between 2004 and
2005, did not change
from 2005 to 2006,
and decreased between
2006 and 2007

and interpreted. We recommend that future empirical research and applications

should pay more attention to modeling and interpreting efficiency as well as to the
assumptions underlying each chosen model.
The studies reviewed in this chapter give estimates of levels of efficiency,
economies of scale, economies of scope, productivity change, and components
thereof. These estimates of performance should be useful for regulators, policy mak-
ers, and perhaps also for individual firm managers and researchers. However, the
37 Application of Production Economics in the Electricity Distribution Sector 1457

Fig. 5 Comparison of TFP results from studies using data from different countries

results would likely be more useful if there were more agreement on and adoption
of standard practices across studies. Such standards might include determinants of
reasons behind the estimates of efficiency, scale, scope, and productivity change. Of
course, for future research in this direction, access to relevant and reliable data is
also a requirement.
It is highly likely that developments in business analytics will influence future
research. It would be interesting to consider how developments in areas such
as artificial intelligence, machine learning, and big data might open up new
possibilities for the collection and control of data. We expect these developments
to have significant effects on the efficiency of the collection, control, analysis, and
reporting of data by researchers, regulators, and companies.

Cross-References

Cost Assessment of (Un)bundling: Separation of Vertically Integrated Public

Utilities
Envelopment Analysis: A Nonparametric Method of Production Analysis
Empirical Analysis of Production Economics: Applications to Banking
Index Numbers and Productivity Measurement
Scale Elasticity and Returns to Scale
Stochastic Frontier Analysis: Foundations and Advances I
Stochastic Frontier Analysis: Foundations and Advances II

Acknowledgments We would also like to thank participants in the project ElBench for providing
funding to complete this work.
1458 Ø. Mydland and G. Lien

References
1. Abbott M (2006) The productivity and efficiency of the Australian electricity supply industry.
Energy Econ 28(4):444–454
2. Agrell PJ, Brea-Solís H (2017) Capturing heterogeneity in electricity distribution operations: a
critical review of latent class modelling. Energy Policy 104:361–372
3. Agrell PJ, Teusch J (2020) Predictability and strategic behavior under frontier regulation.
Energy Policy 137:111140
4. Agrell PJ, Bogetoft P, Tind J (2005) DEA and dynamic yardstick competition in Scandinavian
electricity distribution. J Prod Anal 23(2):173–201
5. Amundsveen R, Kvile HM (2015) The development and application of an incentive regulation
model – a balancing act. The ICER. Chronicle 3:17–23
6. Amundsveen R, Kvile HM (2016) Balancing incentives: the development and application of a
regulatory benchmarking model. In: Greene WH, Khalaf L, Sickles R, Veall M, Voia MC (eds)
Productivity and efficiency analysis. Springer, Cham, pp 233–247
7. Amundsveen R, Kvile HM (2017) Productivity development among Norwegian electricity
distribution system operators – the impact of capital assessment and technological progress.
Data Envelop Anal J 3(1–2):93–118
8. Arcos-Vargas A, Núñez-Hernández F, Villa-Caro G (2017) A DEA analysis of electricity
distribution in Spain: an industrial policy recommendation. Energy Policy 102:583–592
9. Arocena P (2008) Cost and quality gains from diversification and vertical integration in the
electricity industry: a DEA approach. Energy Econ 30(1):39–58
10. Arocena P, Saal DS, Coelli T (2012) Vertical and horizontal scope economies in the regulated
US electric power industry. J Ind Econ 60(3):434–467
11. Badunenko O, Henderson DJ, Kumbhakar SC (2012) When, where and how to perform
efficiency estimation. J R Stat Soc A Stat Soc 175(4):863–892
12. Badunenko O, Cullmann A, Kumbhakar SC, Nieswand M (2021) The effect of restructuring
electricity distribution systems on firms’ persistent and transient efficiency: the case of
Germany. Energy J 42(4)
13. Battese GE (1997) A note on the estimation of Cobb-Douglas production functions when some
explanatory variables have zero values. J agric Econ 48(1–3):250–252
14. Baumol WJ, Panzar JC, Willig RD (1982) Contestable markets and the theory of industry
structure. Harcourt Brace Jovanovich, New York
15. Bjørndal E, Bjørndal M, Cullmann A, Nieswand M (2018) Finding the right yardstick:
regulation of electricity networks under heterogeneous environments. Eur J Oper Res 265(2):
710–722
16. Bogetoft P, Otto L (2010) Benchmarking with DEA, SFA, and R. Springer, New York
17. Caves DW, Christensen LR, Diewert WE (1982) The economic theory of index numbers and
the measurement of input, output, and productivity. Econometrica 50(6):1393–1414
18. CEER (Council of European Energy Regulators) (2019) Incentive regulation and benchmark-
ing work stream, Report on regulatory frameworks for European Energy Networks. CEER
report, C18-IRB-38-03
19. Çelen A (2013) Efficiency and productivity (TFP) of the Turkish electricity distribution
companies: an application of two-stage (DEA&Tobit) analysis. Energy Policy 63:300–310
20. Coelli T, Estache A, Perelman S, Trujillo L (2003) A primer on efficiency measurement for
utilities and transport regulators. The World Bank, Washington, DC
21. Coelli TJ, Rao DSP, O’Donnell CJ, Battese GE (2005) An introduction to efficiency and
productivity analysis. Springer, New York
22. Cullmann A (2012) Benchmarking and firm heterogeneity: a latent class analysis for German
electricity distribution companies. Empir Econ 42(1):147–169
23. Dai X, Kuosmanen T (2014) Best-practice benchmarking using clustering methods: application
to energy regulation. Omega 42(1):179–188
24. Debreu G (1951) The coefficient of resource utilization. Econometrica 19(3):273–292
37 Application of Production Economics in the Electricity Distribution Sector 1459

25. Deng NQ, Liu LQ, Deng YZ (2018) Estimating the effects of restructuring on the technical
and service-quality efficiency of electricity companies in China. Util Policy 50:91–100
26. Edvardsen DF, Førsund FR (2003) International benchmarking of electricity distribution
utilities. Resour Energy Econ 25(4):353–371
27. Edvardsen DF, Førsund F, Hansen W, Kittelsen SAC, Neurauter T (2007) Productivity and
regulatory reform of Norwegian electricity distribution utilities. In: Coelli T, Lawrence D (eds)
Performance measurement and regulation of network utilities. Edward Elgar, Cheltenham, pp
97–131
28. Ervural BC, Ervural B, Zaim S (2016) Energy efficiency evaluation of provinces in Turkey
using data envelopment analysis. Soc Behav Sci 235:139–148
29. Farrell MJ (1957) The measurement of productive efficiency. J R Stat Soc 120:253–281
30. Farsi M, Fetz A, Filippini M (2007) Benchmarking and regulation in the electricity distribution
sector. In: Servizi pubblici: nuove tendenze nella regolamentazione nella produzione e nel
finanziamento. Franco Angeli, pp 159–176
31. Farsi M, Filippini M (2004) Regulation and measuring cost-efficiency with panel data models:
application to electricity distribution utilities. Rev Ind Organ 25(1):1–19
32. Fetz A, Filippini M (2010) Economies of vertical integration in the Swiss electricity sector.
Energy Econ 32(4):1325–1330
33. Filippini M, Hunt LC (2012) US residential energy demand and energy efficiency: a stochastic
demand frontier approach. Energy Econ 34(5):1484–1491
34. Filippini M, Greene W, Masiero G (2018) Persistent and transient productive inefficiency in a
regulated industry: electricity distribution. Energy Econ 69:325–334
35. Førsund FR, Kittelsen SA (1998) Productivity development of Norwegian electricity distribu-
tion utilities. Resour Energy Econ 20(3):207–224
36. Fraquelli G, Piacenza M, Vannoni D (2004) Scope and scale economies in multi-utilities:
evidence from gas, water and electricity combinations. Appl Econ 36(18):2045–2057
37. Fried HO, Lovell CK, Schmidt SS (2008) Efficiency and productivity. In: Fried HO, Lovell
CK, Schmidt SS (eds) The measurement of productive efficiency and productivity growth, 3rd
edn. Oxford University Press, Oxford, pp 3–91
38. Giannakis D, Jamasb T, Pollitt M (2005) Benchmarking and incentive regulation of quality of
service: an application to the UK electricity distribution networks. Energy Policy 33(17):2256–
2271
39. Growitsch C, Jamasb T, Pollitt M (2009) Quality of service, efficiency and scale in network
industries: an analysis of European electricity distribution. Appl Econ 41(20):2555–2570
40. Gugler K, Liebensteiner M, Schmitt S (2017) Vertical disintegration in the European electricity
sector: empirical evidence on lost synergies. Int J Ind Organ 52(5):450–478
41. Hafezalkotob A, Haji-Sami E, Omrani H (2015) Robust DEA under discrete uncertain data: a
case study of Iranian electricity distribution companies. J Ind Eng Int 11(2):199–208
42. Jamasb T, Pollitt M (2001) Benchmarking and regulation: international electricity experience.
Util Policy 9(3):107–130
43. Jamasb T, Pollitt M (2003) International benchmarking and regulation: an application to
European electricity distribution utilities. Energy Policy 31(15):1609–1622
44. Jamasb T, Nillesen P, Pollitt M (2004) Strategic behaviour under regulatory benchmarking.
Energy Econ 26(5):825–843
45. Jara-Díaz S, Ramos-Real FJ, Martínez-Budría E (2004) Economies of integration in the
Spanish electricity industry using a multistage cost function. Energy Econ 26(6):995–1013
46. Kaserman DL, Mayo JW (1991) The measurement of vertical economies and the efficient
structure of the electric utility industry. J Ind Econ 39(5):483–502
47. Koopmans T (1951) Activity analysis of production and allocation. Wiley, New York
48. Korhonen PJ, Syrjänen MJ (2003) Evaluation of cost efficiency in Finnish electricity distribu-
tion. Ann Oper Res 121(1–4):105–122
49. Kumbhakar SC, Lien G (2017) Yardstick regulation of electricity distribution – disentangling
short-run and long-run inefficiencies. Energy J 38(5):17–37
1460 Ø. Mydland and G. Lien

50. Kumbhakar SC, Lovell CAK (2000) Stochastic frontier analysis. Cambridge University Press,
Cambridge, UK
51. Kumbhakar SC, Wang HJ (2006) Pitfalls in the estimation of a cost function that ignores
allocative inefficiency: a Monte Carlo analysis. J Econ 134(2):317–340
52. Kumbhakar SC, Baardsen S, Lien G (2012) A new method for estimating market power with
an application to Norwegian sawmilling. Rev Ind Organ 40(2):109–129
53. Kumbhakar SC, Amundsveen R, Kvile HM, Lien G (2015a) Scale economies, technical change
and efficiency in Norwegian electricity distribution, 1998–2010. J Prod Anal 43(3):295–305
54. Kumbhakar SC, Wang HJ, Horncastle AP (2015b) A practitioner’s guide to stochastic frontier
analysis using stata. Cambridge University Press, New York
55. Kumbhakar SC, Mydland Ø, Musau A, Lien G (2020) Disentangling cost of persistent and
transient technical and input misallocation: the case of Norwegian electricity distribution firms.
Energy J 41(3):143–160
56. Kuosmanen T (2012) Stochastic semi-nonparametric frontier estimation of electricity distri-
bution networks: application of the StoNED method in the Finnish regulatory model. Energy
Econ 34(6):2189–2199
57. Kwoka JE (2002) Vertical economies in electric power: evidence on integration and its
alternatives. Int J Ind Organ 20(5):653–671
58. Li HZ, Kopsakangas-Savolainen M, Xiao XZ, Lau SY (2017) Have regulatory reforms
improved the efficiency levels of the Japanese electricity distribution sector? A cost
metafrontier-based analysis. Energy Policy 108:606–616
59. Littlechild S (2014) RPI-X, competition as a rivalrous discovery process, and customer
engagement–Paper presented at the conference the British utility regulation model: beyond
competition and incentive regulation? Util Policy 31:152–161
60. Makieła K, Osiewalski J (2018) Cost efficiency analysis of electricity distribution. Energy J
39(4):31–56
61. Meyer R (2012) Economies of scope in electricity supply and the costs of vertical separation
for different unbundling scenarios. J Regul Econ 42(1):95–114
62. Miguéis VL, Camanho AS, Bjørndal E, Bjørndal M (2012) Productivity change and innovation
in Norwegian electricity distribution companies. J Oper Res Soc 63(7):982–990
63. Mullarkey S, Caulfield B, McCormack S, Basu B (2015) A framework for establishing
the technical efficiency of electricity distribution counties (EDCs) using data envelopment
analysis. Energy Convers Manag 94:112–123
64. Musau A, Kumbhakar SC, Mydland Ø, Lien G (2021) Determinants of allocative and technical
inefficiency in stochastic frontier models: an analysis of Norwegian electricity distribution
firms. Eur J Oper Res 288:983–991
65. Mydland Ø, Kumbhakar SC, Lien G, Amundsveen R, Kvile HM (2020) Economies of scope
and scale in the Norwegian electricity industry. Econ Model 88:39–46
66. Nieswand M, Seifert S (2018) Environmental factors in frontier estimation–A Monte Carlo
analysis. Eur J Oper Res 265(1):133–148
67. O’Donnell CJ (2016) Using information about technologies, markets and firm behaviour to
decompose a proper productivity index. J Econ 190(2):328–340
68. O’Donnell CJ (2018) Productivity and efficiency analysis: an economic approach to measuring
and explaining managerial performance. Springer Nature, Singapore
69. Orea L, Jamasb T (2017) Regulating heterogeneous utilities: a new latent class approach with
application to the Norwegian electricity distribution networks. Energy J 38(4):101–128
70. Orea L, Growitsch C, Jamasb T (2015) Using supervised environmental composites in
production and efficiency analyses: an application to Norwegian electricity networks. Compet
Regul Netw Ind 16(3):260–287
71. Orea L, Álvarez IC, Jamasb T (2018) A spatial stochastic frontier model with omitted variables:
electricity distribution in Norway. Energy J 39(3):93–116
72. Pacudan R, De Guzman E (2002) Impact of energy efficiency policy to productive efficiency
of electricity distribution industry in the Philippines. Energy Econ 24(1):41–54
37 Application of Production Economics in the Electricity Distribution Sector 1461

73. Parmeter CF, Kumbhakar SC (2014) Efficiency analysis: a primer on recent advances. Found
Trends Econometr 7:191–385, Now
74. Piacenza M, Vannoni D (2009) Vertical and horizontal economies in the electric utility
industry: an integrated approach. Ann Public Coop Econ 80(3):431–450
75. Pombo C, Taborda R (2006) Performance and efficiency in Colombia’s power distribution
system: effects of the 1994 reform. Energy Econ 28(3):339–369
76. Quiggin J (2001) Market-oriented reform in the Australian electricity industry. Econ Labour
Relat Rev 12(1):126–150
77. Racine JS (2019) An introduction to the advanced theory of nonparametric econometrics: a
replicable approach using R. Cambridge University Press, Cambridge
78. Ramos-Real FJ, Tovar B, Iootty M, De Almeida EF, Pinto Jr. HQ (2009) The evolution and
main determinants of productivity in Brazilian electricity distribution 1998–2005: An empirical
analysis. Energy Econ 31(2):298–305
79. Resende M (2002) Relative efficiency measurement and prospects for yardstick competition in
Brazilian electricity distribution. Energy Policy 30(8):637–647
80. Sadjadi SJ, Omrani H (2008) Data envelopment analysis with uncertain data: an application
for Iranian electricity distribution companies. Energy Policy 36(11):4247–4254
81. Schmidt P (2011) One-step and two-step estimation in SFA models. J Prod Anal 36(2):201–203
82. Shuttleworth G (2005) Benchmarking of electricity networks: practical problems with its use
for regulation. Util Policy 13(4):310–317
83. Sickles RC, Zelenyuk V (2019) Measurement of productivity and efficiency: theory and
practice. Cambridge University Press, Cambridge
84. Silva E, Macedo P, Soares I (2019) Maximum entropy: a stochastic frontier approach for
electricity distribution regulation. J Regul Econ 55(3):237–257
85. Simar L, Wilson P (1998) Sensitivity of efficiency scores: How to bootstrap in nonparametric
frontier models. Manag Sci 44(1):49–61
86. Thakur T, Deshmukh SG, Kaushik SC (2006) Efficiency evaluation of the state owned electric
utilities in India. Energy Policy 34(17):2788–2804
87. Tovar B, Ramos-Real FJ, De Almeida EF (2011) Firm size and productivity. Evidence from
the electricity distribution industry in Brazil. Energy Policy 39(2):826–833
88. Triebs TP, Saal DS, Arocena P, Kumbhakar SC (2016) Estimating economies of scale and
scope with flexible technology. J Prod Anal 45(2):173–186
89. Wang HJ, Schmidt P (2002) One-step and two-step estimation of the effects of exogenous
variables on technical efficiency levels. J Prod Anal 18(2):129–144
Production and the Environment
38
Moriah Bostian and Tommy Lundgren

Contents
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1464
Modeling Production and the Environment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1465
Externalities, Efficiency, and Productivity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1468
Radial Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1469
Non-radial Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1470
Network and Multi-function Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1471
Valuation and Substitution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1473
Valuation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1474
Substitution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1477
Environmental Policy and Firm Performance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1479
Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1483
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1484

We would like to thank Rolf Färe and Shawna Grosskopf, as well as our reviewer. This work has
greatly benefited from their comments.
M. Bostian ()
Department of Economics, Lewis & Clark College, Portland, OR, USA
Department of Economics, Centre for Environmental and Resource Economics (CERE), Umeå
University, Umeå, Sweden
e-mail: [email protected]
T. Lundgren
Department of Economics, Centre for Environmental and Resource Economics (CERE), Umeå
University, Umeå, Sweden
e-mail: [email protected]

S. C. Ray et al. (eds.), Handbook of Production Economics,
https://doi.org/10.1007/978-981-10-3455-8_38
1464 M. Bostian and T. Lundgren

Abstract

Production theory offers a mathematical framework for modeling important

relationships between production activities and the environment. These include
the generation and valuation of production-related environmental effects, envi-
ronmental contributions to production processes, and production effects of
environmental management practices. In this chapter, we review the seminal
and recent empirical work in each of these areas. We anchor our review to
multi-input/multi-output production processes, as these make up a large share
of environmental applications in the field, and their associated models offer the
practitioner considerable flexibility in terms of specification and estimation.

Keywords

Pollution-generating technology · Externalities · Valuation ·

Competitiveness · Environmental performance

Introduction

Production theory offers a mathematical framework for modeling important rela-

tionships between production activities and the environment. These include the
generation and valuation of production-related environmental effects, environmental
contributions to production processes, and production effects of environmental
management practices. In this chapter, we review the seminal and recent empirical
work in each of these areas. We anchor our review to multi-input/multi-output
production processes, as these make up a large share of environmental applications
in the field, and their associated models offer the practitioner considerable flexibility
in terms of specification and estimation.
The chapter unfolds as follows: In section “Modeling Production and the
Environment,” we summarize the relevant theory for modeling multi-input/multi-
output production and the environment and the duality theory underlying production
and valuation. We intend this first section to serve as a unifying reference for
modeling pollution generation from production activities, the economic costs
of environmental protection, and efficiency and productivity. We also provide
additional references for the interested reader who wishes to delve deeper into the
theory. In the remaining sections, we explain how this basic modeling framework
has been adapted and applied to the environment in practice. We begin in sec-
tion “Externalities, Efficiency, and Productivity” with externalities. This includes
some consideration of what constitutes an externality in the production context,
applications to better understand how the production process generates externalities,
and how to incorporate externalities into measures of productivity and efficiency.
Then in section “Valuation and Substitution” we turn to the topic of valuation,
distinguishing between the benefits and costs of pollution abatement. Here we
review how others have used production theory to estimate prices for externalities,
38 Production and the Environment 1465

as well as other non-marketed environmental goods and services. For our final
application area in section “Environmental Policy and Firm Performance,” we
examine firm competitiveness in relation to the environment. This includes a review
of the empirical evidence for and against the well-known Porter hypothesis and
the growing field of what is commonly termed “corporate social responsibility
(CSR).” We conclude in section “Conclusion” by considering the current trajectory
of empirical work in the field.

Modeling Production and the Environment

To examine production and the environment through the lens of production theory,
we begin with a general model of the production technology as a multi-input/multi-
output process. For notation, let x ∈ N + represent a vector of inputs and y ∈
M+ a vector of intended production outputs. In practice, producing y from x
can (and often does) also generate unintended environmental effects, the focus
of this chapter. Let u ∈ J+ represent these unintended environmental effects
(e.g., pollution). A common approach in the literature is to incorporate u into (1)
as an undesirable output, in order to model the production of both intended and
unintended outputs. The resulting technology is often labeled the “environmental
production technology,” “joint production technology,” or the “pollution-generating
technology.” We will use the latter here, abbreviated as PGT, as it speaks more
directly to the undesirable nature of u. We define the PGT, T , as

T = {(x, y, u) : x can produce y and u}, (1)

which can also be represented in terms of the feasible output set, P (x),

P (x) = {(y, u) : (x, y, u) ∈ T }, (2)

or the input requirement set, L(y, u),

L(y, u) = {x : (x, y, u) ∈ T }. (3)

Distance functions offer a functional representation of the PGT and are widely
applied in the field to incorporate environmental effects into estimates of produc-
tivity and efficiency, valuation, and policy analysis. These models take two general
forms, radial [92] distance functions and non-radial [70] or directional distance
functions [25]. The Shephard output and input distance functions, DO and DI , are
defined, respectively, as

DO (x, y, u) = inf{θ : (y/θ, u) ∈ P (x)}, 0 ≥ θ ≥ 1, (4)

1466 M. Bostian and T. Lundgren

and

DI (x, y, u) = sup{λ : x/λ ∈ L(y, u)}, λ ≥ 1, (5)

where DO measures the feasible radial expansion of intended outputs from the
origin and DI measures the feasible radial contraction of inputs to the origin. A
distance value θ = 1 or λ = 1 implies that an observation is operating on the
corresponding output or input frontier. Numerous studies have also adapted the
radial input distance function to undesirable outputs, which we denote as DU ,

DU (x, y, u) = sup{λ : u/λ, x/λ ∈ L(y)}, λ ≥ 1. (6)

O and D
The directional output and input distance functions, D I , are defined,
respectively, as

O (x, y, u; gy , gu ) = sup{β : (y + βgy , u − βgu ) ∈ P (x)}, β ≥ 0,

D (7)

and

I (x, y, u; gx ) = sup{β : x − βgx ∈ L(y, u)}, β ≥ 0,

D (8)

where D O measures the feasible joint expansion of intended output in the direction
gy and contraction of unintended output in the direction gu , while D I measures the
feasible contraction of inputs in the direction gx . A distance value β = 0 implies
that an observation is operating on the corresponding output or input frontier, and
the distance value increases with inefficiency.
Chung et al. [29] first introduced D O in the context of pollution resulting from
production. One important distinction between the directional and radial models
for environmental applications is that while both models can be used to contract
undesirable production outputs, radial distance functions treat the bad output as an
input, as opposed to the directional distance function which explicitly models u as
an output. While the radial approach has been commonly used to model the PGT,
the treatment of bads as inputs can imply substitute relationships between pollution
and other inputs that may not exist in practice.
Both types of distance models and their corresponding production set represen-
tations satisfy key axioms and desirable mathematical properties from production
theory. Given the empirical focus of this chapter, we omit their discussion here and
instead point the reader to [25], as well as [46] for more theoretical detail. Bostian
et al. [18] also provide summary discussion of functional form and parameterization
of distance functions. Dakpo et al. [34] review recent nonparametric developments
for modeling the PGT.
To aid in interpreting the empirical work that follows, we illustrate the Shephard
and directional distance functions, along with their corresponding technology sets
in Fig. 1. In the left panel (i), the radial Shephard output distance function projects
observation A to point B on the production frontier, expanding both goods and bads.
38 Production and the Environment 1467

(i) PGT Output Set (ii) PGT Input Set

y x2
A

B
C C
B
A L(y, u)

P (x) gx
(gy , gu )

ū u x1

Fig. 1 Radial and directional distance function models of the PGT. (i) Output distance, (ii) input
distance

The directional output distance function instead projects observation A to point

C on the frontier, expanding goods and contracting bads. In the right panel (ii),
both models contract inputs to the frontier. The Shephard input distance function
contracts A radially to point B while the directional input distance function to
point C.
Two important concepts follow from the production technology. The first is
technical efficiency, which measures the extent to which an observation attains
maximal output production and/or minimal input use, relative to the technology
frontier. The second is productivity, which measures the change in overall produc-
tion over time, accounting for both change in technical efficiency and change in
the production technology. Distance function values provide a measure of technical
efficiency, or inefficiency in the case of the directional distance function, by
modeling distance to the frontier. Using Fig. 1 to illustrate, the radial Shephard
OA
efficiency measure is the ratio of OB in panel (i) and its inverse in panel (ii), while
the directional inefficiency measure is given by AC in both panels. Productivity
indexes (or indicators for additive models) can then be constructed from distance
and technology estimates across time. Two widely applied productivity measures
are the Malmquist productivity index [24, 43], in the case of Shephard distance
functions, and the [70] productivity indicator [29], in the case of directional distance
functions. Both the Malmquist and Luenberger can be used in an environmental
context to incorporate unintended outputs into composite productivity measures and
decomposed into separate measures of efficiency and technology change.
We note that the output set in panel (i) of Fig. 1 also illustrates two commonly
made assumptions in the empirical literature for the PGT P (x), null jointness and
weak disposability for u and y. The null joint condition implies that for (y, u) ∈
P (x), if u = 0, then y = 0. Related to this, the unintended output also entails
some cost of disposal. While y and x are generally assumed freely disposable (i.e.,
1468 M. Bostian and T. Lundgren

the firm could always simply use fewer inputs to produce less output), u and y are
only weakly disposable. More formally, if (y, u) ∈ P (x) and 0 ≤ θ ≤ 1, then
(θy, θ u) ∈ P (x). Weak disposability imposes a physical trade-off between y and u
at the margin, where any reduction of the bad requires some proportional sacrifice
of the good. The weak disposability assumption has however generated some debate
in the literature, particularly concerning the materials balance condition and other
physical laws governing mass and energy flows [30, 53, 55]. However, [89] shows
that the model can be specified to make weak disposability consistent with materials
balance. Bostian et al. [18] also note that for empirical work, physical laws such as
materials balance are presumably borne out by the observed data, obviating the need
to impose the materials balance condition ex post.
The framework above models the PGT in quantity space, in terms of the
physical relationships between production, abatement, and pollution. We review
related empirical applications to externalities, efficiency, and productivity in
section “Externalities, Efficiency, and Productivity.” Duality theory connects these
physical production relationships to price space, allowing for valuation of non-
marketed environmental effects, the focus of section “Valuation and Substitution.”
Namely, both the radial and directional input distance function are dual to firm
cost, while the output-oriented counterparts are dual to firm revenue. A number
of empirical studies exploit these dual relationships to estimate shadow prices for
generated pollution, abatement, and changes to environmental quality [16, 42].
For a detailed exposition of the relevant duality theory, refer to Färe and Primont
[52] for radial models and [46] for directional models. Färe et al. [51] include a
comprehensive review of both the theory and shadow price applications. We review
related empirical applications in section “Valuation and Substitution,” for both costs
and benefits of generated pollution, abatement, and environmental quality.

Externalities, Efficiency, and Productivity

In practice, externalities frequently result from production activities. Examples

include soil and water pollution from agriculture and mining, as well as both local-
ized air pollution and greenhouse gas emissions from manufacturing and energy
sectors. Unlike other outputs of production, externalities are often unintended and
non-marketed. Reducing them requires some reallocation of resources away from
intended output, so that in the absence of regulation or appropriate policy incentives,
firms have little reason to factor externalities into their production decisions.
However, the production theoretical framework outlined in section “Modeling
Production and the Environment” can be used to better understand how externalities
are generated by the production process, the potential for their reduction, and to
incorporate externalities into composite productivity measures.
That said, the framework provides little guidance in determining whether or
not production outputs, intended or otherwise, actually constitute externalities. For
example, waste generated from livestock might be considered an externality if
allowed to run off the farm into surrounding waterways, or an internalized cost if
38 Production and the Environment 1469

the farm is charged a fee for any runoff that occurs. That same waste might even be
considered a desirable intermediate output/input if it is collected and subsequently
used to fertilize crops. Thus, for empirical application, the practitioner must first
identify the relevant production and environmental processes.

Radial Models

One of the first, and most widely cited, studies to include externalities in the
production technology is that of [41], who specify a hyperbolic distance function
to radially expand and contract desirable and undesirable outputs, respectively.
This hyperbolic specification can be considered a pre-cursor to the more general
directional distance function, which can also be used to model joint expansion
and contraction of production inputs and outputs. In their application to US pulp
and paper production, the authors find that introducing the environmental objective
to contract effluent emissions generally decreases estimates of firm inefficiency.
In other words, without having to also reduce emissions, firms could produce
far greater levels of intended output with a given set of input resources. Other
hyperbolic distance applications include [78], who incorporate toxic release data
into efficiency estimates for US oil refineries; Yang et al. [98], who incorporate
energy use and emissions reductions into productivity measures for provinces
in China; Mamardashvili et al. [74], who incorporate nitrogen pollution into
efficiency estimates for Swiss dairy farms and estimate shadow prices (discussed
more in section “Valuation and Substitution”) for nitrogen production; Duman
and Kasman [40], who estimate potential CO2 reductions for EU member and
candidate countries, finding some evidence of convergence between the two groups;
and Peña et al. [82], who incorporate both desirable and undesirable externalities,
forest preservation and degraded land, into efficiency measures for ranching in the
Amazon.
We now turn to the more widely applied Shephard and directional distance
literature. Given their radial nature, Shephard distance functions are often used to
construct Malmquist-type ratio measures of good to bad output, sometimes termed
“environmental performance.” Färe et al. [47] introduce the use of Shephard-based
Malmquist indexes to construct an environmental performance index as the ratio
of two multilateral Malmquist productivity indexes, one measuring expansion of
desirable outputs and the other measuring contraction of undesirable outputs. They
apply this framework to emissions of CO2 , SO2 , and NOX for a cross section
of OECD countries, finding an average environmental performance index value of
roughly 1.10, which indicates higher proportional increases of good to bad outputs.
Building on this static case, Kortelainen [63] introduces a dynamic Malmquist
index for environmental performance, along with its associated decomposition into
environmental efficiency change and environmental technical change. He applies
this framework to data on production value added and air pollution for the EU
member states from 1993 to 2000, finding an overall improvement in environmental
performance, driven mainly by environmental technical change. Zhou and Ang
1470 M. Bostian and T. Lundgren

[102] apply a similar Malmquist framework to CO2 emissions for OECD countries,
further decomposing productivity into multiple aspects of efficiency and technology
change, including the change in carbon factor, energy intensity, emissions, and GDP.
Färe et al. [44] apply a dynamic Malmquist index to air pollutants for US coal-
fired power plants, from 1998 to 2005. They find average overall improvements but
also a large amount of variation, with more than 25% of plants exhibiting deteri-
orating performance for the study period. Barnhart et al. [12] apply a multilateral
Malmquist framework to agricultural nitrogen and phosphorus pollution from crop
and livestock production in the US Mississippi-Atchafalaya river basin. They find
that for many of the corn- and soy-producing regions that are most targeted for
nutrient reduction, the production increases proportionally outweigh increases to
pollution, while overall environmental performance tends to be lowest for marginal
low-production lands. Bostian et al. [17] extend the Malmquist index approach
to a network technology setting, to incorporate investments in environmentally
friendly practices into measures of environmental performance, energy efficiency,
and overall productivity for Swedish pulp and paper firms. They find that these three
aspects of change in production generally move together for firms in the industry.

Non-radial Models

Chung et al. [29] introduce the use of the non-radial directional distance function
to model the joint production of intended outputs and unintended externali-
ties. They also develop an additive productivity indicator, which they term the
“Malmquist-Luenberger” index, with associated decomposition of efficiency change
and technology change. This represents the first empirical environmental application
of the directional distance framework, using a panel of Swedish pulp and paper
firms to incorporate multiple water pollutants into the production technology. They
find sustained average improvements to productivity over the study period, with
technology gains outweighing efficiency losses.
The early work by [29] has led to a proliferation of non-radial, directional
distance function approaches in the empirical literature. Zhang and Yongrok [101]
review this progression more comprehensively. We provide selected references
here. Weber and Domazlicky [95] provide an early application, adjusting US
manufacturing productivity estimates to also include toxic releases. Picazo-Tadeo
et al. [83] apply the directional distance framework to estimate both the regulated
(assuming costly disposal of bad outputs) and unregulated production technologies
(assuming free disposal of bad outputs) for firms in the Spanish ceramic tile industry.
They interpret the difference in frontier output between the two as the regulatory
impact, finding an aggregate regulatory impact of roughly 4.8% foregone production
value.
Kumar [64] uses a similar technology comparison, estimated with and without
free disposal of bad outputs, to consider the potential for productivity convergence in
the face of regulation for CO2 emissions. Application to a panel of both developed
and developing countries reveals that on average the difference in productivity
38 Production and the Environment 1471

growth between the two technologies is positive for developed countries and nega-
tive for developing countries, suggesting a lack of convergence for environmentally
sensitive productivity. Focusing just on the OECD countries, [81] estimates both
country/year-specific technologies and a global reference technology which pools
the panel in order to also estimate a technology gap component of productivity.
Contrary to Kumar [64], Oh [81] finds that environmentally sensitive Luenberger
productivity indicators fall behind those not accounting for pollution emissions,
when restricting the analysis to developed countries.
Zhang and Yongkrok [100] employ a similar global, or metafrontier, approach
to fossil fuel-powered energy production in China. So also do [99], in their
case to China’s provincial industrial sector. Wei et al. [96] extend this non-radial
metafrontier approach to the 2015 Paris Agreement signatory countries, finding
technology progress to be the main driver of improvements to CO2 emissions
productivity.
Numerous applications consider agricultural water use and related non-point
source pollution. Färe et al. [49] use the directional output distance function
to estimate shadow prices for runoff from fertilizer and pesticides for the US
agricultural sector. Piot-Le and Le Moing [85] apply the Luenberger indicator to
examine the productivity effects of participation in agri-environmental programs for
French agriculture, finding support for the well-known Porter hypothesis (discussed
more in section “Environmental Policy and Firm Performance”) that participating
farms were more productive. Azad and Ancev [6] use the Luenberger indicator to
estimate productivity of water use in the Australia agricultural sector, while [7]
use the non-radial Russell graph measure to estimate a global efficiency index
for irrigation in Australia. Bostian et al. [20] incorporate the directional output
distance function into an integrated production-biophysical model to consider the
optimal spatial allocation of nitrogen fertilizer reduction in a US agricultural
watershed. More recently, [93] use the directional output distance function to
estimate shadow prices for reductions to deforestation in the Brazilian Amazon.
The authors then connect these shadow price estimates to a biophysical model for
carbon sequestration to estimate trade-off values for CO2 emissions and agricultural
production.

Network and Multi-function Models

The relationship between production and the environment often depends on a series
of processes across both space and time. The extension of standard single-stage
production technology estimation methods to a multi-function, network technology
framework can allow for more realistic structural representation of important
underlying processes. Førsund [53] outlines a theoretical justification for this
approach, while [18] review the subsequent development of network methods
to estimate the PGT. Empirical work related to production and the environment
increasingly makes use of network methods in order to better assess efficiency,
productivity, and environmental performance. We review recent advances here.
1472 M. Bostian and T. Lundgren

P1 (y 1f , u1f )
x1
y
x (y 1i , u1i )
u
2
x
(y 2f , u2f )
P2

Fig. 2 General two-stage network technology for goods and bads

To begin, we use Fig. 2 (adapted from [18]) to illustrate a basic two-stage

network production technology for goods and bads, which can be extended to
multiple stages in practice. The network framework separates the single-stage
production technology into two underlying subtechnologies, P 1 and P 2 ; aggregate
inputs, x, into subtechnology inputs x 1 and x 2 ; aggregate final outputs, (y,u),
into subtechnology final outputs, (y 1f , u1f ) and (y 2f , u2f ). Intermediate inputs,
(y 1i , u1i ), connect P 1 to P 2 , forming the network. The individual subtechnologies
can be modeled using the distance function methods outlined in section “Modeling
Production and the Environment.”
The general network framework in Fig. 2 can be tailored to more detailed
subtechnology processes for empirical application. Perhaps most prominent among
these, [79] decompose the technology into separate goods- and bads-generating sub-
technologies, using bads-generating and non-bads-generating inputs, respectively.
They term this approach “by-production” and include an empirical application
to US coal-fired power plants to illustrate. Chambers et al. [26] apply the by-
production approach to estimate shadow prices for nitrogen pollution from Catalan
farms. In a related application, [90] incorporate risk by estimating the by-production
technology within a state-contingent production framework. Dakpo et al. [35] apply
a similar framework to estimate shadow prices for greenhouse gas emissions from
French sheep farming, adding an additional materials balance constraint for bads-
generating inputs across the subtechnologies, which is consistent with the critique
of [34]. Dakpo and Lansink [36] extend the byproduction framework to the dynamic
case. Though not explicitly applying by-production, [54] develops a similar multi-
equation framework, termed “factorially determined multi-output production,” to
also satisfy the materials balance condition. Ray et al. [88] apply a by-production
model to country-level CO2 emissions, relaxing the weak disposability assumption
for bads-generating inputs and pollution outputs to joint disposability instead. Both
[73] and [66] estimate the by-production model econometrically, using Bayesian
methods. The former use the results to construct an environmental productivity
38 Production and the Environment 1473

index, with decompositions for environmental efficiency and technology change,

while the latter impose cost-minimization constraints to correct for endogeneity of
production inputs. Arjomandi et al. [5] use the by-production framework to con-
struct a Luenberger-Hicks-Moorsteen productivity indicator, along with associated
decompositions, with application to global airline emissions. Murty and Russell
[80] show that by-production satisfies the standard production axioms for PGTs
and provide guidance for selecting functional form in estimation.
Another strand of the environmental network technology literature addresses pol-
lution abatement, both instead of and along with pollution generation. Indeed, [79]
include abatement as an intended output in the intended production subtechnology,
as does [54]. Färe et al. [45] develop a two-stage network, with a joint production
PGT in the first stage, where generated pollution and intended outputs from the first
stage serve as intermediate inputs in the second stage (similar to Fig. 2). They apply
this approach to US coal-fired power plants.
In a similar application, [56] adds a materials balance condition to the abatement
subtechnology. Lozano [68, 69] builds on this to also consider allocative efficiency
of abatement, in two related applications to US and EU power generation. Bi et al.
[15] apply a similar production-abatement network model to power generation in
China. Bostian et al. [18] extend the production-abatement network approach by
decomposing total abatement into “beginning of pipe” prevention activities in the
first stage and “end of pipe” treatment activities in the second stage, with application
to Swedish pulp and paper firms. The authors also consider the optimal allocation
of investment in abatement between prevention and treatment.
Network and multi-function approaches can also be used to model environmental
production processes over time and space, incorporating dynamic relationships
between production, abatement, and emissions. Färe et al. [48] apply a dynamic
model termed “time substitution” to estimate the optimal time path of GHG emis-
sions reductions for the Kyoto Protocol signatory countries, in light of technology
change. Zhou et al. [103] extend this approach to allow for spatial-temporal
substitutions of CO2 emissions reductions in China. Bostian et al. [20] and
Whittaker et al. [97] use bi-level optimization methods [11] to integrate the envi-
ronmental production framework with a biophysical model to estimate the optimal
spatial allocation of nitrogen reductions from agriculture in the USA. Bostian
et al. [18] model the optimal dynamic allocation of environmental investments and
expenditures to reduce GHG emissions in Swedish manufacturing. Cheng et al. [28]
model the optimal time path for emissions reductions in China, under increasingly
stringent emissions targets.

Valuation and Substitution

In this section we look at applications where the environment is treated as an input or

as an output in the production function. We focus on valuation and substitution. We
mention briefly shadow pricing/marginal productivity of environmental factors. The
previous sections have primarily considered the costs of abatement or benefits of
1474 M. Bostian and T. Lundgren

pollution from the firm’s perspective. In this section we also consider the benefits of
environmental protection and distinguish these from abatement costs in the context
of production.
The production theoretical background on the valuation of benefits or costs
resulting from a change in the environment can be found in [76], which is based on
the standard case – without explicitly considering the environment – as described
in [60].
The theoretical background for empirical applications on the environmental input
substitution part stems from the seminal work on mass/material balance production
functions by [8] and by more recent elaborations on that work, e.g., [94].
A well-cited paper on undesirable output substitution and shadow pricing
undesirables is [42], where both deterministic and stochastic directional distance
function methods are used to estimate and illustrate the axiomatic models discussed
earlier in this chapter. See also [51] for more shadow pricing applications.

Valuation

The environment affects the production opportunities of firms. Think of the simplest
case: a single firm selling its output and buying its inputs on competitive markets.
Output y depends on inputs x and an environmental quality input q (generally a
“good”). Now, suppose we can write the transformation function t (y, x; q) = 0.
The associated production function is assumed to be concave and non-decreasing
in inputs, including q. Let us suppose some environmental damage occurs, so that
q decreases from q0 to q1 . The resulting cost to the firm, given no price changes,
is simply the change in producer surplus (quasi-rents) or profits, π(p, w, q 1 ) −
π(p, w, q 0 ), where p and w are prices of output and inputs, respectively. When
more than one firm is affected, the total change in producer surplus is simply
the sum of changes to profits across firms. We note that a given change to the
environment may benefit some firms while harming others. The classic laundry-
factory example illustrates this point, where pollution from the factory harms the
laundry but benefits the factory; the reverse holds for the case of abatement by
the factory. If complete data on profits before and after an environmental change
is available, then calculating the change in producer surplus becomes a simple
exercise. But this is rarely the case, and we have to attempt recovering essential
parts of the profit function or approximations. Note that here if the resulting change
in profits from a decrease in q is positive (e.g., due to avoided abatement costs), then
we are referring to the costs of environmental protection; if the resulting change is
negative, then we are referring to a benefit of environmental protection.
Assume a profit function π(p, w, q) = max [py − C(y, w, q)]. Figure 3
y
illustrates the change in costs – and consequently producer surplus – for a given
change in the environment, q, from q 0 to q 1 holding the output price constant. Think
of this change as going from bad to worse in terms of pollution. Less resources
spent on internal pollution control means lower costs for the firm, ceteris paribus.
38 Production and the Environment 1475

Monetary units

Cy (y, w, q 0 )
Cy (y, w, q 1 )

A B
p0

y(p0 , w, q 0 ) y(p0 , w, q 1 ) Output

Fig. 3 Production cost shift due to changing environmental quality

The cost savings are represented by the area ABCD, i.e., the increase in producer
surplus. The approach illustrated in the figure would not be appropriate if our
interest lies in looking at the benefits of environmental improvements, since here
we are focusing solely on the costs. If the opposite shift in costs were to occur,
where a decrease in q from q 1 to q 0 resulted in an increase in costs (e.g., the case
where more pollution reduces the productivity of other inputs), then this same area
ABCD would represent a benefit of pollution control to the firm. However, this
benefit to the firm may not represent the full benefit of pollution control to society;
that would depend on whether the pollution is fully internalized. The framework
says little about the negative externality since it is in the context of a single firm
and is focused on production possibilities rather than behavioral choices, which in
turn hinge on whether the costs of environmental degradation are internalized or
not. A comprehensive welfare measure would include both the benefits and costs
of a decrease in q, which cannot be generally represented by the framework in the
figure.
One obvious way to recover the production costs or benefits of changes in the
environment is to estimate a profit function and calculate the difference in value
evaluated at different levels of the environmental input. But adequate data on profits
are not generally available. Alternative approaches involve using the envelope
properties of profit, cost, and revenue functions to estimate behavioral supply or
demand functions which incorporate the environment. With the proper restrictions
on the profit function, these behavioral functions can provide a valuation measure
for production effects due to changes in the environment.
In the absence of adequate data to estimate output supply, another option can be
to instead estimate input demand functions. McConnell and Bockstael [76] show
1476 M. Bostian and T. Lundgren

that if the input is essential (zero amount of that input means zero output), then it
is possible to relate the change in input demand to producer surplus and changes in
welfare due to a change in environmental quality. The environment only affects a
firm’s production if the input is positive. Assume, for example, that one watershed
is polluted by industrial activity. The producer surplus associated with changes in
the pollutant can be estimated as the change in the area under the derived demand
for the contaminated water.
The researcher needs to be able to estimate supply and/or demand functions
which are continuous in prices and that include the environment explicitly. There
are examples in the literature where this approach has been used. One of very
few examples is [61], who derive a fishery supply function which is modeled as
depending on sub-aquatic vegetation levels. Another example is [77], where benefits
from a fishery are studied when derived demand and supply functions depend on
water quality.
Most applications are in developed countries where markets are “well-behaved”
and an ample amount of alternatives exist to reasonably expect supply and demand
function to be continuous in prices. Applications in developing countries with
limited data or not well-functioning markets often resort to practical approaches that
attempt to approximate producer surplus effects. McConnell and Bockstael [76]
chronicle these practical approaches, which include (i) valuing changes in output
through a damage function, (ii) valuing only changes in output, and (iii) assessing
changes in costs.
More recent dual production economics approaches in the spirit of [61] are
scarce. A few can be found in [9], who reviews studies on valuing ecosystem
services as inputs in a fish habitat setting. Another more recent example also looking
at a supply function and estimating the value of changing ocean habitats is [57].
In sum, we see some interest in the literature in using production economics for
valuation when the firm is subject to changes in environmental factors. However,
even though the theoretical underpinnings are quite robust – as outlined in [76] –
the use of production economics for comprehensive welfare analysis is limited to
instances in which the environmental effect is borne solely by producers. This holds
if the environmental effect is fully internalized to the firm, or if the profit measure
evaluated includes the combined production benefits and costs to all firms, so that
the change in profit represents the change in total net benefits.
In comparison to using production economics to value change in producer
surplus, using primal or dual representations of technologies and looking at
substitutability, marginal productivity, or shadow pricing of environmental factors
are more common in the literature. We first turn to studies looking at marginal
productivity and shadow pricing, since those concepts are connected to valuation.

Marginal Productivity and Shadow Pricing

Marginal productivity and shadow price estimates of environmental factors can
be useful in evaluating policy measures or economic assessment of changes in
the environment. A number of authors and studies have taken this more primal
approach to evaluate the marginal productivity of environmental factors as inputs
38 Production and the Environment 1477

in a production function. Note that these studies on marginal productivity provide

estimates of the benefits of environmental improvements, while shadow pricing
generally provides an estimate of the costs in terms of foregone production. The
production function does not have to be that of a firm, but can instead pertain to any
type of economic agent or decision-making unit amenable to analysis within this
framework.
For example, the marginal productivity of quality of ocean fish habitats has been
studied by, e.g., [1, 10, 75]. See [9] for more references on similar studies. These
studies analyze the potential increase in marginal productivity from improving the
ocean fish habitats.
In a similar vein, [62] examines Brazilian forests and the value of fallow
ecosystem services in shifting forest management cultivation, including hydrolog-
ical externalities that may affect other farms. The author estimates a production
function to assess the value of forest fallow and test whether it provides local
externalities to agricultural production.
Applications that shadow price bad outputs are quite common in the literature,
especially after the widely cited work of [42]. This influential paper uses a
quadratic directional output distance function to estimate the shadow price of SO2
for 209 electric utilities before (1993) and after (1997) implementation of Phase
I regulations of the US acid rain program. They find that the shadow price of
pollution increases over the study period, reflecting the increased stringency of
regulations and, as a consequence, the increased cost. While the majority of shadow
pricing examples pertain to quantitative changes in pollution levels, [16] apply this
framework to qualitative changes in the environment, estimating shadow prices for
improved wetland condition in an agricultural watershed.

Substitution

This section describes substitution relationships between the environment and

production, both from the input side and the output side. We start by looking at
the environment as an input. This input is not to be seen as an input that is used
up in the process, that is, as a productive input such as soil or water. Rather it is
– in what is described below – defined as emissions or pollutions that serve as an
input by enabling more production (cutting emissions thus restricts production). It
is important to understand the distinction between these two concepts of inputs.
The actions of firms usually involve the direct or indirect use of environmental
resources that are altered from a natural state to a more or less degraded state.
Ayers and Kneese [8] developed the material or mass balance principle to measure
this transformation, arguing that material inputs should be defined more broadly
to include water and air in addition to fuels and conventional material inputs. Van
den Bergh [94] claims that neoclassical production functions are not inherently
inconsistent with mass balances, but notes that empirical models based on produc-
tion economics often do not explicitly explore input substitution possibilities for
improving/degrading the environment.
1478 M. Bostian and T. Lundgren

In an attempt to close this gap in the literature, [31], building to some extent
on [23], develop a model for US electricity production that includes conventional
inputs, along with environmental resources as factors of production. This study
examines the case of sulfur emissions trading, where emissions serve as environ-
mental resources. The authors employ a dynamic factor demand model developed
originally by [84]; that is, quasi-fixed stocks potentially induce adjustment costs,
and firms minimize the expected sum of discounted (translog) costs so that factor
demands and Euler equations (for quasi-fixed factors) are given by the solution to
a dynamic optimization problem. Quasi-fixed factors are the stocks of capital and
permits, respectively. The user cost for permit stocks is proxied by the pollution
allowance price. Considine and Larson [31] motivate their specification of a mass
balance production function by referring to [8] and go on to argue that according
to the material balance relationship, the amount of environmental pollutants should
approximately equal the weight of energy and raw materials inputs, which include
minerals, water, air, and other environmental resources. Førsund [54] critiques this
setup, which we explain in more detail below.
Results in [31] suggest that considerable substitution possibilities exist between
environmental emissions and other inputs in US electric power generation. While
emissions prices are significant, relative fuel prices are more important in determin-
ing factor substitution than emission permit prices. Yet, the substitution elasticities
between labor, capital, and emissions are significant.
Considine and Larson [32] – using a similar approach as [31] – study fossil
and fossil-free fuel substitution in electricity production, following the introduction
of the European Union’s Emissions Trading System for greenhouse gas emissions.
Despite low emission permit prices, this study finds statistically significant substi-
tution between fossil fuels and fossil-free energy in electric power production.
Førsund [54] finds treating pollution as an input to some extent problematic,
arguing that this option is proposed without proper motivation in, e.g., the influential
textbook of [13]. In motivating this approach, it has been argued that good outputs
increase when pollution increases; less resources are used on pollution abatement,
and these resources are then reallocated to output production (e.g., [33]). However,
in the standard production economic setting, inputs are given, so there is no leeway
to use inputs for abatement without extending the model. Another explanation is
that generation of pollution needs services from the environment (nature) to take
care of these residuals and that such services working as inputs can be measured by
the volume or weight of the residuals (this is the motivation that [31] is embracing).
However, measuring environmental services this way, an increase in the use of the
environment cannot increase desirable output for given resources, because this is
impossible keeping inputs constant due to the materials balance condition. Førsund
[54]: “A partial increase in a residual as input cannot technically explain that a good
output increases by reasoning that inputs are reallocated from abatement activity to
the production of goods. Again, by definition, the inputs that are explicitly specified
in this relation must be kept constant. Having sort of additional inputs behind the
scene is not a very satisfactorily way of modelling.”
38 Production and the Environment 1479

Now let us turn our attention to the case when the undesirable is treated as an
output or by-product in production. A string of studies have looked at substitution
among pollutants when they are considered as non-separable or joint outputs. These
include [2, 42, 48, 65].
Färe et al. [42] use a quadratic directional output distance function to measure
the Morishima output elasticity of substitution between US electricity production
and SO2. They suggest that the elasticity of substitution shows that the ability to
trade reductions in electricity production for reductions in emissions became more
difficult after the implementation of Phase I regulations of the acid rain program.
The elasticity is estimated using both deterministic (DEA) and stochastic (SFA)
techniques.
Kumar and Managi [65] test the implicit assumption in the empirical literature
that (i) production of desirable output, undesirable output, and abatement are
separable and (ii) that different undesirables can be abated separately. Using a
unique plant-level data set from India, they find sufficient evidence to reject the null
hypotheses of separability between marketable output and pollutants and between
different pollutants. Firms must incur abatement costs for reducing pollution
levels. In addition, they find statistically significant complement and substitute
relationships between water pollutants.
Färe et al. [48] use a directional distance function to estimate the Morishima
transformation elasticity between SO2 and NOx, as well as the desirable output
in US electricity-producing utilities during a period when they were subject to
regulations associated with the Clean Air Act Amendments of 1990. The main
finding is that SO2 and NOx are substitutes, which implies that any gains in benefits
from the reduction of one undesirable are being partially offset by any costs due to
increases in another undesirable.
Agee et al. [2] use a 10-year panel for 77 US electric utilities to estimate
a multiple-input, multiple-output directional distance function, combining good
inputs and a bad input to produce good outputs and bad outputs (SO2, NOx, and
CO2). They find that considerable jointness (substitutability) exists between SO2,
NOX, and CO2 emissions. They conclude that failure to account for this jointness
increases the cost of pollution control.

Environmental Policy and Firm Performance

Climate change and other rising environmental problems have motivated govern-
ments to introduce or plan various types of environmental policies to mitigate or
adapt to these problems. One concern is that strict environmental policies will set
back growth and hinder development, at least for standard goods. The conventional
or neoclassical view is that stricter policies related to the environment lead to an
increase in cost, lower productivity, and decrease in profitability.
The literature on competitiveness, economic performance, and environmental
and/or energy and climate policy is vast; there are at least five comprehensive
1480 M. Bostian and T. Lundgren

reviews published in the general area of environmental policy and firm performance
[4, 21, 38, 39, 58]. Interest in this line of research dates back to the early 1990s,
when Harvard professor M. Porter challenged the conventional wisdom about the
impact of environmental regulation on firms. Two influential works, [86] and
[87], introduce the argument that well-designed or “right-kind” of regulation could
actually increase competitiveness, widely known today as the Porter hypothesis.
In the empirical literature on the impacts of environmental regulations on firms’
performance, competitiveness is typically measured by trade, industry location,
employment, productivity, or innovation (or some combination of these variables).
Here we review some of the recent studies focusing on production measures such as
total productivity, efficiency, and technological change.
The review by [58] directly responds to Porter’s claim. The authors find
no systematic evidence supporting the revisionist hypothesis that environmental
regulations stimulate innovation/productivity and improved competitiveness.
Brännlund and Lundgren [21] conclude that the theoretical literature can identify
the (rather non-general) mechanisms that must exist for a Porter effect to occur, but
the empirical literature gives no general support for the Porter hypothesis; in terms
of productivity, the impact is typically negative (see also [22]).
Ambec et al. [4] reach a similar conclusion as that of [21], i.e., that the empirical
evidence of the Porter hypothesis in terms of productivity enhancements is mixed,
however, with more recent studies suggesting somewhat more clear support.
Dechezleprêtre and Sato [38] conclude that evidence indicates that environmen-
tal regulation has both negative, short-term impacts on productivity in some sectors
and for some pollutants and positive productivity impacts in others; again the main
message is that results are mixed.
Dechezleprêtre et al. [39] observe that recent interest in a causal relationship
between environmental and economic performance is indicative of a new wave of
research; that is, being more “green” – be it because of internal or external policy –
not only affects firm costs but also potentially affects firm revenues. This connects
to some extent to the literature on corporate social responsibility (CSR), which we
do not include in this chapter. Dam et al. [37] provide a comprehensive exposition
of the theory underlying CSR and include recent applications for reference.
Before proceeding to recent empirical studies, we illustrate the Porter hypothesis
from a production economics perspective. How should we think about environmen-
tal policy and competitiveness? Figure 4 illustrates the basic Porter argument using
the relationship between the firm’s desirable and undesirable output (figure recreated
from [21]).
According to Porter, a regulation will highlight inefficiencies in a company. One
way to illustrate this in the pre-regulation period is to assume that a company is not
producing on the production possibilities frontier but rather at point C. The “right-
kind” of regulation of emissions from z0 to zR would highlight inefficiencies, which
would allow the company to move (outward) to the production possibilities curve.
At point B, all inefficiencies are neutralized and the company increases production
(from q 0 to q R ), earns higher profits (from π 0 to π R ), and, at the same time, reduces
emissions. The hypothesis implicitly assumes that it is cost-free to move toward the
38 Production and the Environment 1481

P roduction, q
q = fR (z)

π1 q = f0 (z)
1 E
q
A
πR
D
qR
B
π0
0
q
C

zR z0 Emissions, z

Fig. 4 Illustration of the Porter hypothesis

frontier. There could, of course, be several reasons why a company might not be
producing efficiently, or on the frontier.
The Porter hypothesis stipulates that environmental regulations have a “dynamic”
effect in that they stimulate innovation and new processes. Figure 4 illustrates this
by showing how the production possibilities frontier shifts upward, representing a
“new” technology f R (z). The production and emission levels at point B are now
inefficient under the new production technology, but the regulations also make this
inefficiency visible to the company. Ultimately, this means that the company will
move itself from point C to B and then move even further as a result of the new
technology to a point between D and E. Given stable prices (in both the product
itself and the “emissions” input), the firm maximizes profit at point E. We note
that this result of the Porter hypothesis relies on the implicit assumption that the
development of the new technology does not make use of the company’s alternative
productive resources; i.e., there are no crowding out effects, or at least they are very
small.
Below we look at the main results from some of the most influential papers
concerning the Porter hypothesis and its connection to productivity and/or effi-
ciency. Also, we provide a sample of more recent studies on the subject. Additional
references can be found in the reviews cited above.
In an important early contribution to the empirical literature, [59] divide the
Porter hypothesis into three distinct testable versions: (1) the “weak” version posits
that environmental regulation will stimulate environmental innovations; (2) the
“narrow” version asserts that flexible environmental policy regimes give firms
greater incentive to innovate than prescriptive regulations, such as technology
1482 M. Bostian and T. Lundgren

standards; and (3) the “strong” version stipulates that properly designed regulation
may induce cost-saving innovation that more than compensates for the cost of
compliance. Using US survey data on pollution costs and expenditures (PACE),
they find evidence supporting the weak and narrow versions but not for the strong.
The remaining studies included here mainly concern testing the strong version
within a productivity or efficiency modeling setup.
Berman and Bui [14] examine the effect of air quality regulation on productivity
in oil refineries of the Los Angeles Air Basin. They use local air pollution regulation
to estimate the effects on the demand for abatement capital during a period of
sharply increased regulation, 1979 to 1992. They construct measures of total factor
productivity (TFP) using data on physical quantities of inputs and outputs. Despite
high costs associated with regulation, TFP rose sharply between 1987 and 1992, a
period of decreased refinery productivity in other regions. The authors conclude
that abatement costs may grossly overstate the economic cost of environmental
regulation as abatement may very well increase TFP; this conclusion would not
rule out support for the strong version of the Porter hypothesis.
Shadbegian and Gray [91] investigate the impact of abatement expenditures on
productivity, using plant-level data for 68 pulp and paper mills, 55 oil refineries, and
27 steel mills for the 1979–1990 period. They estimate a Cobb-Douglas production
function to measure the contribution of capital, labor, and material inputs to output.
Their access to data on abatement expenditures allows them to distinguish between
productive and abatement expenditures for each input. They find that abatement
expenditures contribute little or nothing to production or productivity, a result that
somewhat contradicts [14].
Lanoie et al. [67] test the significance of all three different variants of the Porter
hypothesis. Their analysis draws upon a large database that includes observations
from approximately 4,200 facilities in 7 OECD countries. In general, and consistent
with [59], they find support for the weak and narrow version but no support for the
strong version.
Lundgren et al. [71] analyze productivity effects of the Swedish CO2 tax and
the European Union Emissions Trading System (EU ETS) for the Swedish pulp and
paper industry 1998–2008. They compute a Luenberger TFP indicator using DEA.
The results indicate that these climate policies had a modest impact on technological
development in the pulp and paper industry, and when significant, these effects were
negative. They conclude that when designing policy to mitigate CO2 emissions, it
is vital that the policy generates a carbon price that is high enough to put pressure
on technological development.
Lundgren et al. [71] investigate how firm-level environmental performance
(EP) – a measure described earlier in this chapter – affects firm-level economic
performance, measured as profit efficiency (PE) in a stochastic frontier setting.
Analyzing firms in Swedish manufacturing from 1990 to 2004, their results show
that EP induced by environmental policy does not determine PE, while voluntary
or market-driven EP seems to have a significant and positive effect on firm PE in
most sectors. The evidence generally supports the idea that good EP is also good for
business (CSR), as long as EP is not brought on by external policy measures, in this
38 Production and the Environment 1483

case a CO2 tax. Thus, the results provide no general support for the strong version of
the Porter hypothesis. However, the results also show no significant negative effect
of policy on efficiency. Using the same data set and modeling framework, similar
results are found in [50, ch. 5]; i.e., the effects of a CO2 tax on efficiency are
mixed, depending on what industry sector is analyzed; but some evidence is found
for a positive dynamic effect of the CO2 tax on those sectors that exhibit the largest
inefficiencies.
Albrizio et al. [3] investigate the impact of environmental policy on industry-
and firm-level productivity growth in a panel of OECD countries. To test the strong
version of the Porter hypothesis, they use a productivity model to allow for effects
of environmental policies. They find that a tightening of environmental policy
is associated with a short-term increase in industry-level productivity growth in
the most technologically advanced countries. This effect diminishes with distance
from the global productivity frontier. For the average firm, no evidence of the
Porter hypothesis is found. However, the most productive firms see a temporary
boost in productivity growth, while less productive firms experience a productivity
slowdown; this directly contradicts the findings in [50].
Rødseth [89] looks specifically at environmental regulations and their impli-
cations for allocative efficiency. He establishes a model framework that allows
disentangling managerial and regulatory-induced allocative efficiencies. Applying
DEA for estimation to a sample of 67 coal-to-gas substituting power plants
from 2002 to 2008, he calculates Nerlovian profit efficiencies and their technical
and allocative efficiency components. The empirical results illustrate that failing
to control for environmental regulations leads to overestimation of managerial
allocative efficiencies by ignoring compliance costs.
From this meta-review of earlier reviews and considering the more recent results
discussed above, the jury is still out on the Porter hypothesis; while it is possible
to find cases and circumstances where it seems to hold, even the strong version,
the literature lacks more general support. We see that the literature associated with
analyzing environmental policy and firm performance has many applications using
the toolbox of production economics, some of them mentioned above, especially
at the micro level. The Porter literature runs parallel with the literature on self-
regulation (CSR), as noted by [39], where the holy grail is to find a positive
relationship between doing good and doing well; green business is good business.
These two strands in the research on environmental policy (internal or external) and
economic performance have, to a large extent, merged with an increasing focus on
the effects of voluntary self-regulation and effects on firm performance (see, e.g.,
[37] for a theoretical discussion or, e.g., [72], or [27], for empirical applications).

Conclusion

We intend for this chapter to serve as a starting point for those interested in
environmental applications of production theory. Importantly, we have by no means
provided a comprehensive survey. For instance, we omit the closely related topic of
1484 M. Bostian and T. Lundgren

energy and the environment, as one could easily fill another chapter on this alone.
The same is also true of agriculture. Instead, the selected topics reflect emphasis in
the field of environmental economics, as well as our own areas of expertise. With
this in mind, we conclude by considering the trajectory of empirical research in the
field.
Beginning with externalities and efficiency/productivity, there is growing work to
incorporate better structural representations of the production-pollution-abatement
technology nexus, largely through the use of multi-equation network models (see
section “Network and Multi-function Models”). Related to this is the incorporation
of environmental processes into the production model framework. This can include
specifying a materials balance condition (à la Førsund [54]), as well as, and with
the advent of recent computational advances, integrating more complex biophysical
models [20, 97].
Turning now to production, the environment, and valuation, we see only limited
efforts to value environmental changes by treating the environment as an input.
While the theory is well-founded [76] and application seems straightforward, still
we see surprisingly few empirical studies, most likely due to data limitations. The
majority of existing applications also relate to fisheries production, mainly in the
tropics.
Studies on substitution possibilities between the environment as an input and
conventional inputs are also quite scarce. The reference that stands out in the
literature is the [31] application of a dynamic factor demand model based on a
dual cost function.
However, substitution studies treating the environment as an output are more
common. The interest in assessing substitutability between different bads/pollutants
(non-separability) and between intended and unintended outputs increased signifi-
cantly after [42], who investigate substitutability and shadow pricing of unintended
outputs, using both deterministic DEA and stochastic frontier analysis.
The empirical literature on the relationship between environmental policy,
economic performance, and environmental performance is now colossal and still
growing. As noted by [39], the field is merging to some degree with the research on
CSR. That is, both external policies and internal management policy decisions affect
firm performance – both economic and environmental. We also see some interest
more recently in using production economics in this context, e.g., [27, 71, 72].

References
1. Aburto-Oropeza O, Ezcurra E, Danemann G, Valdez V, Murray J, Sala E (2008) Mangroves
in the Gulf of California increase fishery yields. Proc Natl Acad Sci 105:10456–10459
2. Agee MD, Atkinson SE, Crocker TD, Williams JW (2014) Non-separable pollution control:
implications for a CO2 emissions cap and trade system. Resour Energy Econ 36:64–82
3. Albrizio S, Kozluk T, Zipperer V (2017) Environmental policies and productivity growth:
evidence across industries and firms. J Environ Econ Manag 81:209–226
4. Ambec S, Cohen MA, Elgie S, Lanoie P (2013) The Porter hypothesis at 20: can environmen-
tal regulation enhance innovation and competitiveness? Rev Environ Econ Policy 7:2–22
38 Production and the Environment 1485

5. Arjomandi A, Dakpo KH, Seufert JH (2018) Have Asian airlines caught up with European
Airlines? A by-production efficiency analysis. In: Transportation Research Part A: Policy and
Practice, vol 116(C). Elsevier, pp 389–403.
6. Azad MAS, Ancev T (2014) Measuring environmental efficiency of agricultural water use: a
Luenberger environmental indicator. J Environ Manag 145:314–320
7. Azad MAS, Ancev T, Hernandez-Sancho F (2015) Efficient water use for sustainable
irrigation industry. Water Resour Manag 29(5):1683–1696
8. Ayers RU, Kneese AV (1969) Production, consumption, and externalities. Am Econ Rev
59:282–297
9. Barbier EB (2007) Valuing ecosystem services as productive inputs. Econ Policy 22:
177–229
10. Barbier EB, Strand I (1998) Valuing mangrove-fishery linkages: a case study of Campeche,
Mexico. Environ Resour Econ 12:151–166
11. Bard J (1998) Practical bilevel optimization: algorithms and applications. Kluwer, The
Netherlands
12. Barnhart B, Bostian M, Whittaker G, Grosskopf S, Färe R (2016) Prioritizing conservation
for the reduction of Gulf Hypoxia using an environmental performance index. Ecol Indic 66:
235–241
13. Baumol WJ, Oates W (1988) The theory of environmental policy. Cambridge University
Press, Cambridge
14. Berman E, Bui LTM (2001) Environmental regulation and productivity: evidence from oil
refineries. Rev Econ Stat 83:498–510
15. Bi GB, Shao YY, Song W, Yang F, Luo Y (2018) A performance evaluation of China’s coal-
fired power generation with pollutant mitigation options. J Clean Prod 171:867–876
16. Bostian M, Herlihy AT (2014) Valuing tradeoffs between agricultural production and wetland
condition in the U.S. Mid-Atlantic region. Ecol Econ 105:284–291
17. Bostian M, Färe R, Grosskopf S, Lundgren T (2016) Environmental investment and firm
performance: a network approach. Energy Econ 57:243–255
18. Bostian M, Färe R, Grosskopf S, Lundgren T (2018) Network representations of pollution-
generating technologies. Int Rev Environ Resour Econ 11(3):193–231
19. Bostian M, Färe R, Grosskopf S, Lundgren T (2018) Prevention or cure? Abatement efficiency
in a network technology. CERE working paper, UmeåUniversity
20. Bostian MB, Whittaker G, Barnhart B, Färe R, Grosskopf S (2015) Valuing water quality
tradeoffs at different spatial scales: an integrated approach using bilevel optimization. Water
Resour Econ 11:1–12
21. Brännlund R, Lundgren T (2009) Environmental policy without costs? A review of the Porter
hypothesis. Int Rev Environ Resour Econ 3:75–117
22. Brännlund R, Lundgren T (2010) Environmental policy and profitability: evidence from
Swedish industry. Environ Econ Policy Stud 12:59–78
23. Carlson C, Burtraw D, Cropper M, Palmer K (2000) Sulfur dioxide control by electric utilities:
what are the gains from trade? J Polit Econ 108:1292–1326
24. Caves DW, Christensen LR, Diewert WE (1982) The economic theory of index numbers and
the measurement of input, output, and productivity. Econometrica 50(6):1393–1414
25. Chambers R, Chung YH, Färe R (1996) Benefit and distance functions. J Econ Theory
70(2):407–419
26. Chambers RG, Serra T, Oude Lansink A (2014) On the pricing of undesirable state-contingent
outputs. Eur Rev Agric Econ 41:485–509
27. Chambers RG, Serra T (2018) The social dimension of firm performance: a data envelopment
approach. Empir Econ 54:189–206
28. Cheng SX, Liu W, Lu K (2018) Economic growth effect and optimal carbon emissions under
China’s carbon emissions reduction policy: a time substitution DEA approach. Sustainability
10(5):article number 1543
29. Chung Y, Färe R, Grosskopf S (1997) Productivity and undesirable outputs. J Environ Manag
51(3):229–240
1486 M. Bostian and T. Lundgren

30. Coelli T, Lauwers L, Van Huylenbroeck G (2007) Environmental efficiency measurement and
the materials balance condition. J Prod Anal 28:3–12
31. Considine TJ, Larson DF (2006) The environment as a factor of production. J Environ Econ
Manag 52:645–662
32. Considine TJ, Larson DF (2012) Short term electric production technology switching under
carbon cap and trade. Energies 5:4165–4185
33. Cropper M, Oates W (1992) Environmental economics: a survey. J Econ Lit 30:675–740
34. Dakpo KH, Jeanneaux P, Latruffe L (2016) Modeling pollution-generating technologies
in performance benchmarking: recent developments, limits and future prospects in the
nonparametric framework. Eur J Operational Res 250(2):347–359
35. Dakpo KH, Jeanneaux P, Latruffe L (2017) Greenhouse gas emissions and efficiency in
French sheep meat farming: a non-parametric framework of pollution-adjusted technologies.
Eur Rev Agric Econ 44(1):33–65
36. Dakpo KH, Lansink AO (2019) Dynamic pollution-adjusted inefficiency under the by-
production of bad outputs. Eur J Oper Res 276(1):202–211
37. Dam L, Lundgren T, Scholtens B (2019) Environmental responsibility – theoretical perspec-
tive. In: McWilliams A, Rupp DE, Siegel DS, Stahl GK, Waldman DA (eds) Oxford handbook
of corporate social responsibility. Oxford University Press, Oxford
38. Dechezleprêtre A, Sato M (2017) The impacts of environmental regulations on competitive-
ness. Rev Environ Econ Policy 11:183–206
39. Dechezleprêtre A, Koźluk T, Kruse T, Nachtigall D, de Serres A (2019) Do environmental
and economic performance go together? A review of micro-level empirical evidence from the
past decade or so. Int Rev Environ Resour Econ 13:1–118
40. Duman YS, Kasman A (2018) Environmental technical efficiency in EU member and
candidate countries: a parametric hyperbolic distance function approach. Energy 147:
297–307
41. Färe R, Grosskopf S, Lovell CAK, Pasurka C (1989) Multilateral productivity comparisons
when some outputs are undesirable: a nonparametric approach. Rev Econ Stat 71(1):
90–98
42. Färe R, Grosskopf S, Noh DW, Weber W (2005) Characteristics of a polluting technology:
theory and practice. J Econ 126:469–492
43. Färe R, Grosskopf S, Norris M, Zhang Z (1994) Productivity Growth, Technical Progress,
and Efficiency Change in Industrialized Countries. American Economic Review 84(1):
66–83
44. Färe R, Grosskopf S, Pasurka C (2010) Toxic releases: an environmental performance index
for coal-fired power plants. Energy Econ 32(1):158–165
45. Färe R, Grosskopf S, Pasurka CA (2013) Joint production of good and bad outputs with
a network application. In: Shogren JF (ed) Encyclopedia of energy, natural resource, and
environmental economics, vol 2. Elsevier, Amsterdam, pp 109–118
46. Färe R, Grosskopf S (2004) New directions: efficiency and productivity. Kluwer Academic
Publishers, Boston
47. Färe R, Grosskopf S, Hernandez-Sancho F (2004) Environmental performance: an index
number approach. Resour Energy Econ 26(4):343–352
48. Färe R, Grosskopf S, Margaritis D, Weber WL (2012) Technological change and timing
reductions in greenhouse gas emissions. J Prod Anal 37:205–216
49. Färe R, Grosskopf S, Weber WL (2006) Shadow prices and pollution costs in US agriculture.
Ecol Econ 56(1):89–103
50. Färe R, Grosskopf S, Lundgren T, Marklund P-O, Wenchao Z (2016) The impact of climate
policy on environmental and economic performance: evidence from Sweden. Routledge
Focus/Taylor Francis Group: London
51. Färe R, Grosskopf S, Margaritis D (2019) Pricing non-marketed goods using distance
functions. World Scientific Publishing (NOW), Singapore
52. Fare R and Primont D (1995) Multi-Output Production and Duality: Theory and Applications.
Kluwer Academic Publishers, Netherlands
38 Production and the Environment 1487

53. Førsund F (2009) Good modeling of bad outputs: pollution and multiple-output production.
Int Rev Environ Resour Econ 3:1–38
54. Førsund F (2018) Multi-equation modeling of desirable and undesirable outputs satisfying
the materials balance. Empir Econ 54(1):67–99
55. Forsund F (2017) Pollution Meets Efficiency: Multi-equation modelling of generation of
pollution and related efficiency measures* Finn Førsund No 09/2017, Memorandum from
Oslo University, Department of Economics
56. Hampf B (2014) Separating environmental efficiency into production and abatement effi-
ciency – a nonparametric model with application to U.S. power plants. J Prod Anal
41(3):457–473
57. Hoagland P, Kite-Powell HL, Jin D, Solow AR (2013) Supply-side approaches to the
economic valuation of coastal and marine habitat in the Red Sea. J King Saud Univ Sci
25:217–228
58. Jaffe AB, Peterson SR, Portney PR, Stavins RN (1995) Environmental regulation and the
competitiveness of US manufacturing: what does the evidence tell us? J Econ Lit 33:
132–163
59. Jaffe AB, Palmer K (1997) Environmental regulation and innovation: a panel data study. Rev
Econ Stat 79:610–619
60. Just RE, Hueth DL, Schmitz A (2004) The welfare economics of public policy: a practical
approach to project and policy evaluation. Edward Elgar Publishing, Cheltenham
61. Kahn JR, Kemp WM (1985) Economic losses associated with the degradation of an ecosys-
tem: the case of submerged aquatic vegetation in Chesapeake Bay. J Environ Econ Manag 12:
246–263
62. Klemick H (2011) Shifting cultivation, forest fallow, and externalities in ecosystem services:
evidence from the Eastern Amazon. J Environ Econ Manag 61:95–106
63. Kortelainen M (2008) Dynamic environmental performance analysis: a Malmquist index
approach. Ecol Econ 64(4):701–715
64. Kumar S (2006) Environmentally sensitive productivity growth: a global analysis using
Malmquist-Luenberger index. Ecol Econ 56(2):280–293
65. Kumar S, Managi S (2011) Non-separability and substitutability among water pollutants:
evidence from India. Environ Dev Econ 16:709–733
66. Kumbhakar SC, Tsionas EG (2016) The good, the bad and the technology: endogeneity in
environmental production models. J Econ 190(2):315–327
67. Lanoie P, Laurent-Lucchetti J, Johnstone N, Ambec S (2011) Environmental policy, inno-
vation and performance: new insights on the Porter hypothesis. J Econ Manag Strateg
20:803–842
68. Lozano S (2015) A joint-inputs Network DEA approach to production and pollution-
generating technologies. Expert Syst Appl 42(21):7960–7968
69. Lozano S (2017) Technical and environmental efficiency of a two-stage production and
abatement system. Ann Oper Res 255(1–2):199–219
70. Luenberger D (1995) Microeconomic theory. McGraw-Hill College, Singapore
71. Lundgren T, Marklund, P-O, Samakovlis E, Wenchao Z (2015) Carbon prices and incentives
for technological development. J Environ Manag 150:393–403
72. Lundgren T, Wenchao Z (2017) Firm performance and the role of environmental management.
J Environ Manag 203:330–341
73. Malikov E, Kumbhakar SC, Tsionas EG (2015) Bayesian approach to disentangling technical
and environmental productivity. Econometrics 3(2):443–465
74. Mamardashvili P, Emvalomatis G, Jan P (2016) Environmental performance and shadow
value of polluting on Swiss dairy farms. J Agric Resour Econ 41(2):225–246
75. McArthur LC, Boland JW (2006) The economic contribution of seagrass to secondary
production in South Australia. Ecol Model 196:163–172
76. McConnell KE, Bockstael NE (2005) Valuing the environment as a factor of production. In:
Mäler K-G, Vincent JR (eds) Handbook of environmental economics, ch. 14, vol 2. Elsevier,
Amsterdam
1488 M. Bostian and T. Lundgren

77. McConnell KE, Strand IE (1989) Benefit from commercial fisheries when demand and supply
depend on water quality. J Environ Econ Manag 17:284–292
78. Mekaroonreung M, Johnson AL (2010) Estimating the efficiency of American petroleum
refineries under varying assumptions of the disposability of bad outputs. Int J Energy Sector
Manag 4(3):356–398
79. Murty S, Russell RR, Levkoff SB (2012) On modeling pollution-generating technologies.
J Environ Econ Manag 64:117–135
80. Murty S, Russell RR (2018) Modeling emission-generating technologies: reconciliation of
axiomatic and by-production approaches. Empir Econ 54(1):7–30
81. Oh DH (2010) A global Malmquist-Luenberger productivity index. J Prod Anal 34(3):
183–197
82. Peña CR, Serrano ALM, de Britto PAP, Franco VR, Guarnieri P, Thomé KM (2018)
Environmental preservation costs and eco-efficiency in Amazonian agriculture: application
of hyperbolic distance functions. J Clean Prod 197(1):699–707
83. Picazo-Tadeo AJ, Reig-Martinez E, Hernandez-Sancho F (2005) Directional distance func-
tions and environmental regulation. Resour Energy Econ 27(2):131–142
84. Pindyck R, Rotemberg JJ (1983) Dynamic factor demands and the effects of energy price
shocks. Am Econ Rev 73:1066–1079
85. Piot-Le Petit I, Le Moing M (2007) Productivity and environmental regulation: the effect of
the nitrates directive in the French pig sector. Environ Resour Econ 38(4):433–446
86. Porter M (1991) America’s green strategy. Sci Am 264:168
87. Porter M, van der Linde C (1995) Toward a new conception of the environment competitive-
ness relationship. J Econ Perspect 9:97–118
88. Ray SC, Mukherjee K, Venkatesh A (2018) Nonparametric measures of efficiency in the
presence of undesirable outputs: a by-production approach. Empir Econ 54(1):31–65
89. Rødseth K (2017) Axioms of a polluting technology: a materials balance approach. Environ
Resour Econ 67(1):1–22
90. Serra T, Chambers R, Lansink A (2016) Measuring technical and environmental efficiency in
a state-contingent technology. Eur J Oper Res 236(2):706–717
91. Shadbegian RJ, Gray WB (2005) Pollution abatement expenditures and plant-level produc-
tivity: a production function approach. Ecol Econ 54:196–208
92. Shephard RW (1970) Theory of cost and production functions. Princeton University Press,
Princeton
93. Silva FD, Perrin RK, Fulginiti LE (2019) The opportunity cost of preserving the Brazilian
Amazon forest. Agric Econ 50(2):219–227
94. Van den Bergh JM (1999) Materials, capital, direct/indirect substitution, and mass balance
production functions. Land Econ 75:547–561
95. Weber WL, Domazlicky B (2001) Productivity growth and pollution in state manufacturing.
Rev Econ Stat 83(1):195–199
96. Wei Y, Li Y, Wu MY, Li YB (2019) The decomposition of total-factor CO2 emis-
sion efficiency of 97 contracting countries in Paris Agreement. Energy Econ 78:
365–378
97. Whittaker G, Färe R, Grosskopf S, Barnhart B, Bostian MB, Mueller-Warrant G, Griffith S
(2017) Spatial targeting of agri-environmental policy using bilevel evolutionary optimization.
OMEGA, Int J Manag Sci 66, Part A:15–27
98. Yang F, Yang M, Nie H (2013) Productivity trends of Chinese regions: a perspective from
energy saving and environmental regulations. Appl Energy 110:82–89
99. Yao X, Guo CW, Shao S, Jiang ZJ (2016) Total-factor CO2 emission performance of China’s
provincial industrial sector: a meta-frontier non-radial Malmquist index approach. Appl
Energy 184:1142–1153
100. Zhang N, Yongrok C (2013) Total-factor carbon emission performance of fossil fuel power
plants in China: a metafrontier non-radial Malmquist index analysis. Energy Econ 40:
549–559
38 Production and the Environment 1489

101. Zhang N, Yongrok C (2014) A note on the evolution of directional distance function and its
development in energy and environmental studies 1997–2013. Renew Sustain Energy Rev
33:50–59
102. Zhou P, Ang PW (2008) Decomposition of aggregate CO2 emissions: a production-theoretical
approach. Energy Econ 30(3):1054–1067
103. Zhou P, Sun ZR, Zhou DQ (2014) Optimal path for controlling CO2 emissions in China: a
perspective of efficiency analysis. Energy Econ 45:99–110
Applications of Production Theory
in Transportation 39
Phill Wheat, Kristofer Odolinski, and Andrew Smith

Contents
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1492
Competition and Governance in the Transportation Sector . . . . . . . . . . . . . . . . . . . . . . . . . . 1493
Approaches to Production Analysis in Transportation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1495
Key Features of Transport . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1497
Outputs Used . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1501
Input Prices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1504
Variable Cost Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1504
Cost and Efficiency Studies for Railways . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1506
Infrastructure Studies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1506
Passenger Train Operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1513
Cost and Efficiency Studies in Other Transport Sectors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1515
Road Infrastructure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1515
Local Public Transport . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1516
Air Transport . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1517
Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1518
Cross-References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1519
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1519

P. Wheat ()
Institute for Transport Studies, University of Leeds, Leeds, UK
e-mail: [email protected]
K. Odolinski
Institute for Transport Studies, University of Leeds, Leeds, UK

Society, Environment, and Transport, The Swedish National Road and Transport Research
Institute (VTI), Stockholm, Sweden
e-mail: [email protected]
A. Smith
Society, Environment, and Transport, The Swedish National Road and Transport Research
Institute (VTI), Stockholm, Sweden
e-mail: [email protected]

S. C. Ray et al. (eds.), Handbook of Production Economics,
https://doi.org/10.1007/978-981-10-3455-8_39
1492 P. Wheat et al.

Abstract

In this chapter, the experience of production and cost analysis in transportation

is reviewed. The key production and cost analysis needs of the sector can be
identified providing strategic operational insight, establishing evidence as to
which market and regulatory structures yield best outcomes and providing the
evidence base for regulatory scrutiny either through yardstick competition or
more formal price cap regulation. The upshot of this is that transportation has
provided the motivation and illustration for many innovations within production,
cost, and efficiency methods, and this chapter brings to life the issues and
solutions found in the sector.

Keywords

Transport infrastructure · Public transport · Marginal cost · Cost efficiency ·

Railways · Roads · Air transport

Introduction

Analysis of production and cost in transportation has a long history and the sector
has provided motivation for many innovations in the economist production and cost
analysis tool box. For example, the pioneering work of Caves et al. [1, 2] in the US
railroads and airlines introduced the need to disentangle economies of density from
economies of scale in network industries. Transportation has also been a key sector
for illustration of methods used to analyze efficiency, including Swiss railways to
illustrate panel data stochastic frontier models [3, 4] and models involving non-
normally distributed noise errors [5].
The richness of data and applications in transportation arises for a number
of reasons. Firstly, some transportation could be considered a public good (for
example, non-tolled highways); however, transport supply often requires subsidy
or exclusivity to make provision viable and efficient (such as provision of infras-
tructure). This motivates a strong public interest in the sector, either through direct
government provision, subsidy to private providers and/or a system of economic
regulation. Secondly, the study of the economics of transportation is nontrivial given
the joint production of transport services (for example, a bus service provides many
journey possibilities along a route) and the temporal nature of transport services
implies non-storability. Ultimately the public interest dimension, coupled with the
complexity of the sector, means datasets are often potentially rich – in terms of input
and output variables that are available for different locations and time periods – and
therefore transportation cost and production analysis has become a major area of
research. Such rich data enables a relatively complex characterization of outputs
and inputs in the analysis and critically an analysis of the influence of quality.
The structure of this chapter is the following. A brief overview of competition
and governance in the transportation sector is provided in section “Competition
and Governance in the Transportation Sector”. Section “Approaches to Production
Analysis in Transportation” reviews approaches to production and cost analysis in
39 Applications of Production Theory in Transportation 1493

transportation. It is important to recognize the variety of approaches beyond the

econometric approach – including alternative top-down approaches and bottom-
up modelling approaches – which are used within transportation for both research
purposes and to inform operational decisions in the sector. Section “Approaches to
Production Analysis in Transportation” then considers the features of transportation
which makes production analysis challenging. These include multiple outputs and
multiple inputs, arising from the fundamental non-storability of the transportation
product. Section “Key Features of Transport” presents empirical findings from
the rail mode, where substantive econometric works have been undertaken, to
act as an illustration of the analysis undertaken in transportation. Section “Cost
and Efficiency Studies for Railways” then briefly surveys work across other
transport modes. Section “Cost and Efficiency Studies in Other Transport Sectors”
concludes.

Competition and Governance in the Transportation Sector1

Transformations of the transport industry and the differing approaches taken by the
public sector was an early core subject of economic analysis, where researchers
tried to answer questions such as why the public sector owns and/or regulates the
production and pricing of certain goods and services, and whether the public sector
should be involved in this matter or not.
Ownership and regulation of the transportation sector has changed over the
years, where large shifts have been triggered by innovations or even new modes of
transport being introduced. For example, up until 1980, railways were often seen as
a typical example of a natural monopoly where the government needed to intervene
and reduce market inefficiency by the means of regulation. However, the benefits of
public ownership and regulation of railways can be questioned when competition
from other modes increases and more subsidy from the public are required to
cover costs. This can explain the vertical separation of the railway sector in many
European countries around the 1990s, where Sweden was the first to do so in 1988,
mainly due to the growth of required subsidies to the state-owned railway company
[6]. Other countries such as the USA and Japan chose a different path by keeping
a vertical integration between infrastructure management and train operations. This
can partly be explained by a reliance on intermodal competition, as well as parallel
competition (e.g., two railway lines serving the same areas) and source competition
(i.e., a producer/customer can choose another source and customer/producer of a
certain good). Still, the vertically integrated railways are often regulated where for
example the railway firms in Japan are subject to yardstick competition, as originally
formulated by Shleifer [7].
Contestability is one explanation for the railway reforms chosen in Europe where
train operation was, to various degrees, subject to competition while infrastructure
management was kept as state-owned monopolies (apart from Great Britain where

1 This section is partly based on Odolinski [77]

1494 P. Wheat et al.

the rail infrastructure company was privatized in 1996). Undoubtedly, rail infrastruc-
ture has a high level of sunk costs and is less contestable than train operations with
rolling stock that may be used on another market – that is, train operations do not
have the same entry and exit barriers as infrastructure provision. It was argued that
the efficiency gains from competition between train operators could outweigh any
potential losses caused by economies of scale not being fully exploited. However,
the move towards open competition for train operations was slow in Europe, with
mainly competitive tendering being used. Still, Sweden introduced open access to
freight services in 1996 and to passenger services in 2011, while Germany at least
nominally opened up for entry and on-the-tracks competition right from the start of
the vertical separation [8].
Infrastructure management has used competition for the market to increase their
cost efficiency, but compared to other industries, the tendering of infrastructure
maintenance poses some challenges on its own. One example is the interdepen-
dence with renewals, where maintenance activities are carried out on a structure
with a long service life, and both maintenance and renewals may affect future
performance of the assets. However, future contingencies make it costly to write
long-term maintenance contracts, that nonetheless have the benefit of inducing
more investments in quality as the producer can recover its investment costs.
Specifying and monitoring quality is thus an important aspect in infrastructure
provision. This can in turn impact the choice between competitive tendering and
in-house production. As pointed out by Hart et al. [9], in-house production can
be preferable to contracting out to private firms when important quality aspects
are ex-ante non-contractible and innovations in quality are not important. This
is the case even though an in-house production unit has weaker incentives for
investments in cost reductions, as well as for investments in quality, compared
to the contracted firm. The detrimental effect in this case is that the firm will
tend to focus too much on cost reductions compared to quality. Overall, incentive
structures that are beneficial in the short run need not be so in the long run,
which can especially be the case when managing an asset with a long service
life.
The internal organization of the infrastructure manager is also an important
aspect in creating a cost-efficient production. A large organization may well have
different working procedures between different units even though there is a central
planning unit or a manager giving (more or less clear) instructions. Estimates
of the relative cost inefficiency between units are therefore useful for internal
benchmarking. A related issue is the size of the contract areas chosen by the
infrastructure manager, which will determine whether economics of scale are fully
exploited or not.
There are clearly many different elements to consider when regulating or
reforming the transport sector, and the effects of reforms and incentive structures
are not obvious. Understanding the production and cost structure of this industry is
therefore vital.
To summarize, key production and cost analysis needs of the sector can be
identified as:
39 Applications of Production Theory in Transportation 1495

• Strategic operational insight: understanding how costs change with output to

inform operational and investment decisions. Essential here is the allowance for
the quality of output as well as a simple scale metric given the jointness and
non-storability of production.
• Establishing evidence as to which market and regulatory structures yield best
outcomes given the complexity in the transport sector and the inevitable trade-
offs between maintaining network coordination and avoiding useless duplication
of resources against potential inefficiency introduced by reducing competition
for or in the market.
• Providing the evidence base for regulatory scrutiny either through yardstick
competition or more formal price cap regulation.

In general, these issues would tend towards a direct analysis of costs. Indeed in
many transport settings, public or private firms can be stylized as delivering a set of
transport services (such as bus or train km) at a given quality for minimum cost, with
public authorities being tasked with design of the transport system. Thus it makes
sense to compare firms through the cost minimizing paradigm even if in reality there
is some inefficiency relative to this ideal.

Approaches to Production Analysis in Transportation2

In this section of the chapter, we review the context in which econometric methods
contribute to the broad understanding of production in transportation. There are
three main approaches for production and cost analysis in transportation:

• An econometric approach
• An accounting (cost allocation) approach
• A bottom-up approach using mechanistic models

The approaches used need not be in the either of these extremes; there are,
for example, hybrids or combinations of the different approaches. Moreover,
the modelling approaches can be either deterministic or stochastic, where the
former makes predictions that excludes random variation, while the latter predict
distributions of potential outcomes.
The econometric approach tries to establish relationships with factors that may
explain variations in costs. This approach is often applied to analyze economies
of scale, scope, or density, or the impact of organizational changes such as (de-)
regulation. It is also a common approach when establishing relationships between
output and (aggregate or disaggregate) costs, such as the impact traffic has on costs
for providing rail infrastructure. This is the primary method used by economists and
is the primary subject of this chapter.

2 This section is partly based on Odolinski and Wheat 51

1496 P. Wheat et al.

The accounting approach has been used to allocate costs to different activities
carried out in the provision of transport services. It is often used when the aim is
full cost recovery, where the users of the transport services bear the full cost. Hence,
the fees are usually set at average costs. This has long been the case within the civil
aviation industry, where States have agreed on international policies on airport cost-
recovery. The International Civil Aviation Organization has, for example, developed
a manual on airport economics [10] to assist in the management of airports and
describes the accounting method at length. There is also a tradition of using the
accounting approach in determining bus operating costs. Early reviews on these
costing models are made by, for example, Cherwony et al. [11] and Stopher et al.
[12], and Sinner [13] is a recent review and application of the cost allocation method
for bus and train lines. The accounting approach also has a history in railway
transportation. As noted by Braeutigam [14], the Interstate Commerce Commission
(regulatory commission in the USA) applied the accounting approach to railways
from the 1930s to the 1980s to determine the cost of transport services and the
revenue required to cover their costs. Moreover, the British rail regulator ORR3 used
a mix of the accounting approach and an engineering bottom-up approach prior to
2008 in order to determine the cost for infrastructure services.
The allocation of costs to different activities can be carried out in numerous
ways. Essentially it uses some set of allocation factors or principles to determine
the allocation of fixed and joint costs or identify incremental or avoidable costs.
Total costs are thus split into different categories or activities, in which the allocation
factors depend on the nature of the cost category or performed activity. For example,
time can be used to allocate costs for staff, while distance travelled, weight of the
vehicle, or consumed capacity may be used for operation or maintenance costs. Link
et al. [15] provide a list of different allocation procedures and allocation factors in
studies using cost allocation in road infrastructure analysis and in air transport.
The accounting approach, however, has a set of weaknesses. As noted by
Waters [16], this approach struggles in determining the opportunity cost of the
assets, as well as in distinguishing between fixed and variable costs. Costs in the
accounting system are also often aggregated making it impossible to determine
a cost for an output that may be important for decision-making and efficient
management, for example, the cost of transport in peak hours. Another downside
is that the accounting approach calculates average costs for different activities and
not the marginal cost. From an efficiency perspective, the latter can be preferred in
decreasing cost industries as a basis for price regulation, and transport regulatory
bodies have often sought estimates on marginal costs, especially for railways. In
fact, the use of cost allocations for pricing purposes has been disapproved by
economists since the late 1800s, partly due to its rather arbitrary cost allocation
procedures, cf. Taussig [17] and Griliches [18].

3 As
of 2021, Road; previously the Office of Rail Regulation and prior to that the Office of the Rail
Regulator.
39 Applications of Production Theory in Transportation 1497

Both the econometric and accounting approaches are known as top-down

approaches as they allocate total cost to several drivers of costs. The benefit of these
approaches is the simplicity and that they use actual costs. However, the resolution
available on the underlying process relationships through the top-down approach is
limited, as the underlying mechanisms are highly correlated or there are (unknown)
mediating variables.
The bottom-up (engineering) approach can be particularly useful in these cases
as it starts at a lower level to model the physical mechanisms behind the cost
relationships. This approach has been used to estimate the relative cost impact of
different vehicle types or specific characteristics of a vehicle. In short, simulations
are performed based on a set of engineering models to provide estimates of the
damage on the infrastructure caused by a certain characteristic of a vehicle or
train operation. Another application concerns damages done to the vehicle using
an infrastructure with a set of characteristics. These damages are then linked to
maintenance activities or costs that can be implemented in order to reduce the need
for future maintenance and minimize costs over the whole life cycle of the transport
system in question or its subsystems (i.e., vehicles, infrastructure, etc.). In doing
this, a cost function is useful (and required).
The strength of the bottom-up approach is that it explicitly models the underlying
mechanisms behind the cost relationships. However, the use of this approach is
limited by data availability. For example, cost data are usually more aggregated than
the predicted mechanisms. The bottom-up approach thus needs to link mechanisms
with costs at a more aggregate level. In doing this, the elasticities of production need
to be acknowledged. Using simple unit costs might, for example, be problematic
as it ignores factors such as economies of scale and/or scope. Such aspects
can be modelled with a statistical model. Hence, the bottom-up and top-down
approaches can complement each other and a combination of them can be useful
in transportation cost analyses.

Key Features of Transport4

In this section, the key features and issues in transport are reviewed. To illustrate,
we utilize the literature in railways. Table 1 summarizes the key studies and states
what function was estimated and the key inputs and outputs used. In the remainder
of this section, the key advances in the methodology associated with econometric
analysis of railway performance are surveyed.
The multi-output nature of transport has motivated recent studies to use either the
cost frontier or distance function. Distance functions are related to a multi-output
generalization of the production function (the transformation function) and yield
estimates of technical inefficiency through considering feasible radial expansions
(contractions) of outputs (inputs) with respect to the production set.

4 This section is based on an updated review by Wheat 91

1498 P. Wheat et al.

Table 1 Summary of the characteristics of parametric cost studies in railways

Function Inputs or prices
Study Sample estimated used Outputs used
Andrikopoulos 1969–1993 Translog cost Total cost. Sum of
and Loizides European rail function. Returns Includes capital passenger-km
[19] companies to scale and costs (historic cost and Freight
productivity depreciation + tonne-km
studied but not interest)
explicit allowance
for inefficiency
Christopoulos 1969–1992 Input specific Total cost. Total train-km
et al. [114] European rail technical Includes capital
companies inefficiency via costs (historic cost
Generalized depreciation +
McFadden cost interest)
function
Coelli and 1988–1983 Deterministic Number of Passenger-km
Perelman [20] European rail production employees and
companies function Rolling stock Freight
capacity tonne-km
Route kilometers
Coelli and 1988–1983 Deterministic Number of Passenger km
Perelman [21] European rail production and employees Freight tonne
companies input and output Rolling stock km
distance functions capacity
Route kilometers
Couto and 1972–1999 Short-run variable Input prices for Two models:
Graham [22] 27 European cost function with labor, service (1)
railway first-order cost rendered by third Passenger-km
companies shares to separate parties, equipment and freight
out technical and (variable inputs) tonne-km (final
allocative and measure of outputs)
inefficiency capital stock. Also (2) Passenger
some network train-km and
characteristic freight
variables train-km
(intermediate
outputs)
Cowie and 1992 Deterministic Number of Passenger
Riddington European rail production employees train-km
[23] companies functions Capital (financial Service
measure) provision index
Deprins and 1970–1983 Deterministic Number of Total train-km
Simar [24] Europe + production employees
Japan rail function Number of
companies coaches / wagons
Energy
consumption
Route kilometers
(continued)
39 Applications of Production Theory in Transportation 1499

Table 1 (continued)
Function Inputs or prices
Study Sample estimated used Outputs used
Farsi et al. [3] 1985–1997 Various stochastic Input prices: Passenger-km
50 railway total cost frontier energy labor and and freight
companies in specifications capital tonne-km
Switzerland examining the
effect of
controlling for
time invariant
characteristics
Gathon and 1961–1988. Stochastic factor Number of Passenger
Perelman [25] European rail requirement employees train-km
companies function Freight
train-km
Route km
Gathon and 1961–1988 Stochastic Number of Sum of
Pestieau [26] European rail production employees passenger
companies function (also Number of rolling tonne-km and
second stage stocks freight
regression) Route kilometers tonne-km
Ivaldi and 1978–1997 Translog variable Prices: indexes of Car miles of
McCullough 25 US Class 1 cost function labor, equipment, (a) bulk, (b)
[27] Railroads fuel, and materials high value, (c)
general traffic
and
replacement of
ties installed
(infrastructure
output) and
also average
length of haul
and length of
road miles
Kumbhakar 1951–1975 Cobb-Douglas Quantities of Passenger-km
[28] 13 US Class 1 stochastic labor, energy, and and Freight
Railroads distance function capital tonne-km
with demand
system to separate
out technical and
allocative
inefficiency
Kumbhakar 1951–1975 Stochastic Quantities of Passenger-km
[29] 42 US Class 1 distance function labor, energy, and and Freight
Railroads with demand capital tonne-km
system to separate
out technical and
allocative
inefficiency
(continued)
1500 P. Wheat et al.

Table 1 (continued)
Function Inputs or prices
Study Sample estimated used Outputs used
Kumbhakar 1971–1994 Output and Input Quantities of Passenger-km
et al. [30] Europe rail distance function labor, energy, and and Freight
companies in a latent class capital tonne-km
framework
Lan and Lin 1995–2002 Two distance Efficiency model: Efficiency
[31] 39 functions one number of model:
international modelling passenger rolling passenger
railways technical units, number of train-km and
efficiency the employees freight
other modelling Effectiveness train-km
service model: passenger Effectiveness
effectiveness train-km and model:
freight train-km passenger-km
and freight
tonne-km
Loizides and 1969–1992 Short-run cost Operating costs Passenger-km
Tsionas [32] Europe rail function (not Capital stock and
companies frontier) with (financial Freight
coefficients which measure) tonne-km
vary by firm or
year
Parisio [33] 1973–1989 Short-run variable Input prices: Passenger-km
8 European cost function with labor, energy, and freight
Railway first-order cost materials. Length ton-km
companies shares to separate of track is the
out technical and measure of the
allocative fixed input
inefficiency
Cantos and 1970–1990 Stochastic cost Variable cost Passenger
Villarroya [34] Europe rail function (excludes capital train-km
companies cost) Freight
train-km
Cantos and 1970–1990 Stochastic cost Labor price, Passenger-km
Villarroya [35] Europe rail and revenue energy price, and
companies functions – material price – Freight
operating costs, price of passenger tonne-km
revenue and freight
outputs (rev
model)
Tsionas and 1969–1992 Stochastic Number of Passenger-km
Christopolous European rail production employees and Freight
[36] companies frontier with firm Energy tonne-km
environmental consumption
variables as Capital (financial
determinants of measure)
mean inefficiency
39 Applications of Production Theory in Transportation 1501

For cost frontier models, both variable cost frontiers (in railways, [22, 27, 33])
and total cost frontiers [3, 34, 35] have been estimated; the difference depending
on whether the infrastructure is deemed quasi-fixed or variable. This decision is
partly determined by the robustness of the available capital stock level variable(s)
versus the capital price variable. For example, one challenge of the cost function is
the difficulties in developing data on input prices, particularly infrastructure capital.
Instead, he estimates a variable cost function which requires data on the levels of
capital and not their associated price.

Outputs Used

Network industries can be viewed as producing many different heterogeneous

outputs. Transport networks in particular, given the non-storability of the product,
the large number of origin and destination combinations, and the many different trip
purposes, produce a very large number of outputs. In the limit, transport could be
thought as producing individual travel opportunities, by time, space, and purpose.
For public transport, such as railways, where the transport service serves many
points at many times of day, such a disaggregation of outputs is likely to be too
extreme to undertake meaningful econometric analysis. As such, a more pragmatic
approach has to be taken in specifying outputs.
Several common features of the output specification can be considered. First, two
general classifications of outputs are common [37]. One set are termed “available
outputs” which are measures of the service that the railway (or other transport
modes) produces (capital supplied) which are available to customers to consume.
Examples include train-km, vehicle-km, and seat-km. The second set are termed
“revenue outputs” which are measures of consumed outputs. Examples include
passenger-km and tonne-km of freight hauled. These two sets could also be thought
of as intermediate versus final outputs of the transport system, although it must also
be borne in mind that the demand for public transport services is often a derived
demand.
When choosing whether to use available or revenue outputs, it is important to
consider what is required to be measured in the analysis and whether the implicit
assumptions on what is under the firm’s control versus what is exogenous is
reasonable. For example, using available outputs can be justified when considering
the performance of a railway manager where the required outputs from the railway
are heavily prescribed by a regulator or government. As such, the railway manager
does not have much discretion as to how many train-km, vehicle-km, etc. can be run.
This is instead set by the regulator. However, if analysis of the effect of government
policy is the aim of a study, then it is more appropriate to adopt revenue output mea-
sures as policy makers have discretion in the specification of railway services to best
1502 P. Wheat et al.

meet demand. Of course, available outputs might be used in this context, alongside
revenue outputs as a measure of characteristics (quality) of the revenue output.
Any measured inefficiency from models reflects both inefficiency of the man-
agers and of policy makers or regulators [37]. Lan and Lin [31] cite Fielding
et al. [38] who define specific terms for these concepts. They define the degree
of suboptimal transformation of inputs into intermediate outputs as “technical
inefficiency,” while they define degree of suboptimal transformation of inputs into
final outputs as “technical ineffectiveness.” They define a further concept, “service
ineffectiveness,” as the degree of suboptimal transformation of intermediate outputs
into final outputs. They point out that it is the non-storability property of transport
outputs which requires such distinctions. Often of immediate concern is the
“technical inefficiency” concept, since the public transport undertakings (at least
in the short run) has to take its outputs as given. “Technical inefficiency” is
bounded by quotation marks in order to distinguish the Fielding et al. concepts
from the definition of technical efficiency in production theory. In particular in
this chapter, cost inefficiency is considered which includes allocative as well as
technical inefficiency even though this applies to the transformation of inputs into
intermediate outputs (and not final outputs). This is appropriate given the chapter is
considering the cost characteristics and performance of different parts of a vertically
separated industry.
It is important to emphasize that network size is viewed in public transport
as a characteristic of transport outputs, since the size of the network affects the
scope of travel opportunities available to users. This is in contrast to the use of
network size as a proxy for the capital stock for which empirical estimation of
related coefficients has yielded counter-intuitive signs (see the discussion about
inputs below). Therefore, empirical evidence suggests that network size has a strong
relation to the output of the railway rather than as a measure of the stock of capital
of the railway.
The second general distinction that has been made is the need to distinguish
between scale and density effects. Density effects comprise the effect on costs of
increasing all outputs (in equal proportion) while holding network size constant.
Scale effects comprise the effect on costs of increasing all outputs and network
size in equal proportion. This distinction is important since it is often argued that
marginal costs in network transport industries are below average costs and this is
a problem in terms of opening such markets to competition. Specifically, there is
strong reason to suggest that the marginal cost of accommodating an additional
consumer using the current network size through greater utilization is very small,
while it is not clear that the marginal cost of expanding the network to accommodate
the marginal consumer (here marginal O-D pair) is less than average cost. This was
one argument for choosing a vertical separation between train operations and rail
infrastructure management in Europe, where the former was exposed to competition
and the latter was kept as state-owned monopolies.
Caves et al. [1, 2] outlined expressions for returns to scale and returns to density
in cost functions. Caves et al. showed returns to scale (RtS) and density (RtD) can
be computed as follows:
39 Applications of Production Theory in Transportation 1503

RtS = 1
m−1
(1)
εyi + εS
i=1

RtD = 1
m−1
(2)
εyi
i=1

where εyi is the elasticity of cost with respect to the ith output (i = 1, . . . , m-1) and
εS is the elasticity of cost with respect to the network size variable.5
The need to distinguish between scale and density effects or the choice between
revenue versus available outputs is only part of the wider issue of how to account
for the heterogeneity of railway outputs, as introduced at the start of this section.
One way to deal with the heterogeneity in outputs is to group outputs into m groups
and include a further set of r variables which characterize the outputs.

C (y1 , . . . , ym , q1 , . . . , qr , p1 , . . . , pn ) (3)

The move from potentially hundreds or thousands of outputs to a more man-

ageable number of m outputs is obviously a simplification. However, the inclusion
of output characteristic variables is an attempt to reintroduce heterogeneity in
outputs back into the model. Such variables may include revenue measures (such
as passenger-km and freight tonnes-hauled) where available measures are adopted
as output and vice versa. As such it can become ambiguous as to what variables
represent outputs versus output characteristics versus network size. By implication
it also means that in practice, the distinction between the “technical inefficiency”
and “technical ineffectiveness” of Fielding et al. [38], discussed earlier, is far from
clear (e.g., if train-km and passenger load factor enter the model).
The inclusion of characteristic variables in the cost function specification has
prompted new definitions of returns to scale and density to be proposed to allow
for the possibility of characteristics of outputs changing along with the outputs or
network size themselves. (See Oum and Zhang [39] for a discussion.) The ideas are
similar to the discussion in Caves et al. [40] regarding the need to consider changes
in unobserved network effects in RtS described above; however, in Oum and Zhang
[39], these relate to changes in observed rather than unobserved variables. These
ideas are applied to the analysis of train operating company (TOC) costs in section
“Input Prices”, where several scale and density measures are proposed taking into
account variations in output characteristics as well as “primary” outputs.
While this formulation does simplify the problem to a tractable level, the
resulting function may be very complicated, given the number of variables and

5 For notational convenience and consistency with other equations which do not distinguish
between the network size variable and other outputs, the network size variable is treated as the mth
output and so only the first m-1 output elasticities are used in the RtD equation (which excludes
this output).
1504 P. Wheat et al.

possible interaction and higher order terms for each. As a result, the cost function
may still not be suitably parsimonious. An alternative is the hedonic cost function
developed by Spady and Friedlaender [41] and applied, for example, by Wheat and
Smith [113] to train operating company data as described below.

Input Prices

The measures of the price of inputs should reflect the opportunity cost of a unit of
those outputs. For example, the opportunity cost associated with 1 h of labor is the
wage rate. Less obvious is the price of capital. It should reflect the hourly rental of
the capital. This is problematic to measure because of heterogeneity in capital (see
below) but also due to the fact that capital tends to be owned rather than leased.
Methods such as the perpetual inventory method (see Bishop and Thompson [42])
have been developed to better capture a measure of capital price.
A further issue with the price of capital is the relationship between this and the
network size which could be viewed as a measure of capital. In particular, because
of a positive coefficient on miles of railroad, a negative marginal product of capital
is suggested [43]. However, it is clear that in a railway cost function, network size is
much more related to the scale of output of operation than a measure of the capital
stock of the network.
In practice, there is a similar problem to defining input prices as in defining
outputs, i.e., the problem of heterogeneity in inputs. For example, average salary
is likely to be a poor measure of the labor price as workers may work a different
number of hours across observations. Likewise, there is the possibility of a different
mix of workers across different observations. One firm may thus face higher
labor costs because it utilizes more expensive but higher skilled labor. This is
likely to distort coefficient estimates (and indeed estimates of inefficiency) due
to endogeneity of explanatory variables. The usual way to remedy this is to
disaggregate further the input prices in the model (such as wage rates per staff type),
but this adds to the number of coefficients to be estimated and the data may simply
not exist.

Variable Cost Function

As stated earlier, the transport sector is to a large extent a regulated industry, and
in many regulatory settings, it is often not reasonable to assume that the firm can
adjust the levels of all inputs. For example, the size and configuration of railway
infrastructure is often fixed. In these circumstances, the variable (short run) cost
function is appropriate. It can be derived using duality from a production function
under the assumption of cost minimization, a level of the fixed input(s), and prices
of the variable inputs. The resulting function for m outputs, n inputs (o of them
fixed) is given as:
39 Applications of Production Theory in Transportation 1505

V (y1 , . . . , ym , p1 , . . . , pn−o , z1 , . . . , zo ) (4)

where y and p are as before and zi represents the level of the ith fixed input. The
measure of variable cost in the function should only include the costs associated
with the variable inputs and not those associated with the fixed input(s).
The issues raised in the discussions above on output, functional form, and inputs
are applicable to the variable cost function in addition to the total cost function, but
the measurement of RtS and RtD are subtly different. The reason is that there is
a need to consider the effect of the fixed factor(s) when computing RtS and RtD.
Caves et al. [1, 2] give the expressions as:

o
RtS = 1 − εzi
m−1
(5)
i=1 εyi + εS
i=1

o
RtD = 1 − i=1 εzi m
ε (6)
yi
i=1

It is not entirely clear from the subsequent literature when (Eqs. 5 and 6) should
be employed vis-à-vis (Eqs. 1 and 2). For example, Wilson [43] has two variables
to capture the fixed factor. First, length of railroad and second, average speed rating.
He defines RtS as (Eq. 5) except εzi only includes the variable cost elasticity with
respect to length of rail road. This seems intuitive given the line speed measure
is a characteristic of the track which may a priori not be expected to change with
size of network. However, RtD is given as (Eq. 2) rather than (Eq. 6) (Wilson [43],
footnote 20), which seems odd given the definitions in Caves et al. [1]. Caves et al.
[2] compute RtD for their variable cost specification as (Eq. 6) where εzi is the
cost elasticity with respect to capacity (defined as the sum of the annual service
flows (measured in constant 1977 dollars) from flight equipment and from ground
property and equipment – footnote 19).
Clearly, either (Eq. 1/2) or (Eq. 5/6) could be valid measures of RtS and RtD in
a variable cost function; ultimately the two sets of measures are aimed at answering
subtly different questions. Equations (Eq. 1/2) are measuring how variable cost is
impacted on by changing scale and density, while (Eq. 5/6) are measuring how
total cost are impacted on by changing scale and density. What is important in any
analysis is to clearly state to what costs RtS/RtD relate. To some extent, this point is
mute for this chapter, since it is debatable whether what is being estimated is a total
or variable cost, especially in vertically separated industries such as railways. In
particular, we can consider the cost function (Eq. 3) to be a total cost function in the
sense that cost comprises all costs under direct control of train operating companies.
Similarly in (Eq. 4), the sum of maintenance and renewals cost is all that is in control
of the infrastructure manager, so again it is a total cost function. However, from the
perspective of a country’s railway, each cost set is only a part of the wider system.
1506 P. Wheat et al.

Related to this discussion is whether network length is viewed as a fixed factor

in the variable cost function or simply an output to distinguish RtS from RtD (as it
is described in the total cost function). Clearly, this decision affects the appropriate
decision as to which expression for RtS and RtD to adopt. Lee and Baumel [44]
point out that a model with a fixed factor included alongside a capital price in a cost
function violates the properties of both long-run and short-run cost functions. All in
all, it is not clear which measure of RtS and RtD to adopt in practice.

Cost and Efficiency Studies for Railways

In this section, we discuss in detail the application of production and cost economet-
rics in railways. This mode has been chosen due to the extensive and diverse work
undertaken, which is effective to illustrate the breadth of analysis undertaken within
the transport sector.

Infrastructure Studies

The railway system has long been considered a natural monopoly in which the mar-
ket equilibrium results in one firm producing railway services. In Europe, however,
many countries have separated train operations from infrastructure management
as evidence has suggested that train operators could lower their average costs.
This made it possible to introduce competition between train operating companies
by running train services on one or several lines while maintaining the natural
monopoly case of infrastructure management with one owner, the state. Many
European countries used in-house resources to maintain the infrastructure, yet there
are examples where these services have been contracted out. In Britain, this resulted
in concerns over the quality of the track, while Odolinski and Smith [45] found that
the use of competition for maintenance contracts in Sweden reduced costs by around
11% without any measurable falls in quality.
Many other countries have chosen a vertically integrated railway system and have
introduced regulations to reduce the negative effects of market power. Irrespective
of the solution used, the public sector needs to decide on the price to be set (or
allowed to be set), either through ownership or (de)regulation. There has therefore
been a large amount of work to understand railway infrastructure maintenance and
renewal cost from the perspective of quantifying the wear and tear by traffic on the
infrastructure network. In Europe in particular, this has been to inform marginal
cost-based pricing.

Marginal Cost Studies

Beginning with research by Johansson and Nilsson [46], there have been several
studies that have estimated variable cost functions for rail infrastructure mainte-
nance and for the sum of infrastructure maintenance and renewal costs. Studies
have utilized either track section or regional data. Most studies make use of
39 Applications of Production Theory in Transportation 1507

observations over a number of years. The time dimension makes it possible to

consider intertemporal effects. For example, maintenance costs in 1 year can have an
impact on the maintenance carried out in the subsequent year(s). The reason is that,
for example, a (sudden) change in traffic can imply a (temporary) deviation from
the original cost-minimizing plan. It can take time to adjust to the new situation due
to, for example, budget and/or planning restrictions. Indeed, intertemporal effects
have been found in rail infrastructure cost studies (see Odolinski and Nilsson [47])
as well as road infrastructure cost studies (see Haraldsson [48]).
There are two commonly defined (high level) cost categories relevant in deter-
mining infrastructure marginal wear and tear costs: maintenance cost and renewal
cost. Maintenance generally contains expenditures on activities associated with
day-to-day upkeep of the infrastructure, while renewals contain expenditures on
activities on replacement of assets whose life is expired, on a like-for-like basis.
Both cost categories contain substantial elements that are variable with traffic and
so both should be analyzed in econometric modelling of marginal costs. However,
most studies in this area have considered maintenance expenditure only as the
dependent variables and the limited number of studies that consider the sum of the
two categories suffer from poor fit. This is because renewals expenditure tends to
be lumpy (discrete in nature) and also depends on past, as well as current, traffic
levels.
In terms of the choice of the sum of maintenance and renewal cost versus analysis
of maintenance cost only, there is the obvious benefit of using the sum of these
cost categories as the dependent variable since this considers the majority of the
infrastructure manager’s activity that can (non-arbitrarily) be allocated to individual
track sections or areas. It also avoids problems associated with different definitions
of what exactly comprises maintenance versus renewal which can differ from zone
to zone within an infrastructure manager and particularly from one infrastructure
manager to another. However, there is less certainty that the cost functions for the
maintenance and renewals combined have all of the appropriate variables within
it due to the dynamic and lumpy nature of renewals expenditure. This could bias
any efficiency estimates derived from the model. As such, a model for maintenance
only expenditure is still a useful complement to a model with both cost categories
as the dependent variable. Another approach is to model the dynamics between
maintenance and renewals, acknowledging that these activities are input substitutes
in most life cycle asset management. Wheat [49] and Odolinski and Wheat [50] are
examples of top-down approaches on infrastructure costs, while Gaudry et al. [52]
provides estimates based on an (bottom-up) optimization model for maintenance
and regeneration of rail infrastructure.
Models for renewal costs only have also been considered in the literature, using
either corner solution models or survival analysis to capture the impact traffic has
on this cost category (see Andersson et al. [53, 54] for railway examples; see
[55] for a survival analysis on road infrastructure reinvestments; and Odolinski and
Nilsson [112] for a comparison of the corner solution and survival model approach
in this context). The logic behind the cost impact is that a (temporary) increase in
traffic implies that future renewals will be carried out earlier than originally planned.
1508 P. Wheat et al.

Note that a permanent change in traffic may also result in a change in the length of
the renewal intervals.
The type of outputs used are intermediate outputs. The primary reason for
this choice is that the motivation for the costing exercise was to derive marginal
costs with respect to intermediate outputs. This also corresponds to the type of
output which the infrastructure manager perceives, and so intermediate outputs
are appropriate for measuring the efficiency (as opposed to effectiveness, Lan and
Lin [31]) of this decision-making unit. The most popular output measure is gross
tonne-km as, relative to commonly available alternatives, this seems to be most
aligned to the true physical driver of damage and thus explains infrastructure costs.
There may be a priori reason to believe that there is benefit from distinguishing
between gross tonne-km of passenger and freight traffic as this would be more
cost reflective. While there has been some success in doing this (see [56] for a
synthesis of research), there is concern in the plausibility of the relative magnitudes
of marginal costs for the two traffic types. In particular, freight traffic seems to do
less damage per gross tonne-km than passenger traffic by up to seven times (on
average across the network) which seems implausible. Therefore, most studies have
preferred to work with a single measure of output.
Much work has gone on into trying to better characterize the nature of the
infrastructure. There seems to be three distinct measures of this input. First,
measures of what the infrastructure actually is, i.e., its characteristics. Second, what
the capability is of the infrastructure, given its composition, in terms of what quality
of train service it can support. Third, there are measures that describe the condition
of the infrastructure, although these are often interrelated with the second category.
Table 2 gives examples of measures for each category through review of those used
in several European studies. There are a limited number of condition variables used
in these studies. Potentially, the condition measures adopted by Kennedy and Smith
[63] (number of broken rails and infrastructure manager caused delays) could be
useful to incorporate into these cost functions. In particular, this can be used to
analyze the trade-off between costs for producing infrastructure services and costs
for unreliability.
Finally, Table 3 presents RtD and RtS from a selection of studies in the literature
that have examined infrastructure maintenance cost. The focus is on infrastructure
maintenance cost (as opposed to maintenance and renewal together). These studies
have all found increasing RtD, with elasticities of cost with respect to traffic density
of the order of 0.2–0.4 at the sample mean [56]. Less clear is the evidence on
RtS, with some studies finding large increasing RtS while other studies find only
small increasing RtS. However, the usefulness of the RtS measure here has to be
questioned, especially for studies that utilize observations by track sections (such
studies are Johansson and Nilsson [64], Tervonen and Idstrom [65], Munduch et al.
[60], Gaudry and Quinet [61], Andersson [59]). In particular, the length of a track
section has little to do with the organization of maintenance and renewal activities,
because typically maintenance/renewal teams are responsible for a number of track
sections. Thus when analyzing track section data, a more appropriate measure of
RtS would relate to the overall track-km maintained/renewed by each operational
Table 2 Infrastructure variables used in previous railway infrastructure cost studies
Great
Country Sweden Britain Sweden Austria France Switzerland Sweden Finland
Study Odolinski Wheat and Andersson [59] Munduch et al. Gaudry and Marti and Johansson Johansson
[57] Smith [58] [60] Quinet [61] N’schwander [62] and Nilsson and Nilsson
[46] [46]
Infrastructure Track length Track length Track section Track section Number of Track length Track length Track length
characteris- Number of Route distance length track apparatus Track distance Switches Switches
tics tracks length Route length Length of Whether the (route length) Bridges
Length of Length of Tunnels single-railed track is Length of switches Tunnels
switches switches Bridges tunnels in electrified Length of bridges
Tunnels Rail weight meters Route length Tunnels
39 Applications of Production Theory in Transportation

Bridges Rail gradient Length of Number of Level crossings

Rail cant double-railed tracks Track radius
Curvature tunnels in Automatic Track gradient
Lubrication meters traffic control Noise/fire protection
Joints Track radius included or not Number of switches
Continuous Track gradient (by type)
welded rails Length of the Shafts
Frost protection switches Platform edge
Switches Station rails (as
Switch age percentage of
Sleeper age track length)
Rail age
Ballast age
1509

(continued)
1510

Table 2 (continued)
Great
Country Sweden Britain Sweden Austria France Switzerland Sweden Finland
Capability Maximum Continuously Rail weight Maximum line Maximum line Track Electrified
axle load welded rails Continuous speed speed quality Average
Track Maximum welded rails index speed
quality class line speed Track quality Secondary
Rail weight Maximum class lines
axle load
Condition Rail age Switch age Rail age Rail age
Sleeper age Sleeper age
Rail age
Ballast age
Source: Work carried out by Phil Wheat, ITS, University of Leeds. Reproduced from Link et al. [15]
P. Wheat et al.
39 Applications of Production Theory in Transportation 1511

Table 3 Estimates of returns to scale and density from infrastructure maintenance cost studies
Study Country Returns to scale Returns to density
Johansson and Nilsson [64] Sweden 1.256 5.92
Johansson and Nilsson [64] Finland 1.575 5.99
Tervonen and Idstrom [65] Finland 1.325 5.74–7.51
Munduch et al. [60] Austria 1.449–1.621 3.70
Gaudry and Quinet [61] France Not reported 2.70
Andersson [59] Sweden 1.38 4.90
Wheat and Smith [58] Britain 2.074 4.18
Smith et al. [66] International 1.11 3.25
study
Marti et al. [67] Switzerland Not reported 4.54
Smith and Wheat [68] Britain 1.13 3.29
Odolinski [57] Sweden 1.07 2.47
NERA [69] US 1.15 2.85
Source: Amended from Wheat and Smith [58]

crew which is likely to be greater than the track section-km and invariant across
track sections within each operational area. In this chapter, no instances of any such
variables being used within these cost functions has been found.
Further, more recent studies of rail infrastructure maintenance have been con-
ducted using French data for the purpose of setting track access charges – these
broadly support the results above (see Smith et al. [70]). This recent body of work
has also focused on establishing asset by asset elasticities within rail infrastructure
and expanding the literature on rail renewals marginal cost, building on the work of
Andersson et al. [54].

Efficiency Studies
Published research in the academic literature on performance of railway infrastruc-
ture managers is also limited. As with the train operating company research, all
(published) studies relate to the British infrastructure manager, but some do involve
international comparisons with other infrastructure managers.
At the 2003 regulatory review of the British infrastructure manager’s efficiency
performance, ORR commissioned LEK [71] to undertake internal benchmarking of
Network Rail. This looked at potential efficiency savings for various expenditures
categories based on comparisons across Network Rail’s operating areas (seven in
total). Some of the work involved statistical analysis but the analysis was far from
a top-down econometric efficiency study. Efficiency techniques employed were
limited to OLS adjusted by either a COLS shift or lower quartile shift.
A more rigorous econometric study aimed at measuring disparity between the
performances of individual geographical areas within the British infrastructure
manager was undertaken by Kennedy and Smith [63]. This internal benchmarking
study adopted both deterministic and stochastic input orientated distance function
models and utilized relatively robust data, sourced directly from the industry.
1512 P. Wheat et al.

They considered, in two separate models, maintenance only cost and the sum of
maintenance and track renewal cost as inputs, combined with delay minutes and
broken rails as the two other inputs. The levels of these inputs were then assumed to
be endogenously determined given a set of outputs (hence the input orientation of
the distance function). The outputs were track-km and two traffic density variables –
freight tonne-km and passenger train-km both per track-km.
Their findings suggest that the infrastructure manager Railtrack (now replaced
by Network Rail) made substantial improvements in efficiency from privatization to
2000/2001, but then their efficiency deteriorated post this period. There was a key
event in October 2000 (the “Hatfield accident”) which for various reasons prompted
a revision in the behavior of the infrastructure manager and ultimately led to it going
into administration and being replaced by Network Rail. In particular, they find
that most of the earlier gains in efficiency were wiped out by the determination
post Hatfield. They conclude that the substantial variation in efficiency between
the geographical areas means that there were substantial opportunities to improve
performance going forward.
One issue with the modelling at geographical area level, as opposed to the track
section level analysis discussed earlier in respect of marginal cost estimation, is
that this can lead to a more limited number of variables being included in the
function to characterize the infrastructure. This is partly because of data challenges
in computing geographical averages from more disaggregate data and partly due
to smaller sample sizes restricting what can plausibly be done – this being typical
with regulatory applications of efficiency analysis across the network industries. A
key contribution of Kennedy and Smith [63] is that it represents an early attempt to
incorporate quality into the analysis alongside other variables – in this case, quality
being represented by number of broken rails (asset condition) and infrastructure-
caused delay minutes. Overall Kennedy and Smith demonstrated that suitable data
existed within infrastructure and could be used to find evidence of inefficient
practice through internal benchmarking of geographical units.
More recent work has sought to incorporate more variables into efficiency
analysis of rail infrastructure using geographical data. ORR [72] included measures
of speed and also criticality (impact of infrastructure failure on rail operations). This
was enabled by utilizing more disaggregated (maintenance units) data than the more
aggregate zonal data used in Kennedy and Smith [63], thus increasing the number of
units for estimation. In a similar way, Smith et al. [73] used maintenance unit data
within France and were able to include a richer specification of the technology, for
example, track age, number of bridges and tunnels, and track curvature measures.
Moving to country-level studies, econometric efficiency analysis of Network Rail
formed a very important part of the regulatory efficiency determination process,
starting with the 2008 Periodic Review determinations. This comprised two pieces
of analysis, both benchmarking studies utilizing international comparators. The
primary piece of analysis utilized a data set collected by the UIC (International
Union of Railways) and previously analyzed for the Lasting Infrastructure Cost
Benchmarking (LICB) project [74]. This was data for a selection of railway
infrastructure managers who were members of the UIC. The original LICB project
39 Applications of Production Theory in Transportation 1513

was based on adjusted average cost calculations. Thus, unit costs were computed,
but adjustments were made based on the characteristics of railways (see Smith and
Wheat [75] for a review of the adjustment factors). However, the subsequent work
sponsored by ORR undertook econometric efficiency analysis of the dataset (1993–
2006) [66, 76]. The preferred model utilized a time varying inefficiency model
which estimated firm specific paths of adjustment. The model found Network Rail
to be 60% efficient. This analysis demonstrated that international comparisons of
railway infrastructure managers could be made using econometric techniques.
A supporting piece of econometric analysis was using a bespoke dataset collected
by ORR comprising five infrastructure managers. This dataset included observations
for regions within each infrastructure manager and, in some cases, data over time.
At the 2008 Periodic Review, this dataset was relatively new and so analysis was
limited to verification of the inefficiency estimates from the main LICB data analysis
(which were confirmed). Smith and Wheat [68] develops this analysis further.
In particular, models are proposed which best exploit the multi-level structure of
the data. A similar approach was used in Odolinski [77] who compare the cost
inefficiency between different infrastructure maintenance regions in Sweden, as well
as between contract areas within the regions.
Given the nature of the datasets, the international studies adopted as part of the
2008 Periodic Review made use of a more limited set of explanatory variables than
has been the case with marginal cost literature, where track section data permits
a very rich configuration of the cost structure. That said, the studies are more in
line with the wider cost modelling literature across other industries which typically
would include a small number of output variables in the cost function. One issue
facing rail, and transportation more widely as discussed earlier, is the problem
of accurately measuring input prices. In the rail infrastructure literature, these
issues have partly been addressed through using PPP exchange rate adjustments in
international work, or relying on the commonality of input prices within countries
for geographically based internal benchmarking work. As noted, the recent impulse
has been towards richer characterizations of the technology through utilizing more
disaggregate maintenance unit data for efficiency analysis purposes. However,
the richness of the rail infrastructure cost functions estimated for cost efficiency
purposes still does not match that of studies used to estimate marginal costs, where
the latter utilize much larger datasets at track section level .

Passenger Train Operations

There has been limited published work on the performance of passenger train
operating companies (TOCs). Papers on British TOCs have used a variety of
methods including nonparametric DEA [78–81] and index number approaches [82,
83], as well as parametric estimation of cost functions ([84]; Smith and Wheat [68]),
production functions [85], and distance functions (Affuso et al. [78, 79]). Clearly,
the former methods can only consider cost or technical efficiency and produce no
estimates regarding the actual cost structure.
1514 P. Wheat et al.

The papers by Cowie consider three inputs: staff, rolling stock, and network.
This is deficient given that the network input is fixed and difficult to characterize.
Specifically, in Cowie [84, 85], route-km are used as the input for network; however,
as Cowie [80] acknowledges, this is a poor proxy for the true network input. Cowie
[80] replaces this with the cost of access for each TOC as measured by the charges
paid to Network Rail. Given the arbitrariness of the allocation of the fixed charge to
individual operators, the usefulness of this measure has to be questioned. Also, post
2002, the infrastructure manager was not fully funded by TOC access payments.
Instead, the Network Grant (direct payment from government) was introduced
alongside access charges. This further distorts any “price” for network access post
2000 (affecting the Cowie [80] study).
Given the regulatory regime, network access can easily be thought of as a pass-
through with respect to franchised TOCs since TOCs are compensated directly
with respect to changes in access charges as a condition of the franchise contract.
Thus, a cost function which considers TOC cost less access charges as the depen-
dent variable seems most appropriate, rather than the cost function estimated by
Cowie.
Affuso et al. [79] do not include any network inputs into their distance function,
but this simple exclusion does not seem optimal since the network may affect the
transformation function. It is considered that a better way to deal with this is to
estimate a variable cost function with infrastructure held fixed.
Turning to the results on TOC performance, all the studies ([78–85]; Smith and
Wheat [68]) report improved performance over the period from privatization to the
period 2000/2001. A consistent finding is that this improvement in performance, as
measured by a Malmquist total factor productivity measure, has tended to be driven
by positive technical change with only a small improvement in average technical
efficiency over the period. Thus, while the best performing TOCs seemed to be
improving up to 2000/2001, there was little evidence that all firms were converging,
i.e., that franchising was successfully driving out poor performance.
Cowie [80] and Smith and Wheat [86] are the only studies to have considered
the period following the Hatfield accident in October 2000. Cowie’s study covered
the years 1996/1997–2003/2004, while Smith and Wheat extended the sample
to 2005/2006. Cowie found that, following Hatfield, there was a deterioration in
TFP and this was across all TOCs, i.e., was found to be as a result of negative
technical change growth rather than a deterioration in technical efficiency of a
subset of firms (see Figs. 2 and 3 in Cowie [80]). In fact, Cowie finds that average
technical efficiency improves over the post-Hatfield period. This suggests that, even
with the distribution of some firms moving to renegotiated contracts, franchising
had still begun to proliferate best practice across the industry. This finding has
to be moderated however by the finding that, overall, TFP was not found to be
substantially different at the end of the period than at the first year following
privatization. Smith and Wheat [86] also found that technical change was, in the
early years of their sample, beneficial in terms of lowering costs; however, following
the Hatfield accident, not only was there a statistically significant upward shift in
costs but the direction of technical change shifted, such that costs began to increase
39 Applications of Production Theory in Transportation 1515

over time. These observations are the same with respect to overall TFP in the Smith
and Wheat [86] model.
For the parametric studies, it should be possible to derive returns to scale
and density results from the models. In Cowie [85] and Affuso et al. [79], these
properties of the models are not discussed in the chapter. Furthermore, the fact that
the data does not appear to be normalized at the sample mean, coupled with the
adoption of Translog functional forms, means that the results in the papers cannot
be used to derive these results. Of the nonparametric research, Merkert et al. [81]
did estimate a variable RtS model and found that British and Swedish TOCs were
below minimum efficient scale, while the large German operators were above.
Only Cowie [84] and Smith and Wheat [86] provide an explicit discussion of the
returns to scale properties of the models. Cowie defines returns to scale simply in
relation to his single output train-km (there are of course different possible measures
of RtS in this context such as returns to network size, train-km, and train length).
His results seem to suggest decreasing returns to scale at low train-km, but then
increasing RtS at higher train-km.
Smith and Wheat [86] put forward a model which yields estimates of the extent
of both returns to scale and returns to density, where the primary usage output is
train-km rather than train-hours. They found constant RtS and increasing RtD. A
subsequent study Wheat and Smith [113] applied a hedonic cost function approach
and was also able to include train hours (and thus train speed) into the modelling
for the first time, and the method permitted estimates of returns to scale and
density to vary with the heterogeneity characteristics of output. This study reported
increasing returns to density for all types of operator but importantly found that
some operators were operating beyond efficient scale; the latter finding having
important implications for the optimal size of rail franchises.
Overall, the received studies on passenger train operations have concentrated on
technical change, cost efficiency, and overall TFP trends, and the cost structure
in respect of returns to scale and density. The motivation for concentrating on
these issues were, firstly, studies focus on Britain and, secondly, at the time
the railway in Britain suffered from a substantial cost shock which resulted in
several franchises getting into financial difficulty, and where cost pressures therefore
prompted increased focus on the optimal size and structure of rail franchises.

Cost and Efficiency Studies in Other Transport Sectors

Road Infrastructure

Within the literature on road infrastructure costs, the seminal work by Small et al.
[87] has been followed by a number of empirical studies on the cost impact of road
traffic. This research aim is analogous to the work in railways to determine the
marginal wear and tear costs associated with running more traffic.
In addition to top-down econometric approaches, the dominant approach in this
mode, and piloted by Small using simplifying assumptions, is to use engineering
1516 P. Wheat et al.

(bottom-up) approaches, be them through survival analysis or through direct

laboratory experiment. Survival analysis is a statistical method using real data
on asset replacement times, to establish a deterioration elasticity, based on the
suggested model transformation as outlined in Lindberg [88].
There are also many examples of more standard top-down approaches analyzing
different types of road infrastructures (motorways, paved roads, gravel roads)
and types of activities (maintenance, renewals, winter road operations) (see, for
example, Link et al. [15]; Link [89]; Yarmukhamedov and Swärdh [90]).
A growing literature has developed concerning the efficiency of road infras-
tructure managers. Studies include Wheat [92] for the UK, Massiani and Ragazzi
[93] for Italy, Welde and Odeck [94] for Norway and Fritzsche [95], and Kalb [96]
for Germany. All studies use an econometric approach, including using stochastic
frontier analysis, while some have also considered a DEA approach (usually with a
second stage regression): Welde and Odeck [94] and Kalb [96].
In terms of the outputs considered in the road efficiency studies, these include
measures of both scale and density. Scale is the size of the road network (sometimes
broken down into different classes of roads) and density is the traffic usage intensity.
Particular consideration tends to be given to the extent of usage by heavy goods
vehicles as these are found to have a disproportional impact on road damage. Other
outputs tend to be the quality of the infrastructure. This can be measured by a road
condition index or number of defects, but Wheat [92] also proposed using an index
of public satisfaction with highways. The logic for including this alongside a road
condition index was that quality of roads involves not just the engineered quality
of the road but also how responsive the road authority is in response to defects
emerging.
A further area topic of investigation is the extent to which the introduc-
tion of competitive tendering for road maintenance services has reduced costs.
Yarmukhamedov et al. [115] present evidence for Sweden using a panel data
cost function and include controls for contracting arrangement. They find that
competitive tendering to private sector organizations results in costs between 8%
and 20% lower than contracts to the state-owned companies, all other things equal.
This is in line with other evidence that they cite which suggests competitive
tendering reduced road maintenance costs (by around 20–30% in Australia [97],
10–35% in Canada [98], and 22–27% in Sweden [99]).

Local Public Transport

Local public transport includes bus services, tramways, and metros. Smith et al.
[100] synthesize the approaches as being very diverse. The local nature of provision
permits more comparisons within a nation or region.
There is a large volume of research on the efficiency of public transport
operations. This is because there is a great variety of operating models such as
public ownership, private provision, tenders with revenue risk, or tenders with no
39 Applications of Production Theory in Transportation 1517

revenue risk. As such, there is a research need to understand which organizational

structure is best in which circumstances.
Usually researchers use “available outputs,” as opposed to “revenue outputs,”
as the measure of output. This helps to separately identify operator inefficiency
from local authority decisions to provide some loss-making but socially important
services. Since operators have little influence on utilization of capacity and only
provide services according to a specified timetable and tariff, “available outputs” are
a more appropriate measure. If the operators have more influence over what services
to run, or we have to compare possibly complex efficiency of organizational models
and not firms, “revenue outputs” tend to be better, especially if we can capture
the effects of differences in operating conditions through additional independent
variables (such as speed, peak to off-peak kilometers ratio, yearly mileage of
vehicles, etc.). In some cases, total cost can be expressed as a function of outputs
and operating conditions variables and benchmarked. For example, Farsi et al.
[101] evaluated Swiss rural bus companies using vehicle-kilometers as an output.
The usage to “available output” was extensively justified by the authors – they
state that frequency is regulated by public authorities, thus leading to oversupply
of capacity (which also varies between firms). In such conditions, revenue output
may be misleading. On the other hand in India, where operators have much
more freedom, Bhattacharyya et al. [102] used number of passenger-kilometers (a
“revenue output”) as the dependent variable.
Accounting for heterogeneity is very important. This is because difference
external factors (such as network structure) on costs. In many cases, such differences
may be stronger than differences in efficiency. The approach is to include variables
that characterize the difference in output. Drawing again on Bhattacharyya et al.
[102], load factor (passenger-km to vehicle-km), vehicle utilization (average number
of kilometers travelled daily by a bus), number of breakdowns (per 10,000 vehicle-
kilometers), and fleet utilization (proportion of the number of vehicles on the road
to the total fleet) are candidate metrics to be included alongside measures of overall
output. Peak vehicle requirement is also an important measure to differentiate
between firms operating in different demand conditions [103].
There is a broad consensus on a basic set of input variables across studies. These
include rolling stock, labor, and fuel inputs – this approach has been used since
the beginning of contemporary research and was used inter alia by Viton [104];
however, it may be extended .

Air Transport

Airport operations is another part of the transport sector that have been increasingly
privatized, with commercialized entities that rely on non-aeronautical services.
Government regulations have thus changed in this sector, moving from rate of return
and price cap regulation towards a more deregulated environment [105]. Still, many
airports are government owned with regulated charges and research on marginal
costs have been conducted.
1518 P. Wheat et al.

Bottasso and Conti [106] note that recent papers consider the multi-output
nature of the airports, producing both aeronautical and non-aeronautical services,
which in turn can be categorized into various outputs. For example, a distinction
between passenger and freight services can be made, and sometimes the passenger
and freight variables are used together with measures on air transport movements
(landings and take-offs of aircrafts). Similar to railway cost functions, there are
examples of hedonic output functions for air transport: see, for example, Martin and
Voltes-Dorta [107] who use a hedonic specification for different aircraft operations
(or Gillen et al. [108], who use a hedonic output function for an airline cost
function). Regarding inputs, most studies use number of runways, airport surface
area, and number of check-in desks. Similar to the discussion for railway inputs, the
use of the perpetual inventory method has been suggested to better reflect the capital
price, since, for example, physical measures of capital may not properly reflect
this price in regulated airports (e.g., the gold plating phenomena is often found in
industries of with rate of return regulation). Moreover, Bottasso and Conti [106]
notes that the increased commercialization of the airport industry implies that the
exogeneity assumption of outputs in econometric cost estimations can be questioned
and needs more research.
Oum and Yu [109] and Yu [110] lists examples of outputs and inputs used to
study the production and cost structure of airlines. Common outputs for scheduled
passenger services are number of passengers and revenue passenger miles, while
scheduled freight and mail services use revenue ton-miles. The so-called incidental
services (noncore activities of an airline such as catering and aircraft maintenance)
are often represented by an output quantity index. Input examples are labor, fuel,
flight equipment, and ground and property equipment. Yu [110] notes that some
studies use available seat miles and available ton-kilometers, either as outputs or
inputs.
Jara-Díaz et al. [111] summarize the airline market literature (covering 30 years
of research) and report the cost functions used and estimates on RtS and RtD.
Most studies find increasing RtD, and nearly constant RtS. Jara-Díaz et al. also
find evidence of economies of spatial scope, stating that there are advantages of
jointly serving markets (e.g., domestic and international services) that imply larger
networks.

Conclusion

In this chapter, the experience of production and cost analysis in transportation has
been reviewed. Transportation is an active application of econometric techniques
with a significant public interest given that many public transportation services
require government subsidy and a degree in exclusivity in supply to make their
provision viable and efficient. Key production and cost analysis needs of the sector
can be identified as:
• Strategic operational insight: understanding how costs change with output to
inform operation and investment decisions. Essential here is the allowance for
39 Applications of Production Theory in Transportation 1519

the quality of output as well as a simple scale metric given the jointness and
non-storability of production.
• Establishing evidence as to which market and regulatory structures yield best
outcomes given the complexity in the transport sector and the inevitable trade-
offs between maintaining network coordination and avoiding useless duplication
of resources against potential inefficiency introduced by reducing competition
for or in the market.
• Providing the evidence base for regulatory scrutiny either through yardstick
competition or more formal price cap regulation.

The upshot of this is that transportation has provided the motivation and
illustration for many innovations within production, cost, and efficiency methods.
This chapter acts as a survey of the transportation literature, bringing to life the
issues and solutions found in the transportation sector. Some of the challenges of
understanding the cost structure of transportation and the optimal industry structures
remain, coupled with the introduction of new technologies, and the ability to model
cost-quality relationships, and so we expect research in transportation to continue to
be vibrant into the future. This will be supported by ever increasing datasets which
critically enable construction of measure of heterogeneity in output. This in turn
allows for the complex features of transportation networks to be better characterized,
which improves the robustness of findings to the transportation sectors analysis
needs.

Cross-References

Airline Economics: A Survey of Applied Issues in the Performance of the US and

International Airline Industry
Cost Assessment of (Un)bundling: Separation of Vertically Integrated Public
Utilities

References
1. Caves DW, Christensen LR, Swanson JA (1981) Productivity growth, scale economies, and
capacity utilisation in U.S. Railroads, 1955–74. Am Econ Rev 71(5):994–1002
2. Caves DW, Christensen LR, Tretheway MW (1984) Economies of density versus economies
of scale: why trunk and local service airline costs differ. Rand J Econ 15(4):471–489
3. Farsi M, Filippini M, Greene WH (2005) Efficiency measurement in network industries:
application to the Swiss railway companies. J Regul Econ 28(1):69–90
4. Filippini M, Greene W (2016) Persistent and transient productive inefficiency: a maximum
simulated likelihood approach. J Prod Anal 45(2):187–196
5. Wheat P, Stead AD, Greene WH (2019) Robust stochastic frontier analysis: a student’s t-
half normal model with application to highway maintenance costs in England. J Prod Anal
51(1):21–38
6. Nilsson J-E (2002) Restructuring Sweden’s railways: the unintentional de-regulation. Swed
Econ Policy Rev 9:229–254
7. Shleifer A (1985) A theory of yardstick competition. RAND J Econ 16(3):319–327
1520 P. Wheat et al.

8. Nash C, Nilsson J-E, Link H (2013) Comparing three models for introduction of competition
into railways. JTEP 47(2):191–206
9. Hart O, Shleifer A, Vishny RW (1997) The proper scope of government: theory and an
application to prisons. Q J Econ 112(4):1127–1161
10. ICAO (2013) Airport economics manual. Doc 9562, 3rd edn. International Civil Aviation
Organization
11. Cherwony W, Mundle SR, Porter BD, Gleichman GR (1982) Review of bus costing
procedures. Transp Res Rec 854:54–60
12. Stopher RP, Brandrup L, Lee B, Parry ST (1987) Development of a bus operating cost
allocation model compatible with UMTA urban transportation planning system models.
Transp Res Rec 1108:31–41
13. Sinner M, Weidmann U, Nash A (2018) Application of a cost-allocation model to
Swiss bus and train lines. Transp Res Rec 2672(8):431–442. https://doi.org/10.1177/
0361198118772702
14. Braeutigam RR (1999) Learning about transport costs. In: Gomez-Ibanez JA, Tye WB,
Winston C (eds) Essays in transportation economics and policy: a handbook in honor of John
R. Meyer. Brookings Institution Press, Washington, DC
15. Link H, Stuhlemmer A (DIW Berlin), Haraldsson M (VTI), Abrantes P, Wheat P, Iwnicki
S, Nash C, Smith A, CATRIN (Cost Allocation of TRansport INfrastructure cost) (2008)
Deliverable D 1, Cost allocation practices in the European Transport Sector. Funded by Sixth
Framework Programme. VTI, Stockholm
16. Waters WG (1976) Statistical costing in transportation. Transp J 15(3):49–62
17. Taussig FW (1891) A contribution to the theory of railway rates. Q J Econ 5(4):438–465
18. Griliches Z (1972) Cost allocation in railroad regulation. Bell J Econ Manag Sci 3:26–41
19. Andrikopoulos A, Loizides J (1998) Cost structure and productivity growth in European
railway systems. Appl Econ 30:1625–1639
20. Coelli T, Perelman S (1999) A comparison of parametric and non-parametric distance
functions: with application to European Railways. Eur J Oper Res 117:326–339
21. Coelli T, Perelman S (2000) Technical efficiency of European Railways: a distance function
approach. Appl Econ 32:1967–1976
22. Couto A, Graham DJ (2008) The contributions of technical and allocative efficiency to the
economic performance of European Railways. Port Econ J 7:125–153
23. Cowie J, Riddington G (1996) Measuring the efficiency of European Railways. Appl Econ
28:1027–1035
24. Deprins D, Simar L (1989) Estimating technical inefficiencies with correction for environ-
mental conditions. Ann Public Coop Econ 60:81–102
25. Gathon HJ, Perelman S (1992) Measuring technical efficiency in European Railways: a panel
data approach. J Prod Anal 3:135–151
26. Gathon HJ, Pestieau P (1995) Decomposing efficiency into its managerial and its regulatory
components: the case of European Railways. Eur J Oper Res 80:500–507
27. Ivaldi M, McCullough GJ (2001) Density and integration effects on class I U.S. freight
railroads. J Regul Econ 19(2):161–182
28. Kumbhakar SC (1988a) Estimation of input-specific technical and allocative inefficiency in
Stochastic Frontier models. Oxf Econ Pap 40(3):535–549
29. Kumbhakar SC (1988b) On the estimation of technical and allocative inefficiency using
frontier functions: The case of U.S. Class I Railroads. Int Econ Rev 29(4):727–743
30. Kumbhakar SC, Orea L, Rodriguez-Alvarez A, Tsionas EG (2007) Do we estimate an input
or an output distance function? An application of the mixture approach to European Railways.
J Prod Anal 27:87–100
31. Lan LW, Lin ETJ (2006) Performance measurement for railway transport: stochastic distance
functions with inefficiency and ineffectiveness effects. JTEP 40(3):383–408
32. Loizides J, Tsionas EG (2002) Productivity growth in European Railways: a new approach.
Transp Res A 36(7):633–644
39 Applications of Production Theory in Transportation 1521

33. Parisio L (1999) A comparative analysis of European railroads efficiency: a cost frontier
approach. Appl Econ 31:815–823
34. Cantos P, Villarroya J (2000) Efficiency, technical change and productivity in the European
rail sector: a stochastic frontier approach. Int J Transp Econ 27(1):55–76
35. Cantos P, Villarroya J (2001) Regulation and efficiency: the case of European Railways.
Transp Res A 35(5):459–472
36. Tsionas EG, Christopoulos DK (1999) Determinants of technical inefficiency in European
Railways: simultaneous estimation of firm-specific and time-varying inefficiency. KON-
JUNKTURPOLITIK 45:240–256
37. Oum TH, Yu C (1994) Economic efficiency of railways and implications for public policy: a
comparative study of the OECD countries’ railways. JTEP 28:121–138
38. Fielding GJ, Babitsky TT, Brenner ME (1985) Performance evaluation for bus transit. Transp
Res 19A:73–82
39. Oum TH, Zhang Y (1997) A note on scale economies in transport. JTEP 31:309–315
40. Caves DW, Christensen LR, Tretheway MW, Windle RJ (1985) Network effects and the
measurement of returns to scale and density for U.S. Railroads. In: Daughety AF (ed)
Analytical studies in transport economics. Cambridge University Press, Cambridge, pp 97–
120
41. Spady RH, Friedlaender AF (1978) Hedonic cost functions for the regulated trucking industry.
Bell J Econ 9(1):159–179
42. Bishop M, Thompson D (1992) Regulatory reform and productivity growth in the UK’s public
utilities. Appl Econ 24:1181–1190
43. Wilson WW (1997) Cost savings and productivity in the railroad industry. J Regul Econ
11:21–40
44. Lee T, Baumel CP (1987) The cost structure of the U.S. Railroad industry under deregulation.
J Transp Res Forum 27(1):245–253
45. Odolinski K, Smith ASJ (2016) Assessing the cost impact of competitive tendering in rail
infrastructure maintenance services: evidence from the Swedish Reforms (1999 to 2011).
JTEP 50(1):93–112
46. Johansson P, Nilsson JE (2002) An economic analysis of track maintenance costs, deliverable
10 Annex A3 of UNITE (UNIfication of accounts and marginal costs for Transport Effi-
ciency), Funded by EU 5th Framework RTD Programme. ITS, University of Leeds, Leeds.
Online: http://www.its.leeds.ac.uk/projects/unite/
47. Odolinski K, Nilsson J-E (2017) Estimating the marginal maintenance cost of rail infrastruc-
ture usage in Sweden; does more data make a difference? Econ Transp 10:8–17
48. Haraldsson M (2007) Essays on transport economics. Doctoral thesis, Uppsala University
49. Wheat P (2015) The sustainable freight railway: designing the freight vehicle-track system for
higher delivered tonnage with improved availability at reduced cost SUSTRAIL, Deliverable
5.3: access charge final report annex 4, British Case Study
50. Odolinski K, Wheat P (2018) Dynamics in rail infrastructure provision: maintenance and
renewal cost in Sweden. Econ Transp 14:21–30
51. Odolinski K, Wheat P (2021) Rail cost functions. In: Vickerman, Roger (eds.) International
Encyclopedia of Transportation, 1:425–430, United Kingdom: Elsevier Ltd. https://doi.org/
10.1016/B978-0-08-102671-7.10080-6
52. Gaudry M, Lapeyre B, Quinet E (2016) Infrastructure maintenance, regeneration and service
quality economics: a rail example. Transp Res B 86:181–210
53. Andersson M, Björklund G, Haraldsson M (2016) Marginal railway renewal costs: a survival
data approach. Transp Res A 87:68–77
54. Andersson M, Smith A, Wikberg Å, Wheat P (2012) Estimating the marginal
cost of railway track renewals using corner solution models. Transp Res A 46:
954–964
55. Nilsson J-E, Svensson K, Haraldsson M (2015) Estimating the marginal costs for road
infrastructure reinvestment. CTS working paper 2015:5. CTS
1522 P. Wheat et al.

56. Wheat P, Smith ASJ, Nash CA (2009) CATRIN (Cost Allocation of TRansport INfrastructure
cost), deliverable 8 – rail cost allocation for Europe. Funded by Sixth Framework Programme.
Coordinated by VTI, Stockholm
57. Odolinski K (2019) Estimating the impact of traffic on rail infrastructure maintenance costs:
the importance of axle loads. JTEP 53(3):258–274
58. Wheat P, Smith A (2008) Assessing the marginal infrastructure maintenance wear and tear
costs for Britain’s Railway network. JTEP 42(2):189–224
59. Andersson M (2006) Marginal railway infrastructure cost estimates in the presence of
unobserved effects. Case study 1.2D I Annex to Deliverable D 3 Marginal cost case studies for
road and rail transport, Information Requirements for Monitoring Implementation of Social
Marginal Cost Pricing, EU Sixth Framework Project GRACE (Generalisation of Research on
Accounts and Cost Estimation)
60. Munduch G, Pfister A, Sögner L, Stiassny A (2002) Estimating marginal costs for the
Austrian Railway system, working paper 78. Department of Economics, Vienna University
of Economics and B.A, Vienna
61. Gaudry M, Quinet E (2003) Rail track wear-and-tear costs by traffic class in France.
Universite de Montreal, Publication AJD-66
62. Marti M, Neuenschwander R (2006) Case study 1.2E: track maintenance costs in Switzerland,
annex to GRACE (Generalisation of Research on Accounts and Cost Estimation) Deliverable
D3: marginal cost case studies for road and rail transport. Funded by 6th Framework RTD
Programme. Ecoplan, Berne
63. Kennedy J, Smith ASJ (2004) Assessing the efficient cost of sustaining Britain’s rail network:
perspectives based on zonal comparisons. JTEP 38(2):157–190
64. Johansson P, Nilsson J (2004) An economic analysis of track maintenance costs. Transp
Policy 11(3):277–286
65. Tervonen J, Idstrom T (2004) Marginal rail infrastructure costs in Finland 1997–2002. Report
by the Finnish Rail Administration. Available at http://www.rhk.fi. Accessed 20 July 2005
66. Smith ASJ, Wheat PE, Nixon H (2008) International benchmarking of network rail’s
maintenance and renewal costs, joint ITS, University of Leeds and ORR report written as
part of PR2008, June 2008. Presentation available at http://www.rail-reg.gov.uk
67. Marti M, Neuenschwander R, Walker P (2009) CATRIN (cost allocation of transport
infrastructure cost), Deliverable 8 Annex 1B – rail cost allocation for Europe: track main-
tenance and renewal costs in Switzerland. Funded by the Sixth Framework Programme. VTI,
Stockholm
68. Smith ASJ, Wheat P (2012b) Estimation of cost inefficiency in panel data models with form
specific and sub-company specific effects. J Prod Anal 37:27–40
69. NERA (2000) Review of Overseas Railway Efficiency: A Draft Final Report for the Office of
the Rail Regulator. NERA, London
70. Smith ASJ, Walker P, Wheat PE, Guiraud L, Silavong C (2017) Estimating the marginal
maintenance cost for the French railway network: a comparison of models. ITEA Conference,
Barcelona, p 20
71. LEK (2003) Regional benchmarking: report to network rail. ORR and SRA, London
72. Office of Rail Regulation (2018) PR18 econometric top-down benchmarking of Net-
work Rail, a report. http://orr.gov.uk/__data/assets/pdf_file/0011/27875/pr18-econometric-
top-down-benchmarking-of-network-rail.pdf
73. Smith ASJ, Thiebaud JC, Wheat PE (2020) Efficiency analysis of the rail network in France.
mimeo
74. International Union of Railways (UIC) (2008) Lasting infrastructure cost benchmark-
ing (LICB). December 2008. Available at http://www.uic.org/IMG/pdf/li08C_sum_en.pdf.
Accessed 20 Dec 2013
75. Smith ASJ, Wheat P (2010) Sensitivity analysis on the UIC harmonisation factors. Report for
Rail Consult
76. Smith ASJ (2012) The application of stochastic frontier panel models in economic regulation:
experience from the European rail sector. Transp Res E 48:503–515
39 Applications of Production Theory in Transportation 1523

77. Odolinski K (2015) Reforming a publicly owned monopoly: costs and incentives in railway
maintenance. Doctoral dissertation, Örebro Studies in Economics 30
78. Affuso L, Angeriz A, Pollitt MG (2002) Measuring the efficiency of Britain’s privatised train
operating companies. Regulation initiative discussion paper series, no: 48. London Business
School
79. Affuso L, Angeriz A, Pollitt MG (2003) Measuring the efficiency of Britain’s
privatised train operating companies. mimeo (unpublished version provided by the
authors)
80. Cowie J (2009) The British passenger rail privatisation: conclusions on subsidy and efficiency
from the first round of franchises. JTEP 43(1):85–104
81. Merkert R, Smith ASJ, Nash CA (2009) Benchmarking of train operating firms – a transaction
cost efficiency analysis. J Transp Plann Technol 33(1):35–53
82. Cowie J (2002a) Subsidy and productivity in the privatised British passenger railway. Econ
Issues 7(1):25–37, 38
83. Smith ASJ, Nash C, Wheat P (2009) Passenger rail franchising in Britain – has it been a
success? Int J Transp Econ 36(1):33–62
84. Cowie J (2002b) The production economics of a vertically separated railway – the case of the
British train operating companies. Trasporti Europei 20–21:96–103
85. Cowie J (2005) Technical efficiency versus technical change – the British passenger train
operators. In: Hensher DA (ed) Competition and ownership in land passenger transport:
selected refereed papers from the 8th international conference (Thredbo 8) Rio de Janeiro,
September 2003. Elsevier, Amsterdam/London
86. Smith ASJ, Wheat P (2012a) Evaluating alternative policy responses to franchise failure:
evidence from the passenger rail sector in Britain. JTEP 46(1):25–49
87. Small K, Winston C, Evans CA (1989) Road work: a new highway pricing and investment
policy. The Brookings Institution, Washington, DC
88. Lindberg G (2002) Marginal costs of road maintenance for heavy goods vehicles on Swedish
roads. Annex 2 to Deliverable 10, UNITE (UNIfication of accounts and marginal costs for
Transport Efficiency), Research project funded by the Fifth EU Framework RTD Programme.
ITS, University of Leeds, Leeds
89. Link H (2014) A cost function approach for measuring the marginal cost of road maintenance.
JTEP 48(1):15–33
90. Yarmukhamedov S, Swärdh J-E (2016) Marginal cost of road maintenance and operation:
Swedish estimates based on data from 2004 to 2014. VTI notat 15A-2016
91. Wheat PE (2013) Econometric cost analysis in vertically separated railways. PhD Thesis,
University of Leeds. Available at: https://etheses.whiterose.ac.uk/7459/
92. Wheat PE (2017) Scale, quality and efficiency in road maintenance: evidence for English
local authorities. Transp Policy 59:46–53
93. Massiani J, Ragazzi G (2008) Costs and efficiency of highway concessionaires: a survey of
Italian operators. Eur Transp 38:85–106
94. Welde M, Odeck J (2011) The efficiency of Norwegian road toll companies. Util Policy
19:162–171
95. Fritzsche C (2018) Analyzing the efficiency of county road production – evidence from
Eastern German Counties. Ger Econ Rev 1–21. https://doi.org/10.1111/geer.12170
96. Kalb A (2014) What determines local governments’ cost-efficiency? The case of road
maintenance. Reg Stud 48(9):1483–1498. https://doi.org/10.1080/00343404.2012.731044
97. Lyon B, Dwyer A (2011) Road maintenance: options for reform. Report. Infrastructure
Partnerships Australia
98. ISTED (2002) La Lettre de I’STED, N26, October 2002
99. Arnek M (2002) Empirical essays on procurement and regulation. Doctoral thesis, Uppsala
University
100. Smith ASJ, Wheat P, Wolanski M (2015) Public transport operations costs. In: Nash C (ed)
Handbook of research methods and applications in transport economics and policy. Edward
Elgar
1524 P. Wheat et al.

101. Farsi M, Filippini M, Kuenzle M (2006) Cost efficiency in regional bus companies. JTEP
41(3):345–361
102. Bhattacharyya A, Kumbhakar SC, Bhattacharyya A (1995) Ownership structure and cost
efficiency: a study of publicly owned passenger- bus transportation companies in India. J
Prod Anal 6:47–61
103. Pickerl DH (1985) Rising deficits and the uses of transit subsidies in the United States. JTEP
19(3):281–298
104. Viton PA (1981) A translog cost function for urban bus transit. J Ind Econ 29(3):287–304
105. Bitzan JD, Peoples JH (2017) The economics of airport operations. Advances in airline
economics, vol 6. Emerald Publishing
106. Bottasso A, Conti M (2017) The cost structure of the airport industry: methodological issues
and empirical evidence. In: Bitzan JD, Peoples JH (eds) The economics of airport operations.
Advances in airline economics. Emerald Publishing, pp 181–212
107. Martin JC, Voltes-Dorta A (2011) The econometric estimation of airports’ cost function.
Transp Res B Methodol 45(1):112–127
108. Gillen DW, Oum TH, Tretheway MW (1990) Airline cost structure and policy implications:
a multi-product approach for Canadian airlines. JTEP 24(1):9–34
109. Oum TH, Yu C (2001) Winning airlines: productivity and cost competitiveness of the world’s
major airlines. Springer Science + Business Media, LLC
110. Yu C (2017) Airline productivity and efficiency: concept, measurement, and applications. In:
Bitzan JD, Peoples JH, Wesley WW (eds) Airline efficiency. Advances in airline economics,
vol 5. Emerald Group Publishing Limited, pp 11–53
111. Jara-Díaz SR, Cortés CE, Morales GA (2013) Explaining changes and trends in the airline
industry: economies of density, multiproduct scale, and spatial scope. Transp Res E 60:13–26
112. Odolinski K, Nilsson J-E, Yarmukhamedov S, Haraldsson M (2020) The marginal cost of
track renewals in the Swedish railway network: Using data to compare methods. Economics
of Transportation 22:100170
113. Wheat PE, Smith ASJ (2015) Do the usual results of railway returns to scale and density
hold in the case of heterogeneity in outputs: a hedonic cost function approach. J Transp Econ
Policy 49(1):35–57
114. Christopoulos DK, Loizides J, Tsionas EG (2000) Measuring Input-Specific Technical Inef-
ficiency In European Railways: A Panel Data Approach. International Journal of Transport
Economics 27(2):147–171
115. Yarmukhamedov S, Smith ASJ, Thiebaud J-C (2020) Competitive tendering, ownership and
cost efficiency in road maintenance services in Sweden: A panel data analysis. Transportation
Research Part A: Policy and Practice 136:194–204
Productivity in Global Aquaculture
40
Frank Asche, Ruth Beatriz Mezzalira Pincinato, and Ragnar Tveteras

Contents
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1526
Bioeconomic Modeling of Aquaculture Production . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1531
Bioeconomic Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1532
The Rotation Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1533
Risk and Risk Aversion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1535
Biological Shocks and Price Dynamics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1536
Productivity in Aquaculture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1536
Empirical Analyses of Productivity and Efficiency in Aquaculture . . . . . . . . . . . . . . . . . . . . 1537
Analyses of Production Risk and Economic Risk . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1542
Analyses of Environmental Externalities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1547
Analyses of Agglomeration Economies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1548
Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1552
Cross-References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1553
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1553

Abstract

This chapter provides insights on productivity in aquaculture based on production

economic studies of several farmed species in different countries. We first
survey studies of bioeconomic modeling of aquaculture production, and the
farmer’s optimization problem. Next, we look at empirical studies of productivity
and efficiency. Aquaculture is an industry with considerable production and

F. Asche
School of Forest, Fisheries and Geomatics Sciences, Institute for Sustainable Food Systems and
Fisheries and Aquatic Sciences, University of Florida, Gainesville, FL, USA
e-mail: [email protected]
R. B. Pincinato · R. Tveteras ()
UiS Business School, University of Stavanger, Stavanger, Norway
e-mail: [email protected]; [email protected]

S. C. Ray et al. (eds.), Handbook of Production Economics,
https://doi.org/10.1007/978-981-10-3455-8_41
1526 F. Asche et al.

price risk, and we survey econometric studies which estimate the structure
of risk and farmers’ risk preferences and behavior. Studies of relationships
between productivity and environmental factors, which are today one of the main
barriers for further production growth, are discussed separately. We also present
studies of agglomeration economies in aquaculture, including economies and
diseconomies of geographic farm density. Finally, we discuss the challenge of
growing aquaculture sustainably through productivity growth and lower external
environmental footprints.

Keywords

Aquaculture · Bioeconomic modelling · Production risk · Productivity ·

Efficiency · Agglomeration economics

Introduction

Aquaculture has in recent decades been the world’s fastest growing food production
technology. Global production increased from 0.6 million metric tons in 1950 to 120
million metric tons in 2019, as shown in Fig. 1, implying that production increased
200 times. Productivity growth has been a central driver of this increase, caused by
innovations in, e.g., fish genetics, feed, fish health, and production equipment.
Aquaculture is farming in water. It takes place in fresh, brackish, and marine
waters. Aquaculture involves the release of fish, shrimp, or other aquatic species

120
Aquaculture production (million tonnes)

100

0
1950 1960 1970 1980 1990 2000 2010

fish plants molluscs crustaceans others

Fig. 1 Global aquaculture production (million tons) from 1950 to 2019 by species groups.
(Source: FAO [1])
40 Productivity in Global Aquaculture 1527

into a confined water environment, feeding, and harvesting. The production process
also includes management of animal health and environmental risks through
regulation of water quality, medication, etc. Fish farmers will typically have some
kind of ownership rights to the farm locations, implying that they have some
degree of control with the production environment. The degree of control with the
biological production process from release of fingerlings into the water to harvesting
distinguishes aquaculture clearly from the production process in fisheries, discussed
by Squires and Walden in another chapter of this book, where there is typically very
limited control with the biological production processes in lakes and oceans.
Most aquaculture production technologies are not closed. Many aquaculture
technologies depend on inflow of water, and the qualities of that water in terms
of, e.g., temperature, oxygen content, acidity, salinity, and pathogens. Furthermore,
most aquaculture production technologies emit water to the surroundings and can
influence the environment through emissions of, e.g., nutrients, escaped aquatic
species, aquatic diseases, and antibiotic residue. Hence, aquaculture farms can both
be affected by biological and environmental external effects and cause external
effects on other farms and economic agents.
Why should we be interested in the productive performance of aquaculture?
First, a growing global population will benefit from an increasing supply of healthy
seafood, as it is rich in micronutrients and contain high levels of healthy omega-3
fatty acids. Compared with terrestrial farm animals, farmed fish are more efficient
converters of energy and protein. Global fisheries have plateaued out at production
levels (Fig. 2) which are difficult to increase in a sustainable manner, as the world’s

250
Seafood production (million tonnes)

200

150

100

0
1950 1960 1970 1980 1990 2000 2010

Capture fisheries Aquaculture

Fig. 2 Global seafood production (million tons) from 1950 to 2019 by production technology.
(Source: FAO [1])
1528 F. Asche et al.

stocks of wild fish by and large are fully exploited or overexploited. In fact, as shown
by Squires and Walden in this book, many of the studies on bioeconomic modeling
focus on the maximum sustainable/economic yield, which limits fleet size and how
much of the finite natural resource stock one can exploit. Thus, only aquaculture can
provide a significant increase in seafood supply.
An expansion in aquaculture production depends on the sector’s ability to sustain
or increase productivity levels such that consumers can afford aquaculture products.
Furthermore, aquaculture sectors face environmental and sustainability challenges
related to input use in production processes and environmental emissions [2].
Production growth depends on innovations and productivity growth leading to more
efficient use of scarce natural resources in fish feed and lower levels of harmful
emissions to the environment. In many countries, particularly in freshwater aqua-
culture, the potential to increase aquaculture production by expanding the present
aquaculture area and increasing water consumption is limited. Consequently, the
most sustainable way to increase aquaculture production is through intensification
of aquaculture by producing more seafood using the same area and water resources
[3]. Aquaculture also represents an opportunity for poverty alleviation and economic
growth in developing countries [4].
Both supply and demand side factors have driven growth in aquaculture pro-
duction over time [5, 6]. We can observe this when we examine price and
production development of individual farmed species. Markets for aquaculture
species are generally competitive. When prices fall significantly over a longer period
accompanied by an increase in production volume, this is typically an indication of
productivity growth contributing to lower production costs driving expansion of the
market. On the other hand, when prices increase substantially, and production still
increase over a longer time period, this can be an indication of positive shifts in
demand for the species.
This chapter will shed more light on the patterns of productivity growth in
aquaculture and its determinants based on productivity studies of several species
in different countries. But first, we will take a look at price and production devel-
opments over time. The following figures provide examples from some aquaculture
species. Figure 3 shows the development in Atlantic salmon production and prices
over time. Until 2002, production growth was accompanied by declining prices,
suggesting that innovations leading to declining production costs were a central
driver [5–7]. In the following years, production has increased with increasing prices,
indicating that positive shifts in demand have been the dominant underlying driver
of market growth.
For seabream and seabass, we find, as shown in Figs. 4 and 5, that growth in the
early 2000s was accompanied by sharply declining prices, suggesting a substantial
underlying productivity growth allowing for cost reductions. Thereafter, production
has doubled, but prices have not experienced a significant decline, suggesting that
positive shifts in demand have driven market expansion.
For whiteleg shrimp, we find that production increased substantially from around
2000, accompanied by a significant decline in prices that started a few years earlier.
Again, this suggests innovations allowed larger volumes to be supplied to the market
40 Productivity in Global Aquaculture 1529

3 10
Aquaculture production (million tonnes)

Aquaculture prices (2019 USD/kg)

2.5
8
7
2
6
1.5 5
4
1
3
2
0.5
1
0 0
1990
1991
1992
1993
1994
1995
1996
1997
1998
1999
2000
2001
2002
2003
2004
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019
Atlantic salmon - production Atlantic salmon - prices

Fig. 3 Atlantic salmon aquaculture production (million tons) and inflation adjusted prices
(USD/kg) from 1990 to 2019. (Source: FAO [1])
Aquaculture production (million tonnes)

0.3 30

Aquaculture prices (2019 USD/kg)

0.25 25

0.2 20

0.15 15

0.1 10

0.05 5

0 0
1990
1991
1992
1993
1994
1995
1996
1997
1998
1999
2000
2001
2002
2003
2004
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019

gilthead seabream - production gilthead seabream - prices

Fig. 4 Gilthead seabream aquaculture production (million tons) and its prices (USD/kg) from
1990 to 2019. (Source: FAO [1])

at lower costs [6]. From 2006, production volumes continued to increase and more
than doubled, but then at price levels which are stable or higher, suggesting that
increased market demand was the main growth driver in the later period (Fig. 6).
Delgado et al. [8] and Kobayashi et al. [9] show that economic growth is the
most important factor in explaining the development of seafood consumption.
Three of the four largest countries in the world by population are the three
largest aquaculture producers, suggesting that domestic demand is a main driver
of aquaculture production. Abate, Nilsen, and Tveterås [10] find that the stringency
1530 F. Asche et al.

Aquaculture production (million tonnes)

0.3 35

Aquaculture prices (2019 USD/kg)

0.25 30

25
0.2
20
0.15
15
0.1
10
0.05 5

0 0
1990
1991
1992
1993
1994
1995
1996
1997
1998
1999
2000
2001
2002
2003
2004
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019
European seabass - production European seabass - prices

Fig. 5 European seabass aquaculture production (million tons) and its prices (USD/kg) from 1990
to 2019. (Source: FAO [1])
Aquaculture production (million tonnes)

6 12

Aquaculture prices (2019 USD/kg)

5 10

4 8

3 6

2 4

1 2

0 0
1991

1999

2002

2010

2013
1990

1992
1993
1994
1995
1996
1997
1998

2000
2001

2003
2004
2005
2006
2007
2008
2009

2011
2012

2014
2015
2016
2017
2018
2019

whiteleg shrimp - production whiteleg shrimp - prices

Fig. 6 Whiteleg shrimp aquaculture production (million tons) and its prices (USD/kg) from 1990
to 2019. (Source: FAO [1])

of environmental regulation has also been a central factor in explaining growth in

aquaculture production.
Aquaculture production is found in most countries around the world [11].
However, Asia is the dominant region with 92% of global production, when we
include all aquaculture production, not only fish, crustaceans, and mollusks. China is
by far the world’s biggest producer, representing 57% of global aquaculture output.
Over time, the production shares of different world regions have changed
significantly, as shown in Fig. 7. Europe and North America have experienced
rather dramatic declines in their global production shares and have become large
net importers of farmed seafood. Asia has increased its production share most, while
40 Productivity in Global Aquaculture 1531

100 35

90
30
80

Non-Asia Quantity Share (%)

70 25
Asia Quantity Share (%)

60
20
50
15
40

30 10
20
5
10

0 0
1950 1960 1970 1980 1990 2000 2010 2020

Asia Africa Europe Oceania Americas

Fig. 7 Aquaculture production quantity share by region. (Source: FAO [1])

Africa and South America have also experienced an increase, although from a very
low base. These production-share developments can probably be explained by a
mix of not only population growth, growth in GDP per capita, and environmental
and other regulation, but also productivity growth [8–11].
In this chapter, we survey studies of aquaculture productivity, and its determi-
nants. The chapter is organized as follows: In section “Bioeconomic Modeling
of Aquaculture Production,” we discuss bioeconomic modeling of aquaculture
production, and the farmer’s optimization problem. Section “Empirical Analyses
of Productivity and Efficiency in Aquaculture” provides an overview of empirical
studies of productivity and efficiency in aquaculture. Next, in section “Analyses of
Production Risk and Economic Risk,” we survey empirical studies of production
risk. Section “Analyses of Environmental Externalities” provides a survey on
empirical studies of relationships between productivity and environmental fac-
tors. In section “Analyses of Agglomeration Economies,” we survey studies of
agglomeration economies in aquaculture. Section “Conclusion” provides conclud-
ing comments.

Bioeconomic Modeling of Aquaculture Production

Understanding of productivity developments in aquaculture requires an understand-

ing of the production processes in this sector, and further benefits from insights
provided by the literature on bioeconomic modeling of aquaculture. Aquaculture
has several inherent characteristics, most notably a long production period in
1532 F. Asche et al.

combination with biological and environmental shocks, which represents departures

from underlying assumptions of the standard model of the competitive firm, and has
significant consequences for observed productivity across firms and over time.
The aquaculture farmer’s economic optimization problem involves (1) choice
of production technology and investment in fixed capital equipment, (2) timing
release of fingerlings into the ponds or cages, (3) making decisions on feeding and
other decisions that affect the survival and growth of the biomass of fish, and (4)
harvesting the fish for sale [12]. Bioeconomic models can be applied in modeling of
this optimization problem [13], and we will look more into these in the following.

Bioeconomic Models

Several authors have examined the scientific production related to the use of bioe-
conomic modeling in aquaculture. Allen et al. [14] identified 22 specific examples
of modeling aquaculture production processes during the period of 1974–1983.
Leung [15] found 32 studies in which bioeconomic models were used in aquaculture
during the period of 1984–1993. Cacho [16] provides a review of bioeconomic
literature and presents 20 published papers from 1974 to 1995. Pomeroy et al. [17]
identified and surveyed 28 papers that used integrated models for fish production
during the period of 1994–2003, of which only seven were applied in the field of
aquaculture. Llorente and Luna [13] present 40 published papers on bioeconomic
modeling of aquaculture management during the period 2004–2015, which include
both optimization and simulation models.
Karp et al. [18] and Leung and Shang [19] considered the problem of determining
optimal harvest and restocking times and levels for farmed shrimp. Bjørndal
[20] developed the first optimal harvesting models for aquaculture based on the
forestry literature. Several authors have extended Bjørndal’s model to emphasize
specific aspects of the problem. Arnason [21] introduced dynamic behavior and
presented a general comparative dynamic analysis. Furthermore, he introduced
feeding as a decision variable. Heaps [22] modeled density-independent growth,
while Heaps [23] allowed for density-dependent growth and looked at the culling
of farmed fish. Mistiaen and Strand [24] demonstrated general solutions for optimal
feeding schedules and harvesting times under conditions of piecewise-continuous,
weight-dependent prices (harvest). Yu and Leung [25] develop a partial harvesting
model that addresses discrete partial harvesting and other partial harvesting using
impulsive control theory. They consider a general framework for identifying the
optimal time to harvest an aquacultural crop in the case on heterogeneous production
cycles.
In bioeconomic models, there is usually a growth and survival function for the
farmed fish, and it may also include other biological or environmental variables,
functions, and constraints which influence the production process and its effects
on the environment. The bioeconomic model can also include constraints on the
production process from government regulation and physical capacities of the
productions system. Choice variables may be the timing and volume of fingerlings
40 Productivity in Global Aquaculture 1533

released into the farm, feed volume, and timing and quantity of harvested fish.
The bioeconomic optimization problem consists of finding the values of the choice
variables that maximize the objective function, for example, profits.
Following Asche and Bjørndal [12], the farmer’s maximization problem can be
specified as follows:

t
max π(t) = V (t)e−rt − Cf F (u)Re−(M+r)u du,
{0≤t≤T } 0

where π is profits, t is time, T is life expectancy of fish, V(t) is gross fish biomass
value, Cf is feed costs per unit of feed, F is the feed quantity per fish, R is the number
of recruits (i.e., fingerlings released into the water), M is fish mortality rate, and r
is the interest rate. The gross biomass value is given by V(t) = p(w)B(t), where p
is sales price as a function of fish weight w and B is biomass volume. The biomass
is B(t) = N(t)*w(t), i.e., a product of the number of fish N and average weight of

the fish w. The feed quantity per fish at time t is F(t) = ft w (t), where ft is the feed

conversion ratio, i.e., the ratio of feed volume in kg to fish growth in kg, and w is
the fish growth rate. The first order condition for profit maximization can be derived
as:

π (t) = V (t)e−rt − rV (t)e−rt − Cf F (t)Re−(M+r)t = 0

and can be rewritten as:

p (w) ∗ w (t ∗ ) Cf F (t ∗ )
w t + ∗
=r +M +
p(w) w (t ) p(w)w (t ∗ )

The marginal revenue per fish with respect to time is on the left-hand side, while
marginal cost per fish is on the right-hand side. Cf F(t) are the feed costs per fish at
time t, and p(w)w(t) is the value of the fish. Cf F(t)/p(w)w(t) is thus the relative feed
cost, which combined with the interest rate r and the mortality rate M represents
the cost of not harvesting the fish at time t. The fish farmer will continue to feed
the fish as long as the marginal revenue is higher than marginal cost, and harvest
when marginal revenue intercepts the marginal cost, as shown in Fig. 8. This figure
also shows the effect of introducing feed costs on optimal harvesting time (t2 * ) as
compared to a situation without feed costs (t1 * ).

The Rotation Problem

Among the most important decision-making activities in production planning is that

of determining the optimal rotation of live fish, i.e., finding the best sequence of
release of fingerlings and harvesting of fish. This has impact on the farm’s cash flow
1534 F. Asche et al.

marginal revenue

[r + M] CfF
pw
r+M

time
t*2 t*1

Fig. 8 The optimal harvesting time with and without feed costs. (Source: Asche and Bjørndal
[12], Fig. A2)

time
0 t1 t2 t3 t4

Fig. 9 The rotation problem in aquaculture. (Source: Asche and Bjørndal [12], Fig. 9.6)

as well as the allocation of limited production resources, such as feed, fish, space,
and environmental resources [16, 26].
The rotation problem in aquaculture, illustrated in Fig. 9, shares many features
with rotation problems in forestry and traditional terrestrial livestock production.
However, fish farming also exhibits specific features that demand a more flexible
model than those constructed for other sectors. In Guttormsen [26], an extended
version of the Faustmann model is presented, which is general enough to treat differ-
ent species and technologies. Two particularly important aspects of the problem are
emphasized: first, the possibilities for cycles in relative price relationships between
fish of different weight, and second, restrictions in release time for certain species.
An illustration of the model based on assumptions from salmon farming shows that
the inclusion of these two features has major influences on rotation time, and hence
harvest weight. Guttormsen [26] argues based on his extended Faustmann model
that a well-developed production plan can mean the difference between loss and
profit for a fish farm.
40 Productivity in Global Aquaculture 1535

Risk and Risk Aversion

In modeling of the farmer’s economic optimization problem, one can assume a

deterministic setting with certain prices and absence of production risk. This is
often implicit in theoretical models of aquaculture [12] which maximize profits,
or in a multiperiod setting net present value of profits. However, risk is an
inherent characteristic of aquaculture, both in the production process and in markets,
leading to both output risk and price risk. Fish diseases, algae blooms, and other
biological and environmental shocks are sources of production risk. When risk is
incorporated and the farmer is assumed to be risk neutral, then the problem is one
of maximizing expected net present value. However, if the farmer is risk averse,
then the optimization problem becomes an expected utility maximization problem
[27–29, 101].
The EU model of the competitive firm is a member of a broad range of
maximization problems that have been considered in the EU theory of choice under
uncertainty. Many of these can be fitted into the general framework:

Maxα E [U (ϕ(θ, α, W0 ))]

where U(·) is a von Neumann-Morgenstern utility function, α is a control variable

(assumed to take positive values), θ is an economically relevant random variable,
W0 is initial wealth, and ϕ(·) is a function mapping actions α and realizations of
θ into outcomes, normally taken to be wealth levels. In the theory of the firm, the
control variable α might be the production level y or a vector of input levels x. The
random variable θ might be the production level y or the output price p, or both. The
argument of the utility function, ϕ(·), might be the profit function plus initial wealth
W0 .
A foundation for modeling of risky production technologies, and also the
econometric study of heteroskedastic production technologies, which is fruitful
also for the study of aquaculture has been provided by Just and Pope [30]. They
suggested eight postulates for the stochastic production function which they claim to
be reasonable on the basis of a priori theorizing and observed behavior. Furthermore,
they specified a particular functional form which satisfies the eight postulates. This
is known as the Just-Pope production function, which is given by [30]

y = f (x; α) + h (x; β) ε, (1)

where y is output level, x is a vector of input levels, ε is a stochastic term, and

E[ε] = 0. The function f(·) is the mean production function, and h(·) is the variance
production function. The parameter vectors α and β are the mean and variance
function parameters, respectively. In Eq. (1), the effect of input changes has been
separated into two effects: the effect on mean output and the effect on the variance
of output. The Just-Pope production function is a heteroskedastic specification,
because the variance of y is a function of the input vector x. An advantage of the
1536 F. Asche et al.

Just-Pope model is that it allows us to analyze the effects of changing input levels on
mean output and output risk separately. This can be seen by deriving from Eq. (1)
the conditional variance of output, var.[y] = [h(x; β)]2 var[ε], and the conditional
mean output, E[y] = f (x; α).
There is a positive linear relationship between the moments of output and the
moments of profit under JP production risk (Eq. 1), with the mean and variance
of profit given by E(π) = p•f(x) − w x = p•E(y) − w x and Var(π) = p2 Var(y).
In the extreme case of risk neutrality (i.e., dU/dVar(π) = 0), the producer is only
concerned about mean profits (output) and ignores the output risk effects of input
choices. However, the model implies that with increasing risk aversion among
farmers, input choices reflect that reduction of profit (output) risk becomes more
important at the expense of increasing mean profit (output). Later, we will see how
this theoretical framework has been applied and extended in econometric studies of
risk in aquaculture.

Biological Shocks and Price Dynamics

Production risk does not only affect the individual farmer, but can also have
influence on the entire market. The effects of production risk caused by biological
shocks on aquaculture markets have been explored by Asche, Oglend, and Kleppe
[31]. They specify a partial equilibrium model for a biological production industry
(as fish farming) and derive the conditions for profit-maximizing harvest and harvest
transitions in a competitive setting. In their example, the salmon farmer’s timing
decision to harvest depends on a limited availability of each year class fish stock,
and environmental shocks. These factors make the optimal transition between each
year class stock to occurs when the difference between the stocks’ marginal value is
at maximum, which is contrary to what is found in Faustman’s model. This harvest
strategy affects the market by creating a temporary price spike until it is normalized
by the marginal value of the new stock, contributing to a short-term salmon price
volatility. Furthermore, this model implies that biological shocks have effects on
productivity, also through changes in harvest decisions.

Productivity in Aquaculture

The relationships between different measures of economic and productivity perfor-

mance may not be straightforward in aquaculture. In a standard textbook model of
the competitive firm – producing one homogeneous product, absence of uncertainty,
or risk related to prices and output, and where firms only differ in the relationships
between input use and output as represented by the production technology f(x) –
the most profitable firms will also be the most productive as measured by a primal
measure of total factor productivity (TFP). Fluctuations in output prices and timing
of harvesting are one explanation for this. A firm may have high primal productivity
relative to other firms, as measured by the ratio of physical output quantity to inputs
40 Productivity in Global Aquaculture 1537

(e.g., total factor productivity – TFP), but have a profitability performance that is
relatively weaker due to timing of harvesting and the realized market prices at the
time of harvesting.
Presence of market and production risk and risk aversion may also influence
productivity performance. Biological and environmental shocks may have adverse
effects on productivity. These shocks are typically not of the same magnitude
across farms and may consequently affect the relative productivity performance.
Moreover, production shocks can affect the productivity development over time, for
example, disease shocks leading to high loss of biomass and revenue tend to occur at
infrequent time intervals. Risk-averse farmers may not only make input choices that
reduce the level of risk, but also affect the ratio of expected and realized output level
to input levels as measured by TFP. When farmers are heterogeneous with respect
to risk preferences, as Kumbhakar [27, 28, 32] found, i.e., have different degrees
of risk aversion, this can also affect the observed relative productivity performance
across farms as measured by TFP.

Empirical Analyses of Productivity and Efficiency in Aquaculture

Productivity and efficiency in the aquaculture industry have become the topic of
an increasing number of studies over time, as shown by Table 1. The literature
has been able to cover all the major farmed species – carp, tilapia, catfish,
salmon, seabass, seabream, shrimp, and oysters. Studies cover aquaculture sectors
worldwide. While most analyses have been performed of aquaculture production in
Asia (e.g., Bangladesh, China, India, and Vietnam), there are still many studies of
aquaculture sectors in Europe, North and South America, and Africa.
Early during the industry development, works on assessing the industry’s
performance applied production and cost structural models to estimate technical
change and returns to scale. For instance, Salvanes [92, 93] and Bjørndal and
Salvanes [94] found cost inefficiencies related to the overregulated salmon farming
in Norway.
In addition, a detailed analysis of production inefficiency and its sources has
been essential to provide evidence to decision makers to increase productivity. The
main parametric approach applied in the literature for efficiency has been stochastic
frontier analysis (SFA). Its main advantage is the possibility of decomposing
the deviation from the frontier into stochastic noise and technical inefficiency in
production. However, it is also necessary to impose a particular parametric form
for the underlying technology.1 An alternative has been to use nonparametric
approaches such as Data Envelopment Analysis (DEA). This type of analysis
does not require a parametric assumption of the underlying technology. However,
the estimated frontier may be sensitive to stochastic noise in the data and can

1 Recently, nonparametric SF models have been developed to avoid this problem. See Parmeter
et al. [95], Zhou, Wang, and Kumbhakar (2020), and references in there.
1538 F. Asche et al.

Table 1 Studies on productivity and efficiency of the aquaculture industry

Authors Year Region Species Approach
Sharma and 1998 Nepal Carp SFA
Leung [33]
Iinuma, Sharma, 1999 Malaysia Carp polyculture SFA
and Leung [34]
Sharma et al. [35] 1999 China Carp polyculture DEA
Dey et al. [36] 2000 Philippines Tilapia SFA
Karagiannis et al. 2000 Greece Seabream/seabass SFA
[37]
Sharma and 2000 South Asia (India, Carp SFA
Leung [38] Bangladesh, Pakistan, and
Nepal)
Awoyemi et al. 2003 Nigeria Fish SFA
[39]
Irz and Mckenzie 2003 Philippines Fish SFA
[40]
Ara et al. [41] 2004 Bangladesh Fish SFA
Chiang et al. [42] 2004 Taiwan Milkfish SFA
Martinez-Cordero 2004 Mexico Shrimp DEA
and Leung [43]
Dey et al. [44] 2005 China/India/Thailand/Vietnam Fresh water pond SFA
polyculture
system
Cinemre et al. 2006 Black Sea region, Turkey Trout DEA
[45]
Kaliba and Engle 2006 USA Catfish DEA
[46]
Kaliba, Engle, and 2007 USA Catfish DEA
Dorman [47]
Alam and 2008 Bangladesh Prawn/carp DEA
Murshed-e-Jahan
[48]
Kareem, 2009 Nigeria Polyculture SPF
Aromolaran, and (Clarias
Dipeolu [49] gariepinus/Tilapia
guinensis mainly)
Singh et al. [50] 2009 India Fish SFA
Chang et al. [51] 2010 Taiwan Shellfish DEA
Nilsen [52] 2010 Norway Salmon SFA
Ogundari and 2010 Nigeria Fish SFA
Akinbogun [53]
Onumah et al. 2010 Ghana Fish SFA
[54]
Alam [55] 2011 Bangladesh Pangas DEA
Nielsen [56] 2011 Denmark Trout DEA
Pantzios et al. [57] 2011 Greece Seabream/seabass SFA
(continued)
40 Productivity in Global Aquaculture 1539

Table 1 (continued)
Authors Year Region Species Approach
Alam et al. [58] 2012 Bangladesh Tilapia SFA
Asche and Roll 2013 Norway Salmon SFA
[59]
Asche, 2013 Norway Salmon DEA
Guttormsen, and
Nielsen [60]
Begum et al. [61] 2013 Bangladesh Shrimp SFA
Bukenya et al. 2013 Uganda Fish SFA
[62] (catfish/Tilapia)
Arita and Leung 2014 Hawaii Several (catfish, DEA
[63] crustacean,
foodfish,
ornamental,
mollusk, and
others)
Nguyen and 2014 Vietnam Shrimp DEA
Fisher [64]
Schrobback, 2014 Australia Oyster DEA
Pascoe, and
Coglan [65]
Iliyasu and 2015 Malaysia Fresh water DEA
Mohamed [66] aquaculture
(shrimp, prawn,
Tilapia, catfish,
and carp)
Iliyasu, 2015 Malaysia Fresh water DEA
Mohamed, and aquaculture
Hashim [67] (shrimp, prawn,
Tilapia, catfish,
and carp)
Begum et al. [68] 2016 Bangladesh Shrimp SFA
Iliyasu and 2016 Malaysia Fresh water DEA
Mohamed [69] aquaculture
(shrimp, prawn,
Tilapia, catfish,
and carp)
Sandvold [70] 2016 Norway Juveniles/salmon SFA
Anh Ngoc et al. 2018 Vietnam Pangas DEA
[71]
Nguyen et al. [72] 2018 Vietnam Catfish DEA
Ton Nu Hai et al. 2018 Vietnam Lobster DEA
[73]
Bayazid et al. [74] 2019 Bangladesh Floodplain DEA
Forleo et al. [75] 2019 Italy DEA
Mitra et al. [76] 2019 Bangladesh Tilapia/catfish DEA
Rahman et al. [97] 2019 Bangladesh Pangas/Tilapia DEA
(continued)
1540 F. Asche et al.

Table 1 (continued)
Authors Year Region Species Approach
Rahman, Nielsen, 2019 Bangladesh Pangas/Tilapia SFA
and Khan [138]
Rodrigues et al. 2019 Brazil Fingerlings DEA
[77]
Scuderi and Chen 2019 USA Oyster SFA
[78]
Aponte [79] 2020 Norway Salmon DEA
Aripin et al. [80] 2020 Malaysia Seabass DEA
Long et al. [[81], 2020a;b Vietnam Shrimp DEA
[82]]
Mitra et al. [83] 2020 Bangladesh Tilapia/catfish DEA
Fernández 2020 Mediterranean Sea Seabream/seabass SFA
Sánchez et al. [84]
Gutiérrez, 2020 EU members Several (fish and DEA
Lozano, and shellfish)
Guillén [85]
Ton Nu Hai and 2020 Vietnam Lobster DEA
Speelman [86]
Ton Nu Hai, 2020 Vietnam Lobster DEA
Meensel, and
Speelman [87]
Long [88] 2021 Vietnam Shrimp DEA
Nielsen et al. [89] 2021 Mediterranean Seabream/seabass DEA
Khan, Roll, and 2021 Bangladesh Pangas SFA
Guttormsen [90]
Hukom, Nielsen, 2021 Indonesia Shrimp DEA
and Nielsen [91] polyculture
SFA Stochastic Frontier Analysis, DEA Data Envelopment Analysis

be overestimated. Thus, the noise term should be introduced in nonparametric

frontier models and needs additional procedures (e.g., bootstrapping) for statistical
inferences [96].
Both SFA and DEA approaches have been widely applied for estimating produc-
tivity and efficiency of aquaculture sectors (Table 1). In general, parametric studies
apply a stochastic production frontier model using a Cobb-Douglas specification,
or a more flexible form, such as the translog. Most nonparametric studies employ a
total factor productivity index with input-oriented specification [85]. Independent of
the approach, the technical efficiency (TE) and cost efficiency (CE) measurements
provide evidence that the industry has potential to increase its efficiency levels.
Several drivers of efficiency have been identified in the literature. For the
parametric approach, it has been possible to estimate simultaneously the effect of
several factors affecting efficiency together with the frontier. For the nonparametric
approach, a second stage has been incorporated by running different regression
40 Productivity in Global Aquaculture 1541

models (e.g., linear, truncated, tobit, and logit, probit, and fractional) on the potential
explanatory variables of inefficiency.
Human capital or skill is among the factors which have been studied. Education
level, training, and experience are common variables representing human capital
used to explain efficiency levels [36, 42, 45, 50, 51, 53, 54, 58, 61, 64–66, 68, 71–
73, 76, 78, 80–82, 87, 146]. It is expected that higher levels of education, training,
and experience will contribute to better informed decisions with respect to use and
allocation of production resources and inputs.
Other factors related to farmers and farm management include whether it is an
owner-operated farm (i.e., company type [45, 46, 51, 59, 74]), or if aquaculture is
the farmer’s primary activity [38], their household size, if there is a hired manager,
and if a record of activities is maintained [62]. Sharma et al. [35] and Hukom
Nielsen and Nielsen [91] show that comanaged farms perform better than without
comanagement. Basically, comanaged farms require more communication and
collaboration among the farmers, which builds adaptative capacity and resilience
leading to a better performance. However, Forleo et al. [75] found the opposite for
Italian aquaculture firms, where the most inefficient farms are cooperatives.
Credit availability is also a key driver for efficiency in aquaculture farms. Several
studies of different species (trout, tilapia, and catfish) in different regions (Uganda
and Bangladesh) found that credit constrains are associated with lower productivity
[45, 62, 76, 97].
Given the diverse set of technologies used in aquatic farming (e.g., intensive
vs. extensive systems, ponds vs. cages, and small vs. large), farms’ technical
characteristics have been also considered as potential drivers of efficiency. For
instance, intensive carp farms in Nepal [33] and in Malaysia [34] and shrimp farms
in Vietnam [81] performed better than extensive ones. However, other studies show
that small extensive farmers can be more technically and economically efficient
[35, 48, 64]. Aponte [79] identified some benefits with respect to smaller salmon
firms presenting higher revenue productivity than the larger firms. In general,
the farm operation scale (e.g., cage, pond, and farm area), specialization (mono
or polyculture), water management, quality and source, and other inputs used in
production (from temperature, pH, and nitrogen to smolt quantity and price, feed
and fertilizer management) have been considered aspects that influence production
efficiency.
Furthermore, environmental aspects influencing efficiency have received partic-
ular attention in the last decades. Diseases leading to slower growth and mortality
together with pollution and climate change negatively affect the farms’ performance
[43, 59, 64, 91]. However, farmers are able to innovate and learn to deal with these
issues. For instance, catfish farmers in Vietnam which experienced some climate
change impacts in the past, such as flooding or salinity intrusion, had a better
productive performance [72].
Other exogenous variables that seem to influence how well farms perform
are related to access to markets and prices, and other complementary industries.
Ogundari and Akinbogun [53] found that access to markets was associated with
significant lower technical inefficiency of farmers. Gutiérrez, Lozano, and Guillén
1542 F. Asche et al.

[85] suggested countries with a significant and established capture, and fisheries
sectors were also more efficient. This might be related to a better infrastructure and
logistics already established to support the aquaculture industry.

Analyses of Production Risk and Economic Risk

Production and economic risk are present for various farmed species and countries.
In fact, these risks have contributed to boom and bust cycles in several aquaculture
sectors [98]. Often, feed and capital are found as risk-enhancing inputs factors and
labor as risk reducing [29, 53, 99, 100].
Tveterås [29] estimated primal panel data models for risky production on an
unbalanced panel of Norwegian salmon farms using the Just-Pope model frame-
work, presented in section “Bioeconomic Modeling of Aquaculture Production,”
and analyzed how different specifications of functional forms and different esti-
mators influenced the empirical results. Feed and fish input were found to have
risk-increasing effects on output, while labor and materials input had risk-reducing
effects on input. In other words, a marginal increase in feed input increased the
variance of output while a marginal increase in labor reduced the output variance.
The finding that feed is risk increasing and labor risk reducing has been found
in several later studies, e.g., Tveterås [147], Kumbhakar [27, 28], Kumbhakar and
Tveterås [101], and Kumbhakar and Tsionas [102]. Labor plays a particularly
important role in production risk management. Farm workers’ main tasks are mon-
itoring of the live fish in the pens, biophysical variables (sea temperature, salinity,
oxygen concentration, algae concentrations, etc.), and the condition of the physical
production equipment (pens, nets, feeding equipment, anchoring equipment, etc.).
Thus, workers’ ability to detect and diagnose abnormal fish behavior, detect changes
in biophysical variables, and make prognoses on future development is crucial to
mitigate adverse production conditions and reduce production risk. There are several
explanations why feed may increase the level of output risk. The feed is not all
digested by the farmed fish, and residue is released into the environment as feed
waste or feces. Salmon competes with this released organic waste for the limited
oxygen available in the cages (organic waste consumes oxygen in the decomposition
process). In addition, feed waste can also lead to production of toxic by-products,
such as ammonia.
The models of Tveterås [29] accounted for technical change in both the mean
and variance production function. When farmers are risk averse, the change in the
variance function should also be accounted for in an analysis of technical change.
Tveterås finds that from 1985 to 1993, technical change led to not only higher
mean output for average input levels, but also to higher output risk conditional
on average input levels, as shown in Fig. 10. However, when technical change is
analyzed using the first-order stochastic dominance (FSD) criterion, it is found that
the improvement in mean output dominates the increase in output risk, as shown
in Fig. 11. First-order stochastic dominance is present if the cumulative density
function (cdf) of the new technology lies strictly to the right of the cdf of the
40 Productivity in Global Aquaculture

Fig. 10 Rate of technical change of the mean (TC) and variance production functions (TCV). (Source: Tveterås [29])
1543
1544 F. Asche et al.

1
0.9
1985
0.8 1986
0.7 1987
0.6 1988
F(y)

0.5 1989
0.4 1990
0.3 1991

0.2 1992
1993
0.1
0
1
0.4

0.5

0.6

0.7

0.8

0.9

1.1

1.2

1.3

1.4

1.5

1.6

1.7
y

Fig. 11 Cumulative density function (cdf) of salmon output evaluated in sample mean input
levels. (Source: Tveterås [29])

old technology for all values of output y. Then both risk averse and risk neutral
producers will prefer the new technology. FSD of the new technology relative to the
old technology also implies that risk-neutral and risk-averse farmers can agree that
there has been technical progress during the data period.
Unlike Tveterås, who only estimated the structure of production risk, Kumbhakar
[27] extended the production model to include not only risk, but also producer’s
attitude toward risk, and technical inefficiency. Kumbhakar’s specification did not
assume a parametric form of the utility function, or a specific distribution for the
error term representing production risk, and it allowed for a generalized technical
efficiency model.
In Kumbhakar [27], the model is also estimated on a sample of Norwegian
salmon farms. According to the econometric estimates, all farmers are risk averse.
Production risk is found to be increasing with feed and decreasing with labor
and capital. Furthermore, risk preference associated with production uncertainty is
found to be stronger than that of technical inefficiency. Technical inefficiency is
found to be positively related to feed and negatively related to labor and capital. The
mean technical inefficiency for the sample farms is found to be 7.9%.
Kumbhakar and Tveterås [101] use a similar model framework and same data
as Kumbhakar [27]. Farm age is introduced as an additional variable, and the
relative risk premium is estimated. Empirical evidence of production risk and farm
heterogeneity is found based on the econometric estimates. Capital and labor are
found to be risk reducing, while feed and fish input are risk increasing. They
also find that farm age has a negative effect on mean output, with a sample mean
elasticity of −7.2%. The accumulation of organic sediments below the cages over
time may change the environmental conditions around and within the farm, resulting
40 Productivity in Global Aquaculture 1545

in less oxygen available to the animal, along with higher risk of diseases. Hence,
the farm-age variable may capture both the firm’s positive learning effect and the
negative effect on fish health and welfare from the changes in the environmental
conditions. The results imply that the negative fish welfare effects dominate.
Moreover, Kumbhakar, and Tveterås [101], not surprisingly, find salmon farmers
to be risk averse. However, they also find that farmers are downside risk averse.
This means that farmers generally avoid situations which offer the potential for
substantial gains, but which also leave them even slightly vulnerable to losses below
some critical level. To gain a better understanding of the importance of risk-averse
behavior as well as differences in the degree of risk aversion among farms, the
authors estimated the risk premium (RP) defined as the sure amount of money
satisfying E(U(π)) = U(π − RP) for each farm in every year. Since the farms are
heterogeneous, it was considered more appropriate to focus on relative risk premium
(RRP) values, i.e., the RP as a percentage of mean profits, for the salmon farmers
in the sample. The estimated values of relative risk premiums were found to be
positive, but vary across farms, and over time. The overall mean RRP is 17.9%,
while the mean (by farm) RRP ranges from 11.5% to 31.5%. Overall, this empirical
evidence suggests substantial welfare loss associated with private risk bearing in
this industry.
Kumbhakar and Tsionas [102] deal with nonparametric estimation of risk and the
risk preference function when producers face both production risk and output price
risk. Models are specified to estimate risk preference of individual producers under
(i) only production risk, (ii) only price risk, (iii) both production and price risks, (iv)
production risk with technical inefficiency, (v) price risk with technical inefficiency,
and (vi) both production and price risks with technical inefficiency. Norwegian
salmon farming data is used for an empirical application of some of the proposed
models. Based on their estimates, Kumbhakar and Tsionas find that salmon farmers
are, in general, risk averse. Labor is found to be output-risk decreasing while capital
and feed are found to be risk increasing.
The nonparametric estimation approach of Kumbhakar and Tsionas [102] does
not restrict the distribution of elasticity estimates to be symmetric. In fact, they find
that none of the distributions are symmetric but are all skewed to the right. Thus,
the median values of these elasticities are less than their mean values. For technical
change (TC), they find mean technical progress at the rate of 4.6% per year, with
a median value of TC 5.3%. A notable feature of the TC distribution is that it is
bimodal. The two modal values of TC are 2.5% and 7.5% per annum, respectively.
Although the mean TC is around 6% per year, some farms experienced technical
progress at the rate of 2.5% while other “leading” farms experienced a much higher
rate.
Kumbhakar and Tsionas report the frequency distribution of elasticities of the
risk function with respect to labor, capital, feed, age, and time. The mean (median)
values of these elasticities for labor, capital, feed, age, and time are −0.049
(−0.043), 0.016 (0.011), 0.085 (0.016), −0.001 (−0.001), and 0.002 (0.002),
respectively. The risk part of the production technology seems to be quite insensitive
to changes in the age (experience) of farmers. Similarly, no significant change in
1546 F. Asche et al.

production risk has taken place over time. The mean (median) values of relative
risk premium (RRP) in the model with both production risk and price risk are 0.087
(0.0522). In other words, the median farm is willing to pay 5.22% of the mean profit
as an insurance against possible profit loss due to both production risk and output
price uncertainty.
Production risk has also been studied for other countries and species. Ogundari
and Akinbogun [53] model technical efficiency with production risk in inputs
based on a sample of fish farms from Oyo State, Nigeria. The species cultured
by the sampled farms include tilapia, catfish, and carp. Clarias (catfish) is the
most frequently cultured fish, grown by over 80% of the sampled fish farms. The
authors applied a stochastic frontier model using a similar flexible risk specification
as Kumbhakar [27]. Their results suggest not only that labor, fertilizer, and feed
influence the mean fish output, but also that these inputs influence production risk.
As found in other studies (Tveterås 2000) [27, 28, 101, 102], fertilizer and feed seem
to increase risk, while labor is risk reducing. Thus, an average risk-averse farmer in
Oyo State in Nigeria is expected to use less of fertilizer and feed and more labor
compared to a risk-neutral farmer. They also found that without accounting for the
flexible risk component in the production technology specification, the efficiency
score is overstated.
Sarker et al. [103] also provide empirical estimates of production risk and techni-
cal efficiency, using an extended stochastic frontier model, based on Just and Pope
framework and Kumbhakar (2002). They focus on Thai Koi (Anabas testudineus)
in the northern part of Bangladesh, a high-valued species in the market, which
has experienced highly volatile output across different years and farms. Results
show that the main input for production is feed, which is also a risk-increasing
input together with fingerlings, labor, and salt. On the other hand, zeolite, a water-
purifying product, and pesticide appear to be risk reducing. As in Ogundari and
Akinbogun [53], the mean TE scores are overestimated for the conventional model
(0.96) in comparison to when accounting for the risk in the production model (0.73).
Another paper by Khan, Guttormsen, and Roll [100] also looks at Bangladesh
aquaculture industry but focuses on estimating the structure of production risk
of the species Pangas (Pangasius hypophthalmus), which has in the last decades
become an important export-oriented farmed fish sector. They estimated mean-
and risk-stochastic production functions based on the Just–Pope framework. Their
results show a significant different production risk between small and large farms.
Moreover, in contrast to other studies (e.g., salmon farming), feed and capital seem
to reduce risk, while fingerlings and farm size (pond area) increase risk. In their
analysis, they find that investments in training and extension services could reduce
production risk. More efficient use of feed and other resources could also increase
production, in particular for small farms with capital constraints and facing high feed
prices. The authors report that smaller farms tend to use lower-quality or homemade
alternative feed due to barriers to access credit (e.g., needs for collateral). This
leads to productivity loss and increased production risk. The study highlights the
importance of studies in production risk, in particular in developing countries, where
farmers are more vulnerable to risk.
40 Productivity in Global Aquaculture 1547

Analyses of Environmental Externalities

The environmental performance and sustainability of aquaculture has been subject

to increasing scrutiny over the last decades. There are concerns about both global
and local environmental issues. The use of feed ingredients which are scarce or
finite, e.g., fishmeal and fish oil, or have significant environmental footprints, has
been one type of concern [104–106]. Local pollution of water in terms of excessive
nutrient loads [64, 86, 107, 108], and genetic pollution of wild fish stocks through
escape of genetically modified farmed fish, is another concern [109]. In addition,
there are increasing concerns with respect to the animal welfare due to excessive
mortality [86, 104].
All these environmental externalities can be regarded as a bad output from the
production process in aquaculture. Based on that, several studies have specified a
model that quantifies the relationship between the good and bad outputs as well as
inputs use [108–110]. In fact, efficient regulations to address these environmental
issues require knowledge about these relationships. For instance, for outputs with a
joint production process, a reduction in the bad output due to regulations will also
mean a reduction in the good output. So, these relationships provide information on
producers’ incentives to address the negative externalities in the production process.
In order to investigate the incentives that Norwegian salmon farmers have to address
escapees, Pincinato, Asche and Roll [109] applied a multiproduct cost function
with escaped salmon as the bad output and farmed salmon as the good output.
They found that escapees have not affected the salmon production cost directly
or indirectly. This means there are relatively little private incentives to address
this issue. However, given the nonjoint production technology and input-output
separability, it is possible to elaborate regulations that do not reduce the farm’s
economic efficiency.
Nevertheless, several studies on efficiency in aquaculture point out environment-
related factors, such as water pollution, nutrient emissions, and diseases, as key
drivers of farms performance [59, 64, 86–88, 107, 111]. Long [88] found that lower
input levels of chemicals and drugs, and higher feed use per area, seem to increase
the cost efficiency of shrimp farming in Vietnam. Similar results are found for
lobster farms in Vietnam, for which the use of inputs more efficiently would improve
environmental performance and lower production costs [86]. In addition, Hukom,
Nielsen, and Nielsen [91] examined how environmental stressors such as industrial
pollution, disease, land use change, and domestic waste were perceived by small-
scale shrimp farmers in Indonesia. Their results suggest that farmers operating in
comanagement are more efficient and less frequently exposed to industrial pollution.
Inefficiency can also make environmental impacts worse, in particular by the
overuse of inputs such as feed. Asche, Roll, and Tveterås (2009) found that both
technical and allocative inefficiencies explain not only the level and variation of
Norwegian salmon farm costs, but also that major environmental impacts in the
production process are due to technical inefficiency. Thus, the degree of inefficiency
in the industry matters for designing environmental regulations and the industry’s
overall sustainability.
1548 F. Asche et al.

In general, improvements related to the industry’s environmental performance

have been reached mainly through technological innovation [112–115]. Feed con-
taining ingredients considered more sustainable (e.g., algae), better feed formulation
to avoid feed waste, vaccines, more environment-friendly treatments against para-
sites (e.g., laser), and biofloc technology are just some examples of improvements
due to innovation that contributes to reduce the pressure on the environment.
Sandvold and Tveterås [116] examined productivity development using a cost
function model for Norwegian juvenile salmon farms, which is a key input in
salmon production. The significant decline in production costs found in the study
is attributed to innovations in breeding, feed, equipment, fish health, and water
technology. These improvements have spilled over to salmon grow-out production
farms by increasing survival rates and reducing disease outbreaks after juveniles
were released to the sea.
Diseases and parasites have over the years caused several supply socks in the
seafood supply [117, 118], and several studies have tried to estimate the cost
related to these externalities [86, 111, 117, 119, 120]. Abolofia, Asche, and Wilen
[119] found significant private cost related to sea lice, a parasite that is currently
considered one of the major environmental issues in salmon farming. Another
interesting example is the global shrimp industry, which has been hit by disease
outbreaks in several periods and places. Asche et al. [117] specified an economic
model to estimate the disease outbreaks’ impact on economic risk. When disease
outbreaks hit one region, it opens up a market opportunity for other regions to
exploit, by starting up and/or expanding their farming industry. This situation creates
incentives for relatively unsustainable production practices given the high short-run
profitability of the industry.
The literature so far has shown that, in general, addressing the environmental
externalities can lead to improved economic performance. This is essential given the
importance of farmed aquatic species as one of the most resource-efficient sources
of protein for the global food system [121, 122].

Analyses of Agglomeration Economies

One strand of studies of aquaculture productivity has focused on economic effects

of geographic colocation of aquaculture farms, and also colocation of sectors which
are related to aquaculture. These studies are concerned with the possibility that there
may be positive externalities between firms which can be realized when there is a
geographic agglomeration of related firms. Localization in a cluster of related firms
and institutions can yield economic benefits, so-called agglomeration economies, or
external returns to scale.
A distinction is often made in the literature between two types of external
agglomeration economies – localization and urbanization economies. The former
increase returns within a single or more narrowly defined industry (industry clusters)
and draws on seminal insights from Marshall [123], who argues that firms that
colocate could enjoy external economies because of exchange of inputs, expertise,
40 Productivity in Global Aquaculture 1549

and division of labor [124]. The latter increase returns to a diversity of industries
in a regional or urban economy [125] and emphasizes the positive externalities
associated with new ideas across different sectors, as suggested by Jacobs [126].
These agglomeration economies have also been referred to as intra- (localization)
and inter- (urbanization) clustering [127].
Agglomeration economies also have linkages to the concept of a regional
innovation system. A technological innovation system can be defined as “a dynamic
network of agents interacting in a specific economic/industrial area under a par-
ticular institutional infrastructure and involved in the generation, diffusion, and
utilization of technology” [128, 129]. Private companies in an aquaculture inno-
vation system include salmon farming companies and their suppliers, seafood
processing industry, etc. Universities and research institutes are not only important
institutions through the R&D they undertake, but also as suppliers of highly trained
labor and researchers. Innovation systems related to aquaculture have been studied
by Doloreux et al. [130] and Bergesen and Tveterås [131].
The latter study estimates models which aim to explain the determinants of
collaboration and innovation. Bergesen and Tveterås’ find, based on their estimated
models, that firms’ internal R&D resources are key to ensure collaboration with
external organizations, in particular, research institutions. However, collaboration
with R&D institutions has a limited effect on firms’ probability of innovating. Inno-
vation rates are positively influenced by firms’ internal skills (i.e., R&D employees),
and firm’s collaboration with other firms in the value chain. In particular, innovation
seems to be concentrated at the beginning and end of the supply chain. For the
aquaculture industry, input suppliers were found to be highly innovative, while
farms incorporate innovations from their suppliers.
Tveteras [132], Tveteras and Battese [133], and Asche, Roll, and Tveteras [134]
find evidence of agglomeration economies in Norwegian salmon farming using
different econometric modeling approaches. Tveteras [132] estimates long- and
short-run flexible cost functions to test the structure of agglomeration economies
discussed in Porter [135] in the context of the salmon farming industry in Norway.
More specifically, these cost models make it possible to test effects of agglomeration
on firms’ costs, scale economies, and input demands. According to the results, which
seem to be robust to different econometric model specifications, there is evidence of
agglomeration economies that lead to cost savings in salmon farm production. These
savings are associated with increasing regional farm density and increasing regional
industry size. Moreover, the external economies are also significant compared to
the estimated internal scale economies. Under some circumstances, the estimated
models predict that smaller firms in regions with a large aquaculture industry have
lower production costs than larger firms located in regions with a small industry.
Sources of agglomeration economies may be thicker input markets, localized
knowledge spillovers, and complementarities due to better alignment of activities
[132]. There are benefits to the industry as a role from sharing inputs such as the
industry’s physical infrastructure capital, research and development, and specialized
human capital. For instance, certain investments in capital will not be fully utilized
by one single firm, but by sharing the cost and use of certain specialized, highly
1550 F. Asche et al.

productive external inputs, firms can contribute to savings on materials and labor
inputs.
The aquaculture industry demands specialized expertise in management, export
marketing, production monitoring, veterinary services, fish and marine biology, feed
technology, etc. However, there is a minimum market size for certain specialized
services to be provided. Long geographic distances and high transportation costs
make regional input markets important. A bigger regional market can lead to more
productive and specialized inputs. This is particularly the case for many producer
services in the Norwegian salmon industry, which is spread over its long coastline.
Much of the productivity increase in salmon farming can be associated to the
process of learning by doing. This process is in general localized given the uncer-
tainty and context specificity of the knowledge. In this case, knowledge transmission
depends on physical proximity, so that the level of knowledge spillovers increases
with the industry’s spatial density. This leads also to agglomeration economies in
salmon farming.
Moreover, the larger the regional industry size, the larger number of workers, and
the greater the diversity of human capital. So, one can expect that the increase in the
probability of new ideas and knowledge created by these factors, and exchanged in
many more places, will also influence the level of knowledge spillovers [136, 137].
There is indirect evidence from Tveterås’ [132] results of increasing negative
externalities due to fish diseases being associated to higher farm density. This might
dominate the positive agglomeration economies associated with physical proximity
when farm density becomes high.
Continuing with salmon farming, Tveterås and Battese [133] estimate a stochas-
tic frontier production function which accounts also for technical inefficiency,
and two external economy indexes – regional industry size and regional farm
density. The argument for this separation of effects is that localized markets for
highly productive, specialized inputs and localized knowledge spillovers can lead
to different regional production frontiers. Furthermore, increased levels of localized
knowledge spillovers and substitution of internal inputs with external inputs can
also lead to fewer errors in decision-making and execution of production tasks, thus
causing firms to move closer to the production frontier. The estimated econometric
models predict that an increase in regional industry size is associated with increases
in both best-practice output and the level of technical efficiency for farms in that
region. This implies increasing positive externalities, possibly due to knowledge
spillovers and increased supply of specialized external inputs, when the regional
industry grows. An increase in regional farm density has a negative effect on
frontier output but has a positive effect on the level of technical efficiency. Overall,
the effect of increasing regional farm density on output is negative, implying that
negative congestion externalities associated with fish diseases dominate the positive
externalities associated with knowledge spillovers and the sharing of specialized
inputs.
Asche, Roll, and Tveteras [134] conducted an empirical study of Norwegian
salmon aquaculture where a translog profit function was estimated. This allows one
to account for revenue effects in addition to productivity and cost advantages, which
40 Productivity in Global Aquaculture 1551

can be important particularly in high-cost clusters because it also enables firms to

cover higher costs. Profits are a measure that accounts for potentially increased
economic returns to firms’ equity capital owners from agglomeration and thus also
provide an indication of the economic incentives of capital owners to contribute to
the geographic concentration of production activities.
Asche, Roll, and Tveteras [134] also distinguish between agglomeration exter-
nalities within salmon aquaculture (Marshallian externalities) and agglomeration
externalities across sectors (Jacobs externalities). By using firm-level panel data,
they distinguish between inter- and intraindustry agglomeration effects on firm
profitability. The results support the presence of inter-industry effects related to the
size of other seafood sectors and the food-processing sector. A larger manufacturing
sector, measured by regional employment, is not found to have any effect on profit
in salmon production. This indicates that the agglomeration effect on aquaculture
from related industries is limited to firms in the food sector.
Regional agglomeration externalities have also been present in the aquaculture
industry in Bangladesh. Rahman, Nielsen, and Khan [138] examine the influence
of these externalities on productivity and efficiency applying a stochastic frontier
and inefficiency model using several indices: regional industry size, regional farm
density, distances between farms and the point of sale of fish, and the point of
purchase of fingerlings and feed. They find that an increase in regional industry
size is associated with an increase in frontier output. However, an increase in
regional industry size is also associated with a decrease in the level of technical
efficiency of the farms in that region. The authors explain this increase by
the access to specialized input and services, found also in other studies [132,
139], while the decrease in the level of technical efficiency can be explained
by the specialized products’ affordability. In some places, and in particular for
small-scale farmers [140], the specialized inputs prices preclude them to produce
at-the-best practices frontier. An alternative explanation given by the authors is
that larger industries can supply more to local markets, promoting competition
and resulting in lower prices received by farmers in areas with higher industry
concentrations.
As in the salmon farming [133], there might be some negative externalities
associated with biophysical congestion in the aquaculture industry in Bangladesh.
However, the positive effects on the level of technical efficiency and the elasticity
of mean output associated to learning-by-doing and knowledge spillovers outweigh
the negative effects. The extensive nature of pond farming in Bangladesh may be
the reason for this argument. This contrasts to what is found in the intensive salmon
farming. Spatial and biosecurity planning may be key to avoid negative externalities
associated with intensification and agglomeration.
The results from Rahman, Nielsen, and Khan [138] also indicate distance-related
externalities in aquaculture production. For instance, points of sale are important
for farmers to sell their product in higher prices markets, e.g., larger urban markets
[141], and for farmers to have access to inputs with quality, e.g., fingerling with
quality avoid losses and overstocking. In order to exploit these opportunities,
farmers must organize themselves (e.g., cooperatives), and increase their bargaining
1552 F. Asche et al.

power, which can lead to higher selling prices and purchases [142]. Without this
organization, these opportunities are in same places taken by middlemen [143].

Conclusion

A growing world population will need more healthy seafood. Productivity growth is
a main determinant of the expansion of seafood supply from aquaculture. This chap-
ter has provided evidence that many factors influencing productivity and efficiency
in aquaculture have been studied in the literature, using data from a wide range of
species and countries. Studies may not provide a clear roadmap for improvement
of productive performance, but they certainly give a menu of factors which deserve
attention for industry and policy makers aiming to increase economic welfare.
Both internal and external influences on productive performance have been
estimated in the literature. Internal factors include not only conventional inputs such
as feed, fingerlings, labor and capital, and scale of operation, but also farms’ human
capital, ownership and management structure, and access to credit. External influ-
ences can be both negative and positive; they include fish diseases, environmental
emissions, and agglomeration externalities. One strand of the literature also provides
evidence of production risk, which is partly influenced by farmers’ decision-making,
and that fish farmers are generally risk averse.
Overall, the empirical results provide evidence that to achieve the productive
potential of aquaculture sectors, several sources of market failure – such as
externalities, risk, and risk aversion – have to be taken into account, clearly
suggesting a pivotal role for government policies and regulation. A “tragedy of
the commons” in the form of biological and environmental externalities that is not
sufficiently mitigated by government policies can affect both aquaculture sectors’
own productivity and producer welfare, as well as other sectors of the economy.
Achieving internalization of the costs of external effects among aquaculture firms
is a necessary but far from trivial task for governments, which will require different
public regulations and incentives depending on species and production country.
Sustainable growth of aquaculture seafood supply depends on intensification of
production, i.e., producing more using the same or less area, and at the same time
maintaining high-fish welfare and sufficiently mitigating biological and environ-
mental externalities.
The literature finds not only productivity growth over time in many sectors, but
also that it can be uneven and negative in periods, partly due to biological and
environmental shocks such as fish diseases. The presence of significant technical
inefficiency is estimated in many aquaculture sectors, as well as factors explaining
it. In the infant or early stages of aquaculture sectors, there were many “low-hanging
fruits” to be picked that could contribute to increasing productivity, both innovations
that shifted the production frontier and other measures that could reduce technical
inefficiency [144]. Examples are innovations in feed and its raw materials. As
sectors have matured, fewer low-hanging fruits are available, but still there several
avenues to increasing productivity significantly.
40 Productivity in Global Aquaculture 1553

One avenue is selective breeding, which is a central form of innovation to

increase productivity in aquaculture, as pointed out by Gjedrem et al. [145]. For
salmon and shrimp species, selective breeding has been essential in increasing
growth rates, reducing mortality, and increasing quality. Still, most species are at an
early stage with respect to selective breeding programs, with less than 10% of aqua-
culture production based on genetically improved stocks in 2012. There is scope for
increasing productivity significantly through genetic innovation, as the gains in bio-
logical growth rates can be over 10% for each new generation. Aquaculture gener-
ally trails far behind plant and farm animal industries in utilizing selective breeding
as a tool to improve biological productivity, according to Gjedrem et al. [145].
A systemic challenge to maintain sufficient productivity growth is to further
develop aquaculture innovation systems, consisting of aquaculture value chain
companies and supporting private and public institutions, including government
agencies, education, and R&D institutions [131]. In successful innovation systems,
the private aquaculture value chain itself may be financing and investing in inno-
vations in some technology areas, for example, feed and production equipment. In
some other knowledge and technology areas, funding is by government or organized
by government because innovation is more prone to market failure in the form
of private underinvestment. This may, for example, be related to some biological
and environmental externalities where firms are not able to appropriate sufficient
profits from investing in innovation processes. Global aquaculture can provide
examples of innovation systems with high research and innovation investments
and high innovation output. But there is no global blueprint for development of
aquaculture innovation systems that can provide more sustainable production and
higher productivity because of the huge variation in biological, environmental,
technological, and institutional characteristics across species and countries.

Cross-References

The Economics of Production in Marine Fisheries

References
1. FAO (2021) Fishery and Aquaculture Statistics. Global production by production source
1950–2019. FishStatJ: universal software for fishery statistical time series. www.fao.org/
fishery/statistics/software/fishstatj/en
2. Bostock J, McAndrew B, Richards R et al (2010) Aquaculture: global status and trends. Philos
Trans R Soc B Biol Sci 365:2897–2912. https://doi.org/10.1098/rstb.2010.0170
3. Bosma RH, Verdegem MCJ (2011) Sustainable aquaculture in ponds: principles, practices
and limits. Livest Sci 139:58–68. https://doi.org/10.1016/j.livsci.2011.03.017
4. Neiland AE, Soley N, Varley JB, Whitmarsh DJ (2001) Shrimp aquaculture: economic
perspectives for policy development. Mar Policy 25:265–279. https://doi.org/10.1016/S0308-
597X(01)00017-3
5. Asche F (2008) Farming the sea. Mar Resour Econ 23:527–547. https://doi.org/10.2307/
42629678
1554 F. Asche et al.

6. Asche F, Roll KH, Tveterås R (2012a) Innovations through the supply chain and increased
production – the Case of aquaculture. In: Frick J, Laugen B (eds) APMS 2011, IFIP AICT
384. Springer, Berlin/Heidelberg, pp 604–612
7. Asche F, Roll KH, Tveterås R (2012b) Innovations and productivity performance in
salmon aquaculture. In: Frick J, Laugen B (eds) APMS 2011, IFIP AICT 384. Springer,
Berlin/Heidelberg, pp 613–620
8. Delgado CL, Wada N, Rosegrant MW et al (2003) Outlook for fish to 2020: meeting global
demand, Washington, DC/Penang, IFPRI and WorldFish Center
9. Kobayashi M, Msangi S, Batka M et al (2015) Fish to 2030: the role and opportunity
for aquaculture. Aquac Econ Manag 19:282–300. https://doi.org/10.1080/13657305.2015.
994240
10. Abate TG, Nielsen R, Tveterås R (2016) Stringency of environmental regulation and
aquaculture growth: a cross-country analysis. Aquac Econ Manag 20:201–221. https://doi.
org/10.1080/13657305.2016.1156191
11. Garlock T, Asche F, Anderson J et al (2020) A global blue revolution: aquaculture growth
across regions, species, and countries. Rev Fish Sci Aquac 28:107–116. https://doi.org/10.
1080/23308249.2019.1678111
12. Asche F, Bjørndal T (2011) The economics of salmon aquaculture, 2nd edn. Wiley-Blackwell,
Chichester
13. Llorente I, Luna L (2016) Bioeconomic modelling in aquaculture: an overview of the
literature. Aquac Int 24:931–948. https://doi.org/10.1007/s10499-015-9962-z
14. Allen GP, Botsford LW, Schurr AM, Johnston WE (1984) Bioeconomics of aquaculture.
Elsevier, Amsterdam
15. Leung P (1994) Bioeconomic modeling in aquaculture after two decades. In: Shang YC,
Leung PS, Lee CS, Su MSLI (eds) Socioeconomics of aquaculture. Tungkang Marine
Laboratory, Tungkang, pp 115–137
16. Cacho OJ (1997) Systems modelling and bioeconomic modelling in aquaculture. Aquac Econ
Manag 1:45–64. https://doi.org/10.1080/13657309709380202
17. Pomeroy R, Bravo-Ureta BE, Solís D, Johnston RJ (2008) Bioeconomic modelling and
salmon aquaculture: an overview of the literature. Int J Environ Pollut 33:485–500. https://
doi.org/10.1504/IJEP.2008.020574
18. Karp L, Sadeh A, Griffin WL (1986) Cycles in agricultural production: the case of
aquaculture. Am J Agric Econ 68:553–561. https://doi.org/10.2307/1241540
19. Leung PS, Shang YC (1989) Modeling prawn production management system: a
dynamic Markov decision approach. Agric Syst 29:5–20. https://doi.org/10.1016/0308-
521X(89)90067-X
20. Bjørndal T (1988) Optimal harvesting of farmed fish. Mar Resour Econ 5:139–159
21. Arnason R (1992) Optimal feeding schedules and harvesting time in aquaculture. Mar Resour
Econ 7:15–35
22. Heaps T (1993) The optimal feeding of farmed fish. Mar Resour Econ 8:89–99. https://doi.
org/10.1086/mre.8.2.42629053
23. Heaps T (1995) Density dependent growth and the culling of farmed fish. Mar Resour Econ
10:285–298
24. Mistiaen JA, Strand I (1998) Optimal feeding and harvest time for fish with weight-dependent
prices. Mar Resour Econ 13:231–246. https://doi.org/10.1086/mre.13.4.42629239
25. Yu R, Leung P (2006) Optimal partial harvesting schedule for aquaculture operations. Mar
Resour Econ 21:301–315
26. Guttormsen AG (2008) Faustmann in the sea: optimal rotation in aquaculture. Mar Resour
Econ 23:401–410. https://doi.org/10.1086/mre.23.4.42629671
27. Kumbhakar SC (2002a) Risk preference and productivity measurement under output price
uncertainty. Empir Econ 27:461–472. https://doi.org/10.1007/s001810100091
28. Kumbhakar SC (2002b) Risk preferences and technology: a joint analysis. Mar Resour Econ
17:77–89
40 Productivity in Global Aquaculture 1555

29. Tveterås R (1999) Production risk and productivity growth: some findings for Norwegian
salmon aquaculture. J Prod Anal 12:161–179. https://doi.org/10.1023/A:1007863314751
30. Just RE, Pope RD (1978) Stochastic specification of production functions and economic
implications. J Econ 7:67–86
31. Asche F, Oglend A, Selland Kleppe T (2017) Price dynamics in biological production
processes exposed to environmental shocks. Am J Agric Econ 99:1246–1264. https://doi.org/
10.1093/ajae/aax048
32. Kumbhakar SC (2002c) Specification and estimation of production risk, risk preferences and
technical efficiency. Am J Agric Econ 84:8–22. https://doi.org/10.2307/1245020
33. Sharma KR, Leung P (1998) Technical efficiency of carp production in Nepal: an application
of stochastic frontier production function approach. Aquac Econ Manag 2:129–140. https://
doi.org/10.1080/13657309809380224
34. Iinuma M, Sharma KR, Leung P (1999) Technical efficiency of carp pond culture in peninsula
Malaysia: an application of stochastic production frontier and technical inefficiency model.
Aquaculture 175:199–213. https://doi.org/10.1080/13657300009380268
35. Sharma KR, Leung P, Chen H, Peterson A (1999) Economic efficiency and optimum stocking
densities in fish polyculture: an application of data envelopment analysis (DEA) to Chinese
fish farms. Aquaculture 180:207–221. https://doi.org/10.1016/S0044-8486(99)00202-1
36. Dey MM, Paraguas FJ, Bimbao GB, Regaspi PB (2000) Technical efficiency of tilapia
growout pond operations in the Philippines. Aquac Econ Manag 4:33–47. https://doi.org/10.
1080/13657300009380259
37. Karagiannis G, Katranidis SD, Tzouvelekas V (2000) Measuring technical, allocative and
cost efficiencies of seabass and seabream farms in Greece. Aquac Econ Manag 4:191–208.
https://doi.org/10.1080/13657300009380269
38. Sharma KR, Leung P (2000) Technical efficiency of carp pond culture in South Asia: an
application of a stochastic meta-production frontier model. Aquac Econ Manag 4:169–189.
https://doi.org/10.1080/13657300009380268
39. Awoyemi TT, Amao JO, Ehirim NC (2003) Technical efficiency in aquaculture in Oyo State,
Nigeria. Indian J Agric Econ 58:812–819
40. Irz X, McKenzie V (2003) Profitability and technical efficiency of aquaculture sys-
tems in Pampaanga, Philippines. Aquac Econ Manag 7:195–211. https://doi.org/10.1080/
13657300309380340
41. Ara LA, Alam MF, Rahman MM, Jabbar MA (2004) Yield gaps, production losses and
technical efficiency of selected groups of fish farmers in Bangladesh. Indian J Agric Econ
59:808–818
42. Chiang FS, Sun CH, Yu JM (2004) Technical efficiency analysis of milkfish (Chanos
chanos) production in Taiwan – an application of the stochastic frontier production function.
Aquaculture 230:99–116. https://doi.org/10.1016/j.aquaculture.2003.09.038
43. Martinez-Cordero FJ, Leung PS (2004) Sustainable aquaculture and producer performance:
measurement of environmentally adjusted productivity and efficiency of a sample of shrimp
farms in Mexico. Aquaculture 241:249–268. https://doi.org/10.1016/j.aquaculture.2004.07.
028
44. Dey MM, Paraguas FJ, Srichantuk N et al (2005) Technical efficiency of freshwater pond
polyculture production in selected Asian countries: estimation and implication. Aquac Econ
Manag 9:39–63. https://doi.org/10.1080/13657300590961528
45. Cinemre HA, Ceyhan V, Bozoǧlu M et al (2006) The cost efficiency of trout farms in the
Black Sea Region, Turkey. Aquaculture 251:324–332. https://doi.org/10.1016/j.aquaculture.
2005.06.016
46. Kaliba AR, Engle CR (2006) Productive efficiency of catfish farms in Chicot county,
Arkansas. Aquac Econ Manag 10:223–243. https://doi.org/10.1080/13657300600985413
47. Kaliba AR, Engle CR, Dorman L (2007) Efficiency change and technological progress in the
U.S. catfish-processing sector, 1986 to 2005. Aquac Econ Manag 11:53–72. https://doi.org/
10.1080/13657300701202718
1556 F. Asche et al.

48. Alam F, Murshed-e-Jahan K (2008) Resource allocation efficiency of the prawn-

carp farmers of Bangladesh. Aquac Econ Manag 12:188–206. https://doi.org/10.1080/
13657300802332976
49. Kareem RO, Aromolaran AB, Dipeolu AO (2009) Economic efficiency of fish farm-
ing in Ogun State, Nigeria. Aquac Econ Manag 13:39–52. https://doi.org/10.1080/
13657300802679145
50. Singh K, Dey MM, Rabbani AG et al (2009) Technical efficiency of freshwater aquaculture
and its determinants in Tripura, India. Agric Econ Res Rev 22:186–195
51. Chang HH, Boisvert RN, Hung LY (2010) Land subsidence, production efficiency, and the
decision of aquacultural firms in Taiwan to discontinue production. Ecol Econ 69:2448–2456.
https://doi.org/10.1016/j.ecolecon.2010.07.020
52. Nilsen OB (2010) Learning-by-doing or technological leapfrogging: production frontiers and
efficiency measurement in Norwegian salmon aquaculture. Aquac Econ Manag 14:97–119.
https://doi.org/10.1080/13657301003776649
53. Ogundari K, Akinbogun OO (2010) Modeling technical efficiency with production risk: a
study of fish farms in Nigeria. Mar Resour Econ 25:295–308
54. Onumah EE, Brümmer B, Hörstgen-Schwark G (2010) Elements which delimitate technical
efficiency of fish farms in Ghana. J World Aquacult Soc 41:506–518. https://doi.org/10.1111/
j.1749-7345.2010.00391.x
55. Alam F (2011) Measuring technical, allocative and cost efficiency of pangas (Pangasius
hypophthalmus: Sauvage 1878) fish farmers of Bangladesh. Aquac Res 42:1487–1500.
https://doi.org/10.1111/j.1365-2109.2010.02741.x
56. Nielsen R (2011) Green and technical efficient growth in Danish fresh water aquaculture.
Aquac Econ Manag 15:262–277. https://doi.org/10.1080/13657305.2011.624574
57. Pantzios CJ, Karagiannis G, Tzouvelekas V (2011) Parametric decomposition of the input-
oriented Malmquist productivity index: with an application to Greek aquaculture. J Prod Anal
36:21–31. https://doi.org/10.1007/s11123-010-0202-2
58. Alam MF, Khan MA, Huq ASMA (2012) Technical efficiency in tilapia farming of
Bangladesh: a stochastic frontier production approach. Aquac Int 20:619–634. https://doi.
org/10.1007/s10499-011-9491-3
59. Asche F, Roll KH (2013) Determinants of inefficiency in Norwegian salmon aquaculture.
Aquac Econ Manag 17:300–321. https://doi.org/10.1080/13657305.2013.812154
60. Asche F, Guttormsen AG, Nielsen R (2013) Future challenges for the maturing Norwegian
salmon aquaculture industry: an analysis of total factor productivity change from 1996 to
2008. Aquaculture 396–399:43–50. https://doi.org/10.1016/j.aquaculture.2013.02.015
61. Begum EA, Hossain MI, Papanagiotou E (2013) Technical efficiency of shrimp farming in
Bangladesh: an application of the stochastic production frontier approach. J World Aquacult
Soc 44:641–654. https://doi.org/10.1111/jwas.12062
62. Bukenya JO, Hyuha TS, Molnar J, Twinamasiko J (2013) Efficiency of resource use among
pond fish farmers in Central Uganda: a stochastic frontier production function approach.
Aquac Econ Manag 17:148–170. https://doi.org/10.1080/13657305.2013.772264
63. Arita S, Leung P (2014) A technical efficiency analysis of Hawaii’s aquaculture industry.
J World Aquacult Soc 45:312–321. https://doi.org/10.1111/jwas.12124
64. Nguyen KT, Fisher TCG (2014) Efficiency analysis and the effect of pollution on shrimp
farms in the Mekong River Delta. Aquac Econ Manag 18:325–343. https://doi.org/10.1080/
13657305.2014.959209
65. Schrobback P, Pascoe S, Coglan L (2014) Impacts of introduced aquaculture species on
markets for native marine aquaculture products: the case of edible oysters in Australia. Aquac
Econ Manag. https://doi.org/10.1080/13657305.2014.926465
66. Iliyasu A, Mohamed ZA (2015) Technical efficiency of tank culture systems in Peninsular
Malaysia: an application of Data Envelopment Analysis. Aquac Econ Manag 19:372–386.
https://doi.org/10.1080/13657305.2015.1082118
40 Productivity in Global Aquaculture 1557

67. Iliyasu A, Mohamed ZA, Hashim M (2015) Productivity growth, technical change and effi-
ciency change of the Malaysian cage fish farming: an application of Malmquist Productivity
Index approach. Aquac Int 23:1013–1024. https://doi.org/10.1007/s10499-014-9860-9
68. Begum MEA, Nastis SA, Papanagiotou E (2016) Determinants of technical efficiency of
freshwater prawn farming in southwestern Bangladesh. J Agric Rural Dev Trop Subtrop
117:99–112
69. Iliyasu A, Mohamed ZA (2016) Evaluating contextual factors affecting the technical effi-
ciency of freshwater pond culture systems in Peninsular Malaysia: a two-stage DEA approach.
Aquac Rep 3:12–17. https://doi.org/10.1016/j.aqrep.2015.11.002
70. Sandvold HN (2016) Technical inefficiency, cost frontiers and learning-by-doing in Norwe-
gian farming of juvenile salmonids. Aquac Econ Manag 20:382–398. https://doi.org/10.1080/
13657305.2016.1224659
71. Anh Ngoc PT, Gaitán-Cremaschi D, Meuwissen MPM et al (2018) Technical inefficiency of
Vietnamese pangasius farming: a data envelopment analysis. Aquac Econ Manag 22:229–243.
https://doi.org/10.1080/13657305.2017.1399296
72. Nguyen LA, Pham TBV, Bosma R et al (2018) Impact of climate change on the technical
efficiency of striped catfish, Pangasianodon hypophthalmus, farming in the Mekong Delta,
Vietnam. J World Aquacult Soc 49:570–581. https://doi.org/10.1111/jwas.12488
73. Ton Nu Hai A, Bui Dung T, Speelman S (2018) Analyzing the variations in cost-efficiency
of marine cage lobster aquaculture in Vietnam: a two-stage bootstrap DEA approach. Aquac
Econ Manag 22:458–473. https://doi.org/10.1080/13657305.2018.1429032
74. Bayazid Y, Umetsu C, Hamasaki H, Miyanishi T (2019) Measuring the efficiency of col-
lective floodplain aquaculture of Bangladesh using Data Envelopment Analysis. Aquaculture
503:537–549. https://doi.org/10.1016/j.aquaculture.2019.01.007
75. Forleo MB, Romagnoli L, Fanelli RM et al (2019) Assessing the efficiency of the Italian
aquaculture firms. Aquac Econ Manag 23:382–409. https://doi.org/10.1080/13657305.2019.
1641569
76. Mitra S, Khan MA, Nielsen R (2019) Credit constraints and aquaculture productivity. Aquac
Econ Manag 23:410–427. https://doi.org/10.1080/13657305.2019.1641571
77. Rodrigues WS, Mauad JRC, Vogel E et al (2019) Sustainability and technical efficiency of
fish hatcheries in the state of Mato Grosso do Sul, Brazil. Aquaculture 500:228–236. https://
doi.org/10.1016/j.aquaculture.2018.10.024
78. Scuderi B, Chen X (2019) Production efficiency in New England’s oyster aquaculture
industry. Aquac Econ Manag 23:45–64. https://doi.org/10.1080/13657305.2018.1449272
79. Aponte FR (2020) Firm dispersion and total factor productivity: are Norwegian salmon
producers less efficient over time? Aquac Econ Manag 24:161–180. https://doi.org/10.1080/
13657305.2019.1677803
80. Aripin A, Coglan L, Pascoe S, Hoang VN (2020) Productive efficiency and capacity
utilization of sea bass grow-out culture in peninsular Malaysia. Aquac Econ Manag 24:102–
121. https://doi.org/10.1080/13657305.2019.1661045
81. Long LK, Van Thap L, Hoai NT (2020a) An application of data envelopment analysis with
the double bootstrapping technique to analyze cost and technical efficiency in aquaculture:
do credit constraints matter? Aquaculture 525:735290. https://doi.org/10.1016/j.aquaculture.
2020.735290
82. Long LK, Van Thap L, Hoai NT, Pham TTT (2020b) Data envelopment analysis for analyzing
technical efficiency in aquaculture: the bootstrap methods. Aquac Econ Manag 24:422–446.
https://doi.org/10.1080/13657305.2019.1710876
83. Mitra S, Khan MA, Nielsen R, Islam N (2020) Total factor productivity and technical
efficiency differences of aquaculture farmers in Bangladesh: do environmental characteristics
matter? J World Aquacult Soc 51:918–930. https://doi.org/10.1111/jwas.12666
84. Fernández Sánchez JL, Llorente García I, Luna M (2020) Technical efficiency of sea bass
and sea bream farming in the Mediterranean Sea by European firms: a stochastic production
1558 F. Asche et al.

frontier (SPF) approach. Aquac Econ Manag 24:526–539. https://doi.org/10.1080/13657305.

2020.1840660
85. Gutiérrez E, Lozano S, Guillén J (2020) Efficiency data analysis in EU aquaculture produc-
tion. Aquaculture 520:734962
86. Ton Nu Hai A, Speelman S (2020) Economic-environmental trade-offs in marine aquaculture:
the case of lobster farming in Vietnam. Aquaculture 516:734593. https://doi.org/10.1016/j.
aquaculture.2019.734593
87. Ton Nu Hai A, Van Meensel J, Speelman S (2020) The factors influencing environmental
performance of marine aquaculture: a combined material balance-based and meta-frontier
approach. J Clean Prod 269:122342. https://doi.org/10.1016/j.jclepro.2020.122342
88. Long LK (2021) Cost efficiency analysis in aquaculture: data envelopment analysis with
a two-stage bootstrapping technique. Aquac Econ Manag 10:1–20. https://doi.org/10.1080/
13657305.2021.1896605
89. Nielsen R, Ankamah-Yeboah I, Llorente I (2021) Technical efficiency and environmental
impact of seabream and seabass farms. Aquac Econ Manag 25:106–125. https://doi.org/10.
1080/13657305.2020.1840662
90. Khan MA, Roll KH, Guttormsen A (2021) Profit efficiency of Pangas (Pangasius hypoph-
thalmus) pond fish farming in Bangladesh – the effect of farm size. Aquaculture 539:736662.
https://doi.org/10.1016/j.aquaculture.2021.736662
91. Hukom V, Nielsen R, Nielsen M (2021) Effects of co-management on technical efficiency
and environmental stressors: an application to small-scale shrimp polyculture in Indonesia.
Aquac Econ Manag 20:1–26. https://doi.org/10.1080/13657305.2021.1897190
92. Salvanes KG (1989) The structure of the Norwegian fish farming industry: an empirical
analysis of economies of scale and substitution possibilities. Mar Resour Econ 6:349–373.
https://doi.org/10.1086/mre.6.4.42628827
93. Salvanes KG (1993) Public regulation and production factor misallocation. A
restricted cost function for the Norwegian aquaculture industry. Mar Resour Econ 8:
50–64
94. Bjørndal T, Salvanes KG (1995) Gains from deregulation? An empirical test for efficiency
gains in the Norwegian fish farming industry. J Agric Econ 46:113–126. https://doi.org/10.
1111/j.1477-9552.1995.tb00756.x
95. Parmeter CF, Wang HJ, Kumbhakar SC (2017) Nonparametric estimation of the determinants
of inefficiency. J Prod Anal 47:205–221. https://doi.org/10.1007/s11123-016-0479-x
96. Simar L, Wilson PW (2007) Estimation and inference in two-stage, semi-parametric models
of production processes. J Econ 136:31–64. https://doi.org/10.1016/j.jeconom.2005.07.009
97. Rahman MT, Nielsen R, Khan MA, Asmild M (2019) Efficiency and production environmen-
tal heterogeneity in aquaculture: a meta-frontier DEA approach. Aquaculture 509:140–148.
https://doi.org/10.1016/j.aquaculture.2019.05.002
98. Arquitt S, Xu H, Johnstone R (2005) A system dynamics analysis of boom and bust in the
shrimp aquaculture industry. Syst Dyn Rev 21:305–324. https://doi.org/10.1002/sdr.313
99. Asche F, Tveterås R (1999) Modeling production risk with a two-step procedure. J Agric
Resour Econ 24:424–439. https://doi.org/10.2307/40987031
100. Khan A, Guttormsen A, Roll KH (2018) Production risk of pangas (Pangasius hypophthal-
mus) fish farming. Aquac Econ Manag 22:192–208. https://doi.org/10.1080/13657305.2017.
1284941
101. Kumbhakar SC, Tveterås R (2003) Risk preferences, production risk and firm heterogeneity.
Scand J Econ 105:275–293. https://doi.org/10.1111/1467-9442.t01-1-00009
102. Kumbhakar SC, Tsionas EG (2009) Nonparametric estimation of production risk
and risk preference functions. Adv Econ 25:223–260. https://doi.org/10.1108/S0731-
9053(2009)0000025010
103. Sarker MAA, Arshad FM, Alam MF et al (2016) Stochastic modeling of production risk and
technical efficiency of Thai koi (Anabas testudineus) farming in Northern Bangladesh. Aquac
Econ Manag 20:165–184. https://doi.org/10.1080/13657305.2016.1156189
40 Productivity in Global Aquaculture 1559

104. Little DC, Young JA, Zhang W et al (2018) Sustainable intensification of aquaculture value
chains between Asia and Europe: a framework for understanding impacts and challenges.
Aquaculture 493:338–354. https://doi.org/10.1016/j.aquaculture.2017.12.033
105. Naylor RL, Hardy RW, Bureau DP et al (2009) Feeding aquaculture in an era of finite
resources. Proc Natl Acad Sci U S A 106:15103–15110. https://doi.org/10.1073/pnas.
0905235106
106. Tacon AGJ, Metian M (2008) Global overview on the use of fish meal and fish oil in
industrially compounded aquafeeds: trends and future prospects. Aquaculture 285:146–158.
https://doi.org/10.1016/j.aquaculture.2008.08.015
107. Anh PT, Kroeze C, Bush SR, Mol APJ (2010) Water pollution by Pangasius production in the
Mekong Delta, Vietnam: causes and options for control. Aquac Res 42:108–128. https://doi.
org/10.1111/j.1365-2109.2010.02578.x
108. Liu Y, Sumaila UR (2010) Estimating pollution abatement costs of salmon aquaculture: a
joint production approach. Land Econ 86:569–584
109. Pincinato RBM, Asche F, Roll KH (2021) Escapees in salmon aquaculture: a multi-output
approach. Land Econ (forthcoming). https://doi.org/10.3368/wple.97.2.090419-0128R
110. Nielsen R (2012) Introducing individual transferable quotas on nitrogen in Danish fresh water
aquaculture: production and profitability gains. Ecol Econ 75:83–90. https://doi.org/10.1016/
j.ecolecon.2012.01.002
111. Aponte FR, Tveterås S (2019) On the drivers of cost changes in the Norwegian salmon
aquaculture sector: a decomposition of a flexible cost function from 2001 to 2014. Aquac
Econ Manag 23:276–291. https://doi.org/10.1080/13657305.2018.1551438
112. Aerni P (2004) Risk, regulation and innovation: the case of aquaculture and transgenic fish.
Aquat Sci 66:327–341. https://doi.org/10.1007/s00027-004-0715-8
113. Asche F, Guttormsen AG, Tveterås R (1999) Environmental problems, productivity and
innovations in Norwegian salmon aquaculture. Aquac Econ Manag 3:19–29. https://doi.org/
10.1080/13657309909380230
114. Asche F, Smith MD (2018) Induced innovation in fisheries and aquaculture. Food Policy
76:1–7
115. Kumar G, Engle CR (2016) Technological advances that led to growth of shrimp, salmon,
and tilapia farming. Rev Fish Sci Aquac 24:136–152. https://doi.org/10.1080/23308249.2015.
1112357
116. Sandvold HN, Tveterås R (2014) Innovation and productivity growth in Norwegian pro-
duction of juvenile salmonids. Aquac Econ Manag 18:149–168. https://doi.org/10.1080/
13657305.2014.903313
117. Asche F, Anderson JL, Botta R et al (2020) The economics of shrimp disease. J Invertebr
Pathol 2020:107397. https://doi.org/10.1016/j.jip.2020.107397
118. Iversen A, Asche F, Hermansen Ø, Nystøyl R (2020) Production cost and competitiveness
in major salmon farming countries 2003–2018. Aquaculture 522:735089. https://doi.org/10.
1016/j.aquaculture.2020.735089
119. Abolofia J, Asche F, Wilen JE (2017) The cost of lice: quantifying the impacts of parasitic
sea lice on farmed salmon. Mar Resour Econ 32:329–349. https://doi.org/10.1086/691981
120. Dresdner J, Chávez C, Quiroga M et al (2019) Impact of Caligus treatments on unit costs of
heterogeneous salmon farms in Chile. Aquac Econ Manag 23:1–27. https://doi.org/10.1080/
13657305.2018.1449271
121. Froehlich HE, Runge CA, Gentry RR et al (2018) Comparative terrestrial feed and land use
of an aquaculture-dominant world. Proc Natl Acad Sci U S A 115:5295–5300. https://doi.org/
10.1073/pnas.1801692115
122. Poore J, Nemecek T (2018) Reducing food’s environmental impacts through producers and
consumers. Science 360:987–992. https://doi.org/10.1126/science.aaq0216
123. Marshall A (1920) Principles of economics. Macmillan, London
124. Paci R, Usai S (1999) Externalities, knowledge spillovers and the spatial distribution of
innovation. GeoJournal 49:381–390. https://doi.org/10.1023/A:1007192313098
1560 F. Asche et al.

125. Rosenthal SS, Strange WC (2003) Geography, industrial organization, and agglomeration.
Rev Econ Stat 85:377–393
126. Jacobs J (1969) The economy of cities. Vintage Books, New York
127. Melo PC, Graham DJ, Noland RB (2009) A meta-analysis of estimates of urban agglom-
eration economies. Reg Sci Urban Econ 39:332–342. https://doi.org/10.1016/j.regsciurbeco.
2008.12.002
128. Carlsson B, Stankiewicz R (1991) Evolutionary economics. J Evol Econ 1(2):93–118. https://
doi.org/10.4324/9781315493053
129. Freeman C (1995) The “national system of innovation” in historical perspective. Camb J Econ
19:5–24. https://doi.org/10.1093/oxfordjournals.cje.a035309
130. Doloreux D, Isaksen A, Aslesen HW, Melançon Y (2009) A comparative study of the
aquaculture innovation systems in Quebec’s coastal region and Norway. Eur Plan Stud
17:963–981. https://doi.org/10.1080/09654310902949240
131. Bergesen O, Tveterås R (2019) Innovation in seafood value chains: the case of Norway. Aquac
Econ Manag 23:1–29. https://doi.org/10.1080/13657305.2019.1632391
132. Tveteras R (2002) Industrial agglomeration and production costs in Norwegian salmon
aquaculture. Mar Resour Econ 17:1–22
133. Tveteras R, Battese GE (2006) Agglomeration externalities, productivity, and technical
inefficiency. J Reg Sci 46:605–625. https://doi.org/10.1111/j.1467-9787.2006.00470.x
134. Asche F, Roll KH, Tveterås R (2016) Profiting from agglomeration? Evidence from the
salmon aquaculture industry. Reg Stud 50:1742–1754. https://doi.org/10.1080/00343404.
2015.1055460
135. Porter ME (2000) Location, competition, and economic development: local clusters in a
global economy. Econ Dev Q 14:15–34. https://doi.org/10.1177/089124240001400105
136. Jovanovic B, Rob R (1989) The growth and diffusion of knowledge. Rev Econ Stud 56:569–
582. https://doi.org/10.2307/2297501
137. Rauch JE (1993) Productivity gains from geographic concentration of human capital:
evidence from the cities. J Urban Econ 34:380–400
138. Rahman MT, Nielsen R, Khan MA (2019) Agglomeration externalities and technical effi-
ciency: an empirical application to the pond aquaculture of Pangas and Tilapia in Bangladesh.
Aquac Econ Manag 23:158–187. https://doi.org/10.1080/13657305.2018.1531948
139. Otsuka A, Goto M, Sueyoshi T (2010) Industrial agglomeration effects in Japan: productive
efficiency, market access, and public fiscal transfer. Pap Reg Sci 89:819–840. https://doi.org/
10.1111/j.1435-5957.2010.00286.x
140. Ahmed N (2007) Economics of aquaculture feeding practices: Bangladesh. In: Hasan MR
(ed) Economics of aquaculture feeding practices in selected Asian countries. FAO, Rome, pp
33–64
141. Hu C, Zhang X, Reardon T, Hernandez R (2019) Value-chain clusters and aquaculture
innovation in Bangladesh. Food Policy 83:310–326. https://doi.org/10.1016/j.foodpol.2017.
07.009
142. Khan MA (2012) Efficiency, risk and management of fisheries sector in Bangladesh. PhD
thesis, Norwegian University of Life Sciences
143. Islam MS (2008) From pond to plate: towards a twin-driven commodity chain in Bangladesh
shrimp aquaculture. Food Policy 33:209–223. https://doi.org/10.1016/j.foodpol.2007.10.002
144. Watanabe WO, Losordo TM, Fitzsimmons K, Hanley F (2002) Tilapia production systems
in the Americas: technological advances, trends, and challenges. Rev Fish Sci 10:465–498.
https://doi.org/10.1080/20026491051758
145. Gjedrem T, Robinson N, Rye M (2012) The importance of selective breeding in aquaculture
to meet future demands for animal protein: a review. Aquaculture 350–353:117–129. https://
doi.org/10.1016/j.aquaculture.2012.04.008
146. Long LK (2021) Cost Efficiency Analysis in Aquaculture: Data Envelopment Analysis with
a Two-Stage Bootstrapping Technique. Aquaculture Economics and Management 0(0):1–20.
https://doi.org/10.1080/13657305.2021.1896605
40 Productivity in Global Aquaculture 1561

147. Tveterås R (2000) Flexible Panel Data Models for Risky Production Technologies with an
Application to Salmon Aquaculture, Econometric Reviews 19:367–389
148. Zhou J, Parmeter CF, Kumbhakar SC (2020) Nonparametric estimation of the determinants
of inefficiency in the presence of firm heterogeneity. Eur J Oper Res 286:1142–1152. https://
doi.org/10.1016/j.ejor.2020.04.005
Benchmarking in the European Water
Sector 41
Applications of Production Economics

Alan Horncastle, Joseph Duffy, Chien Xen Ng, and Peter Krupa

Contents
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1565
Why Benchmarking Is Important in the Water Sector? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1566
Benchmarking Techniques in Regulation: An Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . 1570
England and Wales: Cost Benchmarking Prior to PR14 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1571
The Structure of the Water Sector and Regulation in England and Wales . . . . . . . . . . . . . 1571
Ofwat’s Approach Prior to PR14 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1572
The Use of Ofwat’s Approach Elsewhere and a Change in Approach . . . . . . . . . . . . . . . . 1575
England and Wales: Cost Benchmarking from PR14 Onwards . . . . . . . . . . . . . . . . . . . . . . . 1575
Ofwat’s Approach in PR14: A Change in Direction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1575
Ofwat’s Approach to Cost Benchmarking in PR19 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1577
Definition of the Inputs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1578
Identifying the Outputs and Other Drivers of Costs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1581
Data Collection, Validation, and Consultation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1582
Benchmarking . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1582
Forecasting Future Efficient Cost Levels . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1589
Wholesale Enhancement Expenditure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1590
Northern Ireland . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1593

The authors would like to thank Julie Skovgaard Hansen and Emil Heesche (DCCA) and Laura
Brien (CRU) for helpful comments on respective sections; Subal Kumbhakar, and two anonymous
referees, for general comments and suggestions; Pierpaolo Perna for support on the approach in
Italy; Oxera’s efficiency team (including Charles Blake, Simona Castellini, Srini Parthasarathy and
Hannes Seidel), particularly on all the work undertaken as part of PR19, many insights from which
we have incorporated in the relevant sections; and Patricia Taylor for supporting us with all the
research.
Any views expressed in this chapter are solely those of the authors and not of Oxera.
A. Horncastle () · J. Duffy · C. X. Ng · P. Krupa
Oxera Consulting, Oxford, UK
e-mail: [email protected]; [email protected]; [email protected];
[email protected]

S. C. Ray et al. (eds.), Handbook of Production Economics,
https://doi.org/10.1007/978-981-10-3455-8_42
1564 A. Horncastle et al.

Historical and Industry Context . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1593

Efficiency Benchmarking in Northern Ireland . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1593
Future Potential Changes in Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1594
Scotland . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1595
Historical and Industry Context . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1595
Efficiency Benchmarking in Scotland . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1595
A Change in Direction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1597
Ireland . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1598
Historical and Industry Context . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1598
Efficiency Benchmarking in Ireland . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1598
Denmark . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1600
Historical and Industry Context . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1600
Recent Regulatory Framework Changes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1601
Efficiency Benchmarking in Denmark . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1601
Italy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1604
Historical and Industry Context . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1604
Cost Benchmarking . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1604
Areas for Further Development . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1607
Input Definition: Modelled Expenditure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1607
Input Definition: Accounting for the Investment Cycle . . . . . . . . . . . . . . . . . . . . . . . . . . . 1609
Output Definition: Multiple Outputs and Cost–Service Trade-Offs . . . . . . . . . . . . . . . . . . 1612
Benchmarking: Input Requirement Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1615
Benchmarking: Functional Form . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1616
Forecasting Efficient Costs: Identifying “Efficient” Cost Levels, While
Accounting for Error and Heterogeneity, and Alternative Estimation Approaches . . . . . . 1617
Forecasting Efficient Costs: The Consistency of Catch-Up, Frontier
Shift and Input Price Inflation Assumptions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1619
International Comparisons . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1621
Concluding Comments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1622
Cross-References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1622
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1622

Abstract

This chapter reviews the use of cost (and output) benchmarking within the
economic regulation of the water sector in the EU. Unlike other utility sectors,
such as energy, economic regulation and benchmarking is not yet widespread
across the water sector. However, where cost benchmarking is used, we review
the approaches taken by water regulators, setting out the key steps in the process
and the issues that have been raised with the approaches taken. Given the
longer history of such benchmarking in England and Wales and the fact that
its framework has been used by other water regulators, we initially focus on
England and Wales before covering some other examples of cost benchmarking
across Europe. We finish by considering potential areas for future development.

Keywords

Regulation · Efficiency · Stochastic frontier models · Data envelopment

analysis · Production · Cost
41 Benchmarking in the European Water Sector 1565

Introduction

The water sector is a natural monopoly. In the absence of competition, water compa-
nies may incur higher than efficient costs or provide a poorer quality of service. To
protect customers, these companies can be regulated with regulators assessing what
cost reductions and/or service improvements are possible by comparing the relative
performance of companies. This process is called benchmarking. In this chapter,
we discuss the approaches that economic regulators have used to benchmark water
companies across Europe. The chapter is organized as follows.
In section “Why Benchmarking Is Important in the Water Sector?,” we set out
the importance of benchmarking in the water sector. In particular, the water sector
provides an essential service that has important impacts on both public health and
the environment, yet water is generally provided by natural monopolies, which, left
unchecked, may not produce optimal outcomes. As such, economic regulation is
often applied to the sector and, within such regulation, benchmarking is usually
applied.
In section “Benchmarking Techniques in Regulation: An Introduction,” we pro-
vide a brief overview of the different benchmarking techniques used by economic
regulators.
In the next seven sections, we examine the benchmarking approach undertaken
by those European regulators that implement benchmarking using econometric
modelling, such as ordinary least squares (OLS), random effects (RE) or stochastic
frontier analysis (SFA), or data envelopment analysis (DEA).
We first examine Ofwat’s, the regulator in England and Wales, approach from
privatization in 1989 up to PR09, the price review in 2009, in section “England
and Wales: Cost Benchmarking Prior to PR14,” before moving on to examine its
approach in PR14 and PR19 in section “England and Wales: Cost Benchmarking
from PR14 Onwards.” In all price control reviews, Ofwat has used econometric
modelling (either OLS or RE) of costs to benchmark the efficiency of the companies
it regulates. This review over a number of price controls is of interest as a number
of analytical and practical issues have been discussed and examined over time,
including in-depth reviews during appeals of the regulator’s final determinations.
As a result, Ofwat’s approach has evolved over time, although it has not used either
SFA or DEA as its primary analytical tool in any price control review. Quite closely
related are the approaches of the regulators in Northern Ireland, Scotland, and Ire-
land, which we cover more briefly in sections “Northern Ireland,” “Scotland,” and
“Ireland,” respectively. Although we note that WICS, the regulator in Scotland, has
moved away from econometric benchmarking approaches entirely in recent years.
In section “Denmark,” we set out the approach of the DCCA, the competition
authority in Denmark with the remit to regulate water companies, which is based
on SFA and DEA. The setting for these models is a cross-sectional data set. Given
the relatively large number of varied water companies that are benchmarked, the
regulator pays particular attention to identifying and dealing with outliers. Finally,
in section “Italy,” we review ARERA’s, the regulator in Italy, use of panel SFA
models for setting efficient cost allowances for the period 2020–2023.
1566 A. Horncastle et al.

We conclude, in section “Areas for Further Development,” by drawing together

some of the key themes across these jurisdictions and examining potential areas
for future development for those jurisdictions that have already implemented cost
benchmarking and those that are considering introducing such benchmarking going
forward.

Why Benchmarking Is Important in the Water Sector?

As stated above, the water sector, in particular its network of pipes, is a natural
monopoly providing an essential service for customers. In areas of the value chain
outside of these network services, it has been possible to introduce competition. In
particular, some retail services have been separated from wholesale services and
opened up to competition in Scotland, England, and Wales. In England, a subset
of the activities required to connect new developments to the water network is also
open to competition, although the number of market participants varies regionally
(see Ofwat [89]). In England, the Department for Environment Food and Rural
Affairs [33] set out the UK Government’s strategic policy statement for the water
regulator, Ofwat, to promote further competitive markets in water resources and
bioresources. However, these contestable parts of the value chain represent only a
small proportion of costs in the sector and in other European countries even these
small parts of the value chain have not been opened up to competition. As such,
monopoly provision of water services remains, by far, the main form of supply
across Europe.
In contrast to the outcomes from competitive markets, monopolies may not
produce outputs at a price or service level that is optimal for consumers. Prices
can be too high and service performance too low.
In addition to the issues that would typically arise from a natural monopoly,
the water sector provides an essential service and impacts the lives of citizens in
a number of ways. The OECD [111] sets out the additional challenges faced by the
water and wastewater sector as:

• The responsibility of the sector to balance a range of economic, social, and

environmental interests in its activities
• The important externalities generated by the sector, in particular with regard to
public health, the economy, and the environment
• The generally fragmented nature of actors in the sector, both horizontally and
vertically

As such, some form of government control or economic regulation is often

applied in the sector. Such control can be in the form of legislation, direct public
ownership, or both. In either case, governments or local authorities directly control
outcomes and prices.
While each member state has its own national legislation governing the water
sector, there are also European wide directives. A key directive for members of
41 Benchmarking in the European Water Sector 1567

the European Union is the EU Water Framework Directive (WFD), see European
Parliament and the Council [41]. This sets out a number of important objectives
for the sector, including environmental objectives as well as the principle of cost
recovery for water services in Article 9 and is applicable regardless of ownership
[41]. According to the European Commission [40], “a number of Member States
have upgraded their water pricing” but progress is still needed particularly as
increased investments are essential to meet the objectives of the WFD. In some
countries, prices do not cover all costs and general government funds are used to
subsidize services (e.g., Ireland and Greece).
It is not necessary for water networks to be publicly owned for government
or local authorities to exert control. Indeed, given that water networks require
significant investment to renew and maintain infrastructure (parts of which are
old and, in some cases, have been historically under-maintained) and comply with
environmental and other obligations, it is often desirable to attract investment from
private capital in some form. The participation of the private sector in the provision
of water services may also assist in cost reflectivity. Where private participation is
involved, different approaches can be taken to try and deliver better outcomes.
Concessionary contracts can be tendered – that is, competition for the market –
delegating management responsibility and (potentially) financing, while maintain-
ing public ownership and control over outcomes and prices. Alternatively, large
discrete capital schemes, such as Thames Tideway in the UK, can be tendered
out (see Ofwat [79]). That is, competition for the market can be introduced. The
competitive tendering process aims to achieve efficient delivery of services as
companies compete with each other for the opportunity to serve the market over
a specified period. As such, the objective is to secure bidders offering a superior
combination of outcomes and prices than other bidders.
An alternative approach is to have private ownership, but to impose controls
either on returns to private capital only (i.e., “cost-plus” or rate of return regulation)
or on prices (including costs and returns) and outcomes (i.e., price or revenue caps).
In this chapter, we consider regulatory models which impose controls on prices
and outcomes, as a more relevant context for the application of cost benchmarking.
It should be noted that private ownership is not a prerequisite for such economic
regulation – indeed, many publicly owned companies are also subject to controls on
prices or outcomes – but economic regulation is considered a necessary condition
for the viability of private ownership, in the absence of a prescriptive concession
agreement.
In addition to environmental, safety, and water quality regulation, the aim of this
type of economic regulation is to achieve the optimal levels of price and outcomes
for customers, the environment and stakeholders as a whole. The “optimal” level
might, for example, reflect the priorities of society such as affordable billing for
customers in financial difficulty.
Figure 1 sets out when economic regulation (and implicitly benchmarking) may
be necessary in the context of private ownership, from the perspective of ensuring
that prices are fair and that service quality does not suffer. However, given the
critical importance of the water sector, even where competitive forces in the market
1568 A. Horncastle et al.

Can a competitive market be introduced, Yes Then market forces

and will it deliver the best outcomes for may remove the need
the sector? for regulation

Can competition for the market be Yes Then market forces

introduced, and will it deliver the best may remove the need
outcomes for the sector? for regulation

Then
Services can be delivered by a Regulation is essential
monopoly, subject to economic to ensure prices are
regulation. fair

Fig. 1 Is direct economic regulation necessary in the context of private ownership?

or for the market can be established and work well, government intervention to
control outcomes may still be necessary.
Independent economic regulators aim to achieve efficient delivery through
a series of incentives and cost and output benchmarking. Often this takes the
form of RPI-X regulation, whereby prices are allowed to increase in line with
general inflation less some factor reflecting, among other things, expected efficiency
improvements. Regulators often set prices for a fixed period of time, incentivizing
companies to outperform their allowance by improving their efficiency, while still
delivering the required outcomes. In order to determine these allowed prices or
revenues, regulators must determine the efficient cost level required to deliver a
number of outcomes over the period.
The outcomes the firm is expected to achieve can be determined by the gov-
ernment, environmental regulators, consumers, and/or other stakeholders. Although
it is less straightforward to represent the views of consumers, consumer priorities
can be determined through willingness to pay surveys and/or representation by
a representative body. To inform all of these interactions, economic regulators
often undertake output or service performance benchmarking. This generally takes
the form of simple comparisons of metrics or ratios, such as the amount of
leakage per km length of mains, and the identification of some best practice
benchmark (for other approaches, including that of ARERA, see the discussion
below).
The approach taken to cost benchmarking is the most likely to involve the appli-
cation of theories and techniques from production economics. In order to determine
41 Benchmarking in the European Water Sector 1569

an efficient cost level, economic regulators often undertake cost benchmarking using
approaches such as econometric modelling or DEA. Here, regulators compare costs
between service providers, taking into account the outputs to be delivered, such
as water delivered or number of properties served, in order to establish, for each
company, what the efficient cost level should be.
This chapter reviews the use of cost (and output) benchmarking within the
economic regulation of the water sector. While the alternative models discussed –
direct government control, control through legislation, competition for the market
or competition in the market – all benefit from cost and output benchmarking, it is
most relevant for independent sector regulators to set prices. As such, applications
of benchmarking are most advanced where an independent sector regulator has been
established, and this chapter draws from such jurisdictions.
Which approach to control and form of benchmarking is most appropriate
depends on the structure of the sector in the particular country, which is often
fragmented (as noted above). Of the 27 EU member states, many are unregulated,
either based on local municipalities with thousands of local authority operators (e.g.,
Germany) or run on a concessions basis whereby private participation is secured
through a tendering process (e.g., France).
A relatively small number of regulators have implemented formal cost (or output)
benchmarking using either an econometric approach or DEA. In the remaining
sections of this chapter, we provide more details on the form of benchmarking
undertaken in some of these.

• England and Wales: A key focus for this chapter, Ofwat, the water regulator in
England and Wales, has used formal cost and output benchmarking from the
first price review in 1994 after the initial setting of price allowances following
privatization in 1989 [67, 151].
• Northern Ireland: Largely follows the framework developed by Ofwat, taking
into account the relevant characteristics of Northern Ireland Water, relative to
companies operating in England and Wales.
• Scotland: Historically adopted a similar benchmarking approach to Ofwat, in the
context of the unique operating characteristics faced by Scottish Water relative
to comparator companies in England and Wales. In recent years, WICS (the
regulator of Scottish Water) has moved away from using cost benchmarking
to determine allowed costs, with an alternative regulatory model to reduce
information asymmetry.
• Ireland: Largely follows the framework developed by Ofwat, taking into account
the relevant characteristics of Irish Water, relative to companies operating in
England and Wales.
• Denmark: The Danish Competition and Consumer Authority (DCCA) currently
undertakes cost benchmarking every 2 years, though this will move to every
4 years from 2022. DCCA uses both DEA and SFA to identify the potential for
efficiency improvements.
• Italy: In 2019, ARERA, the Italian water regulator, used a number of SFA models
for setting efficient cost allowances for the period 2020–2023.
1570 A. Horncastle et al.

Benchmarking Techniques in Regulation: An Introduction

In this section, we provide an overview of how cost (and output) benchmarking

methods may be applied in regulatory settings. The two main benchmarking
techniques used to estimate the cost frontier are econometric approaches (including
SFA) and DEA.
A stylized SFA model can be written as:

c = f (w, y) + v + u

where the dependent variable c is costs, f (w, y) is a cost function with input prices
w and outputs y, v is inefficiency, and u is noise. In practice, input requirement
functions are also estimated, where cost drivers are used as explanatory variables
instead of outputs and input prices. For a detailed discussion of SFA models and
how they may be estimated, see Kumbhakar et al. [57].
Regulators have used various cost measures, including operating expenditure
(OPEX), capital expenditure (CAPEX), and total expenditure (TOTEX). The cost
drivers used by regulators include such factors as scale (such as number of properties
served), the density of area served by the water company, the topography of the
region, and the quality of the water abstracted or the required complexity of the
treatment process. We refer the reader to sections “Denmark” and “Italy” for exam-
ples of SFA models used by economic regulators, and, while the regulators discussed
in sections “England and Wales: Cost Benchmarking Prior to PR14,” “England and
Wales: Cost Benchmarking from PR14 Onwards,” “Northern Ireland,” “Scotland,”
and “Ireland” do not use SFA, the cost drivers discussed are also relevant.
Once the SFA model has been estimated, it can be used to estimate the efficient
cost levels for each company. However, in practice, each regulator uses the results
from estimated efficiency models in different ways. For example, ARERA, the
Italian water regulator, uses efficiency scores to place companies into clusters as an
input to determine cost sharing rates (see section “Italy”). Regulators, such as Ofwat,
that use econometric modelling (but not SFA) need to identify a benchmark in order
to estimate efficient cost levels. This is achieved by making ad hoc assumptions
such as using the upper quartile as the benchmark. For further details, see sections
“England and Wales: Cost Benchmarking Prior to PR14” and “England and Wales:
Cost Benchmarking from PR14 Onwards.”
Cost frontiers may also be estimated using DEA. In a regulatory setting, a DEA
model consists of specifying a set of inputs and outputs, with the model typically
input oriented. For a detailed discussion of DEA, see Thanassoulis [144]. The inputs
and outputs can be similar to those used in the econometric models discussed above.
For example, the DCCA, the Danish water regulator, uses TOTEX as the input and
composite “grid” variables formed by weighting assets as the outputs (see section
“Denmark”).
Once the DEA model has been estimated, it can be used to estimate the efficient
cost levels for each company. The DCCA, in deriving a final efficiency estimate for
41 Benchmarking in the European Water Sector 1571

each company, uses the most favorable estimated efficiency score for each company
from its SFA and DEA modelling with the resultant cost reduction capped at 2%
p.a. (see section “Denmark”).

England and Wales: Cost Benchmarking Prior to PR14

In this section, we review the cost benchmarking approaches used in the England
and Wales water sector used in price controls up to PR09 (i.e., from privatiza-
tion, in 1989, up to 2015). The regulation of this water sector is perhaps more
developed/long standing compared to many other European countries. In addition,
other regulators in Great Britain and Ireland have often used Ofwat’s approach
to benchmark their regulated water company with those in England and Wales.
As such, we set out a relatively detailed history of the approach used in England
Wales, and focus on differences from this in the sections that follow, particularly for
regulatory regimes in Great Britain and Ireland.

The Structure of the Water Sector and Regulation in England

and Wales

Water services are provided by privately owned companies in England. Glas Cymru,
the owner of the majority provider of water services in Wales (Dwr Cymru or Welsh
Water), is a company limited by guarantee, with no shareholders, and, as such any
financial surpluses are retained for the benefit of its customers.
Most water and sewerage companies are regional monopolies. Competition was
introduced for the water supply of large non-domestic customers (above 50 Ml/day)
in England in 2003 (see Ofwat and the Department for Food and Rural Affairs [109])
although domestic customers and small non-domestic customers could still not
choose or switch their supplier. Over the time, the size threshold for non-domestic
customers able to choose or switch their supply has fallen, with the entire market
for non-domestic customers opened in 2017 [84].
As such, the industry is regulated by an independent economic regulator,
Ofwat, to ensure better outcomes for customers. Of key relevance to production
economics – and cost and quality of service benchmarking – is that one of
Ofwat’s duties is to “promote economy and efficiency by water companies” (see
Water Industry Act [152] as amended, Sect. 2). As such, as part of its price
control reviews, Ofwat has always undertaken cost and outcomes benchmarking in
some form.
Ofwat’s approach to quality of service benchmarking has taken the form of
comparative metrics or key performance indicators (KPIs), separate from cost
benchmarking. Therefore, in this section, we focus on Ofwat’s approach to cost
benchmarking.
1572 A. Horncastle et al.

Ofwat’s Approach Prior to PR14

We start with a review of past price control reviews in this section and then examine
the Ofwat’s approach to cost benchmarking in PR14 and PR19, the price control
reviews for the periods 2015–2019 and 2020–2024, in section “England and Wales:
Cost Benchmarking from PR14 Onwards.”

Modelled Costs
Ofwat considered expenditure within three categories:

• Operating expenditure – regular day-to-day expenditure required to deliver water

and wastewater services (including labor, energy, and chemicals)
• Capital maintenance – expenditure incurred to maintain the long-term capability
of the existing asset base, such as restoring a damaged water main
• Capital enhancement – expenditure required to make additions to the asset and
enhance service, such as construction of a new water main or modification to a
treatment works to improve water quality

Up to the price control review in 2009, PR09, Ofwat modelled operating expen-
diture and capital expenditure separately (for example, see Ofwat [75]). Operating
expenditure was modelled using econometric models, capital maintenance was
assessed using both econometric modelling and unit cost comparisons; and capital
enhancement was assessed using unit cost comparisons only. Unit cost comparisons
for capital maintenance and enhancement were set out in the “cost base report” [25,
72, 74].
In the research commissioned by Ofwat for the 1994 price control review,
PR94, Stewart ([136], p. 1) stated that “in principle economic efficiency requires
companies to minimise costs in total and hence this variable [total expenditure]
should be the focus of attention. However, there are problems with the definition
of total costs in the water industry. Current cost operating profit reflects the return
actually earned, rather than the cost of capital [ . . . ] profits in the water industry
reflect to a considerable extent both historical factors and future investment needs
[ . . . ] An alternative approach would attempt to derive capital costs by applying
the cost of capital to (current cost, depreciated) asset values. However, there are
difficulties of both principle and practice in doing this” [primarily measurement and
consistency issues].
Given the now much longer time series of data available, the cost definition
has been reexamined by Ofwat in more recent price control reviews (see section
“England and Wales: Cost Benchmarking from PR14 Onwards”).

Model Specifications
In order to benchmark water companies’ costs on a like-for-like basis, Ofwat
developed a series of input requirement functions, whereby costs are explained using
a number of cost drivers. For further discussion on input requirement functions, see
41 Benchmarking in the European Water Sector 1573

Diewert [34], Kumbhakar and Heshmati [54], and Kumbhakar et al. ([57], Sect. 2.7).
This was applied instead of a cost function approach, whereby costs are a function
of outputs and input prices, see Kumbhakar et al. ([57], Sect. 4.2).
As set out in Ofwat [68], the functional forms of its models were either linear or
log-log, and Ofwat modelled costs at a functional level and an overall service level
(for water services).
For water services, prior to subsequent mergers, Ofwat had 32 observations
(10 WASCs and 22 WOCs) and modelled expenditure on overall water services,
business activities, resources and treatment, and distribution. Ofwat subsequently
also introduced separate models of power costs, stating that “many companies
have made savings on power expenditure and it was considered that it might be
useful to look at this cost separately. Another reason is that the allocation of power
expenditure between distribution and resources and treatment can be problematic” –
see Ofwat ([69], Sect. 2.1.3).
For wastewater, Ofwat had fewer observations (only ten WASCs) and thus
used a combination of econometric models (at a works or area level) and unit
cost models. Ofwat used the following models for wastewater: business activities
(unit cost model), small treatment works (unit cost model), large treatment works
(econometric model at works level), sewerage area (econometric model at an area
level), and sludge treatment and disposal.
Ofwat estimated the models using cross-sectional data, focusing on econometric
(OLS) modelling to determine efficient expenditure. However, Ofwat also under-
took or commissioned studies using DEA and SFA during various price control
reviews. In PR94, for example, Ofwat used DEA results (see section “Forecasting
Efficient Costs: Identifying “Efficient” Cost Levels, While Accounting for Error
and Heterogeneity, and Alternative Estimation Approaches”) as a form of cross-
check on its econometric results, improving a company’s outcome if the DEA result
was significantly better than the OLS result – see the Monopolies and Mergers
Commission (MMC [59], p. 415).

Establishing Efficient Costs

In Stewart [137], the accuracy of the estimated residuals was also examined (using
both SFA and estimating the confidence intervals around the OLS residuals),
identifying some companies whose costs were significantly above, at or below
average. Stewart ([136], p. 26) stated, “as a final word of caution, it should be
reiterated that the inefficiency of a company is an inherently ‘residual’ concept.
The accuracy of the measures derived depends on the extent to which we are able to
control for all relevant cost drivers.”
From PR94 until PR09, to establish future efficient costs, Ofwat first identified
a benchmark using a number of criteria (which themselves changed slightly over
time). For example, the criteria used in PR09 were summarized in a subsequent
report by the Competition Commission ([28], Appendix F, para 6) and included:
Ofwat must have no concern about the company’s data or the independence of
the company’s data; the company must have no unusual exogenous characteristics
which significantly reduce its costs; and the company’s turnover must represent a
1574 A. Horncastle et al.

reasonable proportion of the industry. A “reasonable proportion” was defined as the

size of the smallest WASC, around 2.5–3% of water services turnover.
Having established the cost efficiency benchmark, Ofwat categorized companies
into various efficiency bands. For example, Ofwat [67] explains that, in PR94, three
bands were used – “more efficient,” “average,” and “less efficient.” This seems
to have been based on the analysis by Stewart [137–138], discussed above. In
later reviews, the differentiation was extended and bands A–E were used, starting
with Band A representing companies estimated to be 0–5% from the benchmark,
and these were subsequently further subdivided into upper and lower bands (see
Competition Commission [27], Appendix K). For setting cost allowances, Ofwat
then took the mid-point of each half band as representative of all companies in that
half band.
Ofwat then applied a percentage catch-up assumption rather than requiring
companies to fully catch-up to its estimated benchmark. That is, for the next price
control period, companies were allowed their current cost level less a percentage
reduction based on catching up by a certain percentage to the benchmark. In PR94,
for example, Ofwat [67] set this at around half of the OPEX efficiency gap, while in
PR09 Ofwat set this at 60% (see Ofwat ([74], p. 107) and Competition Commission
[27]). In addition, a “glide path” was used, whereby companies had to achieve this
catch up over a number of years (generally 5 years for OPEX, while for CAPEX,
the timeframe altered between price control reviews).
Ofwat’s reasoning, as reported in Competition Commission ([26], Appendix 4.1,
para. 19), was that initially that “this cautious approach reflects the possibility of
errors in the DGWS’s work, the difficulty of identifying the efficiency frontier,
and the need to offer an incentive to outperform” [emphasis added]. Subsequently,
in PR04, Ofwat [70] emphasized the incentive properties of the approach and, in
particular, its “carrot and stick,” whereby the catch-up target was a “stick” for
companies to improve efficiency, while the remaining gap provided a “carrot,” as
any efficiency gains achieved over and above this were kept by the company for
the duration of the price control period. Separately, modelling or data errors were
explicitly accounted for through an explicit adjustment to the estimated residuals –
a 10% adjustment to the residuals in water and a 20% adjustment in sewerage, see
Ofwat ([70], p. 154).
In addition to catching up to the estimated benchmark, Ofwat would also set
companies targets to reduce their costs further through technological progress. As
with the catch-up element, this was based on a proportion of what Ofwat considered
could be achievable through frontier shift. The frontier shift assumption itself was
determined through an exercise separate from the relative efficiency assessment,
examining productivity in the UK economy. For example, see Ofwat ([67], p. 31).
As well as periodically setting future efficient cost allowances at price control
reviews, Ofwat also published annual reports on companies’ cost efficiency. This
provided reputational efficiency incentives on companies to improve their effi-
ciency. Having worked with companies and discussed this issue, the authors are
aware that such reporting of efficiency rankings did indeed have strong incentive
properties.
41 Benchmarking in the European Water Sector 1575

The Use of Ofwat’s Approach Elsewhere and a Change in Approach

Following PR09, Ofwat would subsequently change its overall regulatory approach
significantly, moving to a TOTEX regime in line with regulatory developments in
other network regulators in Great Britain, such as the energy regulator, Ofgem, see
Ofgem [63]. The implications of this change in approach to cost benchmarking is
set out in section “England and Wales: Cost Benchmarking from PR14 Onwards.”
However, other water regulators in Great Britain and Ireland have used or still use
an approach similar to Ofwat’s PR09 approach (see sections “England and Wales:
Cost Benchmarking from PR14 Onwards,” “Northern Ireland,” and “Scotland”),
although some have also indicated a need to change (for example, see UR [147],
p. 22), or no longer use cost benchmarking to set allowed expenditure (see WICS
([156], p. 8).

England and Wales: Cost Benchmarking from PR14 Onwards

In PR14, Ofwat amended its approach to quality of service. Introducing so-

called “performance commitments” (or targets for quality of service performance)
and associated outcome delivery incentives. These were developed following
extensive customer engagement and willingness to pay surveys and evidence by
companies. At PR19, this approach was broadened to include a number of “com-
mon performance commitments” and many more company-specific performance
commitments for each company. For details, see Ofwat [90]. Any comparative
analysis is based on simple metrics, as such, we do not cover this benchmarking
further.
In this section, we review the cost benchmarking approach used by Ofwat during
PR14 and PR19, the price control review for the periods 2015–2019 [77] and
2020–2024 [104], respectively. From the perspective of benchmarking, these price
controls can be characterized by Ofwat moving to a TOTEX approach, whereby
the regulator seeks to remove “any undesirable incentives for companies to seek
capital expenditure-intensive solutions where there may be better alternatives,”
Ofwat ([77], p. 5). This is achieved by modelling operating expenditure and capital
expenditure together, as well as changing other parts of the regulatory regime
outside the scope of this chapter.

Ofwat’s Approach in PR14: A Change in Direction

In recent price reviews, starting with PR14, Ofwat changed its cost assessment
framework, which, until that point, had been relatively unchanged since 1994. Ofwat
[77, 78] and CEPA [13] set out the key changes, including:

• Modelling total expenditure (TOTEX), given a perception that, under the pre-
vious regulatory approach of modelling costs separately, companies were incen-
1576 A. Horncastle et al.

tivized towards a capital expenditure bias – see Ofwat [76] and Ofwat ([77], Sect.
5.2) for a discussion on this possible bias and Ofwat’s response. In particular,
Ofwat considered that the structure of incentives within its regulatory framework
were such that underperformance (i.e., spending more than expected) was less
costly for solutions weighted towards capital expenditure than for those with
a greater proportion of operating expenditure. See section “Input Definition:
Modelled Expenditure” for a discussion on TOTEX modelling.
• The introduction of more flexible functional forms, namely partial translog
models (see sections Functional form, economies of Scale, Size, and Scope and
“Benchmarking: Functional Form”).
• Use of panel data (i.e., using data across the various regulated companies and
over time). As shown in Kumbhakar and Horncastle [55], moving to a panel data
approach for modelling costs in the England and Wales water sector can result
in “a considerable increase in precision” of the modelling compared to Ofwat’s
previous cross-sectional modelling approach.
• Use of both pooled OLS and RE estimation approaches (see sections Model
development and model selection, Wholesale Enhancement expenditure, and
Forecasting efficient costs: identifying ‘efficient’ cost levels, while accounting
for error and heterogeneity, and alternative estimation approaches).
• In PR14, modelling water services at the water service level and, in PR19,
modelling across the value chain (a slightly different activity split compared to
the previously used functions) and at the aggregate level (see section “Definition
of the Inputs”).
• Using the upper quartile (UQ) as the benchmark (see sections “Forecasting
Future Efficient Cost Levels,” 4.8, and “Forecasting Efficient Costs: Identifying
“Efficient” Cost Levels, While Accounting for Error and Heterogeneity, and
Alternative Estimation Approaches”).
• Using forecast of cost drivers (from companies’ or Ofwat’s own assumptions)
to forecast efficient costs, rather than setting percentage cost reductions from
companies’ own current cost levels (see section “Forecasting Future Efficient
Cost Levels”).
• No longer publishing annual efficiency reports.

However, Ofwat’s cost assessment modelling in PR14 was criticized by Bristol

Water and the Competition and Markets Authority, CMA, in Bristol Water’s appeal
of PR14 – (see Bristol Water [10], Sect. 11) and CMA [23]. Areas of criticism
included:

• Level of aggregation – In PR14, Ofwat did not undertake cost modelling below
the aggregate water service level (though Ofwat did separately models network
and treatment costs for wastewater services). The CMA consider that it was
“ambitious to seek to model the entire wholesale water business through this
type of high-level econometric model, which may fail to take proper account of
the wide range of factors that affect companies’ expenditure requirements.”
• Investment timing – The CMA noted that companies’ investment requirements
vary over time and thus differences between companies in total cash expenditure
41 Benchmarking in the European Water Sector 1577

may be reflective of differences in their investment requirements and not

efficiency.
• TOTEX models – Ofwat’s models included enhancement expenditure. The CMA
considered that there were likely to be substantial differences between water
companies, and over time, in enhancement expenditure requirements, which did
not seem to be sufficiently taken account of in Ofwat’s models.
• The CMA considered that the estimated coefficients were counterintuitive in
some cases and some specified relationships did not make sense (e.g., taking
logarithms of variables expressed as proportions).
• The CMA considered that some models used a relatively large number of
explanatory variables compared to the sample size (e.g., one model had 27
explanatory variables and a sample size of 90 observations).
• Translog models – Given the relatively small sample size, the CMA considered
that the translog structure seemed “overly ambitious” and had, in practice,
“compromised the results” as, for example, one of Ofwat’s models implied a
form of diseconomies of scale which the CMA found counterintuitive.
• Upper quartile benchmark – The CMA report considered that using an upper
quartile benchmark could be overly demanding and instead used an average
benchmark. This, it stated, was a judgment in light of the issues it had identified
both in its review of Ofwat’s econometric models and from its development of
alternative models.
• Future cost prediction – Ofwat’s econometric models included a time trend. Its
future expenditure allowances took account of these time trends, implying an
annual change in costs of around RPI + 0.4% p.a. The CMA considered this
“overly generous.” Instead, the CMA applied a cost trend of RPI–1% p.a., to
capture the impact of input price inflation and productivity improvements.

This backdrop formed the basis for Ofwat’s development of its approach for
PR19, which we cover in the next section. Ofwat [85] stated “our approach has
taken into account learnings from PR14, industry feedback, and the Competition
and Markets Authority (CMA) reference on Bristol Water’s PR14 price controls.”

Ofwat’s Approach to Cost Benchmarking in PR19

Ofwat’s cost modelling in PR19 can be split into the following key steps (Fig. 2):

1. Definition the inputs, or cost base, to be modelled, i.e.:

(a) The activities across the value chain to be modelled
(b) The cost elements to be modelled
2. Identifying the outputs and other drivers of cost
3. Data collection, validation, and consultation
4. Benchmarking using econometric models
5. Forecasting efficient cost levels for each company for the next control period

These steps are examined below in turn.

1578 A. Horncastle et al.

Definition of the inputs Identifying the outputs and

other drivers of costs

Data collection, validation and consultation

Costs, outputs, input prices, other cost drivers including environmental factors

Benchmarking
Specification of the model(s), model development/selection (including
checking against economic and operational insight), model estimation

Forecasting efficient cost levels

Fig. 2 Ofwat’s key benchmarking steps in PR19

Definition of the Inputs

In its initial assessment of plans (IAP), Ofwat [91–94], Ofwat benchmarked

water companies through the use of econometric models of base expenditure
(or “BOTEX”), which consists of operating expenditure and capital maintenance
expenditure. In its subsequent slow-track draft (i.e., companies who submitted
business plans that Ofwat deemed to be cost inefficient and required a further
challenge) and final determinations, Ofwat [95, 96], Ofwat modelled BOTEX
plus (consisting of BOTEX as defined above plus some elements of enhancement
expenditure, primarily relating to growth). For water services, these enhancement
costs were new developments, new connections, and addressing low pressure; for
wastewater services, these enhancement costs were new developments and growth,
growth at sewage treatment works, and reduced flooding risk for properties. For
water, the enhancement expenditure in each year was added to BOTEX, while
for wastewater, the average enhancement expenditure over the sample period
(2011/2012–2018/2019) was used.
41 Benchmarking in the European Water Sector 1579

Aside from growth, there are many other types of enhancement expenditure.
At the PR19 final determinations, there were 40 separate enhancement categories
across both water and wastewater. Examples include meeting legislatively mandated
environmental obligations, improving the quality of water supply and building a
more resilient water supply. These remaining elements of enhancement expenditure
in the slow-track draft and final determinations were considered separately from
BOTEX plus, as Ofwat ([93], p. 7) considered that “enhancement costs tend to be
non-routine and company specific.”
The reasoning for moving to modelling BOTEX plus in the slow-track draft
determinations was set out in Ofwat [95] and can be summarized as follows:

• Ofwat considered the expenditure to be “routine” – companies have incurred it

in the past and will incur it in the future.
• Growth-related enhancement can be explained with similar cost drivers to
operational and capital maintenance (e.g., company scale).
• Ofwat did not expect to see a significant step change in drivers of growth
enhancement expenditure during PR19.

Some companies criticized Ofwat’s inclusion of growth expenditure along with

base expenditure, but Ofwat maintained its approach in the final determinations (see
section “Input Definition: Modelled Expenditure”).
In response to DEFRA [33] and the resultant need to promote upstream markets
for water resources and bioresources, Ofwat [81] set out its methodological
framework for PR19, including setting separate price controls for wholesale water,
water resources, wholesale wastewater and bioresources, and retail. This included
the need to model at a more disaggregated level than that undertaken PR14. This
change in approach was also consistent with issues raised in CMA [23] around the
level of aggregation of the modelling in PR14. Though, as noted in Oxera ([114],
p. 1) about Ofwat’s previous functional modelling approach, “Ofwat’s modelling
is undertaken using functional models for different cost areas. This requires costs
are separable across the different water activities and that cost allocation across the
water companies is consistent.” Disaggregation also makes it difficult to capture
economies of scale and of scope, although Saal and Parker [127] did not find any
evidence of economies of scope between water and wastewater. Similarly, more
recently, Saal ([125], p. 3) stated that “such an approach assumes that complex
multiple output systems can be fully separated.”
At the most granular level, Ofwat subdivided wholesale water services into
four separate parts of the value chain – water resources (abstraction), raw water
distribution (delivery to a water treatment works), water treatment (treatment
of raw water), and treated water distribution (delivery of treated water to con-
sumers). Wastewater was subdivided into five separate parts of the value chain –
sewage collection, sewage treatment, sludge transport, sludge treatment, and sludge
disposal.
However, Ofwat ([94], p. 11) acknowledge that there may be “interaction[s]
between services of the value chain” and “inherent choices and trade-offs across the
1580 A. Horncastle et al.

Water resources

Raw water
Water resources plus
distribution

Water treatment Aggregate

Treated water
distribution

Fig. 3 Value chain in wholesale water. (Source: Ofwat [94]. Checked boxes indicate that Ofwat
modelled the part of the value chain)

value chain.” Therefore, Ofwat aggregated water resources, raw water distribution,
and water treatment into “water resources plus.”
Ofwat “triangulated,” or averaged, across the outcomes from:

• the sum of allowances from “bottom up” models at the water resources plus and
treated water distribution level; with
• models at the aggregate water service level.

This is set out in Fig. 3.

Ofwat ([94], p. 11) stated that it found that these models had more reasonable
ranges of estimated company efficiencies and coefficients were more aligned with
economic intuition than when modelling at more disaggregate levels, such as
separate water resources models. At its industry consultation on models to be used
at the price review, Ofwat ([85], p. 13), it also considered that models at this higher
level of aggregation were less susceptible to misallocations of costs across services.
Ofwat’s approach to wholesale wastewater is similar to wholesale water. Ofwat
aggregated sludge transport, sludge treatment, and sludge disposal into a “biore-
sources,” adding sewage treatment to create “bioresources plus.” This is set out
below (Fig. 4).
Unlike water services, no aggregate wastewater models were developed (see
Ofwat [94], p. 25). This modelling decision assumes separability (see Oxera [114]).
Another challenge is that population density has different effects on expenditure in
different parts of the value chain, such that costs are interrelated, as set out by South
West Water ([134], p. 9):

[population density] affects the wastewater network differently to the water network
because the sewer network is expensive to construct, so sparse networks are generally built
to serve small catchment areas. Densely populated urbanised areas have increased network
costs associated with operating in congested areas but have reduced treatment and sludge
costs as a result of having larger treatment works serving densely populated areas. Sparsely
populated rural areas have increased treatments costs due to requiring many small works
scattered across a sparsely populated region
41 Benchmarking in the European Water Sector 1581

Sewage collection

Sewage treatment Aggregate

Bio-resources
Sludge transport
plus

Sludge treatment Bio-resources

Sludge disposal

Fig. 4 Value chain in wholesale wastewater. (Source: Ofwat [94]. Checked boxes indicate that
Ofwat modelled the part of the value chain)

By aggregating the results of these models before establishing the benchmark,

Ofwat reduced the risk of setting unachievable benchmarks. However, the lack of an
aggregate wastewater model removes the ability to cross-check such disaggregated
modelling.

Identifying the Outputs and Other Drivers of Costs

Having defined the inputs or costs to be modelled, the next key step is to decide on
what else should be included in the model.
As with previous price controls, Ofwat did not define formal cost models.
Instead, Ofwat focused on choosing an appropriate set of “cost drivers.” Input prices
were not included in Ofwat’s models, although Ofwat did examine the impact of
regional wages. As such, Ofwat’s cost benchmarking models are not cost functions
in the formal sense. These model formulations are sometimes referred to as input
requirement functions (as mentioned above).
For wholesale water and wastewater, Ofwat [94] found that the four key
categories of cost drivers to be consistently important in explaining variations in
costs across companies were:

• Scale variables, to measure the size of the network and/or level of output
• Complexity variables, to capture the complexity of required treatment or the
complexity of the network
• Topography variables, to capture energy requirements for transporting or
pumping water or wastewater
• Density variables, to capture economies of scale at the treatment level and costs
resulting from operating in highly dense (or sparse) areas

It appears that the consultation process (see section “Data Collection, Validation,
and Consultation”) was successful in ensuring that some industry insight was
1582 A. Horncastle et al.

embedded within the modelling framework, as these categories are similar to

those suggested by some of the water companies as part of the cost assessment
consultation – for example, see South West Water ([134], Sect. 3).

Data Collection, Validation, and Consultation

The critical next steps in any regulatory cost assessment exercise are: collating
relevant data (including ensuring all the key cost drivers are collated), ensuring the
data is consistent across companies, and ensuring that the models are aligned with
industry and economic insight.
During 2016 and 2017, Ofwat ran a series of cost assessment working groups
with the industry to develop the data and cost assessment tools for PR19. In July
2017, companies submitted data on costs and cost drivers for wholesale water and
wastewater services over the 6-year period, from 2011–2012 to 2016–2017. The
data was subject to extensive quality assurance and was shared with the industry.
In March 2018, Ofwat issued a cost assessment consultation, Ofwat [85].
Thirteen water companies and Ofwat submitted a number of cost models across the
value chain. In total, 382 models were submitted. Each company then commented
on the models that had been submitted. In February 2019, Ofwat published its
approach and decisions regarding econometric modelling for PR19, including its
model specifications, in Ofwat [94].

Benchmarking

The next step in Ofwat’s framework was to undertaking the benchmarking exercise
itself, which first involved the development of a number of econometric models for
each part of the value chain in order to compare costs across companies.
In wholesale water services, Ofwat specified five econometric models (two for
water resources plus; one for treated water distribution; and two for wholesale
water). In wholesale wastewater, Ofwat specified eight econometric models (two
each for sewage collection, sewage treatment, bioresources, and bioresources plus).

Model Development and Model Selection

As set out in Ofwat [85], Ofwat’s approach to model development and assessment
was as follows:

• Engineering, operational, and economic understanding was used to specify an

econometric model and form expectations about the relationship between cost
and cost drivers in the model.
• The resultant estimated coefficients were assessed as to whether they were of the
right sign and plausible magnitude.
• The estimated coefficients were examined for robustness, including whether
they were stable and consistent across different specifications and statistically
41 Benchmarking in the European Water Sector 1583

significant. On this issue, Ofwat stated that they “do not consider that the
common thresholds of statistical significance (e.g., 95% significance) need to
be strictly followed for our model selection. The size of the sample has a large
effect on statistical significance. With a relatively small sample we are careful not
to dismiss mechanistically variables that are not strictly statistically significant,
so long as the significance is still reasonable and the estimation seems robust,”
Ofwat ([85], p. 9).
• Ofwat checked the consequences/risk of perverse incentives of including endoge-
nous cost drivers.
• Ofwat examined the statistical validity of the model.

With regards to its estimation approach, for its initial assessment of plans, Ofwat
([94], p. 7) decided upon using a RE specification. Ofwat justified this choice as it
considered it reflected the panel structure of the data, and statistical significance of
the coefficients and Breusch-Pagan tests supported its use over OLS.

Estimated Wholesale BOTEX Plus Cost Models

The resultant models typically included one scale driver and up to four other cost
drivers. Saal and Nieswand [126] criticized this as being too restrictive. By using
models with different scale drivers in different parts of the value chain, it could be
argued that this may partly capture multiple output production process of water and
waste water service provision. However, the differences are quite limited in practice
and no quality of service measures were included in Ofwat’s core suite of models.
As noted above, quality of service and outcomes more generally are regulated
separately by Ofwat. This separation was criticized by some water companies and is
examined in more detail in section “Input Definition: Accounting for the Investment
Cycle.”
In water, Ofwat separately modelled three parts of the value chain – two at the
disaggregate level (water resources plus, and treated water distribution) and one
aggregate (wholesale water). Ofwat [96], sets out the estimated models for water
service at final determinations. These are generally of the form,

ln BOT EX plus it = a + b1 .scaleit + b2 .treatment complexity it

+ b3 .topography it + b4 .f density it + uit
(1)

and are set out in Table 1 (we indicate the correspondence between the specific
variable in the table and the general cost driver area in the formula with the relevant
coefficient).
In wastewater, Ofwat separately modelled four parts of the value chain, all
at the disaggregate level (sewage collection, sewage treatment, bioresources, and
bioresources plus). Ofwat [96] sets out the estimated models for wastewater service
at final determinations. These are generally of the form,
1584 A. Horncastle et al.

Table 1 Water BOTEX econometric models

Model name WRP1 WRP2 TWD1 WW1 WW2
Water resources + Raw Treated
water distribution + Water water
Dependent variable (log) treatment distribution Wholesale water total
Connected properties 1.007*** 1.007*** 1.034*** 1.020***
(log), b1
Lengths of main (log), b1 1.049***
Water treated at works of 0.008*** 0.005***
complexity levels 3–6
(%), b2
Weighted average 0.486*** 0.568***
treatment complexity
(log), b2
Number of booster 0.455*** 0.231** 0.256***
pumping stations per
lengths of main (log), b3
Weighted average density −1.647*** −0.981** −3.120*** −2.220*** −1.789***
(log), b4
Squared term of log of 0.103*** 0.056 0.248*** 0.156*** 0.125***
weighted average density, (0.120)
b4
Constant term, a −4.274** −6.607*** 5.686*** −1.106 −2.725**
(0.483)
Overall R-squared 0.93 0.92 0.97 0.98 0.98
Number of observations 141 141 141 141 141
Source: Ofwat ([96], p. 162)
Note: The dependent variable is modelled base costs in 2017/2018 prices, using the CPIH
adjustment. P values expressed in parentheses are based on clustered standard errors at the
company level. *, **, and *** denote significance at 10%, 5%, and 1%, respectively

ln BOT EX plus it = a + b1 .scaleit + b2 .size prof ile of works it
+ b3 .topography i + b4 .treatment complexity it (2)
+ b5 .density it + uit

• First, where alternative models were used, Ofwat took the average of the model
predictions within each part of the value chain (e.g., WRP1 and WRP2, and WW1
and WW2) to arrive at a prediction for that part of the value chain (e.g., WRP
Table 2 Wastewater BOTEX econometric models
Model name SWC1 SWC2 SWT1 SWT2 BR1 BR2 BRP1 BRP2
Bioresources
+ Sewage
Dependent variable (log) Sewage collection Sewage treatment Bioresources treatment
Sewer length (log), b1 0.839*** 0.896***
Load (log), b1 0.779*** 0.773*** 0.765*** 0.762***
Sludge produced (log), b1 1.274*** 1.265**
Load treated in size bands 1–3 (%), b2 0.045*** 0.057** 0.038*
Load treated in size band 6 (%), b2 −0.013** −0.011**
Pumping capacity per sewer length (log), 0.317* 0.606***
b3
Load with ammonia consent below 3 mg/l 0.004*** 0.004*** 0.005*** 0.005***
(%), b4
41 Benchmarking in the European Water Sector

Number of properties per sewer length 0.998**

(log), b5
Weighted average density (log), b5 0.178 −0.295**
(0.146)
Sewage treatment works per number of 0.397*
properties, b4 /b5
Constant term, a −8.124*** −6.416*** −5.228*** −3.988*** −0.389 0.994* −4.753*** −3.709***
(0.648)
Overall R-squared 0.93 0.88 0.88 0.87 0.82 0.79 0.92 0.92
Number of observations 80 80 80 80 80 80 80 80
Source: Ofwat ([96], p. 163)
Note: The dependent variable is modelled base costs in 2017/2018 prices, using the CPIH adjustment. P values expressed in parentheses are based on clustered
standard errors at the company level. *, **, and *** denote significance at 10%, 5%, and 1%, respectively
1585
1586 A. Horncastle et al.

overall, and WW overall, respectively). Since there was only one TWD model,
there was no averaging undertaken for this part of the value chain.
• Ofwat then constructed a “bottom-up” and “top-down” view of the company’s
efficient costs. In water, the bottom-up view was arrived at by summing together
water resources plus (WRP overall) and treated water distribution (TWD), while
the top-down view was provided by WW overall.
• To arrive at an overall cost prediction for water, an average of the “bottom-up”
and “top-down” views is taken.

Similarly, for wastewater, a cost prediction for each part of the value chain was
arrived at by averaging its constituent models. For example, the SWC cost prediction
was calculated by average the cost predictions from the two models “SWC1” and
“SWC2.” Then, the bottom-up view was the sum of sewage collection (SWC),
sewage treatment (SWT), and bioresources (BR), while the top-down view was the
sum of sewage collection (SWC) and bioresources plus (BRP). These top-down and
bottom-up views are then also averaged to arrive at the final overall cost prediction.

Estimated Retail Costs Models

For retail services, Ofwat modelled total retail costs, bad debt, and other retail costs.
The models are set out below. These are generally of the form,

ln (total retail costs per household it ) = a + b1 .average bill sizeit

+ b2 .propensity to def ault it
+ b3 .net migrationi
+ b4 .dual service customers it
+ b5 .metered customers it
+ b5 .number of households it
+ uit
(3)

and are set out below (we indicate the correspondence between the specific variable
in the table and the general cost driver area in the formula with the relevant
coefficient) (Tables 3 and 4).
In order to arrive at an overall cost prediction from all these models, Ofwat [96]
followed a similar approach as that set out above for wholesale services, placing
25% weight on the bottom-up models and 75% weight on the top-down retail cost
models. A greater weight was placed on top-down models to reflect Ofwat’s view
of relative model quality, with the wider range of efficiency scores in the bad debt
models given as specific evidence, Ofwat ([96], p. 119).
41 Benchmarking in the European Water Sector 1587

Table 3 Bottom-up retail cost models

Model name RDC1 RDC2 ROC1 ROC2
Bad debt and bad debt
management costs per Other retail costs per
Dependent variable (log) household household
Average bill size (log), b1 1.190*** 1.158***
Proportion of households with 0.067***
default (%), b2
Proportion of households income 0.076***
deprived (%), b2
Total migration (% of 0.035**
population), b3
Proportion of dual service 0.002* 0.002**
households (%), b4
Proportion of metered 0.007*** 0.007***
households (%), b5
Number of connected −0.039
households (log), b1 (0.394)
Constant term, a −6.032*** −5.680*** 2.400*** 2.909***
Overall R-squared 0.77 0.78 0.13 0.15
Number of observations 105 105 105 105
Source: Ofwat ([96], p. 164)

Table 4 Top-down retail cost models

Model name RTC1 RTC2 RTC3
Dependent variable (log) Total retail costs per household
Average bill size (log), b1 0.458*** 0.526*** 0.603***
Proportion of households with default 0.024 0.030**
(%), b2 (0.106)
Proportion of households income 0.059***
deprived (%), b2
Total migration (% of population), b3 0.037**
Proportion of metered customers (%), b5 0.004 0.004 0.002
(0.321) (0.206) (0.436)
Number of connected households (log), −0.059* −0.116*
b1
Constant term, a −0.014 0.226 0.200
(0.980) (0.653) (0.564)
Overall R-squared 0.67 0.70 0.71
Number of observations 105 105 105
Source: Ofwat ([96], p. 164)

Functional Form, Economies of Scale, Size, and Scope

In PR19, Ofwat simplified its models relative to those it used PR14, and no longer
used semi-translog specifications. Generally, the models used a log–log functional
form, though Ofwat included squared terms on density/sparsity related measured to
1588 A. Horncastle et al.

pick up a U-shape impact for wholesale water services (costs are expected to be
higher in both dense and sparse regions, relative to regions of average density) – see
Table 1. On this issue, Ofwat ([93], p. 10) stated:

While the translog has appealing properties in that estimated elasticities vary with company
size, in practice we find individual company elasticities can have a counter-intuitive sign,
that some translog terms were highly insignificant and (individually) unstable, and that
the specification takes up degrees of freedom that could be dispensed with more relevant
cost drivers. Instead, for PR19 we built our models ‘bottom up’ by considering the main
cost drivers in each service. We include non-linear terms where their inclusion aligns with
economic or engineering rationale but not for the purpose of fitting with a preconceived
functional form.

For water services, Ofwat modelled the WASCs and WOCs together, the data set
consisting of 17 companies, which vary significantly in size. The largest company,
Thames Water, is more than 50 times the size of some of the smallest companies (on
the basis of costs). For wastewater services, there are 11 WASCs, which are more
similar in size with the exception of the recently created Hafren Dyfrdwy.
As discussed above, in PR14, Ofwat used semi-translog specifications – see
CEPA [13]. This model estimated varying economies of scale across the sector.
However, this modelling was criticized in the subsequent appeal – Bristol Water
([11], p. 50) stated “it is important that the model is consistent with economic theory.
Oxera has shown that this is not the case with respect to Ofwat’s translog model.”
The CMA [23] similarly stated that “Ofwat’s refined base expenditure models
implied a form of diseconomies of scale with respect to the size of a company’s
customer base, which we found to be counter-intuitive.”
For PR19, the models used were of a log-log functional form and economies
of scale or constant returns to scale were estimated. For example, the value of the
scale coefficient in Ofwat’s model for water distribution (TWD1) is 1.0, i.e., for a
10% increase in length of mains, costs increase by 10%. Similarly, the value on the
scale coefficients in Ofwat’s sewage collection model (SC1) is 0.84, i.e., for a 10%
increase in scale, costs increase by 8.4%.
For retail, if a WASC provides both waste and water, then clearly there will only
be one bill. Where a WOC operates in a WASC area, there may be two separate
bills, or companies may agree to send a combined bill. As such, there can be an
issue when comparing costs between companies. On this issue, Ofwat ([94], p. 32)
states, “Dual service customers receive both water and wastewater services from the
same company. Dual customers may generate more contact and enquiries relative to
single service customers, which in turn drives customer service costs.”
To address this issue, for other retail costs, Ofwat modelled costs per household,
with dual customers per household as one of the drivers – i.e., the models is used to
estimate the economies of scale in serving dual customers through use of proportion
of dual customers and combined bills. The coefficient is positive with a value of
0.002. In contrast, for bad debt and total retail cost models are unit cost models. As
such, an increase in the number of total customers, independent of whether they are
dual or single, by 1% increases the predicted costs by 1%.
41 Benchmarking in the European Water Sector 1589

Forecasting Future Efficient Cost Levels

Ofwat used the above econometric models to arrive at a view of each company’s
efficient costs for the next control period, AMP7 (2020/2021–2024/2025). This was
achieved in four steps.
First, Ofwat accounted for company-specific factors that were not accounted
for in its modelling, as “statistical models are not perfect and cannot take into
account all relevant factors that affect costs. There may be instances where an
adjustment is required to correct these imperfections,” Ofwat ([81], p. 148). As
such, Ofwat’s framework allowed companies to submit so-called “cost adjustment
claims” in their business plans, whereby companies presented evidence of unique
operating circumstances, legal requirements, or atypical expenditure which drive
higher efficient costs for a company relative to its peers. Ofwat would then assess
these claims and make adjustments where, for example, it considered that the claim
was not captured by its modelling, was material, was outside management control,
had been mitigated to the extent possible, and the evidence on its impact on efficient
costs was robust. For details, see Ofwat ([81, 82, 86, 87, 93], Sect. 7, [96], Sect. 9).
Second, Ofwat estimated a historical benchmark and estimated an efficiency
challenge to this. At the IAP and draft determinations, this historical benchmark
was the “upper quartile.” That is, a corrected OLS (COLS) style approach is used
but with the benchmark given by the upper quartile (i.e., the fifth company for water
services, and between the third and fourth company for wastewater services). Ofwat
([93], p. 11) state that “the upper quartile level recognises imperfections of statistical
analysis.” In its final determinations, Ofwat ([96], Sect. 3.1.3) moved the benchmark
to be the fourth ranked company for water services and the third ranked company for
wastewater services. Ofwat considered that: (i) “following changes to our data and
modelling approach . . . , the stringency of the historical upper quartile as a catch
up efficiency challenge has reduced” and (ii) “the cost adjustment claims Ofwat
allowed for were one-sided in most cases, increasing allowances for companies,”
Ofwat ([96], pp. 31–33). As such, Ofwat considered it appropriate to strengthen the
challenge for final determinations.
Third, Ofwat generated cost predictions for each company by using the model
coefficients over the historical period and applying these to forecast of company
cost drivers over AMP7. Typically, these forecasts were either: those developed by
the companies as part of their business plan submissions; derived by Ofwat using
third party sources (such as the ONS, the UK government statistics department), a
linear time trend or average of the past values for the cost driver; or a combination
of the companies’ and Ofwat’s views, Ofwat [96]. Ofwat ([96], p. 23), stated
that “it is important to protect customers from potentially inflated forecasts that
feed into cost estimates . . . It is therefore an important part of our incentive
based regulation to develop an independent view of cost drivers over the forecast
period . . . .” It considered that its mixed approach “better reflects what companies
are expected to deliver during 2020–25 consistent with our final determinations,
while maintaining the properties of our incentive-based regulation.” The historical
benchmark challenge was applied to these forecasts.
1590 A. Horncastle et al.

Finally, Ofwat also applied a frontier-shift challenge of 1.1% p.a. over the period
2020/2021–2024/2025. This was based primarily on separate analysis using total
factor productivity (TFP) growth rates using the EU KLEMS database, Stehrer et al.
[135]. For details, see Europe Economics [37–39]. Ofwat also allowed for real input
price inflation.
For retail expenditure, Ofwat used a different approach. In its initial assessment
of plans and fast-track draft determinations (i.e., determinations for three companies
that submitted high quality business plans, Ofwat [106], p. 2), instead of calculating
a historical upper quartile benchmark and overlaying a frontier shift assumption,
Ofwat [93] used a forward-looking upper quartile benchmark. For the slow-track
draft determinations and final determinations, Ofwat used the average of the
historical UQ and forward-looking UQ-based results.
Ofwat [93] stated that this choice of a forward-looking UQ was driven by the
decline in companies’ projected costs over AMP7 relative to the current level of
expenditure and in the final determinations. Ofwat ([96], p. 121) expanded on this
reasoning, stating that “the retail control has started as recently as 2015 and retail
services can transform more quickly than wholesale services . . . The fact that the
majority of companies submitted forecasts that are significantly more efficient than
historical expenditure is evidence of the pace at which this service is transforming.
It is important that customers share the benefits.”

Wholesale Enhancement Expenditure

Ofwat used a slightly different approach for enhancement expenditure. In contrast

to BOTEX plus, where all constituent costs were aggregated together and modelled,
each type of enhancement activity was assessed separately and efficiency challenges
were generally set at the individual activity level. Exceptions to this were made for
some related types of expenditure such as supply-demand balance expenditure and
the water industry national environmental program (WINEP) expenditure.
The efficiency challenge depended on “the quality of the model and the spread
of company cost projections around [Ofwat’s] benchmarks” (see Ofwat [93], p. 16).
The final view of efficient cost was the minimum of the efficient cost estimated
by modelling and the company’s requested enhancement spend. In total, Ofwat
examined 16 separate capital enhancement activities for water services and 24
separate capital enhancement activities for wastewater services, for a total of 40
activities across both water and wastewater.
Of these 40 enhancement activities, 12 were examined using econometric bench-
marking analysis, while the others were based on an examination of companies’
business plans. Ofwat’s preferred assessment method was econometric benchmark-
ing analysis, stating “Our preferred method of assessment is benchmarking analysis
of forecast costs. Where the investment area does not lend itself to statistical
modelling we rely more on the evidence provided by companies in their business
plans,” Ofwat ([96], p. 49).
41 Benchmarking in the European Water Sector 1591

Where Ofwat did use an econometric approach, the estimation approach was
relatively similar to the BOTEX approach described above. The benchmarking
models were typically limited to only one to two cost drivers. However, there were
two important dimensions in which its approach to enhancement differed to that for
BOTEX:

• Use of forecast data – BOTEX plus models were estimated on outturn, or histor-
ical data, while enhancement models were sometimes estimated on forecast data
only, or a mixture of both historical and forecast data. Generally, Ofwat modelled
enhancement cost data over the business plan forecast period (2020/2021–
2024/2025), although in some areas, it also took into account historical data
(2011/2012–2017/2018) to inform the allowance. However, this was typically
modelled in a separate model to the forecast data, rather than combining all the
data into one model, for an example, see Ofwat [100].
• Choice of benchmark – The efficiency challenge applied varied across different
activities, including an average benchmark, a company-specific challenge based
on its historical base expenditure performance or an upper quartile benchmark.
• Modelling was undertaken on both cross-sectional and panel data – While data
is available for each company over a number of years, in some areas, Ofwat
collapsed the data to one observation per company over the assessed period
by summing across expenditure and cost driver. In other areas, the panel data
structure (i.e., data across companies over time) was retained. This was typically
modelled econometrically using the RE estimator, rather than the OLS estimator.
However, Ofwat occasionally used the OLS estimator instead, for example, see
Ofwat [100].

The models were used to generate cost predictions, efficiency scores, and in turn,
the efficiency challenge. This efficiency challenge varied across models.
As an example of how Ofwat’s benchmarking approach works for one area of
enhancement expenditure, we set out the process for the “phosphorus removal”
enhancement activity, which is associated with removing phosphorus from waste
load to prevent eutrophication. This activity forms a part of a wider WINEP
program. To assess phosphorus removal costs, Ofwat specifies a simple econometric
model. This generates predicted costs for each company. However, rather than
calculating the catch-up challenge for just phosphorus removals, Ofwat does so
across all enhancement activities within WINEP. Ofwat ([96], p. 62) considers that
this is appropriate because it accounts for potential cost allocation issues and the
accuracy of the individual models used for each enhancement activity.
Given the number of activities examined by Ofwat, we do not examine the
other 11 individual modelling areas. For details on these models, the interested
reader should review Ofwat [98–99]. One key difference between BOTEX and
enhancement modelling is that the modelled benchmark did not necessarily form the
efficiency challenge presented to companies. Citing the need to “[protect] customers
from paying for inefficient, unrequired or undelivered investment in the control
1592 A. Horncastle et al.

Gate 1 Gate 2 Gate 3

Need for Need for Management

£ investment? adjustment? control?

Reallocated from
appropriate line

Gate 4

Affordable Customers Robust Best option for

& board- protected? evidence? customers?
assured?

Gate 7 & 8 Gate 6 Gate 5

Fig. 5 Shallow/deep-dive gates. (Source: Oxera ([118], p. 12), based on Ofwat ([93], p. 17))

period,” Ofwat capped the efficient expenditure level in an enhancement area to the
minimum of the model prediction and the company’s submitted cost level, Ofwat
([93], p. 15).
Companies have raised a number of challenges in response to Ofwat’s approach
to enhancement modelling. For example, Ofwat collapses the panel data structure of
phosphorus removals costs and cost drivers into a cross-sectional dataset, meaning
that there are only 10 observations in total (one for each of the WASCs). As a result,
Ofwat only accounted for a limited number of cost drivers. For instance, Yorkshire
Water ([163], p. 63) argues that Ofwat’s approach does not fully account for the
cost differences between different phosphorus removal solutions that are driven by
differences in legislative obligations. Furthermore, Yorkshire Water [163] argues
that the limited sample and reliance on forecast data mean that the use of an upper
quartile benchmark is too challenging and risks classifying noise as inefficiency.
In enhancement activities where Ofwat did not adopt an econometric bench-
marking approach, Ofwat relied more heavily on the written evidence provided in
company business plans. In these areas, either a so-called “shallow” or a “deep-dive”
approach was taken, depending on whether the expenditure for a company was less
or greater than 0.5% of water/wastewater total expenditure (TOTEX).
Ofwat’s shallow/deep-dives were a quantitative and qualitative review of the
companies’ enhancement submission in this area. Expenditure was assessed relative
to a number of “gates,” as set out in Fig. 5. Ofwat applied a 20% challenge
to expenditure that passed the first three gates and where it judged there to be
insufficient evidence of optioneering (i.e., considering all relevant solutions) and/or
robust evidence of efficient costs. In some instances, Ofwat applied an additional
company-specific efficiency challenge. An assessment of affordability and board
assurance was necessary only for the largest schemes.
41 Benchmarking in the European Water Sector 1593

Northern Ireland

In this section, we examine the cost benchmarking approach used in the Northern
Irish water sector, which draws heavily on the approach used by Ofwat in PR09.

Historical and Industry Context

Water and wastewater services in Northern Ireland are provided by NI Water, which
has dual status as a government-owned company and a nondepartmental public
body.
The Department for Infrastructure is the sponsor and sole shareholder of NI
Water. It monitors financial and performance reporting against nominated outputs
as determined by the Utility Regulator (see below), has a challenge and advocacy
role, and is responsible for paying the customer subsidy to NI Water.
Water and sewerage services are regulated by the Northern Ireland Authority for
Utility Regulation, Utility Regulator, or UR (previously Ofreg).

Efficiency Benchmarking in Northern Ireland

PC15 was the third price control for NI Water covering a 6-year period from
2015–2016 to 2020–2021. The final determination for PC15 included separate
cost benchmarking of operating expenditure and capital expenditure, following
approaches previously used by Ofwat (in PR09 and before). The UR benchmarks
NI Water’s costs against England and Wales water companies.
Given the similarity to Ofwat’s approach, we only provide a high-level summary
here. The interested reader should review the references provided.
For operating expenditure, UR’s approach at PC15 included the following
[146]:

• Establish NI Water’s baseline OPEX.

• Adjust for additions (and reductions) to base costs (e.g., additional OPEX due to
new legal standards, improved drinking water, or treatment standards).
• Assess transformation costs (in recognition that significant change was required
to improve efficiency).
• Assess OPEX from capex requirements (i.e., new expenditure arising from the
capital program).
• Determine allowances for special factors (rural/dispersed population, regional
wages, electricity prices, legacy specialist wastewater treatment technology) and
atypical expenditure.
• Undertake cost benchmarking using unit cost and econometric modelling of costs
for different functions (such as distribution, or resources and treatment). That is,
a similar approach to that used by Ofwat in PR09 (see Ofwat [75]).
1594 A. Horncastle et al.

• Estimate the relative efficiency gap between NI Water and the upper quartile.
• Overlay assumptions on the frontier shift.
• Consider how public private partnerships/private finance initiative (PPP/PFI)
costs should be treated.
• Review NI Water proposals.

For capital expenditure, the UR followed a similar approach to that used by

Ofwat in PR09. Two approaches were used – capital maintenance expenditure
models (unit cost models or econometric models) and the “cost base.”
The cost base was used to assess the relative efficiency of water and sewerage
companies in procuring and delivering capital projects. These compare unit
costs across a wide range of standard water and sewerage capital schemes
(e.g., mains laying, mains rehabilitation, meter installation). For PC15, the
UR primarily used upper quartile as its benchmark with cost base derived
efficiencies triangulated against capital procurement efficiencies. The latter was
derived though a panel of reporters drawn SMEs who examined and made
recommendations on how NI Water might, over the course of the control period,
improve procurement processes and practice to achieve enhanced efficiencies going
forward.

Future Potential Changes in Approach

For PC21, which covers 2021–2027, the UR does not envisage following
the same approach to operational efficiency it took at PC10 and PC15 for
assessing efficiencies, as this “no longer remains appropriate,” UR ([147],
p. 22). Its approach to efficiencies in PC21 may include the following elements
[148–149]:

• Engaging with NI Water at a Cost Assessment Working Group (CAWG)

• Setting a challenging efficiency target for NI Water, while recognizing NI Water’s
progress in delivering efficiencies
• Using a pooled dataset, including comparable data from the England and Wales
companies
• Benchmarking capital maintenance expenditure and/or BOTEX modelling
• Accounting for special factors and atypical expenditure
• Using COLS models
• Reserving judgment on the specific rate of catch-up
• Adopting a “triangulated” approach of combining econometric analysis, exam-
ination of frontier shift, and experience of rapid reductions in expenditure in
similar regulated industries

At the time of writing, the UR has not yet published its draft or final decisions.
Therefore, its methodology may be subject to further change.
41 Benchmarking in the European Water Sector 1595

Scotland

In this section, we examine the cost benchmarking approach used in the Scottish
water sector, which used to be based on the approach used by Ofwat in PR09, but
has subsequently moved away from econometric cost benchmarking.

Historical and Industry Context

Scottish Water is responsible for providing water and wastewater services to

household customers and wholesale licensed providers. It is a public corporation
accountable to Scottish Ministers and the Scottish Parliament.
The Water Industry Commission for Scotland (WICS) is the economic regulator
for Scottish Water, setting charges and reporting on costs and performance. The
regulatory model in Scotland is in the process of changing from a traditional
“adversarial” model to a more collaborative approach, characterized as Ethical
Based Regulation (EBR), WICS [158]. This change has had implications for the
role of benchmarking in the regulatory process, in particular reducing the need for
benchmarking as a tool to reduce information asymmetry between regulator and
company [158].

Efficiency Benchmarking in Scotland

Under the previous regulatory model, the price setting process took place (typically)
every 6 years. The process is set out in Fig. 6.

Regulator uses benchmarking to

determine a sufficient price cap for
the company to finance its plan

Scottish Scottish
government1 Water sets WICS Scottish
WICS
sets draft out a detailed reviews the government1
finalises price
objectives for business plan business plan and Scottish
cap for the 6
Scottish to meet these and Water
year period in
Water to objectives to determines respond to
its final
achieve over submit to draft price WICS draft
determination
the 6 year WICS caps determination
period

Potential for company to determine

the cost to deliver outcomes in
alignment with government objectives

Fig. 6 Benchmarking under the previous regulatory model. (Source: WICS ([159], p. 9). Note:
Quality regulators – such as the Drinking Water Quality Regulator and Scottish Environment
Protection Agency – also contributed towards the objectives set for the water sector)
1596 A. Horncastle et al.

The role for and nature of benchmarking has changed over time. At the first
strategic review of charges for Scottish Water – Strategic Review of Charges 2002–
2006 – econometric benchmarking, in combination with bottom-up engineering
evidence, was used to determine the potential for efficiency savings at Scottish
Water. In addition, an estimate of the benefits of merging the three regional
authorities into a single company was also overlaid, WICS [153]. The resulting
efficiency challenge was substantial, with expected OPEX efficiencies of 37% by
2006 and a CAPEX efficiency target of 34%, WICS ([153], p. 15). As such,
WICS provided for additional costs up front in order to facilitate the efficiency
improvements, so-called spend to save.
At this early stage in Scottish water regulation, the benchmarking techniques
drew substantially from the models used in England and Wales in PR09 and before.
As such, the key issues discussed in the section above on benchmarking in England
and Wales benchmarking are also relevant for Scotland.
WICS ([153], p. 14) stated:
The Office of Water Services (Ofwat), in conjunction with Professor Mark Stewart at
the University of Warwick, developed these econometric models [the models used to
benchmark Scottish Water]. The models were used in the 1994 and 1999 price reviews
in England and Wales. They have been held out as an example of good practice by the
Cabinet Office and were reviewed by the Competition Commission last year. I have made
only marginal adjustments to these models to ensure that they take fully into account the
Scottish operating environment.

Although adjustments may have been “only marginal,” a persistent issue with
benchmarking Scottish Water to companies in England and Wales has been the
magnitude of unique or special factors related to operating in Scotland. WICS
([155], p. 95) sets out that these include, but are not limited to:

• Scotland’s geography (size, remote islands, long coastline, topography)

• Its population settlement patterns (remote communities, concentrated dense
urban areas)
• The extent of the assets required to serve customers in Scotland (long mains,
small isolated treatment works)
• The quality of the assets inherited by Scottish Water (condition and performance
of the mains, sewers, treatment works, pumps)
• The nature of the customer base
• The fact that Scottish Water is in public ownership (political interest, Scottish
Water’s duty to Scotland, remit and freedom of management)
• The short time that Scottish Water has had to mature and improve

Another complicating factor for the benchmarking has been Scottish Water’s
inheritance of nine long-term Public Private Partnership contracts to operate
wastewater treatment. These cover around 50% of Scotland’s wastewater treatment
and 80% of its sludge treatment, WICS (ca. [157]). As the fees that Scottish Water
pays for these contracts (around 10% of its annual spending) were set before the
current company came into being, they are not controllable by current management.
41 Benchmarking in the European Water Sector 1597

Thus, the expenditure cannot be considered as controllable or comparable to

Scottish Water’s internal wastewater treatment expenditure and cannot be compared
to companies in England and Wales without adjustment. Many of these contracts
are not due to expire until the 2030s, WICS (ca. [157]).
Over the course of subsequent reviews (2006–2010, 2010–2015) benchmarking
with England and Wales has been applied in a number of ways by WICS and Scot-
tish Water. Scottish Water [130] identified the main areas in which benchmarking
has been used as:

• Service levels using the Overall Performance Assessment (OPA).

• Operating expenditure using an econometric modelling approach.
• Capital maintenance using an econometric modelling approach.
• Capital enhancement using a cost base approach.

A Change in Direction

In more recent reviews (SRC15 and SRC21), WICS has moved away from
econometric benchmarking techniques. This has been driven by changes to the
way Scottish Water is regulated and the position of the regulator on the efficacy
of econometric benchmarking in the current context of the Scottish water sector.
The move in regulatory reviews from 2015 onwards has been to empower Scot-
tish Water to have full responsibility of its business plan. The combination of greater
transparency (removing information asymmetries) and transferring ownership of
determining what outputs and outcomes are required removes some of the need
for benchmarking to increase the information available to the regulator and set hard
budget constraints for companies. However, benchmarking can still play a role in
informing the decision-making process even outside of a traditional “adversarial”
regulatory framework, WICS [156].
WICS and Scottish Water have used higher level metrics, such as average unit
OPEX, to inform the potential for efficiency improvements going forwards. Moving
away from econometric benchmarks has been justified in WICS ([156], p. 8) on the
following basis:

Comparative benchmarking with the companies in England and Wales using econometric
models – which has proved so useful in the past to drive improvements in Scotland – is no
longer entirely adequate. The scale of Scottish Water’s improvement means that we would
have to use more intrusive approaches to identify and measure gaps in performance. Ofwat,
the economic regulator of the water sector in England and Wales, also appears to be moving
away from its historical approach to econometric modelling.

We note that, while Ofwat’s approach has changed from price reviews prior to
PR14, it has continued to place benchmarking methods at the center of its approach
to cost assessment. In considering a comparison of the two approaches, it is also
instructive to consider the different contexts the regulators operate within: WICS
regulates a single public corporation owned by and accountable to government,
1598 A. Horncastle et al.

while Ofwat regulates 17 companies – 16 of which are either publicly traded or

owned by private equity.

Ireland

In this section, we examine the cost benchmarking approach used in the Irish water
sector , which draws heavily on the approach used by Ofwat in PR09.

Historical and Industry Context

In Ireland, water services are provided by state-owned Irish Water, which was
established as a single national public water utility in 2013. Local authorities
currently continue to act as agents for Irish Water, providing services under Service
Level Agreements. The Commission for Regulation of Utilities (CRU), previously
the Commission for Energy Regulation (CER), is Irish Water’s economic regulator.
The last revenue control covers the period from 2017 until 2019, having been
extended by 1 year, CER [16] and CRU [18], while the current revenue control,
RC3, covers the 5-year period from 2020 until 2024, CRU [20]. The Water Services
Act [161] sets out that Irish Water’s revenue will be recovered through a mixture of
Government subvention and customer charges.
The CRU determines Irish Water’s revenue allowance (i.e., the level of funding
Irish Water can collect from its customers). This involves the CRU reviewing
Irish Water’s submissions, benchmarking Irish Water’s proposed costs against
comparator companies, undertaking a public consultation process, and then setting
revenue allowances. The rest of this section is based upon the consultation paper
published by CRU, as, at the time of writing, the final decision had not been made.

Efficiency Benchmarking in Ireland

With regards to benchmarking, the Commission for Regulation of Utilities in Ireland

(CRU) assessed operating expenditure and capital expenditure separately.
For operating costs, CRU [20] reviewed Irish Water’s costs and benchmarked
them against water and wastewater utilities in other jurisdictions (including England
and Wales water and wastewater companies, Scottish Water, and Northern Ireland
Water). CRU reviewed Irish Water’s expenditure proposals and used unit cost
comparisons and econometric benchmarking.
For water services, OLS was used to model operating costs with one cost driver
and time dummies. The cost driver was a composite scale variable (a weighted
combination of distribution input, population and mains length), NERA [60].

CSV =(W ater Delivered)w1 x (Connected P roperties)w2

(4)
x (Mains Length)w3
41 Benchmarking in the European Water Sector 1599

where w1, w2, and w3 are weightings on each of the scale variables.
For wastewater services, operating costs were regressed against a composite
scale variable (consisting of properties connected and sewer length), the number
of WWTW and time dummies.
CRU [20] stated that the impact of Irish Water’s specific characteristics on
its comparative efficiency was considered, including considering Irish Water’s
higher wages costs and its greater length of water network per connection. On
the former, a wage adjustment was made, scaling costs up or down in order to
improve comparability across companies before conducting cost benchmarking, as
adopted by Ofgem [65] (the GB energy regulator) in its electricity distribution price
control, RIIO-ED1. On the latter, CRU [20] noted that models developed by Ofwat,
CMA (the UK competition authority) and UREGNI (the Northern Ireland regulator)
tended to show that the number of connections (rather than network length) is the
main cost driver.
CRU [20] considered that the average level of operating costs was an appropriate
target for Irish Water to move to over time. CRU [20] allowed Irish Water to reduce
its costs over the RC3 period towards an efficient level of costs, as CRU considered
that an immediate reduction would likely have a negative impact on the level of
service. CRU also examined rates of improvement achieved elsewhere, in order to
assess expected rates of improvement that Irish Water could achieve.
For capital expenditure, CRU used a bottom-up/engineering approach, exam-
ining costs at a project level. As set out in Jacobs [45] and CRU [20], the
review considered Irish Water’s maturity and approach to planning, prioritizing,
and optimizing the work identified, including the need and timing of invest-
ments, to meet its obligations. Cost estimating processes were reviewed and
a sample of projects examined in more detail to confirm whether the scope
of work was appropriate, and whether the costing processes were applied as
anticipated.
In its final decision, for operating costs, CRU [22] required Irish Water to
meet a 4% per annum efficiency gain on operating costs, starting with 2% in
2020 and rising to 6% in 2024. For capital expenditure, CRU [22] imposed a 3%
per annum efficiency challenge. However, for spend that was already committed,
comprising approximately one-third of capital investment, CRU [22] did not impose
an efficiency challenge.
In addition to the revenue control review, the CRU has developed a performance
assessment framework for Irish Water, CER [17] and CRU [21]. This Framework
provides a structured way for the CRU to assess Irish Water’s performance over
time. Irish Water’s Performance Assessment Reports are published approximately
every year, CRU [19]. CRU also notes that the publication of reports under the
Framework incentivizes Irish water to improve its performance and service delivery
and allows stakeholders to monitor that performance. This is in contrast to Ofwat,
the UK water regulator, which discontinued publication of its annual efficiency
reports.
Key performance indicators or metrics used in Irish Water’s Performance
Assessment which are categorized under the following headings:
1600 A. Horncastle et al.

• Customer service
• Environmental performance
• Water supply – quality of service
• Security of water supply
• Sewerage service

Denmark

In this section, we examine the cost benchmarking approach used in the Danish
water sector – a combination of DEA and SFA.

Historical and Industry Context

The Danish drinking water sector is highly decentralized. It consists of approxi-

mately 2600 public waterworks, with around 87 municipally owned drinking water
companies, which in total comprise approximately 330 waterworks. The remainder
are privately owned, either as independent individual waterworks or collected
together into small utility companies with additional facilities, usually owned by
the consumer, Danish Water and Wastewater Association ([30], p. 8).
All companies, with more than 200,000 m3 water p.a. are regulated, while around
300 of the largest companies (providing more than 800,000 m3 water p.a.) are
subject to a benchmarking exercise covered by the Danish Water Sector Act [31],
Konkurrence- og Forbrugerstyrelsen ([47], p. 4). The Act provides rules for the
water supply companies to keep their revenues within a set limit (a revenue cap)
and stipulates rules for efficiency requirements. The current timetable is that this will
change from 2022 onwards, after which privately owned companies, which provide
less than 800,000 m3 water p.a., will be able to choose to withdraw from regulation
entirely. All companies who remain regulated will be part of the benchmarking
exercise (so the benchmarking will not just include those with more than 800,000 m3
water p.a.).
The Secretariat for Water Supply (Forsyningssekretariatet) is the Danish eco-
nomic regulator for water supply companies. The Secretariat is a part of the Danish
Competition and Consumer Authority (DCCA) under the Ministry of Industry,
Business and Financial Affairs.
Cost benchmarking was first used by the regulator to set individual requirements
for efficiency improvements in 2012. The Secretariat conducted yearly operating
expenditure (OPEX) benchmarking of Danish water companies above a size
threshold to uncover possible efficiency improvements. Companies smaller than this
threshold (providing below 800,000 m3 of water per year) are exempted, but can
voluntarily participate and need to meet a general efficiency requirement of 1.7% of
total costs.
41 Benchmarking in the European Water Sector 1601

Recent Regulatory Framework Changes

In 2015, there was cross-party agreement for a new regulatory framework for the
water sector, Danish cross-party water sector control [29]. On 10 October 2016, this
agreement was passed into law. It was amended on 28 June 2018 [35]. One of the
key aims of this new regime was to promote more efficiency in the water sector. The
overall goal was to achieve 1.3 billion DKK efficiency improvements in the Danish
water sector from 2015 to 2020 ([35], p. 2).
To allow companies to budget over a longer time horizon and reduce the
administrative burden associated with regulation, multi-year binding price ceilings
have been introduced. After a phasing-in period of one- and multi-year frameworks
from 2017 onwards, 4-year revenue caps will be introduced from 2022 onwards
for wastewater companies and from 2023 onwards for water companies ([29],
p. 3). Currently, the benchmarking is undertaken and the revenue caps adjusted
every 2 years, with water utilities benchmarked in even-numbered years and the
wastewater utilities benchmarked in odd-numbered years. Although, companies
who were identified to be fully efficient at the previous review have already received
4-year revenue caps.
The required efficiency improvements are based on estimates of the potential
for frontier shift and catch-up. For all water companies which process more than
800,000 m3 water per year:

• The annual frontier shift for all companies is calculated in a separate exercise
from the benchmarking exercise. It is based on the productivity performance in
the construction sector and in the market economy, using information published
by Statistics Denmark, Energi-, Forsynings- og Klimaministeriet ([35], Chaps. 5
and 6).
• The individual catch-up efficiency requirement is based on cost benchmarking
using DEA and SFA. The outcome is capped at 2% p.a., Danish cross-party water
sector control ([29], p. 2).

The regulatory framework has also been designed to encourage greater consol-
idation in the water sector and, thus, improve scale efficiency. This is achieved, in
part, by excluding water companies’ merger related counselling expenditure from
cost benchmarking, Danish cross-party water sector control ([29], p. 6).

Efficiency Benchmarking in Denmark

To establish the catch-up efficiency requirements, a regular benchmarking exercise

is conducted by the Forsyningssekretariatet. The benchmarking exercise currently
takes place every 2 years, but will occur every 4 years from 2022 onwards, Danish
cross-party water sector control ([29], p. 3).
First, the water companies report information on their underlying cost drivers,
investments, and operating costs. These reports are quality assured. For water
1602 A. Horncastle et al.

companies that fail to submit financial information by the deadline, the Forsyn-
ingssekretariatet will issue an efficiency target based on estimated numbers, up to a
maximum efficiency target of 2% p.a., Danish cross-party water sector control ([29],
Chap. 5, paragraph 9, Sect. 6).
Second, the models are then defined in terms of inputs and outputs:

• In 2017, DCCA introduced a total expenditure (TOTEX) benchmarking frame-

work. Modelled costs consist of the sum of operating expenditure, depreciation,
and financial expenses. The costs are the actual costs in 2016 less depreciation
for certain investments that are in the category of “other assets.” Investment is
categorized as “other” if it is relatively unique and there is no corresponding
output driver for these costs. However, even if excluding from the modelling,
if a company efficiency challenge is identified in the benchmarking, it will still
be applied to such costs, Konkurrence- og Forbrugerstyrelsen ([47], pp. 15–16).
TOTEX, as assessed by the DCCA, also excludes noncontrollable costs (such as
taxes) which do not have an efficiency challenge applied.
• Two output variables (cost drivers) are used to explain the differences in efficient
expenditure across water companies (Konkurrence- og Forbrugerstyrelsen ([47],
p. 10). Based on the information provided by the water companies, two network
volume measures are calculated, which are intended to describe the totality of
the companies’ activities. The cost drivers are both based on the cost of asset
replacement:
– OPEX grid volume is intended to capture OPEX.
– CAPEX grid volume, intended to CAPEX.

Both cost drivers are adjusted in some years based on the density of the popula-
tion that the utility serves and/or the age of the network if there is statistical evidence
to support such an adjustment, Konkurrence- og Forbrugerstyrelsen (2019a, p. 14).
The water companies are then benchmarked using these grid volume measures
and water companies’ actual costs using DEA and SFA. For DEA, an input-
oriented DEA model with constant returns to scale is used, Konkurrence- og
Forbrugerstyrelsen [48]. For SFA, a Cobb–Douglas functional form is used (i.e., the
inputs and outputs are modelled in logs) and a half-normal distribution is assumed
for the inefficiency term, Konkurrence- og Forbrugerstyrelsen ([46], p. 6).
As part of this benchmarking, the DCCA identifies outliers to be excluded in both
its DEA and SFA analysis, taking a qualitative approach supported by the following
procedures (Konkurrence- og Forbrugerstyrelsen ([49], p. 16):

• In DEA, the so-called “super-efficiency” criterion is used in support of DCCA’s

qualitative assessment of whether a company should be considered an outlier.
This involves estimating each company’s efficiency against a frontier that
excludes the company itself. If the super efficiency score exceeds a particular
limit, then the company is considered a potential outlier. The limit used is
q(75) + 1.5 × (q(75) − q(25)), where q(75) and q(25) are the 75th and 25th
percentiles, respectively. Furthermore, companies forming the frontier in DEA
41 Benchmarking in the European Water Sector 1603

are contacted to assess whether or not they are representative, Konkurrence- og

Forbrugerstyrelsen ([47], p. 16).
• In SFA, Cook’s Distance is used in support of DCCA’s qualitative outlier
identification. If the maximum Cook’s Distance exceeds a certain limit, given
4
by N −k−1 (where N is the number of companies in the regression and k is the
number of outputs), then the company associated with the maximum Cook’s
Distance is considered a potential outlier. If removed, this procedure is then
repeated.

The qualitative assessment involves contacting all companies which constitute

the frontier to assess whether they have some exceptional favorable conditions
which are not comparable with the remaining companies. In addition, if a company
only participates in one of the two possible wastewater activities (collection or
treatment), then it is not allowed to constitute the general frontier for companies
conducting both activities. When estimating the specialized collection (transporta-
tion) companies’ efficiency, all collection (transportation) companies are included
in the benchmarking.
Table 5 presents the parameter estimates from the estimated SFA models for
water for the year 2019–2020.
This establishes the efficiency frontier, which is used to calculate efficiency
scores for each company. DCCA also calculates efficiency scores for each outlier
company for both DEA and SFA. To do this, the outlier is added back into the
model and the model is reestimated to obtain the efficiency score for the outlier.
This procedure is then repeated, where each outlier is added back in to the model
one at a time.
The efficiency scores are corrected for special conditions. This correction is
intended to ensure that account is taken of companies’ individual circumstances.
This efficiency score is then used to calculate the companies’ efficient cost level.
In deriving a final efficiency estimate for each company, the highest estimated
efficiency score (and thus most favorable result) for each company is used, a

Table 5 Parameter estimates for SFA models

Unadjusted Age-adjusted Density-adjusted
Dependent variable TOTEX excluding TOTEX excluding TOTEX excluding
noncontrollable noncontrollable noncontrollable
costs costs costs
Constant 0.928* 0.0887* 0.613*
OPEX grid volume 0.758*** 0.823*** 0.556***
CAPEX grid volume 0.329*** 0.264*** 0.517***
Ratio of variance of 3.22** 2.67** 2.90**
inefficiency to variance of
noise
Number of outliers 4 4 4
Source: Konkurrence- og Forbrugerstyrelsen ([50], Tables 3.1–3.3)
1604 A. Horncastle et al.

so-called “best of two” approach, see Konkurrence- og Forbrugerstyrelsen ([51],

p. 22) and Konkurrence- og Forbrugerstyrelsen [52]. As stated above, the cost
reduction is also capped at 2% p.a., see Danish cross-party water sector control
([29], Chap. 5, paragraph 9, Sect. 6).

Italy

In this section, we examine the cost benchmarking approach developed by the Italian
water regulator, ARERA, in 2019.

Historical and Industry Context

The Italian drinking water sector is highly fragmented. Around 2000 entities provide
water services in Italy. The vast majority of these are municipalities or other public
entities serving around 17% of the Italian population. Only in a limited number
of instances are water utilities part of listed mixed ownership multi-utility groups
or owned by private investors. Both national and local authorities in Italy are
responsible for the water sector in Italy.
The national water regulatory authority (Autorità di regolazione per Energia,
Reti e Ambiente, ARERA) establishes tariff rules on the basis of a common
methodology, in place since 2012. In 2017, ARERA introduced quality of service
regulation, including rules to determine financial rewards and penalties as well as
reputational incentives related to greater transparency in quality standards.
The third regulatory period for water distribution started in 2020, with three main
objectives: fostering investments, sector consolidation, and cost efficiency.
With regards to cost efficiency, proposals were first published in 2019 to use a
formal cost benchmarking approach for the next price control period – albeit not
to determine allowed expenditure but to determine the proportion of any historical
underperformance or overspending that should be passed on to customers through
higher tariffs. It is expected that the scope of the cost benchmarking approach will
be further strengthened in subsequent regulatory reviews.

Cost Benchmarking

ARERA [5] set out the proposals for the regulatory framework for the period 2020–
2023, which considered different approaches to OPEX efficiency benchmarking.
The proposed approach included:

• Using a panel data set of 98 companies over 4 years (2014–2017), covering a

population of around 42 million, or around 70% of the population
• Using cost models with a Cobb–Douglas functional form and the following
variables:
– Inputs: operating costs
41 Benchmarking in the European Water Sector 1605

– Input prices: cost of electricity supply (PE), labor costs (PL), wholesale water
purchase cost (WS)
– Outputs: volume of water invoiced (V), length of network (L), resident
population (Pa), population equivalent (AE), availability and reliability of
measurement data (PREQ1_4), compliance with the legislation on urban
wastewater management (PREQ3), water losses (M1a)
• Using SFA models of Battese and Coelli [9], Battese and Coelli [8], and Pitt and
Lee [122]

Table 6 provides an overview of model coefficients.

The outcome of this modelling is estimated inefficiency scores ranging between
1% and 82%, with average inefficiency scores of 22–29%.
Given the lower estimated inefficiency scores from the Pit and Lee [122] model,
ARERA [5] proposed, and confirmed in [6], using that model. The Pitt and Lee
model is also the only model where the factor M1a (variable relative to technical
quality in terms of linear water losses) is statistically significant, which constituted
an additional reason to be considered as a potential candidate to be used to assess
cost efficiency.
Rather than using the efficiency scores to estimate efficiency targets, ARERA
instead groups companies based on two metrics: the companies’ historical unit
OPEX and the companies’ “efficient” unit OPEX derived from the model prediction.
ARERA sorts companies into one of six classes (based on the level of historical
outturn unit OPEX) and one of three clusters (based on its econometric model’s
prediction of historical efficient unit OPEX), and determines the underperformance
sharing rate on this basis.
In other words, the model is used to establish which proportion of any underper-
formance or overspending in 2016 should be passed on to customers through higher
tariffs in “controllable” OPEX over the next regulatory period (2020–2023).

Table 6 SFA model coefficients

Battese and Coelli [9] Battese and Coelli [8] Pitt and Lee [122]
Variable model model model
PE 0.907** 0.932*** 1.032***
PL 0.261*** 0.260*** 0.282***
WS 0.664*** 0.661*** 0.784***
V 0.210*** 0.209*** 0.226***
L 0.142*** 0.141*** 0.146***
Pa 0.510*** 0.512*** 0.469***
AE 0.118*** 0.118*** 0.142***
PREQ1_4 −0.099 −0.101 −0.075
PREQ3 −0.037 −0.037 −0.061
M1a 0.023 0.023 0.028*
Constant 3.381*** 3.378*** 3.277***
Source: ARERA [5]
Note: *p < 0.05; **p < 0.01; ***p < 0.001
1606 A. Horncastle et al.

The sharing rate is determined by the G parameter in the following formula:

a
Opex aend = Opex 2018
end ∗ 1 + l t − 1 + γi,j
OP
∗ max {0; Opex}
t=2019
(5)

For each year, a = {2020, 2021, 2022, 2023}, “controllable” OPEX Opex aend is
determined as follows:

• Opex 2018
end is the cost component defined by MTI-2, calculated to determine 2018
tariff.
• lt is the inflation rate.
• Opex represents the difference between allowed controllable, OPEX in 2016,
and outturn OPEX, COeff , in the same year.
OP is the operator coefficient. It is determined by the operator class i (relating
• γi,j
to the operators’ historical outturn unit costs) and the operator cluster j (relating
to the prediction of efficient historical unit costs derived from ARERA’s SFA
model).

Table 7 shows the operator clustering matrix, which determines the gamma
parameter.
To illustrate how this application of benchmarking works, it is instructive to
consider a few examples. If a company is in class A (the first row) and cluster C (the
rightmost column), implying that it has incurred low unit costs and the econometric
model predicted that it should have high unit costs, then the underperformance
sharing rate is 100% – i.e., the company’s customers bear the full costs of any
historical cost overrun that the company incurred through higher bills over the

Table 7 OPEX clustering approach and sharing rate (G) setting

SFA-predicted unit SFA-predicted unit SFA-predicted unit
Sharing rate (G) OPEX, cluster j = A OPEX, cluster j = B OPEX, cluster j = C
Outturn unit OPEX, −90% −100% −100%
i=A
Outturn unit OPEX, −88% −90% −100%
i = B1
Outturn unit OPEX, −83% −90% −100%
i = B2
Outturn unit OPEX, −75% −83% −90%
i = C1
Outturn unit OPEX, −50% −75% −90%
i = C2
Outturn unit OPEX, 0% −50% −88%
i = Cover
Source: ARERA [5]
Note: Unit OPEX value ranges are ranked in increasing order (i.e., cluster A is characterized by
the lowest unit OPEX)
41 Benchmarking in the European Water Sector 1607

regulatory period. By contrast if a company that has the highest level of outturn
unit costs (class Cover , the bottom row), but ARERA’s econometric model predicts
an average unit cost (cluster B, the middle column), then it has an underperformance
sharing rate of 50% – i.e., any cost overrun is split equally between company and
customer bills.
In its final decision, ARERA [6] stated that they will consider reviewing the
methodology set out above for the second half of the price control period (2022–
2023).

Areas for Further Development

In this section, we draw together some of the potential issues with the current
application of cost benchmarking within the European water sector. We restrict the
scope of this section to specific technical issues with cost or output benchmarking
in regulatory jurisdictions where it is formally applied. Even then, the issues set out
here are not exhaustive.
This section does not address justification for/arguments against the decision to
not apply formal benchmarking in countries outside the jurisdictions covered in this
chapter.

Input Definition: Modelled Expenditure

For an efficiency benchmarking assessment to provide a true reflection of a

company’s relative efficiency, the modelled costs should be defined as broad as
possible as there are trade-offs between cost categories – companies may focus on
OPEX solutions or CAPEX solutions, and benchmarking on one cost category may
result in unachievable targets. This creates an issue in that CAPEX is can be lumpy
and thus difficult to compare across companies. Two approaches have been adopted
with regards to this issue.

• Many water regulators have used a “cash cost” approach to a total cost
measurement, with annual or average enhancement expenditure and capital
maintenance expenditure added to annual operating expenditure. For example, in
PR19, Ofwat models operating expenditure combined with capital maintenance
expenditure on an annual basis, while mostly assessing enhancement on
an average basis. This is similar in style to that adopted by other water
regulators that have followed Ofwat’s approach – i.e., CRU, UR, and WICS
(historically) – and ARERA, whose cost benchmarking models include
only OPEX. As ARERA’s cost benchmarking models include only OPEX,
there may be some issues around not accounting for trade-offs with capital
costs.
1608 A. Horncastle et al.

• An alternative cost measures, such as a capital stock approach or an economic

cost approach, could be also undertaken (see section “Input Definition: Account-
ing for the Investment Cycle”). For example, the DCCA models TOTEX as a
combination of OPEX, depreciation, and financial expenses.

Portela [123] provides further discussion on why, in a cost assessment exercise

where prices are not taken into account (which is often the case in regulatory
settings), an overall cost figure should be used.
Another key issue when determining the cost base is to ensure that there is an
appropriate correspondence between the inputs and the outputs used in the model.
A case study on these issues, based on Ofwat’s approach at PR19, is provided below.

Ofwat’s Approach to Input Definition at PR19

In its draft determinations for PR19, Ofwat extended its cost definitions
from BOTEX (OPEX and capital maintenance) to include some enhancement
expenditure areas, including growth expenditure, calling the new definition
BOTEX plus. Ofwat argued that growth expenditure was that more “routine”
than other enhancement expenditure. In their responses to the draft determina-
tions, a number of companies considered that Ofwat’s BOTEX plus modelling
approach failed to properly account for the added enhancement expenditure
areas, see, for example, Anglian Water [2].
In its final determinations, Ofwat ([96], p. 20) “accept[ed] that the inte-
grated models may suffer from missing growth variables and that may lead
to the base econometric models only funding the average historical growth
rate across the industry.” As such, Ofwat slightly amended its approach,
providing additional cost allowances for those companies in high growth
regions and lower cost allowances for those companies in low growth regions.
Nevertheless, Ofwat’s BOTEX plus approach might still be susceptible to
some of the issues identified by companies, as the additional allowance is
based on unit growth costs of the upper quartile BOTEX plus companies.
Two alternative approaches may be worth pursuing going forward.

• Ofwat’s BOTEX plus models could have potentially been improved by

considering growth related cost drivers and reformulating the general
model specification prior to testing down. That is, a renewed model devel-
opment exercise could be undertaken. Indeed, such models were submitted
by some companies as part of Ofwat’s cost assessment consultation [85] –
for example, see some of the models submitted by South West Water,
Ofwat ([88], pp. 21–22; pp. 31–32; pp. 62–64; pp. 73–75) and South
West Water ([134], p. 1). However, the authors understand that such

(continued)
41 Benchmarking in the European Water Sector 1609

extensions were harder to develop using the data set available for the final
determinations than at the modelling consultation stage.
• Alternatively, growth expenditure could be modelled separately, as per
Ofwat’s approach with other enhancement cost areas. That is, growth
expenditure could be modelled in a similar way to Ofwat’s growth models
in IAP (see Ofwat [97]), but using more sophisticated models as its growth
models at IAP included only one cost driver (number of new connections).
For example, see Anglian Water [4] and Anglian Water [3]. This approach
does, however, assume that enhancement costs are separable from base
expenditure.

This issue is yet to be resolved as the time of writing. Following the

PR19 final determinations, four companies appealed Ofwat’s decision. In
their statements of case to the CMA, which set out their key arguments
against Ofwat’s final determinations, several of these companies described
issues with Ofwat’s approach that potentially understated a higher growth
cost, see Oxera [120], Anglian Water [4], and Bristol Water [12]. In its
provisional findings (CMA [24]), the CMA agreed that Ofwat’s approach
was “imperfect” and “has some limitations,” but provisionally concluded that
Ofwat’s approach was a “sensible and pragmatic approach.”

Regulators must manage the trade-off between including more of the cost base
in a single benchmarking model (rather than separate models) – making it more
challenging to account for all the relevant drivers and to appropriately measure
capital expenditure – and including a smaller proportion of the cost base (such as
OPEX only benchmarking) – potentially failing to account for cost trade-offs and
cost allocation issues. An issue that regulators face in applying empirical methods
from production economics to regulated companies is the scope for regulatory
approaches to bias company behavior, for instance only benchmarking OPEX can
lead to companies seeking CAPEX solutions even where this is not necessarily the
most cost-effective solution. This has resulted in several regulators using broader
cost definitions.

Input Definition: Accounting for the Investment Cycle

A related input definition concern in the water sector is that a significant propor-
tion of the cost base funds the construction and maintenance of very long-lived
infrastructure assets. Indepen [43] identified that water and wastewater services are
among the most capital-intensive sectors in the UK economy. This is likely to also
be the case outside the UK. Incidentally, often the justification for regulating such
companies (and, implicitly, the use of benchmarking) is that it would be uneconomic
1610 A. Horncastle et al.

for another company to construct similar assets to serve a particular area, making it
uneconomic for competitors to enter the market.
This creates the following problems for benchmarking such companies:

• The drivers of capital expenditure are often difficult to measure or at least

measure consistently (for example, the amount of remaining capacity in the water
network, or the condition of the underlying asset base).
• The average asset life of the infrastructure typically significantly exceeds the
period of data available to conduct the analysis (stretching to over 100 years for
certain types of pipe) – implying that all companies do not need to undertake the
same level of renewal activity over a typical analysis period.
• A consequential outcome of underinvestment is an increase in the probability
of low likelihood catastrophic events (due to a reduction in the underlying asset
condition), which are challenging to measure directly as any impact may not
occur for many years and is unlikely to have been observed in the analysis
period.

In the context of assessing Ofwat’s PR14 models as part of the Bristol Water
appeal, CMA [23] noted that companies’ investment requirements vary over time –
and that these requirements are not captured by any explanatory variables to
control for these differences. Therefore, differences between companies in total cash
expenditure may be reflective of differences in their investment requirements and
not efficiency. This criticism remains valid for Ofwat’s models at PR19. In a recent
publication on asset health, Ofwat [83], no mention is made of the relationship
between asset health and cost benchmarking assessments, although a key focus of
the document is how to ensure companies measure asset health on a consistent basis.
Indeed, Ofwat has more recently initiated an industry project on asset resilience (see
Ofwat [107]).
In Scotland, the regulator has cited a key reason to move away from setting hard
budget constraints is the restrictions this places on the ability of the regulated water
company to take a long-term approach to managing asset replacement, WICS [160].
In particular, the concern of the regulator is that, by setting a fixed price cap, the
company may be incentivized to take forward the interventions that require the
smallest cash outlay in the short run, rather than taking a long-term approach, WICS
[160]. This was a key reason for WICS to stop using benchmarking to determine
an efficient cost level, in contrast to other regulatory jurisdictions set out in this
chapter.
This concern relating to the incentives that price cap regulatory regimes create for
investment is also raised in Ofwat ([83], p. 68), which mentions potential problems
around short term thinking imposed by the 5-year regulatory cycle, in particular,
“ . . . we were a little concerned to hear that some considered the 5 year planning
cycle to be a potential barrier to innovation, in the belief that it meant that investment
returns need to be recouped in the 5 year period and this does not encourage
taking a risk on new technology. This viewpoint requires further understanding,
consideration and resolution.”
Potential benchmarking solutions to address this issue are set out below.
41 Benchmarking in the European Water Sector 1611

• Using a capital stock or economic cost definition for modelled costs rather than
cash costs (as per DCCA’s approach, see Konkurrence- og Forbrugerstyrelsen
[47]), or cross-checking the efficiency assessment of cash costs with assessments
derived from asset values and the implied cost to replace the existing asset base.
Saal et al. ([129], p. 49, pp. 78–79, p. 98, p. 114) criticized Ofwat’s cash cost
approach, stating, “all OPEX is a cost while CAPEX, regardless of whether it is
for enhancements or maintenance, is investment which contributes to a capital
stock. This capital stock then has associated depreciation and capital financing
costs.”
• Using a cash cost approach but extending the timeframe of the dataset used for
analysis or carefully considering what benchmarking period is appropriate (i.e.,
avoiding potential troughs in expenditure for companies that define the frontier).
On this, Ofwat [105] state “We agree that capital maintenance is ‘lumpy’ and in
some periods companies may need to spend more than in others. To address the
issue of lumpy expenditure and ensure that we are setting an efficient allowance
for the long term, we use eight years of data, which is the longest historical data
set we have ever used in models, to ensure that our input data includes a wide
range of company peaks, troughs and atypical lumps.”
• Including measures of asset health or risk and/or measures of company activity –
such as pipes replaced as in Ofwat ([88], pp. 52–53) – to address asset health
as cost drivers and to ensure that companies are incentivized to maintain a high
level of asset quality, while being careful to avoid perverse incentives of gold
plating. Variables previously considered in capital maintenance modelling in
England and Wales can be found in Ofwat [71] – these included measures of:
the size of the asset base (such as modern equivalent asset value); asset type
(such as large bore water mains); and asset condition (such as the proportion
of assets by condition grade). The incentive structure around such drivers needs
to be considered carefully to avoid “gold plating” – i.e., companies delivering a
program of asset replacement beyond that desired by stakeholders. Indeed, the
CMA, in its redetermination of PR19, was “concerned that [such] measures are
within the control of a company” (CMA [24], p. 136). Age of the asset base has,
however, been used, for example, the DCCA adjust their CAPEX grid volume
measure based on the age of the network and Anglian Water [1] included age in
some of their cost models. With regards to asset age, Ofwat has further argued
that “asset age does not directly correlate to asset performance or service to
customers in the water sector” and provide an example of companies claiming
higher costs for relatively new assets due to “plastic pipes installed in the 1960s
and 1970s” (see Ofwat [105], p. 6). Ultimately, in the case of the redetermination
of PR19, the CMA provisionally concluded that “there should be no systematic
underfunding in the long run” and “provisionally decide[d] not to adjust [its]
approach to setting capital maintenance allowances” (see CMA [24], p. 136 and
139).

Key to any approach designed to ensure that companies have enough money to
fund costs associated with asset replacement will be to ensure that customers do
fund the consequences of historical underinvestment.
1612 A. Horncastle et al.

Cost benchmarking approaches to set capital expenditure allowances remain

more controversial than in other areas, such as operating expenditure. Given the
difficulties in capturing all drivers of capital maintenance expenditure within a
DEA or econometric model, regulators have often considered other approaches
to either set a capital maintenance allowance or to cross-check the outcome from
econometric models, examples of these include the Common Framework (UKWIR
[145] and Ofwat [73]), the Asset Management Assessment or AMA [27], and broad
equivalence [154].
One potential development that could potentially increase the robustness of
benchmarking capital maintenance is the development of more sophisticated indices
of risk, such as that being developed by Ofgem and the industry for assessment of
gas distribution costs (see Ofgem [66]). Ofwat’s recently initiated project on asset
resilience (see Ofwat [107]) may also result in similar measures being developed in
the England and Wales water industry. The use of such drivers in cost benchmarking
models could address several of the challenges set out in this section.

Output Definition: Multiple Outputs and Cost–Service Trade-Offs

Once appropriate inputs have been defined, there are two issues that regulators have
faced in including outputs (including quality) in their models, particularly given
constraints imposed by small datasets and limited data availability:

• Controlling for multiple outputs, such as the number of customers served, the
number of connected properties, the amount of water delivered, the sewage load
treated, the size of network (capturing the distance water or sewage needs to be
pumped or transported)).
• Controlling for quality of service.

With regard to controlling for multiple outputs, regulators have historically

employed a range of approaches, including:

• Controlling for multiple outputs in the same model, such as ARERA’s inclusion
of the volume of water invoiced, the length of network, resident population, and
population equivalent in its models; DCCA’s use of aggregate measures such as
OPEX or CAPEX grid volumes; or the CRU’s and UK energy regulators’ use of
composite scale variables (see NERA [60], Ofgem [64] and Ofgem [65]).
• Using a single key output as a scale driver, and using other normalized cost
drivers to pick up the impact of delivering other outputs (such the PR19 approach
in Ofwat [96] of controlling for sewer length as the main scale driver and the
number of properties per sewer length to capture the higher costs associated with
more properties to collect sewage from for a given sewer length).
• Ofwat’s PR19 approach of modelling across the value chain and controlling for
the most relevant driver at each level of the value chain, for instance using sewer
41 Benchmarking in the European Water Sector 1613

length as the most relevant variable for sewer collection and load as the most
relevant variable for sewer treatment (see Ofwat [96]).

As companies generally need to increase inputs in order to improve service

quality and given that quality of service is a key outcome for consumers, the
environment, and other stakeholders, it is important that regulators account for any
differences in quality of service and outputs delivered in determining the regulatory
settlement. This has been considered within regulatory frameworks either within the
cost benchmarking assessment itself or separately, for example, through financial
incentives for companies to deliver a relatively high level of service quality. The
latter has tended to be the approach adopted by regulators historically.

• In PR19, while some quality related measures were included, Ofwat’s cost
models primarily excluded quality of service measures, which were determined
and incentivized separately (see below).
• Ireland and Northern Ireland adopt largely the same framework, with quality of
service considered outside of benchmarking.
• In Scotland, when WICS used benchmarking to set price control limits, it used
a broadly similar framework to Ofwat. As set out above, quality of service
performance was benchmarked against that of companies in England and Wales,
but this was separate from cost assessment.
• In Denmark, quality of service measures are currently not directly controlled for
in the cost benchmarking framework, although the DCAA is considered account-
ing for quality in future modelling (see Konkurrence- og Forbrugerstyrelsen
[53]).
• In Italy, the regulator has identified compliance with the legislation on urban
wastewater management and water losses as two quality of service related output
variable it will aim to control for (see ARERA [5]). ARERA [5] also used a driver
to capture the availability and reliability of measurement data as an additional
control.

A case study on these issues, based on Ofwat’s approach at PR19, is provided

below.

Ofwat’s Approach to Accounting for Quality of Service at PR19

In the England and Wales water sector, a number of water companies consid-
ered that Ofwat’s separate approaches to cost assessment and service outcome
(or performance commitments) in PR19 ignored the trade-offs between costs
and performance – see, for example, Yorkshire Water [162]. Companies
argued that more output and service performance measures, and the drivers
of these (e.g., legislative changes), should be included within the cost models,

(continued)
1614 A. Horncastle et al.

as the exclusion of such measures can result in biased efficiency estimates.

While some output measures were included by Ofwat, such as consent levels
(i.e., the quality of discharges legally allowed following treatment), other
important quality measures, required or targeted to improve substantially over
AMP7 (Asset Management Period 7, over 2020/2021–2024/2025), were not
included. Some modelling including additional service performance measures
and the drivers of these was submitted by a number of companies – for
example, see Oxera [119], NERA [61], and Yorkshire Water [163].
Given this, Ofwat ([101], pp. 36–42) further examined this issue for its
final determinations, but generally maintained its approach of separately
accounting for service performance and cost performance. However, Ofwat
[96] did make some adjustments to its modelling and tried to account for the
additional costs of: phosphorus removal, by including legislative changes in
some of its modelling, and adjusted its allowance for Yorkshire Water [102];
and leakage, by including the distance from the upper quartile 2024–2025
leakage target in the model, and adjusted its allowance for Anglian Water
(Ofwat ([96], p. 35, Ofwat [103]); PwC [124] and Ofwat [105]).
Nevertheless, the issue remains important going forward and has continued
to be debated as part of the appeals of PR19 – see Oxera [120], Anglian Water
[4], Bristol Water [12], Northumbrian Water [62], and Yorkshire Water [163].
In its redetermination of PR19, the CMA provisionally concluded that quality
of service measures were “substantially under management control” and their
inclusion in the cost models “is likely to lead to endogeneity problems and
thus biased coefficient estimates” (see CMA [24], pp. 125–131). However,
this ignores the omitted variable bias problem caused by not including such
variables.

Clearly, accounting for quality of service is a key consideration for cost bench-
marking exercises and, while endogeneity issues require careful consideration,
ignoring quality of service can bias the results. It terms of further development, the
large number of potential outputs and service performance measures for the water
industry suggests that it may not be possible to estimate a model that captures all of
these measures separately. It may, therefore, be necessary to:

• Focus on capturing only the key service performance measures, see ARERA [5]
• Create composite output measures (similar in concept to DCCA’s aggregate
asset-based output measures, see Konkurrence- og Forbrugerstyrelsen [47]),
CRU’s use of a composite scale variable, see NERA [60], or Ofwat’s previous
use of service incentive mechanism (SIM) scores, see Ofwat [80]
• Use such models to quantify company-specific adjustments outside of the main
industry-wide models (for example, Ofwat [96])
• Adjust costs prior to cost benchmarking based on willingness to pay evidence
(see, for example, Energiavirasto [36] for such an approach in the energy sector)
41 Benchmarking in the European Water Sector 1615

Benchmarking: Input Requirement Functions

In developing cost benchmarking models, regulators have used both input require-
ment functions and cost functions.
In Denmark, the regulator’s model consists of one input (total expenditure)
and two “outputs.” These two outputs measures are both asset-based measures of
network volume, with some adjustment for the density of the population and the age
of the network. The two output measures are not outputs of the production process.
As such, the model captures the costs expected for a particular size of network.
In England and Wales, Ofwat’s PR19 models, while including output measures,
did not include any input prices, as such their models were not cost functions but
input requirement functions. Ofwat did, however, examine the impact of regional
wages but found these not to be significant cost drivers. Ofwat ([94], p. 15) stated:
“We have consistently found that the regional wage level is not a robust cost driver.
In many specification the variable has very low predictive power, and sometimes
it showed a counterintuitive negative sign (albeit statistically insignificant) [ . . . ]
We recognise that variation in labour cost can have an impact on costs although
companies can exercise control to mitigate this impact. We consider also that the
inclusion of a density variable, and a square of density, in our models, capture the
effect of regional wage as the two are correlated.”
Ofwat’s model’s up to PR09 also did not include input prices. (Though dif-
ferences in regional wages were accounted for separately). As such, other water
regulators that have historically followed a similar approach to Ofwat – for example,
CRU, UR, WICS (historically) – have also used input requirement functions. As has
the DCCA.
In contrast, in PR14, Ofwat included regional wages in its cost models. Although
only one input price was captured and its application was subsequently criticized
in CMA [23] as the estimated coefficient for regional wages varied substantially
across Ofwat’s PR14 models, and, in some cases, a coefficient greater than one had
been estimated. The CMA’s own models developed as part of the appeal of PR14
included regional wages, but any specifications that resulted in negative coefficients
or coefficients greater than 1 were dropped.
In Italy, ARERA’s model has three input prices (cost of electricity supply, labor
costs, wholesale water purchase cost) and four outputs, plus a number of compliance
dummies. The models are thus closer, conceptually, to cost functions. However, the
unit costs that are included as proxies for input prices are endogenously (rather
than exogenously) determined variables. For example, the labor price variable is
based on each companies’ own employment costs. As such, any inefficiency in
these variables (e.g., paying above the market wage rate) is not captured in the final
estimated inefficiency. Instead, regional market prices, i.e., the regional wage rate
using government statistics, should be used. In addition, it appears that the provision
of reliable data is included in the modelling by way of an incentive effect to improve
the provision of such data. However, it is unclear if the impact of such variables can
be appropriately picked up in an econometric cost model. Such factors may be better
incentivized outside the cost benchmarking.
1616 A. Horncastle et al.

This area warrants further examination. While a possible extension of the input
requirement function is relatively straightforward in principle, experience so far,
such as that of Ofwat at PR19, has demonstrated that this may be challenging in
practice.
If there is minimal variation in input prices across regions, then this is possibly
a less significant issue in practice. It may be that some input prices, such as raw
material prices or capital input prices, do not greatly vary across regions within a
country. Alternatively, where there is some variation in input prices, such as regional
wages, the variation may still be limited. For example, London may have relatively
high wages compared to the rest of England and Wales, but there may not be
much variation across the remaining regions. It is also possible that the variation in
regional wages may be correlated with other factors such as density, and so already
partly captured through existing cost drivers.

Benchmarking: Functional Form

Ofwat’s models in PR19 were simpler than those used in PR14, with log linear
functional forms for all cost drivers other than density, which was also included
in squared form to capture a “U-shape” effect of density/sparsity. Other water
regulators have similar used relatively simple log-linear functional forms – with
CRU, UR, and WICS (historically) following Ofwat’s approach pre PR14 and
ARERA and DCCA (for its SFA models) using a Cobb–Douglas functional form.
While Ofwat’s approach might overcome some of the criticisms in CMA [23],
it may be that the relationships are more complex and that it is most likely that
the relationships are not the same for all companies. Criticisms of such possible
over-simplicity were made in Saal and Nieswand [126]. While this might not make
a significant difference for the industry as a whole, it could result in an excessive
challenge or a windfall gain for individual companies. For example, Portsmouth
Water was estimated by Ofwat to be 16% more efficient than its benchmark in PR19
(though Ofwat capped this at 10% – see Ofwat [96], p. 13). This could be the result
of an inappropriate model specification for Portsmouth Water and/or the impact of
Ofwat’s choice of an ad hoc blanket benchmark (see below).
Similarly, the models in ARERA [5] appear to estimate close to constant returns
to scale, but with such a wide variation in company scale, a more flexible functional
form may be more appropriate to better capture the relationship between company
size and cost.
A potential solution to this problem is to estimate more flexible functional forms,
such as the translog model, and then test whether the estimated company-specific
elasticities are aligned with economic and engineering insight for each company. As
discussed above, the semi-translog models used in PR14 did not always align with
economic and engineering insight. This was shown to be the case in the appeal and
subsequently amended by the CMA.
Non-parametric or semi-parametric econometric approaches or DEA could also
be investigated. Indeed, as well as SFA, the DCCA has used DEA to estimate cost
41 Benchmarking in the European Water Sector 1617

efficiency. While potentially more flexible, the DCCA used an input-oriented DEA
model with constant returns to scale (see Konkurrence- og Forbrugerstyrelsen [50],
p. 8).

Forecasting Efficient Costs: Identifying “Efficient” Cost Levels, While

Accounting for Error and Heterogeneity, and Alternative Estimation
Approaches

Having developed the cost models, a critical step is then to use the models to
estimate each regulated company’s efficient cost level.
With regards to this step, approaches that do not use SFA or DEA (including
those used by Ofwat, CRU, UR, and WICS (historically)) have a number of related
issues:

• Ad hoc blanket benchmark identification – these regulators have focused on

defining the frontier through the use of an upper quartile adjustment (or, in
Ofwat’s final determinations at PR19, the third or fourth ranked company),
applied to either OLS or RE models. However, this adjustment is a subjective
judgment, the benchmark could have equally been drawn, for example, at the
average. Indeed, CRU [20] considered that the average level of operating costs
was an appropriate target for Irish Water. Similarly, in Bristol Water’s appeal
of Ofwat’s PR14 final determination, the CMA ([23], p. 117) chose an average
benchmark, in part, because it was “concerned that an efficiency benchmark
based on an upper quartile efficiency concept would be overly demanding if
applied to the results of the econometric models that we used. This was a
judgment in the light of the issues we had identified both from our review of
Ofwat’s econometric models and from our development of alternative models.”
Ad hoc blanket adjustments, like the upper quartile, also assume that the same
degree of noise is present for each company (although this point is most pertinent
to Ofwat as the other examples relate to the identification of the efficient cost
level for only one company). In contrast, it is well known in econometrics that
the accuracy of the model prediction decreases as you move farther out from
the central data (i.e., noise is company specific) and the accuracy of the model
prediction decreases as the sample size decreases.
• limited estimation approaches – these regulators relied on either OLS or RE
outcomes and did not always use alternative estimation approaches (such as DEA
or SFA) that might have resulted in different outcomes for individual companies.
An interesting contrast can be made with the German energy sector, in which the
legislation underpinning the regulatory framework states that both DEA and SFA
should be used to estimate efficiency (see [110], Anreizregulierungsverordnung,
AregV, Sect. 12). The best outcome from both the DEA and SFA models is
then used to set companies’ efficiency scores, subject to a minimum 60%. This
benefit of doubt approach is similar to that also used by DCCA. Similarly, in
PR94, Ofwat used DEA as a cross-check on the outcomes from its econometric
1618 A. Horncastle et al.

modelling. The MMC ([59], p. 415) noted that “in most cases the [DEA – based]
results were similar to those of the regressions. If they were significantly better,
[Ofwat] moved the company up one band” (see also Thanassoulis [141] and
Thanassoulis [142]). In addition, some water sector regulators have mitigated
against the reliance on one top-down estimation approach by using bottom-up
assessments to cross-check the outcomes from the econometric analysis.
• Heterogeneity – when estimating inefficiency using the results from the RE
models, Ofwat ([93], p. 11) implicitly assumed that, with the exception of the
subsequent upper quartile adjustment, all of the estimated company-specific
effect is due to inefficiency. However, given the use of relatively parsimonious
models, it is likely that the company-specific effect is partly due to company
heterogeneity/specific drivers not included in the model of the expenditure, and
not inefficiency.

With regards to heterogeneity, these company-specific effects could be controlled

for by removing the company-specific effect from the estimated inefficiency (as per
a true RE model approach as in Greene [42], pp. 7–32) or certain panel SFA models
could be used to control for these effects. An alternative approach, deployed by both
DCCA and Ofwat, is to adjust for special characteristics outside of the modelling to
try to account for companies’ individual circumstances.
For example, heterogeneity is particularly pronounced in Italy – the 98 compa-
nies in ARERA’s SFA modelling are very diverse – for example, their operating
costs vary from A C1.6 m to A
C340 m in 2017, and their length of network varies from
151 km to 19,783 km. As such, accounting for this heterogeneity is particularly
important in this instance. However, other than the included cost drivers, ARERA’s
models do not allow for company heterogeneity and so can conflate inefficiency
with company specific factors and thus may overestimate inefficiency. This could
be mitigated by using models that capture company heterogeneity – such models
include Kumbhakar and Heshmati [54], Greene [42], Wang and Ho [150], Colombi
et al. [15], and Kumbhakar et al. [56].
The DCCA’s SFA model uses cross-sectional data, so the SFA models that
can control for heterogeneity are not applicable. Instead, the DCCA controls for
heterogeneity by examining whether each company is an outlier (see Konkurrence-
og Forbrugerstyrelsen [47]). As discussed above, in addition to its qualitative assess-
ment, the DCCA undertakes a quantitative assessment. For the DEA modelling,
potential outliers are identified using “super-efficiency,” while for SFA, Cook’s
Distance is used to identify potential outliers. As part of the qualitative assess-
ment, companies are contacted to assess whether or not they are representative.
Adjustments are also made for special characteristics outside of the modelling,
Konkurrence- og Forbrugerstyrelsen ([47], p. 16).
In addition to accounting for heterogeneity, SFA could be used to account for
modelling noise, providing company-specific separation of noise and inefficiency,
and confidence intervals around the inefficiency estimates. While distributional
assumptions are required, SFA avoids the need for the identification of an ad hoc
blanket benchmark.
41 Benchmarking in the European Water Sector 1619

While not used by Ofwat in PR19, the potential use of SFA was discussed, but
the approach was dismissed based on factual errors. For example, in CEPA ([14],
p. 38), some of the reasons for dismissing its use included “since SFA is not a
statistical technique, it is not possible to implement tests to evaluate the accuracy
of the results”; and “although [SFA models have] been considered by regulators,
they are rarely pursued.” Clearly, SFA is a statistical technique, enabling testing
to be undertaken, and has widely been used in regulatory contexts across sectors
and jurisdictions by both regulators and companies – see, for example, ORR [112],
Deloitte [32], Swiss Economics, Sumicsid and IAEW [140], and Oxera [115].
Evidence using SFA has also been used in the England and Wales water sector – see,
for example, Stewart [137–138], Oxera [114], Oxera [116], and Saal et al. [128].
DEA can also potentially help with heterogeneity as companies’ efficiency
will be estimated through comparisons with peer companies that have similar
characteristics (to the extent that those characteristics are included in the model).
In terms of specific applications of DEA in water regulatory assessments:

• DCCA’s DEA model consists of one input and two output measures, under the
assumption of constant returns to scale (which assumes companies can be scaled
up or down to form virtual peers). See Konkurrence- og Forbrugerstyrelsen
[47]. As such, the model’s ability to account for heterogeneity is quite lim-
ited although, DCCA does correct the estimated efficiency scores for special
conditions. (Clearly, alternative assumptions for returns to scale could also be
assumed).
• In PR94, while econometric modelling was the focus of Ofwat’s assessment,
separate DEA models were carried out for water distribution, water treatment,
and sewerage services, with the models based on the econometric models. As
the models included multiple output measures (e.g., properties, length of mains,
and water delivered for the water distribution model; population, length of
sewers, area served, and pumping capacity for sewerage network) slightly greater
heterogeneity was accounted for (although constant returns to scale was also
assumed to be consistent with the estimated econometric models at the time).
See Thanassoulis [141], Thanassoulis [142], and Thanassoulis [144].

With regards to separating noise and inefficiency, although not a direct solution,
DEA is a more rigorous approach to establishing an efficiency frontier than using an
upper quartile adjustment to OLS or RE, and can be extended to account for noise
(see Simar and Wilson [131], and Simar and Wilson [132]).

Forecasting Efficient Costs: The Consistency of Catch-Up, Frontier

Shift and Input Price Inflation Assumptions

The potential to catch-up to the frontier (technical efficiency), the potential for
further frontier shift (technical change) improvements, and the impact of input price
inflation are interrelated. Clearly catch-up and frontier shift are closely related to
1620 A. Horncastle et al.

each other as the former is measured relative to the position of the frontier. Equally,
the input price inflation that a sector experiences is affected by the productivity
performance of the sectors related to those inputs. Productivity growth typically
drives increases in real wages such that, in the long run, economy-wide real wage
growth is broadly in line with economy-wide labor productivity growth (see, for
example, the International Labour Organisation [44], p. 10).
The expectations for changes in all these elements are clearly specific to the
industry in question. For example, the digital sector clearly has greater rates of
technological progress than the water sector.
As such, it is important that regulators ensure consistency between the catch-up,
frontier shift, and input price inflation assumptions.
As discussed above, DCCA’s frontier-shift challenge was based on separate
analysis to its relative efficiency analysis, Ofwat’s frontier-shift challenge was
also based on separate analysis to its relative efficiency analysis and separate to
its analysis of input price inflation. These approaches may, therefore, result in
inconsistencies.
Indeed, with regards to Ofwat’s approach, some companies argued that an
inconsistency had occurred with Ofwat’s initial assumptions of a frontier shift
of 1.5% p.a. and no input price inflation allowance. Ofwat ([95], p. 139) stated
“Economic Insight and Oxera argue that it is inconsistent to assume that there will
be no real wage growth yet still apply a frontier shift which implies significant
growth in productivity, as wage rates and labour productivity will be linked.”
In its draft determinations, Ofwat ([95], p. 129) made some allowance of input
price inflation, stating “In contrast to the initial assessment, we are including
a real price effect adjustment for real wages to reflect improvements in labour
productivity.” This is one solution to improving consistency – namely, to reassess
the outcomes from separate analyses and try to ensure consistency in those assump-
tions by making some adjustments. Though such an approach may not guarantee
consistency.
Another approach would be to model all components (technical efficiency,
technical change, and input price inflation) simultaneously. This could be achieved
using SFA and a subsequent decomposition of the components – see, for example,
Kumbhakar et al. ([57], Chap. 11), Ashton [7], and Saal et al. [128]. (Clearly, to also
be able to decompose the impact of input price inflation, input prices would need
to be included in the models, i.e., a cost function would first need to be modelled).
The same decomposition can be achieved using DEA and Malmquist Indices – see
Maniadakis and Thanassoulis [58]. In this case, the critical assumption is that past
performance in each of these elements is a good predictor for future performance.
As such, further adjustments may still be required.
Such a decomposition for SFA requires careful consideration of the pattern of
efficiency over time. There are quite a few approaches that can be taken. On this
point, we note that the models used by ARERA are quite restrictive in how they
allow inefficiency to change over time. For example, Pitt and Lee [122] and Battese
and Coelli [8] have time invariant inefficiency, i.e., these models assume inefficiency
does not change over time, while Battese and Coelli [9] assume a fixed (exponential)
41 Benchmarking in the European Water Sector 1621

shape to inefficiency over time. Less restrictive models, such as Kumbhakar and
Heshmati [54], could instead be used.

International Comparisons

An additional complexity for UR and CRU is the necessity to compare NI Water

and Irish Water to operators outside the country. This makes achieving like-for-like
comparisons more complicated.
The first key issue is data comparability. In the case of NI Water, the company
was required to submit similar data as was previously the case in England and Wales
via Ofwat’s use of, the now disused, June Returns [108]. On this basis, most of NI
Water’s dataset was comparable to the comparator set used by the UR, save for
minor differences in pollution categorization regime, for example.
There is also a need to convert to the same currency, if necessary. In CRU’s
case, cost for the England and Wales companies were converted into euros using
the OECD’s Purchasing Power Parities (PPP) for private consumption. Clearly,
there are implementation issues and uncertainties around conversion rates so some
sensitivity testing may be required.
The second key issue is the ability to control for differences between companies.
This is more complicated in an international setting. This can be accounted for in a
number of ways, including:

• Accounting for differences through the variables chosen in the model, ensuring,
in particular, that the key differentiating characteristics between countries are
accounted for. CRU [20] stated that it considered Irish Water’s greater length
of water network per connection but noted that models developed by other
regulators tended to show that the number of connections, rather than network
length, was the main cost driver.
• Accounting for potential differences in technology (that is, differences in the
engineering process that companies use, or could use, as well as the physical
and policy environments in which the companies operate). This implies that it is
important to test whether the estimated (DEA or SFA) frontier differs by country.
• Making pre- or post-modelling adjustments to account for atypical character-
istics. For example, the UR made some allowances through the introduction
of negative special cost factors after conducting cost benchmarking, such as a
regional wage adjustment for wage differentials between Ireland and England
and Wales. Similarly, CRU [20] accounted for Irish Water’s higher wages costs
by scaling costs up or down before conducting cost benchmarking.

With any international benchmarking exercise, a number of issues need to be

accounted for. For the interested reader, Oxera [113] summarizes a number of
issues with international benchmarking and how they can be mitigated in relation
to ORR’s benchmarking of National Rail with other European network operators
[133]. Similarly, Oxera [121] summarizes a number of issues with international
1622 A. Horncastle et al.

benchmarking and how they can be mitigated in relation to CEER/Sumicsid’s

benchmarking of European TSOs [139]. Many of the issues discussed in these
papers are also relevant to international benchmarking in the water sector.

Concluding Comments

Although economic regulation and benchmarking in the water sector is not

widespread across Europe, some cost benchmarking is undertaken by a number of
European regulators. With the exception of those regulators that have benchmarked
the regulated water company with companies in another jurisdiction (namely,
England and Wales), the approaches taken have varied considerably across
regulators.
This variation has been in, among other things: the form of the benchmarking
approach taken (e.g., OLS, RE, SFA, or DEA); the costs benchmarked (e.g., OPEX,
BOTEX, TOTEX); the outputs and other cost drivers used in the models; and
how future efficient cost are ultimately established. That is, there has been little
consensus in a number of fundamental aspects of the cost benchmarking exercise.
While regulators need to be pragmatic and take into account local issues, including
differences in the regulatory framework, the variation in approaches is perhaps
somewhat surprising. As such, despite cost benchmarking having been undertaken
in the water sector for a number of decades, there still remains a number of areas
for future research and development.

Cross-References

Application of Production Economics in the Electricity Distribution Sector

Empirical Analysis of Production Economics: Applications to Banking

References
1. Anglian Water (2017) Water industry cost modelling: Anglian Water’s approach and initial
results, September
2. Anglian Water (2019a) PR19 draft determination representation, August
3. Anglian Water (2019b) PR19 draft determinations: deep dive on growth expenditure, August
4. Anglian Water (2020) PR19 CMA redetermination statement of case, April
5. ARERA (2019a) Water tariff method for the third regulatory period (MTI-3): consultation
document 402/2019/R/IDR, October
6. ARERA (2019b) Water tariff method for the third regulatory period (MTI-3). Decision
580/2019/R/IDR, December
7. Ashton JK (2000) Total factor productivity growth and technical change in the water and
sewerage industry. Serv Ind J 20(4):121–130
8. Battese GE, Coelli TJ (1988) Prediction of firm-level technical efficiencies with a generalized
frontier production function and panel data. J Econ 38:387–399
9. Battese GE, Coelli TJ (1992) Frontier production functions, technical efficiency and panel
data: with application to paddy farmers in India. J Prod Anal 3:153–169
41 Benchmarking in the European Water Sector 1623

10. Bristol Water (2015a) Bristol Water statement of case

11. Bristol Water (2015b) Bristol Water’s reply to Ofwat’s response to the statement of case, 13
April
12. Bristol Water (2020) PR19 redetermination Bristol Water: statement of case (non-
confidential), April
13. Cambridge Economic Policy Associates (CEPA) (2014) Ofwat: cost assessment – advanced
econometric models, March
14. Cambridge Economic Policy Associates (CEPA) (2018) PR19 econometric benchmarking
models, March
15. Colombi R, Martini G, Vittadini G (2011) A stochastic frontier with short-run and long-
run inefficiency random effects. Department of Economics and Technology Management,
University of Bergamo, Working paper series
16. Commission for Economic Regulation (CER) (2016a) Irish Water second revenue control
2017–2018, December
17. Commission for Economic Regulation (CER) (2016b) Irish Water performance assessment:
framework of reporting metrics decision paper, November
18. Commission for Regulation of Utilities (CRU) (2018a) Irish Water revenue control 2019
revenue control 2 (2017/2018) one-year extension, September
19. Commission for Regulation of Utilities (CRU) (2018b) Irish Water performance assessment
report no. 3, November
20. Commission for Regulation of Utilities (CRU) (2019a) Irish Water revenue control revenue
control 3 (2020–2024), July
21. Commission for Regulation of Utilities (CRU) (2019b) Irish Water performance assessment
framework – 2017 implementation update, July
22. Commission for Regulation of Utilities (CRU) (2019c) Revenue control 3 (2020–2024),
December
23. Competition and Markets Authority (CMA) (2015) Bristol Water plc A reference under
section 12(3)(a) of the Water Industry Act 1991 report, October
24. Competition and Markets Authority (CMA) (2020) Anglian Water Services Limited, Bris-
tol Water plc, Northumbrian Water Limited and Yorkshire Water Services Limited price
determinations provisional findings, September. https://www.gov.uk/cma-cases/ofwat-price-
determinations
25. Competition Commission (2000) Mid Kent Water Plc: a report on the references under
sections 12 and 14 of the Water Industry Act 1991. Chapter 6, paragraph 6.129–6.139 and
Appendix 6.6, paragraph 22–33
26. Competition Commission (2002) Vivendi Water UK PLC and First Aqua (JVCo) Limited: a
report on the proposed merger
27. Competition Commission (2010) Bristol Water plc A reference under section 12(3)(a) of the
Water Industry Act 1991, August
28. Competition Commission (2012) South Staffordshire Plc/Cambridge Water PLC merger
inquiry: a report on the completed acquisition by South Staffordshire Plc of Cambridge Water
PLC
29. Danish Cross-Party Water Sector Control (2015) Aftale mellem regeringen
(Socialdemokraterne og Det Radikale Venstre) og Venstre, Dansk Folkeparti, Enhedslisten,
Socialistisk Folkeparti og Det Konservative Folkeparti om en ny og forbedret regulering af
den danske vandsektor, April
30. Danish Water and Wastewater Association (2017) Water in figures
31. Danish Water Sector Act (2016)
32. Deloitte (2016) Econometric benchmarking in the UK postal sector. Final report. Produced
for Ofcom, May
33. Department for Environmental and Rural Affairs (DEFRA) (2017) The government’s strate-
gic priorities and objectives for Ofwat. Presented to parliament pursuant to section 2A of the
Water Industry Act 1991, September
34. Diewert WE (1974) Functional forms for revenue and factor requirement functions. Int Econ
Rev 15:119–130
1624 A. Horncastle et al.

35. Energi-, Forsynings- og Klimaministeriet (2018) Bekendtgørelse om økonomiske rammer for

vandselskaber, March
36. Energiavirasto (2015) Regulation methods in the fourth regulatory period of 1 January 2016–
31 December 2019 and the fifth regulatory period of 1 January 2020–31 December 2023,
November
37. Europe Economics (2019a) Real price effects and frontier shift, January
38. Europe Economics (2019b) Real price effects and frontier shift – updated assessment, July
39. Europe Economics (2019c) Real price effects and frontier shift – final assessment and
response to company representations, December
40. European Commission (2019) Report from the Commission to the European Parliament and
the Council on the implementation of the Water Framework Directive (2000/60/EC) and
the Floods Directive (2007/60/EC) Second River Basin Management Plans First Flood Risk
Management Plans, February
41. European Parliament and the Council (2000) Directive 2000/60/EC of the European Parlia-
ment and of the Council establishing a framework for the Community action in the field of
water policy. Off J 327, 22 Dec, pp 1–73
42. Greene W (2005) Fixed and random effects in stochastic frontier models. J Prod Anal 23(1):7–
32
43. Indepen (2017) Capital maintenance: summary of issues from the interviews, WICS, March
44. International Labour Organisation (2014) Global wage report 2014/15
45. Jacobs (2019) Technical and economic support for the Irish Water Revenue Control 3: the
Commission for Regulation of Utilities review of Irish Water’s RC3 capital investment plan,
July
46. Konkurrence- og Forbrugerstyrelsen (2016) Bilag 7 SFA-Modellen
47. Konkurrence- og Forbrugerstyrelsen (2017) Totaløkonomisk benchmarking. Fastsættelse af
individuelle effektiviseringskrav i de økonomiske rammer for 2018–2019 for spildevandssel-
skaber
48. Konkurrence- og Forbrugerstyrelsen (2018a) Bilag 6. R-Koder til brug for benchmarking af
drikkevandselskaber, August
49. Konkurrence- og Forbrugerstyrelsen (2018b) Totaløkonomisk benchmarking for drikkevands-
selskaber Model for beregning af individuelle effektiviseringskrav i de økonomiske rammer
for 2019–2020
50. Konkurrence- og Forbrugerstyrelsen (2018c) Bilag 3. Fronterne i DEA og SFA, August
51. Konkurrence- og Forbrugerstyrelsen (2019a) Totaløkonomisk benchmarking for drikkevands-
selskaber, August
52. Konkurrence- og Forbrugerstyrelsen (2019b) Benchmarking. https://www.kfst.dk/vandtilsyn/
benchmarking/. Accessed 4 May 2020
53. Konkurrence- og Forbrugerstyrelsen (2019c) Velfungerende Markeder: Benchmarking Og
Forsyningssikkerhed I Forsyningssektoren
54. Kumbhakar S, Heshmati A (1995) Efficiency measurement in Swedish dairy farms: an
application of rotating panel data, 1976–88. Am J Agric Econ 77:660–674
55. Kumbhakar S, Horncastle A (2010) Improving the econometric precision of regulatory
models. October 2010, Journal of Regulatory Economics 38(2):144–166
56. Kumbhakar SC, Lien G, Hardaker JB (2014) Technical efficiency in competing panel data
models: a study of Norwegian grain farming. J Prod Anal 41(2):321–337
57. Kumbhakar S, Wang H-J, Horncastle A (2015) A practitioner’s guide to stochastic frontier
analysis using stata. Cambridge University Press, Cambridge
58. Maniadakis N, Thanassoulis E (2004) A cost Malmquist productivity index. Eur J Oper Res
154(2):396–409
59. Monopolies and Mergers Commission (MMC) (1995) South West Water Services Ltd: a
report on the determination of adjustment factors and infrastructure charges for South West
Water Services Ltd
60. Nera (2019a) IW RC3 (2020–2024) Look Forward Submission Assessment: Opex, July
41 Benchmarking in the European Water Sector 1625

61. Nera (2019b) Assessing Ofwat’s funding and incentive targets for leakage reduction,
March
62. Northumbrian Water (2020) NWL statement of case PR19 CMA redetermination, April
63. Ofgem (2010) Handbook for implementing the RIIO model, October
64. Ofgem (2012) RIIO-GD1: final proposals – supporting document – cost efficiency, December
65. Ofgem (2014) RIIO-ED1: final determinations for the slow-track electricity distribution
companies business plan expenditure assessment: final decision, November
66. Ofgem (2015) Network output measures health & risk reporting methodology & framework.
A common methodology framework, adopted by all Gas Distribution Networks, for the
assessment, forecasting and regulatory reporting of asset risk, September
67. Ofwat (1994a) Future charges for water and sewerage companies: the outcome of the periodic
review, July
68. Ofwat (1994b) 1993–94 report on the cost of water delivered and sewage collected
69. Ofwat (1998) Assessing the scope for future improvements in water company efficiency
70. Ofwat (2003a) Setting water and sewerage price limits for 2005–10: framework and approach,
March
71. Ofwat (2003b) Record of workshop on the capital maintenance econometric modelling
explanatory variables, April
72. Ofwat (2008a) Cost base feedback report, August 2008
73. Ofwat (2008b) Setting price limits for 2010–15: framework and approach, March
74. Ofwat (2009a) Future water and sewerage charges 2010–15: final determinations
75. Ofwat (2009b) Relative efficiency assessments 2008–09 – supporting information
76. Ofwat (2011) Capex bias in the water and sewerage sectors in England and Wales – substance,
perception or myth? A discussion paper
77. Ofwat (2013) Setting price controls for 2015–20 – final methodology and expectations for
companies’ business plans, July
78. Ofwat (2014) Setting price controls for 2015–20 final price control determination notice:
policy chapter A3 – wholesale water and wastewater costs and revenues, December
79. Ofwat (2015a) Thames Tideway Tunnel infrastructure provider project licence. https://www.
ofwat.gov.uk/regulated-companies/markets/direct-procurement/thames-tideway/
80. Ofwat (2015b) Service incentive mechanism – guidance for collating customer service
information for calculating the SIM score, March
81. Ofwat (2017a) Delivering Water 2020: our final methodology for the 2019 price review,
December
82. Ofwat (2017b) Delivering Water 2020: our final methodology for the 2019 price review.
Appendix 11: securing cost efficiency, December
83. Ofwat (2017c) Targeted review of asset health and resilience in the water industry – report
from CH2M, September
84. Ofwat (2017d) Business retail market. https://www.ofwat.gov.uk/regulated-companies/
markets/business-retail-market/. Accessed 6 Nov 2020
85. Ofwat (2018a) Cost assessment for PR19: a consultation on econometric cost modelling,
March
86. Ofwat (2018b) IN 18/02: price review early submissions on 3 May 2018 for performance
commitment definitions and cost adjustment claims, March
87. Ofwat (2018c) IN 18/11: enhancement expenditure – setting expectations for well-evidenced
proposals and clarifying interaction with cost adjustment claims, June
88. Ofwat (2018d) Cost assessment for PR19 – a consultation on econometric cost modelling:
appendix 1 – modelling results, March
89. Ofwat (2019a) Consultation on charging rules for new connections and new developments for
English companies from April 2020, April
90. Ofwat (2019b) PR19 final determinations: delivering outcomes for customers policy
appendix, December
91. Ofwat (2019c) PR19 initial assessment of plans: summary of test area assessment, January
1626 A. Horncastle et al.

92. Ofwat (2019d) PR19 initial assessment of plans – overview of company categorisation,
January
93. Ofwat (2019e) Technical appendix 2 – securing cost efficiency, January
94. Ofwat (2019f) Supplementary technical appendix: econometric approach, February
95. Ofwat (2019g) PR19 draft determinations: securing cost efficiency technical appendix, July
96. Ofwat (2019h) PR19 final determinations: securing cost efficiency technical appendix,
December
97. Ofwat (2019i) Initial assessment of business plans – cost assessment models
98. Ofwat (2019j) Slow track and significant scrutiny draft determinations models
99. Ofwat (2019k) Final determinations – cost assessment models
100. Ofwat (2019l) Final determinations. Wholesale wastewater enhancement feeder model: first
time sewerage. https://www.ofwat.gov.uk/wp-content/uploads/2019/12/FM_E_WWW_first-
time-sewerage_FD.xlsx
101. Ofwat (2019n) PR19 final determinations – overall level of stretch across costs, outcomes and
allowed return on capital appendix, December
102. Ofwat (2019o) Final determinations. Wholesale wastewater enhancement feeder model: P-
removal. https://www.ofwat.gov.uk/wp-content/uploads/2019/12/FM_E_WWW_p-removal_
FD.xlsx
103. Ofwat (2019p) Final determinations. 5 Base adjustment model. https://www.ofwat.gov.uk/
wp-content/uploads/2019/12/FM_Base_adjustments_FD.xlsx
104. Ofwat (2019q) Final determinations. https://www.ofwat.gov.uk/regulated-companies/price-
review/2019-price-review/final-determinations/. Accessed 4 May 2020
105. Ofwat (2019r) PR19 final determinations: Anglian Water – cost efficiency additional infor-
mation appendix, December
106. Ofwat (2019s) PR19 initial assessment of plans: summary of test area assessment, January
107. Ofwat (2020) Asset resilience project kick off meeting, November
108. Ofwat (n.d.) Historic performance. https://www.ofwat.gov.uk/regulated-companies/
company-obligations/performance/. Accessed 4 May 2020
109. Ofwat and the Department for Food and Rural Affairs (2006) The development of the water
industry in England and Wales. 27 January. https://www.ofwat.gov.uk/wp-content/uploads/
2015/11/rpt_com_devwatindust270106.pdf. Accessed 6 Nov 2020
110. Ordinance on Incentive Regulation of Energy Supply Networks (Incentive Regulation
Ordinance). Verordnung über die Anreizregulierung der Energieversorgungsnetze (Anreizreg-
ulierungsverordnung – ARegV)
111. Organisation for Economic Cooperation and Development (OECD) (2015) The governance
of water regulators. OECD studies on water. OECD Publishing, Paris
112. ORR (2018) PR18 econometric top-down benchmarking of Network Rail: a report, July
113. Oxera (2009) Recommendations on how to model efficiency for future price reviews. Prepared
for Office of Rail Regulation, November
114. Oxera (2010) Bristol Water’s efficiency: an assessment of relative operating efficiency for
water services
115. Oxera (2013) Recommendations on cost assessment approaches for RIIO-ED1. Produced for
ENW Ltd
116. Oxera (2018) South East water wholesale BOTEX assessment
117. Oxera (2019a) Ofwat’s enhancement modelling approaches at the IAP: a review, March
118. Oxera (2019b) WINEP: phosphorous removal, August
119. Oxera (2020a) PR19 redeterminations: companies’ statements of case – cost allowance,
April
120. Oxera (2020b) A critical assessment of TCB18 electricity, April
121. Pitt MM, Lee L (1981) The measurement and sources of technical inefficiency in the
Indonesian weaving industry. J Dev Econ 9(1):43–64
122. Portela M (2014) Value and quantity data in economic and technical efficiency measurement.
Econ Lett 124:108–112
123. PwC (2019) Funding approaches for leakage reduction. Report for Ofwat, December
41 Benchmarking in the European Water Sector 1627

124. Saal D (2018) Comments on CEPA’s methodological approach in its PR19 econometric
benchmarking models for Ofwat, May
125. Saal D, Nieswand M (2019) A review of Ofwat’s January 2019 wholesale water and
wastewater Botex cost assessment modelling for PR19, March
126. Saal DS, Parker D (2000) The impact of privatization and regulation on the water and
sewerage industry in England and Wales: a translog cost function model. Manag Decis Econ
21(6):253–268
127. Saal D, Parker D, Weyman-Jones T (2007) Determining the contribution of technical,
efficiency and scale change to productivity growth in the privatized English and welsh water
and sewerage industry: 1985–2000. J Prod Anal 28:127–139
128. Saal D, Ferrari A, Nieswand M (2017) Independent review of Anglian Water’s preliminary
regulatory cost modelling for PR2019, September
129. Scottish Water (2012) Submission to Scottish parliament, September
130. Simar L, Wilson PW (2000) A general methodology for bootstrapping in non-parametric
frontier models. J Appl Stat 27(6):779–802
131. Simar L, Wilson PW (2011) Inference by the m out of n bootstrap in nonparametric frontier
models. J Prod Anal 36:33–53
132. Smith A (2008) International benchmarking of Network Rail’s maintenance and renewal
costs: an econometric study based on the LICB dataset (1996–2006): report for the Office
of Rail Regulator, October
133. South West Water Limited (2018) Cost model consultation response, May
134. Stehrer R, Bykova A, Jäger K, Reiter O, Schwarzhappel M (2019) Industry level growth and
productivity data with special focus on intangible assets, October. https://euklems.eu/
135. Stewart M (1993a) Ofwat research paper number 2: modelling water costs 1992–93: further
research into the impact of operating conditions on company costs: main report, December
136. Stewart M (1993b) Ofwat research paper number 4: modelling sewage treatment costs 1992–
93: research into the impact of operating conditions on the costs of sewage treatment: main
report, December
137. Stewart M (1994) Ofwat research paper number 3: modelling sewerage costs 1992–93:
research into the impact of operating conditions on the costs of the sewerage network: main
report, January
138. Sumicsid (2019) Pan-European cost-efficiency benchmark for electricity transmission system
operators main report, July
139. Swiss Economics, Sumicsid and IAEW (2019) Efficiency comparison of Electricity Distri-
bution System Operators for the third regulatory period (EVS3). Produced for Bundesnetza-
gentu, April
140. Thanassoulis E (2000a) DEA and its use in the regulation of water companies. Eur J Oper
Res 127(1):1–13
141. Thanassoulis E (2000b) The use of data envelopment analysis in the regulation of UK water
utilities: water distribution. Eur J Oper Res 126(2):436–453
142. Thanassoulis E (2001) Introduction to the theory and application of data envelopment
analysis. Kluwer, Dordrecht
143. Thanassoulis E (2002) Comparative performance measurement in regulation: the case of
English and Welsh sewerage services. J Oper Res Soc 53:292–302
144. UK Water Industry Research Limited (UKWIR) (2002) Capital maintenance planning: a
common framework
145. Utility Regulator (UR) (2014) Water & sewerage services, price control 2015–21. Final
determination – main report, December
146. Utility Regulator (UR) (2018) Price control for water and sewerage services 2021–2027: our
overall approach, June
147. Utility Regulator (UR) (2019a) PC21 Opex ‘minded to’ methodology, March
148. Utility Regulator (UR) (2019b) PC21 Capex ‘minded to’ methodology, March
149. Wang H-J, Ho C-W (2010) Estimating fixed-effect panel stochastic frontier models by model
transformation. J Econ 157:286–296
1628 A. Horncastle et al.

150. Water Act (1989) (c.15) UK government

151. Water Industry Act (1991) (c.15) UK government
152. Water Industry Commission for Scotland (WICS) (2001) Strategic review of Charges 2002–
06, November
153. Water Industry Commission for Scotland (WICS) (2004a) Our work in regulating the Scottish
water industry: the calculation of prices, September
154. Water Industry Commission for Scotland (WICS) (2004b) Our work in regulating the Scottish
water industry: the scope for operating cost efficiency, October
155. Water Industry Commission for Scotland (WICS) (2013) Strategic review of charges 2015–
21: innovation and choice, May
156. Water Industry Commission for Scotland (WICS) (ca. 2014) Staff paper 5: public private
partnership costs. https://www.watercommission.co.uk/UserFiles/Documents/Staff%20paper
%205.pdf
157. Water Industry Commission for Scotland (WICS) (2018) Strategic review of charges 2021–
27: methodology refinements and clarifications, November
158. Water Industry Commission for Scotland (WICS) (2019) What can regulation achieve, August
159. Water Industry Commission for Scotland (WICS) (2020) Prospects for prices. Strategic
review of charges 2021–27. Final decision paper, February
160. Water Services Act (2017) Government of Ireland
161. Yorkshire Water (2019) Cost efficiency – Yorkshire water draft determination representation
REDACTED
162. Yorkshire Water (2020) PR19 Redetermination Yorkshire water services: statement of case,
April
The Economics of Sports
42
Joshua Congdon-Hohman and Victor Matheson

Contents
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1630
Team Sports Versus Individualistic Sports . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1631
What Is a Sports Team’s Objective Function? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1632
Sports Economics and the Production Function for Attendance
and Revenue Success (Off-Field Success) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1633
Stadiums and Sports Infrastructure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1633
Producing Attendance and Revenue: Teams . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1634
Producing Attendance and Revenue: Leagues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1635
Sports Economics and the Production Function for On-Field Success . . . . . . . . . . . . . . . . . 1637
Measuring Monopsony Power . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1638
Management and Strategic Efficiency . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1640
Optimal Levels and Distribution of Inputs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1642
Worker Effort and Compensation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1644
Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1645
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1646

Abstract

Spectator sports present a unique area of study for economists both because
of their global popularity and because they require a high level of cooperation
among rival firms to produce the product. This chapter explores how leagues
allocate labor among teams to maximize member revenues. It also examines
whether individual teams try to maximize on-field success (in terms of wins) or
on-field success (in terms of revenues or profits) and how expenditures on talent
and sports infrastructure contribute to these goals.

J. Congdon-Hohman · V. Matheson ()

College of the Holy Cross, Worcester, MA, USA
e-mail: [email protected]; [email protected]

S. C. Ray et al. (eds.), Handbook of Production Economics,
https://doi.org/10.1007/978-981-10-3455-8_43
1630 J. Congdon-Hohman and V. Matheson

Keywords

Sports · Marginal revenue product · Labor economics · Attendance ·

Stadiums

Introduction

Spectator sports represent a fairly unique area of study for economists for a wide
variety of reasons. First, sports command a degree of attention within popular
culture that far exceeds its financial contributions. Even before COVID-19 wreaked
havoc on the bottom lines of sports promoters across the world, the total revenue
generated by the five largest professional sports leagues in the United States (the
National Football League (NFL), Major League Baseball (MLB), the National
Basketball Association (NBA), the National Hockey League (NHL), and Major
League Soccer (MLS) totaled only around $40 billion dollars (US) in 2018.
Adding in the “Big 5” European soccer leagues (Germany’s Bundesliga, the UK’s
Premier League, Italy’s Serie A, Spain’s La Liga, and France’s Ligue 1) raises
the total by another roughly $17 billion. Including in Formula One auto racing,
Nippon Professional Baseball, the Australian Football League (AFL), the Indian
Premier League (IPL – Cricket), the Professional Golfers Association (PGA), the
International Olympic Committee (IOC), and FIFA (soccer) brings the worldwide
total average annual revenue to around $65 billion, a combined figure that would not
even crack the top 100 largest firms in the world. However, these sports dominate
media coverage around the world and can capture the attention of entire countries.
Indeed, one of the most visible and dramatic signs that the coronavirus had moved
from “potential threat” to “critical global pandemic” was the empty stadiums and
arenas across the globe.
Next, while sports remain a relatively small segment of the economy, they have
experienced rapid growth over the past century. Technology in the form of television
and other forms of media distribution has aided the growth of sports. In most high-
level leagues the amount of revenue generated by media rights exceeds the revenue
generated by live, in-person attendance. In addition, it has been generally observed,
especially in more recent studies, that attendance at games is a luxury good. As
incomes grow, consumers have more disposable income and more free time to spend
on entertainment [102].
Third, in order to produce its product, sports require a level of cooperation among
erstwhile competitors that is uncommon among other industries. Thus, while firms
in traditional industries such as consumer electronics, air travel, or automobiles
would normally try to drive their competitors out of business in order to capture
the entire market for themselves, in sports, a team that successfully drove each of its
important rivals from the market would find itself unable to produce its product at
all. Indeed, this distinction was at the heart of Walter Neale’s [85] seminal work on
sports economics. Neale dubbed this phenomenon the “Louis-Schmeling Paradox”
after the prominent heavy-weight boxers Joe Louis and Max Schmeling. Neale notes
42 The Economics of Sports 1631

that without a prominent opponent like Max Schmeling, Louis would have “no one
to fight and therefore no income.” In few other sectors of the economy is a firm’s
production function so closely dependent on the existence of a competing firm in
the same industry.
Finally, sports provide researchers with a vast trove of publicly available data
that is often not found in other industries. For example, perhaps more than any other
industry, scholars can examine the contributions of individual workers to the success
of the firm as a whole since the sports pages routinely publish statistics on specific
players. In addition, sports provide data at a level of frequency that is rare in other
industries. While national accounting practices may require firms to release financial
information only on an annual or, perhaps, quarterly basis, sports performance data
is typically available on a game-by-game basis that may be as frequent as daily
matches or even individual events within a competition. One should keep in mind,
however, that while sports benefit from a wealth of easily accessible productivity
data, analysis of the industry has generally been hampered by a lack of profit or
revenue data. Many sports teams, especially in North America, are privately held
businesses and are therefore not required to release audited financial data unlike
publicly held firms that dominate many other industries.
Given the unique nature of sports, beginning in the 1950s, academic economists
began to devote serious attention to the industry. In the late 1990s, industry
practitioners began to adopt the quantitative methodology first proposed by
economists, leading to a statistics-based revolution that has swept through
essentially every sport and league. Today nearly every major sports team employs
experts in quantitative analysis in order to optimize team performance. This chapter
examines this on-going revolution and how economists measure production and
productivity in spectator sports.

Team Sports Versus Individualistic Sports

While individualistic sports such as tennis, various combatant sports such as

wrestling, mixed martial arts, or boxing, golf, athletics, and horse and auto racing
do play an important role in the world of sports, the general interest in as well as
the revenue generated by these sports is small relative to that generated by team
sports. Furthermore, the economic analysis of individualistic sports is generally less
complex, or at least less unique, than that of team sports. In addition, there are fewer
cross-country differences in the organization of individual sports as opposed to team
sports. For example, marathons and golf tournaments are organized similarly in
most parts of the world while sports leagues have vastly different organizational
principles in North America as opposed to the major leagues in Europe [112].
Therefore, the primary focus of this chapter will be on production in team sports;
however, it is useful to at least include a brief overview of individual sports here.
The majority of the academic work on individual sports has typically focused
on the incentive structures that lead to the maximum effort on the part of the
participants. The literature includes theoretical models of effort [114] which
1632 J. Congdon-Hohman and V. Matheson

generally suggest a payoff function for individual contestants that is increasing in

prize money, decreasing in effort, and affected by the number and distribution of
the participants. More commonly, given the fact that sporting contests provide easy
access to data, the results of individual sports contests can also be used to test the
hypotheses of sports specific tournament theory or more general labor economics
tournament models such as that of Lazear and Rosen [70]. Empirical analyses of
pay and performance can be found for tennis [93], golf [11, 29], distance running
[40, 77], weight lifting [87], and horse racing [32, 49].

What Is a Sports Team’s Objective Function?

In most industries, the objective function of a firm’s manager is assumed to simply

be the maximization of profit. This is not necessarily the assumption made in
sports. In sports, it is common to assume that sports teams could have one of two
possible objective functions. The first possibility is that sports teams are just like
any other firm and operate like profit-maximizers. The other option is that teams
are simply trying to win as many games as possible subject to being able to stay
in business. This option is typically referred to as “win-maximization” or “utility-
maximization.” The concept of teams as win-maximizers was first developed,
apparently independently, by both Sloane [105] and El-Hodiri and Quirk [31].
It is plausible to believe that sports teams may choose to operate as utility-
maximizers rather than profit-maximizers for at least two reasons. First, many
professional sports teams, including many of the biggest soccer teams in the world,
such as Barcelona and Real Madrid, operate as clubs where the owners of the team
are the fans themselves. Fans purchase memberships in these clubs not to make
money on their investment but to qualify for the right to purchase tickets to games
and to have a say in electing the team’s management. Indeed, it would be very
unusual to believe that a fan-based ownership group would act in order to earn
money rather than win games.
Second, ownership of a sports team may bring nonmonetary benefits to the
owner in terms of fame or popularity or can be seen as a form of entertainment
or consumption akin to a hobby. For example, on-again, off-again Italian Prime
Minister Silvio Berlusconi initially came to prominence at least in part as the
owner of AC Milan, the most popular and successful soccer team in Italy. Similarly,
Russian oligarch Roman Abramovich, the owner of the Chelsea soccer club of the
English Premier League, has happily acknowledged that he is in the sports team
ownership business not for the profits, but instead for the glory.
The concept of win-maximization typically only applies to the labor or talent
acquisition side of a sports team. A win-maximizing team will attempt to acquire
players beyond the point at which the marginal revenue derived from the additional
talent will cover the marginal cost of paying for that talent. However, both win-
maximizing and profit-maximizing teams are likely to behave similarly on the
revenue side of the ledger. Because the vast majority of costs in operating sports
team are fixed in nature, at least over a given season, once a specific level of
42 The Economics of Sports 1633

talent is chosen, profit-maximizers will largely want to engage in pricing strategies

that maximize revenues. But win-maximizers will also want to engage in pricing
strategies for tickets, media, concessions, and sponsorships (as well as appropriate
marketing and promotions) that maximize revenue as they want to generate funds
in order to be able to fund the highest possible level of talent acquisition while still
remaining financially solvent.
Finally, it is important to note that there are important differences between the
objective function of a league and the objective functions of the individual teams
that make up a league. Because competitive sports are a zero-sum game, for every
team that generates a win there must be a corresponding team that takes a loss, it is
impossible for a league to operate in a win-maximizing fashion. It is also the case
that the behavior that maximizes profit for the league as whole may not maximize
profit for specific individual teams within that league.
Of course, even teams that are profit-maximizers are also concerned with the
production of on-field success as it is an empirical fact that team quality is strongly
correlated with demand and therefore revenue generated. The remainder of this
chapter examines how teams and leagues generate revenue and attendance and then
examines how teams generate on-field success.

Sports Economics and the Production Function for Attendance

and Revenue Success (Off-Field Success)

Stadiums and Sports Infrastructure

Most of the literature on both on-field and off-field sports team and league success
focuses on the labor side of the production function, or the athletes themselves.
However, capital, specifically stadium and arena infrastructure, also plays an
important role in generating revenue and potentially generating on-field success as
well (primarily through the generation of additional revenue that can be used to
purchase additional labor talent).
The bulk of the research in stadium economics focuses on the public finance
aspects of stadiums and arena in an attempt to answer the question as to whether
these facilities represent a wise investment of taxpayer dollars. Overwhelmingly,
independent research on sports facilities from the very earliest studies in the late
1980s and early 1990s [5, 6] throughout the US stadium boom in the 1990s and
2000s [17], and up to the current day [52, 78] finds that new stadiums have little
or no net economic benefits to the cities where they are constructed, although
occasionally significant neighborhood effects can be uncovered.
However, even if stadiums are not winners from a public finance standpoint, it is
quite clear that new stadiums can result in substantial benefits to the teams playing
in them. These benefits come in three forms. First, new stadiums increase attendance
and total ticket sales. Second, new stadiums allow teams to sell tickets at a higher
1634 J. Congdon-Hohman and V. Matheson

price. Finally, new stadiums increase the ability of teams to sell complementary
goods and services like concessions, premium seating, merchandise, and parking.
The higher attendances experienced by sports teams following a stadium con-
struction project are not generally the result of increasing stadium capacity. In
fact, most Major League Baseball and Major League Soccer stadiums built during
the most recent stadium boom in the United States since 1990 were smaller than
the stadiums they replaced, and most stadiums in the National Football League,
National Basketball League, and National Hockey League constructed during the
same time periods were roughly equal in size to their predecessors [72]. Instead,
most researchers chalk up the increase in attendance to a “novelty effect” or
“honeymoon effect,” and the interest among fans in seeing a game in the new
facility. This effect appears to persist for between five and ten seasons in most
sports including baseball, basketball, football [18], hockey [106], and minor league
baseball [2].
In addition to the novelty effect of new stadiums, sports facilities in the past
three decades have tended to substitute quantity for quality, improving the game
experience for fans and thereby also increasing fans’ willingness to pay for the
experience. Because the “best seats in the house” are by their very nature limited
in supply, courtside seats also serve as a positional good or an example of Veblen’s
[115] “conspicuous consumption.” Newer sports facilities have generally increased
luxury amenities in order to cater to the very highest spending spectators by
emphasizing luxury boxes, club seats, and exclusive lounge areas [72].
Finally, it is clear that teams produce a multifaceted good to live audiences
including both on-field entertainment as well as other in-stadium offerings such
as concessions, merchandise, and parking [19]. In earlier days, food and beverage
options at stadiums were quite limited. In fact, in the late 1800s as baseball began to
emerge as “America’s pastime,” one major difference between the National League
and its rival league the American Association was the sale of alcohol on the grounds.
The practice was prohibited at National League stadiums while allowed at American
Association parks earning the latter league the moniker “the Beer and Whiskey
League” [86]. Modern stadiums dedicate significantly more space and attention to
the sale of ancillary products which, along with the expansion of luxury amenities,
explains why the amount of capital dedicated to the constructions of stadiums has
increased so rapidly in recent years.

Producing Attendance and Revenue: Teams

The standard production model for a sports team is that a team chooses a level
of spending on talent in order to either maximize profit or maximize wins subject
to a budget constraint. It is assumed, and also verified empirically (see Coates
and Humphreys [19] and García and Rodríguez [44] among many others), that
all other things equal, revenues rise with on-field success. On-field success rises
with spending on talent. The return on investment in talent, however, eventually
experiences diminishing marginal returns for two reasons. First, if a team becomes
42 The Economics of Sports 1635

too good relative to its competition, fan interest may wane. The “uncertainty of
outcome hypothesis,” a concept dating back to Rottenberg’s [94] seminal paper on
sports economics, states that fan interest, and therefore team revenue, is dependent
on having a relatively close contest. Thus, at a certain point the purchase of
additional talent by a team does not generate additional revenue as the benefits from
increased on-field performance are balanced out by the harm associated with lower
uncertainty of outcome.
In addition, as also noted by Rottenberg [94], because every sport limits the
number of players that can participate in a contest, purchasing additional talent
eventually results in lower marginal product with respect to on-field performance.
There is only so much playing time, so eventually additional talent will just end up
sitting on the bench. Thus, a team that is a profit maximizer will purchase talent up
to the point where the marginal revenue they receive from the additional on-field
success that that talent produces, or the marginal revenue product (MRP), is equal
to the marginal cost of that talent. A win-maximizer will purchase talent up to the
point where profit is zero.
Obviously, many demand-side factors also go into determining attendance
including the number of buyers (i.e., the size of the market), weather, rivalries,
promotional events, the availability of other forms of entertainment, etc., but
ultimately, the primary production decision by the team is the level of talent to use
as an input to the production process. It is important to note that because fans in
different markets will not react in the same way to additional purchases of talent,
either due to differences in market size or other differences in the local market, there
is no reason to believe that teams will all decide to purchase the same level of talent.
However, due to the uncertainty of outcome hypothesis, it is likely that teams will
wish to form leagues with other teams of similar levels of overall talent [28].

Producing Attendance and Revenue: Leagues

The sporting industry is quite unique in that each individual firm requires the
cooperation of an erstwhile competitor in order to produce its product. Thus,
individual teams from the very beginning of organized spectator sports have joined
together to form leagues. There are at least three reasons why the formation of
leagues provides a production advantage to their member teams. The first is that a
league provides a uniform set of playing rules for competition. After all, the game of
football looks much different if you play by the Rugby rules where extreme physical
contact and the use of hands is allowed as opposed to playing with Association rules
where handling the ball is not allowed and the level of physical contact is more
strictly controlled. (Indeed, it is from the term As-“soc”-iation rules that the term
“soccer” derives its name.) In many ways this is no different than industry groups
that set uniform technical standards for things like electrical voltages or standardized
sizing.
Leagues also form to allow the members to act as a cartel in order to exert
monopoly (or monopsony) power on the sale of their product or the purchase of
1636 J. Congdon-Hohman and V. Matheson

inputs. In terms of sale of the product, league cartels in closed leagues like those
in the United States, league typically attempt to carve out geographical monopolies
for each franchise in the league leading to only relatively rare cases where more
than one team serves a particular metropolitan area. In so-called open leagues as
are seen in Europe (also known as promotion-relegation leagues), the membership
of the leagues changes year to year as the worst performing teams are relegated
to lower level leagues and the best performing teams from lower divisions are
promoted to higher level leagues. Since open leagues have far less control over
which teams belong to the league at any given time and since individual teams
in open market leagues do not have the ability to exclude competitors from large
markets, it is common to see multiple teams in large European metro areas like
London or Madrid [72].
Of course, leagues have long acted as cartels in terms of the purchase of labor.
Essentially every league has had at some point in their history some version of the
reserve clause, a rule that binds a particular player to a specific team and prevents
other teams in the league from bidding for that player’s services. Beginning in the
1970s, players’ unions in various sports began to successfully fight back against
league labor cartels winning the fight to at least some form of free agency, that
is, the right to freely contract with any team in the league. That being said, many
sports leagues still exert significant monopsony power on their players. The most
notable example is the National Collegiate Athletic Association (NCAA) which has
imposed amateurism on the athletes of its member institutions despite these institu-
tions generating combined revenues from athletics in the billions of dollars [59].
The final reason for league formation is the most unique to sports. In order to
provide an interesting product on the field or court that will be in high demand,
the two opponents must be of relatively similar quality [34], a factor known as
competitive balance. Competitive balance can mean many different things. It can
describe whether a particular season or contest is close in nature, known as “intra-
season or within season competitive balance.” And it can also refer to whether the
same teams tend to win year after year even if the competition within any given
season is fairly close. This is known as “inter-season or between season competitive
balance.” Significant attention has been paid in the academic literature as to how
one measures different types of competitive balance [51].
The next step in determining an optimal league production function is to estimate
what the optimal level of competitive balance should be in the league in order to
maximize the joint profits of the league members. While this is a basic element
of the many theoretical models of sports league formation such as El-Hodiri and
Quirk [31], Fort and Quirk [34], or Dietl, Grossmann, and Lang [27], empirically
measuring the effect competitive balance on attendance or league revenues has been
notoriously difficult (see, e.g., Schmidt and Berri [95] or Forrest and Simmons
[33]). Indeed, it is even difficult (or at least it is rare in the literature) to develop
a model for consumer preferences that leads to fans actually wanting their team
to lose on occasion Humphreys and Miceli [53]. Most studies suggest, however,
that extremely unbalanced competition, at least, is clearly harmful to overall league
economic performance [51].
42 The Economics of Sports 1637

The final step in designing an optimal league production function is to implement

league policies that result in the desired level of competitive balance that produces
the maximum profit or attendance. Potential league policies might include roster
limits, restrictions on player allocation through actions like reverse order drafts or
various forms of the reserve clause, salary caps or floors, and revenue sharing. It is
important to note that both theoretical models and the observed reality suggests
that revenue sharing alone will be insufficient to promote the desired level of
competitive balance unless policies are also put into place that provide incentives for
weaker teams to actually spend any revenue transfers [61]. Furthermore, Rottenberg
himself stated in his seminal work that the reserve clause that restricted the freedom
of players’ movements in the name of competitive balance would also be an
ineffective tool to promote this goal as the best players would ultimately end up
with the teams that valued their services the most. Thus, reserve clause contributed
little to competitive balance and only served to reduce players’ bargaining power
and hence their earnings. This concept has been dubbed “Rottenberg’s Invariance
Principle.”

Sports Economics and the Production Function for On-Field

Success

The detailed production data available in sports allows for a number of applications
examining the production process, optimization, and the value of various types
of inputs. By using the detailed performance measures for athletes in various
sports and examining team success as the desired output of production, researchers
have been able to test economic theories in ways that the production process in
other industries are not as well suited. Specifically, academic studies have tested
theories regarding the value of monopsony power in labor markets, the impact of
management efficiency, input optimization in regard to equality of labor skill and
compensation within an organization, and testing for evidence of shirking based on
various forms of compensation. In most cases, the first step in the analysis deals
with estimating the production process itself, either through a production function
or a production frontier.
The initial attempts to identify a team’s production process came in professional
baseball. In addition to being the most popular team sport in America for almost a
century, baseball has unique characteristics that allow individual worker contribu-
tions to the production process to be identified. Specifically, a baseball game is a
combination of predominantly individual activities. A single defensive player (the
pitcher) is responsible for initiating a play by throwing a baseball at a specific speed,
location and with an intended amount of movement as it travels through the air. A
single offensive player (the batter) attempts to hit the ball in a way that will allow
him to gain as many bases as possible. Teammates are part of each play as either
offensive players who have reached bases in previous at bats and will attempt to
advance on a particular hit ball, or as defensive players who will attempt to stop the
batter and other offensive players from progressing after a ball has been hit.
1638 J. Congdon-Hohman and V. Matheson

Measuring Monopsony Power

Economists began to empirically examine the wage structure in baseball in the mid-
1970s, but the assertion that sports leagues had monopsony power in the labor
market for athletes was identified much earlier. Rottenberg [94] first identified
and codified the idea that Major League Baseball could be characterized as the
combination of a number of different inputs (players) and that the owners as a
collective group had market power as the only employer of high skilled baseball
players. Specifically, US antitrust law carves a formal legal exemption for the
professional baseball league. Davenport [23] also cited monopsonistic power when
discussing the faster growth in baseball team revenues than player wages. Due to
this monopsonistic power, he asserted that players must be paid a wage between
their marginal revenue product and the wage they would earn in a non-baseball job.
Gerald Scully [99] was the first to attempt to statistically measure the degree to
which Major League Baseball teams used their monopsony power to pay players
wages that were below their marginal revenue product (MRP). Scully first set out
to predict revenue as it related to a team’s on-field success, controlling for market
and stadium characteristics. Once Scully had an estimate of the value of winning
baseball games, he turned to estimating the contribution of each team’s statistical
output to the success of the team, which then were used to value the contribution
of each player’s individual statistical accomplishments. With these values, Scully
estimated each individual player’s contribution to a team’s predicted revenue, which
then could be compared to the wages that the players were paid by the owners.
Ultimately, Scully found that the most skilled baseball players were only paid 10–
20% of their net MRP. This high degree of monopsonistic power reflects the labor
market structure under the reserve clause. Individual players were claimed by teams
when they entered the league and were not allowed to negotiate a contract with any
other team at any point in their career without the permission of the team that owned
the exclusive rights to that player.
After Scully’s seminal work, others applied variations on his model to evaluate
monopsonist power as the structure of the labor market in baseball transformed
beginning in the 1970s. In that decade, players in Major League Baseball gained
the right to appeal team salary offers to an arbitrator after a few years in the
league (“arbitration”) and the right to freely negotiate with all league teams after
a longer period (“free agency”). Sommers and Quinton [109] examined the salary
outcomes of the first group of players to negotiate contracts through free agency.
As a competitive labor market model would suggest, they found that players were
able to negotiate a wage near their estimated MRP. Other players who were still
restricted were found to continue to be underpaid. Scully [100] revisited his model
to examine the outcomes of free agent contracts in the mid-1980s. He found that
free agents were paid only 28% of their MRP. Such a low value was likely due
to collusion among owners to not compete for free agent players in this period.
Zimbalist [120] found that the average free agent was actually slightly overpaid
relative to their estimated MRP using a modified approach to Scully’s analysis,
but players earn significantly less than their estimated MRP prior to meeting the
42 The Economics of Sports 1639

tenure requirements for free agency. Rather than looking across class of players
based on tenure, Rockerbie [92] focused on different outcomes within the class
of free agent players. He found that those players who earn the highest contracts
in free agency tend to be paid more than their estimated MRPs while all other
free agents earn at or below their estimated MRPs. Humphreys and Pyun [54]
used a modified-Scully approach to examine the evolution of monopsony power
by examining the monopsony exploitation ratios (MER) across changes to the
collective bargaining agreement between baseball owners and the player’s union.
They found free agents were getting paid salaries closer to their predicted MRP
with each progressive agreement, but other classes of players with less tenure did
not see similar gains.
The Scully approach to measuring a player’s MRP and evaluating the level of
monopsonistic power in baseball has not been without its detractors. As exemplified
by Bradbury [10] and Krautmann [64], the disagreement is focused primarily on
the estimation of the value of players, not on the production function estimates.
The Scully approach estimates the values of team success based on approximations
of revenue streams for teams. The critique leveled is that rather than using a poor
approximation of revenue and the value of a win, why not rely on the wage outcomes
of free agents as the MRP since the market for these players should be competitive
and therefore the wage offer reveals an owner’s estimation of the player’s MRP. The
production function estimate can then be used to identify the value of individual
production.
A number of papers have been written comparing the traditional estimates of
MRP to various alternative techniques. Fort and Quirk [35] found that estimates
using the Scully approach overstate the size of the monopsonistic salary suppres-
sion. Krautmann [63] compared traditional MRP estimates and those based on the
free market revelations of MRP through free agent contracts. He concluded that the
traditional methods overvalued players’ MRPs. Based on these alternative methods,
Krautmann concludes that only those players who are not eligible for arbitration are
underpaid (at only 25% of their estimated MRP) while those players who are eligible
for arbitration are rewarded in the same way as free agents, which is assumed
to equal their MRP. Krautmann, Gustafson, and Hadley [67] find similar results,
though they find slight differences based on a player’s race and identify that the
underpayment of players only recoups a portion of the authors’ estimated player
development costs. Fort, Lee, and Oh [37] found that the accuracy of traditional
model estimates of MRP and therefore monopsonistic power are related to the
markets in which teams play. Specifically, they found that MRP is overestimated
for teams in small markets (those in cities with smaller populations), while it is
underestimated for teams in larger markets.
Monopsonistic power in the labor market has also been tested for other North
American sports leagues that share a similar labor market structure to Major
League Baseball. Scott, Long, and Somppi [98] found that players in the National
Basketball Association (NBA) are paid at levels below their MRP when they
first enter the league and they are restricted to playing for only one team. Once
professional basketball players reach the requisite tenure, they are free to negotiate
1640 J. Congdon-Hohman and V. Matheson

with all teams and the authors find that their compensation rises to the level of
their MRP. A few years later, Atkinson, Stanley, and Tschirhart [4] extended this
type of analysis to football and the National Football League (NFL). The authors
identified an overpayment of players beyond what simple profit maximization might
suggest, which they believe suggests that owners’ maximization problem must
also include personal, nonmonetary-based utility. Brown [12] also examined the
NFL to identify the degree to which monopsonistic exploitation in college may be
offset by professional earnings. Though college football players receive limited,
education-based compensation, the author found sizeable values for their MRP
after controlling for different levels of revenue-generating abilities through quantile
regression methods. Brown estimated that only 33–38% of college players earn NFL
incomes that offset their non-realized MRP during college play. Krautmann, von
Allmen, and Berri [68] directly compared the monopsonistic power across the three
largest North American sports leagues: the NFL, MLB, and the National Hockey
League (NHL, ice hockey). They found that owners in all three leagues exercised
monopsony power during the period when player movement between teams is
limited. When players are most restricted immediately after entering each league,
the authors find that baseball players make the lowest percentage of their MRP, but
basketball players provide the largest value of surplus for owners.

Management and Strategic Efficiency

Often missing in the models described above is the role of team management and
strategy in the estimation of the production function. The contribution of manage-
ment and production methods is an important question in all firms’ production as
owners look to get the most output possible from given levels of labor and capital.
In sports, the quantifiable nature of both inputs and outputs makes the calculation
of production efficiency more testable than it is in most industries. Though baseball
has the most easily discernible measures of individual labor inputs, the examination
of efficiency is much more reliant on aggregate inputs and therefore is more easily
examined in the context of many different sports.
Managerial efficiency was first explicitly quantified in Zak, Huang, and Siegfried
[118] when estimating a production function for a small number of teams in the
NBA. The authors used the measurable activities in basketball including shooting
percentages, rebounds, fouls, steals, and blocks to estimate the production function.
Teams were then assigned a measure of production efficiency based on how close
the teams’ production is to its frontier. Later, Zech [119] included a measure of
management that was insignificant in his estimation of the baseball production
function. Porter and Scully [89] examined managerial efficiency and a manager’s
MRP, finding that managers contribute significantly to the production process and
that the MRP of the best managers was similar to the MRP of star players. Kahn [58]
found similar evidence that good baseball managers increase win production, and
42 The Economics of Sports 1641

also that good management can improve individual player performance. Scully
[101] identified a strong link between estimated managerial efficiency and the
likelihood of continuing in their position, supporting the earlier work of Chapman
and Southwick [16] that suggested variation in manager productivity is based on the
quality of the job match, with highly productive and mutually beneficial matches
likely to continue. More recently, Volz [116] used data envelopment analysis
techniques to show that a baseball team’s production efficiency was more important
than winning percentage in predicting the retention of managers.
In addition to the work of Zak, Huang, and Siegfried [118] and the NBA, other
researchers looked to examine managerial efficiency in sports other than baseball
relatively early. Carmichael and Thomas [13] examined the production inefficiency
in English rugby by estimating the production function for Rugby Football League
teams and identifying teams’ potential given their available resources. Dawson
and Dobson [24] extended this type of analysis to measure managerial efficiency
in English soccer. They examined which traits and experiences were associated
with managers who were able to have success closer to a team’s production
frontier given the team’s quality. They found that prior experience and familiarity
with the specific club through prior affiliation club lead to the highest efficiency
gains. Frick and Simmons [41] examined managerial quality in the German
Bundesliga. They found that quality coaching improved team performance by
limiting technical inefficiency and that coaches were paid below their estimated
MRP. Using ice hockey, Kahane [56] estimated the stochastic production frontier
to identify inefficiencies in the NHL. He found that inefficiencies were associated
with coaching ability, as well as team ownership, management experience, and
the share of players from specific areas. Fort, Lee, and Berri [36] used a similar
approach to estimate the technical inefficiency of basketball coaches in the NBA.
Like Scully [101], they found that job retention was strongly linked to a coach’s
technical efficiency.
In addition to manager efficiency, some have used production function analysis
to evaluate various sports strategies. In cricket, Schofield [96] examined data from
English country cricket to evaluate on-field strategies in addition to player selection,
lineups, and development of particular skills. Carmichael, Thomas, and Ward [14]
estimated the production function using the Opta Index as a quantifiable measure
of player contributions in English Premiership football. They identified the specific
activities on the pitch that most closely determined wins and identified the positive
impact of “aggressive play.” In a follow-up using the Italian Serie A football
league, Carmichael, Rossi, and Thomas [15] constructed composite measures of
performance using factor loading and identified offensive performance as more
critical than defensive performance. In baseball, Lee [71] evaluated a managerial
strategy referred to as “small ball.” He found that aggressive baserunning and
sacrificial activities to move runners closer to scoring were detrimental to the
ultimate goal of scoring as many runs as possible and a source of production
inefficiency.
1642 J. Congdon-Hohman and V. Matheson

Optimal Levels and Distribution of Inputs

In addition to evaluating the production efficiency with a given set of inputs,

sports data has also been used to test theories regarding the level of various inputs
and the optimal mix. Primarily, this literature examines whether the level and
distribution of spending is directly related to team success, both through winning
and through revenue earned. These examinations are testing the efficiency of
personnel management and the nature of demand for the product. The optimizing
the level of inputs is more obviously generalized to non-sports industries as it is
essentially asking whether shear expenditures on skilled workers leads to positive
outcomes. The question of the optimal mix of production inputs is also generalizable
and the question is often posed to address questions of worker motivation based on
earnings differentials between coworkers.
When investigating the relationship between payroll and on-field success,
researchers have found evidence that the structure of the labor market is a key
determinant. Early examinations of the relationship between performance and
payroll took a relatively naive approach. Szymanski and Smith [113] found that
higher payrolls in English football (soccer) were positively related to a team’s
on-field success and profits, but this relationship turned negative when controlling
for a team’s endowments such as market and stadium sizes. Quirk and Fort [90]
examined correlations between winning percentages and payroll rank across the four
major North American sports leagues and found a small, but positive relationship
in only the NHL and NFL. Hall, Szymanski, and Zimbalist [47] more formally
tested the relationship between payroll and team success in both MLB and English
soccer using Granger causality tests. In baseball, they found only a weak causal
relationship running from performance to payroll which strengthened as league
revenue became more disparate in the late 1990s. In soccer, a Granger causality test
could not reject the hypothesis that payroll improved performance. Hall et al. believe
that freer player movement in soccer is likely responsible for the difference in these
results. Similarly, Simmons and Forrest [104] found that the relationship between
payroll and performance is weaker in North American sports leagues compared
to European football leagues. They also believe that labor market interventions in
North American sports leagues are likely responsible for this difference. Payroll
efficiency has also been tested using data envelopment analysis, with Einolf [30]
and Lewis, Sexton, and Lock [76] finding lower levels of efficiency for large-
market baseball teams, though Einolf did not find a similar relationship for NFL
teams. Data envelopment analysis has also been used to examine the efficiency of
player contracts [50] and hall of fame voting in baseball [82].
In an attempt to address expenditure efficiency, sports management has renewed
their focus on analytics to find undervalued player attributes. This relatively new
trend was popularized in the book Moneyball by Michael Lewis [75] which focused
on player contributions in MLB. Lewis examined the approach of a general manager
in baseball who took advantage of undervalued statistical contributions to the
production process to optimize team output despite a relatively small player payroll.
These advantages tend to be short lived, as the labor market responds to successful
42 The Economics of Sports 1643

roster management strategies for baseball, as documented in Hakes and Sauer

[46] and Congdon-Hohman and Lanning [21]. A similar analytics sea change has
occurred across many sports.
Rather than examining total payroll, many researchers have used sports data
to examine the question of how to distribute that payroll amongst the employees.
There is a debate in the broader economic literature as to how workers respond to
a large variance in wages between workers. Tournament theory, as established by
Lazear and Rosen [70] and Milgrom and Roberts [83], suggests that disparity in
compensation motivates workers to put forth more effort due to the clear rewards
established in the wage structure, which in turn increases overall worker productiv-
ity. The cohesion theory (as established by Akerlof and Yellen [3] and Levine [74])
suggests that a compressed wage distribution promotes harmony in the workplace
and results in higher worker productivity. Sports data is particularly appropriate
to test these competing hypotheses since the nature of the work is cooperative,
production is meticulously measured, and salaries are known to coworkers and the
researcher. Additionally, since team production is often the output, most sports are
ripe for examination.
In most sports, the results when estimating wage inequality (as measured in var-
ious ways, including Gini coefficients, Herfindahl-Hirschman Indexes [HHI], and
other measures of salary variance) as part of the team’s production functions suggest
a negative relationship between payroll disparity and team success. Using baseball
data, Richards and Guell (using a measure of variance to represent wage inequality,
[91]), Bloom (Gini, [9]), Depken (HHI, [26]), Frick, Prinz, and Winkelmann (Gini,
[43]), Jewell and Molina (Gini, [55]), and DeBrock, Hendrick, and Koenker (HHI,
[25]) all find a strong, negative relationship between wage inequality and a team’s
on-field success. That said, DeBrock, Hendrick, and Koenker [25] found that wage
differences driven by differences in player quality did not lead to the same negative
effect.
Research examining the impact of wage inequality in professional ice hockey,
soccer, and American football have found a similar relationship. Sommers [108]
found weak evidence that inequality (as measured by a Gini coefficient) had a
negative impact on team success in the NHL, while Stefanec [110] found a stronger
negative result when looking at wage disparity within NHL position groups on
a team. Similarly, Franck and Nüesch (Gini and measure of variance, [38]) and
Coates, Frick, and Jewell [20] found negative relationships between wage disparity
and team outcomes in the German Bundesliga and American Major League Soccer,
with Franck and Nüesch [38] also finding that teams with higher inequality in pay
played more individualistically. In American football, Frick, Prinz, and Winkelmann
[43] found a negative but insignificant relationship between wage inequality and
team success using data from the NFL, while Mondello and Maxcy [84] found
similar results but also that wage dispersion improves team revenue production.
Research examining the NBA produced one of the few results supporting the
tournament theory regarding wage inequality and worker motivation. Frick, Prinz,
and Winkelmann [43] found that the NBA was the only major sports league in North
America where teams did better with more wage disparity (measured using a Gini
1644 J. Congdon-Hohman and V. Matheson

coefficient). Using different methodology and an HHI to measure wage inequality,

Berri and Jewell [7] find no evidence of a link between wage inequality and firm
productivity. Simmons and Berri [103] used changes to the salary structure in
the NBA due to changes to the league’s collective bargaining agreement with the
players’ union to show that salary dispersion based on variation in individual’s talent
levels showed a positive relationship to winning but “unjustified” inequality had no
significant impact. Katayama and Nuch [60] found no relationship between salary
dispersion and team performance using the general method of moments (GMM) to
control for the effect of prior team success on current salary distribution.
Disparity in worker characteristics other than salary have also been examined
using sports data. Using data from MLB, Papps, Bryson, and Gomez [88] found
that baseball teams with a more even distribution of talent perform better than teams
with a more disparate distribution, though also that there is an optimal mix of talent
that outperforms teams with a very low dispersion. Gelade [45] also found that
a large spread in abilities in European soccer led to negative team performance,
though with more goals scored for the team and its opponents. Kahane, Longley,
and Simmons [57] examined the impact of cultural heterogeneity in the workplace
through nationality in the NHL. They found that more European players from a
particular country improves team performance, but foreign workers from various
countries override the gains from player diversity.

Worker Effort and Compensation

Sports data has been used extensively to examine the value of players as discussed
above, but also to examine whether players modify their behavior and output based
on the inherent incentives of the labor market structure in their sport and from their
individual contract. Sports performance data is particularly apt to test agency theory,
which examines the relationship of unaligning incentives between the principle
(owners) and the agents (players). Though the principle-agent problem has been
investigated in other contexts (CEO pay in Abowd [1] and transitions from hourly
pay to piece rates in Lazear [69], for example), the nature of professional sports
contracts gives researchers a clear case where incentives for players change at
clearly distinguishable points.
Again, professional baseball is the focus of much of the research on opportunistic
behavior by labor because of the individual nature of the sport. Krautmann [62]
and Scoggins [97] examined the period immediately following the introduction
of free agency among players and found no evidence of improved performance
before a new contract was signed nor evidence of a decrease in performance after
(often referred to as “shirking”). Using later data, Sommers [107], Maxcy [80],
Maxcy, Fort, and Krautmann [81], and Krautmann and Donley [65] also found
no evidence of strategic behavior among baseball players following a new, long-
term contract. Alternatively, Sommers [107] found evidence of shirking following
arbitration contacts. Woolway [117] found that players’ MRP declined in the year
following a new long-term contract. Krautmann and Solow [66] found evidence of
42 The Economics of Sports 1645

shirking, but only among players identified as unlikely to play after their current
contract expired. Many studies did find evidence that players spend more time
unavailable to play due to injury after signing a contract (Lehn [73], Scoggins [97],
and Maxcy, Fort, and Krautmann [81]), but this cannot necessarily be attributed
to shirking as teams may instead be protecting the player from more severe injury
given their long-term investment in the player’s labor.
Outside of baseball, basketball has been the most fertile ground for a similar
examination of player performance following long-term contracts and have found
mixed results. Harder [48] found that over-rewarded players tend to play in a
more team-oriented manner than under-rewarded players in the NBA, suggesting
free agents respond positively following a new contract. Berri and Krautmann [8]
find that a player’s measured output declines following a long-term contract, but
their MRP does not. Stiroh [111] found evidence of opportunistic behavior with
players’ performances improving prior to signing a multi-year contract but declining
following the contract. He found this difference extended to team production, as
teams with many players with expiring contracts improving play while teams with
many players recently signed to multi-year contracts show a decline in performance.
Opportunistic behavior has also been tested in other sports. Fernie and Metcalf
[32] examine the horse racing industry and found that jockeys performed worse
when they were paid on a non-contingent basis through guaranteed salaries. In
the NFL, Conlin and Emerson [22] found that players in the last year of their
contract started more games than predicted (suggesting more effort) and Frick,
Dilger, and Prinz [42] found that teams who devoted a larger portion of their payroll
to guaranteed bonuses performed worse than teams with a lower portion. Frick [39]
used quantile regression to show that players in the German Bundesliga performed
significantly better in the final year of their contracts.

Conclusion

Sports teams and athletes bring an interesting theoretical twist to standard eco-
nomic models of competition and production by requiring cooperation among
competitors to produce their product. In addition, professional sports’ widespread
popular appeal, growing economic importance, as well as the industry’s appealing
availability of highly specialized data has made sports economics a rapidly growing
field within economics. One study showed that frequency of published papers
covering sports economics topics in a general interest journal had risen by a factor
of nearly 20 between the 1970s and the 2010s [79], and the number of colleges
and universities offering sports economics courses had risen from just a handful
nationwide in the 1990s to well over 100 by 2020. In fact, sports economics even
recently earned its own JEL code in 2015.
At the same time, the high stakes world of professional sports labor markets,
where the contract for a single athlete can now reach into the hundreds of millions
of dollars, has increasingly turned to the sports analytics models first developed by
economists like Gerald Scully ushering in an era of quantitative analysis in sports
1646 J. Congdon-Hohman and V. Matheson

that was unheard of even 25 years ago. A modern general manager or vice president
of player acquisition is more likely to be Ivy League trained economist than a
former star player, and there is every indication that sports metrics will be even
more integral to the game in the future. Indeed, it seems clear that sports economics
has only just kicked off and that there is a long game ahead for the discipline.

References
1. Abowd JM (1990) Does performance-based managerial compensation affect corporate
performance. Ind Labor Relat Rev 43(3):52S–73S
2. Agha N (2013) The economic impact of stadiums and teams: the case of minor league
baseball. J Sports Econ 14(3):227–252
3. Akerlof GA, Yellen JL (1990) The fair wage-effort hypothesis and unemployment. Q J Econ
105(2):255–283
4. Atkinson SE, Stanley LR, Tschirhart J (1988) Revenue sharing as an incentive in an agency
problem: an example from the National Football League. Rand J Econ 19(1):27–43
5. Baade R, Dye R (1988) An analysis of the economic rationale for public subsidization of
sports stadiums. Ann Reg Sci 22:37–47
6. Baade R, Dye R (1990) The impact of stadium and professional sports on metropolitan area
development. Growth Chang 21:1–14
7. Berri DJ, Jewell RT (2004) Wage inequality and firm performance: professional basketball’s
natural experiment. Atl Econ J 32(2):130–139
8. Berri DJ, Krautmann AC (2006) Shirking on the court: testing for the incentive effects of
guaranteed pay. Econ Inq 44(3):536–546
9. Bloom M (1999) The performance effects of pay dispersion on individuals and organizations.
Acad Manag J 42(1):25–40
10. Bradbury JC (2013) What is right with Scully estimates of a player’s marginal revenue
product. J Sports Econ 14(1):87–96
11. Brown J (2011) Quitters never win: the (adverse) incentive effects of competing with
superstars. J Polit Econ 119(5):982–1013
12. Brown R (2012) Do NFL player earnings compensate for monopsony exploitation in college?
J Sports Econ 13(4):393–405
13. Carmichael F, Thomas D (1995) Production and efficiency in team sports: an investigation of
rugby league football. Appl Econ 27(9):859–869
14. Carmichael F, Thomas D, Ward R (2000) Team performance: the case of English premiership
football. Manag Decis Econ 21(1):31–45
15. Carmichael F, Rossi G, Thomas D (2017) Production, efficiency, and corruption in Italian
Serie A football. J Sports Econ 18(1):34–57
16. Chapman KS, Southwick L (1991) Testing the matching hypothesis: the case of Major League
Baseball. Am Econ Rev 81(5):1352–1360
17. Coates D, Humphreys BR (1999) The growth effects of sport franchises, stadia, and arenas. J
Policy Anal Manage 18:601–624
18. Coates D, Humphreys BR (2005) Novelty effects of new facilities on attendance at profes-
sional sporting events. Contemp Econ Policy 23:436–455
19. Coates D, Humphreys BR (2007) Ticket prices, concessions and attendance at professional
sporting events. Int J Sport Financ 2:161–170
20. Coates D, Frick B, Jewell T (2006) Superstar salaries and soccer success: the impact of
designated players in major league soccer. J Sports Econ 17(7):716–735
21. Congdon-Hohman JM, Lanning JA (2018) Beyond moneyball: changing compensation in
MLB. J Sports Econ 19(7):1046–1061
42 The Economics of Sports 1647

22. Conlin M, Emerson PM (2003) Multidimensional separating equilibria and moral hazard: an
empirical study of National Football League contract negotiations. Rev Econ Stat 85(3):760–
765
23. Davenport DS (1969) Collusive competition in Major League Baseball: its theory and
institutional development. Am Econ 13(2):6–30
24. Dawson P, Dobson S (2002) Managerial efficiency and human capital: an application to
English association football. Manag Decis Econ 23(8):471–486
25. DeBrock L, Hendricks W, Koenker R (2004) Pay and performance: the impact of salary
distribution on firm-level outcomes in baseball. J Sports Econ 5(3):243–261
26. Depken CA II (2000) Wage disparity and team productivity: evidence from Major League
Baseball. Econ Lett 67(1):87–92
27. Dietl H, Grossmann M, Lang M (2011) Competitive balance and revenue sharing in sports
leagues with utility-maximizing teams. J Sports Econ 12(3):284–308
28. Eckard EW (2017) The uncertainty-of-outcome hypothesis and the industrial organi-
zation of sports leagues: evidence from U.S. college football. J Sports Econ 18(3):
298–317
29. Ehrenberg R, Bognanno M (1990) Do tournaments have incentive effects? J Polit Econ
98(6):1307–1324
30. Einolf KW (2004) Is winning everything? A data envelopment analysis of Major League
Baseball and the National Football League. J Sports Econ 5(2):127–151
31. El-Hodiri M, Quirk J (1971) An economic model of a professional sports league. J Polit Econ
79(6):1302–1319
32. Fernie S, Metcalf D (1999) It’s not what you pay, It’s the way that you pay it, and that’s what
gets results: jockeys’ pay and performance. Labour 13(2):385–411
33. Forrest D, Simmons R (2002) Outcome uncertainty and attendance demand in sport: the case
of English soccer. J R Stat Soc Ser D (The Statistician) 51(2):229–241
34. Fort R, Quirk J (1995) Cross-subsidization, incentives, and outcomes in professional team
sports leagues. J Econ Lit 33(3):1265–1299
35. Fort R, Quirk J (1996) Overstated exploitation: monopsony versus revenue sharing in sports
leagues. In: Fizel J, Gustafson E, Hadley L (eds) Baseball economics: current issues. Praeger,
Westport, pp 159–178
36. Fort R, Lee YH, Berri D (2008) Race, technical efficiency, and retention: the case of NBA
coaches. Int J Sport Financ 3(2):84–97
37. Fort R, Lee YH, Taeyoun O (2019) Quantile insights on market structure and worker salaries:
the case of Major League Baseball. J Sports Econ 20(8):1066–1087
38. Franck E, Nüesch S (2011) The effect of wage dispersion on team outcome and the way team
outcome is produced. Appl Econ 43(23):3037–3049
39. Frick B (2011) Performance, salaries, and contract length: empirical evidence from German
soccer. Int J Sport Financ 6(2):87–118
40. Frick B, Prinz J (2007) Pay and performance in professional road racing: the case of city
marathons. Int J Sport Financ 2(1):25–35
41. Frick B, Simmons R (2008) The impact of managerial quality on organizational performance:
evidence from German soccer. Manag Decis Econ 29(7):593–600
42. Frick B, Dilger A, Prinz J (2002) Labor market regulation and post-contractual opportunism:
the effects of signing bonuses on player performance in the National Football League. Z
Betriebswirt 72:163–180
43. Frick B, Prinz J, Winkelmann K (2003) Pay inequalities and team performance: evidence
from the major North American leagues. Int J Manpow 24:472–488
44. García J, Rodríguez P (2002) The determinants of football match attendance revisited:
empirical evidence from the Spanish Football League. J Sports Econ 3(1):18–38
45. Gelade GA (2018) The influence of team composition on attacking and defending in football.
J Sports Econ 19(8):1174–1190
46. Hakes JK, Sauer RD (2006) An economic evaluation of the Moneyball hypothesis. J Econ
Perspect 20(3):173–185
1648 J. Congdon-Hohman and V. Matheson

47. Hall S, Szymanski S, Zimbalist AS (2002) Testing causality between team performance and
payroll: the cases of Major League Baseball and English soccer. J Sports Econ 3(2):149–168
48. Harder JW (1992) Play for pay: effects of inequity in a pay-for-performance context. Adm
Sci Q 37(2):321–335
49. Higgins R, Tollison R (1990) Economics at the track. In: Goff B, Tollison R (eds)
Sportometrics. Texas A&M University Press, College Station, pp 15–34
50. Howard LW, Miller JL (1993) Fair pay for fair play: estimating pay equity in professional
baseball with data envelopment analysis. Acad Manag J 36(4):882–894
51. Humphreys B (2002) Alternative measures of competitive balance in sports leagues. J Sports
Econ 3(2):133–148
52. Humphreys BR (2019) Should the construction of new professional sports facilities be
subsidized? J Policy Anal Manage 38(1):264–270
53. Humphreys BR, Miceli TJ (2019) The peculiar preferences of sports fans: toward a
preference-based motivation for the UOH. J Sports Econ 20(6):782–796
54. Humphreys BR, Pyun H (2017) Monopsony exploitation in professional sport: Evidence from
Major League Baseball position players, 2000–2011. Manag Decis Econ 38(5):676–688
55. Jewell TR, Molina DJ (2004) Productive efficiency and salary distribution: the case of US
Major League Baseball. Scott J Polit Econ 51(1):127–142
56. Kahane LH (2005) Production efficiency and discriminatory hiring practices in the National
Hockey League: a stochastic frontier approach. Rev Ind Organ 27(1):47–71
57. Kahane L, Longley N, Simmons R (2013) The effects of coworker heterogeneity on firm-
level output: assessing the impacts of cultural and language diversity in the National Hockey
League. Rev Econ Stat 95:302–314
58. Kahn LM (1993) Managerial quality, team success, and individual player performance in
Major League Baseball. Ind Labor Relat Rev 46(3):531–547
59. Kahn LM (2007) Markets: cartel behavior and amateurism in college sports. J Econ Perspect
21(1):209–226
60. Katayama H, Nuch H (2011) A game-level analysis of salary dispersion and team perfor-
mance in the National Basketball Association. Appl Econ 43(10):1193–1207
61. Késenne S (2000) Revenue sharing and competitive balance in professional team sports. J
Sports Econ 1(1):56–65
62. Krautmann AC (1990) Shirking or stochastic productivity in Major League Baseball? South
Econ J 56(4):961–968
63. Krautmann AC (1999) What’s wrong with Scully-estimates of a player’s marginal revenue
product? Econ Inq 37(2):369–381
64. Krautmann AC (2013) What is right with Scully estimates of a player’s marginal revenue
product: reply. J Sports Econ 14(1):97–105
65. Krautmann AC, Donley TD (2009) Shirking in Major League Baseball revisited. J Sports
Econ 10(3):292–304
66. Krautmann AC, Solow JL (2009) The dynamics of performance over the duration of Major
League Baseball long-term contracts. J Sports Econ 10(1):6–22
67. Krautmann AC, Gustafson E, Hadley L (2000) Who pays for minor league training costs?
Contemp Econ Policy 18(1):37–47
68. Krautmann AC, Von Allmen P, Berri D (2009) The underpayment of restricted players in
North American sports leagues. Int J Sport Financ 4(3):161–175
69. Lazear EP (1986) Salaries and piece rates. J Bus 59(3):405–431
70. Lazear EP, Rosen S (1981) Rank-order tournaments as optimum labor contracts. J Polit Econ
89(5):841–864
71. Lee YH (2011) Is the small-ball strategy effective in winning games? A stochastic frontier
production approach. J Prod Anal 35(1):51–59
72. Leeds M, von Allmen P, Matheson V (2018) The economics of sports, 6th edn. Routledge,
Boston
73. Lehn K (1982) Property rights, risk sharing, and player disability in Major League Baseball.
J Law Econ 25(2):343–366
42 The Economics of Sports 1649

74. Levine DI (1991) Cohesiveness, productivity, and wage dispersion. J Econ Behav Organ
15(2):237–255
75. Lewis M (2003) Moneyball: the art of winning an unfair game. W. W. Norton, New
York/London
76. Lewis HF, Sexton TR, Lock KA (2007) Player salaries, organizational efficiency, and
competitiveness in Major League Baseball. J Sports Econ 8(3):266–294
77. Maloney M, McCormick R (2000) The response of workers to wages in tournaments:
evidence from foot races. J Sports Econ 1(2):99–123
78. Matheson V (2019) Is there a case for subsidizing sports stadiums? J Policy Anal Manage
38(1):271–277
79. Matheson V, von Allmen P (2014) Introduction to the symposium on sports economics. East
Econ J 40(4):456–458
80. Maxcy JG (1997) Do long-term contracts influence performance in Major League Baseball?
In: Hendricks W (ed) Advances in the economics of sport: volume 2. JAI Press, Greenwich,
pp 157–176
81. Maxcy JG, Fort RD, Krautmann AC (2002) The effectiveness of incentive mechanisms in
Major League Baseball. J Sports Econ 3(3):246–255
82. Miceli TJ, Volz BD (2012) Debating immortality: application of data envelopment analysis
to voting for the baseball hall of fame. Manag Decis Econ 33(3):177–188
83. Milgrom P, Roberts J (1988) An economic approach to influence activities in organizations.
Am J Sociol 94:S154–S179
84. Mondello M, Maxcy J (2009) The impact of salary dispersion and performance bonuses in
NFL organizations. Manag Decis 47(1):110–123
85. Neale W (1964) The peculiar economics of professional sports. Q J Econ 78(1):1–14
86. Nemec D (2004) The beer and whisky league: the illustrated history of the American
Association – Baseball’s Renegade Major League. The Lyons Press, Guilford
87. Nutting A (2008) To the slimmer go the spoils: heterogeneous responses to bodyweight
incentives in Olympic weightlifting tournaments. East Econ J 34:375–389
88. Papps KL, Bryson A, Gomez R (2011) Heterogeneous worker ability and team-based
production: Evidence from Major League Baseball, 1920–2009. Labour Econ 18(3):
310–319
89. Porter PK, Scully GW (1982) Measuring managerial efficiency: the case of baseball. South
Econ J 48(3):642–650
90. Quirk J, Fort RD (1999) Hard ball: the abuse of power in pro team sports. Princeton University
Press, Princeton
91. Richards DG, Guell RC (1988) Baseball success and the structure of salaries. Appl Econ Lett
5(5):291–296
92. Rockerbie D (2010) Marginal revenue product and salaries: Moneyball redux. University
Library of Munich, Munich
93. Rosen S (1986) Prizes and incentives in elimination tournaments. Am Econ Rev 76(4):701–
715
94. Rottenberg S (1956) The baseball players’ labor market. J Polit Econ 64(3):242–258
95. Schmidt M, Berri DJ (2001) Competitive balance and attendance: the case of Major League
Baseball. J Sports Econ 2(2):145–167
96. Schofield JA (1988) Production functions in the sports industry: an empirical analysis of
professional cricket. Appl Econ 20(2):177–193
97. Scoggins JF (1993) Shirking or stochastic productivity in Major League Baseball: comment.
South Econ J 60(1):239–240
98. Scott FA, Long J, Somppi K (1985) Salary vs. marginal revenue product under monopsony
and competition: the case of professional basketball. Atl Econ J 13(3):50–59
99. Scully GW (1974) Pay and performance in Major League Baseball. Am Econ Rev 64(6):915–
930
100. Scully GW (1989) The business of Major League Baseball. University of Chicago Press,
Chicago
1650 J. Congdon-Hohman and V. Matheson

101. Scully GW (1994) Managerial efficiency and survivability in professional team sports. Manag
Decis Econ 15(5):403–411
102. Siegfried J, Peterson T (2000) Who is sitting in the stands? The income levels of sports fans.
In: Kern WS (ed) The economics of sports. UpJohn, Kalamazoo, pp 51–73
103. Simmons R, Berri DJ (2011) Mixing the princes and the paupers: pay and performance in the
National Basketball Association. Labour Econ 18.3:381–388
104. Simmons R, Forrest D (2004) Buying success: team performance and wage bills in US and
European sports leagues. In: Fort R, Fizel J (eds) International sports economics comparisons.
Praeger, Westport/London
105. Sloane P (1971) The economics of professional football: the football club as a utility
maximiser. Scott J Polit Econ 18(2):121–146
106. Soebbing B, Mason DS, Humphreys BR (2016) Novelty effects and sports facilities in smaller
cities: evidence from Canadian hockey arenas. Urban Stud 53(8):1674–1690
107. Sommers PM (1993) The influence of salary arbitration on player performance. Soc Sci Q
74(2):439–443
108. Sommers PM (1998) Work incentives and salary distributions in the National Hockey League.
Atl Econ J 26(1):119
109. Sommers PM, Quinton N (1982) Pay and performance in Major League Baseball: the case of
the first family of free agents. J Hum Resour 17(3):426–436
110. Stefanec N (2012) Does pay disparity really hamstring team performance? A study of
professional ice hockey players. Sociol Mind 2(4):447–457
111. Stiroh KJ (2007) Playing for keeps: pay and performance in the NBA. Econ Inq 45(1):145–
161
112. Szymanski S (2003) The economic design of sporting contests. J Econ Lit 41(4):1137–1187
113. Szymanski S, Smith R (1997) The English football industry: profit, performance and
industrial structure. Int Rev Appl Econ 11(1):135–153
114. Szymanski S, Valletti TM (2005) Incentive effects of second prizes. Eur J Polit Econ
21(2):467–481
115. Veblen T (1899) The theory of the leisure class. Oxford University Press; Reissue edition,
Oxford, UK
116. Volz B (2009) Minority status and managerial survival in Major League Baseball. J Sports
Econ 10(5):522–542
117. Woolway MD (1997) Using an empirically estimated production function for Major League
Baseball to examine worker disincentives associated with multi-year contracts. Am Econ
41(2):77–83
118. Zak TA, Huang CJ, Siegfried JJ (1979) Production efficiency: the case of professional
basketball. J Bus 52(3):379–392
119. Zech CE (1981) An empirical estimation of a production function: the case of Major League
Baseball. Am Econ 25(2):19–23
120. Zimbalist A (1992) Baseball and billions. Basic Books, New York
The Effects of Management on Production:
A Survey of Empirical Studies 43
Alecos Papadopoulos

Contents
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1652
Content and Structure of the Chapter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1655
Management as a Latent Variable . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1658
Panel Data Models with Management as the Firm’s Individual Effect . . . . . . . . . . . . . . . 1658
Structural Latent Variable Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1661
Stochastic Frontier Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1663
Estimating Management Effects using Data Envelopment Analysis . . . . . . . . . . . . . . . . . 1666
Empirical Studies using Proxies for Management . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1668
Management Effects in Professional Sports . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1676
Measuring Management . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1678
Measuring Management Through National and International Surveys . . . . . . . . . . . . . . . 1680
The World Management Survey Project . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1681
The Workplace Employment Relations Survey (WERS), UK . . . . . . . . . . . . . . . . . . . . . . 1689
Some Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1690
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1693

Abstract

We review econometric studies that attempt to estimate the effects of manage-

ment on production, being on output, productivity, or efficiency. We group the
studies mainly by a methodological criterion: whether they treat management as
a latent variable, whether they proxy it by some other variable(s), or whether they
attempt to construct a direct measure of management and use it as a regressor
in an econometric model. A large part of the literature uses data from small-
size agriculture, while in recent years, national surveys have started to collect

A. Papadopoulos ()
Athens University of Economics and Business, Athens, Greece
e-mail: [email protected]

S. C. Ray et al. (eds.), Handbook of Production Economics,
https://doi.org/10.1007/978-981-10-3455-8_45
1652 A. Papadopoulos

more systematically data related to management and management practices from

various industries. Rather than being mentioned by telegraphic references, most
of the studies presented are given a somewhat detailed summary so that the reader
can acquire a good sense of the methodological choices made, the estimation
techniques adopted, and the results obtained on the effects of management.
Keywords

Management · Production · Efficiency · Productivity

JEL classiﬁcation

D24, M11, M20

Introduction

Production economics has the primary task of formulating a system of internally

consistent and mathematically rigorous concepts, a system that can represent in
abstract the myriad production processes observed in the real world and then
be transparently used to obtain testable insights and tools for empirical analysis.
Fundamental among these concepts are the production “factors/inputs” and their
interrelation through the production function, properties like the law of diminishing
returns, static phenomena like economies of scale, and dynamic forces like technical
change... but not efficiency. Efficiency (in the use of resources viewed from many
different angles) is not an entity in this system, but an outcome. Students of
economics that have their first contact with the theory of production are usually
surprised to learn that the production function is defined as a frontier, representing
the maximum output to be produced from a given combination of inputs and
technology, namely, the efficiently produced output. “But”, they ask, “how can we
ascertain that the firm is fully efficient? Isn’t this what economics is all about?
Analyzing the current use of scarce resources in order to help and promote full
efficiency in their use? If use of resources is assumed already efficient, what is
our purpose here?” It is then the task of instructors to help these students realize
that in order to target something (full efficiency), one must first define it and then
measure the distance of our semi-Aristotelian world from this Platonic ideal. But
this means that the raison d’ être of economics, the efficient use of scarce resources,
is ultimately an empirical matter.
The fact that the achieved degree of efficiency depends on “organization” was
apparently never in doubt. To quote Alfred Marshall, “Writers on social science
from the time of Plato downwards have delighted to dwell on the increased efficiency
which labour derives from organization.”1 Remaining two centuries in the past, in

1 Marshall [56], Book IV, chapter VIII. We note that Book IV “The Agents of Production” was one
of the parts of Marshall’s Principles that remained rather unchanged from its 1st edition in 1890 to
its 8th, 30 years later.
43 The Effects of Management on Production: A Survey of Empirical Studies 1653

Walker [88, p.275], Francis A. Walker, the then president of both the American
Economic Association and of the American Statistical Association, offered a vivid
description of what it means to achieve efficiency in production: “The excess of
produce which we are contemplating comes from directing force to its proper object
by the simplest and shortest ways; from saving all unnecessary waste of materials
and machinery; from boldly incurring the expense –the often large expense– of
improved processes and appliances, while closely scrutinizing outgo and practising
a thousand petty economies in unessential matters;(...).”
However, Walker was not examining efficiency; his article and the above quote
was about the source of “business profits,” understood as the surplus over and above
labor costs, capital rent, expenses, and even the wage a business owner could claim
for their time spent. Still, the fact that his description is so close to what any business
professional would instinctively think to say in order to describe how management
should go about achieving efficiency impresses powerfully the strong connection
between the two. To return to Marshall, it is to be noted that he uses the term “Agents
of Production” for the title of Book IV in his Principles (rather than “inputs”), and
these are, as we read in the subtitle, “Land, Labour, Capital and Organization.” So
“organization” was for Marshall an agent of production, and reading through Book
IV, we can safely map to this concept what in present day we call “management.”
As years passed and the field of efficiency and productivity analysis arose and
progressed, the conviction that management was the champion of efficiency in a
firm sailed through as a self-evident consensus. In the seminal Farrell [29] paper,
the author stated that “technical efficiency indicates the gain that can be achieved
by simply ‘gingering-up’ the management.” Half a century later, Fried, Lovell,
and Schmidt [34, p. viii], wrote “Ultimate responsibility for performance rests
with management. We believe that inefficiency arises from the varying abilities
of managers....” It appears that even when we acknowledge that the inefficiency
of a firm does not only come from failings in internal planning, monitoring, and
controlling, but that it is also affected by external factors like regulatory regimes,
cultural trends, and other socioeconomic forces, still, we treat management as
accountable for any inefficiency and as responsible for reducing it.
However, accountability and responsibility are contractual concepts (formal or
implicit), not causal. Casual observation indicates that most of the various stake-
holders of a firm tend to simplify their lives by blaming management for every bit
of inefficiency (which is causally inaccurate) and also praising solely management
for every bit of the firm’s success (which again is causally inaccurate). The duty
of scientific research is to go beyond these social tendencies and arrangements
and to provide hard evidence, eventually quantitative, of what are the effects of
management on a firm – and we focus on the effects of management on production.
We write “production” as the comprehensive word to cover “output,” “produc-
tivity,” and “efficiency.” Even though a clear analytical distinction between these
concepts can certainly be drawn [see for example Sickles and Zelenyuk [77,
p.97]], changes in efficiency will affect also productivity and output. Suppose
that a “simpler and shorter way” has been introduced in a production process,
perhaps after management overheard a conversation of competitors at the yearly
1654 A. Papadopoulos

gala of the local trade association. Suppose that this change results in less labor
needed to produce given levels of output: efficiency has just been increased. But
“labor productivity” (output/labor ratio) has also increased. Moreover, if demand
conditions are favorable, the surplus labor may be retained and output may actually
increase.
These interconnections have consequences as regards to how we view the
material here: some studies estimate the effects of management on output, others
on productivity (exclusively of labor), yet others on efficiency itself. We did not use
this as a criterion to group the studies, because we felt that we would misleadingly
create the impression that these are distinct sub-areas of research... They could and
maybe they should be. However, the main challenge was and still is to “measure
management” or at least its effects with some acceptable degree of accuracy. When
we arrive at that point, its effects through different routes and on different aspects of
production could be systematically analyzed. For the time being, the survey of the
literature indicates that we are still in an exploratory stage.
Marshall kept insisting that “it seems best sometimes to reckon Organization
apart as a distinct agent of production,“2 but when Paul Samuelson wrote his
seminal book, his stance was skeptical and his suggestion different: “It is useful,
I believe, to avoid the expression ‘factor of production’ entirely. This has been
used in at least two senses, neither of which is quite satisfactory. First, it has been
used to denote broad composite quantities such as ‘labor, land and capital.’ On
the other hand, it has been used to denote any aspect of the environment which
has any influence on production. I suggest only “inputs” be explicitly included
in the production function and that this term be confined to denote measurable
quantitative goods or services.”3 Note that “organization” was no longer mentioned
as a production input alongside the other, familiar ones.
Samuelson had his focus on the practical needs for quantitative analysis, and he
certainly knew that an easily obtained measure of management was not available.
Economics is acquainted with complex production inputs, “capital” being the obvi-
ous example. In principle, we could settle with measuring management in the same
way that we measure capital: using a monetary measure and summing up the salaries
of managerial positions, the expenses of monitoring, and information-processing
software and related costs. However, that kind of cost analysis is not usually done
by firms, and when it is done, it is not available to outside researchers. In light of this,
the profession followed Samuelson’s reasonable advise, and management was early
on placed in the “unobservable/latent variable” category. The present survey is also
a testimony to the imagination of researchers that tried to come up with very diverse
conceptual and technical methods to measure management, proxy it, and estimate
its effects on production. This at the same time means that there are no established
methodological paradigms on how to proceed, and each researcher, given also what
kind of data are available, almost starts from the beginning.

2 Marshall [56], p.115.

3 Samuelson [74], p. 84.
43 The Effects of Management on Production: A Survey of Empirical Studies 1655

Francis Walker’s era is pictured as that of the individual business owner fighting
alone against entropy, competition, and adversary chance events. This picture has
persisted in agriculture even to our day, and many studies are concerned with
management in agriculture, likely due to the availability of data that relate somehow
to management, but also because of the sensitive sociopolitical status of the industry
that caters to the daily biological needs of the human race, as well as its central
importance in underdeveloped and developing countries. However, this situation
characterizes in general the small-and-medium enterprises (SMEs) that make up
almost all of the firm population globally. In the modern industrial and post-
industrial economy, “management” also arose as a system of control, alongside
the realization that it can be abstracted, analyzed, and taught. This opened the
way for general methodologies to measure management, which led to projects like
the “World Management Survey” to take root and offer a more widely applicable
methodology to measure it (see section “The World Management Survey Project”).
We can crystallize this by making the distinction between “leadership” and “man-
agement.”4 Some studies focus on both (and in some cases they found them to be
incompatible), others on management “only.” While the single most comprehensive
word that coveys the function of management is “control,” decomposing the main
functions of management (planning, monitoring, intervening) makes the case that
the effects of management on production are bound to be transmitted through many
channels: management as another input or a technology shifter (or a technology
itself), management as affecting the structure of the production function (and so
the effects of other inputs on production), management as affecting efficiency...
So, not only management is hard to measure, its effects are hard to disentangle.
Nevertheless, scholars did not shy away from the challenge, and this survey attempts
to bring together very different approaches and applied methodologies that span 70
years of research, ultimately to see if we can arrive at some general conclusions. We
think we can, and we present them at the end.

Content and Structure of the Chapter

This being a chapter in a handbook of production economics, the focus is on studies

using econometric methods related to production. However, this is not the only way
by which scholars have approached the phenomenon of management. Consequently,
certain strands of economics and other disciplines that deal with management are
not surveyed here.
We did not engage with the empirical literature coming from management
science: being a different discipline, it has a different vocabulary, concepts, and

4 Lucas[53] theoretical model remains to this day the simplest and more elegant way to distinguish
between the two while examining them together.
1656 A. Papadopoulos

models, and it would be infeasible to present those alongside the empirical studies
coming from this field.5
The effects of management in agriculture have been under intense scrutiny and
study from the beginning of the twentieth century, under initially the label “Farm
management.”6 From this field, we excluded studies that are concerned with the
effects of implementing certain operational production techniques, because these
relate essentially to the effects that technology in its conventional sense has on
production and not to those of management viewed as a control mechanism.
A large literature exists examining the relation between the financial performance
of firms and their top management. We do not review this literature, because our
concern is the effects of management as a control mechanism in production and
production efficiency, rather than on the intertemporal financial fortunes of a firm
that are mainly linked to the “leadership” function.
With some difficulty, we also decided to not review the empirical literature
coming from personnel economics and from insider econometrics. The reasons
that finally weighed in for this decision were two: the first is that both fields
focus intensely on examining the effects of “human resources management on
worker productivity.” One could argue that management exists mostly because
humans exist in the production process, and so human resources management is
perhaps the most important component of it. However, we are interested in a more
encompassing view of management that allows for more general conclusions as
regards the management effects on production. The second reason is that there exist
relatively recent comprehensive presentations and surveys of these fields, Lazear
and Oyer [50] for personnel economics, and Ichniowski and Shaw [45] for insider
econometrics, both in the 2013 Handbook of Organizational Economics. In contrast,
to our knowledge the research we review here has not been covered in a unified
way before. Interested readers should by all means consult these surveys, if only
for the methodological insights about what does it mean to conduct quantitative
microscopic analysis in the present day.
Even with these exclusions, the survey is not exhaustive and it could not
have been. However, we tried to provide a representative sample of the various
approaches that have been applied over the decades. We allowed for room as regards
the definition of management and looked carefully at how the authors themselves
understood the concept, argued for their methodology, and interpreted their results.
Rather than writing a condensed “list of references” and a “guide to literature”
only, we decided to present each study in some detail so that the chapter can be
read as a self-contained introduction to our topic. These presentations are mostly
descriptive with only occasional light commentary. Our assessment of the various
methodological strands is reserved for the Conclusions section.

5 Some papers published in management journals were included because their approach and
methodology was fully econometric.
6 See McCown, Brennan, and Parton [60] and McCown and Parton [59] for a critical look at the

history and evolution of this field.

43 The Effects of Management on Production: A Survey of Empirical Studies 1657

Because we survey empirical studies where the chosen applied model and
method are as important as the underlying theoretical concepts, we opted for
a methodological criterion to organize the material. The surveyed works are
grouped in sections based on how management is treated as a component in
their estimation model: there are studies that treat it as an unobservable, and
they attempt to capture indirectly its effects by different methodologies. There
are studies that are proxying management, most of the times by more than one
variable. Here one finds the most variety in research approaches: in order to
proxy management, researchers use variables related to the human capital of
managers (age education, experience), various “informational/knowledge” variables
(like access to consulting services or supporting public agencies), indicators of
management practices, goal achievement, or financial indicators like mixed oper-
ating margins. One could argue that when these variables relate “directly” to
managerial aspects like information flows (reporting/monitoring) and management
practices (controlling), the authors do not “proxy” management, but they actually
use a direct measure of it, only they do not construct a single comprehensive
management index/variable, but they use it in disaggregated form, much like one
could take the production input “labor,” disaggregate it into different categories
(e.g., per education level), and use these as autonomous production factors in
a regression specification. However, we decided to include these studies in the
section dealing with proxying management because the “disaggregation” alluded
is only conjectured by the reader, while the authors of these studies do not
reflect as to whether, and how, the various covariates they use could be mean-
ingfully aggregated into a comprehensive single management variable. Finally,
there are those studies that boldly construct a single measure of management, and
in acknowledgment of their intellectual courage (or risk-taking), we give them
the last consideration, not least because these include organized efforts through
(inter)national surveys that are bound to provide large and accessible data sets that
include a measure of management, opening the way for a more systematic research
on the effects of management on production. Nevertheless, we acknowledge that
whether a study is “proxying” or “measuring” management is in many cases up to
debate.
Some studies examine the effects of management on output and/or productivity
(in its quantitative sense i.e., as output per unit of some input), others study the
management effects on some indicator of financial “success”/efficiency, and some
do both. In practice, this comes down to what was the dependent variable(s) in the
econometric specification of each paper. We did not use this as a criterion to organize
the material, but we are reporting what was the dependent variable in all cases. The
only industry that was granted a separate subsection was that of professional sports,
because of the gamely mental leap required to think of “output” as match points or
games won. Apart from that, we resisted the temptation to try to sub-group the other
studies per topic or research focus in each section, because our main purpose here
is not to write mini-reviews per industry but to provide an as panoramic as possible
view of the relation between management and production. Consequently, in every
section the studies are presented in chronological order.
1658 A. Papadopoulos

Specifically, the survey is structured as follows: the section “Management as a

Latent Variable” presents studies that treat management as a latent variable, having
some or no data related to it, but they attempt to nevertheless measure its effects
on output. The section “Empirical Studies using Proxies for Management” presents
studies that proxy management by one or many variables that the authors argue are
linked to management and reflect its “level,” “intensity,” or “quality.” This section
contains also the subsection on professional sports. The section “Measuring Man-
agement” presents the studies that attempt to construct a measure of management
and use it for estimation purposes. Weight is given to the appearance of national
surveys that collect data related to management and the studies that use them. The
final section contains our conclusions.

Management as a Latent Variable

Four different approaches have been used to estimate the effects of management
on output while treating it as an unobservable: the individual effects panel data
model, the structural latent-variables framework, stochastic frontier analysis (SFA),
and data envelopment analysis (DEA).

Panel Data Models with Management as the Firm’s Individual Effect

Management as a latent production factor was apparently what inspired Yair

Mundlak to pioneer the individual effects panel data model. Taking ideas from Hoch
[41] who obtained a measure of “entrepreneurial capacity” essentially advancing
the view that the contribution of management to production hides in the regression
residuals, in Mundlak [64] the author showed how with a panel data set of
just two observations along the time dimension, and a constant-returns-to-scale
assumption, one could obtain a measure of management and of its output elasticity.
He modeled the (unobservable) management variable as another input in a Cobb-
Douglas function with its own elasticity, Q = B0 Xb1 M c (we assume a single
additional input for simplicity). Using lower-case letters for variables in logarithms
and adding a random disturbance, the panel data specification becomes

qit = b0 + b1 xit + cmi + vit , i = 1, ..., N, t = 1, ...T.

So the management variable mi was mapped to the time-invariant “individual

effect” for each firm, conceptually multiplied by the management output elasticity
c. Mundlak re-specified the equation to express the management variable as a
deviation from its sample mean m̄:

qit = (b0 + cm̄) + b1 xit + c (mi − m̄) + vit.

43 The Effects of Management on Production: A Survey of Empirical Studies 1659

This allowed him to apply what is now called the Least Squares Dummy
Variables (LSDV) estimator, and he obtained estimates for the slope coefficients
using the “within” estimator. Then, pooling the data he obtained an estimate
for (b0 + cm̄). This gives an estimated series for c (mi − m̄), as a product. The
assumption of constant returns to scale in inputs including management, b1 + c = 1,
permitted him to estimate the management output elasticity ĉ = 1 − b̂1 and then to
divide c (mi − m̄) by ĉ to obtain a data series on management in mean-deviation
form. The unobservable had been revealed and estimated. Mundlak also discussed
the inclusion of a “time effect” especially when the time dimension of the panel is
longer than two periods. He provided an empirical illustration using a panel data of
family farms from Israel, with N = 66, T = 5. He run logarithmic Cobb-Douglas
regressions with the management and time effects being restricted to zero, or not,
and obtained that the changes in the elasticity estimates of the inputs were important
as specifications changed. He also obtained that the estimated management series
was correlated with the regressors. Finally, he computed that the management output
elasticity under constant returns to scale (the value of c) was 0.205 when a time
effect was included and dropped to 0.122 when not.
Hoch [42] presented more fully the panel data model with both “firm-specific”
and “time” effects, and he also allowed for the possibility that the firm may not
be strictly profit-maximizing either due to external constraints or due to pursuing
other goals in the specific period under examination. In this paper, the author called
the “firm-specific” effect an indicator of “technical efficiency”, in line with the
terminology adopted in efficiency analysis and SFA. He presented an empirical
study of 63 Minnesota farms over the period 1946–1951. As in Mundlak [64],
here too most of the output elasticities of conventional inputs changed when time
and/or firm effects were allowed, but the most striking result was the reduction to
around 0.05 of the output elasticity of labor, from a value of 0.256 in the restricted
regression (zero time and firm effects). Hoch adopted again the “entrepreneurial
capacity” interpretation of the firm effect. Viewing it as another input, its implied
elasticity under constant returns to scale was 0.17. Viewed as an output shifter (back
into levels), the value of the fixed firm effect ranged in the interval [0.7, 1.4]:
this implied that in some cases the management factor dragged output below
its “unmanaged” level, while in other cases, it expanded it above this level. In
this result, we have a confounded trace of the “battle” between the management
factor and inefficiency that characterizes the two-tier stochastic frontier model of
production that will be presented in a while.
Massell [57] applied essentially the same methodology as the previous authors,
but his sample of twenty peasant farms from Rhodesia (now Zimbabwe) for the
1961–1962 crop year was not a conventional panel data one but a cross-sectional
sample with three different outputs per firm (corn, peanuts, millet). The author
specified separate logarithmic production regressions for each output per firm and
allowed for the output elasticities and the individual effect to differ per farm per
output. Because two inputs (a binary soil quality variable and the estimation of
fixed capital) were joint in all three output equations per farm, the regressor matrix
became singular. In such a case, least squares estimates can be obtained by the
1660 A. Papadopoulos

use of the unique Moore-Penrose generalized inverse that gives the solution that
has residuals with minimum Euclidean norm. The author opted instead to run a
two-stage procedure, where initially each output was regressed on the common
inputs, and the residuals from this regression were used as the dependent variable
to estimate the elasticities of the other inputs and the individual effects. As regards
the individual/management effect values, translated back in levels, they ranged in
[0.97, 1.35]. As the author reports, combined with the recorded level of inputs, this
range implied that the best firm could produce double the output of the average
firm, and the average firm could produce double the output of the worst firm. Unlike
the results in Mundlak [64], here the management factor did not appear correlated
with the other inputs. Finally, the author detected a short one-year “learning curve,”
where the mean management effect was low for firms operating just one year, while
for firms operating two years or more, it increased in the second year by a factor of
≈2.5 and remained statistically constant at that value.
Siebert and Zubanov [79] did not equate the individual effect with management,
but they treated it as function of the latter. They looked at a single large UK-based
clothing retailer with more than 600 stores in the UK and many more internationally,
with the intent to estimate the store-manager effects.7 They used weekly data
for the period February 2005 to February 2006 (52 time periods) from 245 UK
stores. The stores were chosen so that the store manager had already a year as
manager of the store at the beginning of the sample and stayed on as manager
throughout the sample period. In this way, they controlled for any initial learning
curve and also for a changing management input. They estimated an individual
effects autoregressive panel data model with sales being the dependent variable
and then regressed the obtained series of store individual effects on controls and
six “Key Behavioral Indicators”: sales focus, commercial awareness, developing
people, drive and personal development, leadership, and planning/organizing. This
was an evaluation system of store managers run by the company itself. In regressions
where a single KBI was included, each proved to be individually significant, except
planning/organizing. But when some combination or all of them were used, only
“commercial awareness” remained as a significant factor to explain the store’s
individual effect. Across specifications, this KBI had a rather stable sales semi-
elasticity estimate of 0.07−0.10, which approximates the mean total sales elasticity,
since the mean value of this KBI was roughly unity.
Wolff, Squires, and Guillotreau [89] examined the “skipper effect” on the French
purse-seine fleet harvesting tuna in the Indian Ocean. They used a three-dimensional
panel data set that identified vessel and vessel captain per observation over the
period 1980–2007 (n = 1197). Their dependent variable was a measure of the
total tuna catch per skipper working on a vessel during a specific fishing year in
the sample. As inputs, they used industry-specific variables. They also included two
measures of skipper’s human capital: overall experience as captain in the Indian
Ocean and tenure to each vessel. Their individual effects model included three

7 The single-firm focus has the flavor of insider econometrics.

43 The Effects of Management on Production: A Survey of Empirical Studies 1661

heterogeneity effects: a time effect, a vessel effect, and a skipper effect, and they
interpreted the human capital variables as the time-varying part of the managerial
effect of the captain. The estimation results showed that these variables had no
effect on output, leaving only the time-invariant skipper effect as a management
production factor. This time-invariance result of the management effect is important
because it comes from a sample that run over 28 years. The previous studies
using the individual effects model worked with much shorter time intervals, and
for example, Mundlak [64] was careful to qualify his method stating that it was
appropriate if management stays fixed from one year to the next. On the other hand,
Kaldor [48] made the distinction between “co-ordination” and “supervision,” which
we can alternatively express as the distinction between “resource planning and
allocation” and “process monitoring and control,” and argued on theoretical grounds
that the first is inherently a fixed production input even in the long run. The time-
invariance result of the managerial effects of vessel skippers appears to echo his
argument. The authors performed a variance decomposition to see the importance
of individual effects on output. The managerial fixed effect variance was 5.7% of
output variance. Another finding was that the vessel effect was negatively correlated
with the skipper effect: the authors commented that this was counterintuitive, against
anticipations that more efficient managers will tend to be matched with more
efficient vessels. However, it is also plausible to argue that skippers are used as
“efficiency adjusters” for vessels by the firm’s higher management, so that a more
uniform efficiency level is attained across the vessels in the fleet, and the authors do
remark that.

Structural Latent Variable Models

Ford and Shonkwiler [30] developed a structural latent variable model to estimate
the effects of three dimensions of managerial ability (financial, dairy, and crop) on a
farm’s “financial success” (measured as net farm income). They related linearly each
of these three latent management variables to four distinct observable indicators:
financial indicators for the first (like equity-to-assets ratio and gross profit margin)
and (different) productivity and cost indicators for the other two. These were not
used as proxy variables for management, but as “intermediates” to eventually
estimate, by maximum likelihood, the direct effect of the latent management
variables on net farm income (in the same spirit that we use an instrumental
variable not to proxy a regressor but to estimate the regressor’s marginal effect).
Namely, causality run from the latent variables to the observables. Herd size was
also included in the specification, to account for scale effects. They applied their
model to a sample of 880 dairy farms from Pennsylvania using 1990 data. Financial
and especially dairy management were estimated as having a positive effect on
net farm income, while crop management appeared not to matter. An additional
interesting result was that the estimated correlations between the three management
latent variables were all negative (although small): focusing on one management
aspect appeared to antagonize the others.
1662 A. Papadopoulos

Kalaitzandonakes and Dunn [47] looked at the relation of education with

managerial ability and technical efficiency. By applying three different methods –
a deterministic frontier model, an SFA model, and DEA – they showed that the
relation between technical efficiency and education is sensitive to the estimation
method chosen. In light of the different results obtained, they argued that we
can treat these estimated technical efficiency series as imperfect indicators and so
functions of the true latent technical efficiency and, at the same time, the latter as a
function of the true latent managerial ability. Education, farming experience, and
consulting/training services on new technologies (“agricultural extension”) were
used as the latter’s imperfect observable indicators. This led to a structural latent
variables system of equations that they applied to a sample of 82 corn farms from
Guatemala. They found a strong positive connection between technical efficiency
and managerial ability, as well as a weaker positive connection of the latter with
education. They also obtained that the weighted combination of the three estimated
technical efficiency series had a 0.97 coefficient of determination as predictors of
the latent technical efficiency, which lends support to a practical “model averaging”
approach, even without the Bayesian machinery at play.
Mäkinen [54] made the distinction between “management ability” (and linked
it to personal characteristics and the psychological make up) and “management
capacity (“having both the necessary personal characteristics and the skills to
deal with the decision-making system”). He operationalized the distinction into
“managerial thinking” (MT) and “management process effe

(Applied Mathematical Sciences, 118) Edwige Godlewski, Pierre-Arnaud Raviart - Numerical Approximation of Hyperbolic Systems of Conservation Laws-Springer (2021)
No ratings yet
(Applied Mathematical Sciences, 118) Edwige Godlewski, Pierre-Arnaud Raviart - Numerical Approximation of Hyperbolic Systems of Conservation Laws-Springer (2021)
846 pages
Bekaert International Financial Management 2e
100% (1)
Bekaert International Financial Management 2e
6 pages
UW Computational-Finance & Risk Management Brochure Final 080613
No ratings yet
UW Computational-Finance & Risk Management Brochure Final 080613
2 pages
(Complexity - Reproducing Automata) Hypercycle - A Principle of Self Organization
No ratings yet
(Complexity - Reproducing Automata) Hypercycle - A Principle of Self Organization
98 pages
Advanced Computing, Networking and Informatics - Volume 1
No ratings yet
Advanced Computing, Networking and Informatics - Volume 1
717 pages
VincenzoSchettino - Materials Under Extreme Conditions2013
100% (1)
VincenzoSchettino - Materials Under Extreme Conditions2013
373 pages
(Carlo Cercignani, Ester Gabetta) Transport Phenom (BookFi)
100% (2)
(Carlo Cercignani, Ester Gabetta) Transport Phenom (BookFi)
274 pages
Potential Theory
No ratings yet
Potential Theory
264 pages
(Chapman & Hall CRC Handbooks of Modern Statistical Methods) Alan E. Gelfand, Peter Diggle, Peter Guttorp, Montserrat Fuentes - Handbook of Spatial Statistics (Chapman & Hall CRC Handbooks of Modern S
No ratings yet
(Chapman & Hall CRC Handbooks of Modern Statistical Methods) Alan E. Gelfand, Peter Diggle, Peter Guttorp, Montserrat Fuentes - Handbook of Spatial Statistics (Chapman & Hall CRC Handbooks of Modern S
620 pages
Numerical Modeling of Earth Systems PDF
No ratings yet
Numerical Modeling of Earth Systems PDF
222 pages
Reynolds - Ordinary and Partial Differential Equations
100% (1)
Reynolds - Ordinary and Partial Differential Equations
416 pages
Roe Et Al., 2018. Mathematics For Sustainability
No ratings yet
Roe Et Al., 2018. Mathematics For Sustainability
534 pages
Spectral Analysis of Julia Sets
100% (1)
Spectral Analysis of Julia Sets
142 pages
Deep Carbon
100% (1)
Deep Carbon
688 pages
978-3-031-65096-3
No ratings yet
978-3-031-65096-3
421 pages
Stability Theory of Large Scale Dynamical Systems
No ratings yet
Stability Theory of Large Scale Dynamical Systems
238 pages
Synergetics Introduction and Advanced Topics, Haken, 2004
No ratings yet
Synergetics Introduction and Advanced Topics, Haken, 2004
761 pages
bma1104-probability-and-statistics-i-1
No ratings yet
bma1104-probability-and-statistics-i-1
242 pages
The Economy As A Complex Adaptive System: Herbert Gintis
0% (1)
The Economy As A Complex Adaptive System: Herbert Gintis
23 pages
Advances in Business Statistics Methods and Data Collection - American Statistical Association
No ratings yet
Advances in Business Statistics Methods and Data Collection - American Statistical Association
1,437 pages
Björn Böttcher, René Schilling, Jian Wang Auth. Lévy Matters III Lévy-Type Processes Construction, Approximation and Sample Path Properties
100% (2)
Björn Böttcher, René Schilling, Jian Wang Auth. Lévy Matters III Lévy-Type Processes Construction, Approximation and Sample Path Properties
215 pages
William Aspray Auth. Women and Underrepresented Minorities in Computing A Historical and Social Study
No ratings yet
William Aspray Auth. Women and Underrepresented Minorities in Computing A Historical and Social Study
274 pages
Factors That Impact The Stability of Vitamin C at Intermediate T
No ratings yet
Factors That Impact The Stability of Vitamin C at Intermediate T
9 pages
Cecotti S. Statistical Mechanics. A Concise Advanced Textbook 2024
No ratings yet
Cecotti S. Statistical Mechanics. A Concise Advanced Textbook 2024
362 pages
On The Expressive Power of Deep Neural Networks
No ratings yet
On The Expressive Power of Deep Neural Networks
8 pages
Nonlinear Continuum Mechanics for Finite Elasticity-Plasticity: Multiplicative Decomposition with Subloading Surface Model 1st Edition Koichi Hashiguchi All Chapters Instant Download
100% (2)
Nonlinear Continuum Mechanics for Finite Elasticity-Plasticity: Multiplicative Decomposition with Subloading Surface Model 1st Edition Koichi Hashiguchi All Chapters Instant Download
41 pages
Oxford Economics - Skills and Practice
No ratings yet
Oxford Economics - Skills and Practice
198 pages
Differential Equations
No ratings yet
Differential Equations
351 pages
(Lecture Notes in Economics and Mathematical Systems 334) Dr. Hans-Walter Lorenz (Auth.) - Nonlinear Dynamical Economics and Chaotic Motion-Springer Berlin Heidelberg (1989)
No ratings yet
(Lecture Notes in Economics and Mathematical Systems 334) Dr. Hans-Walter Lorenz (Auth.) - Nonlinear Dynamical Economics and Chaotic Motion-Springer Berlin Heidelberg (1989)
258 pages
2022-Maurice Bellanger - Digital Signal Processing_ Theory and Practice, 10th Edition-WILEY (2024) (1)
100% (1)
2022-Maurice Bellanger - Digital Signal Processing_ Theory and Practice, 10th Edition-WILEY (2024) (1)
397 pages
Macro-Economics of Mineral and Water Resources (PDFDrive)
No ratings yet
Macro-Economics of Mineral and Water Resources (PDFDrive)
316 pages
Convex Analysis For Optimization A Unified Approach Compress
No ratings yet
Convex Analysis For Optimization A Unified Approach Compress
278 pages
Tensor Computation for Data
No ratings yet
Tensor Computation for Data
347 pages
Time Series Analysis in Economics
100% (1)
Time Series Analysis in Economics
397 pages
PDF
100% (1)
PDF
207 pages
How The Mathematics of Fractals Can Help Predict Stock Markets Shifts - by Marco Tavora
No ratings yet
How The Mathematics of Fractals Can Help Predict Stock Markets Shifts - by Marco Tavora
21 pages
Accounting information and equity valuation - Theory, evidence, and applications
No ratings yet
Accounting information and equity valuation - Theory, evidence, and applications
250 pages
Proceedings of Sixth International Congress On Information and Communication Technology
No ratings yet
Proceedings of Sixth International Congress On Information and Communication Technology
1,046 pages
Random Matrices and Random Partitions Normal Convergence, Volume 1 PDF
100% (1)
Random Matrices and Random Partitions Normal Convergence, Volume 1 PDF
284 pages
(2017) Climate Mitigation From Vegetation Biophysical Feedbacks During The Past Three Decades
No ratings yet
(2017) Climate Mitigation From Vegetation Biophysical Feedbacks During The Past Three Decades
8 pages
(Springer Proceedings in Mathematics &amp - Statistics 187) Josep Antoni Martín-Fernández, Santiago Thió-Henestrosa (Eds.) - Compositional Data Analysis - CoDaWork, L'Escala, Spain, June 2015 PDF
No ratings yet
(Springer Proceedings in Mathematics &amp - Statistics 187) Josep Antoni Martín-Fernández, Santiago Thió-Henestrosa (Eds.) - Compositional Data Analysis - CoDaWork, L'Escala, Spain, June 2015 PDF
211 pages
Gibbs Phase Rule Article 1876 - Equilibrium of Heterogeneous Substances
No ratings yet
Gibbs Phase Rule Article 1876 - Equilibrium of Heterogeneous Substances
329 pages
ProdEconR PDF
No ratings yet
ProdEconR PDF
370 pages
Ecological Stability PDF
No ratings yet
Ecological Stability PDF
12 pages
Responsible Use of Antibiotics in Aquaculture
No ratings yet
Responsible Use of Antibiotics in Aquaculture
110 pages
Modeling Predator Prey Systems Fin
No ratings yet
Modeling Predator Prey Systems Fin
15 pages
Interactions Between Computational Intelligence and Mathematics Compress (1)
100% (3)
Interactions Between Computational Intelligence and Mathematics Compress (1)
125 pages
Building Blocks For Theoretical Computer Science
No ratings yet
Building Blocks For Theoretical Computer Science
268 pages
C. E. Ferguson-The Neoclassical Theory of Production and Distribution-Cambridge University Press (1969)
No ratings yet
C. E. Ferguson-The Neoclassical Theory of Production and Distribution-Cambridge University Press (1969)
400 pages
Herbert Gintis - Mathematics For Humanists-University of Massachusetts (2021)
100% (1)
Herbert Gintis - Mathematics For Humanists-University of Massachusetts (2021)
123 pages
Review On Mathematical Modelling in Water Pollution
No ratings yet
Review On Mathematical Modelling in Water Pollution
4 pages
TecMecRoc-L2 - Dimensioning Crown Pillars
No ratings yet
TecMecRoc-L2 - Dimensioning Crown Pillars
8 pages
Ballestero Et Al. (2015)
No ratings yet
Ballestero Et Al. (2015)
312 pages
58149
100% (1)
58149
72 pages
Writing Scientific Research Articles: Strategy and Steps
From Everand
Writing Scientific Research Articles: Strategy and Steps
Margaret Cargill
3/5 (2)
A Workout in Computational Finance
From Everand
A Workout in Computational Finance
Andreas Binder
No ratings yet
Design and Technology in Today's World: A First Look
From Everand
Design and Technology in Today's World: A First Look
Baz Professor
No ratings yet
Microeconomics: A Critical Companion
From Everand
Microeconomics: A Critical Companion
Ben Fine
No ratings yet
Particle Physics
From Everand
Particle Physics
Brian R. Martin
5/5 (1)
Solar Cell Nanotechnology
From Everand
Solar Cell Nanotechnology
Atul Tiwari
5/5 (1)
15913-ArticleText-52748-1-10-20220630
No ratings yet
15913-ArticleText-52748-1-10-20220630
8 pages
3-Education and Development
No ratings yet
3-Education and Development
28 pages
Time-Table M.A. even semester 2025
No ratings yet
Time-Table M.A. even semester 2025
1 page
Ch-043-Nutrition-for-Preschoolers-2–5-Years
No ratings yet
Ch-043-Nutrition-for-Preschoolers-2–5-Years
12 pages
Kuznet’s Consumption Puzzle_A Twist in Keynesian Theory
No ratings yet
Kuznet’s Consumption Puzzle_A Twist in Keynesian Theory
4 pages
JScientometRes-11-3-447
No ratings yet
JScientometRes-11-3-447
11 pages
vandenBerg
No ratings yet
vandenBerg
7 pages
MA Economics
No ratings yet
MA Economics
71 pages
9780521314275_excerpt
No ratings yet
9780521314275_excerpt
10 pages
Sickles Zelenyuk Book Frontmatter-1
No ratings yet
Sickles Zelenyuk Book Frontmatter-1
29 pages
List of Selected Candidates for Part-time and ERP - PhD Admissions
No ratings yet
List of Selected Candidates for Part-time and ERP - PhD Admissions
1 page
9780521314275_frontmatter
No ratings yet
9780521314275_frontmatter
14 pages
DE004F16E8E00740F67C3C125793E0023A666
No ratings yet
DE004F16E8E00740F67C3C125793E0023A666
9 pages
first-page-pdf
No ratings yet
first-page-pdf
1 page
USET_Economics_Official_Paper_II_Held_On_01_Mar_2015__17f6c335709432ed71abb674dd67b7bf
No ratings yet
USET_Economics_Official_Paper_II_Held_On_01_Mar_2015__17f6c335709432ed71abb674dd67b7bf
16 pages
CH11 revised A
No ratings yet
CH11 revised A
13 pages
IAP-Parental-Guidelines-Eating-Disorders-Anorexia-and-Bulimia
No ratings yet
IAP-Parental-Guidelines-Eating-Disorders-Anorexia-and-Bulimia
11 pages
MASTERING PANEL DATA ANALYSIS FOR RESEARCH EXCELLENCE
No ratings yet
MASTERING PANEL DATA ANALYSIS FOR RESEARCH EXCELLENCE
5 pages
Diamond model
No ratings yet
Diamond model
12 pages
Presentation 8
No ratings yet
Presentation 8
21 pages
Chapter1InternationalEconomicsAnOverview
No ratings yet
Chapter1InternationalEconomicsAnOverview
13 pages
Staggered Price and Wage Setting in Macroeconomics
No ratings yet
Staggered Price and Wage Setting in Macroeconomics
42 pages
Ch-037-Immune-Deficiency-IAP-Parental-Guideline-13122021
No ratings yet
Ch-037-Immune-Deficiency-IAP-Parental-Guideline-13122021
14 pages
Chapter 6
No ratings yet
Chapter 6
27 pages
Befa Unit-3
No ratings yet
Befa Unit-3
57 pages
Transportation Research Part B: Zhi-Chun Li, Qian-Wen Guo, William H.K. Lam, S.C. Wong
No ratings yet
Transportation Research Part B: Zhi-Chun Li, Qian-Wen Guo, William H.K. Lam, S.C. Wong
23 pages
CHAP03 Teori Growth
No ratings yet
CHAP03 Teori Growth
64 pages
Producer Theory
No ratings yet
Producer Theory
8 pages
16 SK Srivastava
No ratings yet
16 SK Srivastava
12 pages
Isoquant and Isocost
100% (5)
Isoquant and Isocost
59 pages
Unit - Ii: The Production Function Expresses A Functional Relationship Between Physical Inputs and
No ratings yet
Unit - Ii: The Production Function Expresses A Functional Relationship Between Physical Inputs and
20 pages
Economics Full Study Notes Gokul
No ratings yet
Economics Full Study Notes Gokul
39 pages
Test 4
No ratings yet
Test 4
69 pages
Alston, 2018
No ratings yet
Alston, 2018
22 pages
Leotieff Derivation From Ces
No ratings yet
Leotieff Derivation From Ces
4 pages
S02 Classical Theories of Economic Growth
No ratings yet
S02 Classical Theories of Economic Growth
44 pages
Short Notes _ Theory of Production and Cost (Unit 01)
No ratings yet
Short Notes _ Theory of Production and Cost (Unit 01)
8 pages
Sajid Anwar & Lan Phi Nguyen: Foreign Direct Investment As A Conduit For Technology Transfer: The Case of Vietnam
No ratings yet
Sajid Anwar & Lan Phi Nguyen: Foreign Direct Investment As A Conduit For Technology Transfer: The Case of Vietnam
18 pages
Chapter 3: Partial Derivative: Lecturer: Nguyen Minh Quan, PHD Quannm@Hcmiu - Edu.Vn
No ratings yet
Chapter 3: Partial Derivative: Lecturer: Nguyen Minh Quan, PHD Quannm@Hcmiu - Edu.Vn
101 pages
Lecture 1 - Solow Model-2 (4)
No ratings yet
Lecture 1 - Solow Model-2 (4)
19 pages
Econ 2Hh3: Intermediate Macroeconomics II: Winter 2021 Bettina Brüggemann & Marc-André Letendre
No ratings yet
Econ 2Hh3: Intermediate Macroeconomics II: Winter 2021 Bettina Brüggemann & Marc-André Letendre
41 pages
Cobb Douglas Preferences
No ratings yet
Cobb Douglas Preferences
5 pages
Intermediate Growth Theory PDF
No ratings yet
Intermediate Growth Theory PDF
184 pages
CH 6 Production Theory Ing-Indo
No ratings yet
CH 6 Production Theory Ing-Indo
63 pages
Assigment
No ratings yet
Assigment
48 pages
Utility Function
No ratings yet
Utility Function
18 pages
Production Theory: Mahendra Budha Chhetri Economics Department Herald International College
No ratings yet
Production Theory: Mahendra Budha Chhetri Economics Department Herald International College
76 pages
Economic Growth Lecture 11 2023
No ratings yet
Economic Growth Lecture 11 2023
36 pages
PAMSpy
No ratings yet
PAMSpy
13 pages
Case Analysis
No ratings yet
Case Analysis
10 pages
Chapter 6 Investment
No ratings yet
Chapter 6 Investment
17 pages
Advanced Macroeconomics Tutorial #1: Solutions
No ratings yet
Advanced Macroeconomics Tutorial #1: Solutions
3 pages
Production Functions (Microeconomics)
No ratings yet
Production Functions (Microeconomics)
59 pages
Hall & Jones (1999) Why Do Some Produce So Much More Output Per Worker Than Others PDF
No ratings yet
Hall & Jones (1999) Why Do Some Produce So Much More Output Per Worker Than Others PDF
35 pages

Handbook of Production Economics

Uploaded by

Handbook of Production Economics

Uploaded by

Subhash C.

With 108 Figures and 20 Tables

ISBN 978-981-10-3454-1 ISBN 978-981-10-3455-8 (eBook)

© Springer Nature Singapore Pte Ltd. 2022

production economics. The objective is to serve as a single unified source of

May 2022 Subhash C. Ray

1 Neoclassical Production Economics: An Introduction . . . . . . . . . . . . 3

11 Activity Analysis in Production Economics . . . . . . . . . . . . . . . . . . . . . 471

Part II Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1081

26 Choice of Inputs and Outputs for Production Analysis . . . . . . . . . . . 1083

39 Applications of Production Theory in Transportation . . . . . . . . . . . . 1491

Subhash C. Ray is Professor of Economics at the

Robert G. Chambers was born in Washington,

Professor Subal C. Kumbhakar (http://bingweb.

Professor Kumbhakar is the co-author (with Knox

Inmaculada C. Álvarez Oviedo Efficiency Group, Department of Economics,

W. Erwin Diewert Vancouver School of Economics, University of British

Victor Matheson College of the Holy Cross, Worcester, MA, USA

Kathleen Segerson Department of Economics, University of Connecticut, Storrs,

© Springer Nature Singapore Pte Ltd. 2022 3

Production · Primal · Dual · Profit-maximization

An Overview of Neoclassical Production Theory

An example from early empirical studies of economic growth illustrates. Growth

The Primal Perspective

where λ is a nonnegative Lagrangian multiplier, and the first-order conditions for an

Here ∇h f (h) denotes the gradient of f with respect to the argument h.

The Dual Perspective

In principle, the technology subsumes all feasible productive activities.2 The

Fig. 2 Netput technology

denote the correspondence3 giving the profit-maximizing solutions to (3) and

Formally, expression (4) says that Z (q) is (positively) cyclically monotone in q

The economic interpretation of this basic characteristic of maxima is that optimal

The subdifferential notion generalizes the more familiar gradient to accom-

when π (q) is smooth (differentiable). Conversely, π (q) is differentiable in qm only

Fig. 4 Subdifferential as profit-maximizing netput

Having demonstrated zero-degree homogeneity of z (q) and the discrete general-

T̄ = z ∈ RM : q  z ≤ π (q) for all q ∈ RM

Recall that a set B ⊂ RM is convex if b0 , b1 ∈ B implies λb0 + (1 − λ) b1 ∈ B for

implies z̃ ∈ T̄ (free disposability of netputs) because z̃ ≤ z requires q  z̃ ≤ q  z for

a strictly positive normal cannot support T from above in a region where it is

Fig. 7 Negative π (q)

Our discussion motivates the following definition.

Definition 1. A technology set, T ⊂ RM , is neoclassical if:

(a) T is closed and nonempty.

Theorem 1 (Fundamental Duality). If T is neoclassical, π (q) ≥ 0 for all q ∈

T = z ∈ RM : q  z ≤ π (q) for all q ∈ RM

Fig. 8 Gauge function for

z ∈ T ⇔ γ (z) ≤ 1. (Indication property)

Knowing γ (z) is mathematically equivalent to knowing neoclassical T . Moreover,

If T is neoclassical, it follows from standard Kuhn-Tucker theory in the smooth

a neoclassical T . A simple perspective on that relationship is offered by observing

Theorem 2 (Profit-Gauge Duality). If T is neoclassical,

Figure 9 illustrates the solutions to the two programming problems posed in

Fig. 9 Support function for T ∗

Modifying earlier arguments, Theorem 2 allows us to establish the following

z ∈ ∂π (q) ⇔ q ∈ ∂γ (z) (Shephard’s lemma) (7)

Fig. 10 Kinks and flats

Restricted Profit Functions

and the revenue function

we can infer for all q̂, q̃ that

This expression manifests the Le Chatelier principle that characterizes optima. As

The Search for a Practical Production Function

The Cobb-Douglas Production Function

The Cobb-Douglas production function remains a classic example of empirical

Using the initial conditions ln Yt = ln Lt = ln Kt = 0 (t = 1899), one gets

Further, zero normal profit in a competitive market implies sL + sK = 1. Now,

Y = min {aL; bK} (11)

The cost-minimizing input bundle is (L∗ , K ∗ ) = Ya , Yb which depends only on

substitutability between inputs.

6 See “Elasticity of Substitution” in the chapter by Russell, this volume.

where the marginal rate of substitution along the isoquant is − dK

The Constant Elasticity of Substitution (CES) Production Function

This may be inverted to obtain

dy w d ln y f  (k) f (k) − kf  (k)

Substituting (16) and (21) into (15), one gets

Upon integration, (24) leads to

T̄ = z ∈ RM : q z ≤ π (q) for all q ∈ RM

implies z̃ ∈ T̄ (free disposability of netputs) because z̃ ≤ z requires q z̃ ≤ q z for

T = z ∈ RM : q z ≤ π (q) for all q ∈ RM

dy w d ln y f (k) f (k) − kf (k)