Workshop 2
Workshop 2
ANALYSIS
1
MOTIVATION
2
USEFULNESS OF SFA
3
STRUCTURE OF THIS PRESENTATION
Concept of efficiency
Estimation
• Part 2: Empirics:
• References
1. Kumbhakar, S.C. and Lovell, C.A.K (2000), Stochastic
Frontier Analysis, Cambridge University Press, U.K.
4
TECHNICAL EFFICIENCY
ƒ(L,K; β) = 2L0.5K0.5
ƒ(9,16; β) = 2.90.5.160.5 = 24
Y ≤ ƒ(x; β) = YM
• Figure 1
• TE = Y/YM 0 ≤ TE ≤ 1
• Y = YM. TE = ƒ(x; β) . TE
5
STOCHASTIC FRONTIER
• Figure 2
• TE = Y/ ƒ(x; β).exp(v)
= ƒ(x; β).exp(v).exp(-u)/ ƒ(x; β).exp(v)
= exp(-u)
6
COST EFFICIENCY OR ECONOMIC
EFFICIENCY
Allocative inefficiency
• Figure 3
• Figure 4
7
COST EFFICIENCY
•
Ei = c(yi, wi; β) exp {vi + ui} (1)
1
Estimation
• First rewrite as
ln Ei = ln c(yi, wi; β) + ui + vi (3)
2
SPECIFICATION
• Deterministic Kernel
– Cobb-Douglas (in log form)
– Translog (a flexible functional form)
• Random Variables vi and ui
– vi ∼ iidN (0, σv2)
– ui ∼ iidN +(0, σu2 )
– vi and ui are distributed independently
of each other, and of the regressors.
• Given these assumptions, the log-likelihood
function for the sample of size I
i λ 1 X
ln L = K − I ln σ + ln Φ( ) − 2 2i .
X
i σ 2σ i
(4)
where i = ui + vi, σ 2 = (σu2 + σv2), λ = σσuv
and Φ(.) is the standard normal cumulative
distribution function. We substitute ln Ei −
ln c(yi, wi; β) in place of i in the likelihood
function.
3
DERIVATION OF LIKELIHOOD FUNCTION
• The density function of vi is
1 vi2
f (vi) = √ exp(− 2 )
2πσv 2σv
• The density function of ui is
2 vi2
f (ui) = √ exp(− 2 )
2πσv 2σv
• Given the independence assumption, the joint
density function of ui and vi is the product
of their individual density function, and so,
2 u2i vi2
f (ui, vi) = exp(− 2 − 2 )
2πσuσv 2σu 2σv
• Since i = vi + ui, the joint density function
for ui and i is:
2 u2i (i − ui)2
f (ui, i) = exp(− 2 − 2
)
2πσuσv 2σu 2σv
4
• The marginal density function of i is then
obtained by integrating ui out of f (ui, i)
which yields
Z
∞
f (i) = 0 f (ui , i )dui
2 −iλ −2i
= √ [1 − Φ( )] exp( 2 )
2πσ σ 2σ
• The likelihood function of the sample is then,
by independence, the product of the density
functions of the individual observations.
i=I
L(sample) = f (i)
Y
i=1
• And then taking log of the likelihood func-
tion yields the log-likelihood equation.
• Battese and Corra (1977) re-parameterization
σu2
–γ= σu2 +σv2
→ bounded between 0 and 1
σu
–λ= σv → any non-negative value
5
• If we use the γ parameterization, the log-
likelihood function is given by:
1 X 2
ln L = K−I ln σ+ ln[1−Φ(zi)]− 2 i .
X
i 2σ i
(5)
where zi = σi 1−γ
γ
s
6
Calculating Producer Specific Efficiency
• ˆi = ln Ei − ln c(yi, wi; β̂) is a composite
ˆ vi
estimate of ui +
• But it contains information about ûi. If ˆi
is high then chances are ûi is high since ex-
pectation of vi is zero.
• The conditional distribution of ui given i
could be exploited to get estimates of pro-
ducer specific inefficiency. This was first
demonstrate by Jondrow, Lovell, Materov,
and Schmidt (1982) and since then this de-
composition is known as the JLMS tech-
nique. Either the mean or the mode of this
conditional distribution can be used.
•
φ( σiλ ) i λ
E(ui | i) = σ∗[ −i λ + ( )] (6)
1 − Φ( σ ) σ
8
Alternative Distributional Assumptions
• Till now we have assumed that ui ∼ iidN +(0, σu2 )
• A more general formulation is that of Trun-
cated Normal Distribution ui ∼ iidN +(µ, σu2 )
• Alternative functional forms like exponen-
tial and gamma distribution could also be
applied.
• Does the distributional assumption matter?
Yes it does matter in the calculation of the
efficiency numbers.
• However, the ranking of the prodcuers are
much less sensitive to distributional assump-
tions.
• In Panel Data Models, one can estimate ef-
ficiency WITHOUT making ANY assump-
tion about the distribution of ui.
9
Analyzing Efficiency Behaviour
• Two questions:
– What is the behavior of efficiencies over
time? Are they increasing, decreasing or
constant?
– What explains the variations in ineffi-
ciencies among producers and across time?
• Time behavior
– Following Kumbhakar (1990), Battese and
Coelli (1992) proposed a simple model
that can be used to estimate the time
behavior of inefficiencies.
11
Explaining efficiency
• Certain factors influence the environment in
which production takes place
– degree of competitiveness
– input and output quality
– network characteristics
– ownership form
– changes in regulation,
– management characteristics
• Two ways to handle them
– Include them as variables in the produc-
tion process as control variables. Using
this interpretation, these variables influ-
ence the structure of the technology by
which conventional inputs are converted
into outputs, but not efficiency.
12
E = c(w, y; γ) exp {v + u} (9)
The parameter γ now includes cost pa-
rameters as well as environmental param-
eters.
– Associate variation in estimated efficiency
with variation in the exogenous variables.
• Early papers adopted two stage approach
– Stage 1: In the first stage a stochastic
frontier equation was estimated (exclud-
ing the exogenous variables), typically by
MLE under the usual distributional and
independence assumptions, and the re-
gression residuals were decomposed using
the JLMS technique.
– Stage 2: The estimated inefficiencies were
regressed on exogenous variables to ex-
plain/locate the source of inefficiency.
13
• Econometrically inconsistent because the iid
assumption necessary to use the JLMS tech-
nique is contracticted in the second stage.
• Kumbhakar, Ghosh, and McGuckin (1991),
and Reifcheneider and Setvenson (1991) ap-
proach. All the parameters of the stochastic
frontier function as well as those of the inef-
ficiency function was estimated together in
a single MLE procedure.
• The model
14
• The non-negativity requirement that uit =
(δ 0zit −it) ≥ 0 is modeled as it ∼ N (0, σ2)
with the distribution of it being bounded
below by the variable truncation point −δ 0zit.
• This is implemented as ”Model 2” in FRON-
TIER program.
• Interesting hypotheses to Test
– Test: γ = δ0 = δ1 = · · · = δm = 0 →
Implies no ineffciency
– Test: δ1 = · · · = δm = 0 → Implies the
Truncated Normal Distribution
15
FRONTIER 4.1 -- A PROGRAM FOR ESTIMATING
STOCHASTIC FRONTIER PRODUCTION AND COST
FUNCTION
1
THE FRONTIER PROGRAM
• Website: http://www.uq.edu.au/economics/cepa/software.htm
2
REQUIREMENTS OF THE PROGRAM
3
The Data File
4
• The program assumes a linear functional form. Thus for
estimating a Cobb-Douglas production function, all the input
and output quantities should be logged.
5
EXECUTING THE PROGRAM
6
THE INSTRUCTION FILE
• As Coelli write in his Guide, “The best way to describe how to use
the program is to provide some examples.” So we discuss the
examples given in the users Guide. The data, program, output
used in this presentation are all taken directly from the Guide.
• For simplicity there are only two production inputs in all cases. In
the cross-sectional examples there are 60 firms, while in the panel
data examples, there are 15 firms and 4 time periods.
7
EXAMPLE 1
• where Qi, Ki and Li are output, capital and labour, respectively, and
Vi and Ui are assumed normal and half-normal distributed,
respectively.
8
__________________________________________________
9
Table 1d - Listing of Instruction File EG1.INS
________________________________________________________________
1 1=ERROR COMPONENTS MODEL, 2=TE EFFECTS MODEL
eg1.dta DATA FILE NAME
eg1.out OUTPUT FILE NAME
1 1=PRODUCTION FUNCTION, 2=COST FUNCTION
y LOGGED DEPENDENT VARIABLE (Y/N)
60 NUMBER OF CROSS-SECTIONS
1 NUMBER OF TIME PERIODS
60 NUMBER OF OBSERVATIONS IN TOTAL
2 NUMBER OF REGRESSOR VARIABLES (Xs)
n MU (Y/N) [OR DELTA0 (Y/N) IF USING TE EFFECTS MODEL]
n ETA (Y/N) [OR NUMBER OF TE EFFECTS REGRESSORS (Zs)]
n STARTING VALUES (Y/N)
IF YES THEN BETA0
BETA1 TO
BETAK
SIGMA SQUARED
GAMMA
MU [OR DELTA0
ETA DELTA1 TO
DELTAK]
10
Table 1e - Listing of Output File EG1.OUT
________________________________________________________
Output from the program FRONTIER (Version 4.1)
mu is restricted to be zero
11
(… more output )
firm eff.-est.
1 0.65068880E+00
2 0.82889151E+00
3 0.72642592E+00
.
.
.
58 0.66471456E+00
59 0.85670448E+00
60 0.70842786E+00
12
EXAMPLE 2
• where Qi, Ki, Li and Vi are as defined earlier, and Ui has truncated
normal distribution.
13
Table 2b - Listing of Shazam Instruction File EG2.SHA
__________________________________________________
read(eg2.dat) n t y x1 x2
genr ly=log(y)
genr lx1=log(x1)
genr lx2=log(x2)
genr lx1s=log(x1)*log(x1)
genr lx2s=log(x2)*log(x2)
genr lx12=log(x1)*log(x2)
file 33 eg2.dta
write(33) n t ly lx1 lx2 lx1s lx2s lx12
stop
__________________________________________________
14
Table 2d - Listing of Instruction File EG2.INS
________________________________________________________
1 1=ERROR COMPONENTS MODEL, 2=TE EFFECTS MODEL
eg2.dta DATA FILE NAME
eg2.out OUTPUT FILE NAME
1 1=PRODUCTION FUNCTION, 2=COST FUNCTION
y LOGGED DEPENDENT VARIABLE (Y/N)
60 NUMBER OF CROSS-SECTIONS
1 NUMBER OF TIME PERIODS
60 NUMBER OF OBSERVATIONS IN TOTAL
5 NUMBER OF REGRESSOR VARIABLES (Xs)
y MU (Y/N) [OR DELTA0 (Y/N) IF USING TE EFFECTS MODEL]
n ETA (Y/N) [OR NUMBER OF TE EFFECTS REGRESSORS (Zs)]
n STARTING VALUES (Y/N)
IF YES THEN BETA0
BETA1 TO
BETAK
SIGMA SQUARED
GAMMA
MU [OR DELTA0
ETA DELTA1 TO
DELTAK]
15
EXAMPLE 3
• where Ci, Qi, Ri and Wi are cost, output, capital price and labour
price, respectively, and Vi and Ui are assumed normal and half-
normal distributed, respectively.
16
Table 3b - Listing of Shazam Instruction File EG3.SHA
_____________________________________________________
read(eg3.dat) n t c q r w
genr lcw=log(c/w)
genr lq=log(q)
genr lrw=log(r/w)
file 33 eg3.dta
write(33) n t lcw lq lrw
stop
________________________________________________________
17
Table 3d - Listing of Instruction File EG3.INS
_______________________________________________________________
1 1=ERROR COMPONENTS MODEL, 2=TE EFFECTS MODEL
eg3.dta DATA FILE NAME
eg3.out OUTPUT FILE NAME
2 1=PRODUCTION FUNCTION, 2=COST FUNCTION
y LOGGED DEPENDENT VARIABLE (Y/N)
60 NUMBER OF CROSS-SECTIONS
1 NUMBER OF TIME PERIODS
60 NUMBER OF OBSERVATIONS IN TOTAL
2 NUMBER OF REGRESSOR VARIABLES (Xs)
n MU (Y/N) [OR DELTA0 (Y/N) IF USING TE EFFECTS MODEL]
n ETA (Y/N) [OR NUMBER OF TE EFFECTS REGRESSORS (Zs)]
n STARTING VALUES (Y/N)
IF YES THEN BETA0
BETA1 TO
BETAK
SIGMA SQUARED
GAMMA
MU [OR DELTA0
ETA DELTA1 TO
DELTAK]
18
EXAMPLE 4
19
Table 4a - Listing of Data File EG4.DAT
________________________________________________________
_____________
1. 1. 15.131 9.416 35.134
2. 1. 26.309 4.643 77.297
3. 1. 6.886 5.095 89.799
4. 1. 11.168 4.935 35.698
5. 1. 16.605 8.717 27.878
6. 1. 10.897 1.066 92.174
7. 1. 8.239 0.258 97.907
8. 1. 19.203 6.334 82.084
9. 1. 16.032 2.350 38.876
10. 1. 12.434 1.076 81.761
11. 1. 2.676 3.432 9.476
12. 1. 29.232 4.033 55.096
13. 1. 16.580 7.975 73.130
14. 1. 12.903 7.604 24.350
15. 1. 10.618 0.344 65.380
20
Table 4b - Listing of Shazam Instruction File EG4.SHA
_______________________________________________
read(eg4.dat) n t y x1 x2
genr ly=log(y)
genr lx1=log(x1)
genr lx2=log(x2)
file 33 eg4.dta
write(33) n t ly lx1 lx2
stop
__________________________________________________
21
Table 4d - Listing of Instruction File EG4.INS
________________________________________________________________
_____
1 1=ERROR COMPONENTS MODEL, 2=TE EFFECTS MODEL
eg4.dta DATA FILE NAME
eg4.out OUTPUT FILE NAME
1 1=PRODUCTION FUNCTION, 2=COST FUNCTION
y LOGGED DEPENDENT VARIABLE (Y/N)
15 NUMBER OF CROSS-SECTIONS
4 NUMBER OF TIME PERIODS
60 NUMBER OF OBSERVATIONS IN TOTAL
2 NUMBER OF REGRESSOR VARIABLES (Xs)
y MU (Y/N) [OR DELTA0 (Y/N) IF USING TE EFFECTS MODEL]
y ETA (Y/N) [OR NUMBER OF TE EFFECTS REGRESSORS (Zs)]
n STARTING VALUES (Y/N)
IF YES THEN BETA0
BETA1 TO
BETAK
SIGMA SQUARED
GAMMA
MU [OR DELTA0
ETA DELTA1 TO
DELTAK]
22
EXAMPLE 5
• (4.5) The Battese and Coelli (1995) specification (Model 2).
23
Table 5c - Listing of Data File EG5.DTA
_____________________________________________________________________
1.000000 1.000000 2.716746 2.242410 3.559169 1.000000
2.000000 1.000000 3.269911 1.535361 4.347655 1.000000
3.000000 1.000000 1.929490 1.628260 4.497574 1.000000
.
.
.
13.00000 4.000000 3.149054 2.233128 4.467332 4.000000
14.00000 4.000000 3.123994 2.058473 4.099995 4.000000
15.00000 4.000000 3.119674 1.726510 3.789132 4.000000
_____________________________________________________________________
24
Table 5d - Listing of Instruction File EG5.INS
________________________________________________________________
2 1=ERROR COMPONENTS MODEL, 2=TE EFFECTS MODEL
eg5.dta DATA FILE NAME
eg5.out OUTPUT FILE NAME
1 1=PRODUCTION FUNCTION, 2=COST FUNCTION
y LOGGED DEPENDENT VARIABLE (Y/N)
15 NUMBER OF CROSS-SECTIONS
4 NUMBER OF TIME PERIODS
60 NUMBER OF OBSERVATIONS IN TOTAL
2 NUMBER OF REGRESSOR VARIABLES (Xs)
y MU (Y/N) [OR DELTA0 (Y/N) IF USING TE EFFECTS MODEL]
1 ETA (Y/N) [OR NUMBER OF TE EFFECTS REGRESSORS (Zs)]
n STARTING VALUES (Y/N)
IF YES THEN BETA0
BETA1 TO
BETAK
SIGMA SQUARED
GAMMA
MU [OR DELTA0
ETA DELTA1 TO
DELTAK]
25
APPENDIX - PROGRAMMER'S GUIDE
The start-up file FRONT41.000 is listed in Table A1. Ten values may be altered in
FRONT41.000. A brief description of each value is provided below.
26
Deregulation, Ownership, and Efficiency Change in Indian Banking:
An Application of Stochastic Frontier Analysis
Subal C. Kumbhakar
Department of Economics
State University of New York
Binghamton, NY 13902, USA
E-mail: [email protected]
and
Subrata Sarkar}
Indira Gandhi Institute of Development Research
Gen. Vaidya Marg, Goregaon (East)
Mumbai 400 065, INDIA
E-mail: [email protected]
Table 3: Estimated Parameters of the Translog Cost Function
and the Simple Time Varying Inefficiency Function Based on
Battese and Coelli Model 1
LR Test
for
One- 571.22 292.22 349.54
sided
Error
d.o.f 3 3 3
All variables, except time, are logged as per translog cost function. The second order terms
are obvious: y12 = y1*y2; y33 = y3*y3; etc. .
28
Table 4: Mean Efficiency of Public and Private Banks
Based on Battese and Coelli Model 1
Mean Efficiency
29
Table 7: Estimated Parameters of the Inefficiency Function Based on
Battese and Coelli Model 2
Model A Model B Model C Model D
Pvt (private) and Dereg (deregulation), are dummy variables. Dereg=1 if year > 1992.
The variable t stands for time, with t=1 for the year 1986. The interaction between
the dummy variables time and private is captured by t*Pvt., and Dereg*t*Pvt is a
three-way interaction.
30
Figure 2: Average Efficiency of Banks by Ownership Group
1.020
1.000
0.980
0.960
0.940
eff
0.920
0.900
vvv
0.880
0.860
0.840
0.820
1986 1987 1988 1989 1990 1991 1992 1993 1994 1995 1996 1997 1998 1999 2000
Year
Public Bank Privete Bank Average bank performance
31
Deterministic frontier
YM
YA
XA X
Stochastic Frontier
f(x) exp(vi)
vi < 0
YA
XA
X
X2
(X1A,X2A)
Actual expenditure
Minimum expenditure
Q = 100
X1
Slope = w1/w2
X2
XA
C
W Q1 = 100
Q2 = 75
B D F X1
Total inefficiency : BF
Technical inefficiency : BD
Allocative inefficiency : DF