0% found this document useful (0 votes)
582 views26 pages

Guja - Chap 16 PDF

This document discusses panel data regression models. It begins by defining panel data as data that has both a time series and cross-sectional dimension, observing the same cross-sectional units over time. It then provides two examples of panel data from previous tables. The document notes there are several names for panel data and discusses some well-known panel data sets. It warns that panel data regressions can be complex. Finally, it lists some key advantages of panel data over time series or cross-sectional data alone, including the ability to account for heterogeneity across units and better study dynamics and effects not observable in other data types.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
582 views26 pages

Guja - Chap 16 PDF

This document discusses panel data regression models. It begins by defining panel data as data that has both a time series and cross-sectional dimension, observing the same cross-sectional units over time. It then provides two examples of panel data from previous tables. The document notes there are several names for panel data and discusses some well-known panel data sets. It warns that panel data regressions can be complex. Finally, it lists some key advantages of panel data over time series or cross-sectional data alone, including the ability to account for heterogeneity across units and better study dynamics and effects not observable in other data types.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 26

guj75772_ch16.

qxd 22/08/2008 07:13 PM Page 591

Chapter

16
Panel Data Regression
Models
In Chapter 1 we discussed briefly the types of data that are generally available for empir-
ical analysis, namely, time series, cross section, and panel. In time series data we observe
the values of one or more variables over a period of time (e.g., GDP for several quarters
or years). In cross-section data, values of one or more variables are collected for several
sample units, or subjects, at the same point in time (e.g., crime rates for 50 states in the
United States for a given year). In panel data the same cross-sectional unit (say a family
or a firm or a state) is surveyed over time. In short, panel data have space as well as time
dimensions.
We have already seen an example of this in Table 1.1, which gives data on eggs produced
and their prices for 50 states in the United States for years 1990 and 1991. For any given
year, the data on eggs and their prices represent a cross-sectional sample. For any given
state, there are two time series observations on eggs and their prices. Thus, we have in all
100 (pooled) observations on eggs produced and their prices.
Another example of panel data was given in Table 1.2, which gives data on investment,
value of the firm, and capital stock for four companies for the period 1935–1954. The data
for each company over the period 1935–1954 constitute time series data, with 20 observa-
tions; data, for all four companies for a given year is an example of cross-section data, with
only four observations; and data for all the companies for all the years is an example of
panel data, with a total of 80 observations.
There are other names for panel data, such as pooled data (pooling of time series
and cross-sectional observations), combination of time series and cross-section data,
micropanel data, longitudinal data (a study over time of a variable or group of subjects),
event history analysis (studying the movement over time of subjects through successive
states or conditions), and cohort analysis (e.g., following the career path of 1965 graduates
of a business school). Although there are subtle variations, all these names essentially con-
note movement over time of cross-sectional units. We will therefore use the term panel data
in a generic sense to include one or more of these terms. And we will call regression mod-
els based on such data panel data regression models.
Panel data are now being used increasingly in economic research. Some of the well-
known panel data sets are:
1. The Panel Study of Income Dynamics (PSID) conducted by the Institute of Social
Research at the University of Michigan. Started in 1968, each year the Institute col-
lects data on some 5,000 families about various socioeconomic and demographic
variables.

591
guj75772_ch16.qxd 22/08/2008 07:13 PM Page 592

592 Part Three Topics in Econometrics

2. The Bureau of the Census of the Department of Commerce conducts a survey similar to
PSID, called the Survey of Income and Program Participation (SIPP). Four times a
year respondents are interviewed about their economic condition.
3. The German Socio-Economic Panel (GESOEP) studied 1,761 individuals every year
between 1984 and 2002. Information on year of birth, gender, life satisfaction, marital
status, individual labor earnings, and annual hours of work was collected for each indi-
vidual for the period 1984 to 2002.
There are also many other surveys that are conducted by various governmental agencies,
such as:
Household, Income and Labor Dynamics in Australia Survey (HILDA)
British Household Panel Survey (BHPS)
Korean Labor and Income Panel Study (KLIPS)
At the outset a warning is in order: The topic of panel data regressions is vast, and some of
the mathematics and statistics involved are quite complicated. We only hope to touch on some
of the essentials of the panel data regression models, leaving the details for the references.1 But
be forewarned that some of these references are highly technical. Fortunately, user-friendly
software packages such as LIMDEP, PC-GIVE, SAS, STATA, SHAZAM, and EViews, among
others, have made the task of actually implementing panel data regressions quite easy.

16.1 Why Panel Data?


What are the advantages of panel data over cross-section or time series data? Baltagi lists
the following advantages of panel data:2
1. Since panel data relate to individuals, firms, states, countries, etc., over time, there is
bound to be heterogeneity in these units. The techniques of panel data estimation can
take such heterogeneity explicitly into account by allowing for subject-specific vari-
ables, as we shall show shortly. We use the term subject in a generic sense to include
microunits such as individuals, firms, states, and countries.
2. By combining time series of cross-section observations, panel data gives “more infor-
mative data, more variability, less collinearity among variables, more degrees of free-
dom and more efficiency.”
3. By studying the repeated cross section of observations, panel data are better suited to
study the dynamics of change. Spells of unemployment, job turnover, and labor mobility
are better studied with panel data.
4. Panel data can better detect and measure effects that simply cannot be observed in pure
cross-section or pure time series data. For example, the effects of minimum wage laws

1
Some of the references are G. Chamberlain, “Panel Data,” in Handbook of Econometrics, vol. II;
Z. Griliches and M. D. Intriligator, eds., North-Holland Publishers, 1984, Chapter 22; C. Hsiao,
Analysis of Panel Data, Cambridge University Press, 1986; G. G. Judge, R. C. Hill, W. E. Griffiths,
H. Lutkepohl, and T. C. Lee, Introduction to the Theory and Practice of Econometrics, 2d ed., John Wiley
& Sons, New York, 1985, Chapter 11; W. H. Greene, Econometric Analysis, 6th ed., Prentice-Hall,
Englewood Cliffs, NJ, 2008, Chapter 9; Badi H. Baltagi, Econometric Analysis of Panel Data, John Wiley
and Sons, New York, 1995; and J. M. Wooldridge, Econometric Analysis of Cross Section and Panel
Data, MIT Press, Cambridge, Mass., 1999. For a detailed treatment of the subject with empirical
applications, see Edward W. Frees, Longitudinal and Panel Data: Analysis and Applications in the Social
Sciences, Cambridge University Press, New York, 2004.
2
Baltagi, op. cit., pp. 3–6.
guj75772_ch16.qxd 22/08/2008 07:13 PM Page 593

Chapter 16 Panel Data Regression Models 593

on employment and earnings can be better studied if we include successive waves of


minimum wage increases in the federal and/or state minimum wages.
5. Panel data enables us to study more complicated behavioral models. For example,
phenomena such as economies of scale and technological change can be better handled
by panel data than by pure cross-section or pure time series data.
6. By making data available for several thousand units, panel data can minimize the bias
that might result if we aggregate individuals or firms into broad aggregates.
In short, panel data can enrich empirical analysis in ways that may not be possible if we use only
cross-section or time series data. This is not to suggest that there are no problems with panel
data modeling. We will discuss them after we cover some theory and discuss some examples.

16.2 Panel Data: An Illustrative Example


To set the stage, let us consider a concrete example. Consider the data given as Table 16.1
on the textbook website, which were originally collected by Professor Moshe Kim and are
reproduced from William Greene.3 The data analyzes the costs of six airline firms for the
period 1970–1984, for a total of 90 panel data observations.
The variables are defined as: I = airline id; T = year id; Q = output, in revenue passen-
ger miles, an index number; C = total cost, in $1,000; PF = fuel price; and LF = load fac-
tor, the average capacity utilization of the fleet.
Suppose we are interested in finding out how total cost (C) behaves in relation to output (Q),
fuel price (PF), and load factor (LF). In short, we wish to estimate an airline cost function.
How do we go about estimating this function? Of course, we can estimate the cost func-
tion for each airline using the data for 1970–1984 (i.e., a time series regression). This can
be accomplished with the usual ordinary least squares (OLS) procedure. We will have in all
six cost functions, one for each airline. But then we neglect the information about the other
airlines which operate in the same (regulatory) environment.
We can also estimate a cross-section cost function (i.e., a cross-section regression).
We will have in all 15 cross-section regressions, one for each year. But this would not make
much sense in the present context, for we have only six observations per year and there are
three explanatory variables (plus the intercept term); we will have very few degrees of free-
dom to do a meaningful analysis. Also, we will not “exploit” the panel nature of our data.
Incidentally, the panel data in our example is called a balanced panel; a panel is said to
be balanced if each subject (firm, individuals, etc.) has the same number of observations. If
each entity has a different number of observations, then we have an unbalanced panel. For
most of this chapter, we will deal with balanced panels. In the panel data literature you will
also come across the terms short panel and long panel. In a short panel the number of
cross-sectional subjects, N, is greater than the number of time periods, T. In a long panel, it
is T that is greater than N. As we discuss later, the estimating techniques can depend on
whether we have a short panel or a long one.
What, then, are the options? There are four possibilities:
1. Pooled OLS model. We simply pool all 90 observations and estimate a “grand”
regression, neglecting the cross-section and time series nature of our data.
2. The fixed effects least squares dummy variable (LSDV) model. Here we pool all 90
observations, but allow each cross-section unit (i.e., airline in our example) to have its
own (intercept) dummy variable.
3
William H. Greene, Econometric Analysis, 6th ed., 2008. Data are located at http://pages.stern.nyu.edu/
~wgreen/Text/econometricanalysis.htm.
guj75772_ch16.qxd 22/08/2008 07:13 PM Page 594

594 Part Three Topics in Econometrics

3. The fixed effects within-group model. Here also we pool all 90 observations, but for
each airline we express each variable as a deviation from its mean value and then esti-
mate an OLS regression on such mean-corrected or “de-meaned” values.
4. The random effects model (REM). Unlike the LSDV model, in which we allow each
airline to have its own (fixed) intercept value, we assume that the intercept values are a
random drawing from a much bigger population of airlines.
We now discuss each of these methods using the data given in Table 16.1. (See textbook
website.)

16.3 Pooled OLS Regression or Constant Coefficients Model


Consider the following model:
Cit = β1 + β2 Q it + β3 P Fit + β4 L Fit + u it (16.3.1)
i = 1, 2, . . . , 6
t = 1, 2, . . . , 15
where i is ith subject and t is the time period for the variables we defined previously. We
have chosen the linear cost function for illustrative purposes, but in Exercise 16.10 you are
asked to estimate a log–linear, or double-log function, in which case the slope coefficients
will give the elasticity estimates.
Notice that we have pooled together all 90 observations, but note that we are assuming
the regression coefficients are the same for all the airlines. That is, there is no distinction
between the airlines—one airline is as good as the other, an assumption that may be diffi-
cult to maintain.
It is assumed that the explanatory variables are nonstochastic. If they are stochastic, they
are uncorrelated with the error term. Sometimes it is assumed that the explanatory variables
are strictly exogenous. A variable is said to be strictly exogenous if it does not depend on
current, past, and future values of the error term u it .
It is also assumed that the error term is u it ∼ iid(0, σu2 ), that is, it is independently and
identically distributed with zero mean and constant variance. For the purpose of hypothe-
sis testing, it may be assumed that the error term is also normally distributed. Notice the
double-subscripted notation in Eq. (16.3.1), which should be self-explanatory.
Let us first present the results of the estimated equation (16.3.1) and then discuss some
of the problems with this model. The regression results based on EViews, Version 6 are pre-
sented in Table 16.2.
If you examine the results of the pooled regression and apply the conventional criteria,
you will see that all the regression coefficients are not only highly statistically significant
but are also in accord with prior expectations and that the R2 value is very high. The only
“fly in the ointment” is that the estimated Durbin–Watson statistic is quite low, suggesting
that perhaps there is autocorrelation and/or spatial correlation in the data. Of course, as we
know, a low Durbin–Watson could also be due to specification errors.
The major problem with this model is that it does not distinguish between the various
airlines nor does it tell us whether the response of total cost to the explanatory variables
over time is the same for all the airlines. In other words, by lumping together different air-
lines at different times we camouflage the heterogeneity (individuality or uniqueness) that
may exist among the airlines. Another way of stating this is that the individuality of each
subject is subsumed in the disturbance term u it . As a consequence, it is quite possible that
the error term may be correlated with some of the regressors included in the model. If that
is the case, the estimated coefficients in Eq. (16.3.1) may be biased as well as inconsistent.
guj75772_ch16.qxd 22/08/2008 07:14 PM Page 595

Chapter 16 Panel Data Regression Models 595

TABLE 16.2
Dependent Variable: C
Method: Least Squares
Included observations: 90

Coefficient Std. Error t Statistic Prob.


C (intercept) 1158559. 360592.7 3.212930 0.0018
Q 2026114. 61806.95 32.78134 0.0000
PF 1.225348 0.103722 11.81380 0.0000
LF -3065753. 696327.3 -4.402747 0.0000
R-squared 0.946093 Mean dependent var. 1122524.
Adjusted R-squared 0.944213 S.D. dependent var. 1192075.
S.E. of regression 281559.5 F-statistic 503.1176
Sum squared resid. 6.82E+12 Prob. (F-statistic) 0.000000
Durbin–Watson 0.434162

Recall that one of the important assumptions of the classical linear regression model is that
there is no correlation between the regressors and the disturbance or error term.
To see how the error term may be correlated with the regressors, let us consider the
following revision of model (16.3.1):
Cit = β1 + β2 P Fit + β3 L Fit + β4 Mit + u it (16.3.2)
where the additional variable M = management philosophy or management quality. Of the
variables included in Eq. (16.3.2), only the variable M is time-invariant (or time-constant)
because it varies among subjects but is constant over time for a given subject (airline).
Although it is time-invariant, the variable M is not directly observable and therefore we
cannot measure its contribution to the cost function. We can, however, do this indirectly if
we write Eq. (16.3.2) as
Cit = β1 + β2 P Fit + β3 L Fit + αi + u it (16.3.3)
where αi , called the unobserved, or heterogeneity, effect, reflects the impact of M on
cost. Note that for simplicity we have shown only the unobserved effect of M on cost, but
in reality there may be more such unobserved effects, for example, the nature of ownership
(privately owned or publicly owned), whether it is a minority-owned company, whether the
CEO is a man or a woman, etc. Although such variables may differ among the subjects (air-
lines), they will probably remain the same for any given subject over the sample period.
Since αi is not directly observable, why not consider it random and include it in the error
term u it , and thereby consider the composite error term vit = αi + u it ? We now write
Eq. (16.3.3) as:
Cit = β1 + β2 P Fit + β3 L Fit + vit (16.3.4)
But if the αi term included in the error term vit is correlated with any of the regressors
in Eq. (16.3.4), we have a violation of one of the key assumptions of the classical linear re-
gression model—namely, that the error term is not correlated with the regressors. As we
know in this situation, the OLS estimates are not only biased but they are also inconsistent.
There is a real possibility that the unobservable αi is correlated with one or more of the
regressors. For example, the management of one airline may be astute enough to buy future
contracts of the fuel price to avoid severe price fluctuations. This will have the effect of
lowering the cost of airline services. As a result of this correlation, it can be shown that
cov (vit , vis ) = σu2 ; t = s, which is non-zero, and therefore, the (unobserved) heterogene-
ity induces autocorrelation and we will have to pay attention to it. We will show later how
this problem can be handled.
guj75772_ch16.qxd 22/08/2008 07:14 PM Page 596

596 Part Three Topics in Econometrics

The question, therefore, is how we account for the unobservable, or heterogeneity, effect(s)
so that we can obtain consistent and/or efficient estimates of the parameters of the variables
of prime interest, which are output, fuel price, and load factor in our case. Our prime interest
may not be in obtaining the impact of the unobservable variables because they remain the
same for a given subject. That is why such unobservable, or heterogeneity, effects are called
nuisance parameters. How then do we proceed? It is to this question we now turn.

16.4 The Fixed Effect Least-Squares Dummy


Variable (LSDV) Model
The least-squares dummy variable (LSDV) model allows for heterogeneity among subjects
by allowing each entity to have its own intercept value, as shown in model (16.4.1). Again,
we continue with our airlines example.
Cit = β1i + β2 Q it + β3 P Fit + β4 L Fit + u it (16.4.1)
i = 1, 2 . . . , 6
t = 1, 2, . . . , 15
Notice that we have put the subscript i on the intercept term to suggest that the intercepts of the
six airlines may be different. The difference may be due to special features of each airline, such
as managerial style, managerial philosophy, or the type of market each airline is serving.
In the literature, model (16.4.1) is known as the fixed effects (regression) model
(FEM). The term “fixed effects” is due to the fact that, although the intercept may differ
across subjects (here the six airlines), each entity’s intercept does not vary over time, that
is, it is time-invariant. Notice that if we were to write the intercept as β1it , it would sug-
gest that the intercept of each entity or individual is time-variant. It may be noted that the
FEM given in Eq. (16.4.1) assumes that the (slope) coefficients of the regressors do not
vary across individuals or over time.
Before proceeding further, it may be useful to visualize the difference between the
pooled regression model and the LSDV model. For simplicity assume that we want to
regress total cost on output only. In Figure 16.1 we show this cost function estimated for
two airline companies separately, as well as the cost function if we pool the data for the two

FIGURE 16.1 Yit


Bias from ignoring
fixed effects.
Group 2 E(Yit|Xit) = α 2 + β Xit

Biased slope when


Total cost

fixed effects are ignored

α2 E(Yit|Xit) = α 1 + β Xit

Group 1
α1

Xit

Output
guj75772_ch16.qxd 22/08/2008 07:14 PM Page 597

Chapter 16 Panel Data Regression Models 597

companies; this is equivalent to neglecting the fixed effects.4 You can see from Figure 16.1
how the pooled regression can bias the slope estimate.
How do we actually allow for the (fixed effect) intercept to vary among the airlines? We can
easily do this by using the dummy variable technique, particularly the differential intercept
dummy technique, which we learned in Chapter 9. Now we write Eq. (16.4.1) as:
Cit = α1 + α2 D2i + α3 D3i + α4 D4i + α5 D5i + α6 D6i
+ β2 Q it + β3 P Fit + β4 L Fit + u it (16.4.2)
where D2i = 1 for airline 2, 0 otherwise; D3i = 1 for airline 3, 0 otherwise; and so on.
Notice that since we have six airlines, we have introduced only five dummy variables to
avoid falling into the dummy-variable trap (i.e., the situation of perfect collinearity). Here
we are treating airline 1 as the base, or reference, category. Of course, you can choose any
airline as the reference point. As a result, the intercept α1 is the intercept value of airline 1
and the other α coefficients represent by how much the intercept values of the other airlines
differ from the intercept value of the first airline. Thus, α2 tells by how much the intercept
value of the second airline differs from α1 . The sum (α1 + α2 ) gives the actual value of the
intercept for airline 2. The intercept values of the other airlines can be computed similarly.
Keep in mind that if you want to introduce a dummy for each airline, you will have to drop
the (common) intercept; otherwise, you will fall into the dummy-variable trap.
The results of the model (16.4.2) for our data are presented in Table 16.3.
The first thing to notice about these results is that all the differential intercept coeffi-
cients are individually highly statistically significant, suggesting that perhaps the six air-
lines are heterogeneous and, therefore, the pooled regression results given in Table 16.2
may be suspect. The values of the slope coefficients given in Tables 16.2 and 16.3 are also
different, again casting some doubt on the results given in Table 16.2. It seems model
(16.4.1) is better than model (16.3.1). In passing, note that OLS applied to a fixed effect
model produces estimators that are called fixed effect estimators.

TABLE 16.3
Dependent Variable: TC
Method: Least Squares
Sample: 1–90
Included observations: 90

Coefficient Std. Error t Statistic Prob.


C (=α1) -131236.0 350777.1 -0.374129 0.7093
Q 3319023. 171354.1 19.36939 0.0000
PF 0.773071 0.097319 7.943676 0.0000
LF -3797368. 613773.1 -6.186924 0.0000
DUM2 601733.2 100895.7 5.963913 0.0000
DUM3 1337180. 186171.0 7.182538 0.0000
DUM4 1777592. 213162.9 8.339126 0.0000
DUM5 1828252. 231229.7 7.906651 0.0000
DUM6 1706474. 228300.9 7.474672 0.0000
R-squared 0.971642 Mean dependent var. 1122524.
Adjusted R-squared 0.968841 S.D. dependent var. 1192075.
S.E. of regression 210422.8 F-statistics 346.9188
Sum squared resid. 3.59E+12 Prob. (F-statistic) 0.000000
Log likelihood -1226.082 Durbin-Watson stat. 0.693288

4
Adapted from the unpublished notes of Alan Duncan.
guj75772_ch16.qxd 22/08/2008 07:14 PM Page 598

598 Part Three Topics in Econometrics

We can provide a formal test of the two models. In relation to model (16.4.1), model
(16.3.1) is a restricted model in that it imposes a common intercept for all the airlines.
Therefore, we can use the restricted F test discussed in Chapter 8. Using formula (8.6.10),
the reader can check that in the present case the F value is:
(0.971642 − 0.946093)/5
F= ≈ 14.99
(1 − 0.971642)/81
Note: The restricted and unrestricted R2 values are obtained from Tables 16.1 and 16.2.
Also note that the number of restrictions is 5 (why?).
The null hypothesis here is that all the differential intercepts are equal to zero. The com-
puted F value for 5 numerator and 81 denominator df is highly statistically significant.
Therefore, we reject the null hypothesis that all the (differential) intercepts are zero. If the
F value were not statistically significant, we would have concluded that there is no differ-
ence in the intercepts of the six airlines. In this case, we would have pooled all 90 of the
observations, as we did in the pooled regression given in Table 16.2.
Model (16.4.1) is known as a one-way fixed effects model because we have allowed the
intercepts to differ between airlines. But we can also allow for time effect if we believe that
the cost function changes over time because of factors such as technological changes, changes
in government regulation and/or tax policies, and other such effects. Such a time effect can be
easily accounted for if we introduce time dummies, one for each year from 1970 to 1984.
Since we have data for 15 years, we can introduce 14 time dummies (why?) and extend model
(16.4.1) by adding these variables. If we do that, the model that emerges is called a two-way
fixed effects model because we have allowed for both individual and time effects.
In the present example, if we add the time dummies, we will have in all 23 coefficients to
estimate—the common intercept, five airlines dummies, 14 time dummies, and three slope
coefficients. As you can see, we will consume several degrees of freedom. Furthermore, if
we decide to allow the slope coefficients to differ among the companies, we can interact the
five firm (airline) dummies with each of the three explanatory variables and introduce
differential slope dummy coefficients. Then we will have to estimate 15 additional coeffi-
cients (five dummies interacted with three explanatory variables). As if this is not enough, if we
interact the 14 time dummies with the three explanatory variables, we will have in all 42 addi-
tional coefficients to estimate. As you can see, we will not have any degrees of freedom left.

A Caution in the Use of the Fixed Effect LSDV Model


As the preceding discussion suggests, the LSDV model has several problems that need to
be borne in mind:
First, if you introduce too many dummy variables, you will run up against the degrees
of freedom problem. That is, you will lack enough observations to do a meaningful statis-
tical analysis. Second, with many dummy variables in the model, both individual and inter-
active or multiplicative, there is always the possibility of multicollinearity, which might
make precise estimation of one or more parameters difficult.
Third, in some situations the LSDV may not be able to identify the impact of time-
invariant variables. Suppose we want to estimate a wage function for a group of workers
using panel data. Besides wage, a wage function may include age, experience, and educa-
tion as explanatory variables. Suppose we also decide to add sex, color, and ethnicity as
additional variables in the model. Since these variables will not change over time for an
individual subject, the LSDV approach may not be able to identify the impact of such time-
invariant variables on wages. To put it differently, the subject-specific intercepts absorb all
heterogeneity that may exist in the dependent and explanatory variables. Incidentally, the
time-invariant variables are sometimes called nuisance variables or lurking variables.
guj75772_ch16.qxd 22/08/2008 07:14 PM Page 599

Chapter 16 Panel Data Regression Models 599

Fourth, we have to think carefully about the error term u it . The results we have pre-
sented in Eqs. (16.3.1) and (16.4.1) are based on the assumption that the error term follows
the classical assumptions, namely, u it ∼ N (0, σ 2 ). Since the index i refers to cross-section
observations and t to time series observations, the classical assumption for u it may have to
be modified. There are several possibilities, including:
1. We can assume that the error variance is the same for all cross-section units or we can
assume that the error variance is heteroscedastic.5
2. For each entity, we can assume that there is no autocorrelation over time. Thus, in our
illustrative example, we can assume that the error term of the cost function for airline #1 is
non-autocorrelated, or we can assume that it is autocorrelated, say, of the AR(1) type.
3. For a given time, it is possible that the error term for airline #1 is correlated with the
error term for, say, airline #2.6 Or we can assume that there is no such correlation.
There are also other combinations and permutations of the error term. As you will quickly
realize, allowing one or more of these possibilities will make the analysis that much more com-
plicated. (Space and mathematical demands preclude us from considering all the possibilities.
The references in footnote 1 discuss some of these topics.) Some of these problems may be
alleviated, however, if we consider the alternatives discussed in the next two sections.

16.5 The Fixed-Effect Within-Group (WG) Estimator


One way to estimate a pooled regression is to eliminate the fixed effect, β1i , by expressing
the values of the dependent and explanatory variables for each airline as deviations from
their respective mean values. Thus, for airline #1 we will obtain the sample mean values of
TC, Q, PF, and LF, (T C, Q, P F, and L F, respectively) and subtract them from the indi-
vidual values of these variables. The resulting values are called “de-meaned” or mean-
corrected values. We do this for each airline and then pool all the (90) mean-corrected
values and run an OLS regression.
Letting tcit , qit , p f it , and l f it represent the mean-corrected values, we now run the
regression:
tcit = β2 qit + β3 p f it + β4l f it + u it (16.5.1)
where i = 1, 2, . . ., 6, and t = 1, 2, . . ., 15. Note that Eq. (16.5.1) does not have an inter-
cept term (why?).
Returning to our example, we obtain the results in Table 16.4. Note: The prefix DM
means that the values are mean-corrected or expressed as deviations from their sample
means.
Note the difference between the pooled regression given in Table 16.2 and the pooled
regression in Table 16.4. The former simply ignores the heterogeneity among the six air-
lines, whereas the latter takes it into account, not by the dummy variable method, but by
eliminating it by differencing sample observations around their sample means. The differ-
ence between the two is obvious, as shown in Figure 16.2.
It can be shown that the WG estimator produces consistent estimates of the slope
coefficients, whereas the ordinary pooled regression may not. It should be added, however,
5
STATA provides heteroscedasticity-corrected standard errors in the panel data regression models.
6
This leads to the so-called seemingly unrelated regression (SURE) model, originally proposed
by Arnold Zellner. See A. Zellner, “An Efficient Method of Estimating Seemingly Unrelated Regressions
and Tests for Aggregation Bias,” Journal of the American Statistical Association, vol. 57, 1962,
pp. 348–368.
guj75772_ch16.qxd 22/08/2008 07:14 PM Page 600

600 Part Three Topics in Econometrics

TABLE 16.4
Dependent Variable: DMTC
Method: Least Squares
Sample: 1–90
Included observations: 90

Coefficient Std. Error t Statistic Prob.


DMQ 3319023. 165339.8 20.07396 0.0000
DMPF 0.773071 0.093903 8.232630 0.0000
DMLF -3797368. 592230.5 -6.411976 0.0000
R-squared 0.929366 Mean dependent var. 2.59E-11
Adjusted R-squared 0.927743 S.D. dependent var. 755325.8
S.E. of regression 203037.2 Durbin–Watson stat. 0.693287
Sum squared resid. 3.59E+12

FIGURE 16.2 Y*it


The within-groups
estimator.
Source: Alan Duncan, “Cross-
Section and Panel Data
Econometrics,” unpublished
lecture notes (adapted).
Total cost

α2

α1 E(Y*it|X*it) = β X*it

X*it
Output

that WG estimators, although consistent, are inefficient (i.e., have larger variances)
compared to the ordinary pooled regression results.7 Observe that the slope coefficients of
the Q, PF, and LF are identical in Tables 16.3 and 16.4. This is because mathematically the
two models are identical. Incidentally, the regression coefficients estimated by the WG
method are called WG estimators.
One disadvantage of the WG estimator can be explained with the following wage
regression model:
Wit = β1i + β2 Experienceit + β3 Ageit + β4 Genderit + β5 Educationit + β6 Raceit
(16.5.2)
In this wage function, variables such as gender, education, and race are time-invariant. If
we use the WG estimators, these time-invariant variables will be wiped out (because of
7
The reason for this is that when we express variables as deviations from their mean values, the varia-
tion in these mean-corrected values will be much smaller than the variation in the original values of
the variables. In that case, the variation in the disturbance term uit may be relatively large, thus
leading to higher standard errors of the estimated coefficients.
guj75772_ch16.qxd 22/08/2008 07:14 PM Page 601

Chapter 16 Panel Data Regression Models 601

differencing). As a result, we will not know how wage reacts to these time-invariant vari-
ables.8 But this is the price we have to pay to avoid the correlation between the error term
(αi included in vit ) and the explanatory variables.
Another disadvantage of the WG estimator is that, “. . . it may distort the parameter val-
ues and can certainly remove any long run effects.”9 In general, when we difference a vari-
able, we remove the long-run component from that variable. What is left is the short-run
value of that variable. We will discuss this further when we discuss time series economet-
rics later in the book.
In using LSDV we obtained direct estimates of the intercepts for each airline. How can
we obtain the estimates of the intercepts using the WG method? For the airlines example,
they are obtained as follows:
α̂i = C i − β̂2 Q i − β̂3 P Fi − β̂4 L F (16.5.3)
where bars over the variables denote the sample mean values of the variables for the ith
airline.
That is, we obtain the intercept value of the ith airline by subtracting from the mean
value of the dependent variable the mean values of the explanatory variables for that airline
times the estimated slope coefficients from the WG estimators. Note that the estimated
slope coefficients remain the same for all of the airlines, as shown in Table 16.4. It may be
noted that the intercept estimated in Eq. (16.5.3) is similar to the intercept we estimate in
the standard linear regression model, which can be see from Eq. (7.4.21). We leave it for
the reader to find the intercepts of the six airlines in the manner shown and verify that they
are the same as the intercept values derived in Table 16.3, save for the rounding errors.
It may be noted that the estimated intercept of each airline represents the subject-specific
characteristics of each airline, but we will not be able to identify these characteristics indi-
vidually. Thus, the α1 intercept for airline #1 represents the management philosophy of that
airline, the composition of its board of directors, the personality of the CEO, the gender of
the CEO, etc. All these heterogeneity characteristics are subsumed in the intercept value.
As we will see later, such characteristics can be included in the random effects model.
In passing, we note that an alternative to the WG estimator is the first-difference
method. In the WG method, we express each variable as a deviation from that variable’s
mean value. In the first-difference method, for each subject we take successive differences
of the variables. Thus, for airline #1 we subtract the first observation of TC from the second
observation of TC, the second observation of TC from the third observation of TC, and so
on. We do this for each of the remaining variables and repeat this process for the remaining
five airlines. After this process we have only 14 observations for each airline, since the first
observation has no previous value. As a result, we now have 84 observations instead of the
original 90 observations. We then regress the first-differenced values of the TC variable on
the first-differenced values of the explanatory variables as follows:
T Cit = β2 Q it + β3 P Fit + β4 L Fit + (u it − u i,t−1 )
i = 1, 2, . . . , 6 (16.5.4)
t = 1, 2, . . . , 84
where  = (T Cit − T Ci, t−1 ). As noted in Chapter 11,  is called the first difference
operator.10

8
This is also true of the LSDV model.
9
Dimitrios Asteriou and Stephen G. Hall, Applied Econometrics: A Modern Approach, Palgrave
Macmillan, New York, 2007, p. 347.
10
Notice that Eq. (16.5.3) has no intercept term (why?), but we can include it if there is a trend
variable in the original model.
guj75772_ch16.qxd 03/09/2008 11:11 AM Page 602

602 Part Three Topics in Econometrics

In passing, note that the original disturbance term is now replaced by the difference
between the current and previous values of the disturbance term. If the original disturbance
term is not autocorrelated, the transformed disturbance is, and therefore it poses the kinds
of estimation problems that we discussed in Chapter 11. However, if the explanatory vari-
ables are strictly exogenous, the first difference estimator is unbiased, given the values of
the explanatory variables. Also note that the first-difference method has the same disad-
vantages as the WG method in that the explanatory variables that remain fixed over time for
an individual are wiped out in the first-difference transformation.
It may be pointed out that the first difference and fixed effects estimators are the same
when we have only two time periods, but if there are more than two periods, these estima-
tors differ. The reasons for this are rather involved and the interested reader may consult the
references.11 It is left as an exercise for the reader to apply the first difference method to our
airlines example and compare the results with the other fixed effects estimators.

16.6 The Random Effects Model (REM)


Commenting on fixed effect, or LSDV, modeling, Kmenta writes:12
An obvious question in connection with the covariance [i.e., LSDV] model is whether the inclu-
sion of the dummy variables—and the consequent loss of the number of degrees of freedom—is
really necessary. The reasoning underlying the covariance model is that in specifying the regres-
sion model we have failed to include relevant explanatory variables that do not change over time
(and possibly others that do change over time but have the same value for all cross-sectional
units), and that the inclusion of dummy variables is a coverup of our ignorance.

If the dummy variables do in fact represent a lack of knowledge about the (true) model,
why not express this ignorance through the disturbance term? This is precisely the approach
suggested by the proponents of the so-called error components model (ECM) or random
effects model (REM), which we will now illustrate with our airline cost function.
The basic idea is to start with Eq. (16.4.1):
T Cit = β1i + β2 Q it + β3 P Fit + β4 L Fit + u it (16.6.1)
Instead of treating β1i as fixed, we assume that it is a random variable with a mean value
of β1 (no subscript i here). The intercept value for an individual company can be expressed as
β1i = β1 + εi (16.6.2)
where εi is a random error term with a mean value of zero and a variance of σε2 .
What we are essentially saying is that the six firms included in our sample are a drawing
from a much larger universe of such companies and that they have a common mean value
for the intercept (= β1 ). The individual differences in the intercept values of each company
are reflected in the error term εi .
Substituting Eq. (16.6.2) into Eq. (16.6.1), we obtain:
T Cit = β1 + β2 Q it + β3 P Fit + β4 L Fit + εi + u it
(16.6.3)
= β1 + β2 Q it + β3 P Fit + β4 L Fit + wit
where
wit = εi + u it (16.6.4)
11
See in particular Jeffrey M. Wooldridge, Econometric Analysis of Cross Section and Panel Data, MIT
Press, Cambridge, Mass., 2002, pp. 279–283.
12
Jan Kmenta, Elements of Econometrics, 2d ed., Macmillan, New York, 1986, p. 633.
guj75772_ch16.qxd 22/08/2008 07:14 PM Page 603

Chapter 16 Panel Data Regression Models 603

The composite error term wit consists of two components: εi , which is the cross-section,
or individual-specific, error component, and u it , which is the combined time series and
cross-section error component and is sometimes called the idiosyncratic term because it
varies over cross-section (i.e., subject) as well as time. The error components model (ECM)
is so named because the composite error term consists of two (or more) error components.
The usual assumptions made by the ECM are that
εi ∼ N (0, σε2 )
 
u it ∼ N 0, σu2
(16.6.5)
E(εi u it ) = 0; E(εi ε j ) = 0 (i = j)
E(u it u is ) = E(u i j u i j ) = E(u it u js ) = 0 (i = j; t = s)
that is, the individual error components are not correlated with each other and are not autocor-
related across both cross-section and time series units. It is also very important to note that wit
is not correlated with any of the explanatory variables included in the model. Since εi is a com-
ponent of wit , it is possible that the latter is correlated with the explantory variables. If that is
indeed the case, the ECM will result in inconsistent estimation of the regression coefficients.
Shortly, we will discuss the Hausman test, which will tell us in a given application if wit is cor-
related with the explanatory variables, that is, whether ECM is the appropriate model.
Notice carefully the difference between FEM and ECM. In FEM each cross-sectional
unit has its own (fixed) intercept value, in all N such values for N cross-sectional units. In
ECM, on the other hand, the (common) intercept represents the mean value of all the
(cross-sectional) intercepts and the error component εi represents the (random) deviation
of individual intercept from this mean value. Keep in mind, however, that εi is not directly
observable; it is what is known as an unobservable, or latent, variable.
As a result of the assumptions stated in Eq. (16.6.5), it follows that
E(wit ) = 0 (16.6.6)
var (wit ) = σε2 + σu2 (16.6.7)
Now if σε2 = 0, there is no difference between models (16.3.1) and (16.6.3) and we can
simply pool all the (cross-sectional and time series) observations and run the pooled regres-
sion, as we did in Eq. (16.3.1). This is true because in this situation there are either no
subject-specific effects or they have all been accounted for in the explanatory variables.
As Eq. (16.6.7) shows, the error term is homoscedastic. However, it can be shown that wit
and wis (t = s) are correlated; that is, the error terms of a given cross-sectional unit at two dif-
ferent points in time are correlated. The correlation coefficient, corr (wit , wis ), is as follows:
σε2
ρ = corr (wit , wis ) = ; t = s (16.6.8)
σε2 + σu2
Notice two special features of the preceding correlation coefficient. First, for any given
cross-sectional unit, the value of the correlation between error terms at two different times
remains the same no matter how far apart the two time periods are, as is clear from
Eq. (16.6.8). This is in strong contrast to the first-order [AR(1)] scheme that we discussed
in Chapter 12, where we found that the correlation between periods declines over time.
Second, the correlation structure given in Eq. (16.6.8) remains the same for all cross-
sectional units; that is, it is identical for all subjects.
If we do not take this correlation structure into account, and estimate Eq. (16.6.3) by
OLS, the resulting estimators will be inefficient. The most appropriate method here is the
method of generalized least squares (GLS).
guj75772_ch16.qxd 22/08/2008 07:14 PM Page 604

604 Part Three Topics in Econometrics

TABLE 16.5
Dependent Variable: TC
Method: Panel EGLS (Cross-section random effects)
Sample: 1–15
Periods included: 15
Cross-sections included: 6
Total panel (balanced) observations: 90
Swamy and Arora estimator of component variances

Coefficient Std. Error t Statistic Prob.


C 107429.3 303966.2 3.534251 0.0007
Q 2288588. 88172.77 25.95572 0.0000
PF 1.123591 0.083298 13.48877 0.0000
LF -3084994. 584373.2 -5.279151 0.0000
Effects Specification
S.D. Rho
Cross-section random 107411.2 0.2067
Idiosyncratic random 210422.8 0.7933

Firm Effect
1 1.000000 -270615.0
2 2.000000 -87061.32
3 3.000000 -21338.40
4 4.000000 187142.9
5 5.000000 134488.9
6 6.000000 57383.00

We will not discuss the mathematics of GLS in the present context because of its com-
plexity.13 Since most modern statistical software packages now have routines to estimate
ECM (as well as FEM), we will present the results for our illustrative example only. But
before we do that, it may be noted that we can easily extend Eq. (16.4.2) to allow for a ran-
dom error component to take into account variation over time (see Exercise 16.6).
The results of ECM estimation of the airline cost function are presented in Table 16.5.
Notice these features of the REM. The (average) intercept value is 107429.3. The (differ-
ential) intercept values of the six entities are given at the bottom of the regression results.
Firm number 1, for example, has an intercept value which is 270615 units lower than the
common intercept value of 107429.3; the actual value of the intercept for this airline is
then −163185.7. On the other hand, the intercept value of firm number 6 is higher by 57383
units than the common intercept value; the actual intercept value for this airline is
(107429.3 + 57383), or 164812.3. The intercept values for the other airlines can be derived
similarly. However, note that if you add the (differential) intercept values of all the six air-
lines, the sum is 0, as it should be (why?).
If you compare the results of the fixed-effect and random-effect regressions, you will see
that there are substantial differences between the two. The important question now is:
Which results are reliable? Or, to put it differently, which should be the choice between the
two models? We can apply the Hausman test to shed light on this question.
The null hypothesis underlying the Hausman test is that the FEM and ECM estimators
do not differ substantially. The test statistic developed by Hausman has an asymptotic χ2

13
See Kmenta, op. cit., pp. 625–630.
guj75772_ch16.qxd 22/08/2008 07:14 PM Page 605

Chapter 16 Panel Data Regression Models 605

TABLE 16.6
Correlated Random Effects—Hausman Test
Equation: Untitled
Test cross-section random effects

Chi-Sq.
Test Summary Statistic Chi-Sq. d.f. Prob.
Cross-section random 49.619687 3 0.0000

Cross-section random effects test comparisons:


Variable Fixed Random Var(Diff.) Prob.
Q 3319023.28 2288587.95 21587779733. 0.0000
PF 0.773071 1.123591 0.002532 0.0000
LF -3797367.59 -3084994.0 35225469544. 0.0001

distribution. If the null hypothesis is rejected, the conclusion is that the ECM is not appro-
priate because the random effects are probably correlated with one or more regressors. In
this case, FEM is preferred to ECM. For our example, the results of the Hausman test are
as shown in Table 16.6.
The Hausman test clearly rejects the null hypothesis, for the estimated χ2 value for 3 df
is highly significant; if the null hypothesis were true, the probability of obtaining a chi-
square value of as much as 49.62 or greater would be practically zero. As a result, we can
reject the ECM (REM) in favor of FEM. Incidentally, the last part of the preceding table
compares the fixed-effect and random-effect coefficients of each variable and, as the last
column shows, in the present example the differences are statistically significant.

Breusch and Pagan Lagrange Multiplier Test14


Besides the Hausman test, we can also use the Breusch-Pagan (BP) test to test the hypoth-
esis that there are no random effects, i.e., σu2 in Eq. (16.6.7) is zero. This test is built into
software packages such as STATA. Under the null hypothesis, BP follows a chi-square dis-
tribution with 1 df; there is only 1 df because we are testing the single hypothesis that
σu2 = 0. We will not present the formula underlying the test, for it is rather complicated.
Turning to our airlines example, an application of the BP test produces a chi-square value
of 0.61. With 1 df, the p value of obtaining a chi-square value of 0.61 or greater is about 43 per-
cent. Therefore, we do not reject the null hypothesis. In other words, the random effects model
is not appropriate in the present example. The BP test thus reinforces the Hausman test, which
also found that the random effects model is not appropriate for our airlines example.

16.7 Properties of Various Estimators15


We have discussed several methods of estimating (linear) panel regression models, namely,
pooled estimators, fixed effects estimators that include least squares dummy variable (LSDV)
estimators, fixed-effect within-group estimators, first-difference estimators, and random effects
estimators. What are their statistical properties? Since panel data generally involve a large num-
ber of observations, we will concentrate on the consistency property of these estimators.

14
T. Breusch and A. R. Pagan, “The Lagrange Multiplier Test and Its Application to Model Specifica-
tion in Econometrics,” Review of Economic Studies, vol. 47, 1980, pp. 239–253.
15
The following discussion draws on A. Colin Cameron and Pravin K. Trivedi, Microeconometrics:
Methods and Applications, Cambridge University Press, Cambridge, New York, 2005, Chapter 21.
guj75772_ch16.qxd 22/08/2008 07:14 PM Page 606

606 Part Three Topics in Econometrics

Pooled Estimators
Assuming the slope coefficients are constant across subjects, if the error term in Eq. (16.3.1)
is uncorrelated with the regressors, pooled estimators are consistent. However, as noted
earlier, the error terms are likely to be correlated over time for a given subject. Therefore,
panel-corrected standard errors must be used for hypothesis testing. Make sure the
statistical package you use has this facility, otherwise the computed standard errors may
be underestimated. It should be noted that if the fixed effects model is appropriate but we
use the pooled estimator, the estimated coefficients will be inconsistent.

Fixed Effects Estimators


Even if it is assumed that the underlying model is pooled or random, the fixed effects
estimators are always consistent.

Random Effects Estimators


The random effects model is consistent even if the true model is the pooled estimator. How-
ever, if the true model is fixed effects, the random effects estimator is inconsistent.
For proofs and further details about these properties, refer to the textbooks of Cameron
and Trivedi, Greene, and Wooldridge cited in the footnotes.

16.8 Fixed Effects versus Random Effects Model: Some Guidelines


The challenge facing a researcher is: Which model is better, FEM or ECM? The answer to
this question hinges around the assumption we make about the likely correlation between
the individual, or cross-section specific, error component εi and the X regressors.
If it is assumed that εi and the X’s are uncorrelated, ECM may be appropriate, whereas
if εi and the X’s are correlated, FEM may be appropriate.
The assumption underlying ECM is that the εi are random drawings from a much larger
population, but sometimes this may not be so. For example, suppose we want to study the
crime rate across the 50 states in the United States. Obviously, in this case, the assumption
that the 50 states are a random sample is not tenable.
Keeping this fundamental difference in the two approaches in mind, what more can we
say about the choice between FEM and ECM? Here the observations made by Judge et al.
may be helpful:16
1. If T (the number of time series data) is large and N (the number of cross-sectional units)
is small, there is likely to be little difference in the values of the parameters estimated by
FEM and ECM. Hence the choice here is based on computational convenience. On this
score, FEM may be preferable.
2. When N is large and T is small (i.e., a short panel), the estimates obtained by the two meth-
ods can differ significantly. Recall that in ECM β1i = β1 + εi , where εi is the cross-
sectional random component, whereas in FEM we treat β1i as fixed and not random. In the
latter case, statistical inference is conditional on the observed cross-sectional units in
the sample. This is appropriate if we strongly believe that the individual, or cross-sectional,
units in our sample are not random drawings from a larger sample. In that case, FEM is
appropriate. If the cross-sectional units in the sample are regarded as random drawings,
however, then ECM is appropriate, for in that case statistical inference is unconditional.
3. If the individual error component εi and one or more regressors are correlated, then the
ECM estimators are biased, whereas those obtained from FEM are unbiased.
16
Judge et al., op. cit., pp. 489–491.
guj75772_ch16.qxd 22/08/2008 07:14 PM Page 607

Chapter 16 Panel Data Regression Models 607

4. If N is large and T is small, and if the assumptions underlying ECM hold, ECM estima-
tors are more efficient than FEM.
5. Unlike FEM, ECM can estimate coefficients of time-invariant variables such as gender
and ethnicity. The FEM does control for such time-invariant variables, but it cannot
estimate them directly, as is clear from the LSDV or within-group estimator models. On
the other hand, FEM controls for all time-invariant variables (why?), whereas ECM can
estimate only such time-invariant variables as are explicitly introduced in the model.
Despite the Hausman test, it is important to keep in mind the warning sounded by
Johnston and DiNardo. In deciding between fixed effects or random effects models, they
argue that, “ . . . there is no simple rule to help the researcher navigate past the Scylla of
fixed effects and the Charybdis of measurement error and dynamic selection. Although
they are an improvement over cross-section data, panel data do not provide a cure-all for all
of an econometrician’s problems.”17

16.9 Panel Data Regressions: Some Concluding Comments


As noted at the outset, the topic of panel data modeling is vast and complex. We have barely
scratched the surface. The following are among the many topics we have not discussed.
1. Hypothesis testing with panel data.
2. Heteroscedasticity and autocorrelation in ECM.
3. Unbalanced panel data.
4. Dynamic panel data models in which the lagged value(s) of the regressand appears as an
explanatory variable.
5. Simultaneous equations involving panel data.
6. Qualitative dependent variables and panel data.
7. Unit roots in panel data (on unit roots, see Chapter 21).
One or more of these topics can be found in the references cited in this chapter, and the
reader is urged to consult them to learn more about this topic. These references also cite
several empirical studies in various areas of business and economics that have used panel
data regression models. The beginner is well-advised to read some of these applications to
get a feel for how researchers have actually implemented such models.18

16.10 Some Illustrative Examples

EXAMPLE 16.1 To find out why productivity has declined and what the role of public investment is, Alicia
Productivity and Munnell studied productivity data in 48 continental United States for 17 years from 1970 to
1986, for a total of 816 observations.19 Using these data, we estimated the pooled regression
Public in Table 16.7. Note that this regression does not take into account the panel nature of the data.
Investment The dependent variable in this model is GSP (gross state product), and the explanatory
variables are: PRIVCAP (private capital), PUBCAP (public capital), WATER (water utility
capital), and UNEMP (unemployment rate). Note: L stands for natural log.
(Continued )

17
Jack Johnston and John DiNardo, Econometric Methods, 4th ed., McGraw-Hill, 1997, p. 403.
18
For further details and concrete applications, see Paul D. Allison, Fixed Effects Regression Methods for
Longitudinal Data, Using SAS, SAS Institute, Cary, North Carolina, 2005.
19
The Munnell data can be found at www.aw-bc.com/murray.
guj75772_ch16.qxd 22/08/2008 07:14 PM Page 608

608 Part Three Topics in Econometrics

EXAMPLE 16.1 TABLE 16.7


(Continued) Dependent Variable: LGSP
Method: Panel Least Squares
Sample: 1970–1986
Periods included: 17
Cross-sections included: 48
Total panel (balanced) observations: 816

Coefficient Std. Error t Statistic Prob.


C 0.907604 0.091328 9.937854 0.0000
LPRIVCAP 0.376011 0.027753 13.54847 0.0000
LPUBCAP 0.351478 0.016162 21.74758 0.0000
LWATER 0.312959 0.018739 16.70062 0.0000
LUNEMP -0.069886 0.015092 -4.630528 0.0000
R-squared 0.981624 Mean dependent var. 10.50885
Adjusted R-squared 0.981533 S.D. dependent var. 1.021132
S.E. of regression 0.138765 F-statistic. 10830.51
Sum squared resid. 15.61630 Prob. (F-statistic) 0.000000
Log likelihood 456.2346 Durbin–Watson stat. 0.063016

All the variables have the expected signs and all are individually, as well as collectively,
statistically significant, assuming all the assumptions of the classical linear regression
model hold true.
To take into account the panel dimension of the data, in Table 16.8 we estimated a fixed
effects model using 47 dummies for the 48 states to avoid falling into the dummy-variable

TABLE 16.8
Dependent Variable: LGSP
Method: Panel Least Squares
Sample: 1970–1986
Periods included: 17
Cross-sections included: 48
Total panel (balanced) observations: 816

Coefficient Std. Error t Statistic Prob.


C -0.033235 0.208648 -0.159286 0.8735
LPRIVCAP 0.267096 0.037015 7.215864 0.0000
LPUBCAP 0.714094 0.026520 26.92636 0.0000
LWATER 0.088272 0.021581 4.090291 0.0000
LUNEMP -0.138854 0.007851 -17.68611 0.0000
Effects Specification
Cross-section fixed (dummy variables)
R-squared 0.997634 Mean dependent var. 10.50885
Adjusted R-squared 0.997476 S.D. dependent var. 1.021132
S.E. of regression 0.051303 F-statistic 6315.897
Sum squared resid. 2.010854 Prob. (F-statistic) 0.000000
Log likelihood 1292.535 Durbin–Watson stat. 0.520682
guj75772_ch16.qxd 22/08/2008 07:14 PM Page 609

Chapter 16 Panel Data Regression Models 609

EXAMPLE 16.1 TABLE 16.9


(Continued)
Dependent Variable: LGSP
Method: Panel EGLS (Cross-section random effects)
Sample: 1970–1986
Periods included: 17
Cross-sections included: 48
Total panel (balanced) observations: 816
Swamy and Arora estimator of component variances

Coefficient Std. Error t Statistic Prob.


C -0.046176 0.161637 -0.285680 0.7752
LPRIVCAP 0.313980 0.029740 10.55760 0.0000
LPUBCAP 0.641926 0.023330 27.51514 0.0000
LWATER 0.130768 0.020281 6.447875 0.0000
LUNEMP -0.139820 0.007442 -18.78669 0.0000
Effects Specification
S.D. Rho
Cross-section random 0.130128 0.8655
Idiosyncratic random 0.051303 0.1345

trap. To save space, we only present the estimated regression coefficients and not the indi-
vidual dummy coefficients. But it should be added that all of the 47 state dummies were
individually highly statistically significant.
You can see that there are substantial differences between the pooled regression and
the fixed-effects regression, casting doubt on the results of the pooled regression.
To see if the random effects model is more appropriate in this case, we present the
results of the random effects regression model in Table 16.9.
To choose between the two models, we use the Hausman test, which gives the results
shown in Table 16.10.
Since the estimated chi-square value is highly statistically significant, we reject the
hypothesis that there is no significant difference in the estimated coefficients of the two
models. It seems there is correlation between the error term and one or more regressors.
Hence, we can reject the random effects model in favor of the fixed effects model. Note,
however, as the last part of Table 16.10 shows, not all coefficients differ in the two mod-
els. For example, there is not a statistically significant difference in the values of the
LUNEMP coefficient in the two models.

TABLE 16.10
Chi-Sq.
Test Summary Statistic Chi-Sq. d.f. Prob.
Cross-section random 42.458353 4 0.0000
Cross-section random effects test comparisons:
Variable Fixed Random Var (Diff.) Prob.
LPRIVCAP 0.267096 0.313980 0.000486 0.0334
LPUBCAP 0.714094 0.641926 0.000159 0.0000
LWATER 0.088272 0.130768 0.000054 0.0000
LUNEMP -0.138854 -0.139820 0.000006 0.6993
guj75772_ch16.qxd 22/08/2008 07:14 PM Page 610

610 Part Three Topics in Econometrics

EXAMPLE 16.2 In their article, Maddala et al. considered the demand for residential electricity and natural
Demand for gas in 49 states in the USA for the period 1970–1990; Hawaii was not included in the
analysis.20 They collected data on several variables; these data can be found on the book’s
Electricity website. In this example, we will only consider the demand for residential electricity. We
in the USA first present the results based on the fixed effects estimation (Table 16.11) and then the
random effects estimation (Table 16.12), followed by a comparison of the two models.

TABLE 16.11
Dependent Variable: Log(ESRCBPC)
Method: Panel Least Squares
Sample: 1971–1990
Periods included: 20
Cross-sections included: 49
Total panel (balanced) observations: 980

Coefficient Std. Error t Statistic Prob.


C -12.55760 0.363436 -34.55249 0.0000
Log(RESRCD) -0.628967 0.029089 -21.62236 0.0000
Log(YDPC) 1.062439 0.040280 26.37663 0.0000
Effects Specification
Cross-section fixed (dummy variables)
R-squared 0.757600 Mean dependent var. -4.536187
Adjusted R-squared 0.744553 S.D. dependent var. 0.316205
S.E. of regression 0.159816 Akaike info criterion -0.778954
Sum squared resid. 23.72762 Schwarz criterion -0.524602
Log likelihood 432.6876 Hannan-Quinn criter. -0.682188
F-statistic 58.07007 Durbin–Watson stat. 0.404314
Prob. (F-statistic) 0.000000

where Log (ESRCBPC) = natural log of residential electricity consumption per capita (in
billion btu), Log(RESRCD) = natural log of real 1987 electricity price, and Log(YDPC) =
natural log of real 1987 disposable income per capita.
Since this is a double-log model, the estimated slope coefficients represent elasticities.
Thus, holding other things the same, if real per capita income goes up by 1 percent, the
mean consumption of electricity goes up by about 1 percent. Likewise, holding other
things constant, if the real price of electricity goes up by 1 percent, the average con-
sumption of electricity goes down by about 0.6 percent. All the estimated elasticities are
statistically significant.
The results of the random error model are as shown in Table 16.12.
It seems that there is not much difference in the two models. But we can use the
Hausman test to find out if this is so. The results of this test are as shown in Table 16.13.
Although the coefficients of the two models in Tables 16.11 and 16.12 look quite sim-
ilar, the Hausman test shows that this is not the case. The chi-square value is highly statis-
tically significant. Therefore, we can choose the fixed effects model over the random

20
G. S. Maddala, Robert P. Trost, Hongyi Li, and Frederick Joutz, “Estimation of Short-run and Long-
run Elasticities of Demand from Panel Data Using Shrikdage Estimators,” Journal of Business and
Economic Statistics, vol. 15, no. 1, January 1997, pp. 90–100.
guj75772_ch16.qxd 22/08/2008 07:14 PM Page 611

Chapter 16 Panel Data Regression Models 611

EXAMPLE 16.2 TABLE 16.12


(Continued) Dependent Variable: Log(ESRCBPC)
Method: Panel EGLS (Cross-section random effects)
Sample: 1971–1990
Periods included: 20
Cross-sections included: 49
Total panel (balanced) observations: 980
Swamy and Arora estimator of component variances

Coefficient Std. Error t Statistic Prob.


C -11.68536 0.353285 -33.07631 0.0000
Log(RESRCD) -0.665570 0.028088 -23.69612 0.0000
Log(YDPC) 0.980877 0.039257 24.98617 0.0000
Effects Specification
S.D. Rho
Cross-section random 0.123560 0.3741
Idiosyncratic random 0.159816 0.6259
Weighted Statistics
R-squared 0.462591 Mean dependent var. -1.260296
Adjusted R-squared 0.461491 S.D. dependent var. 0.229066
S.E. of regression 0.168096 Sum squared resid. 27.60641
F-statistic 420.4906 Durbin–Watson stat. 0.345453
Prob. (F-statistic) 0.000000
Unweighted Statistics
R-squared 0.267681 Mean dependent var. -4.536187
Sum squared resid. 71.68384 Durbin–Watson stat. 0.133039

TABLE 16.13
Correlated Random Effects—Hausman Test
Equation: Untitled
Test cross-section random effects

Chi-Sq.
Test Summary Statistic Chi-Sq. d.f. Prob.
Cross-section random 105.865216 2 0.0000

Cross-section random effects test comparisons:


Variable Fixed Random Var (Diff.) Prob.
Log(RESRCD) -0.628967 -0.665570 0.000057 0.0000
Log(YDPC) 1.062439 0.980877 0.000081 0.0000

effects model. This example brings out the important point that when the sample size is large,
in our case 980 observations, even small differences in the estimated coefficients of the two
models can be statistically significant. Thus, the coefficients of the Log(RESRCD) variable in
the two models look reasonably close, but statistically they are not.
guj75772_ch16.qxd 22/08/2008 07:14 PM Page 612

612 Part Three Topics in Econometrics

EXAMPLE 16.3 To assess the impact of beer tax on beer consumption, Philip Cook investigated the rela-
Beer tionship between the two, after allowing for the effect of income.21 His data pertain to 50
states and Washington, D.C, for the period 1975–2000. In this example we study the
Consumption, relationship of per capita beer sales to tax rate and income, all at the state level. We pre-
Income and sent the results of pooled OLS, fixed effects, and random effects models in tabular form in
Beer Tax Table 16.14. The dependent variable is per capita beer sales.
These results are interesting. As per economic theory, we would expect a negative
relationship between beer consumption and beer taxes, which is the case for the three
models. The negative income effect on beer consumption would suggest that beer is an
inferior good. An inferior good is one whose demand decreases as consumers’ income
rises. Maybe when their income rises, consumers prefer champagne!
For our purpose, what is interesting is the difference in the estimated coefficients.
Apparently there is not much difference in estimated coefficients between FEM and ECM.
As a matter of fact, the Hausman test produces a chi-square value of 3.4, which is not
significant for 2 df at the 5 percent level; the p value is 0.1783.
The results based on OLS, however, are vastly different. The coefficient of the beer tax
variable, in absolute value, is much smaller than that obtained from FEM or ECM. The
income variable, although it has the negative sign, is not statistically significant, whereas
the other two models show that it is highly significant.
This example shows very vividly what could happen if we neglect the panel structure
of the data and estimate a pooled regression.

TABLE 16.14
Variable OLS FEM REM
Constant 1.4192 1.7617 1.7542
(24.37) (52.23) (39.22)
Beer tax −0.0067 −0.0183 −0.0181
(−2.13) (−9.67) (−9.69)
Income −3.54(e−6) −0.000020 −0.000019
(−1.12) (−9.17) (−9.10)
R2 0.0062 0.0052 0.0052

Notes: Figures in parentheses are the estimated t ratios. −3.54(e−6) = −0.00000354.

Summary and 1. Panel regression models are based on panel data. Panel data consist of observations on
the same cross-sectional, or individual, units over several time periods.
Conclusions
2. There are several advantages to using panel data. First, they increase the sample size
considerably. Second, by studying repeated cross-section observations, panel data are
better suited to study the dynamics of change. Third, panel data enable us to study
more complicated behavioral models.
3. Despite their substantial advantages, panel data pose several estimation and inference
problems. Since such data involve both cross-section and time dimensions, problems
that plague cross-sectional data (e.g., heteroscedasticity) and time series data (e.g.,
autocorrelation) need to be addressed. There are some additional problems as well,
such as cross-correlation in individual units at the same point in time.

21
The data used here are obtained from the website of Michael P. Murphy, Econometrics: A Modern In-
troduction, Pearson/Addison Wesley, Boston, 2006, but the original data were collected by Philip
Cook for his book, Paying the Tab: The Costs and Benefits of Alcohol Control, Princeton University Press,
Princeton, New Jersey, 2007.
guj75772_ch16.qxd 22/08/2008 07:14 PM Page 613

Chapter 16 Panel Data Regression Models 613

4. There are several estimation techniques to address one or more of these problems. The
two most prominent are (1) the fixed effects model (FEM) and (2) the random effects
model (REM), or error components model (ECM).
5. In FEM, the intercept in the regression model is allowed to differ among individuals in
recognition of the fact that each individual, or cross-sectional, unit may have some special
characteristics of its own. To take into account the differing intercepts, one can use dummy
variables. The FEM using dummy variables is known as the least-squares dummy variable
(LSDV) model. FEM is appropriate in situations where the individual-specific intercept
may be correlated with one or more regressors. A disadvantage of LSDV is that it consumes
a lot of degrees of freedom when the number of cross-sectional units, N, is very large, in
which case we have to introduce N dummies (but suppress the common intercept term).
6. An alternative to FEM is ECM. In ECM it is assumed that the intercept of an individual
unit is a random drawing from a much larger population with a constant mean value. The
individual intercept is then expressed as a deviation from this constant mean value. One
advantage of ECM over FEM is that it is economical in degrees of freedom, as we do not
have to estimate N cross-sectional intercepts. We need only to estimate the mean value of
the intercept and its variance. ECM is appropriate in situations where the (random) inter-
cept of each cross-sectional unit is uncorrelated with the regressors. Another advantage
of ECM is that we can introduce variables such as gender, religion, and ethnicity, which
remain constant for a given subject. In FEM we cannot do that because all such variables
are colinear with the subject-specific intercept. Moreover, if we use the within-group
estimator or first-difference estimator, all such time-invariance will be swept out.
7. The Hausman test can be used to decide between FEM and ECM. We can also use the
Breusch–Pagan test to see if ECM is appropriate.
8. Despite its increasing popularity in applied research, and despite the increasing avail-
ability of such data, panel data regressions may not be appropriate in every situation.
One has to use some practical judgment in each case.
9. There are some specific problems with panel data that need to be borne in mind. The
most serious is the problem of attrition, whereby, for one reason or another, subjects of
the panel drop out over time so that over subsequent surveys (or cross-sections) fewer
original subjects remain in the panel. Even if there is no attrition, over time subjects may
refuse or be unwilling to answer some questions.

EXERCISES Questions
16.1. What are the special features of (a) cross-section data, (b) time series data, and
(c) panel data?
16.2. What is meant by a fixed effects model (FEM)? Since panel data have both time and
space dimensions, how does FEM allow for both dimensions?
16.3. What is meant by an error components model (ECM)? How does it differ from
FEM? When is ECM appropriate? And when is FEM appropriate?
16.4. Is there a difference between LSDV, within-estimator, and first-difference models?
16.5. When are panel data regression models inappropriate? Give examples.
16.6. How would you extend model (16.4.2) to allow for a time error component? Write
down the model explicitly.
16.7. Refer to the data on eggs produced and their prices given in Table 1.1. Which model
may be appropriate here, FEM or ECM? Why?
guj75772_ch16.qxd 28/08/2008 10:06 AM Page 614

614 Part Three Topics in Econometrics

16.8. For the investment data given in Table 1.2, which model would you choose—FEM
or REM? Why?
16.9. Based on the Michigan Income Dynamics Study, Hausman attempted to estimate
a wage, or earnings, model using a sample of 629 high school graduates, who
were followed for a period of six years, thus giving in all 3,774 observations. The de-
pendent variable in this study was logarithm of wage, and the explanatory variables
were: age (divided into several age groups); unemployment in the previous year;
poor health in the previous year; self-employment; region of residence (for graduate
from the South, South = 1 and 0 otherwise) and area of residence (for a graduate
from rural area, Rural = 1 and 0 otherwise). Hausman used both FEM and ECM.
The results are given in Table 16.15 (standard errors in parentheses).

TABLE 16.15 Variable Fixed Effects Random Effects


Wage Equations 1. Age 1 (20–35) 0.0557 (0.0042) 0.0393 (0.0033)
(Dependent Variable: 2. Age 2 (35–45) 0.0351 (0.0051) 0.0092 (0.0036)
Log Wage) 3. Age 3 (45–55) 0.0209 (0.0055) −0.0007 (0.0042)
Source: Reproduced from 4. Age 4 (55–65) 0.0209 (0.0078) −0.0097 (0.0060)
Cheng Hsiao, Analysis of Panel 5. Age 5 (65– ) −0.0171 (0.0155) −0.0423 (0.0121)
Data, Cambridge University
Press, 1986, p. 42. Original 6. Unemployed previous year −0.0042 (0.0153) −0.0277 (0.0151)
source: J. A. Hausman, 7. Poor health previous year −0.0204 (0.0221) −0.0250 (0.0215)
“Specification Tests in 8. Self-employment −0.2190 (0.0297) −0.2670 (0.0263)
Econometrics,” Econometrica,
vol. 46, 1978, pp. 1251–1271. 9. South −0.1569 (0.0656) −0.0324 (0.0333)
10. Rural −0.0101 (0.0317) −0.1215 (0.0237)
11. Constant —— 0.8499 (0.0433)
S2 0.0567 0.0694
Degrees of freedom 3,135 3,763

a. Do the results make economic sense?


b. Is there a vast difference in the results produced by the two models? If so, what
might account for these differences?
c. On the basis of the data given in the table, which model, if any, would you choose?
Empirical Exercises
16.10. Refer to the airline example discussed in the text. Instead of the linear model given
in Eq. (16.4.2), estimate a log–linear regression model and compare your results
with those given in Table 16.2.
16.11. Refer to the data in Table 1.1.
a. Let Y = eggs produced (in millions) and X = price of eggs (cents per dozen).
Estimate the model for the years 1990 and 1991 separately.
b. Pool the observations for the two years and estimate the pooled regression. What
assumptions are you making in pooling the data?
c. Use the fixed effects model, distinguishing the two years, and present the
regression results.
d. Can you use the fixed effects model, distinguishing the 50 states? Why or why not?
e. Would it make sense to distinguish both the state effect and the year effect? If so,
how many dummy variables would you have to introduce?
f. Would the error components model be appropriate to model the production of
eggs? Why or why not? See if you can estimate such a model using, say, EViews.
guj75772_ch16.qxd 22/08/2008 08:48 PM Page 615

Chapter 16 Panel Data Regression Models 615

16.12. Continue with Exercise 16.11. Before deciding to run the pooled regression, you
want to find out whether the data are “poolable.” For this purpose you decide to use
the Chow test discussed in Chapter 8. Show the necessary calculations involved and
determine if the pooled regression makes any sense.
16.13. Use the investment data given in Table 1.6.
a. Estimate the Grunfeld investment function for each company individually.
b. Now pool the data for all the companies and estimate the Grunfeld investment
function by OLS.
c. Use LSDV to estimate the investment function and compare your results with
the pooled regression estimated in (b).
d. How would you decide between the pooled regression and the LSDV regression?
Show the necessary calculations.
16.14. Table 16.16 gives data on the hourly compensation rate in manufacturing in U.S.
dollars, Y (%), and the civilian unemployment rate, X (index, 1992 = 100), for
Canada, the United Kingdom, and the United States for the period 1980–2006.
Consider the model:
Yit = β1 + β2 X it + u it (1)

TABLE 16.16 Year COMP_U.S. UN_U.S. COMP_CAN UN_CAN COMP_U.K. UN_U.K.


Unemployment Rate
and Hourly 1980 55.9 7.1 49.0 7.3 47.1 6.9
Compensation in 1981 61.6 7.6 53.8 7.3 47.5 9.7
Manufacturing, in 1982 67.2 9.7 60.1 10.7 45.1 10.8
the United States, 1983 69.3 9.6 64.3 11.6 41.9 11.5
Canada, and the 1984 71.6 7.5 65.0 10.9 39.8 11.8
United Kingdom, 1985 75.3 7.2 65.0 10.2 42.3 11.4
1980–2006. 1986 78.8 7.0 64.9 9.3 52.0 11.4
1987 81.3 6.2 69.6 8.4 64.5 10.5
Source: Economic Report of the 1988 84.1 5.5 78.5 7.4 74.8 8.6
President, January 2008,
Table B-109. 1989 86.6 5.3 85.5 7.1 73.5 7.3
5.6
1990 90.5 92.4 7.7 89.6 7.1
1991 95.6 6.8 100.7 9.8 99.9 8.9
1992 100.0 7.5 100.0 10.6 100.0 10.0
1993 102.0 6.9 94.8 10.8 88.8 10.4
6.1
1994 105.3 92.1 9.6 92.8 8.7
1995 107.3 5.6 93.9 8.6 97.3 8.7
1996 109.3 5.4 95.9 8.8 96.0 8.1
1997 112.2 4.9 96.7 8.4 104.1 7.0
1998 118.7 4.5 94.9 7.7 113.8 6.3
1999 123.4 4.2 96.8 7.0 117.5 6.0
2000 134.7 4.0 100.0 6.1 114.8 5.5
2001 137.8 4.7 98.9 6.5 114.7 5.1
2002 147.8 5.8 101.0 7.0 126.8 5.2
2003 158.2 6.0 116.7 6.9 145.2 5.0
2004 161.5 5.5 127.1 6.4 171.4 4.8
2005 168.3 5.1 141.8 6.0 177.4 4.8
2006 172.4 4.6 155.5 5.5 192.3 5.5
Notes: UN = Unemployment rate %.
COMP = Index of hourly compensation in U. S. dollars, 1992–100.
CAN = Canada.
guj75772_ch16.qxd 22/08/2008 07:14 PM Page 616

616 Part Three Topics in Econometrics

a.
A priori, what is the expected relationship between Y and X? Why?
b.
Estimate the model given in Eq. (1) for each country.
c.
Estimate the model, pooling all of the 81 observations.
d.
Estimate the fixed effects model.
e.
Estimate the error components model.
f.
Which is a better model, FEM or ECM? Justify your answer (Hint: Apply the
Hausman Test).
16.15. Baltagi and Griffin considered the following gasoline demand function:*
ln Yit = β1 + β2 ln X 2it + β3 ln X 3it + β4 ln X 4it + u it
Where Y = gasoline consumption per car; X2 = real income per capita, X3 = real
gasoline price, X4 = number of cars per capita, i = country code, in all 18 OECD
countries, and t = time (annual observations from 1960–1978). Note: Values in
table are logged already.
a. Estimate the above demand function pooling the data for all 18 of the countries
(a total of 342 observations).
b. Estimate a fixed effects model using the same data.
c. Estimate a random components model using the same data.
d. From your analysis, which model best describes the gasoline demand in the
18 OECD countries? Justify your answer.
16.16. The article by Subhayu Bandyopadhyay and Howard J. Wall, “The Determinants of
Aid in the Post-Cold War Era,” Review, Federal Reserve Bank of St. Louis,
November/December 2007, vol. 89, number 6, pp. 533–547, uses panel data to
estimate the responsiveness of aid to recipient countries’ economic and physical
needs, civil/political rights, and government effectiveness. The data are for
135 countries for three years. The article and data can be found at: http://
research.stlouisfed.org/publications/review/past/2007 in the November/December
Vol. 89, No. 10 section. The data can also be found on the textbook website in
Table 16.18. Estimate the authors’ model (given on page 534 of their article) using
a random effects estimator. Compare your results with those of the pooled and fixed
effects estimators given by the authors in Table 2 of their article. Which model is
appropriate here, fixed effects or random effects? Why?
16.17. Refer to the airlines example discussed in the text. For each airline, estimate a time
series logarithmic cost function. How do these regressions compare with the fixed
effects and random effects models discussed in the chapter? Would you also esti-
mate 15 cross-section logarithmic cost functions? Why or why not?

*
B. H. Baltagi and J. M. Griffin, “Gasoline Demand in the OECD: An Application of Pooling and Test-
ing Procedures,” European Economic Review, vol. 22, 1983, pp. 117–137. The data for 18 OECD coun-
tries for the years 1960–1978 can be obtained from: http://www.wiley.com/legacy/wileychi/baltagi/
supp/Gasoline.dat, or from the textbook website, Table 16.17.

You might also like