0% found this document useful (0 votes)
41 views

Lecture Note 11 Panel Analysis

The document discusses panel data and methods for analyzing it, including pooled cross-sectional data and longitudinal data. It also discusses difference-in-differences estimation, which can help address problems of reverse causality and omitted variables bias when evaluating the impact of policies or programs. Difference-in-differences estimation compares changes in outcomes over time between a treatment group that receives a program and a comparison group that does not. This helps control for time-invariant unobserved characteristics that may influence outcomes.

Uploaded by

Kasun Perera
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
41 views

Lecture Note 11 Panel Analysis

The document discusses panel data and methods for analyzing it, including pooled cross-sectional data and longitudinal data. It also discusses difference-in-differences estimation, which can help address problems of reverse causality and omitted variables bias when evaluating the impact of policies or programs. Difference-in-differences estimation compares changes in outcomes over time between a treatment group that receives a program and a comparison group that does not. This helps control for time-invariant unobserved characteristics that may influence outcomes.

Uploaded by

Kasun Perera
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 11

Takashi Yamano

Fall Semester 2009

Lecture Notes on Advanced Econometrics

Lecture 11: Unobserved Effects and Panel Analysis


Panel Data

There are two types of panel data sets: a pooled cross section data set and a
longitudinal data set. A pooled cross section data set is a set of cross-sectional data
across time from the same population but independently sampled observations each time.
A longitudinal data set follows the same individuals, households, firms, cities, regions, or
countries over time.

Many governments conduct nationwide cross sectional surveys every year or every once
in a while. Census is an example of such surveys. We can create a pooled cross section
data set if we combine these cross sectional surveys over time. Because these surveys are
often readily available from governments (well this is not true for most of the times
because of many government officials do not see any benefits of making publicly funded
surveys public!), it is relatively easier to obtain pooled cross section data.

Longitudinal data, however, can provide much more detailed information as we see in
this lecture. Because longitudinal data follow the same samples over time, we can
analyze behavioral changes over time of the samples.

Nonetheless, pooled cross section data can provide information that a single cross section
data cannot.

Pooled Cross Section Data

With pooled cross section data, we can examine changes in coefficients over time. For
instance,

yit = β 0 + β1Tit + β 2 xit1 + β 3 xit 2 + β 4 xit 3 + u it (1)

t = 1, 2
i = 1, 2,…, N, N+1, N+2,…, N+N
where Tit = 1 if t = 2 and dit = 0 if t = 1.

The coefficient of the time dummy Tit measures a change in the constant term over time.
If we are interested in a change in a potential effect of one of the variables, then we can
use an interaction term between the time dummy and one of the variables:

yit = β 0 + β1Tit + β 2 xit1 + δ (Tit × xit1 ) + β 3 xit 2 + β 4 xit 3 + u it (2)

1
δ measures a change in the coefficient of x1 over time.

How about if there are changes in all of the coefficients over time? To examine if there is
a structural change, we can use the Chow test. To conduct the Chow test, consider the
following model:

yit = β 0 + β1 xit1 + β 2 xit 2 + β 3 xit 3 + u it (3)

for t = 1 , 2.

We consider this model as a restricted model because we are imposing restrictions that
all the coefficients remain the same over time. There are k+1 restrictions (in this case 5
restrictions).

Unrestricted models are

yit = δ 0 + δ 1 xit1 + δ 2 xit 2 + δ 3 xit 3 + eit for t = 1 (4)

and

yit = α 0 + α 1 xit1 + α 2 xit 2 + α 3 xit 3 + ε it for t = 2 (5)

The coefficients of the first model (t = 1) are not restricted to be the same as in the second
model (t = 2). If all of the coefficients remain the same over time, i.e., β j = δ j = α j ,
then the sum of squared residuals from the restricted model (SSRr) should be equal to the
sum of the sums of squared residuals from the two unrestricted models (SSRur1 + SSRur2).

On the other hand, if there is a structural change, i.e., changes in the coefficients over
time, then the sum of SSRur1 and SSRur2 should be smaller than SSRr, because
unrestricted coefficients in unrestricted models should match the data more precisely than
the restricted model. Then we take the difference between SSRr and (SSRur1 + SSRur2)
and examine if there is statistically significant difference between the two:

[ SSRr − ( SSRur1 + SSRur 2 )] / (k + 1)


F =
( SSRur1 + SSRur 2 ) / [ N1 + N 2 − 2(k + 1)]

This is called the Chow test.

Alternatively, we can create interaction terms on all of independent variables (including


the constant term) and conduct a F-test on the coefficients of the k+1 interaction terms.
This is just the same the Chow test.

2
Difference-in-Differences Estimator

In many economic analyses, we are interested in some of policy variables or politically


interesting variables and how these variables affect people’s lives. However, evaluating
the impacts of various policies is difficult because most of policies are not done under
experimental designs.

For instance, suppose a government of a low-income country decided to invest in health


facilities to improve child health. Suppose this particular government decided to start
with the most-needy communities. After some years, the government wanted to evaluate
the impacts of the investment in health facilities. The government conducts a cross-
sectional survey. However, the government finds negative correlation between newly-
build health facilities and child health. What happened?

Child health

Communities without E(HT2: z = 0)

E(HT2: z = 1)
∗ Communities
with investments

Time
1 2

The problem is that the government built health facilities in communities with poor child
health.

In the figure above, the child health in poor community i with the government-
investments (z) has improved over time, but its absolute level is still not as good as the
child health in rich communities without the government investments. Thus, an OLS
model with a dummy variable for the government investments in health facilities will
find a coefficient of z:

H it = β 0 + β1 Z it + u it (6)
for i = 1, …, N communities.

When we find a negative coefficient or an opposite effect of what expected, we call it the
reverse causality.

From the figure, it is obvious that we need to measure a difference between the two
groups for each time period and measure a net change in the differences over time:

3
δ = [ E ( H i 2 : Z = 1) − E ( H i1 : Z = 1)] − [ E ( H i 2 : Z = 0) − E ( H i1 : Z = 0)] (7)

Although both differences are negative, the difference between the two groups in the
second period is much smaller than the difference in the first period. Thus, the net
change is positive, which measures the net impact of Z on H. We call the δ in (10-7) the
difference-in-differences (DID) estimator.

The difference-in-differences estimator can be estimated by estimating the following


model:

H it = β 0 + β1 Z i + δ (T × Z i ) + u it (8)

We can think this example as a kind of the omitted variables problem. We can rewrite
(9-6) as
H it = β 0 + β 1 Z i + vi + u it (9)

where vi is an important unobserved variable (or an unobserved fixed effect) which is


correlated with both the government investments and the child health. Let’s say that vi
measures the lack of basic infrastructure in community i: the larger the vi , the poorer the
basic infrastructure. Because the government targets the poor communities for the
investments, vi and Zi are correlated positively. But vi and Hi are correlated negatively
because vi measures the lack of basic infrastructure. Therefore, the estimated coefficient
of Zi will be biased downward, which produces a reverse causality.

The First Differenced Estimation

Let’s go back to the DID estimator and rearrange it so that the first term measures a
difference in Hit of community i over time:

δ = [ E ( H i 2 : Z = 1) − E ( H i1 : Z = 1)] − [ E ( H i 2 : Z = 0) − E ( H i1 : Z = 0)] (10)

Here the first term measures a change over time for the treatment group (T) and the
second term measures for the comparison group (C).

In a regression form, we can also rearrange (8). Let’s write the equation (8) with an
unobserved fixed effect:

H it = β 0 + β1T + β1 Z i + δ (T × Z i ) + vi + u it (11)

Now, the problem is that Z could be correlated with vi , which may be also correlated
with Hit. For the first period (thus T = 0), the equation (11) is

4
H it = β 0 + β 1 Z i + vi + u it

and for the second period (T = 1):

H it +1 = ( β 0 + β1 ) + ( β1 + δ ) Z i + vi + u it +1

Then by taking the first-difference, we have

H it +1 − H it = β1 + δZ i + u it +1 − u it (12)

Notice that the unobserved fixed effect, vi has been excluded from this model because
the unobserved fixed effect is fixed over time. In the first-differenced equation (12), Z
will not be correlated with the error term.

Quasi-experimental and Experimental Designs

From this point of view, it is obvious that under a nonrandom assignment of Z (or a
quasi-experimental design), δ in (8) could be biased because Z (a program indicator)
could be correlated with unobserved factors which may be also correlated with H (a
dependent variable).

In contrast, under a random assignment of Z (or an experimental design), Z will not be


correlated with any of unobserved factors. Thus the difference-in-differences estimator
will provide reliable estimators of the impacts of programs on outcomes.

Under an experimental design, a group of individuals or observations that receive


benefits from a give policy is called a treatment group or an experimental group. And a
group of non-beneficiaries is called a control group. Under a quasi-experimental design,
a group of non-beneficiaries is called a comparison group, and reserve the name “the
control group” for experimental designs.

In social science, it is difficult to conduct experimental designed programs because of


ethics and political difficulties. But experimental designed programs can provide very
useful information about the effectiveness of public (or private) policies.

Recent Example of an experimental designed project: PROGRESA in Mexico, designed


and research by IFPRI. See www.ifpri.org.

For a summary of US experience, see Grossman (1994) “Evaluating social policies:


principles and U.S. experience,” World Bank Research Observer, vol9: 159-80.

Thus, we have dealt with an omitted variable problem by taking a difference over time.
Next, we study the omitted fixed effect problem in general.

5
The Omitted Variables Problem Revisited

Suppose that a correctly specified regression model would be

y = Xβ + u = X 1 β1 + X 2 β 2 + u

X1 and X2 have k1 and k2 columns, respectively. But, suppose we regress y on X1 without


including X2 (X2 represent omitted variables). The OLS estimator is

βˆ1 = ( X 1′ X 1 ) −1 X 1′Y = ( X 1′ X 1 ) −1 X 1′ ( X 1 β1 + X 2 β 2 + u )
= β1 + ( X 1′ X 1 ) −1 X 1′ X 2 β 2 + ( X 1′ X 1 ) −1 X 1′u

By taking the expectation on both sides, we have

E ( βˆ1 ) = β 1 + ( X 1′ X 1 ) −1 X 1′ X 2 β 2 = β1 + δˆ12 β 2

Note, however, that the second term indicates the column of slopes ( δˆ12 ) in least squares
regression of the corresponding column of X2 on the columns of X1.

Thus, unless either δˆ12 =0 or β 2 =0, β̂1 is biased.

To overcome the omitted variables problem, we can take two different methods. First
method is to use panel data. As you see later, by using panel (longitudinal) data, we can
eliminate unobserved variables that are specific to each sample and fixed (or time-
invariant or time-constant) over time. Note, however, that this only eliminates the
correlation between the independent variables and the fixed effect. If independent
variables are correlated with the error term which contains time-varying unobserved
characteristics, then the estimated coefficients would be biased.

Second method is to use instrumental variables that are correlated with independent
variables that are considered to be correlated with unobserved variables but uncorrelated
with the dependent variable. Unlike the fixed effects model, the IV method eliminates
any correlation between the independent variables and the error term. Thus, this method
is theoretically appearing. The major problem with the IV method is the availability of
plausible instrumental variables that are sufficiently correlated with the endogenous
variables and uncorrelated with the dependent variable. Often, if not every time, it is
very difficult to find plausible instrumental variables. We will discuss problems with the
IV method elsewhere in the lecture notes.

6
Linear Unobserved Effects

What are unobserved variables? It is impossible to collect all variables in surveys that
affect people’s economic activities. Thus, it is inevitable to have unobserved variables in
our estimation models. What, then, we should do? First, we should start with
characterizing possible unobserved variables.

The most common type of unobserved variables is a fixed effect. A fixed effect is a time
invariant characteristic of an individual or a group (or cluster). For instance, ai may
represent a fixed characteristic of group i. This could be a regional fixed effect or a
cluster fixed effect. Another example is aj which represents a fixed characteristic of a
group (cluster) j.

Suppose, we want to estimate the following model with a group fixed effect,

yit = β 0 + β1 xit1 + ... + β K xitK + vi + u it (13)

In this case, as long as unobserved variables (that are correlated with individual variables
and the dependent variable) are fixed characteristics of groups, then we can eliminate the
omitted variables problem by explicitly including group dummies:

y it = β 0 + β1 xit1 + ... + β K xitK + v 2V2 + ... + v N −1V N −1 + u it . (14)

For instance, it is common practice to include district or village dummies in cross-


sectional data. However, in a cross-sectional data set, it is impossible to include
individual dummies for all samples because we have only one observation per sample.
We will have n-1 dummies for n observations.

If we have multiple observations for each sample (thus we need longitudinal data not a
pooled cross-sectional data over time), then it is possible to have n-1 dummies for s x n
observations. (s is the number of observations per sample.) Thus, we estimate the
equation (14), which is called the Dummy Variable Regression model. In this model,
we have eliminated the unobserved fixed effects by explicitly including individual
dummy variables.

A different way of eliminating the fixed effects is to use the first difference model, as
we have seen earlier. Here let us reconsider the first difference model in a general
treatment. Suppose, again, that we have the following model for time t=1 and t=2:

yi1 = β 0 + β1 xi11 + ... + β K xi1 K + vi + u i1


and
yi 2 = β 0 + β 1 xi 21 + ... + β K xi 2 K + vi + u i1
By subtracting the model for t=1 from the model for t=2, we have

y i 2 − y i1 = β 1 ( xi 21 − xi11 ) + ... + β K ( xi 2 K − xi 21K ) + u i 2 − u i1

7
or
∆y = β1 ∆xi 1 + ... + β k ∆xi k + ∆u i
This is called the first difference model. Notice that the individual fixed effect, a j, has
been eliminated. Thus as long as the new error term is uncorrelated with the new
independent variables, then the estimators should be unbiased.

Some notes: First, a first differenced independent variable, ∆xi k , must have some
variation across i. For instance, a gender dummy variable does not change over time, the
first-differenced gender dummy is zero for all i. Thus, you can not estimate coefficient
on time-invariant independent variables in first difference models. Second, differenced
independent variables loose variation. Thus, estimators often have large standard errors.
Large sample size helps to estimate parameters precisely.

Fixed Effect Estimation

In the previous lecture, we studied the first differenced model, concerning the correlation
between a policy variable and an unobserved fixed effect. In this lecture, we generalize
the model. Consider the following model with T period and k variables

yit = β 0 + β1 xit1 + ... + β K xitK + vi + u it (16)

The omitted unobserved fixed effect could be correlated with any of k independent
variables.

To take the fixed effect away, one can subtract the mean of each variable:

yit − y i = β1 ( xit1 − xi1 ) + ... + β K ( xitK − xitK ) + vi − vi + u it − 0 (17)

As you can see, the unobserved fixed effect can be excluded from the model. This model
is called the fixed effect estimation. To estimate the fixed effect model, you need to
transform each variable by taking the mean out and estimate the OLS with the
transformed data (the time-demeaned data). In STATA, you don’t need to transfer the
data yourself. Instead you just need to use a command “xtreg y x1 x2 … xk, fe i(id).” See
the manuals under “xtreg.”

One drawback of the fixed effect estimation is that some of time-invariant variables will
be also excluded from the model. For instance, consider a typical wage model, where the
dependent variable is log(wage). Some of individual characteristics, such as education
and gender are time-invariant (or fixed over time). Thus if you are interested in the
effects of time-invariant variables you cannot estimate the coefficients of such variables.
However, what you can do is to estimate the changes in the effects of such time-invariant
variables.

8
Example 1: OLS, Fixed Effect, First-Differenced, and LSDV models

. use c:\docs\fasid\econometrics\homework\JTRAIN.dta;
. keep if year==1988|year==1989;
(157 observations deleted)
. replace sales=sales/10000;
(254 real changes made)

. ** OLS;
. reg hrsemp grant employ sales union d89;

Source | SS df MS Number of obs = 220


---------+------------------------------ F( 5, 214) = 2.07
Model | 64302349.8 5 12860470.0 Prob > F = 0.0703
Residual | 1.3291e+09 214 6210560.99 R-squared = 0.0461
---------+------------------------------ Adj R-squared = 0.0239
Total | 1.3934e+09 219 6362385.39 Root MSE = 2492.1

------------------------------------------------------------------------------
hrsemp | Coef. Std. Err. t P>|t| [95% Conf. Interval]
---------+--------------------------------------------------------------------
grant | 824.4124 419.8616 1.964 0.051 -3.181518 1652.006
employ | -3.411366 4.232316 -0.806 0.421 -11.75373 4.931001
sales | .0431946 .3313198 0.130 0.896 -.6098735 .6962627
union | 942.0082 442.0818 2.131 0.034 70.61581 1813.401
d89 | -287.6238 338.4361 -0.850 0.396 -954.7191 379.4715
_cons | 156.7909 300.7423 0.521 0.603 -436.0056 749.5874
------------------------------------------------------------------------------

. ** Fixed Effect Model;


. xtreg hrsemp grant employ sales union d89, fe i(fcode);

Fixed-effects (within) regression Number of obs = 220


Group variable (i) : fcode Number of groups = 114

R-sq: within = 0.0322 Obs per group: min = 1


between = 0.0106 avg = 1.9
overall = 0.0212 max = 2

F(4,102) = 0.85
corr(u_i, Xb) = -0.0144 Prob > F = 0.4972

------------------------------------------------------------------------------
hrsemp | Coef. Std. Err. t P>|t| [95% Conf. Interval]
---------+--------------------------------------------------------------------
grant | 846.6938 552.6861 1.532 0.129 -249.5565 1942.944
employ | 1.504937 20.69735 0.073 0.942 -39.54816 42.55803
sales | -.0847216 .9174962 -0.092 0.927 -1.904571 1.735128
union | (dropped)
d89 | -292.2915 368.5441 -0.793 0.430 -1023.297 438.714
_cons | 137.1313 917.5104 0.149 0.881 -1682.746 1957.009
------------------------------------------------------------------------------
sigma_u | 1742.3611
sigma_e | 2578.396
rho | .31349012 (fraction of variance due to u_i)
------------------------------------------------------------------------------
F test that all u_i=0: F(113,102) = 0.87 Prob > F = 0.7716

9
. ** LSDV model;
. xi: reg hrsemp grant employ sales union d89 i.fcode;
i.fcode Ifcod1-157 (Ifcod1 for fcode==410032 omitted)

Source | SS df MS Number of obs = 220


---------+------------------------------ F(117, 102) = 0.92
Model | 715253529 117 6113278.03 Prob > F = 0.6706
Residual | 678108872 102 6648126.19 R-squared = 0.5133
---------+------------------------------ Adj R-squared = -0.0449
Total | 1.3934e+09 219 6362385.39 Root MSE = 2578.4

------------------------------------------------------------------------------
hrsemp | Coef. Std. Err. t P>|t| [95% Conf. Interval]
---------+--------------------------------------------------------------------
grant | 846.6938 552.6861 1.532 0.129 -249.5565 1942.944
employ | 1.504937 20.69735 0.073 0.942 -39.54816 42.55803
sales | -.0847216 .9174962 -0.092 0.927 -1.904571 1.735128
union | -838.5586 5723.478 -0.147 0.884 -12191.05 10513.93
d89 | -292.2915 368.5441 -0.793 0.430 -1023.297 438.714
Ifcod2 | -192.7619 3889.521 -0.050 0.961 -7907.609 7522.085
Ifcod3 | -203.2925 4021.587 -0.051 0.960 -8180.091 7773.506

Output omitted…

. ** First Differenced Model;


. reg dhrsemp dgrant demploy dsales;

Source | SS df MS Number of obs = 106


---------+------------------------------ F( 3, 102) = 0.81
Model | 32189365.8 3 10729788.6 Prob > F = 0.4928
Residual | 1.3562e+09 102 13296252.1 R-squared = 0.0232
---------+------------------------------ Adj R-squared = -0.0055
Total | 1.3884e+09 105 13222924.6 Root MSE = 3646.4

------------------------------------------------------------------------------
dhrsemp | Coef. Std. Err. t P>|t| [95% Conf. Interval]
---------+--------------------------------------------------------------------
dgrant | 846.6938 552.6861 1.532 0.129 -249.5565 1942.944
demploy | 1.504937 20.69735 0.073 0.942 -39.54816 42.55803
dsales | -.0847216 .9174962 -0.092 0.927 -1.904571 1.735128
_cons | -292.2915 368.5441 -0.793 0.430 -1023.297 438.714

End of Example 1

Example 2: Transferring The Panel Data

In general, the panel data are stacked vertically. For instance, in JTRAIN.dta, two
observations (actually there are three years of observations, but I dropped one year) for
each firm is stacked vertically:

. list fcode year d89 employ

firm code year d89 # of employees


1. 410032 1988 0 131
2. 410032 1989 1 123
3. 410440 1988 0 13
4. 410440 1989 1 14
5. 410495 1988 0 25
6. 410495 1989 1 24

10
To construct differenced variables, you need to “linearize” the vertical data. Here is an
example:

. do "C:\WINDOWS\TEMP\STD0c0000.tmp"

. #delimit;
delimiter now ;
. clear;
. set more off;
. set matsize 800;
. set memory 100m;
(102400k)

. *** Obtain data from 1988;


. clear;
. use c:\docs\fasid\econometrics\homework\JTRAIN.dta;
. keep year fcode employ;
. keep if year==1988;
(314 observations deleted)
. rename employ employ88;
. drop year;
. sort fcode;
. save c:\docs\tmp\jtrain88.dta, replace;
file c:\docs\tmp\jtrain88.dta saved

. *** Obtain data from 1989;


. clear;
. use c:\docs\fasid\econometrics\homework\JTRAIN.dta;
. keep fcode year employ;
. keep if year==1989;
(314 observations deleted)
. rename employ employ89;
. drop year;
. sort fcode;
. save c:\docs\tmp\jtrain89.dta, replace;
file c:\docs\tmp\jtrain89.dta saved

. ** Combine the two years;


. clear;

. use c:\docs\tmp\jtrain89.dta;
. sort fcode;
. merge fcode using c:\docs\tmp\jtrain88.dta;
. gen demploy=employ89-employ88;
(11 missing values generated)

. list fcode employ89 employ88 demploy in 1/3;


fcode employ89 employ88 demploy
1. 410032 123 131 -8
2. 410440 14 13 1
3. 410495 24 25 -1

End of Example 2

11

You might also like