Lab Introduction To STATA
Lab Introduction To STATA
STATA is a powerful statistical package which is also easy to use. This handout
presents the rudiments of STATA. It is by no means comprehensive, but it will allow
you to do basic panel regression analysis.
STATA works like any other spreadsheet: within STATA go to data editor (or type
edit), and input the data manually: each column represents a variable and each cell
the value of the relevant variable.
STATA expects a single matrix or table of data from a single sheet, with at most one
line of text at the start defining the contents of the columns. Using your Windows
computer,
1. Start EXCEL
2. Enter data in rows and column (or open the EXCEL file of interest)
3. Highlight the data of interest, then pull down Edit and choose Copy
4. Start STATA
1
5. Select the data editor
2
8. Exit edit
Click Close
9. Save the data file (*.dta) – any file name you like
3
An example using STATA
Suppose we are interested in estimating the following model based on our panel of
countries
where
FD = financial development (% of GDP)
RGDPC = real GDP per capita (Constant 2005 US dollar)
INS = institutions (scaled 1 - 100)
FDI = foreign direct investment (US$)
4
• Pooled OLS Estimation (Command: regress or reg)
regress or reg
------------------------------------------------------------------------------
lfd | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
lrgdpc | -.0280911 .0665201 -0.42 0.673 -.1594256 .1032435
lins | .9882674 .2226742 4.44 0.000 .5486288 1.427906
lfdi | .1823524 .0336401 5.42 0.000 .1159348 .24877
_cons | -1.27406 .5580983 -2.28 0.024 -2.375945 -.1721739
------------------------------------------------------------------------------
We wish to test whether RE (GLS) is necessary or Pooled OLS (simple OLS) will do.
In other words, we search whether the datasets have specific-effect or heterogeneity
().
------------------------------------------------------------------------------
lfd | Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
lrgdpc | .3001345 .0925208 3.24 0.001 .1187972 .4814719
lins | .3380844 .2032213 1.66 0.096 -.0602219 .7363908
lfdi | .0511819 .0380584 1.34 0.179 -.0234111 .1257749
_cons | .0057854 .8241524 0.01 0.994 -1.609524 1.621094
-------------+----------------------------------------------------------------
sigma_u | .42025734
sigma_e | .25797259
rho | .72631933 (fraction of variance due to u_i)
------------------------------------------------------------------------------
Estimated results:
5
| Var sd = sqrt(Var)
---------+-----------------------------
lfd | .440091 .6633935
e | .0665499 .2579726
u | .1766162 .4202573
Test: Var(u) = 0
chi2(1) = 469.12
Prob > chi2 = 0.0000
Hypotheses:
H0: 2= 0 (pooled OLS model)
HA: 2> 0 (random effects) - heterogeneity
▪ The p-value < 0.05 – Reject H0. The random effect model is more appropriate
than OLS (pooled OLS model). In other words there are country-specific
effects (heterogeneity) in the data.
▪ The second test that is commonly used in applied panel data analysis seeks
to determine which is more appropriate: Random or fixed effects?
H0: Cov (i, xit) = 0 (No correlation between the i and xit Random Effect)
HA: Cov(i, xit) 0 (Correlation between the i and xit Fixed Effect)
.xtreg lfd lrgdpc lins lfdi, fe Fixed effects option (Within-groups FE)
------------------------------------------------------------------------------
lfd | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
lrgdpc | .5804713 .1503694 3.86 0.000 .2834632 .8774794
lins | .5043667 .222531 2.27 0.025 .0648258 .9439076
lfdi | -.0198562 .0476584 -0.42 0.678 -.1139906 .0742782
_cons | -2.239735 1.326785 -1.69 0.093 -4.860386 .3809163
-------------+----------------------------------------------------------------
sigma_u | .7311766
sigma_e | .25797259
rho | .88929927 (fraction of variance due to u_i)
------------------------------------------------------------------------------
F test that all u_i=0: F(9, 157) = 35.40 Prob > F = 0.0000
88.93% of the variance is
due to differences across
panels.
‘rho’ is known as the
H0: Common Intercept
intraclass correlation HA: Different Intercept
The p-value < 0.05, Reject
H0
Each country has different
6
intercept (justify FE)
.est store fixed
------------------------------------------------------------------------------
lfd | Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
lrgdpc | .3001345 .0925208 3.24 0.001 .1187972 .4814719
lins | .3380844 .2032213 1.66 0.096 -.0602219 .7363908
lfdi | .0511819 .0380584 1.34 0.179 -.0234111 .1257749
_cons | .0057854 .8241524 0.01 0.994 -1.609524 1.621094
-------------+----------------------------------------------------------------
sigma_u | .42025734
sigma_e | .25797259
rho | .72631933 (fraction of variance due to u_i)
------------------------------------------------------------------------------
.hausman fixed
---- Coefficients ----
| (b) (B) (b-B) sqrt(diag(V_b-V_B))
| fixed . Difference S.E.
-------------+----------------------------------------------------------------
lrgdpc | .5804713 .3001345 .2803368 .1185364
lins | .5043667 .3380844 .1662823 .0906707
lfdi | -.0198562 .0511819 -.0710381 .028686
------------------------------------------------------------------------------
b = consistent under Ho and Ha; obtained from xtreg
B = inconsistent under Ha, efficient under Ho; obtained from xtreg
chi2(3) = (b-B)'[(V_b-V_B)^(-1)](b-B)
= 17.46
Prob>chi2 = 0.0006
▪ The p-value < 0.05, reject H0. We have to use the fixed effect model.
. xtreg Y X1 X2, fe
. est store fixed
. xtreg Y X1 X2, re
. hausman fixed
7
Diagnostic Checks
a. Multicollinearity
------------------------------------------------------------------------------
lfd | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
lrgdpc | -.0280911 .0665201 -0.42 0.673 -.1594256 .1032435
lins | .9882674 .2226742 4.44 0.000 .5486288 1.427906
lfdi | .1823524 .0336401 5.42 0.000 .1159348 .24877
_cons | -1.27406 .5580983 -2.28 0.024 -2.375945 -.1721739
------------------------------------------------------------------------------
.vif
Variable | VIF 1/VIF
-------------+----------------------
lrgdpc | 8.37 0.119413
lins | 7.02 0.142451
lfdi | 1.60 0.624341
-------------+----------------------
Mean VIF | 5.67
8
b. Heteroskedasticity
.xtreg lfd lrgdpc lins lfdi, fe (since our final model is FE)
Fixed-effects (within) regression Number of obs = 170
Group variable: code Number of groups = 10
F(3,157) = 12.25
corr(u_i, Xb) = -0.8095 Prob > F = 0.0000
------------------------------------------------------------------------------
lfd | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
lrgdpc | .5804713 .1503694 3.86 0.000 .2834632 .8774794
lins | .5043667 .222531 2.27 0.025 .0648258 .9439076
lfdi | -.0198562 .0476584 -0.42 0.678 -.1139906 .0742782
_cons | -2.239735 1.326785 -1.69 0.093 -4.860386 .3809163
-------------+----------------------------------------------------------------
sigma_u | .7311766
sigma_e | .25797259
rho | .88929927 (fraction of variance due to u_i)
------------------------------------------------------------------------------
F test that all u_i=0: F(9, 157) = 35.40 Prob > F = 0.0000
.xttest3
9
Conclusion: The p-value is < 0.05, reject the H0. This means that the variances are
not constant (there is a heteroskedasticity problem)
Double click the above http link then select (click here to install)
.xtserial lfd lrgdpc lins lfdi (No pre-test such as must run the FE)
Conclusion: The p-value is < 0.05, reject the H0. This means that there is a serial
correlation problem.
How to rectify/overcome?
Refer to next page….Table 1 (using robust standard error command)
10
Source: Daniel, Hoechle. (2014) “Robust Standard Errors for Panel Regressions with Cross-
Sectional Dependence, Stata Journal, Page 4.
http://fmwww.bc.edu/repec/bocode/x/xtscc_paper.pdf
In our example, the diagnostic checks indicate there are two problems:
(i) heteroskedasticity, and
(ii) serial correlation problems
To Retify: Use the OLS with Heteroscedasticity and Serial Correlation Robust
Standard Error
11
. regress lfd lrgdpc lins lfdi ndum1 ndum2 ndum3 ndum4 ndum5 ndum6
ndum7 ndum8 ndum9, cluster (code)
Linear regression Number of obs = 170
F( 2, 9) = .
Prob > F = .
R-squared = 0.8595
Root MSE = .25797
or
F(3,9) = 1.10
corr(u_i, Xb) = -0.8095 Prob > F = 0.3987
12
However, if the diagnostic checks only indicate heteroskedasticity problem:
13
(ii) Panel-corrected Standard Errors, xtpcse
------------------------------------------------------------------------------
| Panel-corrected
lfd | Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
lrgdpc | .3366847 .0698404 4.82 0.000 .1998 .4735694
lins | -.2289813 .2201351 -1.04 0.298 -.6604383 .2024756
lfdi | .047209 .0461653 1.02 0.306 -.0432733 .1376913
_cons | 2.030366 .7517767 2.70 0.007 .5569104 3.503821
-------------+----------------------------------------------------------------
rho | .9264655
------------------------------------------------------------------------------
14
Summarize the above results:
Multicollinearity _ _ 5.67 _
(mean vif)
Heteroskedasticity _ _ 641.79 _
(2 – stat) (0.0000)***
Serial Correlation _ _ 266.28 _
(F-stat) (0.0000)***
1. Figures in the parentheses are t-statistics, except for Breusch-Pagan LM test, Hausman
test, Heteroskedasticity and Serial Correlation tests, which are p-values.
2. ** and *** indicate the respective 5% and 1% significance levels.
15
d. Cook’s Distance Outlier Test
Collect the Cook’s Distance residuals then generate the cutoff point, then list the
outliers
. d1 cutoff
.0002801 0
.0012586 0
.0005517 0 | country d1 |
.010484 0 |----------------------|
.0432518 1 37. | Indonesia .0432518 | The first two cutoff that
.0238333 1 38. | Indonesia .0238333 | indicate outlier is
.0214515 0 52. | Japan .0235541 |
53. | Japan .0329401 | Indonesia (obs 37, 38)
.0198299 0
54. | Japan .0389948 |
|----------------------|
138. | Thailand .0327251 |
154. | Vietnam .0366815 |
155. | Vietnam .0288114 |
156. | Vietnam .025521 |
+----------------------+
16
Regress the model without outliers
17
Another Fixed Effect Estimation:
.regress lfd lrgdpc lins lfdi ndum2 ndum3 ndum4 ndum5 ndum6 ndum7
ndum8 ndum9 ndum10
Source | SS df MS Number of obs = 170
-------------+------------------------------ F( 12, 157) = 80.05
Model | 63.9270465 12 5.32725388 Prob > F = 0.0000
Residual | 10.4483277 157 .066549858 R-squared = 0.8595
-------------+------------------------------ Adj R-squared = 0.8488
Total | 74.3753742 169 .440090972 Root MSE = .25797
------------------------------------------------------------------------------
lfd | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
lrgdpc | .5804713 .1503694 3.86 0.000 .2834632 .8774794
lins | .5043667 .222531 2.27 0.025 .0648258 .9439076
lfdi | -.0198562 .0476584 -0.42 0.678 -.1139906 .0742782
ndum2 | -1.688276 .457577 -3.69 0.000 -2.592077 -.7844744
ndum3 | -1.162159 .1163148 -9.99 0.000 -1.391903 -.932415
ndum4 | -1.703402 .5676667 -3.00 0.003 -2.824652 -.5821534
ndum5 | -1.742081 .4564517 -3.82 0.000 -2.643659 -.8405022
ndum6 | -.9299607 .2986868 -3.11 0.002 -1.519924 -.3399978
ndum7 | -1.126768 .1444713 -7.80 0.000 -1.412127 -.8414101
ndum8 | -2.198801 .5020856 -4.38 0.000 -3.190515 -1.207087
ndum9 | -.4354066 .1918365 -2.27 0.025 -.81432 -.0564931
ndum10 | -.2054068 .1297699 -1.58 0.115 -.4617269 .0509134
_cons | -1.120509 1.123154 -1.00 0.320 -3.33895 1.097933
------------------------------------------------------------------------------
18
Individual and Time-Specific Effects
Summary: The Fixed Effects Model (Least Squares Dummy Variable Model)
1) Firm FE
Although there are no significant temporal/time effects, there are significant differences
among firms in this type of model. While the intercept is cross-section (group) specific and in
this case differs from firm to firm, it may or may not differ over time.
2) Year/Time FE
In this case, the model would have no significant firm differences but might have
autocorrelation owing to time-lagged temporal effects. The residuals of this kind of model
may have autocorrelation in the process. In this case, the variables are homogenous across
the firms. They could be similar in region or area of focus. For example, technological
changes or national policies would lead to group specific characteristics that may effect
temporal changes in the variables being analyzed. We could account for the time effect over
the t years with t-1 dummy variables on the right-hand side of the equation.
There is another fixed effects panel model where the slope coefficients are constant, but the
intercept varies over firm as well as time. We would have a regression model with i-1 firm
dummies and t-1 time dummies. The model could be specified as follows:
19
Testing for Individual Effects / Specific Effects for LSDV
Now suppose we would like to know if the difference in the countries effects is
statistically significant (or whether these countries can share the common intercept?)
How to do that?
Option 1: Perform the joint test by restricting the country dummies using F-statistics
(after running the LSDV above)
.regress lfd lrgdpc lins lfdi ndum2 ndum3 ndum4 ndum5 ndum6 ndum7
ndum8 ndum9 ndum10
Source | SS df MS Number of obs = 170
-------------+------------------------------ F( 12, 157) = 80.05
Model | 63.9270465 12 5.32725388 Prob > F = 0.0000
Residual | 10.4483277 157 .066549858 R-squared = 0.8595
-------------+------------------------------ Adj R-squared = 0.8488
Total | 74.3753742 169 .440090972 Root MSE = .25797
------------------------------------------------------------------------------
lfd | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
lrgdpc | .5804713 .1503694 3.86 0.000 .2834632 .8774794
lins | .5043667 .222531 2.27 0.025 .0648258 .9439076
lfdi | -.0198562 .0476584 -0.42 0.678 -.1139906 .0742782
ndum2 | -1.688276 .457577 -3.69 0.000 -2.592077 -.7844744
ndum3 | -1.162159 .1163148 -9.99 0.000 -1.391903 -.932415
ndum4 | -1.703402 .5676667 -3.00 0.003 -2.824652 -.5821534
ndum5 | -1.742081 .4564517 -3.82 0.000 -2.643659 -.8405022
ndum6 | -.9299607 .2986868 -3.11 0.002 -1.519924 -.3399978
ndum7 | -1.126768 .1444713 -7.80 0.000 -1.412127 -.8414101
ndum8 | -2.198801 .5020856 -4.38 0.000 -3.190515 -1.207087
ndum9 | -.4354066 .1918365 -2.27 0.025 -.81432 -.0564931
ndum10 | -.2054068 .1297699 -1.58 0.115 -.4617269 .0509134
_cons | -1.120509 1.123154 -1.00 0.320 -3.33895 1.097933
------------------------------------------------------------------------------
.test ndum2 ndum3 ndum4 ndum5 ndum6 ndum7 ndum8 ndum9 ndum10
( 1) ndum2 = 0
( 2) ndum3 = 0
( 3) ndum4 = 0
( 4) ndum5 = 0
( 5) ndum6 = 0
( 6) ndum7 = 0
( 7) ndum8 = 0
( 8) ndum9 = 0
( 9) ndum10 = 0
F( 9, 157) = 35.40
Prob > F = 0.0000
Therefore, we reject the null hypothesis of common intercept for all countries (since
the p-value < 0.5). Thus, all countries have different intercept (or there is a fixed
effect).
Note: if there are too many country dummies, then use “testparm” command, the
same result will be obtained.
20
.regress lfd lrgdpc lins lfdi ndum2 – ndum10
Source | SS df MS Number of obs = 170
-------------+------------------------------ F( 12, 157) = 80.05
Model | 63.9270465 12 5.32725388 Prob > F = 0.0000
Residual | 10.4483277 157 .066549858 R-squared = 0.8595
-------------+------------------------------ Adj R-squared = 0.8488
Total | 74.3753742 169 .440090972 Root MSE = .25797
------------------------------------------------------------------------------
lfd | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
lrgdpc | .5804713 .1503694 3.86 0.000 .2834632 .8774794
lins | .5043667 .222531 2.27 0.025 .0648258 .9439076
lfdi | -.0198562 .0476584 -0.42 0.678 -.1139906 .0742782
ndum2 | -1.688276 .457577 -3.69 0.000 -2.592077 -.7844744
ndum3 | -1.162159 .1163148 -9.99 0.000 -1.391903 -.932415
ndum4 | -1.703402 .5676667 -3.00 0.003 -2.824652 -.5821534
ndum5 | -1.742081 .4564517 -3.82 0.000 -2.643659 -.8405022
ndum6 | -.9299607 .2986868 -3.11 0.002 -1.519924 -.3399978
ndum7 | -1.126768 .1444713 -7.80 0.000 -1.412127 -.8414101
ndum8 | -2.198801 .5020856 -4.38 0.000 -3.190515 -1.207087
ndum9 | -.4354066 .1918365 -2.27 0.025 -.81432 -.0564931
ndum10 | -.2054068 .1297699 -1.58 0.115 -.4617269 .0509134
_cons | -1.120509 1.123154 -1.00 0.320 -3.33895 1.097933
------------------------------------------------------------------------------
F( 9, 157) = 35.40
Prob > F = 0.0000
Note: The F-stat above is similar with xtreg Y X1 X2, fe command on page 6.
21
Testing for time-effect
Back to our example ……since the final model is FE, now we would like to test if
time fixed effects are needed when running a FE model.
It is a joint test to see if the dummies for all years are equal to 0, if they are then no
time fixed effects are needed.
The above time-fixed effects can be tested by creating time dummy variables:
Command to generate time dummy in the panel data If the “year” is used to represent the
time unit (refer to variable’s name
.tabulate year, generate (tdum) in the dataset)
------------------------------------------------------------------------------
lfd | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
lrgdpc | -.0191519 .0698408 -0.27 0.784 -.1571507 .118847
lins | .917666 .2383268 3.85 0.000 .4467547 1.388577
lfdi | .2102009 .039418 5.33 0.000 .1323148 .2880871
tdum2 | .119034 .2025609 0.59 0.558 -.2812072 .5192751
tdum3 | .102125 .2028266 0.50 0.615 -.2986411 .5028912
tdum4 | -.0540308 .2031446 -0.27 0.791 -.4554253 .3473637
tdum5 | -.094798 .2032939 -0.47 0.642 -.4964875 .3068915
tdum6 | -.0256841 .2035174 -0.13 0.900 -.4278153 .3764471
tdum7 | -.0059049 .2042874 -0.03 0.977 -.4095575 .3977477
tdum8 | -.0190396 .2047366 -0.09 0.926 -.4235797 .3855005
22
tdum9 | -.0556393 .2052772 -0.27 0.787 -.4612477 .349969
tdum10 | -.1307652 .2055285 -0.64 0.526 -.53687 .2753396
tdum11 | -.1551338 .2075592 -0.75 0.456 -.565251 .2549835
tdum12 | -.1754475 .2089743 -0.84 0.402 -.588361 .237466
tdum13 | -.151699 .2097628 -0.72 0.471 -.5661704 .2627724
tdum14 | -.1072258 .2109714 -0.51 0.612 -.5240852 .3096337
tdum15 | -.1103227 .2134789 -0.52 0.606 -.5321369 .3114914
tdum16 | -.1038899 .2138792 -0.49 0.628 -.526495 .3187151
tdum17 | -.1108308 .2144058 -0.52 0.606 -.5344763 .3128147
_cons | -1.31946 .6046789 -2.18 0.031 -2.514248 -.1246719
------------------------------------------------------------------------------.
.testparm tdum*
( 1) tdum2 = 0
( 2) tdum3 = 0
( 3) tdum4 = 0
( 4) tdum5 = 0
( 5) tdum6 = 0
( 6) tdum7 = 0
( 7) tdum8 = 0
( 8) tdum9 = 0
( 9) tdum10 = 0
(10) tdum11 = 0
(11) tdum12 = 0 H0: No time effects
(12) tdum13 = 0
(13) tdum14 = 0 HA: There is a time effects
(14) tdum15 = 0
(15) tdum16 = 0 Similar Result with “Short-cut Command
(16) tdum17 = 0
above”. Again, failed to reject the null
F( 16, 150) = 0.29 hypothesis.
Prob > F = 0.9969 Therefore, time effects are not needed.
23
Other STATA Commands
• To create descriptive statistics of a variable(s), type
Summarize var1 var2
• When there are typos in a data or you want to recode the values of a variable, type
recode var1 460304.1=460304
• To generate a first different variable, say first different log of salary, type
generate fdlnsalary = lnsalary – lnsalary1
Example:
Time dummy:
tabulate year, generate (tdum)
Country dummy:
tabulate code, generate (ndum)
Firm dummy:
tabulate firm, generate (fdum)
• Outreg2 – to transfer the output to MS Word or Excel in a nice table (save time)
24
Lab Exercise / Hand-On Session I(a)
a. Estimate the above model using pooled OLS. Interpret your results. When would this
estimation strategy be justified? Check for multicollinearity using variance inflation factor
(vif). What do you conclude?
b. Estimate the same model using the Random Effects (RE) estimator. Are Random Effects
justified?
c. Next, estimate the same model using the Fixed Effects (FE) estimator.
d. Conduct a Hausman test to compare the Fixed Effects (FE) and Random Effects (RE)
models. What do you conclude?
e. Test for heteroskedasticity for the Fixed Effect (FE) model. What do you conclude?
f. Test for serial correlation using xtserial command. What do you conclude?
g. Perform an estimation that can rectify the above (e) and (f) problem(s), and present the
result in the last column.
ln RGDPC
ln INS
ln FD
Breusch-Pagan _
LM test
Hausman test _
Observations
Multicollinearity
Heteroskedasticity
Serial Correlation
Notes: Figures in the parentheses are t-statistics, except for Breusch-Pagan LM test, Hausman test,
Heteroskedasticity and Serial Correlation tests, which are p-values. ** and *** indicate the respective
5% and 1% significance levels.
25
Lab Exercise / Hand-On Session I(b)
Greene (1997) provides a small panel data set with information on costs and output of 6
different firms, in 4 different periods of time (1955, 1960, 1965, and 1970). Your job is to
estimate a cost function using basic panel data techniques.
The data is shown below in a stacked form, i.e., the first "T" lines (here T=4) regard the firm
1, then the second "T" lines regard firm 2, and so on. The columns are self-explanatory. To
facilitate your work, I included firm specific dummy variables for each firm, represented by
columns D1-D6.
Empirical Model
26
Lab Exercise / Hand-On Session I(c)
Capital Asset Pricing Model (CAPM)
The CAPM due to Fama and MacBeth (1973) test involves a 2-step estimation procedure.
First, the betas are estimated in separate time series regressions for each firm, and Second,
for each separate point in time, a cross-sectional regression of the excess returns on the
betas is conducted
where the dependent variable, Rit - Rft is the excess return of the stock i at time t and the
independent variable is the estimated beta for the portfolio (P) that the stock has been
allocated to. The betas of the firms themselves are not used on the RHS, but rather, the
betas of portfolios formed on the basis of firm size. If the CAPM holds, then 0 should not be
significantly different from zero and m should approximate the (time average) equity market
risk premium, Rm – Rf.
Fama and MacBeth proposed estimating this second stage (cross-sectional) regression
separately for each time period, and then taking the average of the parameter estimates to
conduct hypothesis tests. However, one could also achieve a similar objective using a panel
approach.
The attached Excel file (Panel CAPM) contains data on return and beta, which consists of
2500 UK firms for 11 years.
b. Next, estimate the same model using the Random Effects estimator. Are Random
Effects justified?
d. Compare the Fixed Effects and Random Effects estimates. Conduct a Hausman test
to compare the two models. What do you conclude?
27