0% found this document useful (0 votes)
7 views

Analysing Panel Data

This document shows the step by step approach in analysis a panel set of data.

Uploaded by

Amos Ganyam
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
7 views

Analysing Panel Data

This document shows the step by step approach in analysis a panel set of data.

Uploaded by

Amos Ganyam
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 25

Analysing Panel Data

By
Amos Ganyam
Objectives

• Understand the panel data analysis framework


• Examine the steps in choosing between fixed
effects, random effects and pooled OLS regression
• Interpret results from panel data analysis.
Panel Data Analysis Framework
Panel Data

Descriptive
Statistics

Pooled OLS Correlation Panel Regression

Robustness Tests Random Effects Fixed Effects

Lagrange
Normality
Multiplier Test

Multi-collinearity Random Effects Pooled OLS

Heteroscedasticity
Descriptive Statistics
• Presents a statistical summary of variables in a data
set.
• It usually presents statistics such as number of
observation, mean, standard deviation and range
(minimum and maximum).
Variable Obs Mean Std. Dev. Min Max

edi 80 .015625 .0416007 0 .125


bm 80 4.45 1.231445 0 8
bi 80 60.35187 17.9657 0 90
bo 80 15.76394 22.28182 0 78.1954
age 80 3.709743 .329269 2.890372 4.174387
Pooled Ordinary Least Squares (OLS)
• This helps in estimating the compliance of a study’s
model with the regression assumptions.
• By the affirmation of the LM test, pooled OLS
results forms the basis of inference.
Source SS df MS Number of obs = 80
F(4, 75) = 9.28
Model .045264274 4 .011316068 Prob > F = 0.0000
Residual .091454476 75 .001219393 R-squared = 0.3311
Adj R-squared = 0.2954
Total .13671875 79 .001730617 Root MSE = .03492

edi Coef. Std. Err. t P>|t| [95% Conf. Interval]

bm -.0067548 .0033423 -2.02 0.047 -.0134131 -.0000966


bi .0007474 .0002245 3.33 0.001 .0003002 .0011947
bo .0004459 .0001801 2.48 0.016 .000087 .0008047
age -.0533778 .0122258 -4.37 0.000 -.0777329 -.0290228
_cons .1915638 .0458552 4.18 0.000 .1002156 .2829121
Robustness Tests-Normality
Assumption: Error term in a regression should
be normally distributed.

Normality
Tests

Jarque- Shapiro Skewness


Histogram
Bera Wilk & Kurtosis
Histogram Test for Normality
Assume normal
distribution if
Histogram forms
a bell-shape.
Jarque-Bera Test for Normality
Assume normal
distribution if p-
value is greater
than 0.05

Jarque-Bera normality test: 3.402 Chi(2) .1825


Jarque-Bera test for Ho: normality:
Shapiro Wilk Test for Normality
Assume normal
distribution if p-
value is greater
than 0.05

Shapiro-Wilk W test for normal data

Variable Obs W V z Prob>z

error 80 0.97478 1.731 1.202 0.11463


Skewness & Kurtosis Test for Normality
Assume normal
distribution if p-
value is greater
than 0.05

Skewness/Kurtosis tests for Normality


joint
Variable Obs Pr(Skewness) Pr(Kurtosis) adj chi2(2) Prob>chi2

error 80 0.2062 0.1228 4.13 0.1271


Treatment for Abnormality

• Remove outliers in the data set: Transform to


Logarithm or Z-score.
• Add more variables to your model
• A model with number of observations larger than
30 may not be affected by issues of data
abnormality (Singh, Lucas, Dalpatadu & Murphy,
2013).

Reference

Singh, A. K., Lucas, A. F., Dalpatadu, R. J., & Murphy, D. J. (2013). Casino games and the Central Limit Theorem. UNLV Gaming
Research & Review Journal, 17(2), 45-61
Robustness Tests-Multi-collinearity
• Assumption: Predictors should not be highly
correlated with themselves.
• Variance Inflation Factor (VIF) tests for multi-
collinearity.
Multi-collinearity
Variable VIF 1/VIF is absent when
all VIF values are
bm 1.10 0.911151 less than 10.
bi 1.05 0.948844
age 1.05 0.952492
bo 1.04 0.958067

Mean VIF 1.06


Treatment for Multi-collinearity
• Remove outliers by transforming variables to Log or
Z-score.
• Adjust your model to exclude variables with high
VIF.
Robustness Tests-Heteroscedasticity
• Assumption: Error term in a regression model
should have a constant variance
(homoscedasticity).
• When this is violated then heteroscedasticity is
present. Heteroscedasticity is
present when p-value
is less than 0.05

Breusch-Pagan / Cook-Weisberg test for heteroskedasticity


Ho: Constant variance
Variables: fitted values of edi

chi2(1) = 16.88
Prob > chi2 = 0.0000
Treatment for Heteroscedasticity
• Inclusion of more predictors.
• Final regression results should be performed using
robust standard errors
Correlations
• Tests the magnitude and strength among the study
variables.
• Checks for multi-collinearity among the variables if
all variables are less than 0.8
edi bm bi bo age

edi 1.0000
bm -0.1699 1.0000
bi 0.2660* 0.2134 1.0000
bo 0.1793 0.1647 0.0960 1.0000
age -0.3986* 0.1849 0.0877 0.1362 1.0000

* Variables are significantly correlated


- Variables are negatively correlated
No sign Variables are positively correlated
Panel Regression Analysis
• Fixed effects: Controls for time invariant features
within an entity that may affect the predictors.
These time invariant features should be unique to an
entity and must not be correlated.

• Random effects: Assumes that the variations across


entities are not correlated with the predictors.
The error term for an entity is not correlated with the
predictors

• Pooled OLS: Assumes that observations does not


refer to a particular entity
Choosing Between Fixed & Random Effects
Are the observations from a
random sample?
Yes No
(Perform Random & Fixed
Effects) (Use Fixed Effects)

Hausman Test

Significant difference in
Coefficient?
Yes No
(Use Fixed Effects) (Perform LM Test)
Does LM test indicate presence
of random effects?

Yes
(Use random Effects)

No
(Use Pooled OLS)
Hausman Test
• Test whether the difference in the coefficient of a
panel regression model are systematic or not.

Coefficients
(b) (B) (b-B) sqrt(diag(V_b-V_B))
fixed random Difference S.E.

bm -.001899 -.0022298 .0003308 .0004002


bi .0002153 .0002525 -.0000372 .0000365
bo .0003813 .00039 -8.68e-06 .0000408
age -.0328429 -.0415827 .0087398 .0173111

b = consistent under Ho and Ha; obtained from xtreg


B = inconsistent under Ha, efficient under Ho; obtained from xtreg

Test: Ho: difference in coefficients not systematic


Use random effect
when p-value is
chi2(4) = (b-B)'[(V_b-V_B)^(-1)](b-B) greater than 0.05
= 1.47
Prob>chi2 = 0.8312
Lagrange Multiplier Test
• Test for whether random effects exists in the model

Breusch and Pagan Lagrangian multiplier test for random effects


Use random effects if
edi[firm,t] = Xb + u[firm] + e[firm,t]
p-value is less than
Estimated results:
0.05
Var sd = sqrt(Var)

edi .0017306 .0416007


e .000373 .0193145
u .0011617 .0340844

Test: Var(u) = 0
chibar2(01) = 142.74
Prob > chibar2 = 0.0000
Pooled OLS Regression Results
Source SS df MS Number of obs = 80
F(4, 75) = 9.28
Model .045264274 4 .011316068 Prob > F = 0.0000
Residual .091454476 75 .001219393 R-squared = 0.3311
Adj R-squared = 0.2954
Total .13671875 79 .001730617 Root MSE = .03492

edi Coef. Std. Err. t P>|t| [95% Conf. Interval]

bm -.0067548 .0033423 -2.02 0.047 -.0134131 -.0000966


bi .0007474 .0002245 3.33 0.001 .0003002 .0011947
bo .0004459 .0001801 2.48 0.016 .000087 .0008047
age -.0533778 .0122258 -4.37 0.000 -.0777329 -.0290228
_cons .1915638 .0458552 4.18 0.000 .1002156 .2829121
Fixed Effects Regression Results
Fixed-effects (within) regression Number of obs = 80
Group variable: firm Number of groups = 8

R-sq: Obs per group:


within = 0.0981 min = 10
between = 0.3518 avg = 10.0
overall = 0.2889 max = 10

F(4,68) = 1.85
corr(u_i, Xb) = 0.2817 Prob > F = 0.1297

edi Coef. Std. Err. t P>|t| [95% Conf. Interval]

bm -.001899 .0019956 -0.95 0.345 -.0058812 .0020832


bi .0002153 .0001628 1.32 0.191 -.0001096 .0005403
bo .0003813 .0001622 2.35 0.022 .0000577 .000705
age -.0328429 .028639 -1.15 0.255 -.0899912 .0243055
_cons .1269071 .1066817 1.19 0.238 -.085973 .3397872

sigma_u .03338972
sigma_e .01931448
rho .74928194 (fraction of variance due to u_i)

F test that all u_i=0: F(7, 68) = 25.31 Prob > F = 0.0000
Random Effects Regression Results
Random-effects GLS regression Number of obs = 80
Group variable: firm Number of groups = 8

R-sq: Obs per group:


within = 0.0972 min = 10
between = 0.3547 avg = 10.0
overall = 0.2948 max = 10

Wald chi2(4) = 10.21


corr(u_i, X) = 0 (assumed) Prob > chi2 = 0.0370

edi Coef. Std. Err. z P>|z| [95% Conf. Interval]

bm -.0022298 .0019551 -1.14 0.254 -.0060617 .0016021


bi .0002525 .0001587 1.59 0.112 -.0000585 .0005636
bo .00039 .000157 2.48 0.013 .0000823 .0006977
age -.0415827 .0228149 -1.82 0.068 -.0862991 .0031337
_cons .1584183 .0858255 1.85 0.065 -.0097966 .3266333

sigma_u .03408438
sigma_e .01931448
rho .75693868 (fraction of variance due to u_i)
Key statistics in Panel Regression Results
Statistics Interpretation
F-test /Wald F-test or Wald -test and p-values signifies the joint significant
Statistics (model of the x variables in predicting the y variable. Do not proceed
fitness test) with your analysis if p-value is greater than 0.05.
R-squared Measures the percentage change in the y variable caused by
(within, the x variables. Within measures changes within each entity
between and while between measures changes between the entity.
overall)
Adjusted R- Measures the changes that will occur in y variable if all x
squared estimators are considered. Subtract Adjusted R-squared from
R-squared what you get is the percentage change.
Coefficients Constant is the value of y when all x variables are held
stationary. X1, x2, x3…xn are the values of y with a change in
the value of x variables.
T-test and p- These are used to test prepositions and hypotheses for
values inferences. P-values less than 0.05 are significant
Thanks for
Listening.

You might also like