0% found this document useful (0 votes)
29 views46 pages

Economatrics Postmte 1

The document discusses the violation of Classical Linear Regression Model (CLRM) assumptions, focusing on homoscedasticity and multicollinearity. It explains the implications of heteroscedasticity, methods for detecting it, and remedies such as Generalized Least Squares (GLS) and log transformation. Additionally, it addresses the nature and sources of multicollinearity, emphasizing its impact on regression analysis.

Uploaded by

abhi.ks.real
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
29 views46 pages

Economatrics Postmte 1

The document discusses the violation of Classical Linear Regression Model (CLRM) assumptions, focusing on homoscedasticity and multicollinearity. It explains the implications of heteroscedasticity, methods for detecting it, and remedies such as Generalized Least Squares (GLS) and log transformation. Additionally, it addresses the nature and sources of multicollinearity, emphasizing its impact on regression analysis.

Uploaded by

abhi.ks.real
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 46

Violation of CLRM

Prof. Rishman Jot Kaur Chahal

HSC - 205

Department of Humanities and Social Sciences


Indian Institute of Technology Roorkee

1 / 46
Assumptions of OLS
Homoskedasticity: The variance of all the error terms is same.
Var (ui |X ′ s) = constant.
No Autocorrelation: Error terms are mutually uncorrelated.
corr (ut , ut−i ) = 0. this proplem mainly arises for time series data
No Multicollinearity or perfect/high collinearity.
occurs when two or more explanatory variables in a regression model are highly
correlated
All error terms are independent of all x variables (strict exogeneity).
We treat the explanatory variables as fixed. (deterministic)

cov (ut , X ′ s) = 0 or

u1 , u2 , ..., uN is independent of X’s.


Correct Specification.
These assumptions taken together imply that E (Yi |xi ) = xi′ β
Error terms have mean zero. E (ui |X ′ s) = 0
2 / 46
Homoscedasticity
what affects its existence?

Var (ϵi ) = σ for all i. If the error terms do not have constant variance,
they are said to be heteroscedastic.
Heteroscedasticity is likely if there exists
Measurement Error. For Example: in a primary survey some
respondents might provide more accurate responses than others.

Subpopulation differences or other interaction effects. For example: the


effect of income (our X in the model) on expenditures (Y) differs for
whites and blacks. (Remember the problem arises from violation of the
assumption that no such differences exist or have already been
incorporated into the model.)

Model misspecification. Like Instead of using Y you should be using log


Y in the model; its better to use both X and X 2 rather than only X in
the model.

3 / 46
Homoscedasticity
Example

Example: Consider a model to understand the effect of income on


household vacation expenditure. Assume that the annual family
income is $1000 and annual family vacation expenditure is $300.

Now families with low incomes will spend relatively less on vacations
and further vacations across such families will be small.

But for families with large incomes, the amount of discretionary


income will be higher. The mean amount spent on vacations will be
higher, and there will also be greater variability among such families,
resulting in heteroscedasticity.

Another Example: Sales of larger firms might be more volatile than


sales of smaller firms.

4 / 46
Homoscedasticity
OLS Estimation in the presence of Heteroscedasticity

Retaining all other assumptions, let us introduce heteroscedasticity


i.e. E (ui2 ) = σi2 in our two variable model:

Yi = β1 + β2 Xi + ui

With our usual formula, estimator of β2 is


Σxi yi
βˆ2 =
Σxi2

σi2
Its variance is Var (βˆ2 ) = Σxi2
Σxi2 σi2
With heteroscedasticity, var (βˆ2 ) = (Σxi2 )2

5 / 46
Homoscedasticity
Consequences of its non-existence

Heteroscedasticity does not result in a biased parameter estimates.

OLS estimates are no longer BLUE. This means that among all the
unbiased estimators, OLS does not provide the estimate with the
smallest variance. So, our β̂2 is still unbiased and consistent but it is
inefficient.

Biased standard errors which leads to bias in test statistics and


confidence intervals.

Thank God !! Heteroscedasticity is not very problematic unless it is


“marked” and thus OLS estimation can be used without concern of
serious distortion.

However, it could be a serious issue with the methods beside OLS.


For example, in logistic regression heteroscedasticity can produce
biased and misleading parameter estimates.
6 / 46
Homoscedasticity
Detecting Heteroscedasticity

Visually/ Graphically one can detect the presence of


heteroscedasticity with the visual inspection of residuals plotted
against fitted values. Plot the residuals versus one of the X variables
included in the equation.

If the plot of residuals shows some uneven envelope of residuals, so


that the width of the envelope is considerably larger for some values
of X than for others, a more formal test for heteroscedasticity should
be conducted. 7 / 46
Homoscedasticity
Detecting Heteroscedasticity cont...

Formal Tests:
Park Test: Park formalizes the graphical method by suggesting that
σi2 is some function of the explanatory variable Xi .

σi2 = σ 2 Xiβ e vi
var(Y)=var(u)=ESS=e^2

ln(σi2 ) = ln(σ 2 ) + βlnXi + vi


where vi is the stochastic disturbance term.

σi2 is generally unknown.So, Park suggests to use ei2 as a proxy and


running the following regression:
ln(ei2 ) = α0 + β̂lnXi + vi

If β̂ is significant it implies there is the presence of heteroscedasticity.


Park test is a 2 stage process, first stage we run the OLS regression
disregarding the heteroscedaticity question. Obtain ei2 and then in
second stage we run the above mentioned latter regression.
8 / 46
Homoscedasticity
Detecting Heteroscedasticity cont...

White’s Test: Consider a three variable model as:

Yi = β1 + β2 X2 i + β3 X3 i + ui

The test proceeds as follows:


Step 1: Estimate the model with the usual OLS procedure.
Step 2: We then run the following (auxiliary) regression:

uˆi2 = α1 + α2 X2i + α3 X3i + α4 X2i2 + α5 X32 i + α6 X2i X3i + vi

Step 3: Compute nR 2 from the above auxiliary regression.


Step 4: Under the null hypothesis that there is no heteroscedasticity
we can show that nR 2 ∼ χ2 .
If the chi-square value obtained in the above step is greater than the
critical value at the chosen level of significance we conclude that there
is heteroscedasticity. If not, then it implies that
α1 = α2 = α3 = α4 = α5 = α6 .
9 / 46
Homoscedasticity
Detecting Heteroscedasticity cont...
BPG
Breusch-Pagan-Godfrey Test: This test detects any linear form of
heteroscedasticity. This test is similar to the White’s Test. Consider
the k-variable linear regression model:

Yi = β1 + β2 X2i + ... + βk Xki + ui

Assume that the error variance σi2 is described as:

σi2 = f (α1 + α2 Z2i + ... + αm Zmi )

So, σi2 is some function of the nonstochastic Z variables. (Remember


some or all of the X’s can serve as Z’s). Further assume that
σi2 = α1 + α2 Z2i + ... + αm Zmi .
If α1 = α2 = α3 = ... = αm = 0 i.e. you can test this hypothesis to
whether σi2 is homosedastic or not. This is the basic idea behind this
test.
10 / 46
Homoscedasticity
Detecting Heteroscedasticity cont...

Following are the steps for the test:


Step 1: Same as White’s Test.
Step 2: Obtain σ̃ 2 = Σuˆi2 /n
uˆi2
Step 3: Construct a variable ρi as . So,each residual square divided
σ˜2
by the σ˜2 .
Step 4: Regress ρi on the Z’s as

ρi = α1 + α2 Z2i + ... + αm Zmi + vi

Remember vi is the residual term for this regression.


Step 5: Obtain the Explained Sum of Squares (ESS) for the above
equation and define:
1
ϕ= (ESS)
2
Also ϕ ∼ χ2m−1
11 / 46
Homoscedasticity
Detecting Heteroscedasticity cont...

GQ
Goldfeld-Quandt Test: This method is applicable if it is assumed
that the heterosedastic variance (σi2 ) is positively related to one of
the explanatory variables in the model.

Y1 = β1 + β2 Xi + ui

σi2 = σ 2 Xi2

where σ 2 is constant.
So, σi2 would be larger with the larger Xi .

12 / 46
Homoscedasticity
Detecting Heteroscedasticity cont...

Following are the steps of the test:


Step 1: Order or rank the data in the increasing order of X1 .
Step 2: Omit “c” central observations where “c” is specified a priori
and divide the remaining (n-c) observations into two groups (n − c/2
for each group).
Step 3: Run 2 separate regressions for both the groups and obtain
their respective residual sum of squares i.e. RSS1 and RSS2 .
Step 4: Compute

RSS2 /df
λ=
RSS1 /df

where df = n−c2 −k
λ follows F-distribution. If λ is greater than Fcritical then the null
hypothesis of no heterogeneity will be rejected.
Remember the ability of GQ test depends on how you choose the c.

13 / 46
Homoscedasticity
Remedies

Remedies depends upon whether σi2 is known and when σi2 is not
known.
When σi2 is known: Generalised Least Square (GLS) or Weighted
Least Square (for simplicity using WLS and GLS interchangeably here).

GLS is OLS on the transformed variables that satisfy the standard


least-squares assumptions.

OLS does not take into consideration the unequal variability of


dependent variable. But GLS takes this into account.

Yi = β1 X0i + β2 Xi + ui

where X0i = 1 for all i.


As σi2 is known, divide the above equation with σi

14 / 46
Homoscedasticity
Remedies cont...

now for each term their variance become 1


Thus,
 
Yi X0i Xi Ui
= β1 ( ) + β2 ( ) + ( )
σi σi σi σi

Yi∗ = β1∗ X0i∗ + β2∗ Xi∗ + Ui∗

The transformed model is now following homoscedasticity. This is the


whole purpose of the transformed model.
How? Can you prove?

15 / 46
Homoscedasticity
Remedies Example

Figure: Gujarati et al., 2009 fifth edition


16 / 46
Homoscedasticity
Remedies Example

Figure: Gujarati et al., 2009 fifth edition 17 / 46


Homoscedasticity
Remedies cont...

Consider the following model as:

Y = β1 + β2 X2 + β3 X3 + U (1)

Say, EU 2 = σ 2 Z 2 . =var(Y)

Divide Eq.1 by Z. Our WLS model is:


Y β1 X2 X3 U
= + β2 + β3 +
Z Z Z Z Z
Thus we use the above mentioned WLS model to obtain BLUE
estimators. How?

18 / 46
Homoscedasticity
Remedies cont...

Expectation of the “U” from our WLS model:


U 1
E ( )2 = 2 EU 2
Z Z

1 2 2
= σ Z
Z2
So, our weight i.e. w = 1/z.

Another Example: Consider EU 2 = σ 2 X2 for the above mentioned


model. What will be the weight?

19 / 46
Homoscedasticity
Remedies cont...

Another way of reducing heteroscedasticity is the log transformation


of the variables.

lnYi = β1 + β2 lnXi + ui

Log transformation reduces the scales in which the variables are


measured and thereby reduce the difference between two variables.
like difference between 80 and 8 is 10 times but not of ln80(4.32) and
ln8 (2.07).

Caution: Here β2 measures the elasticity of Y with respect to X.

20 / 46
Multicollinearity
What is the nature of Multicollinearity?

Originally Multicollinearity means the existence of exact or perfect


linear relationship among some or all explanatory variables of a
regression model.
Consider a k-variable regression with explanatory variables
X1 , X2 , X3 , ..., Xk the exact relationship can be written as:
λ1 X1 + λ2 X2 + ... + λk Xk = 0
where λ1 , λ2 , ..., λk are constants such that not all of them are zero
simultaneously.
Now there is a possibility that X variables are not perfectly correlated
which is as follows:
λ1 X1 + λ2 X2 + ... + λk Xk + vi = 0
where vi is a stochastic error term.
Thus, the degree of multicollinearity matters.
21 / 46
Multicollinearity
Graphical View

22 / 46
Multicollinearity
Example

imperfect corr
perfect corr

X1 X2 X3
10 50 52
15 75 75
18 90 97
24 120 129
30 150 152

Calculate the r12 . Is it the perfect multicollinearity?

Is r23 is perfect of imperfect correlation?

23 / 46
Multicollinearity
What is the nature of Multicollinearity?

Note that till now multicollinearity is defined only in the form of linear
relationship among X variables.

But it does not rule out the nonlinear relationships among them. For
example: Consider the following regression model:

Yi = β0 + β1 Xi + β2 Xi2 + β3 Xi3 + ui

where Y = Total Cost of Production and X = Output. But the


relationship is non-linear.

Strictly the model does not violate the assumption of no


multicollinearity.

But there is a very high correlation between Xi , Xi2 and Xi3 which will
make it difficult to estimate the parameters with great precision.
24 / 46
Multicollinearity
Sources

Data collection method employed. For example: sampling over a


limited range of the values taken by the regressors in the population.

Modelling Constraint: Ex: C = f (Income, Education, Health, Asset)


where C represents consumption.

Model specification. For example, adding polynomial terms to a


regression model, especially when the range of the X variable is small.

Time Series Analysis: Most of the variables share a common trend


because of which there is high collinearity. For Example:
GDPt = f (Capitalt ).

25 / 46
Multicollinearity
Estimation in the presence of Perfect Multicollinearity

In case of perfect multicollinearity the regression coefficients remain


indeterminate and their standard errors are infinite.

Consider the following model with three-variable regression model as


yi = βˆ2 x2i + βˆ3 x3i + ûi (2)
With our usual OLS analysis we know that:

yi x2i )( x3i2 ) − ( yi x3i )( x2i x3i )


P P P P
(
βˆ2 = P 2 P 2 (3)
( x2i )( x3i ) − ( x2i x3i )2
P

( yi x3i )( x2i2 ) − ( yi x2i )( x2i x3i )


P P P P
ˆ
β3 =
( x2i2 )( x3i2 ) − ( x2i x3i )2
P P P

Assume that x3i = λx2i . Now calculate βˆ2 .


26 / 46
Multicollinearity
Estimation in the presence of Perfect Multicollinearity

yi x2i )(λ2 x2i2 ) − (λ yi x2i )(λ x2i2 )


P P P P
(
βˆ2 =
( x2i2 )(λ2 x2i2 ) − λ2 ( x2i2 )2
P P P

0
=
0

Similarly you can prove that βˆ3 is also indeterminate.

So, in such cases there is no way to estimate β2 and β3 uniquely.

27 / 46
Multicollinearity
Estimation in the presence of Imperfect Multicollinearity

Generally exact relationship among variables do not exist. Thus,


consider the following relationship:

x3i = λx2i + vi (4)

P
where vi is the stochastic error term and x2i vi = 0.

For such condition, solve βˆ2 by substituting eq 4 in 3. Thus,

(yi x2i )(λ2 x2i2 + vi2 ) − (λ yi x2i + yi vi )(λ x2i2 )


P P P P P P
ˆ
β2 = P 2 2P 2 P 2
x2i + vi ) − (λ x2i2 )2
P
x2i (λ

Note that if vi is very small this will indicate almost perfect


collinearity.
28 / 46
The t-ratio (or t-statistic) is a measure used in regression analysis to
Multicollinearity determine the statistical significance of an individual coefficient. It helps
us understand whether the relationship between an independent
Consequences variable and the dependent variable is strong enough to be considered
statistically significant.
OLS estimators still remain BLUE even if there exists
multicollinearity. Then, what’s the fuss?

Although BLUE OLS estimators will have large variance and


covariance which makes the precise estimation difficult.

Because of large variance there will be wider confidence interval which


tends to accept the null more readily.
t=coeff/SE of coeff

Also, because of large variances and covariances t-ratio of one or


more coefficients will tend to be statistically insignificant but R 2 will
be very high. t ratio is just like testing hypothesis with H0 that some perticular
coeff is 0. significance of corresponding var is perpotional to mod
of t ratio
With the last point one can also detect the presence of
as R2 =goodness of fit
multicollinearity. t ratio also tells about goodness of fit
if one is higher and one is lower=> multicolinearity(contradictory situation)
29 / 46
Multicollinearity
Detection: Large Variance and Covariance

Insignificant t-Ratios: Remember t = βˆ2 /seβˆ2 where the null


hypothesis is say β2 = 0.

As standard errors increases due to high collinearity, critical value of t


decreases.

Remember that for model 2, variances will be


σ2
var (βˆ2 ) = P
x2i2 (1 − r23
2 )

σ2
var (βˆ3 ) = P
x3i2 (1 − r23
2 )

So, as r23 tends toward 1 i.e. as collinearity increases the variances of


the estimators will also increase.
30 / 46
Multicollinearity
Detection: VIF

The speed with which variances and covariances increase can be seen
with the variance-inflation factor (VIF) which is defined as:

1
VIF = 2
1 − r23

VIF shows how the variance of an estimator is inflated by the presence


2 approaches 1, the VIF approaches infinity.
of multicollinearity. As r23

As the extent of collinearity increases, the variance of an estimator


increases, and in the limit it can become infinite.

If there is no collinearity between x2 and x3 , VIF will be 1.

31 / 46
Multicollinearity
VIF

ˆ are directly proportional to the VIF.


Thus, variances of βs

σ2
var (βˆ2 ) = P 2 VIF
x2i

σ2
var (βˆ3 ) = P 2 VIF
x3i

32 / 46
Multicollinearity
Example One

Figure: Gujarati et al., 2009 fifth edition

33 / 46
Multicollinearity
Example Two

Figure: Gujarati et al., 2009 fifth edition

The inverse of the VIF is called tolerence (TOL).


1
TOLj = = (1 − Rj2 )
VIFj
34 / 46
Multicollinearity
Detection with a formal test

Auxiliary Regression: Consider y = f (x1 , x2 , ..., xk ), correlation is


Ry2 and x1 is highly correlated to all.

Now run the following regression:

x1 = f (x2 , x3 , ..., xk ) (5)

and the Rx21 .


Above mentioned Eq. 5 is known as Auxilliary Regression.

If Rx21 > Ry2 implies multicollinearity.

35 / 46
Multicollinearity
Remedies

Dropping unimportant variables will dilute multicollinearity.


Transformation of the variable using logarithm or change to first
difference i.e. first difference regression model.
Using a priori information: Suppose we consider the model
Yi = β1 + β2 X2i + β3 X3i + vi
where Y = consumption, X2 = income, and X3 = wealth. But
suppose a priori we believe that β3 = 0.10β2 . Now run the following
regression :
Yi = β1 + β2 X2i + 0.10β2 X3i + vi

= β1 + β2 Xi + vi
where Xi = X2i + 0.10X3i . One you obtain βˆ2 can estimate βˆ3 from
the postulated relationship.
36 / 46
Multicollinearity
Remedies

Adding more observations.

Combining cross-sectional and time series data: It is a variant of a


priori information technique. Thus, we are pooling the data.

37 / 46
Autocorrelation
What is the nature of autocorrelation?

The violation of the CLRM assumption i.e. E (ui uj ) = 0 is known as


autocorrealtion.

Formally, the term autocorrelation is defined as “correlation between


members of series of observations ordered in time [as in time series
data] or space [as in cross-sectional data].”

Symbolically, autocorrelation implies E (ui uj ) ̸= 0 where i ̸= j.

Difference between autocorrelation and serial correlation:


Correlation between two time series such as u1 , u2 , ..., u10 and where
the former is the latter series lagged by one time period is
autocorrelation, whereas correlation between time series such as
u1 , u2 , ..., u10 and v1 , v2 , ..., v11 where u and v are two different time
series is called serial correlation.
38 / 46
Autocorrelation
Visualization

Figure: Gujarati et al., 2009 fifth edition


39 / 46
Autocorrelation
Sources

Specification Bias: 1. Omitted Variable Bias: Researchers often


use the models that may not be perfect for the analysis.

Consider the following demand model:

Yt = β1 + β2 X2t + β3 X3t + β4 X4t + ut


where Y = quantity of demand of the product, X2 = Price of the
product, X3 = consumer income, X4 = price of the substitute product
and t = time.

But the researcher used the following model:


Yt = β1 + β2 X2t + β3 X3t + vt

Thus, this imperfect model includes ut + β4 X4 in vt which will reflect


a systemic pattern in vt . Thus, creating a (false) autocorrelation. 40 / 46
Autocorrelation
Sources

Specification Bias: 2. Incorrect Functional Form: Consider the


correct model as
MarginalCosti = β1 + β2 Outputi + βOutputi2 + ui

But if the researcher does not use the quadratic term in her model
then it will be an incorrect linear cost curve.

Seasonality of the data is another source of autocorrelation.

Data Manipulation: Like in time series quarterly data is mainly


derived from monthly data but at the same time this quarter
conversion add some smoothness to the data and dampen the
fluctuations in a monthly data. These data “massaging” techniques
itself led to a systematic pattern in the disturbances.
41 / 46
Autocorrelation
Sources

Data Transformation: Consider the correct model as

Yt = β1 + β2 Xt + ut (6)

where Y = consumption expenditure (log), X = Income (log).


But this holds true even for the previous year

Yt−1 = β1 + β2 Xt−1 + ut−1 (7)

where Yt−1 , Xt−1 and ut−1 are the lagged values of Y, X and u
respectively.
Subtract 6 and 7 we will get first difference operator as

∆Yt = β2 ∆Xt + ∆ut

Here ∆ut is autocorrelated where as ut was not.


These models are known as dynamic regression models.
42 / 46
Autocorrelation
Sources

Nonstationarity
In simple terms, a stationary series is a flat looking series i.e. it moves
around a constant.

Also know as Mean Reverting Series.

Main Property: Mean, Variance and Covariance is constant in a Mean


Reverting Series.

Remember it is know as the second order time series. But if it also


fulfills the property of equal probability distribution
(P(Yt ) = P(Yt−1 ) = ...) then it will be known as first order stationary
series.

43 / 46
Autocorrelation
Visualization of Positive and negative Autocorrelation

Figure: Gujarati et al., 2009 fifth edition

44 / 46
Autocorrelation
OLS Estimation in the presence of Autocorrelation

Consider a two-variable regression model as:

Yt = β1 + β2 Xt + ut

Lets assume that E (ut , ut+s ) ̸= 0 where s ̸= 0.

Lets assume that the disturbance, or error, terms are generated by the
following mechanism:

ut = ρut−1 + ϵt (8)

where −1 < ρ < 1 and ρ is known as the coefficient of


autocovariance and ϵt is the stochastic term which satisfies the
standard OLS assumptions.
Eq.8 is known as the first-order autoregressive scheme usually denoted
as AR(1).
45 / 46
Autocorrelation
OLS Estimation in the presence of Autocorrelation

Note that for the AR(1) scheme:

σϵ2
var (ut ) = E (ut2 ) =
1 − ρ2

σϵ2
cov (ut , ut−s ) = E (ut ut − s) = ρs
1 − ρ2

46 / 46

You might also like