0% found this document useful (0 votes)
54 views

Econometrics II Chapter Two

This document discusses violations of classical linear regression assumptions, including multicollinearity, heteroskedasticity, and autocorrelation. It defines each concept, describes potential sources, consequences, and methods for detecting them. Multicollinearity occurs when independent variables are highly linearly correlated. Heteroskedasticity is when the variance of the error term is not constant across observations. Autocorrelation happens when error terms are correlated across time or groups. Failing to meet assumptions reduces accuracy and can invalidate statistical tests.

Uploaded by

Genemo Fitala
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
54 views

Econometrics II Chapter Two

This document discusses violations of classical linear regression assumptions, including multicollinearity, heteroskedasticity, and autocorrelation. It defines each concept, describes potential sources, consequences, and methods for detecting them. Multicollinearity occurs when independent variables are highly linearly correlated. Heteroskedasticity is when the variance of the error term is not constant across observations. Autocorrelation happens when error terms are correlated across time or groups. Failing to meet assumptions reduces accuracy and can invalidate statistical tests.

Uploaded by

Genemo Fitala
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 40

CHAPTER FOUR

VIOLATION OF CLASSICAL REGRESSION


ASSUMPTIONS

1
Lecture Plan

2.1. Revision of the assumption of OLS


2.2 The nature of Violation of Assumption
2.2.1 Multilcollinearity
2.2.2. Hetroskedaticity
2.2.3. Autocorrelation
#. Review of Assumption of the classical
regression/OLS
Assumptions of the Classical Linear Regression Model:
1. The regression model is linear, correctly specified, and has an
additive error term.

2. The error term has a zero population mean.

3. All explanatory variables are uncorrelated with the error term

4. Observations of the error term are uncorrelated with each other


(no serial correlation).

5. The error term has a constant variance (no heteroskedasticity).

6. No explanatory variable is a perfect linear function of any


other explanatory variables (no perfect multicollinearity).

7. The error term is normally distributed (not required).


#. Nature, Potential Source, Consequence,
Detecting and Remedies of :
1. Multicollinearity
2. Heteroskedasticity
3. Autocorrelation
1. Nature Multicollinearity
o Multicollinearity : denotes the linear relationship among
independent or explanatory variables.
1. Perfect Collinearity
o If independent or explanatory variables have perfectly linearly
correlated i.e if correlation coefficient is one, then the parameters
become indeterminate and & their standard errors are
infinite.
o In this situation it is impossible to obtain numerical values for
each parameters separately and the method of ordinary least
doesn’t hold.
Multicollinearity
1. Orthogonal Collinearity
o If correlation coefficient of explanatory variables is zero, then
the parameters become indeterminate.
o There are no problems concerning the estimate of the
coefficients. There is no need of perform a multiple regression
analysis. Each parameters can be estimated by simple regression
of y on the corresponding regressor.
Multicollinearity
o If multicollinearity is less than perfect, the
regression coefficients are determinate but have
large standard errors which impairs the accuracy of
the estimation of the coefficients.
o In case of near or high multicollinearity, OLS
estimators have large variance though they are BLUE.
2. The potential source of multicollinearity
There are several sources of multicollinearity?
1. The data collection method employed
o Sampling over limited range of the values
taken by the repressors in the population
2. Constraint on the model or in the population being
sampled
3. Model specification
o Adding polynomial terms to a regression
o Use of highly dependent independent variables
o Use of many interaction terms in a model
The potential source of multicollinearity
4. Over-determined model
o the model has more explanatory variables than number
of observation
5. Use of many dummy independent variables
The Detection of Multicollinearity
1. High Correlation Coefficients : Pairwise correlations
among independent variables might be high (in
absolute value).
o Rule of thumb: If the correlation > 0.8 then severe
multicollinearity may be present.
2. High R2 with low t-Statistic Values : Possible for
individual regression coefficients to be insignificant
but for the overall fit of the equation to be high.
The Detection of Multicollinearity
3. High Variance Inflation Factors (VIFs):
o The larger the value of VIF, the more ―troublesome‖
or collinear the variable Xj
o If the VIF of a variable exceeds 10, which will happen
if R2 exceeds 0.90, that variable is said to be highly
collinear
o The closer is Tol-High Variance Inflation Factors
(VIFs)j is to 1, the greater the evidence that Xj is not
collinear with the other regression
o
Remedies for Multicollinearity
No single solution exists that will eliminate multicollinearity.
Certain approaches may be useful:
1. Do Nothing
o Live with what you have.
2. Drop a Redundant Variable
o If a variable is redundant, it should have never been
included in the model in the first place. So dropping it
actually is just correcting for a specification error. Use
economic theory to guide your choice of which variable
to drop.
Remedies for Multicollinearity
3. Transform the Multicollinear Variables
o Sometimes you can reduce multicollinearity by re-
specifying the model, for instance, create a combination
of the multicollinear variables. As an example, rather
than including the variables GDP and population in the
model, include GDP/population (GDP per capita) instead.
4. Increase the Sample Size
o Increasing the sample size improves the precision of an
estimator and reduces the adverse effects of
multicollinearity. Usually adding data though is not
feasible.
2. Heteroskedasticity
o The assumption of Classical model describes that the
variance of error or disturbance term have the same
value of independent variables. i.e Homoskedasticity.
o But the assumption violated due to existence of variance
of the error or disturbance term differ for different
values.
o Heteroskedasticity : It describes a situation where the
variance of the error or disturbance term differ for
different values of independent variables.
Source of heteroskedasticity
1. The error learning models
o As people learn their error of behavior becomes smaller
overtime and the variance is expected to decrease.
2. A data collection techniques improves the variance of
the error term decrease
3. As a result of presence of outliers
o An outlier/outlying observation/ is an observation that is
much different either very small or very large in relation
to other observation in the sample. The inclusion or
exclusion of such an observation in small sample can alter
the results of regression analysis.
Source of heteroskedasticity
4. Model misspecification
o Het. may arise if some relevant variables have been
mistakenly omitted so that the model is incorrectly
specified. But when the omitted variables are included in
the model, the non- constancy of variance may disappear
5. Skewness in the distribution of one or more regressor's
included in the model
o Inclusion of explanatory variables in the model whose
distributions are skewed may cause heteroskedasticity
problem → examples of such variables are income,
wealth etc. 5
Source of heteroskedasticity
6. Incorrect data transformation
7. Incorrect functional form
Linear vs log linear models

5
Consequence of heteroskedasticity
If the assumption of homoscedasticity is violated, it will have
the following consequences
1. Heteroskedasticity increases the variances of the
distributions of the coefficients of OLS thereby turning
the OLS estimators inefficient.
2. OLS estimators shall be inefficient : If the random
term Ui is heteroskedasticity, the OLS estimates do not
have the minimum variance in the class of unbiased
estimators. Therefore they are not efficient both in
small & large samples.
5
Consequence of heteroskedasticity
o So, heteroskedasticity has a wide impact on hypothesis
testing; the conventional t and F statistics are no more
reliable for hypothesis testing

5
Detection of heteroskedasticity
There are informal & formal methods of detecting
heteroskedasticity
1. Informal
A. Nature of Problems
o Following the pioneering work from empirical information
B. Graphical Method
o Plotting the square of the residual against the dependent
variable gives rough indication of the existence of
heteroskedasticity
o If there appears a systematic trend in the graph it may
be an indication of the existence of heteroskedasticity.
Detection of heteroskedasticity
II. Formal Methods
A. The spearman rank- correlation Test
o This is the simplest & approximate test for defecting
hetroscedastic which will be applied either to small or
large samples.
o A high rank correlation coefficient suggests the presence
of heteroskedasticity. If we have more explanatory
variable we may compute the rank correlation coefficient
between ei & each one of the explanatory variables
separately.
Detection of heteroskedasticity
B. The Breusch-Pagan Test
o This test is applicable for large samples & the number of
observations (at least) i.e sample size is twice the number
of explanatory variables.
o If the numbers of explanatory variables are 3(X1, X2,
X3) then the sample size is at least must be 6.
o If the computed value is greater than the table value
rejects the null hypothesis that there is
homoscedasticity & accepts the alternative that there is
heteroskedasticity.
Detection of heteroskedasticity
C. White’s General Heteroskedasticity Test
o It is an LM test, but it has the advantage that it does not
require any prior knowledge about the pattern of
heteroskedasticity. The assumption of normality is also
not required here. For all these reasons, it is considered
to be more powerful among the tests of
heteroskedasticity.
o Basic intuition → focuses on systematic patterns between
the residual variance, the explanatory variables, the
squares of explanatory variables and their cross-
products.
Detection of heteroskedasticity
Limitations of White’s test
i. When we have large number of explanatory variables,
the number of terms in the auxiliary regression model
will be so high that we may not have adequate degrees
of freedom.
ii. It is basically a large sample test so that, when we
work with a small sample, it may fail to detect the
presence of heteroskedasticity in data even when such
a problem is present.
Detection of heteroskedasticity
D. Goldfeld-Quandt Test
o It may be applied when one of the explanatory variables
is suspected to be the heteroskedasticity culprit.
o The basic idea here is that if the variances of the
disturbances are same across all observations (i.e.,
homoscedastic), then the variance of one part of the
sample should be same the variance of another part of
the sample.
Remedies of heteroskedasticity
1. Log-transformation of data ⟶ log-transformation

compresses the scales in which the variables are

measured.

o So it helps to reduce intensity of heteroskedasticity

problem. Obviously, this method cannot be followed

where some variables take on zero or negative values.

1. Using some suitable deflator if available to transform

the data series ⟶ the idea is to estimate the model by

using the deflated variables so that more efficient

estimates of the parameters are obtained. 5


Remedies of heteroskedasticity
o But this process might lead to ‘spurious relationship’
between the variables when a common deflator is used
to deflate the variables.
3. When heteroskedasticity appears owing to presence of

outliers, increasing sample size might be helpful.

5
Autocorrelation
o Autocorrelation :- refers to the internal correlation
between members of series of observation ordered in
time or space.
o Autocorrelation is a special case of correlation where the
association is not between elements of two or more
variables but between successive value of one variable,
While correlation refers to the relation ship between
values of two or more different variables.

5
Source of Autocorrelation
1. Omitted Explanatory variables
o If an auto correlated variable has been excluded from
the set of explanatory variables then its influence will
be reflected in the random variables U.
o If several auto correlated explanatory variables are
omitted, then the random variables ,U, may not be auto
correlated. This is because the auto correlation pattern
of the omitted variables may offset each other.

5
Source of Autocorrelation
2. Mis-specification of the mathematical form of the
model
o If we use mathematical form which differ from the
correct form of relation ship then the random variables
may show the serial correlation.
o Example : If we chosen a linear function while the
correct form are non-linear, then the values of U will be
correlated.

5
Source of Autocorrelation
3. Mis-specification of the true random term U.
o Many random factors like war, drought, weather
condition, strikes etc exert influence that are spread
over more than one period of time.
o Example : the effect of weather condition in agricultural
sector will influence the performance of all other
economic variables in several times in the future. A strike
in an organization affect the production process which
will persist for several future periods. In such cases the
value’s of U become serially dependent, so that
autocorrelation has happen. 5
Source of Autocorrelation
4. Interpolation in the statistical observation
o Most time series data involves some interpolation and
smoothing process to remove seasonal effect which do
average the true disturbance of over successive time
period. As a result, the successive value of U’s are
interrelated and show autocorrelation pattern.

5
Consequences of autocorrelation
1. The OLS estimator is unbiased
2. The OLS estimator is inefficient hence it is not BLUE
3. Lower standard error & highest t-statistics which leads to
acceptance of the non- significant variables.
o Usual t-ratio and F ratio tests provide misleading
results.
o Overly optimistic view from R 2

4. The estimated variance & covariance of the OLS estimates


are not valid
o Prediction may have large variances.
5. Hypothesis testing are not valid
o Narrow confidence interval. 5
Detection of Autocorrelation
o The following are detection method of Autocorrelation.
These are :
1. Graphic methods
2. Formal test
1. Graphic methods
o There are two methods that are commonly used to obtain
a rough idea of the existence or absence of
autocorrelation.
A. Plot the residuals against their own lag
B. Plot the residuals against time (t)
5
Detection of Autocorrelation
A. Plot the residuals against their own lag
o In the disturbance term- Ui- since ei = Yi- Yi estimates of the
true Ui, and then if e's are found to be correlated it will suggest
that Ui are auto correlated with each others.
o Autocorrelation may be positive or negative but in most of the
cases of practice autocorrelation is positive. The main reason for
this is economic variables are moving in the same direction. Ex. in
period of boom employment, investment, output, growth of
GNP,consumption etc are moving up wards & then the random term
Ui will follow the same pattern and again in periods of recession all
the economic variables will move down words & the random term
will follow the same patterns.

5
Detection of Autocorrelation
B. Plot the residuals against time (t)- & we will have two
alternatives.
i. If the sign of successive values of the residuals etet-1
are changing rapidly their sign we can say there is
negative autocorrelation.
ii. If the sign of successive values of etet-1 do not
change its sign frequently i.e. several positives are
followed by several negatives values of etet-1 we can
conclude that there is positive autocorrelation.

5
Detection of Autocorrelation
Formal test

1. The Durbin-Watson (DW) Test

2. Von Neumann ratio


Remedies for Autocorrelation
The remedies to remove the effect of auto correlation
depends on the source of autocorrelation.
1. Include these omitted explanatory variables.
2. Apply the appropriate mathematical form of the model.
3. If tests prove that there is a true autocorrelation
problem, we may use other techniques such as GLS.

5
# END!!!!!!!

14

You might also like