0% found this document useful (0 votes)

84 views

Econometricis Chapter Four PBMs

This document discusses violations of the classical assumptions of regression analysis, specifically heteroscedasticity. It begins by restating the assumptions of homoscedasticity (constant variance) in the classical linear regression model. It then defines heteroscedasticity as non-constant variance where the variance of the error term is not the same for all values of the explanatory variable. Some potential causes of heteroscedasticity are discussed such as errors decreasing with learning or data collection improvements. The consequences of heteroscedasticity are that the variance of OLS coefficients will be incorrect, though the estimates remain unbiased. Methods for detecting heteroscedasticity include informal graphical analysis and formal tests such as the Goldfeld-

Uploaded by

kalid kal

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

84 views

Econometricis Chapter Four PBMs

Uploaded by

kalid kal

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 26

Chapter Four

Violations of Basic Classical Assumptions

4. Introduction
In both the simple and multiple regression models, we made important assumptions about the
distribution of Yt and the random error term ‘ut’. We assumed that ‘ut’ was random variable

with mean zero and var(u t ) =  2 , and that the errors corresponding to different observation are

uncorrelated, cov(u t , u s ) = 0 (for t  s) and in multiple regression we assumed there is no

perfect correlation between the independent variables.

Now, we address the following ‘what if’ questions in this chapter. What if the error variance is
not constant over all observations? What if the different errors are correlated? What if the
explanatory variables are correlated? We need to ask whether and when such violations of the
basic classical assumptions are likely to occur. What are the consequences such violations on
least square estimators? How do we detect the presence of autocorrelation, heteroscedasticity, or
multicollineairy? What are the remedial measures? In the subsequent sections, we attempt to
answer such questions.

4.1 Heteroscedasticity
4.1.1 The Nature of Heteroscedasticty
In the classical linear regression model, one of the basic assumptions is that the probability
distribution of the disturbance term remains same over all observations of X; i.e. the variance of
each ui is the same for all the values of the explanatory variable. Symbolically,

var(u i ) = u i − (u i ) = (u i2 ) =  u2 ; constant value.

This feature of homogeneity of variance (or constant variance) is known as homoscedasticity. It

may be the case, however, that all of the disturbance terms do not have the same variance. This
condition of non-constant variance or non-homogeneity of variance is known as
heteroscedasticity. Thus, we say that U’s are heteroscedastic when:
var(ui )   u2 (a constant) but

var(u i ) =  ui2 (a value that varies)

1|Page
4.1.3. Reasons for Hetroscedasticity
There are several reasons why the variances of ui may be variable. Some of these are:

1. Error learning model: it states that as people learn their error of behavior become smaller over
time. In this case  i2 is expected to decrease.

Example: as the number of hours of typing practice increases, the average number of typing
errors and as well as their variance decreases.
2. As data collection technique improves,  ui2 is likely to decrease. Thus banks that have

sophisticated data processing equipment are likely to commit fewer errors in the monthly or
quarterly statements of their customers than banks without such facilities.
3. Heteroscedasticity can also arise as a result of the presence of outliers. An outlier is an
observation that is much different (either very small or very large) in relation to the other
observation in the sample.

4.1.4 Consequences of Hetroscedasticity for the Least Square Estimates

What happens when we use ordinary least squares procedure to a model with hetroscedastic
disturbance terms?
1. The OLS estimators will have no bias
xy x u
ˆ = =  + i 2i
x 2
x
x(u i )
( ˆ ) =  + =
x 2
Similarly; ˆ = Y − ˆX = ( + X + U ) − ˆX

(ˆ ) =  + X + (U ) − ( ˆ ) X = 
i.e., the least square estimators are unbiased even under the condition of heteroscedasticity. It is
because we do not make use of assumption of homoscedasticity here.
2. Variance of OLS coefficients will be incorrect
2
Under homoscedasticity, var( ˆ ) =  2 K 2 = 2 , but under hetroscedastic assumption we shall
x
have: var( ˆ ) = K i2 (Yi 2 ) = K i2 ui2   2 K i2

2|Page
 ui2 is no more a finite constant figure, but rather it tends to change with an increasing range of
value of X and hence cannot be taken out of the summation (notation).
3. OLS estimators shall be inefficient: In other words, the OLS estimators do not have the
smallest variance in the class of unbiased estimators and, therefore, they are not efficient both in
small and large samples. Under the heteroscedastic assumption, therefore:
 x  x 2  2
var( ˆ ) = K i2 (Y 2i) =   i 2 (Yi 2 ) = i 2 ui2 − − − − − − − − − 3.11
 x  ( x i )

ˆ 2
Under homoscedasticy, var(  ) = 2 − − − − − − − − − − − − − − − − − − − 3.12
x
These two variances are different. This implies that, under heteroscedastic assumption although
the OLS estimator is unbiased, but it is inefficient. Its variance is larger than necessary.
To see the consequence of using (3.12) instead of (3.11), let us assume that:
 ui2 = K i 2
Where K i are same non-stochastic constant weights. This assumption merely states that the

hetroscedastic variances are proportional to K i ;  2 being facto of proportionality. Substituting

 2 k i xi2   2  k x 2 
this value of  ui2 in (3.11), we obtain: var( ˆ ) = =  2  i 2i 
(xi2 )(xi ) 2  x  xi 
 k i xi2 
= (var(ˆ ) 
Homo .  2 
 − − − − − 3.13
 xi 
That is to say if x 2 and k i are positively correlated and if and only if the second term of (3.13) is

greater than 1, then var( ˆ ) under heteroscedasticty will be greater than its variance under

homoscedasticity. As a result the true standard error of ̂ shall be underestimated. As such the
t-value associated with it will be overestimated which might lead to the conclusion that in a
specific case at hand ̂ is statistically significant (which in fact may not be true). Moreover, if
we proceed with our model under false belief of homoscedasticity of the error variance, our
inference and prediction about the population coefficients would be incorrect.

3|Page
4.1.5. Detecting Heteroscedasticity
We have observed that the consequences of heteroscedasticty are serious on OLS estimates. As
such, it is desirable to examine whether or not the regression model is in fact homosedastic.
Hence there are two methods of testing or detecting heteroscedasticity. These are:
i. Informal method
ii. Formal method
i. Formal Methods
a. Goldfield-Quandt test
This popular method is applicable if one assumes that the heteroscedastic variance  i2 is

positively related to one of the explanatory variables in the regression model. For simplicity,
consider the usual two variable models:
Yi =  +  i X i + U i

Suppose  i2 is positively related to X i as:  i2 =  2 X i2 − − − − − − − 3.18; where  2 is constant.

If the above equation is appropriate, it would mean  i2 would be larger, the larger values of X i .If

that turns out to be the case, hetroscedasticity is most likely to be present in the model. To test
this explicitly, Goldfeld and Quandt suggest the following steps:
Step 1: Order or rank the observations according to the values of X i beginning with the lowest
X value
Step 2: Omit C central observations where C is specified a priori, and divide the remaining (n-
(n − c)
c) observations into two groups each of observations
2
(n − c) (n − c)
Step 3: Fit separate OLS regression to the first observations and the last
2 2
observations, and obtain the respective residual sums of squares RSS, and RSS2, RSS1
representing RSS from the regression corresponding to the smaller X i values (the small variance
group) and RSS2 that from the larger X i values (the large variance group). These RSS each have
(n − c) (n − c − 2 K )
− K or df , where: K is the number of parameters to be
2 2
estimated, including the intercept term; and df is the degree of freedom.
RSS 2 / df
Step 4: compute  =
RSS1 / df

4|Page
If U i are assumed to be normally distributed (which we usually do), and if the assumption of

homoscedasticity is valid, then it can be shown that  follows F distribution with numerator and
RSS2 /(n − c − 2 K ) / 2
denominator df each (n-c-2k)/2.  = ~ F (n -c) (n -c) 
RSS1 /(n − c − 2 K ) / 2 
 2
−K ,
2
−K 


If in application the computed  (= F ) is greater than the critical F at the chosen level of
significance, we can reject the hypothesis of homoscedasticity, i.e. we can say that
hetroscedasticity is very likely.
Example: to illustrate the Goldfeld-Quandt test, we present in table 3.1 data on consumption
expenditure in relation to income for a cross-section of 30 families. Suppose we postulate that
consumption expenditure is linearly related to income but that heteroscedasticity is present in the
data. We further postulate that the nature of heterosedastic is given in equation (3.15) above.
The necessary reordering of the data for the application of the test is also presented in table 3.1.
Table 3.1 Hypothetical data on consumption expenditure Y($) and income X($). (Data ranked by
X values)
Y X Y X
55 80 55 80
65 100 70 85
70 85 75 90
80 110 65 100
79 120 74 105
84 115 80 110
98 130 84 115
95 140 79 120
90 125 90 125
75 90 98 130
74 105 95 140
110 160 108 145
113 150 113 150
125 165 110 160
108 145 125 165
115 180 115 180
140 225 130 185
120 200 135 190
145 240 120 200
130 185 140 205
152 220 144 210
144 210 152 220
175 245 140 225
180 260 137 230
135 190 145 240
140 205 175 245
178 265 189 250
191 270 180 260
137 230 178 265
189 250 191 270

5|Page
Dropping the middle four observations, the OLS regression based on the first 13 and the last 13
observations and their associated sums of squares are shown next (standard errors in
parentheses). Regression based on the last 13 observations
Yi = 3.4094 + 0.6968 X i + ei
(8.7049) (0.0744)
R 2 = 0.8887
RSS1 = 377.17
df = 11
Regression based on the last 13 observations
Yi = −28.0272 + 0.7941 X i + ei
(30.6421) (0.1319)
R 2 = 0.7681
RSS 2 = 1536.8
df = 11
RSS 2 / df 1536.8 / 11
= =
From these results we obtain: RSS1 / df 377.17 / 11
 = 4.07
The critical F-value for 11 numerators and 11 denominators for df at the 5% level is 2.82. Since
the estimated F (=  ) value exceeds the critical value, we may conclude that there is
hetrosedasticity in the error variance. However, if the level of significance is fixed at 1%, we
may not reject the assumption of homosedasticity (why?) Note that the  value of the observed
 is 0.014.
There are also other tests of hetroscedasticity like spearman’s rank correlation test, Breusch-
pagan-Goldfe y test and white’s general hetroscedastic test. Read them by yourself.

4.1.6. Remedial Measures for the Problems of Heteroscedasticity

As we have seen, heteroscedasticity does not destroy the unbiasedness and consistency property
of the OLS estimators, but they are no longer efficient. This lack of efficiency makes the usual
hypothesis testing procedure of dubious value. Therefore, remedial measures concentrate on the
variance of the error term.
Consider the model
Y =  + X i + U i , var(u i ) =  i2 , (u i ) = 0 (u i u j ) = 0

6|Page
If we apply OLS to the above then it will result in inefficient parameters since var(ui ) is not

constant.
The remedial measure is transforming the above model so that the transformed model satisfies all
the assumptions of the classical regression model including homoscedasticity. Applying OLS to
the transformed variables is known as the method of Generalized Least Squares (GLS). In short
GLS is OLS on the transformed variables that satisfy the standard least squares assumptions.
The estimators thus obtained are known as GLS estimators, and it is these estimators that are
BLUE.

4.2 Autocorrelation
4.2.1 The Nature of Autocorrelation
In our discussion of simple and multiple regression models, one of the assumptions of the
classicalist is that the cov(ui u j ) = (ui u j ) = 0 ,which implies that successive values of

disturbance term U are temporarily independent, i.e. disturbance occurring at one point of
observation is not related to any other disturbance. This means that when observations are made
over time, the effect of disturbance occurring at one period does not carry over into another
period.

If the above assumption is not satisfied, that is, if the value of U in any particular period is
correlated with its own preceding value(s), we say there is autocorrelation of the random
variables. Hence, autocorrelation is defined as a ‘correlation’ between members of series of
observations ordered in time or space.

There is a difference between ‘correlation’ and autocorrelation. Autocorrelation is a special case

of correlation which refers to the relationship between successive values of the same variable,
while correlation may also refer to the relationship between two or more different variables.
Autocorrelation is also sometimes called as serial correlation.

4.2.3 Reasons for Autocorrelation

There are several reasons why serial or autocorrelation a rises. Some of these are:

7|Page
a. Cyclical fluctuations

Time series such as GNP, price index, production, employment and unemployment exhibit
business cycle. Starting at the bottom of recession, when economic recovery starts, most of these
series move upward. In this upswing, the value of a series at one point in time is greater than its
previous value. Thus, there is a momentum built in to them, and it continues until something
happens (e.g. increase in interest rate or tax) to slowdown them. Therefore, regression involving
time series data, successive observations are likely to be interdependent.

Specification Bias

This arises because of the following.

i. Exclusion of variables from the regression model

ii. Incorrect functional form of the model

iii. Neglecting lagged terms from the regression model

Let’s see one by one how the above specification biases causes autocorrelation.

i. Exclusion of variables: as we have discussed in chapter one , there are several sources of
the random disturbance term (ui). One of these is exclusion of variable(s) from the model.
This error term will show a systematic change as this variable changes. For example,
suppose the correct demand model is given by:
yt =  + 1 x1t +  2 x2t +  31 x3t + U t − − − − − − − − − − − − 3.21 where

y = quantity of beef demanded, x1 = price of beef, x 2 = consumer income, x3 = price of

pork and t = time. Now, suppose we run the following regression in (3.21):
yt =  + 1 x1t +  2 x2t + Vt − − − − − − − − − − − − ------3.22

Now, if equation 3.21 is the ‘correct’ model or true relation, running equation 3.22 is the
tantamount to letting Vt =  3 x3t + U t . And to the extent the price of pork affects the

consumption of beef, the error or disturbance term V will reflect a systematic pattern, thus
creating autocorrelation. A simple test of this would be to run both equation 3.21 and equation

8|Page
3.22 and see whether autocorrelation, if any, observed in equation 3.22 disappears when
equation 3.21 is run. The actual mechanics of detecting autocorrelation will be discussed
latter.

ii. Incorrect functional form: This is also one source of the autocorrelation of error term.
Suppose the ‘true’ or correct model in a cost-output study is as follows.
Marginal cost=  0 + 1output i +  2 output i + U i − − − − − − − − − − − − 3.23
2
However, we

incorrectly fit the following model. M arg inal cos t i = 1 +  2 outputi + Vi --------------3.24

The marginal cost curve corresponding to the ‘true’ model is shown in the figure below along
with the ‘incorrect’ linear cost curve.

As the figure shows, between points A and B the linear marginal cost curve will
consistently over estimate the true marginal cost; whereas, outside these points it will
consistently underestimate the true marginal cost. This result is to be expected because the
disturbance term Vi is, in fact, equal to (output)2+ ui, and hence will catch the systematic

effect of the (output)2 term on the marginal cost. In this case, Vi will reflect autocorrelation
because of the use of an incorrect functional form.

iii. Neglecting lagged term from the model: - If the dependent variable of a certain regression
model is to be affected by the lagged value of itself or the explanatory variable and is not
included in the model, the error term of the incorrect model will reflect a systematic pattern
which indicates autocorrelation in the model. Suppose the correct model for consumption
expenditure is:

9|Page
Ct =  + 1 yt +  2 yt −1 + U t -----------------------------------3.25

But again for some reason we incorrectly regress:

Ct =  + 1 yt + Vt ---------------------------------------------3.26

As in the case in (3.21) and (3.22); Vt =  2 yt −1 + U t . Hence, Vt shows systematic change

reflecting autocrrelation.

4.2.2 The Coefficient of Autocorrelation

Autocorrelation, as stated earlier, is a kind of lag correlation between successive values of same
variables. Thus, we treat autocorrelation in the same way as correlation in general. A simple
case of linear correlation is termed here as autocorrelation of first order. In other words, if the
value of U in any particular period depends on its own value in the preceding period alone, we
say that U’s follow a first order autoregressive scheme AR(1) (or first order Markove scheme)
i.e. ut = f (ut −1 ) . ------------------------- - -------------3.28

If ut depends on the values of the two previous periods, then:

ut = f (ut −1 , ut −2 ) ---------------------------------- 3.29

This form of autocorrelation is called a second order autoregressive scheme and so on.
Generally when autocorrelation is present, we assume simple first form of autocorrelation:

ut = f(ut-1) and also in the linear form:

ut = ut −1 + vt --------------------------------------------3.30

where  the coefficient of autocorrelation and V is a random variable satisfying all the basic
assumption of ordinary least square.

(v) = 0, (v 2 ) =  v2 and ( v i v j ) = 0 for i  j The above

relationship states the simplest possible form of autocorrelation; if we apply OLS on the model
given in ( 3.30) we obtain:

10 | P a g e
n

u u t t −1
̂ = t =2
n
--------------------------------3.31
u
t =2
2
t −1

Given that for large samples: u t2  u t2−1 , we observe that coefficient of autocorrelation 
represents a simple correlation coefficient r.
n n n

u u t t −1 u u t t −1 u u t t −1
ˆ = t =2
n
= t =2
= t =2
= rut ut −1 ---------------------------3.32
2
u t2 u t2−1
u
t =2
2
t −1  n 2 
  u t −1 
 t =2 

 −1  ˆ  1 since − 1  r  1 ---------------------------------------------3.33

This proves the statement “we can treat autocorrelation in the same way as correlation in
general”. From our statistics background we know that:

✓ if the value of r is 1 we call it perfect positive correlation,

✓ if r is -1 , perfect negative correlation and

✓ if the value of r is 0 ,there is no correlation.

By the same analogy if the value of ̂ is 1 it is called perfect positive autocorrelation, if ̂ is -1

it is called perfect negative autocorrelation and if  = 0 , no autocorrelation.

If ̂ =0 in u t = u t − 1 + v t i.e. ut is not correlated.

4.2.5. Mean, Variance and Covariance of Disturbance Terms in Autocorrelated Model

To examine the consequences of autocorrelation on ordinary least square estimators, it is

required to study the properties of U. If the values of U are found to be correlated with simple
markove process, then it becomes: U t = ut −1 + vt with /  / 1

vt fulfilling all the usual assumptions of a disturbance term.

Our objective, here is to obtain value of ut in terms of autocorrelation coefficient  and random

variable vt . The complete form of the first order autoregressive scheme may be discussed as
under:

11 | P a g e
U t = f (U t −1 ) = U t −1 + vt
U t −1 = f (U t −2 ) = U t −2 + vt −1
U t −2 = f (U t −3 ) = U t −3 + vt −2
U t −r = f (U t −( r +1) ) = U t −( r +1) + vt −r We
make use of above relations to perform continuous substitutions in U t = ut −1 + vt as follows.
U t = U t −1 + vt
=  ( U t −2 + vt −1 ) + vt , ut −1 = U t −2 + vt −1
=  2U t −2 + vt −1 + vt
=  2 ( U t −3 + vt −3 ) + ( vt −1 + vt )
U t =  3U t −3 +  2 vt −3 + vt −1 + vt In this
way, if we continue the substitution process for r periods (assuming that r is very large), we shall
obtain:
U t = vt + vt −1 +  2 vt −2 +  3 vt −3 + − − − − − − − − -------------3.35
 r → 0 since /  / 1

u t =   r vt − r -----------------------------------------------------------3.36
r =0

Now, using this value of ut , let’s compute its mean, variance and covariance

1. To obtain mean:

  
(U t ) =    r vt − r  =  r (vt − r ) = 0 since (vt −r ) = 0 ----------3.37
 r =0 

In other words, we found that the mean of autocorrelated U’s turns out to be zero.

2. To obtain variance

2
    
By the definition of variance (U ) =    r vt −r  =  (  r ) 2 (vt −r ) 2 =  (  r ) 2 var(Vt −r )
i
2

 r =0  r =0 r =0

;since var(vt −r ) = E (Vt −r ) 2


 1 
=   2 r  2 =  2 (1 +  2 +  4 +  6 + ................ + ) =  2  2
r =0 1 −  
2
var(U t ) = --------------------------------(3.38) ; Since /  /  1
(1 −  2 )

12 | P a g e
2
Thus, variance of autocorrelated ui is which is constant value. From the above, the
1−  2
variance of Ui depends on the nature of variance of Vi. If the variance of Vi is homoscedaistic, Ui
is homomscedastic and if Vi is hetroscedastic, Ui is hetroscedastic.

3. To obtain covariance:

By the definition of covariance:

= E (U tU t −1 ) ------------------------------------------------------------------------ (3.39)
since ut = vt + vt −1 +  2 vt −2 + ........
U t −1 = vt −1 + vt −2 +  2 vt −3 + ........
Substituting the above two equations in equation 3.39, we obtain
cov(U tU t −1 ) = (vt + vt −1 +  2 vt −2 + ........)(vt −1 + vt −2 +  2 vt −3 + ........)
= {vt +  (vt −1 + vt −2 + ........)}(vt −1 + vt −2 +  2 vt −3 + ........)
= [vt (vt −1 + vt −2 + ........) + (  (vt −1 + vt −2 + ........)2 ] ; since E (vt vt −r ) = 0
= 0 + (  (vt −1 + vt −2 + ........)2 )
= (  (vt −1 + vt −2 + ........)2 )
= (vt −1 +  2 vt −2 + ...... + 2 times cross products )
2 2

=  ( v2 +  2 v2 + ...... + 0)
=  ( v2 (1 +  2 +  4 + ......)
 2
= since   1 --------------------------------------------------------3.40
1−  2
 v2
 cov(U t ,U t −1 ) = =  u2 ……………………………………………….3.41
1−  2

Similarly cov(ut , ut −2 ) =  2 u2 ………………………………………….3.42

cov(U t ,U t −3 ) =  3 u2 ….........................................................................3.43
and generalizing cov(U t ,U t − s ) =  s u2 (for s  t ) . Summarizing on the bases of the preceding
discussions, we find that when ut’s are autocorrelated, then:
 2 
U t ~ N 0, v 2  and; E ( U tU t −r )  0 --------------------------------3.44
 1-  

4.2.6. Effect of Autocorrelation on OLS Estimators.

We have seen that ordinary least square technique is based on basic assumptions. Some of the
basic assumptions are with respect to mean, variance and covariance of disturbance term.
Naturally, therefore, if these assumptions do not hold good on what so ever account, the
13 | P a g e
estimators derived by OLS procedure may not be efficient. Now, we are in a position to examine
the effect of autocorrelation on OLS estimators. Following are effects on the estimators if OLS
method is applied in presence of autocorrelation in the given data.

1. OLS estimates are unbiased

We know that: ̂ =  + k i u i

( ˆ ) =  + k i (u i )  We proved (ui ) = 0 -- from (3.37). Therefore, ( ˆ ) = 

2. The variance of OLS estimates is inefficient.

The variance of estimate ̂ in simple regression model will be biased down wards (i.e.
underestimated) when u’s are auto correlated.

3. Wrong Testing Procedure

If var( ˆ ) is underestimated, SE( ˆ ) is also underestimated, this makes t-ratio large. This large t-

ratio may make ̂ statistically significant while it may not.

4. Wrong testing procedure will make wrong prediction and inference about the characteristics
of the population.

4.2.7. Detection (Testing) of Autocorrelation

1. Formal testing method

Different econometricians and statisticians suggest different types of testing methods. But, the
most frequently and widely used testing method by researchers is the following.

A. The Durbin-Watson d test: The most celebrated test for detecting serial correlation is one that
is developed by statisticians Durbin and Waston. It is popularly known as the Durbin-Waston d
statistic, which is defined as:

14 | P a g e
t =n

 (e t − et −1 ) 2
d= t =2
t =n
------------------------------------3.47
e
t =1
2
t

Note that, in the numerator of d statistic the number of observations is n − 1 because one
observation is lost in taking successive differences.

It is important to note the assumptions underlying the d-statistics

1. The regression model includes an intercept term. If such term is not present, as in the case
of the regression through the origin, it is essential to rerun the regression including the
intercept term to obtain the RSS.

2. The explanatory variables, the X’s, are non-stochastic, or fixed in repeated sampling.

3. The disturbances U t are generated by the first order auto regressive scheme:

Vt = ut −1 +  t

4. The regression model does not include lagged value of Y the dependent variable as one
of the explanatory variables. Thus, the test is inapplicable to models of the following
type

yt = 1 +  2 X 2t +  3 X 3t + ....... +  k X kt + ryt −1 + U t

Where yt −1 the one period lagged value of y is such models are known as autoregressive

models. If d-test is applied mistakenly, the value of d in such cases will often be around
2, which is the value of d in the absence of first order autocorrelation. Durbin developed
the so-called h-statistic to test serial correlation in such autoregressive.

5. There are no missing observations in the data.

In using the Durbin –Watson test, it is, therefore, to note that it cannot be applied in
violation of any of the above five assumptions.

15 | P a g e
t =n

 (e t − et −1 ) 2
From equation 3.47 the value of d = t =2
t =n

e
t =1
2
t

Squaring the numerator of the above equation, we obtain

n n

 et2 +  et2−1 − 2et et −1

d= t =2 t =2
------------------3.48
et2
n n
However, for large samples e
t =2
2
t   et2−1 because in both cases one observation is lost.
t =2

Thus,
n
2 et2
2et et −1
d= t =2
−+
e 2
t et
 
 
 et et −1 
d  2 1− n
 



t =1
et 

e e
but  = t t −1 from equation
et
d = 2(1 − ˆ )
From the above relation, therefore
 ˆ = 0, d  2

if  ˆ = 1, d  0
 ˆ = −1, d  4

Thus we obtain two important conclusions
i. Values of d lies between 0 and 4
ii. If there is no autocorrelation ˆ = 0, then d = 2

Whenever, therefore, the calculated value of d turns out to be sufficiently close to 2, we accept
null hypothesis, and if it is close to zero or four, we reject the null hypothesis that there is no
autocorrelation.

However, because the exact value of d is never known, there exist ranges of values with in which
we can either accept or reject null hypothesis. We do not also have unique critical value of d-
stastics. We have d L -lower bound and d u upper bound of the initial values of d to accept or

16 | P a g e
reject the null hypothesis. For the two-tailed Durbin Watson test, we can set five regions to the
values of d graphically (read it).

The mechanisms of the D.W test are as follows, assuming that the assumptions underlying the
tests are fulfilled.

➢ Run the OLS regression and obtain the residuals

➢ Obtain the computed value of d using the formula given in equation 3.47

➢ For the given sample size and given number of explanatory variables, find out critical d L
and dU values.

➢ Now follow the decision rules given below.

1. If d is less that d L or greater than (4 − d L ) we reject the null hypothesis of no

autocorrelation in favor of the alternative which implies existence of autocorrelation.

2. If, d lies between dU and (4 − dU ) , accept the null hypothesis of no autocorrelation

3. If how ever the value of d lies between d L and dU or between (4 − dU ) and (4 − d L ) , the
D.W test is inconclusive.

Example 1. Suppose for a hypothetical model Y =  + X + U i ,if we found

d = 0.1380 ; d L = 1.37; dU = 1.50

Based on the above values test for autocorrelation

Solution: First compute (4 − d L ) and (4 − dU ) and compare the computed value

of d with d L , dU , (4 − d L ) and (4 − dU )

(4 − d L ) =4-1.37=2.63

(4 − dU ) =4-1.5=2.50
Since d is less than dL we reject the null hypothesis of no autocorrelation
Example 2. Consider the model Yt =  + X t + U t with the following observation on X and Y

17 | P a g e
X 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15

Y 2 2 2 1 3 5 6 6 10 10 10 12 15 10 11

Test for autocorrelation using Durbin -Watson method

Solution:

1. regress Y on X: i.e. Yt =  + X t + U t :

From the above table we can compute the following values.

xy = 255, Y = 7, (ei − et −1 ) 2 = 60.21

x 2 = 280, X = 8, et2 = 41.767
y 2 = 274

xy 255
ˆ = = = 0.91
x 2 280

ˆ = Y − ˆX = 7 − 0.91(8) = −0.29

Y = −0.29 + 0.91X + U i

Yˆ = 0.28 + 0.91X , R 2 = 0.85

(et − et −1 ) 2 60.213
d= = = 1.442
et2 41.767

Values of d L and dU on 5% level of significance, with n=15 and one explanatory variable are:
d L =1.08 and dU =1.36.

(4 − d u ) = 2.64

d U  d  4 − d U = (1.364 2.64)
d * = 1.442

Since d* lies between dU  d  4 − dU , accept H0. This implies the data is autocorrelated.

18 | P a g e
Although D.W test is extremely popular, the d test has one great drawback in that if it falls in the
inconclusive zone or region, one cannot conclude whether autocorrelation does or does not exist.
Several authors have proposed modifications of the D.W test.

In many situations, however, it has been found that the upper limit dU is approximately the true

significance limit. Thus, the modified DW test is based on dU in case the estimated d value lies

in the inconclusive zone, one can use the following modified d test procedure. Given the level of
significance  ; if

1.  = 0 versus H 1 :   0 if the estimated d  dU , reject H0 at level  , that is there is

statistically significant positive correlation.

2. H 0 :  = 0 versus H1 :   0 if the estimated d  dU or (4 − d u )  dU reject H0 at level

2 statistically there is significant evidence of autocorrelation, positive or negative.

4.2.8. Remedial Measures for the problems of Autocorrelation

Since in the presence of serial correlation the OLS estimators are inefficient, it is essential to
seek remedial measures. The remedy however depends on what knowledge one has about the
nature of interdependence among the disturbances. This means the remedy depends on whether
the coefficient of autocorrelation is known or not known.

A. when  is known- When the structure of autocorrelation is known, i.e  is known, the
appropriate corrective procedure is to transform the original model or data so that error term of
the transformed model or data is non auto correlated.

B. When  is not known

When  is not known, we first estimate the coefficient of autocorrelation and apply appropriate
measure accordingly.

19 | P a g e
4.3 Multicollinearity
4.3.1 The Nature of Multicollinearity
Originally, multicollinearity meant the existence of a “perfect” or exact, linear relationship
among some or all explanatory variables of a regression model. For k-variable regression
involving explanatory variables x1 , x2 ,......, xk , an exact linear relationship is said to exist if the

following condition is satisfied.

1 x1 + 2 x2 + ....... + k xk + vi = 0 − − − − − − (1)
where 1 , 2 ,.....k are constants such that not all of them are simultaneously zero.

Today, however , the term multicollineaity is used in a broader sense to include the case of
perfect multicollinearity as shown by (1) as well as the case where the x-variables are inter-
correlated but not perfectly so as follows
1 x1 + 2 x2 + ....... + 2 xk + vi = 0 − − − − − − (1)
where vi is the stochastic error term.

4.3.2 Reasons for Multicollinearity

1. The data collection method employed: Example: If we regress on small sample values of
the population; there may be multicollinearity but if we take all the possible values, it
may not show multicollinearity.
2. Constraint over the model or in the population being sampled.
For example: in the regression of electricity consumption on income (x1) and house size
(x2), there is a physical constraint in the population in that families with higher income
generally have larger homes than with lower incomes.
3. Overdetermined model: This happens when the model has more explanatory variables
than the number of observations. This could happen in medical research where there may
be a small number of patients about whom information is collected on a large number of
variables.

4.3.3 Consequences of Multicollinearity

Why does the classical linear regression model put the assumption of no multicollinearity among
the X’s? It is because of the following consequences of multicollinearity on OLS estimators.

20 | P a g e
1. If multicollinearity is perfect, the regression coefficients of the X variables are indeterminate
and their standard errors are infinite.
Proof: - Consider a multiple regression model with two explanatory variables, where the
dependent and independent variables are given in deviation form as follows.
y = ˆ x + ˆ x + e
i 1 1i 2 2i i

Recall the formulas of ˆ1 and ̂ 2 from our discussion of multiple regression.
x1 yx 22 − x 2 yx1 x 2
ˆ1 =
x12i x 22 − (x1 x 2 ) 2
x 2 yx12 − x1 yx1 x 2
ˆ1 =
x12 x 22 − (x1 x 2 ) 2
Assume x 2 = x1 ------------------------3.32
Where  is non-zero constants. Substitute 3.32in the above ˆ1 and ̂ 2 formula:
x1 y(x1 ) 2 − x1 yx1x1
ˆ1 =
x12i (x1 ) 2 − (x1x1 ) 2

x1 y2 x1 − x1 yx1

2 2
0
= =  indeterminate.
 (x ) −  (x1 )
2 2
1
2 2 2 2
0

Applying the same procedure, we obtain similar result (indeterminate value) for ̂ 2 . Likewise,
from our discussion of multiple regression model, variance of ˆ1 is given by :
 2 x 22
var( ˆ1 ) =
x12 x12 − (x1 x 2 ) 2
Substituting x 2 = x1 in the above variance formula, we get:
 2 2 x12
=
2 (x12 ) 2 − 2 (x12 ) 2
 2 2 x12
= =   infinite.
0
These are the consequences of perfect multicollinearity. One may raise the question on
consequences of less than perfect correlation. In cases of near or high multicollinearity, one is
likely to encounter the following consequences.
2. If multicollineaity is less than perfect (i.e near or high multicollinearity), the regression
coefficients are determinate
Proof: Consider the two explanatory variables model above in deviation form.

21 | P a g e
If we assume x 2 = x1 it indicates us perfect correlation between x1 and x 2 because the change
in x2 is completely because of the change in x1.Instead of exact multicollinearity, we may have:
x2i = x1i + vt Where   0, vt is stochastic error term such that xi vi = 0 . In this case x2 is not
only determined by x1,but also affected by some other variables given by vi (stochastic error
term).
Substitute x2 = x1i + vt in the formula of ˆ1 above
x1 yx 22 − x 2 yx1 x 2
ˆ1 =
x12i x 22 − (x1 x 2 ) 2
x1 y2 x12 + vi2 − (y i x1i + y i vi )x12i 0
=   determinate.
x 22i (2 x 22i + vi2 ) − (x12 ) 2 0
This proves that if we have less than perfect multicollinearity the OLS coefficients are
determinate.
The implication of indetermination of regression coefficients in the case of perfect
multicolinearity is that it is not possible to observe the separate influence of x1 and x 2 . But
such extreme case is not very frequent in practical applications. Most data exhibit less than
perfect multicollinearity.
3. If multicollineaity is less than perfect (i.e. near or high multicollinearity) , OLS estimators
retain the property of BLUE
Explanation:
Note: While we were proving the BLUE property of OLS estimators in simple and multiple
regression models; we did not make use of the assumption of no multicollinearity. Hence, if the
basic assumptions which are important to prove the BLUE property are not violated, whether
multicollinearity exist or not, the OLS estimators are BLUE .
3. Although BLUE, the OLS estimators have large variances and covariances.
 2 x 22
var( ˆ 2 ) = 2 2
x1 x 2 − (x1 x 2 ) 2
1
Multiply the numerator and the denominator by 2
x2
1
 2 x 22 . 2
x 2 2
var( ˆ 2 ) = =
(x x
2
1
2
2 − (x1 x 2 ) 2
). 1
2 x12 −
(x1 x 2 ) 2
x 2 x12
2 2
= =
 (x1 x 2 ) 2  x12 − (1 − r122 )
x − 1 −
2
1

 x12 x12 

22 | P a g e
Where r122 is the square of correlation coefficient between x1 and x 2 ,
If x = x + v , what happen to the variance of ˆ as r 2 is line rises.
2 1i i 1 12

As r12 tends to 1 or as collinearity increases, the variance of the estimator increase and in the
limit when r = 1 variance of ˆ becomes infinite.
12 1

− r12 2
Similarly cov( 1 ,  2 ) = . (why?)
(1 − r122 ) x12 x12
As r12 increases to ward one, the covariance of the two estimators increase in absolute value. The
speed with which variances and covariance increase can be seen with the variance-inflating
factor (VIF) which is defined as:
1
VIF =
1 − r122
VIF shows how the variance of an estimator is inflated by the presence of multicollinearity. As
r122 approaches 1, the VIF approaches infinity. That is, as the extent of collinearity increase, the
variance of an estimator increases and in the limit the variance becomes infinite. As can be seen,
if there is no multicollinearity between x1 and x2 , VIF will be 1.

Using this definition we can express var(  1 ) and var( ˆ 2 ) interms of VIF

ˆ 2 ˆ 2
var(  1 ) = 2 VIF and var(  2 ) = 2 VIF
x1 x 2
Which shows that variances of ˆ1 and ˆ 2 are directly proportional to the VIF.
4. Because of the large variance of the estimators, which means large standard errors, the
confidence interval tend to be much wider, leading to the acceptance of “zero null hypothesis”
(i.e. the true population coefficient is zero) more readily.
5. Because of large standard error of the estimators, the computed t-ratio will be very small
leading one or more of the coefficients tend to be statistically insignificant when tested
individually.
6. Although the t-ratio of one or more of the coefficients is very small (which makes the
coefficients statistically insignificant individually), R2, the overall measure of goodness of fit, can
be very high.
Example: if y =  + 1 x1 +  2 x2 + .... +  k xk + vi

23 | P a g e
In the cases of high collinearity, it is possible to find that one or more of the partial slope
coefficients are individually statistically insignificant on the basis of t-test. But the R2 in such
situations may be so high say in excess of 0.9.in such a case on the basis of F test one can
convincingly reject the hypothesis that 1 =  2 = − − − =  k = 0 Indeed, this is one of the

signals of multicollinearity- insignificant t-values but a high overall R2 (i.e a significant F-value).
7. The OLS estimators and their standard errors can be sensitive to small change in the data.
4.3.4 Detection of Multicollinearity
A recognizable set of symptoms for the existence of multicollinearity on which one can rely are:
a. High coefficient of determination ( R2)
b. High correlation coefficients among the explanatory variables (rxi x j ' s)

c. Large standard errors and smaller t-ratio of the regression parameters

Note: None of the symptoms by itself are a satisfactory indicator of multicollinearity. Because:
i. Large standard errors may arise for various reasons and not only because of the presence so
linear relationships among explanatory variables.
ii. A high rxi x j is only sufficient but not a necessary condition (adequate condition) for the

existence of multicollinearity because multicollinearity can also exist even if the correlation
coefficient is low.
However, the combination of all these criteria should help the detection of multicollinearity.

4.3.4.1 Test Based on Auxiliary Regressions:

Since multicollinearity arises because one or more of the regressors are exact or approximately
linear combinations of the other regressors, one may of finding out which X variable is related to
other X variables to regress each Xi on the remaining X variables and compute the corresponding
R2, which we designate as Ri2 , each one of these regressions is called auxiliary to the main

regression of Y on the X’s. Then, following the relationship between F and R2 established in
chapter three under over all significance , the variable,
R x21 , x2 , x3 ,... xk / k − 2
Ri = ~ F( k − 2, n − k +1)
1 − R x21 , x2 , x3 ,... xk /( n − k + 1)

where: - n is number of observation

- k is number of parameters including the intercept

24 | P a g e
If the computed F exceeds the critical F at the chosen level of significance, it is taken to mean
that the particular Xi collinear with other X’s; if it does not exceed the critical F, we say that it is
not collinear with other X’s in which case we may retain the variable in the model. If Fi is
statistically significant, we will have to decide whether the particular Xi should be dropped from
the model. Note that according tot Klieri’s rule of thumb, which suggest that multicollinearity
may be a troublesome problem only if R2 obtained from an auxiliary regression is greater than
the overall R2, that is obtained from the regression of Y on all regressors.
4.3.4.2. Test of multicollinearity using Eigen values and condition index:
Using Eigen values we can drive a number called condition number K as follows:
max imum eigen value
k=
Minimum eigen value
In addition using these value we can drive the condition index (CI) defined as

Max.eigenvalue
CI = = k
min . eigen value

Decision rule: if K is between 100 and 1000 there is moderate to strong muticollinearity and if it
exceeds 1000 there is sever muticollinearity. Alternatively if CI( = k ) is between 10 and 30,
there is moderate to strong multicollineaity and if it exceeds 30 there is sever muticollinearity.
Example . If k=123,864 and CI=352 – This suggest existence of multicollinearity
4.3.4.4 Test of multicollinearity using Tolerance and variance inflation factor

2  1  2
var( ˆ1 ) = 2   = 2 VIF
2 
x1  1 − Ri  xi
where Ri2 is the R 2 in the auxiliary regression of Xj on the remaining (k-2) regressors and VIF is

variance inflation factor.

Some authors therefore use the VIF as an indicator of multicollinearity: The larger is the value
of VIFj, the more “trouble some” or collinear is the variable Xj. However, how high should VIF
be before a regressor becomes troublesome? As a rule of thumb, if VIF of a variable exceeds 10
(this will happens if Ri2 exceeds (0.9) the variable is said to be highly collinear.
Other authors use the measure of tolerance to detect multicollinearity. It is defined as
1
TOLi = (1 − R 2j ) = Clearly, TOLj =1 if Xj is not correlated with the other regressors, where
VIF

25 | P a g e
as it is zero if it is perfectly related to other regressors. VIF (or tolerance) as a measure of
 2 
collinearity, is not free of criticism. As we have seen earlier var( ˆ ), =  = 2 (VIF )  ; depends
 xi 
on three factors  2 , xi2 and VIF. A high VIF can be counter balanced by low

 2 or high xi2 . To put differently, a high VIF is neither necessary nor sufficient to get high
variances and high standard errors. Therefore, high multicollinearity, as measured by a high VIF
may not necessary cause high standard errors.
4.3.5. Remedial measures
It is more difficult to deal with models indicating the existence of multicollinearity than detecting
the problem of multicollinearity. Different remedial measures have been suggested by
econometricians; depending on the severity of the problem, availability of other sources of data
and the importance of the variables, which are found to be multicollinear in the model.
Some suggest that minor degree of multicollinearity can be tolerated although one should be a bit
careful while interpreting the model under such conditions. Others suggest removing the
variables that show multicollinearity if it is not important in the model. But, by doing so, the
desired characteristics of the model may then get affected. However, following corrective
procedures have been suggested if the problem of multicollinearity is found to be serious.
1. Increase the size of the sample: it is suggested that multicollinearity may be avoided or
reduced if the size of the sample is increased. With increase in the size of the sample, the
covariances are inversely related to the sample size. But we should remember that this will be
true when intercorrelation happens to exist only in the sample but not in the population of the
variables. If the variables are collinear in the population, the procedure of increasing the size of
the sample will not help to reduce multicollinearity.
2. Introduce additional equation in the model: The problem of mutlicollinearity may be
overcome by expressing explicitly the relationship between multicollinear variables. Such
relation in a form of an equation may then be added to the original model. The addition of new
equation transforms our single equation (original) model to simultaneous equation model. The
reduced form method (which is usually applied for estimating simultaneous equation models)
can then be applied to avoid multicollinearity.

26 | P a g e

The Subtle Art of Not Giving a F*ck: A Counterintuitive Approach to Living a Good Life
From Everand
The Subtle Art of Not Giving a F*ck: A Counterintuitive Approach to Living a Good Life
Mark Manson
4/5 (6417)
Principles: Life and Work
From Everand
Principles: Life and Work
Ray Dalio
4/5 (640)
The Gifts of Imperfection: Let Go of Who You Think You're Supposed to Be and Embrace Who You Are
From Everand
The Gifts of Imperfection: Let Go of Who You Think You're Supposed to Be and Embrace Who You Are
Brené Brown
4/5 (1173)
Never Split the Difference: Negotiating As If Your Life Depended On It
From Everand
Never Split the Difference: Negotiating As If Your Life Depended On It
Chris Voss
4.5/5 (992)
The Glass Castle: A Memoir
From Everand
The Glass Castle: A Memoir
Jeannette Walls
4.5/5 (1853)
Grit: The Power of Passion and Perseverance
From Everand
Grit: The Power of Passion and Perseverance
Angela Duckworth
4/5 (650)
Sing, Unburied, Sing: A Novel
From Everand
Sing, Unburied, Sing: A Novel
Jesmyn Ward
4/5 (1267)
The Perks of Being a Wallflower
From Everand
The Perks of Being a Wallflower
Stephen Chbosky
4.5/5 (4102)
Her Body and Other Parties: Stories
From Everand
Her Body and Other Parties: Stories
Carmen Maria Machado
4/5 (903)
Shoe Dog: A Memoir by the Creator of Nike
From Everand
Shoe Dog: A Memoir by the Creator of Nike
Phil Knight
4.5/5 (628)
The Hard Thing About Hard Things: Building a Business When There Are No Easy Answers
From Everand
The Hard Thing About Hard Things: Building a Business When There Are No Easy Answers
Ben Horowitz
4.5/5 (361)
Hidden Figures: The American Dream and the Untold Story of the Black Women Mathematicians Who Helped Win the Space Race
From Everand
Hidden Figures: The American Dream and the Untold Story of the Black Women Mathematicians Who Helped Win the Space Race
Margot Lee Shetterly
4/5 (1016)
Steve Jobs
From Everand
Steve Jobs
Walter Isaacson
4.5/5 (1138)
Elon Musk: Tesla, SpaceX, and the Quest for a Fantastic Future
From Everand
Elon Musk: Tesla, SpaceX, and the Quest for a Fantastic Future
Ashlee Vance
4.5/5 (581)
The Emperor of All Maladies: A Biography of Cancer
From Everand
The Emperor of All Maladies: A Biography of Cancer
Siddhartha Mukherjee
4.5/5 (297)
A Man Called Ove: A Novel
From Everand
A Man Called Ove: A Novel
Fredrik Backman
4.5/5 (5143)
Angela's Ashes: A Memoir
From Everand
Angela's Ashes: A Memoir
Frank McCourt
4.5/5 (943)
The Little Book of Hygge: Danish Secrets to Happy Living
From Everand
The Little Book of Hygge: Danish Secrets to Happy Living
Meik Wiking
3.5/5 (460)
The Yellow House: A Memoir (2019 National Book Award Winner)
From Everand
The Yellow House: A Memoir (2019 National Book Award Winner)
Sarah M. Broom
4/5 (100)
Brooklyn: A Novel
From Everand
Brooklyn: A Novel
Colm Tóibín
3.5/5 (2126)
Devil in the Grove: Thurgood Marshall, the Groveland Boys, and the Dawn of a New America
From Everand
Devil in the Grove: Thurgood Marshall, the Groveland Boys, and the Dawn of a New America
Gilbert King
4.5/5 (279)
The World Is Flat 3.0: A Brief History of the Twenty-first Century
From Everand
The World Is Flat 3.0: A Brief History of the Twenty-first Century
Thomas L. Friedman
3.5/5 (2289)
The Art of Racing in the Rain: A Novel
From Everand
The Art of Racing in the Rain: A Novel
Garth Stein
4/5 (4355)
Yes Please
From Everand
Yes Please
Amy Poehler
4/5 (2001)
Bad Feminist: Essays
From Everand
Bad Feminist: Essays
Roxane Gay
4/5 (1090)
The Woman in Cabin 10
From Everand
The Woman in Cabin 10
Ruth Ware
3.5/5 (2787)
A Tree Grows in Brooklyn
From Everand
A Tree Grows in Brooklyn
Betty Smith
4.5/5 (2033)
The Outsider: A Novel
From Everand
The Outsider: A Novel
Stephen King
4/5 (2876)
A Heartbreaking Work Of Staggering Genius: A Memoir Based on a True Story
From Everand
A Heartbreaking Work Of Staggering Genius: A Memoir Based on a True Story
Victoria Walters
3.5/5 (233)
The Sympathizer: A Novel (Pulitzer Prize for Fiction)
From Everand
The Sympathizer: A Novel (Pulitzer Prize for Fiction)
Viet Thanh Nguyen
4.5/5 (141)
Team of Rivals: The Political Genius of Abraham Lincoln
From Everand
Team of Rivals: The Political Genius of Abraham Lincoln
Doris Kearns Goodwin
4.5/5 (244)
Wolf Hall: A Novel
From Everand
Wolf Hall: A Novel
Hilary Mantel
4/5 (4087)
An Introduction To Political and Social Data Analysis Using R
No ratings yet
An Introduction To Political and Social Data Analysis Using R
432 pages
On Fire: The (Burning) Case for a Green New Deal
From Everand
On Fire: The (Burning) Case for a Green New Deal
Naomi Klein
4/5 (78)
Fear: Trump in the White House
From Everand
Fear: Trump in the White House
Bob Woodward
3.5/5 (835)
Manhattan Beach: A Novel
From Everand
Manhattan Beach: A Novel
Jennifer Egan
3.5/5 (918)
Marketing Group Assign
No ratings yet
Marketing Group Assign
16 pages
John Adams
From Everand
John Adams
David McCullough
4.5/5 (2546)
Rise of ISIS: A Threat We Can't Ignore
From Everand
Rise of ISIS: A Threat We Can't Ignore
Jay Sekulow
3.5/5 (144)
The Light Between Oceans: A Novel
From Everand
The Light Between Oceans: A Novel
M.L. Stedman
4.5/5 (815)
Module 1-2 Sources of Knowledge
100% (2)
Module 1-2 Sources of Knowledge
26 pages
TestExer5 Econometrics
0% (1)
TestExer5 Econometrics
3 pages
CBTP One
100% (1)
CBTP One
14 pages
Group Assignment 3 Ot
No ratings yet
Group Assignment 3 Ot
1 page
Econometrcs CHAP 3
No ratings yet
Econometrcs CHAP 3
16 pages
The Unwinding: An Inner History of the New America
From Everand
The Unwinding: An Inner History of the New America
George Packer
4/5 (45)
Little Women
From Everand
Little Women
Louisa May Alcott
4.5/5 (2369)
The Constant Gardener: A Novel
From Everand
The Constant Gardener: A Novel
John le Carré
4/5 (278)
Materials Mnagement III
No ratings yet
Materials Mnagement III
63 pages
PDF&Rendition 1 1
No ratings yet
PDF&Rendition 1 1
1 page
International Relations As A Social Science: Rigor and Relevance
No ratings yet
International Relations As A Social Science: Rigor and Relevance
21 pages
Chapter 3 Methodology
No ratings yet
Chapter 3 Methodology
34 pages
Slides PyConfr Bordeaux Calcagno
No ratings yet
Slides PyConfr Bordeaux Calcagno
46 pages
Pengaruh Pengendalian Jumlah Pembelian Terhadap Pengadaan Material Di Pt. Antero Makmur
No ratings yet
Pengaruh Pengendalian Jumlah Pembelian Terhadap Pengadaan Material Di Pt. Antero Makmur
12 pages
General Concepts in Science, Technology, and Society
No ratings yet
General Concepts in Science, Technology, and Society
11 pages
Drawing and Drafting in Architecture Architectural History As A Part of Future Studies
No ratings yet
Drawing and Drafting in Architecture Architectural History As A Part of Future Studies
7 pages
Sample Size Estimation in Clinical Research
No ratings yet
Sample Size Estimation in Clinical Research
9 pages
Mixed Methods in Finance Research: The Rationale and Research Designs
No ratings yet
Mixed Methods in Finance Research: The Rationale and Research Designs
13 pages
Visual Elements in TV Commercials and Their Impact On Consumers
No ratings yet
Visual Elements in TV Commercials and Their Impact On Consumers
38 pages
CFA Level II: Quantitative Methods
No ratings yet
CFA Level II: Quantitative Methods
169 pages
Sampling Techniques and Sample Size Determination
No ratings yet
Sampling Techniques and Sample Size Determination
104 pages
Statistic Cheat Sheet
No ratings yet
Statistic Cheat Sheet
3 pages
Philosophy Notes
No ratings yet
Philosophy Notes
7 pages
Forecasting: Welingkar's Distance Learning Division
No ratings yet
Forecasting: Welingkar's Distance Learning Division
24 pages
Sheila - Research Design
No ratings yet
Sheila - Research Design
31 pages
How To Become A GOOD Theoretical Physicist
100% (1)
How To Become A GOOD Theoretical Physicist
4 pages
Unit-1 Broad Classification of Economic Relations:: Standard International Trade Classification
No ratings yet
Unit-1 Broad Classification of Economic Relations:: Standard International Trade Classification
2 pages
Instant ebooks textbook Applied Univariate Bivariate and Multivariate Statistics Understanding Statistics for Social and Natural Scientists With Applications in SPSS and R 2nd Edition Daniel J Denis download all chapters
100% (4)
Instant ebooks textbook Applied Univariate Bivariate and Multivariate Statistics Understanding Statistics for Social and Natural Scientists With Applications in SPSS and R 2nd Edition Daniel J Denis download all chapters
62 pages
INTENSIVE SOCIAL SCIENCE BLESSING 2021 September 26 2021
No ratings yet
INTENSIVE SOCIAL SCIENCE BLESSING 2021 September 26 2021
20 pages
Set 7 - ANOVA
No ratings yet
Set 7 - ANOVA
5 pages
Bman 01
No ratings yet
Bman 01
26 pages
Random Walk Model
100% (1)
Random Walk Model
2 pages
W1 Trial Design Final Workbook
No ratings yet
W1 Trial Design Final Workbook
59 pages
Summary, Conclusion, Implications and Recommendation
No ratings yet
Summary, Conclusion, Implications and Recommendation
6 pages
Writersking Com Empirical Review and Literature Review
No ratings yet
Writersking Com Empirical Review and Literature Review
9 pages

Econometricis Chapter Four PBMs

Uploaded by

Econometricis Chapter Four PBMs

Uploaded by

Chapter Four

Violations of Basic Classical Assumptions

uncorrelated, cov(u t , u s ) = 0 (for t  s) and in multiple regression we assumed there is no

perfect correlation between the independent variables.

var(u i ) = u i − (u i ) = (u i2 ) =  u2 ; constant value.

This feature of homogeneity of variance (or constant variance) is known as homoscedasticity. It

var(u i ) =  ui2 (a value that varies)

4.1.4 Consequences of Hetroscedasticity for the Least Square Estimates

hetroscedastic variances are proportional to K i ;  2 being facto of proportionality. Substituting

Suppose  i2 is positively related to X i as:  i2 =  2 X i2 − − − − − − − 3.18; where  2 is constant.

4.1.6. Remedial Measures for the Problems of Heteroscedasticity

There is a difference between ‘correlation’ and autocorrelation. Autocorrelation is a special case

4.2.3 Reasons for Autocorrelation

This arises because of the following.

i. Exclusion of variables from the regression model

ii. Incorrect functional form of the model

iii. Neglecting lagged terms from the regression model

y = quantity of beef demanded, x1 = price of beef, x 2 = consumer income, x3 = price of

But again for some reason we incorrectly regress:

As in the case in (3.21) and (3.22); Vt =  2 yt −1 + U t . Hence, Vt shows systematic change

4.2.2 The Coefficient of Autocorrelation

If ut depends on the values of the two previous periods, then:

ut = f (ut −1 , ut −2 ) ---------------------------------- 3.29

ut = f(ut-1) and also in the linear form:

(v) = 0, (v 2 ) =  v2 and ( v i v j ) = 0 for i  j The above

✓ if the value of r is 1 we call it perfect positive correlation,

✓ if r is -1 , perfect negative correlation and

✓ if the value of r is 0 ,there is no correlation.

By the same analogy if the value of ̂ is 1 it is called perfect positive autocorrelation, if ̂ is -1

If ̂ =0 in u t = u t − 1 + v t i.e. ut is not correlated.

4.2.5. Mean, Variance and Covariance of Disturbance Terms in Autocorrelated Model

To examine the consequences of autocorrelation on ordinary least square estimators, it is

vt fulfilling all the usual assumptions of a disturbance term.

;since var(vt −r ) = E (Vt −r ) 2

By the definition of covariance:

Similarly cov(ut , ut −2 ) =  2 u2 ………………………………………….3.42

4.2.6. Effect of Autocorrelation on OLS Estimators.

1. OLS estimates are unbiased

( ˆ ) =  + k i (u i )  We proved (ui ) = 0 -- from (3.37). Therefore, ( ˆ ) = 

2. The variance of OLS estimates is inefficient.

3. Wrong Testing Procedure

ratio may make ̂ statistically significant while it may not.

4.2.7. Detection (Testing) of Autocorrelation

1. Formal testing method

It is important to note the assumptions underlying the d-statistics

5. There are no missing observations in the data.

Squaring the numerator of the above equation, we obtain

 et2 +  et2−1 − 2et et −1

➢ Run the OLS regression and obtain the residuals

➢ Now follow the decision rules given below.

1. If d is less that d L or greater than (4 − d L ) we reject the null hypothesis of no

2. If, d lies between dU and (4 − dU ) , accept the null hypothesis of no autocorrelation

Example 1. Suppose for a hypothetical model Y =  + X + U i ,if we found

d = 0.1380 ; d L = 1.37; dU = 1.50

Based on the above values test for autocorrelation

Solution: First compute (4 − d L ) and (4 − dU ) and compare the computed value

Test for autocorrelation using Durbin -Watson method

From the above table we can compute the following values.

xy = 255, Y = 7, (ei − et −1 ) 2 = 60.21

ˆ = Y − ˆX = 7 − 0.91(8) = −0.29

Yˆ = 0.28 + 0.91X , R 2 = 0.85

1.  = 0 versus H 1 :   0 if the estimated d  dU , reject H0 at level  , that is there is

2. H 0 :  = 0 versus H1 :   0 if the estimated d  dU or (4 − d u )  dU reject H0 at level

2 statistically there is significant evidence of autocorrelation, positive or negative.

4.2.8. Remedial Measures for the problems of Autocorrelation

B. When  is not known

following condition is satisfied.

4.3.2 Reasons for Multicollinearity

4.3.3 Consequences of Multicollinearity

x1 y2 x1 − x1 yx1

c. Large standard errors and smaller t-ratio of the regression parameters

4.3.4.1 Test Based on Auxiliary Regressions:

where: - n is number of observation

variance inflation factor.

You might also like