0% found this document useful (0 votes)
468 views

ch11 Heteroscedasticity

The document discusses heteroscedasticity, which occurs when the variance of the error term is not constant. It presents several potential causes of heteroscedasticity, including omitted variables, skewed regressors, and incorrect data transformations. It then compares ordinary least squares (OLS) estimation to generalized least squares (GLS) estimation, noting that GLS provides best linear unbiased estimates when heteroscedasticity is present. Several formal tests for detecting heteroscedasticity are also outlined, such as the Park test, Glejser test, and Goldfeld-Quandt test.

Uploaded by

Khirstina Curry
Copyright
© © All Rights Reserved
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
468 views

ch11 Heteroscedasticity

The document discusses heteroscedasticity, which occurs when the variance of the error term is not constant. It presents several potential causes of heteroscedasticity, including omitted variables, skewed regressors, and incorrect data transformations. It then compares ordinary least squares (OLS) estimation to generalized least squares (GLS) estimation, noting that GLS provides best linear unbiased estimates when heteroscedasticity is present. Several formal tests for detecting heteroscedasticity are also outlined, such as the Park test, Glejser test, and Goldfeld-Quandt test.

Uploaded by

Khirstina Curry
Copyright
© © All Rights Reserved
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 31

Heteroscedasticity

Chapter 11
The nature of heteroscedasticity

1. The error-learning models  variance is expected


to decrease
2. Discretionary income  variance is expected to
increase
3. Data collecting technique improve  variance is
likely to decrease
4. Presence of outliers
5. Specification bias  some important variables are
omitted from the model

01/06/22 Prepared by Sri Yani K 2


The nature of heteroscedasticity

6. Skewness in the distribution of one or more


regressors included in the model
7. As David Hendry notes, heteroscedasticity can
also arise because of:
a) Incorrect data transformation
b) Incorrect functional form
8. Problem of heteroscedasticity is likely to be more
common in cross-section than in time series data.

01/06/22 Prepared by Sri Yani K 3


OLS estimation in the presence of
heteroscedasticity
 The two-variable model:
Yi  1   2 X i  ui

ˆ
2 
 xi yi n X iYi   X i  Yi

 i n X i    X i 
x 2 2 2

 
var ˆ2 
i i
x  2

x 
2
2
i

 ̂ 2 is still linear unbiased and consistent, but no longer


best and minimum variance  they are not BLUE

01/06/22 Prepared by Sri Yani K 4


The method of Generalized Least
Squares (GLS)
 GLS is OLS on the transformed variables that
satisfy the standard least squares assumptions
 GLS estimators are BLUE
 The SRF:
Yi  1   2 X i  ui
Yi  1 X 0i   2 X i  ui where X 0i  1 for each i
Assume that the heteroscedastic variances  i2 are known,
Yi  X 0i   Xi   ui 
 1    2   
i  i   i   i 

01/06/22 Prepared by Sri Yani K 5


The method of Generalized Least
Squares (GLS)
Yi    1 X 0i   2 X i  ui
where the starred variables are the original variables devided by  i .
2
 ui 
   Eu 
2
 
var u i i  E 
 i 
1
 2 E  ui 
2
since  i2 is known
i
1
 
 2  i2
i
since E  ui    i2
2

1 constant = homoscedastic

01/06/22 Prepared by Sri Yani K 6


GLS estimators
Yi  X 0i   X i   ui 
 1    2     
i  i   i   i 
Yi   1 X   2 X i  u
  
0i
  
i

 
2
Minimize  uˆ2
i   Yi   1 X   2 X i
  
0i
 

2 2
 uˆi   Yi    X 0i    X i  
          1     2    
 i  i   i   i 

01/06/22 Prepared by Sri Yani K 7


GLS estimators

ˆ    w    w X Y     w X    wY 
i i i i i i i i

 w   w X   w X 
2 2
2
i i i i i

var  ˆ  
   w i

 w   w X   w X 
2 2
2
i i i i i

1
where wi  2
i

01/06/22 Prepared by Sri Yani K 8


Difference between OLS and GLS

in OLS we minimize a sum of residual squares

 uˆ    
2
2
i Yi  ˆ1  ˆ2 X i
in GLS we minimize a weighted sum of residual squares
with wi  1  i2

 
2
 w uˆ   wi Yi  ˆ1  ˆ2 X i
2
i i

Observation coming from a population with larger  will get


relatively smaller weight and those from a population with
smaller  will proportionally larger weight in minimizing the RSS

01/06/22 Prepared by Sri Yani K 9


Consequences of using OLS in the
presence of heteroscedasticity

   
var  2  var ˆ2
ˆ 

 Confidence interval based on the later will be


unnecessarily larger
 The t and F tests are likely to give us inaccurate
results and coefficient to be statistically insignificant
 If we persist in using the usual testing procedure
despite heteroscedasticity, whatever conclusions
we draw or inferences we make may be very
misleading

01/06/22 Prepared by Sri Yani K 10


Detection
Informal methods
1. Nature of the problem
2. Graphical method

Formal methods
1. Park test
2. Glejser test
3. Spearman’s rank correlation test
4. Goldfeld-Quandt test
5. Breusch-Pagan-Godfrey test
6. White’s general heteroscedasticity test
7. Koenker-Basset test

01/06/22 Prepared by Sri Yani K 11


Graphical method
u2 u2 u2

X X X
u2 u2

X X

01/06/22 Prepared by Sri Yani K 12


Park test
 Variance is some function of the explanatory
variable Xi
 i2   2 X i evi or ln i2  ln  2   ln X i  vi
Since  2 is generally unknown, using uˆi2 as a proxy
ln uˆi2  ln  2   ln X i  vi     ln X i  vi
if  statistically significant  heteroscedasticity
Problem: vi may not satisfy the OLS assumptions and
may itself be heteroscedasticity

01/06/22 Prepared by Sri Yani K 13


Glejser test
ˆ i  1   2 X i  i ˆi  1   2 X i  i
1 1
ˆ i  1   2  i ˆi  1   2  i
Xi Xi
ˆ i  1   2 X i  i ˆi  1   2 X i2  i

P roblem: i has some problems in that its expected


value is nonzero, it is serially correlated and
heteroscedastic

01/06/22 Prepared by Sri Yani K 14


Spearman’s rank correlation test

Spearman’s rank correlation coefficient:

  di2 
rs  1  6 2 
 
 n n  1 
where:
di = difference in the rank assigned to two
different characteristic of the i-th individual
or phenomenon
n = number of individuals or phenomenon
ranked

01/06/22 Prepared by Sri Yani K 15


Spearman’s rank correlation test
1. fit the regression on the data on Y and X and obtain
the residuals
2. taking absolute value of residual and rank both
residual and X and compute the Spearman’s rank
correlation coefficient
3. assuming that the population rank correlation
coefficient s is zero and n>8, the significance of the
sample rs can be tested by
rs n  2
t with df=n-2
1 r
s
2

If t > the critical t  heteroscedasticity

01/06/22 Prepared by Sri Yani K 16


Goldfeld-Quandt test
 Assumes that the heteroscedasticity variance,
i2, is positively related to one of the explanatory
variables in the regression model

 The two-variable model: Yi  i   2 X i  ui


 i2 is positively related to Xi as

 i2   2 X i2
where  is a constant
2

01/06/22 Prepared by Sri Yani K 17


Goldfeld-Quandt test
1. Order the observation according to the values of Xi,
beginning with the lowest X value
2. Omit c central observations, and divide the remaining
observations into two groups each (n-c)/2
observations
3. Fit separate OLS regressions to the first (n-c)/2
observations and the last (n-c)/2 observations, and
obtain the residual sum of squares RSS1 and RSS2.
These RSS each have

 n  c  k or
 n  c  2k 
  df
2  2 
01/06/22 Prepared by Sri Yani K 18
Goldfeld-Quandt test
4. Compute the ratio
RSS2 df

RSS1 df
If ui are assumed to be normally distributed and if the
assumption of homoscedasticity is valid, then it can be
shown that  follows the F distribution with df each of
(n-c-2k)/2. If  > the critical F, reject the hypothesis of
homoscedasticity.
 The success of the GQ test depends on the value of c
and identifying the correct X with which to order the
observations

01/06/22 Prepared by Sri Yani K 19


Breusch-Pagan-Godfrey test
The k-variable regression model
Yi  1   2 X 2i  ...   k X ki  ui
Assume that the error variance  i2 is described as
 i2  f  1   2 Z 2i  ...   m Z mi 
that is,  i2 is a linear function of the nonstochastic variables Z's.
Specifically, assume that  i2  1   2 Z 2i  ...   m Z mi
that is,  i2 is a linear function of the Z's.
If  2   3  ...   m  0   i2  1  constant 

01/06/22 Prepared by Sri Yani K 20


Breusch-Pagan-Godfrey test

1. Estimate by OLS and obtain the residuals


2. 
Obtain  2  uˆi2 n , that is the ML estimator of
2
Construct variables pi defined as pi  ui 
2 2
3. ˆ 
4. Regress pi on the Z’s as pi  1   2 Z 2i  ...   m Z mi
5. Obtain the ESS and define
1
   ESS 
2

01/06/22 Prepared by Sri Yani K 21


Breusch-Pagan-Godfrey test
Assuming ui are normally distributed, if there is
homoscedasticity and if the sample size n increase
indefinitely, then
~ 2
m 1
asy

If  > the critical 2, can reject the hypothesis of


homoscedasticity
 It is sensitive to the normally assumption

01/06/22 Prepared by Sri Yani K 22


White’s general heteroscedasticity
test
 Does not rely on the normality assumption
 The three-variable regression model:
Yi  1   2 X 2i  3 X 3i  ui
 The k-variable regression model :
Yi  1   2 X 2i  ...   k X ki  ui

01/06/22 Prepared by Sri Yani K 23


White’s general heteroscedasticity
test
The White test proceeds as follows:
1. Estimate the regression model and obtain the
residual
2. Estimate the following regression
uˆi2  1   2 X 2i   3 X 3i   4 X 22i   5 X 32i   6 X 2i X 3i  ui
3. H0: there is no heteroscedasticity
4. Obtain nR 2
~  2
df number of regressors
asy

5. If 2 > the critical chi-square, the conclusion is


there is heteroscedasticity

01/06/22 Prepared by Sri Yani K 24


Koenker-Basset (KB) test
 Estimate the regression model
Yi  1   2 X 2i  ...   k X ki  ui
 Obtain the estimated Y value and the residual, and
then estimate

 
2
uˆi2   i   2 Yˆi  i
 H0: 2=0
If this is not rejected, conclude that there is no
heteroscedasticity
It is applicable if the error term in the original model
is not normally distributed

01/06/22 Prepared by Sri Yani K 25


Remedial measures

 When i2 is known: the method of weight


least squares
 When i2 is not known
 The true i2 are rarely known
 White’s Heteroscedasticity-Consistent
Variances and Standard Errors  robust
standard errors
 can be larger or smaller than the uncorrected
standard errors

01/06/22 Prepared by Sri Yani K 26


Remedial measures
Assumption about heteroscedasticity pattern
1. The error variance is proportional to Xi2
 
E uˆi2   2 X i2
Divide the original model through by Xi
Yi 1 ui 1
  2   1   2  i
Xi Xi Xi Xi
2
 ui  1 1
 
E  i
2
  
 E    2 E ui  2  2 X i2   2
2

 Xi  Xi Xi

01/06/22 Prepared by Sri Yani K 27


Remedial measures
2. The error variance is proportional to Xi
The square root transformation:

 
E uˆi2   2 X i
The model can be transformed:
Yi 1 ui 1
  2 X i   1   2 X i  i
Xi Xi Xi Xi
2
 ui  1 1
 
E  i
2
 E
 X
 
 Xi
E ui 
2

Xi
  
 2 Xi   2
 i 

01/06/22 Prepared by Sri Yani K 28


Remedial measures
3. The error variance is proportional to the
square of the mean value of Y

E  uˆ     E  Yi  
2 2 2
i

The transform model


Yi 1 Xi ui 1 Xi
  2   1  2  i
E  Yi  E  Yi  E  Yi  E  Yi  E  Yi  E  Yi 
 
E i2   2

01/06/22 Prepared by Sri Yani K 29


Remedial measures

4. A log transformation

ln Yi  1   2 ln X i  ui
very often reduces heteroscedasticity
 log transformation compresses the scales of
variables
 The slope coefficient 2 measures the
elasticity of Y respect to Xi

01/06/22 Prepared by Sri Yani K 30


Conclude of the remedial measures
 All the transformations discussed previously
are ad hoc  speculating about the nature of
 i2
 Some problems:
1. Beyond the two-variable model, we do not know a
priori which of the X variables should be chosen for
transforming the data
2. Log transformation is not applicable if some of the
Y and X values are zero or negative
3. The problem of spurious correlation
4. All testing procedure using the t test, F test, etc are
strictly speaking valid only in large samples

01/06/22 Prepared by Sri Yani K 31

You might also like