Heteroscedasticity
Heteroscedasticity
different values of X are closely concentrated around the regression line with equal
spread (represented by the dashed lines) above and below the regression line. This is
the situation of homoskedasticity.
On the other hand, although the data points in Figure 4.2 again refer to different
values of X, it is now observed that the higher value of X. has higher spread of data
points around the regression line. An opposite situation is displayed in Figure 4.3
where a higher value of X has lower spread of data points around the regression line.
Both these cases represent the situation of heteroskedasticity; the former is referred
to as increasing heteroskedasticity and the latter as decreasing heteroskedasticity
Sources of Heteroskedasticity
Heteroskedasticity in data may arise
due to several reasons.
(i) When we are dealing with
microeconomic or cross-section
data, we are very likely to have a
heteroskedasticity problem.
Sources of
Heteroskedasticity
For instance, when data are collected from a cross-section of individuals on
their levels of consumption and income to estimate a consumption function,
we are likely to face the heteroskedasticity problem.2 This is because the
variance (or spread) of consumption at low levels of income is much less
compared to variance of consumption at higher levels of income. This
happens as people with low income levels do not have much flexibility in
spending their income; a large proportion of it is spent on food, clothing,
and transportation. On the other hand, people with high levels of income
have a much wider choice and flexibility in spending; some might consume
a lot while others might prefer investing in share market. This will produce
high variance of consumption at high levels of income. In any case, this is
typically the case of increasing heteroskedasticity As an example of
decreasing heteroskedasticity, we may refer to error-learning models To
take an example, in a regression of score performance by the students on
hours taken for preparation, we are likely to observe decreasing
heteroskedasticity. This is because, compared to the variance of scores at
lower levels of preparation, the variance of scores will be less at higher
levels of preparation.
Sources of
Heteroskedasticity
2. Presence of outliers in data may
cause heteroskedasticity problem.
As we know, an outlying observation
is much different from the rest of
the observations in the sample.
Presence of such an observation,
especially when the sample size is
small, will create heteroskedasticity
problem.
Sources of
Heteroskedasticity
3. Heteroskedasticity may arise if some relevant
variables have been mistakenly omitted so
that the model is incorrectly specified. For
instance, if we do not include the prices of
competing or complementary goods in the
demand function for a commodity, the
residuals obtained from the regression may
contain non-constant variance. However,
when the omitted variables are included in
the model, the non- constancy of variance
may disappear.
Sources/reasons of
Heteroskedasticity
4. Inclusion of explanatory variables in the
model whose distributions are skewed
may cause heteroskedasticity problem.
The examples of such variables are
income, wealth, etc.
5. Heteroskedasticity may also arise due to
incorrect data transformations (e.g.,
ratios, first-differences, etc.) and
incorrect functional forms (e.g., linear
instead of log-linear models).
CONSEQUENCES OF
HETEROSKEDASTICITY
heteroskedasticity represents a
situation where the variance of the
disturbance of the model becomes a
variable.
we check whether OLS estimators
continue to remain unbiased and
minimum-variance or best when the
disturbance term of the model is
heteroskedastic.
CONSEQUENCES OF
HETEROSKEDASTICITY
Unbiasedness
-hat remains unbiased when the disturbance
term of the model (s,) is heteroskedastic.
Bestness
-hat is no longer a minimum variance and
hence best estimator. Further, as ,-hat is only
unbiased and not the best, it becomes
inefficient.
Consistency
-hat is consistent when the disturbance term is
heteroskedastic.
CONSEQUENCES OF
The OLS estimators
continue to remain unbiased and consistent under
HETEROSKEDASTICITY