0% found this document useful (0 votes)
115 views24 pages

The Glejser Test and The Median Regression: Marilena Furno

The document summarizes research on the Glejser test for heteroscedasticity. It finds that the test over-rejects the null hypothesis of homoscedasticity in the presence of skewed errors or contaminated errors. Using least absolute deviation (LAD) residuals from quantile regression rather than ordinary least squares residuals helps make the test robust to skewness and contamination. Simulation results confirm that the Glejser test performs well when implemented using LAD residuals, without need for skewness correction, and provides a good alternative to bootstrap methods.

Uploaded by

angelo flores
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
115 views24 pages

The Glejser Test and The Median Regression: Marilena Furno

The document summarizes research on the Glejser test for heteroscedasticity. It finds that the test over-rejects the null hypothesis of homoscedasticity in the presence of skewed errors or contaminated errors. Using least absolute deviation (LAD) residuals from quantile regression rather than ordinary least squares residuals helps make the test robust to skewness and contamination. Simulation results confirm that the Glejser test performs well when implemented using LAD residuals, without need for skewness correction, and provides a good alternative to bootstrap methods.

Uploaded by

angelo flores
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 24

Sankhyā : The Indian Journal of Statistics

Special Issue on Quantile Regression and Related Methods


2005, Volume 67, Part 2, pp 335-358
c 2005, Indian Statistical Institute

The Glejser Test and the Median Regression


Marilena Furno
Universitá di Cassino, Italy

Abstract
The Glejser test is affected by a non-vanishing estimation effect in the pres-
ence of skewness. We show that such effect occurs with contaminated errors
as well, and that skewness correction is inappropriate when there is contam-
ination. We consider the use of residuals estimated with conditional median
regression (least absolute deviations or LAD) and discuss why LAD is ef-
fective against both skewness and contamination. The effectiveness of LAD
is confirmed by simulations. With contaminated errors, both standard and
skewness corrected Glejser tests perform poorly when based on least squares
residuals. However, they perform very well when implemented using LAD
residuals. The latter turns out to be a good alternative to bootstrap meth-
ods, which is generally used to solve the discrepancy between asymptotic and
finite sample behaviour of a test.

AMS (2000) subject classification. Primary 62F03, 62J05; secondary 62P20.


Keywords and phrases. Glejser test, skewness, influence function, quantile
regression.

1 Introduction

The Glejser test is a well-known test for heteroscedasticity (Glejser,


1969), which is based on weak assumptions and is very easy to implement.
It checks for the presence of a systematic pattern in the variances of the
errors by estimating an auxiliary regression, where the absolute value of the
residuals of the main equation is the dependent variable. Godfrey (1996)
shows that this test is affected by an “estimation effect”, namely it over-
rejects the null hypothesis of homoscedasticity in the presence of skewed
error distributions, and this effect does not vanish asymptotically.1
In addition to the above estimation effect in the Glejser test due to
skewness, this paper shows the following.
1
Im (2000), and independently Machado and Santos Silva (2000), propose a skewness
corrected Glejser test, and we will discuss them in the following section.
336 Marilena Furno

• Godfrey’s asymptotically non-vanishing estimation effect exists not


only with skewed error distributions, but also with contaminated er-
rors, which are symmetric and have tails heavier than normal. They
are characterized by a greater probability of generating outliers, which
affect the estimated coefficients and the residuals of the main equation,
and thus the behaviour of the Glejser test.

• The use of robust residuals in the test function allows controlling con-
tamination. Pagan and Hall (1983) stress the relevance of robust resid-
uals for testing purposes. In particular, they suggest to estimate the
main equation using OLS and to consider, for testing purposes, a ro-
bust estimator in the auxiliary regression. Instead, we propose and
justify a different strategy, which involves robust estimation of the
main equation and use of the robust residuals to implement the di-
agnostic test. Once the residuals are robustly computed, the need to
estimate the auxiliary regression robustly is marginal.

• Among all the robust estimators, the least absolute deviation (LAD)
is the most suitable one to estimate the residuals needed to implement
the Glejser test. We show that the normal equation defining the LAD
estimator is part of the Taylor expansion approximating the Glejser
test function. When we use a different robust estimator, this result
no longer holds, and the degree of skewness affects the behaviour of
the test. When we use ordinary least squares (OLS) or any other
non-robust estimator, both of skewness and contamination affect the
behaviour of the test.

• The Glejser test computed with LAD residuals is asymptotically dis-


tributed as a χ2 .

The Monte Carlo experiments yield results in line with most of Godfrey’s
findings. There is over-rejection of the null hypothesis in the Glejser test with
skewed error distributions; and under-rejection of the same under normality
when the skewness-corrected Glejser test is used.
We include as a benchmark the Koenker test for heteroscedasticity, which
is widely used and easy to implement. It requires the estimation of an
auxiliary regression, where the dependent variable is the squared residual of
the main equation, rather than the absolute value as in Glejser test.
In our simulations, the Koenker test turns out to have good size in the
presence of skewness and contamination, when there is only one explanatory
variable. However, the OLS based Koenker test is undersized in the case of
The Glejser test and the median regression 337

normality2 and over-rejects the true null also in the case of skewed and con-
taminated errors, when there are many explanatory variables. In addition,
our results show the following.

• The Glejser test over-rejects the true null also in case of error contam-
ination.

• The Glejser test is well behaved with asymmetric and contaminated er-
rors if we use LAD instead of OLS residuals in the auxiliary regression.
When the test is computed using LAD residuals, any skewness correc-
tion, as defined in Im (2000) or Machado and Santos Silva (2000), is
redundant.

• The skewness corrected Glejser test, as the Koenker test, works prop-
erly with skewness and contamination when there is only one explana-
tory variable. However, with three or more explanatory variables both
tests over-reject the true null in many experiments with skewed errors
and in all the experiments with contaminated errors.

• The skewness corrected test and the Koenker test, just as the simple
Glejser test, work fine with contamination, when there are many ex-
planatory variables, if they are computed using LAD instead of OLS
residuals.

• The presence of contamination in the explanatory variables causes


some over-rejection in the LAD based tests, and this occurs because
LAD is not designed to deal with outliers in the x’s.

Finally Godfrey and Orme (1999), analysing the most popular tests of
heteroscedasticity, find that nominal and finite sample critical values are
substantially different. By using bootstrap methods they get good results,
although there is some over-rejection in the Glejser test with asymmetric
error distributions. Horowitz and Savin (2000) also consider bootstrap to
obtain a good estimator of the critical value. Our paper shows that good
results can be obtained using LAD residuals in the test functions. The LAD
based tests have empirical sizes close to, and in some cases better than, the
bootstrap estimates. With respect to the bootstrap, LAD is easier and faster
to estimate since it is computationally less intensive.3
2
This could be linked to the results by Tse (2002). The author shows that in case of
test functions based on estimated regressors, OLS leads to the under rejection of the null.
3
Indeed, quantile regression estimators are included in standard statistical packages
and do not require specific programming nor a large number of replicates.
338 Marilena Furno

Section 2 defines the model and reviews the different versions of the Gle-
jser test. Section 3 discusses Taylor expansion and its connection with the
estimation effect. Section 4 summarizes the test functions and presents the
asymptotic distribution of the Glejser test based on LAD residuals. Sec-
tion 5 outlines the Monte Carlo experiments, and Sections 6 and 7 discuss
the results in terms of the size and the power. The last section draws the
conclusions.

2 Review of the Literature

Consider the linear regression model yt = xt β + ut where xt is a (1, k)


vector containing the t-observation for all the k explanatory variables, β is a
(k, 1) vector of unknown parameters. Here, the ut ’s, under the null, are i.i.d.
errors with common distribution F , density f , zero median and constant
variance σ 2 . The vector b is a consistent estimator of β, ût = yt − xt bOLS
is the OLS residual at time t, while êt = yt − xt bLAD is the LAD residual
at time t. Tests for heteroscedasticity based on artificial regressions define
an auxiliary equation g(ût ) = α0 + zt α, where α and zt are q-dimensional
vectors. The Glejser test (1969) considers the case of q = 1, g(ût ) = |ût |, zt =
xrij , r = ±0.5, ±1. The hypotheses being tested are:

H0 : E(u2t ) = σ 2 = α0 ,
H1 : E(u2t ) = σt2 = α0 + zt α.

We assume the following regularity conditions for the explanatory vari-


ables xt and zt :

a1) n−1 x′t xt → Ωxx , positive definite;
t

−1
a2) n (zt − z̄)′ (zt − z̄) → Ωzz , positive definite;
t

−1
a3) n (zt − z̄)′ xt → Ωzx .
t

In the Glejser test, under the null hypothesis of homoscedasticity, the term
nR2 from the artificial regression, estimated with OLS, is assumed to be
asymptotically distributed as a χ2(q) , where n is the sample size and R2 the
coefficient of determination of the auxiliary regression. A large value of nR2
The Glejser test and the median regression 339

leads to the rejection of the null.4 The test is a check for orthogonality
between
 the OLS residuals and zt , i.e., on the significance of T (bOLS ) =
n−1/2 t (zt − z̄)′ |ûi |. Godfrey (1996) considers Taylor expansion of T (bOLS ),

∂T (βOLS )
T (bOLS ) = T (βOLS ) + n−1/2 + n1/2 (bOLS − βOLS ) + op (1),
∂βOLS
to evaluate the robustness of the test. But, since “differentiability conditions
are violated and standard Taylor series approach is not available”, instead
of using Taylor expansion, Godfrey approximates the test function T (bOLS )
with the expression:

T (bOLS ) = T (βOLS ) + (1 − 2p∗ )Ωzx n1/2 (bOLS − βOLS ) + op (1), (1)

where p∗ is the probability of non-negative


 errors, and Ωzx is the limit of the
sample covariance matrix n −1 ′
t (zt − z̄) xt . The second term of equation (1)
can be ignored if p∗ = 0.5 (e.g., when the error distribution is symmetric),
even if Ωzx = 0 which is often the case. This leads Godfrey to conclude the
inadequacy of the Glejser test in the presence of skewness due to estimation
effects.
Im (2000) and Machado and Santos Silva (2000) independently present
modified Glejser tests to correct the skewness of the error distribution. Im
(2000) approximates the Glejser test by

T (bOLS ) − T (βOLS ) = n−1/2 (zt − z̄)′ (|ût | − |ut |)
t


= n−1/2 (zt − z̄)′ (ût − ut )
t∈n1


− (zt − z̄)′ (ût −ut ) +op (1)
t∈n2

= −mn−1 (zt − z̄)′ xt n1/2 (bOLS −βOLS )+op (1), (2)
t

where the skewness coefficient m = 2P r(ut ≥ 0) − 1 is estimated by m̂ =


n1 −n2
n , the difference between the number of positive residuals n1 and of neg-
ative residuals n2 over the sample size n. Im (2000) proposes to implement
4
When deciding between the null and the alternative hypothesis, we are not interested
in estimating σ 2 per se. The error variance is a nuisance parameter, and we need to check
if this parameter is constant over the sample since, under the unaccounted alternative of
heteroscedasticity, any estimator of the vector of regression coefficient β is inefficient.
340 Marilena Furno

the artificial regression

|ût | − m̂ût − µ̂ = α0 + zt α, (3)

where µ̂ is an estimate of µ = E|ut |. Under the null, nR2 from the artificial
regression in (3) is asymptotically distributed as a χ2(q) , and the result is not
affected by the degree of skewness of the error distribution.
Machado and Santos Silva (2000) suggest to replace the dependent vari-
able of the artificial regression|ût | with ût [I(ût ≥ 0)−η], where I() is the indi-
cator function of the event  in parenthesis, and η, defined as η = P r(ut ≥ 0),
is estimated by η̂ = 1/n t I(ût ≥ 0).5 The artificial regression becomes

ût [I(ût ≥ 0) − η̂] = α0 + zt α. (4)

Finally, we consider another frequently implemented test, presented by


Koenker (1981), which selects g(ût ) = û2t as the dependent variable of the
artificial regression:
û2t = α0 + zt α. (5)
With i.i.d. errors having zero mean, finite variance, and finite fourth mo-
ment, under the null, nR2 of equation (5) is assumed to be asymptotically
distributed as a χ2(q) . The test function T (bOLS ) = n−1/2 t (zt − z̄)′ û2t

checks the orthogonality between zt and the squared residuals, and it does
not violate the differentiability conditions, since

∂T (βOLS ) 
plim n−1/2 = −2plim n−1 (zt − z̄)′ xt ut = 0.
∂βOLS t

This allows Godfrey (1996) to asses the robustness of the Koenker test with
respect to estimation effects. However, in their simulations, Godfrey and
Orme (1999) find that the robustness of this test decreases when the num-
ber of test variables increases, and we will relate this result to the lack of
robustness of the residuals of the main equation.

3 Estimation Effect in Taylor Expansions and Asymptotic


Approximations

We propose to introduce the following approximations in the Taylor ex-


pansion.
5
Since the Im test and the Machado and Santos Silva test are numerically identical, in
the following sections we focus the discussion on the Im test.
The Glejser test and the median regression 341

(i) In view of the non-differentiability of the absolute value function of


the Glejser test, we choose to use the sign( ) function in place of the
derivative. In other words, we define ∂T∂β(β) = ψ(ut ), where ψ( ) =
sign( ), which is the directional derivative of the absolute value func-
tion.6 Our sign function ψ( ) replaces

• the term (1 − 2p∗ ) of equation (1) (Godfrey, 1996),


n1 −n2
• the sample splitting m̂ = n of equation (3) (Im, 2000), or
• the indicator function I(ût ≥ 0 − η̂) in the artificial regression (4)
(Machado and Santos Silva, 2000).

Since ψ( ) enters the normal equations defining the LAD estimator,


this explains why LAD is the “natural choice of conditional location
for the equivalence of T (b) and T (β) to hold under general conditions”
(Machado and Santos Silva, 2000). In LAD, the number of positive
residuals balances the number of negative residuals by construction,
since the LAD fitted plane computes the conditional median of the
y’s. In OLS, this is not necessarily the case, since the OLS fitted plane
estimates the conditional mean of the y’s. In the case of asymmetric
distributions, the mean differs from the median leading to a number of
positive residuals larger or smaller than the number of negative ones,
according to the presence of positive or negative skewness.

(ii) Both in Taylor expansion and Godfrey asymptotic approximation in


(1), by considering β as a functional, we replace n1/2 (b − β) by its
asymptotic representation, i.e., the influence function (IF ), provided
the required regularity conditions are fulfilled. The IF is the derivative
of the selected estimator with respect to contamination

∂b((1 − λ)F + λδy ) 
IF (b, F, y) =
∂λ 
λ=0

(Hampel et al., 1986). In this derivative, the empirical distribution is


expressed as the linear combination of F , the assumed distribution,
and δy , the probability measure assigning unit mass at point y to
account for the presence of an outlying observation in y. The term
λ defines the degree of contamination of the empirical distribution,
i.e., the 1/2
−1/2
 percentage of outliers in the data. Therefore n (b − β) =
n t IF (b, F, xt , yt ) + op (1) defines the impact of the degree of
6
Technically, at ut = 0 the |ut | function has no derivative. However, the directional
derivative is a generally accepted approximation.
342 Marilena Furno

contamination λ on the regression coefficients and, through them, on


the test function T (β).

Considering
 (i), (ii), Taylor expansion of the Glejser test T (bOLS ) =
n−1/2 ′
t (zt − z̄) |ût | becomes:

T (bOLS ) − T (βOLS )
∂T (bOLS ) −1/2
= n−1/2 n (T (bOLS − βOLS ) + op (1)
∂βOLS
 
= n−1 ψ(ut (βOLS ))(zt − z̄)′ xt n−1/2 IF (bOLS , F, xt , yt ) + op (1)
t t
 
−1 −1/2
=n ψ(ut (βOLS ))(zt − z̄)xt n (Ex′t xt )−1 ut (βOLS )x′t + op (1). (6)
t t

Equation (6) shows that the estimation effect depends upon the degree of
skewness (which is only one of possible imbalances) measured by the sign
function and is affected by the lack of boundedness in the influence function
of bOLS . Outliers can cause inconsistency since IF (bOLS , F, xt , yt ) is not
bounded in the errors: n−1/2 (bOLS − βOLS ) = Op (1) and OLS is not the
appropriate estimator in the presence of outliers. Analogously, the impact
of outliers can be seen in Godfrey’s asymptotic approximation in (1):

T (bOLS ) = T (βOLS )+(1 − 2p∗ )Ωzx N 1/2 (bOLS − βOLS ) + op (1)



= T (βOLS )+(1 − 2p∗ )Ωzx n−1/2 IF (bOLS , F, xt , yt )+op (1)
t

∗ −1/2
= T (βOLS )+(1−2p )Ωzx n (Ex′t xt )−1 ut (βOLS )x′t +op (1). (7)
t

Once again, by replacing the term n1/2 (bOLS − βOLS ) with the influence
function of bOLS , the impact of contamination on the Glejser test is self-
evident. Godfrey shows the existence of an estimation effect when working
on the residuals, while this effect is irrelevant when the true errors are in
use. We relate this result to the lack of robustness of the OLS residuals.
The impact of error and/or data contamination does not vanish asymp-
totically, and can be controlled only by implementing a robust estimator,
which is characterized by a bounded
 IF . When we select a robust estimator
1/2
for β, n (brobust − β) = n −1/2
t IF (brobust , F, xt , yt ) + op (1) is bounded,
asymptotically normal with zero mean and variance E(IF )2 , and the effect
of contamination is under control. In particular, when we consider the LAD
1/2 −1/2
 −1/4) ) =
estimator, n (bLAD −βLAD ) = n t IF (bLAD , F, xt , yt )+Op (n
The Glejser test and the median regression 343
 (Ex′t xt )−1 ψ(ut (βLAD ))x′t
n−1/2 t 2f (0) + Op (n−1/4 ), where f (0) is the height of the
error density at the median. The LAD estimator is asymptotically normal,
i.e., n1/2 (bLAD − βLAD ) ∼ N (0, (2f (0))−2 Ω−1 xx ) (Koenker and Bassett, 1982).
Its influence function IF (bLAD , F, xt , yt ) is bounded with respect to outly-
ing errors,7 and by construction, the positive and negative residual are bal-
anced. This makes any adjustment in the standard Glejser test unnecessary.
Indeed, the Taylor  expansion of the Glejser test based on LAD residuals
T (bLAD ) = n −1/2 ′
t (zt − z̄) |êt | is:

T (bLAD ) = T (βLAD )
  (Ex′ xt )−1 ψ(ut (βLAD ))x′
t t
+ n−1 ψ(ut (βLAD ))(zt − z̄)′ xt n−1/2
t t
2f (0)
+ remainder, (8)

which shows the robustness of the LAD based Glejser test with respect to
both skewness and contamination, without the introduction of any correcting
factor.
Summarizing, equations (6) and (7) show that both asymmetry and con-
tamination can affect the behaviour of the Glejser test.
• With OLS, the skewness coefficient p∗ in equation (7) is free to take any
value. In terms of Taylor expansion (6), the sign function ψ(ut (βOLS ))
is free to take any number of positive (or negative) values, since the
OLS residuals are not necessarily balanced between positive and neg-
ative values. This is not the case in the LAD regression, where the
number of positive residuals offsets the number of negative ones by
construction.
• With OLS, the influence function IF (βOLS , F, xt , yt ) is not bounded
so that large residuals and/or outliers in the x’s can cause a drastic
change in the vector of the estimated coefficients bOLS . This is not
the case for IF (bLAD , F, xt , yt ), which is bounded with respect to large
residuals (but not with respect to outliers in the x’s).

4 Asymptotic Distribution of the LAD Test Function

To compare the heteroscedasticity tests defined in Section 2, the selected


linear regression model is estimated by OLS and LAD. We have defined
7
However LAD is not bounded with respect to large x’s, which are unconstrained within
the influence function. When using the OLS estimator ψ(ut (βOLS )) = ut (βOLS ), the sign
function in the IF is not there, and both the errors and the x’s are unbounded.
344 Marilena Furno

ût = yt − xt bOLS and êt = yt − xt bLAD , which allow to implement the


following six artificial regressions,8 corresponding to six different tests of
heteroscedasticity:

a) |ût | = a0 + azt , (GOLS )


b) |ût | − m̂ût − µ̂ = a0 + azt (ImOLS )
c) û2t = a0 + azt , (KOLS )
d) |êt | = a0 + azt , (GLAD )
e) |êt | − m̂êt − µ̂ = a0 + azt (ImLAD )
f) ê2t = a0 + azt , (KLAD )

Equation (a) is the standard Glejser test (Glejser, 1969) implemented with
OLS residuals (GOLS ), while equation (d) is a Glejser test based on LAD
residuals (GLAD ). In (b), we consider the skewness adjusted Glejser test (Im,
2000), computed using OLS residuals (ImOLS ). In equation (e), we define
the same test using LAD residuals (ImLAD ). Equation (c) is the Koenker
test (Koenker, 1981) based on the squared OLS residuals (KOLS ). In (f ),
the same test is computed using squared LAD residuals (KLAD ).
In the six auxiliary equations (a)–(f ), the term nR2 is asymptotically
distributed as a χ2(q) , under the null hypothesis of i.i.d. errors. Indeed, the
distribution of the LAD based tests (d), (e) and (f ) can be obtained as:

T (bLAD ) = n−1/2 (zt − z̄)′ |yt − xt bLAD |
t

−1/2
= n (zt − z̄)′ |yt − xt βLAD + xt (βLAD − bLAD )|
t

−1/2
= n (zt − z̄)′ |yt − xt βLAD |
t

−1/2
−n (zt − z̄)′ xt ψ(ut (βLAD ))(bLAD − βLAD ) + op (1)
t

= T (βLAD ) − n−1 (zt − z̄)′ xt ψ(ut (βLAD ))Op (1) + op (1)
t
= T (βLAD ) + op (1) (9)

where the last equality follows from the fact that the term n−1 t (zt −
z̄)′ xt ψ(ut (βLAD ))Op (1), can be merged with the op (1) remainder, which is
8
All the artificial regressions are estimated by OLS. The use of LAD in the auxiliary
equations turns out to be of little help in our simulations, once the residuals from the main
equation are robustly computed (these results are not reported).
The Glejser test and the median regression 345

possible for median regression.9 The expansion in (9) is valid for any con-
sistent estimator of β, but with skewed and contaminated errors, the last
equality holds only for residuals computed from the median regression.10 By
the Central Limit Theorem, under the null hypothesis of homoscedasticity,
T (bLAD ) is asymptotically normal, and nR2 from the auxiliary regressions
(d), (e) and (f ) are asymptotically distributed as χ2(q) .

5 Plan of the Experiments

The Monte Carlo study here implemented, far from being exhaustive,
replicates some of the experiments presented in other simulation studies on
the Glejser test.11 The data generating process is yt = constant + βxt + ut ,
where constant = 0.3 and β = 0.6. The x’s are generated from a uniform
distribution in the range (0,1), x1t ; from a log-normal, Λ(3,1), where log(x2t )
follows a N (3, 1); from a chi-square with one degree of freedom, x3t ; from a
contaminated normal, designed to have 90% of the observations generated by
a standard normal and the remaining 10% generated by a zero mean normal
with variance equal to 100, x4t ; from a student-t with two degrees of free-
dom, x5t . We begin with a simple model with only one explanatory variable
and then we increase the number of explanatory variables in the regression.
Godfrey and Orme (1999) stress the relevance of checking the behaviour of
a test by employing multiple regression models, since “results obtained from
simple experimental designs may be an unreliable guide to finite sample per-
formance”. Thus we gradually increase the number of explanatory variables
both in the main regression and in the equation defining heteroscedasticity.
This allows checking the stability of the results with respect to the increasing
complexity of a model. The errors are serially independent and independent
of the xt ’s. We consider the following error distributions: standard nor-
mal; chi-square with two degrees of freedom; log-normal, Λ(0,1); student-t
with five degrees of freedom; contaminated normal, where 90% of the ob-
servations are generated by a standard normal and the remaining 10% are
generated by a zero mean normal with variance equal to 100, CN(10%,100).

9
Equation (9) is slightly different from the asymptotic distribution of the Machado and
Santos Silva (2000) test function. They define a skewness correcting factor which we show
to be unnecessary when implementing LAD.
10
Equation (9), by replacing the absolute value with the squared residuals, holds for the
test (f ) as well, due to the use of LAD residuals, ut (βLAD ), in the auxiliary equation.
11
The plan of the experiments takes into account some of the distributions and the
models analysed in the simulations implemented by Godfrey (1996), Godfrey and Orme
(1999), Im (2000), and Machado and Santos Silva (2000).
346 Marilena Furno

The chi-square and the log-normal distributions are used analyse the be-
haviour of the tests in the presence of skewness, while the student-t and
the contaminated distributions are characterized by thick tails, and induce
symmetrically distributed outliers. All distributions, with the sole excep-
tion of the contaminated normal, are taken from other Monte Carlo studies.
Our contribution is to introduce simulation experiments with contaminated
distribution both in the errors and in the explanatory variables in order to
verify the impact of outliers on the selected tests.
We consider two sample sizes, n=34 and n=80. The number of replicates
for each experiment is 10 000. We select as test variable zt = xt .
Under the alternative, the errors are defined as σt ut , with σt = [0.2{1 +
azt2 /var(zt )}]1/2 , where zt2 is defined as a single variable in some models and
as a vector of variables in others. Accordingly, a is a single coefficient, and
is set equal to 8, or is a vector of coefficients, each one equal to 8.12
Unlike previous simulation studies, in our study, we introduce contami-
nation in the errors and in one of the explanatory variables; we extend the
LAD approach to the modified Glejser and to the Koenker test; we consider
both simple and multivariate regression models.

6 Results: Size

Table 1 reports the rejection rates of the tests under the null hypothesis
of homoscedasticity, at the 5% nominal level. In the table, the first column
gives the list of the tests analysed here, where the first three tests are the
OLS based ones, and the remaining three are the LAD based tests. All the
other columns report the rejection rates of these tests under different error
distributions. The table is further divided into eleven parts, I to XI, each
defining the different distributions generating the test variables xt : uniform
in the first part, x1t , log-normal in the second, x2t , chi-square in the third
part, x3t , contaminated normal in the fourth, x4t , and student-t in the fifth,
x5t . In these five parts, the term nR2 of the auxiliary regressions (a) through
(f ), under the null is assumed to be distributed as a χ2(1) with critical value
of 3.84 at the 5% nominal level. The remaining parts of the table present
models having more than one explanatory variable. In parts VI to VIII we
analyse a model with three explanatory variables, yt = 0.3+0.6x1t +0.6x2t +
0.6xjt +ut with j=3,4,5, and the test statistics are assumed to be distributed
as χ2(3) with critical value of 7.81 at the 5% level. Then we consider models
with four explanatory variables, yt = 0.3+0.6x1t +0.6x2t +0.6x3t +0.6xjt +ut
12
All the simulations are implemented using Stata, version 6.
The Glejser test and the median regression 347

with j=4,5, and the term nR2 is assumed to be distributed as a χ2(4) with
a critical value of 9.49. Finally, in the last part, we take into account a
model comprising all five explanatory variables, yt = 0.3 + 0.6x1t + 0.6x2t +
0.6x3t + 0.6x4t + 0.6x5t + ut . The test statistics are assumed to follow χ2(5)
with critical value of 11.1 at the 5% level.
To judge about the size of the tests we take into account the criteria in
Godfrey (1996): rejection rates in the range of 3.5%-6.6% can be considered
“good”, while those falling in the range 2.6%-7.6% are “adequate”. In Table
1, we report in bold the over-rejections. The tests in this table are computed
using estimated residuals.

Table 1 shows the following when using OLS residuals in the test function.

(i) The Glejser test over-rejects the null not only with skewness but also
with contamination. This occurs in all the parts of the table, and the
first line of each part is in bold everywhere but with N (0, 1) errors and
with contaminated x’s. Indeed, in part IV the problem shared by all
the OLS based tests seems to be under-rejection.13

(ii) In the model with one explanatory variable, the OLS based Im test
improves upon the OLS based Glejser test yielding generally good
rejection rates. However, it tends to over-reject the null with error
contamination, when the explanatory variable follows a chi-square dis-
tribution (part III). With skewness, Λ(0,1), when the x’s are drawn
from a uniform and the sample is small (part I). In addition, the Im
and the Koenker tests strongly under-reject the true null when xt is
contaminated, with all the error distributions (part IV) and in both
sample sizes, showing a tendency to under-reject even when the x’s
come from a student-t distribution (part V).14

13
In comparing our results with the ones in Godfrey (G(1.0) test, p. 287, 1996), we
get very similar empirical sizes, coinciding up to the second digit, for the Glejser test in
case of log-normal and chi-squared errors. The main difference is in the experiments with
student-t errors, where we find over-rejection in the Glejser test, and this is the case in
the contaminated errors as well. The latter experiment is not analysed in Godfrey nor in
other simulation studies on the Glejser test.
14
The discrepancies between nominal and empirical sizes for the Im test reported in our
simulations occur in experiments not implemented in Im (2000) or in other Monte Carlo
studies.
348 Marilena Furno

Table 1. Glejser tests using the estimated residuals, n = 34, 80,


OLS in the auxiliary regression, 10000 replicates, H0 : σt = σ
I: x1t drawn from a uniform (0, 1)
N (0, 1) χ2(2) t5 Λ(0, 1) CN (10%, 100)
n = 34 n = 80 n = 34 n = 80 n = 34 n = 80 n = 34 n = 80 n = 34 n = 80
GOLS .055 .051 .137 .137 .117 .121 .184 .182 .116 .084
ImOLS .048 .048 .066 .058 .064 .056 .070 .054 .056 .049
KOLS .049 .050 .066 .059 .063 .051 .064 .045 .044 .041
GLAD .040 .045 .047 .050 .047 .048 .049 .045 .039 .045
ImLAD .040 .044 .052 .053 .054 .051 .054 .047 .039 .045
KLAD .041 .048 .031 .033 .033 .032 .030 .028 .029 .036
II: x2t drawn from a log-normal (3,1)
N (0, 1) χ2(2) t5 Λ(0, 1) CN (10%, 100)
n = 34 n = 80 n = 34 n = 80 n = 34 n = 80 n = 34 n = 80 n = 34 n = 80
GOLS .039 .039 .080 .083 .070 .072 .094 .092 .079 .074
ImOLS .033 .037 .039 .035 .038 .034 .044 .038 .060 .059
KOLS .024 .028 .041 .035 .042 .035 .049 .041 .049 .040
GLAD .045 .048 .034 . 035 .040 .037 .040 .035 .050 .044
ImLAD .044 .048 .038 .037 .043 .039 .043 .036 .049 .044
KLAD .038 .040 .036 .033 .039 .034 .043 .038 .050 .045

III: x3t drawn from a χ2(1)


N (0, 1) χ2(2) t5 Λ(0, 1) CN (10%, 100)
n = 34 n = 80 n = 34 n = 80 n = 34 n = 80 n = 34 n = 80 n = 34 n = 80
GOLS .043 .046 .092 .103 .079 .091 .102 .116 .089 .086
ImOLS .037 .043 .046 .042 .046 .043 .047 .046 .068 .070
KOLS .028 .034 .047 .042 .045 .045 .050 .049 .056 .048
GLAD .044 .052 .039 .041 .040 .043 .041 .044 .050 .047
ImLAD .044 .051 .045 .043 .045 .046 .044 .046 .050 .047
KLAD .043 .044 .039 .037 .041 .041 .044 .045 .056 .053
IV: x4t drawn from a contaminated normal (10%,100)
N (0, 1) χ2(2) t5 Λ(0, 1) CN (10%, 100)
n = 34 n = 80 n = 34 n = 80 n = 34 n = 80 n = 34 n = 80 n = 34 n = 80
GOLS .023 .028 .042 .052 .034 .051 .048 .060 .041 .040
ImOLS .019 .026 .021 .022 .019 .026 .023 .023 .029 .032
KOLS .017 .024 .027 .030 .024 .037 .030 .042 .032 .034
GLAD .069 .079 .050 .053 .051 .057 .045 .054 .057 .059
ImLAD .068 .079 .054 .054 .056 .059 .047 .055 .056 .059
KLAD .054 .070 .044 .047 .042 .052 .044 .056 .052 .058
The Glejser test and the median regression 349

Table 1 (continued)
V: x5t drawn from a t(2)
N (0, 1) χ2(2) t5 Λ(0, 1) CN (10%, 100)
n = 34 n = 80 n = 34 n = 80 n = 34 n = 80 n = 34 n = 80 n = 34 n = 80
GOLS .039 .037 .073 .072 .068 .065 .088 .081 .073 .067
ImOLS .034 .033 .036 .031 .038 .031 .041 .034 .055 .053
KOLS .025 .023 .037 .030 .035 .030 .042 .036 .043 .034
GLAD .043 .049 .034 .033 .037 .033 .034 .031 .044 .040
ImLAD .043 .048 .038 .035 .041 .034 .037 .032 .044 .040
KLAD .034 .038 .030 .028 .034 .031 .038 .033 .047 .041
VI: x1t , x2t and x3t together
N (0, 1) χ2(2) t5 Λ(0, 1) CN (10%, 100)
n = 34 n = 80 n = 34 n = 80 n = 34 n = 80 n = 34 n = 80 n = 34 n = 80
GOLS .039 .043 .129 .152 .112 .121 .181 .197 .132 .119
ImOLS .031 .038 .058 .050 .061 .049 .077 .057 .086 .082
KOLS .029 .035 .061 .053 .063 .052 .078 .061 .069 .055
GLAD .034 .046 .039 .046 .044 .045 .050 .051 .057 .057
ImLAD .034 .046 .048 .050 .051 .050 .057 .053 .056 .057
KLAD .044 .052 .043 .043 .047 .039 .052 .052 .063 .061
VII: x1t , x2t and x4t together
N (0, 1) χ2(2) t5 Λ(0, 1) CN (10%, 100)
n = 34 n = 80 n = 34 n = 80 n = 34 n = 80 n = 34 n = 80 n = 34 n = 80
GOLS .030 .038 .089 .127 .075 .104 .119 .161 .110 .088
ImOLS .025 .035 .040 .040 .042 .040 .052 .044 .081 .058
KOLS .022 .031 .050 .047 .050 .045 .065 .052 .072 .047
GLAD .048 .062 .045 .053 .047 .055 .053 .055 .073 .064
ImLAD .047 .062 .052 .058 .054 .059 .056 .057 .076 .064
KLAD .053 .066 .053 .049 .054 .051 .061 .056 .084 .066
VIII: x1t , x2t and x5t together
N (0, 1) χ2(2) t5 Λ(0, 1) CN (10%, 100)
n = 34 n = 80 n = 34 n = 80 n = 34 n = 80 n = 34 n = 80 n = 34 n = 80
GOLS .042 .038 .121 .136 .103 .115 .166 .181 .118 .105
ImOLS .032 .036 .056 .045 .052 .045 .067 .055 .076 .072
KOLS .029 .032 .056 .044 .055 .049 .065 .055 .062 .048
GLAD .036 .046 .042 .043 .040 .043 .043 .047 .052 .054
ImLAD .036 .046 .051 .048 .048 .046 .050 .049 .052 .055
KLAD .041 .051 .036 .036 .043 .042 .045 .047 .057 .056
350 Marilena Furno

Table 1 (continued)
IX: x1t , x2t , x3t and x4t together
N (0, 1) χ2(2) t5 Λ(0, 1) CN (10%, 100)
n = 34 n = 80 n = 34 n = 80 n = 34 n = 80 n = 34 n = 80 n = 34 n = 80
GOLS .033 .039 .109 .142 .096 .116 .150 .186 .113 .105
ImOLS .026 .035 .050 .046 .049 .047 .062 .050 .073 .076
KOLS .025 .031 .051 .052 .051 .053 .065 .064 .062 .058
GLAD .041 .057 .044 .055 .042 .052 .047 .056 .062 .072
ImLAD .040 .057 .057 .060 .051 .056 .057 .059 .064 .072
KLAD .051 .063 .044 .051 .042 .054 .053 .062 .073 .079

X: x1t , x2t , x3t and x5t together


N (0, 1) χ2(2) t5 Λ(0, 1) CN (10%, 100)
n = 34 n = 80 n = 34 n = 80 n = 34 n = 80 n = 34 n = 80 n = 34 n = 80
GOLS .040 .042 .128 .147 .110 .125 .187 .205 .135 .123
ImOLS .032 .037 .056 .044 .057 .053 .077 .061 .090 .088
KOLS .030 .032 .061 .051 .058 .053 .079 .067 .076 .056
GLAD .034 .046 .039 .042 .040 .046 .053 .052 .061 .062
ImLAD .033 .046 .048 .048 .047 .052 .061 .054 .060 .063
KLAD .043 .054 .040 .042 .044 .049 .054 .058 .070 .070

XI: x1t , x2t , x3t , x4t and x5t together


N (0, 1) χ2(2) t5 Λ(0, 1) CN (10%, 100)
n = 34 n = 80 n = 34 n = 80 n = 34 n = 80 n = 34 n = 80 n = 34 n = 80
GOLS .029 .033 .103 .137 .085 .119 .149 .193 .115 .116
ImOLS .022 .030 .048 .039 .046 .046 .066 .054 .083 .082
KOLS .019 .027 .050 .048 .049 .053 .070 .066 .073 .061
GLAD .031 .056 .038 .049 .040 .050 .050 .057 .069 .078
ImLAD .029 .056 .048 .056 .050 .057 .058 .060 .069 .077
KLAD .044 .070 .044 .049 .048 .055 .058 .069 .079 .087

Note: Under H0 , the “good” rejection rates are in the 3.5%-6.6% range. The over-rejections
are reported in bold.

(iii) When we consider the experiments with more than one explanatory
variable, over-rejection occurs in the presence of contamination and of
skewness for the OLS based Glejser test with both sample sizes and
with all the combinations of the explanatory variables here considered.
For the OLS based Im test, the over-rejection occurs in the presence
of error contamination regardless of the sample size (parts VI to XI),
while with log-normal errors and in small samples we find over-rejection
in parts VI, VIII and X. The OLS based Koenker test over-rejects in
small samples with error contamination, as can be seen in parts VI,
The Glejser test and the median regression 351

VII, X and XI, and with log-normal errors in parts VI, X, XI. Indeed,
both Im and Koenker have inadequate rates with log-normal errors
in the small sample (parts VI, X), while Im presents inadequate rates
with contaminated errors in parts VI, VII, X and XI.15

(iv) The Koenker test and, to a lower extent, the Im test under-reject the
null in case of normality. In most of the experiments with normal
errors the Koenker test has inadequate rates (parts II, IV, V, VII, IX,
XI). The under-rejection tends to disappear when n increases in the
Im but not in the Koenker test.16

(v) The OLS based Im test substantially improves upon the OLS based
Glejser test, although it does not close the gap between actual and
nominal size of the test when the errors are contaminated: in parts
VI, VII, X, XI the Im test presents inadequate rates of rejections and
in parts III, VIII and IX adequate, showing a tendency to over-reject
with contaminated errors in seven out of eleven experiments.17

When using LAD residuals in the auxiliary regression the simulations


show the following.

(ia) The Glejser LAD based test does not over-reject with skewness in all
the experiments. With contamination, it shows a tendency to over-
reject in part VII, IX, XI, all characterized by the presence of x4t , that
is of contamination in the explanatory variable. This is a consequence
of the definition of LAD, which does not control contamination in the
x’s.18
15
Once again, the over-rejections of the Im test occur in experiments not implemented
elsewhere. The result for the Koenker test agrees with the findings in Machado and Santos
Silva (2000) and in Godfrey and Orme (1999). They point out the loss of robustness of
the Koenker test when the number of explanatory variables increases, as can be seen in
parts VI to XI of our Table 1.
16
This result for the Koenker and Im tests agrees with Godfrey and Orme (1999) findings,
and can be explained with the results in Tse (2002): in case of test functions based on
estimated regressors, OLS leads to the under-rejection of the null.
17
Error contamination has not been analysed in other Monte Carlo studies. The contam-
inated distribution considered in Godfrey and Orme (1999) is contaminated in the center
of the distribution and provides a bimodal error distribution. It differs from the one here
selected, which is contaminated in the tails and is currently used to model outliers.
18
Machado and Santos Silva (2000) implement the LAD based Glejser test in their
simulations, and find evidence of a good agreement between nominal and empirical size
of this test, as in our case. Their second set of variables is similar to the one used in our
simulations. However, they do not consider the use of LAD residuals in the Koenker test
352 Marilena Furno

(iia) With one explanatory variable, the Im and the Koenker LAD based
tests work fine.
(iiia) With many explanatory variables, the Im and Koenker tests tend to
over-reject the null only in those experiments including x4t . However
there are only two experiments with “inadequate” rates, for the Koen-
ker test in parts VII and XI.
(iva) In the Koenker test, the under-rejection in the case of normality im-
proves or disappears, with the sole exception of section IV, which once
again considers the contaminated explanatory variable x4t .
(va) In most of the experiments of part IV the LAD based tests improve
upon the OLS based ones, although this estimator is not designed to
deal with outliers in the x’s.
Finally, in parts III, VI, VIII, X, and to a lesser extent in parts VII, IX,
XI, the size distortion in the OLS-Glejser and the OLS-Im tests does not
disappear when the sample increases, while it disappears (in parts III, VI,
VII,VIII, X) or decreases (in parts IX and XI) when the LAD residuals are
used to implement the same tests.
These results imply that the Glejser test in conjunction with LAD resid-
uals does not need any correction for skewness and/or contamination. The
use of LAD residuals generally improves upon the results obtained with OLS
residuals. The correction introduced by Im (2000) and Machado and San-
tos Silva (2000) does not work when there is more than one explanatory
variable and the test function is estimated using OLS residuals. This is sur-
prising, since the correction is designed to solve the over-rejection problem
of the Glejser test with skewed errors. When the explanatory variable is
contaminated, all the OLS based tests are undersized regardless the error
distribution. When there are as many as five explanatory variables, all the
tests here analysed are undersized in the case of normality.

Godfrey and Orme (1999) in their paper propose the use of bootstrap
to estimate the size of the Glejser type tests. The model we implement in
parts VI, VII and VIII of table 1 are very similar to the model analysed
in their article.19 Therefore, in Table 2, we report their bootstrap results,
or in their skewness adjusted Glejser test. In addition, they do not analyse the impact of
error contamination, which affects the behaviour of their test particularly in models with
more than one explanatory variable.
19
They take into account three explanatory variables that are realizations from a (1,31)
uniform, a (3,1) log-normal, and AR(1) process with coefficient of 0.9. Their sample size
is equal to 80.
The Glejser test and the median regression 353

and compare them with our results for the LAD based tests in the common
experiments: standard normal, χ2(2) , t5 and log-normal errors. This table
shows that the two sets of results are comparable, but bootstrap still causes
over-rejections in the Glejser test with a χ2(2) distribution and with log-
normal errors. However, the main advantage of using LAD over bootstrap
is in terms of a reduced computational burden.

Table 2. Empirical size computed using bootstrap and LAD, n = 80.


VI: x1t , x2t and x3t together
N (0, 1) χ2(2) t5 Λ(0, 1)
LAD bootstrap LAD bootstrap LAD bootstrap LAD bootstrap
Glejser 0.046 0.050 0.039 0.067 0.044 0.046 0050 0.078
Im 0.046 0.051 0.048 0.049 0.051 0.048 0.057 0.059
Koenker 0.052 0.052 0.041 0.052 0.047 0.050 0.052 0.057
VII: x1t , x2t and x4t together
N (0, 1) χ2(2) t5 Λ(0, 1)
LAD bootstrap LAD bootstrap LAD bootstrap LAD bootstrap
Glejser 0.048 0.050 0.045 0.067 0.047 0.046 0.053 0.078
Im 0.047 0.051 0.052 0.049 0.054 0.048 0.056 0.059
Koenker 0.053 0.052 0.053 0.052 0.054 0.050 0.061 0.057
VIII: x1t , x2t and x5t together
N (0, 1) χ2(2) t5 Λ(0, 1)
LAD bootstrap LAD bootstrap LAD bootstrap LAD bootstrap
Glejser 0.036 0.050 0.042 0.067 0.040 0.046 0.043 0.078
Im 0.036 0.051 0.051 0.049 0.048 0.048 0.050 0.059
Koenker 0.041 0.052 0.036 0.052 0.043 0.050 0.045 0.057

Note: The table compares our results in Table 1 (sections VI and VII) for the tests based on
LAD residuals, with the results in Godfrey and Orme (1999) computed using bootstrap in
the same experiments and the same sample size (n = 80). Under H0 , the “good” rejection
rates are in the 3.5%-6.6% range. The “adequate” rejection rates are in the 2.6%-7.6%.
All the over-rejections are reported in bold.

7 Results: Power

Table 3 shows the power of the Glejser tests in the small sample, n = 34.
Once again, the table is divided into parts, I to X. The first five parts
consider the model with only one explanatory variable, and the true alter-
native is Ha : σt = [0.2{1 + 8x2jt /var(xjt )}]1/2 for j = 1,2,3,4,5 in turn.
The remaining parts consider multivariate models. In parts VI and VII,
the model has three explanatory variables and the alternative is Ha : σt =
[0.2{1 + 8x21t /var(x1t ) + 8x22t /var(x2t ) + 8x2jt /var(xjt )}]1/2 for j = 3,4. In
354 Marilena Furno

parts VIII and IX, the model has four explanatory variables, and the true al-
ternative is Ha : σt = [0.2{1+8x21t /var(x1t )+8x22t /var(x2t )+8x23t /var(x3t )+
8x2jt /var(xjt )}]1/2 for j = 4,5. In the final part of the table, there are five
explanatory variables in the main equation, and the true alternative is Ha :
σt = [0.2{1+8x21t /var(x1t )+8x22t /var(x2t )+8x23t /var(x3t )+8x24t /var(x4t )+
8x25t /var(x5t )}]1/2 . The auxiliary regressions implemented are those in equa-
tions a) to f ) and, according to the model analysed in each part, we estimate
them by selecting the correct choice of explanatory variables.
In this table, the highest rates of the OLS based tests are reported in
bold, while the highest rates of the LAD based tests are underlined.
Table 3. Power of the Glejser tests,
n = 34, Ha : σt = {0.2[1 + 8x2t /var(xt )]}1/2
I: x1t drawn from a uniform (0,1)
N (0, 1) χ2(2) t5 Λ(0, 1) CN (10%, 100)
GOLS 0.926 0.809 0.823 0.648 0.360
ImOLS 0.913 0.643 0.680 0.417 0.257
KOLS 0.835 0.494 0.532 0.283 0.150
GLAD 0.916 0.659 0.715 0.435 0.273
ImLAD 0.914 0.685 0.739 0.456 0.275
KLAD 0.812 0.349 0.403 0.179 0.122
II: x2t drawn from a log-normal (3,1)
N (0, 1) χ2(2) t5 Λ(0, 1) CN (10%, 100)
GOLS 0.829 0.726 0.742 0.592 0.428
ImOLS 0.806 0.624 0.652 0.459 0.353
KOLS 0.776 0.599 0.620 0.439 0.289
GLAD 0.748 0.612 0.637 0.473 0.363
ImLAD 0.747 0.622 0.646 0.482 0.362
KLAD 0.748 0.542 0.571 0.394 0.278
III: x3t drawn from a χ2(1)
N (0, 1) χ2(2) t5 Λ(0, 1) CN (10%, 100)
GOLS 0.919 0.834 0.848 0.712 0.530
ImOLS 0.901 0.747 0.771 0.568 0.443
KOLS 0.874 0.713 0.734 0.543 0.356
GLAD 0.853 0.730 0.756 0.589 0.462
ImLAD 0.853 0.742 0.768 0.607 0.460
KLAD 0.846 0.645 0.682 0.487 0.343
IV: x4t drawn from a contaminated normal (10%,100)
N (0, 1) χ2(2) t5 Λ(0, 1) CN (10%, 100)
GOLS 0.341 0.227 0.216 0.199 0.194
ImOLS 0.320 0.173 0.167 0.141 0.149
KOLS 0.377 0.231 0.222 0.190 0.172
GLAD 0.482 0.432 0.442 0.394 0.276
ImLAD 0.480 0.434 0.443 0.397 0.276
KLAD 0.543 0.498 0.516 0.449 0.267
The Glejser test and the median regression 355

Table 3 (continued)
V: x5t drawn from a t2
N (0, 1) χ2(2) t5 Λ(0, 1) CN (10%, 100)
GOLS 0.770 0.686 0.699 0.562 0.407
ImOLS 0.749 0.585 0.614 0.434 0.336
KOLS 0.717 0.557 0.575 0.408 0.278
GLAD 0.683 0.567 0.593 0.456 0.351
ImLAD 0.682 0.580 0.603 0.467 0.352
KLAD 0.682 0.501 0.535 0.376 0.271
VI: x1t , x2t and x3t together
N (0, 1) χ2(2) t5 Λ(0, 1) CN (10%, 100)
GOLS 0.505 0.488 0.469 0.423 0.287
ImOLS 0.459 0.323 0.332 0.242 0.221
KOLS 0.423 0.296 0.297 0.222 0.181
GLAD 0.401 0.273 0.290 0.206 0.169
ImLAD 0.395 0.298 0.311 0.226 0.168
KLAD 0.403 0.229 0.240 0.178 0.158
VII: x1t , x2t and x4t together
N (0, 1) χ2(2) t5 Λ(0, 1) CN (10%, 100)
GOLS 0.493 0.449 0.430 0.365 0.234
ImOLS 0.448 0.287 0.294 0.203 0.172
KOLS 0.383 0.251 0.257 0.101 0.148
GLAD 0.442 0.304 0.318 0.216 0.182
ImLAD 0.435 0.326 0.338 0.231 0.181
KLAD 0.401 0.244 0.261 0.187 0.167
VIII: x1t , x2t , x3t and x4t together
N (0, 1) χ2(2) t5 Λ(0, 1) CN (10%, 100)
GOLS 0.373 0.387 0.357 0.345 0.247
ImOLS 0.334 0.242 0.243 0.202 0.192
KOLS 0.313 0.243 0.239 0.198 0.166
GLAD 0.312 0.234 0.240 0.197 0.177
ImLAD 0.305 0.253 0.255 0.210 0.177
KLAD 0.331 0.232 0.239 0.195 0.179
IX: x1t , x2t , x3t and x5t together
N (0, 1) χ2(2) t5 Λ(0, 1) CN (10%, 100)
GOLS 0.366 0.402 0.376 0.372 0.279
ImOLS 0.322 0.262 0.258 0.219 0.215
KOLS 0.318 0.257 0.259 0.215 0.184
GLAD 0.263 0.218 0.216 0.180 0.168
ImLAD 0.257 0.238 0.236 0.196 0.166
KLAD 0.310 0.215 0.219 0.178 0.172
356 Marilena Furno

X: x1t , x2t , x3t , x4t , and x5t together


N (0, 1) χ2(2) t5 Λ(0, 1) CN (10%, 100)
GOLS 0.279 0.332 0.302 0.314 0.237
ImOLS 0.239 0.207 0.206 0.186 0.182
KOLS 0.250 0.225 0.219 0.190 0.168
GLAD 0.222 0.187 0.194 0.171 0.169
ImLAD 0.214 0.206 0.211 0.185 0.169
KLAD 0.275 0.209 0.219 0.188 0.178

Note: The highest rates of the OLS based tests are in bold, those of the LAD based tests
are underlined.

In terms of power we can see the following.

(i) When using OLS residuals, the Glejser test has the highest rejection
rates, with few exceptions in part IV. This result is counterbalanced
by the poor performance of the Glejser test in terms of the size.

(ii) When using LAD residuals, the Im test presents the highest rates of
rejections in most of the experiments.

(iii) The number of rejections reached by the OLS based tests is generally
higher than the number of rejections of the same tests computed using
LAD residuals. The reduction in power seems to be the pay off of the
good size of the LAD based tests.

(iv) Part IV , characterized by contaminated x’s, presents the lowest rates


of rejections of the first part of the table, reporting the model with
only one explanatory variable. In this set of experiments, the LAD
based tests have higher rejection rates than their OLS analogues, and
the LAD based Koenker test is preferable.

(v) In the experiments with three or more explanatory variables, all the
tests have low power.

8 Conclusions

Godfrey (1996) proves that the Glejser test is non-robust in the presence
of skewed errors. This paper adds that the Glejser test shows an estimation
effect in the presence of error contamination as well. The choice of a robust
estimator for the coefficients of the main equation can take care of the esti-
mation effect due to error contamination. By choosing a robust estimator,
and by using robust residuals to implement the Glejser test, we can control
The Glejser test and the median regression 357

the impact of contamination on the coefficient estimates and consequently,


the impact of contamination on the test function. In addition, we use the
directional derivative to define the Taylor expansion of the Glejser test, and
this coincides with the normal equation defining the LAD estimator. Thus
LAD enters the asymptotic approximation of the Glejser test, and this ex-
plains why LAD is a “natural choice” for the implementation of the Glejser
test in the case of skewness. Then we discuss the asymptotic distribution of
the LAD based Glejser test.
Our simulations show the validity of the LAD based Glejser test in empir-
ical applications. On the basis of the results, we can state that the usual Gle-
jser test based on OLS residuals over-rejects the null not only with skewed,
but also with contaminated normal and student-t distributions, that is with
heavy tailed errors. The latter cannot be properly analysed by the skewness
corrected Glejser tests, since heavy tails and contamination do not neces-
sarily cause asymmetry. Indeed, even the skewness corrected Glejser test
over-rejects the true null in the experiments with error contamination, par-
ticularly in models having more than one explanatory variable.
However, when we build the usual Glejser test on LAD residuals, the
number of rejections is closer to the nominal size. The same can be said
for the recently proposed skewness adjusted Glejser test: the performance
of the Im test improves when we compute this test using LAD residuals.
Furthermore, the improvement achieved by using the LAD based tests is
comparable to those achieved by bootstrap, in terms of rejection rates, but
LAD has a very low computational burden.
Unfortunately, LAD is not designed to deal with outliers in the explana-
tory variables. Indeed, the Glejser test based on LAD residuals shows over-
rejections in those experiments with contaminated x’s. However, this prob-
lem cannot be solved by the skewness corrected Glejser test, which causes
over-rejection in the same experiments, when it is computed using either
OLS or LAD residuals. The search for a different estimator is left to further
research.

Acknowledgement. I wish to thank L. Godfrey and the referee for their


extremely helpful suggestions. Errors remain the author’s responsibility.

References

Glejser, H. (1969). A new test for heteroscedasticity. J. Amer. Statist. Assoc., 64,
316-323.
Godfrey, L. (1996). Some results on the Glejser and Koenker test for heteroscedasticity.
J. Econometrics, 72, 275-299.
358 Marilena Furno

Godfrey, L. and Orme, C. (1999). The robustness, reliability and power of het-
eroscedasticity tests. Econometric Reviews, 18, 169-194.
Hampel, F., Ronchetti, E., Rousseeuw, P. and Stahel, W. (1986). Robust Statis-
tics, Wiley, N.Y.
Horowitz, J., Savin, N.E. (2000). Empirically relevant critical values for hypothesis
tests: a bootstrap approach. J. Econometrics, 95, 375-389.
Im, K. (2000). Robustifying the Glejser test of heteroscedasticity. J. Econometrics, 97,
179-188.
Koenker, R. (1981). A note on studentizing a test for heteroscedasticity. J. Economet-
rics, 17, 107-112.
Koenker, R. and Bassett, G. (1982). Robust test for heteroscedasticity based on
regression quantiles. Econometrica, 50, 43-61.
Machado, J. and Santos Silva, J. (2000). Glejser’s test revisited. J. Econometrics,
97, 189-202.
Pagan, A.R. and Hall, A.D. (1983). Diagnostic tests as residuals analysis. Economet-
ric Reviews, 2, 159-218.
Tse, Y.K. (2002). Residual based diagnostics for conditional heteroscedasticity models.
Econometrics J., 5, 358-373.

Marilena Furno
Universitá di Cassino
Department of Economics
Address for correspondence
Marilena Furno
Via Orazio 27D, 80122 Napoli, Italy.
E-mail: [email protected].

Paper received: June 2004; revised May 2005.

You might also like