0% found this document useful (0 votes)
23 views34 pages

Geweke, J Meese, R and Dent, W (1983) - Comparing Alternative Tests of Causality in Temporal Systems

Uploaded by

Nicolas HY
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
23 views34 pages

Geweke, J Meese, R and Dent, W (1983) - Comparing Alternative Tests of Causality in Temporal Systems

Uploaded by

Nicolas HY
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 34

Journal of Econometrics 21 (1983) 161-194.

North-Holland Publishing Company

COMPARING ALTERNATIVE TESTS OF CAUSALITY IN


TEMPORAL SYSTEMS

Analytic Results and Experimental Evidence*

John GEWEKE
University of Wisconsin, Madison, WI 53706, USA

Richard MEESE
University of California, Berkeley, CA 94720, USA

Warren DENT
Eli Lilly International Corporation, Indianapolis, IN 46206, USA

Received February 1982, final version received May 1982

This paper discusses eight alternative tests of the absence of causal ordering, all of which are
asymptotically valid under the null hypothesis in the sense that their limiting size is known.
Their behavior under alternatives is compared analytically using the concept of approximate
slope, and these results are supported by the outcomes of Monte Carlo experiments. The
implications of these comparisons for applied work are unambiguous: Wald variants of a test
attributed to Granger, and a lagged dependent variable version of Sims’ test introduced in this
paper, are equivalent in all relevant respects and are preferred to the other tests discussed.

1. Introduction

Granger (1969) has proposed a definition of ‘causality’ in economic systems


which has been applied frequently in empirical work. Briefly, a time series
{y,} ‘causes’ another time series {x,} if present x can be predicted better by
using past values of y than by not doing so, other relevant information
(including the past of x) being used in either case. When the domain of
relevant information is restricted to past values of x and y, x and y are
jointly covariance stationary time series with autoregressive representation,
the set of predictors is constrained to be linear in past x and y, and one’s

*Thanks to Susan Hudak for research assistance and R.R. Bahadur for helpful discussions on
many points raised in the paper. Responsibility for remaining errors resides with the authors.
Views expressed are not necessarily those of the Board of Governors or the Federal Reserve.
Most of the work reported here was undertaken while the second author was a graduate student
and the third was Visiting Associate Professor in the Economics Department, University of
Wisconsin, Madison. The first author acknowledges financial support from the National Science
Foundation.

03044076/83/000&0000/$03.00 0 1983 North-Holland


162 J. Geweke et al., Alternative tests of causality

criterion for comparison of predictors is mean square error, then the question
of whether {y,} causes (xt} is equivalent to the question of whether there
exists an autoregressive representation,

(l.la)

(l.lb)

in which e(L) = 0.
Since (1.1) is a set of population regression equations with serially
uncorrelated disturbances, a test of the proposition that y does not cause x is
immediately suggested: estimate (l.la) by ordinary least squares, and test the
hypothesis e(L) = 0 in the conventional way. We shall refer to such a test as a
‘Granger test’ since the restriction e(L)=0 stems directly from Granger’s
definition. Unless one has special knowledge which restricts d(L) and e(L) to
be known functions of parameters whose number is reasonably small relative
to sample size, however, this cannot be done. In actual applications [e.g.,
Sargent (1976) and Geweke (1978)] some arbitrary restrictions on the form of
d(L) and e(L) must be made before estimation and testing can proceed; for
example, d(L) and e(L) might be assumed to be polynomials of finite order,
or ratios of polynomials of finite order. If the restrictions are true, then the
asymptotic properties of conventional tests of e(L)=0 are well known. If they
are false, such tests will in general reject the null hypothesis e(L) =0
asymptotically even if the null is true.
Sims (1972) considered the implications of the condition e(L)=0 for the
projection of y, on current and past xt,

s=o
(1.2)
COV(Ut,Xt_-s)=O, SZO,

and the projection of y, on future, current and past x,,

S=-03
(1.3)
cov (v,, x,-J = 0, vs.

He showed that e(L)=0 - i.e., {y,> does not cause {xt> - if and only if the
second projection involves only current and past x,, so that f,(L)=f,(L). We
shall refer to a test of the hypothesis f,(L)=f,(L) as a ‘Sims test’ of the
hypothesis that (y,> does not cause {xt}. If such a test is to be implemented,
f,(L) andf,(L) must be restricted for the same reasons that d(L) and e(L) were
J. Geweke et al., Alternative tests of causality 163

restricted in conducting a Granger test. In addition, because u, and v, are


generally serially correlated, a feasible generalized least squares estimator
which restricts autocovariance functions of u, and v, to known functions of a
reasonable number of parameters must be employed in applications of a
Sims test. Again, the asymptotic bias against the null hypothesis which would
in general result from maintaining restrictions on f,(L) and f,(L) may be
removed by allowing the number of estimated parameters to increase with
sample size. As before, appropriate rates of parameter expansion are
unknown.
There is a variant of the Sims test which avoids the need to use a feasible
generalized least squares estimator or an ad hoc pretilter to cope with serial
correlation in {uJ. Premultiply the system of ,equations (1.1) by the matrix

where
c = cov (E,,6,),

to yield a new system in which the first equation is (l.la) and the second can
be expressed as

where g+(L) and f:(L) are functions of the parameters of (1.1). Since qt=
S,-(c/Q,~)E~,it is uncorrelated with E,, and hence Q is uncorrelated with current
x, as well as past x, and y,. The variables on the right side of (1.4) are
therefore predetermined, and qt is serially uncorrelated. Now suppose that v,
in (1.3) has autoregressive representation h(L)v,=w,, an assumption which is
no stronger than those employed in rigorous justifications for the use of
feasible generalized least squares in this equation. Let h(L) = 1 -h+(L) and
f:(L)= h(L)f,(L). Multiplying through (1.3) by h(L) and rearranging results in

(1.5)

The disturbance CO,is serially uncorrelated by construction. It is uncorrelated


with all xc-S [because it is a linear combination of disturbances in the
projection (1.3)] and with past values of y, (because these are linear
164 J. Geweke et al., Alternative tests of causality

combinations of all x,_, and past w,). Since h(L) is invertible, fr,=O for all
s >O if and only if& = 0 for all s > 0. A ‘Sims test’ of the hypothesis that {y,}
does not cause {xt} may therefore be conducted by basing a test off”;, = 0 for
all s<O on the estimates of some finite parameterization of (1.4) and (1.5) in
the conventional way. Again, the number of parameters in g+(L), h+(L),ff(L)
and f z(L) which are estimated must increase with sample size.
In this paper we report the results of an investigation into the actual
properties of several tests of the hypothesis that y does not cause x. This
investigation was undertaken for three reasons. First, as is evident from the
foregoing discussion, the theory which underlies these tests is only
asymptotic. Because of the need to work with a finite number of parameters
in a sample of finite size, the theory is even less operational than in the more
familiar case in which the parameterization is known: in the latter case we
know exactly how to construct the test statistic but are in doubt about its
precise interpretation in a sample of a given size, whereas in the tests
discussed above it is not known exactly how the test statistic should be
constructed or interpreted in any given situation.
Second, once a parameterization has been adopted, there remain several
ways in which linear restrictions may be tested. The most common are the
Wald, likelihood ratio, and LaGrange-multiplier tests discussed by Silvey
(1959); as Berndt and Savin (1977) have demonstrated, there are practical
situations in which the outcome depends on the test used.
Finally, and most important, as the results of more and more tests for the
absence of causal orderings are reported in the applied literature, it is
becoming evident [see Pierce and Haugh (1977)] that these results may
depend on which test is used.
The remainder of the paper is organized as follows. In the next section we
introduce eight alternative tests of the hypothesis that y does not cause x. All
are asymptotically valid in the sense that under the null hypothesis the
difference between the distribution of the test statistic and that of a sequence
of chi-square distributions vanishes as sample size increases. In section 3 we
compare the asymptotic behavior of all eight tests using the concept of
approximate slope [Bahadur (1960) and Geweke (1981 b)]. The design of a
Monte Carlo experiment conducted to assess the behavior of these
alternative tests in situations typical of those encountered in economic
applications is discussed in section 4. The outcome of the experiment is
reported in section 5, and a final section contains recommendations for
empirical work based on the results of our investigation.

2. Alternative tests for the absence of a causal ordering

In what follows, we shall assume that {x,} and {y,} are jointly stationary,
Gaussian, and have mean zero. The restriction that the set of predictors be
J. Geweke et al., Alternative tests of causality 165

linear in past x and y is then inconsequential, mean square error becomes


the natural metric for comparison of forecasts, and the relevant likelihood
functions are well known.
A Granger test of the hypothesis that {y,} does not cause {xc> is a test of
the restriction e(L)=0 in (l.la). If we write the univariate autoregressive
representation of x, as

x, = QJx, + it, (2.1)


then e(L) =0 is equivalent to {et> = {c,) and c(L) =d(L). If under the
maintained hypothesis d(L) were assumed to be a polynomial of order 1 and
e(L) were assumed to be a polynomial of order k, then the test of e(L) =0
could be based on the sums of squared residuals from ordinary least squares
estimates of the regression equations

x, = C(L)x, + St = f: CsXt __s + zrt, (2.2)


s=l

x,=D(L)x,+E(L)y,+E,= i Dsxt-s+ i: Q-s+&,. (2.3)


s=l s=1

If S& is the maximum likelihood estimate of var(b,) in (2.2) and 8: is the


maximum likelihood estimate of var(b,) in (2.3), then under the null
hypothesis the distribution of each of the statistics,

TfW = n(&$ -6:)/&j, (2.4)

T,GR
= n log (6$/r?;), (2.5)

T,GL= n(B$- c?;)/c?;, (2.6)

converges uniformly to that of a X’(k) variate as the size of the sample, n,


increases without bound. In the terminology of Silvey (1959), a test based on
TzW and this convergence result is a ‘Wald test’, while one based on TfR or
T,GL and the convergence result is a ‘likelihood ratio test’ or a ‘LaGrange-
multiplier test’, respectively.
In practice k and 1are not known, nor even known to be finite. Furthermore,
there are no known relations between 1 and n which guarantee the uniform
convergence of (2.4H2.6) to a X’(k) variate under the null hypothesis. In
practice, values of k and 1 are usually chosen according to procedures which
are at best informal, if not arbitrary.
166 J. Geweke et al., Alternative tests of causality

The restriction that f,(L)=f,(L) in (1.2) and (1.3) on which Sims tests are
based is equivalent to {u,>= {u,}. If under the maintained hypothesis f,(L)
were restricted to a polynomial in the terms Lpp, . . ., L- ‘, 0, L,. . ., Lq, then a
test could be based on the regression equations

Y,=F1(L)xt+ u,=f FlsXt-s+ u,, (2.7)


s=o

y,=F,(L)x,+ v,= $ FZsxtms+l/;. (2.8)


s= -p

When an ad hoc prelilter [say, R(L)] is used to cope with serial correlation,
y, and x, are replaced by yt = R(L)y, and xt = R(L)x,, respectively, in (2.7) and
(2.8). The equations are then estimated by ordinary least squares and the test
F,(L)= F,(L) is conducted using a conventional Wald test statistic. In more
sophisticated feasible generalized least squares procedures, the serial
correlation structure of the disturbances is estimated. In a typical two-step
estimator [e.g., Hannan (1963) or Amemiya (1973)], (2.8) is first estimated by
ordinary least squares. It is assumed that Szi= var (V,, . . ., V,) = C$(a,), where
the function s2”(.) is known but the m, x 1 vector a, is unknown. The
maximum likelihood estimate oi, (or its asymptotic equivalent) of a, is
computed by ignoring differences between ordinary least squares residuals for
(2.8) and I$ The vector of coefficients E, is then estimated by generalized
least squares, replacing 52: with ai(ai,), which provides a vector of residuals
I? To construct the Wald test statistic, (2.7) is then estimated by generalized
least squares, replacing 52: with Q;(&), yielding a vector of residuals 0”. The
test statistic is

FSW)= 0#2,“(oi”))
- lo” - V’(Q@“))- l P, (2.9)

The LaGrange-multiplier test statistic is constructed by replacing (2.8) with


(2.7) for purposes of obtaining an estimated serial correlation structure of the
disturbances. In an obvious notation, the test statistic which results is

FL) = o(sz;(a,))-l o- Pu(sz;(ai,))-


l 9,. (2.10)

Under the null hypothesis the distributions of (2.9) and (2.10) converge to
that of a x’(p) variate. When mt,, m,, p and q are unknown and perhaps not
even known to be finite, m,, m,, and q may be increased with n: as functions
of n, however, proper choices of m,, mV, and q have not been worked out.
In implementations of (1.4) and (1.5) the parameterization problem may be
managed just as it was in the Granger test based on (1.1). A test off;,=0 for
all s<O can be based on the sums of squared residuals from ordinary least
J. Geweke et al., Alternative tests of causality 167

squares estimates of the regression equations

y,=G+(L)y,+FT(L)x,+H,=
i G,'ytms+ f
s=o
F~,x,_,+H,, (2.11)
s=l

yt=H+(L)y,+F:(L)x,+ lq= i Il:y,_,+ f F;sx,_,+ K. (2.12)


s=l s= -p

If ~5; is the maximum likelihood estimate of var(H,) in (2.11) and 8& is the
maximum likelihood estimate of var (IQ in (2.12) then Wald, likelihood ratio
and LaGrange-multiplier tests may be based on the respective statistics

(2.13)

TzR = n In (r?&/&$), (2.14)

(2.15)

all of which have a limiting x’(p) distribution under the null hypothesis.

3. The approximate slopes of the alternative tests


In the comparison of alternative tests of the same hypothesis we are
concerned with adequacy of the limiting distributions under the null
hypothesis and the ability to reject the null under various alternatives, in
finite samples. In the absence of any analytical results for small samples,
these questions may be addressed by sampling experiments, but without
some sort of paradigm based on asymptotic considerations the results of
sampling experiments are difficult, if not impossible, to interpret. The results
of the previous section provide such a paradigm only when the null
hypothesis is true. When the null is false, a convenient paradigm is the
criterion of approximate slope, which was introduced by Bahadur (1960) and
whose econometric applications were discussed by Geweke (1981b). We shall
briefly recapitulate some of the points of the latter paper.
Suppose that test i rejects the null in favor of the alternative in a sample of
size n if the statistic Tk exceeds a critical value. When the liniting
distribution of TL under the null is chi-square, the approximate slope of test i
is the almost sure limit of Ti/n, which we shall denote Ti(0). The s:t 8
consists of all unknown parameters of the population distribution, perhaps
countably infinite in number, and the approximate slope in general depends
on the values of the elements of 6. Most test statistics are constructed in such
a way that Ti(i3) =0 for all 6 which satisfy the null hypothesis, and for all
other 8, T’(f.l)>O. The approximate slopes of alternative tests are related to
168 J. Geweke et al., Alternative tests of causality

their comparative behavior under the alternative in the following way. Let
n’(t*,/?; fI) denote the minimum number of observations required to insure
that the probability that the test statistic Ti exceeds a specified critical point
t* is at least 1 -p. Then

lim n’(t*, fi; ,e)/d(t*, p; e)= P(e)p(e);


t*+m
the ratio of the number of observations required to reject the alternative as
t* is increased without bound (or alternatively, the asymptotic significance
level of the test is reduced to zero) is inversely proportional to the ratio of
their approximate slopes. Similarly, if t’(n/?; 8) indicates the largest non-
rejection region (equivalently, smallest’ asymptotic significance level) possible
in a sample of size n if power 1 -b is to be maintained against the alternative
8. then

lim t’(n, /I; 0)/t’@, /I; 0) = T’(8)/T2(0).


“+U2

In this section we confine consideration to the probability limits of Tyn


for the various tests; demonstration of almost sure convergence appears to be
confounded by the absence of a known, finite parameterization.
The approximate slopes of the three variants of the Granger test are
readily calculated. If the parameters j and k in (2.3) increase suitably with
sample size, then c?is a,Z, where uz is the variance of the disturbance E, in
(l.laj. The increase in I guarantees that when the null hypothesis E(L)=0 is
imposed, the estimated variance of 2, in (2.2), 8$, will converge in probability
to the variance 0: of the innovation in the univariate autoregressive
representation of {xt}. The latter variance is

exp &_iln C&(41dl


R
,

where S,(A) denotes the spectral density of {xr} at frequency A [Whittle (1963,
ch. 2)]. Consequently the approximate slopes of the tests on (2.4)-(2.6) are

TGW = c$/a,z - 1 (3.1)

for the Wald variant,

TGR= In ($/a~) (3.2)

for the likelihood ratio variant, and

TGL = 1 - a,2/o: (3.3)


J. Geweke et al., Alternative tests of causality 169

for the LaGrange-multiplier variant. The problem of deriving the limits of


Tzrr’/n, TfR/n, and Tp/n is formally the same. With suitable expansion of
parameter spaces, we have [in view of (2.13)-(2.15) and the relation of (2.11)
to (1.4) and (2.12) to (1.5)]

Tsw = a;/~;- 1, (3.4)

TSR - In (~$/a& (3.5)

TsL = 1 - C&U;. (3.6)

Since x-l~ln(x)~l-(l/x) for all x21, TGWzTGRzTGL and


TsW 2 TSR 2 TsL, the inequalities being strict under all alternative hypotheses.
In view of the well-known relation which prevails among Wald, likelihood
ratio, and LaGrange-multiplier test statistics [e.g., Breusch (1979)], these
inequalities are not surpsing. To establish further orderings, it is
convenient to use two results in spectral analysis. The first is that in the
projection of y, on all x,, eq. (1.3), the spectral density of the residual is
S,(~)-IS,,(~~)1’/S,(~) [e.g., Whittle (1963, p. 99)]. The second is that if zt
is a multivariate stationary time series with autoregressive representation and
E, is the vector of innovations in the autoregressive representation, then

lnlvar(st)l=&_i lnlS,(l)ldA
x

[Rozanov (1967, p. 72)]. Since w, is the innovation in u, in (1.3),

ln (63 =k i ln (s,@)- I&,(412/&(4)


n
dl.

we also have

In(02) =&_yIn (S,(A))d;l,


7[

whence

In (I&$) =& j In (S,(A)S,(A) - IS,,(A)12)dl = In (o:c$ -c’).


n
But
a2a2
E 6 - c2 = a:~$ and hence at/o: = a~/~$.

It follows that
TGW = Tsw 9 TGR = TSR> TGL = T=
170 .I. Geweke et al., Alternative tests of causality

Consider now Tcsw’ and T (sL),the probability limits of TisW’/n and ThsL)/n.
It is instructive to consider successively sophisticated variants of tests in (2.7)
and (2.8). Since there are fewer regressors in (1.2) than in (1.4), and fewer
regressors in (1.3) than in (1.5), g,’ ZD,” and o~Z& Eqs. (1.2) and (1.4)
coincide, and 0.”=o; if and only if u, is serially uncorrelated; similarly,
ai=aE if and only if u, is serially uncorrelated. If u, and V, are serially
uncorrelated, then a test of F,(L)=F,(L) may be conducted in classical
fashion

= n(6:/8: - 1) and TisL) = n( 1- ~z/$),


TLsW’
T’SW’/na.s. T’sW’=&cJ,2-1,
n
T’=‘In z T’=’ = 1 - g; /a;.

In this case,

T’sw’ = Tsw and T’SL’ = TSL.

Now suppose that y, is replaced by yJ = R(L)y, and xt by XI = R&)x,, where


R(L) is an invertible one-sided filter, R(O)= 1. Transform all equations in
sections 1 and 2 accordingly, affixing the superscript ‘t’ to all variables and
parameters. Because R(L) is invertible and R(O)= 1, the parameters of the
autoregressions (l.l), (1.4), and (2.1) and their disturbances are unaffected
[e.g., &(L)=d(L), &=[f, and o$ =a:], as is the case in (1.5). In (1.3), fl(L)
=f,(L), but qft #gz although o, is the innovation in both u, and u!. Unless
{y,} does not cause (xr} the one-sided projection of (y,} on {x,} depends on
R(L): in (1.2), f!(L) #f,@), a$ # ~2, and the innovations in u? and u, will not
be the same.
Consider a generalized least squares variant of the Wald Sims test, with
R(L) chosen so that R(L)o, is serially uncorrelated; then

Tcsw) = a$/a;t - 1 = o$/ai - 1.

Since uf may be serially correlated, a$? o,$ = c,‘. Consequently, Tcsw) >=TSW,
with equality if and only if the one-sided projection of {yJ> on {XI} turns out
to have a serially uncorrelated residual.
In a generalized least squares variant of the LaGrange-multiplier Sims test,
with R(L) chosen so that R(L)u, is serially uncorrelated, TcsL)= 1 -azt/a$. In
general, R(L), azt, and aft will differ from their counterparts in the Wald
Sims test. Because R(L)u,#v~, neither u! nor ut need be serially uncorrelated.
The approximate slope of a Sims LaGrange-multiplier test is also sensitive to
prefiltering. Treating {y,} and {xJ as the prefiltered series, we see that f,(L)
J. Geweke et al., Alternative tests of causality 171

and 0,’ vary with the prefilter, and consequently so will writ and a,$. [This
problem does not arise in the Wald Sims test, in which R(L) is always chosen
so that the disturbance in the two-sided projection is w, after serial
correlation correction.]
There appears to be no ordering between Tcsw) and TcsL’, and numerical
examples in the next section provide instances in which TcSL)> Tsw and
T’SL’< TsL
In practice, R(L) cannot be chosen so that R(L)u, or R(L)v, is known to be
serially uncorrelated. Either R(L) is chosen to be an arbitrary but reasonable
approximation of the autoregressive representation of u, or u, [Sims (1972)]
or more elaborate procedures are used to estimate R(L) explicitly [Amemiya
(1973)] or implicitly [Hannan (1963)]. In all instances the parameterization
problem again arises. In the mathematical appendix of this paper we show
that for sufficiently slow expansions of the relevant parameter spaces, sums of
squared residuals (after correction for serial correlation) divided by sample
size will converge to cft and ~zt in the restricted and unrestricted models,
respectively. The approximate slopes of the tests based on feasible
generalized least squares estimators are therefore the same as those based on
generalized least squares estimators.

4. Experimental design

To supplement these analytic results, we conducted a series of Monte


Carlo experiments. There are several interesting issues which might be
addressed in such experiments; we have concentrated on the following:

(1) In a given population, what is the relative behavior of the tests discussed
above? Are the asymptotic distributions under the null adequate? Under
the alternative, do the approximate slopes of the different tests provide a
reliable indication of their relative abilities to reject the null hypothesis?
(2) Under the null, or under different alternatives in which approximate
slope is the same, is the performance of the tests sensitive to changes in
population characteristics such as autocorrelation or the lengths of
distributed lags?
(3) Under the alternative what is the effect of changing approximate slope
on test performance?
(4) How are the tests affected when the assumed parameterization is not that
of the population?
(5) In conducting a Sims test in which a correction for serial correlation is
necessary, what is the effect of preliltering?

It is impossible to deal with all of these issues systematically without


conducting a very expensive study. In our experiments, we have examined
172 J. Geweke et al., Alternative tests of causality

the performance of all the tests, thus dealing thoroughly with the first
question. These tests were applied in six different populations, providing
some evidence on the second point. In a few cases the chosen
parameterization was not compatible with the population parameterization,
so limited information about point (5) may be forthcoming. All Sims tests
requiring correlation for serial correlation were conducted with and without
pretilters. In every case we used 100 observations.
The six models used in the experiments are summarized in table 1. The
parameter b was chosen in each case so that the common approximate slopes
of the Wald variants of the Granger and Sims lagged dependent variable
tests would be about 0.109. This value was selected to provide an interesting
frequency of rejection for the hypothesis that x does not cause y, using
Wald’s (1943) result on the distribution of Wald test statistics under the
alternative hypothesis. In our notation, his result states that the asymptotic
distribution of the test statistic under the alternative is non-central chi-square
with degrees of freedom equal to the number of restricted parameters and
non-centrality parameter equal to the product of the approximate slope and
the number of observations. The implied rejection frequencies for the
Granger and Sims lagged dependent variable Wald tests in our experiments
are about 0.76 when four parameters are restricted under the null and about
0.55 when twelve parameters are restricted, if the significance level of the test
is five percent. The approximate slopes of the Sims Wald and LaGrange-
multiplier tests vary across the six experiments.

Table 1
Design of experiments.

Mode1A:(1-0.75L)*y,=1.0+bx,_,+6~,6~-1N(0,1).

Model B: Y,=1.0+b(3L+2LZ-L3)~,+~1, (l-0.75LJ2u,=8:, a:-IN(O,l).

In both models, (1 - BL)x, = 1.0 + i,, [,-IN (0,l).

{a:}, {a:}, {it} are individually and mutually uncorrelated at all leads and lags.

Model A” Model B”

0=0.5 0=0.8 0=0.99 0=0.5 0=0.8 0 = 0.99

b 0.291 0.215 0.113 0.076 0.086 0.087


TGL = T= 0.0985 0.0985 0.0987 0.0978 0.0989 0.0985
TGR = TSR 0.1037 0.1037 0.1037 0.1030 0.1041 0.1037
TGW = Gs+” 0.1093 0.1093 0.1095 0.1085 0.1098 0.1093
T’sw’ 0.1125 0.1240 0.1914 0.1118 0.1125 0.1155
T’SL’ 0.1011 0.1103 0.1612 0.0944 0.0954 0.0976
TtsL1(pretiltered) 0.0713 0.0711 0.0767 0.1066 0.1016 0.1006

BApproximate slopes are given for respective tests of the hypothesis that {xc} does not cause
IYJ.
J. Geweke et al., Alternative tests of causality 173

All artificial independent normal random variables were generated using


the algorithm of Marsaglia [Knuth (1969)]. An autoregressive process of
order k was generated from a sequence of independent, identically distributed
normal variables employing the unconditional distribution for the first
observation, and using the appropriate different conditional distributions
until the (k- l)th observation was generated, after which the conditional
distributions remain the same. The random number generator was allowed to
run continuously from one experiment to the next without restarting, so the
results of the six experiments are quasi-independent. Within each experiment,
the various tests and parameterizations were applied to the same data, which
should allow for a more exact comparison of test behavior than would be
obtained if the different tests were applied to quasi-independent data. All
simultations were performed on a Univac 1110 in double precision. In order
to calculate the test statistics reported in the text, 80 regressions with an
average of 16 regressors were run for each replication. In every case,
computations were performed using a Householder reduction technique so
that the sum of squared residuals could be calculated without the need to
solve for regression coefficients. In the computation of the Hannan efficient
estimates, 240 calls of a fast Fourier transform routine were required each
replication to transform the variables. Each replication required 75 CPU
seconds.
In each of the six models the hypothesis that {yt} does not cause {x,}
(which is true) and the hypothesis that {xt} does not cause {y,} (which is
false) were tested. For the hypothesis that y does not cause x the truncated
lag distribution on x usually used in Sims tests requiring correction for
autocorrelation of the disturbance term is exact in Model B (provided three
or more lags are used) but in Model A it is not. When the (false) hypothesis
that x does not cause y is tested using this method, the parameterization is
not exact for either model. Three variants of each of Models A and B,
corresponding to different degrees of serial correlation in x, were used. The
values of the parameters of eqs. (l.l), relevant for the Granger tests, of eqs.
(1.2) and (1.3) relevant for the Sims tests, and of eqs. (1.4) and (1.5), relevant
for the Sims tests with lagged dependent variables, are shown in tables 2 and
3. Table 2 provides the parameters pertinent to tests of the true hypothesis
{y,} does not cause {x,1, and table 3 provides those for the tests of the false
hypothesis that {xy) does not cause {y,}. In table 3, fi,, is tabulated
separately for the Wald and LaGrange-multiplier tests. These coefficients will
differ in general because the coefficients in the one-sided projection of {y,) on
(xt) depend on the filter used in forming y[ = R(L)y, and x,f = R(L)x,. If the
autocovariance function of the residual u, in the two-sided projection of {y,}
on {xt} is proportional to the autocovariance function of the residual V, in
the one-sided projection of {y,} on {x,>, then the filters are the same and the
f 1,s coincide. That is the case of Experiment A, as may be seen by comparing
S,(n) and S,(n).
174 J. Geweke et al., Alternative tests of causality

Table 2
Parameterizations of (l.lHl.5) in the experiments for H,: {yt} does not cause {x,}.

Model A Mqdel B

0=0.50 8=0.80 e=o.99 0=0.50 0=0.80 e=o.99

1.0000 1.0000 1.0000 1.0000 1.0000 1moo


-0.5000 -0.8000 - 0.9900 -0.5000 - 0.8000 - 0.9900
O.OQOO O.COoO 0.0000 0.0000 O.OQOO 0.0000
1 1 1 1

0.2910
0.0000 0.2150
0.0000 0.1130
0.0000 0.2280
0.0000 0.2580
0.0000 0.2610
0.0000

0.4911
0.4365 0.3628
0.3225 0.1907
0.1695 - 0.1520
0.0760 - 0.1720
0.0860 -0.0870
0.1740

0.4604
0.4911 0.3401
0.3628 0.1788
0.1907 0.0000
1 0.0000
1 0.0000
1
0.3625
0.3108
0.4143 0.2679
0.2296
0.3061 0.1408
0.1207
0.1609

0.2185
0.2622 0.1614
0.1937 0.0849
0.1018

0.1803 0.1332 0.0700


0.1475 0.1090 0.0573

go 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000


g1 - 1.5ooo - 1.5000 - 1.5000 - 1.5000 - 1.5000 - 1.5000
gz 0.5625 0.5625 0.5625 0.5625 0.5625 0.5625
g3 0.0000 o.omo 0.0000 0.0000 0.0000 0.0000
.r”:‘.o 0.0000
1 1
0.0000 1
0.0000 1
0.0000 1
0.0000 1
0.0000

;I:; 1
0.0000
0.2910 1
0.0000
0.2150 1
0.0000
cx1130 -0.1758
-0.1900
0.2280 -0.1989
-0.2150
0.2580 -0.2012
0.2175
0.2610

0.1995 0.2258 0.2284

$
:&1.7 1
- 0.0428
0.0000 1
- 0.0492
0.0000 1
-0.0489
0.0000

S”(O) 256.0000 256.0000 256.0000 256.0000 256.OCOO 256.0000


S,(0.25a) 3.9707 3.9707 3.9707 3.9707 3.9707 3.9707
S”(O.5OK) 0.4096 0.4096 0.4096 0.4096 0.4096 0.4096
S,(O.75rr) 0.1453 0.1453 0.1453 0.1453 0.1453 0.1453
S,(n) 0.1066 0.1066 0.1066 0.1066 0.1066 0.1066

In the Granger tests, 1 lagged values of the dependent variable and k


lagged values of the other variable (hypothesized not to cause the dependent
variable) are used; in our experiments we examined the combinations k= 4
and Z=4, 8, and 12, respectively, as well as the combination k= 1= 12. From
table 2 it may be seen that this parameterization is always exact for tests that
y does not cause x, but for tests that x does not cause y, k=4 does not allow
J. Geweke et al., Alternative tests of causality 175

Table 3
Parameterizations of (l.lHl.5) in the experiments for H,: {.q} does not cause {y,}.

Model A Model B

e=o.50 0=0.80 8=0.99 B=O.50 B=O.80 B=O.99

CO 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000


cl - 1.5493 - 1.5788 - 1.5977 - 1.4815 - 1.4884 - 1.5047
c2 0.6142 0.6239 0.6219 0.5855 0.5935 0.5973
c3 - 0.0044 -0.0001 - 0.0020 -0.0825 -0.0790 -0.0691
c4 - 0.0020 -0.0001 -0.0018 0.0550 0.0487 0.0410
c5 -0.0009 = -0.0016 -0.0135 -0.0116 -0.0102
cc -0.0004 -a -0.0014 0.0040 0.0033 0.0018
c-i -0.0001 -a -0.0013 -0.0016 -0.0012 -0.0015
cs -0.0001 -a -0.0011 0.0004 0.0003 - 0.0004

d, 1.0000 1.0000 1.oOOo 1.0000 1.0000 1.0000


d, - 1.5000 - 1.5000 - 1.5000 - 1.5000 - 1.5000 - 1.5000
d, 0.5625 0.5625 0.5625 0.5625 0.5625 0.5625
d, 0.0000 0.0000 0.0000 0.0000 0.0000 o.oOQo
d, 1 1 1 1 1 1
el 0.2910 0.2150 0.1130 0.2280 0.2580 0.2610
e2 0.0000 0.0000 0.0000 -0.1900 -0.2150 -0.2175
e3 1 1 1 -0.1757 -0.1989 -0.2012
e4 0.1995 0.2258 0.2284
e5 -0.0427 - 0.0484 - 0.0489
e6 0.0000 0.0000 0.0000
e7 1 1 1
Wald:

k:: -0.0100
0.0167 -0.0021
0.036 1 0.0243
0.0853 - 0.0039
0.0079 - 0.0049
0.0080 - 0.0034
0.0026

k: - 0.0007
0.0030 -0.0010
-0.0015 0.0224
0.0234 - 0.0145
0.0046 - 0.0105
0.0027 0.0119
0.0023

;:’1.6 _a
0.0001 --0.0007
0.0004
0.0005 0.0192
0.0203
0.0214 - 0.0002
0.0011
0.0014 ~ 0.0005
0.0001
0.0012 0.0047
0.0054

LaGrange:

k -0.0100
0.0167 -0.0021
0.0361 0.0243
0.0853 - 0.0064
0.0074 -0.0085
0.0106 -0.0193
0.0448

k --0.0007
0.0030 -0.0010
-0.0015 0.0224
0.0234 0.0037
0.0014 -0.0036
0.0041 -0.0068
0.0182

k _a
0.0001 --0.0007
0.0005 0.0203
0.0214 -0.0011
0.0004 -0.0011
0.0005 0.0026
0.0076
f: 0.0001 - 0.0004 0.0192 -0.X03 - 0.0002 0.0060

;2L~Z -a 0.0023 0.0156 _a 0.0001 0.0215


f*.-II -8 0.0032 0.0175 _a 0.0001 0.0220
k: I:” 0.000 I
0.0002 0.0062
0.0045 0.0220
0.0196 -0.0001
O.OOQl -a
O.Oc@l 0.0230
0.0226

0.0005 0.0086 0.0246 0.0009 0.0005 0.0239


0.0012 0.0120 0.0276 -0.0014 -0.0011 0.0234
0.0027 0.0166 0.0309 0.0092 0.0056 0.0287
176 J. Geweke et al., Alternative tests of causality

Table 3 (continued)

Model A Model B

0=0.50 8=0.80 fI=o.99 0 = 0.50 0=0.80 0=0.99

0.0060 0.0230 0.0347 -0.0121 -0.0109 0.0173


0.0132 0.0319 0.0388 0.0775 0.0548 0.0715
0.0293 0.0443 -0.2560 -0.2089 - 0.1507
0.0650 0.0614 0.0488 0.1229 0.1174 0.1296
0.1443 0.0851 0.0547 0.2492 0.2262 0.2230
-0.2620 - -0.1507 - -0.0529 -0.1988 -0.1518 - 0.1002
0.0295 0.0003 0.0101 0.0153 0.0079 0.0324
0.0133 0.0002 0.0090 -0.0101 -0.0101 0.0185
0.0060 0.0002 0.0081 0.0063 0.0026 0.027 1
0.0027 0.0001 0.0072 0.0004 - 0.0004 0.0245
0.0012 0.0001 0.0064 0.0011 0.0004 0.0244
0.0005 0.0001 0.0057 0.0002 O.cQOl 0.0236
0.0002 0.0000 0.0051 0.0002 0.0001 0.0231
0.0001 1 0.0046 0.0001 O.cQOl 0.0226
0.0001 0.0041 _a 0.0001 0.0220
_a 0.0036 _a -a 0.0215
_= 0.0032 _a -a 0.0210
_a 0.0029 _a -2 0.0205

go l.oooO 1.0000 1.oOOo 1.0000 1.0000 1.oooo


g1 -0.5000 -0.8000 - 0.9900 - 0.5000 -0.8000 - 0.9900
g2 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000
g3 1 1 1 1 1 1
0.0000 0.0000 0.0000 0.0000 0.0000 0.0000
$:, 1 1 1 1 1 1
ho’ l.OQOO 1.0000 1.0000 1.0000 1.0000 1.0000
h, -0.4507 -0.7212 -0.8923 -0.5185 -0.8116 -0.9853
h, 0.0000 0.0000 0.0000 -0.0412 -0.0389 -0.0324
h, 1 1 1 0.0438 0.0527 0.0520
h, -0.0088 -0.0112 -0.0115
h, 0.0000 0.0000 0.0000
1 1 1
_a 0.0011 0.0032 _a _a 0.0010
_a 0.0016 0.0036 _a P 0.0010
0.0001 0.0022 0.0040 0.0001 0.0001 0.0010
o.OOQ2 0.0030 0.0045 - 0.0002 -0.0001 0.0009
0.0004 0.0041 0.0050 0.0009 0.0006 0.0014
0.0010 0.0057 0.0056 -0.0018 -0.0015 pa
0.002 1 0.0080 0.0063 0.0099 0.0065 0.0058
0.0047 0.0111 0.0071 -0.0168 -0.0154 -0.0107
0.0105 0.0153 0.0079 0.0833 0.0634 -0.0545
0.0234 0.0212 0.0089 -0.2953 -0.2527 -0.2205
0.0518 0.0295 0.0099 0.2518 0.2842 0.2763
0.1149 0.0409 0.0111 0.1995 0.1420 0.1037
-0.3270 -0.2121 -0.1017 -0.3450 -0.3516 -0.3328
0.1476 0.1090 0.0573 0.1157 0.1308 0.1324
0.0000 0.0000 0.0000 0.0000 0.0000 0.0000
1 1 1 1 1 1
J. Geweke et al., Alternative tests of causality 177

Table 3 (continued)

Model A Model B

l9=0.50 t’=O.80 0=0.99 0=0.50 8=0.80 8=0.99


__~
S”(O) 3.4096 14.5931 155.7753 3.9988 24.7180 2115.1299
S,(0.25n) 1.8183 2.2678 3.2567 1.7943 1.8741 1.9724
S”(O.5OX) 0.8550 0.7462 0.9711 0.7884 0.6108 0.7255
S”(O.757l) 0.5589 0.4466 0.5716 0.5407 0.3968 0.5646
S”(K) 0.4888 0.3829 0.4886 0.4467 0.3130 0.3679
S”(0) 2.9879 11.5976 77.7061 3.9942 24.7144 1745.0116
S”(0.25~) 1.5934 1.8023 1.6869 1.7498 1.8341 1.6194
S”(O.5Orr) 0.7492 0.5930 0.5018 0.6527 0.4997 0.4256
S”(0.75Z) 0.4898 0.3549 0.2947 0.4369 0.3129 0.2621
S”(K) 0.4283 0.3043 0.2517 0.4444 0.3086 0.2525

“Denotes coefficient less than 0.00005 in absolute value.

a complete parameterization of the effect of x on y. In the Sims tests based


on (1.2) and (1.3) p leading and 4 lagged values of the explanatory variable
(hypothesized not to be caused by the dependent variable) are used; we
examined the combinations p=q=4, p=4 and q= 8, p=4 and q= 12, and
p = q = 12. In the Sims tests based on (1.4) and (1.5) employing lagged
dependent variables, four lagged values of the dependent variable were
always used and combinations of leading and lagged values of the
explanatory variable were the same as those in the tests based on (1.2) and
(1.3). For tests that y does not cause x these parameterizations are always
exact, and for tests that x does not cause y, never.
Two techniques were used to correct for serial correlation of the
disturbance in the first group of Sims tests. In both procedures, ordinary
least squares residuals are used to estimate the variance matrix of the
regression equation disturbances. In the Hannan efficient method [Hannan
(1963)] the spectral density of the disturbances was estimated using an
inverted ‘I/’ spectral window with a base of 19 ordinates, applied to the 100
residual periodogram ordinates. The estimated spectral density was then used
to transform the data prior to a final estimation by ordinary least squares in
the way discussed in Geweke (1978). In Amemiya’s (1973) method an
autoregression of ordinary least squares residuals of length 8 was estimated,
and then used to transform the data prior to final estimation by ordinary
least squares. It can be seen from table 1 that the parameterization implicit
in Amemiya’s method is always exact in tests of the hypothesis that {y,} does
not cause {xt}. For tests of {x,} not cause {y,>, it is exact for Experiment A
({Us} and {v,} are both first-order autoregressive) and for Experiment B it is
inexact, but coefficients beyond the eighth in the autoregression are
negligible. The parameterization in Hannan’s method is never exact unless
178 .I. Geweke et al., Alternative tests of causality

the disturbances are serially uncorrelated. It is also less flexible than the one
for Amemiya’s method, employing a spectral estimator with only about five
effective degrees of freedom. The Hannan efficient estimator is especially ill-
suited to cope with the large variation in S,(n) and S,(2) near zero.
Both methods were also used after application of the prehlter (l -0.75L)’
to the data. For tests of the hypothesis that {y,} does not cause {xt> this
renders the disturbance term serially uncorrelated in all models. For tests of
{xt} not to cause { y,j the parameterization of Amemiya’s test is then inexact
in both experiments, but is a very good approximation. For those tests the
pretilter removes the peaks in S,(A) and S,(i) at 3.=O, whence the
parameterization in the Hannan efficient estimator should be more suitable.

5. The outcome of the experiments

The results of our experiments are summarized in tables 4 through 7. The


first two of these tables provide information about the distribution over the
100 replications of the various test statistics for the hypothesis that {y,} does
not cause {x,x, which is true. Tables 6 and 7 provide similar information for
tests of the hypothesis that {xt} does not cause {yt}. We shall discuss the
performance of the tests in these two situations in turn.
Table 4 indicates the mean and standard deviation of the ‘F’ statistic over
the 100 replications for the Wald and LaGrange variants of the Granger,
Sims (using both Hannan’s and Amemiya’s correction for serial correlation),
and Sims lagged dependent variable tests. Table 5 reports the number of
replications in which the hypothesis that {y,} does not cause {xt} was
rejected at the 5% and 10% significance levels, respectively. If the asymptotic
theory were exact, then with probability 0.95 the number of replications in
which rejection occurs would lie between 1 and 9 for the 5% level and
between 5 and 15 for the 10% level.
From these tables it can be seen that the experiments yield considtirable
information about differences in outcomes across tests, but much less about
differences in outcomes across models; since the experiments were designed
to do the former and not the latter, this is not surprising. Tests which require
correction for serial correlation in general do not behave well under the null
hypothesis. All Wald tests that involve a serial correlation correction yield
statistics that are too large and reject the null hypothesis too often. As
expected, preliltering improves the performance of these tests when Hannan’s
method is used, but has little effect for Amemiya’s procedure. The
distributions of the LaGrange-multiplier test statistics are more reasonable.
For the Hannan efficient estimator prefiltering is required for an adequate
distribution, but again seems to have few consequences for tests conducted
using Amemiya’s procedure. Except for the Hannan efficient method without
pretiltering, the distribution of the LaGrange-multiplier test statistics is close
to that of the asymptotic distribution, rejection frequencies for some
Table 4
Mean values of Fs under null hypothesis.”

Sims tests

Hannan ellicient Amemiya GLS


Asymptotic F Lagged dependent
distribution Granger tests variables Fillered Unfiltered Filtered UnIiltered
Model and
parameterization Graoger Sims Wald L&range Wald LaGrange Wald LaGrange Wald LaGrange Wald LaGrange Wald L&range

A,B=OS \
44 F(4,91) F(4.90) 1.11 (0.08) 1.03 (0.07) 1.20 (0.08) I.1 I (0.07) 1.22 (0.09) 1.07 (0.07) 2.25 (0.27) 1.82 (0.15) 1.92 (C !4) 1.43 (r).@9) 1.73 (O.15) 1.39 (0.12)
&x F(4,87) F(4.86) 1.12 (0.08) 1.04 (0.07) 1.22 (0.08) 1.12 (0.07) I.1 I (0.09) 0.97 (0.07) 2.15 10.24) 1.76 (0.15) 1.37 (0.11) 1.05 (0.07) 1.68 (0.13) 1.36 (0.09)
4-12 F(4.83) F(4,82) I.15 (0.09) 1.06 (0.07) 1.24 (0.09) 1.13 (0.07) 1.16 (o.loj 1.01 io.oxj I.87 io.lxj 1.57 {o.lzj 1.48 (0.13) I.14 (0.08) 1.69 (0.12) 1.34 (0.09)
12-12 F(l2.75) F(l2.74) Lo6 (0.05) 0.89 (0.04) 1.19 (0.06) 1.06 (0.04) 1.12 (0.06) 0.86 (0.03) 1.55 (0.12) 1.11 (0.05) 1.47 (0.07) 0.96 (0.03) I.57 (0.10) I.01 (0.04)

8,0=0.5

4-4 F(4.91) F(4,9o) I.00 (0.07) 0.94 (0.06) 0.96 (0.96) 0.90 (0.06) 1.02 (0.07) 0.92 (0.06) 1.61 (0.13) 1.49 (0.11) 1.26 (0.10) 1.01 (0.07) 1.31 (0.10) 1.05 (0.07)
4-8 F(4,87) F(4.86) 0.98 (0.07) 0.92 (0.06) 0.95 (0.06) 0.88 (0.05) 1.04 (0.08) 0.92 (0.06) I.40 (0.12) 1.29 iO.O9) 1.19 (0.10) 0.95 (0.07) 1.26 (0.09) 1.03 (0.07)
4-12 F(4,83) F(4.82) 0.98 (0.07) 0.92 (0.06) 0.97 (0.05) 0.w (0.05) 1.06 (0.08) 0.94 (0.06) 1.25 (0.08) 1.13 (0.07) 1.21 (0.10) 0.95 (0.07) I .25 (0.09) I.00 (0.06)
12-12 F(l2.75) F(l2,74) 0.95 (0.04) 0.81 (0.03) I.01 (0.04) 0.94 (0.03) 1.07 (0.05) 0.83 (0.03) 1.32 (0.08) 0.98 (0.04) 1.28 (0.07) 0.87 (0.03) 1.37 (0.08) 0.89 (0.03)

A,B=O.X

4-4 F(4.91) F(4.9’3) I. I I (0.08) I.04 (0.07) 1.09 (0.07) 1.01 (0.06) I.17 (0.08) I.04 (0.07) 2.35 (0.29) 1.89 (0.17) 1.63 (0.15) 1.23 (0.09) I.45 (0.14) I.13 (0.09)
4-8 F(4.87) F(4.86) I.09 (0.08) I.01 (0.07) 1.08 (0.08) I.00 (0.07) I.18 (0.09) 1.03 (0.07) 1.91 (0.17) I.65 (0.13) 1.33 (0.10) I.04 10.07) 1.38 (0.11) 1.14 (0.09)
‘%I2 F(4.83) F(4,82) I.14 (0.06) I.06 (0.07) 1.13 (0.08) 1.03 (0.07) 1.22 (0.10) 1.05 (0.08) 1.70 (0.15) 1.48 (0.11) 1.4o~o.tlj 1.09~o.oxj 1.46 (0.11) I.21 (0.08)
12-12 F(l2,75) F(12,74) 1.09 (0.04) 0.91 (0.03) 1.10 (0.04) I.00 (0.03) 1.20 (0.07) 0.92 (0.M) 1.46 (0.09) I.1 I (0.05) I.41 (0.06) 0.95 (0.03) I.44 (0.07) 0.99 (0.03)

8,8=0.X

4-4 F(4.91) F(4.90) 1.19 (0.07) I.1 I (0.06) I.09 (007) I 01 (0.06) I.21 (0.09) 1.06 (0.07) 1.96 (0.19) 1.72 (0.14) 1.48 (0.12) 1.11 (0.07) 1.49 (0.11) I.17 (0.08)
4-X F(4,xn F(4,86) 1.19 (0.08) I.10 (0.07) 1.05 (0.071 0.97 CO.061 1.22 (0.08) 1.07 (0.07) 1.69 (0.16) 1.49 (0.12) 1.37 (0.10) I.04 (0.07) 1.42 (0.10) I.15 (0.07)
4-12 F(4,83) F(4, x2) I.17 (0.07) 1.08 (0.06) 1.07 (0.07j 0.99 io.o6j 1.27 (0.08) I.11 (0.07) 1.46 (0 13) 1.31 (0.10) 1.46 (0.11) 1.12 (0.07) 1.45 (0.1 I) I.15 (0.08)
12-12 F(l2.75) F(l2,74) 1.15 (0.05) 0.96 (0.03) I.06 (0.05) 0.97 (0.03) 1.27 (0.06) 0.98 (0.04) 1.52 (0.1 I) 1.14 (0.05) 1.50 (0.06) 1.01 (0.03) 1.49 (0.07) I.01 (0.04)

A.H=O.99

&4 F(4.91) F(4,9o) 1.26 (0.10) 1.16 (0.08) 1.17 (0.09) 1.07 (0.08) 1.57 (0.15) 1.32 (0.111 1.47 (0.16) 1.29 (0.12) 1.61 (0.13) I.15 (0.09) 1.68 (0.15) 1.32 (0.10)
&x F(4.87) F(4.86) 1.26 (0.09) 1.16 (0.08) I.18 (0.09) I.08 (0.07) 1.43 io.lzj 1.22 io.loj I.41 (0.14) I24 (0.11) I.52 (0.13) 1.16 (0.09) I.60 (0.13) 1.30 (0.10)
‘Cl2 F(4.83) F(4.82) 1.27 (0.091 1.17 (0.08) I .I9 (0.09) I ox (0.08) 1.26 (0.10) 1.07 (0.08) 1.46 (0.14) I 28 (0.1 I) I.51 (0.13) I.13 (0.09) 1.59 (0.13) 1.32 (0.10)
12-12 F(12.75) F(l2.74) 1.14 (o.osj 0.94 (0.04) 1.31 (0.05) 1.16 (0.04) 1.30 (0.06) 0 97 (0.04) 1.65 (0.14) 1.16 (0.06) I.68 (0.08) I.05 (0.04) 1.90 (0.11) I.19 (0.04)

5, u = 0.99

‘+I F(4,91) F(4,9o) 1.08 (0.07) 1.01 (0.07) I .05 (0.07) 0.98 (0.06) 1.96 (0.21) 1.63 (0.15) I71 (0.14) I.52 (0.12) 1.95 (0.19) I 35 (0.10) 1.18 (0.08) I.10 (0.07)
4-x F(4.87) F(4,86) 1.10 (0.07) 1.02 (0.07) I.09 (0.08) I.00 (0.07) I.25 (0.09) 1.11 (0.07) 1.63 CO.131 1.45 CO.111 1.48 (0.10) I IX (0.08) 1.26 (0.09) I.16 (0.08)
4-12 F(4.83) F(4, X2) 1.10 (0.07) 1.02 (0.07) 1.12 (0.08) 103 (0.07) 1.22 (0.W) 1.08 (0.07) 1.53 io.13j 1.36 io.lij I.41 (0.10) 1.10 (0.07) 1.27 (0.09) 1.16 (0.08)
12-12 F(l2,75) F(12.74) I.09 (0.05) 0.91 (0.03) I.17 (0.06) 1.05 (0.04) I .25 (0.06) 0.96 (0.03) I.44 (0.08) I 11(0.05) I.59 (0.10) 1.02 (0.04) 1.30 (0.06) 1.04 (0.04)

“Standard errors of estimated means are shown parenthetically.


Table 5
Numbers of F’s in 5% and 10% critical regions under null hypothesis, in 100 replicatmns

Sims tests

Hannan etlicient Amemiya GLS


Asymptotic F Lagged dependent
distribution Grange1 tests variables Filtered Unfiltered Fdtered Unfiltered
Model and
parameterization Granger Sims Wald LaGrange Wald LaGrange Wald LaGrange Wald L&range Wald LaGrange Wald LaGrange

A,B=0.5

4-l F(491) F(4,90) 7, 14 4, I1 5. 14 5, 9 10,2l 6, 14 25, 37 20, 32 29, 38 14,23 28, 33 12,25
4-8 F(4,87) F(4,86) 7, 15 3. 11 7, I2 6, 8 7, 12 5. 7 24. 31 20.24 12.19 4, II 24, 32 11,20
4-12 F(4.83) F(4.82) IO.15 5, 12 9, 14 5, 10 9, 14 5:9 21.29 18, 21 15.25 6, 15 19, 33 10, I5
12-12 F(12,75) F(12, 74) 8, 11 2, 3 7, 16 5, 7 10, 13 1. 6 27, 31 12.15 24, 31 0, 5 24, 37 3, 6
B,B=OS

4-4 F(4.91) F(4,W) 3, 8 4, 7 2, 6 5, I1 2, 7 16, 29 14, 24 15, I9 5, 16 l1,17 5, 9


4-8 F(4,87) 04.86) 3, 4 :: 2, 6 1, 3 4, 8 4, 5 1920 10.18 11, I7 7. 10 10.16 6. 9
4-12 F(4,83) F(4.82) 2, 4 2. 2 1, 7 1, 3 4, 8 4, 4 10, I6 9, 11 11.14 4, 9 8: 14 3, 5
12-12 F(12.75) F(12,74) 1, 5 1. 1 2%9 0%1 6, 8 4 0 16, 22 5, 9 l2,17 0, 3 17.22 0, 2

A,0=0.8

4-4 q4,~91) F(4.90) 7, I4 4, 11 6, 10 4, 9 6, I5 2, 8 26, 33 23, 30 18,25 9, 15 15.20 7, 14


bs F(4, 87) F(4,86) 5, 19 3, 11 7, 13 3, 11 7, 12 5, 7 24, 32 20,26 13, 18 7, 13 l6,23 8, 18
4-12 F(4, 83) F(4,82) 9;16 5, I3 9. 13 4, I2 8, 17 4, 9 21, 28 18,27 16, 22 7, 15 l8,23 9, I7
12-12 F(12.75) F(12.74) 8, 11 0, 2 6. 8 0, 6 9, 15 2. 4 26, 31 9, 17 17, 25 0, 3 23, 30 0, 3
B.B=0.8

4-4 F(4,91) F(4,90) 7, 16 4. 7 3, 7 1, 3 12, 14 4, 6 32, 38 18,26 1, 4 7, 12 8, 18 6, 8


4-8 F(4,87) F(4,86) 11.19 5, 10 4, 6 1. 4 12,2l 3, 9 32.41 9, 21 2, 5 3, 9 1, 6 5, I2
&12 F(4, 83) F(4, 82) 11.22 4, 9 3, 8 3,4 15, 24 4, I2 31, 38 8, I3 3, 6 5, 12 2, 6 5, I1
12-12 F(12,75) F(l2, 74) 22.36 1, 2 3, 5 1, 3 26. 35 2, 5 30, 34 8, 15 0, 4 0, 3 0, 5 1, 4

A,tJ=0.99

4-l F(4, 91) F(4190) IO, 18 8, I4 7, 17 6, 10 16, 23 11, 18 16, 23 13.19 19.29 7, 12 17. 25 5. 18
4-8 F(4,87) F(4,86) 9, I4 7. 12 8, 18 5. IO 16. 24 11,17 13,20 10, 18 15, 23 5, I5 17,28 l1,20
4-12 F(4,83) F(4,82) 11.16 7, 11 8, I4 5;11 11; 19 6, 11 16, 25 10, 19 16.25 7, 17 19,30 13,19
12-12 F(12.75) F(12.74) 7, 17 1, 2 15.23 6, 13 14, 23 1, 8 32, 40 9, 17 34, 38 2, 4 41. 52 6, I4

B, 0 =0.99

4-4 F(4.91) F(4,90) 2, 14 2, 8 2, 11 2, 7 27, 31 24, 27 23, 28 17,24 25, 31 17.22 12.21 5, 13
4-8 F(4,87) F(4, 86) 5, 15 3, 8 8, 13 5, 12 IO, 16 7, 13 22, 29 14, 24 20,29 7, 17 15,21 8, 15
4-12 F(4.83) F(4>82) 7, 16 2, 11 7, 13 5, 11 IO, 14 6, 11 17.25 II.22 16,22 5, 13 10, 23 6, 12
12-12 F(l2, 75) F(l2,74) 7, I3 4 3 9, 19 3, 8 13,21 5, 5 23, 34 9, 14 28, 38 2, 5 29, 39 3. 6
J. Geweke et al., Alternative tests of causality 181

experiments being too large and for others too small but with no apparent
pattern.
Overall, the results of the experiments reflect unfavorably on the
performance under the null hypothesis of tests requiring correction for serial
correlation. The Wald tests reject too frequently and are sensitive to
pretiltering. LaGrange-multiplier tests are also sensitive to preliltering. It is
discouraging to note that the LaGrange-multiplier tests did not perform very
well unless the parameterization of the distribution of the disturbance was
exact (i.e., Amemiya’s method) or the prelilter was used, in which case the
disturbance is then serially uncorrelated before correction for serial
correlation. If one of these conditions is indeed required for an adequate
distribution of the test statistic then these methods are useless in applied
work since then neither the serial correlation nor the functional form of its
parameterization is known a priori. On the other hand, it should be noted
that none of these tests are based on the exact maximum likelihood estimates
presumed in the discussion in section 2. Hannan’s and Amemiya’s procedures
in the Wald tests each constitute the first step in an iterative scheme leading
to maximum likelihood estimates [Oberhofer and Kmenta (1974)], and in the
sample size of 100 used in these experiments this first step may not be a very
close approximation to exact maximum likelihood. This question might merit
further investigation.
By comparison, the behavior of the Granger tests and Sims tests
incorporating lagged dependent variables is excellent. Rejection frequencies
for the Granger tests lie outside the 95% confidence intervals (constructed
under the assumption that the asymptotic distribution theory is exact) in 25
of 96 cases, and for the Sims lagged dependent variable tests in 15 of 96
cases. There is some tendency for the LaGrange-multiplier to be more
adequate than the Wald variant in the case of the Granger tests, and
conversely for the Sims lagged dependent variable tests. In both instances
rejection frequencies run a little too high for Wald tests and too low for
LaGrange-multiplier tests. The distribution of these statistics is as good as or
better than the distribution of the LaGrange-multiplier statistics after
pretiltering in the tests requiring a serial correlation correction. These test
statistics have the further advantages that they are cheaper to compute and
their interpretation is unclouded by the preliltering problem. However, there is
an analog of the problem of parameterizing the serial correlation of the
disturbance in choosing the number of lagged values of the dependent
variable to be used. There is no indication that test results are sensitive to
this choice as was the case in the tests requiring - a serial correlation
correction, even though in some cases the parameterization chosen was not
the correct one (as may be seen by consulting table 2).
The distributions of the test statistics for the hypothesis that (x~} does not
cause {y,> are summarized in tables 6 and 7. Given the results presented in
tables 4 and 5, the behavior of the Granger and Sims lagged dependent
Table 6
Mean va1ucs oi Fs under alternative hypotheses.”

Sims tests - Filtered Hannan etlinent


Asymptotic F,
distributionb Wald tests LaGrange tests Wald LZlGKUlge
Model and
parameterizatlon Granger Sims GraIlger Sims (LDV) Population’ GKUlgtY slms Sample Population Sample

A,B=OS s10pe=0.1093 Slope=O.O985 Slope ~0.1125 Slope EO.0713

44 F(4.91) F(4,90) 3.67 (0.19) 3.75 (0.18) 3.73 3.07 (0.13) 3.10 (0.12) 6.71 (0.30) 3.81 4.63 (0.15)
4-X F(4.87) F(4.86) 3.53 (0.18) 3.63 (0.18) 3.73 2.95 (0.12) 2.99 (0.12) 6.50 (0.29) 3.81 4.48 (0.15)
4-12 F(4,83) F(4,82) 3.45 (0.17) 3.58 (0.18) 3.73 2.88 10.121 2.93 CO.12) 6.28 (0.28) 3.81 4.30 (0.14)
12-12 F(12, 75) 1.77 (0.07) 1.87 (0.07) 3.73 2.88 (0.1 I) 1.94 5
F(12. 74) 1.34 io.04) 1.54 io.zoj 1.78 (0.04)

B, =os Slope=O.1085 Slope =0.0978 Slope=O.lll8 Slope=O.l066 F


f
4-4 F(4.91) F(4,90) 3.75 (0.19) 3.64 (0.19) 3.71 3.13 (0.13) 3.02 (0.13) 5.86 (0.31) 3.80 4.15 (0.16)
P
‘&8 F(4.87) F(4,86) 3.58 (0.19) 3.51 (0.17) 3.71 2.97 (0.13) 2.90 (0.12) 5.65 (0.30) 3.80 3.97 (0.16)
4-12 F(4,83) 3.50 (0.19) 3.56 (0.19) 3.71 2.89 (0.13) 2.90 (0.13) 5.58 (0.30) 3.80 3.88 (0.15) z
F(4.82)
12-12 F(12,75) F(12,74) 1.95 (0.08) 1.90 (0.08) 1.90 1.45 (0.04) I.55 (0.05) 2.64 (0.11) 1.93 1.66 (0.05) a

A,6=0.8 s10pe=0.1093 Slope=O.O985 Slope=O.1240 slope=o.o7ll
?z
‘G-4 3.48 (0.20) 3.27 (0.20) 3.73 2.91 (0.14) 2.72 (0.14) 5.37 (0.27) 4.10 3.90 (0.15) IE:
F(4.91) ‘Y4.90)
4-8 3.43 (0.20) 3.24 (0.19) 3.73 2.85 (0.14) 268 (0.14) 5.25 (0.28) 4.10 3.77 (0.15) 2
F(4,87) F(4.86)
4-12 F(4.83) F(4.82) 3.35 (0.19) 3.1 I iO.18j 3.73 2.78 (0.13) 2.57 (0.13) 5.06 (0.27) 4.10 3.62 (0.14) 8.
12-12 F(12,75) F(l2.74) 1.83 (0.08) 1.86 (0.07) 1.91 1.37 (0.05) 1.52 (0.05) 2.73 (0.11) 2.03 1.70 (0.44) E
E
8,8=0.X Slope=O.1C98 Slope=O.O989 Slope=O.l125 Slope=O.IOl6
5
4-4 F(4.91) F(C 90) 3.40 (0.17) 3.54 (0.17) 3.75 2.88 (0.12) 2.95 (0.12) 4.88 (0.25) 3.81 3.63 (0.14)
4-8 F(4.87) 3.36 (0.17) 3.50 (0.19) 3.75 2.83 (0.12) 2.90 (0.13) 4.67 (0.24) 3.81 3.47 (0.14) T
F(4,86)
412 F(4.83) F(4,82) 3.24 (0.15) 3.32 (0.16) 3.75 2.73 (0.12) 2.75 (0.12) 4.44 (0.23) 3.81 3.32 (0.13) B
12-12 F(12.75) F(12,74) 1.81 (0.06) 1.89 (0.07) 1.92 1.37 (0.38) 1.55 (0.05) 225 (0.08) 1.94 I.52 (0.04) E

A.O=0.99 sl0pe=0.1095 Slope=O.O987 slope=o.i9l4 Slopc=O.O76i 3
4-l F(4.91) F(4,90) 3.62 (0.16) 2.11 (0.13) 3.74 3.05 (0.12) 1.85 (0.10) 3.85 (0.24) 5.79 294 (0.15)
4-8 F(4,87) F(4,86) 3.55 (0.16) 2.10 (0.14) 3.74 2.98 10.12) 1.83 10.11) 3.72 (0.24) 5.79 2.83 (0.15)
4-12 F(4,83) F(I82) 3.38 (0.16) 2.10 (0.13) 3.74 2.84 io.11; 1.82 iO.lOj 3.66 (0.23) 5.79 2.77 (0.14)
12-12 F(12,75) F(12.74) 1.76 (0.07) 1.88 (0.08) 1.91 1.34 (0.04) 1.52 (0.05) 2.29 (0.11) 2.60 1.49 (0.05)

B, 0 =0.99 s10pe=0.1093 Slope=00985 sl0pe=0.1155 s10pe=0.1006

44 F(4.91) F(4,90) 3.47 (0.17) 3.31 (0.18) 3.73 2.93 (0.13) 2.77 (0.13) 4.66 (0.25) 3.89 3.49 (0.15)
4-8 F(4.87) F(4.86) 3.38 (0.17) 3.23 (0.18) 3.73 2.84 (0.13) 2.69 (0.13) 4.59 (0.24) 3.89 3.37 (0.15)
4-12 F(4,83) F(4.82) 3.30 (0.17) 3.18 (0.18) 3.73 2.76 (0.12) 2.63 (0.13) 4.52 (0.24) 3.89 3.31 (0.14)
12-12 F(12,75) F(R 74) 1.81 (0.07) 1.82 (0.17) 1.91 1.37 (0.04) 1.50 (0.05) 2.48 (0.11) I.96 1.54 (0.05)

‘Standard errors of estimated means are shown parenthetically.


‘Asymptotic distribution under null hypothesis.
‘Mean of non-central chi-square with degrees of freedom equal to that in numerator of ‘F and non~ntrality parameter 100 times the approximate slop.
Table 7
Numbers of Fsin 5% and 10% critical
regionsunder alternative
hypothesis,in
100 replications.

Sims tests ~ Filtered


Hannan ellicient
AsymptoticF
distributions' Wald tests LeGrange tests Wald LaGrange
Mcdeland
parametetition Granger Sims Granger Sims(LDV) Populationb Granger Sii?.(LDV) Sample Population Sample

A,8=0.5

4-4 F(4.91) F(4,90) 76,X2 75.82 0.76,0.85 66,76 72.79 97,9x 0.77,0.86 93,98
&X F(4,X7) F(4.861 76.85 73,83 0.76.0.85 62.75 65.77 96.98 0.77,0.86 93,96
4-12 F(4.83) F(4,xzj 79:X6 15,x1 0.76;0.85 56,78 69.79 96;98 0.77,0.86 91,96
12-12 F(12,75) F(l2.74) 62.72 45.62 0 55,0.68 8,20 19,3x X2.92 0.57,0.69 38.56

B,B=O.S

4-4 F(4.91) F(4.90) 73.79 70,81 0.76.0.84 69.78 66,78 94,96 0.77,0.86 X5.94
‘a F(4,87) F(4,86) 72,77 74.79 0.76,0.84 63,72 64.75 90,94 0.77,0.86 x4,93
4-12 F(4,83) F(4,82) 68,75 67.76 0.76,0.84 57.72 59.74 92,95 0.77,0.86 x0,94
12-12 F(l2,75) F(12.74) 50.60 47.61 0.55,0 67 14,31 26,47 70.85 0.56,0.69 31.42

A,Q=O.8

44 F(4,91) F(4.90) 59.76 60.70 0.76,0.85 55,67 51,65 90,96 0.82,0.89 X2.92
4-x F(4,87) F(4.86) 63,71 64.76 0.76,0.85 54.68 49,64 88.94 0.82,0.89 78.89
4-12 .Y4,831 F(4.82) a,70 58,6X 0.76,0.85 55.69 50.64 X5.94 0.89,0.89 77,X6
12-12 F(l2.75) F(12,74) 36,50 41,56 0.55,0.68 15,24 72,37 Xl,86 0.62,0.73 32,52

B,B=O.S

44 F(4,91) F(4,90) 69.79 71,79 0.76,0.85 59.74 63.74 X7,83 0.78,0.86 78.88
‘&8 F(4.87) F(4,86) 64.73 73.77 0.76,0.85 58,70 65,75 87.91 0.78,0.86 7X.89
4-12 F(4,83) F(4,82) 64,76 71,76 0.76,0.85 58.70 65,72 85,90 0.78,0.86 74,84
12~12 F(l2.75) F(12.74) 39.59 49.61 0.55,0.68 8,24 25.44 62,79 0.57,0.69 17,33

A, 8=0.99

4-4 F(4,91) F(4.90) 12,X0 32.51 0.76,0.85 67.82 24.43 69.74 0.96,0.98 55.70
&8 F(4.87) F(4,X6) 72.81 33,51 0.76,0.85 65,X0 23.37 61,75 0.96,0.98 55.67
4-12 F(4.83) F(4,82) 68,X3 30,4x 0.76,0.85 61,76 20,39 63,72 0.96,0.98 56.65
12-12 F(l2,75) F(12.74) 42,53 4X,56 0.55.0.6X 8,24 28.46 64,70 0.85.0.91 17.37

B, 8 = 0.99

4-4 F(4.91) F(4.90) 68.80 59,72 0.76,0.85 58.73 54.66 79.87 0.78,0.87 70,83
&X F(4,X7) F(4.86) 66,76 56.70 0.76,0.85 53,69 51,60 81,89 0.78,0.87 67,84
4-12 F(4.83) F(4,82) 62,76 56,68 0.76,0.85 52,65 47,60 80,X9 0.78,0 87 65,81
12-12 F(12,75) F(12,74) 40,52 37,52 0.55,0.68 10.26 22.36 66,7X 0.58,0.70 22,40

'Asymptoticdistribution
under nullhypothesis.
184 J. Geweke et al., Alternative tests of causality

variable tests are the most interesting. However, we also present some
information on the distribution of the filtered Hannan efficient Sims test of
the hypothesis that {xi} does not cause {y,}, since its distribution under the
null hypothesis was among the best of the tests involving serial correlation
correction. The Wald variant of this test rejects the null very reliably, and its
mean value is much higher than those of the other tests whose behavior is
summarized in these two tables. The tendency for this test statistic to be
‘too large’ is again evident in the first four experiments where its rejection
frequency is greater than the asymptotic distribution theory would lead one
to expect, and its mean value exceeds the limiting value (calculated as the
mean of a non-central chi-square distribution with degrees of freedom equal
to ‘F’ numerator degrees of freedom and non-centrality parameter equal to
the approximate slope multiplied by 100 - the number of observations
divided by the numerator degrees of freedom). In the fifth experiment the
parameterization of the Hannan efficient estimator cannot approximate the
behavior of the spectral density of the disturbance of (1.3) at low frequencies
very well, and evidently this prevents the test statistic from attaining the
large values suggested by the asymptotic theory set forth above. In the sixth
experiment the distribution over the 100 replications matches the asymptotic
distribution nicely, a result which is probably a fortuitous coincidence due to
the tendency of Wald Hannan efficient test statistics to be too large
combined with a downward bias due to an inability to parameterize the
spectral density of the disturbance adequately.
The behavior of the LaGrange-multiplier variant of this test is more
difficult to explain. This test rejects the null with a frequency greater than
that of the Granger and Sims lagged dependent variable Wald tests in most
instances when four restrictions are implied by the null, and usually with a
frequency greater than that of the LaGrange-multiplier variants of those tests
when twelve restrictions are implied. For the samples generated here, small
sample considerations evidently dominate the asymptotic behavior
demonstrated in section 2.
The behavior of both the Wald- and LaGrange-multiplier variants of the
Granger and Sims lagged dependent variable tests under the alternative
seems to be well described by asymptotic theory. In most cases rejection
frequencies run a little less than their limiting values [Wald (1943)], a
tendency which becomes more pronounced as serial correlation in {x,}
increases. The parameterization of the Sims lagged dependent variable test is
ill-suited in Model B, 8 =0.99 (see table 3) for essentially the same reason
that the form of this test requiring correction for serial correlation was not
well parameterized, a fact which again leads to a rate of rejection well below
the asymptotic norm. No systematic differences between Granger and Sims
lagged dependent variable tests emerge in the experiments beyond those
which clearly seem due to the inadequacy of the parameterization of the
J. Geweke et al., Alternative tests of causality 185

latter in some experiments. For twelve restrictions LaGrange-multiplier test


statistics are much smaller than their Wald counterparts, a phenomenon also
observed when the null is true; this tendency seems likely to emerge
whenever the number of parameters estimated under the alternative becomes
a substantial fraction of the number of observations.

6. Conclusion
In this paper we have compared tests due to Granger (1969) and Sims
(1972) for the absence of a Granger causal ordering in stationary Gaussian
time series. Unlike previous efforts [e.g., Nelson and Schwert (1979)] our
comparison is based on asymptotic properties of these tests under the
alternative as well as the limiting behavior under the null and sampling
experiments. The analytic results on limiting behavior under the alternative
employ the concept of approximate slope, which is inversely proportional to
the number of observations required for a given power against the alternative
at small significance levels.
The implications of this study for empirical work are unambiguous: one
ought to use the Wald variant of either the Granger or Sims lagged
dependent variable tests described in the introduction. Both have three
important advantages over all other tests for the absence of Granger
causality which we have studied:

(1) The approximate slopes of these tests are at least as great as those of the
other tests (with the exception of the Wald Sims test) under all
alternatives.
(2) The sampling distribution of these tests under the null was very
satisfactory in our experiments, and under the alternative rejection
frequencies corresponded very closely to those indicated by the
asymptotic theory. By contrast, the sampling distribution of statistics for
Sims tests requiring correction for serial correlation conformed very
poorly to its limiting distribution and was sensitive to prefiltering under
the null hypothesis: under the alternative, rejection frequencies also
departed from their limiting values.
(3) The Granger and Sims lagged dependent variable tests are inexpensive to
compute and require only an ordinary least squares computer algorithm.
The variants of the Sims test which require correction for serial
correlation require much more computation time and more decisions
about parameterizations.

Our results are, of course, confined to time series which are stationary and
Gaussian, although the results of sections 2 and 3 will admit relaxation of
the normality assumptions employed in their derivation. It is difficult even to
186 J. Geweke et al., Alternative tests of causality

conjecture the outcome of a similar comparison for nonstationary time series,


for in that case Granger’s definitions will not apply unless some care is taken
with the kind of nonstationarity permitted. The results of sampling
experiments are very limited if interpreted in isolation. In the present
case, however, there exists asymptotic paradigms for the outcome of these
experiments under both the null and all alternatives. The experimental results
for our favored tests conform well to these paradigms, and we conjecture
that this result would emerge under variations on the experiments which
have been reported here.

Mathematical appendix

A.1. Definitions and notation

0) e, is an n x 1 vector with ith element unity and the rest zero.


(ii) F, is an n x n complex matrix, [FJjk =n-*exp(2zijk/n), i2 = - 1. Note
F”F, = F,F; = I,,.
(iii) The n x n matrix T, is a Toeplitz matrix if [ZJjk = tj-k.

(iv) The eigenvalues of any n x n symmetric matrix A are denoted


CL~(A)~~(~(A)~...~~L,(A).
(v) If A and B are n x n symmetric matrices and A-B is positive
semidetinite we write A @ B.

(vi) For any n x n matrix A, define the norms

IIAIII=(PI(A’A))‘, 11
All11
=n-’ fi (PL~(A’A))~.
Note that if A is symmetric,

lIAllr=y~I&‘% IIAIIII=n-’jilIP~L~;
J

if A is also positive semidefinite,

/Allr=~M &‘$r=n-’ j&(A)=n-‘tr(A).

(vii) If R(L) = Is”= 0 r,L” is an absolutely summable lag operator, its Fourier
transform is denoted @(n)= cs ,, r, exp (- ih).
(viii) For any function f(A) > 0 defined on [ - rc,n] and symmetric about zero,
Q.[f(A)] denotes the n x n diagonal matrix with jth entryf(271jln).
J. Geweke et al., Alternative tests ofcausality 187

A.2. Preliminaries

Lemma 1. If A and B are n x n symmetric matrices and A @ B, then


u,(A) >=u,(B), j = 1,. . ., n.

Proof: Bellman (1960, p. 115).

Proof For any n x 1 uectorx,x’B’A’ABx~x’B’BxllAll~, so B’A’AB @ B’BIIAI/,Z.


From Lemma 1, uj(B’A’AB) 5 p,{B'B)IIAIIf , j = 1,. . .) n, whence the result.

Lemma 3. Let {A,} and {B,}, n= 1,2,. . ., be two sequences of symmetric


positive definite matrices. If p,(A) zrn, >O and p,(B,,)z mB>O for all n and
lim,,, ~~An-Bn~\rr=O, then lim,,,~~A,‘-B,‘II,,=O.

Proof: Note that lIA;‘[(,<m;’ and IIB;ll)l<m;l, and apply Lemma 2 to


B,- ‘(B, - AJA, I.

Lemma 4. Zf A and B are symmetric, then ((A+BI(,,~IIAI~,,+(IB~~~~.

Proof Define the symmetric matrix operators [ ] + and [ ] - as follows:


For n x n diagonal D, [D] + = diag {(d, 1(, . . ., Id,,./} and [D] - E - [D] +. For
n x n symmetric A with diagonalization, A= PMP’ (M diagonal and P ortho-
normal), [A] + = P[M] +P’ and [A] - E P[M] -P’. For n x n symmetric A and B,
[A]‘+[B]‘>=A+B>=[A]-+[B]-. From Lemma 1, p$A]‘+[B]‘)>=
&A + B) 2 p,([A] - + [B] -), j = 1,. . ., n. Hence

IIA+BIIrr=n-r j$I I4(A+B)[Sn-’ f I~j(CAl’+CBl’)I


j=I
=n -t tr([A]‘)+n-‘tr([B]‘)

=IICAI+((II+((CBI+~~~~

= IIA~III’
+ IIBllrr~

Lemma 5. Let (Y”}, Y,zO, Vn, be a random sequence. Zf lim,,, E(Y,) =O, then
plim Y,= 0.

Proof Partition [0, cc) into the intervals [0, E] and (E,cc), where E is any
specified positive constant. Then E( Y,)2 E. P[ Y, > E), whence plim Y,= 0.

Lemma 6. Let T, be an n x n Toeplitz matrix, [Tnljk= tj-k, for which


188 J. Geweke et al., Alternative tests of causality

&T= _ cotjzj is convergent in an open annulus of the unit circle. Let

tj=& _i e’j*g@)d&
I
and suppose

mg> infg(il) > - co, M,=supg(q<cc.


I L

Zf H (*) is a continuous function defined on the interval Cm,, M,], then

lim n-l 2 H(&T,,))=& j HCg(A)]dA.


n+m j=O ?I

Proof Grenander and Szegij (1958, theorem 5.2).

Lemma 7. Let T, be an n x n Toeplitz matrix, [ xljk = tj-k, for which


cj”= _ m Itjl < CO. Let

tj =&5xeijAg(l)dl,
and let

C, =9&(A)] and .I, = F&F,.

Then

lim IIT,-J,llll=O.
n-tm

Proof. We closely follow Grenander and Szegii (1958, pp. 112-113) who
used weaker assumptions to show convergence in a weaker norm. Introduce
the Cesaro sums

gp(l)= 9 (l-_l/p)tje-ijZ= 2 tie-‘j”.


j= -p j= -m

Define the Toeplitz form [Kn]jkstJ-k, the diagonal matrix E, = 9Jg,(A)],


and the symmetric positive definite matrix L, = FbE,F,.
Applying Lemma 6 with H(x) = 1x1to the Toeplitz form T, - K,,

(A.11
J. Geweke et al., Alternative tests of causality 189

The right side of (A.l) is bounded above by 2x7= r (itjl/p +2xj”=,ltj(, which
can be made arbitrarily small by choosing p suffkiently large.
Observe that

[L,]jk= i exp(27ci(j--k)/n)gp(2Zj/n)= f ti_k+ln.


j=l 1=-m

The matrix K, - L, therefore has typical element

This expression is zero for lj-kl <n-p. Hence K,-L, has at most 2p non-
zero eigenvalues, and the absolute value of no eigenvalue can exceed
x:y”=_lt,l. Hence

which vanishes as n+co.


Finally,

converges to zero because the right side of (A.l) converges to zero. The result
follows from Lemma 4.

A.3. Main results


Lemma 8. Suppose {.q} is a zero-mean stationary Gaussian process with
continuous spectral density S,(A) bounded above and below by positive
constants. Let R(L) be an absolutely summable lag operator with all roots
outside an open annulus of the unit circle, let l?(A) be the Fourier transform of
R(L), and de$ne z: = R(L)z,. Then

plim n-l ,iI z:‘=& _j. ~~(4(2Sz(4d~.

Proof Since {zf*j is ergodic, plim n- ’ I:= r z:’ =var (zr). The process z: has
spectral density lR((n)12S,(1)[Fishman (1969, p. 41)], whence

var(zT)=& _i I&l)("S,(A)dA.
II
190 J. Geweke et al., Alternative tests ofcausality

Lemma 9. Let {zr} be the process defined in Lemma 8 and denote z,


= (z 1,. . -3 z,,)l. Let fi,, be a sequence of n x n positive definite matrices, and
)@A)]’ a continuous function defined on [-TC,TC] with positive lower and upper
bounds. If

then

plimn-‘[zh~,‘z,-z~F,~d,[I~((I)12]F~z,] =O. (A.3)

ProoJ: Let the bracketed symmetric matrix in (A.2) have canonical


factorization &I,$‘,, where P, is orthonormal and A, is diagonal. Let
z! = P,z,. Then

n-‘z’P’/i
nn n
p nnz =n-lzt’A n Zt =n-l
n n iil (di)'G 5 n- l i&
(z?i)21nll.

Because P, is orthonormal, var(z$)=var (z,). The expected value of the last


term is var (z,)n-’ CyZI I$[. But this term is non-negative, so by Lemma 5 it
converges in probability to zero.

Lemma 10. Let {zt} and R(L) be the process and lag operator defined in
Lemma 8,

plim n- ‘z.F”~“[lli(n)l’]F:z.=Zf;; _I I&(n)12S,(1)dJ.


R

Proof Let Z;,= (~7;).. ., zT)‘=&,, the finite Fourier transform of {zt}. Let zy
have moving average representation zy= Q(L)E~, in which the E, are i.i.d.
normal. Let En= (ET,.. ., E:)‘= FOE,; the ~7 are i.i.d. complex normal. We have

By virtue of Hannan (1970, corollary 5, p. 214) and the fact that the
(R”(2j/n)(2 are bounded above and below by positive constants uniformly in
n, the difference between the right side of (A.3) and

vanishes in probability. But the $! are i.i.d. complex normal and R”and Q”are
J. Geweke et al., Alternative tests of causality 191

continuous, so the last expression converges in probability to

Theorem 1. Let (zt}, {z:}, and R(L) be the processes and lag operator defined
in Lemma 8, and define fi,, as in Lemma 9. Given (A.2),

64.4)

Zf there exists E>O such that P[n,(fi,J <c]+O, P[,I,(C&) > E- ‘l-+0, then (A.2)
may be replaced by

plim Jlii,-F,~~[I~(~)J-‘]F:,II,,=O. (A.5)

Proof The conditions on &(fi,) and ,I,(@,) guarantee that llb,,llr and II&;‘//,
are bounded away from zero with probability 1, in the limit. The conditions
on R(L) guarantee that IR(in)l’ is bounded above and below by positive
constants for AE[-rr,rc]. Lemma 4 then guarantees the equivalence of (A.2)
and (A.5).
The result itself follows from Lemmas 8, 9, and 10, with application of
Lemma 4.

Remark 1. Theorem 1 is sufficient for the claim of the text that tests based
on a generalized least squares estimator with known variance matrix, and on
a feasible generalized least squares estimator with estimated fi,, [satisfying
(A.2) or (A.5)], have the same approximate slope; let z,= y,-c:= 1 bjxjf, where
x1*,. . ., xkt are the regressors and bj is any consistent estimator of Bj in the
projection Cj”=r ajxj, of yt on xIt,. . ., xkt.

Theorem 2 [Amemiya (1973) feasible generalized least squares]. Suppose


Q, is estimated as suggested by Amemiya (1973): the disturbances are assumed
to follow an autoregressive process of order p, and the autoregression is
estimated accordingly, using ordinary least squares residuals in lieu of the
disturbances. If the true autoregressive representation of the disturbances is
R(L)u,=e,, var (sJ= 1, R(L) satisfies the conditions of Lemma 8, and {u,} has
absolutely summable autocovariance function, then there exists a function p(n)
such that tfp=p(n) then (A.2) is satisfied.

Proof Suppose initially that p is fixed, and let R,(L) denote the lag operator
from the autoregression of u, on u,_ 1,. . ., u,-~. The roots.of R,(L) =EEO r,PLs
lie outside an open annulus of the unit circle [Grenander and Szegii (1958,
192 1. Geweke et al., Alternative tests of causality

pp. 4&42)]. Let

YP r;-r ri-2 .., r! r: 0 ... 0 0 0


0” r: r;_1 ... r5 rI r{ ... 0 0 0
G; =
(n-p)xn ... .. . . .. ... ... ... ... ... ... . .. ... .

I 0 0 0 ... 0 0 0 ... r$ r’; rz 1

We have

(i) lim1152;‘-G~G$,=O,

(ii) plirnII~~l-~~~~lI,,=O,

(iii) lim IIQ;’ -F,~,[I~(‘(I)12]FkJIII=0,

(iv) plim Il@‘GE- G;‘G$,=O.

The assertion (i) follows from the fact that 0;’ and GrGI: differ only in the
p x p submatrix consisting of their first p rows and columns. Hence
I;= I IP,~(K 1- G’G:)l is a fixed constant not depending on n. Similarly, (ii).
From Lemma 7,

and [Grenander and Szegii (1958, p. 66)]

,&U I mjn I@)l’.

(iii) then follows from Lemma 3. To establish (iv), note

Il~~cl~-G~G~lll,~;lGl~-G~lI,,.IIG~ll,+Il~~-Glllrr.IIG~llr.

Let x, be any m x 1 vector with &CC,,,= 1.

which is bounded uniformly above for all n. Likewise plim ~~G~~~l


is bounded
J. Geweke et al., Alternative tests of causality 193

above. Since plim ?: = r,“, s = 0,. . ., p,

>I
2 3

whence
plim
(
sup &F,p-rf)~Sp+~
[ X*+1s=o
=O,

The result for fixed p follows by application of Lemma 4 to (iHiv).

The assumptions on R(L) guarantee lim,,, 1&(,?)1” = lR(~)l’, whence

lim- ((F,~~Cl~~(n)(21F:,-Fn~~CI~(~)121FhJJrr=0.

p can be increased with n at a rate such that (A.2) obtains.

Theorem 3 (Hannan (1963) efficient estimation). Suppose Q, = F,g,,[&(I)]F:,,


where $,(A) is a consistent estimator of S,(A), the spectral density of the
disturbance. If the true autoregressive representation of the disturbance is
R(L)u,=e,, var(.sJ= 1, and R(L) satisfies the conditions of Lemma 7, then
(A.5) is satisfied.

Proof Immediate from the fact that S,(n) = lr?(n)l-*.

Remark 2. Theorems 2 and 3 illustrate applications of Theorem 1 to the


two estimators investigated in this paper. Presumably this theorem could be
applied in the case of other estimators as well.

References

Amemiya, T., 1973, Generalized least squares with an estimated autocovariance matrix,
Econometrica 41, 723-732.
Baliadur, R.R., 1960, Stochastics comparison of tests, Annals of Mathematical Statistics 31, 276
295.
Bellman, R.A., 1960, Introduction to matrix analysis (McGraw-Hill, New York).
Berndt, E.R. and N.E. Savin, 1977, Conflict among criteria for testing hypotheses in the
multivariate linear regression model, Econometrica 45, 1263-1278.
Box, G.E.P. and M.E. Muller, 1958, A note on the generation of random normal deviates,
Annals of Mathematical Statistics 27, 610-611.
Breusch, T.S., 1979, Conflict among criteria for testing hypothesis: Extension and comment,
Econometrica 47, 203-208.
Fishman, G., 1969, Spectral methods of econometrics (Harvard University Press, Cambridge,
MA).
Geweke, J., 1978, Testing the exogeneity specification in the complete dynamic simultaneous
equation model, Journal of Econometrics 7, 163-185.
Geweke; J., 1981a, A comparison of tests of the independence of two covariance-stationary time
series, Journal of the American Statistical Association 76, 363-373.
Geweke, J., 1981b, The approximate slopes of econometric tests, Econometrica 49, 1427-1442.
JE- B
194 J. Geweke et al., Alternative tests ofcausality

Granger, C.W.J., 1969, Investigating causal relations by econometric models and cross spectral
methods, Econometrica 37, 428438.
Grenander, U. and G. Szego, 1958, Toeplitz forms and their applications (University of
California Press, Berkekely, CA).
Hannan, E.J., 1963, Regression for time series, in: M. Rosenblatt, ed., Proceedings of a
symposium on time series analysis (Wiley, New York).
Hannan, E.J., 1970, Multiple times series (Wiley, New York).
Knuth, D.E., 1969, The art of computer programming, Vol. 2 (Addison-Wesley, New York).
Nelson, C.R. and G.W. Schwert, 1979, Tests for Granger/Wiener causality: A Monte Carlo
investigation, University of Rochester Graduate School of Management working paper 7905.
Oberhofer, W. and J. Kmenta, 1974, A general procedure for obtaining maximum likelihood
estimates in generalized regression models, Econometrica 42, 5799590.
Pierce, D.A. and L.D. Haugh, 1977, Causality in temporal systems: Characterizations and a
survey, Journal of Econometrics 5, 2655293.
Rozanov, Y.A., 1969, Stationary random processes (Holden-Day, San Francisco, CA).
Sargent, T.J., 1976, A classical macroeconometric model of the U.S., Journal of Political
Economy 87,207-237.
Silvey, N.E., 1959, The LaGrangian multiplier test, Annals of Mathematical Statistics 30, 389-
407.
Sims, C., 1972, Money, income and causality, American Economic Review 62, 54@552.
Sims, C., 1974, Distributed lags, in: M.D. Intrihgator and D.A. Kendrick, eds., Frontiers of
quantitative economics (North-Holland, Amsterdam).
Wald, A., 1943, Tests of statistical hypotheses concerning several parameters when the number
of observations is large, Transactions of the American Mathematical Society 54, 426485.
Whittle, P., 1963, Prediction and regulation by linear least-squares method (Van Nostrand,
Princeton, NJ).

You might also like