Estimation and Hypothesis Testing of Cointegration Vectors in Gaussian Vector Auto Regressive Models
Estimation and Hypothesis Testing of Cointegration Vectors in Gaussian Vector Auto Regressive Models
Autoregressive Models
Author(s): Søren Johansen
Source: Econometrica, Vol. 59, No. 6 (Nov., 1991), pp. 1551-1580
Published by: The Econometric Society
Stable URL: http://www.jstor.org/stable/2938278
Accessed: 16/08/2010 08:48
Your use of the JSTOR archive indicates your acceptance of JSTOR's Terms and Conditions of Use, available at
http://www.jstor.org/page/info/about/policies/terms.jsp. JSTOR's Terms and Conditions of Use provides, in part, that unless
you have obtained prior permission, you may not download an entire issue of a journal or multiple copies of articles, and you
may use content in the JSTOR archive only for your personal, non-commercial use.
Please contact the publisher regarding any further use of this work. Publisher contact information may be obtained at
http://www.jstor.org/action/showPublisher?publisherCode=econosoc.
Each copy of any part of a JSTOR transmission must contain the same copyright notice that appears on the screen or printed
page of such transmission.
JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range of
content in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new forms
of scholarship. For more information about JSTOR, please contact [email protected].
The Econometric Society is collaborating with JSTOR to digitize, preserve and extend access to Econometrica.
http://www.jstor.org
Econometrica, Vol. 59, No. 6 (November, 1991), 1551-1580
BY S0REN JOHANSEN
The purpose of this paper is to present the likelihood methods for the analysis of
cointegration in VAR models with Gaussian errors, seasonal dummies, and constant
terms. We discuss likelihood ratio tests of cointegration rank and find the asymptotic
distribution of the test statistics. We characterize the maximum likelihood estimator of
the cointegrating relations and formulate tests of structural hypotheses about these
relations. We show that the asymptotic distribution of the maximum likelihood estimator
is mixed Gaussian. Once a certain eigenvalue problem is solved and the eigenvectors and
eigenvalues calculated, one can conduct inference on the cointegrating rank using some
nonstandard distributions, and test hypotheses about cointegrating relations using the x2
distribution.
AND SUMMARY
1. INTRODUCTION
A LARGE NUMBER OF PAPERS are devoted to the analysis of the concept of
cointegration defined first by Granger (1981, 1983), Granger and Weiss (1983),
and studied furtherby Engle and Granger(1987). Under this headingthe topic
has been studied by Stock (1987), Phillips and Ouliaris(1988), Phillips (1988,
1990),Johansen(1988b),Johansenand Juselius(1990, 1991).The main statisti-
cal technique that has been applied is regressionwith integratedregressors,which
has been studied by Phillips(1988), Phillipsand Park (1988), Park and Phillips
(1988, 1989), Phillips and Hansen (1990), Park (1988), and Sims, Stock, and
Watson (1990). Similarproblemshave been studied under the name common
trends(see Stock and Watson(1988)).
The purpose of this paper is to present some new results on maximum
likelihood estimatorsand likelihood ratio tests for cointegrationin Gaussian
vector autoregressivemodels which allow for constant term and seasonal dum-
mies. This brings in the technique of reduced rank regression (see Anderson
(1951),Velu, Reinsel, and Wichern(1986),Ahn and Reinsel (1990),and Reinsel
and Ahn (1990)), as well as the notion of canonical analysis (Box and Tiao
(1981), Velu, Wichern,and Reinsel (1987), Pena and Box (1987), and the very
elegant paper by Tso (1981)). In Johansen(1988b)the likelihoodbased theory
was presented for such a model without constant term and seasonal dummies,
but it turns out that the constant plays a crucial role for the interpretationof
the model, as well as for the statisticaland the probabilisticanalysis.
A detailed statistical analysis illustratingthe techniques by data on money
demand from Denmarkand Finland is given in Johansen and Juselius (1990),
and the present paper deals mainlywith the underlyingprobabilitytheory that
allowsone to make asymptoticinference.
1551
1552 S0REN JOHANSEN
Consider a general VAR model with Gaussian errors written in the error
correctionform
k-1
the residuals
Rit = Zit- milmi-l zit ( i = 0, k),
and the residualsums of squares
(2.5) Sij =M11-M 1M11M1j (i, j=0, k).
The estimate of F for fixedvalues of a, 3, and A is found to be
( 2.6) F( a, 1/3)= (MO
Mo- aWMkl1) M1
Thus the residuals are found by regressing AXt and Xt-k on the lagged
differences,the dummies and the constant. This gives the likelihood function
concentratedwith respect to the parametersFl, ..., Tk- 1' P, and ,u:
(2.7) LTmax (a, 3, A)
T
IA Iexp T-1 E (Rot - aP'Rkt)'A -'(Rot - aP6'Rkt)}.
= S00- Sk(1'Sk3)
(2.9) A() -f'Sko,
togetherwith
(2.10) L-21T(f) = ISoo0I'(Skk SkoSoSok)f3I/ I'SkkI.1
This again is minimized by the choice ,B - (Vl,... ., r), where V= (vQ,... A,tvp)are
the eigenvectorsof the equation
(2.11) IASkk
-
SkOSOOSOk1 0
A A AA
normed by V'SkkV= I, and ordered by Al > ... > Ap> 0. The maximized
likelihoodfunctionis found from
r
(2.12) Lm/T(r)- ISooIH (1 -Ai).
This procedure is given in Johansen (1988b) for the model without constant
term and seasonal dummies,and consistsof well knownmultivariatetechniques
from the theory of partial canonicalcorrelationsand reduced rank regression
(see Anderson(1951) and Tso (1981)).
To give an intuitionfor the above analysis,considerthe estimate of H in the
unrestrictedVAR model given by H = SOkSkk. Since the hypothesisof cointe-
gration is the hypothesisof reduced rank of H, it is intuitivelyreasonableto
calculate the eigenvaluesof the 11 and check whether they are close to zero.
This is the approachof Fountis and Dickey (1989). Another possibilityis to
calculate singular values, i.e. eigenvalues of H'H, since they are real
and positive. It is interestingto see that the maximumlikelihood estimation
COINTEGRATION VECTORS 1555
THEOREM 2.1: The likelihood ratio test statistic for hypothesis H2: H =a13'
versus H1 is given by
p
(2.13) -21n (Q; H2H1)H -T E ln (1-A),
i=r+l1
whereas the likelihood ratio test statistic of H2(r) versus H2(r + 1) is given by
(2.14) -21n(Q;rlr+ 1) =-Tln(1-Ar+l).
The statistic - 2 ln (Q; H21H1) has a limit distribution which, if al ,u * 0, can be
expressed in terms of a (p - r)-dimensional Brownian motion B with i.i.d.
components as
and
(2.17) F2(t)= t- 2
The test statistic - 2 ln (Q; rIr + 1) is asymptotically distributed as the maxi-
mum eigenvalue of the matrix in (2.15).
If in fact a'I,u = 0, then the asymptotic distributions of - 2 n (Q; H21H1) and
-2 In (Q; r Ir + 1) are given as the trace and the maximal eigenvalue respectively
of the matrix in (2.15) with F(t) = B(t) - JB(u) du.
1556 S0REN JOHANSEN
Here, and in the following,the integralsare all on the unit interval,where the
Brownian motions are defined. Note that integrals of the form fFF'du are
ordinaryRiemannintegralsof continuousfunctionsand the result is a matrixof
stochastic variables. The integral fF(dBY, however, is a matrix of stochastic
integrals,defined as L2 limits of the correspondingRiemannsums.
This section is concludedby pointingout how one can analyzethe model H2*.
First note that if jt = a80 then
a'Xt-k + =
a- 'X-k + af30 = aI3* Xttk,
for f3* = (,8',,p0Yand Xt*k = (Xtk, 1)Y.In these new variables the model is
written
k-1
=L-
JXta*'XtEkri Xt-i + + PDt + Et (t=1 .. , T).
i=1
THEOREM 2.2: Under hypothesis H*: Hl= af3 and j.t = a,80, the likelihood
ratio statistics -21n(Q; H2f H1) and -21n(Q; H2(r)IH2*(r + 1)) are distributed
as the trace and maximal eigenvalue respectively of the matrix in (2.15), with
F= (B',1Y.
Since the parametersa and , are not identifiedwe can only test restrictions
that cannot be satisfiedby normalization.
We consider here in detail a simple but importantmodel for linear restric-
tions of the cointegratingspace and the adjustmentspace that allows explicit
maximumlikelihoodestimation:
(3.1) H3: f3=Hfp and a-=A+i.
Note that H3 is a submodelof H2. The likelihood ratio test of the restrictions
H3 in the model H2 will be discussedbelow. There are of course many other
possible hypotheseson the cointegratingrelationsbut the ones chosen here are
simple to analyze, and have a wide variety of applications;see Johansen and
Juselius (1990), Hoffman and Rasche (1989), and Kunst and Neusser (1990).
Another class of hypothesesof the form ,B= (H(p,q/) can be solvedwith similar
methods (see Johansenand Juselius(1991), and Mosconi and Giannini(1992)).
For more general hypotheseson ,Bof the form h(,3) = 0 one can of course not
prove the existence and uniquenessof the maximumlikelihood estimator,but
such hypotheses can be tested by likelihood ratio or Wald tests using the
asymptoticdistributionof ,Bderivedin AppendixC.
Under hypothesis H3 we transformthe matrices Sij some more. Together
with A (p x m) we consider a (p x (p - m)) matrix B =A1 of full rank, such
that BA = 0, and introducethe notation
(3.2) IAShhb -
Sha .bSaa b Sah .b I 0,
for eigenvalues A1> ... > AS> 0, and eigenvectors v v Then
The estimate for F is found from (2.6) and the maximized likelihood function is
r
THEOREM 3.2: The likelihood ratio test statistic of the restriction , = Hp and
a =A if versus H2 is given by
r
(3.9) - 21n (Q; H3IH2) = T ln {(1 - Ai(H3))/(1 - Ai(H2)),
i=1
the process to be integratedof order 1 (see (4.4) below) and we clarifythe role
of the constantterm.
PROOF:If we multiply the equation (4.2) by a' and a', respectively, we get
the equations
-a aa8Xt + a' V(L) AXt = af(Et + lu + ODt),
a(a'a) -1 for any matrix a of full rank.With this notation note that a'a = I and
aa' = aa' = a(a'a)-'a' which is just the projectiononto the space spannedby the
columnsof a. The process AX, can be recoveredfrom Zt and Yt:
JXt = (X1X1 +A) JXt = 1yt + zt
This gives the equationsfor Zt and Yt
(4.12) - a'a/3'/3Zt + a'1i(L)t AZt + a'tV(L)/1Yt = a'(t + t + PDt),
(4.13) I(L)
a', AZt + a' (L)f? Yt = a (Et + / + Dt)).
The idea of the proof is now to show that the equationsfor the processes Zt
and Yt constitute an invertibleautoregressivemodel.
We write the equationsfor Zt and Yt as
A( L) ( Zt, Yt)= (a, aj_) ( Et + 1 + Dt )
with
d aff'8 + a'tV( z ),8-Z1): -A '( "(8
A(z) - a'i '(z)/3(1 i'( z
z) a'C )/3
For z = 1 this has determinant
IIaI IP,8
a(a= Ia'I 113'13
which is nonzero by assumptions(4.3) and (4.4). Hence z = 1 is not a root. For
z 0 1 we use the representation
A(z) = (a,ca )1H(z)(3, 1 (1-z 1)
which gives the determinantas
1A(z) I = 1(a, a ,) IIHI(z) I 1(/3,/3) 1(1_ z)(Pr).
This shows that all roots of 14(z) I = 0 are outside the unit disk, by the
assumptionabout H(z); see (4.1).
It follows that the systemdefinedby (4.12) and (4.13) is invertible,and hence
that Zt and Yt can be given such initial distributionsthat they become station-
ary. Hence also AXt =,,/Yt + p AZt is stationary apart from the contribution
from the centered dummies.This proves (4.6) and (4.7).
If these initial distributions are expressed in terms of a doubly infinite
sequence {Et}, then the process (Z, Yt') has the representation
(5.4) G t = -2 (t E=[0,1]),
that is, the Brownian motion corrected for level and trend. The asymptotic
conditional variance is
0 (c'H7A - 1Hc) -I
process W enter the result. The trend is describedby definingthe last compo-
nent of G by t.
Note that the limitingdistributionfor fixed G is Gaussianwith mean zero and
variance
? (c'H'A - 1Hc) - 1
whichwe call the limitingconditionalvariance.Thus the limitingdistributionof
TQoc- /c) is a mixtureof Gaussiandistributions.See Jeganathan(1988) for a
general theoryof locally asymptoticallymixed normalmodels.
The result shows that if the 3's are normalizedby c one can find the limiting
distributionof any of the coefficientsand hence of any smooth functionof the
coefficients.Note, however,that if we were interestedin the linear combination
TTr(,3c -c3) then the limiting distributiondegeneratesto zero, since 4H(I -
/3(c'/3 -1c')YH= 0. A different normalization by T3/2 is needed in this case. The
resultcan be determinedfrom the proof of Theorem5.1 in AppendixC, but will
not be explicitlyformulatedhere.
Without proof we give the correspondingresult for model H3*, i.e. when
/i a/30, /3 = H(, and a =A if. Introduce YH such that / and YH span H and
=
define YH= (YH,OYand f = (0, 1)'. The normalization by the p x r matrix c is
now done as follows: /3*= ((3'c)- 1/3,(/'8c)-1/) = ('81,/30jY
Note that T(13]-/3,) has the same limit distributionas that givenby (5.1) for
=0.
As an example of an applicationof the results given above consider the
following simple situation where r = 1, and where we want to test a linear
constraint K'/3 = 0 on the cointegration relation /3' = (,B1....,. BP).
We formulatethe result as a Corollary.
|ASkk
-
SkOSOO5SOk =0.
The remaining eigenvectorsform v. A similar result holds for the model with no
trend.
Note that the test statistic (5.11) is very easy to calculate once the basic
eigenvalueproblem(2.11) has been solved. After havingpicked out the eigen-
vector that estimates the cointegratingrelation one can apply the remaining
eigenvectorsto estimate the "variance"of the coefficientsof the cointegrating
relations.
Thus if there is only one cointegrationvector /3 one can think of the matrix
(A,l 1)vv'/T as giving an estimate of the asymptotic "variance" of /3.
If we want to derive a confidence intervalfor the parameterp=132/31 we
define K' = (po, - 1, 0, ..., 0), such that K'/3 = po/3 - 32' which is zero if p = po.
Theorem5.1 yields the result that
Since we have proved in Theorem 4.1 that under certain conditions AX, and B'X, can be
considered stationary, it follows that the stochastic components of Z't = (AX1'_1, . . ..
AXt-k-l, D,, 1) are stationary. Let Xkk = Var (Xt -kIZl)- Since the process Xt -k is nonstationary,
this variance clearly depends on t, but since f3'Xt-k is stationary, then 'hktkP does not depend on
t. We shall indicate this by leaving out the dependence on t and defining
Var
Ax IZ, =
oo OkP
- V B
('X tk ) ( '4'XkO kk
The first result concerns the relations between these variance-covariance matrices and the parame-
ters a and ,. in the model H2.
(A.1) 43=aI3XkO+A,
(A.2) Ok P = atXkkI3,
and hence
(A.3) voo= a(I3'-kkk3)af + A.
These relations imply that
= a, (a', Aa,) af
The first relation in (A.5) is proved the same way, by multiplyingby (a, 400a_L).The second
equality in (A.5) follows from (A.3) since a'X 00 = aA, and the third is proved as the first.
Finally(A.6) is provedby insertinga = XokI3(P'XYkkP)-' such that (A.6) becomes
[TuI
T- 2 E iW(U) (U E [0, 1])-
i=O
[Tu]
w
(A.7) T- Y'X[TU]
= T- Iy'C S E? + op(l) y'CW(u),
i=o
[Tu]
P
(A.8) T-'rT'X[TU]
= T- r'C E + 'r[Tu]T- + op(l) T Tu.
i=o
We do not give the asymptoticresultsin detailhere since the proofsare similarto those in Johansen
(1988b),whichare based on the resultsin Phillipsand Durlauf(1986),and are simpleconsequences
of the representation(4.11),but we summarizethe resultsin two lemmas.
COINTEGRATION VECTORS 1569
LEMMAA.3: For r = CA - 0 define BT = (, T 2) and define G' = (G21,G, as in Theorem 5.1;
then
w
(A-10) B+(Sko - SkkI3a') fG(dW)'.
LEMMA A.4: Let BT=(Y*,T 2, and define G*= (G1*',G*), as in Theorem 5.2; then
T 1ETlP'Xlk- -k , and
(B.2) ?0;
ISASkk-SkOSO'SOk
see (2.11). Let S(A)1=SASkk-SkOSoSOk. We apply Lemma A.3 to investigate the asymptotic
propertiesof S(A) and applythe fact that the orderedsolutionsof (B.2) are continuousfunctionsof
the coefficientmatrices.
1570 S0REN JOHANSEN
As in LemmaA.3 we let y be orthogonalto 13and r, such that (,3,y, r) span RP. We then find
from Lemma A.3, that for BT = (7, T ) and AT = (,1, T- 2BT) we get
,8 -kk 8 0 ] [pzz-
(B.3) IA' (S(A)) AI A[ JGGdu -[VkO-VO~O~ oj6
which has r positiveroots and (p - r) zero roots. This shows that the r largestsolutionsof (B.2)
convergeto the roots of (B.3) and that the rest convergeto zero.
Next considerthe decomposition
1(3, BT)'S(A)(18 BT) I = 1I8'S(k)8I I|BT{JS(A) -S(A)P['8S(A)P] -,PfS(A)}BT|
and let T -X 00 and A -- 0 such that p = TA is fixed. From Lemma A.3 it follows that
where G1(t) = yC(W(t) - JW(u) du) and G2(t) = t - 4, and G' = (G, G2).
In order to simplifythis expressionintroducethe (p - r)-dimensionalBrownianmotion U =
(a',Aa?) 2af W, which has variance matrix I, and the (p - r + 1)-dimensional process F(t) =
(U(t)' - fU'(u)du,(t- 4))' We can then write(B.4) as
Lll ?
L
,
and L1 =
'l3, (a' IP,8l)- '(a'(cAaL) applying the representation (4.5) for the matrix C. The
F
process enters into the integralswith the factor L1l which are p - r - 1 linearlyindependent
combinationsof the componentsof U - JU(u) du. By multiplyingby (L1IL'l )- 2 we can turn these
into orthonormalcomponentsand by supplementingthese vectorswith an extraorthonormalvector,
which is proportionalto (a'IAa) -L 2ca,, we can transformthe process U by an orthornormal
matrix0 to the process B = OU. Then the equationcan be writtenas
where F is given by (2.16) and (2.17). This equation has p - r roots. Thus we have seen that the
p - r smallest roots of (B.2) decrease to zero at the rate T- 1 and that TA converge to the roots of
(B.6). From the expression for the likelihood ratio test statistics we find that
p
-21n (Q; H21H,) = T E Ai+ op(l)
i=r+ 1
- L
E i = tr ((dB)F [fFF' du f F(dB)')
i=r?1l
Note that if r = 0, i.e. the linear trend is missing, then again applying Lemma A.3 we can choose
y = f3L, and the results have to be modified by leaving out the terms containing ir. The matrix Ll, is
(p - r) x (p - r) and cancels in (B.5) so that the test of H2 in H1 is distributed as
with F(t)= U(t)- JU(u)du. This completes the proof of Theorem 2.1.
Proof of Theorem2.2
The estimation under H2* involved the solution of the equation
see (2.11) with the S11replaced by S!*. Let A* = (,f*, T- 2B*) and multiply the matrix in (B.8) by
AT and its transpose (see Lemma A.4), and let T m-*o. The roots of (B.8) converge to the roots of
the equation
[A13'XkkI3- PXkOXOOOkP 0 ]
0.
[ L?0 AJG*G*' du]
This shows that the r largest solutions of (B.8) converge to the roots of the same limiting equation
as before (see (B.3)). Now multiply instead by BT and its transpose and let p = TA and A -O 0; then
we obtain, by an argument similar to that given in the proof of Theorem 2.1, that in the limit the
p - r + 1 smallest roots normalized by T will converge in distribution to the roots of the equation
-
pfG*G*' du - fG*(dWyaC (a1Aa1) la1f(dW)G*' =0.
Again we can introduce the p - r dimensional process U = (a',AaYa)- faI, W and cancel the matrix
(af -3'(a' Aa,) 2 to see that the test statistic has a limit distribution which is given by
(B.9) T2* f(
tr(Id) )
U f
J )(U du J(U)(d))
The result for the maximal eigenvalue follows similarly. This completes the proof of Theorem 2.2.
Proof of Theorem2.3
From the relation
where U = JU(u) du, it follows that T2*= U(1)'U(1)+ T2 (see (B.7) and (B.9)). The likelihood ratio
test statistic of H2* in H2 is the difference of the two test statistics considered in Theorem 2.1 and
Theorem 2.2. Furthermore the test statistics have the same variables entering the asymptotic
expansions and hence the distribution can be found by subtracting the above random variables T2
and T2*,but U(1YU(1)is x2(p - r).
1572 S0REN JOHANSEN
Proof of Theorem3.1
We multiply (2.3) by A' and B' respectively and insert a =A i and ,3 = Hep and obtain
(B.10) A'Zot =A'IZ,t +A'AtP1p'H'Zkt +A'Et,
(B .11) B'Zot = B'FZt1 + B'EtX
since B'Atf = 0. These equations are analyzed by considering the contribution to the likelihood
function from (B.11) and then the contribution from (B.10) given (B.11).
The contribution from (B.11) after having maximized with respect to the parameters
(Fl,..Fk -1,,A)is
Lm2/T= ISbbI/IB'BI .
The contribution from (B.10) given (B.11) is, after the initial maximization,
( T
2/ T qi,;P, -1)= JA .b Iexp T- p I: 'A-1Rt)/lA'Al,
Lmax ( tt7s7A
4aa.b , ab Abb )|aa t Aaa.bt)/lA
t=1
where
Rt =Rat-AabAb'Rbt -A'At/(p'RhtX
and Rat =A'Rot and Rht = H'Rkt. Minimizing with respect to the parameter AabAbb gives rise to
yet another regression of Rat and Rht on Rbt, and the estimate
For fi =A'At, the likelihood function is reduced to the form (2.7) in terms of Ra.bi Rh.bt, ('
and Aaa.b. Hence the solution can be found by solving (3.2) and using a =A(A'A)- 1' together with
the relations (2.8), (2.9), (2.12), and (2.6), which completes the proof of Theorem 3.1.
Proof of Theorem3.2
The limit result follows from Theorem C.1, proved in Appendix C. What remains is to calculate
the degrees of freedom. The matrix H = a,f3' =Aqip'H' is identified, as is the matrix S?p'. Now
normalize Sp to be of the form 'p'= (I, Xpf) with Spo of dimension r x (s - r). Then there are
rm + r(s - r) free parameters under the assumptions of H3. For m = s = p we get the result for H2
and the difference is the degrees of freedom for the test.
APPENDIX C. ASYMPTOTIC
INFERENCE
Proof of Theorem C.1
There is a qualitative difference between inference for , and that for the other parameters. It
was proved by Stock (1987) that the regression estimate for , was superconsistent. This has
consequences for the usual proof for asymptotic normality, as later exploited by Phillips (1990).
COINTEGRATION VECTORS 1573
The idea can be illustratedas follows:Let e = (01, #02) denote all the parameters,and #2 the
parametersin ,3. Let q denote the log likelihoodfunctionnormalizedby T, and q1,q12,etc. denote
the derivativeswith respect to 11, 11 and #2, etc. The usual asymptoticrepresentationfor the
maximumlikelihoodestimatoris
ql,T 2( 1-1 =T
)
and
These equationsare the ones we wouldget when conductinginferenceabout 11 for fixed#2 and #2
for fixed #1. The same expansionwill show that the likelihood ratio test statistic for a simple
hypothesisabout e is
+ T2(v + l)(~2
2-2)'[T vq22]T](T+2)( -
This showsthat the test statisticdecomposesinto a test for #1 and an independenttest for #2*
The above argumentindicatesthat inferenceabout #2 can be conductedas if 11 were known,
and vice versa.
We can prove the above property about the second derivativesof the likelihood function
concentratedwith respectto A andwe thereforededucethat inferenceabout(Il, .... k- 1'a,cP, A)
can be conductedfor fixed ,3; hence one can applythe well knownresultsfor asymptoticinference
for the stationaryprocesses AX, and 8f'X,. See Dunsmuir and Hannan (1976) for a general
treatment of smooth hypotheses for stationaryprocesses. The asymptoticdistributionof A is
somewhatmore complicated,and will not be treatedin detail here.
The asymptoticpropertiesof estimatorsand test statistics are discussed here for a general
smooth hypothesison the cointegratingrelations: p =,83(0), O - 6 cRk, leaving the remaining
parametersunrestricted.Let D,8(u) denote the derivativeof 8(0) with respectto e in the direction
u E Rk, i.e. the p x r matrix,with elementsEk=1Juap11ij(M)/m?k. We assumethroughoutthat D,8(u)
has full rankfor u # 0, and that ,3'D,3(u)= 0 for all u, where , is the value of the parameterfor
whichthe resultsare derived.This last conditioncan alwaysbe achievedby normalizing8(0) by 8,
i.e. by considering 8(0)(,3f,83(0))-' We also let D,3 denote the pr X k matrixwith element
((i, j), s) equal to
dVij(e)/ots (i = 1,..., p; j =1. r; s =1. k),
so that (D,3)u = vec(D,3(u)). The naturalcoordinatesystem in RP is given by (p, y, r) and the
1574 S0REN JOHANSEN
A 9 (n-l o )
A 2A
The asymptotic distribution of e is given by
-
(C.6) NT(# - it) Df,'( JGG' du 0 (a'A la))D } D(3' vec (fG(dV)'}X
(Dfi'(fGG'du ?
(aA-*a))D13
which we call the asymptotic conditional variance. Here V= 'AA-1W. It follows that the limit
distribution of (Ty, T312r)'( 6() - ,3) is given by
If k1 = k, that is if 'r'Df3(u) = 0 for all u, then the results should be modified by replacing G by G1; see
Theorem 5.1.
The likelihood ratio test statistic of a smooth hypothesis e = ~ 1E RS, s < k is asymptotically
distributed as X2 with k - s degrees of freedom. A similar result holds if a',i = 0; only G should be
replaced by G* (see Theorem 5.2).
for
k-1
Et-?- AXt - X- E i(AXt- -i-jx-) - aj(0)(Xt-k -'Vk)
- (Dt- D).
i=l
Here the bar denotes average. The derivatives are most easily found by a Taylor expansion. Thus if
q,(u) denotes the derivative of q with respect to tY in the direction u, we can find the derivative
from the expansion q(t# + u, a, A) = q(a, a, A) + ql,(u) + 0( u 2). We then get derivatives
qo(u) = tr { A - 1Mo0kD(u)a'),
and second derivatives
qrr(g,g) = -tr{A'gT'X,(X,_ - -
- tr {A -aD,(u)'MkkD(u)')
together with similar expressions for mixed second derivatives. It follows from Lemma A.3, that all
terms in the above derivatives are Op(M),except the expression - tr{A -l'aD.(u)'MkkD.8(u)a'} in
q,, and that this tends to infinity, since the columns of D/3(u) are orthogonal to /8. Thus we apply
the above general argument, even though we have two different normalizations, and have shown
that inference for /8 can be conducted for fixed value of the other parameters, and vice versa.
It is not difficult to see by the central limit theorem for stationary ergodic processes (see White
(1984)) that the derivatives T2(qrF,..., qrk_,qa, qo, qA) are asymptotically Gaussian with mean
zero and variance matrix
which is also the limit of the matrix of the second derivatives with opposite sign with respect to these
parameters, such that the first conclusion of the Theorem holds and the variance is given by (C.7);
see also Lutkepohl and Reimers (1989).
To find the asymptotic distribution of the estimate of t# we expand the likelihood function
around the point t#, such that the other parameters are kept fixed. The relation (C.4) now takes the
form
The limiting behavior of the various matrices is given by Lemma A.3: From the identity BT(Y, T 2Ty
= Ry' + Ti' = I -
P., we get since B'D,8(u) = 0 that for u, v E Rk,
say. The matrix Mkk should be normalized by T- 1 and by BT to get convergence (see Lemma A.3),
but of course the factor T fi has to be taken care of. Now introduce the coordinate system wi =
T- 2Ui, i =1. kl, wi = T- lui, i = k1 + 1. k. Then M(wi, wj) is weakly convergent towards
T2D(uf)'MkoA'a = T2DP(uY(y,TlT)B'TMkOA-a
UIDP3'JG(dV)'.
With this notation we can replace (C.8) by the result (C.6). By a Taylor's expansion we find (C.7). By
expanding the likelihood function around a fixed value of t# one finds that the test statistic for a
simple hypothesis for tY is
x (vec ( G(dV)')}
which for given G is x2 distributed with k degrees of freedom. Now if t# = one finds the same
resultexceptthat Df3 is replacedby (D/3)(DM), i.e. the pr x k matrixDf3 is multipliedby the k x s
matrix of derivatives DE
The likelihood ratio test statistic for the hypothesis tY= t(1) is then
asymptotically distributed as
shows that
-acT2( P X-k.
The first term converges towards a Gaussian distribution with mean zero and variance matrix
A(y 2 - 'Ay + 1) and the second is normalized just right, so that the distribution can be found from
the second part of Theorem C.1. Note that the asymptotic distribution of a' 4 is Gaussian, but the
component which goes into the cointegrating relation has a more complicated limit distribution.
COINTEGRATION VECTORS 1577
Proof of Theorem5.1
The proof consistsof simplifyingthe expressionsgivenin TheoremC.1. A point in sp(H) can be
represented as p + (YH' TH)t~, where (p,,yHrH) are orthogonal and span sp(H), and where
1= j i = 1. s - r, j = 1. r}. It followsthat D,3(u) = (YH, TH)U and that the equation
T'Dp (u) = T'(YH, TH) U = (0, THTH) U = 0
(see (C.2) and (C.3)), can be solved by choosing uij = oiej, i = 1. s - r - 1, j =1. r, where oi
are unit vectors in RS-r and ej unit vectors in Rr. Thus in this case k = (s - r)r and k1 =
(k - s - l)r. The (i, j)th column of D13 (see (C.4) and (C.5)), is given by
where P is a notationfor
fYYJG2G'
T du Y'YH T4Tf G2G2 du 'TH J
(TyH,T3/2TH (YH,TH)P1(YH,THY(Y,T)fG(dVY(aA-1a)
whichshows that
(C.9) T( 13-
PC ) C(I-Pc)(YH, 0)P 1(fH TH)(, )JG(dVY(a'A 1a) (c'P)1.
The consistent estimator (5.6) for the asymptotic conditional variance is found from Lemma C.2
below, for the choice K = I - f3Bc', which satisfies K'/3= 0.
Finally we note that the normalization V'ShhbV= I (see (3.2)) implies that HSh-h bH' =
HVV'H' = /3/3'+ Huu'H'. Since (I - 8cc')P = 0, (I - Pcc')P is Op(T- 1) and (I - Pcc'YHM is
Op(T- 2), such that
Hence one can apply either of these in the consistent estimation of the asymptotic conditional
variance. The relation (A.6) from Lemma A.1 gives the identity (G'A a = diag(Al - 1. Ar -
1). This completes the proof of Theorem 5.1.
Consistent Estimates of the Asymptotic Conditional Variance
The next results are needed for the consistent estimation of the limiting conditional variance in
the limiting distribution of 8c. We take the coordinates (13,YH,TH) where YH(P X (p - s - 1)) is
chosen such that (13,YH, TH) span sp(H). We let u = (r+ , ,) (see (3.2) with the normalization
U Shh.b= I)
If TH = 0 then this results holds with YH(P X (p - s)) chosen such that (13, YH)span sp (H) and G1.2
is replaced by G1.
it follows that VAand hence also the coordinates (e, g, f ) are OP(T- 1/2). From the normalization
P
iYShh.bU= I, it even follows that f E Op(T- 1)and that, since e -* 0, we have
= K'BT(T-'B'TSkkBT) B K,
for BT = (YH, T- 2TH). The terms involving T- 2TH are of smaller order of magnitude than the
COINTEGRATION VECTORS 1579
If TH = 0 we can drop the terms involving TH and choose yH orthogonal to /3 such that they span
sp (H) and apply Lemma A.3 again.
REFERENCES
AHN, S. K., AND G. C. REINSEL (1990): "Estimation for Partially Non-stationary Multivariate
Autoregressive Models," Journal of the American Statistical Association, 85, 813-823.
ANDERSON, T. W. (1951): "Estimating Linear Restrictions on Regression Coefficients for Multivari-
ate Normal Distributions," Annals of Mathematical Statistics, 22, 327-351.
(1971): The Statistical Analysis of Time Series. New York: Wiley.
BOSSAERT, P. (1988): "Common Nonstationary Components of Asset Prices," Journal of Economic
Dynamics and Control, 12, 347-364.
Box, G. E. P., AND G. C. TIAO (1981): "A Canonical Analysis of Multiple Time Series with
Applications," Biometrika, 64, 355-365.
DUNSMUIR, W., AND E. J. HANNAN (1976): "Vector Linear Time Series Models," Advances in
Applied Probability, 8, 339-364.
ENGLE, R. F., AND C. W. J. GRANGER (1987): "Co-integration and Error Correction: Representa-
tion, Estimation, and Testing," Econometrica, 55, 251-276.
ENGLE, R. F., AND B. S. Yoo (1989): "Cointegrated Economic Time Series: A Survey with New
Results," Discussion Paper 89-38, University of California, San Diego.
FOUNTIS, N. G., AND D. A. DicKEY (1989): "Testing for Unit Root Nonstationarity in Multivariate
Autoregressive Time Series," Annals of Statistics, 17, 419-428.
GONZALO, J. (1989): "Comparison of Five Alternative Methods of Estimating Long-Run Equilib-
rium Relationships," Discussion Paper 89-55, University of California, San Diego.
GRANGER, C. W. J. (1983): "Cointegrated Variables and Error Correction Models," Discussion
Paper, 83-13a, University of California, San Diego.
(1981): "Some Properties of Time Series Data and their Use in Econometric Model
Specification," Journal of Econometrics, 16, 121-130.
GRANGER, C. W. J., AND A. A. WEISS(1983): "Time Series Analysis of Error Correcting Models," in
Studies in Econometrics, Time Series and Multivariate Statistics, ed. by S. Karlin, T. Amemiya, and
L. A. Goodman. New Yofk: Academic Press, 255-278.
HOFFMAN, D., AND R. H. RASCHE (1989): "Long-run Income and Interest Elasticities of Money
Demand in the United States," National Bureau of Economic Research Discussion Paper No.
2949.
JEGANATHAN, P. (1988): "Some Aspects of Asymptotic Theory with Applications to Time Series
Models," The University of Michigan.
JOHANSEN, S. (1988a): "The Mathematical Structure of Error Correction Models," Contemporary
Mathematics, 80, 259-386.
(1988b): "Statistical Analysis of Cointegration Vectors," Journal of Economic Dynamics and
Control, 12, 231-254.
(1990): "A Representation of Vector Autoregressive Processes Integrated of Order 2," to
appear in Econometric Theory.
(1991): "The Statistical Analysis of I(2) Variables," University of Copenhagen.
JOHANSEN, S., AND K. JUSELIUS (1990): "Maximum Likelihood Estimation and Inference on
Cointegration-with Applications to the Demand for Money," Oxford Bulletin of Economics and
Statistics, 52, 169-210.
(1991): "Some Structural Hypotheses in a Multivariate Cointegration Analysis of the
Purchasing Power Parity and the Uncovered Interest Parity for UK," to appear in Journal of
Econometrics.
LUTKEPOHL,H., AND H.-E. REIMERS (1989): "Impulse Response Analysis of Cointegrated Systems
with an Investigation of German Money Demand," Christian-Albrechts Universitat Kiel.
KUNST, R., AND K. NEUSSER (1990): "Cointegration in a Macro-economic System," Journal of
Applied Econometrics, 5, 351-365.
1580 S0REN JOHANSEN
OSTERWALD-LENUM, M. (1992): "A Note with Fractiles of the Asymptotic Distribution of the
LikelihoodCointegrationRank Test Statistics:Four Cases," to appear in OxfordBulletinof
Economics and Statistics.
MOSCONI, R., AND C. GIANNINI (1992):"Non-Causalityin CointegratedSystems:Representation,
Estimation and Testing," to appear in Oxford Bulletin of Economics and Statistics.
PARK, J. Y. (1988):"CanonicalCointegratingRegressions,"CornellUniversity.
PARK, J. Y., AND P. C. B. PHILLIPS(1988): "StatisticalInference in Regressionswith Integrated
Processes: Part 1," Econometric Theory, 4, 468-497.
(1989):"StatisticalInferencein Regressionswith IntegratedProcesses:Part 2," Econometric
Theory, 5, 95-131.
Structurein Time Series," Journal of
PENA, D., AND G. E. P. Box (1987):"Identifyinga Simplifying
the American Statistical Association, 82, 836-843.
PHILLIPS,P. C. B. (1988):"SpectralRegressionfor CointegratedTime Series,"CowlesFoundation
No. 872.
(1990): "Optimal Inference in Cointegrated Systems," Econometrica, 59, 283-306.
PHILLIPS,P. C. B., AND S. N. DURLAUF (1986):"MultipleTime Series Regressionwith Integrated
Processes," Review of Economic Studies, 53, 473-495.
PHILLIPS,P. C. B., AND S. OULIARIS (1988): "Testing for Cointegration using Principal Components
Methods," Journal of Economic Dynamics and Control, 12, 1-26.
PHILLIPS,P. C. B., AND Y. J. PARK (1988): "Asymptotic Equivalence of OLS and GLS in Regression
with Integrated Regressors," Journal of the American Statistical Association, 83, 111-115.
PHILLIPS,P. C. B., AND B. E. HANSEN (1990): "StatisticalInferencewith I(1) Processes," Review of
Economic Studies, 57, 99-124.
REINSEL, G. C., AND S. K. AHN (1990): "VectorsAR Modelswith Unit Roots and Reduced Rank
Structure:Estimation,LikelihoodRatio Test, and Forecasting,"Universityof Wisconsin.
SIMS,C. A., J. H. STOCK,AND M. W. WATSON (1990): "Inferencein LinearTime Series Modelswith
some Unit Roots," Econometrica, 58, 113-144.
STOCK, J. H. (1987):"AsymptoticPropertiesof Least SquaresEstimatesof CointegrationVectors,"
Econometrica, 55, 1035-1056.
STOCK, J. H., AND M. W. WATSON (1988): "Testing for Common Trends,"Journal of the American
Statistical Association, 83, 1097-1107.
Tso, M.K.-S. (1981): "Reduced-Rank Regression and Canonical Analysis," Journal of the Royal
Statistical Society, Series B, 43, 183-189.
VELU, R. P., G. C. REINSEL, AND D. W. WICHERN (1986): "Reduced Rank Models for Multiple
Time Series,"Biometrika,73, 105-118.
VELU,R. P., D. W. WICHERN, AND G. C. REINSEL (1987): "A Note on Non-stationary and Canonical
Analysis of Multiple Time Series Models," Journal of Time Series Analysis, 8, 479-487.
WHITE, H. (1984): Asymptotic Theoryfor Econometricians. New York: Academic Press.