0% found this document useful (0 votes)

31 views

Unit Root Testing in AR1 Processes

Unit root testing using mon te carlo simulations

Uploaded by

Kadir

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

31 views

Unit Root Testing in AR1 Processes

Unit root testing using mon te carlo simulations

Uploaded by

Kadir

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 64

Technische Universiteit Delft

Faculteit Elektrotechniek, Wiskunde en Informatica

Delft Institute of Applied Mathematics

Unit Root Tests voor AR(1) Processen

(Engelse titel: Unit Root Testing for
AR(1) Processes)

Verslag ten behoeve van het

Delft Institute of Applied Mathematics
als onderdeel ter verkrijging

van de graad van

BACHELOR OF SCIENCE
in
TECHNISCHE WISKUNDE

door

EMMA INGELEIV WAGNER

Delft, Nederland
Januari 2015

Copyright © 2015 door Emma Ingeleiv Wagner. Alle rechten

voorbehouden.
BSc verslag TECHNISCHE WISKUNDE

”Unit Root Tests voor AR(1) Processen”

(Engelse titel: ”Unit Root Testing for AR(1) Processes”)

EMMA INGELEIV WAGNER

Technische Universiteit Delft

Begeleider

Dr. I.G. Becheri

Overige commissieleden

Dr.ir. M. Keijzer Dr.ir. F.H. van der Meulen

Dr.ir. M.B. van Gijzen

Januari, 2015 Delft

Abstract

The purpose of this study is to investigate the asymptotics of a first order auto regressive
unit root process, AR(1). The goal is to determine which tests could be used to test for
the presence of a unit root in a first order auto regressive process. A unit root is present
when the root of the characteristic equation of this process equals unity. In order to test
for the presence of a unit root, we developed an understanding of the characteristics of
the AR(1) process, such that the difference between a trend stationary process and a unit
root process is clear.
The first test that will be examined is the Dickey-Fuller test. The estimator of this
test is based on Ordinary Least Square Regression and a t-test statistic, which is why we
have computed an ordinary least square estimator and the test statistic to test for the
presence of unit root in the first order auto regressive process. Furthermore we examined
the consistency of this estimator and its asymptotic properties. The limiting distribution
of the test statistic is known as the Dickey-Fuller distribution. With a Monte Carlo
approach, we implemented the Dickey-Fuller test statistic in Matlab and computed the
(asymptotic) power of this test. Under the assumption of Gaussian innovations (or shocks)
the limiting distribution of the unit root process is the same as without the normality
assumption been made. When there is a reason to assume Gaussianity of the innovations,
the Likelihood Ratio test can be used to test for a unit root.
The asymptotic power envelope is obtained with help of the Likelihood Ratio test, since
the Neyman-Pearson lemma states that the Likelihood Ratio test is the point optimal
test for simple hypotheses. By calculating the likelihood functions the test statistic was
obtained, such that an explicit formula for the power envelope was found. Since each fixed
alternative results in a different critical value and thus in a different unit root test, there
is no uniform most powerful test available. Instead we are interested in asymptotically
point optimal tests and we will analyze which of these point optimal tests is the overall
best performing test. By comparing the asymptotic powercurve to the asymptotic power
envelope for each fixed alternative we could draw a conclusion on which fixed alternative
results in the overall best performing test.
On the basis of the results of this research, it can be concluded that there does not
exist a uniform most powerful test, nonetheless we can define an overall best performing
test.

3
Contents

Contents 4

List of Tables 6

List of Figures 7

1 Introduction 8

2 Introduction to Unit Root Processes 11

2.1 Unit Root in the AR(1) Process . . . . . . . . . . . . . . . . . . . . . . . . 11
2.1.1 The First Order Auto Regressive Process . . . . . . . . . . . . . . . 11
2.1.2 Presence of a Unit Root in AR(1) . . . . . . . . . . . . . . . . . . . 14
2.2 Trend Stationary Process VS Unit Root Process. . . . . . . . . . . . . . . . 15
2.2.1 Trend Stationary Process . . . . . . . . . . . . . . . . . . . . . . . . 15
2.2.2 Unit Root Process . . . . . . . . . . . . . . . . . . . . . . . . . . . 16

3 Unit Root Testing 17

3.1 Ordinary Least Square Regression . . . . . . . . . . . . . . . . . . . . . . . 17
3.2 Asymptotic Properties of an AR(1) Process under the Null Hypothesis . . 20
3.3 Asymptotic Properties of the AR(1) Random Walk Process with Gaussian
Innovations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
3.4 Consistency of φ̂T . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23

4 The Dickey-Fuller Test 24

4.1 Dickey-Fuller Test Statistic . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
4.2 Critical Values of the Dickey-Fuller Distribution . . . . . . . . . . . . . . . 26

5 Power of the Dickey-Fuller Test 28

5.1 Local Alternatives Framework . . . . . . . . . . . . . . . . . . . . . . . . . 30
5.2 AR(1) with Gaussian Innovations . . . . . . . . . . . . . . . . . . . . . . . 30
5.2.1 t ∼ N (0, 1) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
5.2.2 t ∼ N (0, σ 2 ) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32

6 The Likelihood Ratio Test 34

6.1 Computing the Likelihood Functions . . . . . . . . . . . . . . . . . . . . . 35
6.2 Asymptotic Critical Values of the Likelihood Ratio Test . . . . . . . . . . . 37
6.3 The Power Envelope . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37

4
6.4 The Asymptotic Power Envelope . . . . . . . . . . . . . . . . . . . . . . . 38
6.5 Analytic Solution to the Ornstein-Uhlenbeck Process . . . . . . . . . . . . 40
6.6 Asymptotically Point Optimal Unit Root Test . . . . . . . . . . . . . . . . 42

7 Summary of Results 44

8 Discussion 45

Bibliography 47

Appendices 49

A Auxiliary results 50
A.1 Central Limit Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
A.2 Functional Central Limit Theorem . . . . . . . . . . . . . . . . . . . . . . 50
A.3 Continuous Mapping Theorem . . . . . . . . . . . . . . . . . . . . . . . . . 50
A.4 Neyman-Pearson Lemma . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51

B Matlab programs of the Monte Carlo simulation 52

B.1 Critical values of the Dickey-Fuller test. . . . . . . . . . . . . . . . . . . . . 52
B.1.1 Finite sample critical values . . . . . . . . . . . . . . . . . . . . . . 52
B.1.2 Asymptotic critical values . . . . . . . . . . . . . . . . . . . . . . . 53
B.2 Dickey-Fuller Test - White Noise innovations . . . . . . . . . . . . . . . . . 54
B.3 Dickey-Fuller Test - Gaussian innovations . . . . . . . . . . . . . . . . . . . 56
B.4 Critical values of the Likelihood ratio test. . . . . . . . . . . . . . . . . . . 60
B.5 Likelihood Ratio Test . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
B.6 Asymptotic Power Envelope . . . . . . . . . . . . . . . . . . . . . . . . . . 62
*

5
List of Tables

4.1 Critical values kTα of tT for several significance levels α and sample sizes T . 27

5.1 The power of the Dickey-Fuller test for finite sample sizes T at significance
level α = 0.05. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
5.2 The power of the Dickey-Fuller test for N (0, 1) innovations at nominal
significance level α = 0.05. . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
5.3 The power of the Dickey-Fuller test for N (0, σ 2 = 22 ) innovations at nom-
inal significance level α = 0.05. . . . . . . . . . . . . . . . . . . . . . . . . 32
5.4 The power of the Dickey-Fuller test for several innovations and large sample
size T = 500 with nominal significance level α = 0.05. . . . . . . . . . . . . 33

6
List of Figures

1.1 The unemployment rate of the Netherlands from 2006 to 2014. . . . . . . . 8

2.1 A stationary AR(1) process (φ = 0.5) and a unit root process (φ = 1). . . . 13
2.2 Trend stationary process compared to a unit root process. . . . . . . . . . 15

4.1 The approximation of the distributions of tT for sample sizes T = 25 and

T = 50. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
4.2 The approximation of the distributions of tT for sample sizes T = 100 and
T = 250. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
4.3 The simulated distribution of t∞ . . . . . . . . . . . . . . . . . . . . . . . . 27

5.1 The power of the Dickey-Fuller test at significance level α = 0.05 for T =
25, 50, 100, 250. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
5.2 Power of the Dickey-Fuller test for local alternatives φ1 = 1 + cT1 close to
unity. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31

6.1 The power of the test close to unity corresponds to the asymptotic size
α = 0.05. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
6.2 The asymptotic power envelope. . . . . . . . . . . . . . . . . . . . . . . . . 42

7
Chapter 1

Introduction

The statistical analysis of a stationary time series is more straightforward and more stud-
ied than the analysis of a non-stationary time series. That is why in order to perform
regression analysis on the data, the raw data is often transformed to a stationary pro-
cess. Roughly speaking, a stationary process is a process whose statistical properties do
not change over time, that is the mean and the variance are constant. For many real
applications the assumption of stationarity is not valid and therefore stationarity of the
data needs to be tested. One possible reason for a time series to be non-stationary is the
presence of a unit root. The following example illustrates the importance of testing if a
unit root is present.

Figure 1.1: The unemployment rate of the Netherlands from 2006 to 2014.

Example 1. Consider the unemployment rate of the Netherlands. A person is considered

unemployed if he or she is without a job and actively seeking for a paid job. Figure 1.1
shows the time series of the monthly unemployment rate of the Netherlands from January
2006 up till December 2014. In the field of macroeconomics there are two alternative
theories on the nature of unemployment: the Natural Rate Hypothesis (NRH) and the

8
Hysteresis Hypothesis (HH). The NRH states that the unemployment rate fluctuates
around a certain rate, the natural rate. The unemployment rate can temporarily deviate
from the natural rate due to for example an exogenous shock, but in the long run it will
revert to its natural rate. The NRH theory is therefore consistent with the absence of
a unit root in the unemployment rate time series. Opposed to the NRH, the Hysteresis
Hypothesis states that there does not exist an equilibrium level of the unemployment rate,
with the consequence that a shock has a permanent effect on the unemployment rate time
series. From a statistical point of view this implies that if the HH theory holds the time
series contains a unit root.
Just as in the rest of the European Union, the recession starting in 2008 had a huge
negative effect on the Dutch economy. One way to mitigate the economic recession in
a country is to keep the inflation rate low. The positive effect of a low inflation rate is
the opportunity for the labor market to adjust quickly to negative changes. Monetary
authorities such as De Nederlandsche Bank (The Dutch Bank ) have the power to keep the
inflation rate stable and low. In particular, a contractionary monetary policy has the aim
to reduce the growth of the money supply such that the rate of inflation will stop growing
or will even shrink. However, a contractionary monetary policy does not only have the
positive effect on the inflation rate. The undesirable effect of this policy is that is has the
tendency to increase the unemployment rate of a country. Because of policy implications
the decision whether or not the unemployment rate time series has a unit root is very
important for a monetary authority. If the NRH theory holds, this means that the raise
of the unemployment rate, due to a contractionary monetary policy, does not have a
permanent effect and the time series will eventually revert to its natural rate. However,
if the HH theory is correct, the adverse effect of the monetary policy is permanent, which
will lead to a permanently higher unemployment rate in the Netherlands. Therefore we
can conclude that whether or not the unemployment rate has a unit root is a key element
in designing an optimal policy by The Dutch Bank. 4

Not only the unemployment rate time series could be non-stationary, in fact many
time series such as exchange rates, inflation rates, real outputs etc. should be tested
for stationarity. The nature of non-stationarity could differ: seasonality of the data, the
presence of a deterministic trend, the presence of a stochastic trend and structural breaks
are examples of non-stationarity. Seasonality, structural breaks and the presence of a
deterministic trend in time series are very interesting, yet complicated on their own and
the present thesis will not discuss them, since this thesis focuses on the unit root problem.
Since testing for a unit root is important, this thesis will present two possible unit root
tests: The Dickey-Fuller test and the Likelihood Ratio test.
If a time series is non-stationary, the standard statistical analysis is not valid. The
most common methods of statistical analysis rely on the Law of Large Numbers and the
Central Limit Theorem. But to be able to work with these two theorems the assumption
that the time series is stationary is required, such that the standard methods to perform
statistical analysis on a non-stationary time series will not be correct. This thesis is a
study on the asymptotic properties of inference procedures designed for non-stationary
time series. In large sample theory (asymptotic theory) the properties of an estimator
and test statistic are examined when the sample size becomes indefinitely large. The idea
is that the properties of an estimator when the sample size becomes arbitrarily large, are

9
comparable to the properties of the estimator if the sample is finite.
The unit root problem is a well-studied topic in econometrics. For example Elliot,
Stock and Rothenberg (1992) [4] did a study on the efficiency of tests for auto regressive
unit root processes, where earlier Dickey and Fuller (1979) [1] studied the distribution
of the estimators for auto regressive time series with a unit root. Of course, there are
results for more complex time series than the first order auto regressive process, but they
are beyond this thesis. With this thesis, we aim to examine a non standard problem
in a simple case and show how non-stationarity of a time series can influence statistical
analysis.
First we will give a brief introduction on the first order auto regressive process and the
unit root process and will explain the difference between a trend stationary process and
a unit root process. The basics of unit root testing, such as the hypotheses of interest,
ordinary least square regression and the asymptotic properties of the unit root process
will be dealt with in Chapter 3. The Dickey-Fuller test will be examined in Chapter 4
and Chapter 5. The Likelihood Ratio test will be dealt with in Chapter 6 as well as
the computation of the asymptotic power envelope. The last chapter will summarize the
results of the asymptotic properties of the Dickey-Fuller test and the Likelihood Ratio
test.

10
Chapter 2

Introduction to Unit Root Processes

2.1 Unit Root in the AR(1) Process

In this section, we will discuss the meaning of the presence of a unit root in the first order
auto regressive model, denoted as AR(1).

2.1.1 The First Order Auto Regressive Process

If the time series {Yt } satisfies the first order auto regressive process, denoted as AR(1)
it can be described as the following iterative equation:

Yt = m + γt + φYt−1 + t , with starting value Y0 (2.1.1)

with intercept m, deterministic trend γ and innovations t which satisfy the conditions of
a White Noise process with zero mean and constant variance σ 2 , i.e. t ∼ IID W N (0, σ 2 )
. The t terms are often called random shocks, innovations or errors. We assume that the
t are independent and identically distributed. We recall the definition of a White Noise
Process:
Definition 2.1.1. t is a White Noise Process if E[t ] = 0, E[2t ] = σ 2 and E[t s ] = 0 for
all s 6= t.
Assumption 1. In the rest of the thesis we assume that there is no intercept present, no
deterministic trend and that the initial value equals zero, i.e. m = 0, γ = 0 and Y0 = 0.1
The term auto regression is being used since the equation is a linear regression model
for Yt in terms of the explanatory variable Yt−1 . We assume that t is uncorrelated to
the past values of the time series and thus t represents the random contribution to Yt .
Since the equation is implicit it is useful to rewrite it to an explicit equation. First we
substitute Yt−1 in (2.1.1) and we obtain

Yt = φ(φYt−2 + t−1 ) + t = φ2 Yt−2 + φt−1 + t . (2.1.2)

1
In the discussion we mention how to deal with the process in which Assumption 1 is not satisfied.

11
Substituting for Yt−2 yields

Yt = φ2 (φYt−3 + t−2 ) + φt−1 + t = φ3 Yt−3 + φ2 t−2 + φt−1 + t . (2.1.3)

By continuing this process we see that for t = 0, 1, 2, ..,
t
X
Yt = φi t−i . (2.1.4)
i=0

The factor φ has a strong effect on the behavior of the AR(1) process. We distinguish
three cases:
• |φ| < 1;
• |φ| > 1;
• φ = 1.
Let us consider the first case when −1 < φ < 1, the observation at time t, Yt , in (2.1.4)
will be negligible when t is very large, such that the effect of shocks in the past have no
significance influence on the behavior of the time series. The weight given to a shock
which occured far in to the past will be extremely small. Therefore the time series has a
long term mean and is stationary. On the other hand, if |φ| > 1 the observation at time
t, Yt , will be large and the weight given to a shock a long time ago will be greater than
the weight given to recent shocks. In the long term this process will explode. Clearly this
process is non-stationary. At last let us consider the case in which φ = 1. In this case
the process is non-stationary and behaves as a random walk process. We will discuss this
process in detail later on. Mathematically we can show (weak) stationarity of a process
by computing the mean and the variance in the case |φ| < 1. If |φ| < 1 we can rewrite
the model in the following way:

Yt = φYt−1 + t
(2.1.5)
Yt = t + φt−1 + φ2 t−2 + · · · .
By taking expectations we obtain

E[Yt ] =E t + φt−1 + φ2 t−2 + · · · + φt 0

=E[t ] + φE[t−1 ] + φ2 E[t−2 ] + · · · + φt E[0 ]

(2.1.6)
=0 + φ0 + φ2 0 + · · · + φt 0
=0.
Resulting the mean E[Yt ] of a process in which |φ| < 1 to be zero. The variance is given
by

Var(Yt ) = Var(φYt−1 + t )
= Var(φYt−1 ) + Var(t ) (2.1.7)
= φ2 Var(Yt−1 ) + σ 2 .

12
Under the stationarity assumption Var(Yt ) = Var(Yt−1 ). After substituting Var(Yt−1 )
with Var(Yt ) we obtain

Var(Yt ) = φ2 Var(Yt ) + σ 2 ⇔ (1 − φ2 )Var(Yt ) = σ 2

σ2 (2.1.8)
Var(Yt ) = .
1 − φ2

Since Var(Yt ) > 0, it follows that 1 − φ2 > 0 and we see that the stationarity assumption
is satisfied for |φ| < 1. Therefore the process is stationary if |φ| < 1. We conclude that
the mean and the variance of an AR(1) process with |φ| < 1 are constant and thus {Yt }
is stationary (see Definition 2.1.2).

5
φ=0.5
φ=1

3
Observation

-1
0 10 20 30 40 50 60 70 80 90 100
Time

Figure 2.1: A stationary AR(1) process (φ = 0.5) and a unit root process (φ = 1).

Figure 2.1 indicates the difference between a stationary AR(1) process and a unit
root process, since these time series are simulated with the same innovations {t }. In the
stationary case, the shocks do not have a permanent effect, while in the unit root process,
the shocks do have a permanent effect on the behavior of the process. A process can be
stationary in two ways: strong/strict stationary or weak stationary. A stochastic process
is weak stationary or strict stationary when it meets the following definitions:

Definition 2.1.2. A stochastic process Yt is weak stationary when it meets the following
properties:

• E[Yt2 ] < ∞, ∀t > 0 ∈ N;

13
• E[Yt ] = µ, ∀t > 0 ∈ N;

• Cov(Ys , Yt ) = Cov(Ys+h , Yt+h ), ∀s, t, h > 0 ∈ N.

Definition 2.1.3. A stochastic process Yt is strict (or strong) stationary when its joint
distribution (Yt1 , Yt2 , . . . , Ytk ) function equals (Yt1+h , Yt2+h , . . . , Ytk+h ).

By Definition 2.1.2 of weak stationarity we conclude that the AR(1) process with
|φ| < 1 is a weak stationary process.

2.1.2 Presence of a Unit Root in AR(1)

In a linear model a unit root is present, if one is a root of the process’ characteristic
equation. In the case of a first order auto regressive process, AR(1), one way to determine
if the process is stationary is to look at the roots of the characteristic equation. The
characteristic equation is obtained when we express the process in lag polynomial notation.
The lag operator is represented by the symbol L,

LYt ≡ Yt−1 .
In the case of the AR(1) process, the lag polynomial notation results in the following
equation:

Yt = φ(LYt ) + t ⇔ (1 − φL)Yt = t . (2.1.9)

The characteristic equation is obtained after substituting L by a variable (z), and setting
the polynomial equation equal to zero:

1 − φz = 0. (2.1.10)
Thus if the AR(1) process has a unit root, i.e. z = 1 is a root of the characteristic
equation (2.1.10), φ must equal 1, i.e. φ = 1.

1 − φz = 0 ⇔ z = φ = 1
If φ = 1 the process is non-stationary. This is easy to verify by computing the variance,
since in the case of φ = 1
t
X
Yt = i . (2.1.11)
i=0

The variance of Yt is given by

t
X
Var(Yt ) = σ 2 = tσ 2 . (2.1.12)
i=0

The variance of Yt depends on t since Var(Y1 ) = σ 2 , while Var(Y2 ) = 2σ 2 . Therefore the

AR(1) process with φ = 1 is a non-stationary process.

14
2.2 Trend Stationary Process VS Unit Root Process.
In economics, many time series are not stationary. In general we distinguish between two
cases the Trend Stationary Process and the Unit Root Process:

• Trend Stationary Process: a non-stationary process consisting of a stationary stochas-

tic part and a deterministic time trend.

• Unit Root Process: a non-stationary process with a stochastic trend or a unit root.

5
Trend Stationary Process
Unit Root process
4

2
Observation

-1

-2

-3

-4
0 10 20 30 40 50 60 70 80 90 100
Time

Figure 2.2: Trend stationary process compared to a unit root process.

In a time series a unit root and a deterministic trend could both be present, in that
case the process satisfies equation (2.1.1) with φ = 1 and γ 6= 0. In this thesis we will
not analyze this special case, but in Chapter 8 we have added explanatory notes on this
process. From figure 2.2 we conclude that in the trend stationary process a positive trend
is present, while in the unit root process it seems that there is no deterministic trend
present.

2.2.1 Trend Stationary Process

In the analysis of time series a stochastic process is called trend stationary if the process
consists of a stochastic stationary process and a deterministic trend. Let us consider the
first auto regressive process (2.1.1) with a deterministic linear trend γt and an intercept
m:

Yt = m + γt + φYt−1 + t (2.2.1)

15
In the equation the additional factor γt represents the deterministic linear trend, which
is independent of the stochastic term Yt , and m represents the intercept. Let us compute
the mean and variance of this trend stationary process

E[Yt ] =E[m + γt + φYt−1 + t ]

=E[m] + E[γt] + E[φYt−1 ] + E[t ]
=m + γt + φE[Yt−1 ] + E[t ]
" t−1 #
X
=m + γt + φE φi t−i + E[t ] (2.2.2)
i=0
t−1
X
=m + γt + φ φi 0 + 0
i=0
=m + γt

The mean contains a linear trend dependent on γ and it follows that the mean is not
constant. We can conclude the trend stationary process is not stationary.

2.2.2 Unit Root Process

A process in which a unit root is present is called a unit root process. In particular, we
say that the time series is integrated of order one, I(1) (integrated of order 1) if it must
be differenced 1 time in order to be stationary.

Definition 2.2.1. A time series Yt is integrated of order 1 if (1 − L)Yt is integrated of

order zero, where L is the lag operator and (1 − L) is the first difference.

(1 − L)Yt = Yt − Yt−1 = ∆Yt

we write Yt ∼ I(1). If Yt is stationary then Yt is integrated of order zero, Yt ∼ I(0).

For example, consider the first order auto regressive process, AR(1)

Yt = φYt−1 + t with starting value Y0 .

If φ = 1, Yt = Yt−1 + t and

∆Yt = Yt − Yt−1 = t .
Since t has constant variance and zero mean the process is stationary, so we write Yt ∼
I(1) and a unit root is present. In conclusion, we have seen that a unit root process
is a non-stationary process, since the variance of this process (as obtained in equation
(2.1.12)) equals Var(Yt ) = tσ 2 and increases as t becomes larger.

16
Chapter 3

Unit Root Testing

The previous chapter discussed two types of non-stationary processes, a trend stationary
process and a unit root process. In order to determine whether a time series contains a
stochastic trend (a unit root) we perform a unit root test. The unit root test tests the
null hypothesis of the presence of a unit root in the AR(1) process, against the alternative
hypothesis that the process has no unit root and as a result is stationary. A unit root
test is used to test the following hypotheses:

• H0 : φ = 1 (There is a unit root present.) ⇒ Yt ∼ I(1);

• H1 : |φ| < 1 ⇒ Yt ∼ I(0).

The Dickey-Fuller test statistic is based on the Ordinary Least Square estimator. In
this chapter we will introduce the Ordinary Least Square (OLS) estimator for φ and we
will specify the distribution of this OLS regressor. Furthermore we will show that this
estimator is consistent.

3.1 Ordinary Least Square Regression

We consider the AR(1) process defined in (2.1.1) under Assumption 1:

Yt = φYt−1 + t .
With the method of Ordinary Least Square Regression, we are able to compute an estima-
tor for the parameter of interest, φ̂T . The method of Ordinary Least Square Regression
is based on the idea that you want to minimize the sum of the squared residuals (SSR)
with respect to φ̂T :
T
X T
X
SSR = (Yt − φ̂T Yt−1 ) = 2t ,
t=1 t=1

where T represents the sample size. The goal is to find the value φ̂T that minimizes the
SSR,

17
T
X T
X
SSR = 2t = (Yt − φ̂T Yt−1 )2 , (3.1.1)
t=1 t=1
" T
#
∂(SSR) ∂ X
= (Yt − φ̂T Yt−1 )2 , (3.1.2)
∂ φ̂T ∂ φ̂T t=1

T T
∂(SSR) X ∂ 2
X
= (Yt − φ̂T Yt−1 ) = −2φ̂T Yt−1 (Yt − φ̂T Yt−1 ). (3.1.3)
∂ φ̂T t=1 ∂ φ̂T t=1

In order to minimize (3.1.3) we set the partial derivative equal to zero and we can solve
(3.1.3) for φ̂T :
T
∂(SSR) X
=0⇔ −2φ̂T Yt−1 (Yt − φ̂T Yt−1 ) = 0 (3.1.4)
∂ φ̂T t=1

and we can solve (3.1.4) for φ̂T

T
X T
X T
X T
X
−2φ̂T Yt−1 Yt − −2φ̂2T Yt−1
2
=0⇔ −2φ̂T Yt−1 Yt = −2φ̂2T Yt−1
2
. (3.1.5)
t=1 t=1 t=1 t=1

If we take the value −2φ̂T out of the summation, we obtain

T
X T
X
−2φ̂T Yt−1 Yt = −2φ̂2T 2
Yt−1 . (3.1.6)
t=1 t=1

If we divide both sides by −2φ̂T we obtain

T
X T
X
2
Yt−1 Yt = φ̂T Yt−1 (3.1.7)
t=1 t=1

such that φ̂T equals

PT
Yt−1 Yt
φ̂T = Pt=1
T 2
. (3.1.8)
t=1 Yt−1

The Ordinary Least Square estimator φ̂T which minimizes the sum of the squared residuals
is defined by
PT
Yt−1 Yt
φ̂T = Pt=1T 2
. (3.1.9)
t=1 Yt−1
Since we calculated an estimator for φ by the method of Ordinary Least Square Regression,
it is time to focus on the asymptotic properties of this estimator. By the Central Limit
Theorem (A.1) we know that for T → ∞ if the process is stationary (φ < 1),

18
√ d
− N (0, (1 − φ2 )).
T (φ̂T − φ) → (3.1.10)
Under the null hypothesis Yt is a non-stationary process, thus we are interested in the
asymptotic distribution of φ = 1. If we work under the null hypothesis, we simply can
not use the Central Limit Theorem in the way we used it earlier. Note that for φ = 1
(3.1.10) implies
√ √
T (φ̂T − φ) = T (φ̂T − 1) → N (0, 0) = 0 (3.1.11)
and we obtain a degenerate limiting distribution. A degenerate distribution is a probabil-
ity function of a discrete random variable which consists of one single value. In our case
it resulted in a distribution centered around zero with zero variance. Our aim is to find
a non-degenerate asymptotic distribution for φ̂T under the null hypothesis and therefore
we have to make a change to the √OLS estimator (3.1.9). It turns out that we need to
multiply φ̂T by T rather than by T . To show why scaling with T is needed under H0
(i.e. φ = 1), we define the difference between φ̂T and φ as
PT
Yt−1 t
φ̂T − φ = φ̂T − 1 = Pt=1
T 2
. (3.1.12)
Y
t=1 t−1
1
If we multiply (3.1.12) with T and after substituting T = T1 we obtain
T2
PT ! T
!
1
P
Y t−1 tY Y
t=1 t−1 t Y
T (φ̂T − 1) = T Pt=1
T 2
− 1 = T1 P T 2
−T . (3.1.13)
t=1 Y t−1 T 2 t=1 Y t−1

If we replace 1 with
PT 2
Yt−1
1 = PTt=1 2
(3.1.14)
t=1 Yt−1
and substitute in (3.1.13), we obtain

PT PT !!
1 2
T t=1 Yt−1 Yt t=1 Yt−1
T (φ̂T − 1) = 1
PT 2
−T PT 2
T2
Y
t=1 t−1 t=1 Yt−1
1
PT
t=1 Yt−1 [Yt − Yt−1 ]
=T 1
PT 2
(3.1.15)
T 2 t=1 Yt−1
1
PT
T t=1 ∆Yt Yt−1
= 1
P T 2
.
T2 t=1 Yt−1

Recall that we work under the null hypothesis, such that we can substitute ∆Yt by t ,

∆t = Yt − Yt−1 = t
therefore (3.1.15) yields

19
1
PT
T t=1 t Yt−1
T (φ̂T − 1) = 1
PT 2
. (3.1.16)
T2 t=1 Yt−1
With this result we can now examine if the limiting distribution of this estimator is non-
degenerate. In the next section we will examine the asymptotic properties of the OLS
estimator and we will show that by scaling (φ̂T − 1) with T we obtain a non-degenerate
distribution of the OLS estimator.

3.2 Asymptotic Properties of an AR(1) Process un-

der the Null Hypothesis
Under the null hypothesis the process Yt is best described by a random walk. A random
walk is a stochastic process that is a sequence of random steps, for example the fluctuating
price of a stock is often modeled as a random walk.

Yt = Yt−1 + t = Y0 + 1 + 2 + . . . + t . (3.2.1)
Since we assume Y0 = 0 the process is the sum of random IID innovations. The aim
is to find the asymptotic properties of the AR(1) process with φ = 1 by means of the
asymptotic properties of a random walk. First we will introduce the Wiener Process. A
Wiener Process is a continuous time stochastic process satisfying the following definition:
Definition 3.2.1 (Standard Brownian Motion (Wiener Process)). 1
A continuous time stochastic process {W (t)}t≥0 is called a Standard Brownian Motion
(Wiener Process) when it meets the following properties:
• W (0) = 0;
• For any dates 0 ≤ t1 < t2 < · · · < tk ≤ 1, the increments [W (t2 ) − W (t1 )], [W (t3 ) −
W (t2 )], · · · , [W (tk ) − W (tk−1 )] are independent and Gaussian for any collection of
points 0 ≤ t1 < t2 < · · · < tk−1 < tk and integer k > 2.
• W (t + s) − W (t) ∼ N (0, s) for s > 0.
The Wiener Process is also known as the Standard Brownian Motion. The Wiener
Process is highly related to a random walk. According to Donsker’s Theorem or the
Functional Central Limit Theorem (A.2) the discrete random walk approaches a Standard
Brownian Motion if the number of steps in the random walk increases (t → ∞) and the
step size becomes smaller. As a result, the Wiener Process is the scaling limit of a random
walk. The following proposition holds for a random walk:
Proposition 3.2.1 (Convergence of a random walk). Suppose ψt a random walk,

ψt = ψt−1 + ut
where ut is IID with zero mean and constant variance σ 2 , such that the following properties
hold:
1
Time Series Analysis- James D. Hamilton [8].

20
1
PT d R1
1. T t=1 − σ2
ut ψt−1 → 0
W (t)dW (t) = 12 σ 2 (W (1)2 − 1)
1
PT 2 d R1
2. T2 t=1 ψt−1 − σ2
→ 0
W (t)2 dt
d
Where {W (t)} defines a Wiener Process and →
− defines convergence in distribution.
As we have shown in (3.1.16) the deviation of the OLS estimator from the actual value
φ satisfies
1
PT
T t=1 t Yt−1
T (φ̂T − 1) = 1 P T 2
. (3.2.2)
T2 t=1 Yt−1

Under the null hypothesis Yt describes a random walk, with IID innovations t with
zero mean and constant variance σ 2 . By Proposition 3.2.1 and the Continuous Mapping
Theorem A.3 we may conclude that the asymptotic distribution of (3.2.2) is defined as:
1
PT 1 2
T t=1 t Yt−1 d 2 (W (1) − 1)
T (φ̂T − 1) = 1 P T 2
→
− R 1 . (3.2.3)
Y W (t)2 dt
T 2 t=1 t−1 0

If the assumption of Gaussian innovations is valid, t ∼ N (0, σ 2 ), the asymptotic proper-

ties of T (φ − 1) under the null hypothesis are the same as when the innovations satisfy
a White Noise process. We will illustrate this by computing the asymptotic properties of
the random walk for Gaussian innovations.

3.3 Asymptotic Properties of the AR(1) Random Walk

Process with Gaussian Innovations

Consider the AR(1) process (2.1.1) under the null hypothesis satisfying Assumption 1.
t
X t
X
Yt = φs t−s = t−s = t + t−1 t−2 + · · · + 1 . (3.3.1)
s=1 s=1

When we make the extra assumption of t ∼ N (0, σ 2 ), the process {Yt }t≥0 is the sum of
Gaussian random variables. Therefore (3.3.1) implies that Yt is Gaussian with zero mean
and variance tσ 2 :

Yt ∼ N (0, tσ 2 ). (3.3.2)
Since Yt represents a random walk, we can write the squared random walk process Yt2 in
the following way:

Yt2 − Yt−1
2
− 2t
Yt2 = (Yt−1 + t )2 = Yt−1
2
+ 2Yt−1 t + 2t ⇔ Yt−1 t = . (3.3.3)
2
We are interested in the sum of all the squared observations of the process {Yt }t≥0 , so if
we sum (3.3.3) from 1 to T we obtain

21
T T
X X Yt2 − Yt−1
2
− 2t
Yt−1 t =
t=1 t=1
2
PT 2
PT 2
PT
t=1 Yt t=1 Yt−1 t
= − − t=1
2 2 2
2 2 2 2
Y + Y2 + . . . + YT −1 + YT Y 2 + Y12 + . . . + YT2−1 1 + 2 + . . . + 2T −1 + 2T
= 1 − 0 −
2 2 2
2 2
P T
Y − Y0 t
= T − t=1 .
2 2
(3.3.4)

Assumption 1 states that Y0 = 0, such that

T T
1X 1 2 1 X 2
Yt−1 t = Y − , (3.3.5)
T t=1 2T T 2T t=1 t
and when we divide each side of (3.3.5) by σ 2 we obtain

T T 2 T
1 X 1 2 1 X 2 1 YT 1 X 2
Y
t−1 t = Y − = √ − . (3.3.6)
σ 2 T t=1 2σ 2 T T 2σ 2 T t=1 t 2 σ T 2σ 2 T t=1 t
2
Since Yt ∼ N (0, σ 2 t) it implies that σY√TT ∼ N (0, 1). Then by definition σY√TT follows a
Chi-Squared distribution, 2
YT
√ ∼ X 2 (1).
σ T
PT 2
Let us have a look at the term t=1 t . This is a sum of squared IID normal random
variables with zero mean and constant variance σ 2 , such that by the Law of Large Num-
bers 2 :
T
1X 2 p 2
→σ . (3.3.7)
T t=1 t
Combining the previous results, we have shown that
T
1 X d 1
2
Yt−1 t → Y −1 (3.3.8)
σ T t=1 2

where Y ∼ X 2 (1).
As a result we have found the limiting distribution of the numerator of equation (3.2.3)
for Gaussian innovations. By the definition of a Wiener Process it follows that W (1)2 =
X 2 (1), which implies that
2
J. Doob, Stochastic Processes, John Wiley & Sons, 1953 [3]

22
1 1
W (1)2 − 1 = X 2 (1) − 1 ,

(3.3.9)
2 2
such that the limiting distribution of the numerator of equation (3.2.3) for Gaussian and
non Gaussian innovations are indeed the same. By a similar argument as in section 3.2
we can conclude from the Continuous Mapping Theorem (A.3) and Proposition 3.2.1 that
the limiting distribution of T (φ̂T − 1) satisfies:
1
PT 1 2
T t=1 t Yt−1 d 2 (X (1) − 1)
T (φ̂T − 1) = 1 P T 2
→
− R 1 . (3.3.10)
2 dt
T2 t=1 Yt−1 0
W (t)

3.4 Consistency of φ̂T

An estimator is called consistent if it has the property that if the sample size increases, the
resulting estimate will converge in probability to the value being investigated. Thus if T →
∞ in the AR(1) process, φ̂T is consistent if p− limT →∞ φ̂T = φ, i.e. p− limT →∞ φ̂T −1 = 0.
From (3.2.3) we can conlude that φ̂T is a super consistent estimator at the rate T for φ = 1.
Since we have found that the asymptotic distribution for T → ∞ is
R1
d W (t)dW (t)
T (φ̂T − 1) →− 0R 1 . (3.4.1)
W (t)2 dt
0
R1
0R W (t)dW (t)
If we divide (3.4.1) by T and subtract 1 2
we obtain
0 W (t) dt
R1
1 0
W (t)dW (t)
(φ̂T − 1) − R1 . (3.4.2)
T W (t)2 dt
0
Since we know that the asymptotic distribution is non-degenerate it follows that as T → ∞
R1
1 0 W (t)dW (t) d
R1 →
− 0
T W (t)2 dt
0

such that φ̂T − 1 from equation (3.4.1) converges in probability to zero:

R1
1 0 W (t)dW (t) d
(φ̂T − 1) − R1 →
− 0. (3.4.3)
T W (t)2 dt
0

Hence we can conclude that the OLS estimator φ̂T is a super consistent estimator for the
real value φ. With the asymptotic properties of this estimator we are able to examine the
asymptotic distribution of a test statistic in order to test for the presence of a unit root.
The test was defined by David Dickey and Wayne Fuller in 1979. In the next chapter we
will investigate the Dickey-Fuller test and explain how we can use it to test for a unit
root. For the special case in which the innovations are Gaussian, t ∼ N (0, σ 2 ), we will
construct the Likelihood Ratio test and approximate the asymptotic power of this test.

23
Chapter 4

The Dickey-Fuller Test

In 1979 Wayne Fuller and David Dickey developed a test to examine whether there is
a unit root present in a first order auto regressive process {Yt }t≥0 . This test is named
after the two statisticians and is known as the Dickey-Fuller test. The Dickey-Fuller test
studies the presence of a unit root in the first order auto regressive process (2.1.1), even
if Assumption 1 is not valid. The consideration to include intercept and/or trend results
in three possible auto regressive processes:
• Testing for a unit root.
Yt = φYt−1 + t

• Testing for a unit root with drift.

Yt = m + φYt−1 + t

• Testing for a unit root with drift and deterministic time trend.
Yt = m + φYt−1 + γt + t

Each model results in a different test statistic and different critical values for the
Dickey-Fuller test. That is why it is important for practical reasons to select the correct
underlying auto regressive process before performing a unit root test. Clearly, the first
two processes are simplifications of the third more general auto regressive process. The
AR(1) process under Assumption 1 corresponds to the first model and we will perform
a Monte Carlo simulation to obtain the (asymptotic) critical values for the Dickey-Fuller
test statistic introduced in the next section.

4.1 Dickey-Fuller Test Statistic

Consider the first order auto regressive process under Assumption 1:

Yt = φYt−1 + t (4.1.1)
where t ∼ W N (0, σ 2 ). In this section we will examine the OLS estimator and we will
introduce the test statistic that we use for testing whether there is a unit root present.
In Chapter 3 we found the OLS estimator φ̂T to be

24
PT
Yt−1 Yt
φ̂T = Pt=1
T 2
.
t=1 Yt−1
Let us define the standard t-statistic tT

φ̂T − 1
tT = (4.1.2)
σ̂φ̂T
where σ̂φ̂T is the usual OLS standard error of the estimator,
s
s2
σ̂φ̂T = PT T 2
t=1 Yt−1

and s2T is residual variance of the OLS estimator

PT
t=1 (Yt− φ̂T Yt−1 )2
s2T =
T −1
for a fixed significance level α. The test rejects the null hypothesis (the presence of a
unit root) for small values of the test statistic tT . If the test-statistic tT is smaller than
the corresponding critical value kTα , the null hypothesis of a unit root is rejected. Critical
values kTα can be obtained with Monte Carlo simulation. If we simulate an AR(1) process
of sample size T with a unit root, i.e. φ = 1, we can compute the test statistic tT and by
repeating these steps for a total number of N replications, we will obtain an approximation
to the distribution of tT under the null hypothesis. The critical values can be found with
this distribution, since the critical value kTα satisfies

P r(tT ≤ kTα |H0 ) = α (4.1.3)

for a fixed significance level α and sample size T . Since this thesis focuses on the asymp-
α
totics of a unit root process, we are interested in the asymptotic critical values k∞ of
the test statistic. In order to find asymptotic critical values for the test statistic t∞ it is
necessary to find the limiting distribution of the test statistic tT . We have defined the
t-statistic in the standard way, but as we know the limiting behavior is not Gaussian.
However, we can use the properties obtained in paragraph 3.2 to examine the limiting
distribution of the test statistic. Since φ̂T is a super consistent estimator for φ, it im-
p
plies that s2T →
− σ 2 for T → ∞. Such that the test statistic tT for T → ∞ has limiting
distribution
1 2 1
φ̂T − 1 d 2
σ (W (1)2 − 1) 2
(W (1)2 − 1)
tT = →
− r √ = R 1
q . (4.1.4)
σ̂φ̂T R
1 2 dt
2 2
σ 0 W (t) dt σ 2
0
W (t)

The distributions (3.2.3) and (4.1.4) are known as Dickey-Fuller distributions, since David
Dickey and Wayne Fuller developed the asymptotics of this unit root test.

25
4.2 Critical Values of the Dickey-Fuller Distribution
With the Monte Carlo approach we simulated a distribution for the t-statistic (4.1.2) and
are able calculate the critical values for the test statistic tT . Monte Carlo simulation is
based on the idea of repeated random sampling in order to approximate the underlying
distribution. Under the null hypothesis of a unit root, we repeatedly (N = 50, 000 times)
sampled a first order auto regressive process of length T to approximate the distribution
of tT . Figure 4.1 and Figure 4.2 show the outline of the distribution of the t-statistic for
the sample sizes T = 25, 50, 100, 250.

5000 4500

4500 4000

4000
3500

3500
3000

3000
2500
2500
#

#
2000
2000

1500
1500

1000
1000

500 500

0 0
-6 -4 -2 0 2 4 6 -5 -4 -3 -2 -1 0 1 2 3 4 5
tT tT

(a) T = 25 (b) T = 50

Figure 4.1: The approximation of the distributions of tT for sample sizes T = 25 and
T = 50.

4500 4000

4000 3500

3500
3000

3000
2500

2500
2000
#

2000

1500
1500

1000
1000

500 500

0 0
-6 -4 -2 0 2 4 6 -5 -4 -3 -2 -1 0 1 2 3 4 5
tT tT

(a) T = 100 (b) T = 250

Figure 4.2: The approximation of the distributions of tT for sample sizes T = 100 and
T = 250.

As we can see in both Figure 4.1 and Figure 4.2 the distribution of the t-statistic tT is
positively skewed. The finite sample critical values are obtained by sorting the results of
the finite sample Monte Carlo simulation and determining the critical value at significance
level α. Table 4.1 shows the critical values kTα of the test statistic tT for several different
sample sizes and significance levels α. The asymptotic distribution of the test statistic tT
is known as the Dickey-Fuller distribution (4.1.4). Instead of simulating the data from
the AR(1) process, we obtain the asymptotic critical values of t∞ by sampling a Wiener
Process and approximate the asymptotic distribution of the test statistic t∞ . Figure 4.3

26
shows an approximation of the distribution of t∞ , which has been used to approximate
the asymptotic critical values of the Dickey-Fuller distribution listed in Table 4.1.

2000

1800

1600

1400

1200

1000
#

800

600

400

200

0
-5 -4 -3 -2 -1 0 1 2 3 4
t∞

Figure 4.3: The simulated distribution of t∞ .

T α = 0.01 α = 0.025 α = 0.05

25 -2.7509 -2.3280 -2.0026
50 -2.6409 -2.2697 -1.9682
100 -2.6000 -2.2592 -1.9645
250 -2.5815 -2.2403 -1.9521
∞ -2.5775 -2.2281 -1.9312

Table 4.1: Critical values kTα of tT for several significance levels α and sample sizes T .

By acquiring the critical values of the Dickey-Fuller test, we are able to examine the
power of the Dickey-Fuller test at a certain significance level α. The next chapter examines
the power of the Dickey-Fuller test for several values of φ and different sample sizes T
of the AR(1) process (2.1.1) under Assumption 1. These powers will be evaluated at
significance level α = 0.05, such that we will only consider the critical values listed in
the last column of Table 4.1. We will not only look into the power of the finite sample
process, but also examine the asymptotic power when the parameter φ is close to unity.
Obviously, if we examine the finite sample power we will use the finite sample critical
values and if we examine the asymptotic power we will be using the asymptotic critical
α
value k∞ = −1.9312.

27
Chapter 5

Power of the Dickey-Fuller Test

Power analysis provides information on how good the test is to detect an effect. An effect
is the difference between the value of the parameter under the null hypothesis (φ0 = 1)
and the actual true value (φ1 ). Since we will discuss the performance of the Dickey-Fuller
test it is useful to check the power of the test under different circumstances. We will
analyze the power of the test for several sample sizes and effect sizes. The power of a
statistical test is the probability that the null hypothesis is rejected at a fixed significance
level α when in fact the null hypothesis is false. Which is equivalent to correctly accepting
the alternative hypothesis. Thus

H0 is true φ = 1 H1 is true φ = φ1 < 1

We accept H0 correct conclusion type II error
We reject H0 type I error correct conclusion

Power = Pr(H0 is rejected |H0 is false) = Pr(H0 is rejected |φ = φ1 < 1). (5.0.1)

The power of the Dickey-Fuller test at significance level α = 0.05 corresponds to

Power =Pr(tT < kT0.05 |φ < 1)

(5.0.2)
=1 − Pr(tT > kT0.05 |φ < 1).

The statistical power is dependent on the sample size. The power is often used to calculate
the sample size needed to detect an effect of a given size α. To enlarge the power of a
statistical test, one possible option is to increase the sample size. By means of the finite
sample critical values listed in Table 4.1 we can obtain the power of the Dickey-Fuller
test. With Monte Carlo simulation we approximated the power of the Dickey-Fuller test
at significance level α = 0.05. We calculated the power of the test for AR(1) processes
under Assumption 1, for different alternative hypotheses H1 : φ = φ1 and several sample
sizes T = 25, T = 50, T = 100 and T = 250. For the sake of clarity we define φ1 as
the parameter of the AR(1) process we simulated from and φ = 1 as the value of the
parameter under the null hypothesis.

28
Since the power of the test is dependent on the difference between the real parameter
φ1 and φ = 1, we compute for a sequence of alternatives φ1 ’s the power of the Dickey-
Fuller test. In Table 5.1 we see that for φ1 = 0.5 the power of the test is very close to 1,
which is very high. But if we consider φ1 = 0.9 we conclude that the power of the test is
low. The probability of a type II error β, satisfying Power = 1 − β, is very high, in other
words the probability of failing to reject the null hypothesis is large.

φ1 T = 25 T = 50 T = 100 T = 250
0.5 0.907 1.000 1.000 1.000
0.6 0.774 0.996 1.000 1.000
0.7 0.566 0.977 1.000 1.000
0.8 0.335 0.7606 0.999 1.000
0.9 0.153 0.322 0.752 1.000
1 0.050 0.057 0.045 0.051

Table 5.1: The power of the Dickey-Fuller test for finite sample sizes T at significance
level α = 0.05.

As shown in table 5.1 if the distance |φ1 − φ| gets smaller, the power decreases. To
illustrate this decreasing power figure 5.1 shows the power of the Dickey-Fuller test at
significance level α = 0.05 for several sample sizes T .

1
T=25
T=50
0.9 T=100
T=250

0.8

0.7

0.6
Power

0.5

0.4

0.3

0.2

0.1

0
0.5 0.55 0.6 0.65 0.7 0.75 0.8 0.85 0.9 0.95 1
φ

Figure 5.1: The power of the Dickey-Fuller test at significance level α = 0.05 for T =
25, 50, 100, 250.

Figure 5.1 illustrates the importance of unit root testing. When the value φ1 ap-
proaches unity, the Dickey-Fuller test does not perform the way we would like it to per-
form; the power of the test for values of φ1 close to unity is low. However, if the distance

29
|φ1 − 1| is large, there is no need to test for non-stationarity since the time series will show
stationary properties. Therefore the area of interest is the area of φ1 close to unity and
we will examine the properties of the Dickey-Fuller test for a sequence of φ1 ’s close to 1.
Since we have examined the asymptotic properties of the Dickey-Fuller test, we assume
that we can add data such that the sample size grows indefinitely. To shrink the area
of interest, a small neighborhood of unity, we introduce the local alternative framework.
The local alternative framework is used to shrink the neighborhood of unity as the sample
size T grows indefinitely.

5.1 Local Alternatives Framework

Since we are interested in the power of the Dickey-Fuller test for parameters φ1 in a
small neighborhood of unity, we will define the local alternatives framework. We define
φ1 = 1 + cT1 . Such that
c1
lim φ1 = lim 1 + = 1,
T →∞ T →∞ T
where c1 < 0 ∈ Z which specifies the fixed local alternative. By introducing local al-
ternatives we obtain the power of the Dickey-Fuller test for values extremely close to
unity. For example, if we work with a sample of 1000 observations1 , T = 1000, and
consider the local alternative c1 = −1, we can obtain the power of the Dickey-Fuller test
1
for φ1 = 1 − 1000 = 0.999, which is very close to unity. The results of the simulation
of this example for the power of the Dickey-Fuller test with φ close to unity is shown in
Figure 5.2.

5.2 AR(1) with Gaussian Innovations

In the previous chapters and analysis we assumed the random shocks or innovations to be
independent identically distributed White Noise, i.e. t ∼ W N (0, σ 2 ). Sometimes it makes
sense to assume Gaussianity of the innovations. In that case we can do further power
analysis. Therefore the assumption of Gaussianity gives us an advantage. Which is why
it is nice to check whether the Dickey-Fuller test performs in the same way for Gaussian
innovations, t ∼ N (0, σ 2 ), as for White Noise innovations t ∼ W N (0, σ 2 ). In order to
compare the two types of innovations we will examine the power of the Dickey-Fuller test
for standard normal innovations and innovations with a known constant variance σ 2 6= 1.
Statisticians White (1958) and Hasza (1977) provided us with the knowledge that the
asymptotic properties and the limiting distribution of the estimator φ̂ is the same for
non normal innovations and Gaussian innovations and therefore we can use the limiting
distribution and the critical values we have obtained in Chapter 4. From Section 3.3 we
know that the asymptotic properties of the test statistic tT are the same for Gaussian
innovations, such that we can use the same asymptotic critical values as in Table 4.1.
1
Note that the existence of a time series in econometrics with a unit root of 1000 observations is
very unlikely since the data is often listed quarterly, T = 1000 would result in the approximation of the
asymptotic power of the Dickey-Fuller test.

30
0.8

0.7

0.6

0.5
Power

0.4

0.3

0.2

0.1

0
0.99 0.992 0.994 0.996 0.998 1 1.002
φ

Figure 5.2: Power of the Dickey-Fuller test for local alternatives φ1 = 1 + cT1 close to unity.

5.2.1 t ∼ N (0, 1)
In this section, we will examine the power of the Dickey-Fuller test when the innovations
are standard normally distributed, t ∼ N (0, 1). Monte Carlo Simulation is used to per-
form the Dickey-Fuller test. To be able to compare the power of the Dickey-Fuller test
simulated with Gaussian innovations with the power of the Dickey-Fuller test simulated
with White Noise innovations, we will sample an AR(1) process for several local alterna-
tives φ1 = 1 + cT1 close to unity at nominal significance level α = 0.05 for different sample
sizes T .

−c1 T = 50 T = 100 T = 250

0 0.047 0.050 0.058
2 0.116 0.111 0.127
4 0.236 0.241 0.226
6 0.412 0.406 0.417
8 0.599 0.601 0.554
10 0.785 0.763 0.716

Table 5.2: The power of the Dickey-Fuller test for N (0, 1) innovations at nominal signifi-
cance level α = 0.05.

31
5.2.2 t ∼ N (0, σ 2 )
Since the innovations could also be normally distributed with mean zero and a known
constant variance σ 2 6= 1, it is interesting to examine the power of the Dickey-Fuller test
of this special case. By Monte Carlo simulation we have obtained Table 5.3 which consists
of the power of the Dickey-Fuller test for several sample sizes T and local alternatives
c1 < 0 ∈ Z such that φ1 = 1 + cT1 . For simulation purposes, we performed Monte Carlo
simulation with known variance σ 2 = 4.

−c1 T = 50 T = 100 T = 250

0 0.047 0.042 0.042
2 0.117 0.111 0.126
4 0.227 0.249 0.224
6 0.418 0.420 0.425
8 0.617 0.590 0.582
10 0.748 0.740 0.760

Table 5.3: The power of the Dickey-Fuller test for N (0, σ 2 = 22 ) innovations at nominal
significance level α = 0.05.

By Monte Carlo simulation we have obtained the power for different types of innova-
tions and we can compare these values in order to see which type of innovations produces
the highest power of the Dickey-Fuller test. Table 5.4 illustrates the difference between
the power of the three types of innovations for several fixed local alternatives c1 < 0 ∈ Z.
From Table 5.4 we can conclude that the times series simulated with standard normal
innovations leads to the least biased power estimation. We have fixed the significance
level at α = 0.05 which means that we allow the probability of falsely rejecting the null
hypothesis to be 5% (probability on a type I error). As we can see in the first line in
Table 5.4 the power of the Dickey-Fuller test for the three types of innovations do not
correspond to the significance level α = 0.05. Therefore we can conclude that the results
are biased. Since the probability on a type I error for the AR(1) process simulated with
standard normal innovations is the closest to 0.05, we conclude that the power of the
Dickey-Fuller test for this type of innovations is the least biased.
If the data gives a reason to assume the innovations to be Gaussian, it gives the
opportunity to implement a different unit root test. If indeed the innovations are IID
and Gaussian, we are able to implement the Likelihood Ratio test for unit roots. In the
next chapter we will look into the Likelihood Ratio test and we will examine the power
of this unit root test. Furthermore with the Likelihood Ratio test we can compute the
asymptotic power envelope for unit root tests.

32
−c1 W N (0, σ 2 ) N (0, 1) N (0, σ 2 )
0 0.042 0.0540 0.063
2 0.106 0.136 0.128
4 0.230 0.274 0.244
6 0.399 0.415 0.419
8 0.579 0.625 0.580
10 0.761 0.761 0.750

Table 5.4: The power of the Dickey-Fuller test for several innovations and large sample
size T = 500 with nominal significance level α = 0.05.

33
Chapter 6

The Likelihood Ratio Test

Section 5.2 approximates the asymptotic power of the Dickey-Fuller test under the as-
sumption that the innovations are Gaussian. The Monte Carlo simulation of the AR(1)
process is done with Gaussian innovations rather than White Noise innovations. In order
to perform further power analysis on the AR(1) process with Gaussian innovations, we can
implement the Likelihood Ratio test. This test can also be implemented for other types
of innovations, but in that case the computation of the Likelihood Ratio would become
rather difficult. Instead we stick to Gaussian innovations which are rather common in ap-
plications. By means of the Likelihood Ratio test, we can compute the asymptotic power
envelope for unit root tests. The asymptotic power envelope is a great tool to compare
unit root tests to the maximum asymptotically attainable power. Therefore, within this
chapter the Likelihood Ratio test will be defined and it will be explained how to compute
the power envelope as well as the asymptotic power envelope for unit root tests.
The Likelihood Ratio test is based on the likelihood of two models, the first model
defines the model under the null hypothesis and the second model defines the model under
the alternative hypothesis. Both models will be fitted to the time series’ data and the
likelihood functions will be calculated in order to determine which model is more likely to
be true (hence the name Likelihood Ratio test). If the model under the null hypothesis is
more likely to be true, it results in a large test statistic, denoted as ΛT (·) (dependent on
the sample size T ) and the null hypothesis is rejected for small values of the test statistic
ΛT (·). In the unit root problem the null hypothesis and the simple alternative hypothesis
are the following:

H0 : φ = 1 such that Yt = Yt−1 + t

H1 : φ = φ1 < 1 such that Yt = φ1 Yt−1 + t
The likelihood ratio test rejects the null hypothesis if

L(φ = 1|Y1 , . . . , YT ) f (Y1 , . . . , YT |φ = 1)

ΛT (Yt ) = = ≤ lTα (6.0.1)
L(φ = φ1 |Y1 , . . . , YT ) f (Y1 , . . . , YT |φ = φ1 )
where L(φ = 1|Y1 , . . . , YT ) defines the likelihood function of the model under the null
hypothesis, L(φ = φ1 |Y1 , . . . , YT ) represents the likelihood function of the model under
the alternative hypothesis and lTα denotes the critical value at significance level α satisfying
Pr(ΛT (Yt ) ≤ lTα ) = α.

34
6.1 Computing the Likelihood Functions
First we calculate the likelihood functions: L(φ = φ0 |Y1 , . . . , YT ) & L(φ = φ1 |Y1 , . . . , YT ).
The trick is to write the process {Yt }t≥0 in terms of the innovations t = Yt − φYt−1 . Since
we assumed the innovations independent identically distributed and Gaussian with zero
mean and known constant variance σ 2 , we can easily compute the likelihood functions.
Let us compute the likelihood functions of the hypotheses of interest:

T
2t

Y 1
f (1 , . . . , T |φ = 1) = √ exp − 2
t=1 2πσ 2 2σ
PT 2 !

=(2πσ 2 )−T /2 exp − t=12 t (6.1.1)
2σ
PT !
2
(Y t − Y t−1 )
=(2πσ 2 )−T /2 exp − t=1 ,
2σ 2

T
2t

Y 1
f (1 , . . . , T |φ = φ1 ) = √ exp − 2
t=1 2πσ 2 2σ
PT 2 !

=(2πσ 2 )−T /2 exp − t=12 t (6.1.2)
2σ
PT !
2
(Y t − φ Y
1 t−1 )
=(2πσ 2 )−T /2 exp − t=1 2
.
2σ

With these likelihood functions we will obtain the Likelihood Ratio test statistic ΛT (Yt ) (6.0.1):

L(φ = 1|1 , . . . , T ) f (1 , . . . , T |φ = 1)

ΛT (t ) = =
L(φ = φ1 |1 , . . . , T ) f (1 , . . . , T |φ = φ1 )
PT 2

2 −T /2 t=1 (Yt −Yt−1 )
(2πσ ) exp − 2σ 2
= PT
(Y −φ Y )2
(2πσ 2 )−T /2 exp − t=1 t2σ2 1 t−1
PT
(Y −Y )2
exp − t=1 2σt 2 t−1
= PT
(Y −φ Y )2
exp − t=1 t2σ2 1 t−1
" T #! (6.1.3)
T
1 X X
= exp − 2 (Yt − Yt−1 )2 − (Yt − φ1 Yt−1 )2
2σ t=1 t=1
T
!
1 X 2 2 2
− Yt2 + 2φ1 Yt Yt−1 − φ21 Yt−1

= exp − 2 Y − 2Yt Yt−1 + Yt−1
2σ t=1 t
T
!
1 X
(2φ1 − 2)Yt Yt−1 + (1 − φ21 )Yt−1
2

= exp − 2 .
2σ t=1

35
c1
By substituting φ1 = 1 + T
in (6.1.3) we obtain

T !
1 X c1 c1 2 2
ΛT (t ) = exp − 2 (2 1 + − 2)Yt Yt−1 + (1 − 1 + )Yt−1
2σ t=1 T T
T 2
!
1 X 2c1 2c1 c
= exp − 2 Yt Yt−1 − − 12 Yt−1 2
2σ t=1 T T T
T ! (6.1.4)
c21 2

1 X 2c1
= exp − 2 Yt−1 (Yt − Yt−1 ) − 2 Yt−1
2σ t=1 T T
T !
1 X 2c1 c21 2
= exp − 2 Yt−1 ∆Yt − 2 Yt−1 .
2σ t=1 T T
The resulting Likelihood Ratio function statistic is defined by
T !
1 X c1 1 c1 2 2
ΛT (t ) = exp − 2 Yt−1 ∆Yt − Yt−1 . (6.1.5)
σ t=1 T 2 T
The log Likelihood Ratio function is defined as the natural logarithm of the Likelihood
Ratio:

T
1 X c1 1 c1 2 2
log [ΛT (t )] = − 2 Yt−1 ∆Yt − Yt−1
σ t=1 T 2 T
T T
(6.1.6)
c1 X c2 X 2
=− 2 Yt−1 ∆Yt + Y .
σ T t=1 2T 2 σ 2 t=1 t−1
We can simplify (6.1.6) by substituting:
T
1 X
AT = Yt−1 ∆Yt ,
T σ 2 t=2
T
1 X 2
BT = 2 2 Y ,
T σ t=2 t−1
and we obtain the test statistic of the Likelihood Ratio test:
1
log [ΛT (t )] = −c1 AT + c21 BT . (6.1.7)
2
As a result we have obtained the test statistic for the log Likelihood Ratio test. The
test rejects the null hypothesis in favor of the process being stationary, for small values
of (6.1.7). Since the alternative hypothesis is dependent on the fixed alternative c1 such
that each local alternative corresponds to its own critical value lTα (c1 ) at significance level
α. Let us define the set of M ∈ N negative local alternatives

C = {c1 , c2 , . . . , cM |ci ∈ Z<0 & ci > ci+1 ∀i ≤ M }.

Such that each critical value lTα (ci ) corresponds to the fixed local alternative ci ∈ C.

36
6.2 Asymptotic Critical Values of the Likelihood Ra-
tio Test
The previous section resulted in the find of the log likelihood ratio test statistic. Since
each critical value lTα (ci ) corresponds to a fixed local alternative ci ∈ C being tested,
we have to determine the critical values corresponding to the M fixed alternatives. The
likelihood ratio test rejects the null hypothesis in favor of the fixed alternative ci for small
values of the test statistic log [ΛT (Yt )], the critical values satisfy

Pr(log [ΛT (Yt )] ≤ lTα (ci )) = α (6.2.1)

where we fix the significance level at α = 0.05. Since we are interested in the asymp-
totics of the unit root process, we do not calculate the exact critical value lTα (ci ), but we
α
approximate the asymptotic critical value l∞ (ci ), which is the approximation for lTα (ci )
when T becomes large. We simulate N = 50, 000 times from the first order auto regressive
process under the null hypothesis (φ1 = 1) and for each replication we will calculate the
test statistic and after we have sorted the vector filled with the test statistics, we obtain
the critical value corresponding to the fixed alternative. This procedure will be repeated
for the set of fixed alternatives C to obtain the critical values we need to perform the
power analysis.

6.3 The Power Envelope

This section discusses the power envelope for unit root tests. Elliot, Stock and Rothen-
berg (1996) [5] did a study on the power envelope for unit root testing and produced a
straightforward analysis to derive the power envelope for the class of unit root tests. The
following section will give the outline of the analysis introduced by Elliot et al. (1996)
to calculate the power envelope for unit root tests. The power envelope is a useful tool
for examining power properties of a statistical test. In unit root testing, the power en-
velope returns the attainable upperbound on the power of the class of unit root tests at
significance level α.
If we define the set of functions R = {ρT : ρT is a unit root test} as the class of unit
root tests at significance level α = 0.05, the Dickey-Fuller test and the Likelihood Ratio
test are both members of this class. These unit root tests, such as the Dickey-Fuller test
and the Likelihood Ratio test, can be defined in the form of a function

ρT : Y = (Y1 , . . . , YT ) → [0, 1],

which returns the probability that the test ρT (Y ) rejects the null hypothesis, subject to the
test statistic calculated from the observations Y = (Y1 , . . . , YT ). The power of
a unit root
test ρT is the expectation under the alternative hypothesis of ρT (Y ), Eφi =1− i ρT (Y ) . We
c
T
define the power envelope as the function ΠαT (·) which is the maximum attainable power
of the class of unit root tests at significance level α = 0.05, subject to the alternative
hypothesis c = ci ∈ C. Since the power envelope calculates the maximum attainable
power, it gives a good standard to which power properties of several tests can be compared.
Earlier we computed the power of the Dickey-Fuller test for several sample sizes T and

37
ci
for several (local-to-unity) fixed alternatives ci ∈ C such that φi = 1 + T
. The power
envelope is defined by:

ΠαT (φ) = maxρT :E1 (ρT (Y ))=α Eφi (ρT (Y )) (6.3.1)

where ρT is a unit root test from the class, Y defines the time series of length T and φi the
fixed alternative. With help of the Neyman-Pearson lemma (explained in Appendix A.4),
there is a simple way to derive the power envelope. The Neyman-Pearson lemma states
that the Likelihood Ratio test is the most powerful test for two simple hypotheses H0 :
φ = 1 and H1 : φ = φi < 1. In section 6.1 we have calculated the Likelihood Ratio test
statistic and by the lemma we conclude that the point optimal unit root test rejects the
null hypothesis for small values of
1
log [ΛT (t )] = −T (φi − 1)AT + [T (φi − 1)]2 BT . (6.3.2)
2
This results in an explicit formula for the power envelope ΠαT (φi )

ΠαT (φi ) =maxρT :E1 (ρT (Y ))=α Eφi (ρT (Y ))

(6.3.3)

1 2 α
=Prφi −T (φi − 1)AT + [T (φi − 1)] BT ≤ lT (φi )
2

with AT and BT defined in (6.1.7), lTα (φi ) the critical value corresponding to the fixed
alternative φi = 1 + Tc at significance level α = 0.05, which satisfies

1 2 α
Prφ=1 −T (φi − 1)AT + [T (φi − 1)] BT ≤ lT (φi ) = α
2
For every fixed alternative φi we can compute the corresponding lTα (φi ) and calculate the
maximum power ΠαT (φi ), such that the power envelope is pointwise obtainable. The result
is a sequence of most powerful test depending on the alternative being considered for test
size α = 0.05. The optimal test against the alternative φ = φi < 1 is dependent on the
fixed alternative φi and as as result there does not exist a uniformly most powerful test
at significance level α. As we shall see later on this also holds for the asymptotic power
envelope.

6.4 The Asymptotic Power Envelope

In the power envelope for the finite sample process, we concluded that there does not
exist a uniform most powerful test for size α since the power envelope depends on the
local-to-unity fixed alternative φi = 1 + cTi being evaluated. In this section we will discuss
and derive the asymptotic power envelope for unit root tests. The asymptotic power
envelope gives an asymptotically attainable upperbound on the local-to-unity asymptotic
power of a sequence of unit root tests and is defined by

lim ΠαT (ci ) = Πα∞ (ci ). (6.4.1)

T →∞

38
0.45
Asymptotic Power envelope
Fit

0.4

0.35

0.3
Power

0.25

0.2

0.15

0.1

0.05
0.999 0.9991 0.9992 0.9993 0.9994 0.9995 0.9996 0.9997 0.9998 0.9999 1
φ

Figure 6.1: The power of the test close to unity corresponds to the asymptotic size
α = 0.05.

Figure 6.1 shows the asymptotic power envelope for local alternatives extremely close
to unity. The figure shows that the size is just achieved asymptotically, such that we can
no longer speak of the exact test size, but of the asymptotic size of the test. The power of
the test for local alternatives close to unity approaches the value 0.05 which corresponds
with the test size α = 0.051 . If we assume that the limit (6.4.1) exists, there is an explicit
formula for the asymptotic power envelope:

α 1 2 α
Π∞ (ci ) = lim Prci −ci AT + ci BT ≤ l∞ (ci ) . (6.4.2)
T →∞ 2
Elliot et al. have proven that the asymptotic power envelope equals:

1 2
Πα∞ (ci ) α
= lim Prci −ci AT + ci BT ≤ l∞ (ci )
T →∞ 2
Z 1 (6.4.3)
1 2 1
Z
α
=Pr −ci Wci (r)dW (r) − ci Wci (r)dr ≤ l∞ (ci ) .
0 2 0

Equation (6.4.3) is obtained as follows:

1
Since we are not able to examine the power of the test for c = 0, we extrapolated a quadratic fit in
the figure to show the power of the test for ci → 0. Figure 6.1 illustrates that the power approaches 0.05
when ci → 0.

39

1
ΠαT (φi ) 2 α
=Prφi −T (φi − 1)AT + [T (φi − 1)] BT ≤ lT (φi )
2

1 2 α
=Prci −ci AT + ci BT ≤ lT (ci )
2
(6.4.4)
2 1 2 α
=Prci −ci AT + ci BT − ci BT ≤ lT (ci )
2

1 2 α
=Prci −ci (AT − ci BT ) − ci BT ≤ lT (ci ) .
2
Elliot et al. provided us with the asymptotic distribution of the expressions for AT and
BT :
d R1
• AT − ci BT →− 0 Wci (t)dW (t);
d R1
• BT →
− 0
Wci (t)2 dt.

Such that the asymptotic power envelope (6.4.3) yields

1 2
ΠαT (ci ) α
=Prci −ci (AT − ci BT ) − ci BT ≤ lT (ci )
2
L
−
→ (6.4.5)
Z 1 Z 1
α 1
Π∞ (ci ) =Prci −ci Wci (t)dW (t) − c2i Wci (t)2 dt ≤ l∞
α
(ci ) .
0 2 0

Where W (t) indicates a Wiener Process and Wci (t) denotes an Ornstein-Uhlenbeck Pro-
cess which satisfies the following stochastic differential equation:

dWci (t) = −ci Wci (t)dt + dW (t) (6.4.6)

with initial value Wci (0) = 0.

6.5 Analytic Solution to the Ornstein-Uhlenbeck Pro-

cess
An Ornstein-Uhlenbeck process is a stochastic continuous time process. In order to con-
struct the asymptotic power envelope, we need to simulate a number of Ornstein Uhlen-
beck processes to approximate the asymptotic power for each fixed alternative ci ∈ C. To
simplify the Matlab simulation, we shall solve the stochastic differential equation (6.4.6)
to obtain the analytic solution and we will simulate an Ornstein-Uhlenbeck process by the
analytic solution Wci (t). The general Ornstein-Uhlenbeck process satisfies the following
stochastic differential equation2 :
2 ci
Since we defined the fixed local alternative as φi = 1 + T, c1 < 0 we could substitute k with −ci in
the solution of the Ornstein-Uhlenbeck process.

40
dWci (t) = k(µ − Wci (t))dt + σdW (t). (6.5.1)
In (6.5.1) Wci (·) defines an Ornstein-Uhlenbeck process, W (·) a Wiener process, µ is
the long term mean of the Ornstein-Uhlenbeck process, σ = 1 and ci ∈ C is the fixed
alternative after substituting it by k = −ci . µ is the long term mean, which means that
over time the process will tend to drift towards this value µ. If µ 6= 0 the data is not
centered, therefore we will substitute Y (t) = Wci (t) − µ. The process Yt satisfies the
stochastic differential equation

dY (t) = dWci (t) = −kY (t)dt + σdW (t). (6.5.2)

From equation (6.5.2) we can conclude that Y (t) will converge to zero at exponential rate,
thus it seems a good idea to perform the following variable transformation:

Z(t) = e−kt Y (t) ⇔ Y (t) = ekt Z(t) (6.5.3)

such that the stochastic differential equation for Z(t) satisfies3 .

dZ(t) =kekt Y (t)dt + ekt dY (t)

=cekt Y (t)dt + ekt [−kY (t)dt + σdW (t)] (6.5.4)
=ekt σdW (t).
By Itô integration of Z(t) we obtain the following solution for Z(t)
Z t
Z(t) = Z(s) + σeku dW (u). (6.5.5)
s
If we now substitute Z(t) = ekt Y (t) we obtain

Z t
kt −ks
e Y (t) =e Y (s) + σeku dW (u)
s
⇔ (6.5.6)
Z t
Y (t) =e−k(t−s) Y (s) + σe−k(t−u) dW (u).
s

As a result, after back substitution of Y (t) = Wci (t) − µ and k = −ci we have found the
analytic solution to the Ornstein-Uhlenbeck stochastic differential eqaution:
Z t
ci (t−s)
Wci (t) = µ + e (Wci (t) − µ) + σeci (t−u) dW (u). (6.5.7)
s
Note that (6.4.6) is a simplified version of the general Ornstein-Uhlenbeck stochastic
differential equation (6.5.1) with µ = 0 and σ = 1, by means of this solution the simulation
of an Ornstein-Uhlenbeck process in Matlab is straightforward. With the solution to
this process we can calculate the asymptotic power envelope. Figure 6.2 illustrates the
asymptotic power envelope for fixed local alternatives ci ∈ C. The graph shows for each
alternative ci the maximum asymptotically attainable power of the class of unit root tests.
3
By the product rule for Itô integrals [14].

41
1

0.9

0.8

0.7

0.6
Power

0.5

0.4

0.3

0.2

0.1

0
0.96 0.965 0.97 0.975 0.98 0.985 0.99 0.995 1 1.005
φ

Figure 6.2: The asymptotic power envelope.

6.6 Asymptotically Point Optimal Unit Root Test

Since the asymptotic power envelope is pointwise attainable and there does not exist a
uniform most powerful unit root test, there is not a most optimal test and therefore point
optimal (against the alternative c = ci ) unit root tests have our interest. The asymptotic
power envelope can be used to gain a test based on the use of local alternatives with
higher power than standard tests based on least square regression. However, due to the
fact that each fixed alternative ci results in a different unit root test, we will provide
the analysis on how to obtain the alternative ci ∈ C that results in the overall best
performing unit root test. The closeness of the asymptotic power curve to the asymptotic
power envelope is a good criterion for measuring the performance of the tests based on the
alternatives. In order to accomplish that, for each local alternative we will compute the
asymptotic powercurve and we will determine which of these powercurves is the closest
to the asymptotic power envelope by the method of least squares. Recall the definition
of the asymptotic power envelope Πα∞ :

Z 1 Z 1
1
Πα∞ (ci ) = Prci −ci Wci (t)dW (t) − c2i 2
Wci (t) dt ≤ α
l∞ (ci ) . (6.6.1)
0 2 0

The asymptotic power envelope is the maximum attainable asymptotic power of unit
root tests against the fixed alternative ci . Another way to define this maximum at-
tainable power is in terms of the powercurve Πα∞ (c, ci ), which is the power of the unit

42
α
root test corresponding to the critical value l∞ (ci ) subject to the set of alternatives
C = {c1 , c2 , . . . , cM |ci ∈ Z<0 & ci > ci+1 ∀i ≤ M }:
Z 1
1 2 1
Z
α 2 α
Π∞ (c, ci ) = Pr −ci Wc (t)dW (t) − ci Wc (t) dt ≤ l∞ (ci ) (6.6.2)
0 2 0

where ci denotes the alternative of which the powercurve is computed and c defines the
actual value of the AR(1) process. If we maximize this powercurve (6.6.2) Πα∞ (ci ) over
the range of local alternatives in the small neighborhood around unity, we will obtain the
asymptotic power envelope for the fixed alternative ci :
Z 1
1 2 1
Z
α 2 α
maxc Π∞ (c, ci ) =maxc Pr −ci Wc (t)dW (t) − ci Wc (t) dt ≤ l∞ (ci )
0 2 0
Z 1
1 2 1
Z
2 α (6.6.3)
=Pr −ci Wci (t)dW (t) − ci Wci (t) dt ≤ l∞ (ci
0 2 0
α
=Π∞ (ci ).

To select the overall best performing unit root test, we will compare each powercurve to
the asymptotic power envelope and the method of least squares will result in the best
overall performing test. The method of least squares minimizes the following function:
M
X
minci (Πα∞ (ci ) − Πα∞ (c, ci ))2
i=1

This method returns the fixed alternative ci such that the test obtained with this fixed
alternative is the best performing one. As a result we have found a unit root test with
the best overall performance.

43
Chapter 7

Summary of Results

This thesis has focused the unit root process. In the first order auto regressive process
Yt = φYt−1 + t , a unit root is present if φ = 1 and we investigated the ordinary least
square estimator to construct a standard t-test statistic. The test we examined is known
as the Dickey-Fuller test which rejects the null hypothesis for small values of tT = φ̂−1 σ̂φ̂
and we concluded that the limiting distribution corresponds to the Dickey-Fuller distribu-
tion (4.1.4). The distributions of the innovations is important for the type of unit root test
being considered. If we assumed t ∼ W N (0, σ 2 ) the Dickey-Fuller test is an appropriate
unit root test, instead if we assumed t ∼ N (0, σ 2 ) we could test the null hypothesis with
the Likelihood Ratio test, which has the benefit of performing further power analysis on
the unit root process.
The Likelihood Ratio test rejects the null hypothesis of the presence of a unit root for
small values of
1
log [ΛT (t )] = −c1 AT + c21 BT
2
and with help of the Neyman-Pearson lemma we obtained that this test provided for each
fixed alternative the point optimal test. The Likelihood Ratio gave the opportunity to
compute the asymptotic power envelope. The analysis on the asymptotic power envelope
done by Elliot, Stock and Rothenberg did prove that there exists an explicit formula for
the asymptotic power envelope:
Z 1
1 2 1
Z
α 2 α
Π∞ (ci ) = Prci −ci Wci (t)dW (t) − ci Wci (t) dt ≤ l∞ (ci ) .
0 2 0

The asymptotic power envelope consists of an Ornstein-Uhlenbeck process, such that

we have computed the analytic solution by means of stochastic Itô integration. With the
help of Monte Carlo simulation we obtained the asymptotic power envelope for fixed alter-
natives ci ∈ C. Since each fixed alternative resulted in a different Likelihood Ratio test,
by minimizing the distance between the asymptotic powercurve and the asymptotic power
envelope, we could select the local alternative corresponding to the overall asymptotically
best performing test.

44
Chapter 8

Discussion

This thesis discussed the Dickey-Fuller test, the Likelihood Ratio test and asymptotic
theory of unit root processes and power envelopes for the AR(1) process without intercept
and deterministic trend. AR(1) processes are not always suitable in empirical studies on
time series and, often, higher order auto regressive processes, AR(p) with p > 1 ∈ Z, with
intercept and/or a deterministic trend are considered. Therefore we recommend to do
further research on the asymptotic distribution of the AR(p) process when Assumption 1
is not valid. We will give an outline of the extra simulation and transformations that
need to be done in the case of several other auto regressive processes. Since the unit root
problem is widely studied in econometrics, there are quite some papers which shed light
on the less simplified auto regressive process with a unit root. In the following paragraphs
we will give an insight on the way one could obtain these different limiting distributions.
The first order auto regressive process under Assumption 1, is a simplified version
of the general first order auto regressive process. In Assumption 1 there is no intercept
present and no deterministic trend. The Dickey Fuller test we examined is only valid
when we work with the AR(1) process (2.1.1) under Assumption 1. If we add an intercept
m or a deterministic trend γt to the process’ equation it will result in a different test
statistic and asymptotic distribution compared to the one we examined in this thesis. If
in the AR(1) process an intercept m 6= 0 and deterministic trend are present it yields the
following auto regressive process:

Yt = m + φYt−1 + γt + t (8.0.1)
where t represents IID White Noise with zero mean and constant variance. David Dickey
and Wayne Fuller computed alternative ways to obtain the limiting distribution of 8.0.1.
If one is interested in computing the test statistic and the power of the Dickey-Fuller test
when an intercept m 6= 0 is present, it is possible to center the data by examining Yt − m
opposed to Yt . In this case under the null hypothesis the increments are equal to the
innovations and the process Yt − m behaves as a random walk.
Most often it is not evident if a time series contains a deterministic trend, thus one
has to examine the presence of a trend before choosing the appropriate Dickey-Fuller test.
Harvey, Leybourne and Taylor (2009) [9] have written a paper in which they discuss the
uncertainty about the trend and initial value Y0 . They concluded that it is possible to
detrend the data in order to perform a Dickey-Fuller type unit root test. As a result they
could argue that Dickey-Fuller type unit root tests are almost asymptotically efficient

45
when a deterministic trend is present in the data. We can conclude that there are ways
to use the Dickey-Fuller unit root test for data consisting a deterministic trend and we
would recommend to consider several detrending methods in order to find the test with
locally optimal power.
In reality the first order auto regressive process is not often being considered to fit
a time series. Often higher orders p ∈ N of regression are fitted to a time series. Since
the characteristic equation of that series has several roots, testing for the presence of
a unit root becomes more difficult. Instead of using the standard Dickey-Fuller test,
the Augmented Dickey-Fuller test was introduced by Wayne Fuller [6]. An important
implication of testing for a unit root in higher order auto regressive processes is the
proper selection of the lag p.
Another point of criticism we should mention is the fact that the least square estimator
is biased. By definition of the critical values, the probability of a type I error must be
smaller then the significance level α = 0.05. Table 5.4 shows us that these results do not
match the significance level α, thus we can conclude that the results are biased. There
are several methods available to reduce bias in a Monte Carlo simulation, but as this was
not the aim of the thesis, we did not implement and test these bias reduction methods.
To perform a more accurate power analysis of the Dickey-Fuller test, we recommend to
perform a bias reduction method.

46
Bibliography

[1] David A. Dickey and Wayne A. Fuller. Distribution of the estimators for autore-
gressive time series with a unit root. Journal of the American statistical association,
74(366a):427–431, 1979.
[2] David A Dickey and Wayne A Fuller. Likelihood ratio statistics for autoregressive
time series with a unit root. Econometrica: Journal of the Econometric Society,
pages 1057–1072, 1981.
[3] Joseph L. Doob. Stochastic processes, volume 101. New York Wiley, 1953.
[4] Graham Elliott, Thomas J. Rothenberg, and James H. Stock. Efficient tests for an
autoregressive unit root, 1992.
[5] Graham Elliott, Thomas J. Rothenberg, and James H. Stock. Efficient tests for an
autoregressive unit root, 1996.
[6] Wayne A. Fuller. Introduction to statistical time series, volume 428. John Wiley &
Sons, 2009.
[7] Niels Haldrup and Michael Jansson. Improving size and power in unit root testing.
Palgrave handbook of econometrics, 1:252–277, 2006.
[8] James D. Hamilton. Time series analysis, volume 2. Princeton university press
Princeton, 1994.
[9] David I. Harvey, Stephen J. Leybourne, and Robert Taylor. Unit root testing in
practice: dealing with uncertainty over the trend and initial condition. Econometric
Theory, 25(03):587–636, 2009.
[10] Robin High. Important factors in designing statistical power analysis studies. Com-
puting News, Summer,(14-15), 2000.
[11] Jerzy Neyman and Egon S. Pearson. On the Problem of the Most Efficient Tests of
Statistical Hypotheses. Royal Society of London Philosophical Transactions Series
A, 231:289–337, 1933.
[12] Peter C.B. Phillips and Pierre Perron. Testing for a unit root in time series regression.
Biometrika, 75(2):335–346, 1988.
[13] G. William Schwert. Tests for unit roots: A monte carlo investigation. Journal of
Business & Economic Statistics, 20(1):5–17, 2002.

47
[14] Steven E. Shreve. Stochastic calculus for finance II: Continuous-time models, vol-
ume 11. Springer, 2004.

[15] G. E. Uhlenbeck and L. S. Ornstein. On the theory of the brownian motion. Phys.
Rev., 36:823–841, Sep 1930.

48
Appendices

49
Appendix A

Auxiliary results

A.1 Central Limit Theorem

Theorem A.1.1 (Central Limit Theoreom). Let X1 , X2 , . . . , XT be a set of T IID random

variables and each Xi have an arbitrary probability distribution P (x1 , ..., xT ) with mean
µi and a finite variance σi2 . Then it follows that the normal variable X
PT PT
i=1 xi − i=1 µi
X= qP
T 2
i=1 σi

has a limiting cumulative distribution function which approaches a normal distribution.

A.2 Functional Central Limit Theorem

Theorem A.2.1 (Functional Central limit Theorem). Let X1 , X2 , . . . be IID random

variables with mean zero and variance 1 1 and define:
1 X
Yn (t) = p Xk
(n) k≤nt
with t ∈ [0, 1] and n ∈ N . Consider a Brownian Motion W on [0, 1] and let g : D[0, 1] →
R measurable and continuous at W . Then g(Yn ) → g(W ) in distribution.

A.3 Continuous Mapping Theorem

1
Without loss of generality we can take the variance σ 2 = 1

50
Theorem A.3.1 (Continuous Mapping Theorem). The continuous mapping theorem
states that if ST (·) → S(·) and g(·) is a continuous functional, with S(·) a continuous-
time stochastic process and with S(r) representing its value at some time r ∈ [0, 1], then
g(ST (·)) → g(S(·)).

A.4 Neyman-Pearson Lemma

Theorem A.4.1 (Neyman-Pearson Lemma). If there exists a critical region K of size α

and k ∈ R > 0 such that
Qn
f (xi |θ1 )
Qni=1 ≥k
i=1 f (xi |θ0 )
for points in K and
Qn
f (xi |θ1 )
Qni=1 ≤k
i=1 f (xi |θ0 )

for points not in K, K, then K is the best critical region of size α

51
Appendix B

Matlab programs of the Monte Carlo

simulation

B.1 Critical values of the Dickey-Fuller test.

B.1.1 Finite sample critical values

% Emma Ingeleiv Wagner - 4004949

% Dickey Fuller Asymptotic Critical values simulation

% NO trend and NO constant

clear all;
%randn('seed',1234);

%%%%%%%%%%%%%%%%%%%%%%%% Declaration of variables %%%%%%%%%%%%%%%%%%%%%%

T=25; %sample size
N=100000; %runs
dt=T/N;%intervals
phi=1;

hatsigma=zeros(1,N);
hatphi=zeros(1,N);

%denominator of the DF distr.

ttest=zeros(1,N);

%%%%%%%%%%%%%%%%%%%%% Simulating a Brownian path & calculating critical values %%%%%%%

for k=1:N

X=zeros(1,T);
Z1=zeros(1,T);
Z2=zeros(1,T);
Y1=zeros(1,T);

52
for j = 2:T

X(j)= phi*X(j-1)+rand-0.5;
Z1(j)=X(j-1)*X(j);
Z2(j)=X(j-1)ˆ2;

end
Z1=cumsum(Z1);
Z2=cumsum(Z2);
hatphi(k)=Z1(T)/Z2(T);

for j=2:T
Y1(j)=(X(j)-hatphi(k)*X(j-1))ˆ2;
end

Y1 = cumsum(Y1);

ttest(k)=(hatphi(k)-1)/sqrt(((Y1(T)/(T-1))/Z2(T))); %DF distr

end

% Compute the critical value for the test statistic

dfcritical=sort(ttest);
plot(dfcritical);
nbins=100;
hist(dfcritical,nbins)

% Critical values

alpha1=dfcritical(0.01*N); % 1% significance level

alpha2=dfcritical(0.025*N); % 2.5% significance level
alpha3=dfcritical(0.05*N); % 5% significance level

alpha=[alpha1,alpha2,alpha3]

B.1.2 Asymptotic critical values

% Emma Ingeleiv Wagner - 4004949

% Dickey Fuller Asymptotic Critical values simulation

% NO trend and NO constant

clear all;
%randn('seed',1234);

%%%%%%%%%%%%%%%%%%%%%%%% Declaration of variables %%%%%%%%%%%%%%%%%%%%%%

T=1;
N=10000; %number of intervals
N2=50000; %runs
dt=T/N;%intervals

53
num=zeros(1,N2); %numerator of the DF distr.
den=zeros(1,N2); %denominator of the DF distr.
df=zeros(1,N2);

%%%%%%%%%%%%%%%%%%%%% Simulating a Brownian path & calculating critical values %%%%%%%

for k=1:N2

dW = zeros(1,N);
W = zeros(1,N);
Z=zeros(1,N);
Y= zeros(1,N);
dW(1) = sqrt(dt)*randn;
W(1) = dW(1);

for j = 2:N
dW(j) = sqrt(dt)*randn; % Wiener process simulation
W(j) = W(j-1)+dW(j);
Z(j) = (W(j-1)ˆ2);
Y(j) = W(j-1) * dW(j);
end

Z = cumsum(Z);
Y = cumsum(Y);
den(k)=sqrt(dt*Z(N)); %denominator of the DF distr
num(k)= Y(N); %numerator of the DF distr

df(k)=num(k)/den(k); %DF distr

end

% Compute the critical value for the test statistic

dfcritical=sort(df);
plot(dfcritical);
nbins=100;
hist(dfcritical,nbins)

% Critical values

alpha1=dfcritical(0.01*N2); % 1% significance level

alpha2=dfcritical(0.025*N2); % 2.5% significance level
alpha3=dfcritical(0.05*N2); % 5% significance level

alpha=[alpha1,alpha2,alpha3]

B.2 Dickey-Fuller Test - White Noise innovations

% Emma Ingeleiv Wagner - 4004949

54
% Dickey Fuller test
% White Noise errors (rand-0.5)

% Monte Carlo Simulatie

% AR(1) no intercept no trend

%%%%%%%%%% Declaration of variables %%%%%%%%%%%%%%

clear all;

N=1000;
c=[-10:1:0]; %fixed alternatives
%phi1=0.5:0.05:1;
M=length(c);
type2=zeros(1,M);
power=zeros(1,M);

for i=1:M

T=250; % aantal observaties

rand('seed',1234);

% Generate one unit root process of length T

% Calculate type II error

phi=1+c(i)/T;

Y=zeros(T,1);
hatphi=zeros(1,N);
teller=zeros(T,N);
noemer=zeros(T,N);
teller2=zeros(T,N);
hatsigma=zeros(1,N);
testvalue=zeros(1,N);

criticalvalue=-1.941;
count=0;

for k=1:N

Y(1)=0;

for t=2:T
Y(t)=phi*Y(t-1)+randn;

% constructie hatphi
teller(t,k)=Y(t-1)*Y(t);
noemer(t,k)=Y(t-1)ˆ2;
end

% constructie hatphi
teller=cumsum(teller);
noemer=cumsum(noemer);
hatphi(k)=teller(T,k)/noemer(T,k);

55
% constructie hatsigma

for t=2:T

teller2(t,k)=(Y(t)-hatphi(k)*Y(t-1))ˆ2;
end

teller2=cumsum(teller2);
hatsigma(k)=sqrt(((teller2(T,k))/(T-1))/noemer(T,k));

testvalue(k)=(hatphi(k)-1)/hatsigma(k);

if(testvalue(k)> criticalvalue)
count=count+1;
end

end

%nbins=100;
%hist(testvalue,nbins);
type2(i)=count/N;
power(i)=1-type2(i);

end

plot(1+c/T, power)

B.3 Dickey-Fuller Test - Gaussian innovations

% Emma Ingeleiv Wagner - 4004949

% Dickey Fuller test

% Gaussian errors
% AR(1) no intercept no trend

clearvars -except maxpower;

%%%%%%%%%% Declaration of variables %%%%%%%%%%%%%%

c=[-10:1:0]; %fixed alternatives

N=10000;

M=length(c);
type2=zeros(1,M);
power5=zeros(1,M);
power6=zeros(1,M);
power4=zeros(1,M);

for i=1:M

T=500; % aantal observaties

56
% Generate one unit root process of length T
% Calculate type II error

phi=1+(c(i)/T);

Y=zeros(T,1);
hatphi=zeros(1,N);
teller=zeros(T,N);
noemer=zeros(T,N);
teller2=zeros(T,N);
hatsigma=zeros(1,N);
testvalue=zeros(1,N);

criticalvalue=-1.9522;
count=0;

for k=1:N

Y(1)=0;

for t=2:T
Y(t)=phi*Y(t-1)+randn;

% constructie hatphi
teller(t,k)=Y(t-1)*Y(t);
noemer(t,k)=Y(t-1)ˆ2;
end

% constructie hatphi
teller=cumsum(teller);
noemer=cumsum(noemer);
hatphi(k)=teller(T,k)/noemer(T,k);

% constructie hatsigma

for t=2:T

teller2(t,k)=(Y(t)-hatphi(k)*Y(t-1))ˆ2;
end

teller2=cumsum(teller2);
hatsigma(k)=sqrt(((teller2(T,k))/(T-1))/noemer(T,k));

testvalue(k)=(hatphi(k)-1)/hatsigma(k);

if(testvalue(k)> criticalvalue)
count=count+1;
end

end

type2(i)=count/N;
power5(i)=1-type2(i);

57
end

plot(1+c/T,power5)
hold on;

for i=1:M

T=100; % aantal observaties

% Generate one unit root process of length T

% Calculate type II error

phi=1+(c(i)/T);

Y=zeros(T,1);
hatphi=zeros(1,N);
teller=zeros(T,N);
noemer=zeros(T,N);
teller2=zeros(T,N);
hatsigma=zeros(1,N);
testvalue=zeros(1,N);

criticalvalue=-1.9645;
count=0;

for k=1:N

Y(1)=0;

for t=2:T
Y(t)=phi*Y(t-1)+2*randn;

% constructie hatphi
teller(t,k)=Y(t-1)*Y(t);
noemer(t,k)=Y(t-1)ˆ2;
end

% constructie hatphi
teller=cumsum(teller);
noemer=cumsum(noemer);
hatphi(k)=teller(T,k)/noemer(T,k);

% constructie hatsigma

for t=2:T

teller2(t,k)=(Y(t)-hatphi(k)*Y(t-1))ˆ2;
end

teller2=cumsum(teller2);
hatsigma(k)=sqrt(((teller2(T,k))/(T-1))/noemer(T,k));

testvalue(k)=(hatphi(k)-1)/hatsigma(k);

58
if(testvalue(k)> criticalvalue)
count=count+1;
end

end

type2(i)=count/N;
power6(i)=1-type2(i);

end

for i=1:M

T=50; % aantal observaties

rand('seed',1234);

% Generate one unit root process of length T

% Calculate type II error

phi=1+c(i)/T;

Y=zeros(T,1);
hatphi=zeros(1,N);
teller=zeros(T,N);
noemer=zeros(T,N);
teller2=zeros(T,N);
hatsigma=zeros(1,N);
testvalue=zeros(1,N);

criticalvalue=-1.9682;
count=0;

for k=1:N

Y(1)=0;

for t=2:T
Y(t)=phi*Y(t-1)+2*randn;

% constructie hatphi
teller(t,k)=Y(t-1)*Y(t);
noemer(t,k)=Y(t-1)ˆ2;
end

% constructie hatphi
teller=cumsum(teller);
noemer=cumsum(noemer);
hatphi(k)=teller(T,k)/noemer(T,k);

% constructie hatsigma

for t=2:T

teller2(t,k)=(Y(t)-hatphi(k)*Y(t-1))ˆ2;

59
end

teller2=cumsum(teller2);
hatsigma(k)=sqrt(((teller2(T,k))/(T-1))/noemer(T,k));

testvalue(k)=(hatphi(k)-1)/hatsigma(k);

if(testvalue(k)> criticalvalue)
count=count+1;
end

end

type2(i)=count/N;
power4(i)=1-type2(i);

end

plot(-c, power4, -c, power5, -c, power6)

B.4 Critical values of the Likelihood ratio test.

% Emma Ingeleiv Wagner - 4004949

% Likelihood Ratio Test Critical values simulation

% critical values for the likelihood ratio test are dependent on the fixed
% alternative which is being used. Therefore we construct a vector cv with
% a critical value for each fixed alternative.

clear all

%%%%%%%%%%%%%%%%%%%%%%%% Declaration of variables %%%%%%%%%%%%%%%%%%%%%%

T=1;
N=1000; %number of intervals for the integral
N2=1000; %runs
dt=T/N; %interval size

B=zeros(1,N2); %numerator of the DF distr.

A=zeros(1,N2); %denominator of the DF distr.
lr=zeros(1,N2);
lr2=zeros(1,N2);
c=-10:0.5:0; %fixed local alternative vector
M=length(c);
cv=zeros(1,M);

%%%%%%%%%%%%%%%%%%%%% Simulating a Brownian path & calculating critical values %%%%%%%

for i=1:M

for k=1:N2

60
dW = zeros(1,N);
W = zeros(1,N);
Z=zeros(1,N);
Y= zeros(1,N);
X=zeros(1,N);

dW(1) = sqrt(dt)*randn;
W(1) = dW(1);

for j = 2:N %Wiener Process simulation

dW(j) = sqrt(dt)*randn;
W(j) = W(j-1)+dW(j);
Z(j) = (W(j-1)ˆ2);
Y(j) = W(j-1) * dW(j);

end

Z = cumsum(Z);
Y = cumsum(Y);
A(k)=dt*Z(N);
B(k)= Y(N);

lr(k)=-c(i)*B(k)+0.5*c(i)ˆ2*A(k);

end

% Compute the critical value for the test statistic

lrcritical=sort(lr);
cv(i)=lrcritical(0.05*N2);

end

B.5 Likelihood Ratio Test

% Emma Ingeleiv Wagner - 4004949

% Likelihood Ratio test

randn('seed',1234);
clearvars -except cv; %delete all variables except cv
%%%%%%%%%% Declaration of variables %%%%%%%%%%%%%%

c=-10:1:0; %fixed alternatives

N=1000; %reps
M=length(c);
powerlik=zeros(1,M); %powervector

% Generate an AR(1) process of length T

% Calculate type II error

61
for i=1:M

T=25; %sample size

falt=c(i); %fixed alternative

Y=zeros(T,1); %AR(1) process
lrtest=zeros(1,N); %LR test statistic
A=zeros(T,N);
B=zeros(T,N);
count=0;

for k=1:N

Y(1)=0;

for t=2:T
Y(t)=(1+(falt/T))*Y(t-1)+randn; %AR(1) process simulation
A(t,k)=Y(t-1)*(Y(t)-Y(t-1));
B(t,k)=Y(t-1)ˆ2;
end

% constructie LR test statistic

A=cumsum(A);
B=cumsum(B);
lrtest(k)=-(falt/T)*A(T,k)+0.5*(faltˆ2)/(Tˆ2)*B(T,k); %test statistic

if(lrtest(k)< cv(i))
count=count+1;
end

end

%computing the power for each fixed alternative

powerlik(i)=count/N;

end

plot(-c,powerlik);
hold on;

B.6 Asymptotic Power Envelope

% Emma Ingeleiv Wagner - 4004949

% Asymptotic Power Envelop (as in the paper)

% Simulation of a Wiener Process
% Simulation of an Ornstein-Uhlenbeck process
clearvars -except cv

%%%%%%%%%%%%%%%%% Declaration of variables %%%%%%%%%%%%%%%%%%%%%%%%%%%%%

62
N=1000; %number of intervals of the integral
N2=1000; %reps
T=1;
dt=T/N;
c=-10:0.5:0; %fixed alternative
critval=cv;
powerenv=zeros(1,length(cv));
lrtest=zeros(1,N2);

for k=1:length(c);

count=0;

for i=1:N2

W = zeros(1,N); %Wiener process

X=zeros(1,N); %Ornstein-Uhlenbeck process
Y=zeros(1,N);
Z=zeros(1,N);
dW=zeros(1,N);
dX=zeros(1,N);

dW(1) = sqrt(dt)*randn;
W(1) = dW(1);

for j=2:N

dW(j) = sqrt(dt)*randn; %Wiener process simulation

W(j) = W(j-1)+dW(j);

dX(j)=exp(-c(k)j/N)exp(c(k)(j-1)/N)(W(j)-W(j-1)); %Ornstein-Uhlenbeck process simu

X(j)=X(j-1)+dX(j);

Z(j)=X(j-1)*(W(j)-W(j-1)); %terms integral 0 to 1 XdW

Y(j)=X(j)ˆ2; %terms integral 0 to 1 Xdt
end

Z = cumsum(Z);
Y = dt*cumsum(Y);

lrtest(i)=-c(k)*Z(N)-0.5*c(k)ˆ2*Y(N);

if lrtest(i)<cv(k)
count=count+1;
end

end

powerenv(k)=count/N2;

end

plot(-c,powerenv)

Coursera Basic Statistics Final Exam Answers
80% (5)
Coursera Basic Statistics Final Exam Answers
9 pages
Logistic Regression Assignment
0% (4)
Logistic Regression Assignment
6 pages
Aleix Ruiz de Villa Robert - Causal Inference For Data Science (MEAP V04) - Manning (2023)
No ratings yet
Aleix Ruiz de Villa Robert - Causal Inference For Data Science (MEAP V04) - Manning (2023)
217 pages
Hypothesis Testing
100% (3)
Hypothesis Testing
147 pages
Stats Tools Package OLD
No ratings yet
Stats Tools Package OLD
39 pages
Phillips & Perron - Biometrika - 1988 - Unit Root Test
No ratings yet
Phillips & Perron - Biometrika - 1988 - Unit Root Test
13 pages
Module 5 - Tests For Stationarity Part 2
No ratings yet
Module 5 - Tests For Stationarity Part 2
8 pages
Unit Root Testing: ° Physica-Verlag 0, ISSN 0002-6018
No ratings yet
Unit Root Testing: ° Physica-Verlag 0, ISSN 0002-6018
16 pages
Elliot Rottenberg Stock 1996 Efficient Tests For An Autoregressive Unit Root PDF
No ratings yet
Elliot Rottenberg Stock 1996 Efficient Tests For An Autoregressive Unit Root PDF
16 pages
Chapter 4 - Stationarity and Unit Roots Tests
No ratings yet
Chapter 4 - Stationarity and Unit Roots Tests
11 pages
Non-Stationary Time Series and Unit Root Tests: Deterministic Trend
No ratings yet
Non-Stationary Time Series and Unit Root Tests: Deterministic Trend
13 pages
1 Unit Root Tests: T T T T T T T
No ratings yet
1 Unit Root Tests: T T T T T T T
34 pages
Adv Stat II
No ratings yet
Adv Stat II
140 pages
Stationary and Nonstationary Series: T y y E T y S S T y T y S T y T y T y
No ratings yet
Stationary and Nonstationary Series: T y y E T y S S T y T y S T y T y T y
17 pages
Xxxx Hypothesis Testing
No ratings yet
Xxxx Hypothesis Testing
101 pages
Topic 5 Unit Roots, Cointegration and VECM
100% (1)
Topic 5 Unit Roots, Cointegration and VECM
42 pages
Dickey e Fuller (1981)
No ratings yet
Dickey e Fuller (1981)
17 pages
Dickey, Fuller - 1981 - Likelihood Ratio Statistics For Autoregressive Time Series With A Unit Root
No ratings yet
Dickey, Fuller - 1981 - Likelihood Ratio Statistics For Autoregressive Time Series With A Unit Root
17 pages
Differencing and Unit Root Tests
No ratings yet
Differencing and Unit Root Tests
8 pages
Unit Roots Tests Methods and Problems
No ratings yet
Unit Roots Tests Methods and Problems
28 pages
Wiley Journal of Applied Econometrics: This Content Downloaded From 128.143.23.241 On Sun, 08 May 2016 12:18:10 UTC
No ratings yet
Wiley Journal of Applied Econometrics: This Content Downloaded From 128.143.23.241 On Sun, 08 May 2016 12:18:10 UTC
19 pages
jin_jianwei-thesis (2)
No ratings yet
jin_jianwei-thesis (2)
44 pages
Time Series Analysis Using E-Views Program: June 2020
No ratings yet
Time Series Analysis Using E-Views Program: June 2020
27 pages
Capitulo 4 Primera Edicion
No ratings yet
Capitulo 4 Primera Edicion
29 pages
Unit Roots: A Selected Survey: Gabriel Rodríguez
No ratings yet
Unit Roots: A Selected Survey: Gabriel Rodríguez
34 pages
23 SS143
No ratings yet
23 SS143
50 pages
Power Versus Frequency of Observation : Yale University, New Haven, C T 06520, USA
No ratings yet
Power Versus Frequency of Observation : Yale University, New Haven, C T 06520, USA
6 pages
Unit Root Tests
No ratings yet
Unit Root Tests
71 pages
Temi Di Discussione: Del Servizio Studi
No ratings yet
Temi Di Discussione: Del Servizio Studi
55 pages
Testing For Unit Roots Using The Augmented Dickey-Fuller Test
No ratings yet
Testing For Unit Roots Using The Augmented Dickey-Fuller Test
6 pages
Dickey-Fuller Test: From Wikipedia, The Free Encyclopedia
No ratings yet
Dickey-Fuller Test: From Wikipedia, The Free Encyclopedia
3 pages
VAR and VEC Models
No ratings yet
VAR and VEC Models
45 pages
305MinNotes PDF
No ratings yet
305MinNotes PDF
148 pages
Lecture 2 - Stationarity and Endogeneity
No ratings yet
Lecture 2 - Stationarity and Endogeneity
24 pages
Enders4 - APPLIED ECONOMETRIC TIME SERIES
No ratings yet
Enders4 - APPLIED ECONOMETRIC TIME SERIES
42 pages
Unit Root Testing in Practice: Dealing With Uncertainty Over The Trend and Initial Condition
No ratings yet
Unit Root Testing in Practice: Dealing With Uncertainty Over The Trend and Initial Condition
44 pages
Augmented Dickey-Fuller Test - Wikipedia
No ratings yet
Augmented Dickey-Fuller Test - Wikipedia
4 pages
Time Series Analysis
No ratings yet
Time Series Analysis
26 pages
Econometría
No ratings yet
Econometría
43 pages
Unit Root Test and Applications
No ratings yet
Unit Root Test and Applications
11 pages
Pantsulaia-2016-Applications of Measure Theory to Statistics
No ratings yet
Pantsulaia-2016-Applications of Measure Theory to Statistics
130 pages
AMFE Module 5 - Unit Root Test
No ratings yet
AMFE Module 5 - Unit Root Test
13 pages
CH5 Unit Roots and Cointegration in Panels
No ratings yet
CH5 Unit Roots and Cointegration in Panels
33 pages
Carolina Found The Following Site With An Example of Unit Root Tests
No ratings yet
Carolina Found The Following Site With An Example of Unit Root Tests
9 pages
L 18 Mit Ts
No ratings yet
L 18 Mit Ts
5 pages
Explanation: Dickey-Fuller Table
No ratings yet
Explanation: Dickey-Fuller Table
2 pages
EC2019 Lectures
No ratings yet
EC2019 Lectures
94 pages
Time Series Analysis Exercises: Universität Potsdam
100% (1)
Time Series Analysis Exercises: Universität Potsdam
30 pages
Unit Root, Differencing The Time Series, Unit Root Test (ADF Test)
No ratings yet
Unit Root, Differencing The Time Series, Unit Root Test (ADF Test)
27 pages
ARIMA Estimation: Theory and Applications: 1 General Features of ARMA Models
No ratings yet
ARIMA Estimation: Theory and Applications: 1 General Features of ARMA Models
18 pages
Financial Econometrics
No ratings yet
Financial Econometrics
16 pages
SSRN-id3270269
No ratings yet
SSRN-id3270269
82 pages
A COMPARATIVE STUDY OF UNIT ROOT TESTSWITH PANEL DATA AND A NEW SIMPLE TEST
No ratings yet
A COMPARATIVE STUDY OF UNIT ROOT TESTSWITH PANEL DATA AND A NEW SIMPLE TEST
22 pages
Question two
No ratings yet
Question two
4 pages
ADF Test
No ratings yet
ADF Test
7 pages
Random Walk Without Drift:: Where: Is White Noise Error Term With Mean 0 and Variance
No ratings yet
Random Walk Without Drift:: Where: Is White Noise Error Term With Mean 0 and Variance
30 pages
Maddala Wu
No ratings yet
Maddala Wu
22 pages
Chapter 4
No ratings yet
Chapter 4
102 pages
Carolina Found The Following Site With An Example of Unit Root Tests
100% (1)
Carolina Found The Following Site With An Example of Unit Root Tests
10 pages
How The Mathematics of Fractals Can Help Predict Stock Markets Shifts - by Marco Tavora
No ratings yet
How The Mathematics of Fractals Can Help Predict Stock Markets Shifts - by Marco Tavora
21 pages
2003_Bernoulli
No ratings yet
2003_Bernoulli
24 pages
Augmented Dickey-Fuller Test - Wikipedia
No ratings yet
Augmented Dickey-Fuller Test - Wikipedia
4 pages
Introduction to Finite Element Analysis
From Everand
Introduction to Finite Element Analysis
Rahul Basu
No ratings yet
Quant Developers' Tools and Techniques: Quant Books, #2
From Everand
Quant Developers' Tools and Techniques: Quant Books, #2
Manfred Hindering
No ratings yet
Arc Control in Circuit Breakers: Low Contact Velocity 2nd Edition
From Everand
Arc Control in Circuit Breakers: Low Contact Velocity 2nd Edition
Kesorn Pechrach
No ratings yet
2015 09 21 Black Out Report v10 W ENTSOE
No ratings yet
2015 09 21 Black Out Report v10 W ENTSOE
81 pages
Handout 2
No ratings yet
Handout 2
130 pages
Thirty Years of Heteroskedasticity-Robust Inference: James G. Mackinnon
No ratings yet
Thirty Years of Heteroskedasticity-Robust Inference: James G. Mackinnon
25 pages
Topics Intmacro 1
No ratings yet
Topics Intmacro 1
63 pages
Advanced Econometric Methods I: Problem Set 3: Geert Mesters October 13, 2020
No ratings yet
Advanced Econometric Methods I: Problem Set 3: Geert Mesters October 13, 2020
1 page
0xLAMP JETFinal Emphasized
No ratings yet
0xLAMP JETFinal Emphasized
35 pages
Problem Set 1: Panel Data
No ratings yet
Problem Set 1: Panel Data
3 pages
Extensions Beyond Linear Regression: Topics in Data Science
No ratings yet
Extensions Beyond Linear Regression: Topics in Data Science
66 pages
1xbilbiee 2021 MPHAF RESTUD R1
No ratings yet
1xbilbiee 2021 MPHAF RESTUD R1
55 pages
Problemset 3 Advanced Econometrics II L M: Nstructions
No ratings yet
Problemset 3 Advanced Econometrics II L M: Nstructions
1 page
1 Basic Geometric Intuition: For Example, See Theorems 6.3.8 and 6.3.9 in Lay's Linear Algebra Book On The Syllabus
No ratings yet
1 Basic Geometric Intuition: For Example, See Theorems 6.3.8 and 6.3.9 in Lay's Linear Algebra Book On The Syllabus
3 pages
Advanced Econometric Methods I: Lecture Notes On Weak Instruments
No ratings yet
Advanced Econometric Methods I: Lecture Notes On Weak Instruments
16 pages
Problem Set 2-1
No ratings yet
Problem Set 2-1
6 pages
Event-Related GARCH: The Impact of Stock Dividends in Turkey
No ratings yet
Event-Related GARCH: The Impact of Stock Dividends in Turkey
14 pages
Engle Ledoit Wolf (2019)
No ratings yet
Engle Ledoit Wolf (2019)
14 pages
Advanced Econometric Methods I: Problem Set 2: Geert Mesters October 5, 2020
No ratings yet
Advanced Econometric Methods I: Problem Set 2: Geert Mesters October 5, 2020
2 pages
A Comparison of Conditional Volatility Estimators For The ISE National 100 Index Returns
No ratings yet
A Comparison of Conditional Volatility Estimators For The ISE National 100 Index Returns
30 pages
Advanced Econometric Methods I: Problem Set 1: Geert Mesters September 26, 2020
No ratings yet
Advanced Econometric Methods I: Problem Set 1: Geert Mesters September 26, 2020
2 pages
Advanced Econometric Methods I: On Stationarity, Ergodicity and The Ergodic Theorem
No ratings yet
Advanced Econometric Methods I: On Stationarity, Ergodicity and The Ergodic Theorem
6 pages
Advanced Econometric Methods I: Lecture Notes On Bootstrap: 1 Motivation
No ratings yet
Advanced Econometric Methods I: Lecture Notes On Bootstrap: 1 Motivation
19 pages
A Heavy-Tailed Distribution For ARCH Residuals With Application To Volatility Prediction
No ratings yet
A Heavy-Tailed Distribution For ARCH Residuals With Application To Volatility Prediction
16 pages
Forecasting Volatility in Financial Markets: A Review: S - H P W. J. G
No ratings yet
Forecasting Volatility in Financial Markets: A Review: S - H P W. J. G
71 pages
99xsmet Wouters 2007 PDF
No ratings yet
99xsmet Wouters 2007 PDF
57 pages
1xraftery Et All 1997
No ratings yet
1xraftery Et All 1997
14 pages
99xbalaban Bayar Kan 2001
No ratings yet
99xbalaban Bayar Kan 2001
7 pages
Guido Imbens
No ratings yet
Guido Imbens
62 pages
Simple Linear Regression: Presented by Tayyab Pervaiz 19011507-093
No ratings yet
Simple Linear Regression: Presented by Tayyab Pervaiz 19011507-093
11 pages
3 Simple Moving Average
No ratings yet
3 Simple Moving Average
55 pages
Chapter 4 Appendix Prolems
No ratings yet
Chapter 4 Appendix Prolems
2 pages
Quamet MSC530M G02 3T 2020-2021 - Part IV
No ratings yet
Quamet MSC530M G02 3T 2020-2021 - Part IV
2 pages
Hypothesis Testing Examples
No ratings yet
Hypothesis Testing Examples
5 pages
Workshop Bayes
No ratings yet
Workshop Bayes
534 pages
One-Sample Hypothesis Testing
No ratings yet
One-Sample Hypothesis Testing
9 pages
Unit 1 Introduction To Statistics: Structure
No ratings yet
Unit 1 Introduction To Statistics: Structure
24 pages
Module 5
No ratings yet
Module 5
53 pages
Chapter 22 - Elements of Hierarchical Regression Models
No ratings yet
Chapter 22 - Elements of Hierarchical Regression Models
21 pages
Multivariate Analysis
No ratings yet
Multivariate Analysis
11 pages
PSB 2024
No ratings yet
PSB 2024
5 pages
2019-Assessment of Mothers' Measures Against Home Accidents For 0-6-Year-Old Children
No ratings yet
2019-Assessment of Mothers' Measures Against Home Accidents For 0-6-Year-Old Children
8 pages
Autocorrelation: What Happens If The Error Terms Are Correlated?
No ratings yet
Autocorrelation: What Happens If The Error Terms Are Correlated?
18 pages
Mann-Whitney Worked Example
No ratings yet
Mann-Whitney Worked Example
5 pages
Statppt2 - Test Statistic, Z-Critical & T-Critical
No ratings yet
Statppt2 - Test Statistic, Z-Critical & T-Critical
44 pages
Quotes
No ratings yet
Quotes
4 pages
Lawang Bato National High School - Senior High School
100% (1)
Lawang Bato National High School - Senior High School
2 pages
Statistics and Probability 11 - Introduction
No ratings yet
Statistics and Probability 11 - Introduction
44 pages
Chapter 17 Least Square
No ratings yet
Chapter 17 Least Square
16 pages
Solution Econometric Chapter 10 Regression Panel Data
No ratings yet
Solution Econometric Chapter 10 Regression Panel Data
3 pages
The Relationship Between Reward System, Employee Motivation and Employees Performance in Car Dealers Located in Kingdom of Bahrain
No ratings yet
The Relationship Between Reward System, Employee Motivation and Employees Performance in Car Dealers Located in Kingdom of Bahrain
6 pages
2way Anova
No ratings yet
2way Anova
4 pages
Assignments
No ratings yet
Assignments
8 pages

Unit Root Testing in AR1 Processes

Uploaded by

Unit Root Testing in AR1 Processes

Uploaded by

Technische Universiteit Delft

Faculteit Elektrotechniek, Wiskunde en Informatica

Unit Root Tests voor AR(1) Processen

Verslag ten behoeve van het

van de graad van

EMMA INGELEIV WAGNER

Copyright © 2015 door Emma Ingeleiv Wagner. Alle rechten

”Unit Root Tests voor AR(1) Processen”

EMMA INGELEIV WAGNER

Technische Universiteit Delft

Dr. I.G. Becheri

Dr.ir. M. Keijzer Dr.ir. F.H. van der Meulen

Dr.ir. M.B. van Gijzen

Januari, 2015 Delft

2 Introduction to Unit Root Processes 11

3 Unit Root Testing 17

4 The Dickey-Fuller Test 24

5 Power of the Dickey-Fuller Test 28

6 The Likelihood Ratio Test 34

B Matlab programs of the Monte Carlo simulation 52

1.1 The unemployment rate of the Netherlands from 2006 to 2014. . . . . . . . 8

4.1 The approximation of the distributions of tT for sample sizes T = 25 and

Example 1. Consider the unemployment rate of the Netherlands. A person is considered

Introduction to Unit Root Processes

2.1 Unit Root in the AR(1) Process

2.1.1 The First Order Auto Regressive Process

Yt = m + γt + φYt−1 + t , with starting value Y0 (2.1.1)

Yt = φ(φYt−2 + t−1 ) + t = φ2 Yt−2 + φt−1 + t . (2.1.2)

Yt = φ2 (φYt−3 + t−2 ) + φt−1 + t = φ3 Yt−3 + φ2 t−2 + φt−1 + t . (2.1.3)

E[Yt ] =E t + φt−1 + φ2 t−2 + · · · + φt 0

=E[t ] + φE[t−1 ] + φ2 E[t−2 ] + · · · + φt E[0 ]

Var(Yt ) = φ2 Var(Yt ) + σ 2 ⇔ (1 − φ2 )Var(Yt ) = σ 2

• E[Yt2 ] < ∞, ∀t > 0 ∈ N;

• Cov(Ys , Yt ) = Cov(Ys+h , Yt+h ), ∀s, t, h > 0 ∈ N.

2.1.2 Presence of a Unit Root in AR(1)

Yt = φ(LYt ) + t ⇔ (1 − φL)Yt = t . (2.1.9)

The variance of Yt is given by

The variance of Yt depends on t since Var(Y1 ) = σ 2 , while Var(Y2 ) = 2σ 2 . Therefore the

• Trend Stationary Process: a non-stationary process consisting of a stationary stochas-

Figure 2.2: Trend stationary process compared to a unit root process.

2.2.1 Trend Stationary Process

E[Yt ] =E[m + γt + φYt−1 + t ]

2.2.2 Unit Root Process

Definition 2.2.1. A time series Yt is integrated of order 1 if (1 − L)Yt is integrated of

(1 − L)Yt = Yt − Yt−1 = ∆Yt

we write Yt ∼ I(1). If Yt is stationary then Yt is integrated of order zero, Yt ∼ I(0).

Yt = φYt−1 + t with starting value Y0 .

Unit Root Testing

• H0 : φ = 1 (There is a unit root present.) ⇒ Yt ∼ I(1);

• H1 : |φ| < 1 ⇒ Yt ∼ I(0).

3.1 Ordinary Least Square Regression

and we can solve (3.1.4) for φ̂T

If we take the value −2φ̂T out of the summation, we obtain

If we divide both sides by −2φ̂T we obtain

such that φ̂T equals

3.2 Asymptotic Properties of an AR(1) Process un-

If the assumption of Gaussian innovations is valid, t ∼ N (0, σ 2 ), the asymptotic proper-

3.3 Asymptotic Properties of the AR(1) Random Walk

Assumption 1 states that Y0 = 0, such that

3.4 Consistency of φ̂T

such that φ̂T − 1 from equation (3.4.1) converges in probability to zero:

The Dickey-Fuller Test

• Testing for a unit root with drift.

4.1 Dickey-Fuller Test Statistic

and s2T is residual variance of the OLS estimator

P r(tT ≤ kTα |H0 ) = α (4.1.3)

(a) T = 100 (b) T = 250

Figure 4.3: The simulated distribution of t∞ .

T α = 0.01 α = 0.025 α = 0.05

Power of the Dickey-Fuller Test

H0 is true φ = 1 H1 is true φ = φ1 < 1

The power of the Dickey-Fuller test at significance level α = 0.05 corresponds to

Power =Pr(tT < kT0.05 |φ < 1)

5.1 Local Alternatives Framework

5.2 AR(1) with Gaussian Innovations

−c1 T = 50 T = 100 T = 250

Yt = m + γt + φYt−1 + t , with starting value Y0 (2.1.1)

Yt = φ(φYt−2 + t−1 ) + t = φ2 Yt−2 + φt−1 + t . (2.1.2)

Yt = φ2 (φYt−3 + t−2 ) + φt−1 + t = φ3 Yt−3 + φ2 t−2 + φt−1 + t . (2.1.3)

E[Yt ] =E t + φt−1 + φ2 t−2 + · · · + φt 0

=E[t ] + φE[t−1 ] + φ2 E[t−2 ] + · · · + φt E[0 ]

Yt = φ(LYt ) + t ⇔ (1 − φL)Yt = t . (2.1.9)

E[Yt ] =E[m + γt + φYt−1 + t ]

Yt = φYt−1 + t with starting value Y0 .

If the assumption of Gaussian innovations is valid, t ∼ N (0, σ 2 ), the asymptotic proper-

H0 : φ = 1 such that Yt = Yt−1 + t

L(φ = 1|1 , . . . , T ) f (1 , . . . , T |φ = 1)