0% found this document useful (0 votes)
4 views

Chapter 6 Time Series Analysis

Chapter Six discusses time series analysis, emphasizing its importance in empirical analysis and the challenges faced by econometricians, particularly due to non-stationarity. Key issues include serial correlation, spurious regression, and the random walk phenomenon, with a focus on understanding and testing for stationarity. The chapter also defines trends and seasonality in time series data, highlighting the distinction between deterministic and stochastic trends.

Uploaded by

NATNAEL MENGISTU
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
4 views

Chapter 6 Time Series Analysis

Chapter Six discusses time series analysis, emphasizing its importance in empirical analysis and the challenges faced by econometricians, particularly due to non-stationarity. Key issues include serial correlation, spurious regression, and the random walk phenomenon, with a focus on understanding and testing for stationarity. The chapter also defines trends and seasonality in time series data, highlighting the distinction between deterministic and stochastic trends.

Uploaded by

NATNAEL MENGISTU
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 31

Chapter Six: Time Series Econometrics 2024

CHAPTER SIX
TIME SERIES ANALYSIS

6.1. Introduction
Time Series Data is one of the important types of data used in empirical analysis. Time series
analysis comprises methods for analyzing time series data in order to extract meaningful statistics
and other characteristics of the data. It is an attempt to understand the nature of the time series and
developing appropriate models useful for future forecasting. Therefore, this chapter shall be
devoted to the discussion of introductory concepts in time series analysis mainly for two major
reasons:
1. Time series data is frequently used in practice
2. Time series data analysis poses several challenges to econometricians and
practitioners.

The following are some of the challenges in time series analysis:


A. Serial correlation/nonindependence of observations: Autocorrelation is common in time
series analysis because the underlying time series is oftentimes nonstationary.
B. Spurious or false regression -in regressing a time series variable on another time series
variable (s), one often obtains a very high (in excess of 0.9) even though there is no
meaningful relationship between the two variables. Sometimes we expect no relationship
between two variables, yet a regression of one on the other variable often shows a
significant relationship. This situation exemplifies the problem of spurious, or nonsense,
regression. It is therefore very important to find out if the relationship between economic
variables is spurious or nonsensical.
C. Some financial time series, such as stock prices, exhibit what is known as the random
walk phenomenon, that is, such series are non-stationary. Therefore, forecasting in such
series would be a futile exercise.
D. Causality tests (such as the Granger and Sims) assume that the time series involved in
analysis are stationary. Therefore, tests of stationarity should precede tests of causality.
Note that almost all of the above problems arise mainly due to non-stationarity of time series data
sets. Therefore, our major emphasis in this chapter shall be on discussion of nature and tests of
Stationarity of time series, and the remedial measures for non-stationary data sets.

6.2. The Nature of Time Series Data


An obvious characteristic of time series data which distinguishes it from cross-sectional data is that
a time series data set comes with a temporal ordering. For instance, in this Chapter, we will
briefly discuss a time series data set on GDP(Y), trade balance (TB), money supply (MS2),
exchange rate (EX) and government expenditure (G) of Ethiopia for the periods 1963-2003 E.C. In
this data set, we must know that the data for 1963 immediately precede the data for 1962. For
analyzing time series data, we must recognize that the past can affect the future, but not vice versa.

By: Teklebirhan A. (Asst. Prof.) Page 1


Chapter Six: Time Series Econometrics 2024

To emphasize the proper ordering of time series data, Table 5.1 below gives a listing of the data on
five macroeconomic variables in Ethiopia for the period 1963-2003.

In chapter-2, we studied the statistical properties of the OLS estimators based on the notion that
samples were randomly drawn from the appropriate population. Understanding why cross-
sectional data should be viewed as random outcomes is fairly straightforward: a different sample
drawn from the population will generally yield different values of the variables. Therefore, the
OLS estimates computed from different random samples will generally differ, and this is why we
consider the OLS estimators to be random variables.

How should we think about randomness in time series data? Certainly, economic time series
satisfy the intuitive requirements for being outcomes of random variables. For example, today we
do not know what the trade balance of Ethiopia will be at the end of this year. We do not know
what the annual growth in output will be in Ethiopia during the coming year. Since the outcomes
of these variables are not fore known, they should clearly be viewed as random variables.

Table 5.1: Time series Data on some macroeconomic variables in Ethiopia

Year TB MS2 GDP EXR G


1963 0.69 629.6 9,400 2.4000 303.99
1964 0.69 658.3 9,873 2.3000 316.19
1965 1.04 808 9,892 2.1900 348.14
1966 1.15 1066.6 10,353 2.0700 368.57
1967 0.71 1139.4 11,412 2.0700 442.89
1968 0.79 1421.8 11,145 2.0700 551.99
1969 0.86 1467.9 11,916 2.0700 654.73
1970 0.84 1682.2 13,221 2.0700 752.34
1971 0.61 1848 13,890 2.0700 942.56
1972 0.65 2053.2 15,143 2.0700 993.59
1973 0.62 2377.6 16,135 2.0700 1,017.56
1974 0.47 2643.7 16,530 2.0700 1,090.74
1975 0.29 3040.5 17,498 2.0700 1,244.54
1976 0.45 3383.7 19,655 2.0700 1,506.13
1977 0.42 3849 17,865 2.0700 1,454.12
1978 0.43 4448.2 21,517 2.0700 1,524.52
1979 0.36 4808.7 22,367 2.0700 1,636.04
1980 0.34 5238.7 23,679 2.0700 1,726.70
1981 0.43 5705 24,260 2.0700 2,066.50
1982 0.40 6708.2 25,413 2.0700 2,336.99
1983 0.29 7959.2 27,323 2.0700 2,467.53
1984 0.18 9010.7 31,362 2.0700 2,416.75
1985 0.26 10136.7 34,621 4.2700 2,819.51

By: Teklebirhan A. (Asst. Prof.) Page 2


Chapter Six: Time Series Econometrics 2024

1986 0.30 11598.7 43,171 5.7700 3,770.58


1987 0.43 14408.5 43,849 6.2500 4,220.57
1988 0.35 15654.8 55,536 6.3200 5,378.98
1989 0.46 16550.6 62,268 6.5000 5,671.26
1990 0.44 18585.3 64,501 6.8800 5,984.25
1991 0.31 19399.0 62,028 7.5100 7,069.36
1992 0.35 22177.8 66,648 8.1400 11,921.90
1993 0.31 24516.2 68,027 8.3300 9,963.85
1994 0.27 27322.0 66,557 8.5400 9,873.38
1995 0.26 30469.6 73,432 8.5809 9,849.58
1996 0.23 34662.5 86,661 8.6197 11,315.21
1997 0.23 40211.7 106,473 8.6518 13,203.04
1998 0.22 46377.4 131,641 8.6810 16,080.46
1999 0.23 56651.9 171,989 8.7943 18,071.82
2000 0.22 68182.1 248,303 9.2441 24,364.45
2001 0.19 82509.8 335,392 10.4205 27,592.06
2002 0.24 104432.4 382,939 12.8909 37,527.99
2003 0.33 145377.0 511,157 16.1178 50,093.38

Formally, a sequence of random variables indexed by time is called a stochastic process or a time
series process. (“Stochastic” is a synonym for random). When we collect a time series data set, we
obtain one possible outcome, or realization, of the stochastic process. We can only see a single
realization, because we cannot go back in time and start the process over again. (This is analogous
to cross-sectional analysis where we can collect only one random sample). However, if certain
conditions in history had been different, we would generally obtain a different realization for the
stochastic process, and this is why we think of time series data as the outcome of random variables.
The set of all possible realizations of a time series process plays the role of the population in cross-
sectional analysis.

A random or stochastic process is a collection of random variables ordered in time. If we let Y


denote a random variable, and if it is continuous, we denote it as Y(t), but if it is discrete, we
denoted it as Yt. Since most economic data are collected at discrete points in time, for our purpose
we will use the notation Yt rather than Y(t). In what sense can we regard GDP as a stochastic
process? Consider for instance the GDP of 9.4 billion Birr of Ethiopia for the period 1963. In
theory, the GDP figure for the period 1963 could have been any number, depending on the
economic and political climate then prevailing. The figure of 9.4 billion Birr is a particular
realization of all such possibilities. Therefore, we can say that GDP is a stochastic process and the
actual values we observed for the period 1963-2003 are a particular realization of that process (i.e.,
sample). The distinction between the stochastic process and its realization is akin to the distinction
between population and sample in cross-sectional data. Just as we use sample data to draw

By: Teklebirhan A. (Asst. Prof.) Page 3


Chapter Six: Time Series Econometrics 2024

inferences about a population, in time series we use the realization to draw inferences about the
underlying stochastic process.

6.3. Trends and Seasonality

In describing time series, we have used words such as “trend” and “seasonal” which need to be
defined more carefully. The main features of many time series are trends and seasonal variation.

a) Trend component of time series

A trend exists when there is a long-term increase or decrease value in the series (data). It does not
have to be linear. Sometimes we will refer to a trend as “changing direction”, when it might go
from an increasing trend to a decreasing trend. There is a trend in the time series data shown in
Figure 5.1.

We have two time series trends:

 If the trend in a time series is completely predictable and not variable, we call it a
deterministic trend,
 Whereas if it is not predictable, we call it a stochastic trend.

In a nutshell, a deterministic trend is a nonrandom function of time whereas a stochastic trend is


random and varies over time. According to Stock and Watson (2007), it is more appropriate to
model economic time series as having stochastic rather than deterministic trends. Therefore, our
treatment of trends in economic time series focuses mainly on stochastic rather than deterministic
trends, and when we refer to “trends” in time series data, we mean stochastic trends.

Figure 5.1: Deterministic Versus Stochastic Trend

By: Teklebirhan A. (Asst. Prof.) Page 4


Chapter Six: Time Series Econometrics 2024

b) Seasonality component of time series

A seasonal pattern occurs when a time series is affected by seasonal factors such as the
time of the year or the day of the week. Seasonality is always of a fixed and known
frequency. For example, the monthly sales of antidiabetic drugs above may shows
seasonality which is induced partly by the change in the cost of the drugs at the end of
the calendar year. Thus, seasonality is the repeating short-term cycle in the series. Or it is
the repeating patterns of behavior over time.

Seasonality is the repetition of data at a certain period of time interval. For example,
every year we notice that people tend to go on vacation during the December — January
time, this is seasonality. It is one other most important characteristics of time series
analysis. It is generally measured by autocorrelation after subtracting the trend from the
data.

Figure 5.2: Seasonality pattern of time series

From the above figure 5.2, it is clear that there is a spike at the starting of every year.
Which means every year January people tend to take ‘Diet’ as their resolution rather than
any other month. This is a perfect example of seasonality.

However, a cycle occurs when the data exhibit rises and falls that are not of a fixed
frequency.
Note that we should remove the trend and the seasonal component to get stationary
residuals.

By: Teklebirhan A. (Asst. Prof.) Page 5


Chapter Six: Time Series Econometrics 2024

6.4. Stationary and Non-stationary Stochastic Processes


A type of stochastic process that has received a great deal of attention by time series
analysts is the so-called stationary stochastic process. In colloquial language, stationarity
means that the probabilistic character of the series must not change over time, i.e., that
any section of the time series is “typical” for every other section with the same length. In
other words, a stochastic process is said to be stationary if its mean and variance are
constant over time and the value of the covariance between any two time periods
depends only on the distance or gap or lag between the two time periods and not the
actual time at which the covariance is computed. In the time series literature, such a
stochastic process is known as a weakly stationary, or covariance stationary, or second-
order stationary. This type of stationarity is sufficient for applied time series analysis,
and strict stationarity is a practically useless concept. It is a necessary condition for
building a time series model that is useful for future forecasting. To explain weak
Stationarity, let be a stochastic time series with these properties:
Mean: ( ) = ( )
Variance: ( )= ( − ) = ( )
Covariance: , = = [ − )( − )] ( )
Where, the covariance (or autocovariance) at lag k, is the covariance between the values of and
, that is, between two Y values k periods apart. If = , we obtain , which is simply the
variance of (= ); if = , is the covariance between two adjacent values of .

In short, if a time series is stationary, its mean, variance, and autocovariance (at various lags)
remain the same no matter at what point we measure them; that is, they are time invariant. Such a
time series will tend to return to its mean (called mean reversion) and fluctuations around this
mean (measured by its variance) will have broadly constant amplitude. If a time series is not
stationary in the sense just defined, it is called a non-stationary time series. In other words, a
non-stationary time series will have a time varying mean or a time-varying variance or both.

Why are stationary time series so important? There are two major reasons.
1. If a time series is non-stationary, we can study its behavior only for the time period under
consideration. Each set of time series data will therefore be for a particular episode. As a
result, it is not possible to generalize it to other time periods. Therefore, for the purpose of
forecasting or policy analysis, such (non-stationary) time series may be of little practical
value.
2. If we have two or more non-stationary time series, regression analysis involving such time
series may lead to the phenomenon of spurious or non-sense regression.
How do we know that a particular time series is stationary? In particular, are the time series
shown in the above five figures stationary? If we depend on common sense, it would seem that
the time series depicted in the above five figures are non-stationary, at least in the mean values.
This is because some of them are trending upward while others are trending downwards.

By: Teklebirhan A. (Asst. Prof.) Page 6


Chapter Six: Time Series Econometrics 2024

Although our interest is in stationary time series, we often encounter Non-stationary time series,
the classic example being the random walk model (RWM). It is often said that asset prices, such
as stock prices follow a random walk; that is, they are non-stationary. We distinguish two types of
random walks: (1) random walk without drift (i.e., no constant or intercept term) and (2) random
walk with drift (i.e., a constant term is present).

A. Random Walk without Drift


A random walk is defined as a process where the current value of a variable is composed of the
past value plus an error term defined as a white noise (a normal variable with zero mean and
variance one). Algebraically a random walk is represented as follows:
= + (4)
In the random walk model, as (4) shows, the value of at time is equal to its value at time
( − ) plus a random shock. We can think of (4) as a regression of at time t on its value
lagged one period.
Now from (4) we can write
= +
= + = + +
= + = + + +
In general, if the process started at some time 0 with a value of , we have
= +∑ ( )
( ) = ( )
( ) = ( )

As the preceding expression shows, the mean of is equal to its initial, or starting, value, which is
constant, but as increases, its variance increases indefinitely, thus violating a condition of
Stationarity. In short, the RWM without drift is a non-stationary stochastic process. In practice
is often set at zero, in which case ( ) = . An interesting feature of RWM is the
persistence of random shocks (random errors), which is clear from (5): is the sum of initial
plus the sum of random shocks. As a result, the impact of a particular shock does not die away. For
example, if u = 2 rather than u = 0, then all Y ’s from Y onward will be 2 units higher and the
effect of this shock never dies out. That is why random walk is said to have an infinite memory.
Interestingly, if we write (4) as
( − )= ∆ = (8)
Where, ∆ is the first difference operator.
It is easy to show that, while is non-stationary, its first difference is stationary. In other words,
the first differences of a random walk time series are stationary. But we will have more to say
about this later.

By: Teklebirhan A. (Asst. Prof.) Page 7


Chapter Six: Time Series Econometrics 2024

B. Random Walk with Drift


Let us modify (4) as follows
= + + (9)
Where, δ is known as the drift parameter. The name drift comes from the fact that if we write the
preceding equation as
( − )= ∆ = + (10)
It shows that drifts upward or downward, depending on being positive or negative. Following
the procedure discussed for random walk without drift, it can be shown that for the random walk
with drift model (9),

= + +

( )= + (11)
( )= (12)
As can be seen from above, for RWM with drift the mean as well as the variance increases over
time, again violating the conditions of (weak) Stationarity. In short, RWM with or without drift, is
a non-stationary stochastic process.

Remark
 The random walk model is an example of what is known in the literature as a unit root
process. Since this term has gained tremendous currency in the time series literature, we
need to note what a unit root process is.
 Let’s rewrite the RWM (4) as:

= + − ≤ ≤ ( )

 If = , (13) becomes a RWM (without drift). If is in fact , we face what is known as


the unit root problem, that is, a situation of non-stationarity; we already know that in this case
the variance of is not stationary. The name unit root is due to the fact that = . As noted
above, the first differences of a random walk time series/ a unit root process are stationary.
Thus, the terms non-stationarity, random walk, and unit root can be treated as synonymous.

6.5. Test of Stationarity of Time Series Data


In many time series analysis, one of the most important preliminary steps in regression analysis is
to uncover the characteristics of the data used in the analysis. The main goals of undertaking
Stationarity test are to get a variable which has a constant mean, variance and time invariant
covariance of the variables called second order stationary or covariance stationary. Following the
stationary test, if the variables are non-stationary, we cannot use the data for forecasting purpose
unless the series is transformed to a stationary one.

Thus, now we may have two important practical questions:


(1) How do we find out if a given time series is stationary?

By: Teklebirhan A. (Asst. Prof.) Page 8


Chapter Six: Time Series Econometrics 2024

(2) If we find that a given time series is not stationary, is there a way that it can be made
stationary? We now discuss these two questions in turn.
There are basically three ways to examine the stationarity of a time series, namely: (1) graphical
analysis, (2) correlogram, and (3) the unit root test.

5.5.1 Graphic Analysis


As noted earlier, before one pursues formal tests, it is always advisable to plot the time series
under study, as we have done above for the data given in Table 5.1. Such a plot gives an initial
clue about the likely nature of the time series. Take, for instance, the trade balance time series
shown in Figure-5.3. You will see that over the period of study, trade balance has been declining,
that is, showing a downward trend, suggesting perhaps that the mean of the TB has been changing.
This perhaps suggests that the TB series is not stationary. Such an intuitive feel is the starting point
of more formal tests of Stationarity.

Figure-5.3: The Trade Balance (TB) of Ethiopia through time

Figure-5.4: The trend of Money Supply (MS2) of Ethiopia through Time

By: Teklebirhan A. (Asst. Prof.) Page 9


Chapter Six: Time Series Econometrics 2024

Figure-5.5: The Exchange Rate (ETH/$US) of Ethiopia through time

Figure-5.6: The Government Expenditure of Ethiopia through time

Some of the above time series graph have an upward trending (MS2, EX and G) which may be an
indication of the non-Stationarity of these data sets. That means, the mean or variance or both may
be increasing with the passage of time. But, the graph for Trade Balance (TB) of Ethiopia
( ) showed a downward trend which may also be an indication of the non-Stationarity
of the trade balance (TB) series.

5.5.2 Autocorrelation Function (ACF) and Correlogram


Autocorrelation is the correlation between a variable lagged one or more periods and itself. The
correlogram or autocorrelation function is a graph of the autocorrelations for various lags of a
time series data.

The autocorrelation function (ACF) at lag k, denoted by , is defined as:

= : . ., =

By: Teklebirhan A. (Asst. Prof.) Page 10


Chapter Six: Time Series Econometrics 2024

Since both covariance and variance are measured in the same units of measurement, is a unit
less, or pure, number. It lies between −1 and +1, as any correlation coefficient does. If we plot
against k, the graph we obtain is known as the population correlogram. Since in practice we only
have a realization (i.e., sample) of a stochastic process, we can only compute the sample
autocorrelation function (SAFC), . To compute this, we must first compute the sample
covariance at lag k, , and the sample variance, , which are defined as
∑( − )( − )
= =
∑( )
= =
Therefore, the sample autocorrelation function at lag k is simply the ratio of sample covariance (at
lag k) to sample variance is given as

=
A plot of ρ against k is known as the sample correlogram.

How does a sample correlogram enable us to find out if a particular time series is stationary?
For this purpose, let us first present the sample correlogram of our Exchange rate data set given in
table-5.1. The correlogram of the Exchange rate data is presented for 20 lags in figure-5.6.

Figure-5.6: Correlogram of Ethiopian Exchange Rate (EX), 1963-2003


Now, look at the column labeled AC, which is the sample autocorrelation function, and the first
diagram on the left, labeled autocorrelation. The solid vertical line in this diagram represents the
zero axis; observations above the line are positive values and those below the line are negative
values. Given a correlogram, if the value of ACF is close to zero from above or below, the data are
said to be stationary. In other words, for stationary time series, the autocorrelations (i.e., ACF)

By: Teklebirhan A. (Asst. Prof.) Page 11


Chapter Six: Time Series Econometrics 2024

between and for any lag k are close to zero (i.e., the autocorrelation coefficient is
statistically insignificant). That is, the successive values of a time series (such as and ) are
not related to each other. Moreover, the lines on the right-hand side of the correlograms tends to be
shorter to the left or to the right from the vertical zero line for stationary time series.

On the other hand, if a series has a (stochastic) trend, i.e., nonstationary, successive observations
are highly correlated, and the autocorrelation coefficients are typically significantly different from
zero for the first several time lags and then gradually drop toward zero as the number of lags
increases. The autocorrelation coefficient for time lag 1 is often very large (close to 1).

Thus, looking at figure-5.6, we can see that the autocorrelation coefficients for the exchange rate
series at various lags are high (i.e., close to +1 from above) and low (i.e., close to -1 from below).
Figure 5.6 is an example of correlogram of a nonstationary time series.

For further illustration, the correlogram of the Trade balance and Money supply time series are
presented for 10 lags in figure-5.7 & 5.8 below.

In both correlograms, if we look at the column labeled as AC, all the values are close to 1 from
above. Moreover, the lines on the right hand side of the correlograms are also longer to the right from
the vertical zero line. That means, the lines are close to +1. Thus, both the money supply and trade
balance time series are non-stationary.

Figure-5.7: Correlogram of Ethiopian Trade Balance, 1963-2003

By: Teklebirhan A. (Asst. Prof.) Page 12


Chapter Six: Time Series Econometrics 2024

Figure-5.8: Correlogram of Ethiopian Money Supply (MS2), 1963-2003


 An important practical questions from the above analysis is that how do we choose the lag
length to compute the ACF? A rule of thumb is to compute ACF up to one-third to one-
quarter the length of the time series.

5.5.3. The Unit Root Test (Augmented Ducky –Fuller Test)


A test of Stationarity or non-stationarity that has become widely popular over the past several
years is the unit root test. In this section, therefore, we shall address the unit root test, using the
Augmented Dickey- Fuller (ADF) test of Stationarity of the time series. Dickey and Fuller (1979,
1981) devised a procedure to formally test for non-stationarity. The ADF test is a modified version
of the original Ducky–Fuller (DF) Test. The modification is inclusion of extra lagged terms of the
dependent variable to eliminate autocorrelation in the test equation.

The key insight of their test is that testing for non-stationarity is equivalent to testing for the
existence of a unit root. Thus, the starting point is the unit root (stochastic) process given by:
= + − ≤ ≤ ( )
We know that if = 1, that is, in the case of a unit root, (20) becomes a random walk model
without drift, which we know is a non-stationary stochastic process. Therefore, why not simply
regress on its one period lagged value and find out if the estimated is statistically equal
to 1? If it is, then is non-stationary. This is the general idea behind the unit root test of
Stationarity.

In a nutshell, what we need to examine here is = (unity and hence ‘unit root’). Obviously, the
null hypothesis is : = , and the alternative hypothesis is : < .

By: Teklebirhan A. (Asst. Prof.) Page 13


Chapter Six: Time Series Econometrics 2024

We obtain a more convenient version of the test by subtracting from both sides of (20):
− = − +
∆ =( − ) +
∆ = + (21)
Where, ∆ is the first-difference operator and = − . In practice, therefore, instead of
estimating (20), we estimate (21) and test the null hypothesis : = , against the alternative
hypothesis : < . In this case, if = , then = , (i. e., we have a unit root) follows a
pure random walk (and, of course, is non-stationary).

Now let us turn to the estimation of (21). This is simple enough; all we have to do is to take the
first differences of and regress them on and see if the estimated slope coefficient in this
regression (= ) is zero or not. If it is zero, we conclude that is non-stationary. But if it is
negative, we conclude that is stationary.

The modified version of the test equation given by (21), i.e., the test equation with extra lagged
terms of the dependent variable is specified as:
∆ = +∑ ∆ + ( )

Dickey and Fuller (1981) also proposed two alternative regression equations that can be used for
testing for the presence of a unit root. The first contains a constant in the random walk process,
and the second contains a non-stochastic time trend as in the following equations, respectively:
∆ = + +∑ ∆ + ( )
∆ = + + +∑ ∆ + ( )
The difference between the three regressions concerns the presence of the deterministic elements
and .

Note that, , in all the above three test equations is the lag length for the number of lagged
dependent variables to be included. That is, we need to choose a lag length to run the ADF test so
that the residuals are not serially correlated. To determine the number of lags, , we can use one
of the following procedures.
a. General-to-specific testing: Start with Pmax and drop lags until the last lag
is statistically significant, i.e., delete insignificant lags and include the significant ones.
b. Use information criteria such as the Schwarz information criteria, Akaike’s information
criterion (AIC), Final Prediction Error (FPE), or Hannan-Quinn criterion (HQIC).

By: Teklebirhan A. (Asst. Prof.) Page 14


Chapter Six: Time Series Econometrics 2024

In practice, we just click the ‘automatic selection’ on the ‘lag length’ dialog box in EViews.

Now, the only question is which test we use to find out if the estimated coefficient of in (22,
23 & 24) is statistically zero or not. The ADF test for stationarity is simply the normal ‘t’ test on
the coefficient of the lagged dependent variable from one of the three models (22, 23, and
24). This test does not, however, have a conventional ‘t’ distribution and so we must use special
critical values which were originally calculated by Dickey and Fuller which is known as the
Dickey-Fuller tau statistic1. However, most modern statistical packages such as Stata and Eviews
routinely produce the critical values for Dickey-Fuller tests at 1%, 5%, and 10% significance
levels.

 In all the three test equations, the ADF test concerns whether = . The ADF test statistic
is the ‘t’ statistic for the lagged dependent variable. ADF statistic is a negative number
and more negative ADF test statistic is the stronger the rejection of the hypothesis that
there is a unit root.
 Null Hypotehsis (H0): If accepted, it suggests the time series has a unit root,
meaning it is non-stationary. It has some time dependent structure.
 Alternate Hypothesis (H1): If accepted, the null hypothesis is rejected; it suggests
the time series does not have a unit root, meaning it is stationary.
Equivalently, if
 p-value > 0.05: Accept H0, the data has a unit root and is non-stationary
 p-value ≤ 0.05: Reject H0. the data does not have a unit root and is stationary

In short, if the ADF statistical value is smaller in absolute terms than the critical value(s) or
equivalently if the P-value for the ADF test statistic is significant then we reject the null hypothesis
of a unit root and conclude that is a stationary process.

Note that the choice among the three possible forms of the ADF test equations depends on the
knowledge of the econometrician about the nature of his/her data. Plotting the data2 and observing
the graph is sometimes very useful because it can clearly indicate the presence or not of
deterministic regressors. However, if the form of the data-generating process is unknown,
estimation of the most general model given by (24) and then answering a set of questions
regarding the appropriateness of each model and moving to the next model is suggested (i.e.,
general to specific procedure).

1 The Dickey-Fuller tau statistic can be found from appendix section of any statistical book (Eg: you can find it
on Gujarati).
2Sometimes if you have data that is exponentially trending then you might need to take the log of the data first

before differencing it. In this case in your ADF unit root tests you will need to take the differences of the log of
the series rather than just the differences of the series.

By: Teklebirhan A. (Asst. Prof.) Page 15


Chapter Six: Time Series Econometrics 2024

Illustration: Unit Root Test of Some Macroeconomic Variables of Ethiopia using EViews.

As noted above, the ADF test is based on the null hypothesis that a unit root exists in the time
series. Using the ADF test, some of the variables (TB, MS2, and EX) from our time series data set
are examined for unit root as follows.

Table-5.2: Augmented Dickey-Fuller Unit Root Test on log of Trade Balance

Table-5.3: Augmented Dickey-Fuller Unit Root Test on log of Exchange Rate

Table-5.4: Augmented Dickey-Fuller Unit Root Test on Log of Money Supply

By: Teklebirhan A. (Asst. Prof.) Page 16


Chapter Six: Time Series Econometrics 2024

All the above test results showed that the data sets are non-stationary. The test statistics is lower
than the critical values at 1%, 5% and 10% levels of significance for all the three data sets. This
implies the acceptance of the null hypothesis which states that there is unit root in the data sets.

5.5.4. Transforming Non-stationary Time Series


Now that we know the problems associated with non-stationary time series, the practical question
is what to do. To avoid the spurious regression problem that may arise from regressing a non-
stationary time series on one or more non-stationary time series, we have to transform non-
stationary time series to make them stationary. The transformation method depends on whether the
time series are difference stationary process (DSP) or trend stationary process (TSP). We consider
each of these methods in turn.

A) Difference-Stationary Processes
If a time series has a unit root, the first differences of such time series (i.e., a series with stochastic
trend) are stationary. Therefore, the solution here is to take the first differences of the time series.

Returning to our Ethiopian Trade Balance (TB) time series, we have already seen that it has a unit
root. Let us now see what happens if we take the first differences of the TB series.

Let ∆ = ( − ). Now consider the following regression result:


∆(∆ )=− . ∆
∶ (− . )
= . ; = .
The 1 percent critical ADF τ value is about −3.5073 as given in APPENDIX-D –table-7 of
Gujarati. Since the computed τ (= t) is more negative than the critical value, we conclude that the
first-differenced TB is stationary; that is, it is (0). In other words, the overall unit root test result
for TB time series at first difference is as given below (i.e., Test result from EViews).
Table-5.5: Unit Root test of the Log of Trade balance data set at first difference

By: Teklebirhan A. (Asst. Prof.) Page 17


Chapter Six: Time Series Econometrics 2024

Figure-5.9: the first difference of the log of Ethiopian Money Supply (LMS2), 1963-2003
If you compare the above figure with figure-5.9 (i.e., the Trade Balance figure at level), you
will see the obvious difference between the two.

Table-5.6: Unit Root test of the Log of MS2 data set at first difference

Table-5.7: Unit Root test of the Log of EXR data set at first difference

In general, the above test results revealed that Trade balance (TB), log of money supply (LMS2),
and exchange rate data (EX) became stationary at first difference. In all the above test results, the
null hypothesis of Unit Root is rejected.

By: Teklebirhan A. (Asst. Prof.) Page 18


Chapter Six: Time Series Econometrics 2024

B) Trend-Stationary Process (TSP)


As we have noted before, a TSP is stationary around the trend line. Hence, the simplest way to
make such a time series stationary is to regress it on time and the residuals from this regression
will then be stationary. In other words, run the following regression:
= + +
where is the time series under study and where t is the trend variable measured chronologically.
Now,
=( − − )
will be stationary. is known as a (linearly) detrended time series.
It is important to note that the trend may be nonlinear. For example, it could be
= + + +
which is a quadratic trend series. If that is the case, the residuals from the above regression model
will now be (quadratically) detrended time series.
Note, however, that most macroeconomic time series are DSP rather than TSP.
6.6. What next after unit root testing?
The outcome of unit root testing matters for the empirical model to be estimated. The following
cases explain the implications of unit root testing for further analysis.
CASE 1: Series in the model under examination are stationary.
What if all the time series under consideration are stationary? Technically speaking, we mean they
are I(0) series (integrated of order zero).
 Under this scenario, cointegration test is not required, as any shock to the system in the
short run quickly adjusts to the long run. Therefore, only the long run model should be
estimated. Thus, the estimation of short run model is not necessary if series are I(0).
CASE 2: Series in the model under consideration are I(1).
Under this scenario, the series are assumed to be non-stationary. One special feature of these
series is that they are of the same order of integration, I(1).
 Under this scenario, the model in question is not entirely useless although the variables
are unpredictable. To verify further the relevance of the model, there is need to test for
cointegration.
 That is, can we assume a long run relationship in the model despite the fact that the series
are trending either upward or downward?
 If there is cointegration, that means the series in question are related and therefore can be
combined in a linear fashion. This implies that, even if there are shocks in the short run,
which may affect movement in the individual series, they would converge with time (in

By: Teklebirhan A. (Asst. Prof.) Page 19


Chapter Six: Time Series Econometrics 2024

the long run). However, there is no long run if series are not cointegrated. This implies
that, if there are shocks to the system, the model is not likely to converge in the long run.
 Note that both long run and short run models must be estimated when there is
cointegration. If there is no cointegration, there is no long run and therefore, only the
short run model will be estimated.
 There are however, two prominent cointegration tests for I(1) series in the literature.
They are Engle-Granger cointegration test and Johansen cointegration test.
 The Engle-Granger test is meant for single equation model while Johansen is considered
when dealing with multiple equations.
CASE 3: The series are different order of cointegration.
Researchers are more likely to be confronted with this situation. For instance, some of the
variables may be I(0) while others may be I(1).
 Like case 2, cointegration test is also required under this scenario.
 Recall that, Engle-Granger and Johansen cointegration tests are only valid for I(1) series.
 Where the series are of different order of cointegration, the appropriate test to use is the
Bounds cointegration test.
 Similar to case 2, if series are not cointegrated based on Bounds test, we are expected to
estimate only the short run. However, both the long run and short run models are valid if
there is cointegration.

6.7. Co-integration Test of Time Series Data


We have warned that the regression of a nonstationary time series on another nonstationary time
series may produce a spurious regression. Cointegration test is the appropriate method for
detecting the existence of long-run relationship. Engle and Granger (1987) argue that, even though
a set of economic series is not stationary, there may exist some linear combinations of the variables
that are stationary, if the two variables are really related. If the separate series are stationary only
after differencing but a linear combination of their levels is stationary, the series are cointegrated.
Cointegration becomes an overriding requirement for any economic model using nonstationary
time series data. If the variables do not co-integrate, we usually face the problems of spurious
regression and econometric work becomes almost meaningless.

Suppose that, if there really is a genuine long-run relationship between any two variables:
and , although the variables will rise/decline overtime (because they are trended), there will be a
common trend that links them together. For an equilibrium, or long-run relationship to exist, what

By: Teklebirhan A. (Asst. Prof.) Page 20


Chapter Six: Time Series Econometrics 2024

we require, then, is a linear combination of and that is a stationary variable [an I(0)
variable].

Assume that two variables and are individually nonstationary time series. A linear
combination of and can be directly taken from estimating the following regression:
= + +
And taking the residuals:
= − −
We now subject to unit root analysis and hence if we find that ~ (0), then the variables
and are said to be co-integrated. This is an interesting situation, for although are
individually (1), that is, they have stochastic trends, their linear combination , = − −
, is I (0). So to speak, the linear combination cancels out the stochastic trends in the two
series. If you take consumption and income as two ( ) variables, savings defined as (income −
consumption) could be (0). As a result, a regression of = + + would be
meaningful (i.e., not spurious). In this case we say that the two variables are co-integrated.
Economically speaking, two variables will be co-integrated if they have a long-term, or
equilibrium, relationship between them.

In short, provided we check that the residuals from regressions like, = + + , are
( ) or stationary, the traditional regression methodology (including the t and F tests) that we have
considered extensively is applicable to data involving (nonstationary) time series. The valuable
contribution of the concepts of unit root, co-integration, etc., is to force us to find out if the
regression residuals are stationary. As Granger notes, “A test for Cointegration can be thought of
as a pre-test to avoid ‘spurious regression’ situations”.

In the language of co-integration theory, a regression such as, = + + , is known as a


co-integrating regression and the slope parameter is known as the co-integrating parameter.
The concept of co-integration can be extended to a regression model containing regressors. In
this case we will have cointegrating parameters.

Testing for Cointegration


A number of methods for testing cointegration have been proposed in the literature. These includes
a. Engle–Granger (EG)- appropriate only when all variables are I(1)
b. Johansen and Jusesuis (1990) co-integration test
c. The Durbin–Watson (CRDW) cointegrating test,
d. Bounds cointegration test appropriate when variables are I(0) and I(1) or
mutually I(0) or mutually I(1)

But, in this chapter we will use the Engle Granger (1987) two stage co-integration test procedures.

By: Teklebirhan A. (Asst. Prof.) Page 21


Chapter Six: Time Series Econometrics 2024

Engle-Granger Test for Cointegration


For single equation, the simple test of cointegration is the ADF unit root tests on the residuals
estimated from the cointegrating regression. This test is a modified test of unit root by the Engle-
Granger (EG) and Augmented Engle-Granger (AEG) tests. Notice the difference between the unit
root and cointegration tests. Tests for unit roots are performed on single time series, whereas
cointegration test deals with the relationship among a group of variables, each having a unit root.

According to the Engle-Granger two steps method, first we have to estimate the static regression
equation to get the long run multiplier. In the second step an error correction model is formulated
and estimated using residuals from the first step as equilibrium error correction.

In other words, all we have to do is estimate a regression like, = + + , obtain the


residuals, and use the ADF unit root test procedure. There is one precaution to exercise, however.
Since the estimated residuals are based on the estimated co-integrating parameter , the
Augmented Dickey Fuller critical significance values are not quite appropriate. However, Engle
and Granger have calculated these values, which can be found in the references of any statistical
books. Therefore, the ADF tests in the present context are known as Engle–Granger (EG) test.
However, several software packages now present these critical values along with other outputs.

Let us illustrate this test using a Trade Balance model.


1. Long run/Static Model of the Trade Balance of Ethiopia
In econometrics, a long run model is a static model where variables are neither lagged nor
differenced. Therefore, the equation below is a long run model for Trade Balance.
= + + + + +
Where, all variables are in log forms.

Estimate the above model using least squares and save the residuals for stationarity test.

Dependent Variable: LTB


Method: Least Squares
Included observations: 40 after adjustments
Variables Coefficients Standard Error t-Statistics P-value
LMS2 -0.734042 0.256962 -2.856615 0.0071

LGDP 0.219502 0.174400 1.258610 0.2163

LEXR 0.332259 0.137795 2.411259 0.0211

By: Teklebirhan A. (Asst. Prof.) Page 22


Chapter Six: Time Series Econometrics 2024

LG 0.175736 0.298237 0.589248 0.5594

Constant 1.500707 0.951265 1.577591 0.1234

= 0.79 = 0.77; Durbin Watson d statistics =1.18

a. Interpret the above regression results


b. Which variables significantly affect the trade balance of Ethiopia in the long run?

2. Test of the Stationarity of the residuals obtained from the long run model

∆ = + +

Augmented Dickey-Fuller Test Equation


Dependent Variable: ( )
Method: Least Squares Included observations: 40 after adjustments
Variables Coefficients Standard Error t-Statistics P-value
-0.650962 0.150385 -4.328652 0.0001
Constant 0.014651 0.031967 0.458332 0.6493

0.330246 F-statistic 18.73722


Adjusted 0.312621 Prob(F-statistic) 0.000105
Durbin-Watson stat 1.812471

a. Test the above model for Stationarity


b. Is the null hypothesis accepted or rejected?

This above regression result showed that the model is significant and we accept the alternative
hypothesis. That means, the residuals from the long run model is stationary and this implies that
regression of = + + + + + is meaningful (i.e.,
not spurious). Thus, the variables in the model have long run r/ships or equilibrium.

3. The Short Run Dynamics (Error Correction Models)


We just showed that LTB, LMS2, LGDP, LEXR and LG are cointegrated; that is, there is a long
term, or equilibrium, relationship between these variables. Of course, in the short run there may be
disequilibrium. Therefore, one can treat the error term in
= + + + + +
as the “equilibrium error”. And we can use this error term to tie the short-run behavior of TB to its
long-run value. The error correction mechanism (ECM) first used by Sargan and later
popularized by Engle and Granger corrects for disequilibrium. An important theorem, known as
the Granger representation theorem, states that if two variables Y and X are cointegrated, then
the relationship between the two can be expressed as ECM. To see what this means, we will revert
to our example of the trade balance of Ethiopia. But before that let’s consider a typical example.

By: Teklebirhan A. (Asst. Prof.) Page 23


Chapter Six: Time Series Econometrics 2024

Assume that and are co-integrated and assuming further that is exogenous, the ECM can
be given as follows:
∆ = + ∆ + +
To account for short run dynamics, include lagged terms as:

∆ = + ∆ + ∆ + +

This is called Error Correction Model (ECM) in time series econometrics. = the lagged
values of the residuals from the long run model is used in the short run regression analysis and
this is because the last periods disequilibrium or equilibrium error affects the direction of the
dependent variable in the current period.

The coefficient of the lagged values of the residuals is expected to be less than one and negative. It
is less than 1 because the adjustment towards equilibrium may not be 100%. If the coefficient is
zero, the system is at equilibrium.
Note:
a. If is negative and significant, the adjustment is towards the equilibrium.
b. If is positive and significant, the adjustment is away from the equilibrium.
Illustration: Error Correction Model for the Trade Balance of Ethiopia

The Error Correction Model (ECM) for the Trade Balance of Ethiopia is specified as:

∆ = + ∆ + ∆ + ∆ + ∆

+ ∆ + +

Where, p and q are the optimal lag length (which can be determined by minimizing the model
selection information criteria) for the dependent and explanatory variables, respectively,
while is the one period lagged values of the residuals obtained from the
cointegrating regression, i.e., long run/static model.

Short Run Determinants of the Trade Balance of Ethiopia (Output from EViews)
Dependent Variable: DLTB
Method: Least Squares
Included observations: 39 after adjustments
Variables Coefficients Standard Error t-Statistics P-value
D(LTB(-1)) 0.238526 0.162666 1.466352 0.1523
D(LEXR(-1)) -0.079162 0.274364 -0.288531 0.7748

By: Teklebirhan A. (Asst. Prof.) Page 24


Chapter Six: Time Series Econometrics 2024

D(LMS2(-1)) -0.540010 0.580852 -0.929685 0.3595


D(LGDP(-1)) -0.160748 0.351836 -0.456882 0.6508
D(LG(-1)) 0.401352 0.277733 1.445103 0.1582
ECT(-1) -0.888009 0.189013 -4.698145 0.0000

− = 0.516354
− = 0.425670
= 1.968382

c. Interpret the above regression results of ECM


d. Which variables significantly affect the trade balance of Ethiopia in the short run?

The coefficient of the error-correction term of about − . suggests that about 89% of the
discrepancy between long-term and short-term Trade Balance is corrected within a year (yearly
data), suggesting a high rate of adjustment to equilibrium.

In summary, the Engle-Granger Two Stage Procedures:


1. Run the Static Model at level
= + + + + +
2. Save the residuals from this long run or static model:
= − − − − −

3. Test the above residual for unit root:


∆ = + +
Where, = −
=

4. If the null hypothesis is accepted, the regression


= + + + + +
is spurious. We don’t have long run model and we have to run only the following short run
model by disregarding the long run model.
∆ = ∆ + ∆ + ∆ + ∆ +
(NB: one can also add the lagged values of the dependent & explanatory variables)
5. If the null hypothesis is rejected, the regression
= + + + + +

By: Teklebirhan A. (Asst. Prof.) Page 25


Chapter Six: Time Series Econometrics 2024

is meaningful (non-spurious). We do have both the long run model and short run dynamics.
We have to include the lagged value of the residuals in the short run model as one
explanatory variable.

6.8. Autoregressive Distributed Lag (ARDL(p,q)) Models


Autoregressive distributed lag models are models that contain the lagged values of the dependent
variable, the current and lagged values of regressors as explanatory variables.
 This model was developed by Pesaran et al. (2001)
 ARDL Models can be specified and used:
 If the variables are integrated of different orders. That is, a model having a combination
of variables with I(0) and I(1) order of integration.
 If all variables are integrated of order one, I(1). That is stationary after first difference.
 But no variable should be included that is integrated of order two, I(2). Unit root test is
vital to ascertain that no variable is I(2).
 The ARDL model makes use of Bounds cointegration test method. Then, if the variables are
cointegrated, one can specify and estimate both the long-run and short run models.
 The short-run and long-run coefficients of the model are estimated simultaneously.
 ARDL model is relatively efficient in small sample data sets. Moreover, by applying ARDL
model, unbiased long run estimates are obtained.

Specification of Autoregressive Distributed Lag Model (ARDL(p,q))


For two variables, a generalized ARDL (p, q) model is specified as:

= + + +

Where, p and q are optimal lag length for the dependent and independent variables, respectively.
is the usual white noise residuals.

An ARDL representation of our Trade Balance Model is formulated as follows:

∆ = + ∆ + ∆ + ∆ + ∆

+ ∆ + + + +

+ +

By: Teklebirhan A. (Asst. Prof.) Page 26


Chapter Six: Time Series Econometrics 2024

The coefficients − correspond to the long-run relationship. The remaining expressions with
the summation sign ( − ) represent the short-run dynamics of the model.

To investigate the presence of long-run relationships among the LTB, LMS2, LEXR, LGDP and
LG, bound testing under Pesaran, et al. (2001) procedure is used. The bound testing procedure is
based on the F-test. The F-test is actually a test of the hypothesis of no co-integration among the
variables against the existence or presence of cointegration among the variables, denoted as:
: = = = = =0
i.e., there is no cointegration among the variables.
: ≠ ≠ ≠ ≠ ≠0
i.e., there is cointegration among the variables.

The ARDL bound test is based on the Wald-test (F-statistic). The asymptotic distribution of the
Wald-test is non-standard under the null hypothesis of no cointegration among the variables. Two
critical values are given by Pesaran et al. (2001) for the cointegration test. The lower critical bound
assumes all the variables are I(0) meaning that there is no cointegration relationship between the
examined variables. The upper bound assumes that all the variables are I(1) meaning that there is
cointegration among the variables. When the computed F-statistic is greater than the upper bound
critical value, then the is rejected (the variables are cointegrated).

If the F-statistic is below the lower bound critical value, then the cannot be rejected (there is no
cointegration among the variables). When the computed F-statistics falls between the lower and
upper bound, then the results are inconclusive.

Up on Bounds cointegration test:


If there is no cointegration, we estimate only the short run model (as there is no long run
model) given below.

∆ = + ∆ + ∆ + ∆ + ∆

+ ∆ +

By: Teklebirhan A. (Asst. Prof.) Page 27


Chapter Six: Time Series Econometrics 2024

If there is cointegration, we have to estimate both the short run and long run models. And the
error correction model (ECM) representation of the trade balance ARDL model is as
follows:

∆ = + ∆ + ∆ + ∆ + ∆

+ ∆ + +

Where, ETC is the speed of adjustment parameter and is the residuals that are obtained from the
estimated cointegrated/long run model.

Autoregressive Distributed Lag Model (ARDL(p,q)) Using EViews


The relevant steps for the implement tation of the ARDL using Eviews 9 are as follows:
Step 1: Specification of the model in static form
 Click on Quick on the Menu bar and select Estimate Equation
 Specify the appropriate static equation in the Equation Specification Window.
Step 2: Choose the appropriate estimation technique
Click on the drop-down button in front of Methods under the Estimation settings and select
ARDL – Auto regressive Distributed Lag Models

Step 3: Choose the appropriate maximum lags and trend specification


The lag length must be selected such that the degrees of freedom (computed as n-k) must not be
less than 30. The Linear Trend option under the Trend specification is also selected.

Step 4: Choose the appropriate lag selection criterion for optimal lag
Click on Options tab, then click on the drop-down button under Model Selection Criteria and
select the Schwarz Criterion (SC).
(Considering steps-1-4 above, we will have the below equation estimation dialog box)

By: Teklebirhan A. (Asst. Prof.) Page 28


Chapter Six: Time Series Econometrics 2024

Step 5: Estimate the model based on Steps 1 to 4 (i.e., click Ok in the above dialog box), to
obtain:

By: Teklebirhan A. (Asst. Prof.) Page 29


Chapter Six: Time Series Econometrics 2024

Step-6: Test the ARDL model for cointegration using Bounds Cointegration test. To do this,
on the equation workfile estimated above View/Coefficient Diagnostics/Bounds Test, then you
will obtain the following test result:

As can be seen from the test result above, the F-statistic is higher than the upper bounds test
critical value even at 1% level of significance. Therefore, we reject the null hypothesis of no long
run relationship. Right in such cases, we can estimate both the SHORT RUN and LONG RUN
Models. So, consider step-7.

Step-7: Estimate the long run model


a. Click on View on the Menu Bar
b. Click on Coefficient Diagnostics
c. Select the Cointegration and Long Run Form option

By: Teklebirhan A. (Asst. Prof.) Page 30


Chapter Six: Time Series Econometrics 2024

Error
Correction
Term Short Run
Estimates

Long Run
Estimates

Diagnostic Tests for ARDL


ARDL is a linear regression model and therefore the underlying assumptions of CLRM have to be
verified. These assumptions as earlier highlighted are linearity/the model is correctly specified,
homoscedasticity, serial correlation and normality of the error term, among others. Note that we
have previously demonstrated how to use EViews to test for these assumptions; nonetheless, the
application is also extended to the ARDL model estimated here. So, do it for yourself.

 Please go to your lecture PowerPoint to have more understanding about


ARDL

By: Teklebirhan A. (Asst. Prof.) Page 31

You might also like