0% found this document useful (0 votes)
43 views

Some Notes On Univariate Time Series Analysis: C:/JIM//CLASSES/588/NOTES/III - ARIMA - 588 - 2000

This document provides an overview of univariate time series analysis and ARIMA models. It discusses key concepts like stationarity and different types of non-stationary processes, including random walks and trend stationary models. The document also outlines the basic ARIMA (p,d,q) model framework and how it can be used for time series forecasting.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
43 views

Some Notes On Univariate Time Series Analysis: C:/JIM//CLASSES/588/NOTES/III - ARIMA - 588 - 2000

This document provides an overview of univariate time series analysis and ARIMA models. It discusses key concepts like stationarity and different types of non-stationary processes, including random walks and trend stationary models. The document also outlines the basic ARIMA (p,d,q) model framework and how it can be used for time series forecasting.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 51

C:\JIM\\CLASSES\588\NOTES\III_ARIMA_588_2000 3/15/2000

SOME NOTES ON UNIVARIATE TIME SERIES ANALYSIS


O.. Basic Problem

A. A, B, C's of Time Series Models


1. Stationarity
2. Non-stationary: two important examples
a. Random walk with drift
b. Trend stationary model
c. Summary
d. Dickey-Fuller Tests
3. Random walks, trends, and spurious regressions
4. Cointegration

B. Basic ARIMA Models: ARIMA (p,d,q)

C. Characteristics of some ARIMA Models and Identification


1. An overview and simple models
a. AR(1)
b. MA(1)
c. Summary
2. AR(p)
3. MA (q)
4. ARMA (p,q)
5. ARIMA (p,d,q)

D. Diagnostic Analysis

E. Estimation

F. Forecasting

G. General comments

H.. Help: Computer Programs

Appendices
AR(p) and Yule Walker equations
MA(q)
Spectral density functions
James B. McDonald
Brigham Young University
Minor revisions 3/2000

Univariate Time Series Analysis

0. Basic Problem: Consider the problem of, given n observations on a single variable Y
(Y1, Y2, ..., Yn), obtain forecasts of Y at time period n + h, denoted by Y$N+ h or Yn(h) . This
might be pictorially
represented as:

The data might correspond to GNP, consumer prices, foreign exchange rates, telephone calls,

unemployment rates, product sales, the number of empty hospital beds in a hospital, commodity prices,

or any other time series. The forecasting problem becomes that of attempting to obtain a prediction

forecast h periods in the future from known observations. Many techniques have been developed to

extrapolate from past observations into the future. The techniques differ in structure and assumptions

made including the "amount" of past information used in forecasting. Some techniques assume that past

levels or changes or percentage changes can be used to forecast the next period. Other techniques are

based upon "moving" averages of past values and possibly allow for trends or seasonality in the

underlying series. When forecasting, we would do well to remember the admonition found in 88th
Section of the doctrine and Covenants (79th verse) about studying things “which must shortly come to

pass” and be very cautious about long-run forecasts.

Auto Regressive Integrated Moving Average (ARIMA) models are very general specification

which includes some of the previously mentioned forecasting approaches as special cases as well as

many other approaches. These techniques will be studied in order of increasing "sophistication."

Before getting into the details of time series forecasting techniques, it is useful to give a brief

overview of applications of these techniques. They can be used (1) on a "stand alone" basis to predict

future values of a dependent variable of interest, YN(h), by extrapolating past trends ; (2) to predict a

future value for an independent variable (Xt), XN(h) which can be substituted into an econometric

model

YN(h) = f(XN (h))

to obtain forecasts of the dependent variable; and finally (3) these techniques can also be used to

predict systematic components in the residuals. Thus time series techniques can be used separately

from the specification of an economic model or in conjunction with an economic model.

A. A,B,C’s of time series models

1. Stationarity

Consider a stochastic process Yt which is defined for all integer values of t. Yt is said to be

weakly or covariance stationary if

E(Yt) = µ (A.1a-c)

Var(Yt) = ó 2

Cov(Yt, Yt-s ) = ãs
for all t. A stronger definition of stationarity is that the joint distribution functions F(Yt+1,...,

Yt+n) and F(Ys+1,..., Ys+n) are identical for any value of t and s.

Consider the four following figures:

š š
Ytš š
š š
š š
Yt
š š
š š
š š
š š
š š
š š
š____________________________ š_______________________________
t t

š š
Ytš š
š š
š š
Yt
š š
š š
š š
š š
š š
š š
š____________________________ š_______________________________
t t

The first series would be classified as being stationary; whereas, the other three would not.

It will be useful to define the autocorrelation coefficients corresponding to Yt in terms of the

autocovariances (ãs)

γs
ñs = = correlation (Yt, Yt-s) (A.2)
γ0

and the plot of ñs against s is referred to as the correlogram.

š
š
š
ñs
š
1 š
š
š
š
š
š____________________
0 1 2 3 s
Correlogram

An example of a stationary series is the autoregressive model, AR(1), where

Yt = ö 1 Yt-1 + åt (A.3 a-b)

where gt - N[0,ó 2].

From this specification, it can be shown that [do this using lag or back shift operator]

E(Yt) = 0

ó2
Var(Y t) ' ' ã0
2
1 & ö1

s ó2 s
Cov(y t,yt&s) ' (ö1) ' ö1 ã0 ' ãs
2
1 & ö1

with autocorrelation coefficients given by

ãs s
ñs '
ã0
' ö1 which decay exponentially for *ö 1* < 1.

While the AR(1) model is stationary if *ö 1* < 1, many economic series are not stationary.

Remember a time series is not stationary if either the mean, variance, or covariance change with

time. Thus a series which increases over time or is characterized by heteroskedasticity is not
stationary. Since many forecasting techniques are based upon the assumption that the series is

stationary, it is comforting to note that some non-stationary series can be transformed into

stationary series. We now consider two such series.

2. Non-stationary series: two important examples

a. Consider the random walk, with drift:

Yt = µ + Yt-1 + åt (A.4)

where gt - N[0,ó 2]. By recursive substitution this model can be rewritten as

Σ
t
Yt = µt + i =0
(ε t −i ) (A.5)

And we note that Yt is not stationary because the mean of Yt ( µt ) increases (linearly) with t

and the variance of Yt , about the linear trend, increases with time. It is important to note

that the first difference of Yt , ∆ Yt ,

Zt = Yt - Yt-1 = (Yt - BYt ) = (I -B)Yt = ∆ Yt = µ + åt (A.6)

is stationary. B or L denotes the backshift operator. Yt is said to be integrated of order one,

I(1), because the first difference of the series is stationary. A series is said to be integrated of

order d, I(d), if ∆ d Yt is stationary.

b. Now consider the trend stationary model defined by

Yt = µ + ât + åt (A.7)

where gt - N[0, ó 2]. Like the random walk with drift, the trend stationary model has a

mean ( µ + ât ) which increases linearly with t; however, the variance of Yt , about its trend,
is a constant ó 2 which is in contrast to the random walk with drift whose variance increases

with t. Now the first differences of a trend stationary model

Zt = Yt - Yt-1 = â + åt - åt-1 (A.8)

have a constant mean (â) and variance 2 ó 2, but the errors are moving averages. The

correlation between Yt and Yt-1 is -1/2 and correlation between Yt and Yt-s for s > 1 is

zero. The optimal estimations for these two series are different. The best way to estimate the

random walk with drift is to take first differences and then estimate the unknown parameters

associated with the differenced series. The best way to work with the trend stationary series

is to estimate a polynomial in “t” and then analyze the residuals. Thus, the two non stationary

series are estimated in different ways.

c. The difference between these two important series can be summarized as in the

following table.

Series type/ Approach Differences: Äd Yt Detrend [regress y


on a polynomial time
trend]
Random walk with drift • Optimal • Not optimal
• Yt = µ + Yt-1 + åt

Σ i =0 (
t
• Yt = µt + t −i )
• Behavior: Increasing mean and variance
• The impact of innovations (åj ) persist.

Trend stationary • Not optimal • OLS optimal


• Yt = µ + ât + åt
• Behavior: increasing mean with constant • Standard
variance statistical tests
• The impact of innovations (åi ) pass. are
problematical
The implications of the two different models can be important in some applications. A

practical problem is that it may be difficult to differentiate between the trend and difference

stationary models and, hence, to know the most efficient estimation techniques to use. One

approach to discriminated between the two models is to consider the following “nesting”

regression:

Yt - Yt-1 = µ ( 1-ã) + âã + â(1-ã) t + (ã-1)Yt-1 + åt (A.9)

Consider how this equation simplifies for

(1) ã = 0: T_________S____________

(2) ã = 1: R________W__________or DS

There are a series of tests known as Dickey-Fuller tests which explore the null hypothesis,

Ho: ã = 1, and a variety of tests are based on the regression:

Yt - Yt-1 = á 0 + [á 1= (ã-1)]Yt-1 + á 2 t + åt

The hypothesis ã = 1 implies that the coefficients of the variables t and Yt-1

equal 0. Standard t-tests are not appropriate and special tables have been

constructed using Monte Carlo methods. Special tables of statistical significance

have been constructed using Monte Carlo methods. If the coefficient of Yt-1 is

negative, the evidence favors trend stationarity.

In some applications there is concern that the åt will be characterized by autocorrelation.

In these cases, the augmented Dickey_Fuller test is based on estimating the equation
Yt - Yt-1 = á 0 + [ á 1 = (ã-1)]Yt-1 + á 2 t + Σ pj=−11 φ ∆ Yt− j + åt (A.10)

where the term involving the summation has been added to pick up the impact of

autocorrelation. The test for differencing is performed by testing the null hypothesis

HO: á 1 = (ã-1) = 0

The SHAZAM command for these tests is given by

COINT Y/options

Shazam reports a number of tests for unit roots and difference stationarity which are based

on (A.10) above and (A.11) defined by the equation

Yt - Yt-1 = á 0 + [ á 1 = (ã-1)]Yt-1 + Σ pj=−11 φ ∆ Yt− j + åt (A.11)

These formulations would be used for series not being associated with a time trend, whereas

(A.10) would be used in the presence of a time trend. The summation term in (A.11) has

been added to account for the possibility of autocorrelated errors.

The tests are:

(1) á 1 = 0 in (A.11) —two tests

(2) á 0 = á 1 = 0 in (A.11) using an F test -Unit root test with zero drift

(3) á 1 = 0 in (A.10) —two tests

(4) á 0 = á 1 = á 2 =0 in (A.10) using an F test -Unit root test with zero drift

(5) á 1 = á 2 =0 in (A.10) using an F test -unit root test with non-zero drift

Notes:

1. Tests of á 1 = 0, null hypothesis of a unit root, are one tailed tests. The hypothesis is

rejected if the estimated test statistic is less than the reported critical value. For example, for
series without time trends, Asymptotic critical values for a unit root t - test [with no time

trend] are given by

Significance level 1% 2.5% 5% 10%


Critical value -3.43 -3.12 -2.86 -2.57

α$ 1
If < - 3.43, then we would reject the hypothesis of a unit root. In the case of unit
sα$1

roots, one approach is to work with first differences of the data. There are other

approaches.

2..Peter Phillips and others have explored the use of fractional differences where Äd where 0

< d < 1 and provides an intermediate ground between working with levels (d=0) and

working with differences ( d =1).

3. Random Walks, trends, and Spurious Regressions

Have you ever wondered what happens if you regress one series on an unrelated series

when both series grow over time? This was the question explored by Granger and Newbold in a

classic paper published in the Journal of Econometrics in 1974. Their finding may not be

surprising. Often, in these cases, standard statistical tests suggest statistical significance when, in

fact, there is no relationship. They concluded that in such cases much larger t-statistics would be

needed than suggested by traditional t-tables. Granger and Newbold find a critical value of 11.2

more appropriate than 1.96. Their study was based on Monte Carlo simulations. Peter Phillips
works with a more general model and used analytical methods, rather than simulation studies, and

recommends the use of a critical value given by (N.5 )( t_critical_from the table).

The bottom line of all of this, is that if we should consider fitting relationships to

appropriately differenced or detrended data. However, there are alternatives.

4. Cointegration

An interesting problem arises if Y and X are integrated of different orders. This makes it

impossible for the error term = Yt - Xt â to be stationary. Greene gives a very abbreviated

treatment of this important issue.

B. Basic ARIMA Models

An important general class of stochastic models which have been widely adopted as models for

time series are the autoregressive-integrated moving average models (ARIMA). These models include

several of the models in the previous section as special cases and are extremely versatile in terms of

their statistical properties. An ARIMA model with parameters (p,d,q) is defined as follows:

Y*t - ö 1Y*t -1 - ... -ö pY*t -p = gt - è 1gt-1 - ... -è qgt-q (B.1)

or

ö(B)Y*t = è(B)gt

where

• Y*t = (I-B)dYt, e.g., for d = 1 Y*t = Yt - Yt-1.

• BsYt = Yt-s
• gt-N(0,ó 2)

• ö(B) = 1-ö 1B - ... -ö pBp

• è(B) = 1-è 1B - ... -è qBq

This model is denoted ARIMA(p,d,q) and, as previously mentioned, includes many useful

models as special cases.

Some special cases include

(1) ARIMA (1,0,0) = AR(1)

Yt - ö 1 Yt-1 = gt

This is the common form used to model autocorrelation in regression models.

(2) ARIMA (p,0,0) = AR(p)


Yt - ö 1 Yt-1 . . . -ö pYt-p = gt

(3) ARIMA (0,0,1) = MA(1)


Yt = gt - è 1 gt-1

(4) ARIMA (0,0,q) = MA(q)


Yt = gt - è 1 gt-1 - . . . -è q gt-q

(5) ARIMA (p, o, q) = ARMA (ñ,q)

(6) ARIMA(0, 0, 0)
Yt = gt

Note that the portion of the expression on the left hand side of the equal sign in equation
(4.3) is the Autoregressive (AR) portion with p lags and that on the right hand side is the
Moving average component (MA) with q lags. The d refers to the number of times the
series is differenced. Other special cases include exponential smoothing ARIMA (0,1,1) and
the Holt-Winters nonseasonal model corresponds to ARIMA (0,2,2).

Traditional applications of ARIMA models as forecasting tools involves the following four
steps:
(1) IDENTIFICATION--determine values for p,d, and q. In other words the appropriate
number of autoregressive lags (p) and moving average lags (q) as well as the
appropriate degree of differencing needed to be determined.

"d" is selected to be the number of times that the series must be differenced to obtain a
stationary series.

Probably the most common method of "identifying" or selecting p and q is by analyzing


the behavior of the estimated autocorrelation coefficients (ñi ‘s) and the partial
autocorrelation coefficients ö ii (related to the ö i and will be defined shortly).

Different ARIMA models will be associated with different behaviorial patterns for the
true autocorrelation and partial autocorrelation coefficients. These patterns depend
upon the values for p and q as well as the coefficients values of ö i, and è i in the model.

Thus, an inspection of the autocorrelation coefficients and partial autocorrelation


coefficients can help identify an ARIMA model (p, d, and q) much like fingerprints or
DNA “identify” an individual. Values for p and q will be selected so that estimated
patterns of the estimated autocorrelation and partial autocorrelation coefficients will be
mimic patterns of the observed behavior of autocorrelation and partial autocorrelation
coefficients.

The behavior of the autocorrelation coefficients and partial correlation coefficients


corresponding to different values of p and q can be determined mathematically or using
the computer program
THEORYTS
This program generates true population auto and partial autocorrelation coefficients
corresponding to arbitrary values for p, q, and ö i, ö i and will provide a useful tutorial
in the identification process. Sample outputs are included in these notes.

Many programs are available which will estimated autocorrelation and partial
autocorrelation coefficients corresponding to a series of data.

The SHAZAM command used in “identifying” the form of the ARIMA model is

ARIMA y / NLAG= NLAGP= PLOTAC PLOTPAC

Where y is the name of the variable to be analyzed; NLAG, NLAGP, PLOTAC,


and PLOTPAC indicate the number of estimated autocorrelation and partial
autocorrelation coeffients to be estimated and plotted.

Partial autocorrelation coefficients: The partial autocorrelation coefficients are denoted


by ö ii and are useful in determining the order of the autoregressive process.
ö ii is equal to the coefficient ö i if the model has an ith order autoregressive component,
AR(i). For example,
ö 11 = ö 1 in an AR(1)
ö 22 = ö 2 in an AR(2)
ö 33 = ö 3 in an AR(3)
.
.
ö pp = ö p in an AR(p)

Unfortunately, in applications we do not know the true values of ö ii which must be


estimated from the data . The partial autocorrelation coefficients are related to the
autocorrelation coefficients using the YULE WALKER EQUATIONS which will be
discussed later.

We will use the estimated partial autocorrelation coefficients as a tool in determining p.


For example, if p=2, then we wouldn't expect to find estimators of ö 33 statistically
different from zero. More on this later.

Alternative and complimentary approaches to the identification process involve the use
of the spectral density function or the specification of an objective function which is
optimized over p and q and ö and è. The Akaike information criterion AIC = -2 Rn
(likelihood) +2(p+q) is an example of this procedure. AIC is then minimized over p
and q as well as the coefficients in the ARIMA specification.

After the model has been identified (values for p,d,q selected), the second step involves
estimating the specified model.

(2) Estimation--Given values for p and q, nonlinear estimation techniques can be


employed to estimate ó 2 = Var(gt), ö 1,...,ö p, è 1,...,è q. Conditional maximum
likelihood or nonlinear least square estimators can be obtained. Some of the
associated details are discussed in section D.

The estimation command in SHAZAM is of the form

ARIMA yr/NAR= p NMA=q RESID=e COEF = b NDIF=d

Where y denotes the name of the variable, NAR denotes the number of
autoregressive parameters to be estimated, NMA denotes the number of
moving average parameters to be estimated, RESID saves the residuals in a
variable denoted “e” to be used in possible additional diagnostics, COEF =
saves the coefficients in a variable “b” to be used in forecasting, and
NDIF=d denote the number of times the series should be differenced.
The estimation routine is based upon an equivalent AR(4) model
representations of the errors gt:
gt = Yt - ö 1Yt-1 ... -ö pYt-p + è 1gt-1 + ... + è q gt-q or

= è -1(B)ö(B) Ä d Yt

In either representation gt depends upon the ö i's and è i's (autoregressive and
moving average parameters). The associated sum of squared errors is given
by

SSE(ö,è) = 3 g2t
t

= 3(Yt - ö 1Yt-1 ... -ö pYt-p + è 1gt-1 + ... + è qgt-q)2


t

= 3 {è -1(B) ö(B)Ä dYt}2 .

This formulation gives MLE for normally distributed error terms. Alternative
formulations based on different probability density functions can be
employed.

(3) Diagnostics--Given that the hypothesized model has been estimated, tests are
performed to check the validity of the conjectured model.

• Given estimated values for the ö’s and è’s , we can obtain estimated
residuals:

gt = è -1(B)ö(B) Ä d Yt

• The estimated residuals (gt) can then be tested for the existence of
underlying patterns. This can be done by checking for patterns in the
associated autocorrelation and partial autocorrelation coefficients. In a
correctly specified model, the estimated residuals should be white noise
without statistically significant autocorrelation coefficients or partial
autocorrelation coefficients. A "Q statistic" provides the basis for a
statistical test of the hypothesis that the autocorrelation coefficents of
the residuals are zero. Using the Box-Pierce Q-statistic
Q = TΣ pk=1 rk2 ~ χ 2 (p)
or the Box–Ljung Q statistic
rk2
Q = T (T + 2) Σ p
k=1 ~ χ 2(p)
T- k

• This analysis could also be performed by using the ARIMA command


on the residuals and investigating the associated patterns of the auto and
partial autocorrelation coefficients.

ARIMA e / NLAG= NLAGP= PLOTAC PLOTPAC

• Extra AR or MA parameters may be included and then test for the


statistical significance of the extra terms using a "t-type" statistic or
likelihood ratio test.

(4) Forecast

The forecasts can be thought of as being generated from the equivalent


MA(4) representation, if it exists, of the ARIMA model identified and
estimated above. This form is given by:
Θ (B)
Yt = ε
Φ (B) t

= Ã(B) åt = åt + Ã1 åt-1 + Ã2 åt-2 + ...

The form for the SHAZAM forecasting command is

ARIMA name_of_var / NAR= p NMA=q COEF = b NDIF=d FBEG =


FEND =
Where the options are defined as above and FBEG and FEND denote
the first and last observations to be forecast.

This section has attempted to give a brief overview of alternative models, their characteristics and
identification, estimation, model diagnostics, and forecasting procedures. Each of these issues will
be considered in additional detail in subsequent sections. We turn to a more thorough analysis of
a number of simple ARIMA(p,d,q) models–including an investigation of the behavior of the
associated autocorrelation coefficients considered.
C. Characteristics of some ARIMA Models and Identification

(1) An overview and introduction.

We begin by considering the behavior of the autocorrelation coefficients and


partial autocorrelation coefficients of a first order autoregressive model AR(1) and
moving average MA(1) model. Then we will analyze AR(p), MA(q) and ARIMA
(p,d,q) models.

(a) Autoregressive Model of Order 1 [ARIMA (1,0,0) or AR(1)]

Yt - ö 1 Yt-1 = gt (C.1)

where gt - N[0,ó 2].

From this specification, it can be shown that

E(yt) = 0 (C.2)
ó2
Var(Y t) ' ' ã0
2
1 & ö1

s ó2 s
Cov(y t,yt&s) ' (ö1) ' ö1 ã0 ' ãs
2
1 & ö1

The autocorrelation coefficients


ãs s
ñs ' ' ö1 decay exponentially.
ã0

Note ö 1 = ñ1

The partial autocorrelation coefficients denoted (ö ii) can be shown to be ö 11 = ñ1 and


ö ii = 0 i>1.

Some sample output from the theory program illustrates these patterns. The reader
might experiment with some other values of ö i.
DOS mode: TheoryTS
ENTER THE NUMBER OF AUTOREGRESSIVE PARAMETERS: 1
ENTER THE NUMBER OF MOVING AVERAGE PARAMETERS: 0
ENTER THE NUMBER OF AUTOCORRELATION COEFFICIENTS: 15
ENTER THE VALUE OF PHI(1): .8
****AUTO CORRELATION COEFFICIENTS****
LAGS VALUES
15 I* 0.035
14 I* 0.044
13 I* 0.055
12 I* 0.069
11 I* 0.086
10 I* 0.107
9 I * 0.134
8 I * 0.168
7 I * 0.210
6 I * 0.262
5 I * 0.328
4 I * 0.410
3 I * 0.514
2 I * 0.640
1 I * 0.800
I.I.I.I.I.I.I.I.I.I.I.I.I.I
-1 0 +1

ENTER THE NUMBER OF PARTIALS: 15


* * * PARTIAL AUTOCORRELATION COEFFICIENTS * * *
LAGS VALUES
15 * -0.000
14 * -0.000
13 * 0.000
12 * 0.000
11 * -0.000
10 * 0.000
9 * -0.000
8 * 0.000
7 * -0.000
6 * 0.000
5 * -0.000
4 * 0.000
3 * -0.000
2 * -0.000
1 I * 0.800
I.I.I.I.I.I.I.I.I.I.I.I.I.I.I
-1 0 +1
ENTER 1 FOR NEW PROCESS. 2 FOR THE SAME PROCESS: 1
ENTER THE NUMBER OF AUTOREGRESSIVE PARAMETERS:
ENTER THE NUMBER OF MOVING AVERAGE PARAMETERS: 0
ENTER THE NUMBER OF AUTOCORRELATION COEFFICIENTS: 15
ENTER THE VALUE OF PHI(1): .-8
* * * * AUTO CORRELATION COEFFICIENTS * * * *
LAGS VALUE
15 *I -0.035
14 I* 0.044
13 *I -0.055
12 I* 0.069
11 * I -0.086
10 I * 0.107
9 * I -0.134
8 I * 0.168
7 * I -0.210
6 I * 0.262
5 * I -0.320
4 I * 0.410
3 * I -0.512
2 I * 0.640
1 * I -0.800
I.I.I.I.I.I.I.I.I.I.I.I.I.I.I
ENTER THE NUMBERS OF PARTIALS 15

* * * PARTIAL AUTOCORRELATION COEFFICIENTS * * *

LAGS VALUES
15 * -0.000
14 * -0.000
13 * -0.000
12 * -0.000
11 * -0.000
10 * -0.000
9 * -0.000
8 * -0.000
7 * -0.000
6 * -0.000
5 * -0.000
4 * -0.000
3 * -0.000
2 * -0.000
1 * I -0.800
I.I.I.I.I.I.I.I.I.I.I.I.I.I.I
-1 0 +1
Note that in each case the autocorrelation coefficients decline geometrically as ö 1s. The partial
autocorrelation coefficients are all equal to zero except for ö 11 (the first) which is equal to ö 1 It will
also be instructive to note that an AR(1) with š ö 1 š<1 can be written as an infinite moving average
.

MA(4). This demonstration along with a derivation of previous results follows. It might also be
mentioned that if the process is AR(p), then the first p partial autocorrelation coefficients may be
nonzero and all others zero. The corresponding autocorrelation coefficients decline to zero.

Details associated with the previous results:

Yt - ö 1Yt = (1-ö 1B)Yt = gt

gt ' 'ö1B igt


1 i
Yt '
1&ö1B

= gt + ö 1 gt-1 + ö 21 gt-2 + ...

E(Yt) = E(3 ö 1i gt-i)

= 3 ö 1i E(gt-i)

=0

Var(Yt) = Var (3ö 1i gt-i)

= 3 ö 2i1 Var(g t-i)

= ó 2 3ö 12i
ó2
'
2
1 & ö1

ãs = E(Yt Yt-s) = E(gt + ö 1 gt-1 + ... + ö s1 gt-s + ...)

C(gt-s + ö 1, gt-s-1 + ...)

= ö s1 E(gt-s + ö 1gt-s-1 + ...)2

= ö s1 Var(yt-s)
s
ö1ó2
2
1 & ö1
ñs = ãs/ã0 = ö s1

(b) Moving average of order 1, ARIMA (0,0,1) or MA(1)

The MA(1) model is defined by

Yt = gt - è 1gt-1 = (1- è 1B) gt = è(B)gt

The MA(1) model is invertible and can be written as AR(4) if š è 1 š < 1. In


particular,

gt = è -1(B)Yt = (1-è 1B)-1 Yt

' èiB iY t
4

i'0

' èiYt&i
4

i'0

or

Yt + è 1 Yt-1 + è 21 Yt-2 + . . . = gt

which is an AR(4).

Looking at the form of the coefficients of the lagged Yt's, suggests that the partial
autocorrelation coefficients will decline geometrically. This is in fact what happens
for a MA(1).

In order to determine the behavior of the autocorrelation coeffients, we derive


expressions for ñs.

From the form for a moving average model we obtain the following

• E(Yt) = E(gt - è 1 gt-1)

= E(gt) - è 1 E(gt-1)

=0

• Var (Yt) = Var (gt - è 1, gt-1)


= Var (gt) + è 21 Var (gt-1)

= ó 2 + è 21 ó 2

= ó 2 (1+è 21)

• Cov (Yt Yt-1) = E(YtYt-1)

= E(gt - è 1gt-1)(gt-1 - è 1gt-2)

= -è 1E(gt-1)2

= -è 1ó 2

• Cov (Yt Yt-2) = E(gt - è 1gt-1)(gt-1 - è 1gt-3)

=0

• Cov (Yt Yt-s) = E(gt - è 1gt-1)(gt-s - è 1gt-s-1)

=0

Therefore
è s ' 1
ñs ' &
2 s ' 2,3,...
1 % è1
0

This result demonstrates that a moving average process of order 1 (MA (1)) only has a
"memory" of one period. Similarly a moving average process of order q (MA(q)) only
has a "memory" of q periods. In other words for a MA(q) model
ñ s = 0 for s>q. The impact of Yt on the Y's will completely die out after q periods.

The following computer printout illustrates typical behavior of the autocorrelation and
partial autocorrelation coefficients corresponding to a MA(1) with è 1 = .9. Note that
there is only one nonzero autocorrelation coefficient and the partial autocorrelation
coefficients decline geometrically. Computational details will be reviewed following the
graphs.
ENTER 1 FOR NEW PROCESS. 2 FOR THE SAME PROCESS: 1
ENTER THE NUMBER OF AUTOREGRESSIVE PARAMETERS: 0
ENTER THE NUMBER OF MOVING AVERAGE PARAMETERS: 1
ENTER THE NUMBER OF AUTOCORRELATION COEFFICIENTS: 15
ENTER THE VALUE OF THETA (1): .9
* * * * AUTO CORRELATION COEFFICIENTS * * * *
LAGS VALUES
15 * 0.000
14 * 0.000
13 * 0.000
12 * 0.000
11 * 0.000
10 * 0.000
9 * 0.000
8 * 0.000
7 * 0.000
6 * 0.000
5 * 0.000
4 * 0.000
3 * 0.000
2 * 0.000
1 * I -0.497
I.I.I.I.I.I.I.I.I.I.I.I.I.I.I
-1 0 +1

ENTER THE NUMBER OF PARTIALS: 15


* * * PARTIAL AUTOCORRELATION COEFFICIENTS * * *
LAGS VALUES
15 *I -0.041
14 *I -0.045
13 *I -0.051
12 *I -0.057
11 *I -0.065
10 *I -0.073
9 * I -0.084
8 * I -0.096
7 * I -0.112
6 * I -0.131
5 * I -0.156
4 * I -0.191
3 * I -0.243
2 * I -0.329
1 * I -0.497
I.I.I.I.I.I.I.I.I.I.I.I.I.I.I
-1 0 +1
Computational details associated with the previous example:

ñi = 0 for i > 2
&è1
ñi '
2
1 % è1

&.9
' ' &.497
1 % .81

(c) In summary and a preview, the patterns of the autocorrelation and partial
autocorrelation coefficients for an AR(p) and MA(q) "appear" as follows

š š
Autocorrelation Partial autocorrelation

š š
š_______________ š______________________
š š
AR(p)

š š
p "spikes"

š š
Autocorrelation Partial autocorrelation

š š
š_______________ š______________________
š š
MA(q)

š š

q "spikes"

Sampling variation and the possible presence of autoregressive and moving


average components makes the identification process more difficult.

We now turn to an analysis of higher order and mixed processes.

(2) Autoregressive model of order p, ARIMA (p,0,0) or AR(p)

Yt - ö 1Yt-1 - ... -ö pYt-p = gt (C.3)

ö(B)yt = gt (C.4)
(a) An AR(p) will be stationary and can be written as a MA(4) if the roots of ö(z) are
greater than one in absolute value.

Details:

ö(z) = 1 -ö 1 z-ö 2z2 ... ö pzp

= (1-ö̃ 1z) (1-ö̃ 2z) ... (1-ö̃ pz)

Factoring polynomial

The roots of ö(z) are then given by

(1-ö̃ iz) = 0 , z = 1/ö̃ i.

Now consider

yt = {1/ö(B)}gt
1 1
' ... gt
1 & ö̃1B 1 & ö̃p

' ö̃1B i ' ö̃pB j gt


4 4
i j
' ...
i'0 j'0

= {1 + ö̃ 1 â + ö̃ 21â 2 + ...}..{1 + ö̃ pB + ö̃ 2pB2...} gt

= {1 + (ö̃ 1 + ö̃ 2 + ... + ö̃ p)B

+ (ö̃ 21 + ö̃ 22 + ... + ö̃ 2p + ö̃ 1ö̃ 2 + ... + ö̃ 1ö̃ p

+ ... + ö̃ p-1ö̃ p) B2} gt + ...

which is valid if the š ö̃ š<1, i.e., the roots of ö(z) are greater than one in
absolute value.

(b) E(Yt) = 0 if the series (C.3) is stationary.

(c) Yule Walker equations

The relationship between the autocorrelation coefficients and ö i's in (C.3) is


given by
ñ1 1 ñ1 ÿ ñp&1 ö1
ñ2 ñ1 1 ÿ ñp&2 ö2
' (C.5)
! ! ! !
ñp ñp&1 ñp&2 ÿ 1 öp

or

ñk ' ' öjñk&j


p
for k > 0
j'1

(C.5) is referred to as the system of Yule Walker equations. The ö i's can then be expressed
in terms of the ñi's and the ñi's can be expressed in terms of the ö i's. For example, for

p=1: ñ1 = ö 1

ñ1 1 ñ1 ö1
p=2: '
ñ2 ñ1 1 ö2

2
ñ2 & ñ1 ö1
ö1 ' , ñ1 '
1&
2
ñ1 1 & ñ2

2 2
ñ2 & ñ1 ö2(1 & ö2) % ö1
ö2 ' , ñ2 '
1 & ñ1
2 1 & ñ2

Derivation of the Yule Walker Equations: Multiply (C.3) by yt-k and take the
expected value

E(yt yt-k) - ö 1E(yt-1 yt-k) ... ö p E(yt-p yt-k)

= E(gt yt-k)

or

ãk - ö 1ãk-1 ... ö p ãk-p = 0

Dividing by ã0 = Var(yt) yields

ñk - ö 1 ñk-1 - ... ö pñk-p = 0.


(d) Partial Autocorrelation Coefficients

If different values of p are selected and the "last coefficient" ö p obtained for each p,
using the Yule Walker equations, then these coefficients are referred to as partial
autocorrelation coefficients and are useful in determining the order of the autoregressive
process. This is analogous to deciding on how many terms to include in a multiple
regression. For an autoregressive process of order p, the first p partial autocorrelations will
be nonzero and higher order partial autocorrelation coefficients will equal zero. The partial
autocorrelation coefficients are denoted by ö ii.

As an example for p = 1, the Yule Walker equations are

ñ1 = ö 1 = ö 11

For p = 2, the Yule Walker equations are


ñ1 1 ñ1 ö1
' ;
ñ2 ñ1 1 ö2

therefore, using Cramer's Rule to solve for ö 2 yields

/0 /0
1 ñ1
00ñ ñ2000
0
' 0 1 0 ' 2
2
ñ & ñ1
ö22
/0 /0
1 ñ1 2

00ñ 1 000
1 & ñ1
00 1 0

For p = 3, the Yule Walker equations are


ñ1 1 ñ1 ñ2 ö1
ñ2 ' ñ1 1 ñ1 ö2 .
ñ3 ñ2 ñ1 1 ö3

The corresponding third partial autocorrelation coefficient is

/0 /0
1 ñ1 ñ2
00ñ 1 ñ2000
000 1 00
00 0
00ñ ñ1 ñ3000
ö33 ' 0 2 0
/0 /0
1 ñ1 ñ2
00ñ 1 ñ1000
00 1 00
00 0
00ñ ñ1 1 000
00 2 0

More generally
/0 /0
1 ñ1 ÿ ñ2
00 ñ 1 ÿ ñ2000
00 1 00
00 00
00 ! 00
00 00
00
!
ñi000
!
00ñi&1 ñi&2
ö33 ' 0 0
/0 /0
1 ñ1 ÿ ñi&1
00 ñ 1 ÿ ñi&2000
00 1
00 000
00 ! ! 000
00 00
00
00ñi&1 ñi&2 ÿ 1 000
!

0 0

Note:

If the actual value of p is 1, then

ö 11 = ñ1
2 2 2
ñ2 & ñ1 ñ1 & ñ1
ö22 ' ' ' 0
2 2
1 & ñ1 1 & ñ1

since ñi = ö 1i for a first order autoregressive process. Therefore,

ö ii = 0 i $ 2.

The determination of an appropriate estimate for p becomes a statistical question. The


ñk can be estimated by

' (y t & ȳ)(yt%k & ȳ)/'(yt & ȳ)2


n&k
(C.6)
t'1

The Yule-Walker equations can then be used to estimate the corresponding partial
autocorrelation coefficients. The associated asymptotic standard errors were shown to
be
1
sö̂ ' k $ %1 (C.7)
kk T

by Quenouille. If one assumes that n (number of observations used in fitting) is large


enough for ö̂ kk to be approximately normally distributed, we have a procedure which
can be used in determining a reasonable value of p.
The following computer printout corresponds to the theoretical auto and partial autocorrelation
coefficients for an AR(2) with (ö 1, ö 2) = (.2, .7). The numerical calculations associated with this
graph will follow.
ENTER 1 FOR NEW PROCESS. 2 FOR THE SAME PROCESS: 1
ENTER THE NUMBER OF AUTOREGRESSIVE PARAMETERS: 2
ENTER THE NUMBER OF MOVING AVERAGE PARAMETERS: 0
ENTER THE NUMBER OF AUTOCORRELATION COEFFICIENTS: 15
ENTER THE VALUE OF PHI(1): .2
ENTER THE VALUE OF PHI(2): .7
* * * * AUTO CORRELATION COEFFICIENTS * * * *
LAGS VALUES
15 I * 0.343
14 I * 0.368
13 I * 0.384
12 I * 0.416
11 I * 0.430
10 I * 0.471
9 I * 0.480
8 I * 0.536
7 I * 0.533
6 I * 0.614
5 I * 0.585
4 I * 0.710
3 I * 0.633
2 I * 0.833
1 I * 0.667

I.I.I.I.I.I.I.I.I.I.I.I.I.I.I
-1 0 +1

ENTER THE NUMBER OF PARTIALS 15


* * * PARTIAL AUTOCORRELATION COEFFICIENTS * * *
LAGS VALUES
15 -0.000
14 0.000
13 0.000
12 * 0.000
11 * 0.000
10 * -0.000
9 * 0.000
8 * -0.000
7 * -0.000
6 * 0.000
5 * -0.000
4 * 0.000
3 * -0.000
2 I * 0.700
1 I * 0.667
I.I.I.I.I.I.I.I.I.I.I.I.I.I.I
-1 0 +1
Numerical calculations associated with this example (Yule Walker equations for p = 2)

ñk = ö 1ñk-1 + ö 2ñk-2

ñ0 = 1
ö1 02 2
ñ1 ' ' '
1 & ö2 1 & .7 3

= .667
2
ö2(1 & ö2) % ö1
ñ2 '
1 & ö

(.7)(.3) % .04
'
1 & .7

.25
' ' .833
.30

ö 11 = ñ1 = .667
2
5 2
ñ2&ñ2
2 &
6 3
ö22 ' '
2 2
1&ñ1 2
1 &
3

21
' .7
30

ö 33 = 0

(3) qth order moving average model, ARIMA (0,0,q) or MA(q)

yt = gt - è 1gt-1 - è 2gt-2 - ... -è qgt-q (C.8)

= è(B)Ut

In order for the process defined by (C.8) to be invertible, the roots of è(B) must have
modulus greater than one,
q
è(z) ' ð (1 & èiz) ' 0
i'1

Hint: è&1(B) ' Ð (1 & è̃jB)&1 ' Ð ' (è̃j ) is valid if š è̃ jš < 1 for all j.
q q 4 i
i

j'1 j'1 i'0

The roots of è(B) are equal to 1/è̃ j.


The autocovariances and autocorrelations can be evaluated by considering

ãk = E(yt yt-k)

ãk = E[gt - è 1gt-1 - ... -è qgt-q)(gt-k - è 1gt-k-1 - ... - è q gt-k-q)] (C.9)

ã0 = (1 + è 21 + ... + è 2q)ó 2.

ãk = (-è k + è 1è k+1 + è 2è k+2 + ... + è qè q-k)ó 2 (C.10)

for k = 1, 2, ... , q

=0 for k > q.
&èk % èk%1è1 % ... % èqèq&k
ñk ' or k=1,2,...,q (C.11)
2 2
1 % è1 % ... % èq

= 0 for k > q

From (C.10) we see that the autocorrelation function of a MA(q) has a "cut-off" at lag
q. We might say that a MA(q) has a memory of Length q.

Bartlet's approximation for the standard error of estimators of ñk is useful in determining


an estimate of q. For any process for which the autocorrelation's ñk are zero for k > q,
Bartlet's approximation is given by

1 % 2 ' ñi for k
q
2 1 2
sñ̂ ' (4.16)
k T i'1

The reader should be reminded that the appropriateness of the convention of using the
limiting normal density with (C.7) or (C.12) for purposes of assessing statistical
significance is questionable for small samples.

Consider the following example:

The auto and partial autocorrelation coefficients corresponding to

MA(2): yt = gt - .5gt-1 - .3gt-2

are given in the next figure.


-θ 1 + θ 1 θ 2 -θ 1
ρ1 = = -.26 and ρ2 = = -.224
1 + θ 12 + θ 22 1 + θ 12 + θ 22
ENTER 1 FOR NEW PROCESS. 2 FOR THE SAME PROCESS: 1
ENTER THE NUMBER OF AUTOREGRESSIVE PARAMETERS: 0
ENTER THE NUMBER OF MOVING AVERAGE PARAMETERS: 2
ENTER THE NUMBER OF AUTOCORRELATION COEFFICIENTS: 15
ENTER THE VALUE OF THETA(1): .5
ENTER THE VALUE OF THETA(2): .3
* * * AUTO CORRELATION COEFFICIENTS * * *
LAGS VALUES
15 * 0.000
14 * 0.000
13 * 0.000
12 * 0.000
11 * 0.000
10 * 0.000
9 * 0.000
8 * 0.000
7 * 0.000
6 * 0.000
5 * 0.000
4 * 0.000
3 * 0.000
2 * I -0.224
1 * I -0.261
I.I.I.I.I.I.I.I.I.I.I.I.I.I.I
-1 0 +1

ENTER THE NUMBER OF PARTIALS: 15

* * * PARTIAL AUTOCORRELATION COEFFICIENTS * * *


LAGS VALUES

15 * -0.023
14 *I -0.027
13 *I -0.032
12 *I -0.037
11 *I -0.044
10 *I -0.052
9 *I -0.062
8 *I -0.074
7 * I -0.088
6 * I -0.107
5 * I -0.127
4 * I -0.165
3 * I -0.189
2 * I -0.313
1 * I -0.261
I.I.I.I.I.I.I.I.I.I.I.I.I.I.I
-1 0 +1

(4) Autoregressive Moving Average Processes, ARIMA (p,o,q), ARMA (p,q)


Yt-ö1Yt-1 - ... -ö pYt-p = gt-è 1gt-1 - ... -è qgt-q (C.13)
ö(B)Yt = ö(B)gt
The process defined by (C.13) will be stationary if the roots of ö(B) have modulus
greater than one and will be invertable if the roots of è(B) have modulus greater than
one.

Given that the conditions of stationarity and invertibility are satisfied, we note that an
ARMA (p,q) process can be expressed as
AR(4): è -1(B)ö(B)Y = g t t (C.14)
MA(4): Yt = ö -1(B)è(B)gt (C.15)
Multiplying (C.13) by yt-k and taking expected values we see that

ñk = ö 1ñk-1 + ... + ö pñk-p + ñk(Y,u) - è 1 ñk-1(Y,u) (C.16)

-è qñk-q(y,u)

where ñi(Y,u) = E(Yt-i, ut) = 0 i > 0.


(C.16) simplifies to
ñk = ö 1ñk-1 + ö 2ñk-2 + ... + ö p ñk-p for k $ q + 1
The first q autocorrelation coefficients will depend upon the moving average parameters
è i as well as the autoregressive parameters ö j. Therefore, the autocorrelation
coefficients will exhibit an irregular pattern at lags 1 through g, then tail off according to
(C.16).

The process (C.13) is equivalent to an AR(4); hence, the partial autocorrelation


coefficients tail off eventually in the same manner as with a pure autoregressive
process.

These results will assist in the determination of p, q. The following tables are taken
from Nelson (p. 89) and Box and Jenkins (176,7) and provide useful summary
information to assist in the determination of p and q. It should be noted that the
autocorrelation coefficients and partial autocorrelations provide the basis for this
determination. Recall that the asymptotic standard errors of ñ̂k and ö̂ kk are given by
1 2 2 1/2
sñ̂ ' 1 % 2(ñ1 % ... % ñq)
k
T

1
sö̂ ' for k > p.
kk T

Table 1. Characteristic Behavior of Autocorrelations and


Partial Autocorrelations for Three Classes of Processes.

Class of processes Autocorrelations Partial autocorrelations


_____________________________________________________________________________
Moving average Spikes at lags 1 through q, Tail off
then cut off

Autoregressive Tail off according to Spikes at lags 1 through


p, then cut off
ñj = ö 1ñj-1 + ... + ö pñj-p

Mixed auto- Irregular pattern at lags 1 through q, Tail off


regressive-moving average then tail off according to

ñj = ö 1ñj-1 + ... + ö pñj-p


_____________________________________________________________________________
Table 2. Behavior of the autocorrelation functions for the dth difference
of an ARIMA process of order (p,d,q). (Table A and Charts B, C
and D are included at the end of this volume to facilitate the
calculation of approximate estimates of the parameters for first-
order moving average, second-order autoregressive, second-order
moving average, and for the mixed (ARMA) (1,1) process.)
_____________________________________________________________________________
Order (1,d,0) (0,d,1)
_____________________________________________________________________________
Behavior of ñk decays exponentially only ñ1 nonzero

Behavior of ñkk only ö 11 nonzero decay exponentially


è1
Preliminary estimates ö 1 = ñ1 ñ1 '
1 % è2

Admissible region -1 < ö 1 < 1 -1 < è 1 < 1


_____________________________________________________________________________
Order (2,d,0) (0,d,2)
_____________________________________________________________________________
Behavior of ñk mixture of exponentials only ñ1 and ñ2 nonzero
or damped sine wave

Behavior of ö kk only ö 11 and ö 22 dominated by mixture of


nonzero exponentials or damped
sine wave

ñ1(1 & ñ2) &è1(1 & è2)


ö1 ' ñ1 '
Preliminary estimates 1 &
2
ñ1
2
1 % è1 è2
2

ñ2 & ñ1
2 è2
ö2 ' ñ2 '
2 2
1 & ñ2 1 % è1 è2

-1 < ö 2 < 1 -1 < è 2 < 1

Admissible region ö2 + ö1 < 1 è2 + è1 < 1

ö2 - ö1 < 1 è2 - è1 < 1
______________________________________________________________________________
Order (1,d,1)
______________________________________________________________________________
Behavior of ñk decays exponentially from first lag

Behavior of ö kk dominated by exponential decay from first lag

(1&è1ö1)(ö1&è1)
Preliminary estimates from ñ1 ' ñ2 ñ1ö 1
2
1 % è1 & 2ö1è1

Admissible region -1 < ö 1 < 1 -1 < è 1 < 1


_____________________________________________________________________________

(5) ARIMA (p,d,q) The foregoing discussion has assumed that the underlying series is
stationary. If this is not the case, for any of several reasons, the previous approaches
are not strictly appropriate. For example if a time series didn't have a fixed mean, then
the series wouldn't be stationary. It will frequently be the case that Yt-Yt-1 = (I-B)Yt,
or (I-B)2Yt, or (I-B)dYt, will be stationary. If such a value of d can be determined then
the previous techniques can be applied to the "differenced" series, (I-B)dYt. For
example if Yt is basically distributed with constant variance about a linear (quadratic)
trend, then (I-B)Yt, (I-B)2Yt, will be stationary. If Yt shows evidence of trend
stationarity, then Y could be regressed on a polynomial in “t” and the previously
described techniques can be applied to the resulting residuals. Sometimes nonlinear
transformation on Yt , such as RnYt, may facilitate the search for a stationary process.

D. Diagnostic Analysis

There are several approaches which can be utilized to determine the "validity" of the
estimated model and three of the most common involve (1) considering more general models,
(2) an analysis of estimated residuals and (3) the Q-statistic.

(1) Generalized Model. Assume that ARIMA(p,d,q) has been "identified." The researcher
might estimate an ARIMA(p',d,q') where p' and q' are larger than p, q and then check
the statistical significance of the additional coefficients. This approach has at least two
limitations. The validity of the statistical inference (test statistics) is questionable for
small samples and an ARIMA (p,d,q) process is uniquely determined by the
autocorrelation structure up to a multiple of polynomials in B. t "type" statistics or
likelihood ratio tests may be used.

(2) Analysis of Estimated Residuals

ĝt = è̂ -1(B) ö̂(B)(I-B)dyt


One might consider an analysis of the behavior of

ñ̂k(ĝt) = 3ĝtĝt-k/3ĝt

and the associated partial autocorrelation coefficients.

It should be mentioned that the distributional characteristics of ĝt are not necessarily
exactly the same as for gt. Autocorrelation and partial autocorrelation coefficients of
gt's are frequently used in this analysis. If the model has been correctly specified, the
estimated residuals and the associated auto and partial autocorrelation coefficients
should correspond to white noise. However, if the estimated residuals appear to have
an AR or MA component, then the model specification should be respecified. For
example, if the ĝt's appear to be an AR(1), then one more AR component should be
included in the specification for the Yt series.

(3) Q-Statistics.
Box and Pierce define

Q ' n ' ñ̂k(ĝt)


M
2
or, alternatively,
k'1

ρ$k2
Q = T (T + 2) Σ M
k=1
T -k

each of which is approximately ÷2(M-p-q) under the hypotheses of gt being


independently and identically distributed as N(0,ó 2). This follows because ρ$i ~
N [ 0, 1/T.5 ] if the model is correctly specified. The hypothesis is rejected by
large values of Q.

E. Estimation

Once values for p and q have been determined, the coefficients in the ARIMA (p,d,q) need
to be estimated in order to use the model.

(1-ö 1B - ... -ö pBp)Yt = (1-è 1B - ... -è qBq)gt (E.1)

or

ö(B)Yt = è(B)gt

Note that gt can be explicitly expressed as

gt = Yt - ö 1Yt-1 ... -ö pYt-p + è 1gt-1 + ... + è q gt-q (E.2)


= ö(B)Yt + è 1gt-1 + ... + è qgt-q

and also

= è -1(B)ö(B)Yt (E.3)

In either representation gt depends upon the ö i's and è i's (autoregressive and moving average
parameters). The associated sum of squared errors is given by

SSE(ö,è) = 3 g2t (e.4)


t

= 3(Yt - ö 1Yt-1 ... -ö pYt-p + è 1gt-1 + ... + è qgt-q)2


t

= 3 {è -1(B) ö(B)Yt}2

Under the assumption that the gt's are independently and identically distributed as N(0,ó 2)
the log likelihood function is
&'g2t/2ó 2
e
R(è, ö,ó2) ' Rn
(2ð)N/2(ó2)N/2

&SSE(ö,è) N N
' & Rn(2ð) & R n(ó2)
2ó2 2 2

Hence, minimizing SSE(ö,è) with respect to ö and è is a necessary part of obtaining


maximum likelihood estimators.

A close inspection of the expression for SSE reveals that gt depends on the previous random
disturbances and observations on the variable Y, e.g.

g1 = Y1-ö 1Y0 - ö 2Y-1 ... -ö pY1-p + è 1g0 + ... + è qg1-q

Several approaches to these problems have been proposed. This is frequently referred to as
the question as to how to initialize the series. One approach is to replace the unobservable
values of yYt and gt by their expected values--in this case 0; hence

g1=Y1

g2= Y2 - ö 1Y1 + ö 1g1

g3 = Y3 - ö 1Y2 - ö 2Y1 + ö 1g2 + ög1


The associated sum of squared error terms is

' gt
N
2

t'1

Another approach is to start the sum at t = p + 1

'
N
SSE ' (yt&ö1yt&1 & ... &öpyt&p % è1gt&1 ... % èqgt&q)2
t'p%1

where gp, gp-1 , ... , gp+1-q are set equal to zero.

The minimization of (E.4) requires a nonlinear optimization routine if there is a moving


average component to the series, i.e., some of the èi's are nonzero. *If there is not a moving
average component, (all èi's = 0), then a regular linear regression package can be used in the
estimation process. A number of studies have found that autoregressive models perform
very well. Recall that an ARIMA(p,d,q) model can be expressed as an AR(4).

Nonlinear optimization routines require that initial estimates of the parameters be


provided. The Yule Walker equations are frequently used for this purpose.
1 ñ1 ÿ ñp&1
ö1 ñ1
ñ1 1 .
! ' !
. . .
öp ñp
ñp&1 ÿ 1

where ñi is estimated by
'(y t & ȳ)(yt&i&ȳ)
'(y t & ȳ)2

F. Forecasts

Let YN(h) denote a forecast of YN+h made at time t = N. Assume that Yt is an


ARIMA(p,d,q) stochastic process, i.e.,

ö(B)Yt = è(B)gt (F.1)

Yt can be expressed in terms of gt as


è(B)
Yt ' g
ö(B) t

= Ã(B) gt

= 1 gt + Ã1gt-1 + Ã2 gt-2 + . . . (F.2)


From (F.2) we can express YN+h as

YN+h = 1gN+h + ... Ãh-1gN+1 + ÃngN + Ãh+1 gN-1 + ... (F.3)

Unknown at Can be estimated at


time t=N t=N

It can be shown that the optimal (minimum mean squared error) forecast of YN+h at time N
is given by

YN(h) = ÃhgN + Ãh+1 gN-1 + Ãh+2gN-2 + ... (F.4)

' Ãh%j gN&j


4

j'0

A recurrence relationship can also be developed from (F.3) which facilitates evaluation of
forecasts.

Y N(h) ' ' öiY N(h&i) % ' èi%h gN&i


p 4

i'1 i'0

= ö 1 YN(h-1) + ... + ö pyN(h-p)

+ è hgN + è h+1 gN-1 + ... + è q + h gN-q

As an example consider forecasts corresponding to an AR(1) model.

YN(h) = ö 1YN(h-1)

YN(1) = ö 1YN

YN(h) = ö h1 YN

The forecast error is defined by

e N(h) = YN+h - YN(h) (F.6)

' ' Ãi gN%h&i & ' Ãh%j gN&j


4 4

i'0 j'0

= gN+h + Ã1gN+h-1 + ... + Ãh-1 gN+1

(see equations F.3 and F.4)

The variance of the forecast error is given by


Var(eN(h))= Var(gN+h + ... + Ãh-1gN+1) (F.7)

= Var(gN+h) + ... + Ã2h-1 Var(gN+h)

= ó 2g (1 + Ã21 + ... + Ã2h-1)

Var(eN(h)) = ó 2g (1 + Ã21 + ... + Ã2h-1) (F.7)

Note: (1) That the variance of the forecast error increases as the lead time
increases

(2) ó 2g can be estimated by


2 SSE
ó̂g '
N&p&q

(3) "Asymptotic" confidence intervals are given by


2 2
Y N(h) ± Zá ó̂2(Ã0 % ... % Ãh&1)

š
š___________________________________________
š
š___________________________________________
š
š___________________________________________
š
š___________________________________________
š
š___________________________________________
š
š___________________________________________
š
š___________________________________________
N t

The forecasts will eventually converge to the "trend."

(4) The expression for the variance of the forecast error in (F.7) doesn't take
account of parameter uncertainty.

G. General Comments

(1) It should be mentioned that


ö(B)(I-B)dyt = è(B)Ut

is uniquely determined by a given autocorrelation structure (given d) up to a multiple of


a polynomial in B. This suggests working with the simplest model.

(2) If the time series exhibits seasonal behavior [large ñ12 (monthly) or ñ4 (quarterly)] then
the previously developed model can be modified to incorporate this behavior into the
modeling process.

(3) Just a reminder: the simple exponential smoothing is an ARIMA(0,1,1) model and the
Holt-Winters nonseasonal predictor is an ARIMA (0,2,2) model.

(4) A number of texts suggest that time series models based on ARIMA formulations
should only be expected to be "successful" if at least 40 observations are available. For
smaller samples, use other techniques–such as exponential smoothing or Holt
Winters. . .

(5) A complementary method of analysis is that of spectral analysis. The spectral density of
a stationary series is given by

g(f) ' 2 1 % 2 ' ñk Cos2ð fk


4

k'1

0#f# ½

In practice this is estimated using "lag windows" by

g(f) ' 2 1 % 2 ' ëkñk Cos 2ðfk


n&1

k'1

The spectral density function provides information about the cyclical behavior of a
series and shows how the variance of a stochastic process is distributed between a
continuous range of frequencies.

Spectral Density Functions for

ARMA(p,q) ö(B)yt = è(B)gt

yt = ö -1(B)è(B)gt

The spectral density function is given by


è(e iu) è(e &iu)
f(u) ' ó2
ö(e iu)ö(e &iu)

-ð < u < ð
Demoivre's Theorem.

eiè = cosè + i sinè

eièm = cosèm + i sin mè

Model Spectral Density Function

ARMA(0,0) σ ε2

AR(1) ó 2g /(1-2ö 1 cos ù + ó 21)

AR(2) ó 2/[1 + ö 21 + ó 22 - 2ö 1(1-ö 2)cosù - 2ö 2 cos 2ù ]

MA(1) ó 2(1 + è 21 - 2è 1 cos ù )

MA(2) ó 2(1 + è 21 + è 22 - 2è 1(1-è 2)cos ù - 2è 2 cos 2ù )

2
ó2(1 % è1 & 2è1cosù)
ARMA(1,1) 2
(1 % ö1 & 2ö1cosù)

2 2
ó2(1 % è1 % è2 & 2è1(1&è2 )cosù & 2è2(cos2ù)
ARMA(1,2) 2
(1 % ö1 & 2ö1cosù)

2
ó2(1 % è1 & 2è1cosù)
ARMA(2,1) 2 2
1 % ö1 % ö2 & 2ö1(1&ö2)cosù

2 2
ó2(1 % è1 % è2 & 2è1(1&è2 )cosù & 2è2cos2ù)
ARMA (2,2) 2 2
1 % ö1 % ö2 & 2ö1(1&ö2)cosù & 2ö2cosù
COMPUTER PROGRAMS (TIME SERIES ANALYSIS)
I. THEORYTS

ENTER THE NUMBER OF AUTOREGRESSIVE PARAMETERS: 2

ENTER THE NUMBER OF MOVING AVERAGE PARAMETERS: 0

ENTER THE NUMBER OF AUTOCORRELATION COEFFICIENTS: 25

ENTER THE NUMBER OF PARTIAL AUTO CORRELATION COEFFICIENTS: 25

ENTER THE VALUE OF PHI(1): .2

ENTER THE VALUE OF PHI(2): .7

II. RUN SHR:DATATSF

HOW MANY SERIES DO YOU WANT? 1

HOW MANY OBSERVATIONS ARE TO BE IN EACH SERIES? 300

WHAT ARE THE AR PARAMETERS? (UP TO TWO) .9

WHAT ARE THE MA PARAMETERS? (UP TO TWO)

DO YOU WANT THE SIBYL RUNNER HEADING? N

WHAT IS THE SEASONAL AR PARAMETER? 0

WHAT IS THE SEASONAL MA PARAMETER?

HOW MANY TIMES SHOULD THE SERIES BE SUMMED? 0

HOW MANY TIMES SEASONALLY SUMMED? 0

WHAT IS THE SPAN OF SEASONALITY? 12

WHAT IS THE MEAN OF THE SERIES? 0

WHAT TYPE OF ERROR TERMS DO YOU WANT?


(N=NORMAL, L=LOG-NORMAL, P-PARETO, E=EXPONENTIAL) N

WHAT VALUE OF SIGMA DO YOU WANT? 1. **(remember the decimal)**


STOP

III. RUN SHR:BYUTSF

* * * * BYU TIME SERIES FORECASTING PACKAGE * * * *

WOULD YOU LIKE TO TEST LEVELS OF DIFFERENCING? N (see next section,


used for identification)

WHAT FORECASTING TECHNIQUE WOULD YOU LIKE TO RUN?


FOR HELP TYPE HELP

THE FOLLOWING FORECASTING TECHNIQUES ARE AVAILABLE

BOXJN = BOX-JENKINS MODEL


EXPS = SINGLE PARAMETER EXPONENTIAL SMOOTHING
WINTR = WINTER'S 3-PARAMETER MODEL
MARM = TRANSFER FUNCTIONS (MARMA MODELS)

WHICH OF THESE WOULD YOU LIKE TO RUN? BOXJN

WHAT IS THE DATA FILE NAME? SER1.DAT

HOW MANY OBSERVATIONS ARE TO BE USED? 300

IS THIS A SIBYL-RUNNER FILE? N

IV. * * * * BYU TIME SERIES FORECASTING PACKAGE * * * *

WOULD YOU LIKE TO TEST LEVELS OF DIFFERENCING? N

WHAT IS THE DATA FILE NAME? L.DAT

HOW MANY OBSERVATIONS ARE TO BE USED? 50

IS THIS A SIBYL-RUNNER FILE? N

DO YOU WANT A GRAPH OF THE DATA? N

HOW MANY TIMES MUST THE SERIES BE DIFFERENCED? 1

HOW MANY TIMES MUST THE SERIES BE SEASONALLY DIFFERENCED?


WHAT IS THE SPAN OF SEASONALITY?

DO YOU WANT TO SEE GRAPH OF THE NEW DATA? N

WOULD YOU LIKE TO SEE THE SPECTRUM? N

HOW MANY AUTOCORRELATION COEFFICIENTS ARE TO BE SEEN? 10

HOW MANY PARTIAL AUTO'S ARE TO BE SEEN? 10

WOULD YOU LIKE TO REPEAT THIS PROCEDURE? N

You might also like