0% found this document useful (0 votes)
82 views

Chapter 02 PDF

Uploaded by

nasim dashkhane
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
82 views

Chapter 02 PDF

Uploaded by

nasim dashkhane
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 73

APPLIED ECONOMETRIC TIME

SERIES
4TH EDITION

Chapter 2: STATIONARY TIME-SERIES


MODELS

WALTER ENDERS, UNIVERSITY OF ALABAMA

Copyright © 2015 John Wiley & Sons, Inc.


Copyright © 2015 John, Wiley & Sons, Inc. All rights reserved.
Section 1

STOCHASTIC DIFFERENCE
EQUATION MODELS

Copyright © 2015 John, Wiley & Sons, Inc. All rights reserved.
Example of a time-series model

=mt ρ (1.03)t m0* + (1 − ρ )mt −1 + ε t (2.2)

1. Although the money supply is a continuous variable, (2.2) is a discrete


difference equation. Since the forcing process {εt} is stochastic, the
money supply is stochastic; we can call (2.2) a linear stochastic
difference equation.
2. If we knew the distribution of {εt}, we could calculate the distribution
for each element in the {mt} sequence. Since (2.2) shows how the
realizations of the {mt} sequence are linked across time, we would be
able to calculate the various joint probabilities. Notice that the
distribution of the money supply sequence is completely determined by
the parameters of the difference equation (2.2) and the distribution of
the {εt} sequence.
3. Having observed the first t observations in the {mt} sequence, we can
make forecasts of mt+1, mt+2, ….

Copyright © 2015 John, Wiley & Sons, Inc. All rights reserved.
White Noise
• E(εt) = E(εt–1) = … = 0

• E(εt)2 = E(εt–1) 2 = … = σ2

• [or var(εt) = var(εt–1) = … = σ2]

• E(εt εt-s) = E(εt-j εt-j-s) = 0 for all j and s

• [or cov(εt, εt-s) = cov(εt-j, εt-j-s) = 0]


q
xt = ∑ β iε t −i
i =0

A sequence formed in this manner is called a moving average of


order q and is denoted by MA(q)

Copyright © 2015 John, Wiley & Sons, Inc. All rights reserved.
2. ARMA MODELS
In the ARMA(p, q) model

yt = a0 + a1yt–1 + … + apyt-p + εt + β1εt–1 + … + βqεt-q


where εt series are serially uncorrelated “shocks”

The particular solution is:


 q
  p
i
 0 ∑ i t −i   ∑ i 
yt = a + β ε 1 − a L
 i =0   i =1 

Note that all roots must lie outside of the unit circle.
If this is the case, we have the MA Representation

yt = c + ∑ ciε t − i
i =0

Copyright © 2015 John, Wiley & Sons, Inc. All rights reserved.
t ε

Section 3
Stationarity Restrictions for an AR(1) Process

STATIONARITY

Copyright © 2015 John, Wiley & Sons, Inc. All rights reserved.
Copyright © 2015 John, Wiley & Sons, Inc. All rights reserved.
Covariance Stationary Series
• Mean is time-invariant
• Variance is constant
• All covariances are constant
– all autocorrelations are constant
• Example of a series that are not covariance stationary
– yt = α + β time
– yt = yt-1 + εt (Random Walk)

Copyright © 2015 John, Wiley & Sons, Inc. All rights reserved.
Formal Definition
A stochastic process having a finite mean and variance is
covariance stationary if for all t and t − s,

1. E(yt) = E(yt-s) = µ

2. E[(yt – µ)2] = E[(yt-s – µ)2]

or [var(yt) = var(yt−s) = ]

3. E[(yt – µ)(yt-s – µ)] = E[(yt-j – µ)(yt-j-s – µ)] = γs


or cov(yt, yt-s) = cov(yt-j, yt-j-s) = γs

where µ, and γs are all constants.

Copyright © 2015 John, Wiley & Sons, Inc. All rights reserved.
4. STATIONARITY RESTRICTIONS
FOR AN ARMA(p, q) MODEL

• Stationarity Restrictions for the Autoregressive


Coefficients

Copyright © 2015 John, Wiley & Sons, Inc. All rights reserved.
Stationarity of an AR(1) Process
yt = a0 + a1yt–1 + εt with an initial condition
t −1
t −1

ty= a
0
i
1
t

1 0
=i 0=i 0
a + a y + ∑ 1ε t −i
a i

Only if t is large is this stationary:



a0
=
lim yt + ∑ a1iε t −i
1 − a1 i =0

E[(yt – µ)(yt-s – µ)] = E{[εt + a1εt–1 + (a1)2εt–2 + …]


[εt-s + a1εt-s–1 + (a1)2εt-s–2 + …]}
= σ2(a1)s[1 + (a1)2 + (a1)4 + …]
= σ2(a1)s/[1 – (a1)2]

Copyright © 2015 John, Wiley & Sons, Inc. All rights reserved.
Restrictions for the AR Coefficients
p
a0 + ∑ ai yt −i + ε t
Let yt =
i =1

 p
 ∞
so that yt = a0 / 1 − ∑ ai  + ∑ ci ε t −i
=  i 1=  i 0

We know that the sequence {ci} will eventually solve the difference equation

ci – a1ci–1 – a2ci–2 – … – apci−p = 0 (2.21)

If the characteristic roots of (2.21) are all inside the unit circle, the {ci} sequence
will be convergent.

The stability conditions can be stated succinctly:


1. The homogeneous solution must be zero. Either the sequence must have started
infinitely far in the past or the process must always be in equilibrium (so that the
arbitrary constant is zero).
2. The characteristic root a1 must be less than unity in absolute value.

Copyright © 2015 John, Wiley & Sons, Inc. All rights reserved.

A Pure MA Process xt = ∑ β iε t −i
i =0

1. Take the expected value of xt


E(xt) = E(εt + β1εt–1 + β2εt–2 + )
= Eεt + β1Eεt–1 + β2Eεt–2 + = 0
E(xt−s) = E(εt−s + β1εt−s–1 + β2εt−s–2 + ) = 0
Hence, all elements in the {xt} sequence have the same finite mean (µ = 0).

2. Form var(xt) as
var(xt) = E[(εt + β1εt–1 + β2εt–2 + )2]
= σ2[1 + (β1)2 + (β2)2 + ]

As long as Σ(βi)2 is finite, it follows that var(xt) is finite.


var(xt−s) = E[(εt−s + β1εt−s–1 + β2εt−s–2 + )2]
= σ2[1 + (β1)2 + (β2)2 + ]

Thus, var(xt) = var(xt−s) for all t and t−s.


Are all autocovariances finite and time independent?
E[xtxt−s] = E[(εt + β1εt–1 + β2εt–2 + )(εt−s + β1εt−s–1 + β2εt−s–2 + )]
= σ2(βs + β1βs+1 + β2βs+2 + )
Restricting the sum βs + β1βs+1 + β2βs+2 + to be finite means that E(xtxt−s) is finite.

Copyright © 2015 John, Wiley & Sons, Inc. All rights reserved.
The Autocorrelation Function of an AR(2) Process
The Autocorrelation Function of an MA(1) Process
The Autocorrelation Function of an ARMA(1, 1) Process

5. THE AUTOCORRELATION
FUNCTION

Copyright © 2015 John, Wiley & Sons, Inc. All rights reserved.
The Autocorrelation Function of an MA(1) Process
Consider yt = εt + βεt–1. Again, multiply yt by each yt−s and
take expectations

γ0 = var(yt) = Eytyt = E[(εt + βεt–1)(εt + βεt–1)] = (1 + β2)σ2

γ1 = cov(ytyt–1) = Eytyt–1 = E[(εt + βεt–1)(εt–1 + βεt–2)] = βσ2

and

γs = Eytyt−s = E[(εt + βεt–1)(εt−s + βεt−s–1)] = 0 for all s > 1

Copyright © 2015 John, Wiley & Sons, Inc. All rights reserved.
Copyright © 2015 John, Wiley & Sons, Inc. All rights reserved.
The ACF of an ARMA(1, 1) Process:
Let yt = a1yt–1 + εt + β1εt–1.
Eytyt = a1Eyt–1yt + Eεtyt + β1Eεt–1yt

⇒ γ0 = a1γ1 + σ2 + β1(a1+β1)σ2

Eytyt–1 = a1Eyt–1yt–1 + Eεtyt–1 + β1Eεt–1yt–1

⇒ γ1 = a1γ0 + β1σ2

Eytyt–2 = a1Eyt–1yt–2 + Eεtyt–2 + β1Eεt–1yt–2


⇒ γ2 = a1γ1 .
Eytyt−s = a1Eyt–1yt−s + Eεtyt−s + β1Eεt–1yt−s
⇒ γs = a1γs–1

Copyright © 2015 John, Wiley & Sons, Inc. All rights reserved.
ACF of an AR(2) Process

Eytyt = a1Eyt–1yt + a2Eyt–2yt + Eεtyt


Eytyt–1 = a1Eyt–1yt–1 + a2Eyt–2yt–1 + Eεtyt–1
Eytyt–2 = a1Eyt–1yt–2 + a2Eyt–2yt–2 + Eεtyt–2
.
.
Eytyt−s = a1Eyt–1yt−s + a2Eyt–2yt−s + Eεtyt−s

So that
γ0 = a1γ1 + a2γ2 + σ2
γ1 = a1γ0 + a2γ1 → ρ1= a1/(1 - a2)
γs = a1γs–1 + a2γs–2 → ρi = a1ρi-1 + a2ρi-2

Copyright © 2015 John, Wiley & Sons, Inc. All rights reserved.
6. THE PARTIAL
AUTOCORRELATION
FUNCTION
Copyright © 2015 John, Wiley & Sons, Inc. All rights reserved.
PACF of an AR Process

yt = a + a1yt-1 + εt

yt = a + a1yt-1 + a2yt-2+ εt

yt = a + a1yt-1 + a2yt-2 + a3yt-3 + εt

The successive estimates of the ai are the


partial autocorrelations

Copyright © 2015 John, Wiley & Sons, Inc. All rights reserved.
PACF of a MA(1)

yt = εt + β1εt-1

but εt-1 = yt-1 - β1εt-2

yt = εt + β1[ yt-1 - β1εt-2 ]

= εt + β1yt-1 – (β1)2 εt-2

yt = εt + β1yt-1 – (β1)2[yt-2 - β1εt-3 ] …

It looks like an MA(∞)

Copyright © 2015 John, Wiley & Sons, Inc. All rights reserved.
or Using Lag Operators

yt = εt + β1εt−1 = (1 + β1L)εt
yt /(1 + β1L) = εt
Recall yt/(1 − a1L) = yt + a1yt−1 + a12yt−2 + a13yt−3 + …
so that −β1 plays the role of a1
yt /(1 + β1L)εt = yt /[1 − (−β1)L]εt =
yt − β1yt−1 + β12yt−2 − a13yt−3 + … = εt
or
yt =β1yt−1 − β12yt−2 + β 13yt−3 + … = εt

Copyright © 2015 John, Wiley & Sons, Inc. All rights reserved.
Summary: Autocorrelations and Partial
Autocorrelations

ACF PACF
• AR(1) • AR(p)
– geometric decay – Cuts off at lag p
• MA(q) • MA(1)
– cuts off at lag q – Geometric Decay

Copyright © 2015 John, Wiley & Sons, Inc. All rights reserved.
For stationary processes, the key points to note are the
following:
• 1. The ACF of an ARMA(p,q) process will begin to decay
after lag q. After lag q, the coefficients of the ACF (i.e.,
the ri) will satisfy the difference equation (ρi = a1ρi–1 +
a2ρi–2 + + apρi-p).

• 2. The PACF of an ARMA(p,q) process will begin to


decay after lag p. After lag p, the coefficients of the PACF
(i.e., the fss) will mimic the ACF coefficients from the
model yt /(1 + β1L + β2L2 + + βqLq).

Copyright © 2015 John, Wiley & Sons, Inc. All rights reserved.
TABLE 2.1: Properties of the ACF and PACF

Process ACF PACF


White Noise All ρs = 0 (s ≠ 0) All φss = 0
AR(1): a1 > 0 Direct exponential decay: ρs = a1s φ11 = ρ1; φss = 0 for s ≥ 2
AR(1): a1 < 0 Oscillating decay: ρs = a1s φ11 = ρ1; φss = 0 for s ≥ 2
AR(p) Decays toward zero. Coefficients may oscillate. Spikes through lag p. All φss = 0 for s > p.
MA(1): β > 0 Positive spike at lag 1. ρs = 0 for s ≥ 2 Oscillating decay: φ11 > 0.
MA(1): β < 0 Negative spike at lag 1. ρs = 0 for s ≥ 2 Geometric decay: φ11 < 0.
ARMA(1, 1) Geometric decay beginning after lag1. Sign ρ1 = Oscillating decay after lag 1. φ11 = ρ1
a1 > 0 sign(a1+β)
ARMA(1, 1) Oscillating decay beginning after lag 1. Sign ρ1 = Geometric decay beginning after lag 1. φ11 = ρ1 and
a1 < 0 sign(a1+β) sign(φss) = sign(φ11).
ARMA(p, q) Decay (either direct or oscillatory) beginning after Decay (either direct or oscillatory) beginning after
lag q. lag p.

Copyright © 2015 John, Wiley & Sons, Inc. All rights reserved.
Testing the significance of ρi
• Under the null ρi = 0, the sample distribution of is:
– approximately normal (but bounded at -1.0 and +1.0)
when T is large
– distributed as a students-t when T is small.

• The standard formula for computing the appropriate t value to


test significance of a correlation coefficient is:

T −2 with df = T − 2
t = ρˆ i
1− ρˆi2
• SD(ρ) = [ ( 1 – ρ2) / (T – 2) ]1/2

• In reasonably large samples, the test for the null that ρi = 0 is


simplified to T1/2. Alternatively, the standard deviation of the
correlation coefficient is (1/T)0.5.

Copyright © 2015 John, Wiley & Sons, Inc. All rights reserved.
Significance Levels
• A single autocorrelation
– st.dev(ρ) = [ ( 1 – ρ2) / (T – 2) ] ½
• For small ρ and large T, st.dev( ρ ) is
approx. (1/T)1/2
– If the autocorrelation exceeds | 2/T1/2 | we can reject the
null that r = 0.
• A group of k autocorrelations:

k
T (T + 2)∑ ρ i /(T − k )
Q=
i =1
Is a Chi-square with degrees of freedom = k

Copyright © 2015 John, Wiley & Sons, Inc. All rights reserved.
Model Selection Criteria
Estimation of an AR(1) Model
Estimation of an ARMA(1, 1) Model
Estimation of an AR(2) Model

7. SAMPLE
AUTOCORRELATIONS
OF STATIONARY SERIES
Copyright © 2015 John, Wiley & Sons, Inc. All rights reserved.
Sample Autocorrelations

∑(y t − y )( yt − s − y )
Form the sample autocorrelations
rs =
t = s +1
T

∑( y
t =1
t − y )2

s
T (T + 2)∑ rk2 /(T − k )
Q= Test groups of correlations
k =1

If the sample value of Q exceeds the critical value of χ2 with s derees of


freedom, then at least one value of rk is statistically different from zero at the
specified significance level.

The Box–Pierce and Ljung–Box Q-statistics also serve as a check to see if


the residuals from an estimated ARMA(p,q) model behave as a white-noise
process. However, the degrees of freedom are reduced by the number of
estimated parameters
Copyright © 2015 John, Wiley & Sons, Inc. All rights reserved.
Model Selection
• AIC = T ln(sum of squared residuals) + 2n
• SBC = T ln(sum of squared residuals) + n ln(T)

where n = number of parameters estimated (p + q + possible


constant term) T = number of usable observations.

ALTERNATIVE
• AIC* = –2ln(L)/T + 2n/T
• SBC* = –2ln(L)/T + n ln(T)/T

• where n and T are as defined above, and L =maximized value


of the log of the likelihood function.
• For a normal distribution, –2ln(L) = Tln(2π) +Tln(σ2) + (1/σ2)
(sum of squared residuals)

Copyright © 2015 John, Wiley & Sons, Inc. All rights reserved.
Figure 2.3: ACF and PACF for two simulated processes
Panel a: ACF for the AR(1) Process Panel b: PACF for the AR(1) Process
1.00 1.00

0.75
0.75

0.50
0.50
0.25

0.25
0.00

0.00 -0.25
0 5 10 15 0 5 10 15

Panel c: ACF for the ARMA(1,1) Process Panel d: PACF for the ARM1(1,1) Process
1.00 1.00

0.75 0.75

0.50 0.50

0.25 0.25

0.00 0.00

-0.25 -0.25

-0.50 -0.50

-0.75 -0.75

-1.00 -1.00
0 5 10 15 0 5 10 15

Copyright © 2015 John, Wiley & Sons, Inc. All rights reserved.
Table 2.2: Estimates of an AR(1) Model

Model 1 Model 2
yt = a1yt-1 + et yt = a1yt-1 + et + b12et-12
Degrees of Freedom 98 97
Sum of Squared 85.10 85.07
Residuals
Estimated a1 0.7904 0.7938
(standard error) (0.0624) (0.0643)
Estimated b -0.0250
(standard error) (0.1141)
AIC / SBC AIC = 441.9 ; SBC = 444.5 AIC = 443.9 ; SBC = 449.1
Ljung-Box Q- Q(8) = 6.43 (0.490) Q(8) = 6.48 (0.485)
statistics for the Q(16) = 15.86 (0.391) Q(16) = 15.75 (0.400)
residuals Q(24) = 21.74 (0.536) Q(24) = 21.56. (0.547)
(significance level in
parentheses)

Copyright © 2015 John, Wiley & Sons, Inc. All rights reserved.
1.0

0.8

0.6

0.4

0.2

0.0

-0.2
0 5 10 15 20
Figure 2.4: ACF of the Residuals from Model 1

Copyright © 2015 John, Wiley & Sons, Inc. All rights reserved.
Table 2.3: Estimates of an ARMA(1,1)
Model

Estimates Q-Statistics AIC / SBC


Model 1 a1: -0.835 (.053) Q(8) = 26.19 (.000) AIC = 496.5
Q(24) = 41.10 (.001) SBC = 499.0

Model 2 a1: -0.679 (.076) Q(8) = 3.86 (.695) AIC = 471.0


b1: -0.676 (.081) Q(24) = 14.23 (.892) SBC = 476.2

Model 3 a1: -1.16 (.093) Q(8) = 11.44 (.057) AIC = 482.8


a2: -0.378 (.092) Q(24) = 22.59 (.424) SBC = 487.9

Copyright © 2015 John, Wiley & Sons, Inc. All rights reserved.
ACF of Nonstationary Series

20
0 Differences
1.00

15 0.75

0.50
10

0.25

5
0.00

-0.25
0

-0.50

-5

-0.75

CORRS
PARTIALS
-10 -1.00
50 100 150 200 250 300 350 400 450 500 0 5 10 15 20

Copyright © 2015 John, Wiley & Sons, Inc. All rights reserved.
Parsimony
Stationarity and Invertibility
Goodness of Fit
Post-Estimation Evaluation

8. BOX–JENKINS MODEL
SELECTION

Copyright © 2015 John, Wiley & Sons, Inc. All rights reserved.
Box Jenkins Model Selection
• Parsimony
– Extra AR coefficients reduce degrees of freedom by 2
– Similar processes can be approximated by very
different models
– Common Factor Problem
• yt = εt and yt = 0.5 yt-1 + εt - 0.5εt-1
– Hence: All t-stats should exceed 2.0

– Model should have a good fit as measured by AIC or


BIC (SBC)

Copyright © 2015 John, Wiley & Sons, Inc. All rights reserved.
Box-Jenkins II
• Stationarity and Invertibility
– t-stats, ACF, Q-stats, … all assume that the process is stationary
– Be suspicious of implied roots near the unit circle
– Invertibility implies the model has a finite AR representation.
• No unit root in MA part of the model
• Diagnostic Checking
– Plot residuals—look for outliers, periods of poor fit
– Residuals should be serially uncorrrelated
• Examine ACF and PACF of residuals
– Overfit the model
– Divide sample into subperiods
– F = (ssr – ssr1 – ssr2)/(p+q+1) / (ssr1 + ssr2)/(T-2p-2q-2)

Copyright © 2015 John, Wiley & Sons, Inc. All rights reserved.
Residuals Plot
Deviations from Trend GDP
1500

1250

1000

750

500

250

-250

-500

-750
1947 1952 1957 1962 1967 1972 1977 1982 1987 1992 1997

What can we learn by plotting the residuals?


What if there is a systematic pattern in the residuals?

Copyright © 2015 John, Wiley & Sons, Inc. All rights reserved.
Requirements for Box-Jenkins
• Successful in practice, especially short term
forecasts
• Good forecasts generally require at least 50
observations
– more with seasonality
• Most useful for short-term forecasts
• You need to ‘detrend’ the data.
• Disadvantages
– Need to rely on individual judgment
• However, very different models can provide nearly identical
forecasts

Copyright © 2015 John, Wiley & Sons, Inc. All rights reserved.
Higher-Order Models
Forecast Evaluation
The Granger–Newbold Test
The Diebold–Mariano Test

9. PROPERTIES OF
FORECASTS

Copyright © 2015 John, Wiley & Sons, Inc. All rights reserved.
Forecasting with ARMA Models

The MA(1) Model


yt = β0 + β1εt-1 + εt

Updating 1 period:

yt+1 = β0 + β1εt + εt+1

Hence, the optimal 1-step ahead forecast is:

Etyt+1 = β0 + β1εt

Note: Etyt+j is a short-hand way to write the conditional expectation of yt+j

The 2-step ahead forecast is:

Etyt+2 = Et[ β0 + β1εt+1 + εt+2 ] = β0

Similarly, the n-step ahead forecasts are all β0

Copyright © 2015 John, Wiley & Sons, Inc. All rights reserved.
Forecast errors
The 1-step ahead forecast error is:

yt+1 - Etyt+1 = β0 + β1εt + εt+1 - β0 - β1εt = εt+1

Hence, the 1-step ahead forecast error is the "unforecastable" portion of yt+1

The 2-step ahead forecast error is:

yt+2 - Etyt+2 = β0 + β1εt+1 + εt+2 - β0 = β1εt+1 + εt+2

Forecast error variance


The variance of the 1-step ahead forecast error is: var(εt+1) = σ2
The variance of the 2-step ahead forecast error is: var(β1εt+1 + εt+2) = (1 + β12)σ2

Confidence intervals
The 95% confidence interval for the 1-step ahead forecast is:
β0 + β1εt ± 1.96σ

The 95% confidence interval for the 2-step ahead forecast is:
β0 ± 1.96(1 + β12)1/2σ

In the general case of an MA(q), the confidence intervals increase up to lag q.

Copyright © 2015 John, Wiley & Sons, Inc. All rights reserved.
The AR(1) Model: yt = a0 + a1yt-1 + εt.

Updating 1 period, yt+1 = a0 + a1yt + εt+1, so that

Etyt+1 = a0 + a1yt [*]

The 2-step ahead forecast is:


Etyt+2 = a0 + a1Etyt+1
and using [ * ]
Etyt+2 = a0 + a0a1 + a12yt

It should not take too much effort to convince yourself that:


Etyt+3 = a0 + a0a1 + a0a12 + a13yt

and in general:

Etyt+j = a0[ 1 + a1 + a12 + ... + a1j-1 ] + a1jyt

If we take the limit of Etyt+j we find that Etyt+j = a0/(1 - a1). This result is really
quite general; for any stationary ARMA model, the conditional forecast of yt+j
converges to the unconditional mean.

Copyright © 2015 John, Wiley & Sons, Inc. All rights reserved.
Forecast errors
The 1-step ahead forecast error is:
yt+1 – Etyt+1 = a0 + a1yt + εt+1 - a0 - a1yt = εt+1

The 2-step ahead forecast error is: yt+2 - Etyt+2. Since yt+2 = a0 + a1a0 + a12yt +
εt+2 + a1εt+1 and Etyt+2 = a0 + a1a0 + a12yt , it follows that:

yt+2 - Etyt+2 = εt+2 + a1εt+1

Continuing is this fashion, the j-step ahead forecast error is :


yt+j - Etyt+j = εt+j + a1εt+j-1 + a12εt+j-2 + a13εt+j-3 + ... + a1j-1εt+1

Forecast error variance: The j-step ahead forecast error variance is:
σ2[ 1 + a12 + a14 + a16 + ... + a12(j-1) ]

The variance of the forecast error is an increasing function of j. As such, you


can have more confidence in short-term forecasts than in long-term forecasts.
In the limit the forecast error variance converges to σ2/(1-a12); hence, the
forecast error variance converges to the unconditional variance of the {yt}
sequence.
Copyright © 2015 John, Wiley & Sons, Inc. All rights reserved.
Confidence intervals

• The 95% confidence interval for the 1-step ahead forecast


is:

a0 + a1yt ± 1.96σ

• Thus, the 95% confidence interval for the 2-step ahead


forecast is:
a0(1+a1) + a12yt ± 1.96σ(1+a12)1/2.

Copyright © 2015 John, Wiley & Sons, Inc. All rights reserved.
Forecast Evaluation
• Out-of-sample Forecasts:
1. Hold back a portion of the observations from the estimation process
and estimate the alternative models over the shortened span of data.
2. Use these estimates to forecast the observations of the holdback
period.
3. Compare the properties of the forecast errors from the two models.
• Example:
1. If {yt} contains a total of 150 observations, use the first 100
observations to estimate an AR(1) and an MA(1) and use each to
forecast the value of y101. Construct the forecast error obtained from the
AR(1) and from the MA(1).
2. Reestimate an AR(1) and an MA(1) model using the first 101
observations and construct two more forecast errors.
3. Continue this process so as to obtain two series of one-step ahead
forecast errors, each containing 50 observations.

Copyright © 2015 John, Wiley & Sons, Inc. All rights reserved.
• A regression based method to assess the forecasts is to use the 50
forecasts from the AR(1) to estimate an equation of the form
y100+t = a0 + a1f1t + v1t
• If the forecasts are unbiased, an F-test should allow you to impose the
restriction a0 = 0 and a1 = 1. Repeat the process with the forecasts from
the MA(1). In particular, use the 50 forecasts from the MA(1) to
estimate
y100+t = b0 + b1f2t + v2t t = 1, … , 50
• If the significance levels from the two F-tests are similar, you might
select the model with the smallest residual variance; that is, select the
AR(1) if var(v1t) < var(v2t).

• Instead of using a regression-based approach, many researchers would


select the model with the smallest mean square prediction error
(MSPE). If there are H observations in the holdback periods, the
MSPE for the AR(1) can be calculated
H
as
1
MSPE =
H
∑ 1i
e 2

i =1

Copyright © 2015 John, Wiley & Sons, Inc. All rights reserved.
The Diebold–Mariano Test

Let the loss from a forecast error in period i be denoted by g(ei). In the
typical case of mean-squared errors, the loss is et2

We can write the differential loss in period i from using model 1 versus
model 2 as di = g(e1i) – g(e2i). The mean loss can be obtained as

H
1
=d
H
∑ [ g (e
i =1
1i ) − g (e2i ) ]

If the {di} series is serially uncorrelated with a sample variance of γ0, the
estimate of var(𝑑̅ ) is simply γ0/(H − 1). The expression

d / γ 0 /( H − 1)

has a t-distribution with H − 1 degrees of freedom

Copyright © 2015 John, Wiley & Sons, Inc. All rights reserved.
Out-of-Sample Forecasts

10. A MODEL OF THE


INTEREST RATE SPREAD

Copyright © 2015 John, Wiley & Sons, Inc. All rights reserved.
Copyright © 2015 John, Wiley & Sons, Inc. All rights reserved.
1.00

0.75

0.50

0.25

0.00

-0.25

-0.50
0 1 2 3 4 5 6 7 8 9 10 11 12

Autocorrelations PACF

Figure 2.6: ACF and PACF of the Spread

Copyright © 2015 John, Wiley & Sons, Inc. All rights reserved.
Table 2.4: Estimates of the Interest Rate Spread
AR(7) AR(6) AR(2) p = 1, 2, ARMA(1, 1) ARMA(2, 1) p=2
and 7 ma = (1, 7)
µy 1.20 1.20 1.19 1.19 1.19 1.19 1.20
(6.57) (7.55) (6.02) (6.80) (6.16) (5.56) (5.74)
a1 1.11 1.09 1.05 1.04 0.76 0.43 0.36
(15.76) (15.54) (15.25) (14.83) (14.69) (2.78) (3.15)
a2 -0.45 -0.43 -0.22 -0.20 0.31 0.38
(-4.33) (-4.11) (-3.18) (-2.80) (2.19) (3.52)
a3 0.40 0.36
(3.68) (3.39)
a4 -0.30 -0.25
(-2.70) (-2.30)
a5 0.22 0.16
(2.02) (1.53)
a6 -0.30 -0.15
(-2.86) (-2.11)
a7 0.14 -0.03
(1.93) (-0.77)
β1 0.38 0.69 0.77
(5.23) (5.65) (9.62)
β7 -0.14
(-3.27)

SSR 43.86 44.68 48.02 47.87 46.93 45.76 43.72


AIC 791.10 792.92 799.67 801.06 794.96 791.81 784.46
SBC 817.68 816.18 809.63 814.35 804.93 805.10 801.07

Q(4) 0.18 0.29 8.99 8.56 6.63 1.18 0.76


Q(8) 5.69 10.93 21.74 22.39 18.48 12.27 2.60
Q(12) 13.67 16.75 29.37 29.16 24.38 19.14 11.13

Copyright © 2015 John, Wiley & Sons, Inc. All rights reserved.
Models of Seasonal Data
Seasonal Differencing

11. SEASONALITY

Copyright © 2015 John, Wiley & Sons, Inc. All rights reserved.
Seasonality in the Box-Jenkins framework

• Seasonal AR coefficients
– yt = a1yt-1+a12yt-12 + a13yt-13
– yt = a1yt-1+a12yt-12 + a1a12yt-13
– (1 – a1L)(1 – a12L12)yt

• Seasonal MA Coefficients

• Seasonal differencing:
– ∆yt = yt – yt-1 versus ∆12yt = yt – yt-12
• NOTE: You do not difference 12 times
– In RATS you can use: dif(sdiffs=1) y / sdy

Copyright © 2015 John, Wiley & Sons, Inc. All rights reserved.
Copyright © 2015 John, Wiley & Sons, Inc. All rights reserved.
Panel a: M1 Grow th
1.0
0.8
0.6
0.4
0.2
0.0
-0.2
-0.4
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22
Autocorrelations PACF

Figure 2.8: ACF and PACF

Copyright © 2015 John, Wiley & Sons, Inc. All rights reserved.
Three Models of Money growth

Model 1: AR(1) with Seasonal MA

mt = a0 + a1mt–1 + εt + β4εt–4

Model 2: Multiplicative Autoregressive

mt = a0 + (1 + a1L)(1 + a4L4)mt–1 + εt

Model 3: Multiplicative Moving Average

mt = a0 + (1 + β1L)(1 + β4L4)εt

Copyright © 2015 John, Wiley & Sons, Inc. All rights reserved.
Table 2.5 Three Models of Money Growth

Model 1 Model 2 Model 3


a1 0.541 0.496
(7.66)
(8.59)
a4 −0.476
(−7.28)
β1 0.453
(6.84)
β4 −0.759 −0.751
(−14.87)
(−15.11)
SSR 0.0177 0.0214 0.0193
AIC −735.9 −701.3 −720.1
SBC −726.2 −691.7 −710.4
Q(4) 1.39 (0.845) 3.97 (0.410) 22.19 (0.000)
Q(8) 6.34 (0.609) 24.21 (0.002) 30.41 (0.000)
Q(12) 14.34 (0.279) 32.75 (0.001) 42.55 (0.000)

Copyright © 2015 John, Wiley & Sons, Inc. All rights reserved.
3500

3000

2500

2000

1500

1000
2000 2002 2004 2006 2008 2010 2012 2014

M1 in Billions Forecasts

Figure 2.9: Forecasts of M1

Copyright © 2015 John, Wiley & Sons, Inc. All rights reserved.
Testing for Structural Change
Endogenous Breaks
Parameter Instability
An Example of a Break

12. PARAMETER INSTABILITY


AND STRUCTURAL CHANGE

Copyright © 2015 John, Wiley & Sons, Inc. All rights reserved.
Parameter Instability and the CUSUMs
• Brown, Durbin and Evans (1975) calculate whether the cumulated sum of the
forecast errors is statistically different from zero. Define:
N
CUSUM N = ∑ ei (1) / σ e
i =n N = n, …, T − 1

n = date of the first forecast error you constructed, σe is the estimated


standard deviation of the forecast errors.

Example: With 150 total observations (T = 150), if you start the procedure
using the first 10 observations (n = 10), 140 forecast errors (T − n) can be
created. Note thatσe is created using all T – n forecast errors.

To create CUSUM10, use the first ten observations to create e10(1)/σe. Now
let N = 11 and create CUSUM11 as [e10(1)+e11(1)]/σe. Similarly, CUSUMT-1 =
[e10(1)+…+eT-1(1)]/σe.

If you use the 5% significance level, the plot value of each value of
CUSUMN should be within a band of approximately ± 0.948 [ (T − n)0.5 +
2(N – n) (T − n)-0.5 ].

Copyright © 2015 John, Wiley & Sons, Inc. All rights reserved.
Figure 2.10: Recursive Estimation of the Model
Panel 1: The Series Panel 2: Intercept
12 7
6
10
5

8 4
3
6
2
1
4
0
2 -1
-2
0 10 20 30 40 50 60 70 80 90 100 110 120 130 140 150

-2 Intercept + 2 sds. - 2 stds.


25 50 75 100 125 150

Panel 3: AR(1) Coefficient Panel 4: The CUSUM TEST


2.0 50
40
1.5
30
1.0
20
0.5 10

0.0 0
-10
-0.5
-20
-1.0
-30
-1.5 -40
10 20 30 40 50 60 70 80 90 100 110 120 130 140 150 10 20 30 40 50 60 70 80 90 100 110 120 130 140 150

AR(1) + 2 sds. - 2 stds. CUSUMS Upper 5% Lower 5%

Copyright © 2015 John, Wiley & Sons, Inc. All rights reserved.
Section 13

COMBINING FORECASTS

Copyright © 2015 John, Wiley & Sons, Inc. All rights reserved.
13 Combining Forecasts

Consider the composite forecast fct constructed as weighted average of the


individual forecasts
fct = w1f1t + w2f2t + + wnfnt (2.71)

and ∑ 𝑤𝑖 = 1

If the forecasts are unbiased (so that Et−1fit = yt), it follows that the
composite forecast is also unbiased:

Et−1fct = w1Et-1f1t + w2Et−1f2t + + wnEt−1fnt


= w1yt + w2yt + + wnyt = yt

Copyright © 2015 John, Wiley & Sons, Inc. All rights reserved.
A Simple Example
To keep the notation simple, let n = 2.

Subtract yt from each side of (2.71) to obtain

fct − yt = w1(f1t − yt) + (1 − w1)(f2t − yt)

Now let e1t and e2t denote the series containing the one-step-ahead forecast errors
from models 1 and 2 (i.e., eit = yt − fit) and let ect be the composite forecast error.
As such, we can write

ect = w1e1t + (1 − w1)e2t

The variance of the composite forecast error is

var(ect) = w12var(e1t) + (1 − w1)2var(e2t) + 2w1(1 − w1)cov(e1te2t) (2.72)

Suppose that the forecast error variances are the same size and that cov(e1te2t)
=0. If you take a simple average by setting w1 = 0.5, (2.72) indicates that the
variance of the composite forecast is 25% of the variances of either forecast:
var(ect) = 0.25var(e1t) = 0.25var(e2t).
Copyright © 2015 John, Wiley & Sons, Inc. All rights reserved.
Optimal Weights
var(ect) = (w1)2var(e1t) + (1 − w1)2var(e2t) + 2w1(1 − w1)cov(e1te2t)
Select the weight w1 so as to minimize var(ect):

δ var( ect )
= 2 w1 var( e1t ) − 2(1 − w1 ) var( e2 t ) + 2(1 − 2 w1 ) cov( e1t e2 t )
δ w1
Bates and Granger (1969), recommend constructing the weights
excluding the covariance terms.

var( e2 t ) var( e1t ) −1


w *
=
var( e1t ) + var( e2 t ) var( e1t ) −1 + var( e2 t ) −1
1

In the n-variable case:

var( e1t ) −1
w =
*

var( e1t ) −1 + var( e2 t ) −1 + ... + var( ent ) −1


n

Copyright © 2015 John, Wiley & Sons, Inc. All rights reserved.
Alternative methods
Consider the regression equation
yt = α0 + α1f1t + α2f2t + + αnfnt + vt (2.75)

It is also possible to force α0 = 0 and α1 + α2 + + αn = 1.


Under these conditions, the αi’s would have the direct
interpretation of optimal weights.
Here, an estimated weight may be negative. Some researchers
would reestimate the regression without the forecast
associated with the most negative coefficient.

Granger and Ramanathan recommend the inclusion of an intercept


to account for any bias and to leave the αi’s unconstrained.

As surveyed in Clemen (1989), not all researchers agree with the


Granger–Ramanathan recommendation and a substantial amount
of work has been conducted so as to obtain optimal weights.
Copyright © 2015 John, Wiley & Sons, Inc. All rights reserved.
The SBC
Let SBCi be the SBC from model i and let SBC* be the SBC from the
best fitting model.
Form αi = exp[(SBC* − SBCi)/2] and then construct the weights

n
w = αi / ∑α i
*
i
t =1

Since exp(0) = 1, the model with the best fit has the weight 1/Σαi. Since
αi is decreasing in the value of SBCi, models with a poor fit with have
smaller weights than models with large values of the SBC.

Copyright © 2015 John, Wiley & Sons, Inc. All rights reserved.
Example of the Spread
I estimated seven different ARMA models of the interest rate spread. The data
ends in April 2012 and if I use each of the seven models to make a one-step-
ahead forecast for January 2013:

AR(7) AR(6) AR(2) AR(||1,2,7||) ARMA(1,1) ARMA(2,1) ARMA(2,||1,7||)


fi2013:1 0.775 0.775 0.709 0.687 0.729 0.725 0.799

Simple averaging of the individual forecasts results in a combined forecast of 0.743.

Construct 50 1-step-ahead out-of-sample forecasts for each model so as to obtain

AR(7) AR(6) AR(2) AR(||1,2,7||) ARMA(1,1 ARMA(2,1) ARMA(2,||1,7||)


)
var(eit) 0.635 0.618 0.583 0.587 0.582 0.600 0.606
wi 0.135 0.139 0.147 0.146 0.148 0.143 0.141

Copyright © 2015 John, Wiley & Sons, Inc. All rights reserved.
Next, use the spread (st) to estimate a regression in the form of (5). If you
omit the intercept and constrain the weights to unity, you should obtain:

st = 0.55f1t – 0.25f2t −2.37f3t + 2.44f4t + 0.84f5t – 0.28f6t + 1.17f7t (6)


Although some researchers would include the negative weights in (6), most
would eliminate those that are negative. If you successively reestimate the
model by eliminating the forecast with the most negative coefficient, you
should obtain:
st = 0.326f4t + 0.170f5t + 0.504f7t
The composite forecast using the regression method is:
0.326(0.687) + 0.170(0.729) + 0.504(0.799) = 0.751.
If you use the values of the SBC as weights, you should obtain:

AR(7) AR(6) AR(2) AR(||1,2,7||) ARMA(1,1) ARMA(2,1) ARMA(2,||1,7||)


wi 0.000 0.000 0.011 0.001 0.112 0.103 0.773

The composite forecast using SBC weights is 0.782. In actuality, the spread in 2013:1 turned
out to be 0.74 (the actual data contains only two decimal places). Of the four methods, simple
averaging and weighting by the forecast error variances did quite well. In this instance, the
regression method and constructing the weights using the SBC provided the worst composite
forecasts.
Copyright © 2015 John, Wiley & Sons, Inc. All rights reserved.
APPENDIX 2.1: ML ESTIMATION OF A REGRESSION

1  −ε t2 
exp  2 
2πσ 2
 2σ 
T
1  −ε t2 
∏ 2πσ
exp 
σ 2 
t =1
2
 2 
−T T 1 T
ln L=
2
ln(2π ) − ln σ −
2
2

2σ 2
∑ t
ε 2

t =1

Let εt = yt – bxt
T
T T 1
ln L = − ln (2π ) − ln σ 2 − ∑ t t− β 2
( y x )
2 2 2σ 2 t=1

∂ ln L T 1 T
∂ ln L 1 T
− ∑( y − β x ) = 2 ∑( y t x t − β x t2)
2
= +
∂σ 2 2σ 2 2σ 4 ∂β
t
σ t=1
t
t=1

Copyright © 2015 John, Wiley & Sons, Inc. All rights reserved.
ML ESTIMATION OF AN MA(1)
Now let yt = βεt–1 + εt. The problem is to construct
the {εt} sequence from the observed values of
{yt}. If we knew the true value of β and knew that
ε0 = 0, we could construct ε1, … , εT recursively.
Given that ε0 = 0, it follows that:
ε 1 = y1
ε2 = y2 – βε1 = y2 – βy1
ε3 = y3 – βε2 = y3 – β (y2 – βy1 )
ε4 = y4 – βε3 = y4 – β [y3 – β (y2 – βy1 ) ]
In general, εt = yt – βεt–1 so that if L is the lag
operator
t −1
ε t = yt /(1 + β L) =∑ ( − β )i yt −i
i =0
2
−T T 1 T  t −1 
2 ∑ ∑( ) t −i 
ln L= ln(2π ) − ln σ − − β
2 i
y
2 2 2σ = i 0
t 1= 
Copyright © 2015 John, Wiley & Sons, Inc. All rights reserved.

You might also like