Topic Four: IV. Box-Jenkins Methodology
Topic Four: IV. Box-Jenkins Methodology
We consider how to fit an ARIMA(p,d,q) model to historical data {x1, x2, ...xn}. We assume that trends
and seasonal effects have been removed from the data.
If the tentatively identified model passes the diagnostic tests, it can be used for forecasting.
If it does not, the diagnostic tests should indicate how the model should be modified, and a new cycle of
• Identification
• Estimation
• Diagnostic checks
is performed.
Recall: in a simple linear regression model, yi = .0 + .1Xi + ei, ei ~ IN(0,)2), we use regression
diagnostic plots of the residuals eˆi to test the goodness of fit of the model, i.e. if the assumptions
ei ~ IN(0,)2) are justified.
The error variables ei form a zero-mean white noise process: they are uncorrelated, with common
variance )2.
E (et ) = 0 $t
%! 2 , k = 0
" k = Cov(et , et #k ) = &
' 0, otherwise
Thus the ACF and PACF of a white noise process (when plotted against k) look like this:
ACF (*k) PACF ( !ˆk )
1 1
| | | | | | | | | | | |
1 2 3 ... k 1 2 3 ... k
-1 -1
i.e. apart from *0 = 1, we have *k = 0 for k = 1,2,... and !k = 0 for k = 1, 2,...
Question: how do we test if the residuals from a time series model look like a realisation of a white
noise process?
Answer: we look at the SACF and SPACF of the residuals. In studying the SACF and SPACF, we
realise that even if the original process was white noise, we would not expect rk = 0 for k = 1, 2,… and
!k = 0 for k = 1, 2,… as rk is only an estimate of *k and !ˆk is only an estimate of !k .
Question: how close to 0 should rk and !ˆk be, if rk = 0 for k = 1, 2, … and !ˆk = 0 for k = 1, 2, …?
Answer: If the original model is white noise, Xt = µ + et, then for each k, the SACF and SPACF satisfy
! 1$ ! 1$
rk ~ N # 0, & and !ˆk ~ N # 0, &
" n% " n%
This is true for large samples, i.e. for large values of n.
! 2 2 "
Values of rk or !ˆk outside the range $ # , % can be taken as suggesting that a white noise model is
& n n'
inappropriate.
However, these are only approximate 95% confidence intervals. If *k = 0, we can be 95% certain that rk
lies between these limits. This means that 1 value in 20 will lie outside these limits even if the white
noise model is correct.
Hence a single value of rk or !ˆk outside these limits would not be regarded as significant on its own, but
three such values might well be significant.
There is an overall Goodness of Fit test, based on all the rk’s in the SACF, rather than on individual rk’s,
called the Portmanteau test by Ljung and Box. It consists in checking whether the m sample
autocorrelation coefficients of the residuals are too large to resemble those of a white noise process
(which should all be negligible).
Given residuals from an estimated ARMA(p,q) model, under the null hypothesis that all values of rk = 0,
and the Q-statistic is asymptotically #2-distributed with s = m – p – q degrees of freedom, or, if a
constant (say µ) is included, s = m – p – q – 1 degrees of freedom.
That is, under the null hypothesis that all values of rk = 0, the Q-statistic given above is asymptotically
#2-distributed with m degrees of freedom. If the Q-statistic is found to be greater than the 95th percentile
of that #2 distribution, the null hypothesis is rejected, which means that the alternative hypothesis that “at
least one autocorrelation is non-zero” is accepted. Statistical packages print these statistics. For large n,
the Ljung-Box Q-statistic tends to closely approximate the Box-Pierce statistic:
m
rk 2 m
2
n(n + 2) " ! n "r k
k =1 n - k k =1
The overall diagnostic test is therefore performed as follows (for centred realisations):
• Fit ARMA(p,q) model
• Estimate (p+q) parameters
• Test if
m
rk2 2
Q = n(n + 2) "n!k ~ ! m! p!q
k=1
Remark: the above Ljung-Box Q-statistic was first suggested to improve upon the simpler Box-Pierce
test statistic
m
Q = n! rk2
k =1
which was found to perform poorly even for moderately large sample sizes.
b. Identification of MA(q)
Recall: for an MA(q) process, #k = 0 for all k > q, i.e. the “ACF cuts off after lag q”.
To test if an MA(q) model is appropriate, we see if rk is close to 0 for all k > q. If the data do come from
an MA(q) model, then for k > q (since the first q+1 coefficients are significant),
" 1" q %%
rk ~ N $$ 0, $$1+ ! 2 !i2 ''''
# n # i=1 &&
" 1$ q
1$ q #
2% 2%
' &1.96 )1 + 2/ !i * , +1.96 )1 + 2/ !i * (
'- n+ i =1 , n+ i =1 , (.
(note that it is common to use 2 instead of 1.96 in the above formula). We would expect 1 in 20 values to
lie outside the interval. In practise, the #i’s are replaced by ri’s. The “confidence limits” on SACF plots
are based on this. If rk lies outside these limits it is “significantly different from zero” and we conclude
that #k $ 0. Otherwise, rk is not significantly different to zero and we conclude that #k = 0.
SACF
---
---
---
rk
1 2 k
---
---
---
For q=0, the limits for k=1 are
as for testing for white noise model. Coefficient r1 is compared with these limits. For q = 1, the limits
for k = 2 are
! 1 2 1 2
"
$ #1.96 (1 + 2r1 ),1.96 (1 + 2r1 ) %
& n n '
and r2 is compared with these limits. Again, 2 is often used in place of 1.96.
c. Identification of AR(p)
Recall: for an AR(p) process, we have !k = 0 for all k > p, i.e. the “PACF cuts off after lag p”.
To test if an AR(p) model is appropriate, we see if the sample estimate of !k is close to 0 for all k > p. If
the data do come from an AR(p) model, then for k > p,
! 1$
!ˆk ~ N # 0, &
" n%
and 95% of the sample estimates should lie in the interval
! 2 2 "
$# , %
& n n'
The “confidence limits” on SPACF plots are based on this: if the sample estimate of !k lies outside these
limits, it is “significant”.
0.2
-0.2 0.0
5 10 15
Lag k
IV.3 Model fitting
• An appropriate value of d has been found and {zd+1, zd+2, ... zn} is stationary.
• For simplicity, we assume that d = 0 (to simplify upper and lower limits of sums).
• If the SACF appears to cut off after lag q, an MA(q) model is indicated (we use the tests of
significance described previously).
• If the SPACF appears to cut off after lag p, and AR(p) model is indicated.
If neither the SACF nor the SPACF cut off, mixed models must be considered, starting with
ARMA(1,1).
Having identified the values for the parameters p and q, we must now estimate the values of the
parameters (1, (2, ... (p and &1, &2, ..., &q in the model
Least squares (LS) estimation is equivalent to maximum likelihood (ML) estimation if et is assumed
normally distributed.
Example: in the AR(p) model, et = Zt – (1Zt-1 – ... – (pZt-p. The estimators !ˆ1 ,...,!ˆ p are chosen to
minimise
n
For general ARMA models, êt cannot be deduced from the zt. In the MA(1) model for instance,
We can solve this iteratively for êt as long as some starting value ê0 is assumed. For an ARMA(p,q)
model, the list of starting values is ( ê0 , ê1 , ..., êq!1 ). The starting values are estimated recursively by
backforecasting:
0. Assume ( ê0 , ê1 , ..., êq!1 ) are all zero
1. Estimate the (i and &j
2. Use forecasting on the time-reversed process {zn, ..., z1} to predict values for ( ê0 , ê1 , ..., êq!1 )
3. Repeat cycle (1)-(2) until the estimates converge.
• Calculate theoretical ACF or ARMA(p,q): #k’s will be a function of the (’s and &’s.
• Set #k = rk and solve for the (’s and &’s. These are the method of moments estimators.
xn = en + .en-1 , en ~ N(0,1)
ˆ
We have r1 = ! 1 = -0.25.
!ˆ0
Recall: the MA(1) process is invertible IFF |.| < 1. So for . = -0.268, the model is invertible. But for . =
-3.732 the model is not invertible.
Note: If !ˆ1 = -0.5 here, then #1 = r1 = ! = -0.5, which gives (. + 1)2 = 0, so . = -1, and neither
1+! 2
estimate gives an invertible model.
Recall that in the simple linear model Yi = &0 + &1Xi + ei, ei ~ IN(0, )2), )2 is estimated by
n
1
!ˆ 2 = " eˆ 2
i
n-2 i =1
1 n 2
!ˆ 2 = $ eˆt
n t = p +1
n
1
= $ ( z - "ˆ z t 1 t -1 -...- "ˆ p zt - p - #ˆ1eˆt -1 -...- #ˆq eˆt -q )
n t = p +1
No matter which estimation method is used this parameter is estimated last, as estimates of the (’s and
.’s are required first.
Note: In using either Least Squares or Maximum Likelihood Estimation we also find the residuals, ê t ,
whereas using the Method of Moments to estimate the ,’s and .’s these residuals have to be calculated
afterwards.
Note: for large n, there will be little difference between LS, ML and Method of Moments estimators.
d. Diagnostic checking
Assume we have identified a tentative ARIMA(p,d,q) model and calculated the estimates
ˆ "ˆ 1 , ... "ˆ p , #ˆ 1, ... ,#ˆ q .
ˆ !,
µ,
We must perform diagnostic checks based on the residuals. If the ARMA(p,q) model is a good
approximation to the underlying time series process, then the residuals ê t will form a good
approximation to a white noise process.
If the SACF or SPACF of the residuals has too many values outside the interval !$ # 1.96 , 1.96 "% we
& n n'
conclude that the fitted model does not have enough parameters and a new model with additional
parameters should be fitted.
The Portmanteau test may also be used for this purpose. Other tests are:
plot ê t against t
•
• plot ê t against zt
any patterns evident in these plots may indicate that the residuals are not a realisation of a set of
independent (uncorrelated) variables and so the model is inadequate.
(III) Counting Turning Points:
This is a test of independence. Are the residuals a realisation of a set of independent variables?
Possible configurations for a turning point are:
In the diagram above, there exists a turning point for all configurations except (a) and (b). Since four out
of the six possible configurations exhibit a turning point, the probability to observe one is 4/6 = 2/3.
If y1, y2, ..., yn is a sequence of numbers, the sequence has a turning point at time k if
either
yk-1 < yk AND yk > yk+1
or
yk-1 > yk AND yk < yk+1
therefore, the number of turning points in a realisation of Y1, Y2, ... YN should lie within the 95%
confidence interval:
!2 $ 16 N # 29 % 2 $ 16 N # 29 % "
& ( N # 2) # 1.96 ( ) , ( N # 2) + 1.96 ( )'
,& 3 * 90 + 3 * 90 + -'
Recall: the spectral density function on white noise process is f(#) = )2/2$ , -$ < # < $. So the sample
spectral density function of the residuals should be roughly constant for a white noise process.