Main Linear Models of Time Series
Main Linear Models of Time Series
3.1 Introduction
The main linear models used for modelling stationary time series are:
The definitions of each of these processes, presented below, involve the standard zero-
mean white noise process et : t 1,2, defined in Section 2.3.
In practice we often wish to model processes which are not I (0) (stationary) but I (1) . For
this purpose a further model is considered:
ARMA and ARIMA are pronounced as single words (similar to ‘armour’ and ‘areema’).
Autoregressive
An autoregressive process of order p (the notation AR ( p ) is commonly used) is a
sequence of random variables X t defined consecutively by the rule:
X t 1 X t 1 2 X t 2 p X t p et
Thus the autoregressive model attempts to explain the current value of X as a linear
combination of past values with some additional externally generated random variation.
The similarity to the procedure of linear regression is clear, and explains the origin of the
name ‘autoregression’.
Moving average
A moving average process of order q , denoted MA(q ) , is a sequence X t defined by the
rule:
X t et 1 et 1 q et q
The moving average model explains the relationship between the X t as an indirect effect,
arising from the fact that the current value of the process results from the recently passed
random error terms as well as the current one. In this sense, X t is ‘smoothed noise’.
X t 1 X t 1 p X t p et 1et 1 q et q
The backwards shift operator, B , acts on the process X to give a process BX such that:
BX t X t 1
If we apply the backwards shift operator to a constant, then it doesn’t change it:
B
X t X t X t 1
( 2 X )t (X )t (X )t 1 X t 2 X t 1 X t 2
The usefulness of both of these operators will become apparent in later sections.
2 Xt (1 B)2 Xt (1 2B B2 ) Xt Xt 2 Xt 1 Xt 2
Similarly:
3 Xt 1 B Xt (1 3B 3B2 B3 ) Xt Xt 3 Xt 1 3 Xt 2 Xt 3
3
Xt 5 Xt 1 7 Xt 2 3 Xt 3 ( Xt Xt 1 ) 4( Xt 1 Xt 2 ) 3( Xt 2 Xt 3 )
2 Xt 32 Xt 1
Question
Suppose that wn xn . Give a formula for xn in terms of x0 and the differences:
wn xn xn1 , wn 1 xn 1 xn 2 , , w1 x1 x0
Solution
We have:
xn wn xn1
wn wn1 xn2
wn wn1 w1 x0
n
x0 wi
i 1
The R commands for generating the differenced values of some time series x are:
diff(x,lag=1,differences=1)
diff(x,lag=1,differences=3)
diff(x,lag=12,differences=1)
for a simple seasonal difference with period 12, 12 (see Section 1.4 in Chapter 14).
X t ( X t 1 ) et (13.1)
t 1
Xt t (X0 ) j et j (13.2)
j 0
This representation can be obtained by substituting in for Xt 1 , then for Xt 2 , and so on, until
we reach X 0 . It is important to realise that X 0 itself will be a random variable in general – it is
not necessarily a given constant, although it might be.
t t 0
Here the notation t is being used in place of E ( Xt ) . This result follows by taking expectations of
both sides of Equation (13.2), and noting that the white noise terms have zero mean.
The white noise terms are also uncorrelated with each other, and with X 0 . It follows that the
variance of Xt can be found by summing the variances of the terms on the right-hand side.
1 2t
var( X t ) 2 2
2t var( X 0 )
1
where, as before, 2 denotes the common variance of the white noise terms et .
Question
Solution
t 1
var( Xt ) var t ( X 0 ) j et j
j 0
t 1
2t var( X 0 ) 2 j var(et j )
j 0
t 1
2t var( X 0 ) 2 2j
j 0
a(1 r n )
Now using the formula a ar ar 2 ar n 1 for summing the first n terms of a
1r
geometric progression, we see that:
1 2t
var( Xt ) 2t var( X 0 ) 2
1 2
For the process X to be stationary, its mean and variance must both be constant. A quick look at
the expressions above is enough to see that this will not be the case in general. It is therefore
natural to ask for conditions under which the process is stationary.
From this it follows that a stationary process X satisfying (13.1) can exist only if 1 .
It should be clear that we require 0 in order to remove the t -dependence from the mean.
2
Similarly, we require var( X0 ) in order to make the variance constant. (We are assuming
1 2
0 , otherwise X is a white noise process, which is certainly stationary.)
We also require 1 . One way of seeing this is to note that the variance has to be a finite non-
negative number.
Notice that this implies that X can be stationary only if X 0 is random. If X 0 is a known
constant, then var X 0 0 and var X t is no longer independent of t , whereas if X 0 has
expectation different from then the process X will have non-constant expectation.
2
We now consider the situation in which t , and/or var X0 . From what we’ve just
1 2
said, the process will then be non-stationary. However, what we are about to see is that even if
the process is non-stationary, as long as 1 , the process will become stationary in the long
run, without any extra conditions.
It is easy to see that the difference t is a multiple of t and that var X t 2 1 2
is a multiple of 2t .
This follows by writing the equations we derived above for the mean and variance in the form:
2 2
t t (0 ) and var( Xt ) 2t var( X 0 )
1 2 1 2
Both of these terms will decay away to zero for large t if 1 , implying that X will be
virtually stationary for large t .
We can also turn this result on its head: if we assume that the process has already been running
for a very long time, then the process will be stationary. In other words, any AR(1) process with
an infinite history and 1 will be stationary.
Xt j et j (13.3)
j 0
This is an explicit representation of the value of Xt in terms of the historic values of the
process e . The infinite sum on the right-hand side only makes sense, ie converges, if 1 . The
equation can be derived in two ways – either using an iterative procedure as used above to derive
Equation (13.2), or by using the backward shift operator.
For the latter method, we write the defining equation Xt ( Xt 1 ) et in the form:
1 B Xt et
The expression 1 B will be invertible (using an expansion) if and only if 1 . (In fact we
1 1B1
1 1 1
could expand it for 1 using 1 B B . However, this would give
an expansion in terms of future values since B1 is effectively a forward shift. This would not
therefore be of much use. We will not point out this qualification in future.)
Xt 1 B
1
et 1 B 2B2 et
et et 1 2et 2
In other words:
Xt j et j
j 0
This representation makes it clear that X t has expectation and variance equal to:
2
2 j 2 1 2
j 0
if 1 .
This last step uses the formula for the sum of an infinite geometric progression.
So, in this case, the process does satisfy the conditions required for stationarity given above.
We have only looked at the mean and variance so far, however. We also need to look at the
autocovariance function.
k cov( X t , X t k ) i j cov(et j , et k i )
j 0 i 0
2 2 j k k 2 2 j k 0
j 0 j 0
The double sum has been simplified here by noting that the covariance will be zero unless the
subscripts t j and t k i are equal, ie unless i j k . In this case the covariance equals 2 .
So in the sum over i , we only include the term for i j k .
2
We have also used the formula 0 var( Xt ) 2 j 2 from before.
j 0 1 2
It is worth introducing here a method of more general utility for calculating autocovariance
functions. From (13.1) we have, assuming that X is stationary:
k cov( X t , X t k )
cov( ( X t 1 ) et , X t k )
cov( X t 1, X t k )
k 1
implying that:
2
k k 0 k
1 2
and:
k
k k
0
for k 0 .
1 1
2 2
2 0
1 2
Indeed, since the best linear estimator of X t given X t 1, X t 2 , X t 3 , is just X t 1 , the
definition of the PACF implies that k 0 for all k 1 . Notice the contrast with the ACF,
which decreases geometrically towards 0.
The following lines in R generate the ACF and PACF functions for an AR (1) model:
par(mfrow=c(1,2))
Example
One of the well known applications of a univariate autoregressive model is the description
of the evolution of the consumer price index Qt : t 1,2,3, . The force of inflation,
rt ln Qt Qt 1 , is assumed to follow the AR (1) process:
rt rt 1 et
One initial condition, the value for r0 , is required for the complete specification of the
model for the force of inflation rt .
The process rt is said to be mean-reverting, ie it has a long-run mean, and if it drifts away, then it
tends to be dragged back towards it. In this case, the long-run mean is . The equation for rt
can be written in the form:
rt rt 1 et
If we ignore the error term, which has a mean of zero, then this equation says that the difference
between r and the long-run mean at time t , is times the previous difference. In order to be
mean-reverting, this distance must reduce, so we need | | 1 , as for stationarity. In fact, we
probably wouldn’t expect the force of inflation to be dragged to the other side of the mean, so a
realistic model is likely to have 0 1 .
X t 1( X t 1 ) 2 ( X t 2 ) p ( X t p ) et (13.4)
As seen for AR (1) , there are some restrictions on the values of the j which are permitted
if the process is to be stationary. In particular, we have the following result.
1 1z 2 z 2 p z p 0
1 z 0
The root of this equation is z 1 . So for an AR(1) process to be stationary, we must have:
1
1
| |
ie:
| | 1
It is important to realise what this result does not say. Although for an AR(1) process we can look
at the coefficient and deduce stationarity if and only if 1 , we cannot do this for higher
order autoregressive processes. For example, an AR(2) process:
Xt 1 Xt 1 2 Xt 2 et
would not necessarily be stationary, just because i 1 for both i 1 and 2. We would need to
look at the roots of the characteristic polynomial.
p p
k cov( X t , X t k ) cov j X t j et , X t k j k j
j 1
j 1
for k p .
This is a p -th order difference equation with constant coefficients; it has a solution of the
form:
p
k A j z j k
j 1
for all k 0 , where z1, , z p are the p roots of the characteristic polynomial and A1, , Ap
are constants. (We will show this in a moment.) As X is purely indeterministic, we must
have k 0 , which requires that z j 1 for each j .
Question
p
k
Show by substitution that k Aj z j is a solution of the given difference equation.
j 1
Solution
p p
k
Substituting k Aj z j into the right-hand side of the difference equation k j k j
j 1 j 1
we get:
p p p p p
j k j j Ai zik j Ai zik j zij
j 1
j 1 j 1 i 1 i 1
p
j zij 1
j 1
So:
p p
j k j Ai zik k
j 1 i 1
as required.
The converse of Result 13.2 is also true (but the proof is not given here): if the roots of the
characteristic polynomial are all greater than 1 in absolute value, then it is possible to
construct a stationary process X satisfying (13.4). In order for an arbitrary process X
satisfying (13.4) to be stationary, the variances and covariances of the initial values
X 0 , X 1, , X p 1 must also be equal to the appropriate values.
Although we do not give a formal proof, we will provide another way of thinking about this result.
Recall that in the AR(1) case we said that the process turned out to be stationary if and only if
X t could be written as a (convergent) sum of white noise terms. Equivalently, if we start from
the equation:
1 B Xt et
then the process is stationary if and only if we can invert the term 1 B , since this is the case if
and only if 1 .
B B B
1 1 1 Xt et
z1 z2 zp
where z1 , z2 , , zp are the p (possibly complex) roots of the characteristic polynomial. In other
words, the characteristic polynomial factorises as:
z z z
1 1 z 2 z2 p z p 1 1 1
z1 z2 zp
It follows that in order to write X t in terms of the process e, we need to be able to invert all p of
B
the factors 1 . This will be the case if and only if zi 1 for all i 1,2, , p .
zi
Question
Xt 5 2( Xt 1 5) 3( Xt 2 5) et
and calculate its roots. Hence comment on the stationarity of the process.
Solution
We first rearrange the equation for the process so that all the X terms appear on the same side.
Doing this we obtain:
Xt 2 Xt 1 3 Xt 2 et
There is no requirement to use the letter z (or indeed any particular letter) when writing down
the characteristic polynomial. The letter is often used instead.
Question
Xn 11 X
6 n1
Xn2 16 Xn3 en
Solution
Xn 11 X
6 n 1
Xn 2 61 Xn 3 en
1 11
6
2 16 3 0
1 11
6
2 16 3 ( 2)(a 2 b c)
where a , b and c are constants. The values of a , b and c can be determined in several ways,
eg by comparing the coefficients on both sides of this equation, by long division of polynomials, or
by synthetic division. We find that:
1 11
6
2 16 3 2 16 2 23 12 16 2 3 1
The process is not stationary, since the characteristic equation has a root that is not strictly
greater than 1 in magnitude.
It is easy to see that differencing the process once will eliminate the root of 1. The two remaining
roots (ie 2 and 3) are both strictly greater than 1 in magnitude, so the differenced process is
stationary. Hence X is I(1) .
Often exact values for the k are required, entailing finding the values of the constants Ak .
From (13.4) we have:
for 0 k p . (These are known as the Yule-Walker equations.) Here the notation 1k 0
3 1 2 2 1 3 0
2 1 1 2 0 3 1
1 1 0 2 1 3 2
The second and third of these equations are sufficient to deduce 2 and 1 in terms of 0 ,
which is all that is required to find 2 and 1 . The first and fourth of the equations are
needed when the values of the k are to be found explicitly.
The PACF, k : k 1 , of the AR ( p ) process can be calculated from the defining equations,
but is not memorable. In particular, the first three equations above can be written in terms
of 1 , 2 , 3 and the resulting solution of 3 as a function of 1 , 2 , 3 is the expression
of 3 . The same idea applies to all values of k , so that k is the solution of k in a system
2 12
of k linear equations, including those for 1 1 and 2 that we have seen
1 12
before.
k 0 for k p
This property of the PACF is characteristic of autoregressive processes and forms the basis
of the most frequently used test for determining whether an AR ( p ) model fits the data. It
would be difficult to base a test on the ACF as the ACF of an autoregressive process is a
sum of geometrically decreasing components. (See Section 2 in Chapter 14.)
Question
0 1 1 2 2 3 3 2
Solution
0 cov Xt , Xt
Expanding the LHS only (which will always be our approach when determining the autocovariance
function of an autoregressive process), and remembering that the covariance is unaffected by the
mean , we see that:
0 cov 1 Xt 1 2 Xt 2 3 Xt 3 et , Xt
But:
0 0 0 2 2
This is because Xt 1 , Xt 2 ,... are functions of past white noise terms, and et is independent of
earlier values. So:
0 1 1 2 2 3 3 2
We will now derive a formula for the ACF of an AR(2) process. To do this we need to remember a
little about difference equations. Some formulae relating to difference equations are given on
page 4 of the Tables.
Question
Xt 65 Xt 1 16 Xt 2 et
Solution
We do not actually need to calculate 0 in order to find the ACF. This is always the case for an
autoregressive process.
By definition:
0 0 1
0
1 cov( Xt , Xt 1 )
cov( 65 Xt 1 61 Xt 2 et , Xt 1 )
65 0 16 1
Rearranging gives:
7 65 0
6 1
So:
1 75 0 and 1 1 75
0
2 cov( Xt , Xt 2 )
cov( 65 Xt 1 61 Xt 2 et , Xt 2 )
65 1 16 0
2 65 75 0 16 0 73 0 and
2 2 73
0
k 65 k 1 16 k 2
We can solve this second order difference equation. The characteristic equation (for the
difference equation, which is unfortunately slightly different to the characteristic equation for the
process) is:
2 65 61 12 13 0
Using the formula on page 4 of the Tables, the general solution of this difference equation is of
the form:
12
k k
k A B 13
In order to find the solution we want, we need to use two boundary conditions to determine the
two constants. We know that 0 1 and 1 75 . So:
0 A B 1
1 12 A 13 B 75
12
k k
k 16
7
97 13
We are also asked for the partial autocorrelation function. Using the formulae on page 40 of the
Tables:
1 1 75
2 12
2 61
1 12
k 0 for k 3, 4, 5,
X t et et 1
0 var(et et 1) (1 2 ) 2
1 cov(et et 1, et 1 et 2 ) 2
k 0 for k 1
0 1
1
1 2
k 0 for k 1
Question
Show that the moving average process X n Z n Z n 1 is weakly stationary, where Z n is a white
noise process with mean and variance 2 .
Solution
cov X n , X n cov Z n Z n 1 , Z n Z n 1 1 2 2
or alternatively:
var( X n ) var Zn Z n 1 1 2 2
and:
In fact, the covariance at higher lags remains 0 since there is no overlap between the Z ’s. The
covariances at the corresponding negative lags are the same.
Since none of these expressions depends on n, it follows that the process is weakly stationary.
An MA(1) process is stationary regardless of the values of its parameters. The parameters
are nevertheless usually constrained by imposing the condition of invertibility. This may be
explained as follows.
It is possible to have two distinct MA(1) models with identical ACFs: consider, for example,
0.5 and 2 , both of which have 1 0.4 .
1 2
The defining equation of the MA(1) may be written in terms of the backwards shift operator:
X (1 B )e (13.6)
(1 B)1( X ) e
X t ( X t 1 ) 2 ( X t 2 ) 3 ( X t 3 ) et
The original moving average model has therefore been transformed into an autoregression
of infinite order. But this procedure is only valid if the sum on the left-hand side is
convergent, in other words if 1 . When this condition is satisfied the MA(1) is called
invertible. Although more than one MA process may share a given ACF, at most one of the
processes will be invertible.
We might want to know the historic values of the white noise process. Although the values
x0 , x1 ,, xt can be observed, and are therefore known at time t, the values of the white noise
process e0 , e1 , , et are not. Can we obtain the unknown e values from the known x values?
The answer is yes, in theory, if and only if the process is invertible, since we can then write the
value et in terms of the x’s, as above. In practice, we wouldn’t actually have an infinite history of
x values, but since the coefficients of the x’s get smaller as we go back in time, for an invertible
process, the contribution of the values before time 1, say, will be negligible. We can make this
more precise, as in the following question.
Question
n1
en ( )n e0 ( )i xni
i 0
Solution
en Xn en1
X n X n1 en2
n1
X n Xn1 2 X n2 X1 n e0
Notice that, as n gets large, the dependence of en on e0 will be small provided 1 .
The condition for a MA(1) process to be invertible is similar to the condition that an AR(1)
process is stationary. An AR(1) process is stationary if and only if the process X can be written
explicitly in terms of the process e. The invertibility condition ensures that the white noise
process e can be written in terms of the X process. This relationship generalises to AR(p) and
MA(q) processes, as we will see shortly.
It is possible, at the cost of considerable effort, to calculate the PACF of the MA(1) , giving:
(1 2 ) k
k ( 1)k 1
1 2(k 1)
This decays approximately geometrically as k , highlighting the way in which the ACF
and PACF are complementary: the PACF of a MA(1) behaves like the ACF of an AR (1) , and
the PACF of an AR (1) behaves like the ACF of a MA(1) .
X (1 1B 2B2 q Bq )e
Xt et 1et 1 2et 2 q et q
Moving average processes are always stationary, as they are a linear combination of white noise,
which is itself stationary.
Recall that for a stationary AR(p) process, X t can be expressed as an (infinite and convergent)
sum of white noise terms. This means that any stationary autoregressive process can be
considered to be a moving average of infinite order. However, by a moving average process we
will usually mean one of finite order.
q q q k
k i j E (et i et j k ) 2 i i k
i 0 j 0 i 0
Note that cov(ei , e j ) E (ei e j ) E (ei )E (e j ) E (ei e j ) 0 for i j (since the random variables et
have 0 mean and are uncorrelated), and cov(ei , ei ) var(ei ) E (ei2 ) 2 . So, in the double sum
above, the only non-zero terms will be where the subscripts of et i and et j k match, ie when
i j k . This means that we can simplify the double sum by writing everything in terms of j .
We need to get the limits right for j , which cannot go above q k because i j k and i only
goes up to q . So we get:
q k
k j k j 2
j 0
This matches the formula above, except that i has been used in place of j .
The importance of this observation will become apparent in Section 2 of Chapter 14. We will look
at an explicit case of this result to make things clearer.
Question
X n 3 en en1 0.25en2
Solution
We have:
0 cov( X n , X n ) var( X n )
var(en en 1 0.25en 2 )
2.0625
1 cov X n , X n 1
1.25
2 cov X n , X n 2
cov(0.25en 2 ,en 2 )
0.25
3 cov Xn , X n 3
0
The covariance at higher lags is also 0 since there is no overlap between the e’s.
In the solution above, we have expanded the terms on both sides of the covariance expression.
This will always be our strategy when calculating the autocovariance function for a moving
average series. For all other types of series, we just expand the term on the LHS of the covariance
expression.
The equation:
1 1 z 2 z2 q z q 0
can be used to determine invertibility. This can be thought of as the characteristic equation of the
white noise terms.
Xt et 1et 1 2et 2 q et q
1 1 z 2 z2 q z q 0
This is equivalent to saying that the value et can be written explicitly as a (convergent) sum of X
values.
It follows that in the same way as a stationary autoregression can be thought of as a moving
average of infinite order, so a moving average can be thought of as an autoregression of infinite
order.
Question
Solution
Although there may be many moving average processes with the same ACF, at most one of
them is invertible, since no two invertible processes have the same autocorrelation
function. Moving average models fitted to data by statistical packages will always be
invertible.
X t 1( X t 1 ) p ( X t p ) et 1et 1 q et q
(1 1B p B p )( X ) (1 1B q Bq )e
(B)( X n ) (B)en
If (B) and (B) have any factors (ie roots) in common, then the defining relation could be
simplified. For example, we may have a stationary ARMA(1,1) process defined by
1 B Xn 1 B en with 1 . Since the factors 1 B are invertible, we can multiply
1
both sides of the equation by 1 B to see that X n en . So this process would actually be
classified as ARMA(0,0) .
In general, it would be assumed that the polynomials (B) and (B) have no common roots.
Autoregressive and moving average processes are special cases of ARMA processes. AR(p) is the
same as ARMA(p,0) . MA(q) is the same as ARMA(0, q) .
To check the stationarity of an ARMA process, we just need to examine the autoregressive part.
The moving average part (which involves the white noise terms) is always stationary. The test is
the same as for an autoregressive process – we need to determine the roots of the characteristic
equation formed by the X terms. The process is stationary if and only if all the roots are strictly
greater than 1 in magnitude.
Similarly, we can check for invertibility by examining the roots of the characteristic equation that
is obtained from the white noise terms. The process is invertible if and only if all the roots are
strictly greater than 1 in magnitude.
Neither the ACF nor the PACF of the ARMA process eventually becomes equal to zero. This
makes it more difficult to identify an ARMA model than either a pure autoregression or a
pure moving average.
Theoretically, both the ACF and PACF of a stationary ARMA process will tend towards 0 for large
lags, but neither will have a cut off property.
It is possible to calculate the ACF by a method similar to the method employing the
Yule-Walker equations for the ACF of an autoregression.
We will show that the autocorrelation function of the stationary zero-mean ARMA(1,1)
process:
X t X t 1 et et 1 (13.7)
is given by:
(1 )( )
1
(1 2 2 )
k k 11, k 2,3,
These results can be obtained from the formula for k given on page 40 of the Tables.
Figure 13.1 in Section 2.5 shows the ACF and PACF values of such a process with 0.7
and 0.5 .
Xt Xt 1 et et 1
we have:
2
Similarly:
cov( X t , et 1) cov( X t 1, et 1) cov(et , et 1) cov(et 1, et 1)
( ) 2
cov( X t , X t 1) cov( X t 1, X t 1) cov(et , X t 1) cov(et 1, X t 1)
and, for k 1 :
So:
0 1 (1 2 ) 2
1 0 2
k k 1
1 2 2
0 2
2
1
( )(1 )
1 2
1 2
k k 1 1 k 2,3,...
Hence:
1 (1 )( )
1
0 (1 2 2 )
and:
k k 1 1
k k 1 1 , k 2,3,
0 0
Question
Show that the process 12 Xt 10 Xt 1 2 Xt 2 12et 11et 1 2et 2 is both stationary and
invertible.
Solution
We start by rewriting the process so that all the X ' s are on the same side and all the e ' s are on
the same side:
Suppose we are given some time series data xn , where n varies over some finite range. If we
want to model the data, then we would expect to take sample statistics, in particular the sample
autocorrelation function, sample partial autocorrelation function and sample mean; these will be
discussed in more detail in Chapter 14. The modelling process would then involve finding a
stochastic process with similar characteristics. In the Box-Jenkins approach, the model is picked
from the ARMA(p, q) class. However, the theoretical counterparts of the autocorrelation and
partial autocorrelation functions are only defined for stationary series. The upshot of this is that
we can only directly apply these methods if the original data values are stationary.
However, we can get around this problem as follows. First transform the data into a stationary
form, which we will discuss in a moment. We can then model this stationary series as suggested
above. Finally, we carry out the inverse transform on our model to obtain a model for the original
series. The question remains as to what we mean by ‘transform’ and ‘inverse transform’.
The backward difference operator can turn a non-stationary series into a stationary one. For
example, a random walk:
Xt Xt 1 et
is non-stationary (as its characteristic equation is 1 0 , which has a root of 1). However, the
difference:
Xt Xt Xt 1 et
For the moment we will assume that it’s possible to turn the data set into a stationary series by
repeated differencing. We may have to difference the series several times, the specific number
usually being denoted by d.
Now assuming we’ve transformed our data into stationary form by differencing, we can model
this series using a stationary ARMA(p, q) process. The final step is to reverse the differencing
procedure to obtain a model of the original series. The inverse process of differencing is
integration since we must sum the differences to obtain the original series.
Question
State the range of values of i and give an expression for x j in terms of x0 and the w ' s .
Solution
j
x j x0 wi
i 1
In many applications the process being modelled cannot be assumed stationary, but can
reasonably be fitted by a model with stationary increments, that is, if the first difference of
X , Y X , is itself a stationary process.
Example 1
The simplest example of an ARIMA process is the random walk, X t X t 1 et , which can
t
be written X t X 0 e j . The expectation of X t is equal to E X 0 but the variance is
j 1
Here we are assuming (as we usually do) that the white noise process has zero mean.
Yt X t et
Example 2
Let Zt denote the closing price of a share on day t . The evolution of Z is frequently
described by the model:
Zt Zt 1 exp et
By taking logarithms we see that this model is equivalent to an I (1) model, since Yt ln Zt
satisfies the equation:
Yt Yt 1 et
t
which is the defining equation of a random walk with drift because Yt Y0 t e j . The
j 1
model is based on the assumption that the daily returns ln Zt Zt 1 are independent of the
past prices Z0 , Z1, , Zt 1 .
Example 3
The logarithm of the consumer price index can be described by the ARIMA(1,1,0) model:
(1 B )ln Qt (1 B )ln Qt 1 et
When analysing the behaviour of an ARIMA( p,1, q ) model, the standard technique is to look
at the first difference of the process and to perform the kind of analysis which is suitable for
an ARMA model. Once complete, this can be used to provide predictions for the original,
undifferenced, process.
ARIMA(p,d,q) processes
In certain cases it may be considered desirable to continue beyond the first difference, if the
process X is still not stationary after being differenced once. The notation extends in a
natural way.
In terms of the backwards shift operator, the equation of the ARIMA( p, d , q ) process is:
AR(p)I(d)MA(q)
We now check whether the polynomial on the left-hand side is divisible by 1 B; if so,
factorise it out. We continue to do this until the remaining polynomial is not divisible by
1 B .
We should also check that the roots of the characteristic polynomial of Xt , ie 1 0.4 0.1 2 ,
are both strictly greater than 1 in magnitude. In fact, the roots are 2 i 6 . The magnitude of
6
2
(2)2 4 6 10
The magnitude of 2 i 6 is also 10 . So both roots are strictly greater than 1 in magnitude.
Differencing once removes the factor of (1 B) . Hence, the process Xt is stationary, as
required.
There is no simple formula for solving cubic equations, so we should start by checking whether 1
is a root of this equation. Setting 1 , we see that the left-hand side is:
( 1)(a 2 b c)
The equation:
or:
or:
0.1 2 0.4 1 0
As we have already seen, the roots of this equation are 2 i 6 , and these both have a
magnitude of 10 . So X is stationary ARMA(2,1) and hence X is ARIMA(2,1,1) .
ARIMA models play a central role in the Box-Jenkins methodology, which aims to provide a
consistent and unified framework for analysis and prediction using time series models.
(See Section 3.1 of Chapter 14.)