0% found this document useful (0 votes)
13 views

Main Linear Models of Time Series

Uploaded by

kipropnathan15
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
13 views

Main Linear Models of Time Series

Uploaded by

kipropnathan15
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 30

Page 20 CS2-13: Time series 1

3 Main linear models of time series

3.1 Introduction
The main linear models used for modelling stationary time series are:

 Autoregressive process (AR)

 Moving average process (MA)

 Autoregressive moving average process (ARMA).

The definitions of each of these processes, presented below, involve the standard zero-
mean white noise process et : t  1,2, defined in Section 2.3.

In practice we often wish to model processes which are not I (0) (stationary) but I (1) . For
this purpose a further model is considered:

 Autoregressive integrated moving average (ARIMA).

ARMA and ARIMA are pronounced as single words (similar to ‘armour’ and ‘areema’).

Autoregressive
An autoregressive process of order p (the notation AR ( p ) is commonly used) is a
sequence of random variables  X t  defined consecutively by the rule:

 
X t    1  X t 1      2  X t  2        p X t  p    et

Thus the autoregressive model attempts to explain the current value of X as a linear
combination of past values with some additional externally generated random variation.
The similarity to the procedure of linear regression is clear, and explains the origin of the
name ‘autoregression’.

Moving average
A moving average process of order q , denoted MA(q ) , is a sequence  X t  defined by the
rule:

X t    et  1 et 1    q et q

The moving average model explains the relationship between the X t as an indirect effect,
arising from the fact that the current value of the process results from the recently passed
random error terms as well as the current one. In this sense, X t is ‘smoothed noise’.

Autoregressive moving average


The two basic processes (AR and MA) can be combined to give an autoregressive moving
average, or ARMA, process. The defining equation of an ARMA( p, q ) process is:

 
X t    1  X t 1        p X t  p    et  1et 1     q et q

© IFE: 2019 Examinations The Actuarial Education Company


CS2-13: Time series 1 Page 21

Note: ARMA( p,0) is AR ( p ) ; ARMA(0, q ) is MA(q ) .

Autoregressive integrated moving average


The definition of an ARIMA(p, d , q) process is given in Section 3.8.

3.2 The backwards shift operator, B, and the difference operator,


Further discussion of the various models will be helped by the use of two operators which
operate on the whole time series process X.

The backwards shift operator, B , acts on the process X to give a process BX such that:

 BX t  X t 1

If we apply the backwards shift operator to a constant, then it doesn’t change it:

B  

The difference operator,  , is defined as   1  B , or in other words:

  X t  X t  X t 1

Both operators can be applied repeatedly. For example:

(B2 X )t  (B(BX ))t  (BX )t 1  X t  2

( 2 X )t  (X )t  (X )t 1  X t  2 X t 1  X t  2

and can be combined as, for example:

(BX )t  (B(1  B) X )t  (BX )t  (B2 X )t  X t 1  X t  2

The usefulness of both of these operators will become apparent in later sections.

We could also work out 2 Xt as follows:

2 Xt  (1  B)2 Xt  (1  2B  B2 ) Xt  Xt  2 Xt 1  Xt 2

Similarly:

3 Xt  1  B  Xt  (1  3B  3B2  B3 ) Xt  Xt  3 Xt 1  3 Xt 2  Xt 3
3

In addition, we use the difference operator to write Xt  5 Xt 1  7 Xt 2  3 Xt 3 as follows:

Xt  5 Xt 1  7 Xt 2  3 Xt 3  ( Xt  Xt 1 )  4( Xt 1  Xt 2 )  3( Xt 2  Xt 3 )

 Xt  4Xt 1  3Xt 2

 (Xt  Xt 1 )  3(Xt 1  Xt 2 )

 2 Xt  32 Xt 1

The Actuarial Education Company © IFE: 2019 Examinations


Page 22 CS2-13: Time series 1

Question

Suppose that wn  xn . Give a formula for xn in terms of x0 and the differences:

wn  xn  xn1 , wn 1  xn 1  xn 2 , , w1  x1  x0

Solution

We have:

xn  wn  xn1
 wn  wn1  xn2


 wn  wn1   w1  x0
n
 x0   wi
i 1

The R commands for generating the differenced values of some time series x are:

diff(x,lag=1,differences=1)

for ordinary difference  .

diff(x,lag=1,differences=3)

for differencing three times 3 , and:

diff(x,lag=12,differences=1)

for a simple seasonal difference with period 12, 12 (see Section 1.4 in Chapter 14).

3.3 The first-order autoregressive model, AR(1)


The simplest autoregressive process is the AR (1) , given by:

X t     ( X t 1   )  et (13.1)

A process satisfying this recursive definition can be represented as:

t 1
Xt     t (X0  )    j et  j (13.2)
j 0

This representation can be obtained by substituting in for Xt 1 , then for Xt 2 , and so on, until
we reach X 0 . It is important to realise that X 0 itself will be a random variable in general – it is
not necessarily a given constant, although it might be.

© IFE: 2019 Examinations The Actuarial Education Company


CS2-13: Time series 1 Page 23

It follows that the mean function t is given by:

t     t   0   

Here the notation t is being used in place of E ( Xt ) . This result follows by taking expectations of
both sides of Equation (13.2), and noting that the white noise terms have zero mean.

The white noise terms are also uncorrelated with each other, and with X 0 . It follows that the
variance of Xt can be found by summing the variances of the terms on the right-hand side.

The same representation (13.2) gives the variance:

1   2t
var( X t )   2 2
  2t var( X 0 )
1 

where, as before,  2 denotes the common variance of the white noise terms et  .

Question

Derive this expression.

Solution

From Equation (13.2), we have:

 t 1 
var( Xt )  var     t ( X 0   )    j et  j 
 
 j 0 
t 1
  2t var( X 0   )    2 j var(et  j )
j 0

t 1
  2t var( X 0 )   2  2j
j 0

a(1  r n )
Now using the formula a  ar  ar 2    ar n 1  for summing the first n terms of a
1r
geometric progression, we see that:

 1   2t 
var( Xt )   2t var( X 0 )   2  
 1 2 
 

For the process X to be stationary, its mean and variance must both be constant. A quick look at
the expressions above is enough to see that this will not be the case in general. It is therefore
natural to ask for conditions under which the process is stationary.

The Actuarial Education Company © IFE: 2019 Examinations


Page 24 CS2-13: Time series 1

From this it follows that a stationary process X satisfying (13.1) can exist only if   1 .

Further requirements are that 0   and that var  X 0    2 1    .


2

It should be clear that we require 0   in order to remove the t -dependence from the mean.
2
Similarly, we require var( X0 )  in order to make the variance constant. (We are assuming
1 2
  0 , otherwise X is a white noise process, which is certainly stationary.)

We also require   1 . One way of seeing this is to note that the variance has to be a finite non-
negative number.

Notice that this implies that X can be stationary only if X 0 is random. If X 0 is a known
constant, then var  X 0   0 and var  X t  is no longer independent of t , whereas if X 0 has
expectation different from  then the process X will have non-constant expectation.

2
We now consider the situation in which t   , and/or var  X0   . From what we’ve just
1 2
said, the process will then be non-stationary. However, what we are about to see is that even if
the process is non-stationary, as long as   1 , the process will become stationary in the long
run, without any extra conditions.

It is easy to see that the difference t   is a multiple of  t and that var  X t    2 1   2

is a multiple of  2t .

This follows by writing the equations we derived above for the mean and variance in the form:

2  2 
t     t (0   ) and var( Xt )    2t  var( X 0 )  
1  2  1   2 

Both of these terms will decay away to zero for large t if   1 , implying that X will be
virtually stationary for large t .

We can also turn this result on its head: if we assume that the process has already been running
for a very long time, then the process will be stationary. In other words, any AR(1) process with
an infinite history and   1 will be stationary.

In this context it is often helpful to assume that X 1, , X n is merely a sub-sequence of a


process  , X 1, X 0 , X 1, , X n which has been going on unobserved for a long time and has
already reached a ‘steady state’ by the time of the first observation. A double-sided infinite
process satisfying (13.1) can be represented as:


Xt      j et  j (13.3)
j 0

© IFE: 2019 Examinations The Actuarial Education Company


CS2-13: Time series 1 Page 25

This is an explicit representation of the value of Xt in terms of the historic values of the
process e . The infinite sum on the right-hand side only makes sense, ie converges, if   1 . The
equation can be derived in two ways – either using an iterative procedure as used above to derive
Equation (13.2), or by using the backward shift operator.

For the latter method, we write the defining equation Xt     ( Xt 1   )  et in the form:

1   B   Xt     et
The expression 1   B  will be invertible (using an expansion) if and only if   1 . (In fact we

1   1B1 
1 1 1
could expand it for   1 using 1   B     B  . However, this would give

an expansion in terms of future values since B1 is effectively a forward shift. This would not
therefore be of much use. We will not point out this qualification in future.)

If 1   B  is invertible, then we can write:

Xt     1   B 
1

et  1   B   2B2   et
 et   et 1   2et 2  

In other words:


Xt      j et  j
j 0

This representation makes it clear that X t has expectation  and variance equal to:


2
  2 j 2  1   2
j 0

if   1 .

This last step uses the formula for the sum of an infinite geometric progression.

So, in this case, the process does satisfy the conditions required for stationarity given above.

We have only looked at the mean and variance so far, however. We also need to look at the
autocovariance function.

The Actuarial Education Company © IFE: 2019 Examinations


Page 26 CS2-13: Time series 1

In order to deduce that X is stationary we also need to calculate the autocovariance


function:

 
 k  cov( X t , X t  k )     i  j cov(et  j , et  k i )
j 0 i 0

 
   2 2 j  k   k   2 2 j   k  0
j 0 j 0

The double sum has been simplified here by noting that the covariance will be zero unless the
subscripts t  j and t  k  i are equal, ie unless i  j  k . In this case the covariance equals  2 .
So in the sum over i , we only include the term for i  j  k .


2
We have also used the formula  0  var( Xt )    2 j 2  from before.
j 0 1  2

This is independent of t , and thus a stationary process exists as long as   1 .

It is worth introducing here a method of more general utility for calculating autocovariance
functions. From (13.1) we have, assuming that X is stationary:

 k  cov( X t , X t  k )
 cov(    ( X t  1   )  et , X t  k )

  cov( X t  1, X t  k )

  k  1

implying that:

2
 k   k 0   k
1  2

and:

k
k   k
0

for k  0 .

The partial autocorrelation function k is given by:

1  1  

2 2
2  0
1  2

Indeed, since the best linear estimator of X t given X t 1, X t  2 , X t  3 , is just  X t 1 , the
definition of the PACF implies that k  0 for all k  1 . Notice the contrast with the ACF,
which decreases geometrically towards 0.

© IFE: 2019 Examinations The Actuarial Education Company


CS2-13: Time series 1 Page 27

The following lines in R generate the ACF and PACF functions for an AR (1) model:

par(mfrow=c(1,2))

barplot(ARMAacf(ar=0.7,lag.max = 12)[-1],main = "ACF of AR(1)",


col="red")

barplot(ARMAacf(ar=0.7,lag.max = 12,pacf = TRUE),main = "PACF of


AR(1)",col="red")

Figure 13.2: ACF and PACF of AR (1) with   0.7

Example
One of the well known applications of a univariate autoregressive model is the description
of the evolution of the consumer price index Qt : t  1,2,3, . The force of inflation,
rt  ln Qt Qt 1  , is assumed to follow the AR (1) process:

rt      rt 1     et

One initial condition, the value for r0 , is required for the complete specification of the
model for the force of inflation rt .

The process rt is said to be mean-reverting, ie it has a long-run mean, and if it drifts away, then it
tends to be dragged back towards it. In this case, the long-run mean is  . The equation for rt
can be written in the form:

rt      rt 1     et

The Actuarial Education Company © IFE: 2019 Examinations


Page 28 CS2-13: Time series 1

If we ignore the error term, which has a mean of zero, then this equation says that the difference
between r and the long-run mean at time t , is  times the previous difference. In order to be
mean-reverting, this distance must reduce, so we need | | 1 , as for stationarity. In fact, we
probably wouldn’t expect the force of inflation to be dragged to the other side of the mean, so a
realistic model is likely to have 0    1 .

3.4 The autoregressive model, AR(p)


The equation of the more general AR( p) process is:

X t    1( X t 1   )  2 ( X t 2   )     p ( X t  p   )  et (13.4)

or, in terms of the backwards shift operator:

(1  1B  2B2     p B p )( X t   )  et (13.5)

As seen for AR (1) , there are some restrictions on the values of the  j which are permitted
if the process is to be stationary. In particular, we have the following result.

Condition for stationarity of an AR(p) process (Result 13.2)


If the time series process X given by (13.4) is stationary then the roots of the equation:

1  1z  2 z 2     p z p  0

are all greater than 1 in absolute value.

(The polynomial 1  1z  2 z 2     p z p is called the characteristic polynomial of the


autoregression.)

The equation 1  1 z  2 z2     p z p  0 is known as the characteristic equation of the process.

For an AR(1) process, Xt      Xt 1     et , the characteristic equation is:

1 z  0

The root of this equation is z  1  . So for an AR(1) process to be stationary, we must have:

1
1
| |

ie:

| | 1

This is the same stationarity condition that we derived in Section 3.3.

© IFE: 2019 Examinations The Actuarial Education Company


CS2-13: Time series 1 Page 29

It is important to realise what this result does not say. Although for an AR(1) process we can look
at the coefficient  and deduce stationarity if and only if   1 , we cannot do this for higher
order autoregressive processes. For example, an AR(2) process:

Xt    1  Xt 1     2  Xt 2     et

would not necessarily be stationary, just because  i  1 for both i  1 and 2. We would need to
look at the roots of the characteristic polynomial.

We can prove the stationarity condition as follows.

Proof of Result 13.2


If X is stationary then its autocovariance function satisfies:

 p  p
 k  cov( X t , X t  k )  cov    j X t  j  et , X t  k     j k j
 j 1 
  j 1

for k  p .

The  ’s are constant and do not therefore contribute to the covariance.

This is a p -th order difference equation with constant coefficients; it has a solution of the
form:

p
k   A j z j k
j 1

for all k  0 , where z1, , z p are the p roots of the characteristic polynomial and A1, , Ap
are constants. (We will show this in a moment.) As X is purely indeterministic, we must
have  k  0 , which requires that z j  1 for each j .

Question

p
k
Show by substitution that  k   Aj z j is a solution of the given difference equation.
j 1

Solution

p p
k
Substituting  k   Aj z j into the right-hand side of the difference equation  k    j  k  j
j 1 j 1
we get:

p p  p  p  p 
  j  k  j    j   Ai zik  j    Ai zik    j zij 
   j 1 
j 1 j 1  i 1  i 1  

The Actuarial Education Company © IFE: 2019 Examinations


Page 30 CS2-13: Time series 1

By definition of zi as a root of the characteristic equation, we have:

p
  j zij  1
j 1

So:

p p
  j  k  j   Ai zik   k
j 1 i 1

as required.

The converse of Result 13.2 is also true (but the proof is not given here): if the roots of the
characteristic polynomial are all greater than 1 in absolute value, then it is possible to
construct a stationary process X satisfying (13.4). In order for an arbitrary process X
satisfying (13.4) to be stationary, the variances and covariances of the initial values
X 0 , X 1, , X  p 1 must also be equal to the appropriate values.

Although we do not give a formal proof, we will provide another way of thinking about this result.

Recall that in the AR(1) case we said that the process turned out to be stationary if and only if
X t could be written as a (convergent) sum of white noise terms. Equivalently, if we start from
the equation:

1  B  Xt     et
then the process is stationary if and only if we can invert the term 1   B  , since this is the case if
and only if   1 .

Analogously, we can write an AR(p) process in the form:

 B  B  B 
 1   1    1    Xt     et
 z1  z2   zp 

where z1 , z2 , , zp are the p (possibly complex) roots of the characteristic polynomial. In other
words, the characteristic polynomial factorises as:

 z  z   z 
1  1 z  2 z2    p z p   1   1    1  
 z1  z2   zp 

It follows that in order to write X t in terms of the process e, we need to be able to invert all p of
 B 
the factors  1   . This will be the case if and only if zi  1 for all i  1,2, , p .
 zi 

© IFE: 2019 Examinations The Actuarial Education Company


CS2-13: Time series 1 Page 31

Question

Determine the characteristic polynomial of the process defined by the equation:

Xt  5  2( Xt 1  5)  3( Xt 2  5)  et

and calculate its roots. Hence comment on the stationarity of the process.

Solution

We first rearrange the equation for the process so that all the X terms appear on the same side.
Doing this we obtain:

Xt  2 Xt 1  3 Xt 2  et

We now replace Xt  s by z s , ie we replace X t by 1, Xt 1 by z , and Xt 2 by z2 . So the


characteristic polynomial is 1  2z  3z2 .

This polynomial can be factorised as (1  3z)(1  z) , so its roots are 13 and 1.

This shows that the process is not stationary.

There is no requirement to use the letter z (or indeed any particular letter) when writing down
the characteristic polynomial. The letter  is often used instead.

Question

Given that   2 is a root of the characteristic equation of the process:

Xn  11 X
6 n1
 Xn2  16 Xn3  en

calculate the other roots and classify the process as I(d ) .

Solution

The process can be written in the form:

Xn  11 X
6 n 1
 Xn 2  61 Xn 3  en

So that the characteristic equation is:

1  11
6
   2  16  3  0

The Actuarial Education Company © IFE: 2019 Examinations


Page 32 CS2-13: Time series 1

We are given that   2 is a root. So (  2) is a factor and hence:

1  11
6
   2  16  3  (  2)(a 2  b  c)

where a , b and c are constants. The values of a , b and c can be determined in several ways,
eg by comparing the coefficients on both sides of this equation, by long division of polynomials, or
by synthetic division. We find that:

1  11
6  
   2  16  3     2   16  2  23   12   16    2    3    1 

So the other roots of the characteristic equation are   3 and   1 .

The process is not stationary, since the characteristic equation has a root that is not strictly
greater than 1 in magnitude.

It is easy to see that differencing the process once will eliminate the root of 1. The two remaining
roots (ie 2 and 3) are both strictly greater than 1 in magnitude, so the differenced process is
stationary. Hence X is I(1) .

Often exact values for the  k are required, entailing finding the values of the constants Ak .
From (13.4) we have:

cov( X t , X t  k )  1 cov( X t 1, X t  k )     p cov( X t  p , X t  k )  cov(et , X t  k )

which can be re-expressed as:

 k  1 k 1  2 k 2     p k  p   2 1{k 0}

for 0  k  p . (These are known as the Yule-Walker equations.) Here the notation 1k  0

denotes an indicator function, taking the value 1 if k  0 , the value 0 otherwise.

For p  3 we have 4 equations:

 3   1 2   2 1   3 0

 2  1 1   2 0   3 1

 1   1 0   2 1   3 2

 0  1 1  2 2  3 3   2

The second and third of these equations are sufficient to deduce  2 and  1 in terms of  0 ,
which is all that is required to find  2 and 1 . The first and fourth of the equations are
needed when the values of the  k are to be found explicitly.

© IFE: 2019 Examinations The Actuarial Education Company


CS2-13: Time series 1 Page 33

The PACF, k : k  1 , of the AR ( p ) process can be calculated from the defining equations,
but is not memorable. In particular, the first three equations above can be written in terms
of 1 ,  2 ,  3 and the resulting solution of  3 as a function of 1 ,  2 ,  3 is the expression
of 3 . The same idea applies to all values of k , so that  k is the solution of  k in a system
2  12
of k linear equations, including those for 1  1 and 2  that we have seen
1  12
before.

It is important to note, though, that k  0 for all k  p .

This result is worth repeating.

Behaviour of the PACF for an AR(p) process


For an AR(p) process:

k  0 for k  p

This property of the PACF is characteristic of autoregressive processes and forms the basis
of the most frequently used test for determining whether an AR ( p ) model fits the data. It
would be difficult to base a test on the ACF as the ACF of an autoregressive process is a
sum of geometrically decreasing components. (See Section 2 in Chapter 14.)

Question

Give a derivation of the equation:

 0  1 1   2 2   3 3   2

Solution

The autocovariance at lag 0 is:

 0  cov  Xt , Xt 

Expanding the LHS only (which will always be our approach when determining the autocovariance
function of an autoregressive process), and remembering that the covariance is unaffected by the
mean  , we see that:

 0  cov 1 Xt 1  2 Xt 2  3 Xt 3  et , Xt 

Now, using the properties of covariance:

 0  1 cov( Xt 1 , Xt )  2 cov( Xt 2 , Xt )  3 cov( Xt 3 , Xt )  cov(et , Xt )


 1 1  2 2  3 3  cov  et , Xt 

The Actuarial Education Company © IFE: 2019 Examinations


Page 34 CS2-13: Time series 1

But:

cov(et , Xt )  cov(et ,1 Xt 1  2 Xt 2   3 Xt 3  et )

 1 cov(et , Xt 1 )  2 cov(et , Xt 2 )   3 cov(et , Xt 3 )  cov(et , et )

 0  0  0 2 2

This is because Xt 1 , Xt 2 ,... are functions of past white noise terms, and et is independent of
earlier values. So:

 0  1 1   2 2   3 3   2

We will now derive a formula for the ACF of an AR(2) process. To do this we need to remember a
little about difference equations. Some formulae relating to difference equations are given on
page 4 of the Tables.

Question

A stationary AR(2) process is defined by the equation:

Xt  65 Xt 1  16 Xt 2  et

Determine the values of k and k for k  1, 2, 3,  .

Solution

We do not actually need to calculate  0 in order to find the ACF. This is always the case for an
autoregressive process.

By definition:


0   0  1
0

The autocovariance at lag 1 is:

 1  cov( Xt , Xt 1 )

 cov( 65 Xt 1  61 Xt 2  et , Xt 1 )

 65 cov( Xt 1 , Xt 1 )  16 cov( Xt 2 , Xt 1 )  cov(et , Xt 1 )

 65  0  16  1

Rearranging gives:

7  65  0
6 1

© IFE: 2019 Examinations The Actuarial Education Company


CS2-13: Time series 1 Page 35

So:


 1  75  0 and 1   1  75
0

Similarly, the autocovariance at lag 2 is:

 2  cov( Xt , Xt 2 )

 cov( 65 Xt 1  61 Xt 2  et , Xt 2 )

 65 cov( Xt 1 , Xt 2 )  16 cov( Xt 2 , Xt 2 )  cov(et , Xt 1 )

 65  1  16  0

Using the fact that  1  75  0 , we have:

 2  65  75  0   16  0  73  0 and

2   2  73
0

In general, for k  2 , we have:

k  65 k 1  16 k 2

We can solve this second order difference equation. The characteristic equation (for the
difference equation, which is unfortunately slightly different to the characteristic equation for the
process) is:


 2  65   61    12   13  0  
Using the formula on page 4 of the Tables, the general solution of this difference equation is of
the form:

 12  
k k
k  A  B 13

In order to find the solution we want, we need to use two boundary conditions to determine the
two constants. We know that 0  1 and 1  75 . So:

0  A  B  1

1  12 A  13 B  75

Solving these equations gives A  16


7
and B   97 .

The autocorrelation function is therefore:

 12  
k k
k  16
7
 97 13

The Actuarial Education Company © IFE: 2019 Examinations


Page 36 CS2-13: Time series 1

We are also asked for the partial autocorrelation function. Using the formulae on page 40 of the
Tables:

1  1  75

2  12
2    61
1  12

Also, since this is an AR(2) process:

k  0 for k  3, 4, 5, 

3.5 The first-order moving average model, MA(1)


A first-order moving average process, denoted MA(1) , is a process given by:

X t    et   et 1

The mean of this process is t   .

The variance and autocovariance are:

 0  var(et   et 1)  (1   2 ) 2

 1  cov(et   et 1, et 1   et 2 )   2

 k  0 for k  1

Hence the ACF of the MA(1) process is:

0  1


1 
1  2

 k  0 for k  1

Question

Show that the moving average process X n  Z n   Z n 1 is weakly stationary, where Z n is a white
noise process with mean  and variance  2 .

© IFE: 2019 Examinations The Actuarial Education Company


CS2-13: Time series 1 Page 37

Solution

The mean is constant since E(Xn )  1     .

For the covariance:

 
cov  X n , X n   cov  Z n   Z n 1 , Z n   Z n 1   1   2  2

or alternatively:

 
var( X n )  var  Zn   Z n 1   1   2  2

and:

cov  Xn , Xn 1   cov  Zn   Zn 1 , Zn 1   Zn 2   cov   Zn 1 , Zn 1    2

cov  Xn , Xn2   cov  Zn   Zn1 , Zn2   Zn3   0

In fact, the covariance at higher lags remains 0 since there is no overlap between the Z ’s. The
covariances at the corresponding negative lags are the same.

Since none of these expressions depends on n, it follows that the process is weakly stationary.

An MA(1) process is stationary regardless of the values of its parameters. The parameters
are nevertheless usually constrained by imposing the condition of invertibility. This may be
explained as follows.

It is possible to have two distinct MA(1) models with identical ACFs: consider, for example,

  0.5 and   2 , both of which have 1   0.4 .
1  2

The defining equation of the MA(1) may be written in terms of the backwards shift operator:

X    (1   B )e (13.6)

In many circumstances an autoregressive model is more convenient than a moving average


model.

We may rewrite (13.6) as:

(1   B)1( X   )  e

and use the standard expansion of (1   B)1  1   B   2B2   3B3   to give:

X t     ( X t 1   )   2 ( X t 2   )   3 ( X t 3   )    et

The expansion referred to here is given on page 2 of the Tables.

The Actuarial Education Company © IFE: 2019 Examinations


Page 38 CS2-13: Time series 1

The original moving average model has therefore been transformed into an autoregression
of infinite order. But this procedure is only valid if the sum on the left-hand side is
convergent, in other words if   1 . When this condition is satisfied the MA(1) is called
invertible. Although more than one MA process may share a given ACF, at most one of the
processes will be invertible.

We might want to know the historic values of the white noise process. Although the values
x0 , x1 ,, xt  can be observed, and are therefore known at time t, the values of the white noise
process e0 , e1 , , et  are not. Can we obtain the unknown e values from the known x values?
The answer is yes, in theory, if and only if the process is invertible, since we can then write the
value et in terms of the x’s, as above. In practice, we wouldn’t actually have an infinite history of
x values, but since the coefficients of the x’s get smaller as we go back in time, for an invertible
process, the contribution of the values before time 1, say, will be negligible. We can make this
more precise, as in the following question.

Question

Show that the process X n    en   en1 may be inverted as follows:

n1
en  (  )n e0   (  )i  xni   
i 0

Solution

A simple algebraic rearrangement shows that X n    en   en1 can be rewritten as


en  X n     en1 . Now using an iterative procedure:

en  Xn     en1

 X n      X n1     en2 


n1
  X n       Xn1      2  X n2            X1       n e0
Notice that, as n gets large, the dependence of en on e0 will be small provided   1 .

The condition for a MA(1) process to be invertible is similar to the condition that an AR(1)
process is stationary. An AR(1) process is stationary if and only if the process X can be written
explicitly in terms of the process e. The invertibility condition ensures that the white noise
process e can be written in terms of the X process. This relationship generalises to AR(p) and
MA(q) processes, as we will see shortly.

© IFE: 2019 Examinations The Actuarial Education Company


CS2-13: Time series 1 Page 39

It is possible, at the cost of considerable effort, to calculate the PACF of the MA(1) , giving:

(1   2 )  k
k  ( 1)k 1
1  2(k 1)

This formula can be found on page 41 of the Tables.

This decays approximately geometrically as k   , highlighting the way in which the ACF
and PACF are complementary: the PACF of a MA(1) behaves like the ACF of an AR (1) , and
the PACF of an AR (1) behaves like the ACF of a MA(1) .

Figure 13.3: ACF and PACF of MA(1) with   0.7

3.6 The moving average model, MA(q)


The defining equation of the general q th order moving average is, in backwards shift
notation:

X    (1  1B   2B2     q Bq )e

In other words, it is:

Xt    et  1et 1  2et 2    q et q

Moving average processes are always stationary, as they are a linear combination of white noise,
which is itself stationary.

Recall that for a stationary AR(p) process, X t can be expressed as an (infinite and convergent)
sum of white noise terms. This means that any stationary autoregressive process can be
considered to be a moving average of infinite order. However, by a moving average process we
will usually mean one of finite order.

The Actuarial Education Company © IFE: 2019 Examinations


Page 40 CS2-13: Time series 1

The autocovariance function is easier to find than in the case of AR ( p ) :

q q q k
k     i  j E (et  i et  j  k )   2   i  i  k
i 0 j 0 i 0

provided k  q . (Here  0 denotes 1.)

Note that cov(ei , e j )  E (ei e j )  E (ei )E (e j )  E (ei e j )  0 for i  j (since the random variables et

have 0 mean and are uncorrelated), and cov(ei , ei )  var(ei )  E (ei2 )   2 . So, in the double sum
above, the only non-zero terms will be where the subscripts of et i and et  j k match, ie when
i  j  k . This means that we can simplify the double sum by writing everything in terms of j .
We need to get the limits right for j , which cannot go above q  k because i  j  k and i only
goes up to q . So we get:

q k
 k    j  k  j 2
j 0

This matches the formula above, except that i has been used in place of j .

For k  q it is obvious that  k  0 . Just as autoregressive processes are characterised by


the property that the partial ACF is equal to zero for sufficiently large k , moving average
processes are characterised by the property that the ACF is equal to zero for sufficiently
large k .

The importance of this observation will become apparent in Section 2 of Chapter 14. We will look
at an explicit case of this result to make things clearer.

Question

Calculate  k , k  0,1,2,3,... for the process:

X n  3  en  en1  0.25en2

where en is a white noise process with mean 0 and variance 1.

Solution

We have:

 0  cov( X n , X n )  var( X n )
 var(en  en 1  0.25en 2 )

 var(en )  (1)2 var(en 1 )  0.252 var(en 2 )


 1  1  0.0625

 2.0625

© IFE: 2019 Examinations The Actuarial Education Company


CS2-13: Time series 1 Page 41

 1  cov  X n , X n 1 

 cov  en  en 1  0.25en 2 , en 1  en 2  0.25en 3 

 cov(en 1 , en 1 )  cov(0.25en 2 , en 2 )


 1  0.25

 1.25

 2  cov  X n , X n 2 

 cov  en  en 1  0.25en 2 , en 2  en 3  0.25en  4 

 cov(0.25en 2 ,en 2 )
 0.25

 3  cov  Xn , X n 3 

 cov  en  en 1  0.25en 2 , en 3  en  4  0.25en  5 

0

The covariance at higher lags is also 0 since there is no overlap between the e’s.

In the solution above, we have expanded the terms on both sides of the covariance expression.
This will always be our strategy when calculating the autocovariance function for a moving
average series. For all other types of series, we just expand the term on the LHS of the covariance
expression.

We said above that an MA(1) process Xt    et   et 1 is invertible if   1 , and we drew the


analogy with an AR(1) process being stationary. The same goes for this more general case. Recall
that an AR(p) process is stationary if and only if the roots of the characteristic equation are all
strictly greater than 1 in magnitude.

For an MA(q) process we have:

Xt    (1  1B  2B2    q Bq )et

The equation:

1  1 z  2 z2    q z q  0

can be used to determine invertibility. This can be thought of as the characteristic equation of the
white noise terms.

The Actuarial Education Company © IFE: 2019 Examinations


Page 42 CS2-13: Time series 1

Condition for invertibility of a MA(q) process


The process X defined by the equation:

Xt    et  1et 1  2et 2    q et q

is invertible if and only if the roots of the equation:

1  1 z  2 z2    q z q  0

are all strictly greater than 1 in magnitude.

This is equivalent to saying that the value et can be written explicitly as a (convergent) sum of X
values.

Again,  is often used instead of z in the characteristic equation.

It follows that in the same way as a stationary autoregression can be thought of as a moving
average of infinite order, so a moving average can be thought of as an autoregression of infinite
order.

Question

Determine whether the process Xt  2  et  5et 1  6et 2 is invertible.

Solution

The equation 1  5  6 2  (1  2 )(1  3 )  0 has roots 1 3 and 1 2 , so the process is not


invertible.

Although there may be many moving average processes with the same ACF, at most one of
them is invertible, since no two invertible processes have the same autocorrelation
function. Moving average models fitted to data by statistical packages will always be
invertible.

3.7 The autoregressive moving average process, ARMA(p,q)


A combination of the moving average and autoregressive models, an ARMA model includes
direct dependence of X t on both past values of X and present and past values of e .

The defining equation is:

X t    1( X t 1   )     p ( X t  p   )  et  1et 1     q et q

or, in backwards shift operator notation:

(1  1B     p B p )( X   )  (1  1B     q Bq )e

© IFE: 2019 Examinations The Actuarial Education Company


CS2-13: Time series 1 Page 43

This might also be written:

 (B)( X n   )   (B)en

where  (B) and  (B) are polynomials of degrees q and p, respectively.

If  (B) and  (B) have any factors (ie roots) in common, then the defining relation could be
simplified. For example, we may have a stationary ARMA(1,1) process defined by
1   B  Xn  1   B  en with   1 . Since the factors 1   B  are invertible, we can multiply
1
both sides of the equation by 1   B  to see that X n  en . So this process would actually be
classified as ARMA(0,0) .

In general, it would be assumed that the polynomials  (B) and  (B) have no common roots.

Autoregressive and moving average processes are special cases of ARMA processes. AR(p) is the
same as ARMA(p,0) . MA(q) is the same as ARMA(0, q) .

To check the stationarity of an ARMA process, we just need to examine the autoregressive part.
The moving average part (which involves the white noise terms) is always stationary. The test is
the same as for an autoregressive process – we need to determine the roots of the characteristic
equation formed by the X terms. The process is stationary if and only if all the roots are strictly
greater than 1 in magnitude.

Similarly, we can check for invertibility by examining the roots of the characteristic equation that
is obtained from the white noise terms. The process is invertible if and only if all the roots are
strictly greater than 1 in magnitude.

Neither the ACF nor the PACF of the ARMA process eventually becomes equal to zero. This
makes it more difficult to identify an ARMA model than either a pure autoregression or a
pure moving average.

Theoretically, both the ACF and PACF of a stationary ARMA process will tend towards 0 for large
lags, but neither will have a cut off property.

It is possible to calculate the ACF by a method similar to the method employing the
Yule-Walker equations for the ACF of an autoregression.

We will show that the autocorrelation function of the stationary zero-mean ARMA(1,1)
process:

X t   X t 1  et   et 1 (13.7)

is given by:

(1   )(   )
1 
(1   2  2 )

 k   k 11, k  2,3,

These results can be obtained from the formula for  k given on page 40 of the Tables.

The Actuarial Education Company © IFE: 2019 Examinations


Page 44 CS2-13: Time series 1

Figure 13.1 in Section 2.5 shows the ACF and PACF values of such a process with   0.7
and   0.5 .

Before we tackle the Yule-Walker equations, we need a couple of preliminary results.

Using equation (13.7), ie:

Xt   Xt 1  et   et 1

we have:

cov( X t , et )   cov( X t 1, et )  cov(et , et )   cov(et 1, et )

 2

since et is independent of both et 1 and X t 1 .

Similarly:

cov( X t , et 1)   cov( X t 1, et 1)  cov(et , et 1)   cov(et 1, et 1)

 (   ) 2

This enables us to deduce the autocovariance function of X . Again from (13.7):

cov( X t , X t )   cov( X t 1, X t )  cov(et , X t )   cov(et 1, X t )

cov( X t , X t 1)   cov( X t 1, X t 1)  cov(et , X t 1)   cov(et 1, X t 1)

and, for k  1 :

cov( X t , X t  k )   cov( X t 1, X t  k )  cov(et , X t  k )   cov(et 1, X t  k )

So:

 0   1  (1     2 )  2

 1   0   2

 k   k 1

The solution is:

1  2   2
0  2
2
1 
(   )(1   )
1  2
1  2

 k   k 1 1 k  2,3,...

assuming that the process is stationary, ie that   1 .

© IFE: 2019 Examinations The Actuarial Education Company


CS2-13: Time series 1 Page 45

Hence:

 1 (1   )(   )
1  
 0 (1  2   2 )

and:

 k  k 1 1
k     k 1 1 , k  2,3,
0 0

Question

Show that the process 12 Xt  10 Xt 1  2 Xt 2  12et  11et 1  2et 2 is both stationary and
invertible.

Solution

We start by rewriting the process so that all the X ' s are on the same side and all the e ' s are on
the same side:

12 Xt  10 Xt 1  2 Xt 2  12et  11et 1  2et 2

The characteristic equation of the AR part is 12  10  2 2   2  4    3  0 , which has roots


2 and 3. The process is therefore stationary.

The characteristic equation of the MA part is 12  11  2 2   2  3   4   0 . The roots of


this equation are 1.5 and 4. The process is therefore invertible.

3.8 Modelling non-stationary processes: the ARIMA model


This is the most general class of models we will consider. They lie at the heart of the Box-Jenkins
approach to modelling time series. In order to understand the rationale underlying the definition,
it will be useful to give a brief introduction to the Box-Jenkins method; a more detailed discussion
will be given in Chapter 14.

Suppose we are given some time series data xn , where n varies over some finite range. If we
want to model the data, then we would expect to take sample statistics, in particular the sample
autocorrelation function, sample partial autocorrelation function and sample mean; these will be
discussed in more detail in Chapter 14. The modelling process would then involve finding a
stochastic process with similar characteristics. In the Box-Jenkins approach, the model is picked
from the ARMA(p, q) class. However, the theoretical counterparts of the autocorrelation and
partial autocorrelation functions are only defined for stationary series. The upshot of this is that
we can only directly apply these methods if the original data values are stationary.

The Actuarial Education Company © IFE: 2019 Examinations


Page 46 CS2-13: Time series 1

However, we can get around this problem as follows. First transform the data into a stationary
form, which we will discuss in a moment. We can then model this stationary series as suggested
above. Finally, we carry out the inverse transform on our model to obtain a model for the original
series. The question remains as to what we mean by ‘transform’ and ‘inverse transform’.

The backward difference operator can turn a non-stationary series into a stationary one. For
example, a random walk:

Xt  Xt 1  et

is non-stationary (as its characteristic equation is 1    0 , which has a root of 1). However, the
difference:

Xt  Xt  Xt 1  et

is just white noise, which is stationary.

For the moment we will assume that it’s possible to turn the data set into a stationary series by
repeated differencing. We may have to difference the series several times, the specific number
usually being denoted by d.

Now assuming we’ve transformed our data into stationary form by differencing, we can model
this series using a stationary ARMA(p, q) process. The final step is to reverse the differencing
procedure to obtain a model of the original series. The inverse process of differencing is
integration since we must sum the differences to obtain the original series.

Question

From a data set x0 , x1 , x2 , , xN the first order differences wi  xi  xi 1 are calculated.

State the range of values of i and give an expression for x j in terms of x0 and the w ' s .

Solution

The values of i are 1,2,...,N and:

j
x j  x0   wi
i 1

In many applications the process being modelled cannot be assumed stationary, but can
reasonably be fitted by a model with stationary increments, that is, if the first difference of
X , Y  X , is itself a stationary process.

A process X is called an ARIMA( p,1, q ) process if X is non-stationary but the first


difference of X is a stationary ARMA( p, q ) process.

We will now consider some examples.

© IFE: 2019 Examinations The Actuarial Education Company


CS2-13: Time series 1 Page 47

Example 1
The simplest example of an ARIMA process is the random walk, X t  X t 1  et , which can
t
be written X t  X 0   e j . The expectation of X t is equal to E  X 0  but the variance is
j 1

var  X 0   t 2 , so that X is not itself stationary.

Here we are assuming (as we usually do) that the white noise process has zero mean.

The first difference, however, is given by:

Yt  X t  et

which certainly is stationary. Thus the random walk is an ARIMA(0,1,0) process.

Example 2
Let Zt denote the closing price of a share on day t . The evolution of Z is frequently
described by the model:

Zt  Zt 1 exp    et 

By taking logarithms we see that this model is equivalent to an I (1) model, since Yt  ln Zt
satisfies the equation:

Yt    Yt 1  et

t
which is the defining equation of a random walk with drift because Yt  Y0  t   e j . The
j 1

model is based on the assumption that the daily returns ln  Zt Zt 1  are independent of the
past prices Z0 , Z1, , Zt 1 .

Example 3
The logarithm of the consumer price index can be described by the ARIMA(1,1,0) model:

(1  B )ln Qt      (1  B )ln Qt 1     et

When analysing the behaviour of an ARIMA( p,1, q ) model, the standard technique is to look
at the first difference of the process and to perform the kind of analysis which is suitable for
an ARMA model. Once complete, this can be used to provide predictions for the original,
undifferenced, process.

ARIMA(p,d,q) processes
In certain cases it may be considered desirable to continue beyond the first difference, if the
process X is still not stationary after being differenced once. The notation extends in a
natural way.

The Actuarial Education Company © IFE: 2019 Examinations


Page 48 CS2-13: Time series 1

Definition of an ARIMA process


If X needs to be differenced at least d times in order to reduce it to stationarity and if the
d th difference Y  d X is an ARMA( p, q ) process, then X is termed an ARIMA( p, d , q )
process.

In terms of the backwards shift operator, the equation of the ARIMA( p, d , q ) process is:

(1  1B     p B p )(1  B)d ( X   )  (1  1B     q Bq )e

An ARIMA(p, d , q) process is I(d) . We can think of the classification ARIMA(p, d , q) as:

AR(p)I(d)MA(q)

We now consider an example where we classify a time series as an ARIMA process.

To identify the values of p , d and q for which X is an ARIMA( p, d , q ) process, where:

X t  0.6 X t 1  0.3 X t  2  0.1X t  3  et  0.25et 1

we can write the equation in terms of the backwards shift operator:

(1  0.6B  0.3B2  0.1B3 ) X  (1  0.25B)e

We now check whether the polynomial on the left-hand side is divisible by 1  B; if so,
factorise it out. We continue to do this until the remaining polynomial is not divisible by
1 B .

(1  B)(1  0.4B  0.1B2 ) X  (1  0.25B)e

The model can now be seen to be ARIMA(2,1,1) .

We should also check that the roots of the characteristic polynomial of Xt , ie 1  0.4  0.1 2 ,
are both strictly greater than 1 in magnitude. In fact, the roots are 2  i 6 . The magnitude of

the complex number a  bi is a2  b2 . So the magnitude of 2  i 6 is:

 6
2
(2)2   4  6  10

The magnitude of 2  i 6 is also 10 . So both roots are strictly greater than 1 in magnitude.

Differencing once removes the factor of (1  B) . Hence, the process Xt is stationary, as
required.

Alternatively, we could write the equation for the process as:

Xt  0.6 Xt 1  0.3 Xt 2  0.1 Xt  3  et  0.25et 1

© IFE: 2019 Examinations The Actuarial Education Company


CS2-13: Time series 1 Page 49

The characteristic equation of the AR part is:

1  0.6  0.3 2  0.1 3  0

There is no simple formula for solving cubic equations, so we should start by checking whether 1
is a root of this equation. Setting   1 , we see that the left-hand side is:

1  0.6  0.3  0.1  0

So 1 is a root, and hence (  1) is a factor. The characteristic polynomial can therefore be


written in the form:

(  1)(a 2  b  c)

We can determine the values of a , b and c by comparing coefficients, by long division of


polynomials, or by synthetic division. We find that a  0.1 , b  0.4 and c  1 , and the roots of
the equation 0.1 2  0.4  1  0 are 2  i 6 as stated above, and these are strictly greater
than 1 in magnitude. Since the characteristic equation has one root of 1 and the other roots are
strictly greater than 1 in magnitude, differencing once will give us a stationary process. So d  1 .
There are two other roots, so p  2 . In addition, q  1 since the moving average part is of
order 1. Hence the process is ARIMA(2,1,1) .

Another alternative is to write the defining equation in terms of the differences.

The equation:

Xt  0.6 Xt 1  0.3 Xt 2  0.1 Xt 3  et  0.25et 1

can be rearranged as:

Xt  0.6 Xt 1  0.3 Xt 2  0.1 Xt 3  et  0.25et 1

or:

( Xt  Xt 1 )  0.4( Xt 1  Xt 2 )  0.1( Xt 2  Xt 3 )  et  0.25et 1

or:

Xt  0.4Xt 1  0.1Xt 2  et  0.25et 1

The characteristic equation formed by the X terms is:

0.1 2  0.4  1  0

As we have already seen, the roots of this equation are 2  i 6 , and these both have a
magnitude of 10 . So X is stationary ARMA(2,1) and hence X is ARIMA(2,1,1) .

ARIMA models play a central role in the Box-Jenkins methodology, which aims to provide a
consistent and unified framework for analysis and prediction using time series models.
(See Section 3.1 of Chapter 14.)

The Actuarial Education Company © IFE: 2019 Examinations

You might also like